Customize load balancing with KongUpstreamPolicy

Related Documentation
Related Resources
TL;DR

Create a KongUpstreamPolicy resource then add the konghq.com/upstream-policy annotation to your Service

Prerequisites

If you don’t have a Konnect account, you can get started quickly with our onboarding wizard.

  1. The following Konnect items are required to complete this tutorial:
    • Personal access token (PAT): Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
  2. Set the personal access token as an environment variable:

    export KONNECT_TOKEN='YOUR KONNECT TOKEN'
    
    Copied to clipboard!

Deploy additional echo replicas

To demonstrate Kong’s load balancing functionality we need multiple echo Pods. Scale out the echo deployment.

kubectl scale -n kong --replicas 2 deployment echo
Copied to clipboard!

Use KongUpstreamPolicy with a Service resource

By default, Kong will round-robin requests between upstream replicas. If you run curl -s $PROXY_IP/echo | grep "Pod" repeatedly, you should see the reported Pod name alternate between two values.

You can configure the Kong Upstream associated with the Service to use a different load balancing strategy, such as consistently sending requests to the same upstream based on a header value. See the KongUpstreamPolicy reference for the full list of supported algorithms and their configuration options.

Let’s create a KongUpstreamPolicy resource defining the new behavior:

echo '
apiVersion: configuration.konghq.com/v1beta1
kind: KongUpstreamPolicy
metadata:
  name: sample-customization
  namespace: kong
spec:
  algorithm: consistent-hashing
  hashOn:
    header: demo
  hashOnFallback:
    input: ip
  ' | kubectl apply -f -
Copied to clipboard!

Now, let’s associate this KongUpstreamPolicy resource with our Service resource using the konghq.com/upstream-policy annotation.

kubectl patch -n kong service echo \
  -p '{"metadata":{"annotations":{"konghq.com/upstream-policy":"sample-customization"}}}'
Copied to clipboard!

With consistent hashing and client IP fallback, sending repeated requests without any x-lb header now sends them to the same Pod:

for n in {1..5}; do curl -s $PROXY_IP/echo | grep "Pod"; done
Copied to clipboard!
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-frpjc.

If you add the header, Kong hashes its value and distributes it to the same replica when using the same value:

for n in {1..3}; do
  curl -s $PROXY_IP/echo -H "demo: foo" | grep "Pod";
  curl -s $PROXY_IP/echo -H "demo: bar" | grep "Pod";
  curl -s $PROXY_IP/echo -H "demo: baz" | grep "Pod";
done
Copied to clipboard!
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-frpjc.
Running on Pod echo-965f7cf84-wlvw9.

Increasing the replicas redistributes some subsequent requests onto the new replica:

kubectl scale -n kong --replicas 3 deployment echo
Copied to clipboard!
for n in {1..3}; do
  curl -s $PROXY_IP/echo -H "demo: foo" | grep "Pod";
  curl -s $PROXY_IP/echo -H "demo: bar" | grep "Pod";
  curl -s $PROXY_IP/echo -H "demo: baz" | grep "Pod";
done
Copied to clipboard!
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-wlvw9.
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-5h56p.
Running on Pod echo-965f7cf84-wlvw9.

Kong’s load balancer doesn’t directly distribute requests to each of the Service’s endpoints. It first distributes them evenly across a number of equal-size buckets. These buckets are then distributed across the available endpoints according to their weight. For Ingresses, however, there is only one Service, and the controller assigns each endpoint (represented by a Kong Upstream Target) equal weight. In this case, requests are evenly hashed across all endpoints.

Gateway API HTTPRoute rules support distributing traffic across multiple Services. The rule can assign weights to the Services to change the proportion of requests an individual Service receives. In Kong’s implementation, all endpoints of a Service have the same weight. Kong calculates a per-endpoint Upstream Target weight such that the aggregate target weight of the endpoints is equal to the proportion indicated by the HTTPRoute weight.

For example, say you have two Services with the following configuration:

  • One Service has four endpoints
  • The other Service has two endpoints
  • Each Service has weight 50 in the HTTPRoute

The Targets created for the two-endpoint Service have double the weight of the Targets created for the four-endpoint Service (two weight 16 Targets and four weight 8 Targets). Scaling the four-endpoint Service to eight would halve the weight of its Targets (two weight 16 Targets and eight weight 4 Targets).

KongUpstreamPolicy can also configure Upstream health checking behavior as well. See the KongUpstreamPolicy reference for the health check fields.

Cleanup

kubectl delete -n kong -f https://developer.konghq.com/manifests/kic/echo-service.yaml
Copied to clipboard!

Did this doc help?

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!