You are browsing documentation for an outdated version.
See the latest documentation here.
Load Balancing Reference
Kong provides multiple ways of load balancing requests to multiple backend
services: the default DNS-based method, and an advanced set of load-balancing
algorithms using the Upstream entity.
The DNS load balancer is enabled by default and is limited to round-robin
upstream entity has health-check and circuit-breaker
functionalities, besides the more advanced algorithms like least-connections,
consistent-hashing, and lowest-latency.
Refer to the DNS caveats depending on your infrastructure .
DNS-based load balancing
Every Service that has been defined with a
host containing a hostname
(instead of an IP address) will automatically use DNS-based load balancing
if the name resolves to multiple IP addresses.
The DNS record
ttl setting (time to live) determines how often the information
is refreshed. When using a
ttl of 0, every request will be resolved using its
own DNS query. Obviously this will have a performance penalty, but the latency of
updates/changes will be very low.
The round-robin algorithm used, weighted or not, depends on the DNS record type of the
An A record contains one or more IP addresses. Hence, when a hostname
resolves to an A record, each backend service must have its own IP address.
Because there is no
weight information, all entries will be treated as equally
weighted in the load balancer, and the balancer will do a straight forward
An SRV record contains weight and port information for all of its IP addresses.
A backend service can be identified by a unique combination of IP address
and port number. Hence, a single IP address can host multiple instances of the
same service on different ports.
SRV records also feature a
priority property. Kong will only use the entries with
the highest priority, and ignore all others (note that the “highest priority” in an
SRV record actually is the record with the lowest
weight information is available, each entry will get its own
weight in the load balancer and it will perform a weighted round-robin.
Similarly, any given port information will be overridden by the port information from
the DNS server. If a Service has attributes
myhost.com resolves to an SRV record with
127.0.0.1:456, then the request
will be proxied to
http://127.0.0.1:456/somepath, as port
123 will be
Kong will trust the nameserver. This means that information retrieved via a DNS
query will have higher precedence than the configured values. This mostly relates
to SRV records which carry
Whenever the DNS record is refreshed a list is generated to handle the
weighting properly. Try to keep the weights as multiples of each other to keep
the algorithm performant, e.g., 2 weights of 17 and 31 would result in a structure
with 527 entries, whereas weights 16 and 32 (or their smallest relative
counterparts 1 and 2) would result in a structure with merely 3 entries. This is
especially relevant with a very small (or even 0)
DNS is carried over UDP with a default limit of 512 Bytes. If there are many entries
to be returned, a DNS Server will respond with partial data and set a truncate flag,
indicating there are more entries unsent.
DNS clients, including Kong’s, will then make a second request over TCP to retrieve the full
list of entries.
- Some nameservers by default do not respond with the truncate flag, but trim the response
to be under 512 byte UDP size.
- Consul is an example. Consul, in its default configuration, returns up to the first
three entries only, and does not set the truncate flag to indicate there are remaining entries unsent.
Consul includes an option to enable the truncate flag. Please refer to Consul documentation
for more information.
If a deployed nameserver does not provide the truncate flag, the pool
of upstream instances might be loaded inconsistently. The Kong node is effectively
unaware of some of the instances, due to the limited information provided by the nameserver.
To mitigate this, use a different nameserver, use IP addresses instead of names, or make sure
you use enough Kong nodes to still keep all upstream services in use.
When the nameserver returns a
3 name error, then that is a valid response
for Kong. If this is unexpected, first validate the correct name is being
queried for, and second check your nameserver configuration.
- The initial pick of an IP address from a DNS record (A or SRV) is not
randomized. So when using records with a
ttl of 0, the nameserver is
expected to randomize the record entries.
Advanced load-balancing algorithms are available through the
When using these load balancers, the adding and removing of backend services will
be handled by Kong, and no DNS updates will be necessary. Kong will act as the
Configuring the load balancers is done through the
upstream: a ‘virtual hostname’ which can be used in a Service
field, e.g., an
weather.v2.service would get all requests
upstream carries the
properties that determine the load-balancing behaviour (as well
as the health-checks and circuit-breaker configuration).
target: an IP address or hostname with a port number where a backend
service resides, e.g. “192.168.100.12:80”. Each
target gets an additional
weight to indicate the relative load it gets. IP addresses can be
in both IPv4 and IPv6 format.
upstream can have many
target entries attached to it, and requests proxied
to the ‘virtual hostname’ will be load balanced over the targets.
Adding and removing targets can be done with a simple HTTP request on the
Admin API. This operation is relatively cheap. Changing the upstream
itself is more expensive as the balancer will need to be rebuilt when the
number of slots change for example.
Detailed information on adding and manipulating
upstreams is available in the
upstream section of the
Admin API reference.
A target is an IP address/hostname with a port that identifies an instance of
a backend service. Each upstream can have many targets.
Detailed information on adding and manipulating targets is available in the
target section of the Admin API reference.
The targets will be automatically cleaned when there are 10x more inactive
entries than active ones. Cleaning will involve rebuilding the balancer, and
hence is more expensive than just adding a target entry.
target can also have a hostname instead of an IP address. In that case
the name will be resolved and all entries found will individually be added to
the ring balancer, e.g., adding
name ‘api.host.com’ resolves to an A record with 2 IP addresses. Then both
IP addresses will be added as target, each getting
weight=100 and port 123.
NOTE: the weight is used for the individual entries, not for the whole!
Would it resolve to an SRV record, then also the
from the DNS record would be picked up, and would overrule the given port
Note: similar to the DNS based load-balancing, only the highest priority
entries (the lowest values) in an SRV record will be used.
The balancer will honor the DNS record’s
ttl setting, upon expiry it queries the
nameserver and updates the balancer.
Exception: When a DNS record has
ttl=0, the hostname will be added
as a single target, with the specified weight. Upon every proxied request
to this target it will query the nameserver again.
As described in the target paragraph, the targets can be specified as hostnames.
In orchestrated environments like k8s or docker-compose, the IP addresses and ports
are mostly ephemeral and SRV records must be used to find the appropriate backends and
to stay up to date.
On a DNS level many infrastructure tools can also provide load-balancing type features.
These are mostly service-discovery tools that will have their own health-checks and
will randomize DNS records, or only return a small subset of available peers.
The Kong load balancers and the DNS based tools often fight each other. The nameserver will
provide as little information as possible to force clients to follow its scheme, where
Kong tries to get all backends to properly set up its load balancers and health-checks.
In your environment, ensure that:
- the nameserver sets the truncation flag on the responses when it cannot fit all
records in the UDP response. This will force Kong to retry using TCP.
- TCP queries are allowed on the nameserver.
The load balancers support the following load-balancing algorithms:
These algorithms are only available when using the
upstream entity, see
Note: for all these algorithms it is important to understand how the weights
and ports of the individual backends are being set up. See the Target
paragraph on how the actual weights and ports are being determined based on user
configuration as well DNS results.
The round-robin algorithm will be done in a weighted manner. It will be identical
in results to the DNS based load-balancing, but due to it being an
the additional features for health-checks and circuit-breakers will be available
in this case.
When choosing this algorithm, consider the following:
- good distribution of requests.
- fairly static, as only DNS updates or
target updates can influence the
distribution of traffic.
- does not improve cache-hit ratios.
With the consistent-hashing algorithm a configurable client-input will be used to
calculate a hash-value. This hash-value will then be tied to a specific backend
A common example would be to use the
consumer as a hash-input. Since this ID is
the same for every request from that user, it will ensure that the same user will
consistently be dealt with by the same backend server. This will allow for cache
optimizations on the backend, since each of the servers only serves a fixed subset
of the users, and hence can improve its cache-hit-ratio for user related data.
This algorithm implements the ketama principle to
maximize hashing stability and minimize consistency loss upon changes to the list
of known backends.
When using the
consistent-hashing algorithm, the input for the hash can be either
cookie. When set to
round-robin scheme will be used, and hashing will be disabled. The
algorithm supports a primary and a fallback hashing attribute; in case the primary
fails (e.g., if the primary is set to
consumer, but no Consumer is authenticated),
the fallback attribute is used. This maximizes upstream cache hits.
Supported hashing attributes are:
none: Do not use
round-robin instead (default).
consumer: Use the Consumer ID as the hash input. If no Consumer ID is available,
it will fall back on the Credential ID (for example, in case of an external authentication mechanism like LDAP).
ip: Use the originating IP address as the hash input. Review the configuration
settings for determining the real IP when using this.
header: Use a specified header as the hash input. The header name is
specified in either
hash_fallback_header, depending on whether
header is a primary or fallback attribute, respectively.
cookie: Use a specified cookie with a specified path as the hash input.
The cookie name is specified in the
hash_on_cookie field and the path is
specified in the
hash_on_cookie_path field. If the specified cookie is not
present in the request, it will be set by the response. Hence, the
setting is invalid if
cookie is the primary hashing mechanism.
The generated cookie will have a random UUID value. So the first assignment will
be random, but then sticks because it is preserved in the cookie.
The consistent-hashing balancer is designed to work both with a single node as well
as in a cluster. When using the hash based algorithm it is important that all nodes
build the exact same balancer-layout to make sure they all work identical. To do
this the balancer must be built in a deterministic way.
When choosing this algorithm, consider the following:
- improves backend cache-hit ratios.
- requires enough cardinality in the hash-inputs to distribute evenly (for example, hashing on
a header that only has 2 possible values does not make sense).
- the cookie based approach will work well for browser based requests, but less so
for machine-2-machine clients which will often omit the cookie.
- avoid using hostnames in the balancer as the
balancers might/will slowly diverge because the DNS ttl has only second precision
and renewal is determined by when a name is actually requested. On top of this is
the issue with some nameservers not returning all entries, which exacerbates
this problem. So when using the hashing approach in a Kong cluster, preferably add
target entities by their IP address. This problem can be mitigated by balancer
rebuilds and higher ttl settings.
This algorithm keeps track of the number of in-flight requests for each backend.
The weights are used to calculate “connection-capacity” of a backend. Requests are
routed towards the backend with the highest spare capacity.
When choosing this algorithm, consider the following:
- good distribution of traffic.
- does not improve cache-hit ratio’s.
- more dynamic since slower backends will have more connections open, and hence
new requests will be routed to other backends automatically.