You are browsing documentation for an outdated version.
See the latest documentation here.
Establish a Kong Gateway performance benchmark
While Kong Gateway is optimized out-of-the-box, there are still situations where tweaking some configuration
options for Kong Gateway can substantially increase its performance.
You can establish a baseline for performance by running an initial benchmark of Kong Gateway, optimizing the
kong.conf file using the recommendations in this guide, and then conducting several additional benchmark tests.
This guide explains the following:
- How to establish an initial Kong Gateway performance benchmark
- How to optimize Kong Gateway performance before performing additional benchmarks
- How to configure your
kong.conf for benchmarking
You must have Kong Gateway 184.108.40.206 or later.
Before you conduct a benchmark test, you must make sure the testbed is configured correctly.
Here are a few general recommendations before you begin the benchmark tests:
- Use fewer nodes of Kong Gateway with 4 or 8 NGINX workers with corresponding CPU resource
allocations rather than many smaller Kong Gateway nodes.
- Run Kong Gateway in DB-less or hybrid mode.
In these modes, Kong Gateway’s proxy nodes aren’t connected to a database, which can become another
variable that might affect performance.
Perform a baseline Kong Gateway performance benchmark
Once you have implemented the recommendations in the prerequisites, you can begin the benchmark test:
- Configure a route with a Request Termination plugin and measure Kong Gateway’s performance.
In this case, Kong Gateway responds to the request and doesn’t send any traffic to the upstream server.
- Run this test a few times to spot unexpected bottlenecks.
Either Kong Gateway, the benchmarking client (such as k6 or Apache JMeter), or some other component will likely be an unexpected bottleneck.
You should not expect higher performance from Kong Gateway until you solve these bottlenecks.
Proceed to the next step only after this baseline performance is acceptable to you.
- Once you have established the baseline, configure a route to send traffic to the upstream server without any plugins.
This measures Kong Gateway’s proxy and your upstream server’s performance.
- Verify that no components are unexpectedly causing a bottleneck before proceeding.
- Run the benchmark multiple times to gain confidence in the data.
Ensure that the difference between observations isn’t high (there’s a low standard deviation).
- Discard the stats collected by the benchmark’s first one or two iterations.
We recommend doing this to ensure that the system is operating at an optimal and stable level.
Only after the previous steps are completed should you proceed with benchmarking Kong Gateway with
Carefully read the optimization recommendations in the following sections and make any changes to the configuration
as needed before performing additional benchmarks.
Optimize Kong Gateway performance
The subsections in this section detail recommendations to improve your Kong Gateway performance for
additional benchmark tests.
Read each section carefully and make any necessary adjustments to your configuration file.
Action: Increase the
ulimit if it’s less than
Explanation: While Kong Gateway can use as many resources as it can get from the system, the
operating system (OS) limits the number of connections Kong Gateway can open with the upstream
(or any other) server, or that it can accept from the client.
The number of open connections in Kong Gateway defaults to the
ulimit with an upper bound of 16384.
This means that if the
ulimit is unlimited or is a value higher than 16384, Kong Gateway limits itself to 16384.
You can shell into Kong Gateway’s container or VM and run
ulimit -n to check the system’s
If Kong Gateway is running inside a container on top of a VM, you must shell into the container.
If the value of
ulimit is less than 16384, increase it.
Also check and set the appropriate
ulimit in the client and upstream server, since a connection bottleneck
in these systems leads to suboptimal performance.
Increase connection reuse
upstream_keepalive_max_requests = 100000 and
nginx_http_keepalive_requests = 100000.
Explanation: In high throughput scenarios with 10 000 or more RPS, the overhead of setting up TCP and TLS
connections or insufficient connections can result in under utilization of network bandwidth or the upstream server.
To increase connection re-use, you can increase
100000, or all the way up to
Action: Ensure that Kong Gateway is not scaled in/out (horizontal) or up/down (vertical).
Explanation: During a benchmarking run, ensure that Kong Gateway is not scaled in/out (horizontal)
or up/down (vertical).
In Kubernetes, this is commonly done using a Horizontal or Vertical Pod autoscaler.
Autoscalers interfere with statistics in a benchmark and introduce unnecessary noise.
Scale Kong Gateway out before testing the benchmark to avoid auto-scaling during the benchmark.
Monitor the number of Kong Gateway nodes to ensure new nodes are spawned during the benchmark and
existing nodes are not replaced.
Use multiple cores effectively
Action: On most VM setups, set
auto. On Kubernetes, set
to one or two less than the worker node CPUs.
Explanation: Make sure
nginx_worker_processes is configured correctly:
- On most VM setups, set this to
auto. This is the default setting. This ensures that NGINX spawns one
worker process for each CPU core, which is desired.
We recommend setting this explicitly in Kubernetes. Ensure CPU requests and limits for Kong Gateway
match the number of workers configured in Kong Gateway. For example, if you configure
you must request 4 CPUs in your pod spec.
If you run Kong Gateway pods on Kubernetes worker nodes with n CPUs, allocate n-2 or n-1 to
Kong Gateway, and configure a worker process count equal to this number.
This ensures that any configured daemons and Kubernetes processes, like kubelet, don’t contend
for resources with Kong Gateway.
Each additional worker uses additional memory, so you must ensure that Kong Gateway isn’t triggering
the Linux Out-of-Memory Killer.
Action: Make sure the client (like Apache JMeter or k6), Kong Gateway, and upstream servers are on different
machines (VM or bare metal) and run on the same local network with low latencies.
- Ensure that the client (like Apache JMeter or k6), Kong Gateway, and the upstream servers run on different
machines (VM or bare-metal). If these are all running in a Kubernetes cluster, ensure that the pods for these three systems
are scheduled on dedicated nodes.
Resource contention (usually CPU and network) between these can lead to suboptimal performance of any system.
- Ensure the client, Kong Gateway, and upstream servers run on the same local network with low latencies.
If requests between the client and Kong Gateway or Kong Gateway and the upstream server traverse the internet,
then the results will contain unnecessary noise.
Upstream servers maxing out
Action: Verify that the upstream server isn’t maxing out.
Explanation: You can verify that the upstream server isn’t maxing out by checking the CPU and memory
usage of the upstream server.
If you deploy additional Kong Gateway nodes and the throughput or error rate remains the same,
the upstream server or a system other than Kong Gateway is likely the bottleneck.
You must also ensure that upstream servers are not autoscaled.
Client maxing out
Action: The client must use keep-alive connections.
Explanation: Sometimes, the clients (such as k6 and Apache JMeter) max themselves out.
To tune them, you need to understand the client. Increasing the CPU, threads, and connections on clients results in
higher resource utilization and throughput.
The client must also use keep-alive connections. For example, k6
and the HTTPClient4 implementation in Apache JMeter both enable keep-alive by default.
Verify that this is set up appropriately for your test setup.
Action: Ensure custom plugins aren’t interfering with performance.
Explanation: Custom plugins can sometimes cause issues with performance.
First, you should determine if custom plugins are the source of the performance issues.
You can do this by measuring three configuration variations:
- Measure Kong Gateway’s performance without enabling any plugins. This provides a baseline for Kong Gateway’s performance.
- Enable necessary bundled plugins (plugins that come with the product), and then measure Kong Gateway’s performance.
- Next, enable custom plugins (in addition to bundled plugins), and then measure Kong Gateway’s performance once again.
If Kong Gateway’s baseline performance is poor, then it’s likely that either Kong Gateway’s configuration needs
tuning or external factors are affecting it. For external factors, see the other sections in this guide.
A large difference between the performance in the second and third steps indicates that performance problems could be due to custom plugins.
Cloud-provider performance issues
Action: Ensure you aren’t using burstable instances or hitting bandwidth, TCP connection per unit time, or PPS limits.
Explanation: While AWS is mentioned in the following, the same recommendations apply to most cloud providers:
- Ensure that you are not using burstable instances, like T type instances, in AWS.
In this case, the CPU available to applications is variable, which leads to noise in the stats.
For more information, see the Burstable performance instances AWS documentation.
- Ensure you are not hitting bandwidth limits, TCP connections per unit time limits, or Packet Per Second (PPS) limits.
For more information, see the Amazon EC2 instance network bandwidth AWS documentation.
Configuration changes during benchmark tests
Action: Don’t change the Kong Gateway configuration during a benchmark test.
Explanation: If you change the configuration during a test, Kong Gateway’s tail latencies can increase sharply.
Avoid doing this unless you are measuring Kong Gateway’s performance under a configuration change.
Large request and response bodies
Action: Keep request bodies below 8 KB and response bodies below 32 KB.
Explanation: Most benchmarking setups generally consist of an HTTP request with a small HTTP body and a corresponding
HTTP response with a JSON or HTML response body.
A request body of less than 8 KB and a response body of less than 32 KB is considered small.
If your request or response bodies are larger, Kong Gateway will buffer the request and response using the disk,
which significantly impacts Kong Gateway’s performance.
Bottlenecks in third-party systems
Explanation: More often than not, the bottlenecks in Kong Gateway are caused by bottlenecks in third-party
systems used by Kong Gateway.
The following sections explain common third-party bottlenecks and how to fix them.
Action: If you use Redis and any plugin is enabled, the CPU can cause a bottleneck. Scale Redis vertically by giving
it an additional CPU.
Explanation: If you use Redis and any plugin is enabled, ensure Redis is not a bottleneck.
The CPU generally creates a bottleneck for Redis, so check CPU usage first.
If this is the case, scale Redis vertically by giving it an additional CPU.
300 or up to
Explanation: DNS servers can bottleneck Kong Gateway since Kong Gateway depends on DNS to determine
where to send the request.
In the case of Kubernetes, DNS TTLs are 5 seconds long and can cause problems.
You can increase
300 or up to
86400 to rule out DNS as the issue.
If DNS servers are the root cause, you will see
coredns pods creating a bottleneck on the CPU.
Blocking I/O for access logs
Action: Disable access logs for high throughput benchmarking tests by setting the
configuration parameter to
Explanation: Kong Gateway and the underlying NGINX are programmed for non-blocking network I/O and they
avoid blocking disk I/O as much as possible.
However, access logs are enabled by default, and if the disk powering a Kong Gateway node is slow for any reason,
it can result in performance loss.
Disable access logs for high throughput benchmarking tests by setting the
proxy_access_log configuration parameter to
Internal errors in Kong Gateway
Action: Make sure that there are no errors in Kong Gateway’s error log.
Explanation: Check Kong Gateway’s error log for internal errors.
Internal errors can highlight issues within Kong Gateway or a third-party system that Kong Gateway relies on to proxy traffic.
Example kong.conf for benchmarking
kong.conf file examples contain all the recommended parameters from the previous sections:
Now that you’ve optimized the performance of Kong Gateway, you can perform additional benchmarks.
Always measure, make some changes, and measure again.
Maintain a log of changes to help you figure out the next steps when you get stuck or trace back to another approach.