Did you know that you can try this plugin without talking to anyone with a free trial of Kong Konnect? Get started in under 5 minutes.
Looking for the plugin's configuration parameters? You can find them in the Rate Limiting Advanced configuration reference doc.
Rate limit how many HTTP requests can be made in a given time frame.
The Rate Limiting Advanced plugin offers more functionality than the Kong Gateway (OSS) Rate Limiting plugin, such as:
- Enhanced capabilities to tune the rate limiter, provided by the parameters
limit
andwindow_size
. Learn more in Multiple Limits and Window Sizes - Support for Redis Sentinel, Redis cluster, and Redis SSL
- Increased performance: Rate Limiting Advanced has better throughput performance with better accuracy. The plugin allows you to tune performance and accuracy via a configurable synchronization of counter data with the backend storage. This can be controlled by setting the desired value on the
sync_rate
parameter. - More limiting algorithms to choose from: These algorithms are more accurate and they enable configuration with more specificity. Learn more about our algorithms in How to Design a Scalable Rate Limiting Algorithm.
- More control over which requests contribute to incrementing the rate limiting counters via the
disable_penalty
parameter - Consumer groups support: Apply different rate limiting configurations to select groups of consumers. Learn more in Rate limiting for consumer groups
Headers sent to the client
When this plugin is enabled, Kong sends some additional headers back to the client indicating the allowed limits, how many requests are available, and how long it will take until the quota will be restored.
For example:
RateLimit-Limit: 6
RateLimit-Remaining: 4
RateLimit-Reset: 47
The plugin also sends headers indicating the limits in the time frame and the number of remaining minutes:
X-RateLimit-Limit-Minute: 10
X-RateLimit-Remaining-Minute: 9
You can optionally hide the limit and remaining headers with the hide_client_headers
option.
If more than one limit is being set, the plugin returns a combination of more time limits:
X-RateLimit-Limit-Second: 5
X-RateLimit-Remaining-Second: 4
X-RateLimit-Limit-Minute: 10
X-RateLimit-Remaining-Minute: 9
If any of the limits configured has been reached, the plugin returns an HTTP/1.1 429
status
code to the client with the following JSON body:
{ "message": "API rate limit exceeded" }
The [Retry-After
] header will be present on 429
errors to indicate how long the service is
expected to be unavailable to the client. When using window_type=sliding
and RateLimit-Reset
, Retry-After
may increase due to the rate calculation for the sliding window.
The headers
RateLimit-Limit
,RateLimit-Remaining
, andRateLimit-Reset
are based on the Internet-Draft RateLimit Header Fields for HTTP and may change in the future to respect specification updates.
Multiple limits and window sizes
An arbitrary number of limits/window sizes can be applied per plugin instance. This allows you to create multiple rate limiting windows (e.g., rate limit per minute and per hour, and per any arbitrary window size). Because of limitations with Kong’s plugin configuration interface, each nth limit will apply to each nth window size. For example:
curl -X POST http://localhost:8001/services/example-service/plugins \
--data "name=rate-limiting-advanced" \
--data "config.limit=10" \
--data "config.limit=100" \
--data "config.window_size=60" \
--data "config.window_size=3600"
This example applies rate limiting policies, one of which will trip when 10 hits have been counted in 60 seconds, or the other when 100 hits have been counted in 3600 seconds. For more information, see the Enterprise Rate Limiting Library.
The number of configured window sizes and limits parameters must be equal (as shown above); otherwise, an error occurs:
You must provide the same number of windows and limits
Implementation considerations
Limit by IP Address
If limiting by IP address, it’s important to understand how the IP address is determined. The IP address is determined by the request header sent to Kong from downstream. In most cases, the header has a name of X-Real-IP
or X-Forwarded-For
.
By default, Kong uses the header name X-Real-IP
. If a different header name is required, it needs to be defined using the real_ip_header Nginx property. Depending on the environmental network setup, the trusted_ips Nginx property may also need to be configured to include the load balancer IP address.
Strategies
The plugin supports three strategies.
Strategy | Pros | Cons |
---|---|---|
local |
Minimal performance impact. | Less accurate. Unless there’s a consistent-hashing load balancer in front of Kong, it diverges when scaling the number of nodes. |
cluster |
Accurate, no extra components to support. | Each request forces a read and a write on the data store. Therefore, relatively, the biggest performance impact. |
redis |
Accurate, less performance impact than a cluster policy. |
Needs a Redis installation. Bigger performance impact than a local policy. |
Two common use cases are:
- Every transaction counts. The highest level of accuracy is needed. An example is a transaction with financial consequences.
- Backend protection. Accuracy is not as relevant. The requirement is only to protect backend services from overloading that’s caused either by specific users or by attacks.
Every transaction counts
In this scenario, because accuracy is important, the local
policy is not an option. Consider the support effort you might need
for Redis, and then choose either cluster
or redis
.
You could start with the cluster
policy, and move to redis
if performance reduces drastically.
Do remember that you cannot port the existing usage metrics from the data store to Redis. This might not be a problem with short-lived metrics (for example, seconds or minutes) but if you use metrics with a longer time frame (for example, months), plan your switch carefully.
Backend protection
If accuracy is of lesser importance, choose the local
policy. You might need to experiment a little
before you get a setting that works for your scenario. As the cluster scales to more nodes, more user requests are handled.
When the cluster scales down, the probability of false negatives increases. So, adjust your limits when scaling.
For example, if a user can make 100 requests every second, and you have an
equally balanced 5-node Kong cluster, setting the local
limit to something like 30 requests every second
should work. If you see too many false negatives, increase the limit.
To minimize inaccuracies, consider using a consistent-hashing load balancer in front of Kong. The load balancer ensures that a user is always directed to the same Kong node, thus reducing inaccuracies and preventing scaling problems.
Fallback from Redis
When the redis
strategy is used and a Kong Gateway node is disconnected from Redis, the rate-limiting-advanced
plugin will fall back to local
. This can happen when the Redis server is down or the connection to Redis broken.
Kong Gateway keeps the local counters for rate limiting and syncs with Redis once the connection is re-established.
Kong Gateway will still rate limit, but the Kong Gateway nodes can’t sync the counters. As a result, users will be able
to perform more requests than the limit, but there will still be a limit per node.
Rate limiting for consumer groups
As of Kong Gateway 3.4, you can use the consumer groups entity to manage custom rate limiting configurations for
subsets of consumers. This is enabled by default without using the /consumer_groups/:id/overrides
endpoint.
You can see an example of this in the Enforcing rate limiting tiers with the Rate Limiting Advanced plugin guide.