AI Rate Limiting Advanced

AI License Required

Overview Examples Configuration reference Changelog

Configuration

configobjectrequired

Hide Child Parameters

dictionary_namestring

The shared dictionary where counters are stored. When the plugin is configured to synchronize counter data externally (that is config.strategy is cluster or redis and config.sync_rate isn’t -1), this dictionary serves as a buffer to populate counters in the data store on each synchronization cycle.

Default:kong_rate_limiting_counters

disable_penaltyboolean

If set to true, this doesn’t count denied requests (status = 429). If set to false, all requests, including denied ones, are counted. This parameter only affects the sliding window_type and the request prompt provider.

Default:false

error_codenumber

Set a custom error code to return when the rate limit is exceeded.

Default:429

>= 0

error_hide_providersboolean

Optionally hide informative response that would otherwise provide information about the provider in the error message.

Default:false

error_messagestring

Set a custom error message to return when the rate limit is exceeded.

Default:AI token rate limit exceeded for provider(s):

header_namestring

A string representing an HTTP header name.

hide_client_headersboolean

Optionally hide informative response headers that would otherwise provide information about the current status of limits and counters.

Default:false

identifierstring

The type of identifier used to generate the rate limit key. Defines the scope used to increment the rate limiting counters. Can be ip, credential, consumer, service, header, path or consumer-group. Note if identifier is consumer-group, the plugin must be applied on a consumer group entity. Because a consumer may belong to multiple consumer groups, the plugin needs to know explicitly which consumer group to limit the rate.

Allowed values:consumerconsumer-groupcredentialheaderippathservice

Default:consumer

llm_formatstring

LLM input and output format and schema to use

Allowed values:bedrockcoheregeminihuggingfaceopenai

Default:openai

llm_providersarray[object]required

The provider config. Takes an array of name, limit and window size values.

Hide Child Parameters

limitarray[number]required

One or more requests-per-window limits to apply. There must be a matching number of window limits and sizes specified.

namestringrequired

The LLM provider to which the rate limit applies.

Allowed values:anthropicazurebedrockcoheregeminihuggingfacellama2mistralopenairequestPrompt

window_sizearray[number]required

One or more window sizes to apply a limit to (defined in seconds). There must be a matching number of window limits and sizes specified.

namespacestring

The rate limiting library namespace to use for this plugin instance. Counter data and sync configuration is isolated in each namespace. NOTE: For the plugin instances sharing the same namespace, all the configurations that are required for synchronizing counters, e.g. strategy, redis, sync_rate, dictionary_name, need to be the same.

pathstring

A string representing a URL path, such as /path/to/resource. Must start with a forward slash (/) and must not contain empty segments (i.e., two consecutive forward slashes).

redisobject

Hide Child Parameters

cluster_max_redirectionsinteger

Maximum retry attempts for redirection.

Default:5

cluster_nodesarray[object]

Cluster addresses to use for Redis connections when the redis strategy is defined. Defining this field implies using a Redis Cluster. The minimum length of the array is 1 element.

>= 1 characters

Hide Child Parameters

ipstring

A string representing a host name, such as example.com.

Default:127.0.0.1

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

connect_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

connection_is_proxiedboolean

If the connection to Redis is proxied (e.g. Envoy), set it true. Set the host and port to point to the proxy address.

Default:false

databaseinteger

Database to use for the Redis connection when using the redis strategy

Default:0

hoststring

A string representing a host name, such as example.com.

Default:127.0.0.1

keepalive_backloginteger

Limits the total number of opened connections for a pool. If the connection pool is full, connection queues above the limit go into the backlog queue. If the backlog queue is full, subsequent connect operations fail and return nil. Queued operations (subject to set timeouts) resume once the number of connections in the pool is less than keepalive_pool_size. If latency is high or throughput is low, try increasing this value. Empirically, this value is larger than keepalive_pool_size.

>= 0<= 2147483646

keepalive_pool_sizeinteger

The size limit for every cosocket connection pool associated with every remote server, per worker process. If neither keepalive_pool_size nor keepalive_backlog is specified, no pool is created. If keepalive_pool_size isn’t specified but keepalive_backlog is specified, then the pool uses the default value. Try to increase (e.g. 512) this value if latency is high or throughput is low.

Default:256

>= 1<= 2147483646

passwordstring

Password to use for Redis connections. If undefined, no AUTH commands are sent to Redis.
This field is referenceable.
This field is encrypted.

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

read_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

send_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

sentinel_masterstring

Sentinel master to use for Redis connections. Defining this value implies using Redis Sentinel.

sentinel_nodesarray[object]

Sentinel node addresses to use for Redis connections when the redis strategy is defined. Defining this field implies using a Redis Sentinel. The minimum length of the array is 1 element.

>= 1 characters

Hide Child Parameters

hoststring

A string representing a host name, such as example.com.

Default:127.0.0.1

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

sentinel_passwordstring

Sentinel password to authenticate with a Redis Sentinel instance. If undefined, no AUTH commands are sent to Redis Sentinels.
This field is referenceable.
This field is encrypted.

sentinel_rolestring

Sentinel role to use for Redis connections when the redis strategy is defined. Defining this value implies using Redis Sentinel.

Allowed values:anymasterslave

sentinel_usernamestring

Sentinel username to authenticate with a Redis Sentinel instance. If undefined, ACL authentication won’t be performed. This requires Redis v6.2.0+.
This field is referenceable.

server_namestring

A string representing an SNI (server name indication) value for TLS.

sslboolean

If set to true, uses SSL to connect to Redis.

Default:false

ssl_verifyboolean

If set to true, verifies the validity of the server SSL certificate. If setting this parameter, also configure lua_ssl_trusted_certificate in kong.conf to specify the CA (or server) certificate used by your Redis server. You may also need to configure lua_ssl_verify_depth accordingly.

Default:false

usernamestring

Username to use for Redis connections. If undefined, ACL authentication won’t be performed. This requires Redis v6.0.0+. To be compatible with Redis v5.x.y, you can set it to default.
This field is referenceable.

request_prompt_count_functionstring

If defined, it use custom function to count requests for the request prompt provider

retry_after_jitter_maxnumber

The upper bound of a jitter (random delay) in seconds to be added to the Retry-After header of denied requests (status = 429) in order to prevent all the clients from coming back at the same time. The lower bound of the jitter is 0; in this case, the Retry-After header is equal to the RateLimit-Reset header.

Default:0

strategystring

The rate-limiting strategy to use for retrieving and incrementing the limits. Available values are: local and cluster.

Allowed values:clusterlocalredis

Default:local

sync_ratenumber

How often to sync counter data to the central data store. A value of 0 results in synchronous behavior; a value of -1 ignores sync behavior entirely and only stores counters in node memory. A value greater than 0 will sync the counters in the specified number of seconds. The minimum allowed interval is 0.02 seconds (20ms).

tokens_count_strategystring

What tokens to use for cost calculation. Available values are: total_tokens prompt_tokens, completion_tokens or cost.

Allowed values:completion_tokenscostprompt_tokenstotal_tokens

Default:total_tokens

window_typestring

Sets the time window type to either sliding (default) or fixed. Sliding windows apply the rate limiting logic while taking into account previous hit rates (from the window that immediately precedes the current) using a dynamic weight. Fixed windows consist of buckets that are statically assigned to a definitive time range, each request is mapped to only one fixed window based on its timestamp and will affect only that window’s counters.

Allowed values:fixedsliding

Default:sliding

consumerobject

If set, the plugin will activate only for requests where the specified has been authenticated. (Note that some plugins can not be restricted to consumers this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

consumer_groupobject

If set, the plugin will activate only for requests where the specified consumer group has been authenticated. (Note that some plugins can not be restricted to consumers groups this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer Groups

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

protocolsarray[string]

A set of strings representing HTTP protocols.

Allowed values:grpcgrpcshttphttps

Default:grpc, grpcs, http, https

routeobject

If set, the plugin will only activate when receiving requests via the specified route. Leave unset for the plugin to activate regardless of the route being used.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

serviceobject

If set, the plugin will only activate when receiving requests via one of the routes belonging to the specified Service. Leave unset for the plugin to activate regardless of the Service being matched.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

AI Rate Limiting Advanced

Configuration

Did this doc help?

Help us make these docs great!

Still need help