AI Semantic Cache

AI License Required

Overview Examples Configuration reference Changelog API reference

Configuration

configobjectrequired

Hide Child Parameters

cache_controlboolean

When enabled, respect the Cache-Control behaviors defined in RFC7234.

Default:false

cache_ttlinteger

TTL in seconds of cache entities. Must be a value greater than 0.

Default:300

>= 0

embeddingsobjectrequired

Hide Child Parameters

authobject

Hide Child Parameters

allow_overrideboolean

If enabled, the authorization header or parameter can be overridden in the request by the value configured in the plugin.

Default:false

aws_access_key_idstring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_ACCESS_KEY_ID environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

aws_secret_access_keystring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_SECRET_ACCESS_KEY environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

azure_client_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client ID.
This field is referenceable.

azure_client_secretstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client secret.
This field is encrypted.
This field is referenceable.

azure_tenant_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the tenant ID.
This field is referenceable.

azure_use_managed_identityboolean

Set true to use the Azure Cloud Managed Identity (or user-assigned identity) to authenticate with Azure-provider models.

Default:false

gcp_service_account_jsonstring

Set this field to the full JSON of the GCP service account to authenticate, if required. If null (and gcp_use_service_account is true), Kong will attempt to read from environment variable GCP_SERVICE_ACCOUNT.
This field is encrypted.
This field is referenceable.

gcp_use_service_accountboolean

Use service account auth for GCP-based providers and models.

Default:false

header_namestring

If AI model requires authentication via Authorization or API key header, specify its name here.
This field is referenceable.

header_valuestring

Specify the full auth header value for ‘header_name’, for example ‘Bearer key’ or just ‘key’.
This field is encrypted.
This field is referenceable.

param_locationstring

Specify whether the ‘param_name’ and ‘param_value’ options go in a query string, or the POST form/JSON body.

Allowed values:bodyquery

param_namestring

If AI model requires authentication via query parameter, specify its name here.
This field is referenceable.

param_valuestring

Specify the full parameter value for ‘param_name’.
This field is encrypted.
This field is referenceable.

modelobjectrequired

Hide Child Parameters

namestringrequired

Model name to execute.

optionsobject

Key/value settings for the model

Hide Child Parameters

azureobjectrequired

Hide Child Parameters

api_versionstring

‘api-version’ for Azure OpenAI instances.

Default:2023-05-15

deployment_idstring

Deployment ID for Azure OpenAI instances.

instancestring

Instance name for Azure OpenAI hosted models.

bedrockobject

Hide Child Parameters

aws_assume_role_arnstring

If using AWS providers (Bedrock) you can assume a different role after authentication with the current IAM context is successful.

aws_regionstring

If using AWS providers (Bedrock) you can override the AWS_REGION environment variable by setting this option.

aws_role_session_namestring

If using AWS providers (Bedrock), set the identifier of the assumed role session.

aws_sts_endpoint_urlstring

If using AWS providers (Bedrock), override the STS endpoint URL when assuming a different role.

embeddings_normalizeboolean

If using AWS providers (Bedrock), set to true to normalize the embeddings.

Default:false

performance_config_latencystring

Force the client’s performance configuration ‘latency’ for all requests. Leave empty to let the consumer select the performance configuration.

geminiobject

Hide Child Parameters

api_endpointstring

If running Gemini on Vertex, specify the regional API endpoint (hostname only).

location_idstring

If running Gemini on Vertex, specify the location ID.

project_idstring

If running Gemini on Vertex, specify the project ID.

huggingfaceobject

Hide Child Parameters

use_cacheboolean

Use the cache layer on the inference API

wait_for_modelboolean

Wait for the model if it is not ready

upstream_urlstring

upstream url for the embeddings

providerstringrequired

AI provider format to use for embeddings API

Allowed values:azurebedrockgeminihuggingfacemistralopenai

exact_cachingboolean

When enabled, a first check for exact query will be done. It will impact DB size

Default:false

ignore_assistant_promptsboolean

Ignore and discard any assistant prompts when Vectorizing the request

Default:false

ignore_system_promptsboolean

Ignore and discard any system prompts when Vectorizing the request

Default:false

ignore_tool_promptsboolean

Ignore and discard any tool prompts when Vectorizing the request

Default:false

llm_formatstring

LLM input and output format and schema to use

Allowed values:bedrockcoheregeminihuggingfaceopenai

Default:openai

message_countbacknumber

Number of messages in the chat history to Vectorize/Cache

Default:1

>= 1<= 1000

stop_on_failureboolean

Halt the LLM request process in case of a caching system failure

Default:false

vectordbobjectrequired

Hide Child Parameters

dimensionsintegerrequired

the desired dimensionality for the vectors

distance_metricstringrequired

the distance metric to use for vector searches

Allowed values:cosineeuclidean

pgvectorobject

Hide Child Parameters

databasestring

the database of the pgvector database

Default:kong-pgvector

hoststring

the host of the pgvector database

Default:127.0.0.1

passwordstring

the password of the pgvector database
This field is referenceable.
This field is encrypted.

portinteger

the port of the pgvector database

Default:5432

sslboolean

whether to use ssl for the pgvector database

Default:false

ssl_certstring

the path of ssl cert to use for the pgvector database

ssl_cert_keystring

the path of ssl cert key to use for the pgvector database

ssl_requiredboolean

whether ssl is required for the pgvector database

Default:false

ssl_verifyboolean

whether to verify ssl for the pgvector database

Default:false

ssl_versionstring

the ssl version to use for the pgvector database

Allowed values:anytlsv1_2tlsv1_3

Default:tlsv1_2

timeoutnumber

the timeout of the pgvector database

Default:5000

userstring

the user of the pgvector database
This field is referenceable.

Default:postgres

redisobject

Hide Child Parameters

cluster_max_redirectionsinteger

Maximum retry attempts for redirection.

Default:5

cluster_nodesarray[object]

Cluster addresses to use for Redis connections when the redis strategy is defined. Defining this field implies using a Redis Cluster. The minimum length of the array is 1 element.

>= 1 characters

Hide Child Parameters

ipstring

A string representing a host name, such as example.com.

Default:127.0.0.1

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

connect_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

connection_is_proxiedboolean

If the connection to Redis is proxied (e.g. Envoy), set it true. Set the host and port to point to the proxy address.

Default:false

databaseinteger

Database to use for the Redis connection when using the redis strategy

Default:0

hoststring

A string representing a host name, such as example.com.

Default:127.0.0.1

keepalive_backloginteger

Limits the total number of opened connections for a pool. If the connection pool is full, connection queues above the limit go into the backlog queue. If the backlog queue is full, subsequent connect operations fail and return nil. Queued operations (subject to set timeouts) resume once the number of connections in the pool is less than keepalive_pool_size. If latency is high or throughput is low, try increasing this value. Empirically, this value is larger than keepalive_pool_size.

>= 0<= 2147483646

keepalive_pool_sizeinteger

The size limit for every cosocket connection pool associated with every remote server, per worker process. If neither keepalive_pool_size nor keepalive_backlog is specified, no pool is created. If keepalive_pool_size isn’t specified but keepalive_backlog is specified, then the pool uses the default value. Try to increase (e.g. 512) this value if latency is high or throughput is low.

Default:256

>= 1<= 2147483646

passwordstring

Password to use for Redis connections. If undefined, no AUTH commands are sent to Redis.
This field is referenceable.
This field is encrypted.

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

read_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

send_timeoutinteger

An integer representing a timeout in milliseconds. Must be between 0 and 2^31-2.

Default:2000

>= 0<= 2147483646

sentinel_masterstring

Sentinel master to use for Redis connections. Defining this value implies using Redis Sentinel.

sentinel_nodesarray[object]

Sentinel node addresses to use for Redis connections when the redis strategy is defined. Defining this field implies using a Redis Sentinel. The minimum length of the array is 1 element.

>= 1 characters

Hide Child Parameters

hoststring

A string representing a host name, such as example.com.

Default:127.0.0.1

portinteger

An integer representing a port number between 0 and 65535, inclusive.

Default:6379

>= 0<= 65535

sentinel_passwordstring

Sentinel password to authenticate with a Redis Sentinel instance. If undefined, no AUTH commands are sent to Redis Sentinels.
This field is referenceable.
This field is encrypted.

sentinel_rolestring

Sentinel role to use for Redis connections when the redis strategy is defined. Defining this value implies using Redis Sentinel.

Allowed values:anymasterslave

sentinel_usernamestring

Sentinel username to authenticate with a Redis Sentinel instance. If undefined, ACL authentication won’t be performed. This requires Redis v6.2.0+.
This field is referenceable.

server_namestring

A string representing an SNI (server name indication) value for TLS.

sslboolean

If set to true, uses SSL to connect to Redis.

Default:false

ssl_verifyboolean

If set to true, verifies the validity of the server SSL certificate. If setting this parameter, also configure lua_ssl_trusted_certificate in kong.conf to specify the CA (or server) certificate used by your Redis server. You may also need to configure lua_ssl_verify_depth accordingly.

Default:false

usernamestring

Username to use for Redis connections. If undefined, ACL authentication won’t be performed. This requires Redis v6.0.0+. To be compatible with Redis v5.x.y, you can set it to default.
This field is referenceable.

strategystringrequired

which vector database driver to use

Allowed values:pgvectorredis

thresholdnumberrequired

the default similarity threshold for accepting semantic search results (float)

consumerobject

If set, the plugin will activate only for requests where the specified has been authenticated. (Note that some plugins can not be restricted to consumers this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

consumer_groupobject

If set, the plugin will activate only for requests where the specified consumer group has been authenticated. (Note that some plugins can not be restricted to consumers groups this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer Groups

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

protocolsarray[string]

A set of strings representing HTTP protocols.

Allowed values:grpcgrpcshttphttps

Default:grpc, grpcs, http, https

routeobject

If set, the plugin will only activate when receiving requests via the specified route. Leave unset for the plugin to activate regardless of the route being used.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

serviceobject

If set, the plugin will only activate when receiving requests via one of the routes belonging to the specified Service. Leave unset for the plugin to activate regardless of the Service being matched.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

AI Semantic Cache

Configuration

Did this doc help?

Help us make these docs great!

Still need help