You are browsing unreleased documentation.
Configuration
This plugin is compatible with DB-less mode.
Compatible protocols
The AI Proxy plugin is compatible with the following protocols:
grpc
, grpcs
, http
, https
Parameters
Here's a list of all the parameters which can be used in this plugin's configuration:
-
name or plugin
string requiredThe name of the plugin, in this case
ai-proxy
.- If using the Kong Admin API, Konnect API, declarative configuration, or decK files, the field is
name
. - If using the KongPlugin object in Kubernetes, the field is
plugin
.
- If using the Kong Admin API, Konnect API, declarative configuration, or decK files, the field is
-
instance_name
stringAn optional custom name to identify an instance of the plugin, for example
ai-proxy_my-service
.The instance name shows up in Kong Manager and in Konnect, so it's useful when running the same plugin in multiple contexts, for example, on multiple services. You can also use it to access a specific plugin instance via the Kong Admin API.
An instance name must be unique within the following context:
- Within a workspace for Kong Gateway Enterprise
- Within a control plane or control plane group for Konnect
- Globally for Kong Gateway (OSS)
-
service.name or service.id
stringThe name or ID of the service the plugin targets. Set one of these parameters if adding the plugin to a service through the top-level
/plugins
endpoint. Not required if using/services/{serviceName|Id}/plugins
. -
route.name or route.id
stringThe name or ID of the route the plugin targets. Set one of these parameters if adding the plugin to a route through the top-level
/plugins
endpoint. Not required if using/routes/{routeName|Id}/plugins
. -
consumer.name or consumer.id
stringThe name or ID of the consumer the plugin targets. Set one of these parameters if adding the plugin to a consumer through the top-level
/plugins
endpoint. Not required if using/consumers/{consumerName|Id}/plugins
. -
consumer_group.name or consumer_group.id
stringThe name or ID of the consumer group the plugin targets. If set, the plugin will activate only for requests where the specified group has been authenticated
/plugins
endpoint. Not required if using/consumer_groups/{consumerGroupName|Id}/plugins
. -
enabled
boolean default:true
Whether this plugin will be applied.
-
config
record required-
route_type
string required Must be one of:llm/v1/chat
,llm/v1/completions
,preserve
The model’s operation implementation, for this provider. Set to
preserve
to pass through without transformation.
-
auth
record-
header_name
string referenceableIf AI model requires authentication via Authorization or API key header, specify its name here.
-
header_value
string referenceable encryptedSpecify the full auth header value for ‘header_name’, for example ‘Bearer key’ or just ‘key’.
-
param_name
string referenceableIf AI model requires authentication via query parameter, specify its name here.
-
param_value
string referenceable encryptedSpecify the full parameter value for ‘param_name’.
-
param_location
string Must be one of:query
,body
Specify whether the ‘param_name’ and ‘param_value’ options go in a query string, or the POST form/JSON body.
-
azure_use_managed_identity
boolean default:false
Set true to use the Azure Cloud Managed Identity (or user-assigned identity) to authenticate with Azure-provider models.
-
azure_client_id
string referenceableIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client ID.
-
azure_client_secret
string referenceable encryptedIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client secret.
-
azure_tenant_id
string referenceableIf azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the tenant ID.
-
gcp_use_service_account
boolean default:false
Use service account auth for GCP-based providers and models.
-
gcp_service_account_json
string referenceable encryptedSet this field to the full JSON of the GCP service account to authenticate, if required. If null (and gcp_use_service_account is true), Kong will attempt to read from environment variable
GCP_SERVICE_ACCOUNT
.
-
aws_access_key_id
string referenceable encryptedSet this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_ACCESS_KEY_ID environment variable for this plugin instance.
-
aws_secret_access_key
string referenceable encryptedSet this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_SECRET_ACCESS_KEY environment variable for this plugin instance.
-
allow_override
boolean default:false
If enabled, the authorization header or parameter can be overridden in the request by the value configured in the plugin.
-
-
model
record required-
provider
string required Must be one of:openai
,azure
,anthropic
,cohere
,mistral
,llama2
,gemini
,bedrock
,huggingface
AI provider request format - Kong translates requests to and from the specified backend compatible formats.
-
name
stringModel name to execute.
-
options
recordKey/value settings for the model
-
max_tokens
integer default:256
Defines the max_tokens, if using chat or completion models.
-
input_cost
numberDefines the cost per 1M tokens in your prompt.
-
output_cost
numberDefines the cost per 1M tokens in the output of the AI.
-
temperature
number between:0
5
Defines the matching temperature, if using chat or completion models.
-
top_p
number between:0
1
Defines the top-p probability mass, if supported.
-
top_k
integer between:0
500
Defines the top-k most likely tokens, if supported.
-
anthropic_version
stringDefines the schema/API version, if using Anthropic provider.
-
azure_instance
stringInstance name for Azure OpenAI hosted models.
-
azure_api_version
string default:2023-05-15
‘api-version’ for Azure OpenAI instances.
-
azure_deployment_id
stringDeployment ID for Azure OpenAI instances.
-
llama2_format
string Must be one of:raw
,openai
,ollama
If using llama2 provider, select the upstream message format.
-
mistral_format
string Must be one of:openai
,ollama
If using mistral provider, select the upstream message format.
-
upstream_url
stringManually specify or override the full URL to the AI operation endpoints, when calling (self-)hosted models, or for running via a private endpoint.
-
upstream_path
stringManually specify or override the AI operation path, used when e.g. using the ‘preserve’ route_type.
-
gemini
record-
api_endpoint
stringIf running Gemini on Vertex, specify the regional API endpoint (hostname only).
-
project_id
stringIf running Gemini on Vertex, specify the project ID.
-
location_id
stringIf running Gemini on Vertex, specify the location ID.
-
-
bedrock
record-
aws_region
stringIf using AWS providers (Bedrock) you can override the
AWS_REGION
environment variable by setting this option.
-
-
huggingface
record-
use_cache
booleanUse the cache layer on the inference API
-
wait_for_model
booleanWait for the model if it is not ready
-
-
-
-
logging
record required-
log_statistics
boolean required default:false
If enabled and supported by the driver, will add model usage and token metrics into the Kong log plugin(s) output.
-
log_payloads
boolean required default:false
If enabled, will log the request and response body into the Kong log plugin(s) output.
-
-
response_streaming
string default:allow
Must be one of:allow
,deny
,always
Whether to ‘optionally allow’, ‘deny’, or ‘always’ (force) the streaming of answers via server sent events.
-
max_request_body_size
integer default:8192
max allowed body size allowed to be introspected
-
model_name_header
boolean default:true
Display the model name selected in the X-Kong-LLM-Model response header
-