AI Proxy

Overview Examples Configuration reference Changelog

Configuration

configobjectrequired

Hide Child Parameters

authobject

Hide Child Parameters

allow_overrideboolean

If enabled, the authorization header or parameter can be overridden in the request by the value configured in the plugin.

Default:false

aws_access_key_idstring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_ACCESS_KEY_ID environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

aws_secret_access_keystring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_SECRET_ACCESS_KEY environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

azure_client_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client ID.
This field is referenceable.

azure_client_secretstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client secret.
This field is encrypted.
This field is referenceable.

azure_tenant_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the tenant ID.
This field is referenceable.

azure_use_managed_identityboolean

Set true to use the Azure Cloud Managed Identity (or user-assigned identity) to authenticate with Azure-provider models.

Default:false

gcp_service_account_jsonstring

Set this field to the full JSON of the GCP service account to authenticate, if required. If null (and gcp_use_service_account is true), Kong will attempt to read from environment variable GCP_SERVICE_ACCOUNT.
This field is encrypted.
This field is referenceable.

gcp_use_service_accountboolean

Use service account auth for GCP-based providers and models.

Default:false

header_namestring

If AI model requires authentication via Authorization or API key header, specify its name here.
This field is referenceable.

header_valuestring

Specify the full auth header value for ‘header_name’, for example ‘Bearer key’ or just ‘key’.
This field is encrypted.
This field is referenceable.

param_locationstring

Specify whether the ‘param_name’ and ‘param_value’ options go in a query string, or the POST form/JSON body.

Allowed values:bodyquery

param_namestring

If AI model requires authentication via query parameter, specify its name here.
This field is referenceable.

param_valuestring

Specify the full parameter value for ‘param_name’.
This field is encrypted.
This field is referenceable.

genai_categorystring

Generative AI category of the request

Allowed values:audio/speechaudio/transcriptionimage/generationtext/embeddingstext/generation

Default:text/generation

llm_formatstring

LLM input and output format and schema to use

Allowed values:bedrockcoheregeminihuggingfaceopenai

Default:openai

loggingobject

Hide Child Parameters

log_payloadsboolean

If enabled, will log the request and response body into the Kong log plugin(s) output.

Default:false

log_statisticsboolean

If enabled and supported by the driver, will add model usage and token metrics into the Kong log plugin(s) output.

Default:false

max_request_body_sizeinteger

max allowed body size allowed to be introspected. 0 means unlimited, but the size of this body will still be limited by Nginx’s client_max_body_size.

Default:8192

>= 0

modelobjectrequired

Hide Child Parameters

namestring

Model name to execute.

optionsobject

Key/value settings for the model

Hide Child Parameters

anthropic_versionstring

Defines the schema/API version, if using Anthropic provider.

azure_api_versionstring

‘api-version’ for Azure OpenAI instances.

Default:2023-05-15

azure_deployment_idstring

Deployment ID for Azure OpenAI instances.

azure_instancestring

Instance name for Azure OpenAI hosted models.

bedrockobject

Show Child Parameters

cohereobject

Show Child Parameters

embeddings_dimensionsinteger

If using embeddings models, set the number of dimensions to generate.

>= 0

geminiobject

Show Child Parameters

huggingfaceobject

Show Child Parameters

input_costnumber

Defines the cost per 1M tokens in your prompt.

>= 0

llama2_formatstring

If using llama2 provider, select the upstream message format.

Allowed values:ollamaopenairaw

max_tokensinteger

Defines the max_tokens, if using chat or completion models.

mistral_formatstring

If using mistral provider, select the upstream message format.

Allowed values:ollamaopenai

output_costnumber

Defines the cost per 1M tokens in the output of the AI.

>= 0

temperaturenumber

Defines the matching temperature, if using chat or completion models.

>= 0<= 5

top_kinteger

Defines the top-k most likely tokens, if supported.

>= 0<= 500

top_pnumber

Defines the top-p probability mass, if supported.

>= 0<= 1

upstream_pathstring

Manually specify or override the AI operation path, used when e.g. using the ‘preserve’ route_type.

upstream_urlstring

Manually specify or override the full URL to the AI operation endpoints, when calling (self-)hosted models, or for running via a private endpoint.

providerstringrequired

AI provider request format - Kong translates requests to and from the specified backend compatible formats.

Allowed values:anthropicazurebedrockcoheregeminihuggingfacellama2mistralopenai

model_name_headerboolean

Display the model name selected in the X-Kong-LLM-Model response header

Default:true

response_streamingstring

Whether to ‘optionally allow’, ‘deny’, or ‘always’ (force) the streaming of answers via server sent events.

Allowed values:allowalwaysdeny

Default:allow

route_typestringrequired

The model’s operation implementation, for this provider.

Allowed values:audio/v1/audio/speechaudio/v1/audio/transcriptionsaudio/v1/audio/translationsimage/v1/images/editsimage/v1/images/generationsllm/v1/assistantsllm/v1/batchesllm/v1/chatllm/v1/completionsllm/v1/embeddingsllm/v1/filesllm/v1/responsespreserverealtime/v1/realtime

consumerobject

If set, the plugin will activate only for requests where the specified has been authenticated. (Note that some plugins can not be restricted to consumers this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

consumer_groupobject

If set, the plugin will activate only for requests where the specified consumer group has been authenticated. (Note that some plugins can not be restricted to consumers groups this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer Groups

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

protocolsarray[string]

A list of the request protocols that will trigger this plugin. The default value, as well as the possible values allowed on this field, may change depending on the plugin type. For example, plugins that only work in stream mode will only support tcp and tls.

Allowed values:grpcgrpcshttphttpswswss

Default:grpc, grpcs, http, https, ws, wss

routeobject

If set, the plugin will only activate when receiving requests via the specified route. Leave unset for the plugin to activate regardless of the route being used.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

serviceobject

If set, the plugin will only activate when receiving requests via one of the routes belonging to the specified Service. Leave unset for the plugin to activate regardless of the Service being matched.

* Additional properties are NOT allowed.

Hide Child Parameters

idstring

AI Proxy

Configuration

Did this doc help?

Help us make these docs great!

Still need help