Configuration

configobjectrequired
Hide Child Parameters
http_proxy_hoststring

A string representing a host name, such as example.com.

http_proxy_portinteger

An integer representing a port number between 0 and 65535, inclusive.

>= 0<= 65535

http_timeoutinteger

Timeout in milliseconds for the AI upstream service.

Default:60000

https_proxy_hoststring

A string representing a host name, such as example.com.

https_proxy_portinteger

An integer representing a port number between 0 and 65535, inclusive.

>= 0<= 65535

https_verifyboolean

Verify the TLS certificate of the AI upstream service.

Default:true

llmobjectrequired
Hide Child Parameters
authobject
Hide Child Parameters
allow_overrideboolean

If enabled, the authorization header or parameter can be overridden in the request by the value configured in the plugin.

Default:false

aws_access_key_idstring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_ACCESS_KEY_ID environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

aws_secret_access_keystring

Set this if you are using an AWS provider (Bedrock) and you are authenticating using static IAM User credentials. Setting this will override the AWS_SECRET_ACCESS_KEY environment variable for this plugin instance.
This field is encrypted.
This field is referenceable.

azure_client_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client ID.
This field is referenceable.

azure_client_secretstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the client secret.
This field is encrypted.
This field is referenceable.

azure_tenant_idstring

If azure_use_managed_identity is set to true, and you need to use a different user-assigned identity for this LLM instance, set the tenant ID.
This field is referenceable.

azure_use_managed_identityboolean

Set true to use the Azure Cloud Managed Identity (or user-assigned identity) to authenticate with Azure-provider models.

Default:false

gcp_service_account_jsonstring

Set this field to the full JSON of the GCP service account to authenticate, if required. If null (and gcp_use_service_account is true), Kong will attempt to read from environment variable GCP_SERVICE_ACCOUNT.
This field is encrypted.
This field is referenceable.

gcp_use_service_accountboolean

Use service account auth for GCP-based providers and models.

Default:false

header_namestring

If AI model requires authentication via Authorization or API key header, specify its name here.
This field is referenceable.

header_valuestring

Specify the full auth header value for ‘header_name’, for example ‘Bearer key’ or just ‘key’.
This field is encrypted.
This field is referenceable.

param_locationstring

Specify whether the ‘param_name’ and ‘param_value’ options go in a query string, or the POST form/JSON body.

Allowed values:bodyquery

param_namestring

If AI model requires authentication via query parameter, specify its name here.
This field is referenceable.

param_valuestring

Specify the full parameter value for ‘param_name’.
This field is encrypted.
This field is referenceable.

loggingobject
Hide Child Parameters
log_payloadsboolean

If enabled, will log the request and response body into the Kong log plugin(s) output.

Default:false

log_statisticsboolean

If enabled and supported by the driver, will add model usage and token metrics into the Kong log plugin(s) output.

Default:false

modelobjectrequired
Hide Child Parameters
namestring

Model name to execute.

optionsobject

Key/value settings for the model

Hide Child Parameters
anthropic_versionstring

Defines the schema/API version, if using Anthropic provider.

azure_api_versionstring

‘api-version’ for Azure OpenAI instances.

Default:2023-05-15

azure_deployment_idstring

Deployment ID for Azure OpenAI instances.

azure_instancestring

Instance name for Azure OpenAI hosted models.

bedrockobject
Hide Child Parameters
aws_assume_role_arnstring

If using AWS providers (Bedrock) you can assume a different role after authentication with the current IAM context is successful.

aws_regionstring

If using AWS providers (Bedrock) you can override the AWS_REGION environment variable by setting this option.

aws_role_session_namestring

If using AWS providers (Bedrock), set the identifier of the assumed role session.

aws_sts_endpoint_urlstring

If using AWS providers (Bedrock), override the STS endpoint URL when assuming a different role.

embeddings_normalizeboolean

If using AWS providers (Bedrock), set to true to normalize the embeddings.

Default:false

performance_config_latencystring

Force the client’s performance configuration ‘latency’ for all requests. Leave empty to let the consumer select the performance configuration.

cohereobject
Hide Child Parameters
embedding_input_typestring

The purpose of the input text to calculate embedding vectors.

Allowed values:classificationclusteringimagesearch_documentsearch_query

Default:classification

wait_for_modelboolean

Wait for the model if it is not ready

embeddings_dimensionsinteger

If using embeddings models, set the number of dimensions to generate.

>= 0

geminiobject
Hide Child Parameters
api_endpointstring

If running Gemini on Vertex, specify the regional API endpoint (hostname only).

location_idstring

If running Gemini on Vertex, specify the location ID.

project_idstring

If running Gemini on Vertex, specify the project ID.

huggingfaceobject
Hide Child Parameters
use_cacheboolean

Use the cache layer on the inference API

wait_for_modelboolean

Wait for the model if it is not ready

input_costnumber

Defines the cost per 1M tokens in your prompt.

>= 0

llama2_formatstring

If using llama2 provider, select the upstream message format.

Allowed values:ollamaopenairaw

max_tokensinteger

Defines the max_tokens, if using chat or completion models.

mistral_formatstring

If using mistral provider, select the upstream message format.

Allowed values:ollamaopenai

output_costnumber

Defines the cost per 1M tokens in the output of the AI.

>= 0

temperaturenumber

Defines the matching temperature, if using chat or completion models.

>= 0<= 5

top_kinteger

Defines the top-k most likely tokens, if supported.

>= 0<= 500

top_pnumber

Defines the top-p probability mass, if supported.

>= 0<= 1

upstream_pathstring

Manually specify or override the AI operation path, used when e.g. using the ‘preserve’ route_type.

upstream_urlstring

Manually specify or override the full URL to the AI operation endpoints, when calling (self-)hosted models, or for running via a private endpoint.

providerstringrequired

AI provider request format - Kong translates requests to and from the specified backend compatible formats.

Allowed values:anthropicazurebedrockcoheregeminihuggingfacellama2mistralopenai

route_typestringrequired

The model’s operation implementation, for this provider.

Allowed values:audio/v1/audio/speechaudio/v1/audio/transcriptionsaudio/v1/audio/translationsimage/v1/images/editsimage/v1/images/generationsllm/v1/assistantsllm/v1/batchesllm/v1/chatllm/v1/completionsllm/v1/embeddingsllm/v1/filesllm/v1/responsespreserverealtime/v1/realtime

max_request_body_sizeinteger

max allowed body size allowed to be introspected. 0 means unlimited, but the size of this body will still be limited by Nginx’s client_max_body_size.

Default:8192

>= 0

parse_llm_response_json_instructionsboolean

Set true to read specific response format from the LLM, and accordingly set the status code / body / headers that proxy back to the client. You need to engineer your LLM prompt to return the correct format, see plugin docs ‘Overview’ page for usage instructions.

Default:false

promptstringrequired

Use this prompt to tune the LLM system/assistant message for the returning proxy response (from the upstream), adn what response format you are expecting.

transformation_extract_patternstring

Defines the regular expression that must match to indicate a successful AI transformation at the response phase. The first match will be set as the returning body. If the AI service’s response doesn’t match this pattern, a failure is returned to the client.

consumerobject

If set, the plugin will activate only for requests where the specified has been authenticated. (Note that some plugins can not be restricted to consumers this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer.

* Additional properties are NOT allowed.
Hide Child Parameters
idstring
consumer_groupobject

If set, the plugin will activate only for requests where the specified consumer group has been authenticated. (Note that some plugins can not be restricted to consumers groups this way.). Leave unset for the plugin to activate regardless of the authenticated Consumer Groups

* Additional properties are NOT allowed.
Hide Child Parameters
idstring
protocolsarray[string]

A set of strings representing HTTP protocols.

Allowed values:grpcgrpcshttphttps

Default:grpc, grpcs, http, https

routeobject

If set, the plugin will only activate when receiving requests via the specified route. Leave unset for the plugin to activate regardless of the route being used.

* Additional properties are NOT allowed.
Hide Child Parameters
idstring
serviceobject

If set, the plugin will only activate when receiving requests via one of the routes belonging to the specified Service. Leave unset for the plugin to activate regardless of the Service being matched.

* Additional properties are NOT allowed.
Hide Child Parameters
idstring

Did this doc help?

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!