You are browsing unreleased documentation.
Overview
The AI Response Transformer plugin is designed to operate in two ways:
- As a transformer/security arbiter for your existing upstream APIs
- As an extension of another AI Proxy LLM route, inspecting and transforming the responses before sending to the upstream LLM service
The plugin configuration consists of two distinct sections:
- The LLM configuration, which uses the same configuration options as the AI Proxy plugin.
- The prompt (and additional options) containing the instructions for the LLM, which will transform your request.
See the same LLM block in the context of the AI Proxy plugin and the AI Response Transformer plugin:
When the plugin is accessed in any scope (global, service, route, or consumer), it always sets the upstream’s response
body as the user
prompt in a chat message, and then sends it to the configured llm:
configuration block for inspection or transformation.
Examples
Transforming existing API traffic
This example uses ai-response-transformer
on an existing API, for example, something that you have already developed and maintain internally.
-
Design the prompt.
For this example, we want to intercept responses from the
customers
API. On each client response, the plugin needs to forward it to the configured large language model, and ask the LLM to mask all credit card numbers with asterisk characters.The plugin would be configured like this:
config: prompt: > Mask all credit card numbers in my JSON message with '*'. Return me ONLY the resulting JSON. llm: # see `ai-proxy` plugin documentation for compatible fields for the "llm" block
-
Attach the plugin.
Attach the
ai-response-transformer
plugin to the global level, route, service, or consumer on which you want to inspect/transform all responses.It can even be used on APIs that already have the
ai-request-transformer
plugin, or it can be used on its own. -
What happens next?
First, an upstream API responds to a client request. For example:
{ "user": { "name": "Kong User", "city": "London", "credit_card_no": "1234-5678-9012-3456" } }
Next, Kong parses this into an
llm/v1/chat
type message, based on yourconfig.prompt
:{ "messages": [ { "role": "system", "content": "Mask all credit card numbers in my JSON message with '*'. Return me ONLY the resulting JSON." }, { "role": "user", "content": "{\n\"user\":{\n\"name\":\"Kong User\",\n\"city\":\"London\"\n\"credit_card_no\":\"1234-5678-9012-3456\"}\n}" } ] }
Finally, it sends this to the configured LLM. On the response, it takes the trailing
assistant
response back from the LLM, and sets it as the HTTP body that will return to the original client:{ "user": { "name": "Kong User", "city": "London", "credit_card_no": "****-****-****-****" } }
Extraction patterns
If your LLM is a chatbot type, or is unpredictable in responses, you can configure the additional field transformation_extract_pattern
with a (PCRE) regular expression to extract the first match from the LLM’s response.
For example, if you have asked for a JSON response but you know that your LLM may add its own text around your answer, use this extraction pattern to withdraw only the JSON object from the LLM’s response:
config:
prompt: >
Mask all credit card numbers in my JSON message with '*'. Return me ONLY the resulting JSON.
transformation_extract_pattern: '\\{((.|\n)*)\\}'
Setting body, headers, and status code
The AI Response Transformer can modify any of the following response sections independently:
- Headers
- Status code
- Body
This allows the Kong admin to configure the LLM to fully orchestrate the response phase inside Kong Gateway.
Enable this feature by setting the config option parse_llm_response_json_instructions
.
-
Design the prompt.
For this example, we want to intercept responses from our
customers
API. On each client response, the plugin forwards the response to the configured large language model, and asks the LLM to mask all credit card numbers with asterisk characters.The plugin would be configured like this:
config: prompt: > If my JSON message has the user's name 'Kong User', then return this exact JSON message: {"status": 400, "headers": {"x-failed": "true"}, "body": "VALIDATION_FAILURE"} parse_llm_response_json_instructions: true llm: # see `ai-proxy` plugin documentation for compatible fields for the "llm" block
-
Attach the plugin.
Attach the
ai-response-transformer
plugin to the global level, route, service, or consumer on which you want to inspect/transform all responses.It can even be used on APIs that already have the
ai-request-transformer
plugin, or it can be used on its own. -
What happens next?
First, an upstream API responds to a client request, for example:
{ "user": { "name": "Kong User", "city": "London", "credit_card_no": "1234-5678-9012-3456" } }
Once sent to the LLM, the response will be:
{ "status": 400, "headers": { "x-failed": "true" }, "body": "VALIDATION_FAILURE" }
Kong will do the following actions:
- Set each response header from the
headers
object - Set the HTTP status code to the
status
integer - Set the body to be the
body
string
- Set each response header from the