Overview
As a platform owner, you might need to moderate all user request content against a reputable service to comply with specific sensitive
categories when using Kong to proxy your Large Language Model (LLM) traffic.
This plugin integrates with the Azure REST API and transmits every user LLM request
from users to the Azure Content Safety SaaS before proxying to the upstream LLM.
The plugin uses the text moderation operation, and only supports REST API version 2023-10-01.
To configure the plugin, you set an array of categories and levels.
If Azure finds that a piece of content has breached one or more of these levels,
the request is stopped with a 400 status and reported to the Kong log file for auditing.
Prerequisites
Authentication
In each instance of the plugin, it supports one of:
- Content Safety Key (static key generated from Azure Portal)
- Managed Identity Authentication
Content Safety Key Auth
To use a content safety key, you must set the config.content_safety_key
parameter.
Managed Identity Auth
To use Managed Identity auth (e.g. Machine Identity on an Azure VM or AKS Pod), you must set config.use_azure_managed_identity
to true
.
Following this, there are three more parameters that may or may not be required:
config.azure_client_id
config.azure_client_secret
config.azure_tenant_id
The client ID is normally required when you want to use a different user assigned identity instead of the
managed identity assigned to the resource on which Kong is running.
The client secret and tenant ID are usually only used when you are running Kong somewhere outside
of Azure, but still want to use Entra ID (ADFS) to authenticate with Content Services.
See the cloud provider authentication guide to learn more.
Examples
Configure the plugin with an array of supported categories, as defined by Azure Content Safety:
Azure’s harm categories map to categories.name
in the plugin’s configuration, and the severity levels map to categories.rejection_level
.
For example, here’s what it looks like if you use all four
supported categories in this API version:
Kong Admin API
Konnect API
Kubernetes
Declarative (YAML)
Konnect Terraform
Make the following request:
curl -X POST http://localhost:8001/services/{serviceName|Id}/plugins \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-azure-content-safety",
"config": {
"content_safety_url": "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze",
"use_azure_managed_identity": false,
"content_safety_key": "{vault://env/AZURE_CONTENT_SAFETY_KEY}",
"categories": [
{
"name": "Hate",
"rejection_level": 2
},
{
"name": "SelfHarm",
"rejection_level": 2
},
{
"name": "Sexual",
"rejection_level": 2
},
{
"name": "Violence",
"rejection_level": 2
}
],
"text_source": "concatenate_user_content",
"reveal_failure_reason": true,
"output_type": "FourSeverityLevels"
}
}
'
Replace SERVICE_NAME|ID
with the id
or name
of the service that this plugin configuration will target.
Make the following request, substituting your own access token, region, control plane ID, and service ID:
curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer TOKEN" \
--data '{"name":"ai-azure-content-safety","config":{"content_safety_url":"https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze","use_azure_managed_identity":false,"content_safety_key":"{vault://env/AZURE_CONTENT_SAFETY_KEY}","categories":[{"name":"Hate","rejection_level":2},{"name":"SelfHarm","rejection_level":2},{"name":"Sexual","rejection_level":2},{"name":"Violence","rejection_level":2}],"text_source":"concatenate_user_content","reveal_failure_reason":true,"output_type":"FourSeverityLevels"}}'
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
First, create a KongPlugin
resource:
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-azure-content-safety-example
plugin: ai-azure-content-safety
config:
content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
use_azure_managed_identity: false
content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories:
- name: Hate
rejection_level: 2
- name: SelfHarm
rejection_level: 2
- name: Sexual
rejection_level: 2
- name: Violence
rejection_level: 2
text_source: concatenate_user_content
reveal_failure_reason: true
output_type: FourSeverityLevels
" | kubectl apply -f -
Next, apply the KongPlugin
resource to an ingress by annotating the service
as follows:
kubectl annotate service SERVICE_NAME konghq.com/plugins=ai-azure-content-safety-example
Replace SERVICE_NAME
with the name of the service that this plugin configuration will target.
You can see your available ingresses by running kubectl get service
.
Note: The KongPlugin resource only needs to be defined once
and can be applied to any service, consumer, or route in the namespace. If you
want the plugin to be available cluster-wide, create the resource as a
KongClusterPlugin
instead of KongPlugin
.
Add this section to your declarative configuration file:
plugins:
- name: ai-azure-content-safety
service: SERVICE_NAME|ID
config:
content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
use_azure_managed_identity: false
content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories:
- name: Hate
rejection_level: 2
- name: SelfHarm
rejection_level: 2
- name: Sexual
rejection_level: 2
- name: Violence
rejection_level: 2
text_source: concatenate_user_content
reveal_failure_reason: true
output_type: FourSeverityLevels
Replace SERVICE_NAME|ID
with the id
or name
of the service that this plugin configuration will target.
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "kpat_YOUR_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_azure_content_safety" "my_ai_azure_content_safety" {
enabled = true
config = {
content_safety_url = "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze"
use_azure_managed_identity = false
content_safety_key = "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories = [
{
name = "Hate"
rejection_level = 2
},
{
name = "SelfHarm"
rejection_level = 2
},
{
name = "Sexual"
rejection_level = 2
},
{
name = "Violence"
rejection_level = 2
} ]
text_source = "concatenate_user_content"
reveal_failure_reason = true
output_type = "FourSeverityLevels"
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
service = {
id = konnect_gateway_service.my_service.id
}
}
Now, given the following AI Chat request:
{
"messages": [
{
"role": "system",
"content": "You are a mathematician."
},
{
"role": "user",
"content": "What is 1 + 1?"
},
{
"role": "assistant",
"content": "The answer is 3."
},
{
"role": "user",
"content": "You lied, I hate you!"
}
]
}
The plugin folds the text to inspect by concatenating the contents into the following:
You are a mathematician.; What is 1 + 1?; The answer is 3.; You lied, I hate you!
Based on the plugin’s configuration, Azure responds with the following analysis:
{
"categoriesAnalysis": [
{
"category": "Hate",
"severity": 2
}
]
}
This breaches the plugin’s configured (inclusive and greater) threshold of 2
for Hate
based on Azure’s ruleset, and sends a 400 error code to the client:
{
"error": {
"message": "request failed content safety check: breached category [Hate] at level 2"
}
}
Hiding the failure from the client
If you don’t want to reveal to the caller why their request failed, you can set config.reveal_failure_reason
to false
, in which
case the response looks like this:
{
"error": {
"message": "request failed content safety check"
}
}
Using blocklists
The plugin supports previously-created blocklists in Azure Content Safety.
Using the Azure Content Safety API
or the Azure Portal, you can create a series of blocklists for banned phrases or patterns.
You can then reference their unique names in the plugin configuration.
In the following example, the plugin takes two existing blocklists from Azure, company_competitors
and
financial_properties
:
Kong Admin API
Konnect API
Kubernetes
Declarative (YAML)
Konnect Terraform
Make the following request:
curl -X POST http://localhost:8001/services/{serviceName|Id}/plugins \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--data '
{
"name": "ai-azure-content-safety",
"config": {
"content_safety_url": "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze",
"use_azure_managed_identity": false,
"content_safety_key": "{vault://env/AZURE_CONTENT_SAFETY_KEY}",
"categories": [
{
"name": "Hate",
"rejection_level": 2
}
],
"blocklist_names": [
"company_competitors",
"financial_properties"
],
"halt_on_blocklist_hit": true,
"text_source": "concatenate_user_content",
"reveal_failure_reason": true,
"output_type": "FourSeverityLevels"
}
}
'
Replace SERVICE_NAME|ID
with the id
or name
of the service that this plugin configuration will target.
Make the following request, substituting your own access token, region, control plane ID, and service ID:
curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
--header "accept: application/json" \
--header "Content-Type: application/json" \
--header "Authorization: Bearer TOKEN" \
--data '{"name":"ai-azure-content-safety","config":{"content_safety_url":"https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze","use_azure_managed_identity":false,"content_safety_key":"{vault://env/AZURE_CONTENT_SAFETY_KEY}","categories":[{"name":"Hate","rejection_level":2}],"blocklist_names":["company_competitors","financial_properties"],"halt_on_blocklist_hit":true,"text_source":"concatenate_user_content","reveal_failure_reason":true,"output_type":"FourSeverityLevels"}}'
See the Konnect API reference to learn about region-specific URLs and personal access tokens.
First, create a KongPlugin
resource:
echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
name: ai-azure-content-safety-example
plugin: ai-azure-content-safety
config:
content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
use_azure_managed_identity: false
content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories:
- name: Hate
rejection_level: 2
blocklist_names:
- company_competitors
- financial_properties
halt_on_blocklist_hit: true
text_source: concatenate_user_content
reveal_failure_reason: true
output_type: FourSeverityLevels
" | kubectl apply -f -
Next, apply the KongPlugin
resource to an ingress by annotating the service
as follows:
kubectl annotate service SERVICE_NAME konghq.com/plugins=ai-azure-content-safety-example
Replace SERVICE_NAME
with the name of the service that this plugin configuration will target.
You can see your available ingresses by running kubectl get service
.
Note: The KongPlugin resource only needs to be defined once
and can be applied to any service, consumer, or route in the namespace. If you
want the plugin to be available cluster-wide, create the resource as a
KongClusterPlugin
instead of KongPlugin
.
Add this section to your declarative configuration file:
plugins:
- name: ai-azure-content-safety
service: SERVICE_NAME|ID
config:
content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
use_azure_managed_identity: false
content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories:
- name: Hate
rejection_level: 2
blocklist_names:
- company_competitors
- financial_properties
halt_on_blocklist_hit: true
text_source: concatenate_user_content
reveal_failure_reason: true
output_type: FourSeverityLevels
Replace SERVICE_NAME|ID
with the id
or name
of the service that this plugin configuration will target.
Prerequisite: Configure your Personal Access Token
terraform {
required_providers {
konnect = {
source = "kong/konnect"
}
}
}
provider "konnect" {
personal_access_token = "kpat_YOUR_TOKEN"
server_url = "https://us.api.konghq.com/"
}
Add the following to your Terraform configuration to create a Konnect Gateway Plugin:
resource "konnect_gateway_plugin_ai_azure_content_safety" "my_ai_azure_content_safety" {
enabled = true
config = {
content_safety_url = "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze"
use_azure_managed_identity = false
content_safety_key = "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
categories = [
{
name = "Hate"
rejection_level = 2
} ]
blocklist_names = ["company_competitors", "financial_properties"]
halt_on_blocklist_hit = true
text_source = "concatenate_user_content"
reveal_failure_reason = true
output_type = "FourSeverityLevels"
}
control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
service = {
id = konnect_gateway_service.my_service.id
}
}
Kong Gateway will then command Content Safety to enable and execute these blocklists against the content. The plugin property config.halt_on_blocklist_hit
is
used to tell Content Safety to stop analyzing the content as soon as any blocklist hit matches. This can save analysis costs, at the expense of accuracy
in the response: for example, if it also fails the Hate category, this will not be reported.