Using the AI Azure Content Safety plugin - Plugin - unreleased

You are browsing unreleased documentation.

Overview

As a platform owner, you might need to moderate all user request content against a reputable service to comply with specific sensitive categories when using Kong to proxy your Large Language Model (LLM) traffic.

This plugin integrates with the Azure REST API and transmits every user LLM request from users to the Azure Content Safety SaaS before proxying to the upstream LLM.

The plugin uses the text moderation operation, and only supports REST API version 2023-10-01.

To configure the plugin, you set an array of categories and levels. If Azure finds that a piece of content has breached one or more of these levels, the request is stopped with a 400 status and reported to the Kong log file for auditing.

Prerequisites

An Azure subscription and a Content Safety instance. You can follow the quickstart from Microsoft to get set up quickly.
Create a service, route, and ai-proxy plugin that will serve as your LLM access point.

Authentication

In each instance of the plugin, it supports one of:

Content Safety Key (static key generated from Azure Portal)
Managed Identity Authentication

Content Safety Key Auth

To use a content safety key, you must set the config.content_safety_key parameter.

Managed Identity Auth

To use Managed Identity auth (e.g. Machine Identity on an Azure VM or AKS Pod), you must set config.use_azure_managed_identity to true.

Following this, there are three more parameters that may or may not be required:

config.azure_client_id
config.azure_client_secret
config.azure_tenant_id

The client ID is normally required when you want to use a different user assigned identity instead of the managed identity assigned to the resource on which Kong is running.

The client secret and tenant ID are usually only used when you are running Kong somewhere outside of Azure, but still want to use Entra ID (ADFS) to authenticate with Content Services.

See the cloud provider authentication guide to learn more.

Examples

Configure the plugin with an array of supported categories, as defined by Azure Content Safety:

Azure’s harm categories map to categories.name in the plugin’s configuration, and the severity levels map to categories.rejection_level.

For example, here’s what it looks like if you use all four supported categories in this API version:

Enable on a service

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/services/{serviceName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-azure-content-safety",
  "config": {
    "content_safety_url": "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze",
    "use_azure_managed_identity": false,
    "content_safety_key": "{vault://env/AZURE_CONTENT_SAFETY_KEY}",
    "categories": [
      {
        "name": "Hate",
        "rejection_level": 2
      },
      {
        "name": "SelfHarm",
        "rejection_level": 2
      },
      {
        "name": "Sexual",
        "rejection_level": 2
      },
      {
        "name": "Violence",
        "rejection_level": 2
      }
    ],
    "text_source": "concatenate_user_content",
    "reveal_failure_reason": true,
    "output_type": "FourSeverityLevels"
  }
}
    '

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and service ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-azure-content-safety","config":{"content_safety_url":"https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze","use_azure_managed_identity":false,"content_safety_key":"{vault://env/AZURE_CONTENT_SAFETY_KEY}","categories":[{"name":"Hate","rejection_level":2},{"name":"SelfHarm","rejection_level":2},{"name":"Sexual","rejection_level":2},{"name":"Violence","rejection_level":2}],"text_source":"concatenate_user_content","reveal_failure_reason":true,"output_type":"FourSeverityLevels"}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-azure-content-safety-example
plugin: ai-azure-content-safety
config:
  content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
  use_azure_managed_identity: false
  content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
  categories:
  - name: Hate
    rejection_level: 2
  - name: SelfHarm
    rejection_level: 2
  - name: Sexual
    rejection_level: 2
  - name: Violence
    rejection_level: 2
  text_source: concatenate_user_content
  reveal_failure_reason: true
  output_type: FourSeverityLevels
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the service as follows:

kubectl annotate service SERVICE_NAME konghq.com/plugins=ai-azure-content-safety-example

Replace SERVICE_NAME with the name of the service that this plugin configuration will target. You can see your available ingresses by running kubectl get service.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-azure-content-safety
  service: SERVICE_NAME|ID
  config:
    content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
    use_azure_managed_identity: false
    content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
    categories:
    - name: Hate
      rejection_level: 2
    - name: SelfHarm
      rejection_level: 2
    - name: Sexual
      rejection_level: 2
    - name: Violence
      rejection_level: 2
    text_source: concatenate_user_content
    reveal_failure_reason: true
    output_type: FourSeverityLevels

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_azure_content_safety" "my_ai_azure_content_safety" {
  enabled = true

  config = {
    content_safety_url = "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze"
    use_azure_managed_identity = false
    content_safety_key = "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
    categories = [
      {
        name = "Hate"
        rejection_level = 2
      }, 

      {
        name = "SelfHarm"
        rejection_level = 2
      }, 

      {
        name = "Sexual"
        rejection_level = 2
      }, 

      {
        name = "Violence"
        rejection_level = 2
      }    ]
    text_source = "concatenate_user_content"
    reveal_failure_reason = true
    output_type = "FourSeverityLevels"
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

Now, given the following AI Chat request:

{
  "messages": [
    {
      "role": "system",
      "content": "You are a mathematician."
    },
    {
      "role": "user",
      "content": "What is 1 + 1?"
    },
    {
      "role": "assistant",
      "content": "The answer is 3."
    },
    {
      "role": "user",
      "content": "You lied, I hate you!"
    }
  ]
}

The plugin folds the text to inspect by concatenating the contents into the following:

You are a mathematician.; What is 1 + 1?; The answer is 3.; You lied, I hate you!

Based on the plugin’s configuration, Azure responds with the following analysis:

{
    "categoriesAnalysis": [
        {
            "category": "Hate",
            "severity": 2
        }
    ]
}

This breaches the plugin’s configured (inclusive and greater) threshold of 2 for Hate based on Azure’s ruleset, and sends a 400 error code to the client:

{
	"error": {
		"message": "request failed content safety check: breached category [Hate] at level 2"
	}
}

Hiding the failure from the client

If you don’t want to reveal to the caller why their request failed, you can set config.reveal_failure_reason to false, in which case the response looks like this:

{
	"error": {
		"message": "request failed content safety check"
	}
}

Using blocklists

The plugin supports previously-created blocklists in Azure Content Safety.

Using the Azure Content Safety API or the Azure Portal, you can create a series of blocklists for banned phrases or patterns. You can then reference their unique names in the plugin configuration.

In the following example, the plugin takes two existing blocklists from Azure, company_competitors and financial_properties:

Enable on a service

Kong Admin API

Konnect API

Kubernetes

Declarative (YAML)

Konnect Terraform

Make the following request:

curl -X POST http://localhost:8001/services/{serviceName|Id}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --data '
    {
  "name": "ai-azure-content-safety",
  "config": {
    "content_safety_url": "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze",
    "use_azure_managed_identity": false,
    "content_safety_key": "{vault://env/AZURE_CONTENT_SAFETY_KEY}",
    "categories": [
      {
        "name": "Hate",
        "rejection_level": 2
      }
    ],
    "blocklist_names": [
      "company_competitors",
      "financial_properties"
    ],
    "halt_on_blocklist_hit": true,
    "text_source": "concatenate_user_content",
    "reveal_failure_reason": true,
    "output_type": "FourSeverityLevels"
  }
}
    '

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Make the following request, substituting your own access token, region, control plane ID, and service ID:

curl -X POST \
https://{us|eu}.api.konghq.com/v2/control-planes/{controlPlaneId}/core-entities/services/{serviceId}/plugins \
    --header "accept: application/json" \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer TOKEN" \
    --data '{"name":"ai-azure-content-safety","config":{"content_safety_url":"https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze","use_azure_managed_identity":false,"content_safety_key":"{vault://env/AZURE_CONTENT_SAFETY_KEY}","categories":[{"name":"Hate","rejection_level":2}],"blocklist_names":["company_competitors","financial_properties"],"halt_on_blocklist_hit":true,"text_source":"concatenate_user_content","reveal_failure_reason":true,"output_type":"FourSeverityLevels"}}'

See the Konnect API reference to learn about region-specific URLs and personal access tokens.

First, create a KongPlugin resource:

echo "
apiVersion: configuration.konghq.com/v1
kind: KongPlugin
metadata:
  name: ai-azure-content-safety-example
plugin: ai-azure-content-safety
config:
  content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
  use_azure_managed_identity: false
  content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
  categories:
  - name: Hate
    rejection_level: 2
  blocklist_names:
  - company_competitors
  - financial_properties
  halt_on_blocklist_hit: true
  text_source: concatenate_user_content
  reveal_failure_reason: true
  output_type: FourSeverityLevels
" | kubectl apply -f -

Next, apply the KongPlugin resource to an ingress by annotating the service as follows:

kubectl annotate service SERVICE_NAME konghq.com/plugins=ai-azure-content-safety-example

Replace SERVICE_NAME with the name of the service that this plugin configuration will target. You can see your available ingresses by running kubectl get service.

Note: The KongPlugin resource only needs to be defined once and can be applied to any service, consumer, or route in the namespace. If you want the plugin to be available cluster-wide, create the resource as a KongClusterPlugin instead of KongPlugin.

Add this section to your declarative configuration file:

plugins:
- name: ai-azure-content-safety
  service: SERVICE_NAME|ID
  config:
    content_safety_url: https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze
    use_azure_managed_identity: false
    content_safety_key: "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
    categories:
    - name: Hate
      rejection_level: 2
    blocklist_names:
    - company_competitors
    - financial_properties
    halt_on_blocklist_hit: true
    text_source: concatenate_user_content
    reveal_failure_reason: true
    output_type: FourSeverityLevels

Replace SERVICE_NAME|ID with the id or name of the service that this plugin configuration will target.

Prerequisite: Configure your Personal Access Token

terraform {
  required_providers {
    konnect = {
      source  = "kong/konnect"
    }
  }
}

provider "konnect" {
  personal_access_token = "kpat_YOUR_TOKEN"
  server_url            = "https://us.api.konghq.com/"
}

Add the following to your Terraform configuration to create a Konnect Gateway Plugin:

resource "konnect_gateway_plugin_ai_azure_content_safety" "my_ai_azure_content_safety" {
  enabled = true

  config = {
    content_safety_url = "https://my-acs-instance.cognitiveservices.azure.com/contentsafety/text:analyze"
    use_azure_managed_identity = false
    content_safety_key = "{vault://env/AZURE_CONTENT_SAFETY_KEY}"
    categories = [
      {
        name = "Hate"
        rejection_level = 2
      }    ]
    blocklist_names = ["company_competitors", "financial_properties"]
    halt_on_blocklist_hit = true
    text_source = "concatenate_user_content"
    reveal_failure_reason = true
    output_type = "FourSeverityLevels"
  }

  control_plane_id = konnect_gateway_control_plane.my_konnect_cp.id
  service = {
    id = konnect_gateway_service.my_service.id
  }
}

Kong Gateway will then command Content Safety to enable and execute these blocklists against the content. The plugin property config.halt_on_blocklist_hit is used to tell Content Safety to stop analyzing the content as soon as any blocklist hit matches. This can save analysis costs, at the expense of accuracy in the response: for example, if it also fails the Hate category, this will not be reported.

Previous Basic config examples for AI Azure Content Safety

Next AI Azure Content Safety Changelog