Kong AI Gateway

Connectivity and governance layer for modern AI-native applications built on top of Kong Gateway

Introducing Kong AI Gateway

As AI adoption accelerates, applications are evolving beyond basic LLM calls into complex, multi-actor systems—including user apps, agents, orchestration layers, and context servers—all interacting with foundation models in real time.

To support this shift, developers are adopting protocols like Model Context Protocol (MCP) and Agent2Agent (A2A) to standardize how components exchange tools, data, and decisions.

But infrastructure often falls behind, with challenges around authentication, rate limiting, data security, observability, and constant provider changes.

Kong AI Gateway addresses these challenges with a high-performance control plane that secures, governs, and observes AI-native systems end to end. Whether serving LLM traffic, exposing structured context via MCP, or coordinating agents through A2A, Kong AI Gateway ensures scalable, secure, and reliable AI infrastructure.

Quickstart

Launch a demo instance of Kong Gateway running AI Proxy:

curl -Ls https://get.konghq.com/ai | bash

Copied to clipboard!

Get started

Run the Kong Gateway quickstart and enable the AI Proxy plugin.

Video tutorials

Learn how to use AI plugins with video tutorials.

AI plugins

Learn about all the AI plugins.

AI providers

Learn about the various providers supported by AI Gateway.

Tools to manage AI Gateway

Kong AI Gateway, working alongside Kong Gateway, supports multiple tools for managing configuration and resources. Use the following tools to automate, integrate, or streamline AI Gateway operations in a way that best fits your deployment model.

AI Manager: GUI for managing all your Kong AI Gateway resources in one place.
decK: Manage Kong AI Gateway and Kong Gateway configuration through declarative state files.
Terraform: Manage infrastructure as code and automated deployments to streamline setup and configuration of Konnect and Kong Gateway.
KIC: Manage ingress traffic and routing rules for your services.
Kong Gateway Admin API: Manage on-prem Kong Gateway entities via an API.
Control Plane Config API: Manage Kong Gateway entities within Konnect Control Planes via an API.

AI Gateway capabilities

You can enable the AI Gateway features through a set of modern and specialized plugins, using the same model you use for any other Kong Gateway plugin. When deployed alongside existing Kong Gateway plugins, Kong Gateway users can quickly assemble a sophisticated AI management platform without custom code or deploying new and unfamiliar tools.

Universal API

Route client requests to various AI providers.

Rate limiting

Manage traffic to your LLM API.

Semantic caching

Semantically cache responses from LLMs.

Semantic routing

Semantically distribute requests to different LLM models.

MCP traffic gateway

Gain control and visibility over AI agent infrastructure with AI Gateway-driven MCP capabilities

Automated RAG injection

Automatically embed RAG logic into your workflows.

Data governance

Use AI plugins to control AI data and usage.

Guardrails

Inspect requests and configure content safety and moderation.

Prompt engineering

Create prompt templates and manipulate client prompts.

Load balancing

Learn about the load balancing algorithms available for AI Gateway.

Audit log

Learn about AI Gateway logging capabilities.

LLM metrics

Expose and visualize LLM metrics.

Konnect Advanced Analytics

Visualize LLM metrics in Konnect.

Streaming

Stream user requests with AI Gateway

Secrets management

Use Konnect Config Store to store and reference your LLM provider API keys

Prompt compression

Keep your prompts lean, reduce latency, and optimize LLM usage for cost efficiency

LLM cost control

Reduce LLM usage costs by giving you control over how prompts are built and routed

Request transformations

Use AI to transform requests and responses.

Universal API

Kong’s AI Gateway Universal API, delivered through the AI Proxy and AI Proxy Advanced plugins, simplifies AI model integration by providing a single, standardized interface for interacting with models across multiple providers.

Easy to use: Configure once and access any AI model with minimal integration effort.
Load balancing: Automatically distribute AI requests across multiple models or providers for optimal performance and cost efficiency.
Retry and fallback: Optimize AI requests based on model performance, cost, or other factors.
Cross-plugin integration: Leverage AI in non-AI API workflows through other Kong Gateway plugins.

AI Proxy

The AI Proxy plugin lets you transform and proxy requests to a number of AI providers and models.

See plugin →

AI Proxy Advanced

The AI Proxy Advanced plugin lets you transform and proxy requests to multiple AI providers and models at the same time. This lets you set up load balancing between targets.

See plugin →

AI Usage Governance

As AI technologies see broader adoption, developers and organizations face new risks—most notably, the risk of sensitive data leaking to AI providers, exposing businesses and their customers to potential breaches and security threats.

Managing how data flows to and from AI models has become critical not just for security, but also for compliance and reliability. Without the right controls in place, organizations risk losing visibility into how AI is used across their systems.

Kong AI Gateway helps mitigate these challenges by offering a suite of plugins that extend beyond basic AI traffic management.

Data governance: Control how sensitive information is handled and shared with AI models.
Prompt engineering: Customize and optimize prompts to deliver consistent, high-quality AI outputs.
Guardrails and content safety: Enforce policies to prevent inappropriate, unsafe, or non-compliant responses.
Automated RAG injection: Seamlessly inject relevant, vetted data into AI prompts without manual RAG implementations.
Load balancing: Distribute AI traffic efficiently across multiple model endpoints to ensure performance and reliability.
LLM cost control: Use the AI Compressor, RAG Injector, and Prompt Decorator to compress and structure prompts efficiently. Combine with AI Proxy Advanced to route requests across OpenAI models by semantic similarity—optimizing for cost and performance.