Skip to content
Kong Docs are moving soon! Our docs are migrating to a new home. You'll be automatically redirected to the new site in the future. In the meantime, view this page on the new site!
Kong Logo | Kong Docs Logo
  • Docs
    • Explore the API Specs
      View all API Specs View all API Specs View all API Specs arrow image
    • Documentation
      API Specs
      Kong Gateway
      Lightweight, fast, and flexible cloud-native API gateway
      Kong Konnect
      Single platform for SaaS end-to-end connectivity
      Kong AI Gateway
      Multi-LLM AI Gateway for GenAI infrastructure
      Kong Mesh
      Enterprise service mesh based on Kuma and Envoy
      decK
      Helps manage Kong’s configuration in a declarative fashion
      Kong Ingress Controller
      Works inside a Kubernetes cluster and configures Kong to proxy traffic
      Kong Gateway Operator
      Manage your Kong deployments on Kubernetes using YAML Manifests
      Insomnia
      Collaborative API development platform
  • Plugin Hub
    • Explore the Plugin Hub
      View all plugins View all plugins View all plugins arrow image
    • Functionality View all View all arrow image
      View all plugins
      AI's icon
      AI
      Govern, secure, and control AI traffic with multi-LLM AI Gateway plugins
      Authentication's icon
      Authentication
      Protect your services with an authentication layer
      Security's icon
      Security
      Protect your services with additional security layer
      Traffic Control's icon
      Traffic Control
      Manage, throttle and restrict inbound and outbound API traffic
      Serverless's icon
      Serverless
      Invoke serverless functions in combination with other plugins
      Analytics & Monitoring's icon
      Analytics & Monitoring
      Visualize, inspect and monitor APIs and microservices traffic
      Transformations's icon
      Transformations
      Transform request and responses on the fly on Kong
      Logging's icon
      Logging
      Log request and response data using the best transport for your infrastructure
  • Support
  • Community
  • Kong Academy
Get a Demo Start Free Trial
1.6.x (latest)
  • Home icon
  • Kong Gateway Operator
  • Guides
  • Horizontally autoscale a Data Plane
github-edit-pageEdit this page
report-issueReport an issue
  • Kong Gateway
  • Kong Konnect
  • Kong Mesh
  • Kong AI Gateway
  • Plugin Hub
  • decK
  • Kong Ingress Controller
  • Kong Gateway Operator
  • Insomnia
  • Kuma

  • Docs contribution guidelines
  • unreleased
  • 1.6.x (latest)
  • 1.5.x
  • 1.4.x
  • 1.3.x
  • 1.2.x
  • 1.1.x
  • 1.0.x
  • Introduction
    • Overview
    • Deployment Topologies
      • Hybrid Mode
      • DB-less Mode
    • Key Concepts
      • Gateway API
      • Gateway Configuration
      • Managed Gateways
    • Changelog
    • Version Support Policy
    • FAQ
  • Get Started
    • Konnect
      • Install Gateway Operator
      • Create a KonnectExtension
      • Deploy a Data Plane
      • Create a Route
    • Kong Ingress Controller
      • Install Gateway Operator
      • Create a Gateway
      • Create a Route
  • Production Deployment
    • Overview
    • Install
    • Enterprise License
    • Monitoring
      • Metrics
      • Status fields
        • Overview
        • DataPlane
        • ControlPlane
        • Gateway
    • Upgrade Gateway Operator
    • Certificates
      • Using custom CA for signing operator certificates
  • Guides
    • AI Gateway
    • Customization
      • Set data plane image
      • Deploying Sidecars
      • Customizing PodTemplateSpec
      • Defining PodDisruptionBudget for DataPlane
    • Autoscaling Kong Gateway
    • Autoscaling Workloads
      • Overview
      • Prometheus
      • Datadog
    • Hardening
      • Limiting namespaces watched by ControlPlane
    • Upgrading Data Planes
      • Rolling Deployment
      • Blue / Green Deployment
    • Kong Custom Plugin Distribution
    • Managing Konnect entities
      • Architecture overview
      • Gateway Control Plane
      • Service and Route
      • Consumer, Credentials and Consumer Groups
      • Key and Key Set
      • Upstream and Targets
      • Certificate and CA Certificate
      • Vault
      • Data Plane Client Certificate
      • Tagging and Labeling
      • Managing Plugin Bindings by CRD
      • Cloud Gateways - Networks
      • Cloud Gateways - Data Plane Group Configuration
      • Cloud Gateways - Transit Gateways
      • FAQ
    • Migration
      • Migrate Konnect DataPlanes from KGO v1.4.x to v1.5.x
  • Reference
    • Custom Resources
      • Overview
      • GatewayConfiguration
      • ControlPlane
      • DataPlane
      • KongPluginInstallation
    • Understanding KonnectExtension
    • Configuration Options
    • License
    • Version Compatibility
enterprise-switcher-icon Switch to OSS
On this pageOn this page
  • Before we begin
  • Create a DataPlane with horizontal autoscaling enabled
  • Test autoscaling with a load test

Horizontally autoscale a Data Plane

Kong Gateway Operator can deploy data planes that will horizontally autoscale based on user defined criteria.

This guide shows how to autoscale data planes based on their average CPU utilization.

Before we begin

Kong Gateway Operator uses Kubernetes HorizontalPodAutoscaler to perform horizontal autoscaling of data planes.

In order to be able to use HorizontalPodAutoscaler in your clusters you’ll need to have a metrics server installed. More info on the metrics server can be found in official Kubernetes docs.

Create a DataPlane with horizontal autoscaling enabled

To enable horizontal autoscaling, you must specify the spec.deployment.scaling section in your DataPlane resource to indicate which metrics should be used for decision making.

In the example below autoscaling is triggered based on CPU utilization. The DataPlane resource can have between 2 and 10 replicas, and a new replica will be launched whenever CPU utilization is above 50%.

The scaleUp configuration states that either 100% of existing replicas, or 5 new pods (whichever is higher) may be launched every 10 seconds. If you have 3 replicas, 5 pods may be created. If you have 50 replicas, up to 50 more pods may be launched.

The scaleDown configuration states that 100% of pods may be removed (with a minReplicas value of 2).

echo '
apiVersion: gateway-operator.konghq.com/v1beta1
kind: DataPlane
metadata:
  name: horizontal-autoscaling
spec:
  deployment:
    scaling:
      horizontal:
        minReplicas: 2
        maxReplicas: 10
        metrics:
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 50
        behavior:
          scaleUp:
            stabilizationWindowSeconds: 1
            policies:
            - type: Percent
              value: 100
              periodSeconds: 10
            - type: Pods
              value: 5
              periodSeconds: 10
            selectPolicy: Max
          scaleDown:
            stabilizationWindowSeconds: 1
            policies:
            - type: Percent
              value: 100
              periodSeconds: 10
    podTemplateSpec:
      spec:
        containers:
        - name: proxy
          image: kong/kong-gateway:3.10.0.1
          resources:
            requests:
              memory: "64Mi"
              cpu: "250m"
            limits:
              memory: "1024Mi"
              cpu: "1000m"
          # Add any Konnect-related configuration here: environment variables, volumes, and so on.
' | kubectl apply -f -

Please consult the CRD reference for all scaling options.

A DataPlane is created when the manifest above is applied. This creates 2 Pods running Kong Gateway, as well as a HorizontalPodAutoscaler which will manage the replica count of those Pods to ensure that the average CPU utilization is around 50%.

kubectl get hpa

The output will show the HorizontalPodAutoscaler resource:

NAME                     REFERENCE                                           TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontal-autoscaling   Deployment/dataplane-horizontal-autoscaling-4q72p   2%/50%    2         10        2          30s

Test autoscaling with a load test

You can test if the autoscaling works by using a load testing tool (e.g. k6s) to generate traffic.

  1. Fetch the DataPlane address and store it in the PROXY_IP variable:

     export PROXY_IP=$(kubectl get dataplanes.gateway-operator.konghq.com -o jsonpath='{.status.addresses[0].value}' horizontal-autoscaling)
    
  2. Install k6s, then create a configuration file containing the following code:

     import http from "k6/http";
     import { check } from "k6";
    
     export const options = {
       insecureSkipTLSVerify: true,
       stages: [
         { duration: "120s", target: 5 },
       ],
     };
    
     // Simulated user behavior
     export default function () {
       let res = http.get(`https://${__ENV.PROXY_IP}`);
       check(res, { "status was 404": (r) => r.status == 404 });
     }
    
  3. Start the load test.

    k6 run k6.js
    
  4. Observe the scaling events in the cluster while the test is running.

     kubectl get events --field-selector involvedObject.name=horizontal-autoscaling --field-selector involvedObject.kind=HorizontalPodAutoscaler
    

    The output will show the scaling events:

     LAST SEEN   TYPE      REASON                         OBJECT                                           MESSAGE
     3m55s       Normal    SuccessfulRescale              horizontalpodautoscaler/horizontal-autoscaling   New size: 6; reason: cpu resource utilization (percentage of request) above target
     3m25s       Normal    SuccessfulRescale              horizontalpodautoscaler/horizontal-autoscaling   New size: 7; reason: cpu resource utilization (percentage of request) above target
     2m55s       Normal    SuccessfulRescale              horizontalpodautoscaler/horizontal-autoscaling   New size: 10; reason: cpu resource utilization (percentage of request) above target
     85s         Normal    SuccessfulRescale              horizontalpodautoscaler/horizontal-autoscaling   New size: 2; reason: All metrics below target
    

    The DataPlane’s status field will also be updated with the number of ready/target replicas:

     kubectl get dataplanes.gateway-operator.konghq.com horizontal-autoscaling -o jsonpath-as-json='{.status}'
    
     [
         {
             ...
             "readyReplicas": 2,
             "replicas": 2,
             ...
         }
     ]
    
Thank you for your feedback.
Was this page useful?
Too much on your plate? close cta icon
More features, less infrastructure with Kong Konnect. 1M requests per month for free.
Try it for Free
  • Kong
    Powering the API world

    Increase developer productivity, security, and performance at scale with the unified platform for API management, service mesh, and ingress controller.

    • Products
      • Kong Konnect
      • Kong Gateway Enterprise
      • Kong Gateway
      • Kong Mesh
      • Kong Ingress Controller
      • Kong Insomnia
      • Product Updates
      • Get Started
    • Documentation
      • Kong Konnect Docs
      • Kong Gateway Docs
      • Kong Mesh Docs
      • Kong Insomnia Docs
      • Kong Konnect Plugin Hub
    • Open Source
      • Kong Gateway
      • Kuma
      • Insomnia
      • Kong Community
    • Company
      • About Kong
      • Customers
      • Careers
      • Press
      • Events
      • Contact
  • Terms• Privacy• Trust and Compliance
© Kong Inc. 2025