How to: Autoscale workloads with Prometheus

Prerequisites

Kong Konnect

If you don’t have a Konnect account, you can get started quickly with our onboarding wizard.

The following Konnect items are required to complete this tutorial:
- Personal access token (PAT): Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
Set the personal access token as an environment variable:
```
export KONNECT_TOKEN='YOUR KONNECT TOKEN'
```
Copied to clipboard!

Enable the Gateway API

Install the Gateway API CRDs before installing Kong Ingress Controller.

kubectl apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.3.0/standard-install.yaml

Copied to clipboard!

Create a Gateway and GatewayClass instance to use.

echo "
apiVersion: v1
kind: Namespace
metadata:
  name: kong
---
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
  name: kong
  annotations:
    konghq.com/gatewayclass-unmanaged: 'true'
spec:
  controllerName: konghq.com/gateway-operator
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
  name: kong
spec:
  gatewayClassName: kong
  listeners:
  - name: proxy
    port: 80
    protocol: HTTP
    allowedRoutes:
      namespaces:
         from: All
" | kubectl apply -n kong -f -

      
        
      
    
Copied to clipboard!

Kong Gateway Operator running (with an Enterprise license)

Add the Kong Helm charts:

helm repo add kong https://charts.konghq.com
helm repo update

Copied to clipboard!

Create a kong namespace:

kubectl create namespace kong --dry-run=client -o yaml | kubectl apply -f -

Copied to clipboard!

Install Kong Ingress Controller using Helm:

helm upgrade --install kgo kong/gateway-operator -n kong-system --create-namespace  \
  --set image.tag=1.5 \
  --set kubernetes-configuration-crds.enabled=true \
  --set env.ENABLE_CONTROLLER_KONNECT=true

Copied to clipboard!

Apply a KongLicense. This assumes that your license is available in ./license.json

echo "
apiVersion: configuration.konghq.com/v1alpha1
kind: KongLicense
metadata:
 name: kong-license
rawLicenseString: '$(cat ./license.json)'
" | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Kong Gateway Operator running (with an Enterprise license)

Add the Kong Helm charts:

helm repo add kong https://charts.konghq.com
helm repo update

Copied to clipboard!

Create a kong namespace:

kubectl create namespace kong --dry-run=client -o yaml | kubectl apply -f -

Copied to clipboard!

Install Kong Ingress Controller using Helm:

helm upgrade --install kgo kong/gateway-operator -n kong-system --create-namespace  \
  --set image.tag=1.5 \
  --set kubernetes-configuration-crds.enabled=true \
  --set env.ENABLE_CONTROLLER_KONNECT=true

Copied to clipboard!

Apply a KongLicense. This assumes that your license is available in ./license.json

echo "
apiVersion: configuration.konghq.com/v1alpha1
kind: KongLicense
metadata:
 name: kong-license
rawLicenseString: '$(cat ./license.json)'
" | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Required Kubernetes resources

This how-to requires some Kubernetes services to be available in your cluster. These services will be used by the resources created in this how-to.

kubectl apply -f https://developer.konghq.com/manifests/kic/command-service.yaml -n kong

Copied to clipboard!

This how-to also requires 1 pre-configured route:

echo "
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: command
  namespace: kong
  annotations:
    konghq.com/strip-path: 'true'
spec:
  parentRefs:
  - name: kong
    namespace: kong
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /command
    backendRefs:
    - name: command
      kind: Service
      port: 80
" | kubectl apply -f -

      
        
      
    
Copied to clipboard!

echo "
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: command
  namespace: kong
  annotations:
    konghq.com/strip-path: 'true'
spec:
  ingressClassName: kong
  rules:
  - http:
      paths:
      - path: /command
        pathType: ImplementationSpecific
        backend:
          service:
            name: command
            port:
              number: 80
" | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Autoscaling Workloads

This tutorial shows how to autoscale workloads based on Service latency. The command service created in the prerequisites allows us to inject an artificial delay in to responses to trigger autoscaling.

Create a `DataPlaneMetricsExtension`

The DataPlaneMetricsExtension allows Kong Gateway Operator to monitor Service latency and expose it on the /metrics endpoint.

Create a DataPlaneMetricsExtension that points to the command service:

 echo '
 kind: DataPlaneMetricsExtension
 apiVersion: gateway-operator.konghq.com/v1alpha1
 metadata:
   name: kong
   namespace: kong
 spec:
   serviceSelector:
     matchNames:
     - name: command
   config:
     latency: true
 ' | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Create a GatewayConfiguration that uses it:

 echo '
 kind: GatewayConfiguration
 apiVersion: gateway-operator.konghq.com/v1beta1
 metadata:
   name: kong
   namespace: kong
 spec:
   controlPlaneOptions:
     extensions:
     - kind: DataPlaneMetricsExtension
       group: gateway-operator.konghq.com
       name: kong
 ' | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Patch the GatewayClass to use the config:

 kubectl patch -n kong --type=json gatewayclass kong -p='[
     {
         "op":"add",
         "path":"/spec/parametersRef",
         "value":{
                 "group": "gateway-operator.konghq.com",
                 "kind": "GatewayConfiguration",
                 "name": "kong",
                 "namespace": "kong",
         }
     }
 ]'

      
        
      
    
Copied to clipboard!

Install Prometheus

Note: You can reuse your current Prometheus setup and skip this step but please be aware that it needs to be able to scrape Kong Gateway Operator’s metrics (e.g. through ServiceMonitor) and note down the namespace in which it’s deployed.

Add the prometheus-community helm charts:

helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update

Copied to clipboard!

Install Prometheus via kube-prometheus-stack helm chart:

helm upgrade --install --create-namespace -n prometheus prometheus prometheus-community/kube-prometheus-stack

Copied to clipboard!

Create a ServiceMonitor to scrape Kong Gateway Operator

To make Prometheus scrape Kong Gateway Operator’s /metrics endpoint, we’ll need to create a ServiceMonitor:

echo '
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: prometheus
  name: gateway-operator
  namespace: kong-system
spec:
  endpoints:
  - port: https
    scheme: http
    path: /metrics
    bearerTokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token
  selector:
    matchLabels:
      control-plane: controller-manager ' | kubectl apply -f -

      
        
      
    
Copied to clipboard!

After applying the above manifest you can check one of the metrics exposed by Kong Gateway Operator to verify that the scrape config has been applied.

To access the Prometheus UI, create a port-forward and visit http://localhost:9090:

kubectl port-forward service/prometheus-kube-prometheus-prometheus 9090:9090 -n prometheus

Copied to clipboard!

This can be verified by going to your Prometheus UI and querying:

up{service=~"kgo-gateway-operator-metrics-service"}

Copied to clipboard!

Prometheus metrics can take up to 2 minutes to appear.

Install prometheus-adapter

The prometheus-adapter package makes Prometheus metrics usable in Kubernetes.

To deploy prometheus-adapter, you’ll need to decide what time series to expose so that Kubernetes can consume it.

Note: Kong Gateway Operator enriches specific metrics for use with prometheus-adapter. See the overview for a complete list.

Create a values.yaml file to deploy the prometheus-adapter helm chart. This configuration calculates a kong_upstream_latency_ms_60s_average metric, which exposes a 60s moving average of upstream response latency:

echo $'
prometheus:
  # Update this value if Prometheus is installed in a different namespace
  url: http://prometheus-kube-prometheus-prometheus.prometheus.svc

rules:
  default: false
  custom:
  - seriesQuery: \'{__name__=~"^kong_upstream_latency_ms_(sum|count)",kubernetes_namespace!="",kubernetes_name!="",kubernetes_kind!=""}\'
    resources:
      overrides:
        exported_namespace:
          resource: "namespace"
        exported_service:
          resource: "service"
    name:
      as: "kong_upstream_latency_ms_60s_average"
    metricsQuery: |
      sum by (exported_service) (rate(kong_upstream_latency_ms_sum{<<.LabelMatchers>>}[60s:10s]))
        /
      sum by (exported_service) (rate(kong_upstream_latency_ms_count{<<.LabelMatchers>>}[60s:10s]))
' > values.yaml

      
        
      
    
Copied to clipboard!

Install prometheus-adapter using Helm:

helm upgrade --install --create-namespace -n prometheus --values values.yaml prometheus-adapter prometheus-community/prometheus-adapter

Copied to clipboard!

Send traffic

To trigger autoscaling, run the following command in a new terminal window. This will cause the underlying deployment to sleep for 100ms on each request and thus increase the average response time to that value.

while curl -k "http://$(kubectl get -n kong gateway kong -o custom-columns='name:.status.addresses[0].value' --no-headers)/command/shell?cmd=sleep%200.1" ; do sleep 1; done

Copied to clipboard!

Keep this running while we move on to next steps.

Verify metrics are exposed in Kubernetes

When all is configured, you should be able to see the metric you’ve configured in prometheus-adapter exposed via the Kubernetes Custom Metrics API:

kubectl get --raw '/apis/custom.metrics.k8s.io/v1beta1/namespaces/kong/services/command/kong_upstream_latency_ms_60s_average' | jq

Copied to clipboard!

Note: The prometheus-adapter may take up to 2 minutes to populate the custom metrics

This should result in:

{
  "kind": "MetricValueList",
  "apiVersion": "custom.metrics.k8s.io/v1beta1",
  "metadata": {},
  "items": [
    {
      "describedObject": {
        "kind": "Service",
        "namespace": "kong",
        "name": "command",
        "apiVersion": "/v1"
      },
      "metricName": "kong_upstream_latency_ms_60s_average",
      "timestamp": "2024-03-06T13:11:12Z",
      "value": "102312m",
      "selector": null
    }
  ]
}

Note: 102312m is a Kubernetes way of expressing numbers as integers. value represents the latency in microseconds, and is approximately equivalent to 102 milliseconds (ms).

Use exposed metric in HorizontalPodAutoscaler

When the metric configured in prometheus-adapter is available through Kubernetes’s Custom Metrics API, we can use it in HorizontalPodAutoscaler to autoscale our workload, specifically the command Deployment.

This can be done by using the following manifest, which will scale the underlying command Deployment between 1 and 10 replicas, trying to keep the average latency across last 60s at 40ms.

echo '
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: command
  namespace: kong
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: command
  minReplicas: 1
  maxReplicas: 10
  behavior:
    scaleDown:
      stabilizationWindowSeconds: 1
      policies:
      - type: Percent
        value: 100
        periodSeconds: 10
    scaleUp:
      stabilizationWindowSeconds: 1
      policies:
      - type: Percent
        value: 100
        periodSeconds: 2
      - type: Pods
        value: 4
        periodSeconds: 2
      selectPolicy: Max
  metrics:
  - type: Object
    object:
      metric:
        name: "kong_upstream_latency_ms_60s_average"
      describedObject:
        apiVersion: v1
        kind: Service
        name: command
      target:
        type: Value
        value: "40" ' | kubectl apply -f -

      
        
      
    
Copied to clipboard!

Observe Kubernetes `SuccessfulRescale` events

You can watch SuccessfulRescale events using the following kubectl command:

kubectl get events -n kong --field-selector involvedObject.name=command --field-selector involvedObject.kind=HorizontalPodAutoscaler -w

Copied to clipboard!

If everything went well we should see the SuccessfulRescale events:

12m          Normal   SuccessfulRescale   horizontalpodautoscaler/command   New size: 2; reason: Service metric kong_upstream_latency_ms_60s_average above target
12m          Normal   SuccessfulRescale   horizontalpodautoscaler/command   New size: 4; reason: Service metric kong_upstream_latency_ms_60s_average above target
12m          Normal   SuccessfulRescale   horizontalpodautoscaler/command   New size: 8; reason: Service metric kong_upstream_latency_ms_60s_average above target
12m          Normal   SuccessfulRescale   horizontalpodautoscaler/command   New size: 10; reason: Service metric kong_upstream_latency_ms_60s_average above target

Then when latency drops (when you stop sending traffic with the curl command) you should observe the SuccessfulRescale events scaling your workloads down:

4s          Normal   SuccessfulRescale   horizontalpodautoscaler/command   New size: 1; reason: All metrics below target

Autoscale workloads with Prometheus

Prerequisites

Kong Konnect

Enable the Gateway API

Kong Gateway Operator running (with an Enterprise license)

Kong Gateway Operator running (with an Enterprise license)

Required Kubernetes resources

Autoscaling Workloads

Create a `DataPlaneMetricsExtension`

Install Prometheus

Create a ServiceMonitor to scrape Kong Gateway Operator

Install prometheus-adapter

Send traffic

Verify metrics are exposed in Kubernetes

Use exposed metric in HorizontalPodAutoscaler

Observe Kubernetes `SuccessfulRescale` events

Did this doc help?

Help us make these docs great!

Still need help

Autoscale workloads with Prometheus

Prerequisites

Kong Konnect

Enable the Gateway API

Kong Gateway Operator running (with an Enterprise license)

Kong Gateway Operator running (with an Enterprise license)

Required Kubernetes resources

Autoscaling Workloads

Create a DataPlaneMetricsExtension

Install Prometheus

Create a ServiceMonitor to scrape Kong Gateway Operator

Install prometheus-adapter

Send traffic

Verify metrics are exposed in Kubernetes

Use exposed metric in HorizontalPodAutoscaler

Observe Kubernetes SuccessfulRescale events

Did this doc help?

Help us make these docs great!

Still need help

Create a `DataPlaneMetricsExtension`

Observe Kubernetes `SuccessfulRescale` events