Configure Service health checks

Related Documentation
TL;DR

Configure spec.healthchecks in a KongUpstreamPolicy resource, then attach the KongUpstreamPolicy resource to a Kubernetes Service using the konghq.com/upstream-policy annotation

Prerequisites

If you don’t have a Konnect account, you can get started quickly with our onboarding wizard.

  1. The following Konnect items are required to complete this tutorial:
    • Personal access token (PAT): Create a new personal access token by opening the Konnect PAT page and selecting Generate Token.
  2. Set the personal access token as an environment variable:

    export KONNECT_TOKEN='YOUR KONNECT TOKEN'
    
    Copied to clipboard!

Health check types

Kong Gateway supports active and passive health checks. This allows Kong Gateway to automatically short-circuit requests to specific Pods that are misbehaving in your Kubernetes Cluster. The process to re-enable these pods is different between active and passive health checks.

Passive health checks

Pods that are marked as unhealthy by Kong Gateway are permanently marked as unhealthy.

If a passive health check for a service that runs in a cluster and if the Pod that runs the service reports an error, Kong Gateway returns a 503, indicating that the service is unavailable. Kong Gateway doesn’t proxy any requests to the unhealthy pod.

There is no way to mark the pod as healthy again using Kong Ingress Controller and passive health checks. To resolve the issue, choose on of the following options:

  • Delete the current Pod: Kong Gateway then sends proxy requests to the new Pod that is in its place.
  • Scale the deployment: Kong Gateway then sends proxy requests to the new Pods and leaves the short-circuited Pod out of the loop.

Active health checks

Pods that are marked as unhealthy by Kong Gateway are temporarily marked as unhealthy.

Kong Gateway will make a request to the healthcheck path periodically. When it has received enough healthy responses, it will re-enable the Pod in the load balancer and traffic will be routed to the Pod again.

Enable passive health checking

  1. All health checks are done at the Service-level. To configure Kong Gateway to short-circuit requests to a Pod if it throws 3 consecutive errors, add a KongUpstreamPolicy resource:

    echo '
    apiVersion: configuration.konghq.com/v1beta1
    kind: KongUpstreamPolicy
    metadata:
        name: demo-health-checking
        namespace: kong
    spec:
      healthchecks:
        passive:
          healthy:
            successes: 3
          unhealthy:
            httpFailures: 3
    ' | kubectl apply -f -
    
    Copied to clipboard!
  2. Associate the KongUpstreamPolicy resource with httpbin Service:

     kubectl patch -n kong svc httpbin -p '{"metadata":{"annotations":{"konghq.com/upstream-policy":"demo-health-checking"}}}'
    
    Copied to clipboard!
  3. Test the Ingress rule by sending two requests to /status/500 that simulate a failure from the upstream service:

     curl -i "$PROXY_IP/httpbin/status/500"
     curl -i "$PROXY_IP/httpbin/status/500"
    
    Copied to clipboard!

    The results should look like this:

     HTTP/1.1 500 INTERNAL SERVER ERROR
     Content-Type: text/html; charset=utf-8
     Content-Length: 0
     Connection: keep-alive
     Server: gunicorn/19.9.0
     Access-Control-Allow-Origin: *
     Access-Control-Allow-Credentials: true
     X-Kong-Upstream-Latency: 1
     X-Kong-Proxy-Latency: 0
     Via: kong/3.11
    
    Copied to clipboard!
  4. Send a third request with status/200. This will reset the circuit breaker counter as it is a healthy response:

     curl -i "$PROXY_IP/httpbin/status/200"
    
    Copied to clipboard!

    The results should look like this:

     HTTP/1.1 200 OK
     Content-Type: text/html; charset=utf-8
     Content-Length: 0
     Connection: keep-alive
     Server: gunicorn/19.9.0
     Access-Control-Allow-Origin: *
     Access-Control-Allow-Credentials: true
     X-Kong-Upstream-Latency: 1
     X-Kong-Proxy-Latency: 0
     Via: kong/3.11
    
    Copied to clipboard!

    Kong Gateway didn’t short-circuit because there were only two failures.

Trip the circuit breaker

  1. Send three requests to /status/500 to mark the pod as unhealthy. We need three requests as this is the number provided in unhealthy.httpFailures in the KongUpstreamPolicy resource:

     curl -i "$PROXY_IP/httpbin/status/500"
     curl -i "$PROXY_IP/httpbin/status/500"
     curl -i "$PROXY_IP/httpbin/status/500"
    
    Copied to clipboard!

    The results should look like this:

     HTTP/1.1 500 INTERNAL SERVER ERROR
     Content-Type: text/html; charset=utf-8
     Content-Length: 0
     Connection: keep-alive
     Server: gunicorn/19.9.0
     Access-Control-Allow-Origin: *
     Access-Control-Allow-Credentials: true
     X-Kong-Upstream-Latency: 1
     X-Kong-Proxy-Latency: 0
     Via: kong/3.11
    
    Copied to clipboard!
  2. Make a request to /status/200 and note that Kong Gateway returns an HTTP 503:

     curl -i "$PROXY_IP/httpbin/status/200"
    
    Copied to clipboard!

    The results should look like this:

     HTTP/1.1 503 Service Temporarily Unavailable
     Content-Type: application/json; charset=utf-8
     Connection: keep-alive
     Content-Length: 62
     X-Kong-Response-Latency: 0
     Server: kong/3.11
    
     {
       "message":"failure to get a peer from the ring-balancer"
     }%
    
    Copied to clipboard!

    Because there’s only one Pod of httpbin Service running in the cluster, and that is throwing errors, Kong Gateway doesn’t proxy any additional requests. To get resolve this, you can use active health-check, where each instance of Kong Gateway actively probes Pods to check if they are healthy.

Enable active health checking

Active health checking can automatically mark an upstream service as healthy again once it receives enough healthy responses.

  1. Update the KongUpstreamPolicy resource to use active health checks:

    echo '
    apiVersion: configuration.konghq.com/v1beta1
    kind: KongUpstreamPolicy
    metadata:
        name: demo-health-checking
        namespace: kong
    spec:
      healthchecks:
        active:
          healthy:
            interval: 5
            successes: 3
          httpPath: /status/200
          type: http
          unhealthy:
            httpFailures: 1
            interval: 5
        passive:
          healthy:
            successes: 3
          unhealthy:
            httpFailures: 3
    ' | kubectl apply -f -
    
    Copied to clipboard!

    This configures Kong Gateway to actively probe /status/200 every five seconds. If a Pod is unhealthy from Kong Gateway’s perspective, three successful probes change the status of the Pod to healthy, and Kong Gateway again starts to forward requests to that Pod. Wait 15 seconds for the pod to be marked as healthy before continuing.

  2. Make a request to /status/200 after 15 seconds:

     sleep 15 && curl -i "$PROXY_IP/httpbin/status/200"
    
    Copied to clipboard!

    The results should look like this:

     HTTP/1.1 200 OK
     Content-Type: text/html; charset=utf-8
     Content-Length: 0
     Connection: keep-alive
     Server: gunicorn/19.9.0
     Access-Control-Allow-Origin: *
     Access-Control-Allow-Credentials: true
     X-Kong-Upstream-Latency: 1
     X-Kong-Proxy-Latency: 1
     Via: kong/3.11
    
    Copied to clipboard!
  3. Trip the circuit again by sending three requests that return the HTTP 500 from httpbin:

    curl -i "$PROXY_IP/httpbin/status/500"
    curl -i "$PROXY_IP/httpbin/status/500"
    curl -i "$PROXY_IP/httpbin/status/500"
    
    Copied to clipboard!

    The httpbin pod is now marked as unhealthy for 15 seconds. This is the duration required for active health checks to re-classify the httpbin Pod as healthy again (three requests with a five second interval).

    curl -i $PROXY_IP/httpbin/status/200
    
    Copied to clipboard!

    The results should look like this:

    HTTP/1.1 503 Service Temporarily Unavailable
    Content-Type: application/json; charset=utf-8
    Connection: keep-alive
    Content-Length: 62
    X-Kong-Response-Latency: 0
    Server: kong/3.11
    
    {
      "message":"failure to get a peer from the ring-balancer"
    }%
    
    Copied to clipboard!
  4. Wait 15 seconds then make another request:

    sleep 15 && curl -i "$PROXY_IP/httpbin/status/200"
    
    Copied to clipboard!

    The results should look like this:

    HTTP/1.1 200 OK
    Content-Type: text/html; charset=utf-8
    Content-Length: 0
    Connection: keep-alive
    Server: gunicorn/19.9.0
    Access-Control-Allow-Origin: *
    Access-Control-Allow-Credentials: true
    X-Kong-Upstream-Latency: 1
    X-Kong-Proxy-Latency: 1
    Via: kong/3.11
    
    Copied to clipboard!

Active health checking has marked the upstream healthy again in Kong Gateway

Cleanup

kubectl delete -n kong -f https://developer.konghq.com/manifests/kic/httpbin-service.yaml
Copied to clipboard!

Did this doc help?

Something wrong?

Help us make these docs great!

Kong Developer docs are open source. If you find these useful and want to make them better, contribute today!