Configure Service health checks

Uses: Kong Ingress Controller

Deployment Platform

konnect

on-prem

Health check types

Kong Gateway supports active and passive health checks. This allows Kong Gateway to automatically short-circuit requests to specific Pods that are misbehaving in your Kubernetes Cluster. The process to re-enable these pods is different between active and passive health checks.

Passive health checks

Pods that are marked as unhealthy by Kong Gateway are permanently marked as unhealthy.

If a passive health check for a service that runs in a cluster and if the Pod that runs the service reports an error, Kong Gateway returns a 503, indicating that the service is unavailable. Kong Gateway doesn’t proxy any requests to the unhealthy pod.

There is no way to mark the pod as healthy again using Kong Ingress Controller and passive health checks. To resolve the issue, choose on of the following options:

Delete the current Pod: Kong Gateway then sends proxy requests to the new Pod that is in its place.
Scale the deployment: Kong Gateway then sends proxy requests to the new Pods and leaves the short-circuited Pod out of the loop.

Active health checks

Pods that are marked as unhealthy by Kong Gateway are temporarily marked as unhealthy.

Kong Gateway will make a request to the healthcheck path periodically. When it has received enough healthy responses, it will re-enable the Pod in the load balancer and traffic will be routed to the Pod again.

Enable passive health checking

All health checks are done at the Service-level. To configure Kong Gateway to short-circuit requests to a Pod if it throws 3 consecutive errors, add a KongUpstreamPolicy resource:

echo '
apiVersion: configuration.konghq.com/v1beta1
kind: KongUpstreamPolicy
metadata:
    name: demo-health-checking
    namespace: kong
spec:
  healthchecks:
    passive:
      healthy:
        successes: 3
      unhealthy:
        httpFailures: 3
' | kubectl apply -f -

Copied!

Associate the KongUpstreamPolicy resource with httpbin Service:

 kubectl patch -n kong svc httpbin -p '{"metadata":{"annotations":{"konghq.com/upstream-policy":"demo-health-checking"}}}'

Copied!

Test the Ingress rule by sending two requests to /status/500 that simulate a failure from the upstream service:

 for _  in {1..2}; do
  curl  -i "$PROXY_IP/httpbin/status/500" \
      --no-progress-meter --fail-with-body ; done

Copied!

 for _  in {1..2}; do
  curl  -i "$PROXY_IP/httpbin/status/500" \
      --no-progress-meter --fail-with-body ; done

Copied!

The results should look like this:

 HTTP/1.1 500 INTERNAL SERVER ERROR
 Content-Type: text/html; charset=utf-8
 Content-Length: 0
 Connection: keep-alive
 Server: gunicorn/19.9.0
 Access-Control-Allow-Origin: *
 Access-Control-Allow-Credentials: true
 X-Kong-Upstream-Latency: 1
 X-Kong-Proxy-Latency: 0
 Via: kong/3.12

Copied!

Send a third request with status/200. This will reset the circuit breaker counter as it is a healthy response:

  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

The results should look like this:

 HTTP/1.1 200 OK
 Content-Type: text/html; charset=utf-8
 Content-Length: 0
 Connection: keep-alive
 Server: gunicorn/19.9.0
 Access-Control-Allow-Origin: *
 Access-Control-Allow-Credentials: true
 X-Kong-Upstream-Latency: 1
 X-Kong-Proxy-Latency: 0
 Via: kong/3.12

Copied!

Kong Gateway didn’t short-circuit because there were only two failures.

Trip the circuit breaker

Send three requests to /status/500 to mark the pod as unhealthy. We need three requests as this is the number provided in unhealthy.httpFailures in the KongUpstreamPolicy resource:

 for _  in {1..3}; do
  curl  -i "$PROXY_IP/httpbin/status/500" \
      --no-progress-meter --fail-with-body ; done

Copied!

 for _  in {1..3}; do
  curl  -i "$PROXY_IP/httpbin/status/500" \
      --no-progress-meter --fail-with-body ; done

Copied!

The results should look like this:

 HTTP/1.1 500 INTERNAL SERVER ERROR
 Content-Type: text/html; charset=utf-8
 Content-Length: 0
 Connection: keep-alive
 Server: gunicorn/19.9.0
 Access-Control-Allow-Origin: *
 Access-Control-Allow-Credentials: true
 X-Kong-Upstream-Latency: 1
 X-Kong-Proxy-Latency: 0
 Via: kong/3.12

Copied!

Make a request to /status/200 and note that Kong Gateway returns an HTTP 503:

  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

The results should look like this:

 HTTP/1.1 503 Service Temporarily Unavailable
 Content-Type: application/json; charset=utf-8
 Connection: keep-alive
 Content-Length: 62
 X-Kong-Response-Latency: 0
 Server: kong/3.12

 {
   "message":"failure to get a peer from the ring-balancer"
 }%

Copied!

Because there’s only one Pod of httpbin Service running in the cluster, and that is throwing errors, Kong Gateway doesn’t proxy any additional requests. To get resolve this, you can use active health-check, where each instance of Kong Gateway actively probes Pods to check if they are healthy.

Enable active health checking

Active health checking can automatically mark an upstream service as healthy again once it receives enough healthy responses.

Update the KongUpstreamPolicy resource to use active health checks:

echo '
apiVersion: configuration.konghq.com/v1beta1
kind: KongUpstreamPolicy
metadata:
    name: demo-health-checking
    namespace: kong
spec:
  healthchecks:
    active:
      healthy:
        interval: 5
        successes: 3
      httpPath: /status/200
      type: http
      unhealthy:
        httpFailures: 1
        interval: 5
    passive:
      healthy:
        successes: 3
      unhealthy:
        httpFailures: 3
' | kubectl apply -f -

Copied!

This configures Kong Gateway to actively probe /status/200 every five seconds. If a Pod is unhealthy from Kong Gateway’s perspective, three successful probes change the status of the Pod to healthy, and Kong Gateway again starts to forward requests to that Pod. Wait 15 seconds for the pod to be marked as healthy before continuing.

Make a request to /status/200 after 15 seconds:

 sleep 15 &&  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

 sleep 15 &&  curl  -i "$PROXY_IP/httpbin/status/200" \
      --no-progress-meter --fail-with-body

Copied!

The results should look like this:

 HTTP/1.1 200 OK
 Content-Type: text/html; charset=utf-8
 Content-Length: 0
 Connection: keep-alive
 Server: gunicorn/19.9.0
 Access-Control-Allow-Origin: *
 Access-Control-Allow-Credentials: true
 X-Kong-Upstream-Latency: 1
 X-Kong-Proxy-Latency: 1
 Via: kong/3.12

Copied!

Trip the circuit again by sending three requests that return the HTTP 500 from httpbin:

for _  in {1..3}; do
 curl  -i "$PROXY_IP/httpbin/status/500" \
     --no-progress-meter --fail-with-body ; done

Copied!

for _  in {1..3}; do
 curl  -i "$PROXY_IP/httpbin/status/500" \
     --no-progress-meter --fail-with-body ; done

Copied!

The httpbin pod is now marked as unhealthy for 15 seconds. This is the duration required for active health checks to re-classify the httpbin Pod as healthy again (three requests with a five second interval).

curl -i $PROXY_IP/httpbin/status/200

Copied!

The results should look like this:

HTTP/1.1 503 Service Temporarily Unavailable
Content-Type: application/json; charset=utf-8
Connection: keep-alive
Content-Length: 62
X-Kong-Response-Latency: 0
Server: kong/3.12

{
  "message":"failure to get a peer from the ring-balancer"
}%

Copied!

Wait 15 seconds then make another request:

sleep 15 &&  curl  -i "$PROXY_IP/httpbin/status/200" \
     --no-progress-meter --fail-with-body

Copied!

sleep 15 &&  curl  -i "$PROXY_IP/httpbin/status/200" \
     --no-progress-meter --fail-with-body

Copied!

The results should look like this:

HTTP/1.1 200 OK
Content-Type: text/html; charset=utf-8
Content-Length: 0
Connection: keep-alive
Server: gunicorn/19.9.0
Access-Control-Allow-Origin: *
Access-Control-Allow-Credentials: true
X-Kong-Upstream-Latency: 1
X-Kong-Proxy-Latency: 1
Via: kong/3.12

Copied!

Active health checking has marked the upstream healthy again in Kong Gateway

Cleanup

Delete created Kubernetes resources

kubectl delete -n kong -f https://developer.konghq.com/manifests/kic/httpbin-service.yaml

Copied!

Configure Service health checks

Prerequisites

Kong Konnect

Enable the Gateway API

Create a KIC Control Plane

Create mTLS certificates

Kong Ingress Controller running (attached to Konnect)

Kong Ingress Controller running

Required Kubernetes resources

Health check types

Passive health checks

Active health checks

Enable passive health checking

Trip the circuit breaker

Enable active health checking

Cleanup

Delete created Kubernetes resources

Uninstall KIC from your cluster

Next Steps

Did this doc help?

Help us make these docs great!

Still need help