AWS EKS/K8 Healthcheck uses underlying server IP address and fails

Hi, we are migrating out Keycloak to version 19, which is run as a kubernetes pod, we have healthchecks enabled using port 8080, I have found that the healthchecks are failing because the check is using the underlying hosts IP address not the pod ip address, to get health checks working we have exposed them with a nodeport, but we dont like exposing services via nodeports, is it a normal operation, below is the error message found in the kubernetes events.

Warning   Unhealthy           pod/cipher-login-01-6984fb88fb-xzww2    Liveness probe failed: Get "http://10.30.4.183:8080/auth/health/live": dial tcp 10.30.4.183:8080: connect: connection refused

deployment healthcheck spec (node port being used) a bit of a botch

        livenessProbe:
          failureThreshold: 3
          httpGet:
            path: /auth/health/live
            port: 31629
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 30
        name: keycloak
        ports:
        - containerPort: 8080
          name: application
          protocol: TCP
        readinessProbe:
          failureThreshold: 3
          httpGet:
            path: /auth/health/ready
            port: 31629
            scheme: HTTP
          initialDelaySeconds: 30
          periodSeconds: 30
          successThreshold: 1
          timeoutSeconds: 30

Is there a way to configure keycloak to not use the server IP? Our current version is 16.1.0 and we do not experience this problem nor do any of our other k8 deployments.

Any advice would be appreciated.

Regards Mark Day

I could be wrong here, but I’m pretty sure liveness, readiness and startup probes talk directly with the pod, not with a service.

Also, from version 17 forward, keycloak runs in /, not /auth, unless you override it to keep the old behaviour.

So, a readinessProbe will be something like this:

readinessProbe:
  failureThreshold: 3
  httpGet:
    path: /health/ready
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 30
  successThreshold: 1

Hi, yes I thought that also but the developers kept /auth for old behaviour, when testing inside the pod, it responded to curl -v http://localhost:8080/auth/health

sh-4.4$ curl -v http://localhost:8080/auth/health
*   Trying 127.0.0.1...
* TCP_NODELAY set
* Connected to localhost (127.0.0.1) port 8080 (#0)
> GET /auth/health HTTP/1.1
> Host: localhost:8080
> User-Agent: curl/7.61.1
> Accept: */*
> 
< HTTP/1.1 200 OK
< content-type: application/json; charset=UTF-8
< content-length: 46
< 

{
    "status": "UP",
    "checks": [
    ]
* Connection #0 to host localhost left intact

@weltonrodrigo it failed even for

httpGet:
    path: /health/ready
    port: 8080

in my case, I have logged https://github.com/keycloak/keycloak/issues/13839 to track it. Could you please opine here.
Thanks,
Mohammed Adain

Have you tryied path: /auth/health/ready ?

Also, health endpoints are not active by default, you have to activate them with the enviroment variable KC_HEALTH_ENABLED=true

Hi, yes we have enabled the environment variable and the /auth/health/ready path works, the EKS IP address have turned out to be a bit of a red herring, I did some testing with a standard Quay image, first I started with /health/ready and /health/live and both HC work as expected, but once I set KC_HTTP_RELATIVE_PATH: “/auth” and changed the HC paths to /auth//health/ready and /auth/health/live the pod would start to log intermittent HC failures, which were captured in the kubectl get events and every now and then these failures would cause the pod to restart

see my response below

After setting

KC_HEALTH_ENABLED=true

, /health/ready seems to work.
For liveliness probe it would be /health/live.

Thanks

1 Like