Excessive memory usage in HA cluster on GKE


We have a keycloak cluster on GKE with version 11.0.3 deployed using the codecentric/keycloak helm chart version of 9.5.0. In the cluster there are 3 replicas of keycloak and 1 PostgreSQL DB.

Based on the observation, we found some issues that it couldn’t explain itself:

  1. The traffic comes into our cluster will increase in a specific time during the date. Each time when the traffic comes, the memory usage of keycloak will increase by around 2GB and it never goes down. Is this an expected behaviour of keycloak?

  2. We are expecting GC to do the job to stabilize memory usage; however we see that memory usage has been increasing during days until the jboss server received a kill signal.

  3. We are using offline tokens with default settings ( refresh token revoke disabled, 30 days idle timeout and 5 mins lifespan for access token, etc). In the cluster, there are only around 50 offline sessions and 50 active sessions. Will the offline sessions ( sticky) keep consuming more and more memory?

  4. We also notice that the Last Refresh date/time always shown in year 1970. Is this related to any of our settings?

Below is the configuration for our deployment:

Any help and information is appreciated. Thanks in advance!

This screenshot shows the last refresh date of 1970

This screenshot shows our keycloak instances would crash due to out of resources once a while:

This shows whenever the traffic is growing in the cluster during a day, the memory usage by keycloak would increase by around 2GB each time and then remain relatively stable for the rest of the day.