Performance problems after load tests on Kubernetes

Hello all.

We have Keycloak deployed in Kubernetes (this helm chart: https://github.com/codecentric/helm-charts/tree/master/charts/keycloak ) and we are performing load tests at the moment, but we are having problems.

Our setup is the following:
4 pods with 500m (half a core) and 1gb memory each;
1 external PostgresSQL instance

When we run the load tests, Keycloak becomes super slow (it takes 10 or 20 seconds to get a response to the query), which can make sense if the problem is the database.

However, the reason why I’m posting this topic is that Keycloak is also super slow AFTER the load tests are over. It takes ~10sec to get the user list for a given realm, for example. And we found out that if we clear the realm cache, keycloak becomes fast again.

So, my question is: can someone explain why does the realm cache have such a big performance impact in Keycloak? And can someone give advice on how to solve this?

Thank you!

Hi again.

Sorry for the bump, but this is really a mystery to us.

We were checking the network IN/OUT graphs and we can’t explain the values.
We have 3 environments: Unstable, Staging and Stable. In Staging and Stable we have no performance problems, but in unstable we do.

The Network IN/OUT for stable/staging shows values below 500kb. For unstable, it’s 10x more (which could explain the performance problems).

This is the graph of our DB network when we try to get a list of users in keycloak (which was taking ~2sec for the request to be served):

Then we cleared the realm cache and the network in/out fell my a magnitude of over 10x:

How is the cache related to the amount of data sent to the DB? Why is the request size 20kB when the cache is cleared, but 1mB when the cache is full?

Any help will be highly appreciated! :smiley:

P.s: sorry for hosting the image in another site, but can’t upload media here as a new user