Performance of Keycloak with many realms

Hi.
This is one of my first posts, so hoping it’s the right place.
I was trying to figure out if it’s possible to get support from Red Hat for example, but let me try this :slight_smile:

We are running Keycloak 11.0.3 in Azure AKS. A 3-node setup using the Azure PostgreSQL database version 11.
We currently have 373 realms. We are hosting customer environments, and each environment has it’s own realm.
We are seeing that performance at the moment is pretty bad. Simply loading all realms when logging in takes 20-30 seconds, and also creating new realms takes longer and longer, and sometimes we get timeouts.
During approximately 1 month, we see that memory usage is growing, GC count and GC time grows a lot (!) and then we have to restart the KC nodes (pods).

I am not an expert on Keycloak or PostgreSQL at all, but I am hoping that someone can help and suggest on what to do to troubleshoot this.
Are we using KC in bad way?
Are there known issues with KC v11?
How do I check PostgreSQL to see if it’s a database problem?

Thanks in advance,
Morten

It’s a known issue that there are performance issues after 100+ realms, and there’s a task on the Keycloak side:

https://issues.redhat.com/browse/KEYCLOAK-4593

Hi yalpertem.
We are aware of that Jira. Unfortunately that looks to be on the backlog, and there are no updates for a long time :frowning:
But I also found these which seem related to caching which could also be relevant.
https://issues.redhat.com/browse/KEYCLOAK-17774
https://issues.redhat.com/browse/KEYCLOAK-18518 “Expired cache objects in infinispan cache are never garbage collected and lead to out of memory”

The problem described in #2 seems to be fixed in version 15.0.0.

But personally I am not skilled enough to be able to tell if the high GC count and time that we see is related to having (too) many realms or if it’s a possible caching issue. Or both.

1 Like