We have faced an issue with Keycloak server cache. We noticed that during normal server operation JVM heap size grows constantly day-to-day. We took 2 heap dumps with a several days delay, and after comparing those dumps we found that JVM heap memory is consumed by Java objects of class org.keycloak.models.cache.infinispan.events.UserUpdatedEvent and related Infinispan classes.
As far as I can see, lifetime of these objects is set to 120000 ms, i.e. 2 minutes. But the heap analysis has shown that UserUpdatedEvent objects are not removed for some reason, even after a week.
A few words about our use case. All our users are federated from LDAP storage, and we do not import those users into a local Keycloak database. Additionally, we have some legacy client integration which generates a huge amount of SAML assertions for users, and this may trigger extra LDAP mappers executions.
Could you please advice where to look at? Infinispan cache configs are in their default state as set in official Docker image and Helm chart.
A small investigation showed that cluster events with a “USER_INVALIDATION_EVENTS” key are filling up the Infinispan cache.
Currently a Cache Policy option for LDAP user federation is set to MAX_LIFESPAN=60000, i.e. every 1 minute every user entry loaded from the LDAP storage into the cache is invalidated, and invalidation events are sent to every node within the cluster.
The question is: Why doesn’t the reaper remove UserUpdatedEvents automatically? Or it shouldn’t do that, and I’m missing something?
Is there a way to inspect any metrics of this cache (via JMX metrics or something)?
Thanks for the advice! We will definitely try the new version, but this will require additional integration testing on our side. Did I understand correctly, fixed versions are 13.0.0 and newer?