Hi,
I am running Keycloak 8.0.1 in HA (2 replicas) on AWS ECS Fargate.
CPU: 1024
Memory: 2048
There are something like 20 realms, each of that with at least 100 users and 100 groups (coming from an external LDAP (OpenLdap)).
One of that realm is huge and has ~2000 users and ~7000 groups.
All things goes fine, but we notice a liner CPU and Memory increment over the time, till it reach the limits: CPU ~ 98% and Memory ~ 71% for both 2 running instances.
I am really worried about this behavior.
In attachment the CPU graph (Not the memory one cause I can only attach one image).
Do you have any advice?
Note:
Since it is running on Fargate I cannot have any chance to make TD or even HD.
However, I could make it run on ECS (EC2 Instance) if it can help to understand.
Hi lmolinaro,
my team is running a Keycloak 7.0.0 Standalone HA (2 nodes) on EC2 instance and we had a similar experience. We managed to delay the growth by days instead of hours as in your case.
Can you check/answer the following?:
How often do your User Federations (LDAP connections) sync?
Do you use full and changed users syncs?
Do you have an LDAP errors in your logs?
Are you sure that all users and groups from LDAP are synced?
Are your nodes (Keycloak) in Sync?
Do you have a single LDAP instance or a cluster?
What we discovered:
LDAP Cluster: If LDAP nodes are out of sync the Keycloak members won’t stop updating/syncing the users and fill the heap.
Low sync periods: The higher the period for syncing, the lower the usage. I’d guess from our CPU utillization that your User Federations sync every hour.
Don’t use “Periodic Changed Users” setting if you use LDAP as source of truth (no other tools like AD, ADFS, … are enabled). There is no need for syncing if you don’t do management in your LDAP system
Wrong User Federation configuration: If you have any errors in the LDAP configuration it may leak open TCP connections filling slowly up your memory
We runs LDAP in master/slave mode. The first url is the master one. Viewing the keycloak user sync log users are not updated when no need it.
We use LDAP as source of truth for the authentication, but we also use LDAP for get user groups in this case the source of truth is Keycloak. Since user group change in the LDAP we need to keep them in sync often.
I do not notice any error in the Keycloak logs.
I done some configuration changes in the LDAP federation:
Set an higher period of sync.
Remove the flag “Preserve Group Inheritance” from the groups-mapper (this reduce by ~30% the time of sync)
Remove a second groups-to-role-mapper (no more needed by our application)
After that changes I restarts both keycloak instances and the following is the CPUUtilization graph:
Hi
I am really convinced that there is some leak somewhere.
I was running Keycloak 4.8.3 in production. The CPU utilization was always under the 1% and the RAM under the 30% (of 2GB).
Since in production we do NOT have big realm, I upgraded Keycloak to the 8.0.1 version without changing any configuration.
Now the CPU utilization is growing day by day as well as the RAM (now at 35%).