Hi,
Recently, we were able to configure our Keycloak cluster to use a dedicated Infinispan cluster as a remote-store
to maintain Keycloak’s various caches. However, we have been seeing some server errors that were seemingly related to when Keycloak tried to clear the expired caches. These errors occurred exactly every 15 minutes. Every time when these errors occurred, we could see a corresponding increase on the number of JVM threads in Keycloak, an increase of CPU usage in Infinispan, and an increase of amount of network traffic between Keycloak and Infinispan. We could also see a decrease of the number of cache entries in Infinispan. Does this ring any bells to anyone? Could this have something to do with how Keycloak does garbage collection? Is there any way to avoid having this periodic clean up operation every 15 minutes?
Here are some example errors we found in the Keycloak and Infinispan logs:
Keycloak Logs:
- ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread–p15-t1) ISPN000136: Error executing command RemoveCommand on Cache ‘sessions’, writing keys [0e62ff2a-0528-4bc3-80c5-1749007080ca]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 1876114186 from keycloak-prod-269626-i-0e2bb963b00d91414
- ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread–p15-t1) ISPN000136: Error executing command RemoveCommand on Cache ‘authenticationSessions’, writing keys [c154475c-b137-416f-9dea-cddebd6db35a]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 1872915524 from keycloak-prod-269626-i-07543a23b2f9de6d0
Infinispan Log:
- ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread–p3-t1) ISPN000136: Error executing command RemoveCommand on Cache ‘clientSessions’, writing keys [WrappedByteArray{bytes=[B0x0304090000000E6A…[85], hashCode=-419468116}]: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 11713381 from ip-10-177-5-57
- ERROR [org.infinispan.interceptors.impl.InvocationContextInterceptor] (timeout-thread–p3-t1) ISPN000136: Error executing command GetCacheEntryCommand on Cache ‘sessions’, writing keys []: org.infinispan.util.concurrent.TimeoutException: ISPN000476: Timed out waiting for responses for request 9399385 from ip-10-177-18-145
Here is more information about our set up:
Number of instances in the Keycloak cluster: 3 (on AWS EC2)
Number of instances in the Infinispan cluster: 4 (on AWS Fargate)
Both Keycloak and Infinispan are using distributed cache with cache owner being 2
Deeply appreciated for any help. Thanks in advance.
Jia