Hello Keycloak Community,
I am experiencing an issue with my Keycloak setup deployed in a Docker Swarm environment. The Keycloak instances are randomly crashing, and the logs indicate issues with Infinispan operations, specifically SocketTimeoutException
and blocked threads.
Environment Details:
- Keycloak Version: [Provide your Keycloak version]
- Deployment: Docker Swarm
- Database: [Specify your database, e.g., PostgreSQL, MySQL]
- Operating System: [Specify the OS, e.g., Ubuntu 20.04]
Issue Description:
Keycloak crashes randomly with the following stack traces:
Stack Trace:
2024-06-19 12:00:51,712 WARN [org.infinispan.HOTROD] (Thread-0) ISPN004098: Closing connection [id: 0xc10ae209, L:/10.0.3.139:39214 - R:10.0.3.110/10.0.3.110:11222] due to transport error: java.net.SocketTimeoutException: ReplaceIfUnmodifiedOperation{offlineSessions, key=[B0x033E2466653061613936352D65376465..[39], value=[B0x03040B000000446F72672E6B6579636C..[1141], flags=0, connection=10.0.3.110/10.0.3.110:11222} timed out after 60000 ms
at org.infinispan.client.hotrod.impl.operations.HotRodOperation.run(HotRodOperation.java:182)
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98)
at io.netty.util.concurrent.ScheduledFutureTask.run(ScheduledFutureTask.java:170)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:164)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:469)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:384)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
2024-06-19 12:00:55,227 WARN [io.vertx.core.impl.BlockedThreadChecker] (vertx-blocked-thread-checker) Thread Thread[vert.x-eventloop-thread-5,5,main] has been blocked for 3548 ms, time limit is 2000 ms: io.vertx.core.VertxException: Thread blocked
at io.vertx.core.net.impl.ConnectionBase.lambda$handleException$4(ConnectionBase.java:357)
at io.vertx.core.net.impl.ConnectionBase$$Lambda$1733/0x0000000841090840.handle(Unknown Source)
at io.vertx.core.impl.EventLoopContext.emit(EventLoopContext.java:50)
at io.vertx.core.impl.ContextImpl.emit(ContextImpl.java:274)
at io.vertx.core.impl.EventLoopContext.emit(EventLoopContext.java:22)
at io.vertx.core.net.impl.ConnectionBase.handleException(ConnectionBase.java:354)
at io.vertx.core.http.impl.Http1xServerConnection.handleException(Http1xServerConnection.java:466)
at io.vertx.core.net.impl.VertxHandler.exceptionCaught(VertxHandler.java:136)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.AbstractChannelHandlerContext.fireExceptionCaught(AbstractChannelHandlerContext.java:273)
at io.netty.channel.DefaultChannelPipeline$HeadContext.exceptionCaught(DefaultChannelPipeline.java:1377)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:302)
at io.netty.channel.AbstractChannelHandlerContext.invokeExceptionCaught(AbstractChannelHandlerContext.java:281)
at io.netty.channel.DefaultChannelPipeline.fireExceptionCaught(DefaultChannelPipeline.java:907)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.handleReadException(AbstractNioByteChannel.java:125)
at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:177)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:722)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:658)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:584)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:496)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:986)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Despite these changes, the issue persists.
Additional Information:
- The issue appears to occur randomly under load.
- Network connectivity between Keycloak and Infinispan nodes has been verified as stable.
- Resource allocation for both Keycloak and Infinispan nodes appears sufficient based on current monitoring tools.
Request for Assistance:
I would appreciate any guidance on resolving these issues. Specifically, I’m looking for recommendations on further configuration adjustments or insights into potential underlying causes.
Thank you in advance for your assistance!