Hi,
My company is currently on Keycloak v16.1.1, and have been hit (we think) by KEYCLOAK-13340: Performance Issues with many offline sessions, which in extreme cases is causing servers to fail liveness checks and be restarted, and seems to be a the root of user logouts. This behavior seems to have been introduced with lazy offline session loading in v15.
A fix for this has been merged into main
at b104dc7, and has not been merged into a release branch. As such, I assume that it’s considered unstable for now; nevertheless, barring the lack of any alternatives, it’s not feasible for for us to migrate to v20 at this time.
I have a few followup questions:
- We have a relatively small number of accounts with large numbers of offline sessions. Why this is the case is still a matter of internal investigation, but given that, is there a recommended database-level method for manually cleaning up sessions? The theory here is that even if there are adverse effects to a small group of users, they would be able to re-login as necessary.
- Mostly a mirror of the GH issue discussion comment: While my search has been non-exhaustive and mostly centered upon changes to
PersistentClassSessionEntity.java
and specifically the queryfindClientSessionsOrderedById
, I haven’t found changes associated with this issue between 16.1.1 and the proposed fix mentioned above. I’ve created a patch for 16.1.1 based upon the original fix, but have yet to attempt to build it. Regardless, how horrible of an idea is this? - Are there any other short-term remediations that come to mind?
Thanks!
Cora