I’ve stumbled upon a very weird performance issue with Keycloak on the latest version, 22.
We have multiple deployments of Keycloak in production and we decided to upgrade to the latest version. Our current version is Keycloak 19 on Quarkus distribution, and it works great!
However, as soon as I upgraded the simplest deployment to latest version 22.0.4, the performance dropped significantly. I can provide many more details but the gist of it is this:
- Run a couple of EC2 instances with Keycloak 19 Quarkus, running on edge mode behind a ALB. We only have between 10-20 clients and they all are used for service logins, with a very simple clients_credentials grant type. These are confidential clients with a simple secret (no JWT or certificates, just plain client_id and client_secret). Our typical load varies during the day between 30 to 90 requests per minute, we have around 30 client_credentials logins per minute (not a very big load).
However, with Keycloak 19 our p99 for request times varies between 20-30 milliseconds (a very decent result), compared to Keycloak 22 with p99 for request times varying between 100-200 milliseconds. This is a huge performance penalty, almost 10 fold worse. And the most interesting fact is that does not matter the mentioned load, the p99 was constant for Keycloak 19 and it is constant for Keycloak 22. So our load is not enough to worsen the response times.
However, the baseline is severely shifted for the worse. I am baffled and wondering if anyone noticed such a thing? Could it be the upgrade to Quarkus 3? Any help is greatly appreciated! We are afraid to upgrade the rest of the deployments as those are much heavy used with a huge variety of loads so if anything gets 10 times slower will be noticeable.