dear all,
in order to figure out if Keycloak can handle the load we foresee at some point I’ve been doing a stress test with a single local instance (v.10.0.1). For database I use Postgresql 12, JRE is 11.0.6, Mac OS 10.13.6.
Our first tests with Keycloak were with Thomas Darimont’s Spring-Boot packaged keycloak v.4.8.3 (because we want to deploy KC on IMB Cloud), and there we found that KC had a bit of trouble handling ~2500 clients: when, after startup, you first try to list the Clients in the GUI, it freezes for several minutes. After that, it worked fine.
I found that it’s not that hard to deploy the Keycloak Docker image directly on Cloud Foundry, which also made it a lot easier to upgrade to a more recent version of Keycloak. I found that if I migrated a v.4.8.3 version to v.10.0.1, it would not start up any longer, so I started with a much smaller set of clients, and it worked OK.
With v.10 up and running, I then added 10K users and 10K clients with JMeter, so see how it would hold out. Adding all that to Keycloak went pretty smoothly, in the order of about 2000 / minute.
And indeed, the “list clients” freezing seemed to be fully resolved in v.10; and listing that many users was also immediate.
But … then I tried adding one more client and one user, manually via the Keycloak UI. The client was no problem at all; but adding the user does not work (any more).
Configuration is like this
- standalone.xml
- defined the datasource according to the setup guide:
standalone.xml: datasource config
<datasource jndi-name="java:jboss/datasources/KeycloakDS"
pool-name="KeycloakDS" enabled="true" use-java-context="true"
statistics-enabled="${wildfly.datasources.statistics-enabled:${wildfly.statistics-enabled:false}}">
<connection-url>jdbc:postgresql://localhost:5432/<DB>?currentSchema=keycloakprod-v10</connection-url>
- added Postgres driver to modules, added the config to standalone.xml
- changed the ExampleDS in default-bindings to KeycloakDS, and removed the ExampleDS references
We also added our own BCrypt hashing implementation as a .jar deployment, plus whatever that needed in libraries (commons-coded, spring-security-crypty) as two other modules (which worked fine)
When I add another user in the Keycloak UI, the interface freezes. For about 5 minutes, nothing happens at all, and the operation fails (stack trace abbreviated for readability):
warn & error stacktrace
WARN [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffff7f000001:-67f70a68:5ee0fb65:c0 in state RUN
(...)
ERROR [org.keycloak.services.error.KeycloakErrorHandler] (default task-1) Uncaught server error: javax.persistence.PersistenceException: org.hibernate.HibernateException: Transaction was rolled back in a different thread!
at org.hibernate@5.3.15.Final//org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
at org.hibernate@5.3.15.Final//org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1515)
at org.hibernate@5.3.15.Final//org.hibernate.query.Query.getResultList(Query.java:132)
at org.keycloak.keycloak-model-jpa@10.0.1//org.keycloak.models.jpa.ClientAdapter.getClientScopes(ClientAdapter.java:381)'
(...)
Caused by: org.hibernate.HibernateException: Transaction was rolled back in a different thread!
at org.hibernate@5.3.15.Final//org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorTrackingImpl.processAnyDelayedAfterCompletion(SynchronizationCallbackCoordinatorTrackingImpl.java:90)
at org.hibernate@5.3.15.Final//org.hibernate.internal.SessionImpl.delayedAfterCompletion(SessionImpl.java:658)
(...)
[com.arjuna.ats.arjuna] (default task-1) ARJUNA012077: Abort called on already aborted atomic action 0:ffff7f000001:-67f70a68:5ee0fb65:c0
… no user was created (I also checked in the user_entity table directly).
I searched for a solution and found some hints suggesting that this might be a configuration issue. One was a recommendation to set the timeout (default 300 sec.) to a longer value - though I thought that unlikely in this case because: creating those 10000 initial users in Jmeter took maybe 20 milliseconds per user, so that you’d think that 300 seconds should be more than enough to create another one.
Still, I tried adding it this section of standalone.xml, like this:
standalone.xml: subsystem xmlns="urn:jboss:domain:transactions:5.0
<subsystem xmlns="urn:jboss:domain:transactions:5.0">
<core-environment node-identifier="${jboss.tx.node.id:1}">
<process-id>
<uuid/>
</process-id>
</core-environment>
<recovery-environment socket-binding="txn-recovery-environment" status-socket-binding="txn-status-manager"/>
<coordinator-environment default-timeout="600" <--- there
though the only result was that I had to wait twice as long for adding a user to fail with the above error.
Another suggestion was to add an idle timeout to the datasource, like this:
<timeout> <idle-timeout-minutes>1</idle-timeout-minutes> </timeout>
and yet another one was:
<validation> <check-valid-connection-sql>select 1</check-valid-connection-sql> <background-validation>true</background-validation> <background-validation-millis>15000</background-validation-millis> </validation>
… neither of which made a difference.
Then I found one other suggestion to add jta=false to the <datasource, like this:
<datasource jta="false" jndi-name="java:jboss/datasources/KeycloakDS" ...
which had a most curious effect: this time, a formidable spike could be seen in the connections monitor of the database, which tapered off asymptotically (like the function of 1/x does). Looking in the database I could see that now, indeed the user was created: though the email address I had filled in appeared as the username, and both the first_name and last_name were left empty.
That was pretty much the point where I decided that I better ask here on the forum
So, if anyone could point me towards something I haven’t yet tried, that would be great!
Thanks,
Lúthien