DB Connections to MySQL are not being released

Hi there

I have Keycloak and mysql running in docker containers on my local. I’m attempting to use the Benchmarking tools (GitHub - keycloak/keycloak-benchmark) so we can fine tune settings. However, when I run the command to create a number of realms (say 200), I end up maxing out the connections to the database. It seems that the threads are not letting go of their connections (or re-using the ones it already has) and seem to be requesting new ones from the pool.

Below are my two docker container setups with Docker Compose

version: '3'

services:

  mysql_keycloak:
    container_name: mysql_keycloak
    image: percona:5.7-centos
    #  see https://hub.docker.com/_/perconafor settings
    environment:
      - MYSQL_ALLOW_EMPTY_PASSWORD=1
      - MYSQL_HOST=localhost
      - DEV_MYSQL_USER=dev
      - DEV_MYSQL_PASSWORD=@ch13v3rs
    volumes:
      - data:/var/lib/mysql
      - ${PWD}/mysql/configuration/threading.cnf:/etc/percona-server.conf.d/threading.cnf
      - ${PWD}/scripts/initialize-db.sh:/docker-entrypoint-initdb.d/initialize-db.sh
      - ${PWD}/scripts/initial-database.sql:/scripts/initial-database.sql
    ports:
      - "33066:3306"
    networks: 
      - test_network

  keycloak:
    container_name: keycloak
    image: jboss/keycloak:latest
    command: -b 0.0.0.0 -Dkeycloak.profile.feature.upload_scripts=enabled -Dkeycloak.profile.feature.admin_fine_grained_authz=enabled -Dkeycloak.profile.feature.token_exchange=enabled -Djboss.as.management.blocking.timeout=3600 -D 
    environment:
      - DB_VENDOR=mysql
      - DB_ADDR=mysql_keycloak
      - DB_PORT=3306
      - DB_USER=dev
      - DB_PASSWORD=supersecretpassword
      - KEYCLOAK_USER=admin
      - KEYCLOAK_PASSWORD=admin
      - KEYCLOAK_FRONTEND_URL=https://localhost:8443/auth/
      - keycloak.profile.feature.upload_scripts=enabled
    depends_on: 
      - mysql_keycloak
    volumes:
      - ${PWD}/deployments:/opt/jboss/keycloak/standalone/deployments
      - ${PWD}/keycloak/configuration/standalone-ha.xml:/opt/jboss/keycloak/standalone/configuration/standalone-ha.xml
    ports:
      - "8443:8443"
    networks: 
      - test_network

volumes:
  data:
    driver: local
  dbdata:
    driver: local

# you will need to update the network
networks:
  test_network:
    driver: bridge
    external: true

This is the datasource entry for the standalone-ha.xml

        <datasource jndi-name="java:jboss/datasources/KeycloakDS" pool-name="KeycloakDS" enabled="true" use-java-context="true" use-ccm="false">
            <connection-url>jdbc:mysql://${env.DB_ADDR:mysql}:${env.DB_PORT:3306}/${env.DB_DATABASE:keycloak}${env.JDBC_PARAMS:}</connection-url>
            <driver>mysql</driver>
            <pool>
                <flush-strategy>IdleConnections</flush-strategy>
                <min-pool-size>1</min-pool-size>
                <max-pool-size>1000</max-pool-size>
                <initial-pool-size>20</initial-pool-size>
                <use-strict-min>true</use-strict-min>
            </pool>
            <timeout>
                <idle-timeout-minutes>1</idle-timeout-minutes>
            </timeout>
            <security>
                <user-name>${env.DB_USER:keycloak}</user-name>
                <password>${env.DB_PASSWORD:password}</password>
            </security>
            <validation>
                <check-valid-connection-sql>SELECT 1</check-valid-connection-sql>
                <background-validation>true</background-validation>
                <background-validation-millis>6000</background-validation-millis>
            </validation>
        </datasource>

On the mysql side of things, this is the settings I have for the connection pools in the threading.cnf file

[mysqld]
# Enable Thread Pooling
thread_handling=pool-of-threads
thread_pool_idle_timeout=30
thread_pool_size=20
max_connections=150

If we execute the REST end point for the dataset to create a number of realms and set that too 100 (https://localhost:8443/auth/realms/master/dataset/create-realms?count=100&clients-per-realm=10), it will create all 100 realms. Two minutes After the realms have all been created, I check on mysql to see how many connections are running show status where variable_name = 'Threads_connected'; and it shows 100.

If I execute the REST call again, to create another 100 realms, it will get about 50 deep and then it will blow up. On the MySQL side of things, I get too many connection errors:

mysql_keycloak    | 2021-03-09T20:54:49.747437Z 0 [Warning] Too many connections
mysql_keycloak    | 2021-03-09T20:54:49.752921Z 0 [Warning] Too many connections
mysql_keycloak    | 2021-03-09T20:54:49.755953Z 0 [Warning] Too many connections

On the keycloak side, I get the following error message:

keycloak          | 20:54:49,748 WARN  [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (pool-20-thread-5) IJ000604: Throwable while attempting to get a new connection: null: javax.resource.ResourceException: IJ031084: Unable to create connection
keycloak          | 	at org.jboss.ironjacamar.jdbcadapters@1.4.23.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:345)
keycloak          | 	at org.jboss.ironjacamar.jdbcadapters@1.4.23.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.getLocalManagedConnection(LocalManagedConnectionFactory.java:352)
keycloak          | 	at org.jboss.ironjacamar.jdbcadapters@1.4.23.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createManagedConnection(LocalManagedConnectionFactory.java:287)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreConcurrentLinkedDequeManagedConnectionPool.createConnectionEventListener(SemaphoreConcurrentLinkedDequeManagedConnectionPool.java:1322)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.pool.mcp.SemaphoreConcurrentLinkedDequeManagedConnectionPool.getConnection(SemaphoreConcurrentLinkedDequeManagedConnectionPool.java:499)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.pool.AbstractPool.getTransactionNewConnection(AbstractPool.java:714)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.pool.AbstractPool.getConnection(AbstractPool.java:613)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.AbstractConnectionManager.getManagedConnection(AbstractConnectionManager.java:624)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.tx.TxConnectionManagerImpl.getManagedConnection(TxConnectionManagerImpl.java:440)
keycloak          | 	at org.jboss.ironjacamar.impl@1.4.23.Final//org.jboss.jca.core.connectionmanager.AbstractConnectionManager.allocateConnection(AbstractConnectionManager.java:789)
keycloak          | 	at org.jboss.ironjacamar.jdbcadapters@1.4.23.Final//org.jboss.jca.adapters.jdbc.WrapperDataSource.getConnection(WrapperDataSource.java:151)
keycloak          | 	at org.jboss.as.connector@21.0.2.Final//org.jboss.as.connector.subsystems.datasources.WildFlyDataSource.getConnection(WildFlyDataSource.java:64)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.jdbc.connections.internal.DatasourceConnectionProviderImpl.getConnection(DatasourceConnectionProviderImpl.java:122)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.internal.NonContextualJdbcConnectionAccess.obtainConnection(NonContextualJdbcConnectionAccess.java:35)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.acquireConnectionIfNeeded(LogicalConnectionManagedImpl.java:106)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.resource.jdbc.internal.LogicalConnectionManagedImpl.getPhysicalConnection(LogicalConnectionManagedImpl.java:136)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.jdbc.internal.StatementPreparerImpl.connection(StatementPreparerImpl.java:50)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.jdbc.internal.StatementPreparerImpl$1.doPrepare(StatementPreparerImpl.java:90)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.jdbc.internal.StatementPreparerImpl$StatementPreparationTemplate.prepareStatement(StatementPreparerImpl.java:176)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.jdbc.internal.StatementPreparerImpl.prepareStatement(StatementPreparerImpl.java:81)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3173)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:3706)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.action.internal.EntityInsertAction.execute(EntityInsertAction.java:90)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:604)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.engine.spi.ActionQueue.executeActions(ActionQueue.java:478)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.event.internal.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:356)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.event.internal.DefaultFlushEventListener.onFlush(DefaultFlushEventListener.java:39)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.internal.SessionImpl.doFlush(SessionImpl.java:1472)
keycloak          | 	at org.hibernate@5.3.20.Final//org.hibernate.internal.SessionImpl.flush(SessionImpl.java:1458)
keycloak          | 	at jdk.internal.reflect.GeneratedMethodAccessor364.invoke(Unknown Source)
keycloak          | 	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
keycloak          | 	at java.base/java.lang.reflect.Method.invoke(Method.java:566)
keycloak          | 	at org.keycloak.keycloak-model-jpa@12.0.4//org.keycloak.connections.jpa.PersistenceExceptionConverter.invoke(PersistenceExceptionConverter.java:49)
keycloak          | 	at javax.persistence.api@2.2.3//com.sun.proxy.$Proxy91.flush(Unknown Source)
keycloak          | 	at org.keycloak.keycloak-model-jpa@12.0.4//org.keycloak.models.jpa.JpaRealmProvider.createRealm(JpaRealmProvider.java:93)
keycloak          | 	at org.keycloak.keycloak-model-infinispan@12.0.4//org.keycloak.models.cache.infinispan.RealmCacheSession.createRealm(RealmCacheSession.java:395)
keycloak          | 	at org.keycloak.keycloak-services@12.0.4//org.keycloak.services.managers.RealmManager.importRealm(RealmManager.java:516)
keycloak          | 	at org.keycloak.keycloak-services@12.0.4//org.keycloak.services.managers.RealmManager.importRealm(RealmManager.java:501)
keycloak          | 	at deployment.keycloak-benchmark-dataset-0.4-SNAPSHOT.jar//org.keycloak.benchmark.dataset.DatasetResourceProvider.createAndSetRealm(DatasetResourceProvider.java:897)
keycloak          | 	at deployment.keycloak-benchmark-dataset-0.4-SNAPSHOT.jar//org.keycloak.benchmark.dataset.DatasetResourceProvider.lambda$createRealmsImpl$2(DatasetResourceProvider.java:178)
keycloak          | 	at org.keycloak.keycloak-server-spi-private@12.0.4//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:228)
keycloak          | 	at org.keycloak.keycloak-server-spi-private@12.0.4//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithTimeout(KeycloakModelUtils.java:268)
keycloak          | 	at deployment.keycloak-benchmark-dataset-0.4-SNAPSHOT.jar//org.keycloak.benchmark.dataset.DatasetResourceProvider.lambda$createRealmsImpl$5(DatasetResourceProvider.java:177)
keycloak          | 	at org.keycloak.keycloak-server-spi-private@12.0.4//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransaction(KeycloakModelUtils.java:228)
keycloak          | 	at org.keycloak.keycloak-server-spi-private@12.0.4//org.keycloak.models.utils.KeycloakModelUtils.runJobInTransactionWithTimeout(KeycloakModelUtils.java:268)
keycloak          | 	at deployment.keycloak-benchmark-dataset-0.4-SNAPSHOT.jar//org.keycloak.benchmark.dataset.ExecutorHelper.lambda$addTask$0(ExecutorHelper.java:53)
keycloak          | 	at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
keycloak          | 	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
keycloak          | 	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
keycloak          | 	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
keycloak          | 	at java.base/java.lang.Thread.run(Thread.java:834)
keycloak          | Caused by: java.sql.SQLNonTransientConnectionException: Data source rejected establishment of connection,  message from server: "Too many connections"
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:110)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.exceptions.SQLError.createSQLException(SQLError.java:97)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.exceptions.SQLExceptionsMapping.translateException(SQLExceptionsMapping.java:122)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:836)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:456)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:246)
keycloak          | 	at com.mysql.jdbc@8.0.22//com.mysql.cj.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:198)
keycloak          | 	at org.jboss.ironjacamar.jdbcadapters@1.4.23.Final//org.jboss.jca.adapters.jdbc.local.LocalManagedConnectionFactory.createLocalManagedConnection(LocalManagedConnectionFactory.java:321)
keycloak          | 	... 50 more
keycloak          |
keycloak          | 20:54:49,750 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (pool-20-thread-5) SQL Error: 0, SQLState: null
keycloak          | 20:54:49,750 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (pool-20-thread-5) javax.resource.ResourceException: IJ000453: Unable to get managed connection for java:jboss/datasources/KeycloakDS

Something else that comes up, regardless if I executed the REST end point to create another 100 realms and triggering the error, of if I just left at the inital 100 realms, the following warnings come up 5 mins after starting the initial request.

keycloak          | 20:55:12,843 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:52 in state  RUN
keycloak          | 20:55:12,844 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:56 in state  RUN
keycloak          | 20:55:12,845 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:52
keycloak          | 20:55:12,847 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:56
keycloak          | 20:55:12,849 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:5c in state  RUN
keycloak          | 20:55:12,850 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:5c
keycloak          | 20:55:12,854 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:66 in state  RUN
keycloak          | 20:55:12,855 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:66
keycloak          | 20:55:14,037 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:7f in state  RUN
keycloak          | 20:55:14,039 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:7f
keycloak          | 20:55:14,832 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper) ARJUNA012117: TransactionReaper::check timeout for TX 0:ffffac130003:15e6a115:6047df6f:a0 in state  RUN
keycloak          | 20:55:14,834 WARN  [org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorTrackingImpl] (Transaction Reaper Worker 0) HHH000451: Transaction afterCompletion called by a background thread; delaying afterCompletion processing until the original thread can handle it. [status=4]
keycloak          | 20:55:14,834 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac130003:15e6a115:6047df6f:a0

The about 1-2 mins after the warning messages, the DB connections start to be released and I can see the number go down when running show status where variable_name = 'Threads_connected';

I’m not 100% sure, but I don’t think the problem is on the dataset side of the code, cause it’s ultimately executing code in the main codebase to create the realms and the DB connections seem to be created and maintained in the main code base too. It seem like some sort of transaction is not completing properly and that the DB connection is never freed up.

I’m hoping this sort of bug is isolated to creating realms, but I’ll be creating other datasets soon. If it’s a more generic error, then this really doesn’t bode well when we’re expecting to see high traffic on the server. We’re looking to create 400-500 realms initially in production and are probably going to scale up to 1000 in the near future.

Is this something that can be fixed with a config change? I feel like I’ve tried everything already, but I might have missed something. If not and it is a bug, can we can a issue created for it?

Thanks

Do you have a thread dump?

From the error, I suspect keycloak is updating db but can’t finish within the transaction timeout period…

I see there are some posts complain that keycloak/rhsso cannot handle many realms…

Okay, so I fired up an instance again and executed the REST end point to create 200 realms and dumped the full log, which can be found here: KeyCloak Log Dump - Pastebin

I posted a thread dump on pastebin, but the post is still waiting approval. In the mean time, I’ll highlight some issues that I seem to have found in the code base:

In the DataResourceProvider.java (keycloak-benchmark/DatasetResourceProvider.java at 586a9e0ebb03175b6376149310d68b6edc841fab · keycloak/keycloak-benchmark · GitHub), there is the following block of code:

        try {
            DefaultRoles.getDefaultRoles(realm);
        } catch (NoClassDefFoundError ncdfe) {
            // Since Keycloak 13
        }

This seems to cause issues. First off the getDefaultRoles function has a return, which isn’t being assigned here, so I question why it’s even there. Secondly, I think it’s “dying” in that function. I don’t have any solid proof, other than I added logging before the block of code and after. I would always see the logs I wrote before, but never after. I added a extra catch for a general exception too, but that didn’t seem to catch anything either, so I’m not sure what’s going on there. Commenting out the block of code seems to help.

After commenting out that block and doing the following request (https://localhost:8443/auth/realms/master/dataset/create-realms?count=200&clients-per-realm=10&users-per-realm=50). the max DB connections was 22, but it mainly hovered around 7 connections.