Keycloak Upgrade Docker Container From 18.0.2 to 21.0.2 With +200 Realms

Hi dear community, I am reaching out to you because we need some community help on upgrading our KC version from 18.0.2 to 21.0.2. We run a multi-tenancy approach where each organisation with users is represented by a realm. Since we serve multiple organisations, we have more than 300 on our KC. I am aware of the fact, that around 300 realms, KC suffers significant performances issues, and we do have a plan to move away from this approach, and go for groups. However, we decided first to upgrade KC first.

We tried to upgrade our dev environments as well as integration. This worked seamless, and we experienced no issues. However, production is a different story. It did not work. And the only difference is, that on production we have 434 realms. I tried to reach out to OPs to give me a list of inactive businesses respectively abandoned realms, to clean and reduce the size, but we still be between 150 and 200 realms, I reckon. Because we do this from time to time, when we experience performance issues. Anyways, there is still a lot to optimize in this case. I took this over from some senior dev who set everything up. The more important issue now is upgrading while persisting the realms of course.

Further info: We run our KC as a image in a docker container configured with env vars.

Anyone who faces similar challenge? What do we need to do in order to make this migration

I can post more informations, configs and error logs.

Thanks in advance!!!

Hey Boris!

Very nice challenge you have here. Can you share what are the symptoms of the upgrade failure in production? Did you get any errors?

Are multiple docker instances trying to update the DB at the same time?

(Figuring one instance in dev, multiple in prod)

@Carl Thanks for getting back. No, we only have one KC instance running on our server. Hence, it is just this one trying to update the DB. But the issue has been solved :). Thank you very much, anyways!

@Carl @gmolaire I managed to upgrade it. I went first for 19.0 then from there to 20.0.5. Had the same errors from 19.0.3 to 20.0.5. The upgrade from 18.0.2 to 19.0.3 worked easily.

This helped me to upgrade from 19.0.2 to 20.0.5.
The solution for upgrading with a large size of realms, I found in here: Align `quarkus.transaction-manager.default-transaction-timeout` with storage lock timeouts · Issue #19453 · keycloak/keycloak · GitHub.

He has faced a similar issue with about +600 realms.

Solution is to add this to the quarkus.properties.

quarkus.transaction-manager.default-transaction-timeout=35M

I am now trying the same from 20.0.5 to 21.0.2. Meaning, that I just change the image version and compose it up again.

2 Likes

UPDATE: Same solution worked from upgrading 20.0.5 to 21.0.2.

Config your quarkus.porperties with this variable:
quarkus.transaction-manager.default-transaction-timeout=35M

Value can be adjusted, based on the size of the realms. 35M worked seamless with +400 realms. As well as more than +600, see: Align `quarkus.transaction-manager.default-transaction-timeout` with storage lock timeouts · Issue #19453 · keycloak/keycloak · GitHub

We would have been able to help you here, if you would have answered @gmolaire ‘s question which error message you were confronted with. By just writing „it does not work“, one is hardly able to tell you anything about possible root causes!
Being mor precise in your problem description could have lead you faster to the proper solution!

Hi @dasniko,

That would have been my next step. I did not want to clutter the problem description with a long stack trace. It’s not that I didn’t think to provide the error message, but I was initially seeking some form of feedback on my problem, such as a question about the error or the environment.

I had indeed planned to answer @gmolaire’s questions. However, before I returned to this thread, I had already found the solution after spending about two hours researching various forums and documentation. Fortunately, I wasn’t the only one facing this issue. If I hadn’t found a solution, I would have provided more error messages. :slight_smile:

Regardless, you are absolutely right, and I will definitely provide the error message immediately in the future.

Best,
Boris

Hi @gmolaire,

thanks for getting back to this so quickly. I have already figure out the problem and the upgrade has been successfully completed. Nevertheless, here is the error I received:

2024-07-08 10:54:39,220 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012108: CheckedAction::check - atomic action 0:ffffac145016:9a89:668bc443:0 aborting with 1 threads active!
2024-07-08 10:54:39,223 WARN  [io.agroal.pool] (Transaction Reaper Worker 0) Datasource '<default>': JDBC resources leaked: 1 ResultSet(s) and 1 Statement(s)
2024-07-08 10:54:39,228 WARN  [org.hibernate.resource.transaction.backend.jta.internal.synchronization.SynchronizationCallbackCoordinatorTrackingImpl] (Transaction Reaper Worker 0) HHH000451: Transaction afterCompletion called by a background thread; delaying afterCompletion processing until the original thread can handle it. [status=4]
2024-07-08 10:54:39,229 WARN  [com.arjuna.ats.arjuna] (Transaction Reaper Worker 0) ARJUNA012121: TransactionReaper::doCancellations worker Thread[Transaction Reaper Worker 0,5,main] successfully canceled TX 0:ffffac145016:9a89:668bc443:0
2024-07-08 10:54:39,291 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (main) SQL Error: 0, SQLState: null
2024-07-08 10:54:39,292 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (main) Connection is closed
2024-07-08 10:54:39,359 WARN  [com.arjuna.ats.arjuna] (main) ARJUNA012077: Abort called on already aborted atomic action 0:ffffac145016:9a89:668bc443:0
2024-07-08 10:54:39,522 INFO  [org.infinispan.CLUSTER] (main) ISPN000080: Disconnecting JGroups channel ISPN
2024-07-08 10:54:39,758 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Failed to start server in (production) mode
2024-07-08 10:54:39,758 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: org.hibernate.exception.GenericJDBCException: could not prepare statement
2024-07-08 10:54:39,759 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: could not prepare statement
2024-07-08 10:54:39,759 ERROR [org.keycloak.quarkus.runtime.cli.ExecutionExceptionHandler] (main) ERROR: Connection is closed
2024-07-08 10:54:39,759 ERROR