Troubleshooting Docker JDBC_PING 504 Error

Hi,

I’ve had a pair of AWS Tasks running Keycloak 24 for a month that returned a 504 Gateway Timeout today. The Tasks coordinate in a cluster with JDBC_PING. Before I got the 504, I noticed that the browser was redirect to /admin.

The Tasks looked healthy to me. CPU and RAM were low. There’s a custom SPI that checks DB connections and that was operating ok. The database supporting KC was up too.

I was wondering if anyone knew where I might start looking. Source code line numbers would be much appreciated.

Thanks in advance,
Carl

Hey Carl when the 504s occur, what do you get in your logs?

Unfortunately, no errors except the 504 that the load balancer is getting. I’m figuring it might be the distributed cache taking too long to lookup a realm. The DB supporting Keycloak looked good.

The logs are in DEBUG mode?

If there is nothing in the logs, it may be infrastructure.

  • Do you have some security rules somewhere preventing you access to the service?
  • Do you see your target group reaching the task? Are they healthy?
  • Do you still get the issue if you reboot/re-create the service completely?

Hi,

Thanks for sticking with me and my vague question.

A restart fixed everything.

There weren’t any errors in the logs. Also, (this is on AWS) there weren’t any bad health checks involving the ALB, cluster, services, or tasks besides the 504s.

I’ve been looking at org/keycloak/services/resources/admin/AdminRoot.java since the browser did redirect to /admin when I tried to this the console. I had checked the DB supporting Keycloak, but maybe the distributed cache or another component never returned from this call.

RealmModel master = new RealmManager(session)
    .getKeycloakAdminstrationRealm();

I have expanded logging to include more org.keycloak messages. I’ll report back if I find anything.

Thanks again,
Carl