SQL Connection closed after I/O Exception

I’m running Keycloak clustered with a PostgreSQL JDBC driver (both DB and Keycloak VM are running on Azure).
After a couple days of uptime, Keycloak will eventually stop working. According to the log file, each time the originating problem is an I/O error during a scheduled task (“ClearExpiredEvents”):

2020-03-27 23:27:44,272 ERROR [org.keycloak.services] (Timer-2) KC-SERVICES0089: Failed to run scheduled task ClearExpiredEvents: javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: could not extract ResultSet
at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
	at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1515)
	at org.hibernate.query.Query.getResultList(Query.java:132)
	at org.keycloak.models.jpa.JpaRealmProvider.getRealms(JpaRealmProvider.java:117)
	at org.keycloak.models.jpa.JpaRealmProvider.getRealms(JpaRealmProvider.java:113)
	at org.keycloak.models.cache.infinispan.RealmCacheSession.getRealms(RealmCacheSession.java:466)
	at org.keycloak.services.scheduled.ClearExpiredEvents.run(ClearExpiredEvents.java:34)
        [ ... ]

Caused by the underlying exception:

Caused by: org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
	at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:358)
	at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:448)
	at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:369)
	at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:159)
	at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:109)
	at org.jboss.jca.adapters.jdbc.WrappedPreparedStatement.executeQuery(WrappedPreparedStatement.java:504)
	at org.hibernate.engine.jdbc.internal.ResultSetReturnImpl.extract(ResultSetReturnImpl.java:60)
	... 28 more
Caused by: java.net.SocketException: Connection reset by peer: socket write error
	at java.net.SocketOutputStream.socketWrite0(Native Method)
	at java.net.SocketOutputStream.socketWrite(Unknown Source)
	at java.net.SocketOutputStream.write(Unknown Source)
       [ ... ]

Now one error like this wouldn’t be a problem. After all, connection to the database might break once in a while… But unfortunately, Keycloak seems to stop working after this. Consecutive error messages might look like this (here another scheduled task “ClearExpiredUserSessions”):

> 2020-03-27 23:27:45,336 WARN  [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-2) SQL Error: 0, SQLState: 08003
> 2020-03-27 23:27:45,336 ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-2) This connection has been closed.
> 2020-03-27 23:27:45,336 ERROR [org.keycloak.services] (Timer-2) KC-SERVICES0089: Failed to run scheduled task ClearExpiredUserSessions: javax.persistence.PersistenceException: org.hibernate.exception.JDBCConnectionException: could not prepare statement
> 	at org.hibernate.internal.ExceptionConverterImpl.convert(ExceptionConverterImpl.java:154)
> 	at org.hibernate.query.internal.AbstractProducedQuery.list(AbstractProducedQuery.java:1515)
> 	at org.hibernate.query.Query.getResultList(Query.java:132)
> 	at org.keycloak.models.jpa.JpaRealmProvider.getRealms(JpaRealmProvider.java:117)
> 	at org.keycloak.models.jpa.JpaRealmProvider.getRealms(JpaRealmProvider.java:113)
> 	at org.keycloak.models.cache.infinispan.RealmCacheSession.getRealms(RealmCacheSession.java:466)
> 	at org.keycloak.services.scheduled.ClearExpiredUserSessions.run(ClearExpiredUserSessions.java:36)
>        [ ... ]

It seems to me that some part of Keycloak should re-open the connection? Now every database access just seems to fail…?

Hi sweber,

Any news on it ? Do you solved this issue ?

I found the same error on my keycloak v10 and it’s ok on my keycloak v8.

Thanks

Hi

As I mentioned, the Keycloak VM as well as the database were running on Azure. Our current workaround is to run Keycloak against a MSSQL database instead of a PostgreSQL. With MSSQL, the problem has not happened again so far.

So unfortunately, from my point of view, the issue in Keycloak is not resolved at all. If a connection loss happens, it’s possible that Keycloak does not react correctly in my opinion. This should still be addressed in order to build a reliable system.

Update: As suspected, the above mentioned workaround did not last forever. Keycloak was running for about 2 weeks without issues, until the SQL Server connection was disrupted once again.
Then it once again did not work until manually restarted.

Hibernate seems to recognize that the connection is closed:
WARN [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-2) SQL Error: 0, SQLState: null
ERROR [org.hibernate.engine.jdbc.spi.SqlExceptionHelper] (Timer-2) The connection is closed.

But instead of reopening the connection, all future SQL statements fail.

Can you post the configuration of your datasource in your standalone-ha.xml file?
Do you have the following present? Maybe you can tweak the configuration a bit.
Also check the max allowed connection in Azure and the one set in your pool config.
Also take a look at https://docs.jboss.org/jbossas/docs/Server_Configuration_Guide/beta500/html/ch13s04s03s01.html

<validation>
    <check-valid-connection-sql>SELECT 1</check-valid-connection-sql>
    <background-validation>true</background-validation>
    <background-validation-millis>60000</background-validation-millis>
</validation>

Thank you for your inputs. I appreciate you helping me out.
I had not included the validation block that you posted. I have added that now and will have to see how stability develops.

My datasource is now configured like this:

                <datasource jndi-name="java:jboss/datasources/KeycloakDS" pool-name="KeycloakDS" enabled="true" use-java-context="true">
                <connection-url>jdbc:sqlserver://my-obfuscated-azure-db.database.windows.net:1433;database=my-obfuscated-azure-db;encrypt=true;trustServerCertificate=false;hostNameInCertificate=*.database.windows.net;loginTimeout=30;</connection-url>
                <driver>sqlserver</driver>
                <pool>
                    <max-pool-size>20</max-pool-size>
                </pool>
                <security>
                    <user-name>user@my-obfuscated-azure-db</user-name>
                    <password>my-password</password>
                </security>
                <validation>
                    <valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.mssql.MSSQLValidConnectionChecker"></valid-connection-checker>
                    <background-validation>true</background-validation>
                    <background-validation-millis>120000</background-validation-millis>
                </validation>
            </datasource>
1 Like

Did you find a solution in the meantime?

Yes, with the solution mentioned in the previous post the problem no longer occurred.

1 Like

Same problem happened to us following a network switch restart. Keycloak was stuck in this state until I restarted the instance. Thanks for this thread and the information. I have updated the configuration as suggested.

I had fetched same issue. Now its resolved by using following validation block.


<datasource jndi-name="java:jboss/datasources/KeycloakDS" pool-name="KeycloakDS" enabled="true" use-java-context="true">
   <connection-url>jdbc:postgresql://host.docker.internal:5432/postgres_db</connection-url>
   <driver>postgresql</driver>
   <pool>
	   <max-pool-size>100</max-pool-size>
   </pool>
   <security>
	   <user-name>postgres_user</user-name>
	   <password>postgres_password</password>
   </security>				   
   <validation>
		<valid-connection-checker class-name="org.jboss.jca.adapters.jdbc.extensions.postgres.PostgreSQLValidConnectionChecker"></valid-connection-checker>
		<background-validation>true</background-validation>
		<background-validation-millis>120000</background-validation-millis>
   </validation>				
</datasource>