High Avalability: Standalone HA vs Domain Clustered mode

Based on the documentation:

Domain Clustered Mode:
Domain mode is a way to centrally manage and publish the configuration for your servers.
Running a cluster in standard mode can quickly become aggravating as the cluster grows in size. Every time you need to make a configuration change, you have to perform it on each node in the cluster. Domain mode solves this problem by providing a central place to store and publish configurations. It can be quite complex to set up, but it is worth it in the end. This capability is built into the WildFly Application Server which Keycloak derives from.

I tried the example setup from the user manual and it really the maintenance of multiple configuration.

However, as High Availability is concerned, this is not quite resilient. When the master node goes down, the Auth Server will stop functioning since all the slave nodes listen to the domain controller.

Is my understanding correct here? Or am I missing something?

If this is the case, to ensure High Availability then Standalone-HA is the way to go, right?

Yes, I’m using standalone HAs all the way. As all my environments are scripted/automated/using Docker, it’s not a matter of configuration and I can reduce overhead management and configuration know how of domain mode.

I setup 2 nodes with standalone-ha.xml and a physical network load balancer in the front with shared mysql data source on RHEL8.

The nodes are not aware of the tokens issued by other nodes and if load balancer sends the user to a different node other than the one he initially got token from, his authentication is denied.

If a user registers via Node 1 and he logs in immediately and if he’s sent to Node 2 his information takes a minute to be seen by Node 2. He gets a message that user doesn’t exist. I think the cache is not replicated instantly to DB. As a workaround we had to disable realmCache and userCache.

Basically these nodes are not aware of other nodes existence. I don’t know what I’m doing wrong here.

Also I see this message on one of the two nodes. That is if both nodes are stopped and we bring up Node 1 first and after a couple of minutes, we bring up Node 2… This message keeps coming up on Node 2… But not on Node 1. Probably the Node 1 acquired some type of lock on DB and Node 2 couldn’t though it’s bound to datasource.

2020-10-31 21:38:07,979 WARN [org.jboss.jca.core.connectionmanager.pool.strategy.OnePool] (Timer-2) IJ000621: Destroying connection that could not be validated: org.jboss.jca.core.connectionmanager.listener.TxConnectionListener@15d28c4e[state=NORMAL managed connection=org.jboss.jca.adapters.jdbc.local.LocalManagedConnection@3d559a6b connection handles=0 lastReturned=1604193789095 lastValidated=1604193788053 lastCheckedOut=1604193789088 trackByTx=false pool=org.jboss.jca.core.connectionmanager.pool.strategy.OnePool@523baa80 mcp=SemaphoreConcurrentLinkedQueueManagedConnectionPool@1c1a2ce9[pool=KeycloakDS] xaResource=LocalXAResourceImpl@36d86162[connectionListener=15d28c4e
connectionManager=2b63566b warned=false currentXid=null productName=MySQL productVersion=8.0.20 jndiName=java:/jboss/datasources/KeycloakDS] txSync=null]

Any help or suggestions could off a great help!

No! can directly select domain clustered mode