I have spent some considerable time attempting to run the Keycloak container clustered in swarm mode.
TL;DR I ended up adding dn.DNS_PING and changing the docker-entrypoint.sh to set the bind address to match the dnsrr ip from the service, its down the bottom.
Adventures
First run was to try it in a local docker without swam and scale to 2 with docker-compose. This worked well once I adjusted/added the environment variables CACHE_OWNERS_COUNT
and CACHE_OWNERS_AUTH_SESSIONS_COUNT
Having read some background and also confirming mcast does not work in swarm mode/overlay networks I settled on dns.DNS_PING
for discovery and converted the entrypoint_mode
to dnsrr
.
While I could see the dnsquery via ROOT_LOGLEVEL=DEBUG
or tcpdump and see the correct responses the result was two independent cluster for jgroups/infinispan.
Resultant command from entrypoint.
/bin/sh /opt/jboss/keycloak/bin/standalone.sh -Djboss.bind.address=10.0.1.41 -Djboss.bind.address.private=10.0.1.41 -Djboss.bind.address=172.19.0.4 -Djboss.bind.address.private=172.19.0.4 -c=standalone-ha.xml -b 0.0.0.0
Jgroups/Infinispan logs
8:13:07,149 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool – 39) WFLYCLINF0001: Activating Infinispan subsystem.
18:13:07,191 INFO [org.jboss.as.clustering.jgroups] (ServerService Thread Pool – 43) WFLYCLJG0001: Activating JGroups subsystem. JGroups version 4.2.4
18:13:12,086 INFO [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool – 60) 90ac043bdf26: no members discovered after 3020 ms: creating cluster as coordinator
18:13:12,859 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-6) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:13:12,871 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-7) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:13:12,872 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-8) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:13:12,859 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-5) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:13:12,873 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-4) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:13:12,894 INFO [org.infinispan.CONTAINER] (MSC service thread 1-8) ISPN000128: Infinispan version: Infinispan ‘Turia’ 10.1.8.Final
18:13:13,115 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000078: Starting JGroups channel ejb
18:13:13,115 INFO [org.infinispan.CLUSTER] (MSC service thread 1-7) ISPN000078: Starting JGroups channel ejb
18:13:13,115 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000078: Starting JGroups channel ejb
18:13:13,117 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000078: Starting JGroups channel ejb
18:13:13,117 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000078: Starting JGroups channel ejb
18:13:13,124 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000094: Received new cluster view for channel ejb: [90ac043bdf26|0] (1) [90ac043bdf26]
18:13:13,124 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000094: Received new cluster view for channel ejb: [90ac043bdf26|0] (1) [90ac043bdf26]
18:13:13,124 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000094: Received new cluster view for channel ejb: [90ac043bdf26|0] (1) [90ac043bdf26]
18:13:13,124 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000094: Received new cluster view for channel ejb: [90ac043bdf26|0] (1) [90ac043bdf26]
18:13:13,133 INFO [org.infinispan.CLUSTER] (MSC service thread 1-7) ISPN000094: Received new cluster view for channel ejb: [90ac043bdf26|0] (1) [90ac043bdf26]
18:13:13,140 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000079: Channel ejb local address is 90ac043bdf26, physical addresses are [172.19.0.4:7600]
18:13:13,149 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000079: Channel ejb local address is 90ac043bdf26, physical addresses are [172.19.0.4:7600]
18:13:13,150 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000079: Channel ejb local address is 90ac043bdf26, physical addresses are [172.19.0.4:7600]
18:13:13,151 INFO [org.infinispan.CLUSTER] (MSC service thread 1-7) ISPN000079: Channel ejb local address is 90ac043bdf26, physical addresses are [172.19.0.4:7600]
18:13:13,151 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000079: Channel ejb local address is 90ac043bdf26, physical addresses are [172.19.0.4:7600]
I noticed in the logs that the ‘wrong’ ip was being bound. The wrong IP here belonging to the docker_gwbridge network (default route) not the network defined in compose. So I decided to try and use the BIND
variable and set this to 0.0.0.0
Identical results in that two independent clusters are created.
Resultant command from entrypoint.
/bin/sh /opt/jboss/keycloak/bin/standalone.sh -Djboss.bind.address=0.0.0.0 -Djboss.bind.address.private=0.0.0.0 -c=standalone-ha.xml -b 0.0.0.0
sudo nsenter -t $(docker inspect $(docker ps --filter name=kc_keycloak.1 -q) | jq '.[].State.Pid') -n ss -tnl 'sport = :7600'
State Recv-Q Send-Q Local Address:Port Peer Address:Port Process
LISTEN 0 50 0.0.0.0:7600 0.0.0.0:*
jgroups/inifinispan logs
18:19:36,229 INFO [org.jgroups.protocols.pbcast.GMS] (ServerService Thread Pool – 60) 9261f2605131: no members discovered after 3032 ms: creating cluster as coordinator
18:19:36,780 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-4) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:19:36,780 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-3) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:19:36,791 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-6) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:19:36,801 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-8) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:19:36,802 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-1) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
18:19:36,823 INFO [org.infinispan.CONTAINER] (MSC service thread 1-6) ISPN000128: Infinispan version: Infinispan ‘Turia’ 10.1.8.Final
18:19:37,026 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000078: Starting JGroups channel ejb
18:19:37,026 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000078: Starting JGroups channel ejb
18:19:37,026 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000078: Starting JGroups channel ejb
18:19:37,026 INFO [org.infinispan.CLUSTER] (MSC service thread 1-1) ISPN000078: Starting JGroups channel ejb
18:19:37,026 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000078: Starting JGroups channel ejb
18:19:37,033 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000094: Received new cluster view for channel ejb: [9261f2605131|0] (1) [9261f2605131]
18:19:37,033 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000094: Received new cluster view for channel ejb: [9261f2605131|0] (1) [9261f2605131]
18:19:37,033 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000094: Received new cluster view for channel ejb: [9261f2605131|0] (1) [9261f2605131]
18:19:37,035 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000094: Received new cluster view for channel ejb: [9261f2605131|0] (1) [9261f2605131]
18:19:37,035 INFO [org.infinispan.CLUSTER] (MSC service thread 1-1) ISPN000094: Received new cluster view for channel ejb: [9261f2605131|0] (1) [9261f2605131]
18:19:37,041 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000079: Channel ejb local address is 9261f2605131, physical addresses are [0.0.0.0:7600]
18:19:37,043 INFO [org.infinispan.CLUSTER] (MSC service thread 1-8) ISPN000079: Channel ejb local address is 9261f2605131, physical addresses are [0.0.0.0:7600]
18:19:37,049 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000079: Channel ejb local address is 9261f2605131, physical addresses are [0.0.0.0:7600]
18:19:37,053 INFO [org.infinispan.CLUSTER] (MSC service thread 1-1) ISPN000079: Channel ejb local address is 9261f2605131, physical addresses are [0.0.0.0:7600]
18:19:37,056 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000079: Channel ejb local address is 9261f2605131, physical addresses are [0.0.0.0:7600]
I also tried the JDBC_PING and TCPPING with no success.
How I got this working.
For some reason I thought this might be related to the bind ip and/or routes interacting with jgroups.
First of all I updated the docker-entrypoint.sh:
docker-entrypoint.sh diff
--- docker-entrypoint.sh.orig 2020-09-15 05:01:53.000000000 -0400
+++ docker-entrypoint.sh.test 2021-01-07 13:41:01.645836780 -0500
@@ -77,7 +77,7 @@
########################
if [[ -z ${BIND:-} ]]; then
- BIND=$(hostname --all-ip-addresses)
+ BIND=$(hostname --ip-address)
fi
if [[ -z ${BIND_OPTS:-} ]]; then
for BIND_IP in $BIND
This combined with the entrypoint_mode: dnsrr
, JGROUPS_DISCOVERY_PROTOCOL: dns.DNS_PING
and JGROUPS_DISCOVERY_PROPERTIES: dns_query=keycloak
I managed to get a cluster.
I don’t know why this is and will happily receive some education.
jgoups/infinispan logs
Setting JGroups discovery to dns.DNS_PING with properties {dns_query=>keycloak}
19:14:15,505 INFO [org.jboss.as.clustering.infinispan] (ServerService Thread Pool – 39) WFLYCLINF0001: Activating Infinispan subsystem.
19:14:15,568 INFO [org.jboss.as.clustering.jgroups] (ServerService Thread Pool – 43) WFLYCLJG0001: Activating JGroups subsystem. JGroups version 4.2.4
19:14:25,017 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-3) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
19:14:25,027 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-4) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
19:14:25,022 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-6) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
19:14:25,030 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-2) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
19:14:25,041 INFO [org.infinispan.PERSISTENCE] (MSC service thread 1-5) ISPN000556: Starting user marshaller ‘org.wildfly.clustering.infinispan.marshalling.jboss.JBossMarshaller’
19:14:25,070 INFO [org.infinispan.CONTAINER] (MSC service thread 1-2) ISPN000128: Infinispan version: Infinispan ‘Turia’ 10.1.8.Final
19:14:25,360 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000078: Starting JGroups channel ejb
19:14:25,360 INFO [org.infinispan.CLUSTER] (MSC service thread 1-2) ISPN000078: Starting JGroups channel ejb
19:14:25,360 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000078: Starting JGroups channel ejb
19:14:25,360 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000078: Starting JGroups channel ejb
19:14:25,360 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000078: Starting JGroups channel ejb
19:14:25,372 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000094: Received new cluster view for channel ejb: [eae0f6d94438|1] (2) [eae0f6d94438, af39ff385235]
19:14:25,372 INFO [org.infinispan.CLUSTER] (MSC service thread 1-2) ISPN000094: Received new cluster view for channel ejb: [eae0f6d94438|1] (2) [eae0f6d94438, af39ff385235]
19:14:25,373 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000094: Received new cluster view for channel ejb: [eae0f6d94438|1] (2) [eae0f6d94438, af39ff385235]
19:14:25,374 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000094: Received new cluster view for channel ejb: [eae0f6d94438|1] (2) [eae0f6d94438, af39ff385235]
19:14:25,381 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000094: Received new cluster view for channel ejb: [eae0f6d94438|1] (2) [eae0f6d94438, af39ff385235]
19:14:25,395 INFO [org.infinispan.CLUSTER] (MSC service thread 1-4) ISPN000079: Channel ejb local address is af39ff385235, physical addresses are [10.0.2.66:7600]
19:14:25,401 INFO [org.infinispan.CLUSTER] (MSC service thread 1-6) ISPN000079: Channel ejb local address is af39ff385235, physical addresses are [10.0.2.66:7600]
19:14:25,423 INFO [org.infinispan.CLUSTER] (MSC service thread 1-2) ISPN000079: Channel ejb local address is af39ff385235, physical addresses are [10.0.2.66:7600]
19:14:25,428 INFO [org.infinispan.CLUSTER] (MSC service thread 1-5) ISPN000079: Channel ejb local address is af39ff385235, physical addresses are [10.0.2.66:7600]
19:14:25,436 INFO [org.infinispan.CLUSTER] (MSC service thread 1-3) ISPN000079: Channel ejb local address is af39ff385235, physical addresses are [10.0.2.66:7600]
If there are multiple networks on a container then this ip method is not deterministic and my testing shows that a cluster is not successfully created/joined.
With an updated docker-entrypoint.sh it again works.
docker-entrypoint.sh
--- docker-entrypoint.sh.orig 2020-09-15 05:01:53.000000000 -0400
+++ docker-entrypoint.sh 2021-01-07 14:29:41.843423045 -0500
@@ -80,6 +80,12 @@
BIND=$(hostname --all-ip-addresses)
fi
if [[ -z ${BIND_OPTS:-} ]]; then
+ if [[ -n ${DOCKER_SWARM:-} ]]; then
+ SVCIP=$(getent hosts ${SVC_NAME:-keycloak} | awk '{print $1}'| uniq)
+ THIS_IP=$(echo ${SVCIP} ${BIND} | sed 's/ /\n/g' | sort | uniq -d)
+ echo INFO: Using bindip ${THIS_IP} for jgroups
+ BIND=${THIS_IP}
+ fi
for BIND_IP in $BIND
do
BIND_OPTS+=" -Djboss.bind.address=$BIND_IP -Djboss.bind.address.private=$BIND_IP "
docker-compose.yaml
version: '3.8'
networks:
keycloak:
foo:
secrets:
pg-password:
file: ./pg-password
services:
postgres:
image: postgres:12
environment:
POSTGRES_PASSWORD_FILE: /run/secrets/pg-password
POSTGRES_DB: keycloak
POSTGRES_USER: keycloak
secrets:
- pg-password
networks:
- keycloak
keycloak:
image: keycloak:test
deploy:
endpoint_mode: dnsrr
replicas: 2
placement:
constraints:
- "node.platform.os==linux"
# max_replicas_per_node: 1
labels:
traefik.enable: "true"
traefik.http.routers.keycloak.rule: Host(`keycloak`)
traefik.http.routers.keycloak.entrypoints: http
traefik.http.services.keycloak.loadbalancer.server.port: 8080
traefik.http.services.keycloak.loadbalancer.healthcheck.path: /auth/
secrets:
- pg-password
environment:
# ROOT_LOGLEVEL: DEBUG
CACHE_OWNERS_COUNT: 2
CACHE_OWNERS_AUTH_SESSIONS_COUNT: 2
DB_ADDR: postgres
DB_DATABASE: keycloak
DB_PASSWORD_FILE: /run/secrets/pg-password
DB_SCHEMA: public
DB_USER: "keycloak"
DB_VENDOR: postgres
DOCKER_SWARM: "true"
JGROUPS_DISCOVERY_PROTOCOL: dns.DNS_PING
JGROUPS_DISCOVERY_PROPERTIES: dns_query=keycloak
KEYCLOAK_PASSWORD: password
KEYCLOAK_USER: admin
PROXY_ADDRESS_FORWARDING: "true"
networks:
- keycloak
- foo
traefik:
image: traefik:v2.3
ports:
- '80:80'
command:
- --entrypoints.http.address=:80
- --providers.docker=true
- --providers.docker.swarmMode=true
- --providers.docker.exposedbydefault=false
- --accesslog
networks:
- keycloak
volumes:
- /var/run/docker.sock:/var/run/docker.sock:ro