Hi everybody,
I have been working for a few days on the installation of a keycloak cluster in azure on redhat machines for later installation on corporate bare metal redhat machines.
The installation on a single server in azure with ssl, database postgres and nginx proxy in productive mode is perfect, but when I want to have a cluster with 2 redhat machines and distributed cache, it doesn’t work.
When I check the keycloak logs I don’t see that looking for “members” the servers that would have to form my cluster are added and they all end up being shown as “coordinators”, which is incorrect.
I have created rules in firewalld and are open 80/tcp, 443/tcp, 7800/tcp/upd
server1 log
2022-10-26 18:54:28,918 INFO [org.jgroups.protocols.pbcast.GMS] (keycloak-cache-init) redhattest1-35666: no members discovered after 2053 ms: creating cluster as coordinator
server2 log
022-10-26 18:42:50,695 INFO [org.jgroups.protocols.pbcast.GMS] (keycloak-cache-init) redhattest2-34025: no members discovered after 2003 ms: creating cluster as coordinator
Here I put my main configuration…
nginx.conf
server {
listen 80;
server_name x.y.z;
return 301 https://x.y.z$request_uri;
}
server {
listen 443 ssl;
server_name x.y.z;
ssl_protocols TLSv1.2 TLSv1.3;
ssl_certificate /etc/ssl/x.y.z.fullchain.pem;
ssl_certificate_key /etc/ssl/x.y.z.private.key;
# https://itnext.io/nginx-as-reverse-proxy-in-front-of-keycloak-21e4b3f8ec53
proxy_set_header X-Forwarded-For $proxy_protocol_addr; # To forward the original client's IP address
proxy_set_header X-Forwarded-Proto $scheme; # to forward the original protocol (HTTP or HTTPS)
proxy_set_header Host $host; # to forward the original host requested by the client
location / {
proxy_pass https://10.3.0.4:8443;
}
}
#####################################################################################
}
keycloak.conf
db=postgres
db-url-host=XXXX
db-username=YYYY
db-password=ZZZZ
https-port=8443
https-protocols=TLSv1.3,TLSv1.2
hostname=x.y.z
https-certificate-file=/etc/ssl/certs/x.y.z.fullchain.pem
https-certificate-key-file=/etc/ssl/certs/x.y.z.private.key
proxy=edge
cache=ispn
cache-stack=tcp
cache-config-file=cache-ispn-ha.xml
log=file
log-file=/home/redhat/keycloak-prima/keycloak.out
log-level=INFO,org.infinispan:DEBUG, org.jgroups:DEBUG
cache-ispn-ha.xml
Can you help me please?
Thanks so much,
Xavier.
- You probably checked that, but the
cache-config-file
location is relative to /conf
which is inside Keycloak installation. Which means that your keycloak.conf
is looking for the a file <keycloak_installation>/conf/cache-ispn-ha.xml
.
- With a default infinispan config, I suppose you shouldn’t set
cache-stack
, as it will take precedence over your custom configuration. That means you are using the default tcp stack defined in the keycloak distribution.
That being said, I see your two instances are in the same subnet. By default, which corresponds to cache=ispn
, which uses the jgroups transport UDP
(take a look at Chapter 7. List of Protocols if curious about how it works), instances in the same subnet should be able to find each other with IP multicast. Check the SO firewall rules to check if that is disabled.
Also (by Configuring distributed caches - Keycloak) , the tcp
stack uses udp for the discovery phase. This document is pretty complete, but you can also take a look at this issue, as it shows what a working custom jgroups configuration should look like (ignore the fact that this is a bug report, because the problem there was the cache-stack overwriting the settings).
Hope this helps.
Most probably why your custom cache-ispn-ha.xml
configuration isn’t used, is because you have configured this:
When using cache-config-file
DON’T use cache-stack
at the same time. If cache-stack
is given, cache-config-file
is being ignored.
Hi @weltonrodrigo and @dasniko , thanks so much for your help.
Yes, I have the cache-ispn-ha.xml
in the conf
directory. The application read oks this configuration.
I have remove the cache-stack=tcp
of my code too.
Final code snippet in keycloak.conf
is:
cache=ispn
cache-config-file=cache-ispn-ha.xml
The problem is in the discovery of the nodes.
I have this snippet in cache config file because I would like simply ping connection for discovery nodes.
<jgroups>
<stack name="tcpping" extends="tcp">
<TCP bind_port="7800" />
<TCPPING initial_hosts="redhattest1[7800],redhattest2[7800]" port_range="0" max_dynamic_hosts="2"/>
</stack>
</jgroups>
<cache-container name="keycloak">
<transport cluster="mykeycloak" lock-timeout="60000" stack="tcpping" node-name="redhattestttt1"/>
<local-cache name="realms">
Tested too with internal ip address.
It should work but not, when I launch 2 keycloak instances appear 2 nodes as a coordinator (not ok).
I think perhaps the problem is use Azure virtual machines.
I have firewalld enabled for this ports for 2 servers and enabled rule AllowVnetInBound
enabled in networks for each virtual machine in Azure Portal.
[redhat@redhattest1 ~]$ sudo firewall-cmd --zone=public --permanent --list-ports
80/tcp 443/tcp 7800/tcp 7800/udp
I want to define this custom tcp solution (or similar) for discovery (no azure solution with AZURE_PING) because this is a POC in Azure but I want to install Keycloak quarkus in a Bare metal corporate infraestructure.
It would have to be possible but maybe I’ll try the solution with AZURE_PING and I already left the horns on the baremetal machine when I play and I assume that the ping problem was because it was azure
I think the documentation of new Keycloak Quarkus (Guides - Keycloak) is simply and clair but insufficient on this documentation for use cases as common as tcpping on “normal” unix servers (not azure, ec, google).
Any idea?
Thanks for your help.
Xavier.
I’m surely not that familiar with infinispan xml configuration, so not sure what can be going wrong here. Maybe the extends
semantic behaving differently from what you expect here?
I’d suggest you turn tracing or debug logs on jgroups package via quarkus Set Quarkus Logging Category Level via Environment Variables - Stack Overflow and see if anything pops up. I suppose org.jgroups
should be enough, if not, also org.infinispan
.
Hi, @weltonrodrigo,
Thanks so much, I have activated the debug logs but the issue is clair, is a problema of jgroups connection.
I will create a new post in this group about azure connection.
Thanks so much!!
Xavier.