Keycloak operator in OpenShift: HA with infinispan

I’m having issues getting a proper understanding of what’s required for me to be able to gracefully restart my openshift keycloak deployment. Currently I’m testing with 3 instances and whatever I have tried so far has not resulted in sessions being resilient to deployment rollouts.

I read things about affinity and sticky sessions and I’m not sure exactly what I should be wanting, but I think it’s a situation where online user/client sessions are always distributed, so I guess sticky sessions are out the window for me then. Please advice.

I read about many different things, such as “embedded infinispan” and connecting to a remote infinispan cluster. But I can’t get my head around what’s the actual minimal requirement for me to get to a proper, however scalable/unscalable, HA setup.

Although having reading this page, Configuring distributed caches - Keycloak
…I really can’t workout if it’s implying that I need a remote infinispan cluster, or that I don’t need it because magically (in the context of running Keycloak using the k8s operator, or even regardless of that) each Keycloak instances will have a companion “embedded infinispan” buddy, and those buddies can synchronize their state somehow.

I’ve also seen that people have asked for the ability to share user and client (online) sessions by storing them in a central database, and how this is actually implemented and slated for release in v25.
Now, I’m not sure what I should be doing at this point. It’s either waiting for v25 which seems like a bad idea because I can’t work out when that will be released, or it’s trying to work out how to get the infinispan thing working. Or it’s something else.

Anyone that can help me sort out this mess is my hero. And a nice Friday to the rest of you too :heart:

I’ve been looking at this warning, wondering if I should do what it implies - try to enable a global state.

ISPN000569: Unable to persist Infinispan internal caches as no global state enabled

Maybe I should be more concerned with these messages?

keycloak-0-19748: no members discovered after 2003 ms: creating cluster as coordinator
keycloak-1-47132: no members discovered after 2004 ms: creating cluster as coordinator
keycloak-2-11089: no members discovered after 2004 ms: creating cluster as coordinator

Doesn’t that look like the discovery mechanism times out at 2s? I might have a look at network policies in the namespace, they might be too strict!

Now I’m getting somewhere :slight_smile:

So I tried open a bunch of ports but I haven’t worked out yet which ones are actually needed. I know that 7800 is essential, but the others, not sure yet.

kind: NetworkPolicy
apiVersion: networking.k8s.io/v1
metadata:
  name: allow-keycloak-cluster-communication
  namespace: my-namespace
spec:
  podSelector:
    matchLabels:
      app: keycloak
  ingress:
    - ports:
        - protocol: TCP
          port: 7600
        - protocol: TCP
          port: 7800
        - protocol: TCP
          port: 57600
        - protocol: TCP
          port: 8443
      from:
        - podSelector:
            matchLabels:
              app: keycloak
  egress:
    - ports:
        - protocol: TCP
          port: 7600
        - protocol: TCP
          port: 7800
        - protocol: TCP
          port: 57600
        - protocol: TCP
          port: 8443
      to:
        - podSelector:
            matchLabels:
              app: keycloak
  policyTypes:
    - Ingress
    - Egress

Turns out all I needed was to open tcp 7800 for traffic in both directions :partying_face:

That’s all folks. Carry on :policeman:

2 Likes