If you post 1) what you have tried, and 2) what specific problems you are having (including full logs and configurations), then it might be possible to help you. “the keycloak is not working” is neither informative or helpful.
And get following warnings on one of fargate cluser’s error logs:
WARN [org.infinispan.PERSISTENCE] (keycloak-cache-init) ISPN000554: jboss-marshalling is deprecated and planned for removal
WARN [org.infinispan.CONFIG] (keycloak-cache-init) ISPN000569: Unable to persist Infinispan internal caches as no global state enabled
WARN [io.quarkus.agroal.runtime.DataSources] (main) Datasource <default> enables XA but transaction recovery is not enabled. Please enable transaction recovery by setting quarkus.transaction-manager.enable-recovery=true, otherwise data may be lost if the application is terminated abruptly
WARN [com.arjuna.ats.arjuna] (main) ARJUNA012210: Unable to use InetAddress.getLocalHost() to resolve address.
Also I used following env variables on fargate task definition:
That happens in the code-to-token flow when the code isn’t valid. By “Sometimes” do you mean it only fails/succeeds intermittently? Are you trying to run 2 separate keycloak instances behind the same hostname and load balancer? This is the kind of thing I see when someone is balancing traffic between two keycloaks that are not actually connected via infinispan.
I don’t understand. Are you trying to connect them in the same cluster? Is there any indication from the logs that they are discovering each other via infinispan? If not, you have a problem with your network setup on AWS, and this is not a keycloak question.
The network configuration was flawless. And keycloak worked as expected when I ran the ECS service with only one task. It did not work while keycloak was running on multiple ECS tasks.
Great. How did you validate that the network configuration was flawless? What did you do in order to verify that both tasks could communicate over the Infinispan ports?
My guess was it was an infinispan problem because it can’t communicate over the network. If you can validate those ports are open between containers, that’s a good first step.
Thanks for the pointer. After looking deeper into all the security group, we have found out the security group which the fargate service is using is missing the inbound rules to allow all traffic within the security group. After that it works like a charm.
Do you know if I can tighten down to inbound rule? e.g what ports to specify and traffice type? rather than allow all traffic , all ports at the moment? Thanks ! @xgp !