Configuring Keycloak on AWS ECS Fargate with Postgres RDS

Hi All

I have set up a terraform for version 19.x of Keycloak. KC runs on ECS Fargate in private subnets (multi az) and have configured a simple rds instance to store all the authentication goodies. After this I have noticed two things:

  1. Private zone DNS does not resolve. If I use the IP address of the rds instance, I can connect, or alternatively set up a public DNS record with the private IP. So there seems to be an issue with DNS
  2. Clustering does not work. I will be honest here. I have no idea how to set this up. Tried reading a couple of posts and I cannot wrap my head around it. From the docs I have tried the EC2 cache stack, but that complains about groups.

Any suggestions?

Many thanks!

I can’t debug without seeing your terraform, but here are two examples that might be helpful:

Hi!

Thanks for that. So the deadlysyn uses wildfly, so I used it as a guideline for my own implementation. The second one, I am going to be honest, I am not sure how that works.

I created a public repo so you can have a look at the code that I have published. Any pointers on terraform would also be appreciated, this is my first attempt at it :laughing:: GitHub - ivan-navi-studios/debug-terraform-aws-ecs: Used as a debugging repo. Do not use this for anything

Maybe just to clarify the issues that I am getting:

  1. Keycloak docker container is not resolving internal DNS routes in private hosted zone. So when attempting to connect to my RDS instance, I get a JDBC connection error. When I load a postgres image in place of the keycloak image and create a psql connection to my rds instance it is successful. So this looks like an issue with the keycloak docker container itself.

  2. When I set the server to fire up using an IP address for the RDS instance, all is well and it connects as expected. When I then have a look at the logs, I can’t see that clustering is happening.

  3. When I log into the keycloak instance, I get directed to 0.0.0.0. Which makes no sense to me

Hi

Just checking in if you managed to have a look at this. Really appreciate the help

Ok this is some strange behaviour.

I dropped the docker container version to v17.0.0 and all of a sudden the private zone DNS resolution started working. After some tinkering with JDBC_PING for JGroups, I managed to get the clustering to work as expected. (Turns out the exposed port for that is by default 7800 not 7600).

I then decided to bump the version to 18.0.0. Now the DNS resolution no longer works when attempting to connect to the RDS instance, but for some reason the JGroups connection to the same rds instance with the same hostname works perfectly fine.

I then bumped to version 19.0.1. This resulted in nothing connecting… Strange indeed

Hi Ivan,

We are already using ECS Fargate for our keycloak 19.0.1 with aurora-postgres with clustering mode and it is working as expected. Seems like some config issue from your side. This guide helped us in our infra. Maybe you should take a look into this

1 Like

Hey Akbar. Thanks for this. I had a look at the link provided and cannot see any differences besides that this is a Wildfly installation from what we are doing. Would you be willing to share your ECS config? And perhaps your JGroup configuration?

Hi Ivan,

Please find the JGroups configuration attached. Just replace username and password accordingly. Also you need to set ENV JGROUPS_DISCOVERY_PROTOCOL=JDBC_PING in your docker file. Also don’t forget to specify ENV KC_CACHE_CONFIG_FILE=cache-ispn-jdbc-ping.xml the config file name accordingly.

<?xml version="1.0" encoding="UTF-8"?>
<infinispan
        xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
        xsi:schemaLocation="urn:infinispan:config:11.0 http://www.infinispan.org/schemas/infinispan-config-11.0.xsd"
        xmlns="urn:infinispan:config:11.0">

  <!-- custom stack goes into the jgroups element -->
  <jgroups>
    <stack name="jdbc-ping-tcp" extends="tcp">
      <JDBC_PING connection_driver="org.postgresql.Driver"
                 connection_username="${env.KC_DB_USERNAME}" connection_password="${env.KC_DB_PASSWORD}"
                 connection_url="${env.KC_DB_URL}"
                 initialize_sql="CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200) NOT NULL, cluster_name varchar(200) NOT NULL, ping_data BYTEA, constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name));"
                 info_writer_sleep_time="500"
                 remove_all_data_on_view_change="true"
                 stack.combine="REPLACE"
                 stack.position="MPING"/>
    </stack>
  </jgroups>

  <cache-container name="keycloak">
    <!-- custom stack must be referenced by name in the stack attribute of the transport element -->
    <transport lock-timeout="60000" stack="jdbc-ping-tcp"/>
    <local-cache name="realms">
      <encoding>
        <key media-type="application/x-java-object"/>
        <value media-type="application/x-java-object"/>
      </encoding>
      <memory max-count="10000"/>
    </local-cache>
    <local-cache name="users">
      <encoding>
        <key media-type="application/x-java-object"/>
        <value media-type="application/x-java-object"/>
      </encoding>
      <memory max-count="10000"/>
    </local-cache>
    <distributed-cache name="sessions" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <distributed-cache name="authenticationSessions" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <distributed-cache name="offlineSessions" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <distributed-cache name="clientSessions" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <distributed-cache name="offlineClientSessions" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <distributed-cache name="loginFailures" owners="2">
      <expiration lifespan="-1"/>
    </distributed-cache>
    <local-cache name="authorization">
      <encoding>
        <key media-type="application/x-java-object"/>
        <value media-type="application/x-java-object"/>
      </encoding>
      <memory max-count="10000"/>
    </local-cache>
    <replicated-cache name="work">
      <expiration lifespan="-1"/>
    </replicated-cache>
    <local-cache name="keys">
      <encoding>
        <key media-type="application/x-java-object"/>
        <value media-type="application/x-java-object"/>
      </encoding>
      <expiration max-idle="3600000"/>
      <memory max-count="1000"/>
    </local-cache>
    <distributed-cache name="actionTokens" owners="2">
      <encoding>
        <key media-type="application/x-java-object"/>
        <value media-type="application/x-java-object"/>
      </encoding>
      <expiration max-idle="-1" lifespan="-1" interval="300000"/>
      <memory max-count="-1"/>
    </distributed-cache>
  </cache-container>
</infinispan>

@akbar1214 Thanks so much for sharing this! Pardon the complete n00b question, but where would I put the jgroups XML chunk you described. Is there a spot on the docker filesystem I should copy it during that build?

Into the conf/ dir, and then tell Keycloak it’s there by setting the appropriate env var:

KC_CACHE_CONFIG_FILE=cache-whatever-you-named-it.xml
1 Like