Getting 401 on whoami when logging in in standalone-ha mode on ECS with 2 Fargate notes

I’m getting a 401 on the whoami request:

  • only in this environment

  • only when more that a single Fargate Node is activated

This is for Keycloak run-in in a container in standalone-ha mode on ECS.

We use terraform to configure the ECS Service/task(s), so the configurations (apart from environment specific sg’s and subnets, etc) are the same.

We don’t have this issue In our test environment, but is chronic in out staging environment.

This suggests that our Standalone-HA.xml file is correct, and here are the env vars passed into the task:

CHECKIT_ENV staging
DB_ADDR
DB_DATABASE
DB_PASSWORD
DB_PORT 5432
DB_USER
DB_VENDOR postgres
JGROUPS_DISCOVERY_PROPERTIES datasource_jndi_name=java:jboss/datasources/KeycloakDS,info_writer_sleep_time=500,remove_old_coords_on_view_change=true
JGROUPS_DISCOVERY_PROTOCOL JDBC_PING
KEYCLOAK_FRONTEND_URL Welcome to Keycloak
KEYCLOAK_LOGLEVEL INFO
PROXY_ADDRESS_FORWARDING true

Any support and hints would be welcome.

I would use standard debugging = increase log level and check the logs. Behaviour indicates problem with clustering, so check if nodes are discovered correctly and they are able to talk each other without any issue.

Thanks @jangaraj, I should have mentioned that I’ve done that, DEBUG level, and it reveals nothing.
I should also state that I’m using and external ProstgresDB instance that is being connected to.

OK.
I’ve done more investigation and it seems that the JDBC_PING is not being invoked.
I have made sure that the right JDBC_PING.cli file is in /opt/jboss/tools/cli/jgroups/discovery/
Have updated standalone-ha.xml to use JDBC_PING:

<stacks>
                <stack name="tcp">
                    <transport type="TCP" socket-binding="jgroups-tcp">
                        <property name="external_addr">
                            ${env.EXTERNAL_ADDR}
                        </property>
                    </transport>
                    <protocol type="org.jgroups.protocols.JDBC_PING">
                        <property name="connection_driver">
                            org.postgresql.Driver
                        </property>
                        <property name="connection_url">
                            jdbc:postgresql://${env.DB_ADDR:postgres}:${env.DB_PORT:5432}/${env.DB_DATABASE:keycloak}
                        </property>
                        <property name="connection_username">
                            ${env.DB_USER:keycloak}
                        </property>
                        <property name="datasource_jndi_name">
                            java:jboss/datasources/KeycloakDS
                        </property>
                        <property name="clear_table_on_view_change">
                            true
                        </property>
                        <property name="connection_password">
                            ${env.DB_PASSWORD:password}
                        </property>
                        <property name="initialize_sql">
                            CREATE TABLE IF NOT EXISTS JGROUPSPING (own_addr varchar(200) NOT NULL,bind_addr varchar(200) NOT NULL,created timestamp NOT NULL,cluster_name varchar(200) NOT NULL,ping_data BYTEA,constraint PK_JGROUPSPING PRIMARY KEY (own_addr, cluster_name))
                        </property>
                        <property name="insert_single_sql">
                            INSERT INTO JGROUPSPING (own_addr, bind_addr, created, cluster_name, ping_data) values (?,'${jgroups.bind.address:127.0.0.1}',NOW(), ?, ?)
                        </property>
                        <property name="select_all_pingdata_sql">
                            SELECT ping_data FROM JGROUPSPING WHERE cluster_name=?;
                        </property>
                        <property name="delete_single_sql">
                            DELETE FROM JGROUPSPING WHERE own_addr=? AND cluster_name=?
                        </property>
                     </protocol>
<!--                    <transport type="TCP" socket-binding="jgroups-tcp"/>-->
<!--                    <socket-protocol type="MPING" socket-binding="jgroups-mping"/>-->
                    <protocol type="MERGE3"/>
                    <socket-protocol type="FD_SOCK" socket-binding="jgroups-tcp-fd"/>
                    <protocol type="FD_ALL"/>
                    <protocol type="VERIFY_SUSPECT"/>
                    <protocol type="pbcast.NAKACK2"/>
                    <protocol type="UNICAST3"/>
                    <protocol type="pbcast.STABLE"/>
                    <protocol type="pbcast.GMS"/>
                    <protocol type="MFC"/>
                    <protocol type="FRAG3"/>
                </stack>

Set the JGROUPS_DISCOVERY_PROTOCOL env var to “JDBC_PING”
But the JGROUPSPING table never gets created in the PostgreSQL DB

Any pointers would be Very VERY gratefully received

Use the current official Keycloak docker image. Older images didn’t have jdbc ping + configure proper env variables: Keycloak - Blog - Keycloak and JDBC Ping

Usually you don’t need to update standalone-ha.xml manually.

Jan, I didn’t exactly that.
Here is the Dockerfile:

FROM jboss/keycloak:15.0.2

USER root
RUN curl -O http://mirror.centos.org/centos/8/AppStream/x86_64/os/Packages/oniguruma-6.8.2-2.el8.x86_64.rpm
RUN rpm -ivh oniguruma-6.8.2-2.el8.x86_64.rpm
RUN curl -O  http://mirror.centos.org/centos/8/AppStream/x86_64/os/Packages/jq-1.5-12.el8.x86_64.rpm
RUN rpm -ivh jq-1.5-12.el8.x86_64.rpm

USER jboss
WORKDIR /opt/jboss

COPY cli/JDBC_PING.cli tools/cli/jgroups/discovery/
COPY standalone-ha.xml keycloak/standalone/configuration/
COPY postgresql/ keycloak/modules/system/layers/keycloak/org/
COPY themes/ keycloak/themes/
COPY docker-entrypoint.sh .

ENTRYPOINT [ "/opt/jboss/docker-entrypoint.sh", "-b", "0.0.0.0" ]

EXPOSE 7600

We had to do this because:

  1. we use custom themes
  2. we have to load the Postgres jdbc driver
  3. we load up the JDBC_PING.cli that supports Postgres
  4. we install JQ to support getting the ECS Task’s IP address in the startup script
  5. we point to the Postgres jdbc drive in standalone-ha.xml

The startup script is:

#!/bin/bash

# use when testing jq
#ECS_CONTAINER_METADATA_URI_V4=`cat aws_test.json`
#
echo "getting the Container's IP Address for Keycloak clustering"
#echo "ECS_CONTAINER_METADATA_URI_V4 is ${ECS_CONTAINER_METADATA_URI_V4}"
curl "${ECS_CONTAINER_METADATA_URI_V4}" > container_metadta.json
#cat container_metadta.json
export EXTERNAL_ADDR=$(cat container_metadta.json | jq -r ".Networks[0].IPv4Addresses[0]")
echo "EXTERNAL_ADDR is ${EXTERNAL_ADDR}"
export JGROUPS_DISCOVERY_EXTERNAL_IP="${EXTERNAL_ADDR}"
exec /opt/jboss/keycloak/bin/standalone.sh --server-config=standalone-ha.xml "$@" -Djboss.bind.address="${EXTERNAL_ADDR}" -Djboss.bind.address.private="${EXTERNAL_ADDR}"
exit $?

Here we:

  1. use JQ to get the Container’s (ECS task) IP address from ECS metadata
  2. use the JGROUPS_DISCOVERY_EXTERNAL_IP env var with it (although it seems to have done nothing)
  3. use the IP address in -Djboss.bind.address="${EXTERNAL_ADDR}" -Djboss.bind.address.private="${EXTERNAL_ADDR}" (more on these later)

What we found was:

  1. NOTHING happens re jdbc_ping unless this line is changed standalone-ha.xml (replace the ‘udp’ value with ‘tcp’)
<channels default="ee">
     <channel name="ee" stack="tcp" cluster="ejb"/>
</channels>
  1. The JDBC_PING.cli just isn’t run (maybe because the Jboss user has to own it and Groups doesn’t see it because root owns it after the Dockerfile update??), hence the change not being made above and the other changes need to be made in standalone-ha.xml
  2. Unless BOTH of these parameters are passed in, you’ll get our friend “401 on whoami”, even with the jgroupsping table created and used:
-Djboss.bind.address="${EXTERNAL_ADDR}" -Djboss.bind.address.private="${EXTERNAL_ADDR}"

However, none of this information is highlighted, or even documented. The last point I found purely by chance when searching on another issue (and it was in an obscure use post)!

You don’t have to copy & specify a postgres driver manually, this is done automatically when setting DB_VENDOR=postgres

Configuring JDBC_PING can also be done via env vars, no need to copy and modifying anything manually. See here for example:

@dasniko unfortunately, as the start of this issue states, I used the envars but I did not automate anything.
Which is why I opened this issue.
All this stuff may work nicely in standalone docker containers, but not when orchestrated in a Fargate instance in ECS

All this stuff works nicely in my clustered AWS EC2 environment. No manual stuff needed.
It’s a bit of work, yes, but it’s possible!

Have you tried it with Fargate in ECS, not standalone EC2 instances?

see also issue Keycloak clustering · Issue #95 · aws-samples/keycloak-on-aws · GitHub there. There is usually no need for any customization. I guess you made a problem with your own customization.
Recommendation: start simple with vanila docker image and then when standard Keycloak is running in clustered mode properly add your own customizations.

Thanks Jan, but we we migrating from an existing set of implementations (Test, Staging, Production) that were just single standalone implementations. The people that set these up had left the organisation, so we found it hard to find somewhere that described a standard cluster in ECS to do a base implementation from.

That link you gave, although referencing Fargate, uses Typescript and the aws-cdk, so would take almost as much unpacking as doing what I did.

But the most important aspect is that it only using standalone Fargate Instances, like standalone EC2 instances, NOT managed by ECS.

I really don’t understand. You are complaining that something is not working, but you didn’t mention in first place that you have a lot of customizations.

For example custom entrypoint, which apparently doesn’t execute vendor jgroups.sh, which configure jgroups configuration in the container.

You should provide reproducible example (+ infrastructure + hire someone). This looks like a very long shoot for community forum (so you didn’t open real issue, only discussion with random people, who are willing to read your problem). I have autoscaled apps on AWS ECS Fargate in prod (but not a Keycloak), but this problem is 🤷.

I really don’t see a reason for any your customization (only custom theme, but that can be delivered from side car container, so it won’t need a custom Keycloak image build at all).

1 Like

Please don’t get aggressive because you can’t answer the question or solve the problem, being that keycloak has NO working example or documentation of running in Fargate in ECS.
If they tried it they would find out the it doesn’t work out of the box because the JDBC_PING.cli file is not run for an existing implementation (Read here data in the database).
The only customisation we have is the themes, nothing else.
Nothing any one that a has responded has been able to refute what I’m saying.
It’s not my fault if it doesn’t automatically work, and the documentation assumes it will.

In case you have misunderstood, keycloak IS working in Fargate in ECS, BUT only with the manual configs I’ve pointed out.

Sorry, to be aggresive. I’m ignoring this thread.