Can't restart docker container

Hello, I am using the official docker image for the latest version (10.0.2) and trying to setup a standalone-HA cluster using TCPPING as explained here https://www.keycloak.org/2019/04/keycloak-cluster-setup.html

I’ve managed to create the cluster and even access it through a lb/proxy and realm creation and authentication all work fine. The problem is that if for any reason (manual or not) a container crashes or restarts, it won’t come up again without any useful logs. If i docker rm the container and start a new one it joins the cluster normally and everything works. But I can’t have this be the standard procedure for every docker restart/random crash.

Here are the logs after the cluster is formed and I issue a docker restart command:
https://hatebin.com/cuidimuihn

I’ve searched for some WFLYCTL0212 related issue and from what I can tell this is caused by the change-database.cli script that can be found under …/tools/cli/databases/postgres/change-database.cli which is called twice (once for standalone and once for standalone-ha, thus the logs showing the failure two times).

After I patched the entrypoint to include a check for whether this runs for the first time or not, I’ve managed to suppress the duplicate resource errors but the container still won’t start with no extra logs.
It just shows

Setting JGroups discovery to TCPPING with properties {initial_hosts=>"auth-server01[7600],auth-server02[7600]",port_range=>"100"}                                                                          
User with username 'admin' already added to '/opt/jboss/keycloak/standalone/configuration/keycloak-add-user.json'

and keeps restarting and repeats the admin already added line over and over.

This is how I start the container:

- name: "Install and start keycloak container"
  docker_container:
    name: keycloak
    image: "{{ docker_registry }}keycloak:10.0.2"
    state: started
    pull: "{{ docker_pull }}"
    volumes:
      - "/opt/TCPPING.cli:/opt/jboss/tools/cli/jgroups/discovery/TCPPING.cli"
    published_ports:
      - 7600-7700:7600-7700 # TCPPING
      - 8080:8080 # frontend
      - 9990:9990 # wildfly mgmt & metrics
    restart_policy: unless-stopped
    env:
      KEYCLOAK_USER: "{{ keycloak_server_user }}"
      KEYCLOAK_PASSWORD: "{{ keycloak_server_password }}"
      DB_VENDOR: postgres
      DB_USER: "{{ keycloak_db_user }}"
      DB_PASSWORD: "{{ keycloak_db_password }}"
      DB_ADDR: "{{ groups['keycloak_db_servers'][0] }}:5432"
      PROXY_ADDRESS_FORWARDING: "true"
      JGROUPS_DISCOVERY_PROTOCOL: TCPPING
      JGROUPS_DISCOVERY_PROPERTIES_DIRECT: |
        {initial_hosts=>"{{ keycloak_server_initial_hosts }}",port_range=>"100"}
      JGROUPS_DISCOVERY_EXTERNAL_IP: "{{ inventory_hostname }}"
      KEYCLOAK_STATISTICS: all

Here is a question with a similar (possibly the same) issue, over on SO: https://stackoverflow.com/questions/62402630/keycloak-docker-container-fails-to-start-after-restarting-the-container

1 Like

I’ve just run into this issue as well. I know it’s all well and good to expect the container to be destroyed and recreated but I ran into this with the container crashing

It was my fault for opening too many connections to a Postgres database which caused Keycloak to crash but I would expect the container to be able to recover :frowning: