Skip to content

cardano-db-sync container doesn't gracefully shut down #1945

Open
@TrevorBenson

Description

@TrevorBenson

OS
Your OS: Ubuntu & RockyLinux

Versions
The db-sync version (eg cardano-db-sync --version): 13.6.0.4
PostgreSQL version: 15.10

Build/Install Method
The method you use to build or install cardano-db-sync: ghcr.io/intersectmbo/cardano-db-sync:13.6.0.4 container image.

Run method
The method you used to run cardano-db-sync (eg Nix/Docker/systemd/none): [docker|podman] run or by creating Quadlet (podman-systemd) based unit files.

Problem Report
The cardano-db-sync Docker container doesn't shut down gracefully when docker stop is used. Instead of responding to SIGTERM, it eventually gets killed by SIGKILL after the 10-second grace period. This may lead to unclean shutdowns.

Expected behavior

The container should gracefully shut down upon receiving SIGTERM. This is typically done by the application responding to a SIGINT signal.

Current behavior

The cardano-db-sync process inside the container doesn't respond to SIGTERM. The scripts that launch cardano-db-sync don't forward signals or handle shutdown requests, so the main process never receives a SIGINT, and the container is forcibly killed.

Additional context

The TL;DR breakdown:

# podman top preview-db-sync 
USER        PID         PPID        %CPU        ELAPSED              TTY         TIME        COMMAND
root        1           0           0.000       20h12m56.722117998s  ?           0s          /nix/store/izpf49b74i15pcr9708s3xdwyqs4jxwl-bash-5.2p32/bin/bash /nix/store/4k0flcssdaway1pzs39valv7278r585s-cardano-db-sync-preview/bin/cardano-db-sync-preview 
root        6           1           4.659       20h12m56.722279625s  ?           56m31s      /nix/store/8yrmmrdl0zgpzish0pfcji3630lgqrmx-cardano-db-sync-exe-cardano-db-sync-13.6.0.4/bin/cardano-db-sync --config /nix/store/f3mggncdz0284z9cykzv8nd1ccq31n7i-db-sync-config.json --socket-path /node-ipc/node.socket --schema-dir /nix/store/npsidz34y67jp7sc07b2iw7s2n3fp9lj-schema --state-dir /var/lib/cexplorer 
  1. The container entrypoint uses exec when calling the cardano-db-sync-${network} bash script.
    elif [[ "$NETWORK" == "${env}" ]]; then
    echo "Connecting to network: ${env}"
    exec ${dbSyncScript}/bin/${dbSyncScript.name}
    echo "Cleaning up"
  2. The cardano-db-sync-${network} bash script:
    • Does not use execso it becomes the primary process of the container & does not use trap to pass along the signals to the cardano-db-sync process it starts.
      #!${runtimeShell}
      set -euo pipefail
      ${service.script} $@
  3. The cardano-db-sync binary does not appear to handle SIGTERM.
  4. Even correcting the stop signal via --stop-signal=SIGINT when creating the container would not change the behavior, is it would only reach the bash wrapper script for the given network/environment, not the binary.

To Reproduce

  1. Run the cardano-db-sync Docker container.
  2. Issue [docker|podman] stop <container_id>.
  3. Observe that:
    • With docker the container stop takes 10 seconds, reaching its stop timeout, so SIGKILL would be sent.
    • With podman the container stop returns a warning after 10 seconds, notifying the user it resorts to SIGKILL.
      WARN[0010] StopSignal SIGETERM failed to stop container cardano-db-sync in 10 seconds, resorting to SIGKILL.
      

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions