Debug: Service Restarts And Dependency Exits¶
Use this guide when a service unexpectedly exits, keeps restarting, or appears to hang while waiting for another component.
What Controls Restart Behavior¶
There are two layers involved:
-
scripts/lib/watch-dependencies.shThis wrapper waits for declared dependencies before starting the real process. If a dependency disappears later, it stops the child process and exits with status1. -
Docker restart policy Most core services in
docker-compose.ymluseSERVICE_RESTART_POLICY.
Default behavior by environment:
- Mainnet:
SERVICE_RESTART_POLICY=unless-stopped - Galleon testnet:
SERVICE_RESTART_POLICY=no - Dev:
SERVICE_RESTART_POLICY=no
Two services currently pin restart: unless-stopped directly in Compose:
node-health-check-clientatan-uploader
Dependency Map¶
flowchart LR
executionLayer["execution-layer"] --> kaspad["kaspad"]
executionLayer --> rpcProvider["rpc-provider-*"]
kaspad --> kaswallet["kaswallet-*"]
kaswallet -->|"RPC_READ_ONLY=false"| rpcProvider
rpcProvider --> traefik["traefik"]
Restart Frontend Without Touching Backend¶
frontend-w* profiles do not need to activate backend services directly. For a full stack start, bring up both profiles:
If backend is already running and you only want to bounce frontend services, restart the frontend profile directly:
To remove and recreate frontend containers without touching backend:
For fewer workers, replace frontend-w5 with frontend-w1 through frontend-w4.
This works because kaspad and execution-layer are not part of the frontend-w* profile set, so profile-scoped restart and down only target frontend services.
Quick Triage¶
1. Check current container state¶
This tells you whether the container is currently Up, Exited, or flapping.
2. Check whether Docker actually restarted it¶
docker inspect -f 'name={{.Name}} restart_count={{.RestartCount}} status={{.State.Status}} exit_code={{.State.ExitCode}} started={{.State.StartedAt}} finished={{.State.FinishedAt}} error={{.State.Error}}' rpc-provider-0
Interpretation:
restart_count=0withstatus=running: the container has not restarted yetrestart_count=0withstatus=exited: it exited and stayed downrestart_count>0: Docker restarted the same container at least once
3. Read recent logs with timestamps¶
docker compose logs --timestamps --since 30m rpc-provider-0
docker compose logs --timestamps --since 30m kaspad
docker compose logs --timestamps --since 30m traefik
Use -f if you want to keep watching:
4. Check Docker restart events¶
This helps confirm whether Docker restarted the container or whether it simply stayed up and kept polling for dependencies.
How To Tell What Happened¶
Service is waiting for a dependency¶
Typical signals:
docker compose psshows the container asUprestart_count=0- logs repeatedly show
[watch-dependencies] ... unavailable ... - you do not see the real application startup logs yet
This means the wrapper is still running and the child command has not started.
Service exited and stayed down¶
Typical signals:
docker compose psshowsExitedrestart_count=0- logs end with a dependency-loss message from
watch-dependencies.sh
This is expected when restart policy is no.
Service restarted after an error¶
Typical signals:
restart_count>0- logs contain an earlier failure, then a fresh startup sequence later
docker eventsshows restart-related lifecycle events
This is expected on mainnet for services using SERVICE_RESTART_POLICY=unless-stopped.
Are Logs Persistent After Restart¶
Yes, in the current default setup they are.
The environment examples set:
With the json-file driver:
docker logsanddocker compose logscontinue to show earlier log lines after a restart of the same container- the log stream is appended across restarts of that container
- log rotation still applies because Compose sets
max-size,max-file, andcompress
Important distinction:
- Restarted container: same container, logs remain visible via
docker logs - Recreated container: new container ID, old logs stay attached to the old container until it is removed
To compare container identity:
If you switch to LOGGING_DRIVER=syslog, persistence depends on the host logging system instead of Docker's json-file storage.
Common Signatures¶
[watch-dependencies] execution-layer is unavailable at execution-layer:8545¶
Meaning:
- the wrapper could not reach the execution layer TCP endpoint
What to check:
docker compose ps
docker compose logs --timestamps --since 15m execution-layer
docker inspect -f 'restart_count={{.RestartCount}} status={{.State.Status}}' execution-layer
[watch-dependencies] rpc-backends has no healthy endpoints in ...¶
Meaning:
- none of the configured RPC health URLs responded successfully
What to check:
docker compose ps
docker compose logs --timestamps --since 15m rpc-provider-0 rpc-provider-1 traefik
Note:
traefikuses--http-any, so only one healthy RPC backend is required- if you run
watch-dependencies.shmanually from the host shell, Docker-internal names likerpc-provider-0only work if they resolve from your current network context
[watch-dependencies] missing command¶
Meaning:
- the wrapper was invoked directly without
-- command ...
Correct pattern:
Repeated dependency messages every few seconds¶
Meaning:
- the wrapper is still polling and has not started the child process yet
The poll interval is controlled by:
Optional timeout knobs:
Recommended Debug Flow¶
When a service looks unhealthy:
- Run
docker compose ps - Inspect
RestartCountand state timestamps - Read recent logs with
--timestamps --since ... - Check the upstream service named in the
watch-dependenciesmessage - Use
docker eventsif you need to confirm actual restart behavior
For example, if rpc-provider-0 is failing:
docker compose ps
docker inspect -f 'restart_count={{.RestartCount}} status={{.State.Status}} started={{.State.StartedAt}} finished={{.State.FinishedAt}}' rpc-provider-0
docker compose logs --timestamps --since 15m rpc-provider-0 execution-layer kaswallet-0
docker events --since 15m --filter container=rpc-provider-0