Problem
When svcinit receives SIGINT/SIGTERM, the configured shutdown_signal and shutdown_timeout are bypassed because exec.CommandContext automatically sends SIGKILL when the context is cancelled.
Root Cause
In runner/runner.go:258:
cmd := exec.CommandContext(ctx, s.Exe, s.Args...)
When the signal handler calls cancelFunc(), Go's exec package immediately sends SIGKILL to the child process via pidfd_send_signal. This happens before StopAll() can send SIGTERM with the configured grace period.
Timeline (from strace analysis)
- SIGINT received → signal handler calls
cancelFunc()
pidfd_send_signal(fd, SIGKILL, ...) sent by Go runtime
- Child process killed immediately
- "Shutting down services." printed
StopAll() tries to send SIGTERM to already-dead process
Impact
Services that need graceful shutdown (like docker run/docker compose up) are killed immediately without time to clean up. The shutdown_signal = "SIGTERM" and shutdown_timeout = "5s" configuration is effectively ignored.
Proposed Fixes
Have a couple ideas here but would like some input on what you think feels best
- Set
cmd.Cancel (Go 1.20+) to send the configured ShutdownSignal instead of SIGKILL, and set cmd.WaitDelay to the configured shutdown_timeout. Not sure if the WaitDelay is necessary or not.
cmd.Cancel = func() error {
var sig syscall.Signal
switch s.ShutdownSignal {
case "SIGTERM":
sig = syscall.SIGTERM
default:
sig = syscall.SIGKILL
}
return killGroup(cmd, sig)
}
cmd.WaitDelay = shutdownTimeout
-
Use a separate context for commands separate from the root context and propagate manually. Since the rest of the code seems to be handling the shutdown manually, perhaps this is the right move?
-
Don't use exec.CommandContext - in a similar vein, the code is already handling this lifecycle stuff, so perhaps it makes more sense to just remove the Context propagation and handle things ourselves?
Reproduction
- Create a service that needs graceful shutdown (e.g., something that wraps
docker run)
- Configure
shutdown_signal = "SIGTERM" and shutdown_timeout = "5s"
- Run with
bazel run
- Send SIGINT (Ctrl+C)
- Observe that the service is killed immediately without graceful shutdown
I'll also see about getting this into a test
Problem
When svcinit receives SIGINT/SIGTERM, the configured
shutdown_signalandshutdown_timeoutare bypassed becauseexec.CommandContextautomatically sends SIGKILL when the context is cancelled.Root Cause
In
runner/runner.go:258:When the signal handler calls
cancelFunc(), Go's exec package immediately sends SIGKILL to the child process viapidfd_send_signal. This happens beforeStopAll()can send SIGTERM with the configured grace period.Timeline (from strace analysis)
cancelFunc()pidfd_send_signal(fd, SIGKILL, ...)sent by Go runtimeStopAll()tries to send SIGTERM to already-dead processImpact
Services that need graceful shutdown (like
docker run/docker compose up) are killed immediately without time to clean up. Theshutdown_signal = "SIGTERM"andshutdown_timeout = "5s"configuration is effectively ignored.Proposed Fixes
Have a couple ideas here but would like some input on what you think feels best
cmd.Cancel(Go 1.20+) to send the configuredShutdownSignalinstead of SIGKILL, and setcmd.WaitDelayto the configuredshutdown_timeout. Not sure if the WaitDelay is necessary or not.Use a separate context for commands separate from the root context and propagate manually. Since the rest of the code seems to be handling the shutdown manually, perhaps this is the right move?
Don't use exec.CommandContext - in a similar vein, the code is already handling this lifecycle stuff, so perhaps it makes more sense to just remove the Context propagation and handle things ourselves?
Reproduction
docker run)shutdown_signal = "SIGTERM"andshutdown_timeout = "5s"bazel runI'll also see about getting this into a test