-
-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kill all children on SIGTERM #106
base: master
Are you sure you want to change the base?
Conversation
any news? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you confirm whether or not ninja has a similar behavior when it receives SIGTERM?
Extract it from jobwork() so that build() can call it on a signal. Signed-off-by: Paolo Bonzini <[email protected]>
Keep the system clean by propagating SIGTERM to all children, and by not starting new jobs on both SIGTERM and SIGINT. The only tricky bit is that previously fd[i].revents was used to skip both jobs that are not in use and jobs that did not have output; that's because negative file descriptors do not cause POLLNVAL and therefore fd[i].revents is zero for inactive jobs as well. But because all jobs must be killed, build() now has to check fd[i].fd == -1 explicitly. While at it, also clean up jobdone() by clearing job[i].edge; it's not nice to leave a dangling pointer in the jobs array, even if it's harmless. Signed-off-by: Paolo Bonzini <[email protected]>
Thinking about this some more, I'm worried a race condition where the signal arrives outside of the poll. If this happens, then we won't forward the signal to the subprocesses until the next one produces output or finishes. I think writing to a self-pipe in the handler and adding that to the pollfd array is probably the simplest way to solve this. I also did some digging into ninja to see how it deals with these signals and found a few things:
This leaves me with a few questions. Since ninja doesn't forward the signal to any foreground job, doesn't it have the same issue you're fixing here? If SIGTERM is sent to ninja only, what happens? Does the subprocess remain running? I also wonder if whoever sent the SIGTERM ought to have sent it to samurai's process group. Currently, samurai doesn't make new process groups for jobs. In this PR, we're making the assumption that SIGINT is usually sent to the whole process group due to a Ctrl-C. However, if SIGINT is sent to only samurai, then I believe it will just stop starting new jobs and wait until any active jobs finish naturally. Similarly, if SIGTERM is sent to samurai's process group, then I think the subprocesses will end up seeing SIGTERM twice (once from the initial signal, once from samurai). |
That is caught by:
but indeed there is a microscopic window between this line and the poll() right below. I can fix it once we agree on what to do.
ninja forwards the signal to the non-console processes (using process groups):
But I think for SIGTERM it should send it to console processes as well. Unlike SIGINT and SIGHUP, which are sent by the OS, SIGTERM is usually sent by the user with kill, and you cannot assume that the user sent it to the process group. In fact, I'd argue that because the idea of SIGTERM is to let the process clean up after itself, 1) it should not be sent to a process group, 2) it is a bug to not trap it if you spawn processes. The behavior of moving the children in their process group was implemented for ninja-build/ninja#110 with no particular explanation; then it was changed to move the process into a session (ninja-build/ninja#909) and reverted (ninja-build/ninja#1097, but see also ninja-build/ninja#1001). Frankly I wouldn't take it as a good example even if samurai is a "ninja clone". Make instead does the same as my implementation: it does not place processes in separate process groups, and forwards SIGTERM.
That's a problem of the user that sent it to the process group; it's not samurai's problem. Generally I don't think that it would be an issue, because SIGTERM will either exit on the first or trigger orderly cleanup in the child. In the latter case it would be triggered twice but, because signal handlers in general don't do much work themselves, it should be safe to consider SIGTERM idempotent; and if they're not, that should be considered a bug in the program. |
Any news? |
Thanks for the ping. I'm happy with this, but would like to fix the signal race before merging. Sorry for the delay in reviewing. I appreciate your patience.
I think the window is bigger than you suggest, but due to the poll timeout, it hangs for at most 5 seconds before sending the signal to jobs and exiting itself. For example, say we are reading some input for the last job in the array when the signal arrives. We finish the jobwork loop and continue in the outer loop. I was able to reproduce this pretty reliably with a job that wrote to stdout and then sent SIGTERM to samu:
I think the cleanest solution is to write a byte to a pipe in the handler, and add the read end to the pollfd array.
Thanks for doing this investigation. I agree with your analysis.
Makes sense to me. |
Regarding job[i].edge, my preference is just to leave it be. I'm not sure it makes sense to partially reinitialize the job struct (why |
Yeah, you're right. I will fix it (in fact most of the logic is already in the jobserver patches...) |
Handle SIGTERM by forwarding it to all children and waiting for them to stop. This is a better behavior than letting the children continue in the background.
The price to pay is that if a program does not respond to SIGTERM, samurai will have to be killed with SIGKILL. This however is consistent with many other programs that invoke and manage child processes, and the reason will be apparent from e.g.
ps
ortop
output, so overall I think this is in improvement.