Add capability for goroutine dumping on SIGQUIT/USR1#10786
Add capability for goroutine dumping on SIGQUIT/USR1#10786thaJeztah merged 1 commit intomoby:masterfrom
Conversation
|
Ping @vieux IIRC he either did it already or it was planned. Let me know |
|
can we please add this to the |
|
@tiborvass after talking to @vieux and testing it locally--this is built into Go itself now, and our @SvenDowideit I could change this PR to add a note about the ability to dump stacktraces with SIGQUIT in the docs, or just close the PR without anything given this is a Go feature that Docker is allowing to work properly when DEBUG="" in the environment (which is set with |
|
@vieux can you confirm that the only scenario where it makes sense to trap SIGQUIT is when attached to a container? |
|
If I recall correctly, if you have container(s) running sigquit is carched because you have to press it 3 times to actually quit. Do you remember this @tiborvass? |
|
@vieux I thought that's for SIGINT not SIGQUIT. SIGQUIT is |
There was a problem hiding this comment.
I'm not sure if \n\n is necessary here, looks like log.Debugf should add EOL itself.
|
@estesp Also on rebase note, that we changed all |
There was a problem hiding this comment.
Probably makes sense to write one big log message. Because actually other messages can be between :)
|
I tried and I like it. |
|
@estesp FYI currently SIGQUIT should be already handled (try Ctrl+\ on a daemon in the foreground). Looking at the code now. |
|
NOT LGTM @estesp this code is already available in pkg/signal/trap.go Unfortunately it checks for the DEBUG envvar, which I believe we set when If I'm missing something, let me know :) |
|
@tiborvass yes.. I've been lazy to get back to this, but I confirmed last month with @vieux that there is no need to set up another handler as it is already happening in base Go lang debug traps. Why I left the PR open was to either (a) clarify about |
|
@estesp IMO t should not be only on DEBUG. I actually had a great chat with @gabrtv in Denver about this, and we agreed that we should probably have another signal handler that just dumps all the stack traces just like SIGQUIT would but without quitting. This would allow production environments to get a lot more feedback if they're ever stuck in a weird case. A hacky solution would be to discriminate on DEBUG. If it's in debug mode, SIGQUIT quits, if not, then just dumps output without quitting. It's not very intuitive though cause SIGQUIT wouldn't quit. However, I couldn't find conclusive documentation on what to do on SIGQUIT. I even found contradictions: some say we should clean up after SIGQUIT others say no, some say it's okay to ignore SIGQUIT, others say it should always terminate the process. Also we should probably write both to stdout and to disk, the stack traces from the runtime package just like core dump. What do you think? Also golang/go@5146a93 can be interesting to read. |
|
It would be straightforward to use something like |
|
@estesp i'm fine with SIGUSR1 :) |
|
👍 for a |
|
@estesp Can you please update the PR to use |
|
Ping @estesp for SIGUSR1, I think this will be a 1.7 feature of its own :) |
d34d8e7 to
b4452f6
Compare
b4452f6 to
3e8dcde
Compare
3e8dcde to
2f103d1
Compare
|
@icecrime @tiborvass thanks for the reminders! I had almost forgotten about this. I've updated the code and the first comment with the current implementation based on the feedback. I realize from CI output that we will need a build tag for a separate trap.go for Windows as SIGUSR1 is not available there. I'll fix that tomorrow. |
|
@icecrime @tiborvass this should be ready for review--I refactored to only set up a SIGUSR1 handler for build |
|
Thanks @estesp! Pushing to |
There was a problem hiding this comment.
I think we're missing a loop of some sort here, because the way it's written you can only dump the stack once here (unless that's the intent?).
There was a problem hiding this comment.
Wait, no one submitted the requirements document with a specific feature requests to dump more than once. 😢 Embarrassed.. my original code was correct but in all the refactoring I hastily recreated this part and forgot the loop.. fixed now!! :)
f5dcb1c to
f0b6296
Compare
|
ping @tiborvass |
|
code LGTM and I like this feature 👍 |
|
LGtm |
|
ping @tiborvass ok with you? we will not merge without you liking it :) |
|
Behavior is good, reading code now |
|
Does this really work? I thought you could only have one signal handler per process and here you are making two. edit: maybe it's one signal handler per type of signal??? |
There was a problem hiding this comment.
@estesp I agree with crosbymichael, why can't we just add syscall.SIGUSR1 to the list below? If we want it to be cross platform, we could make it in a const sigdump = syscall.SIGUSR1 and make it or const sigdump = syscall.Signal(-1) on windows maybe?
There was a problem hiding this comment.
Given our IRC discussion and the comments here ( https://golang.org/pkg/os/signal/#Notify ), I think it is fully reasonable to have a separate handler func for the SIGUSR1 channel notifier, and while the other method would work, I'm not sure it has value over this model.
There was a problem hiding this comment.
https://golang.org/pkg/os/signal/#Notify shows indeed that it doesn't matter, so it's okay
EDIT: Sorry I got github cached...
|
|
|
@estesp Actually this package is starting to become pretty weird, can't we just move all the debugtrap stuff to docker/daemon.go ? |
Add handler for SIGUSR1 based on feedback regarding when to dump goroutine stacks. This will also dump goroutine stack traces on SIGQUIT followed by a hard-exit from the daemon. Docker-DCO-1.1-Signed-off-by: Phil Estes <[email protected]> (github: estesp)
f0b6296 to
95fcf76
Compare
|
@crosbymichael let me know if this is fine. We could put DumpStacks() to its own debug package... but i really wanna merge this. |
|
LGTM |
1 similar comment
|
LGTM |
Add capability for goroutine dumping on SIGQUIT/USR1
This allows "kill -QUITpidof docker" to be used to dump thegoroutine stacktraces from the docker daemon to the debug log. The
SIGQUIT listener is only created if the daemon is started in debug mode.
Updated based on feedback -- 4/21:
Goroutine dumps are created on
SIGQUIT(matching Go-lang builtin dumper), andSIGUSR1. The user signal 1 handler is not destructive: it only dumps goroutines without exiting the daemon. The quit handler is a hard-exit (no cleanup routine called) after dumping the goroutines to allow a method to exit a hung daemon after capturing debug information. Both handlers are installed without any reliance on "DEBUG" being set in the daemon. The routines are logged at "info" level, matching the default log-level set at start time. If the log-level is set to something more restrictive (warn or error), then the goroutine logging will not be visible. Potentially we might want to "escalate" the output to a higher log level if we think that will be a problem with usage.Docker-DCO-1.1-Signed-off-by: Phil Estes [email protected] (github: estesp)