-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Move TTOU processing #1165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move TTOU processing #1165
Conversation
|
|
||
| @workers.delete_if(&:dead?) | ||
|
|
||
| cull_workers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there really ever a case where we should cull and spawn?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should be rearchitected so that cull_workers and spawn_workers takes an integer argument, and then you calculate the value to give to those methods here. You could then call cull_workers(1) in your TTOU signal trap.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, if the signals come fast enough, they can "queue up" before an indirect handler such as this one has a chance to run. But you're right, maybe having a difference counter is a better approach. I'll have a look at it this week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this approach is fine, no need to combine them for now.
| Signal.trap "TTOU" do | ||
| @options[:workers] -= 1 if @options[:workers] >= 2 | ||
| @workers.last.term | ||
| wakeup! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this ends up delaying the response to TTOU until the next run of check_workers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Derp, you said that in your description.
Can you explain a little more about how/why this might occur? |
|
It could happen if the signals are fired in quick succession (and maybe even by some automated process that wants to scale out by say 5 workers). If the signal handler isn't fast enough and the process is busy, signals can get dropped by the kernel, or worse, the signal handler can get interrupted by the same signal being sent again (thus we have a race condition in the signal handler). The more I think about it, it may actually be better to rearchitect the entire signal handling block to use a self pipe (q.v. https://www.sitepoint.com/the-self-pipe-trick-explained/) in order to minimize the race condition potential (that already exists with the worker counter increment/decrement code as it stands). But that may be too much change in one go. Would you rather have a second pull request introducing a self pipe, or should I try and rewrite everything at once? |
Ah right, this is probably a fairly common case. I have no opinion on whether or not a self pipe is better or worse here. Not including a self pipe with this PR would not make the code any worse than it already stands, so let's do the changes separately. |
|
|
|
You heard the man! MERGE! |
This pull request changes a few things in the cluster:
This of course changes the behaviour of puma in dealing with TTOU signals, as they are no longer processed immediately, but there is a delay until the next WORKER_CHECK_INTERVAL expires. On the upside, multiple TTOU signals can be handled at once and the signal handler is leaner and more symmetric to the TTIN signal handler.
Unfortunately I did not come up with a way to add tests for the changes (or the signal handlers themselves) without changing the internal structure of Puma::Launcher and Puma::Cluster. If you can give me pointers to how you'd like to have this tested, I'd be happy to add the tests.