-
Notifications
You must be signed in to change notification settings - Fork 1.5k
add out_of_band hook #1648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add out_of_band hook #1648
Conversation
evanphx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the impact of this feature, namely that it enables folks to do work out of band from requests but doesn't explicitly do any of that work, I'm inclined to accept this.
I'd like some thoughts from @schneems and @nateberkopec too.
schneems
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall seems like an good experiment. Have you tried running with this patch in production? Any numbers to share?
| # spin up the max number of threads. | ||
| return if @todo.size - @waiting < @max - @spawned | ||
| busy_threads = @spawned - @waiting + @todo.size | ||
| return busy_threads if @max > busy_threads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm really jumpy around this line of code. This controls a disproportionate amount of what puma does compared to it's size. While the method is only a few lines of code there's 30 lines of docs (most of which I wrote).
Original code
@todo.size - @waiting < @max - @spawned
Can be re-written as
@todo.size - @waiting + @spawned < @max
or flipping it
@max > @todo.size - @waiting + @spawned
re-aranging
@max > @spawned - @waiting + @todo.size
And finally
busy_threads = @spawned - @waiting + @todo.size
@max > busy_threads
So that looks fine. Just makes me nervous.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other question to ask: is this the correct logic.
@waiting is the count of threads that have been spawned but are idle. So busy count would be the spawned count minus the idle count.
We then need to handle for the case where a request has been processed but not picked up yet so we subtract the backlog.
So that makes sense.
Edit 7 years later: "so we subtract the backlog." is incorrect, we're adding the backlog (@todo.size). Which: if there's one idle thread and there's one request in the queue then we count the number of "busy" threads as one because that thread is about to be busy (even if it isn't right this second). A bug here is that we're not accounting for the case where @todo.size is > @waiting. When this code was written, it was possible, but usually only by a small count. This code has persisted and lives on in Puma 7 in the form ofbusy_threads methods and stats and now the todo queue is no longer as constrained. So I that bug has a larger impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm less confident the logic is correct to begin with, only that it's unchanged by the algebraic refactoring.
In fact, during some local benchmarking I noticed some weird anomalies where busy_threads was logged as 2 after only a single request was added to the worker-process, so I do suspect there might be some concurrency edge-case not handled perfectly by the current logic.
Anyway, it's certainly possible to leave the equation untouched by this PR if that makes it any less nerve-wracking. I figured making the 'busy threads' count more explicit in this method makes the logic easier to follow, but the change isn't strictly necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after only a single request was added to the worker-process
Was this a web page load or did you make the request via curl? if it's a webpage don't forget there are assets that each require a different request.
there might be some concurrency edge-case not handled perfectly by the current logic.
Yes there's a known race condition between the time where a request is added to the @todo array and when a new thread picks it up and starts working on it. So it's possible that @todo.size might be greater than 0 but all threads be idle.
Anyway, it's certainly possible to leave the equation untouched by this PR if that makes it any less nerve-wracking. I figured making the 'busy threads' count more explicit in this method makes the logic easier to follow, but the change isn't strictly necessary.
I'm fine with the change just wanted to be sure I talked through it out loud and it preserved behavior.
I just ran a ~12-hour side-by-side comparison of two Puma servers running Ruby 2.5, with/without the OOBGC from tmm1/gctools#17 along with this PR. The difference on my current production workload is slight but still noticeable.
So both response duration and memory utilization are slightly improved by about ~7-8% running out-of-band GC on my workload. |
This PR adds an
out_of_bandhook that is invoked when the worker is idle.The hook runs immediately after a request has finished processing and there are no busy threads on the worker. The worker doesn't accept new requests until this code finishes.
This hook can be used for running out-of-band garbage collection as a solution to #450, e.g.:
It could also be useful for scheduling/deferring other asynchronous tasks to be run during idle periods on the server.