-
Notifications
You must be signed in to change notification settings - Fork 38.7k
http: speed up shutdown #6990
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
http: speed up shutdown #6990
Conversation
Are you sure this isn't symptomatic of something else? edit: Right, never mind, you were using |
src/httpserver.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of interest, why do we have to wait [a hard coded timeout]? Why can't we wait for all events to finish (and maybe just ignore all new incoming connections?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all events to finish
may take a long time (minutes) depending what was going in in the past?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoFalke but this doesn't stop new connections (or is that done elsewhere?) in that waiting time. So conceptually, the status quo could be the same?
That is, new connections coming in, old ones still finishing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
New connections are stopped just above, see where the accept sockets are torn down.
Though it does look like we should close connected sockets in http_reject_request_cb as well for good measure. @laanwj Is there a reason to keep them open once they've had a req rejected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Waiting for all events to finish could take forever. There's tons of ways that someone could keep a connection option by slowly sending data. Also, timing events (such as the
walletpassphrasetimeout) may hold up the event queue. - Just breaking off everything in progress immediately prevents the answer to
stopfrom getting back. We could just declare this a feature of course and be done with it... But it was a race condition, sometimes you would get a reply, sometimes not. We could definestopto cease all RPC activity immediately. But I think I like this better. - There AFAIK is no way to ask libevent, nor libevent_http if the events in the queue are important enough to wait for or not. And specially tagging the connection that sends 'stop' is quite ugly, not sure if even possible without diving into internal data structures.
For context read #6719.
I'm not 100% happy with this solution either, but I wasn't able to find another way to do it, and it's better than the previous one.
@cfields sure, you can close the connection afterwards there, but that doesn't solve this issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoFalke but this doesn't stop new connections (or is that done elsewhere?) in that waiting time. So conceptually, the status quo could be the same?
It does:
// Unlisten sockets
BOOST_FOREACH (evhttp_bound_socket *socket, boundSockets) {
evhttp_del_accept_socket(eventHTTP, socket);
}
// Reject requests on current connections
evhttp_set_gencb(eventHTTP, http_reject_request_cb, NULL);
This closes the port and makes new requests error (so that no new work items are created). But that's best-effort. Current connections could still do e.g. a slowloris attack. The two seconds are a belt and suspenders, nothing more.
|
utACK |
|
utACK. |
|
Some more information: it appears that with Edit: so I think we should still do this for backward compat, but at least I'm not crazy :) |
|
@laanwj maybe just inline a comment to mention this might be fixed in latest version of libevent? Could be useful for some future refactoring, save someone trying to establish whether there was a reason we moved away from it that they aren't aware of. |
|
Will do Edit: OK, added comment and updated commit message. |
|
Huh! I pushed this branch to bitcoin/bitcoin instead of my own repository. Oh well, no harm done, I'll just remove it again when it's merged. |
1ec9198 to
febcc13
Compare
This continues/fixes #6719. `event_base_loopbreak` was not doing what I expected it to, at least in libevent 2.0.21. What I expected was that it sets a timeout, given that no other pending events it would exit in N seconds. However, what it does was delay the event loop exit with 10 seconds, even if nothing is pending. Solve it in a different way: give the event loop thread time to exit out of itself, and if it doesn't, send loopbreak. This speeds up the RPC tests a lot, each exit incurred a 10 second overhead, with this change there should be no shutdown overhead in the common case and up to two seconds if the event loop is blocking. As a bonus this breaks dependency on boost::thread_group, as the HTTP server minds its own offspring.
febcc13 to
a264c32
Compare
|
|
|
ACK. |
a264c32 http: speed up shutdown (Wladimir J. van der Laan)
|
Awesome, thanks @laanwj :) |
|
Post merge tested ACK! @laanwj this makes travis a lot faster. (runtime reduced for some rpc tests by 90%) Before: After: |
Fix various thread assertion errors caused during shutdown. Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#6719 - bitcoin/bitcoin#6990 - bitcoin/bitcoin#8421 - Second commit only in this PR - bitcoin/bitcoin#11006 I've cherry-picked the relevant commits, along with a note in each commit referring to the original Bitcoin commit ID (and the Zcash issue numbers where applicable). I've tested each issue with/without these patches applied. Closes #2214, #2334, and #2554.
HTTP Server cherries from Core: bitcoin/bitcoin#6719 - Make HTTP server shutdown more graceful bitcoin/bitcoin#6859 - http: Restrict maximum size of http + headers bitcoin/bitcoin#6990 - http: speed up shutdown bitcoin/bitcoin#7966 - http: Do a pending c++11 simplification handling work items bitcoin/bitcoin#8421 - httpserver: drop boost (#8023 dependency) bitcoin/bitcoin#11006 - Improve shutdown process
HTTP Server cherries from Core: bitcoin/bitcoin#6719 - Make HTTP server shutdown more graceful bitcoin/bitcoin#6859 - http: Restrict maximum size of http + headers bitcoin/bitcoin#6990 - http: speed up shutdown bitcoin/bitcoin#7966 - http: Do a pending c++11 simplification handling work items bitcoin/bitcoin#8421 - httpserver: drop boost (#8023 dependency) bitcoin/bitcoin#11006 - Improve shutdown process
Pardon a bit of iteration. This continues/fixes #6719.
event_base_loopexitwas not doing what I expected it to. What I expected was that it sets a timeout, given that no other pending events it would exit in N seconds. However, what it does was delay the event loop exit with N seconds, even if nothing is pending.Solve it in a different way: give the event loop thread time to exit out of itself, and if it doesn't, send loopbreak (and then join it).
This speeds up the RPC tests a lot, each exit incurred a 10 second overhead, with this change there should be no shutdown overhead in the common case and up to two seconds if the event loop is blocking.
As a bonus this breaks dependency on the global boost::thread_group, as the HTTP server minds its own offspring.
Time for rpctests with this patch:
Without: