Make HTTP server shutdown more graceful #6719

laanwj · 2015-09-24T15:46:18Z

Shutting down the HTTP server currently breaks off all current requests. This can create a race condition with RPC stop command, where the calling process never receives confirmation.

This change removes the listening sockets on shutdown so that no new requests can come in, but no longer breaks off requests in progress.

Meant to fix #6717.

dcousens · 2015-09-25T06:49:47Z

but no longer breaks off requests in progress.

Possible attack vector by denying the service the ability to shut down / restart by keeping a connection open?

laanwj · 2015-09-25T07:40:13Z

After -rpcservertimeout current connections will get booted, unless there is activity. If they do any requests they get a server unavailable error. But yes, if your application software is truly malicious they could keep the server busy by sending e.g. one header at a time, every timeout-1 seconds.

Note that this solution is cleaner in any case, it doesn't rule out queuing a event_base_loopbreak(eventBase);, say, 30 seconds in the future to force a shutdown.

jonasschnelli · 2015-09-25T09:17:52Z

utACK.

I agree that the attack vectors are out of scope for this PR maybe for the whole http-json-rpc-daemon (it's not hardened enough to expose to potential malicious environments).

dcousens · 2015-09-25T10:00:32Z

utACK, just thought I'd mention it.

laanwj · 2015-09-25T12:00:03Z

@dcousens It's great that you mentioned it. It reminded me of potential issue of this: what happens if a timer is stilll running, will it block the event queue exit until it triggers?
Will test this scenario by stopping with an unlocked wallet.

Pushed another related fix:

http: Wait for worker threads to exit

Add a WaitExit() call to http's WorkQueue to make StopHTTPServer delete the work queue only when all worker threads stopped.

This fixes a problem that was reproducable by pressing Ctrl-C during AppInit2:

/usr/include/boost/thread/pthread/condition_variable_fwd.hpp:81: boost::condition_variable::~condition_variable(): Assertion `!ret' failed.
/usr/include/boost/thread/pthread/mutex.hpp:108: boost::mutex::~mutex(): Assertion `!posix::pthread_mutex_destroy(&m)' failed.

I was assuming that threadGroup->join_all(); would always have been called when entering the Shutdown(). However this is not the case in bitcoind's AppInit2-non-zero-exit case "was left out intentionally here".
(now that the http module takes care of waiting for its own worker threads, we could even make it stop using threadGroup completely, but I leave that for a future refactor)

laanwj · 2015-09-25T13:39:10Z

Pushed another commit to force-exit the event loop after a predefined time after interruption. This addresses @dcousens 's issue, as well as the case that other events are still lingering (such as a wallet unlock timer).

dcousens · 2015-09-25T14:19:56Z

utACK, well done @laanwj

theuni · 2015-09-25T21:35:10Z

Concept ACK, btw. Definitely an improvement.

Shutting down the HTTP server currently breaks off all current requests. This can create a race condition with RPC `stop` command, where the calling process never receives confirmation. This change removes the listening sockets on shutdown so that no new requests can come in, but no longer breaks off requests in progress. Meant to fix bitcoin#6717.

Add a WaitExit() call to http's WorkQueue to make it delete the work queue only when all worker threads stopped. This fixes a problem that was reproducable by pressing Ctrl-C during AppInit2: ``` /usr/include/boost/thread/pthread/condition_variable_fwd.hpp:81: boost::condition_variable::~condition_variable(): Assertion `!ret' failed. /usr/include/boost/thread/pthread/mutex.hpp:108: boost::mutex::~mutex(): Assertion `!posix::pthread_mutex_destroy(&m)' failed. ``` I was assuming that `threadGroup->join_all();` would always have been called when entering the Shutdown(). However this is not the case in bitcoind's AppInit2-non-zero-exit case "was left out intentionally here".

This makes sure that the event loop eventually terminates, even if an event (like an open timeout, or a hanging connection) happens to be holding it up.

laanwj · 2015-09-28T13:09:47Z

Processed @theuni 's comments. WIll merge as soon as Travis passes.

theuni · 2015-09-28T13:35:42Z

Thanks. ut ACK.

ec908d5 http: Force-exit event loop after predefined time (Wladimir J. van der Laan) de9de2d http: Wait for worker threads to exit (Wladimir J. van der Laan) 5e0c221 Make HTTP server shutdown more graceful (Wladimir J. van der Laan)

Pardon a bit of iteration. This continues/fixes #6719. `event_base_loopbreak` was not doing what I expected it to. What I expected was that it sets a timeout, given that no other pending events it would exit in N seconds. However, what it does was delay the event loop exit with 10 seconds, even if nothing is pending. Solve it in a different way: give the event loop thread time to exit out of itself, and if it doesn't, send loopbreak. This speeds up the RPC tests a lot, each exit incurred a 10 second overhead, with this change there should be no shutdown overhead in the common case and up to two seconds if the event loop is blocking. As a bonus this breaks dependency on boost::thread_group, as the HTTP server minds its own offspring.

This continues/fixes #6719. `event_base_loopbreak` was not doing what I expected it to, at least in libevent 2.0.21. What I expected was that it sets a timeout, given that no other pending events it would exit in N seconds. However, what it does was delay the event loop exit with 10 seconds, even if nothing is pending. Solve it in a different way: give the event loop thread time to exit out of itself, and if it doesn't, send loopbreak. This speeds up the RPC tests a lot, each exit incurred a 10 second overhead, with this change there should be no shutdown overhead in the common case and up to two seconds if the event loop is blocking. As a bonus this breaks dependency on boost::thread_group, as the HTTP server minds its own offspring.

Fix various thread assertion errors caused during shutdown. Cherry-picked from the following upstream PRs: - bitcoin/bitcoin#6719 - bitcoin/bitcoin#6990 - bitcoin/bitcoin#8421 - Second commit only in this PR - bitcoin/bitcoin#11006 I've cherry-picked the relevant commits, along with a note in each commit referring to the original Bitcoin commit ID (and the Zcash issue numbers where applicable). I've tested each issue with/without these patches applied. Closes #2214, #2334, and #2554.

HTTP Server cherries from Core: bitcoin/bitcoin#6719 - Make HTTP server shutdown more graceful bitcoin/bitcoin#6859 - http: Restrict maximum size of http + headers bitcoin/bitcoin#6990 - http: speed up shutdown bitcoin/bitcoin#7966 - http: Do a pending c++11 simplification handling work items bitcoin/bitcoin#8421 - httpserver: drop boost (#8023 dependency) bitcoin/bitcoin#11006 - Improve shutdown process

laanwj added RPC/REST/ZMQ Tests labels Sep 24, 2015

laanwj force-pushed the 2015_09_rpc_shutdown_race branch from 049587c to bfc9707 Compare September 25, 2015 12:10

laanwj added 3 commits September 28, 2015 15:06

http: Force-exit event loop after predefined time

ec908d5

This makes sure that the event loop eventually terminates, even if an event (like an open timeout, or a hanging connection) happens to be holding it up.

laanwj force-pushed the 2015_09_rpc_shutdown_race branch from 48c8a95 to ec908d5 Compare September 28, 2015 13:06

laanwj merged commit ec908d5 into bitcoin:master Sep 28, 2015

laanwj mentioned this pull request Nov 11, 2015

http: speed up shutdown #6990

Merged

jnewbery mentioned this pull request Aug 9, 2017

Improve shutdown process #11006

Merged

jasondavies mentioned this pull request Oct 23, 2017

Fix various thread assertion errors caused during shutdown. zcash/zcash#2555

Merged

dagurval mentioned this pull request Jan 27, 2018

HTTP Server cherries bitcoinxt/bitcoinxt#315

Merged

sickpig mentioned this pull request Mar 12, 2018

HTTP servers Core ports BitcoinUnlimited/BitcoinUnlimited#1005

Merged

bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make HTTP server shutdown more graceful #6719

Make HTTP server shutdown more graceful #6719

Uh oh!

laanwj commented Sep 24, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

laanwj commented Sep 25, 2015

Uh oh!

jonasschnelli commented Sep 25, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

laanwj commented Sep 25, 2015

Uh oh!

laanwj commented Sep 25, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

theuni commented Sep 25, 2015

Uh oh!

laanwj commented Sep 28, 2015

Uh oh!

theuni commented Sep 28, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Make HTTP server shutdown more graceful #6719

Make HTTP server shutdown more graceful #6719

Uh oh!

Conversation

laanwj commented Sep 24, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

laanwj commented Sep 25, 2015

Uh oh!

jonasschnelli commented Sep 25, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

laanwj commented Sep 25, 2015

http: Wait for worker threads to exit

Uh oh!

laanwj commented Sep 25, 2015

Uh oh!

dcousens commented Sep 25, 2015

Uh oh!

theuni commented Sep 25, 2015

Uh oh!

laanwj commented Sep 28, 2015

Uh oh!

theuni commented Sep 28, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants