-
Notifications
You must be signed in to change notification settings - Fork 38.7k
rpc: work-around an upstream libevent bug #11593
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes.
50bbb41 to
6b58360
Compare
|
utACK. |
|
utACK |
dcousens
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
utACK 6b58360. For Lines 291 to 296 in 1b8c884
|
|
I'm trying to reproduce the bug, but not having luck so far. I'm building 6b58360 with the fix commit reverted. My src/bitcoind -regtestwith while true; do curl -s -XGET --data-binary . 127.0.0.1:18443; done |
TheBlueMatt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to at least mimic the fix at libevent/libevent#567
src/httpserver.cpp
Outdated
| static void http_request_cb(struct evhttp_request* req, void* arg) | ||
| { | ||
| // Disable reading to work around a libevent bug, fixed in 2.2.0. | ||
| if (event_get_version_number() < 0x02020001) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are they not going to backport the fix? It looks like they haven't released a 2.2.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unsure if they'll backport, but it shouldn't hurt anything if they do. The reason for the version check is in case the behavior changes drastically in the future and enabling the read/write may be harmful. 2.2.0 isn't released, but the fix is in master.
| evbuffer_add(evb, strReply.data(), strReply.size()); | ||
| HTTPEvent* ev = new HTTPEvent(eventBase, true, | ||
| std::bind(evhttp_send_reply, req, nStatus, (const char*)nullptr, (struct evbuffer *)nullptr)); | ||
| auto req_copy = req; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we now need to copy req where we weren't previously?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::bind copied in req by value, but passing "this" into the lambda and using req would use this->req in the callback, which would blow up because req is set to nullptr a few lines below.
This is clunky, but I'm not sure how else to pass a class member into a lambda by value in c++11.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the lambda were changed to simply say "req" (not "req&") then it is copied by value (in this case the pointer is copied).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class members need to be captured with 'this', so unfortunately that doesn't work.
|
utACK 6b58360, assuming libevent/libevent#567 gets merged upstream. |
The bug was introduced in 2.1.6-beta, versions before that don't need the workaround.
|
Thanks to @ryanofsky for the version info. I bisected the upstream bug and found that it was introduced in 2.1.6-beta, so the workaround has been constrained to the appropriate range. |
|
utACK 97932cd |
|
utACK 97932cd |
97932cd rpc: further constrain the libevent workaround (Cory Fields) 6b58360 rpc: work-around an upstream libevent bug (Cory Fields) Pull request description: A rare race condition may trigger while awaiting the body of a message. This may fix some reported rpc hangs/crashes. This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373 Fixed upstream at: libevent/libevent@5ff8eb2 Tree-SHA512: b9fa97cae9da2a44101c5faf1e3be0b9cbdf722982d35541cf224be31430779c75e519c8ed18d06ab7487bfb1211069b28f22739f126d6c28ca62d3f73b79a52
A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes. Github-Pull: bitcoin#11593 Rebased-From: 6b58360
The bug was introduced in 2.1.6-beta, versions before that don't need the workaround. Github-Pull: bitcoin#11593 Rebased-From: 97932cd
Summary: A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes. Backport of Bitcoin Core PR#11593. Completes T466. Test Plan: ```make check``` ```test_runner.py``` Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment) Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash) Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc Subscribers: teamcity, schancel Differential Revision: https://reviews.bitcoinabc.org/D2642
Summary: A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes. Backport of Bitcoin Core PR#11593. Completes T466. Test Plan: ```make check``` ```test_runner.py``` Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment) Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash) Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc Subscribers: teamcity, schancel Differential Revision: https://reviews.bitcoinabc.org/D2642
Summary: A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes. Backport of Bitcoin Core PR#11593. Completes T466. Test Plan: ```make check``` ```test_runner.py``` Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment) Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash) Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc Subscribers: teamcity, schancel Differential Revision: https://reviews.bitcoinabc.org/D2642
97932cd rpc: further constrain the libevent workaround (Cory Fields) 6b58360 rpc: work-around an upstream libevent bug (Cory Fields) Pull request description: A rare race condition may trigger while awaiting the body of a message. This may fix some reported rpc hangs/crashes. This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373 Fixed upstream at: libevent/libevent@5ff8eb2 Tree-SHA512: b9fa97cae9da2a44101c5faf1e3be0b9cbdf722982d35541cf224be31430779c75e519c8ed18d06ab7487bfb1211069b28f22739f126d6c28ca62d3f73b79a52
A rare race condition may trigger while awaiting the body of a message, see upsteam commit libevent/libevent@5ff8eb2 for details. This may fix some reported rpc hangs/crashes. Issue: bitcoin/bitcoin#11593 Commits: - bitcoin/bitcoin@6b58360 - bitcoin/bitcoin@97932cd
A rare race condition may trigger while awaiting the body of a message, see upsteam commit libevent/libevent@5ff8eb2 for details. This may fix some reported rpc hangs/crashes. Issue: bitcoin/bitcoin#11593 Commits: - bitcoin/bitcoin@6b58360 - bitcoin/bitcoin@97932cd
A rare race condition may trigger while awaiting the body of a message, see upsteam commit libevent/libevent@5ff8eb2 for details. This may fix some reported rpc hangs/crashes. Issue: bitcoin/bitcoin#11593 Commits: - bitcoin/bitcoin@6b58360 - bitcoin/bitcoin@97932cd
aab15d7 ReplayBlocks: use find instead of brackets operator to access to the element. (furszy) e898353 [Refactoring] Use const CBlockIndex* where appropriate (random-zebra) c76fa04 qa: Extract rpc_timewait as test param (furszy) 0f832e3 shutdown: Stop threads before resetting ptrs (MarcoFalke) 67aebbf http: Remove numThreads and ThreadCounter (Wladimir J. van der Laan) e24c710 http: Remove WaitExit from WorkQueue (Wladimir J. van der Laan) b8f7364 http: Join worker threads before deleting work queue (Wladimir J. van der Laan) 7d68769 rpc: further constrain the libevent workaround (Cory Fields) 75af065 rpc: work-around an upstream libevent bug (Cory Fields) 50e5833 Always return true if AppInitMain got to the end (Matt Corallo) bd70dcc [qa] Test non-atomic chainstate writes (furszy) 8f04970 Dont create pcoinsTip until after ReplayBlocks. (Matt Corallo) 93f2b15 Random db flush crash simulator (Pieter Wuille) 72f3b17 Adapt memory usage estimation for flushing (Pieter Wuille) 8540113 Non-atomic flushing using the blockchain as replay journal (Pieter Wuille) 8d6625f [MOVEONLY] Move LastCommonAncestor to chain (Pieter Wuille) Pull request description: > This patch adds an extra "head blocks" to the chainstate, which gives the range of blocks for writes may be incomplete. At the start of a flush, we write this record, write the dirty dbcache entries in 16 MiB batches, and at the end we remove the heads record again. If it is present at startup it means we crashed during flush, and we rollback/roll forward blocks inside of it to get a consistent tip on disk before proceeding. > If a flush completes succesfully, the resulting database is compatible with previous versions. If the node crashes in the middle of a flush, a version of the code with this patch is needed to recovery. An adaptation of the following PRs with further modifications to the `feature_dbcrash.py` test to be up-to-date with upstream and solve RPC related bugs. * bitcoin#10148. * Increase RPC wait time. * bitcoin#11831 * bitcoin#11593 * bitcoin#12366 * bitcoin#13837 * bitcoin#13894 ACKs for top commit: random-zebra: ACK aab15d7 Fuzzbawls: ACK aab15d7 Tree-SHA512: 898806746f581a9eb8deb0155c558481abf4454c6f3b3c3ad505c557938d0700fe9796e98e36492286ae869378647072c3ad77ad65e9dd7075108ff96469ade1
A rare race condition may trigger while awaiting the body of a message, see upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details. This may fix some reported rpc hangs/crashes. zcash: currently, we build with libevent-2.1.12, so this fix is needed zcash: cherry picked from commit 6b58360 zcash: bitcoin/bitcoin#11593
The bug was introduced in 2.1.6-beta, versions before that don't need the workaround. zcash: currently, we build with libevent-2.1.12, so this fix is needed zcash: cherry picked from commit 97932cd zcash: bitcoin/bitcoin#11593
Bitcoin 0.16 locking PRs These are locking changes from upstream (bitcoin core) release 0.16 (Aug 14, 2017, to Feb 16, 2018), oldest to newest (when merged to the master branch). Each commit also includes a reference both to the PR and the upstream commit. - bitcoin/bitcoin#11126 - Excludes changes to wallet tests that we don't have. - bitcoin/bitcoin#10916 - first commit only; second commit already merged by d9fcc2b - bitcoin/bitcoin#11107 - Only the last commit. - bitcoin/bitcoin#11593 - bitcoin/bitcoin#11585 - bitcoin/bitcoin#11618 - bitcoin/bitcoin#10286 - Only the third and last commits. - bitcoin/bitcoin#11870 - bitcoin/bitcoin#12330 - bitcoin/bitcoin#12366 - bitcoin/bitcoin#12368 - bitcoin/bitcoin#12333 - Only the first commit.
A rare race condition may trigger while awaiting the body of a message.
This may fix some reported rpc hangs/crashes.
This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373
Fixed upstream at: libevent/libevent@5ff8eb2