Skip to content

Conversation

@theuni
Copy link
Member

@theuni theuni commented Nov 1, 2017

A rare race condition may trigger while awaiting the body of a message.

This may fix some reported rpc hangs/crashes.

This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373

Fixed upstream at: libevent/libevent@5ff8eb2

@laanwj laanwj added this to the 0.15.0.2 milestone Nov 1, 2017
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.
@gmaxwell
Copy link
Contributor

gmaxwell commented Nov 1, 2017

utACK.

@achow101
Copy link
Member

achow101 commented Nov 1, 2017

utACK

Copy link
Contributor

@dcousens dcousens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@promag
Copy link
Contributor

promag commented Nov 2, 2017

utACK 6b58360.

For http_set_gencb there is no need to enable/disable since the reply is sent right away in the callback:

bitcoin/src/httpserver.cpp

Lines 291 to 296 in 1b8c884

/** Callback to reject HTTP requests after shutdown. */
static void http_reject_request_cb(struct evhttp_request* req, void*)
{
LogPrint(BCLog::HTTP, "Rejecting request while shutting down\n");
evhttp_send_error(req, HTTP_SERVUNAVAIL, nullptr);
}

@ryanofsky
Copy link
Contributor

ryanofsky commented Nov 2, 2017

I'm trying to reproduce the bug, but not having luck so far. I'm building 6b58360 with the fix commit reverted. My event_get_version_number() is 0x2001500 and I'm running

src/bitcoind -regtest

with

while true; do curl -s -XGET --data-binary . 127.0.0.1:18443; done

@morcos morcos mentioned this pull request Nov 2, 2017
Copy link
Contributor

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks to at least mimic the fix at libevent/libevent#567

static void http_request_cb(struct evhttp_request* req, void* arg)
{
// Disable reading to work around a libevent bug, fixed in 2.2.0.
if (event_get_version_number() < 0x02020001) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are they not going to backport the fix? It looks like they haven't released a 2.2.0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsure if they'll backport, but it shouldn't hurt anything if they do. The reason for the version check is in case the behavior changes drastically in the future and enabling the read/write may be harmful. 2.2.0 isn't released, but the fix is in master.

evbuffer_add(evb, strReply.data(), strReply.size());
HTTPEvent* ev = new HTTPEvent(eventBase, true,
std::bind(evhttp_send_reply, req, nStatus, (const char*)nullptr, (struct evbuffer *)nullptr));
auto req_copy = req;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we now need to copy req where we weren't previously?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::bind copied in req by value, but passing "this" into the lambda and using req would use this->req in the callback, which would blow up because req is set to nullptr a few lines below.

This is clunky, but I'm not sure how else to pass a class member into a lambda by value in c++11.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the lambda were changed to simply say "req" (not "req&") then it is copied by value (in this case the pointer is copied).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class members need to be captured with 'this', so unfortunately that doesn't work.

@TheBlueMatt
Copy link
Contributor

utACK 6b58360, assuming libevent/libevent#567 gets merged upstream.

The bug was introduced in 2.1.6-beta, versions before that don't need the
workaround.
@theuni
Copy link
Member Author

theuni commented Nov 2, 2017

Thanks to @ryanofsky for the version info. I bisected the upstream bug and found that it was introduced in 2.1.6-beta, so the workaround has been constrained to the appropriate range.

@jonasschnelli
Copy link
Contributor

utACK 97932cd
A squash of the two commits would be preferable IMO.

@laanwj
Copy link
Member

laanwj commented Nov 2, 2017

utACK 97932cd

@laanwj laanwj merged commit 97932cd into bitcoin:master Nov 2, 2017
laanwj added a commit that referenced this pull request Nov 2, 2017
97932cd rpc: further constrain the libevent workaround (Cory Fields)
6b58360 rpc: work-around an upstream libevent bug (Cory Fields)

Pull request description:

  A rare race condition may trigger while awaiting the body of a message.

  This may fix some reported rpc hangs/crashes.

  This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373

  Fixed upstream at: libevent/libevent@5ff8eb2

Tree-SHA512: b9fa97cae9da2a44101c5faf1e3be0b9cbdf722982d35541cf224be31430779c75e519c8ed18d06ab7487bfb1211069b28f22739f126d6c28ca62d3f73b79a52
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Nov 2, 2017
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.

Github-Pull: bitcoin#11593
Rebased-From: 6b58360
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Nov 2, 2017
The bug was introduced in 2.1.6-beta, versions before that don't need the
workaround.

Github-Pull: bitcoin#11593
Rebased-From: 97932cd
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Mar 9, 2019
Summary:
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.

Backport of Bitcoin Core PR#11593.
Completes T466.

Test Plan:
```make check```
```test_runner.py```
Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment)
Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands
This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash)

Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc

Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc

Subscribers: teamcity, schancel

Differential Revision: https://reviews.bitcoinabc.org/D2642
jonspock pushed a commit to devaultcrypto/devault that referenced this pull request Mar 10, 2019
Summary:
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.

Backport of Bitcoin Core PR#11593.
Completes T466.

Test Plan:
```make check```
```test_runner.py```
Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment)
Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands
This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash)

Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc

Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc

Subscribers: teamcity, schancel

Differential Revision: https://reviews.bitcoinabc.org/D2642
jonspock pushed a commit to devaultcrypto/devault that referenced this pull request Mar 11, 2019
Summary:
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.

Backport of Bitcoin Core PR#11593.
Completes T466.

Test Plan:
```make check```
```test_runner.py```
Following the only documented attempt I could find for replicating the bug: bitcoin/bitcoin#11593 (comment)
Running `bitcoind -regtest` in with `while true; do curl -XGET --data-binary . 127.0.0.1:18444; done` while doing rpc commands
This should output `curl: (52) Empty reply from server` (bug would rarely cause a crash)

Reviewers: jasonbcox, deadalnix, Fabien, O1 Bitcoin ABC, #bitcoin_abc

Reviewed By: jasonbcox, O1 Bitcoin ABC, #bitcoin_abc

Subscribers: teamcity, schancel

Differential Revision: https://reviews.bitcoinabc.org/D2642
codablock pushed a commit to codablock/dash that referenced this pull request Sep 29, 2019
97932cd rpc: further constrain the libevent workaround (Cory Fields)
6b58360 rpc: work-around an upstream libevent bug (Cory Fields)

Pull request description:

  A rare race condition may trigger while awaiting the body of a message.

  This may fix some reported rpc hangs/crashes.

  This work-around mimics what libevent does internally once a write has started, which is what usually happens, but not always due to the processing happening on a different thread: https://github.com/libevent/libevent/blob/e7ff4ef2b4fc950a765008c18e74281cdb5e7668/http.c#L373

  Fixed upstream at: libevent/libevent@5ff8eb2

Tree-SHA512: b9fa97cae9da2a44101c5faf1e3be0b9cbdf722982d35541cf224be31430779c75e519c8ed18d06ab7487bfb1211069b28f22739f126d6c28ca62d3f73b79a52
DeckerSU added a commit to DeckerSU/komodo that referenced this pull request Jan 26, 2020
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit libevent/libevent@5ff8eb2 for details.

This may fix some reported rpc hangs/crashes.

Issue: bitcoin/bitcoin#11593

Commits:
- bitcoin/bitcoin@6b58360
- bitcoin/bitcoin@97932cd
DeckerSU added a commit to DeckerSU/KomodoOcean that referenced this pull request Jan 26, 2020
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit libevent/libevent@5ff8eb2 for details.

This may fix some reported rpc hangs/crashes.

Issue: bitcoin/bitcoin#11593

Commits:
- bitcoin/bitcoin@6b58360
- bitcoin/bitcoin@97932cd
Fair-Exchange pushed a commit to Fair-Exchange/SafecoinClassic that referenced this pull request May 16, 2020
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit libevent/libevent@5ff8eb2 for details.

This may fix some reported rpc hangs/crashes.

Issue: bitcoin/bitcoin#11593

Commits:
- bitcoin/bitcoin@6b58360
- bitcoin/bitcoin@97932cd
random-zebra added a commit to PIVX-Project/PIVX that referenced this pull request Feb 21, 2021
aab15d7 ReplayBlocks: use find instead of brackets operator to access to the element. (furszy)
e898353 [Refactoring] Use const CBlockIndex* where appropriate (random-zebra)
c76fa04 qa: Extract rpc_timewait as test param (furszy)
0f832e3 shutdown: Stop threads before resetting ptrs (MarcoFalke)
67aebbf http: Remove numThreads and ThreadCounter (Wladimir J. van der Laan)
e24c710 http: Remove WaitExit from WorkQueue (Wladimir J. van der Laan)
b8f7364 http: Join worker threads before deleting work queue (Wladimir J. van der Laan)
7d68769 rpc: further constrain the libevent workaround (Cory Fields)
75af065 rpc: work-around an upstream libevent bug (Cory Fields)
50e5833 Always return true if AppInitMain got to the end (Matt Corallo)
bd70dcc [qa] Test non-atomic chainstate writes (furszy)
8f04970 Dont create pcoinsTip until after ReplayBlocks. (Matt Corallo)
93f2b15 Random db flush crash simulator (Pieter Wuille)
72f3b17 Adapt memory usage estimation for flushing (Pieter Wuille)
8540113 Non-atomic flushing using the blockchain as replay journal (Pieter Wuille)
8d6625f [MOVEONLY] Move LastCommonAncestor to chain (Pieter Wuille)

Pull request description:

  > This patch adds an extra "head blocks" to the chainstate, which gives the range of blocks for writes may be incomplete. At the start of a flush, we write this record, write the dirty dbcache entries in 16 MiB batches, and at the end we remove the heads record again. If it is present at startup it means we crashed during flush, and we rollback/roll forward blocks inside of it to get a consistent tip on disk before proceeding.

  > If a flush completes succesfully, the resulting database is compatible with previous versions. If the node crashes in the middle of a flush, a version of the code with this patch is needed to recovery.

  An adaptation of the following PRs with further modifications to the `feature_dbcrash.py` test to be up-to-date with upstream and solve RPC related bugs.

  * bitcoin#10148.
  * Increase RPC wait time.
  * bitcoin#11831
  * bitcoin#11593
  * bitcoin#12366
  * bitcoin#13837
  * bitcoin#13894

ACKs for top commit:
  random-zebra:
    ACK aab15d7
  Fuzzbawls:
    ACK aab15d7

Tree-SHA512: 898806746f581a9eb8deb0155c558481abf4454c6f3b3c3ad505c557938d0700fe9796e98e36492286ae869378647072c3ad77ad65e9dd7075108ff96469ade1
LarryRuane pushed a commit to LarryRuane/zcash that referenced this pull request Apr 13, 2021
A rare race condition may trigger while awaiting the body of a message, see
upsteam commit 5ff8eb26371c4dc56f384b2de35bea2d87814779 for details.

This may fix some reported rpc hangs/crashes.

zcash: currently, we build with libevent-2.1.12, so this fix is needed
zcash: cherry picked from commit 6b58360
zcash: bitcoin/bitcoin#11593
LarryRuane pushed a commit to LarryRuane/zcash that referenced this pull request Apr 13, 2021
The bug was introduced in 2.1.6-beta, versions before that don't need the
workaround.

zcash: currently, we build with libevent-2.1.12, so this fix is needed
zcash: cherry picked from commit 97932cd
zcash: bitcoin/bitcoin#11593
zkbot added a commit to zcash/zcash that referenced this pull request Apr 17, 2021
Bitcoin 0.16 locking PRs

These are locking changes from upstream (bitcoin core) release 0.16 (Aug 14, 2017, to Feb 16, 2018), oldest to newest (when merged to the master branch).

Each commit also includes a reference both to the PR and the upstream commit.

- bitcoin/bitcoin#11126
  - Excludes changes to wallet tests that we don't have.
- bitcoin/bitcoin#10916
  - first commit only; second commit already merged by d9fcc2b
- bitcoin/bitcoin#11107
  - Only the last commit.
- bitcoin/bitcoin#11593
- bitcoin/bitcoin#11585
- bitcoin/bitcoin#11618
- bitcoin/bitcoin#10286
  - Only the third and last commits.
- bitcoin/bitcoin#11870
- bitcoin/bitcoin#12330
- bitcoin/bitcoin#12366
- bitcoin/bitcoin#12368
- bitcoin/bitcoin#12333
  - Only the first commit.
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants