Skip to content

Conversation

@promag
Copy link
Contributor

@promag promag commented Mar 31, 2020

Release locks before calling rpcRunLater.

Quick explanation: rpcRunLater leads to event_free which calls event_del which can wait for the event callback to finish if it's already running and that callback will try to lock wallet mutex - which is already locked in http thread.

Fixes #14995 , fixes #18482. Best reviewed with whitespace changes hidden.

@promag promag force-pushed the 2020-04-fix-rpcrunlater-race branch from 10a1eb2 to 0944caf Compare April 1, 2020 00:13
@promag promag closed this Apr 1, 2020
@promag promag reopened this Apr 1, 2020
@promag promag changed the title wip: rpc: Fix rpcRunLater race in walletpassphrase rpc: Fix rpcRunLater race in walletpassphrase Apr 1, 2020
@laanwj laanwj added this to the 0.20.0 milestone Apr 1, 2020
@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 1, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@maflcko
Copy link
Member

maflcko commented Apr 2, 2020

Can be tested with #18507

@promag promag force-pushed the 2020-04-fix-rpcrunlater-race branch from 0944caf to 47dd37c Compare April 2, 2020 15:29
@maflcko
Copy link
Member

maflcko commented Apr 2, 2020

ACK 47dd37cc83bdcaf19a3abe95950b2409baae6249 only tested that this avoids the node freezing. Didn't look at how libevent works or how the deadlock happens or if this breaks other stuff.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 0944cafe294d4ee009b615fc8d8caca77c117481. This is a nice, clean bug fix, but agree with Marco's suggestions to improve the comment. Adding the explanation of why the lock is released from the PR description would be very helpful. And it doesn't seem right to say "Currently this must be called without locks held", when I don't think it was or will be safe to safe to add a timer with locks held.

@promag promag force-pushed the 2020-04-fix-rpcrunlater-race branch from 47dd37c to 7b8e157 Compare April 2, 2020 16:25
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 7b8e157. Just updated comment since last review

@maflcko
Copy link
Member

maflcko commented Apr 2, 2020

ACK 7b8e157, only tested that this avoids the node freezing. Did not look at how libevent works or how the deadlock happens or if this breaks other stuff. 📞

Show signature and timestamp

Signature:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

ACK 7b8e15728d, only tested that this avoids the node freezing. Did not look at how libevent works or how the deadlock happens or if this breaks other stuff. 📞
-----BEGIN PGP SIGNATURE-----

iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
pUjKCgwAjQesMVDRk/0HJqT63QzYXgsky/+kGB7rdd3glG/ytl06M6dgM3bgepAv
t2Jfe/uGUugB9v0ettQ5/MkXewKi/r9dllozXIf6ZYyW0tzQKN22kSO6ybRcn0bL
GJDJeK35QG5CEoRB8kJiwohGhxPBeI0HF9LkkfEb7eLsXJGH4hliuQ9xpc0KGr32
0Sj3SUttKlK4xUbyDVyC7VOGeD53ZphJFiq+HPMY3lm6FmXn5Z2wXv6/FeV2jFQb
oxCCJ52gBzijv5YImikXjeOBEElUhI2IuIKCmTvNgXFv3T/c/ToSPShFpH2j2btP
cCzsIK3pex8fJSZI1ZE+PszAeh2FejdDN144fP7Ds/9fOwXIARU/vPVCgXeDsHZe
0yl17Qtl6p1YAGK2NGDXNM8UOA9MtzqAUyDBl4jRblIux3qEwEsxskjsDUv5d3JX
Im6mmvKOk/5loNPGXNhIE150YdbYZ4PNKjy4ke6rLvMW60tX0PnhfwOKWGdVM5iw
Lxj6BnQy
=jUhE
-----END PGP SIGNATURE-----

Timestamp of file with hash 72e45144d2300c2bfb37bf5663a6b1678638fd7ac770b9ac957658a6a77b4224 -

@promag
Copy link
Contributor Author

promag commented Apr 2, 2020

I didn't want to write libevent details in rpcwallet and also how RPCRunLater is implemented.

@promag
Copy link
Contributor Author

promag commented Apr 2, 2020

@MarcoFalke the deadlock happens in this case:

  • (we are in one of the http thread)
  • walletpassphrase
  • RPCRunLater
  • ~HTTPEvent()
  • event_free()
  • event_del()
  • event_del_(, EVENT_DEL_AUTOBLOCK);

And from libevent header

/** Argument for event_del_nolock_. Tells event_del to block on the event
 * if it is running in another thread and it doesn't have EV_FINALIZE set.
 */
#define EVENT_DEL_AUTOBLOCK 2

And the event callback (triggered from the event base thread) will lock cs_wallet and deadlock:

pwallet->chain().rpcRunLater(strprintf("lockwallet(%s)", pwallet->GetName()), [weak_wallet] {
if (auto shared_wallet = weak_wallet.lock()) {
LOCK(shared_wallet->cs_wallet);
shared_wallet->Lock();
shared_wallet->nRelockTime = 0;
}
}, nSleepTime);

Also, I think its reasonable to reduce lock scope (which we already did in other cases for other reasons) than to change the EV_FINALIZE flag.

or if this breaks other stuff.

I can't see how that could happen. but..

@laanwj laanwj merged commit 75021e8 into bitcoin:master Apr 6, 2020
@promag promag deleted the 2020-04-fix-rpcrunlater-race branch April 6, 2020 18:36
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 8, 2020
7b8e157 rpc: Fix rpcRunLater race in walletpassphrase (João Barbosa)

Pull request description:

  Release locks before calling `rpcRunLater`.

  Quick explanation: `rpcRunLater` leads to `event_free` which calls `event_del` which can wait for the event callback to finish if it's already running and that callback will try to lock wallet mutex - which is already locked in http thread.

  Fixes bitcoin#14995 , fixes bitcoin#18482. Best reviewed with whitespace changes hidden.

ACKs for top commit:
  MarcoFalke:
    ACK 7b8e157, only tested that this avoids the node freezing. Did not look at how libevent works or how the deadlock happens or if this breaks other stuff. 📞
  ryanofsky:
    Code review ACK 7b8e157. Just updated comment since last review

Tree-SHA512: 17874a2fa7b0e164fb0d7ee4cb7d59650275b8c03476fb291d60af8b758495457660d3912623fb26259fefe84aeba21c0a9e0c6467982ba511f19344ed5413ab
@promag
Copy link
Contributor Author

promag commented Apr 29, 2020

or if this breaks other stuff.

I can't see how that could happen. but..

@MarcoFalke but #18811 happened.

fanquake added a commit that referenced this pull request May 13, 2020
9f59dde rpc: Relock wallet only if most recent callback (João Barbosa)
a2e6db5 rpc: Add mutex to guard deadlineTimers (João Barbosa)

Pull request description:

  This PR fixes an early relocking race condition from #18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time

  Issue introduced in #18487.
  Fixes #18811.

ACKs for top commit:
  MarcoFalke:
    ACK 9f59dde
  ryanofsky:
    Code review ACK 9f59dde. No changes since last review except squashing commits.
  jonatack:
    ACK 9f59dde

Tree-SHA512: 2f7fc03e5ab6037337f2d82dfad432495cc337c77d07c968ee2355105db6292f24543c03456f5402e0e759577a4327758f9372f7ea29de6d56dc3695fda9b379
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 14, 2020
9f59dde rpc: Relock wallet only if most recent callback (João Barbosa)
a2e6db5 rpc: Add mutex to guard deadlineTimers (João Barbosa)

Pull request description:

  This PR fixes an early relocking race condition from bitcoin#18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time

  Issue introduced in bitcoin#18487.
  Fixes bitcoin#18811.

ACKs for top commit:
  MarcoFalke:
    ACK 9f59dde
  ryanofsky:
    Code review ACK 9f59dde. No changes since last review except squashing commits.
  jonatack:
    ACK 9f59dde

Tree-SHA512: 2f7fc03e5ab6037337f2d82dfad432495cc337c77d07c968ee2355105db6292f24543c03456f5402e0e759577a4327758f9372f7ea29de6d56dc3695fda9b379
jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Oct 13, 2020
Summary:
7b8e15728d1ad058a4b7d7569fd5d5ba6806ca28 rpc: Fix rpcRunLater race in walletpassphrase (João Barbosa)

Pull request description:

  Release locks before calling `rpcRunLater`.

  Quick explanation: `rpcRunLater` leads to `event_free` which calls `event_del` which can wait for the event callback to finish if it's already running and that callback will try to lock wallet mutex - which is already locked in http thread.

  Fixes #14995 , fixes #18482. Best reviewed with whitespace changes hidden.

Backport of Core [[bitcoin/bitcoin#18487 | PR18487]]
Fixes #375

Test Plan: `ninja check check-functional`

Reviewers: #bitcoin_abc, deadalnix

Reviewed By: #bitcoin_abc, deadalnix

Differential Revision: https://reviews.bitcoinabc.org/D7908
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

RPC stuck on walletpassphrase call possible deadlock with walletpassphrase

5 participants