Skip to content

Conversation

@hebasto
Copy link
Member

@hebasto hebasto commented Oct 31, 2021

The dcd6eeb commit (#23288) introduced an intermittent failure in the psbt_wallet_tests/psbt_updater_test unit test. See #23368.

The test failure can be easily made reproducible with the following patch:

--- a/src/scheduler.cpp
+++ b/src/scheduler.cpp
@@ -57,6 +57,8 @@ void CScheduler::serviceQueue()
             Function f = taskQueue.begin()->second;
             taskQueue.erase(taskQueue.begin());
 
+            UninterruptibleSleep(100ms);
+
             {
                 // Unlock before calling f, so it can reschedule itself or another task
                 // without deadlocking:

This PR implements an idea which was mentioned in the comment:

Yes, as I said before this looks like a race where the wallet is deleted before stopping the scheduler: #23368 (comment)

IIRC, the order should be:

  • stop scheduler

  • delete wallet

  • delete scheduler

The second commit introduces a refactoring with no behavior change.

Fixes #23368.

@mjdietzx
Copy link
Contributor

Code review ACK 68018e4

@maflcko maflcko merged commit 5adc5c0 into bitcoin:master Nov 1, 2021
@hebasto hebasto deleted the 210331-psbt branch November 1, 2021 13:40
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 1, 2021
…_updater_test

68018e4 test: Avoid excessive locking of `cs_wallet` (Hennadii Stepanov)
7986faf test: Fix segfault in the psbt_wallet_tests/psbt_updater_test (Hennadii Stepanov)

Pull request description:

  The dcd6eeb commit (bitcoin#23288) introduced an intermittent failure in the `psbt_wallet_tests/psbt_updater_test` unit test. See bitcoin#23368.

  The test failure can be easily made reproducible with the following patch:
  ```diff
  --- a/src/scheduler.cpp
  +++ b/src/scheduler.cpp
  @@ -57,6 +57,8 @@ void CScheduler::serviceQueue()
               Function f = taskQueue.begin()->second;
               taskQueue.erase(taskQueue.begin());

  +            UninterruptibleSleep(100ms);
  +
               {
                   // Unlock before calling f, so it can reschedule itself or another task
                   // without deadlocking:
  ```

  This PR implements an idea which was mentioned in the [comment](bitcoin#23368 (comment)):
  > Yes, as I said before this looks like a race where the wallet is deleted before stopping the scheduler: [bitcoin#23368 (comment)](bitcoin#23368 (comment))
  >
  > IIRC, the order should be:
  >
  >    * stop scheduler
  >
  >    * delete wallet
  >
  >    * delete scheduler

  The second commit introduces a refactoring with no behavior change.

  Fixes bitcoin#23368.

ACKs for top commit:
  mjdietzx:
    Code review ACK 68018e4

Tree-SHA512: d9103f6252aab807453628159dec243bc543a2595eecaa04ec761dca3c2370085592c55d6f50967d69a4ac6e8b5827eec30dd9b025132c99b0bb9aa5911ad915
maflcko pushed a commit that referenced this pull request Nov 29, 2021
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov)
c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov)

Pull request description:

  This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`.

  From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one):
  > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

  Related to:
  - #23167
  - #23223

ACKs for top commit:
  martinus:
    ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in #23403.
  vasild:
    ACK 459e208
  theStack:
    Code-review ACK 459e208

Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Nov 29, 2021
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov)
c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov)

Pull request description:

  This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`.

  From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one):
  > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

  Related to:
  - bitcoin#23167
  - bitcoin#23223

ACKs for top commit:
  martinus:
    ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403.
  vasild:
    ACK 459e208
  theStack:
    Code-review ACK 459e208

Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jan 25, 2022
…scheduler

Summary:
> The order should be:
>  - stop scheduler
>  - delete wallet
>  - delete scheduler

This is a partial backport of [[bitcoin/bitcoin#23403 | core#23403]]
bitcoin/bitcoin@68018e4

The other commit in that pull request is not applicable to Bitcoin ABC, for now. It has a dependency on [[bitcoin/bitcoin#23288 | core#23288]], which has itself a lot of dependencies on recent descriptor wallets developments.

Test Plan:
With TSAN:
`ninja && ninja check`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10883
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Apr 11, 2022
459e208 Exit early for an empty vChecks in CCheckQueue::Add (Hennadii Stepanov)
c43aa62 Avoid excessive lock contention in CCheckQueue::Add (Hennadii Stepanov)

Pull request description:

  This PR significantly reduces lock contention in the `CCheckQueue` class by releasing a mutex before calling `std::condition_variable::notify_one` and `std::condition_variable::notify_all`.

  From C++ [docs](https://en.cppreference.com/w/cpp/thread/condition_variable/notify_one):
  > The notifying thread does not need to hold the lock on the same mutex as the one held by the waiting thread(s); in fact doing so is a pessimization, since the notified thread would immediately block again, waiting for the notifying thread to release the lock.

  Related to:
  - bitcoin#23167
  - bitcoin#23223

ACKs for top commit:
  martinus:
    ACK 459e208, codereview and tested. I first thought this introduced a segfault in `psbt_wallet_tests/psbt_updater_test` because that test failed for me, but thats a different issue fixed in bitcoin#23403.
  vasild:
    ACK 459e208
  theStack:
    Code-review ACK 459e208

Tree-SHA512: c197858656392ba3ebcd638d713cf93c9fb48b7b3bad193209490d2828f9c7e3ae4dee6f84674f2f34dceed894139562e29579ee7299e06756c8c990caddc5ed
@bitcoin bitcoin locked and limited conversation to collaborators Nov 2, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

qa: Intermittent failures in the psbt_wallet_tests/psbt_updater_test unit test

3 participants