Skip to content

Conversation

@furszy
Copy link
Member

@furszy furszy commented May 2, 2023

Basically, with bdb, we can't make a write operation while we are traversing the db with the same db handler. These two operations are performed in different txn contexts and cause a deadlock.

Added coverage by using EraseRecords() which is the simplest function that executes this process.

To replicate it, need bdb support and drop the first commit. The test will run forever.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 2, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
ACK hebasto, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #27286 (wallet: Keep track of the wallet's own transaction outputs in memory by achow101)
  • #24914 (wallet: Load database records in a particular order by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look closer at the changes yet, but can confirm that running the newly introduced unit test via

$ ./src/test/test_bitcoin --log_level=all --run_test=walletdb_tests/walletdb_read_write_deadlock

succeeds on the PR branch head and hangs if the bugfix commit bda562ddba13ee8f94147734ab9901ab07357f2f is reverted.

@achow101
Copy link
Member

achow101 commented May 3, 2023

ACK d3419f4108669d37a4ccecab77750e70c2f6237c

@achow101
Copy link
Member

achow101 commented May 3, 2023

Backport shouldn't be required as I don't think there's any user action that can trigger the deadlock.

@furszy furszy force-pushed the 2023_wallet_db_deadlock branch from d3419f4 to de202f8 Compare May 3, 2023 15:59
@furszy
Copy link
Member Author

furszy commented May 3, 2023

Updated per feedback. Thanks achow101.

Only diff is BerkeleyCursor requiring a WalletBatch ref instead of a pointer.

@achow101
Copy link
Member

achow101 commented May 3, 2023

ACK de202f860b3f79774d5235dec59a140fa5c82fa7

@hebasto
Copy link
Member

hebasto commented May 11, 2023

Concept ACK.

@hebasto
Copy link
Member

hebasto commented May 11, 2023

Does this PR makes

outdated?

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK de202f860b3f79774d5235dec59a140fa5c82fa7.

@furszy
Copy link
Member Author

furszy commented May 12, 2023

Does this PR makes

outdated?

@hebasto hmm, interesting. It seems to be outdated in master. Just ran the test suite there, without the suppression, and it passed.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK de202f860b3f79774d5235dec59a140fa5c82fa7, tested on Ubutnu 23.04.

To replicate it, need bdb support and drop the first commit. The test will run forever.

The test is deadlocked with the second commit dropped as well.

furszy added 3 commits May 15, 2023 12:23
…yCursor

If the batch ptr is not passed, the cursor will not use the db active
txn context which could lead to a deadlock if the code tries to modify
the db while it is traversing it.

E.g. the 'EraseRecords()' function.
so we erase all the records atomically or abort the entire
procedure.

and, at the same time, we can share the same db txn context
for the db cursor and the erase functionality.

extra note from the Db.cursor doc:
"If transaction protection is enabled, cursors must be
opened and closed within the context of a transaction"

thus why added a `CloseCursor` call before calling to
`TxnAbort/TxnCommit`.
@furszy furszy force-pushed the 2023_wallet_db_deadlock branch from de202f8 to 69d4390 Compare May 15, 2023 15:31
@furszy
Copy link
Member Author

furszy commented May 15, 2023

rebased, conflicts solved.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 69d4390

@achow101
Copy link
Member

ACK 69d4390

@DrahtBot DrahtBot removed the request for review from achow101 May 18, 2023 15:02
@achow101 achow101 merged commit 6cc136b into bitcoin:master May 18, 2023
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 19, 2023
69d4390 test: add coverage for wallet read write db deadlock (furszy)
12daf6f walletdb: scope bdb::EraseRecords under a single db txn (furszy)
043fcb0 wallet: bugfix, GetNewCursor() misses to provide batch ptr to BerkeleyCursor (furszy)

Pull request description:

  Decoupled from bitcoin#26644 so it can closed in favor of bitcoin#26715.

  Basically, with bdb, we can't make a write operation while we are traversing the db with the same db handler. These two operations are performed in different txn contexts and cause a deadlock.

  Added coverage by using `EraseRecords()` which is the simplest function that executes this process.

  To replicate it, need bdb support and drop the first commit. The test will run forever.

ACKs for top commit:
  achow101:
    ACK 69d4390
  hebasto:
    re-ACK 69d4390

Tree-SHA512: b3773be78925f674e962f4a5c54b398a9d0cfe697148c01c3ec0d68281cc5c1444b38165960d219ef3cf1a57c8ce6427f44a876275958d49bbc0808486e19d7d
@furszy furszy deleted the 2023_wallet_db_deadlock branch August 9, 2023 15:40
@bitcoin bitcoin locked and limited conversation to collaborators Aug 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants