Log explicit error message when coindb is found in inconsistent state #28350

fjahr · 2023-08-27T11:29:15Z

While doing manual testing on assumeutxo this week I managed to put the coindb into an inconsistent state twice. For a normal user, this can also happen if their computer crashes during a flush or if they try to stop their node during a flush and then get tired of waiting and just shut their computer down or kill the process. It's an edge case but I wouldn't be surprised if this does happen more often when assumeutxo gets used more widely because there might be multiple flushes happening during loading of the UTXO set in the beginning and users may think something is going wrong because of the unexpected wait or they forgot some configs and want to start over quickly.

The problem is, when this happens at first the node starts up normally until it's time to flush again and then it hits an assert that the user can not understand.

2023-08-25T16:31:09Z [httpworker.0] [snapshot] 52000000 coins loaded (43.30%, 6768 MB)
2023-08-25T16:31:16Z [httpworker.0] Cache size (7272532192) exceeds total space (7256510300)
2023-08-25T16:31:16Z [httpworker.0] FlushSnapshotToDisk: flushing coins cache (7272 MB) started
Assertion failed: (old_heads[0] == hashBlock), function BatchWrite, file txdb.cpp, line 126.
Abort trap: 6

We should at least log an error message that gives users a hint of what the problem is and what they can do to resolve it. I am keeping this separate from the assumeutxo project since this issue can also happen during any regular flush.

DrahtBot · 2023-08-27T11:29:18Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	jonatack, ryanofsky, jamesob, achow101

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

jonatack

utACK be1b6d5122355e2eff6bde4efd88a51c2740761e

src/txdb.cpp

jonatack · 2023-08-27T18:40:51Z

ACK df60de7

ryanofsky

Code review ACK df60de7

This is an improvement, but I would think just killing the process should not put the coindb in an inconsistent state that would require a reindex. Am I wrong about that, or is there more work that could be done here to debug the issue and update the database atomically?

jamesob

Code review ACK df60de7

achow101 · 2023-09-01T17:17:36Z

ACK df60de7

luke-jr · 2023-09-05T20:33:46Z

For a normal user, this can also happen if their computer crashes during a flush or if they try to stop their node during a flush and then get tired of waiting and just shut their computer down or kill the process.

It shouldn't... That would be a bug.

…in inconsistent state df60de7 log: Print error message when coindb is in inconsistent state (Fabian Jahr) Pull request description: While doing manual testing on assumeutxo this week I managed to put the coindb into an inconsistent state twice. For a normal user, this can also happen if their computer crashes during a flush or if they try to stop their node during a flush and then get tired of waiting and just shut their computer down or kill the process. It's an edge case but I wouldn't be surprised if this does happen more often when assumeutxo gets used more widely because there might be multiple flushes happening during loading of the UTXO set in the beginning and users may think something is going wrong because of the unexpected wait or they forgot some configs and want to start over quickly. The problem is, when this happens at first the node starts up normally until it's time to flush again and then it hits an assert that the user can not understand. ``` 2023-08-25T16:31:09Z [httpworker.0] [snapshot] 52000000 coins loaded (43.30%, 6768 MB) 2023-08-25T16:31:16Z [httpworker.0] Cache size (7272532192) exceeds total space (7256510300) 2023-08-25T16:31:16Z [httpworker.0] FlushSnapshotToDisk: flushing coins cache (7272 MB) started Assertion failed: (old_heads[0] == hashBlock), function BatchWrite, file txdb.cpp, line 126. Abort trap: 6 ``` We should at least log an error message that gives users a hint of what the problem is and what they can do to resolve it. I am keeping this separate from the assumeutxo project since this issue can also happen during any regular flush. ACKs for top commit: jonatack: ACK df60de7 achow101: ACK df60de7 ryanofsky: Code review ACK df60de7 jamesob: Code review ACK df60de7 Tree-SHA512: b546aa0b0323ece2962867a29c38e014ac83ae8f1ded090da2894b4ff2450c05229629c7e8892f7b550cf7def4038a0b4119812e548e11b00c60b1dc3d4276d2

DrahtBot added the CI failed label Aug 27, 2023

fjahr mentioned this pull request Aug 27, 2023

assumeutxo (2) #27596

Merged

DrahtBot removed the CI failed label Aug 27, 2023

jonatack reviewed Aug 27, 2023

View reviewed changes

src/txdb.cpp Outdated Show resolved Hide resolved

log: Print error message when coindb is in inconsistent state

df60de7

fjahr force-pushed the 202308-coindb-error branch from be1b6d5 to df60de7 Compare August 27, 2023 18:16

ryanofsky approved these changes Aug 29, 2023

View reviewed changes

jamesob approved these changes Aug 29, 2023

View reviewed changes

glozow added the Utils/log/libs label Sep 1, 2023

achow101 merged commit df98a12 into bitcoin:master Sep 1, 2023

bitcoin locked and limited conversation to collaborators Sep 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Log explicit error message when coindb is found in inconsistent state #28350

Log explicit error message when coindb is found in inconsistent state #28350

Uh oh!

fjahr commented Aug 27, 2023

Uh oh!

DrahtBot commented Aug 27, 2023 •

edited

Loading

Uh oh!

jonatack left a comment

Uh oh!

Uh oh!

jonatack commented Aug 27, 2023

Uh oh!

ryanofsky left a comment

Uh oh!

jamesob left a comment

Uh oh!

achow101 commented Sep 1, 2023

Uh oh!

luke-jr commented Sep 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Log explicit error message when coindb is found in inconsistent state #28350

Log explicit error message when coindb is found in inconsistent state #28350

Uh oh!

Conversation

fjahr commented Aug 27, 2023

Uh oh!

DrahtBot commented Aug 27, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews

Uh oh!

jonatack left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonatack commented Aug 27, 2023

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

jamesob left a comment

Choose a reason for hiding this comment

Uh oh!

achow101 commented Sep 1, 2023

Uh oh!

luke-jr commented Sep 5, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

DrahtBot commented Aug 27, 2023 •

edited

Loading