index: During sync, commit best block after indexing #25074

mzumsande · 2022-05-06T12:34:40Z

This changes the periodic commit of the best block during the index sync phase to use the already indexed predecessor of the current block index, instead of committing the current one that will only be indexed (by calling WriteBlock()) after committing the best block.

The previous code would leave the index database in an inconsistent state until the block is actually indexed - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we'd assume that we have already indexed this block.

theStack · 2022-05-06T12:54:17Z

Concept ACK

w0xlt

ACK 75f53f1

pk-b2

ACK 75f53f1

luke-jr · 2022-05-09T21:33:57Z

I wonder if it would make sense to always re-index the initial best block too, just to avoid accidentally running with such corruption undetected.

luke-jr · 2022-05-09T21:39:13Z

src/index/base.cpp

current_time is stale here. Maybe instead of moving this, it'd be better to just SetBestBlockIndex(pindex->pprev);?

re: #25074 (comment)

current_time is stale here. Maybe instead of moving this, it'd be better to just SetBestBlockIndex(pindex->pprev);?

Yeah probably the reason this bug was introduced is that original author was trying to avoid current_time being stale. But pindex->pprev would be the wrong best block index to use in the Rewind() case above. It would be pointing to parent of the rewind block instead of the block that was actually rewound to. I guess this would be better than current behavior, but still technically incorrect, and seems like it would be fragile even if current index subclasses are ok with it.

I think using stale rewind time is ok and probably better than the alternatives, even if it seems a little wonky. This is all replaced anyway in 1819bdb from #24230, as Martin noted in #24230 (comment)

But pindex->pprev would be the wrong best block index to use in the Rewind() case above. It would be pointing to parent of the rewind block instead of the block that was actually rewound to. I guess this would be better than current behavior, but still technically incorrect,

I don't see how it would even be incorrect... After the rewind, pindex is changed to the block one beyond that which the index should be current to...?

I don't see how it would even be incorrect... After the rewind, pindex is changed to the block one beyond that which the index should be current to...?

Oops, you're right. I didn't notice NextSyncBlock was returning the next block after the fork. So I think your pindex->pprev suggestion is good and makes sense, and probably a little better than the current fix, though I think both fixes are fine.

I changed it to Luke's suggestion and adjusted the commit message, thanks.
While this is simpler, current_time being a bit stale shouldn't ever be a serious problem in my opinion - I think of updating the best block / committing it every 30s as a non-essential but nice-to-have feature that may save users some time in case of shutdowns, if we removed this block completely, it shouldn't break anything.

ryanofsky

Code review ACK 75f53f1e1fc6917eac40e51ce7b6bec2717bd526. Nice catch. This just changes the index sync thread to commit after writing instead of committing before writing, so the block referenced in the commit that is supposed to have been written is actually written.

ryanofsky · 2022-05-11T02:02:26Z

src/index/base.cpp

re: #25074 (comment)

current_time is stale here. Maybe instead of moving this, it'd be better to just SetBestBlockIndex(pindex->pprev);?

Yeah probably the reason this bug was introduced is that original author was trying to avoid current_time being stale. But pindex->pprev would be the wrong best block index to use in the Rewind() case above. It would be pointing to parent of the rewind block instead of the block that was actually rewound to. I guess this would be better than current behavior, but still technically incorrect, and seems like it would be fragile even if current index subclasses are ok with it.

I think using stale rewind time is ok and probably better than the alternatives, even if it seems a little wonky. This is all replaced anyway in 1819bdb from #24230, as Martin noted in #24230 (comment)

naumenkogs · 2022-05-12T07:48:21Z

Concept ACK

DrahtBot · 2022-05-12T09:17:42Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

mzumsande · 2022-05-13T00:31:50Z

I pushed an update, implementing Luke's suggestion.

I wonder if it would make sense to always re-index the initial best block too, just to avoid accidentally running with such corruption undetected.

It does get detected for the coinstatsindex, leading to an init error - that's how I encountered this, playing around with uprobes that send kill signals to bitcoind (branch). But other indexes may not.
Since this should not be possible unless there is a logic bug in the index code, checking on startup that there is an indexed entry for the best block / aborting otherwise may be another way instead of attempting to recover ?

luke-jr

utACK

ryanofsky

Code review ACK 9d4509b64ee333ef4587c77fb0d52e8e720a28c4, but it would be good to update commit message & PR description.

I was thinking probably it would be possible write a unit test for this with a mock BaseIndex subclass that overrides the CommitInternal() and WriteBlock() methods and checks that commit is called with the last block written, not the next block about to be written during sync. I don't think this is necessary however, since the bug is pretty theoretical and seems unlikely to happen in practice.

re: #25074 (comment)

This rearranges the order during the index sync phase to do the periodic update of the best block (Commit()) only after actually indexing this block (WriteBlock()). The previous order would leave the index database temporarily in an inconsistent state - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we'd assume that we have already indexed this block. Indexing the block ahead of committing the best block is the typical order and not a problem - at worst, we'd have to index this block again.

This PR description is out of date and would be good to fix because it will become part of the merge history.

src/index/base.cpp

Committing a block prior to indexing would leave the index database in an inconsistent state until it is indexed, which could corrupt the index in case of a unclean shutdown. Thus commit its predecessor. Co-authored-by: Luke Dashjr <[email protected]>

mzumsande · 2022-05-19T11:39:59Z

9d4509b -> 7171ebc: Updated the commit message and the PR description.

I was thinking probably it would be possible write a unit test for this with a mock BaseIndex subclass that overrides the CommitInternal() and WriteBlock() methods and checks that commit is called with the last block written, not the next block about to be written during sync. I don't think this is necessary however, since the bug is pretty theoretical and seems unlikely to happen in practice.

I think that having a test-only mock index is a really interesting idea, and that it could help with increasing the test coverage. I'd be interested to look into this as a follow-up, with the goal of improving the test coverage of the base index logic more broadly (not only with respect to this issue, I think if the mock index class tested only this it might be a bit overkill).

ryanofsky

Code review ACK 7171ebc. Looks great! Just commit message changes since last review

Committing a block prior to indexing would leave the index database in an inconsistent state until it is indexed, which could corrupt the index in case of a unclean shutdown. Thus commit its predecessor. Co-authored-by: Luke Dashjr <[email protected]> Github-Pull: bitcoin#25074 Rebased-From: 7171ebc

…xing 7171ebc index: Don't commit a best block before indexing it during sync (Martin Zumsande) Pull request description: This changes the periodic commit of the best block during the index sync phase to use the already indexed predecessor of the current block index, instead of committing the current one that will only be indexed (by calling `WriteBlock()`) after committing the best block. The previous code would leave the index database in an inconsistent state until the block is actually indexed - if an unclean shutdown happened at just this point in time, the index could get corrupted because at next startup, we'd assume that we have already indexed this block. ACKs for top commit: ryanofsky: Code review ACK 7171ebc. Looks great! Just commit message changes since last review Tree-SHA512: a008de511dd6a1731b7fdf6a90add48d1e53f7f7d6402672adb83e362677fc5b9f5cd021d3111728cb41d73f1b9c2140db79d7e183df0ab359cda8c01b0ef928

fanquake added the UTXO Db and Indexes label May 6, 2022

mzumsande mentioned this pull request May 6, 2022

indexes: Stop using node internal types and locking cs_main, improve sync logic #24230

Draft

fanquake requested a review from ryanofsky May 6, 2022 12:39

w0xlt approved these changes May 6, 2022

View reviewed changes

pk-b2 approved these changes May 9, 2022

View reviewed changes

luke-jr reviewed May 9, 2022

View reviewed changes

ryanofsky approved these changes May 11, 2022

View reviewed changes

mzumsande force-pushed the 202205_index_sync_order branch from 75f53f1 to 9d4509b Compare May 13, 2022 00:08

luke-jr approved these changes May 13, 2022

View reviewed changes

ryanofsky approved these changes May 19, 2022

View reviewed changes

src/index/base.cpp Outdated Show resolved Hide resolved

mzumsande force-pushed the 202205_index_sync_order branch from 9d4509b to 7171ebc Compare May 19, 2022 11:21

ryanofsky approved these changes May 19, 2022

View reviewed changes

fanquake merged commit e18fd47 into bitcoin:master May 19, 2022

mzumsande deleted the 202205_index_sync_order branch May 19, 2022 13:26

bitcoin locked and limited conversation to collaborators May 19, 2023

index: During sync, commit best block after indexing #25074

index: During sync, commit best block after indexing #25074

Uh oh!

Conversation

mzumsande commented May 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

theStack commented May 6, 2022

Uh oh!

w0xlt left a comment

Choose a reason for hiding this comment

Uh oh!

pk-b2 left a comment

Choose a reason for hiding this comment

Uh oh!

luke-jr commented May 9, 2022

Uh oh!

luke-jr May 9, 2022

Choose a reason for hiding this comment

Uh oh!

ryanofsky May 11, 2022

Choose a reason for hiding this comment

Uh oh!

luke-jr May 11, 2022

Choose a reason for hiding this comment

Uh oh!

ryanofsky May 11, 2022

Choose a reason for hiding this comment

Uh oh!

mzumsande May 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

ryanofsky May 11, 2022

Choose a reason for hiding this comment

Uh oh!

naumenkogs commented May 12, 2022

Uh oh!

DrahtBot commented May 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

mzumsande commented May 13, 2022

Uh oh!

luke-jr left a comment

Choose a reason for hiding this comment

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mzumsande commented May 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ryanofsky left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

9 participants

mzumsande commented May 6, 2022 •

edited

Loading

mzumsande May 13, 2022 •

edited

Loading

DrahtBot commented May 12, 2022 •

edited

Loading

mzumsande commented May 19, 2022 •

edited

Loading