Skip to content

Conversation

@ryanofsky
Copy link
Contributor

@ryanofsky ryanofsky commented Feb 1, 2022

This PR lets indexing code mostly run outside of the node process. It also improves indexing sync logic, which is moved out of indexing code to a new node::SyncChain() function.

Almost all the commits in this PR are small refactoring changes that move code from src/index/ to src/node/, or replace references to node types like CBlockIndex, CChain, CChainState in index code. There are only two commits affecting indexing sync logic which make complicated or substantive changes:

The commit messages have more details about these and other changes. Followups to this PR will reuse indexing sync code for wallets (#15719, #11756) and let indexes run in separate processes (#10102)

@ryanofsky ryanofsky changed the title indexes: Stop using node API and locking cs_main, improve sync logic indexes: Stop using node internal types and locking cs_main, improve sync logic Feb 1, 2022
@ryanofsky
Copy link
Contributor Author

Rebased 8d8cdcb -> 1fcfc73 (pr/indexy.1 -> pr/indexy.2, compare) due to conflict with #22932. Also fixed some bugs in the last commit 🙄

@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 2, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/24230.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK fjahr, mzumsande, jamesob, aureleoules, josibake
Approach ACK TheCharlatan

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #33324 (blocks: add -reobfuscate-blocks arg to xor existing blk/rev on startup by l0rinc)
  • #33042 (refactor: inline constant return values from dbwrapper write methods by l0rinc)
  • #32885 (Cache m_cached_finished_ibd where SetTip is called. by pstratem)
  • #32875 (index: handle case where pindex_prev equals chain tip in NextSyncBlock() by HowHsu)
  • #31308 (ci, iwyu: Treat warnings as errors for specific directories by hebasto)
  • #31260 (scripted-diff: Type-safe settings retrieval by ryanofsky)
  • #29770 (index: Check all necessary block data is available before starting to sync by fjahr)
  • #29640 (Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) by sr-gi)
  • #28690 (build: Introduce internal kernel library by TheCharlatan)
  • #26966 (index: initial sync speedup, parallelize process by furszy)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

  • "the error will be already be logged." -> "the error will already be logged." [duplicate "be" makes the sentence ungrammatical]
  • "the error will be already be logged." -> "the error will already be logged." [same duplicate "be" appears elsewhere]
  • "and there will be a fatal error if there an attempt to connect a another block to the index." -> "and there will be a fatal error if there is an attempt to connect another block to the index." [missing "is" and extra "a" make the sentence ungrammatical]
  • "a users fault." -> "a user's fault." [missing possessive apostrophe makes the phrase grammatically incorrect]
  • "If the locator does not point to the best blocks or one of its ancestors," -> "If the locator does not point to the best block or one of its ancestors," [plural "blocks" is not grammatically correct in this context]

drahtbot_id_5_m

@DrahtBot DrahtBot mentioned this pull request Feb 2, 2022
18 tasks
@ryanofsky
Copy link
Contributor Author

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some questions and comments while reviewing the first 8 commits.

My brain got friend while trying to understand 15398b1, even with --color-moved and probably because the original logic is also a bit confusing. So I'll try that again later.

Copy link
Contributor Author

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated 3ab1ebe -> d27072d (pr/indexy.4 -> pr/indexy.5, compare) with various suggested changes and a few other tweaks, all minor

My brain got friend while trying to understand 15398b1, even with --color-moved and probably because the original logic is also a bit confusing. So I'll try that again later.

This is good feedback. I split the commit up into two parts, so the boilerplate code moving part is separate from the other changes which are a little less obvious.

@ryanofsky
Copy link
Contributor Author

Thanks for the review!

Avoid a race condition in index startup where indexes set m_synced = true too
early, while there are still notifications from previously connected blocks in
the validationinterface queue, causing those blocks to trigger the Rewind()
call in the BaseIndex::BlockConnected notification handler, and pointlessly
remove and then re-append the same blocks.

This race condition was easy to reproduce just by running the
test/functional/feature_coinstatsindex.py "Test that the index works with
-reindex-chainstate" test.

This change also allows simplifying code and getting rid of the "WARNING: Block %s
does not connect to an ancestor of known best chain (tip=%s); not updating
index" warning.
Previous commit 01b95ac from
bitcoin#33212 fixed one case where index state
referring to blocks the node had not flushed to disk yet could be commited,
leading to index corruption if the node was not shut down cleanly.

However, as pointed out
bitcoin#33212 (review) and
bitcoin#33208 (comment) there
are other remaining cases where index state could be committed referring to
unflushed blocks, and this commit fixes those cases.
Add new interfaces::Chain::attachChain method and start moving index sync logic
out of src/index/ to src/node/. In this commit, move code that registers for
index validationinterface notifications. In later commits, move the actual sync
thread and add real blockConnected, blockDisconnected, and chainStateFlushed
implementations.

This commit does not change behavior in any way.
Split up and rename validationinterface callback functions
BaseIndex::BlockConnected to BaseIndex::IgnoreBlockConnected and
BaseIndex::ChainStateFlushed to BaseIndex IgnoreChainStateFlushed.

Most of the code in these functions was just ignoring spurious notifications,
so leave that code in place, but move the rest of the code to
BaseIndexNotifications::blockConnected and
BaseIndexNotifications::chainStateFlushed. In an upcoming commit the spurious
notifications will be dropped and the new BaseIndex::Ignore... methods will be
deleted entirely.

This commit does not change behavior in any way.
…BaseIndex destructor

This code never worked and was never used. Now that an m_handler member is
added, the incorrect usage is easier to detect and an assert can be added.

This commit does not change behavior in any way.
This commit does not change behavior in any way. Goal of this change is to
document behavior better and make it more explicit.
…ected

Start moving indexing code from BaseIndex::Sync() to blockConnected() so
indexes can be separated from the node and do not need to implement syncing.
More index-specific code will be moved from Sync() to blockConnected() in
upcoming commits and then the Sync method will be dropped.

This commit does not change behavior in any way.
…n handlers

This commit does not change behavior in any way.
…ion handler

This commit does not change behavior in any way.
…ed and Sync

This main thing this commit code does is make indexes start using
blockDisconnected() notifications instead of ignoring them and calling Rewind()
in blockConnected() after the fact. It also changes Sync() code to stop calling
Rewind() and instead call `blockDisconnected` in a loop. It moves most of the
code was that previously in Rewind() to blockDisconnected(), and some of it to
blockConnected(), which does not change behavior because Rewind was always
called before connecting a new block.

This commit tries to avoid changing behavior in any way. In particular, the
best block pointer is still not updated during the rewind, so indexes still
report status in the same way they did before this commit. (In the future it
would be good to consider updating best block pointer in a more consistent way,
but that is beyond the scope of this PR). Also failures during rewind still
cause fatal errors and interrupt the Sync() loop.

The only actual change in behavior should be that when blocks are disconnected,
the disconnects are processed earlier: as soon as blockDisconnected events
arrive instead of right before the next new block is connected. Additionally if
rewind occurs during an initial sync, it is now interruptible, and the rewind
loop will exit if `m_interrupt` is set.
Add BlockFilterIndex::m_filter_mutex to avoid TSAN errors in next commit moving
index sync thread from indexes to node. This mutex was previously neccessary,
but TSAN did not detect errors until the sync thread was moved.
No changes in behavior, this change is mostly move-only.
This commit does not change behavior in any way.
Add SyncChain method and improve Chain::attachChain behavior so it no longer
sends notifications from the sync thread at the same time as notifications from
validationinterface queue. This allows BaseIndex::m_ready variable and
surrounding logic to be dropped and indexing code to be simplified.

It is also a potential performance improvement since it avoids holding cs_main
while processing all indexing notifications. Previously cs_main was held while
sending some notifications in the attachChain sync thread.
In attachChain, avoid locking cs_main while calling prepare_sync method to run
index CustomInit code.

This is implemented by unconditionally spawning index sync threads, instead of
only spawning them conditionally when the index best block is not the chain
tip. An alternative implementation could avoid the cost of spawning the sync
thread by calling prepare_sync from the validationinterface notification
thread. But this would be slightly more complex, and also potentially slower if
any current or future index takes longer to initialize, since it would
initialize serially instead of in parallel with other index and node init code.
This commit does not change behavior in any way.
…ng code

This commit does not change behavior in any way.
Replace last m_blockman.UpdatePruneLock call with
interfaces::Chain::updatePruneLock call.
@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.