-
Notifications
You must be signed in to change notification settings - Fork 38.7k
validation: improve performance of CheckBlockIndex #28339
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
validation: improve performance of CheckBlockIndex #28339
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code CoverageFor detailed information about the code coverage, see the test coverage report. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsNo conflicts as of last run. |
|
cc @martinus |
|
Hi, I wonder how you measured the runtime, is there a benchmark? Another optimization that might be worth it, it is to replace the |
So far only manually, by adding log statements to the beginning and end of CheckBlockIndex and running with
Thanks, I'll look into that! |
|
While you're at it, I noticed that the help text for |
Good point, I think that an interval argument similar to addrman or mempool consistency checks makes sense. Will try that in addition to the other points above (it might take a few weeks though, therefore put the PR into draft). |
3e0f001 to
87d6155
Compare
|
I added a commit with a benchmark, and another one that allow to specify the frequency for
I tried that in mzumsande@51aae54 (directly on master without the changes of this PR), however it doesn't seem to improve performance - I'll post benchmark results below. |
|
Benchmark results: master
PR
martinus' ideaUses
The benchmark currently uses a linear chain of 1100 headers. In general, the efficiency gain becomes larger with a longer chain, for example with just 100 headers the speedup is less. So there is a speedup of a facto >2.5 with the benchmark (that becomes larger on larger chains, see results for mainnet / testnet in OP). |
|
Concept ACK |
87d6155 to
efdc8e3
Compare
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May be better to do this deterministically to avoid non-determinism in tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be nice to have consistency among -checkaddrman, -checkmempool and -checkblockindex. Since non-determinism would only occur for values other than 0 and 1, maybe it'd be best not to use other values than that in tests? Or change all three do be deterministic?
7015231 to
e880ee0
Compare
|
b51eb42 to e880ee0 Still haven't had the time to write more unit tests for |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK e880ee0a634f0be5079d39328fa5e118452553a9. Just rebase and suggested simplifications since the last review
|
@DrahtBot, I just reviewed it.. |
e880ee0 to
3c3895c
Compare
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 3c3895ccfa108e10f4a0b1a78f64ffedade1fd11
Left another suggestion, but this looks good in its current form.
by not saving all indexes in a std::multimap, but only those that are not part of the best header chain. The indexes of the best header chain are stored in a vector, which, in the typical case of a mostly linear chain with a few forks, results in a much smaller multimap, and increases performance noticeably for long chains. This does not change the actual consistency checks that are being performed for each index, just the way the block index tree is stored and traversed. Co-authored-by: Ryan Ofsky <[email protected]>
This makes it similar to -checkaddrman and -checkmempool, which also allow to run the check occasionally instead of always / never. Co-authored-by: Ryan Ofsky <[email protected]>
3c3895c to
5bc2077
Compare
furszy
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 5bc2077
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK 5bc2077. Just added suggested assert and simplification since last review
|
ACK 5bc2077 |
|
Ported to the CMake-based build system in hebasto#290. |
CheckBlockIndex()are consistency checks that are currently enabled by default on regtest.The function is rather slow, which is annoying if you
-checkblockindexor don't know it existed.One reason why it's slow is that in order to be able to traverse the block tree depth-first from genesis, it inserts pointers to all block indices into a
std::multimap- for which inserts and lookups become slow once there are hundred thousands of entries.However, typically the block index is mostly chain-like with just a few forks so a multimap isn't really needed for the most part. This PR suggests to store the block indices of the chain ending in the best header in a vector instead, and store only the rest of the indices in a multimap. This does not change the actual consistency checks that are being performed for each index, just the way the block index tree is stored and traversed.
This adds a bit of complication to make sure each block is visited (note that there are asserts that check it), making sure that the two containers are traversed correctly, but it speeds up the function considerably:
On master, a single invocation of
CheckBlockIndextakes ~1.4s on mainnet for me (4.9s on testnet which has >2.4 million blocks).With this branch, the runtime goes down to ~0.27s (0.85s on testnet).This is a speedup by a factor ~5.