-
Notifications
You must be signed in to change notification settings - Fork 38.6k
[kernel 2a/n] Split hashing/index GetUTXOStats codepaths, decouple from coinstatsindex
#24410
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[kernel 2a/n] Split hashing/index GetUTXOStats codepaths, decouple from coinstatsindex
#24410
Conversation
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK (a7883ac7a497f3d1cb52f5f71e71de5e5e9301f1). Looked through various commits and I think the changes all make sense
a7883ac to
c46aafa
Compare
|
Pushed a7883ac7a4 -> c46aafaf11
|
|
Would an alternative be to treat assumeutxo as not part of the kernel for now (and completely move it out?) |
Sure, but I don't prefer that alternative. I think it would actually be more work to extract assumeutxo from the kernel, plus if you look at the |
|
Concept ACK |
c2800d8 to
2b6443e
Compare
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
This comment was marked as resolved.
This comment was marked as resolved.
2b6443e to
756f1a7
Compare
This comment was marked as resolved.
This comment was marked as resolved.
756f1a7 to
30503c9
Compare
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
|
@willcl-ark Okay for me to hide your comments so as to not confuse future reviewers? |
Please go ahead! |
|
Pushed 756f1a755d138e36a9e50e7e4e46ae0bcd3975ca -> 30503c98a8bc97b582f072c34168b5279d65a492
Reviewers: Please let me know if the new "fuzz: Remove useless GetUTXOStats fuzz case" commit looks okay. Ping @practicalswift |
mzumsande
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK
I was able to follow the commits step by step and the changes look sensible to me. One suggestion below about the coinstatsindex codepath.
|
Pushed e6e2b165fd377806e257fbb5378e22067a9a3957 -> e42583f928d5461827a946db3baa7b3da9487253
|
theStack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
re-ACK e42583f928d5461827a946db3baa7b3da9487253
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code review ACK e42583f928d5461827a946db3baa7b3da9487253. Looks good, I left new suggestions but they are not important. Changes since last review were all minor: adding comment, adding static, dropping unused parameter
src/node/coinstats.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "coinstats: Return purely out-param CCoinsStats" (f735c92edc08047bdbe0f47ea0e7c319bcc62eeb)
Agree with Marco, I think. I don't understand why you wouldn't initialize success to false. If function is called with (CoinStatsHashType)123 compiler is correct to warn that it has undefined behavior. It seems like it would be best to assert false in this case, second best to return nullopt if asserting is inconvenient, and third best to have undefined behavior.
In previous commits in this patchset, we removed all in-param members of CCoinsStats. Now that that's done, we can modify GetUTXOStats to return an optional CCoinsStats instead of a status bool. Callers are modified accordingly. In rpc/blockchain.cpp, we discover that GetUTXOStats' status bool when getting UTXO stats for pprev was not checked for error. We fix this as well.
Split out ComputeUTXOStats and LookupUTXOStatsWithIndex from
GetUTXOStats, since the hashing and indexing codepaths are quite
disparate in practice.
Also allow add a constructor to CCoinsStats for it to be constructed
from a a block height and hash. This is used in both codepaths.
Also add a note in GetUTXOStats documenting a behaviour quirk that
predates this patchset.
[META] This allows the hashing codepath to be moved to a separate file
in a future commit, decoupling callers that only rely on the
hashing codepath from the indexing one. This is key for
libbitcoinkernel, which needs to have the hashing codepath for
AssumeUTXO, but does not wish to be coupled with indexes.
The indexing codepath logic in node/coinstats.cpp is simple enough to be moved into CoinStatsIndex::LookUpStats, avoiding an additional layer of function calls. Callers are modified accordingly. Also, add 2 missed BOOST_CHECKs to the coinstatsindex_initial_sync unit test.
As mentioned in a previous commit, the hashing codepath can now be moved to a separate file. This decouples callers that only rely on the hashing codepath from the indexing one. This is key for libbitcoinkernel, which needs to have the CoinsStats hashing codepath for AssumeUTXO, but does not wish to be coupled with indexes. Note that only the .cpp file is split in this commit, the header files will be split in a subsequent commit and the #includes to node/coinstats.h will be adjusted to only #include the necessary headers.
Most of this commit is pure-move.
After this change:
- kernel/coinstats.h
-> Contains declarations for:
- enum class CoinStatsHashType
- struct CCoinsStats
- GetBogoSize(...)
- TxOutSer(...)
- ComputeUTXOStats(...)
- node/coinstats.h
-> Just GetUTXOStats, which will be removed as we change callers to
directly use the hashing/indexing codepaths in future commits.
Removes a circular dependency, horray!
Introduces a new kernel:: namespace and move all of src/kernel/coinstats
under it.
In the verify script, lines like:
line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h
Are intended to replace only the last instance of "namespace node" with
"namespace kernel", this is to avoid replacing forward declarations of
things inside the node:: namespace.
-BEGIN VERIFY SCRIPT-
sed -E -i 's@namespace node@namespace kernel@g' -- src/kernel/coinstats.cpp
line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h
line="$(grep -n '// namespace node' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@// namespace node@// namespace kernel@" -- src/kernel/coinstats.h
things='(CCoinsStats|CoinStatsHashType|GetBogoSize|TxOutSer|ComputeUTXOStats)'
git grep -lE 'node::'"$things" | xargs sed -E -i 's@node::'"$things"'@kernel::\1@g'
sed -E -i 's@'"$things"'@kernel::\1@g' -- src/node/coinstats.cpp src/node/coinstats.h
sed -E -i 's@BlockManager@node::\0@g' -- src/kernel/coinstats.cpp
-END VERIFY SCRIPT-
This is the "fruit of our labor" for this patchset. ChainstateManager::PopulateAndValidateSnapshot can now directly call ComputeUTXOStats(...). Our consensus engine is now fully decoupled from all indices. See the src/Makefile.am for some satisfying removals.
e42583f to
c7e8803
Compare
|
Push e42583f928d5461827a946db3baa7b3da9487253 -> c7e8803e47eadc68e46e9384a7e2c4543404cd1b |
rpc/blockchain.cpp is now the only user of the vestigial GetUTXOStats(...). And since GetUTXOStats(...)'s special fallback logic was only really relevant/meant for rpc/blockchain.cpp, we can just move it there.
c7e8803 to
664a14b
Compare
|
Pushed c7e8803e47eadc68e46e9384a7e2c4543404cd1b -> 664a14b
|
|
Code review ACK 664a14b |
theStack
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
post-merge re-ACK 664a14b
…ific block with `use_index=false` 27c8056 rpc: Disallow gettxoutsetinfo queries for a specific block with use_index=false (Martin Zumsande) Pull request description: In the `gettxoutsetinfo` RPC, if we set `use_index` to false but specify `hash_or_height`, we currently hit a nonfatal error, e.g. `gettxoutsetinfo "muhash" "1" "false"` results in: ``` Internal bug detected: "!pindex || pindex->GetBlockHash() == view->GetBestBlock()" rpc/blockchain.cpp:836 (GetUTXOStats) ``` The failing check was added in [bitcoin#24410](bitcoin@664a14b), but the previous behavior, returning the specified height together with data corresponding to the tip's height, was very confusing too in my opinion. Fix this by disallowing the interaction of `use_index=false` and `hash_or_height` and add a RPC help example with `-named` because users might ask themselves how to use the `use_index` flag witout hitting an error. An alternative way would be to allow the interaction if the specified `hash_or_height` happens to correspond to the tip (which should then also be applied to the `HASH_SERIALIZED` check before). If reviewers would prefer that, please say so. ACKs for top commit: fjahr: utACK 27c8056 shaavan: ACK 27c8056 Tree-SHA512: 1d81c34eaa48c86134a2cf7193246d5de6bfd819d413c3b3fae9cb9290e0297a336111eeaecede2f0f020b0f9a181d240de0da4493e1b387fe63b8189154442b
…ific block with `use_index=false` 27c8056 rpc: Disallow gettxoutsetinfo queries for a specific block with use_index=false (Martin Zumsande) Pull request description: In the `gettxoutsetinfo` RPC, if we set `use_index` to false but specify `hash_or_height`, we currently hit a nonfatal error, e.g. `gettxoutsetinfo "muhash" "1" "false"` results in: ``` Internal bug detected: "!pindex || pindex->GetBlockHash() == view->GetBestBlock()" rpc/blockchain.cpp:836 (GetUTXOStats) ``` The failing check was added in [bitcoin#24410](bitcoin@664a14b), but the previous behavior, returning the specified height together with data corresponding to the tip's height, was very confusing too in my opinion. Fix this by disallowing the interaction of `use_index=false` and `hash_or_height` and add a RPC help example with `-named` because users might ask themselves how to use the `use_index` flag witout hitting an error. An alternative way would be to allow the interaction if the specified `hash_or_height` happens to correspond to the tip (which should then also be applied to the `HASH_SERIALIZED` check before). If reviewers would prefer that, please say so. ACKs for top commit: fjahr: utACK 27c8056 shaavan: ACK 27c8056 Tree-SHA512: 1d81c34eaa48c86134a2cf7193246d5de6bfd819d413c3b3fae9cb9290e0297a336111eeaecede2f0f020b0f9a181d240de0da4493e1b387fe63b8189154442b
Part of: #24303
Depends on: #24322
The
GetUTXOStatsfunction has 2 codepaths:CoinStatsIndexfor the UTXO hashFor
libbitcoinkernel, the only place where we callGetUTXOStatsis inPopulateAndValidateSnapshots, which uses theSHA256Dhash, and is therefore unable to use theCoinStatsIndexsince that only providesMuHashhashes. Not that I think indices necessarily belong inlibbitcoinkernelanyway.This PR separates these 2 aforementioned codepaths of
GetUTXOStats, uses the hashing codepath inPopulateAndValidateSnapshots, and removes the need to link inindex/coinstatsindex.cppandnode/coinstats.cpp.Logistically, this PR:
index_requestedandhash_typemembers ofCoinStats, which served as "in-params" toGetUTXOStatsembedded within theCoinStatsstruct. This allowsCoinStatsto only consist of "out-param" members, and be returned byGetUTXOStatswithout needing to be an "in-out" paramUTXOHashersclass, with 3 implementations:SHA256DHasher,MuHashHasher, andNullHasher. These replace the existing template-based polymorphism.GetUTXOStatsinto:CalculateUTXOStatsWithHasher(UTXOHasher&, ...), andLookupUTXOStatsWithIndex(CoinStatsIndex&, ...)CalculateUTXOStatsWithHasherdirectly where appropriate (src/validation.cppandsrc/fuzz)GetUTXOStatstorpc/blockchain, which is the only place that depends onGetUTXOStats's weird fallback behaviourLookupUTXOStatsWithIndextoindex/coinstatsindexCode organization:
src/kernel/→ only contains the hashing codepathcoinstats.cpp→ hashing codepath implementationscoinstats.h→ header forkernel/coinstats.cppindex/→ only contains the index codepathcoinstatsindex.cpp→ index codepath implementationscoinstatsindex.hvalidation.cpp→ only uses the hashing codepathrpc/blockchain.cpp→ uses both the hashing and index codepath, oldGetUTXOStatsfallback logic moved here as statictest/fuzz/coins_view.cpp→ only uses the hashing codepathTODOs:
Would love any feedback!