Skip to content

Conversation

@dongcarl
Copy link
Contributor

@dongcarl dongcarl commented Feb 21, 2022

Part of: #24303
Depends on: #24322

The GetUTXOStats function has 2 codepaths:

  • One which queries the CoinStatsIndex for the UTXO hash
  • One which actually performs the hashing

For libbitcoinkernel, the only place where we call GetUTXOStats is in PopulateAndValidateSnapshots, which uses the SHA256D hash, and is therefore unable to use the CoinStatsIndex since that only provides MuHash hashes. Not that I think indices necessarily belong in libbitcoinkernel anyway.

This PR separates these 2 aforementioned codepaths of GetUTXOStats, uses the hashing codepath in PopulateAndValidateSnapshots, and removes the need to link in index/coinstatsindex.cpp and node/coinstats.cpp.


Logistically, this PR:

  • Extracts out the index_requested and hash_type members of CoinStats, which served as "in-params" to GetUTXOStats embedded within the CoinStats struct. This allows CoinStats to only consist of "out-param" members, and be returned by GetUTXOStats without needing to be an "in-out" param
  • Introduce the purely virtual UTXOHashers class, with 3 implementations: SHA256DHasher, MuHashHasher, and NullHasher. These replace the existing template-based polymorphism.
  • Split GetUTXOStats into:
    • CalculateUTXOStatsWithHasher(UTXOHasher&, ...), and
    • LookupUTXOStatsWithIndex(CoinStatsIndex&, ...)
  • Use CalculateUTXOStatsWithHasher directly where appropriate (src/validation.cpp and src/fuzz)
  • Move GetUTXOStats to rpc/blockchain, which is the only place that depends on GetUTXOStats's weird fallback behaviour
  • Move LookupUTXOStatsWithIndex to index/coinstatsindex

Code organization:

  • src/
    • kernel/ → only contains the hashing codepath
      • coinstats.cpp → hashing codepath implementations
      • coinstats.h → header for kernel/coinstats.cpp
    • index/ → only contains the index codepath
      • coinstatsindex.cpp → index codepath implementations
      • coinstatsindex.h
    • validation.cpp → only uses the hashing codepath
    • rpc/blockchain.cpp → uses both the hashing and index codepath, old GetUTXOStats fallback logic moved here as static
    • test/fuzz/coins_view.cpp → only uses the hashing codepath

TODOs:

  • Commit messages could be fleshed out more

Would love any feedback!

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approach ACK (a7883ac7a497f3d1cb52f5f71e71de5e5e9301f1). Looked through various commits and I think the changes all make sense

@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch from a7883ac to c46aafa Compare April 28, 2022 16:49
@dongcarl
Copy link
Contributor Author

Pushed a7883ac7a4 -> c46aafaf11

  • Rebased over master
  • Made the scripted-diff actually work

@maflcko
Copy link
Member

maflcko commented Apr 28, 2022

Would an alternative be to treat assumeutxo as not part of the kernel for now (and completely move it out?)

@dongcarl
Copy link
Contributor Author

@MarcoFalke

Would an alternative be to treat assumeutxo as not part of the kernel for now (and completely move it out?)

Sure, but I don't prefer that alternative. I think it would actually be more work to extract assumeutxo from the kernel, plus if you look at the GetUTXOStats function and friends prior to this PR, it needed quite some cleaning up, which is done in this PR.

@jonatack
Copy link
Member

Concept ACK

@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch 2 times, most recently from c2800d8 to 2b6443e Compare April 28, 2022 20:36
@DrahtBot
Copy link
Contributor

DrahtBot commented Apr 29, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25065 ([kernel 2c/n] Extract only what we use of init/common.cpp by dongcarl)
  • #24952 (rpc: Add sqlite format option for dumptxoutset by dunxen)
  • #24897 ([Draft / POC] Silent Payments by w0xlt)
  • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)
  • #19792 (rpc: Add dumpcoinstats by fjahr)
  • #13462 (Make SER_GETHASH implicit for CHashWriter and SerializeHash by Empact)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@willcl-ark

This comment was marked as resolved.

@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch from 2b6443e to 756f1a7 Compare April 29, 2022 18:51
@dongcarl

This comment was marked as resolved.

@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch from 756f1a7 to 30503c9 Compare April 29, 2022 19:11
@mzumsande

This comment was marked as resolved.

@willcl-ark

This comment was marked as resolved.

@willcl-ark

This comment was marked as resolved.

@dongcarl
Copy link
Contributor Author

@willcl-ark Okay for me to hide your comments so as to not confuse future reviewers?

@willcl-ark
Copy link
Member

@willcl-ark Okay for me to hide your comments so as to not confuse future reviewers?

Please go ahead!

@dongcarl
Copy link
Contributor Author

Pushed 756f1a755d138e36a9e50e7e4e46ae0bcd3975ca -> 30503c98a8bc97b582f072c34168b5279d65a492

  • Added commit "fuzz: Remove useless GetUTXOStats fuzz case", adjusted subsequent commits
  • Removed some extraneous whitespace in scripted-diff

Reviewers: Please let me know if the new "fuzz: Remove useless GetUTXOStats fuzz case" commit looks okay.

Ping @practicalswift

Copy link
Contributor

@mzumsande mzumsande left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

I was able to follow the commits step by step and the changes look sensible to me. One suggestion below about the coinstatsindex codepath.

@dongcarl
Copy link
Contributor Author

dongcarl commented May 21, 2022

Pushed e6e2b165fd377806e257fbb5378e22067a9a3957 -> e42583f928d5461827a946db3baa7b3da9487253

  • Removed over-eager assertion added in the previous e6e2b165fd377806e257fbb5378e22067a9a3957 push: GetUTXOStats is sometimes called in rpc/blockchain.cpp with a non-null pindex even when we aren't going to use the index. In all such cases the pindex represents view's best block.

Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK e42583f928d5461827a946db3baa7b3da9487253

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK e42583f928d5461827a946db3baa7b3da9487253. Looks good, I left new suggestions but they are not important. Changes since last review were all minor: adding comment, adding static, dropping unused parameter

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "coinstats: Return purely out-param CCoinsStats" (f735c92edc08047bdbe0f47ea0e7c319bcc62eeb)

Agree with Marco, I think. I don't understand why you wouldn't initialize success to false. If function is called with (CoinStatsHashType)123 compiler is correct to warn that it has undefined behavior. It seems like it would be best to assert false in this case, second best to return nullopt if asserting is inconvenient, and third best to have undefined behavior.

dongcarl added 9 commits May 23, 2022 14:50
In previous commits in this patchset, we removed all in-param members of
CCoinsStats. Now that that's done, we can modify GetUTXOStats to return
an optional CCoinsStats instead of a status bool. Callers are modified
accordingly.

In rpc/blockchain.cpp, we discover that GetUTXOStats' status bool when
getting UTXO stats for pprev was not checked for error. We fix this as
well.
Split out ComputeUTXOStats and LookupUTXOStatsWithIndex from
GetUTXOStats, since the hashing and indexing codepaths are quite
disparate in practice.

Also allow add a constructor to CCoinsStats for it to be constructed
from a a block height and hash. This is used in both codepaths.

Also add a note in GetUTXOStats documenting a behaviour quirk that
predates this patchset.

[META] This allows the hashing codepath to be moved to a separate file
       in a future commit, decoupling callers that only rely on the
       hashing codepath from the indexing one. This is key for
       libbitcoinkernel, which needs to have the hashing codepath for
       AssumeUTXO, but does not wish to be coupled with indexes.
The indexing codepath logic in node/coinstats.cpp is simple enough to be
moved into CoinStatsIndex::LookUpStats, avoiding an additional layer of
function calls. Callers are modified accordingly.

Also, add 2 missed BOOST_CHECKs to the coinstatsindex_initial_sync unit
test.
As mentioned in a previous commit, the hashing codepath can now be moved
to a separate file. This decouples callers that only rely on the hashing
codepath from the indexing one.

This is key for libbitcoinkernel, which needs to have the CoinsStats
hashing codepath for AssumeUTXO, but does not wish to be coupled with
indexes.

Note that only the .cpp file is split in this commit, the header files
will be split in a subsequent commit and the #includes to
node/coinstats.h will be adjusted to only #include the necessary
headers.
Most of this commit is pure-move.

After this change:

- kernel/coinstats.h
    -> Contains declarations for:
       - enum class CoinStatsHashType
       - struct CCoinsStats
       - GetBogoSize(...)
       - TxOutSer(...)
       - ComputeUTXOStats(...)
- node/coinstats.h
    -> Just GetUTXOStats, which will be removed as we change callers to
       directly use the hashing/indexing codepaths in future commits.
Removes a circular dependency, horray!
Introduces a new kernel:: namespace and move all of src/kernel/coinstats
under it.

In the verify script, lines like:

line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h

Are intended to replace only the last instance of "namespace node" with
"namespace kernel", this is to avoid replacing forward declarations of
things inside the node:: namespace.

-BEGIN VERIFY SCRIPT-
sed -E -i 's@namespace node@namespace kernel@g' -- src/kernel/coinstats.cpp

line="$(grep -n 'namespace node {' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@namespace node {@namespace kernel {@" -- src/kernel/coinstats.h

line="$(grep -n '// namespace node' -- src/kernel/coinstats.h | tail -n1 | cut -d: -f1)"
sed -i -e "${line}s@// namespace node@// namespace kernel@" -- src/kernel/coinstats.h

things='(CCoinsStats|CoinStatsHashType|GetBogoSize|TxOutSer|ComputeUTXOStats)'
git grep -lE 'node::'"$things" | xargs sed -E -i 's@node::'"$things"'@kernel::\1@g'
sed -E -i 's@'"$things"'@kernel::\1@g' -- src/node/coinstats.cpp src/node/coinstats.h
sed -E -i 's@BlockManager@node::\0@g' -- src/kernel/coinstats.cpp
-END VERIFY SCRIPT-
This is the "fruit of our labor" for this patchset.
ChainstateManager::PopulateAndValidateSnapshot can now directly call
ComputeUTXOStats(...).

Our consensus engine is now fully decoupled from all indices.

See the src/Makefile.am for some satisfying removals.
@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch from e42583f to c7e8803 Compare May 23, 2022 19:06
@dongcarl
Copy link
Contributor Author

rpc/blockchain.cpp is now the only user of the vestigial
GetUTXOStats(...). And since GetUTXOStats(...)'s special fallback logic
was only really relevant/meant for rpc/blockchain.cpp, we can just move
it there.
@dongcarl dongcarl force-pushed the 2022-02-libbitcoinkernel-p2-coinstats-minimal branch from c7e8803 to 664a14b Compare May 23, 2022 19:19
@dongcarl
Copy link
Contributor Author

Pushed c7e8803e47eadc68e46e9384a7e2c4543404cd1b -> 664a14b

  • Use CHECK_NONFATAL instead of assert in GetUTXOStats after moving to rpc/blockchain.cpp

@laanwj
Copy link
Member

laanwj commented May 24, 2022

Code review ACK 664a14b

@laanwj laanwj merged commit 7008087 into bitcoin:master May 24, 2022
Copy link
Contributor

@theStack theStack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

post-merge re-ACK 664a14b

sidhujag pushed a commit to syscoin/syscoin that referenced this pull request May 28, 2022
maflcko pushed a commit to maflcko/bitcoin-core that referenced this pull request Jul 1, 2022
…ific block with `use_index=false`

27c8056 rpc: Disallow gettxoutsetinfo queries for a specific block with use_index=false (Martin Zumsande)

Pull request description:

  In the `gettxoutsetinfo` RPC, if we set `use_index` to false but specify `hash_or_height`, we currently hit a nonfatal error, e.g. `gettxoutsetinfo "muhash" "1" "false"` results in:
  ```
  Internal bug detected: "!pindex || pindex->GetBlockHash() == view->GetBestBlock()"
  rpc/blockchain.cpp:836 (GetUTXOStats)
  ```
  The failing check was added in [bitcoin#24410](bitcoin@664a14b), but the previous behavior, returning the specified height together with data corresponding to the tip's height, was very confusing too in my opinion.
  Fix this by disallowing the interaction  of `use_index=false` and `hash_or_height` and add a RPC help example with `-named` because users might ask themselves how to use the `use_index` flag witout hitting an error.

  An alternative way would be to allow the interaction if the specified `hash_or_height` happens to correspond to the tip (which should then also be applied to the `HASH_SERIALIZED` check before). If reviewers would prefer that, please say so.

ACKs for top commit:
  fjahr:
    utACK 27c8056
  shaavan:
    ACK 27c8056

Tree-SHA512: 1d81c34eaa48c86134a2cf7193246d5de6bfd819d413c3b3fae9cb9290e0297a336111eeaecede2f0f020b0f9a181d240de0da4493e1b387fe63b8189154442b
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 1, 2022
…ific block with `use_index=false`

27c8056 rpc: Disallow gettxoutsetinfo queries for a specific block with use_index=false (Martin Zumsande)

Pull request description:

  In the `gettxoutsetinfo` RPC, if we set `use_index` to false but specify `hash_or_height`, we currently hit a nonfatal error, e.g. `gettxoutsetinfo "muhash" "1" "false"` results in:
  ```
  Internal bug detected: "!pindex || pindex->GetBlockHash() == view->GetBestBlock()"
  rpc/blockchain.cpp:836 (GetUTXOStats)
  ```
  The failing check was added in [bitcoin#24410](bitcoin@664a14b), but the previous behavior, returning the specified height together with data corresponding to the tip's height, was very confusing too in my opinion.
  Fix this by disallowing the interaction  of `use_index=false` and `hash_or_height` and add a RPC help example with `-named` because users might ask themselves how to use the `use_index` flag witout hitting an error.

  An alternative way would be to allow the interaction if the specified `hash_or_height` happens to correspond to the tip (which should then also be applied to the `HASH_SERIALIZED` check before). If reviewers would prefer that, please say so.

ACKs for top commit:
  fjahr:
    utACK 27c8056
  shaavan:
    ACK 27c8056

Tree-SHA512: 1d81c34eaa48c86134a2cf7193246d5de6bfd819d413c3b3fae9cb9290e0297a336111eeaecede2f0f020b0f9a181d240de0da4493e1b387fe63b8189154442b
@bitcoin bitcoin deleted a comment from Miminlaili Oct 4, 2022
@bitcoin bitcoin deleted a comment from Miminlaili Oct 4, 2022
@bitcoin bitcoin deleted a comment from Miminlaili Oct 4, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Oct 4, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

Status: Done or Closed or Rethinking

Development

Successfully merging this pull request may close these issues.