-
Notifications
You must be signed in to change notification settings - Fork 38.6k
coins: remove logic for spent-and-FRESH cache entries and writing non-DIRTY entries #30673
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30673. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
3f032cf to
b393b0d
Compare
|
🚧 At least one of the CI tasks failed. HintsMake sure to run all tests locally, according to the documentation. The failure may happen due to a number of reasons, for example:
Leave a comment here, if you need help tracking down a confusing failure. |
b393b0d to
feaea74
Compare
feaea74 to
1f2f8e5
Compare
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code became a lot simpler now - have a few suggestion I'd like us to consider, but I understand if you think it's out of scope.
Once we get to a state that is ACK-worthy, I'll enable hard assertions everywhere and run an IBD and check if it crashes anywhere.
I think for a change like this, spending time fuzzing would be more valuable. I believe it already runs in debug mode for that so |
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dug deeper and added more relevant questions - bear with me if they're outside the scope of this PR.
82b6263 to
7d2b499
Compare
l0rinc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The unreliability of many of our tests worries me, would you be open to fixing them in a PR before this one?
src/coins.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this still happen during a reorg? The comment here is really scary, do we really need so much context here? I'd prefer a test that fails when the code is wrong instead of a really long description...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it can happen during a reorg. But we know we can't have a spent coin that is not DIRTY, so the comments being updated here are making this not be redundant. When the coin is spent in DisconnectBlock, it is either erased if FRESH or set DIRTY. The only time we would get into this block if the coin is not DIRTY is if it was just inserted.
src/coins.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The freshness flag depend on inserted, possible_overwrite and it->second.coin.IsSpent(), assigned and validated in different places.
Could we obviate the freshness flag calculation instead with something like:
bool fresh_flag = inserted && !possible_overwrite ? CCoinsCacheEntry::FRESH : 0;Either here or in a follow-up we could simplify & modernize the method to something like:
void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possible_overwrite) {
assert(!coin.IsSpent());
if (coin.out.scriptPubKey.IsUnspendable()) return;
auto [it, inserted] = cacheCoins.try_emplace(outpoint);
if (!inserted) {
if (!possible_overwrite && !it->second.coin.IsSpent()) throw std::logic_error("Attempted to overwrite an unspent coin (when possible_overwrite is false)");
cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
}
cachedCoinsUsage += coin.DynamicMemoryUsage();
it->second.coin = std::move(coin);
bool fresh_flag = inserted && !possible_overwrite ? CCoinsCacheEntry::FRESH : 0;
it->second.AddFlags(CCoinsCacheEntry::DIRTY | fresh_flag, *it, m_sentinel);
TRACE5(utxocache, add,
outpoint.hash.data(),
(uint32_t) outpoint.n,
(uint32_t) it->second.coin.nHeight,
(int64_t) it->second.coin.out.nValue,
(bool) it->second.coin.IsCoinBase());
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's leave this for a follow-up.
src/test/coins_tests.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unrelated: could coins_cache_simulation_testbe moved into a fuzz test - or would it be too difficult to track all those states?.
If you think any of these recommendations can be done in a parallel PR, let me know and I'll do them myself before this is merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. It might just be redundant to our current fuzz harness. It's also valuable to have these tests run on every CI.
7d2b499 to
34a101d
Compare
|
This is still important - @andrewtoth, how can I help? |
6956ee9 to
7f4a3bf
Compare
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
There are no code paths which add a non-DIRTY entry to the cursor in BatchWrite, so we can safely make this a logic error and test for it. There are no code paths which add a spent and FRESH coin to the cursor in BatchWrite, so we can safely make this a logic error and test for it.
It is no longer possible to get non-DIRTY entries in BatchWrite
It is not possible for an entry to be FRESH if not already DIRTY.
A spent coins must be DIRTY, so remove references to spent but not DIRTY coins. The only way a spent coin can be not DIRTY is when creating the CCoinsCacheEntry with an empty coin. This can be made more clear by setting fresh if inserted, instead of checking if an unspent coin is not DIRTY.
7f4a3bf to
53dbebd
Compare
|
@l0rinc rebased. Thank you for your reviews! I have tried to address all your comments. There are some older comments that I'm unsure are still relevant. Please let me know if there is still anything outstanding that needs to be addressed. |
It is not possible to have a FRESH CCoinsCacheEntry that is not also DIRTY. Test code uses the SetFresh method out of order to simulate entries that are FRESH but not DIRTY. By removing the method entirely we can simplify the test code and make the production code easier to understand.
5f4095f to
fd2338d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The src/coins.cpp prod changes are a bit scary - while it's a very important change, based on the lack of enthusiasm by others, we may want to separate the tests and small refactors into PRs that will get us closer to this.
One could remove the invalid test states - without removing the invalid production code states, just potentially adding TODOs there in case it's missing.
The SetFresh and flag removal could be next and if all of those are merged, we can continue with the scary FetchCoin, AddCoin, BatchWrite and Uncache.
Left a few new comments quickly, let me know what you think.
| std::map<COutPoint, Coin> m_data; | ||
|
|
||
| public: | ||
| std::optional<Coin> GetCoin(const COutPoint& outpoint) const final |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the class is already final + other overrides specify that explicitly for clarity
| std::optional<Coin> GetCoin(const COutPoint& outpoint) const final | |
| std::optional<Coin> GetCoin(const COutPoint& outpoint) const override |
and
bool HaveCoin(const COutPoint& outpoint) const overrideAnd most other such methods in the file - but I understand if you'd prefer to do in a follow-up instead
| static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel, bool fresh = false) noexcept | ||
| { | ||
| AddFlags(fresh ? FRESH | DIRTY : DIRTY, pair, sentinel); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's only called here, we could inline and simplify it now:
| static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel, bool fresh = false) noexcept | |
| { | |
| AddFlags(fresh ? FRESH | DIRTY : DIRTY, pair, sentinel); | |
| } | |
| static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel, bool fresh = false) noexcept | |
| { | |
| if (!pair.second.m_flags) { | |
| Assume(!pair.second.m_prev && !pair.second.m_next); | |
| pair.second.m_prev = sentinel.second.m_prev; | |
| pair.second.m_next = &sentinel; | |
| sentinel.second.m_prev = &pair; | |
| pair.second.m_prev->second.m_next = &pair; | |
| } | |
| Assume(pair.second.m_prev && pair.second.m_next); | |
| pair.second.m_flags |= fresh ? FRESH | DIRTY : DIRTY; | |
| } |
Not sure, but do the flags still make sense after this or would a bool is_fresh also suffice? The dirtiness can be deduced from the m_next/m_prev pair so the only unknown is the freshness which doesn't necessitate a flag anymore - right?
| m_prev = m_next = nullptr; | ||
| } | ||
| bool IsDirty() const noexcept { return m_flags & DIRTY; } | ||
| bool IsFresh() const noexcept { return m_flags & FRESH; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't have Fresh anymore, only IsDirtyAndFresh, right?
| bool IsFresh() const noexcept { return m_flags & FRESH; } | |
| bool IsDirtyAndFresh() const noexcept | |
| { | |
| const bool is_fresh = m_flags & FRESH; | |
| Assume(IsDirty() || !is_fresh); | |
| return is_fresh; | |
| } |
| return it->second; // TODO spent coins shouldn't be returned | ||
| } | ||
| } | ||
| if (auto it{map_.find(outpoint)}; it != map_.end() && !it->second.IsSpent()) return it->second; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the lack of reviews, we may want to split out the test cleanups (removing invalid states) and refactors to a separate PR, where this will be a tracking PR for where we want to get to in safer/smaller steps
| CCoinsCacheEntry::SetFresh(n2, sentinel); | ||
| BOOST_CHECK(n2.second.IsFresh() && !n2.second.IsDirty()); | ||
| // Check that setting DIRTY and FRESH on new node inserts it after n1 | ||
| CCoinsCacheEntry::SetDirty(n2, sentinel, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CCoinsCacheEntry::SetDirty(n2, sentinel, true); | |
| CCoinsCacheEntry::SetDirty(n2, sentinel, /*fresh=*/true); |
| bool HaveCoin(const COutPoint& outpoint) const final | ||
| { | ||
| return m_data.count(outpoint); | ||
| return GetCoin(outpoint).has_value(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should be able to do this safely in a prefactor PR regardless of the other changes
|
Closing in favor of #33018. |
0ac969c validation: don't reallocate cache for short-lived CCoinsViewCache (Lőrinc) c8f5e44 coins: reduce lookups in dbcache layer propagation (Lőrinc) Pull request description: This change is part of [[IBD] - Tracking PR for speeding up Initial Block Download](#32043) ### Summary Previously, when the parent coins cache had no entry and the child did, `BatchWrite` performed a find followed by `try_emplace`, which resulted in multiple `SipHash` computations and bucket traversals on the common insert path. On a different path, these caches were recreated needlessly for every block connection. ### Fix for double fetch This change uses a single leading `try_emplace` and branches on the returned `inserted` flag. In the `FRESH && SPENT` case (not used in production, only exercised by tests), we erase the just-inserted placeholder (which is constant time with no rehash anyway). Semantics are unchanged for all valid parent/child state combinations. This change is a minimal version of [bitcoin/bitcoin@`723c49b` (#32128)](723c49b) and draws simplification ideas [bitcoin/bitcoin@`ae76ec7` (#30673)](ae76ec7) and #30326. ### Fix for temporary cache recreation Related to parent cache propagation, the second commit makes it possible to avoid destructuring-recreating-destructuring of these short-live parent caches created for each new block. A few temporary `CCoinsViewCache`'s are destructed right after the `Flush()`, therefore it is not necessary to call `ReallocateCache` to recreate them right before they're killed anyway. This change was based on a subset of #28945, the original authors and relevant commenters were added as coauthors to this version. ----- Reindex-chainstate indicates ~1% speedup. <details> <summary>Details</summary> ```python COMMITS="647cdb4f7e8041affed887e2325ee03a91078bb1 0b0c3293ffd75afb27dadc0b28426b40132a8c6b"; \ STOP=909090; DBCACHE=4500; \ CC=gcc; CXX=g++; \ BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \ (echo ""; for c in $COMMITS; do git fetch -q origin $c && git log -1 --pretty='%h %s' $c || exit 1; done; echo "") && \ hyperfine \ --sort command \ --runs 2 \ --export-json "$BASE_DIR/rdx-$(sed -E 's/(\w{8})\w+ ?/\1-/g;s/-$//'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \ --parameter-list COMMIT ${COMMITS// /,} \ --prepare "killall bitcoind 2>/dev/null; rm -f $DATA_DIR/debug.log; git checkout {COMMIT}; git clean -fxd; git reset --hard && \ cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release -DENABLE_IPC=OFF && ninja -C build bitcoind && \ ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20" \ --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \ "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0" 647cdb4 Merge #33311: net: Quiet down logging when router doesn't support natpmp/pcp 0b0c3293ff validation: don't reallocate cache for short-lived CCoinsViewCache Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4) Time (mean ± σ): 16233.508 s ± 9.501 s [User: 19064.578 s, System: 951.672 s] Range (min … max): 16226.790 s … 16240.226 s 2 runs Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b) Time (mean ± σ): 16039.626 s ± 17.284 s [User: 18870.130 s, System: 950.722 s] Range (min … max): 16027.405 s … 16051.848 s 2 runs Relative speed comparison 1.01 ± 0.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 647cdb4) 1.00 COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=909090 -dbcache=4500 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 0b0c3293ffd75afb27dadc0b28426b40132a8c6b) ``` </details> ACKs for top commit: optout21: utACK 0ac969c achow101: ACK 0ac969c andrewtoth: utACK 0ac969c sedited: ACK 0ac969c Tree-SHA512: 9fcc3f1a8314368576a4fba96ca72665527eaa3a97964ab5b39491757f3527147d134f79a5c3456f76c1330c7ef862989d23f764236c5e2563be89a81c1cee47
Following up from #28280 (comment), which suggested a revival of #18746.
GetCoinwill never returntruefor a spent entry, so we can safely assume that any entry we fetch will not be spent. This lets us remove the only non-test code path which adds a FRESH-but-not-DIRTY entry to the flagged linked list. This in turn ensures all entries being sent toBatchWriteareDIRTYentries.A corollary is that all spent coins must be DIRTY. The only time a coin can be spent and not DIRTY is when the
CCoinsViewCacheEntryis created with an empty coin. The last commit makes this more clear by checking for freshness if inserted instead of if spent and not DIRTY.This is a pure refactor removing dead code which handles a non-existent corner case of a FRESH-but-not-DIRTY entry in the
CCoinsViewCacheor a spent entry inCCoinsViewDB. There is a lot of test code which tries to exercise this corner case, which is also updated in this PR to behave like non-test code.