-
Notifications
You must be signed in to change notification settings - Fork 38.7k
validation: do not wipe utxo cache for stats/scans/snapshots #33680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33680. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
0a769e3 to
0186bc4
Compare
395b567 to
b92ee75
Compare
cedwies
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approach ACK
Do ForceFlushToDisk(...)/FlushToDisk(...) require cs_main to be held?
Most call sites hold it during shutdown flushes, but the function itself doesn’t assert or document that (no AssertLockHeld(cs_main)). If this is correct, I think a short comment can help that a Lock is not strictly needed.
src/test/fuzz/utxo_snapshot.cpp
Outdated
| @@ -57,7 +57,7 @@ void sanity_check_snapshot() | |||
| // Connect the chain to the tmp chainman and sanity check the chainparams snapshot values. | |||
| LOCK(cs_main); | |||
| auto& cs{node.chainman->ActiveChainstate()}; | |||
| cs.ForceFlushStateToDisk(); | |||
| cs.ForceFlushStateToDisk(/*wipe_cache=*/true); | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we wiping the cache here? We only compute stats from CoinsDB(), not the in-memory cache, so a non-wiping flush (wipe_cache=false) should be enough for consistency (?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, changed to false, added you as coauthor, thanks.
|
I suggest splitting into two commits:
|
adfb4eb to
a59aace
Compare
| 637657 ALWAYS 122320 17124.80 kB False | ||
| 81349 ALWAYS 0 1383.49 kB False | ||
| Duration (µs) Mode Coins Count Memory Usage Flush for Prune | ||
| 2556340 IF_NEEDED 2899141 394844.34 kB False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 'Flush for Prune' has changed here to False, and now all lines have False value here. Can the False line be preserved, or is it OK in this sample output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual output of running the script, looks like it wasn't updated in a while
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering how tracing is affected by this change. The trace includes the mode, and it should display it fine, so the trace should be OK.
It would be nice to check how the trace looks for a FORCE_SYNC case as well; I guess that can be triggered by an RPC call to the running node.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never used tracing before, my understanding is that @0xB10C uses these for monitoring mostly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual output of running the script, looks like it wasn't updated in a while
I think that the scripts only enables the trace and listens to it, but it does not trigger a particular use case.
If trace outputs are displayed with these changes, that should be sufficient check (i.e., tracing a FORCE_SYNC is not crucial).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan have a closer look.
I don't use the utxocache tracepoints at the moment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the actual output of running the script, looks like it wasn't updated in a while
Since the script logs UTXO cache events, running it against a different node will produce different output. Not sure if worth changing the example output of the script here. The example output is just as valid just as it was before.
If you retouch, feel free to revert, but I don't think it matters much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for checking!
The headers have changed and the column widths, so I had to regenerate the example to be accurate:
-Duration (µs) Mode Coins Count Memory Usage Prune
+Duration (µs) Mode Coins Count Memory Usage Flush for Prune|
reACK c6ca2b8 (prev: |
|
rebased after #30595 |
a59aace to
d442598
Compare
|
utACK d442598 |
|
Looked at the flush tracing changes a bit (d4425982ca4aff9b856a26f1d83d1419252f10a2). They seem fine to me. I can't really comment on the other changes, as I'm not too familiar with the utxo cache in general. from #33680 (comment):
Tested this against a signet node. $ python3 contrib/tracing/log_utxocache_flush.py 1234
Hooking into bitcoind with pid 1234
Logging utxocache flushes. Ctrl-C to end...
Duration (µs) Mode Coins Count Memory Usage Flush for Prune
3911 FORCE_SYNC 3 262.34 kB False
12439 FORCE_SYNC 3 262.34 kB False
4652 FORCE_SYNC 3 262.34 kB False
1331 FORCE_FLUSH 16 262.46 kB False
939 FORCE_FLUSH 0 262.24 kB False |
d442598 to
14abac1
Compare
|
Rebased with trivial conflicts, ready for re-review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reACK 14abac1
bd0f8c0 to
6fc49a1
Compare
6fc49a1 to
11fe88a
Compare
|
Non-trivial rebase after #30214 Changes:
|
|
checked the changes, ACK 11fe88a |
This prepares the addition of `FORCE_SYNC`. `empty_cache` in `FlushStateToDisk` was moved up to be reusable and `FlushStateMode::FORCE_FLUSH` was used as a placeholder before we properly split the two new states. `log_utxocache_flush.py` was regenerated and the alignment adjusted for the wider `FlushStateMode` values. Co-authored-by: Pieter Wuille <[email protected]> Co-authored-by: optout <[email protected]>
Since bitcoin#28280, the cost of a non-wiping sync of the UTXO cache is only proportional to the number of dirty entries, rather than proportional to the size of the entire cache. Because of that, there is no reason to perform a wiping flush in case the contents of the cache is still useful. Split the FlushStateMode::ALWAYS mode into a FORCE_SYNC (non-wiping) and a FORCE_FLUSH (wiping), and then use the former in scantxoutset, gettxoutsetinfo, snapshot creation. Co-authored-by: l0rinc <[email protected]> Co-authored-by: cedwies <[email protected]>
11fe88a to
c6ca2b8
Compare
|
Trivial rebase after #33866, you can check it with |
|
reACK c6ca2b8 (trivial) |
Revival of #30610 (comment) with the remaining comments applied on top
(slightly updated after #30214)