Change UpdateForDescendants to use Epochs #18191

JeremyRubin · 2020-02-22T04:19:12Z

This patch replaces #18120, fixing some regressions or issues with that PR. The overall change here should be easier to review as well.

Refactors UpdateForDescendants to use Epochs instead of sets to de-duplicate traversal, and replaces the cache entries with a vector instead of a set. This is a straightforward win. The algorithm is a bit clever so as not to require two separate vectors for txiters that need expanding and txiters that have already been processed, but I think it's relatively easy to reason about.

This is extracted from #18063 as a standalone because it turns out the DoS issues with this code perhaps merit more extensive review http://gnusha.org/bitcoin-core-dev/2020-02-11.log. That PR will stay open for review, but this PR is designed as an "obviously better" win that can be merged now to keep progress up on the mempool project.

sdaftuar can you re-run your bench from #18120?

DrahtBot · 2020-02-22T09:26:15Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

sdaftuar

Only looked at ec237cf8c640f9e85c3c7aad0d3eb243a346b1f9 but I think maybe not worth the added complexity, if I'm understanding correctly.

Also the shrink_to_fit of the update_cache vector could just be included in #18120?

src/txmempool.cpp

JeremyRubin · 2020-03-03T04:46:53Z

Fixed and updated comment.

JeremyRubin · 2020-03-03T05:55:29Z

Sorry for all the line noise; I realized I forgot to add a continue, and then decided that I could DRY up the code a bit. I'm pretty happy with the final version, I think it's a bit easier to view because we don't treat update_it special from other iterators we walk (other than it not going into the cache line).

I can squash this all down into one commit and replace #18120.

hebasto · 2020-03-29T17:43:19Z

Concept ACK.

src/txmempool.cpp

hebasto · 2020-03-30T08:41:31Z

IIUC, the first pass of the do ... while (true); loop just emplaces all of the direct descendants of update_it tx, i.e., its children, to the new_cache_line, and taints their m_epoch. All of the following loop passes process grand children, grand-grand children and so on. And two thing are confusing to me:

the following comment: https://github.com/bitcoin/bitcoin/blob/32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a/src/txmempool.cpp#L79-L80

I'd suggest to remove it completely.

the following variable name: https://github.com/bitcoin/bitcoin/blob/32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a/src/txmempool.cpp#L87

I'd suggest:

s/cached_grand_children/cached_descendants/
s/grand_child_it/descendant_it/
s/the grand child/the descendant/ in the comment

EDIT: I realized, that you mean "children" and "grandchildren" relative to an element of the new_cache_line. So, ignore my suggestion (2).

src/txmempool.cpp

hebasto

ACK 32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a, tested on Linux Mint 19.3 with the functional test from #18485 (cfaa79b):

the average time (from 10 runs) reported by test decreased by 9% from 0.367s to 0.347s.

8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov) Pull request description: This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool. It could be helpful for working on PRs like #17925, #18191. ACKs for top commit: ariard: ACK 8098dea Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06

8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov) Pull request description: This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool. It could be helpful for working on PRs like bitcoin#17925, bitcoin#18191. ACKs for top commit: ariard: ACK 8098dea Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06

src/txmempool.cpp

jonatack · 2020-05-25T07:58:02Z

It seems that performance is the motivation for this PR. Could benchmarks, or a link to them, be added to the PR description?

hebasto · 2020-05-25T08:04:16Z

@jonatack

It seems that performance is the motivation for this PR. Could benchmarks, or a link to them, be added to the PR description?

Some benchmarks are done :)

jonatack · 2020-05-27T07:35:42Z

For what it's worth, 30 test runs on master versus this branch rebased on master seem to show no significant difference in run time (~0.5%). They do appear to show a significant reduction in run time variance with this PR. I ran four alternating cycles of 15: master, PR, master, PR; it took most of a day to for the 60 runs.

results with a 2012 MacBookPro running Mojave

master

(/ (+
 1.6158521175384521
 1.5466628074645996
 1.6130828857421875
 1.5822479724884033
 1.8187620639801025
 1.5548689365386963
 1.5336580276489258
 1.5182380676269531
 1.5340137481689453
 1.5922291278839111
 1.6903131008148193
 1.8066248893737793
 1.8288109302520752
 1.5529100894927979
 1.4984791278839111
 1.6140100955963135
 1.6495018005371094
 1.6218130588531494
 1.6670641899108887
 1.550534963607788
 1.540299654006958
 1.6754770278930664
 1.548105001449585
 1.5992188453674316
 1.5701148509979248
 1.5563862323760986
 1.5182199478149414
 1.5484421253204346
 1.578866958618164
 1.5370879173278809
) 30)
1.602063

(HEAD, origin/pr/18191) rebased on master

(/ (+
 1.5921480655670166
 1.5645711421966553
 1.5695910453796387
 1.5787348747253418
 1.6047391891479492
 1.563709020614624
 1.5736000537872314
 1.5630147457122803
 1.5653061866760254
 1.599463939666748
 1.5765950679779053
 1.692945957183838
 1.580806016921997
 1.5970890522003174
 1.5502479076385498
 1.7198941707611084
 1.5886380672454834
 1.5441138744354248
 1.601973056793213
 1.5830471515655518
 1.7337110042572021
 1.5389633178710938
 1.5398321151733398
 1.588163137435913
 1.744027853012085
 1.569767951965332
 1.6018469333648682
 1.560020923614502
 1.542902946472168
 1.5561339855194092
) 30)
1.5928533

(/ 1.5928533 1.602063)
0.9942514

jonatack · 2020-05-27T07:43:50Z

The build warnings I encountered building in Debian with Clang 6 were also present when building on MacOS with the default Apple clang version 11.

In both cases, these are the only compiler warnings I'm seeing while building Bitcoin Core. They can be removed by either making the txiters const reference or non-const.

txmempool.cpp:89:35: warning: loop variable 'grand_child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
                for (const txiter grand_child_it : cached_grand_children->second) {
                                  ^
txmempool.cpp:89:22: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
                for (const txiter grand_child_it : cached_grand_children->second) {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
txmempool.cpp:78:27: warning: loop variable 'child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
                          ^
txmempool.cpp:78:14: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
             ^~~~~~~~~~~~~~~~~~~~~~~

2 warnings generated.

jonatack · 2020-05-27T07:48:12Z

Thanks all for running and developing these benches!

With this patch we allocate way less memory.

I think it would help the PR to feature relevant benchmarkings, with instructions to reproduce them, in the PR description.

maflcko · 2020-05-27T11:38:00Z

I don't think we have documentation on how to track memory usage. I remember I used massif at one point, but I forgot to write down notes. #17164 (comment)

JeremyRubin · 2020-05-28T04:21:26Z

I think it would help the PR to feature relevant benchmarkings, with instructions to reproduce them, in the PR description.

Good point. Generally this one is supposed to be an obviously better version that doesn't really require much benching. You can easily see from the code that std::sets go to vectors and a std::set of all conflicts is removed. But it's a good standard to hold to I suppose to want benches/testing for this stuff even if it seems obvious -- but as @MarcoFalke points out, benching memory is a bit tricky...

@sdaftuar do you think any of the benches you've run previously can be published in a reproducible way?

JeremyRubin · 2020-05-28T04:29:57Z

@jonatack I'll fix the iterator reference thing. Is there a way to make this warning trigger on all platforms that you know about?

adamjonas · 2020-06-15T19:19:27Z

I encountered identical build warnings with MacOS clang 10.

fjahr · 2020-06-17T15:50:09Z

Concept ACK, also encountered the build warnings though.

JeremyRubin · 2020-06-17T17:05:55Z

Will fix the build warnings soon; thanks all for the review (will tag some people when I try to fix as I can't trigger locally).

JeremyRubin · 2020-06-17T20:35:26Z

@jonatack can you confirm this fixes the build warning you encountered?

adamjonas · 2020-06-18T00:31:04Z

Confirming that 32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a produced the warnings but 8e5aa08 does not.

hebasto

re-ACK 8e5aa08, tested on Linux Mint 19.3 (x86_64) with Clang 6.0.0:

32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a

$ make > /dev/null
...
txmempool.cpp:89:35: warning: loop variable 'grand_child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
                for (const txiter grand_child_it : cached_grand_children->second) {
                                  ^
txmempool.cpp:89:22: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
                for (const txiter grand_child_it : cached_grand_children->second) {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
txmempool.cpp:78:27: warning: loop variable 'child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
                          ^
txmempool.cpp:78:14: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
             ^~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
...

8e5aa08 -- no such warnings

ariard

I think this PR should layout more optimization goal, i.e are these performance changes pursued to improve adversarial scenarios where a burdensome to evaluate reorg would unbalance block race ? Or it's still under the whole Epoch mempool goal to support more complex packages ?

You should mention performance improvements in commit message, namely adding an epoch on descendants traversal to avoid duplicate and swap-in std::vector against std::set to remove the O(log n) at element insertion and argue why that's an improvement. Wrt to std::vector, is this also an improvement memory-size ? You can also spread them on 2 different commits to ease review.

Overall, maybe you should come with a benchmark suite with package payloads of different complexity to avoid hitting hidden performance regressions ? I don't think mempool_updatefromblock has been designed for this.

ariard · 2020-07-09T17:30:34Z

src/txmempool.cpp

-                // but don't traverse again.
-                for (txiter cacheEntry : cacheIt->second) {
-                    setAllDescendants.insert(cacheEntry);
+    const auto epoch = GetFreshEpoch();


I think you should layout high-level algorithm behavior here, like "cacheMap maintain a key-elements mapping where each key represents a back-to-the-mempool transactions, elements represents its children sorted XXX. We merge cache line based on key intersecting with children of a latter cache entry evaluation"

You can keep per-line comments but it would be better to normalize style, IMO you blur code behavior description (e.g "Create something") and assumptions ("Elements are excluded, ...") which is a bit confusing.

ariard · 2020-07-09T17:44:36Z

src/txmempool.cpp

+    int64_t modify_count = 0;
+    // All entries in the new_cache_line have
+    // already been checked against excluded list
+    for (txiter child_it : new_cache_line) {


I wonder if previous accounting logic was correct to exclude already-processed vHashesToUpdate member from descendant state update. E.g if back-to-mempool transaction Y is processed first, as reverse_iterate enforces, its parents X, also to process as second, won't account it as part of modify_fee and others.

AFAICT, new code maintains behavior by not including Y as part of X's children cache line.

Correct me if I miss some boost:multi_index internals ?

Ok so suppose vhashestoupdate is {X, Y}.

First we submit X to mempool and Y to mempool.

Then we reverse iterate and update Y, then X.

Previously, we scanned across all of the direct children in the mempool of Y then of X and updated the parent/child relationships. So Y would do only things already in the mempool, and X would see Y in this list, but exclude it in setExcluded as it was set properly when Y was put in the mempool.

In the new patch, we scanned across all of the direct children in the mempool of Y then of X and updated the parent/child relationships. So Y would do only things already in the mempool, and X would see Y in this list, but exclude it in mapMemPoolDescendants as it was set properly when Y was put in the mempool, and mapMemPoolDescendants will get a cacheline entry for everything in vHashesToUpdate; and because of iteration order, children will always be visible to parents.

When we actually call into UpdateForDescendants, we only update against things that are in the cachelines, and not the keys to those cachelines. This is because earlier when Y was accepted to the mempool it should have already modified X's fees.

Gotcha, thanks for the laid-out explanation, I forgot that ancestor state X is well-modified at Y new mempool insertion. I was just confused by the sybillic comment "because any descendants in cache were added to the mempool after the transaction being updated and hence their state is already reflected in the parent state". In fact that's should point to AcceptToMemoryPool in validation's UpdateMempoolForReorg to enhance clarity.

DrahtBot · 2020-09-07T11:21:37Z

🐙 This pull request conflicts with the target branch and needs rebase.

_{Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".}

DrahtBot · 2022-02-22T12:18:03Z

There hasn't been much activity lately and the patch still needs rebase. What is the status here?

Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
Is it no longer relevant? ➡️ Please close.
Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

Crypt-iQ · 2022-04-19T14:49:10Z

src/txmempool.cpp

                // We can skip updating entries we've encountered before or that
                // are in the block (which are already accounted for).
-                if (!visited(childIter) && !setAlreadyIncluded.count(childHash)) {
+                if (!visited(childIter) && !mapMemPoolDescendantsToUpdate.count(childIter)) {


Not sure if this PR is still active, but I believe it may be more efficient to check:

+++ if (!visited(childIter) && !childIter->GetMemPoolParentsConst().count(it) {

since if the child is unvisited and is in vHashesToUpdate, it will have it as a parent, which isn't the case otherwise. I think this would be more efficient if the childiter's set of parents is < cache size, which I would expect to be true more often than not

glozow · 2022-04-22T21:58:31Z

this is a duplicate of #24158 and has needed rebase for 18 months, close?

fanquake · 2022-04-25T09:01:32Z

close?

I think so. Looking at the comments in #24158:

oh oops.
uh yeah TBH I forgot I had that other PR open.
They should be mostly the same.
The main difference is the earlier one also applies an optimization getting rid of setExclude and just using the cache line presence instead, which ends up being redundant with the setExclude.
We can add that optimization as a separate PR since it's a little bit less obvious why it works, I left it out when I rewrote this one.

8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov) Pull request description: This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool. It could be helpful for working on PRs like bitcoin#17925, bitcoin#18191. ACKs for top commit: ariard: ACK 8098dea Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06

JeremyRubin requested review from ajtowns and sdaftuar February 22, 2020 04:19

fanquake added the Mempool label Feb 22, 2020

JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from cae5e32 to 182e676 Compare February 22, 2020 04:35

sdaftuar reviewed Feb 25, 2020

View reviewed changes

src/txmempool.cpp Outdated Show resolved Hide resolved

src/txmempool.cpp Outdated Show resolved Hide resolved

JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch 2 times, most recently from 80a2488 to ae89491 Compare March 3, 2020 01:17

JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from 857c42d to b262e8c Compare March 3, 2020 05:51

JeremyRubin changed the title ~~Epoch mempool cache regression plus remove excluded set.~~ Change UpdateForDescendants to use Epochs Mar 6, 2020

JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from b262e8c to 32985d3 Compare March 6, 2020 19:45

hebasto reviewed Mar 29, 2020

View reviewed changes