Skip to content

Conversation

@JeremyRubin
Copy link
Contributor

@JeremyRubin JeremyRubin commented Feb 22, 2020

This patch replaces #18120, fixing some regressions or issues with that PR. The overall change here should be easier to review as well.

Refactors UpdateForDescendants to use Epochs instead of sets to de-duplicate traversal, and replaces the cache entries with a vector instead of a set. This is a straightforward win. The algorithm is a bit clever so as not to require two separate vectors for txiters that need expanding and txiters that have already been processed, but I think it's relatively easy to reason about.

This is extracted from #18063 as a standalone because it turns out the DoS issues with this code perhaps merit more extensive review http://gnusha.org/bitcoin-core-dev/2020-02-11.log. That PR will stay open for review, but this PR is designed as an "obviously better" win that can be merged now to keep progress up on the mempool project.

sdaftuar can you re-run your bench from #18120?

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from cae5e32 to 182e676 Compare February 22, 2020 04:35
@DrahtBot
Copy link
Contributor

DrahtBot commented Feb 22, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

Copy link
Member

@sdaftuar sdaftuar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only looked at ec237cf8c640f9e85c3c7aad0d3eb243a346b1f9 but I think maybe not worth the added complexity, if I'm understanding correctly.

Also the shrink_to_fit of the update_cache vector could just be included in #18120?

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch 2 times, most recently from 80a2488 to ae89491 Compare March 3, 2020 01:17
@JeremyRubin
Copy link
Contributor Author

Fixed and updated comment.

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from 857c42d to b262e8c Compare March 3, 2020 05:51
@JeremyRubin
Copy link
Contributor Author

Sorry for all the line noise; I realized I forgot to add a continue, and then decided that I could DRY up the code a bit. I'm pretty happy with the final version, I think it's a bit easier to view because we don't treat update_it special from other iterators we walk (other than it not going into the cache line).

I can squash this all down into one commit and replace #18120.

@JeremyRubin JeremyRubin changed the title Epoch mempool cache regression plus remove excluded set. Change UpdateForDescendants to use Epochs Mar 6, 2020
@JeremyRubin JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from b262e8c to 32985d3 Compare March 6, 2020 19:45
@hebasto
Copy link
Member

hebasto commented Mar 29, 2020

Concept ACK.

@hebasto
Copy link
Member

hebasto commented Mar 30, 2020

IIUC, the first pass of the do ... while (true); loop just emplaces all of the direct descendants of update_it tx, i.e., its children, to the new_cache_line, and taints their m_epoch. All of the following loop passes process grand children, grand-grand children and so on. And two thing are confusing to me:

  1. the following comment: https://github.com/bitcoin/bitcoin/blob/32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a/src/txmempool.cpp#L79-L80

I'd suggest to remove it completely.

  1. the following variable name: https://github.com/bitcoin/bitcoin/blob/32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a/src/txmempool.cpp#L87

I'd suggest:

  • s/cached_grand_children/cached_descendants/
  • s/grand_child_it/descendant_it/
  • s/the grand child/the descendant/ in the comment

EDIT: I realized, that you mean "children" and "grandchildren" relative to an element of the new_cache_line. So, ignore my suggestion (2).

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a, tested on Linux Mint 19.3 with the functional test from #18485 (cfaa79b):

the average time (from 10 runs) reported by test decreased by 9% from 0.367s to 0.347s.

maflcko pushed a commit that referenced this pull request Apr 29, 2020
8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov)

Pull request description:

  This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool.

  It could be helpful for working on PRs like #17925, #18191.

ACKs for top commit:
  ariard:
    ACK 8098dea

Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 29, 2020
8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov)

Pull request description:

  This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool.

  It could be helpful for working on PRs like bitcoin#17925, bitcoin#18191.

ACKs for top commit:
  ariard:
    ACK 8098dea

Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06
@jonatack
Copy link
Member

It seems that performance is the motivation for this PR. Could benchmarks, or a link to them, be added to the PR description?

@hebasto
Copy link
Member

hebasto commented May 25, 2020

@jonatack

It seems that performance is the motivation for this PR. Could benchmarks, or a link to them, be added to the PR description?

Some benchmarks are done :)

@jonatack
Copy link
Member

For what it's worth, 30 test runs on master versus this branch rebased on master seem to show no significant difference in run time (~0.5%). They do appear to show a significant reduction in run time variance with this PR. I ran four alternating cycles of 15: master, PR, master, PR; it took most of a day to for the 60 runs.

results with a 2012 MacBookPro running Mojave

master

(/ (+
 1.6158521175384521
 1.5466628074645996
 1.6130828857421875
 1.5822479724884033
 1.8187620639801025
 1.5548689365386963
 1.5336580276489258
 1.5182380676269531
 1.5340137481689453
 1.5922291278839111
 1.6903131008148193
 1.8066248893737793
 1.8288109302520752
 1.5529100894927979
 1.4984791278839111
 1.6140100955963135
 1.6495018005371094
 1.6218130588531494
 1.6670641899108887
 1.550534963607788
 1.540299654006958
 1.6754770278930664
 1.548105001449585
 1.5992188453674316
 1.5701148509979248
 1.5563862323760986
 1.5182199478149414
 1.5484421253204346
 1.578866958618164
 1.5370879173278809
) 30)
1.602063

(HEAD, origin/pr/18191) rebased on master

(/ (+
 1.5921480655670166
 1.5645711421966553
 1.5695910453796387
 1.5787348747253418
 1.6047391891479492
 1.563709020614624
 1.5736000537872314
 1.5630147457122803
 1.5653061866760254
 1.599463939666748
 1.5765950679779053
 1.692945957183838
 1.580806016921997
 1.5970890522003174
 1.5502479076385498
 1.7198941707611084
 1.5886380672454834
 1.5441138744354248
 1.601973056793213
 1.5830471515655518
 1.7337110042572021
 1.5389633178710938
 1.5398321151733398
 1.588163137435913
 1.744027853012085
 1.569767951965332
 1.6018469333648682
 1.560020923614502
 1.542902946472168
 1.5561339855194092
) 30)
1.5928533

(/ 1.5928533 1.602063)
0.9942514

@jonatack
Copy link
Member

The build warnings I encountered building in Debian with Clang 6 were also present when building on MacOS with the default Apple clang version 11.

In both cases, these are the only compiler warnings I'm seeing while building Bitcoin Core. They can be removed by either making the txiters const reference or non-const.

txmempool.cpp:89:35: warning: loop variable 'grand_child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
                for (const txiter grand_child_it : cached_grand_children->second) {
                                  ^
txmempool.cpp:89:22: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
                for (const txiter grand_child_it : cached_grand_children->second) {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
txmempool.cpp:78:27: warning: loop variable 'child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
                          ^
txmempool.cpp:78:14: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::__1::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
             ^~~~~~~~~~~~~~~~~~~~~~~

2 warnings generated.

@jonatack
Copy link
Member

Thanks all for running and developing these benches!

With this patch we allocate way less memory.

I think it would help the PR to feature relevant benchmarkings, with instructions to reproduce them, in the PR description.

@maflcko
Copy link
Member

maflcko commented May 27, 2020

I don't think we have documentation on how to track memory usage. I remember I used massif at one point, but I forgot to write down notes. #17164 (comment)

@JeremyRubin
Copy link
Contributor Author

I think it would help the PR to feature relevant benchmarkings, with instructions to reproduce them, in the PR description.

Good point. Generally this one is supposed to be an obviously better version that doesn't really require much benching. You can easily see from the code that std::sets go to vectors and a std::set of all conflicts is removed. But it's a good standard to hold to I suppose to want benches/testing for this stuff even if it seems obvious -- but as @MarcoFalke points out, benching memory is a bit tricky...

@sdaftuar do you think any of the benches you've run previously can be published in a reproducible way?

@JeremyRubin
Copy link
Contributor Author

@jonatack I'll fix the iterator reference thing. Is there a way to make this warning trigger on all platforms that you know about?

@adamjonas
Copy link
Member

I encountered identical build warnings with MacOS clang 10.

@fjahr
Copy link
Contributor

fjahr commented Jun 17, 2020

Concept ACK, also encountered the build warnings though.

@JeremyRubin
Copy link
Contributor Author

Will fix the build warnings soon; thanks all for the review (will tag some people when I try to fix as I can't trigger locally).

@JeremyRubin JeremyRubin force-pushed the epoch-mempool-cache-regression-plus-remove-excluded branch from 32985d3 to 8e5aa08 Compare June 17, 2020 20:34
@JeremyRubin
Copy link
Contributor Author

@jonatack can you confirm this fixes the build warning you encountered?

@adamjonas
Copy link
Member

Confirming that 32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a produced the warnings but 8e5aa08 does not.

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK 8e5aa08, tested on Linux Mint 19.3 (x86_64) with Clang 6.0.0:

  • 32985d34a0a0cfd4ff8c4d96ff461778eaf72a8a
$ make > /dev/null
...
txmempool.cpp:89:35: warning: loop variable 'grand_child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
                for (const txiter grand_child_it : cached_grand_children->second) {
                                  ^
txmempool.cpp:89:22: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
                for (const txiter grand_child_it : cached_grand_children->second) {
                     ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
txmempool.cpp:78:27: warning: loop variable 'child_it' of type 'const CTxMemPool::txiter' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag>') creates a copy from type 'const CTxMemPool::txiter' [-Wrange-loop-analysis]
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
                          ^
txmempool.cpp:78:14: note: use reference type 'const CTxMemPool::txiter &' (aka 'const hashed_index_iterator<hashed_index_node<ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, ordered_index_node<boost::multi_index::detail::null_augment_policy, index_node_base<CTxMemPoolEntry, std::allocator<CTxMemPoolEntry> > > > >, boost::multi_index::detail::hashed_unique_tag>, bucket_array<allocator<CTxMemPoolEntry> >, boost::multi_index::detail::hashed_index_global_iterator_tag> &') to prevent copying
        for (const txiter child_it : GetMemPoolChildren(next_it)) {
             ^~~~~~~~~~~~~~~~~~~~~~~
2 warnings generated.
...

Copy link

@ariard ariard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR should layout more optimization goal, i.e are these performance changes pursued to improve adversarial scenarios where a burdensome to evaluate reorg would unbalance block race ? Or it's still under the whole Epoch mempool goal to support more complex packages ?

You should mention performance improvements in commit message, namely adding an epoch on descendants traversal to avoid duplicate and swap-in std::vector against std::set to remove the O(log n) at element insertion and argue why that's an improvement. Wrt to std::vector, is this also an improvement memory-size ? You can also spread them on 2 different commits to ease review.

Overall, maybe you should come with a benchmark suite with package payloads of different complexity to avoid hitting hidden performance regressions ? I don't think mempool_updatefromblock has been designed for this.

// but don't traverse again.
for (txiter cacheEntry : cacheIt->second) {
setAllDescendants.insert(cacheEntry);
const auto epoch = GetFreshEpoch();
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should layout high-level algorithm behavior here, like "cacheMap maintain a key-elements mapping where each key represents a back-to-the-mempool transactions, elements represents its children sorted XXX. We merge cache line based on key intersecting with children of a latter cache entry evaluation"

You can keep per-line comments but it would be better to normalize style, IMO you blur code behavior description (e.g "Create something") and assumptions ("Elements are excluded, ...") which is a bit confusing.

int64_t modify_count = 0;
// All entries in the new_cache_line have
// already been checked against excluded list
for (txiter child_it : new_cache_line) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if previous accounting logic was correct to exclude already-processed vHashesToUpdate member from descendant state update. E.g if back-to-mempool transaction Y is processed first, as reverse_iterate enforces, its parents X, also to process as second, won't account it as part of modify_fee and others.

AFAICT, new code maintains behavior by not including Y as part of X's children cache line.

Correct me if I miss some boost:multi_index internals ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so suppose vhashestoupdate is {X, Y}.

First we submit X to mempool and Y to mempool.

Then we reverse iterate and update Y, then X.

Previously, we scanned across all of the direct children in the mempool of Y then of X and updated the parent/child relationships. So Y would do only things already in the mempool, and X would see Y in this list, but exclude it in setExcluded as it was set properly when Y was put in the mempool.

In the new patch, we scanned across all of the direct children in the mempool of Y then of X and updated the parent/child relationships. So Y would do only things already in the mempool, and X would see Y in this list, but exclude it in mapMemPoolDescendants as it was set properly when Y was put in the mempool, and mapMemPoolDescendants will get a cacheline entry for everything in vHashesToUpdate; and because of iteration order, children will always be visible to parents.

When we actually call into UpdateForDescendants, we only update against things that are in the cachelines, and not the keys to those cachelines. This is because earlier when Y was accepted to the mempool it should have already modified X's fees.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha, thanks for the laid-out explanation, I forgot that ancestor state X is well-modified at Y new mempool insertion. I was just confused by the sybillic comment "because any descendants in cache were added to the mempool after the transaction being updated and hence their state is already reflected in the parent state". In fact that's should point to AcceptToMemoryPool in validation's UpdateMempoolForReorg to enhance clarity.

@DrahtBot
Copy link
Contributor

DrahtBot commented Sep 7, 2020

🐙 This pull request conflicts with the target branch and needs rebase.

Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".

@DrahtBot
Copy link
Contributor

There hasn't been much activity lately and the patch still needs rebase. What is the status here?
  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

// We can skip updating entries we've encountered before or that
// are in the block (which are already accounted for).
if (!visited(childIter) && !setAlreadyIncluded.count(childHash)) {
if (!visited(childIter) && !mapMemPoolDescendantsToUpdate.count(childIter)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this PR is still active, but I believe it may be more efficient to check:

+++ if (!visited(childIter) && !childIter->GetMemPoolParentsConst().count(it) {

since if the child is unvisited and is in vHashesToUpdate, it will have it as a parent, which isn't the case otherwise. I think this would be more efficient if the childiter's set of parents is < cache size, which I would expect to be true more often than not

@glozow
Copy link
Member

glozow commented Apr 22, 2022

this is a duplicate of #24158 and has needed rebase for 18 months, close?

@fanquake
Copy link
Member

close?

I think so. Looking at the comments in #24158:

oh oops.
uh yeah TBH I forgot I had that other PR open.
They should be mostly the same.
The main difference is the earlier one also applies an optimization getting rid of setExclude and just using the cache line presence instead, which ends up being redundant with the setExclude.
We can add that optimization as a separate PR since it's a little bit less obvious why it works, I left it out when I rewrote this one.

@fanquake fanquake closed this Apr 25, 2022
PastaPastaPasta pushed a commit to vijaydasmp/dash that referenced this pull request Jan 12, 2023
8098dea test: Add mempool_updatefromblock.py (Hennadii Stepanov)

Pull request description:

  This PR adds a new test for mempool update of transaction descendants/ancestors information (count, size) when transactions have been re-added from a disconnected block to the mempool.

  It could be helpful for working on PRs like bitcoin#17925, bitcoin#18191.

ACKs for top commit:
  ariard:
    ACK 8098dea

Tree-SHA512: 7e808fa8df8d7d7a7dbdc3f79361049b49c7bce9b58fd5539b28c9636bedac747695537e500d7ed68dc8bdb80167ad3f1c01086f7551691405d2ba2e38ef1d06
@bitcoin bitcoin locked and limited conversation to collaborators Apr 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.