Skip to content

Conversation

@dongcarl
Copy link
Contributor

@dongcarl dongcarl commented May 25, 2022

This is part of the libbitcoinkernel project: #24303, https://github.com/bitcoin/bitcoin/projects/18

This PR reduces the number of call sites where we explicitly construct CTxMemPool. This is done in preparation for later PRs which decouple the mempool module from ArgsManager, eventually all of libbitcoinkernel will be decoupled from ArgsManager.

The changes in this PR:

  • Allows us to have less code churn as we modify CTxMemPool's constructor in later PRs
  • In many cases, we can make use of existing CTxMemPool instances, getting rid of extraneous constructions
  • In other cases, we construct a ChainTestingSetup and use the CTxMemPool there, so that we can rely on the logic in setup_common to set things up correctly

Notes for Reviewers

A note on using existing mempools

When evaluating whether or not it's appropriate to use an existing mempool in a *TestingSetup struct, the key is to make sure that the mempool has the same lifetime as the *TestingSetup struct.

Example 1: In src/fuzz/tx_pool.cpp, the TestingSetup is initialized in initialize_tx_pool and lives as a static global, while the CTxMemPool is in the tx_pool_standard fuzz target, meaning that each time the tx_pool_standard fuzz target gets run, a new CTxMemPool is created. If we were to use the static global TestingSetup's CTxMemPool we might run into problems since its CTxMemPool will carry state between subsequent runs. This is why we don't modify src/fuzz/tx_pool.cpp in this PR.

Example 2: In src/bench/mempool_eviction.cpp, we see that the TestingSetup is in the same scope as the constructed CTxMemPool, so it is safe to use its CTxMemPool.

A note on checking CTxMemPool ctor call sites

After the "tree-wide: clang-format CTxMemPool references" commit, you can find all CTxMemPool ctor call sites with the following command:

git grep -E -e 'make_unique<CTxMemPool>' \
            -e '\bCTxMemPool\s+[^({;]+[({]' \
            -e '\bCTxMemPool\s+[^;]+;' \
            -e '\bnew\s+CTxMemPool\b'

At the end of the PR, you will find that there are still quite a few call sites that we can seemingly get rid of:

$ git grep -E -e 'make_unique<CTxMemPool>' -e '\bCTxMemPool\s+[^({;]+[({]' -e '\bCTxMemPool\s+[^;]+;' -e '\bnew\s+CTxMemPool\b'
# rearranged for easier explication
src/init.cpp:        node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio);
src/test/util/setup_common.cpp:    m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1);
src/rpc/mining.cpp:        CTxMemPool empty_mempool;
src/test/util/setup_common.cpp:    CTxMemPool empty_pool;
src/bench/mempool_stress.cpp:    CTxMemPool pool;
src/bench/mempool_stress.cpp:    CTxMemPool pool;
src/test/fuzz/rbf.cpp:    CTxMemPool pool;
src/test/fuzz/tx_pool.cpp:    CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1};
src/test/fuzz/tx_pool.cpp:    CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1};
src/test/fuzz/validation_load_mempool.cpp:    CTxMemPool pool{};
src/txmempool.h:    /** Create a new CTxMemPool.

Let's break them down one by one:

src/init.cpp:        node.mempool = std::make_unique<CTxMemPool>(node.fee_estimator.get(), mempool_check_ratio);
src/test/util/setup_common.cpp:    m_node.mempool = std::make_unique<CTxMemPool>(m_node.fee_estimator.get(), 1);

Necessary


src/rpc/mining.cpp:        CTxMemPool empty_mempool;
src/test/util/setup_common.cpp:    CTxMemPool empty_pool;

These are fixed in #25223 where we stop requiring the BlockAssembler to have a CTxMemPool if it's not going to consult it anyway (as is the case in these two call sites)


src/bench/mempool_stress.cpp:    CTxMemPool pool;
src/bench/mempool_stress.cpp:    CTxMemPool pool;

Fixed in #24927.


src/test/fuzz/rbf.cpp:    CTxMemPool pool;
src/test/fuzz/tx_pool.cpp:    CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1};
src/test/fuzz/tx_pool.cpp:    CTxMemPool tx_pool_{/*estimator=*/nullptr, /*check_ratio=*/1};
src/test/fuzz/validation_load_mempool.cpp:    CTxMemPool pool{};

These are all cases where we don't want the CTxMemPool state to persist between runs, see the previous section "A note on using existing mempools"


src/txmempool.h:    /** Create a new CTxMemPool.

It's a comment (someone link me to a grep that understands syntax plz thx)

[META] Do this so that we can more easily grep for all actual instances
       of CTxMemPool construction.
@DrahtBot
Copy link
Contributor

DrahtBot commented May 26, 2022

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #25290 ([kernel 3a/n] Decouple CTxMemPool from ArgsManager by dongcarl)
  • #25077 (Fix chain tip data race and corrupt rest response by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@maflcko
Copy link
Member

maflcko commented May 27, 2022

It looks like you are mostly modifying tests. If the goal it to avoid future code churn, why not add a test-only function that constructs a test-only mempool?

@dongcarl
Copy link
Contributor Author

dongcarl commented May 27, 2022

@MarcoFalke

It looks like you are mostly modifying tests. If the goal it to avoid future code churn, why not add a test-only function that constructs a test-only mempool?

I agree, though in my mind ChainTestingSetup::ChainTestingSetup is that "test-only function that constructs a test-only mempool", and I'm just making more tests use it! 😄

@maflcko
Copy link
Member

maflcko commented May 28, 2022

Well, it also sets up a full chain, so it is not possible to call it twice. I am mostly asking in the context of #25073 , which I did after your suggestion (see #19909 (comment) ).

@dongcarl
Copy link
Contributor Author

dongcarl commented May 31, 2022

@MarcoFalke

I am mostly asking in the context of #25073 , which I did after your suggestion (see #19909 (comment) ).

Ah I see! I consider the cases in #25073 as ones where we don't want the CTxMemPool state to persist for the lifetime of the TestingSetup, and therefore we wouldn't have changed it in this PR at all. See "A note on using existing mempools" in the original description. I'm also okay with adding these new invocations of the CTxMemPool constructor in #25073 since they are logically necessary and avoids clear().

I do now see the benefit to having a standalone function that constructs a test-only mempool... It would coalesce all of the cases not addressed by this PR and 2e (mostly the src/fuzz ones and the new miner_test ones in #25073) into just one call site... I'll give it some thought.

@dongcarl
Copy link
Contributor Author

@MarcoFalke

I'll give it some thought.

Given it some thought. I think "making a standalone function that constructs a test-only mempool" is not relevant to this PR and can be done in the future. In fact, it's likely easier to do with the CTxMemPool::Options introduced in #25290.

I don't think anything we're doing here or in #25290 is mutually exclusive with your #25073, since we're not outright eliminating all CTxMemPool callsites, just eliminating them where they're not necessary. So if you're okay with it, I think we can table this discussion for now.

fanquake added a commit that referenced this pull request Jun 15, 2022
…tructing temporary empty mempools

0f1a259 miner: Make mempool optional for BlockAssembler (Carl Dong)
cc5739b miner: Make UpdatePackagesForAdded static (Carl Dong)
f024578 miner: Absorb SkipMapTxEntry into addPackageTxs (Carl Dong)

Pull request description:

  This is part of the libbitcoinkernel project: #24303, https://github.com/bitcoin/bitcoin/projects/18

  This is **_NOT_** dependent on, but is a "companion-PR" to #25215.

  ### Abstract

  This PR removes the need to construct `BlockAssembler` with temporary, empty mempools in cases where we don't want to source transactions from the mempool (e.g. in `TestChain100Setup::CreateBlock` and `generateblock`). After this PR, `BlockAssembler` will accept a `CTxMemPool` pointer and handle the `nullptr` case instead of requiring a `CTxMemPool` reference.

  An overview of the changes is best seen in the changes in the header file:

  ```diff
  diff --git a/src/node/miner.h b/src/node/miner.h
  index 7cf8e3f..7e9f503602 100644
  --- a/src/node/miner.h
  +++ b/src/node/miner.h
  @@ -147,7 +147,7 @@ private:
       int64_t m_lock_time_cutoff;

       const CChainParams& chainparams;
  -    const CTxMemPool& m_mempool;
  +    const CTxMemPool* m_mempool;
       CChainState& m_chainstate;

   public:
  @@ -157,8 +157,8 @@ public:
           CFeeRate blockMinFeeRate;
       };

  -    explicit BlockAssembler(CChainState& chainstate, const CTxMemPool& mempool);
  -    explicit BlockAssembler(CChainState& chainstate, const CTxMemPool& mempool, const Options& options);
  +    explicit BlockAssembler(CChainState& chainstate, const CTxMemPool* mempool);
  +    explicit BlockAssembler(CChainState& chainstate, const CTxMemPool* mempool, const Options& options);

       /** Construct a new block template with coinbase to scriptPubKeyIn */
       std::unique_ptr<CBlockTemplate> CreateNewBlock(const CScript& scriptPubKeyIn);
  @@ -177,7 +177,7 @@ private:
       /** Add transactions based on feerate including unconfirmed ancestors
         * Increments nPackagesSelected / nDescendantsUpdated with corresponding
         * statistics from the package selection (for logging statistics). */
  -    void addPackageTxs(int& nPackagesSelected, int& nDescendantsUpdated) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs);
  +    void addPackageTxs(const CTxMemPool& mempool, int& nPackagesSelected, int& nDescendantsUpdated) EXCLUSIVE_LOCKS_REQUIRED(mempool.cs);

       // helper functions for addPackageTxs()
       /** Remove confirmed (inBlock) entries from given set */
  @@ -189,15 +189,8 @@ private:
         * These checks should always succeed, and they're here
         * only as an extra check in case of suboptimal node configuration */
       bool TestPackageTransactions(const CTxMemPool::setEntries& package) const;
  -    /** Return true if given transaction from mapTx has already been evaluated,
  -      * or if the transaction's cached data in mapTx is incorrect. */
  -    bool SkipMapTxEntry(CTxMemPool::txiter it, indexed_modified_transaction_set& mapModifiedTx, CTxMemPool::setEntries& failedTx) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs);
       /** Sort the package in an order that is valid to appear in a block */
       void SortForBlock(const CTxMemPool::setEntries& package, std::vector<CTxMemPool::txiter>& sortedEntries);
  -    /** Add descendants of given transactions to mapModifiedTx with ancestor
  -      * state updated assuming given transactions are inBlock. Returns number
  -      * of updated descendants. */
  -    int UpdatePackagesForAdded(const CTxMemPool::setEntries& alreadyAdded, indexed_modified_transaction_set& mapModifiedTx) EXCLUSIVE_LOCKS_REQUIRED(m_mempool.cs);
   };

   int64_t UpdateTime(CBlockHeader* pblock, const Consensus::Params& consensusParams, const CBlockIndex* pindexPrev);
  ```

  ### Alternatives

  Aside from approach in this current PR, we can also take the approach of moving the `CTxMemPool*` argument from the `BlockAssembler` constructor to `BlockAssembler::CreateNewBlock`, since that's where it's needed anyway. I did not push this approach because it requires quite a lot of call sites to be changed. However, I do have it coded up and can do that if people express a strong preference. This would look something like:

  ```
  BlockAssembler::BlockAssembler(CChainState& chainstate, const Options& options);
  BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, const CTxMemPool* maybe_mempool);
  ```

  ### Future work

  Although wholly out of scope for this PR, we could potentially refine the `BlockAssembler` interface further, so that we have:

  ```
  BlockAssembler::BlockAssembler(CChainState& chainstate, const Options& options);
  BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, std::vector<CTransaction>& txs);
  BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn, const CTxMemPool& mempool);
  ```

  Whereby `TestChain100Setup::CreateBlock` and `generateblock` would call the `BlockAssembler::CreateNewBlock` that takes in `CTransaction`s and we can potentially remove `RegenerateCommitments` altogether. All other callers can use the `CTxMemPool` version.

ACKs for top commit:
  glozow:
    ACK 0f1a259
  laanwj:
    Code review ACK 0f1a259
  MarcoFalke:
    ACK 0f1a259 🐊

Tree-SHA512: 2b4b1dbb43d85719f241ad1f19ceb7fc50cf764721da425a3d1ff71bd16328c4f86acff22e565bc9abee770d3ac8827a6676b66daa93dbf42dd817ad929e9448
Copy link
Member

@theuni theuni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick concept/code review ACK, looks like no functional change.

I have no thoughts on the test setup changes, so I'll defer to @laanwj's and @MarcoFalke's reviews on those.

In case it's helpful for future work to use clang-query to find instances rather than these nasty greps, the incantation for matching CTxMemPool constructors would be:
cxxConstructExpr(hasType(cxxRecordDecl(hasName("CTxMemPool"))))

dongcarl added 5 commits June 15, 2022 17:28
...just don't try to consult it at all when fCheckMemPool is false
After this commit, there should be no explicit instantiation of
CTxMemPool in src/test other than those in fuzz/ and setup_common

-BEGIN VERIFY SCRIPT-
find_regex="CTxMemPool\s+([^;({]+)(|\(\)|\{\});" \
    && git grep -l -E "$find_regex" -- src/test \
        | grep -v -e "^src/test/util/setup_common.cpp$" \
                  -e "^src/test/fuzz/" \
        | xargs sed -i -E "s@$find_regex@CTxMemPool\& \1 = *Assert(m_node.mempool);@g"
-END VERIFY SCRIPT-
This is correct because:

- The ChainTestingSetup is constructed before the call to bench.run(...)
- All the runs are performed on the same mempool
@dongcarl dongcarl force-pushed the 2022-03-libbitcoinkernel-reduce-mempool-ctor branch from bf65ab9 to d273e53 Compare June 15, 2022 21:29
@dongcarl
Copy link
Contributor Author

Pushed bf65ab9088df6f7e49030f3539b51d75023a6572 -> d273e53

@laanwj
Copy link
Member

laanwj commented Jun 16, 2022

Code review ACK d273e53

@laanwj laanwj merged commit 489b587 into bitcoin:master Jun 16, 2022
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jun 17, 2022
@bitcoin bitcoin locked and limited conversation to collaborators Jun 16, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

Status: Done or Closed or Rethinking

Development

Successfully merging this pull request may close these issues.

5 participants