Faster tests with topological mempool rpc sync 🚀 #15858

maflcko · 2019-04-19T20:25:12Z

Most of the tests spend time waiting for the poisson tx send to timeout. However, tests that don't test p2p behaviour, don't need to.

Speed them up by syncing transactions over rpc.

Overall, the tests run three times as fast for me now:

Before:

ALL                                   | ✓ Passed  | 1395 s (accumulated) 
Runtime: 368 s

After:

ALL                                   | ✓ Passed  | 940 s (accumulated) 
Runtime: 120 s

jnewbery · 2019-04-19T20:33:13Z

Nice! This has been on my list for a long time. Thanks for doing it!

🚀 concept ACK

laanwj · 2019-04-20T12:20:05Z

This is an impressive improvement in speed!
ACK fad26ed

a test here:

ALL                                   | ✓ Passed  | 1788 s (accumulated)
Runtime: 472 s

ALL                                   | ✓ Passed  | 1516 s (accumulated)
Runtime: 388 s

promag

Concept ACK.

test/functional/test_framework/util.py

promag · 2019-04-21T23:04:49Z

test/functional/test_framework/util.py

+
+                # Push this pool to all targets
+                for i_target, rpc_target in enumerate(rpc_connections):
+                    missing_txids = pool[i_remote].difference(pool[i_target])


Could skip when i_remove == i_target?

I opted for code brevity. Calculating the empty set as a difference from two equal sets should take negligible time compared to an rpc call

test/functional/test_framework/util.py

jonatack

Concept ACK. Consider describing this argument's purpose and use in the sync_mempools function docstring, in example_test.py, and possibly in rpc_deprecated.py.

test/functional/test_framework/util.py

maflcko · 2019-04-22T12:35:19Z

Added a commit to extend the docstring, as requested by @promag and @jonatack

jnewbery · 2019-04-22T19:39:29Z

test/functional/test_framework/util.py

        pool = [set(r.getrawmempool()) for r in rpc_connections]
-        if pool.count(pool[0]) == len(rpc_connections):
+        sync_done = pool.count(pool[0]) == len(rpc_connections)
+        if use_rpc_sync and not sync_done:


It seems slightly odd to me that this code, which we only ever expect to run once, should be in a while loop. I think it'd be clearer to split it out into its own function. Something like:

def sync_mempools(rpc_connections, *, wait=1, timeout=60, use_rpc_sync=False, flush_scheduler=True): """ Wait until everybody has the same transactions in their memory pools. If use_rpc_sync is set, sync all transactions right away. """ if use_rpc_sync: force_sync_mempools(rpc_connections) else: stop_time = time.time() + timeout while time.time() <= stop_time: pool = [set(r.getrawmempool()) for r in rpc_connections] if pool.count(pool[0]) == len(rpc_connections): break time.sleep(wait) else: raise AssertionError("Mempool sync timed out:{}".format("".join("\n {!r}".format(m) for m in pool))) if flush_scheduler: for r in rpc_connections: r.syncwithvalidationinterfacequeue() def force_sync_mempools(rpc_connections): class TxInfo: def __init__(self, *, raw, ancestors): self.raw = raw self.ancestors = ancestors def topo_send(txs, rpc, pool_add): for i in txs: topo_send(txs[i].ancestors, rpc, pool_add) if i not in pool_add: try: assert_equal(i, rpc.sendrawtransaction(txs[i].raw)) pool_add.add(i) # Note that conflicted txs (due to RBF) are not removed # from the pool except JSONRPCException as e: # This transaction violates policy (e.g. RBF policy). The # mempools should still converge when the high-fee # replacement is synced in a later call assert 'insufficient fee' in e.error['message'] pool = [set(r.getrawmempool()) for r in rpc_connections] # Iterate over all nodes, get their raw mempool and send the # missing txs to all other nodes for i_remote, rpc_remote in enumerate(rpc_connections): pool_remote = { txid: TxInfo(raw=rpc_remote.getrawtransaction(txid), ancestors=info['depends']) for txid, info in rpc_remote.getrawmempool(verbose=True).items() } # Create "recursive pools" for ancestors for tx in pool_remote: pool_remote[tx].ancestors = {a: pool_remote[a] for a in pool_remote[tx].ancestors} # Push this pool to all targets for i_target, rpc_target in enumerate(rpc_connections): missing_txids = pool[i_remote].difference(pool[i_target]) # Send missing txs topo_send( txs={txid: pool_remote[txid] for txid in pool_remote if txid in missing_txids}, rpc=rpc_target, pool_add=pool[i_target], ) # If the sync fails there is a logic error in the sync or test code pool = [set(r.getrawmempool()) for r in rpc_connections] assert pool.count(pool[0]) == len(rpc_connections)

I'd rather not use a while-else, which I have to look up what it does every time I encounter it. I opted for a smaller diff and added an assert, so the diff should be easier to review and it should be clear that it is only run once via the assert.

Also rewriting the function would invalidate previous review

I'm surprised you think Python's while-else syntax is confusing. I don't think I've ever had a problem with it: https://stackoverflow.com/a/3295949. I certainly think it's clearer than the existing code, which has a return buried in a conditional in the while loop, but hidden under the flush_scheduler cleanup code.

I'm pretty sure you've been pushed into this weird construction (execute-once code within a while loop) because of the flush_scheduler cleanup code being inside the loop. Restructuring the function so the loop breaks and then does the cleanup allows the RPC sync to be separated from the while loop, which seems clearer to me.

Unlike if-else, where only one branch is executed, the while-else (or for-else) generally execute both branches. To make it even more confusing, it won't execute the else branch when you break.

Anyway, I am happy to review a follow-up pull that switches both sync_ helpers to the while-else syntax. But, I'd rather not do it here.

test/functional/test_framework/util.py

jnewbery · 2019-04-22T21:26:46Z

Aesthetic disagreements aside, this is a great improvement. utACK faac5a8

kostyantyn · 2019-04-23T14:38:14Z

test/functional/test_framework/test_framework.py

        self.nodes = []
        self.network_thread = None
        self.rpc_timeout = 60  # Wait for up to 60 seconds for the RPC server to respond
+        self.use_rpc_sync = False


Curious, as the improvement is mostly gained by bypassing Poisson feature, why don't disable it via parameter e.,g,-enablepoisson=0, which would be allowed only in regtest? In this case, nodes behave closer to a real scenario, we have cleaner logs as txs are exchanged via P2P only (no race condition), would work even if sync_mempool hasn't been called and no need to maintain a custom tx exchange function.

@kostyantyn Good point! Done in

net: Send txs without delay on regtest net: Send txs without delay on regtest #15881

maflcko · 2019-04-24T15:20:02Z

Going back to this approach, since modifying bitcoind is interpreted as too controversial

maflcko · 2019-04-24T15:20:21Z

Unless there are objections this will be merged on Monday

promag · 2019-04-24T15:26:07Z

Could make sense to skip this optimization in one of the travis job?

DrahtBot · 2019-04-25T16:28:53Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#15911 (Use wallet RBF default for walletcreatefundedpsbt by Sjors)
#15891 (test: Require standard txs in regtest by default by MarcoFalke)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jonatack · 2019-04-29T22:55:52Z

Functional tests on master at 8da1aa4 with Linux Debian 4.19.28-2 (2019-03-15) x86/64

ALL | ✓ Passed | 1824 s (accumulated)
Runtime: 466 s

ALL | ✓ Passed | 1867 s (accumulated)
Runtime: 476 s

ALL | ✓ Passed | 1856 s (accumulated)
Runtime: 473 s

ALL | ✓ Passed | 1774 s (accumulated)
Runtime: 453 s

Functional tests with this PR at faac5a8 rebased on master at 8da1aa4

ALL | ✓ Passed | 1372 s (accumulated)
Runtime: 352 s

ALL | ✓ Passed | 1388 s (accumulated)
Runtime: 357 s

ALL | ✓ Passed | 1472 s (accumulated)
Runtime: 377 s

ALL | ✓ Passed | 1449 s (accumulated)
Runtime: 373 s

laanwj · 2019-04-30T20:19:12Z

Going back to this approach, since modifying bitcoind is interpreted as too controversial

I prefer this approach as well. I think the only use for this, ever, is the tests, so I'd prefer not to change bitcoind with a special case for it.

test: Use RPC mempool sync to speed up tests

fad26ed

DrahtBot added the Tests label Apr 19, 2019

promag reviewed Apr 21, 2019

View reviewed changes

jonatack reviewed Apr 22, 2019

View reviewed changes

test/functional/test_framework/util.py Show resolved Hide resolved

jnewbery reviewed Apr 22, 2019

View reviewed changes

test: Add documentation to sync_mempools::use_rpc_sync

faac5a8

maflcko force-pushed the 1904-qaSync branch from fa1680d to faac5a8 Compare April 22, 2019 20:12

kostyantyn reviewed Apr 23, 2019

View reviewed changes

maflcko closed this Apr 23, 2019

maflcko deleted the 1904-qaSync branch April 23, 2019 21:02

naumenkogs mentioned this pull request Apr 23, 2019

net: Send txs without delay on regtest #15881

Closed

maflcko restored the 1904-qaSync branch April 24, 2019 15:19

maflcko reopened this Apr 24, 2019

maflcko closed this Apr 30, 2019

maflcko deleted the 1904-qaSync branch April 30, 2019 20:15

maflcko mentioned this pull request Nov 10, 2019

ci: Travis support for PowerPC64 #17402

Closed

bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021

Faster tests with topological mempool rpc sync 🚀 #15858

Faster tests with topological mempool rpc sync 🚀 #15858

Uh oh!

Conversation

maflcko commented Apr 19, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jnewbery commented Apr 19, 2019

Uh oh!

laanwj commented Apr 20, 2019

Uh oh!

promag left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko Apr 22, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jonatack left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

maflcko commented Apr 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jnewbery commented Apr 22, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

maflcko commented Apr 24, 2019

Uh oh!

maflcko commented Apr 24, 2019

Uh oh!

promag commented Apr 24, 2019

Uh oh!

DrahtBot commented Apr 25, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

jonatack commented Apr 29, 2019

Uh oh!

laanwj commented Apr 30, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

maflcko commented Apr 19, 2019 •

edited

Loading

maflcko Apr 22, 2019 •

edited

Loading

jonatack left a comment •

edited

Loading

DrahtBot commented Apr 25, 2019 •

edited

Loading