-
Notifications
You must be signed in to change notification settings - Fork 38.7k
[tests] Fix feature_block flakiness #13048
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with.
|
This test bug would disappear with #11639, since that changes the DoS disconnect/ban behaviour. However, this change is probably still useful to ensure that there's no cross-contamination between subtests. |
|
Longer term, can we just have proper unit tests for validation please 😁 |
ryanofsky
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No - this is specific to invalid block headers (and more specifically - only to block headers that fail PoW validation). |
|
utACK c1d7420 |
c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f
c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f
…nTestFramework (#3277) * [tests] Change feature_block.py to use BitcoinTestFramework * [tests] Fix flake8 warnings in feature_block.py * [tests] Tidy up feature_block.py - move all helper methods to the end - remove block, create_tx and create_and_sign_tx shortcuts - remove --runbarelyexpensive option, since it defaults to True and it's unlikely that anyone ever runs the test with this option set to false. * [tests] Add logging to feature_block.py * [tests] Improve assert message when wait_until() fails * Merge bitcoin#13048: [tests] Fix feature_block flakiness c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f * Temporarely rename MAX_BLOCK_SIZE -> MAX_BLOCK_BASE_SIZE We'll undo this after the next commit. This avoids merge many conflicts and makes reviewing easier. * Rename MAX_BLOCK_BASE_SIZE back to MAX_BLOCK_SIZE * Use DoS score of 100 for bad-blk-sigops This was accidently changed to 10 while backporting bitcoin#7287 and causes test failures in p2p-fullblocktest.py * Use allowOptimisticSend=true when sending reject messages This fixes test failures in p2p-fullblocktest.py which expects reject messages to be sent/received before connections get closed. * Fix p2p-fullblocktest.py - CBlock and friends are still in test_framework.mininode - "-whitelist" causes connections to not be dropped, which in turn causes sync_blocks with reconnect=True to fail - "bad-cb-amount" does not cause a ban in Dash, so reconnect must be False - Dash already bans when a header is received which is a child of an invalid header, causing block requests to never happen * Backport missing changes from bitcoin#13003 bitcoin#13003 was backported out of order which causes missed changes. * Bump p2p-fullblocktest timeouts * Increase RPC timeout in p2p-fullblocktest.py Co-authored-by: John Newbery <[email protected]> Co-authored-by: MarcoFalke <[email protected]>
c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f
c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f
c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f
…nTestFramework (dashpay#3277) * [tests] Change feature_block.py to use BitcoinTestFramework * [tests] Fix flake8 warnings in feature_block.py * [tests] Tidy up feature_block.py - move all helper methods to the end - remove block, create_tx and create_and_sign_tx shortcuts - remove --runbarelyexpensive option, since it defaults to True and it's unlikely that anyone ever runs the test with this option set to false. * [tests] Add logging to feature_block.py * [tests] Improve assert message when wait_until() fails * Merge bitcoin#13048: [tests] Fix feature_block flakiness c1d7420 [tests] Fix feature_block flakiness (John Newbery) Pull request description: feature_block.py occasionally fails on Travis. I believe this is due to a a race condition when reconnecting to bitcoind after a subtest that expects disconnection. If the test runs ahead and sends the INV for the subsequent test before we've received the initial sync getheaders, then we may end up sending two headers messages - one as a response to the initial sync getheaders and one in response to the INV getheaders. If both of those headers fail validation with a DoS score of 50 or higher, then we'll unexpectedly be disconnected. There is only one validation failure that has a DoS score bewteen 50 and 100, which is high-hash. That's why the test is failing immediately after the "Reject a block with invalid work" subtest. Fix is to wait for the initial getheaders from the peer before we start populating our blockstore. That way we won't have any invalid headers to respond to it with. Tree-SHA512: dc17d795fcfaf0f8c0bf1e9732b5e11fbc8febbfafba4c231b7c13a5404a2c297dcd703a7a75bc7f353c893e12efc87f424f2201abd47ba5268af32d4d2e841f * Temporarely rename MAX_BLOCK_SIZE -> MAX_BLOCK_BASE_SIZE We'll undo this after the next commit. This avoids merge many conflicts and makes reviewing easier. * Rename MAX_BLOCK_BASE_SIZE back to MAX_BLOCK_SIZE * Use DoS score of 100 for bad-blk-sigops This was accidently changed to 10 while backporting bitcoin#7287 and causes test failures in p2p-fullblocktest.py * Use allowOptimisticSend=true when sending reject messages This fixes test failures in p2p-fullblocktest.py which expects reject messages to be sent/received before connections get closed. * Fix p2p-fullblocktest.py - CBlock and friends are still in test_framework.mininode - "-whitelist" causes connections to not be dropped, which in turn causes sync_blocks with reconnect=True to fail - "bad-cb-amount" does not cause a ban in Dash, so reconnect must be False - Dash already bans when a header is received which is a child of an invalid header, causing block requests to never happen * Backport missing changes from bitcoin#13003 bitcoin#13003 was backported out of order which causes missed changes. * Bump p2p-fullblocktest timeouts * Increase RPC timeout in p2p-fullblocktest.py Co-authored-by: John Newbery <[email protected]> Co-authored-by: MarcoFalke <[email protected]>
feature_block.py occasionally fails on Travis. I believe this is due to
a a race condition when reconnecting to bitcoind after a subtest that
expects disconnection. If the test runs ahead and sends the INV for the
subsequent test before we've received the initial sync getheaders, then
we may end up sending two headers messages - one as a response to the
initial sync getheaders and one in response to the INV getheaders. If
both of those headers fail validation with a DoS score of 50 or higher,
then we'll unexpectedly be disconnected.
There is only one validation failure that has a DoS score bewteen 50 and
100, which is high-hash. That's why the test is failing immediately
after the "Reject a block with invalid work" subtest.
Fix is to wait for the initial getheaders from the peer before we
start populating our blockstore. That way we won't have any invalid
headers to respond to it with.