Skip to content

Conversation

@davereikher
Copy link
Contributor

This is an improvement to a test, inspired by #14343 - removing non determinism from a test.

The test descriptor_test is non-deterministic, as it relies on the MaybeUseHInsteadOfApostrophy function which randomly either swaps all apostrophes with 'h' or doesn't at all in a descriptor. This fix makes both cases always run, if an apostrophe is found in a test descriptor.
This does not reduce test coverage but removes the non-determinism.

Additionally, the MaybeUseHInsteadOfApostrophy function removed the checksum if found at the end of a descriptor when the apostrophes are swapped by 'h's, since after being swapped the checksum is no longer correct. I instead added re-calculation of the checksum using the DescriptorChecksum function, which adds coverage for the case of a descriptors having 'h's instead of apostrophes and a checksum. This was previously lacking.
To achieve this I had to move DescriptorChecksum and PolyMod out of the anonymous namespace in descriptor.cpp to make DescriptorChecksum accessible in descriptor_tests.cpp.

All tests complete successfully (functional as well as unit tests).

@davereikher
Copy link
Contributor Author

Woops, lint failed..
Is there an easy way to reproduce a Travis job locally? I'm a newb to Travis. I tried following https://stackoverflow.com/questions/21053657/how-to-run-travis-ci-locally/49019950 but I don't understand the instance field that is displayed in the Travis build logs (there is more than one instance there?)

@fanquake
Copy link
Member

fanquake commented Aug 8, 2019

@davereikher The linter is just complaining about trailing whitespace:

This diff appears to have added new lines with trailing whitespace.

The following changes were suspected:

diff --git a/src/test/descriptor_tests.cpp b/src/test/descriptor_tests.cpp

@@ -55,0 +63 @@ std::string MaybeUseHInsteadOfApostrophy(std::string ret)

+

@@ -61 +69,2 @@ const std::set<std::vector<uint32_t>> ONLY_EMPTY{{}};

+void DoCheck(const std::string& prv, const std::string& pub, int flags, const std::vector<std::vector<std::string>>& scripts, const std::set<std::vector<uint32_t>>& paths = ONLY_EMPTY,

^---- failure generated from test/lint/lint-whitespace.sh

You should be able to run the test/lint/lint-whitespace.sh script (as well as the other linters) locally.

@DrahtBot
Copy link
Contributor

DrahtBot commented Aug 8, 2019

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@davereikher
Copy link
Contributor Author

The test/lint/lint-whitespace.sh script finishes without errors locally. I would like to reproduce the Travis environment to avoid fixing and pushing multiple times. If not, I can just remove the trailing whitespace and try again.

Copy link
Member

@achow101 achow101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK. I've left a few comments.

I think it makes sense to test the parsing of both the ' and h key origin forms for all descriptors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment shouldn't be moved.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. It's not quite clear in the diff, but the // Checksum comment was not moved. It remains after namespace {. The only comment that was moved is the one above PolyMod, describing the operation of that function, along with the function.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words: The function does not need to move, only the namespace {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in commit 6819fbd741891081b2679e4b763bb88dc997ffe7. Is there a way to re-run AppVeyor? Looks like it timed out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, done in commit 0fc46f5b9262f016c42d2abe68d5125833f55416.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary whitespace change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed in commit d3e2d08a3be92cecb50c89c69871e61e0618e10b.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would make more sense to largely leave this function untouched and instead do the ' to h change in your new Check function below. The only change here would be to remove MaybeUseHInsteadOfApostrophy

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree and I did it that way initially but then, for example, this check failed:

std::string pub1 = parse_priv->ToString();
BOOST_CHECK(EqualDescriptor(pub, pub1));

since Descriptor::ToString returns a descriptor formatted with ', while pub has h, if I do it the way you suggest. I had to work around this by passing pub and prv without modification and introducing a boolean argument of whether to swap ' with h only when passing them to the Parse function.

@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from 6819fbd to 0fc46f5 Compare August 9, 2019 14:24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually backwards. When replace_apostrophe_with_h is false, we're swapping. I think it's supposed to be the other way around.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, you're absolutely right. Fixed in a8f49b7d53d33aeba8bd0965c21878795b36f99b.

Copy link
Contributor

@meshcollider meshcollider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

PolyMod linkage changed, make it static?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, static function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's in an anonymous namespace, or am I missing something?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, sorry.

Copy link
Contributor

@promag promag Aug 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But make it std::string UseHInsteadOfApostrophe(const std::string& ret). Also could avoid ret argument, sounds like it's an output/return argument. (rename is unrelated here).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this line would fail should ret be made a const reference:
https://github.com/bitcoin/bitcoin/blob/0b90954416767f6213094f1f282e708c3e8a710b/src/test/descriptor_tests.cpp#L50
I guess I could change the signature to what you suggest and then assign ret to a local variable. Then there would still be a single copy operation so it's not less efficient or anything, but that would introduce an extra line of code. Do you think I should do it?
I agree that ret is not an indicative name. I think desc is better?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it would be fine to declare in src/script/descriptor.h

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c079a2fb49dca58a1d7e5e185a1cbc430381df.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, if (it == std::string::npos) break;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c079a2fb49dca58a1d7e5e185a1cbc430381df.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment like /* replace_apostrophe_with_h = */ true instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c079a.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use std::string::find(char, ...) like find('\'').

nit, { here.

nit, remove unnecessary ( ).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit } else {

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c079a2fb49dca58a1d7e5e185a1cbc430381df.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit, { here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 07c079a2fb49dca58a1d7e5e185a1cbc430381df.

@davereikher
Copy link
Contributor Author

davereikher commented Aug 10, 2019

Concept ACK.

PolyMod linkage changed, make it static?

Thanks! Done in commit 07c079a.

@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from 07c079a to 0b90954 Compare August 13, 2019 07:35
@davereikher
Copy link
Contributor Author

Squashed commits.

@meshcollider
Copy link
Contributor

I believe this will be a lot simpler change now if you use GetDescriptorChecksum from #15986

Concept ACK anyhow

@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from 0b90954 to 762dff3 Compare August 18, 2019 17:03
@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from 762dff3 to d0e54ff Compare August 19, 2019 04:39
@davereikher
Copy link
Contributor Author

I believe this will be a lot simpler change now if you use GetDescriptorChecksum from #15986

Concept ACK anyhow

Great! Now that GetDescriptorChecksum is available I moved DescriptorChecksum and PolyMod back into the anonymous namespace and instead used GetDescriptorChecksum to calculate the checksum. This also introduces an indirect test to GetDescriptorChecksum and the underlying functions.
My branch seems to have failed an unrelated test on Travis while passing all tests (unit and functional, including the test that fails on Travis) locally. Is sporadic failure a known issue perhaps? Is there a way to re-run Travis?

@fanquake
Copy link
Member

@davereikher Travis tests can occasionally fail sporadically. I've rebooted the two failing instances for you.

@fanquake fanquake changed the title Make descriptor tests deterministic tests: Make descriptor tests deterministic Aug 19, 2019
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: indentation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is actually a test case which used to be covered which is no longer ever covered - the use of h in the pub and ' in the prv and vice versa (because there was a 50% chance of each being changed independently)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your review. You are right, I completely missed this. I added code to cover all four test cases (no replacement, prv only, pub only and prv+pub) in b9ee63c.

@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from d0e54ff to 35b07e9 Compare August 21, 2019 05:08
Changed MaybeUseHInsteadOfApostrophy to UseHInsteadOfApostrophe.
This function now always replaces apostrophes with 'h'.
The original Check function was renamed to DoCheck and it's
called with an additional parameter which tells it to either
leave the prv and pub arguments as is or replace the apostrophes
with 'h'. The test runs with apostrophes replaced in prv only,
pub only, prv and pub and without replacement at all. Replacement
of apostrophes in a descriptor and then running DoCheck is conditional
on whether apostrophes are found in that descriptor.

Additionally, instead of dropping the checksum recalculate it
after replacing apostrophes with 'h' in the function UseHInsteadOfApostrophe
using the GetDescriptorChecksum function. That way, this also
introduces an indirect unit test to GetDescriptoChecksum.
@davereikher davereikher force-pushed the make_descriptor_tests_deterministic branch from 35b07e9 to b9ee63c Compare August 21, 2019 05:23
@davereikher
Copy link
Contributor Author

@davereikher Travis tests can occasionally fail sporadically. I've rebooted the two failing instances for you.

@fanquake Travis failed again, could you please reboot the failing instance here?

Copy link
Contributor

@meshcollider meshcollider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@fanquake fanquake requested a review from achow101 August 21, 2019 23:52
Copy link
Member

@achow101 achow101 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review ACK b9ee63c

@fanquake
Copy link
Member

@davereikher Thanks for sticking with this, nice first time contribution.

fanquake added a commit that referenced this pull request Aug 23, 2019
b9ee63c Make descriptor test deterministic (David Reikher)

Pull request description:

  This is an improvement to a test, inspired by #14343 - removing non determinism from a test.

  The test `descriptor_test` is non-deterministic, as it relies on the `MaybeUseHInsteadOfApostrophy` function which randomly either swaps all apostrophes with 'h' or doesn't at all in a descriptor. This fix makes both cases always run, if an apostrophe is found in a test descriptor.
  This does not reduce test coverage but removes the non-determinism.

  Additionally, the `MaybeUseHInsteadOfApostrophy` function removed the checksum if found at the end of a descriptor when the apostrophes are swapped by 'h's, since after being swapped the checksum is no longer correct. I instead added re-calculation of the checksum using the `DescriptorChecksum` function, which adds coverage for the case of a descriptors having 'h's instead of apostrophes and a checksum. This was previously lacking.
  To achieve this I had to move `DescriptorChecksum` and `PolyMod` out of the anonymous namespace in descriptor.cpp to make `DescriptorChecksum` accessible in descriptor_tests.cpp.

  All tests complete successfully (functional as well as unit tests).

ACKs for top commit:
  achow101:
    Code Review ACK b9ee63c

Tree-SHA512: 992c73a6644a07bfe7c72301ee2666f3c4845a012aaedd7a099a05cea8bdac84fa8280b28e44a7856260c00c0be1a6f1b6768f5694c2a22edf4c489e53fec424
@fanquake fanquake merged commit b9ee63c into bitcoin:master Aug 23, 2019
@davereikher
Copy link
Contributor Author

Thanks a lot! I learned so much from just this small contribution and your reviews. Looking forward for more! :)

@practicalswift
Copy link
Contributor

@davereikher Very nice to have you onboard and thanks for this first contribution! Hope to see more contributions from you.

If you want to take on non-determinism in tests you might want to take a look at contrib/devtools/test_deterministic_coverage.sh.

The following unit tests are known to have non-deterministic line coverage:

NON_DETERMINISTIC_TESTS=(
"blockfilter_index_tests/blockfilter_index_initial_sync" # src/checkqueue.h: In CCheckQueue::Loop(): while (queue.empty()) { ... }
"coinselector_tests/knapsack_solver_test" # coinselector_tests.cpp: if (equal_sets(setCoinsRet, setCoinsRet2))
"denialofservice_tests/DoS_mapOrphans" # denialofservice_tests.cpp: it = mapOrphanTransactions.lower_bound(InsecureRand256());
"fs_tests/fsbridge_fstream" # deterministic test failure?
"miner_tests/CreateNewBlock_validity" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"scheduler_tests/manythreads" # scheduler.cpp: CScheduler::serviceQueue()
"scheduler_tests/singlethreadedscheduler_ordered" # scheduler.cpp: CScheduler::serviceQueue()
"tx_validationcache_tests/checkinputs_test" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"tx_validationcache_tests/tx_mempool_block_doublespend" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"txindex_tests/txindex_initial_sync" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"txvalidation_tests/tx_mempool_reject_coinbase" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"validation_block_tests/processnewblock_signals_ordering" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/coin_mark_dirty_immature_credit" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/dummy_input_size_test" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/importmulti_rescan" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/importwallet_rescan" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/ListCoins" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/scan_for_wallet_transactions" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/wallet_disableprivkeys" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
)

Getting them deterministic in terms of line coverage would be great! More specifically that would allow us to reason about how a proposed change alters unit tests line coverage. Currently line coverage is a stochastic process :-)

@davereikher
Copy link
Contributor Author

@davereikher Very nice to have you onboard and thanks for this first contribution! Hope to see more contributions from you.

If you want to take on non-determinism in tests you might want to take a look at contrib/devtools/test_deterministic_coverage.sh.

The following unit tests are known to have non-deterministic line coverage:

NON_DETERMINISTIC_TESTS=(
"blockfilter_index_tests/blockfilter_index_initial_sync" # src/checkqueue.h: In CCheckQueue::Loop(): while (queue.empty()) { ... }
"coinselector_tests/knapsack_solver_test" # coinselector_tests.cpp: if (equal_sets(setCoinsRet, setCoinsRet2))
"denialofservice_tests/DoS_mapOrphans" # denialofservice_tests.cpp: it = mapOrphanTransactions.lower_bound(InsecureRand256());
"fs_tests/fsbridge_fstream" # deterministic test failure?
"miner_tests/CreateNewBlock_validity" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"scheduler_tests/manythreads" # scheduler.cpp: CScheduler::serviceQueue()
"scheduler_tests/singlethreadedscheduler_ordered" # scheduler.cpp: CScheduler::serviceQueue()
"tx_validationcache_tests/checkinputs_test" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"tx_validationcache_tests/tx_mempool_block_doublespend" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"txindex_tests/txindex_initial_sync" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"txvalidation_tests/tx_mempool_reject_coinbase" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"validation_block_tests/processnewblock_signals_ordering" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/coin_mark_dirty_immature_credit" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/dummy_input_size_test" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/importmulti_rescan" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/importwallet_rescan" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/ListCoins" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/scan_for_wallet_transactions" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
"wallet_tests/wallet_disableprivkeys" # validation.cpp: if (GetMainSignals().CallbacksPending() > 10)
)

Getting them deterministic in terms of line coverage would be great! More specifically that would allow us to reason about how a proposed change alters unit tests line coverage. Currently line coverage is a stochastic process :-)

Thanks! Sounds great, I'll look into the non-determinism of those tests and I guess I will start with the low hanging fruit first :)

jasonbcox pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Aug 18, 2020
Summary:
```
Changed MaybeUseHInsteadOfApostrophy to UseHInsteadOfApostrophe.
This function now always replaces apostrophes with 'h'.
The original Check function was renamed to DoCheck and it's
called with an additional parameter which tells it to either
leave the prv and pub arguments as is or replace the apostrophes
with 'h'. The test runs with apostrophes replaced in prv only,
pub only, prv and pub and without replacement at all. Replacement
of apostrophes in a descriptor and then running DoCheck is conditional
on whether apostrophes are found in that descriptor.

Additionally, instead of dropping the checksum recalculate it
after replacing apostrophes with 'h' in the function
UseHInsteadOfApostrophe
using the GetDescriptorChecksum function. That way, this also
introduces an indirect unit test to GetDescriptoChecksum.
```

Backport of core [[bitcoin/bitcoin#16570 | PR16570]].

Test Plan:
  ninja check

Reviewers: #bitcoin_abc, deadalnix

Reviewed By: #bitcoin_abc, deadalnix

Differential Revision: https://reviews.bitcoinabc.org/D7208
kwvg added a commit to kwvg/dash that referenced this pull request Dec 5, 2021
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants