[test] Apply maximal flags to tx_valid tests and minimal flags to tx_invalid tests #15045

jl2012 · 2018-12-27T19:29:06Z

The first commit inverts the meaning of verifyFlags for tx_valid tests, as flags being excluded. All flags are applied by default, except those found in verifyFlags. This makes sure that a new or existing flag won't invalidate a tx by accident.

The second commit reduces the number of validation flags used for tx_invalid tests, to a minimally required set to fail a test. This makes sure that a tx failed due to the tested flags, not unexpected effects of some other flags. It also uses "BADTX" to indicate tests not passing CheckTransaction(), vs. those failing script execution.

(If a test is expected to fail due to multiple independent flags, multiple tests should be used)

The third commit verifies that the flags excluded in tx_valid and included in tx_invalid are indeed the minimal set. In tx_valid, it adds back the excluded flags individually and expects it to fail. In tx_invalid, it removes the included flags individually and expects it to pass.

This process helped me to identify and fix some buggy tests:

Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tx_valid tests, so there is no need to exclude CLEANSTACK
An OP_CSV tx_valid test missed an OP_ADD, and is added back
2 witness tests were found with empty vout, so they failed due to CheckTransaction(), not script tests. Corrected by filling in proper vout.

jl2012 · 2018-12-27T19:35:32Z

src/test/data/tx_invalid.json

I think there is something wrong with this test, but I am not sure about its intention

jl2012 · 2018-12-27T19:36:25Z

src/test/data/tx_invalid.json

Another problematic test with unclear intention

practicalswift · 2018-12-28T10:42:04Z

src/test/transaction_tests.cpp

Should the implicit conversion from negative to unsigned be made explicit here?

Done. Also fixed a bug as I swapped SCRIPT_VERIFY_CLEANSTACK and SCRIPT_VERIFY_WITNESS here

DrahtBot · 2018-12-28T11:56:07Z

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#16881 (consensus: Improve CScriptNum error reporting by fanquake)
#14696 (qa: Add explicit references to related CVE's in p2p_invalid_block test. by lucash-dev)
#13360 ([Policy] Reject SIGHASH_SINGLE with output out of bound by jl2012)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jl2012 · 2018-12-28T19:53:11Z

Added a commit to verify all validation flags are backward compatible (softfork), like #10699

jl2012 · 2018-12-30T19:23:10Z

Added a commit to do the same in script_tests

laanwj · 2019-01-02T14:29:21Z

Concept ACK

- Apply all validation flags by default - Invert the meaning of verifyFlags as flags being excluded - Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tests - Fix an OP_CSV test with an OP_ADD missing

- Reduce the number of validation flags used, to a minimally required set to fail a test - Use "BADTX" to indicate tests not passing CheckTransaction() - Fix 2 witness tests with empty vout

- in tx_valid, all flags are set except those in verifyFlags - in tx_invalid, all flags are not set except those in verifyFlags

See bitcoin#10699

- For invalid script tests: -- reduce the number of validation flags used, to a minimally required set to fail a test -- verify that adding flags would not validate an invalid script - For valid script tests: -- apply all flags except those explicitly excluded -- verify that removing flags would not invalidate a valid script

jl2012 · 2019-01-16T17:12:08Z

src/test/script_tests.cpp

-    int libconsensus_flags = flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL;
-    if (libconsensus_flags == flags) {
+    int libconsensus_flags = main_flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL;
+    if (expect || libconsensus_flags == (int)main_flags) {


rebased and fixed a bug here. Without expect ||, all valid script tests were skipped due to the use of standardness flags. If a script is valid with standardness flags, removing these flags shouldn't make it invalid.

This also fixed an existing bug that many tests containing standardness flags (mostly STRICTENC) are unnecessarily skipped.

practicalswift · 2019-05-07T16:26:32Z

@jl2012 Could you rebase this PR on master and push to have Travis test the rebased version? I think there might be some UBSan runtime errors that needs to be addressed.

DrahtBot · 2019-09-18T15:14:25Z

Needs rebase

maflcko · 2020-08-07T08:04:32Z

No activity for a year. Closing for now, let me know when you want to work on this again

…ests and assert backwards compatibility 5786a81 Verify that all validation flags are backward compatible (gzhao408) b10ce9a [test] check verification flags are minimal/maximal (gzhao408) a260c22 [test] Check for invalid flag combinations (gzhao408) a7098a2 [refactor] use CheckTxScripts, TrimFlags, FillFlags (gzhao408) 7a77727 Apply minimal validation flags to tx_invalid tests (gzhao408) 9532591 [test] add BADTX setting for invalid txns that fail CheckTransaction (gzhao408) 4c06ebf [test] fix two witness tests in invalid tests with empty vout (gzhao408) 158a0b2 Apply maximal validation flags to tx_valid tests (gzhao408) 0a76a39 [test] fix CSV test missing OP_ADD (gzhao408) 19db590 [test] remove unnecessary OP_1s from CSV and CLTV tests (gzhao408) Pull request description: This uses the first 4 commits of #15045, rebased and added some comments. The diff is quite large already and I want to make it easy to review, so I'm splitting it into 2 PRs (transaction and script). Script one is WIP, I'll link it when I open it. Interpretation of scripts is dependent on the script verification flags passed in. In tests, we should always apply **maximal** verification flags when checking that a transaction is **valid**; any additional flags should invalidate the transaction. A transaction should not be valid because we forgot to include a flag, and we should apply all flags by default. We should apply **minimal** verification flags when asserting that a transaction is **invalid**; if verification flags are applied, removing any one of them should mean the transaction is valid. New verify flags must be backwards compatible; tests should check backwards compatibility and apply the new flags by default. All `tx_invalid` tests should continue to be invalid with the exact same verify flags. All `tx_valid` tests that don't pass with new flags should _explicitly_ indicate that the flags need to be excluded, and fail otherwise. 1. Flip the meaning of `verifyFlags` in tx_valid.json to mean _excluded_ verification flags instead of included flags. Edit the test data accordingly. 2. Trim unneeded flags from tx_invalid.json. 3. Add check to verify that tx_valid tests have maximal flags and tx_invalid tests have minimal flags. 4. Add checks to verify that flags are soft forks (#10699) i.e. adding any flag should only decrease the number of acceptable scripts. Test by adding/removing random flags. ACKs for top commit: achow101: ACK 5786a81 laanwj: ACK 5786a81 Tree-SHA512: 19195d8cf3299e62f47dd3443ae4a95430c5c9d497993a18ab80de9e24b1869787af972774993bf05717784879bc4592fdabaae0fddebd437963d8f3c96d9a73

…ction tests and assert backwards compatibility 5786a81 Verify that all validation flags are backward compatible (gzhao408) b10ce9a [test] check verification flags are minimal/maximal (gzhao408) a260c22 [test] Check for invalid flag combinations (gzhao408) a7098a2 [refactor] use CheckTxScripts, TrimFlags, FillFlags (gzhao408) 7a77727 Apply minimal validation flags to tx_invalid tests (gzhao408) 9532591 [test] add BADTX setting for invalid txns that fail CheckTransaction (gzhao408) 4c06ebf [test] fix two witness tests in invalid tests with empty vout (gzhao408) 158a0b2 Apply maximal validation flags to tx_valid tests (gzhao408) 0a76a39 [test] fix CSV test missing OP_ADD (gzhao408) 19db590 [test] remove unnecessary OP_1s from CSV and CLTV tests (gzhao408) Pull request description: This uses the first 4 commits of bitcoin#15045, rebased and added some comments. The diff is quite large already and I want to make it easy to review, so I'm splitting it into 2 PRs (transaction and script). Script one is WIP, I'll link it when I open it. Interpretation of scripts is dependent on the script verification flags passed in. In tests, we should always apply **maximal** verification flags when checking that a transaction is **valid**; any additional flags should invalidate the transaction. A transaction should not be valid because we forgot to include a flag, and we should apply all flags by default. We should apply **minimal** verification flags when asserting that a transaction is **invalid**; if verification flags are applied, removing any one of them should mean the transaction is valid. New verify flags must be backwards compatible; tests should check backwards compatibility and apply the new flags by default. All `tx_invalid` tests should continue to be invalid with the exact same verify flags. All `tx_valid` tests that don't pass with new flags should _explicitly_ indicate that the flags need to be excluded, and fail otherwise. 1. Flip the meaning of `verifyFlags` in tx_valid.json to mean _excluded_ verification flags instead of included flags. Edit the test data accordingly. 2. Trim unneeded flags from tx_invalid.json. 3. Add check to verify that tx_valid tests have maximal flags and tx_invalid tests have minimal flags. 4. Add checks to verify that flags are soft forks (bitcoin#10699) i.e. adding any flag should only decrease the number of acceptable scripts. Test by adding/removing random flags. ACKs for top commit: achow101: ACK 5786a81 laanwj: ACK 5786a81 Tree-SHA512: 19195d8cf3299e62f47dd3443ae4a95430c5c9d497993a18ab80de9e24b1869787af972774993bf05717784879bc4592fdabaae0fddebd437963d8f3c96d9a73

maflcko · 2021-02-23T16:20:47Z

(last commit hasn't been picked up yet)

jl2012 commented Dec 27, 2018

View reviewed changes

src/test/data/tx_invalid.json Outdated

Copy link

Contributor Author

jl2012 Dec 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another problematic test with unclear intention

jl2012 force-pushed the min_txtests_flags branch from c1b8110 to fc19bc1 Compare December 27, 2018 19:43

fanquake self-assigned this Dec 27, 2018

fanquake added the Tests label Dec 27, 2018

fanquake removed their assignment Dec 27, 2018

practicalswift reviewed Dec 28, 2018

View reviewed changes

jl2012 force-pushed the min_txtests_flags branch from fc19bc1 to 9e43ae4 Compare December 28, 2018 15:15

jl2012 force-pushed the min_txtests_flags branch from 457ec82 to 862a176 Compare December 30, 2018 19:27

DrahtBot added the Needs rebase label Jan 16, 2019

jl2012 added 5 commits January 16, 2019 22:15

Apply maximal validation flags to tx_valid tests

c6eca9c

- Apply all validation flags by default - Invert the meaning of verifyFlags as flags being excluded - Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tests - Fix an OP_CSV test with an OP_ADD missing

Apply minimal validation flags to tx_invalid tests

438d353

- Reduce the number of validation flags used, to a minimally required set to fail a test - Use "BADTX" to indicate tests not passing CheckTransaction() - Fix 2 witness tests with empty vout

Verify that verifyFlags in tx_valid and tx_invalid tests are minimal

2565830

- in tx_valid, all flags are set except those in verifyFlags - in tx_invalid, all flags are not set except those in verifyFlags

Verify that all validation flags are backward compatible

089d5c8

See bitcoin#10699

jl2012 force-pushed the min_txtests_flags branch from 862a176 to 447d292 Compare January 16, 2019 14:47

DrahtBot removed the Needs rebase label Jan 16, 2019

jl2012 commented Jan 16, 2019

View reviewed changes

maflcko closed this May 7, 2019

maflcko reopened this May 7, 2019

DrahtBot added the Needs rebase label Sep 18, 2019

maflcko added the Up for grabs label Aug 7, 2020

maflcko closed this Aug 7, 2020

This was referenced Aug 10, 2020

[wip] test: Apply maximal verification flags to tx_valid tests and minimal flags to tx_invalid tests glozow/bitcoin#3

Closed

test: apply strict verification flags for transaction tests and assert backwards compatibility #19698

Merged

bitcoin locked as resolved and limited conversation to collaborators Aug 16, 2022

[test] Apply maximal flags to tx_valid tests and minimal flags to tx_invalid tests #15045

[test] Apply maximal flags to tx_valid tests and minimal flags to tx_invalid tests #15045

Uh oh!

Conversation

jl2012 commented Dec 27, 2018

Uh oh!

jl2012 Dec 27, 2018

Choose a reason for hiding this comment

Uh oh!

jl2012 Dec 27, 2018

Choose a reason for hiding this comment

Uh oh!

practicalswift Dec 28, 2018

Choose a reason for hiding this comment

Uh oh!

jl2012 Dec 28, 2018

Choose a reason for hiding this comment

Uh oh!

DrahtBot commented Dec 28, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Conflicts

Uh oh!

jl2012 commented Dec 28, 2018

Uh oh!

jl2012 commented Dec 30, 2018

Uh oh!

laanwj commented Jan 2, 2019

Uh oh!

jl2012 Jan 16, 2019

Choose a reason for hiding this comment

Uh oh!

practicalswift commented May 7, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DrahtBot commented Sep 18, 2019

Uh oh!

maflcko commented Aug 7, 2020

Uh oh!

maflcko commented Feb 23, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

DrahtBot commented Dec 28, 2018 •

edited

Loading

practicalswift commented May 7, 2019 •

edited

Loading