-
Notifications
You must be signed in to change notification settings - Fork 38.7k
[test] Apply maximal flags to tx_valid tests and minimal flags to tx_invalid tests #15045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
src/test/data/tx_invalid.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is something wrong with this test, but I am not sure about its intention
src/test/data/tx_invalid.json
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another problematic test with unclear intention
c1b8110 to
fc19bc1
Compare
src/test/transaction_tests.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the implicit conversion from negative to unsigned be made explicit here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Also fixed a bug as I swapped SCRIPT_VERIFY_CLEANSTACK and SCRIPT_VERIFY_WITNESS here
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
fc19bc1 to
9e43ae4
Compare
|
Added a commit to verify all validation flags are backward compatible (softfork), like #10699 |
|
Added a commit to do the same in script_tests |
457ec82 to
862a176
Compare
|
Concept ACK |
- Apply all validation flags by default - Invert the meaning of verifyFlags as flags being excluded - Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tests - Fix an OP_CSV test with an OP_ADD missing
- Reduce the number of validation flags used, to a minimally required set to fail a test - Use "BADTX" to indicate tests not passing CheckTransaction() - Fix 2 witness tests with empty vout
- in tx_valid, all flags are set except those in verifyFlags - in tx_invalid, all flags are not set except those in verifyFlags
- For invalid script tests: -- reduce the number of validation flags used, to a minimally required set to fail a test -- verify that adding flags would not validate an invalid script - For valid script tests: -- apply all flags except those explicitly excluded -- verify that removing flags would not invalidate a valid script
862a176 to
447d292
Compare
| int libconsensus_flags = flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL; | ||
| if (libconsensus_flags == flags) { | ||
| int libconsensus_flags = main_flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL; | ||
| if (expect || libconsensus_flags == (int)main_flags) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rebased and fixed a bug here. Without expect ||, all valid script tests were skipped due to the use of standardness flags. If a script is valid with standardness flags, removing these flags shouldn't make it invalid.
This also fixed an existing bug that many tests containing standardness flags (mostly STRICTENC) are unnecessarily skipped.
|
@jl2012 Could you rebase this PR on |
| Needs rebase |
|
No activity for a year. Closing for now, let me know when you want to work on this again |
…ests and assert backwards compatibility 5786a81 Verify that all validation flags are backward compatible (gzhao408) b10ce9a [test] check verification flags are minimal/maximal (gzhao408) a260c22 [test] Check for invalid flag combinations (gzhao408) a7098a2 [refactor] use CheckTxScripts, TrimFlags, FillFlags (gzhao408) 7a77727 Apply minimal validation flags to tx_invalid tests (gzhao408) 9532591 [test] add BADTX setting for invalid txns that fail CheckTransaction (gzhao408) 4c06ebf [test] fix two witness tests in invalid tests with empty vout (gzhao408) 158a0b2 Apply maximal validation flags to tx_valid tests (gzhao408) 0a76a39 [test] fix CSV test missing OP_ADD (gzhao408) 19db590 [test] remove unnecessary OP_1s from CSV and CLTV tests (gzhao408) Pull request description: This uses the first 4 commits of #15045, rebased and added some comments. The diff is quite large already and I want to make it easy to review, so I'm splitting it into 2 PRs (transaction and script). Script one is WIP, I'll link it when I open it. Interpretation of scripts is dependent on the script verification flags passed in. In tests, we should always apply **maximal** verification flags when checking that a transaction is **valid**; any additional flags should invalidate the transaction. A transaction should not be valid because we forgot to include a flag, and we should apply all flags by default. We should apply **minimal** verification flags when asserting that a transaction is **invalid**; if verification flags are applied, removing any one of them should mean the transaction is valid. New verify flags must be backwards compatible; tests should check backwards compatibility and apply the new flags by default. All `tx_invalid` tests should continue to be invalid with the exact same verify flags. All `tx_valid` tests that don't pass with new flags should _explicitly_ indicate that the flags need to be excluded, and fail otherwise. 1. Flip the meaning of `verifyFlags` in tx_valid.json to mean _excluded_ verification flags instead of included flags. Edit the test data accordingly. 2. Trim unneeded flags from tx_invalid.json. 3. Add check to verify that tx_valid tests have maximal flags and tx_invalid tests have minimal flags. 4. Add checks to verify that flags are soft forks (#10699) i.e. adding any flag should only decrease the number of acceptable scripts. Test by adding/removing random flags. ACKs for top commit: achow101: ACK 5786a81 laanwj: ACK 5786a81 Tree-SHA512: 19195d8cf3299e62f47dd3443ae4a95430c5c9d497993a18ab80de9e24b1869787af972774993bf05717784879bc4592fdabaae0fddebd437963d8f3c96d9a73
…ction tests and assert backwards compatibility 5786a81 Verify that all validation flags are backward compatible (gzhao408) b10ce9a [test] check verification flags are minimal/maximal (gzhao408) a260c22 [test] Check for invalid flag combinations (gzhao408) a7098a2 [refactor] use CheckTxScripts, TrimFlags, FillFlags (gzhao408) 7a77727 Apply minimal validation flags to tx_invalid tests (gzhao408) 9532591 [test] add BADTX setting for invalid txns that fail CheckTransaction (gzhao408) 4c06ebf [test] fix two witness tests in invalid tests with empty vout (gzhao408) 158a0b2 Apply maximal validation flags to tx_valid tests (gzhao408) 0a76a39 [test] fix CSV test missing OP_ADD (gzhao408) 19db590 [test] remove unnecessary OP_1s from CSV and CLTV tests (gzhao408) Pull request description: This uses the first 4 commits of bitcoin#15045, rebased and added some comments. The diff is quite large already and I want to make it easy to review, so I'm splitting it into 2 PRs (transaction and script). Script one is WIP, I'll link it when I open it. Interpretation of scripts is dependent on the script verification flags passed in. In tests, we should always apply **maximal** verification flags when checking that a transaction is **valid**; any additional flags should invalidate the transaction. A transaction should not be valid because we forgot to include a flag, and we should apply all flags by default. We should apply **minimal** verification flags when asserting that a transaction is **invalid**; if verification flags are applied, removing any one of them should mean the transaction is valid. New verify flags must be backwards compatible; tests should check backwards compatibility and apply the new flags by default. All `tx_invalid` tests should continue to be invalid with the exact same verify flags. All `tx_valid` tests that don't pass with new flags should _explicitly_ indicate that the flags need to be excluded, and fail otherwise. 1. Flip the meaning of `verifyFlags` in tx_valid.json to mean _excluded_ verification flags instead of included flags. Edit the test data accordingly. 2. Trim unneeded flags from tx_invalid.json. 3. Add check to verify that tx_valid tests have maximal flags and tx_invalid tests have minimal flags. 4. Add checks to verify that flags are soft forks (bitcoin#10699) i.e. adding any flag should only decrease the number of acceptable scripts. Test by adding/removing random flags. ACKs for top commit: achow101: ACK 5786a81 laanwj: ACK 5786a81 Tree-SHA512: 19195d8cf3299e62f47dd3443ae4a95430c5c9d497993a18ab80de9e24b1869787af972774993bf05717784879bc4592fdabaae0fddebd437963d8f3c96d9a73
|
(last commit hasn't been picked up yet) |
The first commit inverts the meaning of verifyFlags for tx_valid tests, as flags being excluded. All flags are applied by default, except those found in verifyFlags. This makes sure that a new or existing flag won't invalidate a tx by accident.
The second commit reduces the number of validation flags used for tx_invalid tests, to a minimally required set to fail a test. This makes sure that a tx failed due to the tested flags, not unexpected effects of some other flags. It also uses "BADTX" to indicate tests not passing CheckTransaction(), vs. those failing script execution.
(If a test is expected to fail due to multiple independent flags, multiple tests should be used)
The third commit verifies that the flags excluded in tx_valid and included in tx_invalid are indeed the minimal set. In tx_valid, it adds back the excluded flags individually and expects it to fail. In tx_invalid, it removes the included flags individually and expects it to pass.
This process helped me to identify and fix some buggy tests: