[test] Apply maximal flags to tx_valid tests and minimal flags to tx

jl2012 commented at 7:29 pm on December 27, 2018: contributor

The first commit inverts the meaning of verifyFlags for tx_valid tests, as flags being excluded. All flags are applied by default, except those found in verifyFlags. This makes sure that a new or existing flag won’t invalidate a tx by accident.

The second commit reduces the number of validation flags used for tx_invalid tests, to a minimally required set to fail a test. This makes sure that a tx failed due to the tested flags, not unexpected effects of some other flags. It also uses “BADTX” to indicate tests not passing CheckTransaction(), vs. those failing script execution.

(If a test is expected to fail due to multiple independent flags, multiple tests should be used)

The third commit verifies that the flags excluded in tx_valid and included in tx_invalid are indeed the minimal set. In tx_valid, it adds back the excluded flags individually and expects it to fail. In tx_invalid, it removes the included flags individually and expects it to pass.

This process helped me to identify and fix some buggy tests:

Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tx_valid tests, so there is no need to exclude CLEANSTACK
An OP_CSV tx_valid test missed an OP_ADD, and is added back
2 witness tests were found with empty vout, so they failed due to CheckTransaction(), not script tests. Corrected by filling in proper vout.

in src/test/data/tx_invalid.json:157 in c1b8110428 outdated

171 [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0 CHECKLOCKTIMEVERIFY 1"]],
172-"010000000100010000000000000000000000000000000000000000000000000000000000000000000000ffffffff0100000000000000000000000000", "P2SH,CHECKLOCKTIMEVERIFY"],
173+"010000000100010000000000000000000000000000000000000000000000000000000000000000000000ffffffff0100000000000000000000000000", "CHECKLOCKTIMEVERIFY"],
174 [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0"]],
175-"01000000010001000000000000000000000000000000000000000000000000000000000000000000000251b1ffffffff0100000000000000000002000000", "P2SH,CHECKLOCKTIMEVERIFY"],
176+"01000000010001000000000000000000000000000000000000000000000000000000000000000000000251b1ffffffff0100000000000000000002000000", "NONE"],

jl2012 commented at 7:35 pm on December 27, 2018:

I think there is something wrong with this test, but I am not sure about its intention

in src/test/data/tx_invalid.json:168 in c1b8110428 outdated

185 [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0 CHECKLOCKTIMEVERIFY 1"]],
186-"01000000010001000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000065cd1d", "P2SH,CHECKLOCKTIMEVERIFY"],
187+"01000000010001000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000065cd1d", "CHECKLOCKTIMEVERIFY"],
188 [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0"]],
189-"01000000010001000000000000000000000000000000000000000000000000000000000000000000000251b100000000010000000000000000000065cd1d", "P2SH,CHECKLOCKTIMEVERIFY"],
190+"01000000010001000000000000000000000000000000000000000000000000000000000000000000000251b100000000010000000000000000000065cd1d", "NONE"],

jl2012 commented at 7:36 pm on December 27, 2018:

Another problematic test with unclear intention

jl2012 force-pushed on Dec 27, 2018

fanquake assigned fanquake on Dec 27, 2018

fanquake added the label Tests on Dec 27, 2018

fanquake unassigned fanquake on Dec 27, 2018

in src/test/transaction_tests.cpp:282 in fc19bc13b6 outdated

284+            if (!CheckTx(tx, mapprevOutScriptPubKeys, mapprevOutValues, verify_flags[0], txdata, strTest, false))
285+                BOOST_ERROR("Tx unexpectedly passed: " << strTest);
286+            for (size_t f = 1; f < verify_flags.size(); f++) {
287+                unsigned int flags = verify_flags[f];
288+                if (!(flags & SCRIPT_VERIFY_CLEANSTACK)) {
289+                    flags &= ~SCRIPT_VERIFY_WITNESS;

practicalswift commented at 10:42 am on December 28, 2018:

Should the implicit conversion from negative to unsigned be made explicit here?

jl2012 commented at 3:16 pm on December 28, 2018:

Done. Also fixed a bug as I swapped SCRIPT_VERIFY_CLEANSTACK and SCRIPT_VERIFY_WITNESS here

DrahtBot commented at 11:56 am on December 28, 2018: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#16881 (consensus: Improve CScriptNum error reporting by fanquake)
#14696 (qa: Add explicit references to related CVE’s in p2p_invalid_block test. by lucash-dev)
#13360 ([Policy] Reject SIGHASH_SINGLE with output out of bound by jl2012)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jl2012 force-pushed on Dec 28, 2018

jl2012 commented at 7:53 pm on December 28, 2018: contributor

Added a commit to verify all validation flags are backward compatible (softfork), like #10699

jl2012 commented at 7:23 pm on December 30, 2018: contributor

Added a commit to do the same in script_tests

jl2012 force-pushed on Dec 30, 2018

laanwj commented at 2:29 pm on January 2, 2019: member

Concept ACK

DrahtBot added the label Needs rebase on Jan 16, 2019

Apply maximal validation flags to tx_valid tests

- Apply all validation flags by default
- Invert the meaning of verifyFlags as flags being excluded
- Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tests
- Fix an OP_CSV test with an OP_ADD missing

c6eca9cfc8

Apply minimal validation flags to tx_invalid tests

- Reduce the number of validation flags used, to a minimally required set to fail a test
- Use "BADTX" to indicate tests not passing CheckTransaction()
- Fix 2 witness tests with empty vout

438d353d72

Verify that verifyFlags in tx_valid and tx_invalid tests are minimal

- in tx_valid, all flags are set except those in verifyFlags
- in tx_invalid, all flags are not set except those in verifyFlags

256583085b

Verify that all validation flags are backward compatible

See #10699

089d5c8119

Apply minimal/maximal validation flags to invalid/valid script tests

- For invalid script tests:
-- reduce the number of validation flags used, to a minimally required set to fail a test
-- verify that adding flags would not validate an invalid script
- For valid script tests:
-- apply all flags except those explicitly excluded
-- verify that removing flags would not invalidate a valid script

447d292c9b

jl2012 force-pushed on Jan 16, 2019

DrahtBot removed the label Needs rebase on Jan 16, 2019

in src/test/script_tests.cpp:212 in 447d292c9b

223     CDataStream stream(SER_NETWORK, PROTOCOL_VERSION);
224     stream << tx2;
225-    int libconsensus_flags = flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL;
226-    if (libconsensus_flags == flags) {
227+    int libconsensus_flags = main_flags & bitcoinconsensus_SCRIPT_FLAGS_VERIFY_ALL;
228+    if (expect || libconsensus_flags == (int)main_flags) {

jl2012 commented at 5:12 pm on January 16, 2019:

rebased and fixed a bug here. Without expect ||, all valid script tests were skipped due to the use of standardness flags. If a script is valid with standardness flags, removing these flags shouldn’t make it invalid.

This also fixed an existing bug that many tests containing standardness flags (mostly STRICTENC) are unnecessarily skipped.

practicalswift commented at 4:26 pm on May 7, 2019: contributor

@jl2012 Could you rebase this PR on master and push to have Travis test the rebased version? I think there might be some UBSan runtime errors that needs to be addressed.

MarcoFalke closed this on May 7, 2019

MarcoFalke reopened this on May 7, 2019

DrahtBot commented at 3:14 pm on September 18, 2019: member

DrahtBot added the label Needs rebase on Sep 18, 2019

MarcoFalke added the label Up for grabs on Aug 7, 2020

MarcoFalke commented at 8:04 am on August 7, 2020: member

No activity for a year. Closing for now, let me know when you want to work on this again

MarcoFalke closed this on Aug 7, 2020

laanwj referenced this in commit c263c3d7d2 on Feb 23, 2021

sidhujag referenced this in commit 7deae1779c on Feb 23, 2021

MarcoFalke commented at 4:20 pm on February 23, 2021: member

(last commit hasn’t been picked up yet)

DrahtBot locked this on Aug 16, 2022

[test] Apply maximal flags to tx_valid tests and minimal flags to tx_invalid tests #15045

Conflicts