The first commit inverts the meaning of verifyFlags for tx_valid tests, as flags being excluded. All flags are applied by default, except those found in verifyFlags. This makes sure that a new or existing flag won’t invalidate a tx by accident.
The second commit reduces the number of validation flags used for tx_invalid tests, to a minimally required set to fail a test. This makes sure that a tx failed due to the tested flags, not unexpected effects of some other flags. It also uses “BADTX” to indicate tests not passing CheckTransaction(), vs. those failing script execution.
(If a test is expected to fail due to multiple independent flags, multiple tests should be used)
The third commit verifies that the flags excluded in tx_valid and included in tx_invalid are indeed the minimal set. In tx_valid, it adds back the excluded flags individually and expects it to fail. In tx_invalid, it removes the included flags individually and expects it to pass.
This process helped me to identify and fix some buggy tests:
- Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tx_valid tests, so there is no need to exclude CLEANSTACK
- An OP_CSV tx_valid test missed an OP_ADD, and is added back
- 2 witness tests were found with empty vout, so they failed due to CheckTransaction(), not script tests. Corrected by filling in proper vout.