test: apply strict verification flags for transaction tests and assert backwards compatibility #19698

glozow commented at 5:51 PM on August 11, 2020: member

This uses the first 4 commits of #15045, rebased and added some comments. The diff is quite large already and I want to make it easy to review, so I'm splitting it into 2 PRs (transaction and script). Script one is WIP, I'll link it when I open it.

Interpretation of scripts is dependent on the script verification flags passed in. In tests, we should always apply maximal verification flags when checking that a transaction is valid; any additional flags should invalidate the transaction. A transaction should not be valid because we forgot to include a flag, and we should apply all flags by default. We should apply minimal verification flags when asserting that a transaction is invalid; if verification flags are applied, removing any one of them should mean the transaction is valid. New verify flags must be backwards compatible; tests should check backwards compatibility and apply the new flags by default. All tx_invalid tests should continue to be invalid with the exact same verify flags. All tx_valid tests that don't pass with new flags should explicitly indicate that the flags need to be excluded, and fail otherwise.

Flip the meaning of verifyFlags in tx_valid.json to mean excluded verification flags instead of included flags. Edit the test data accordingly.
Trim unneeded flags from tx_invalid.json.
Add check to verify that tx_valid tests have maximal flags and tx_invalid tests have minimal flags.
Add checks to verify that flags are soft forks (#10699) i.e. adding any flag should only decrease the number of acceptable scripts. Test by adding/removing random flags.

DrahtBot added the label Tests on Aug 11, 2020

luke-jr commented at 10:10 PM on August 11, 2020: member

Not sure this is a good idea. Tests should ideally test one thing only, and failing due to other bugs would be annoying.

laanwj commented at 11:13 AM on August 12, 2020: member

Looks like one of the sanitizers finds an integer conversion/truncation problem in the changed code:

�[1;34;49mtest/transaction_tests.cpp(251): Leaving test case "tx_invalid"; testing time: 591590us
�[0;39;49m�[1;34;49mtest/transaction_tests.cpp(164): Entering test case "tx_valid"
test/transaction_tests.cpp:237:35: runtime error: implicit conversion from type 'unsigned long' of value 18446744073709526401 (64-bit, unsigned) to type 'unsigned int' changed the value to 4294942081 (32-bit, unsigned)
    [#0](/bitcoin-bitcoin/0/) 0x558b927932d2 in transaction_tests::tx_valid::test_method() /tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/transaction_tests.cpp:237:35
    [#1](/bitcoin-bitcoin/1/) 0x558b9278fb88 in transaction_tests::tx_valid_invoker() /tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/transaction_tests.cpp:164:1
    [#2](/bitcoin-bitcoin/2/) 0x558b91e8e608 in boost::detail::function::void_function_invoker0<void (*)(), void>::invoke(boost::detail::function::function_buffer&) /usr/include/boost/function/function_template.hpp:117:11
    [#3](/bitcoin-bitcoin/3/) 0x7f37eb6e23f1  (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x353f1)
    [#4](/bitcoin-bitcoin/4/) 0x7f37eb6dfc74 in boost::execution_monitor::catch_signals(boost::function<int ()> const&) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x32c74)
    [#5](/bitcoin-bitcoin/5/) 0x7f37eb6dfcf7 in boost::execution_monitor::execute(boost::function<int ()> const&) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x32cf7)
    [#6](/bitcoin-bitcoin/6/) 0x7f37eb6dfdcd in boost::execution_monitor::vexecute(boost::function<void ()> const&) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x32dcd)
    [#7](/bitcoin-bitcoin/7/) 0x7f37eb70d134 in boost::unit_test::unit_test_monitor_t::execute_and_translate(boost::function<void ()> const&, unsigned long) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x60134)
    [#8](/bitcoin-bitcoin/8/) 0x7f37eb6f05a8  (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x435a8)
    [#9](/bitcoin-bitcoin/9/) 0x7f37eb6f0b03  (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x43b03)
    [#10](/bitcoin-bitcoin/10/) 0x7f37eb6f0b03  (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x43b03)
    [#11](/bitcoin-bitcoin/11/) 0x7f37eb6e7939 in boost::unit_test::framework::run(unsigned long, bool) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x3a939)
    [#12](/bitcoin-bitcoin/12/) 0x7f37eb70bfea in boost::unit_test::unit_test_main(bool (*)(), int, char**) (/lib/x86_64-linux-gnu/libboost_unit_test_framework.so.1.71.0+0x5efea)
    [#13](/bitcoin-bitcoin/13/) 0x558b91dd63a6 in main /usr/include/boost/test/unit_test.hpp:63:12
    [#14](/bitcoin-bitcoin/14/) 0x7f37eae990b2 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x270b2)
    [#15](/bitcoin-bitcoin/15/) 0x558b91d2bbbd in _start (/tmp/cirrus-ci-build/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/test_bitcoin+0x21c5bbd)

SUMMARY: UndefinedBehaviorSanitizer: implicit-unsigned-integer-truncation test/transaction_tests.cpp:237:35 in

glozow commented at 4:54 PM on August 14, 2020: member

@laanwj I'm looking into the issue, is there a way to run this sanitizer locally? Edit: nvm I think I got it, from developer-notes...

glozow commented at 4:56 PM on August 14, 2020: member

Not sure this is a good idea. Tests should ideally test one thing only, and failing due to other bugs would be annoying. @luke-jr I would think of it more as... each test sanity checks itself to make sure it's testing exactly what we want it to test. Passing/failing because the test itself is incorrect would be more annoying imo. All the current tests pass; as more tests are added, I'd think it's a good idea to have this check.

glozow force-pushed on Aug 20, 2020

glozow commented at 3:00 PM on August 21, 2020: member

Last push fixed the sanitizer bug - just needed a cast. Ready for review :)

glozow commented at 12:10 AM on September 4, 2020: member

A little bit quiet here... @laanwj and @practicalswift you both left reviews on the original PR, if you have time I'd appreciate a look here as well :)

laanwj commented at 9:35 AM on October 27, 2020: member

Code review ACK, thanks for solving the sanitizer bug.

I would like @sipa @MarcoFalke or someone else close with the verification testing code to look at this and give concept ACK.

benthecarman commented at 4:21 AM on November 14, 2020: contributor

Concept ACK

in src/test/transaction_tests.cpp:321 in 110239f2ff outdated

 331 | -                                      witness, verify_flags, TransactionSignatureChecker(&tx, i, amount, txdata), &err);
 332 | +            // Not using FillFlags() in the main test, in order to detect invalid verifyFlags combination
 333 | +            if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, verify_flags[0], txdata, strTest, /* expect_valid */ false))
 334 | +                BOOST_ERROR("Tx unexpectedly passed: " << strTest);
 335 | +            // Check that flags are minimal: transaction should succeed if any set flags are unset.
 336 | +            for (size_t i = 0; i < mapFlagNames.size(); i++)

benthecarman commented at 4:23 AM on November 14, 2020:

nit: ++i

in src/test/transaction_tests.cpp:332 in 110239f2ff outdated

 342 | +                // Backwards compatibility: Adding some random flags should not validate an invalid transaction
 343 | +                flags = FillFlags(verify_flags[0] | (unsigned int)InsecureRandBits(mapFlagNames.size()));
 344 | +                if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, flags, txdata, strTest, /* expect_valid */ false))
 345 | +                    BOOST_ERROR("Tx unexpectedly passed with random flags " << ToString(flags) << ": " << strTest);
 346 | +            }
 347 | +            for (size_t f = 1; f < verify_flags.size(); f++) {

benthecarman commented at 4:23 AM on November 14, 2020:

nit: ++f

jnewbery commented at 5:53 PM on November 26, 2020:

or just use range-based for loop.

in src/test/transaction_tests.cpp:172 in 110239f2ff outdated

 168 | @@ -104,10 +169,9 @@ BOOST_AUTO_TEST_CASE(tx_valid)
 169 |      // or [[[prevout hash, prevout index, prevout scriptPubKey], [input 2], ...],"], serializedTransaction, verifyFlags
 170 |      // ... where all scripts are stringified scripts.
 171 |      //
 172 | -    // verifyFlags is a comma separated list of script verification flags to apply, or "NONE"
 173 | +    // verifyFlags is a comma separated list of MINIMAL script verification flags to exclude, or "NONE"

jnewbery commented at 5:38 PM on November 26, 2020:

I don't think the format of tx_valid.json should be documented here and in the file itself. It'd be better to consolidate this documentation to just the json file.

No need to capitalize MINIMAL.

glozow commented at 1:58 AM on December 7, 2020:

consolidated in the json since that's where any new tests would be added

in src/test/transaction_tests.cpp:145 in 110239f2ff outdated

 145 | +        flags &= ~(unsigned int)SCRIPT_VERIFY_CLEANSTACK;
 146 | +    }
 147 | +    return flags;
 148 | +}
 149 | +
 150 | +unsigned int FillFlags(unsigned int flags)

jnewbery commented at 5:40 PM on November 26, 2020:

Would a better interface be to take a reference and update it in place?

glozow commented at 11:52 PM on December 6, 2020:

maybe not, since we're using it like CheckTxScripts(... FillFlags(flags) ...) So doing it in place would mean we need an extra line there

in src/test/transaction_tests.cpp:152 in 110239f2ff outdated

 154 | +    }
 155 | +    else if (flags & SCRIPT_VERIFY_CLEANSTACK) {
 156 | +        flags |= SCRIPT_VERIFY_P2SH;
 157 | +        flags |= SCRIPT_VERIFY_WITNESS;
 158 | +    }
 159 | +    return flags;

jnewbery commented at 5:44 PM on November 26, 2020:

Perhaps clearer:

    // CLEANSTACK implies WITNESS
    if (flags & SCRIPT_VERIFY_CLEANSTACK) flags |= SCRIPT_VERIFY_WITNESS;
    
    // WITNESS implies P2SH (and transitively CLEANSTACK implies P2SH)
    if (flags & SCRIPT_VERIFY_WITNESS) flags |= SCRIPT_VERIFY_P2SH;

glozow commented at 1:58 AM on December 7, 2020:

indeed, cleaner :)

in src/test/transaction_tests.cpp:103 in 110239f2ff outdated

  99 | @@ -94,6 +100,65 @@ std::string FormatScriptFlags(unsigned int flags)
 100 |      return ret.substr(0, ret.size() - 1);
 101 |  }
 102 |  
 103 | +bool CheckTxScripts(const CTransaction& tx, const std::map<COutPoint, CScript>& mapprevOutScriptPubKeys, const std::map<COutPoint, int64_t>& mapprevOutValues, const unsigned int& flags, const PrecomputedTransactionData& txdata, const std::string& strTest, const bool expect_valid)

jnewbery commented at 5:44 PM on November 26, 2020:

Please split this line (and perhaps comment what this function is doing)

murchandamus commented at 3:43 PM on December 2, 2020:

Have line length guidelines ever been considered for this codebase? :innocent:

jonatack commented at 4:22 PM on December 2, 2020:

@glozow I find https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines.html#Rf-in useful for when to pass "in" parameters by value versus by reference to const

jonatack commented at 4:31 PM on December 2, 2020:

Have line length guidelines ever been considered for this codebase? innocent

For the C++ part, the longer the better? (After many years of 80 or 100 char line limits, this surprised me at first, but for the C++ code I now actually prefer the lines to be unbroken more often than not.)

For the Python code, ISTM the unofficial practice is not PEP8 (79 chars) but more like 120 or 160 chars.

glozow commented at 3:01 PM on December 7, 2020:

I eyeballed ~100 chars, hopefully that's ok Also thanks for the link @jonatack, switched the cheaper ints to pass by value

in src/test/transaction_tests.cpp:108 in 110239f2ff outdated

  99 | @@ -94,6 +100,65 @@ std::string FormatScriptFlags(unsigned int flags)
 100 |      return ret.substr(0, ret.size() - 1);
 101 |  }
 102 |  
 103 | +bool CheckTxScripts(const CTransaction& tx, const std::map<COutPoint, CScript>& mapprevOutScriptPubKeys, const std::map<COutPoint, int64_t>& mapprevOutValues, const unsigned int& flags, const PrecomputedTransactionData& txdata, const std::string& strTest, const bool expect_valid)
 104 | +{
 105 | +    bool tx_valid = true;
 106 | +    ScriptError err = expect_valid ? SCRIPT_ERR_UNKNOWN_ERROR : SCRIPT_ERR_OK;
 107 | +    for (unsigned int i = 0; i < tx.vin.size() && tx_valid; i++)
 108 | +    {

jnewbery commented at 5:44 PM on November 26, 2020:

join

jnewbery commented at 5:47 PM on November 26, 2020:

consider making an alias reference for tx.vin[i] here, since you'll use it a lot below.

in src/test/transaction_tests.cpp:111 in 110239f2ff outdated

 106 | +    ScriptError err = expect_valid ? SCRIPT_ERR_UNKNOWN_ERROR : SCRIPT_ERR_OK;
 107 | +    for (unsigned int i = 0; i < tx.vin.size() && tx_valid; i++)
 108 | +    {
 109 | +        const CAmount amount = mapprevOutValues.count(tx.vin[i].prevout) ? mapprevOutValues.at(tx.vin[i].prevout) : 0;
 110 | +        try
 111 | +        {

jnewbery commented at 5:45 PM on November 26, 2020:

join

in src/test/transaction_tests.cpp:115 in 110239f2ff outdated

 110 | +        try
 111 | +        {
 112 | +            tx_valid = VerifyScript(tx.vin[i].scriptSig, mapprevOutScriptPubKeys.at(tx.vin[i].prevout), &tx.vin[i].scriptWitness, flags, TransactionSignatureChecker(&tx, i, amount, txdata), &err);
 113 | +        }
 114 | +        catch (...)
 115 | +        {

jnewbery commented at 5:45 PM on November 26, 2020:

join

in src/test/transaction_tests.cpp:232 in 110239f2ff outdated

 240 | -                                                 witness, verify_flags, TransactionSignatureChecker(&tx, i, amount, txdata), &err),
 241 | -                                    strTest);
 242 | -                BOOST_CHECK_MESSAGE(err == SCRIPT_ERR_OK, ScriptErrorString(err));
 243 | +            // Do not use TrimFlags() in the main test in order to detect invalid verifyFlags combination in the test file.
 244 | +            if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, ~verify_flags[0], txdata, strTest, /* expect_valid */ true))
 245 | +                BOOST_ERROR("Tx unexpectedly failed: " << strTest);

jnewbery commented at 5:45 PM on November 26, 2020:

braces

in src/test/transaction_tests.cpp:231 in 110239f2ff outdated

 243 | +            // Do not use TrimFlags() in the main test in order to detect invalid verifyFlags combination in the test file.
 244 | +            if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, ~verify_flags[0], txdata, strTest, /* expect_valid */ true))
 245 | +                BOOST_ERROR("Tx unexpectedly failed: " << strTest);
 246 | +            // Check that flags are maximal: transaction should fail if any unset flags are set.
 247 | +            for (size_t i = 0; i < mapFlagNames.size(); i++)
 248 | +            {

jnewbery commented at 5:46 PM on November 26, 2020:

join

in src/test/transaction_tests.cpp:62 in 110239f2ff outdated

  58 | @@ -59,10 +59,12 @@ static std::map<std::string, unsigned int> mapFlagNames = {
  59 |      {std::string("CONST_SCRIPTCODE"), (unsigned int)SCRIPT_VERIFY_CONST_SCRIPTCODE},
  60 |  };
  61 |  
  62 | -unsigned int ParseScriptFlags(std::string strFlags)
  63 | +std::vector<unsigned int> ParseScriptFlags(std::string strFlags)

jnewbery commented at 5:49 PM on November 26, 2020:

This is a really weird function. It'll now return a vector of:

first item is the sum of all the flags
then a list of the sum of all except one flag for each flag.

I think the function should be left unchanged and the client do whatever it needs with the sum of all the flags. At the very least, the function should be documented.

in src/test/transaction_tests.cpp:240 in 110239f2ff outdated

 247 | +            for (size_t i = 0; i < mapFlagNames.size(); i++)
 248 | +            {
 249 | +                // Backwards compatibility: Removing a flag should not invalidate a valid transaction
 250 | +                unsigned int flags = TrimFlags(~(verify_flags[0] | (1U << i)));
 251 | +                if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, flags, txdata, strTest, /* expect_valid */ true))
 252 | +                    BOOST_ERROR("Tx unexpectedly failed with flag " << ToString(i) << " unset: " << strTest);

jnewbery commented at 5:51 PM on November 26, 2020:

braces

in src/test/transaction_tests.cpp:242 in 110239f2ff outdated

 254 | +                flags = TrimFlags(~(verify_flags[0] | (unsigned int)InsecureRandBits(mapFlagNames.size())));
 255 | +                if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, flags, txdata, strTest, /* expect_valid */ true))
 256 | +                    BOOST_ERROR("Tx unexpectedly failed with random flags " << ToString(flags) << ": " << strTest);
 257 | +            }
 258 | +            for (size_t f = 1; f < verify_flags.size(); f++)
 259 | +            {

jnewbery commented at 5:51 PM on November 26, 2020:

join

jonatack commented at 4:18 PM on December 2, 2020:

@glozow if you run clang-format on your changes, it can avoid the need for all the formatting feedback

glozow commented at 4:44 PM on December 2, 2020:

oops 😅 thanks for advice @jonatack

michaelfolkson commented at 4:58 PM on December 2, 2020:

https://github.com/bitcoin/bitcoin/blob/master/contrib/devtools/README.md#clang-format-diffpy

in src/test/transaction_tests.cpp:241 in 110239f2ff outdated

 253 | +                // Backwards compatibility: Removing some random flags should not invalidate a valid transaction
 254 | +                flags = TrimFlags(~(verify_flags[0] | (unsigned int)InsecureRandBits(mapFlagNames.size())));
 255 | +                if (!CheckTxScripts(tx, mapprevOutScriptPubKeys, mapprevOutValues, flags, txdata, strTest, /* expect_valid */ true))
 256 | +                    BOOST_ERROR("Tx unexpectedly failed with random flags " << ToString(flags) << ": " << strTest);
 257 | +            }
 258 | +            for (size_t f = 1; f < verify_flags.size(); f++)

jnewbery commented at 5:51 PM on November 26, 2020:

Use range-based loop?

jonatack commented at 4:17 PM on December 2, 2020:

And otherwise, prefer ++f to f++

in src/test/transaction_tests.cpp:309 in 110239f2ff outdated

 305 | @@ -240,27 +306,34 @@ BOOST_AUTO_TEST_CASE(tx_invalid)
 306 |  
 307 |              TxValidationState state;
 308 |              fValid = CheckTransaction(tx, state) && state.IsValid();
 309 | +            if (!fValid) {

jnewbery commented at 5:53 PM on November 26, 2020:

join these lines and don't use the local variable fValid (it means 'the test file was malformed' above and 'the transaction failed validation' here).

jnewbery commented at 5:56 PM on November 26, 2020: member

I've left lots of style suggestions, which should be pretty easy to resolve.

It's very difficult to review this PR, since it's mixing refactors and fixes into the same commits. For example, the first commit is changing the format of the tx_valid.json file and making changes/fixes to the tests. I think it'd be far easier to review those different changes if they were split out into individual commits. The same is true for the other commits in this branch - each one is doing too much, making review more difficult than it needs to ne.

in src/test/data/tx_valid.json:307 in 1bb48b8257 outdated

 412 |  ["The argument can be calculated rather than created directly by a PUSHDATA"],
 413 | -[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4194303 1ADD CHECKSEQUENCEVERIFY 1"]],
 414 | -"020000000100010000000000000000000000000000000000000000000000000000000000000000000000000040000100000000000000000000000000", "P2SH,CHECKSEQUENCEVERIFY"],
 415 | -[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4194304 1SUB CHECKSEQUENCEVERIFY 1"]],
 416 | -"020000000100010000000000000000000000000000000000000000000000000000000000000000000000ffff00000100000000000000000000000000", "P2SH,CHECKSEQUENCEVERIFY"],
 417 | +[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "4194303 1ADD CHECKSEQUENCEVERIFY"]],

murchandamus commented at 3:16 PM on December 2, 2020:

If the "Remove unnecessary OP_1 at the end of most OP_CLTV and OP_CSV tests" change would allow tests to pass before the other changes in this commit, it would be preferable to pull it out into a separate preceding commit.

glozow commented at 1:59 AM on December 7, 2020:

pulled them into their own commits

in src/test/data/tx_valid.json:313 in 1bb48b8257 outdated

 422 |  ["An ADD producing a 5-byte result that sets CTxIn::SEQUENCE_LOCKTIME_DISABLE_FLAG"],
 423 | -[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 65536 CHECKSEQUENCEVERIFY 1"]],
 424 | -"020000000100010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000", "P2SH,CHECKSEQUENCEVERIFY"],
 425 | -[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 4259840 ADD CHECKSEQUENCEVERIFY 1"]],
 426 | -"020000000100010000000000000000000000000000000000000000000000000000000000000000000000000040000100000000000000000000000000", "P2SH,CHECKSEQUENCEVERIFY"],
 427 | +[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 65536 ADD CHECKSEQUENCEVERIFY"]],

murchandamus commented at 3:20 PM on December 2, 2020:

Similarly, if the ADD was missing and tests would pass, I would pull it out to a separate commit.

My motivation would be that you are making a lot of changes, and if I'm looking for multiple different reasons why those changes are made, it's adding steps to my parsing of every line as a reviewer, while it would be easier to parse a commit that has only one concern (the exclusion of verify flags rather than the inclusion thereof).

in src/test/data/tx_invalid.json:277 in 274fb7368e outdated

 272 | @@ -273,7 +273,7 @@
 273 |  [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "0x51", 1000],
 274 |  ["0000000000000000000000000000000000000000000000000000000000000100", 1, "0x00 0x14 0x4c9c3dfac4207d5d8cb89df5722cb3d712385e3f", 2000],
 275 |  ["0000000000000000000000000000000000000000000000000000000000000100", 2, "0x51", 3000]],
 276 | -"0100000000010300010000000000000000000000000000000000000000000000000000000000000000000000ffffffff000100000000000000000000000000000000000000000000000000000000000001000000000100000000010000000000000000000000000000000000000000000000000000000000000200000000ffffffff00000248304502210091b32274295c2a3fa02f5bce92fb2789e3fc6ea947fbe1a76e52ea3f4ef2381a022079ad72aefa3837a2e0c033a8652a59731da05fa4a813f4fc48e87c075037256b822103596d3451025c19dbbdeb932d6bf8bfb4ad499b95b6f88db8899efac102e5fc710000000000", "P2SH,WITNESS"],
 277 | +"0100000000010300010000000000000000000000000000000000000000000000000000000000000000000000ffffffff000100000000000000000000000000000000000000000000000000000000000001000000000100000000010000000000000000000000000000000000000000000000000000000000000200000000ffffffff03e8030000000000000151d0070000000000000151b80b0000000000000151000248304502210091b32274295c2a3fa02f5bce92fb2789e3fc6ea947fbe1a76e52ea3f4ef2381a022079ad72aefa3837a2e0c033a8652a59731da05fa4a813f4fc48e87c075037256b822103596d3451025c19dbbdeb932d6bf8bfb4ad499b95b6f88db8899efac102e5fc710000000000", "P2SH,WITNESS"],

murchandamus commented at 3:23 PM on December 2, 2020:

Similarly to my previous comment, I'd propose to pull out changes to the transactions into a prior commit.

glozow commented at 1:59 AM on December 7, 2020:

also 👍

MarcoFalke commented at 11:33 AM on February 23, 2021:

#19698 (review)

Looks like this hasn't been fixed and breaks git-bisect.

Also, I am wondering why exactly 3 outputs? Wouldn't one suffice?

glozow commented at 2:35 PM on February 23, 2021:

I had interpreted that I was supposed to put the 2 tx_invalid test fixes into a separate commit (they were initially grouped with the "apply minimal flags" commit) so I thought I had done this, oops! Do you have the test failing? Should I do something to fix this now?

Also, I am wondering why exactly 3 outputs? Wouldn't one suffice?

Yeah I think so. I just pulled what was on jl2012's branch 😅

MarcoFalke commented at 3:06 PM on February 23, 2021:

It is not possible to rewrite our git history, so the failing commit will need to be kept as is.

in src/test/data/tx_invalid.json:380 in 274fb7368e outdated

 384 | -  "01000000016f3dbe2ca96fa217e94b1017860be49f20820dea5c91bdcb103b0049d5eb566000000000fd1d0147304402203989ac8f9ad36b5d0919d97fa0a7f70c5272abee3b14477dc646288a8b976df5022027d19da84a066af9053ad3d1d7459d171b7e3a80bc6c4ef7a330677a6be548140147304402203989ac8f9ad36b5d0919d97fa0a7f70c5272abee3b14477dc646288a8b976df5022027d19da84a066af9053ad3d1d7459d171b7e3a80bc6c4ef7a330677a6be548140121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ac47304402203757e937ba807e4a5da8534c17f9d121176056406a6465054bdd260457515c1a02200f02eccf1bec0f3a0d65df37889143c2e88ab7acec61a7b6f5aa264139141a2b0121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ffffffff010000000000000000016a00000000", "P2SH,CONST_SCRIPTCODE"],
 385 | +  "01000000016f3dbe2ca96fa217e94b1017860be49f20820dea5c91bdcb103b0049d5eb566000000000fd1d0147304402203989ac8f9ad36b5d0919d97fa0a7f70c5272abee3b14477dc646288a8b976df5022027d19da84a066af9053ad3d1d7459d171b7e3a80bc6c4ef7a330677a6be548140147304402203989ac8f9ad36b5d0919d97fa0a7f70c5272abee3b14477dc646288a8b976df5022027d19da84a066af9053ad3d1d7459d171b7e3a80bc6c4ef7a330677a6be548140121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ac47304402203757e937ba807e4a5da8534c17f9d121176056406a6465054bdd260457515c1a02200f02eccf1bec0f3a0d65df37889143c2e88ab7acec61a7b6f5aa264139141a2b0121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ffffffff010000000000000000016a00000000", "CONST_SCRIPTCODE"],
 386 |  
 387 |  [[["5a6b0021a6042a686b6b94abc36b387bef9109847774e8b1e51eb8cc55c53921", 1, "DUP HASH160 0x14 0xee5a6aa40facefb2655ac23c0c28c57c65c41f9b EQUALVERIFY CHECKSIG"]],
 388 | -  "01000000012139c555ccb81ee5b1e87477840991ef7b386bc3ab946b6b682a04a621006b5a01000000fdb40148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f2204148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390175ac4830450220646b72c35beeec51f4d5bc1cbae01863825750d7f490864af354e6ea4f625e9c022100f04b98432df3a9641719dbced53393022e7249fb59db993af1118539830aab870148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a580039017521038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ffffffff010000000000000000016a00000000", "P2SH,CONST_SCRIPTCODE"],
 389 | +  "01000000012139c555ccb81ee5b1e87477840991ef7b386bc3ab946b6b682a04a621006b5a01000000fdb40148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390121038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f2204148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a5800390175ac4830450220646b72c35beeec51f4d5bc1cbae01863825750d7f490864af354e6ea4f625e9c022100f04b98432df3a9641719dbced53393022e7249fb59db993af1118539830aab870148304502201723e692e5f409a7151db386291b63524c5eb2030df652b1f53022fd8207349f022100b90d9bbf2f3366ce176e5e780a00433da67d9e5c79312c6388312a296a580039017521038479a0fa998cd35259a2ef0a7a5c68662c1474f88ccb6d08a7677bbec7f22041ffffffff010000000000000000016a00000000", "CONST_SCRIPTCODE"],

murchandamus commented at 3:24 PM on December 2, 2020:

See above.

in src/test/transaction_tests.cpp:192 in 274fb7368e outdated

 187 | @@ -188,7 +188,8 @@ BOOST_AUTO_TEST_CASE(tx_invalid)
 188 |      // or [[[prevout hash, prevout index, prevout scriptPubKey], [input 2], ...],"], serializedTransaction, verifyFlags
 189 |      // ... where all scripts are stringified scripts.
 190 |      //
 191 | -    // verifyFlags is a comma separated list of script verification flags to apply, or "NONE"
 192 | +    // verifyFlags is a comma separated list of MINIMAL script verification flags to apply,
 193 | +    // or "NONE" to not apply any, or "BADTX" if it is expected to fail CheckTransaction().

murchandamus commented at 3:33 PM on December 2, 2020:

Optional: "verifyFlags is the minimal comma separated list of script verification flags that are necessary for the invalid transaction to fail verification. Use "NONE" when no flags need to be applied, and use a solitary "BADTX" if the transaction is expected to fail CheckTransaction()."

in src/test/transaction_tests.cpp:107 in 1bb48b8257 outdated

 103 | @@ -104,7 +104,7 @@ BOOST_AUTO_TEST_CASE(tx_valid)
 104 |      // or [[[prevout hash, prevout index, prevout scriptPubKey], [input 2], ...],"], serializedTransaction, verifyFlags
 105 |      // ... where all scripts are stringified scripts.
 106 |      //
 107 | -    // verifyFlags is a comma separated list of script verification flags to apply, or "NONE"
 108 | +    // verifyFlags is a comma separated list of MINIMAL script verification flags to exclude, or "NONE"

murchandamus commented at 3:36 PM on December 2, 2020:

Optional: Consider renaming verifyFlags to disabledVerifyFlags or similar.

in src/test/transaction_tests.cpp:83 in 450872d670 outdated

  80 | +    ret.push_back(flags);
  81 | +    for (const std::string& word : words)
  82 | +    {
  83 | +        ret.push_back(flags & ~mapFlagNames[word]);
  84 | +    }
  85 | +    return ret;

murchandamus commented at 3:53 PM on December 2, 2020:

Maybe it would be better to have a separate function that returns the names of the flags from a given sum of all flags.

murchandamus commented at 4:04 PM on December 2, 2020: contributor

Concept ACK

TBH, I could use a bit more explanation in the fourth commit message "Verify that all validation flags are backward compatible". Could you go a bit more into detail what this is doing and why?

glozow force-pushed on Dec 7, 2020

glozow commented at 3:07 PM on December 7, 2020: member

Thanks for the review @laanwj @jnewbery @benthecarman @Xekyo @jonatack :) addressed your comments and split the PR up into more, dedicated commits. CI is green 🟢 ready for review again!

in src/test/transaction_tests.cpp:99 in ed0b99e2d2 outdated

  91 | @@ -95,14 +92,83 @@ std::string FormatScriptFlags(unsigned int flags)
  92 |      return ret.substr(0, ret.size() - 1);
  93 |  }
  94 |  
  95 | +/*
  96 | +* Check that the input scripts of a transaction are valid/invalid as expected.
  97 | +*/
  98 | +bool CheckTxScripts(const CTransaction& tx, const std::map<COutPoint, CScript>& mapprevOutScriptPubKeys,
  99 | +    const std::map<COutPoint, int64_t>& mapprevOutValues, unsigned int flags,

murchandamus commented at 6:15 PM on December 7, 2020:

Nit: map_p_rev. Also, I seem to remember that new variables should preferably use snake case.

jonatack commented at 6:43 PM on December 7, 2020:

That's true (doc/developer-notes.md)

jonatack commented at 6:47 PM on December 7, 2020:

flags and expect_valid now passed by value, thanks :+1:

glozow commented at 6:47 PM on December 7, 2020:

o tru

murchandamus commented at 6:22 PM on December 7, 2020: contributor

Looks good to me, just the one nit from me at the moment.

in src/test/transaction_tests.cpp:105 in 398151b81f outdated

 100 | +    const std::map<COutPoint, int64_t>& mapprevOutValues, unsigned int flags,
 101 | +    const PrecomputedTransactionData& txdata, const std::string& strTest, bool expect_valid)
 102 | +{
 103 | +    bool tx_valid = true;
 104 | +    ScriptError err = expect_valid ? SCRIPT_ERR_UNKNOWN_ERROR : SCRIPT_ERR_OK;
 105 | +    for (unsigned int i = 0; i < tx.vin.size() && tx_valid; i++) {

jonatack commented at 6:44 PM on December 7, 2020:

drive-by nit if you retouch: s/i++/++i/ (see developer-notes.md)

glozow force-pushed on Jan 26, 2021

glozow commented at 10:25 PM on January 26, 2021: member

Attempting to revive this again 😅 🙏 Rebased on master since it's been a while and applied the style suggestions from @jonatack and @Xekyo. I had also forgotten to add the original author to the intermediate commits when I split up the changes - my sincerest apologies - that's fixed now.

laanwj commented at 7:35 AM on January 27, 2021: member

To test the tests I made a small change to tx_valid.json (basically reverting the initial commits):

 ["An ADD producing a 5-byte result that sets CTxIn::SEQUENCE_LOCKTIME_DISABLE_FLAG"],
-[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 65536 ADD CHECKSEQUENCEVERIFY"]],
+[[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 65536 CHECKSEQUENCEVERIFY"]],

The good thing is that it easily detects this. However, an avalanche of 63 failures for a single error might be a bit overkill :sweat_smile: It does report the JSON of the failed test, which is good!

The following change, however

 [[["0000000000000000000000000000000000000000000000000000000000000100", 0, "2147483647 65536 ADD CHECKSEQUENCEVERIFY"]],
-"020000000100010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000", "NONE"],
+"020000000100010000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000", "P2SH,CHECKSEQUENCEVERIFY"],

raises an assertion

test_bitcoin: …/bitcoin/src/script/interpreter.cpp:2044: bool VerifyScript(const CScript &, const CScript &, const CScriptWitness *, unsigned int, const BaseSignatureChecker &, ScriptError *): Assertion `(flags & SCRIPT_VERIFY_P2SH) != 0' failed.

Then reports 7 failures in different places in the code. It does not report which specific test in the JSON failed.

Not sure if these are a problem, I mean, the tests fail when they're supposed to fail, but just thought I'd report it.

[test] remove unnecessary OP_1s from CSV and CLTV tests