script: return verification flag responsible for error upon validation failure #33012

pull darosior wants to merge 5 commits into bitcoin:master from darosior:2507_script_errors changing 19 files +335 −269
  1. darosior commented at 7:19 pm on July 18, 2025: member

    For unconfirmed transactions, we currently distinguish between Script validation failures due to standardness rules and those due to consensus rules. The distinction is used to decide whether to disconnect the peer which submitted this transaction.

    This is currently achieved by repeating Script validation with only the consensus verification flags, after it failed with the standardness flags. It is wasteful, and potentially significantly more so since we have Taproot. This PR proposes to replace this detection by having Script validation return the verification flag responsible for the failure instead.

    Note this is a slight behaviour change, as a consensus-related Script validation failure that happens after a standardness-related Script validation failure would not be treated as a consensus error anymore (and consequentially the peer not disconnected). I don’t think this is an issue because we already do not treat all consensus-invalid transactions we get submitted as consensus errors. For instance, a non-standard transaction (for any reason other than its Scripts) with a consensus-invalid Script will be treated as a standardness error. Same for a transaction with a consensus-invalid Script in a later input than another with a non-standard Script. In these conditions, a Script that is consensus-invalid after being standardness-invalid is an edge case that seems unnecessary to detect. It also seems to me that the fact this edge case is detected at all is more an artifact of the existing logic (introduced in 97e7901a3a74 13 years ago) than a design decision. For all these reasons, this slight behaviour change seems fine to me (and definitely worth it when considering the upside of not having to run Script validation twice).

    This change is quite verbose. If reviewers want i can try to make it less so, but i first wanted to open this as draft to get feedback on the broader approach of returning errors from Script validation directly instead of having to run it multiple times with different flags to distinguish the cause of an error.

  2. DrahtBot added the label Consensus on Jul 18, 2025
  3. DrahtBot commented at 7:19 pm on July 18, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33012.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32998 (Bump SCRIPT_VERIFY flags to 64 bit by ajtowns)
    • #32575 (refactor: Split multithreaded case out of CheckInputScripts by fjahr)
    • #32473 (Introduce per-txin sighash midstate cache for legacy/p2sh/segwitv0 scripts by sipa)
    • #32247 (BIP-348 (OP_CHECKSIGFROMSTACK) (regtest only) by jamesob)
    • #32143 (Fix 11-year-old mis-categorized error code in OP_IF evaluation by cculianu)
    • #31713 (miniscript refactor: Remove unique_ptr-indirection (#30866 follow-up) by hodlinator)
    • #31650 (refactor: Avoid copies by using const references or by move-construction by maflcko)
    • #31576 (test: Move script_assets_tests into its own suite by hebasto)
    • #29247 (CAT in Tapscript (BIP-347) by arminsabouri)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. darosior force-pushed on Jul 18, 2025
  5. DrahtBot added the label CI failed on Jul 18, 2025
  6. DrahtBot commented at 8:11 pm on July 18, 2025: contributor

    🚧 At least one of the CI tasks failed. Task TSan, depends, gui: https://github.com/bitcoin/bitcoin/runs/46284608490 LLM reason (✨ experimental): The CI failure is caused by a compilation error in verify_script.cpp due to an incorrect call to EvalScript with mismatched argument types.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. DrahtBot removed the label CI failed on Jul 18, 2025
  8. ajtowns commented at 5:44 am on July 20, 2025: contributor

    Note this is a slight behaviour change, as a consensus-related Script validation failure that happens after a standardness-related Script validation failure would not be treated as a consensus error anymore (and consequentially the peer not disconnected).

    I’m skeptical whether this behaviour is really worth preserving in a limited fashion? With this change, an attacker can waste your resources without being discouraged or risking having to pay tx fees by making a consensus invalid tx that fails standardness before the consensus failure is seen; and any innocent third parties are more likely to waste your resources by relaying txs under different standardness policies, and won’t be penalised at all.

    I think getting different errors for standardness vs consensus failures can be useful for debugging (and perhaps functional tests or network behaviour monitoring) so could see it being worthwhile to retain a (default off) config option to continue testing transactions twice.

  9. darosior commented at 1:35 pm on July 21, 2025: member

    The behaviour already exists in a limited fashion. Even without this change, an attacker can trivially create such a transaction (as discussed in OP). Here is for instance a p2p_invalid_tx.py case which passes both with or without this PR:

     0diff --git a/test/functional/data/invalid_txs.py b/test/functional/data/invalid_txs.py
     1index 36b274efc27..46bb2bba367 100644
     2--- a/test/functional/data/invalid_txs.py
     3+++ b/test/functional/data/invalid_txs.py
     4@@ -286,6 +286,22 @@ class NonStandardAndInvalid(BadTxTemplate):
     5             self.spend_tx, 0, script_sig=b'\x00' * 3 + b'\xab\x6a',
     6             amount=(self.spend_avail // 2))
     7 
     8+
     9+class FirstInNonStdSecondInInvalid(BadTxTemplate):
    10+    """A transaction with first a standardness error and then a consensus error."""
    11+    reject_reason = "non-mandatory-script-verify-flag (Using OP_CODESEPARATOR in non-witness script)"
    12+    # NOTE: feature_block would need to be tweaked for the spent tx to have two outputs.
    13+    block_reject_reason = "mandatory-script-verify-flag-failed (OP_RETURN was encountered)"
    14+    expect_disconnect = False
    15+    valid_in_block = False
    16+
    17+    def get_tx(self):
    18+        tx = create_tx_with_script(self.spend_tx, 0, script_sig=b"", amount=(self.spend_avail // 2))
    19+        tx.vin.append(CTxIn(COutPoint(self.spend_tx.txid_int, 1)))
    20+        tx.vin[0].scriptSig = b"\xab"  # A CODESEP in the first input (non-standard)
    21+        tx.vin[1].scriptSig = b"\x6a"  # An OP_RETURN in the second input (invalid)
    22+        return tx
    23+
    24 # Disabled opcode tx templates (CVE-2010-5137)
    25 DisabledOpcodeTemplates = [getDisabledOpcodeTemplate(opcode) for opcode in [
    26     OP_CAT,
    27diff --git a/test/functional/p2p_invalid_tx.py b/test/functional/p2p_invalid_tx.py
    28index ae9771e7cb4..33082337cc8 100755
    29--- a/test/functional/p2p_invalid_tx.py
    30+++ b/test/functional/p2p_invalid_tx.py
    31@@ -14,6 +14,7 @@ from test_framework.messages import (
    32     CTxOut,
    33 )
    34 from test_framework.p2p import P2PDataStore
    35+from test_framework.script import CScript, OP_TRUE
    36 from test_framework.test_framework import BitcoinTestFramework
    37 from test_framework.util import (
    38     assert_equal,
    39@@ -56,7 +57,7 @@ class InvalidTxRequestTest(BitcoinTestFramework):
    40 
    41         self.log.info("Create a new block with an anyone-can-spend coinbase.")
    42         height = 1
    43-        block = create_block(tip, create_coinbase(height), block_time)
    44+        block = create_block(tip, create_coinbase(height, extra_output_script=CScript([OP_TRUE])), block_time)
    45         block.solve()
    46         # Save the coinbase for later
    47         block1 = block
    

    I don’t think this logic is actually meant to prevent an attacker from wasting resources anyways. I think it’s rather to make sure we don’t split the network when introducing new Script-runtime standardness rules. And this change doesn’t affect this.

  10. script: turn ScriptError into a proper C++ enum
    typedef enum is a C idiom which is unnecessary in C++
    c9a906f777
  11. script: rename ScriptError to ScriptErrorType f7cdad3e15
  12. darosior force-pushed on Jul 21, 2025
  13. darosior commented at 2:35 pm on July 21, 2025: member
    Rebased to fix a silent merge conflict that was making the “test each commit” CI job fail.
  14. in src/script/interpreter.cpp:209 in b8c2777c07 outdated
    206     }
    207     if ((flags & (SCRIPT_VERIFY_DERSIG | SCRIPT_VERIFY_LOW_S | SCRIPT_VERIFY_STRICTENC)) != 0 && !IsValidSignatureEncoding(vchSig)) {
    208-        return set_error(serror, SCRIPT_ERR_SIG_DER);
    209+        // If any of the three flags are set, stronger ones take precedence in the error.
    210+        uint32_t err_flag{flags & SCRIPT_VERIFY_DERSIG};
    211+        if (err_flag == 0) err_flag &= SCRIPT_VERIFY_LOW_S;
    


    sipa commented at 3:02 pm on July 21, 2025:
    This has no effect. If err_flag is already zero, then err_flah &= ...; will not change that.

    darosior commented at 3:09 pm on July 21, 2025:
    Oops, thanks. Of course i meant |= here.

    darosior commented at 3:09 pm on July 21, 2025:
    There is a couple other occurrences where i must have made the same mistake.

    darosior commented at 4:40 pm on July 22, 2025:
    Well, apparently that was the only place. Fixed.
  15. script: return offending Script verification flag upon error cc615fa359
  16. qa: demonstrate existing distinction between consensus/standardness errors
    We distinguish between consensus and standardness errors in validating a transaction to decide
    whether to disconnect the peer. When a transaction is both consensus-invalid and non-standard, we
    sometimes treat it as a consensus failure and sometimes not. This commit adds additional cases to
    invalid_txs.py to clarify the distinction and highlight the behaviour change in the next commit.
    
    We treat a non-standard transaction with an invalid Script as non-standard (and not disconnect the
    peer), and always treat a standard transaction with an invalid Script (whether or not the Script
    itself is also non-standard) as consensus. (That is, as long as no non-standard Script is encountered
    before.)
    46ce07469e
  17. darosior force-pushed on Jul 22, 2025
  18. DrahtBot added the label CI failed on Jul 22, 2025
  19. validation: use Script validation error to distinguish standardness and standardness errors
    Change CScriptCheck() to return both the Script error type and the Script verification flag which
    caused the failure. This prevents having to run Script validation a second time upon failure.
    
    It does change behaviour slightly, in that a Script consensus failure after a Script standardness
    failure won't be detected anymore, and therefore won't lead to a disconnect. This is arguably by
    design though: this can only ever be detected by running Script validation without the standardness
    verification flags, which we are trying to avoid here.
    648e5e2d54
  20. darosior force-pushed on Jul 22, 2025
  21. ajtowns commented at 0:52 am on July 23, 2025: contributor

    The behaviour already exists in a limited fashion. Even without this change, an attacker can trivially create such a transaction (as discussed in OP). Here is for instance a p2p_invalid_tx.py case which passes both with or without this PR:

    Okay, but that just means the extra work isn’t providing much value today either?

    I don’t think this logic is actually meant to prevent an attacker from wasting resources anyways. I think it’s rather to make sure we don’t split the network when introducing new Script-runtime standardness rules. And this change doesn’t affect this.

    I’m suggesting we should not discourage for consensus invalid txs either, since we can’t do that both consistently and efficiently – ie, drop MaybePunishNodeForTx() entirely. That would also have the benefit of not having to worry about potential network splits due to txs violating new consensus rules that never actually make it into a block (cf #31269 and #26291). See https://github.com/ajtowns/bitcoin/commits/202507-nopunishtx/

  22. DrahtBot removed the label CI failed on Jul 23, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-07-23 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me