script: return verification flag responsible for error upon validation failure #33012

pull darosior wants to merge 5 commits into bitcoin:master from darosior:2507_script_errors changing 19 files +335 −269
  1. darosior commented at 7:19 pm on July 18, 2025: member

    For 30.0 please consider reviewing #33050 instead of this PR. This can still be considered independently later on, but #33050 is a smaller patch to get the desired effects which should be easier to get in before feature freeze.

    For unconfirmed transactions, we currently distinguish between Script validation failures due to standardness rules and those due to consensus rules. The distinction is used to decide whether to disconnect the peer which submitted this transaction.

    This is currently achieved by repeating Script validation with only the consensus verification flags, after it failed with the standardness flags. It is wasteful, and potentially significantly more so since we have Taproot. This PR proposes to replace this detection by having Script validation return the verification flag responsible for the failure instead.

    Note this is a slight behaviour change, as a consensus-related Script validation failure that happens after a standardness-related Script validation failure would not be treated as a consensus error anymore (and consequentially the peer not disconnected). I don’t think this is an issue because we already do not treat all consensus-invalid transactions we get submitted as consensus errors. For instance, a non-standard transaction (for any reason other than its Scripts) with a consensus-invalid Script will be treated as a standardness error. Same for a transaction with a consensus-invalid Script in a later input than another with a non-standard Script. In these conditions, a Script that is consensus-invalid after being standardness-invalid is an edge case that seems unnecessary to detect. It also seems to me that the fact this edge case is detected at all is more an artifact of the existing logic (introduced in 97e7901a3a74 13 years ago) than a design decision. For all these reasons, this slight behaviour change seems fine to me (and definitely worth it when considering the upside of not having to run Script validation twice).

    This change is quite verbose. If reviewers want i can try to make it less so, but i first wanted to open this as draft to get feedback on the broader approach of returning errors from Script validation directly instead of having to run it multiple times with different flags to distinguish the cause of an error.

  2. DrahtBot added the label Consensus on Jul 18, 2025
  3. DrahtBot commented at 7:19 pm on July 18, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33012.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33050 (net, validation: don’t punish peers for consensus-invalid txs by ajtowns)
    • #32998 (Bump SCRIPT_VERIFY flags to 64 bit by ajtowns)
    • #32575 (refactor: Split multithreaded case out of CheckInputScripts by fjahr)
    • #32473 (Introduce per-txin sighash midstate cache for legacy/p2sh/segwitv0 scripts by sipa)
    • #32247 (BIP-348 (OP_CHECKSIGFROMSTACK) (regtest only) by jamesob)
    • #32143 (Fix 11-year-old mis-categorized error code in OP_IF evaluation by cculianu)
    • #31713 (miniscript refactor: Remove unique_ptr-indirection (#30866 follow-up) by hodlinator)
    • #31576 (test: Move script_assets_tests into its own suite by hebasto)
    • #29491 ([EXPERIMENTAL] Schnorr batch verification for blocks by fjahr)
    • #29247 (CAT in Tapscript (BIP-347) by arminsabouri)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. darosior force-pushed on Jul 18, 2025
  5. DrahtBot added the label CI failed on Jul 18, 2025
  6. DrahtBot commented at 8:11 pm on July 18, 2025: contributor

    🚧 At least one of the CI tasks failed. Task TSan, depends, gui: https://github.com/bitcoin/bitcoin/runs/46284608490 LLM reason (✨ experimental): The CI failure is caused by a compilation error in verify_script.cpp due to an incorrect call to EvalScript with mismatched argument types.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. DrahtBot removed the label CI failed on Jul 18, 2025
  8. ajtowns commented at 5:44 am on July 20, 2025: contributor

    Note this is a slight behaviour change, as a consensus-related Script validation failure that happens after a standardness-related Script validation failure would not be treated as a consensus error anymore (and consequentially the peer not disconnected).

    I’m skeptical whether this behaviour is really worth preserving in a limited fashion? With this change, an attacker can waste your resources without being discouraged or risking having to pay tx fees by making a consensus invalid tx that fails standardness before the consensus failure is seen; and any innocent third parties are more likely to waste your resources by relaying txs under different standardness policies, and won’t be penalised at all.

    I think getting different errors for standardness vs consensus failures can be useful for debugging (and perhaps functional tests or network behaviour monitoring) so could see it being worthwhile to retain a (default off) config option to continue testing transactions twice.

  9. darosior commented at 1:35 pm on July 21, 2025: member

    The behaviour already exists in a limited fashion. Even without this change, an attacker can trivially create such a transaction (as discussed in OP). Here is for instance a p2p_invalid_tx.py case which passes both with or without this PR:

     0diff --git a/test/functional/data/invalid_txs.py b/test/functional/data/invalid_txs.py
     1index 36b274efc27..46bb2bba367 100644
     2--- a/test/functional/data/invalid_txs.py
     3+++ b/test/functional/data/invalid_txs.py
     4@@ -286,6 +286,22 @@ class NonStandardAndInvalid(BadTxTemplate):
     5             self.spend_tx, 0, script_sig=b'\x00' * 3 + b'\xab\x6a',
     6             amount=(self.spend_avail // 2))
     7 
     8+
     9+class FirstInNonStdSecondInInvalid(BadTxTemplate):
    10+    """A transaction with first a standardness error and then a consensus error."""
    11+    reject_reason = "non-mandatory-script-verify-flag (Using OP_CODESEPARATOR in non-witness script)"
    12+    # NOTE: feature_block would need to be tweaked for the spent tx to have two outputs.
    13+    block_reject_reason = "mandatory-script-verify-flag-failed (OP_RETURN was encountered)"
    14+    expect_disconnect = False
    15+    valid_in_block = False
    16+
    17+    def get_tx(self):
    18+        tx = create_tx_with_script(self.spend_tx, 0, script_sig=b"", amount=(self.spend_avail // 2))
    19+        tx.vin.append(CTxIn(COutPoint(self.spend_tx.txid_int, 1)))
    20+        tx.vin[0].scriptSig = b"\xab"  # A CODESEP in the first input (non-standard)
    21+        tx.vin[1].scriptSig = b"\x6a"  # An OP_RETURN in the second input (invalid)
    22+        return tx
    23+
    24 # Disabled opcode tx templates (CVE-2010-5137)
    25 DisabledOpcodeTemplates = [getDisabledOpcodeTemplate(opcode) for opcode in [
    26     OP_CAT,
    27diff --git a/test/functional/p2p_invalid_tx.py b/test/functional/p2p_invalid_tx.py
    28index ae9771e7cb4..33082337cc8 100755
    29--- a/test/functional/p2p_invalid_tx.py
    30+++ b/test/functional/p2p_invalid_tx.py
    31@@ -14,6 +14,7 @@ from test_framework.messages import (
    32     CTxOut,
    33 )
    34 from test_framework.p2p import P2PDataStore
    35+from test_framework.script import CScript, OP_TRUE
    36 from test_framework.test_framework import BitcoinTestFramework
    37 from test_framework.util import (
    38     assert_equal,
    39@@ -56,7 +57,7 @@ class InvalidTxRequestTest(BitcoinTestFramework):
    40 
    41         self.log.info("Create a new block with an anyone-can-spend coinbase.")
    42         height = 1
    43-        block = create_block(tip, create_coinbase(height), block_time)
    44+        block = create_block(tip, create_coinbase(height, extra_output_script=CScript([OP_TRUE])), block_time)
    45         block.solve()
    46         # Save the coinbase for later
    47         block1 = block
    

    I don’t think this logic is actually meant to prevent an attacker from wasting resources anyways. I think it’s rather to make sure we don’t split the network when introducing new Script-runtime standardness rules. And this change doesn’t affect this.

  10. script: turn ScriptError into a proper C++ enum
    typedef enum is a C idiom which is unnecessary in C++
    c9a906f777
  11. script: rename ScriptError to ScriptErrorType f7cdad3e15
  12. darosior force-pushed on Jul 21, 2025
  13. darosior commented at 2:35 pm on July 21, 2025: member
    Rebased to fix a silent merge conflict that was making the “test each commit” CI job fail.
  14. in src/script/interpreter.cpp:209 in b8c2777c07 outdated
    206     }
    207     if ((flags & (SCRIPT_VERIFY_DERSIG | SCRIPT_VERIFY_LOW_S | SCRIPT_VERIFY_STRICTENC)) != 0 && !IsValidSignatureEncoding(vchSig)) {
    208-        return set_error(serror, SCRIPT_ERR_SIG_DER);
    209+        // If any of the three flags are set, stronger ones take precedence in the error.
    210+        uint32_t err_flag{flags & SCRIPT_VERIFY_DERSIG};
    211+        if (err_flag == 0) err_flag &= SCRIPT_VERIFY_LOW_S;
    


    sipa commented at 3:02 pm on July 21, 2025:
    This has no effect. If err_flag is already zero, then err_flah &= ...; will not change that.

    darosior commented at 3:09 pm on July 21, 2025:
    Oops, thanks. Of course i meant |= here.

    darosior commented at 3:09 pm on July 21, 2025:
    There is a couple other occurrences where i must have made the same mistake.

    darosior commented at 4:40 pm on July 22, 2025:
    Well, apparently that was the only place. Fixed.
  15. script: return offending Script verification flag upon error cc615fa359
  16. qa: demonstrate existing distinction between consensus/standardness errors
    We distinguish between consensus and standardness errors in validating a transaction to decide
    whether to disconnect the peer. When a transaction is both consensus-invalid and non-standard, we
    sometimes treat it as a consensus failure and sometimes not. This commit adds additional cases to
    invalid_txs.py to clarify the distinction and highlight the behaviour change in the next commit.
    
    We treat a non-standard transaction with an invalid Script as non-standard (and not disconnect the
    peer), and always treat a standard transaction with an invalid Script (whether or not the Script
    itself is also non-standard) as consensus. (That is, as long as no non-standard Script is encountered
    before.)
    46ce07469e
  17. darosior force-pushed on Jul 22, 2025
  18. DrahtBot added the label CI failed on Jul 22, 2025
  19. validation: use Script validation error to distinguish standardness and standardness errors
    Change CScriptCheck() to return both the Script error type and the Script verification flag which
    caused the failure. This prevents having to run Script validation a second time upon failure.
    
    It does change behaviour slightly, in that a Script consensus failure after a Script standardness
    failure won't be detected anymore, and therefore won't lead to a disconnect. This is arguably by
    design though: this can only ever be detected by running Script validation without the standardness
    verification flags, which we are trying to avoid here.
    648e5e2d54
  20. darosior force-pushed on Jul 22, 2025
  21. ajtowns commented at 0:52 am on July 23, 2025: contributor

    The behaviour already exists in a limited fashion. Even without this change, an attacker can trivially create such a transaction (as discussed in OP). Here is for instance a p2p_invalid_tx.py case which passes both with or without this PR:

    Okay, but that just means the extra work isn’t providing much value today either?

    I don’t think this logic is actually meant to prevent an attacker from wasting resources anyways. I think it’s rather to make sure we don’t split the network when introducing new Script-runtime standardness rules. And this change doesn’t affect this.

    I’m suggesting we should not discourage for consensus invalid txs either, since we can’t do that both consistently and efficiently – ie, drop MaybePunishNodeForTx() entirely. That would also have the benefit of not having to worry about potential network splits due to txs violating new consensus rules that never actually make it into a block (cf #31269 and #26291). See https://github.com/ajtowns/bitcoin/commits/202507-nopunishtx/

  22. DrahtBot removed the label CI failed on Jul 23, 2025
  23. darosior commented at 7:50 pm on July 23, 2025: member
    That doesn’t seem too unreasonable to stop disconnecting on receiving a transaction which fails Script validation due to a consensus verification flag. However it’s not clear that it’s necessarily better. It doesn’t help in adversarial situations, but may still avoid wasting resources on innocently misbehaving peers. With this PR, opportunistically detecting such peers and evicting them comes at virtually no cost (unlike the current situation). So i suppose the tradeoff is between “maintain the opportunistic behaviour” and “avoid the cost of reviewing and merging this PR”.
  24. darosior commented at 8:29 pm on July 23, 2025: member
    One example where the opportunistic disconnection could be useful is in the case of a hardfork coin with signature replay protection but still connecting to Bitcoin peers. It happened in the past but admittedly seems very unlikely nowadays.
  25. ajtowns commented at 1:14 am on July 24, 2025: contributor

    One example where the opportunistic disconnection could be useful is in the case of a hardfork coin

    In that case you’ll already get a network split when the hard fork side mines a block.

    It doesn’t help in adversarial situations, but may still avoid wasting resources on innocently misbehaving peers. With this PR, opportunistically detecting such peers and evicting them comes at virtually no cost (unlike the current situation).

    Disconnecting peers is only useful in adversarial situations – keeping slightly buggy but otherwise honest peers in sync with our view of the world is a desirable outcome.

    A particular example of where this makes the network more robust is when we take currently standard txs and make them consensus invalid: if we ban on consensus invalid txs, then old nodes (that consider the tx standard) and new nodes (that consider the tx invalid) need to be bridged by intermediate nodes (that consider it valid but non-standard). If we simply ignore invalid txs, making the network more fragile.

    So i suppose the tradeoff is between “maintain the opportunistic behaviour” and “avoid the cost of reviewing and merging this PR”.

    Someone recently made the point that we already have a lot of code to maintain in bitcoin core, and that perhaps it would be beneficial to focus our attention on the parts that are providing the most benefit…

  26. darosior commented at 1:07 pm on July 24, 2025: member

    Someone recently made the point that we already have a lot of code to maintain in bitcoin core

    A wise man! More seriously, i was merely pointing out the tradeoff here. Not saying this PR is the better alternative to #33050. I am out of rationalization for the existing behaviour anyways, so might as well take a red diff over a green one.

  27. DrahtBot added the label Needs rebase on Jul 26, 2025
  28. DrahtBot commented at 0:11 am on July 26, 2025: contributor
    🐙 This pull request conflicts with the target branch and needs rebase.
  29. darosior commented at 2:04 pm on July 29, 2025: member

    In that case you’ll already get a network split when the hard fork side mines a block.

    Discussed this with AJ (and others). For reference, in this case the disconnection would happen after a second block being submitted on top a first invalid one. We would not disconnect upon a first invalid block being submitted through compact blocks. It turns out that this logic was not tested at all, so i opened #33083 which exercises various cases in MaybePunishNodeForBlock.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-08-15 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me