Align legacy script policy with P2SH policy in AreInputsStandard

151henry151 commented at 7:03 pm on November 22, 2025: contributor

Legacy scripts currently require solvability, while P2SH scripts only check sigop count. This asymmetry causes some non-solvable legacy scripts to be rejected even when they have acceptable sigop counts, while equivalent P2SH scripts would be accepted.

This change updates AreInputsStandard to check sigop count for legacy scripts instead of requiring solvability, matching P2SH behavior. Non-solvable legacy scripts with sigop counts within the limit are now accepted, consistent with P2SH.

This implements the approach suggested by roconnor-blockstream in #33882 to address the policy asymmetry between legacy and P2SH script redemption.

The test suite covers boundary conditions (0, 14, 15, and 16 sigops), different sigop types (CHECKSIG and CHECKMULTISIG), multiple inputs, and regression cases to ensure existing behavior for standard and WITNESS_UNKNOWN scripts is preserved.

DrahtBot commented at 7:03 pm on November 22, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33926.

Reviews

See the guideline for information on the review process. A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#33645 (refactor: optimize: avoid allocations in script & policy verification by Raimo33)
#29060 (Policy: Report debug message why inputs are non standard by ismaelsadeeq)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible places where named args may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):

AddCoins(coins, CTransaction(txFrom), 0) in src/test/script_p2sh_tests.cpp

2025-11-27

DrahtBot added the label CI failed on Nov 22, 2025

DrahtBot commented at 7:49 pm on November 22, 2025: contributor

🚧 At least one of the CI tasks failed. Task macOS native, fuzz: https://github.com/bitcoin/bitcoin/actions/runs/19599919918/job/56129931926 LLM reason (✨ experimental): Fuzz target crash: assertion failed (mempoolDuplicate.HaveCoin) in txmempool.cpp during process_messages, causing non-zero exit.

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

151henry151 commented at 9:33 pm on November 22, 2025: contributor

Our change allows NONSTANDARD legacy scripts with acceptable sigop counts to pass AreInputsStandard, which breaks tests that relied on those scripts being rejected.

p2p_segwit.py uses a NONSTANDARD script (0 sigops) as self.utxo[0] that now passes instead of being rejected, so transactions that should be rejected are getting accepted.

feature_taproot.py has some spenders with NONSTANDARD scripts that are now allowed, so the test’s standardness checks don’t match the new policy.

I’ll fix both by updating the tests to match the new policy.

151henry151 commented at 6:34 pm on November 23, 2025: contributor

I updated the tests to match the new policy. The NONSTANDARD script in p2p_segwit.py now passes AreInputsStandard since it has zero sigops. For test_segwit_versions, I switched the input from the NONSTANDARD UTXO to the standard UTXOs created earlier, so the transaction now passes all checks and gets accepted. For feature_taproot.py, I adjusted the is_standard flag for the compat/nocsa spender because bare NONSTANDARD scripts with low sigop counts now pass policy checks.

Please check this all out and let me know if this looks correct or needs any adjustment.

151henry151 force-pushed on Nov 24, 2025

151henry151 commented at 9:30 pm on November 24, 2025: contributor

The process_messages fuzz test fails with Assertion failed: (mempoolDuplicate.HaveCoin(txin.prevout)) at txmempool.cpp:727 after allowing NONSTANDARD scripts with ≤15 sigops to pass AreInputsStandard.

I added a coin validity check in AreInputsStandard (verifying coin.IsSpent() before GetSigOpCount), but the test still fails. Since this PR only aligns policy behavior and shouldn’t require mempool changes, perhaps the issue is how the policy change interacts with the consistency check.

I’m not seeing why transactions accepted by the policy fail the consistency check. Any help diagnosing this would be appreciated. Perhaps you might have some ideas, @roconnor-blockstream ?

in src/policy/policy.cpp:234 in 4d06b1e94c outdated

235             // WITNESS_UNKNOWN failures are typically also caught with a policy
236             // flag in the script interpreter, but it can be helpful to catch
237             // this type of NONSTANDARD transaction earlier in transaction
238             // validation.
239             return false;
240+        } else if (whichType == TxoutType::NONSTANDARD) {

roconnor commented at 5:09 pm on November 25, 2025:

Maybe it is better to add a new TxoutType::LEGACY and just match on that (putting NONSTADARD back on the line above) rather than redoing some of the work already done by Solver.

roconnor commented at 5:56 pm on November 25, 2025: none

My best guess is that making legacy script standard means that randomly generated ScriptPubKeys are now generally mempool acceptable whereas before only very few specifically shaped ScriptPubKeys could be accepted and weren’t being generated by the fuzzer. This is exposing some kind of error in the process_messages fuzz test harness.

One thing to explore is to see if reverting PR #32822 makes the failure disappear. If so then we maybe narrowed down the problem to the recent changes to this fuzz test.

Another angle we might take is that legacy script perhaps ought to require at least one checksig operation, otherwise it is unsecured and, theoretically, eligible as an upgrade mechanism similar to Segwit. This might sweep any potential existing process_messages fuzz test issues under the rug.

151henry151 force-pushed on Nov 26, 2025

151henry151 force-pushed on Nov 27, 2025

in COMPREHENSIVE_FIX_PLAN.md:8 in 3246bd68d0 outdated

0@@ -0,0 +1,135 @@
1+# Comprehensive Fix Plan for PR 33926 CI Failures
2+
3+## Current Status
4+- **5 failing CI jobs**: lint, No wallet, previous releases, ASan+LSan+UBSan+integer, i686 no IPC
5+- **Primary failure**: `test_v0_outputs_arent_spendable` in p2p_segwit.py
6+- **ScriptPubKey**: `[OP_DUP, OP_HASH160, pubkey_hash, OP_EQUALVERIFY, OP_CHECKSIG, OP_1]`
7+- **Classification**: LEGACY (26 bytes, ends with OP_1, not OP_CHECKSIG)
8+- **Sigop count**: 1 (OP_CHECKSIG)

maflcko commented at 7:50 am on November 27, 2025:

@151henry151 I don’t think it is a good use of anyone’s time to ask an LLM to fundamentally change the script policy in a tight agent loop that uses the public repo resources to run and get the CI result continuously.

151henry151 commented at 8:39 am on November 27, 2025:

I didn’t consider that I was burning through valuable resources, and I have certainly been using the LLM to help me troubleshoot and try to figure out why I can’t get the tests to pass. Sorry for the abuse of resources and I’ll use my own machine for tests until I get some good results.

maflcko commented at 7:52 am on November 27, 2025: member

Please note that contributors are required to fully understand their authored code themselves.

Asking others for review, to feed that into an LLM, does not count as understanding the code.

If they wanted to use an LLM, they could just do so themselves.

Please focus on creating high-quality, original content that demonstrates a clear understanding of the project’s requirements and goals. Also, see the contributing guidelines.

151henry151 force-pushed on Nov 27, 2025

policy: align legacy script policy with P2SH 7f33352601

151henry151 force-pushed on Nov 27, 2025

maflcko commented at 8:16 am on November 27, 2025: member

Closing, to break the bot loop

maflcko closed this on Nov 27, 2025

151henry151 commented at 8:36 am on November 27, 2025: contributor

Sorry about that – not trying to falsely represent anything here. I thought I had a pretty good start with my changes to policy.cpp, but then I’ve been struggling with getting the tests working. I know that overuse of LLMs is frowned upon and I realize that accidentally committing and pushing a bunch of MD files with notes generated by an LLM is not a good look, but I’m going to keep trying to figure this out locally and hope to make a valuable contribution here. I also realize that I should have got my code to a point where the tests are passing first before opening the PR so closing it makes sense, sorry. I’ve been applying myself to this enthusiastically but it’s proven to be more difficult than I anticipated, and I have been leveraging every tool available to me to try and get a good result that will be of some use to the project.

151henry151 commented at 10:12 pm on November 27, 2025: contributor

@maflcko

Please note that contributors are required to fully understand their authored code themselves.

I appreciate the feedback and want to make sure it’s clear that I do comprehend the code I’ve written.

The core change applies sigop limits to legacy scripts and fixes CLEANSTACK to only apply to P2SH as specified in BIP62. Most of the diff is test updates—the functional tests were using 0-sigop scripts that fail under the new policy, so they needed to be switched to 1-sigop scripts with proper signing.

I’ve been stuck on a p2p_segwit.py failure for quite a while now. The issue appears to be that test_non_witness_transaction creates a UTXO requiring a signature (scriptPubKey: [OP_TRUE, OP_DROP] * 14 + [OP_CHECKSIG]), but test_standardness_v0 later tries to spend it without signing. I’ve tried several approaches to fix this but haven’t been able to get it working, which is where I ended up making too many CI runs trying different things.

I apologize for the excessive CI usage. I’ve now set up local testing and will continue debugging there until I have a working solution.

151henry151 commented at 9:34 pm on November 29, 2025: contributor

@maflcko I think I’ve resolved the issue in https://github.com/151henry151/bitcoin/commit/1c67900568

Witness scripts (P2WSH, P2WPKH, etc.) were incorrectly falling into the legacy script path, so they’re now explicitly skipped since they’re handled by IsWitnessStandard. Anyone-can-spend scripts like CScript([OP_TRUE]) were being rejected as NONSTANDARD, so they’re now explicitly allowed. NONSTANDARD scripts also needed sigop limit checking applied to match P2SH behavior.

All tests now pass locally: p2p_segwit.py, feature_taproot.py, script_p2sh_tests, and related policy/script tests.

Could this PR be reopened so we can verify the fixes pass CI? The previous failures were due to missing edge case handling in the policy logic, which has now been addressed.

Align legacy script policy with P2SH policy in AreInputsStandard #33926

Code Coverage & Benchmarks

Reviews

Conflicts

LLM Linter (✨ experimental)