Short-circuit `GetLegacySigOpCount` for known scripts

l0rinc commented at 1:59 pm on May 16, 2025: contributor

It is an optimization PR for GetSigOpCount, covered heavily with new tests, benchmarks and small refactorings.

The changes were split out to #32729 (refactors only for now).

Context

Test coverage was thin for this code path that suddenly became popular for different reasons (https://github.com/bitcoin/bitcoin/pull/31624 and #32521 and #32533) highlighting the need for better code coverage.

The PR now splits out the common PUSHDATA length/bounds reads and error checks and covers its corner cases with unit tests and benchmarks.

Testing

added unit tests covering every standard script type (and a historical OP_RETURN … OP_CHECKSIG oddity);
added fuzz-style test that compares the refactored implementation against the exact legacy algorithm on random scripts - also ran it continuously for 25 hours without divergence;
reindexed to height 897,000 comparing the outputs of the old and the new implementation without incidents.

DrahtBot commented at 1:59 pm on May 16, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32532.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept NACK	darosior
Concept ACK	tapcrafter, ryanofsky

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#32673 (clang-tidy: Apply modernize-deprecated-headers by maflcko)
#32554 (bench: replace embedded raw block with configurable block generator by l0rinc)
#32521 (policy: make pathological transactions packed with legacy sigops non-standard by darosior)
#31989 (BIP-119 (OP_CHECKTEMPLATEVERIFY) (regtest only) by jamesob)
#31682 ([IBD] specialize CheckBlock’s input & coinbase checks by l0rinc)
#29060 (Policy: Report reason inputs are non standard by ismaelsadeeq)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

l0rinc force-pushed on May 16, 2025

in src/test/sigopcount_tests.cpp:192 in 3248d025cc outdated

29@@ -30,17 +30,16 @@ BOOST_FIXTURE_TEST_SUITE(sigopcount_tests, BasicTestingSetup)
30 
31 BOOST_AUTO_TEST_CASE(GetSigOpCount)
32 {
33-    // Test CScript::GetSigOpCount()
34     CScript s1;
35-    BOOST_CHECK_EQUAL(s1.GetSigOpCount(false), 0U);
36-    BOOST_CHECK_EQUAL(s1.GetSigOpCount(true), 0U);
37+    BOOST_CHECK_EQUAL(s1.GetLegacySigOpCount(/*fAccurate=*/false), 0U);

tapcrafter commented at 3:32 pm on May 18, 2025:

Should the test case (BOOST_AUTO_TEST_CASE(GetSigOpCount)) also be renamed?

l0rinc commented at 9:19 am on May 19, 2025:

It exercises both GetSigOpCount and GetLegacySigOpCount, so I kept it. But the new tests can indeed be renamed, thanks!

in test/functional/feature_taproot.py:1504 in 3248d025cc outdated

1500@@ -1501,7 +1501,7 @@ def test_spenders(self, node, spenders, input_counts):
1501                     tx.vout[-1].nValue = random.randint(DUST_LIMIT, in_value)
1502                 in_value -= tx.vout[-1].nValue
1503                 tx.vout[-1].scriptPubKey = random.choice(host_spks)
1504-                sigops_weight += CScript(tx.vout[-1].scriptPubKey).GetSigOpCount(False) * WITNESS_SCALE_FACTOR
1505+                sigops_weight += CScript(tx.vout[-1].scriptPubKey).GetLegacySigOpCount(False) * WITNESS_SCALE_FACTOR

tapcrafter commented at 3:39 pm on May 18, 2025:

Can we add fAccurate=False here?

l0rinc commented at 9:19 am on May 19, 2025:

Indeed, added, thanks.

in src/script/script.h:561 in fcc418de42 outdated

558+    bool IsPayToPubKeyHash() const noexcept
559+    {
560+        return size() == 25 &&
561+               front() == OP_DUP &&
562+               (*this)[1] == OP_HASH160 &&
563+               (*this)[2] == 0x14 &&

tapcrafter commented at 6:17 pm on May 18, 2025:

Not sure if we can include interpreter.h here, but if yes, we could use WITNESS_V0_KEYHASH_SIZE. And the other constants too, for extra clarity (WITNESS_V0_SCRIPTHASH_SIZE = 32, WITNESS_V1_TAPROOT_SIZE = 32).

l0rinc commented at 9:19 am on May 19, 2025:

I have tried including interpreter.h before, but it results in ugly weird compiler mess. But it seems we can move those constant over to script.h instead - done, added you as a co-author.

in src/script/script.h:612 in fcc418de42 outdated

592+        return size() == 34 &&
593+               front() == OP_0 &&
594+               (*this)[1] == 0x20;
595+    }
596+
597+    bool IsCompressedPayToPubKey() const noexcept

tapcrafter commented at 6:21 pm on May 18, 2025:

Looks like this check omits the && (script[1] == 0x02 || script[1] == 0x03) part that was in the original IsToPubKey function. Not sure if this is on purpose?

l0rinc commented at 9:19 am on May 19, 2025:

I took these definitions from solver.cpp - the point here is not necessarily to fully validate the internal structure of the script templates, but rather to check if it follows a general template’s layout (opcode, followed by a version, followed by a hash, etc). Validating the existing versions is done in other places, it’s not important for e.g. SigOp count. If you think I’m mistaken here, please let me know.

l0rinc commented at 5:21 pm on June 6, 2025:

Update: we do need those, but only for the compressor, not for the solver or other uses

in src/script/script.h:619 in fcc418de42 outdated

599+        return size() == CPubKey::COMPRESSED_SIZE + 2 &&
600+               front() == CPubKey::COMPRESSED_SIZE &&
601+               back() == OP_CHECKSIG;
602+    }
603+
604+    bool IsUncompressedPayToPubKey() const noexcept

tapcrafter commented at 6:22 pm on May 18, 2025:

Same here, the original had the additional condition && script[1] == 0x04.

l0rinc commented at 9:19 am on May 19, 2025:

This wasn’t checked in MatchPayToPubkey either.

l0rinc commented at 9:03 am on June 6, 2025:

Update: same, those checks need to be on use site, in compressor only

tapcrafter commented at 6:45 pm on May 18, 2025: none

Concept ACK 432ff7e736942c6dd6fc923300b5fa5f0f94c287.

Can’t fully speak to the correctness of the refactor itself, still brushing up on my C++. So can’t give a full ACK but left some questions instead.

Test protocol

New benchmark test on 2e8a9e17 (first commit of PR, before refactor):

0|           ns/sigops |            sigops/s |    err% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------:|:----------
2|               50.76 |       19,700,534.44 |    0.5% |      0.01 | `SigOpsBlockBench`

On 432ff7e736942c6dd6fc923300b5fa5f0f94c287 (last commit):

0|           ns/sigops |            sigops/s |    err% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------:|:----------
2|               18.87 |       53,002,534.31 |    0.4% |      0.01 | `SigOpsBlockBench`

I also ran the unit tests a couple of times to excercise the randomized tests in src/test/sigopcount_tests.cpp. No failures to report.

maflcko commented at 6:08 am on May 19, 2025: member

I can’t tell from the pull description, but is there a end-user visible performance difference? If yes, how much?

l0rinc force-pushed on May 19, 2025

l0rinc commented at 9:25 am on May 19, 2025: contributor

Thanks for the observations @tapcrafter, addressed your concerns in the latest push, adding you as a co-author in one of the commits. @maflcko, to clarify the effect on the overall behavior, I slimmed down the DeserializeAndCheckBlockTest similarly to #32457 and added the results to the PR’s description.

I ran

0cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release
1cmake --build build -j$(nproc)
2build/bin/bench_bitcoin -filter='CheckBlockBench' --min-time=10000

several times before and after the SigOp fast-path.

To isolate the remaining hot-spots I also:

split the former DeserializeAndCheckBlockTest into a pure‐serialization (optimized independetly) and a pure CheckBlock benchmark;
switched to block 784,588 (contains every modern script type) - see #32457;
included another related optimization (special-cases the coinbase & duplicate-input checks - #31682).

Before (ef73758d5bd0ef32749d2cefc475bb84e4e6d11e - bench: measure behavior of a more representative block):

0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
2|        1,077,514.54 |              928.06 |    0.0% |    8,231,717.91 |    3,867,061.18 |  2.129 |   1,234,943.74 |    2.5% |     11.00 | `CheckBlockBench`
3|        1,077,039.60 |              928.47 |    0.1% |    8,231,717.76 |    3,865,853.67 |  2.129 |   1,234,943.80 |    2.5% |     11.01 | `CheckBlockBench`
4|        1,077,973.36 |              927.67 |    0.1% |    8,231,717.62 |    3,869,437.50 |  2.127 |   1,234,943.75 |    2.5% |     11.01 | `CheckBlockBench`

Before (9d719108bef902590aa55bf61f979dbcfa152020 - specialize CheckBlock’s input & coinbase checks):

0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
2|          836,899.54 |            1,194.89 |    0.1% |    6,618,933.87 |    3,003,330.32 |  2.204 |     847,006.84 |    2.4% |     10.77 | `CheckBlockBench`
3|          838,135.14 |            1,193.13 |    0.0% |    6,618,933.87 |    3,008,503.94 |  2.200 |     847,006.84 |    2.4% |     10.78 | `CheckBlockBench`
4|          835,303.73 |            1,197.17 |    0.1% |    6,618,933.87 |    2,997,539.89 |  2.208 |     847,006.84 |    2.4% |     10.77 | `CheckBlockBench`

After (ddc82fdb6fae77ba299104ed03c33ea9dd0e2e05 - script: short-circuit GetLegacySigOpCount for known scripts):

0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
2|          753,774.48 |            1,326.66 |    0.0% |    5,224,278.78 |    2,705,172.05 |  1.931 |     597,353.76 |    3.4% |     10.70 | `CheckBlockBench`
3|          753,337.59 |            1,327.43 |    0.0% |    5,224,278.78 |    2,704,108.71 |  1.932 |     597,353.76 |    3.4% |     10.70 | `CheckBlockBench`
4|          749,374.04 |            1,334.45 |    0.0% |    5,224,278.78 |    2,690,235.39 |  1.942 |     597,353.75 |    3.4% |     10.69 | `CheckBlockBench`

~11.4% additional validation speed-up thanks to short-circuit SigOp counting (the two changes together result in a ~43.7% validation speedup for the given block). Note that instruction count also dropped from 6,6M to 5.2M, branch executions from 0.8M to 0.6M.

maflcko commented at 9:41 am on May 19, 2025: member

Ah, ok. So from extrapolating the previous comment and #31682 (comment), the IBD speedup should be around 1%?

ryanofsky approved

ryanofsky commented at 2:30 pm on May 19, 2025: contributor

Concept ACK bde1fc1289443785e4538c757631d2fa19d5ecd8. Seems like this makes code clearer and faster without changing complexity, and adds tests.

l0rinc renamed this:
~~RFC: script: short-circuit `GetLegacySigOpCount` for known scripts~~
script: short-circuit `GetLegacySigOpCount` for known scripts
on May 19, 2025

DrahtBot added the label Consensus on May 19, 2025

l0rinc marked this as ready for review on May 19, 2025

darosior commented at 7:22 pm on May 19, 2025: member

This is a significant refactoring of tricky consensus critical routines that have essentially never been modified since Satoshi wrote them 15 years ago. The exploration is interesting but it seems ill-advised to completely overhaul this critical code for a marginal IBD speedup.

I feel bad for being stop energy here, but since i have been an advocate for us voicing our opinion instead of just ignoring PRs i’ll go and own it. Concept NACK: i don’t think the risk-benefit ratio of this change is worth it.

ryanofsky commented at 7:58 pm on May 19, 2025: contributor

Might also help to add all the benchmark and test changes and simpler improvements like inlining in a separate PR, so it’s a more clear how tricky the sensitive changes here are, and if the sensitive changes are really worth making.

l0rinc marked this as a draft on May 20, 2025

DrahtBot added the label Needs rebase on May 21, 2025

l0rinc force-pushed on May 21, 2025

l0rinc renamed this:
~~script: short-circuit `GetLegacySigOpCount` for known scripts~~
refactor: extract `DecodePushData` from `GetScriptOp` and `GetLegacySigOpCount`
on May 21, 2025

l0rinc requested review from darosior on May 21, 2025

l0rinc requested review from ryanofsky on May 21, 2025

l0rinc commented at 1:40 pm on May 21, 2025: contributor

@ryanofsky, @darosior, thank you for expressing your concerns. I have removed the optimization part, I understand why that can be considered risky. The new rebase only contains extraction of the common script parts to avoid some work duplication and to help with understanding this part of the codebase better. It’s also covered extensively with deterministic and random tests, comparing against fixed values, non-standard scripts, and completely random ones compared against the original implementation. We can reconsider the optimizations if this gets accepted.

Also note that the optimizations weren’t there for IBD necessarily, but as the checkblock bench states:

0// These are the two major time-sinks which happen after we have fully received
1// a block off the wire, but before we can relay the block on to peers using
2// compact block relay.

l0rinc marked this as ready for review on May 21, 2025

DrahtBot removed the label Needs rebase on May 21, 2025

in src/bench/checkblock.cpp:48 in f7b161ba72 outdated

56-        CBlock block; // Note that CBlock caches its checked state, so we need to recreate it here
57-        stream >> TX_WITH_WITNESS(block);
58-        bool rewound = stream.Rewind(benchmark::data::block413567.size());
59-        assert(rewound);
60-
61+        block.fChecked = block.m_checked_witness_commitment = block.m_checked_merkle_root = false; // Reset the cached state

ryanofsky commented at 4:50 pm on June 4, 2025:

In commit “bench: measure CheckBlock speed separately from serialization” (f7b161ba72ee4b219f9ccae50ee350eb1092cd07)

It seem a bit fragile to set these flags individually far away from the CBlock definition, because more flags could be added and the benchmark could show results that are not meaningful. Might be good to to add a ResetChecked or similar method to the class definition and call it here instead.

l0rinc commented at 3:07 pm on June 6, 2025:

Excellent, we can even split ResetChecked out of CBlock::SetNull

in src/bench/checkblock.cpp:72 in 1e614fb9da outdated

67+        }
68+        return nSigOps;
69+    }};
70+
71+    auto expected_sigops{0};
72+    for (const auto& tx : block.vtx) expected_sigops += GetLegacySigOpCount(*tx);

ryanofsky commented at 5:00 pm on June 4, 2025:

In commit “bench: measure SigOpsBlock speeds separately” (1e614fb9daa1baf1cef37d66ff30179a174ced99)

Not sure but would it maybe make sense for the GetLegacySigOpCount to take a block instead of transaction to deduplicate code running before and during the benchmark and make sure it is consistent?

Also it seems like this duplicating the tx_verify function. Might be good to explain in the commit message if this intentional.

l0rinc commented at 3:24 pm on June 6, 2025:

this duplicating the tx_verify function

At the moment it seemed simpler to copy it over, but you’re right that I can now just call it directly - thanks. We can also hard-code the expected_sigops to have more redundancy here, since it’s legacy code.

in src/script/script.h:572 in c87c7b0279 outdated

557@@ -552,11 +558,60 @@ class CScript : public CScriptBase
558      */
559     static bool IsPayToAnchor(int version, const std::vector<unsigned char>& program);
560 
561-    bool IsPayToScriptHash() const;
562-    bool IsPayToWitnessScriptHash() const;
563-    bool IsWitnessProgram(int& version, std::vector<unsigned char>& program) const;
564+    bool IsPayToPubKeyHash() const noexcept

ryanofsky commented at 5:38 pm on June 4, 2025:

In commit “refactor: add extra-fast checks for all standard script types” (c87c7b02796d83f5c4d8a7472a2cd13df30e6a05)

This change seems all right but I’d find it much easier to review if it could be split up to move just one CScript::Is* method per-commit. Otherwise it is easy to confuse the different method names and bodies because they look very similar, and verifying the c87c7b02796d83f5c4d8a7472a2cd13df30e6a05 diff requires jumping around and searching for one method at a time instead of being able to read sequentially.

l0rinc commented at 5:21 pm on June 6, 2025:

Indeed!

in src/test/sigopcount_tests.cpp:34 in 461779ae49 outdated

29@@ -28,6 +30,160 @@ Serialize(const CScript& s)
30 
31 BOOST_FIXTURE_TEST_SUITE(sigopcount_tests, BasicTestingSetup)
32 
33+BOOST_AUTO_TEST_CASE(GetLegacySigOpCountErrors)

ryanofsky commented at 6:15 pm on June 4, 2025:

In commit “test: add SigOpCount edge‑cases & known‑template coverage” (461779ae4903d8e86500fe5ef4d73009d80ee311)

Could consider expanding commit message a bit to say how the new tests compare to existing tests. I think:

GetLegacySigOpCountErrors is adding new coverage for malformed scripts that weren’t checked before?
GetLegacySigOpCountStandardScripts is adding more organized and complete coverage, but not really new coverage?

And I know you tend to push back against adding code comments, so ignore this suggestion if you don’t think it is a good idea, but I’d consider moving test descriptions from the commit message:

GetLegacySigOpCountErrors feeds malformed PUSHDATA sequences to confirm the parser never crashes and still counts sig‑ops that appear before the error.

GetLegacySigOpCountStandardScripts asserts the expected legacy/accurate sig‑op totals for all common known script templates (P2PKH, P2SH, P2WPKH/WSH, P2TR, compressed & uncompressed P2PK, OP_RETURN, multisig).

into the test code itself, because IMO is useful to have 1-2 line descriptions of tests like these to help navigate and understand them.

l0rinc commented at 5:28 pm on June 6, 2025:

is adding new coverage but not really new coverage

Not sure, many of these branches were never covered - based on https://maflcko.github.io/b-c-cov/total.coverage/src/script/script.cpp.gcov.html And based on https://corecheck.dev/bitcoin/bitcoin/pulls/32532 many more are covered now

Added comments to the tests, please let me know what I’m still missing.

in src/script/script.h:614 in c87c7b0279 outdated

601+               (*this)[1] == WITNESS_V0_SCRIPTHASH_SIZE;
602+    }
603+
604+    bool IsCompressedPayToPubKey() const noexcept
605+    {
606+        return size() == 35 &&

ryanofsky commented at 6:25 pm on June 4, 2025:

In commit “refactor: add extra-fast checks for all standard script types” (c87c7b02796d83f5c4d8a7472a2cd13df30e6a05)

Any particular reason for writing 35 instead of CPubKey::COMPRESSED_SIZE + 2 like previous code?

l0rinc commented at 9:02 am on June 6, 2025:

thought about it, it seems more redundant (which I think could be safer for legacy code) if we state the sizes explicitly, otherwise changing the CPubKey::COMPRESSED_SIZE value might not fail immediately. Every other helper here has fixed size - and I couldn’t find a reasonable value for all of them, decided to hard-code the sizes instead. Do you have an idea that we could apply to all of these?

in src/compressor.cpp:35 in c87c7b0279 outdated

48-static bool IsToPubKey(const CScript& script, CPubKey &pubkey)
49+static bool IsToPubKey(const CScript& script, CPubKey& pubkey)
50 {
51-    if (script.size() == 35 && script[0] == 33 && script[34] == OP_CHECKSIG
52-                            && (script[1] == 0x02 || script[1] == 0x03)) {
53+    if (script.IsCompressedPayToPubKey()) {

ryanofsky commented at 6:27 pm on June 4, 2025:

In commit “refactor: add extra-fast checks for all standard script types” (c87c7b02796d83f5c4d8a7472a2cd13df30e6a05)

This is dropping the script[1] == 0x02 || script[1] == 0x03)) check?

l0rinc commented at 9:01 am on June 6, 2025:

Some of these are rechecked later, but it looks like we need some unit tests for this area - thanks for the detailed review, I have incorrectly removed it during a rebase - since I got confused of the different ways of checking the same.

It was also noticed by #32532 (review) (seems I misunderstood his comment)

in src/compressor.cpp:39 in c87c7b0279 outdated

54         pubkey.Set(&script[1], &script[34]);
55         return true;
56     }
57-    if (script.size() == 67 && script[0] == 65 && script[66] == OP_CHECKSIG
58-                            && script[1] == 0x04) {
59+    if (script.IsUncompressedPayToPubKey()) {

ryanofsky commented at 6:28 pm on June 4, 2025:

In commit “refactor: add extra-fast checks for all standard script types” (c87c7b02796d83f5c4d8a7472a2cd13df30e6a05)

This is dropping the script[1] == 0x04 check?

l0rinc commented at 9:01 am on June 6, 2025:

Thank you, split these into smaller commits to make sure we don’t miss these anymore

ryanofsky commented at 6:32 pm on June 4, 2025: contributor

Concept ACK 5b11b8706f14e0545f7e07c7c029e24811dc6c9d. I haven’t finished reviewing but these changes seem very straightforward and the new test and benchmark coverage seems nice. Left some review comments below

l0rinc marked this as a draft on Jun 4, 2025

bench: measure `CheckBlock` speed separately from serialization

To make sure the blocks keep their original state, a ResetChecked() was introduced which will reenable all relevant calculations

4f75c5a94d

l0rinc force-pushed on Jun 6, 2025

bench: measure `SigOpsBlock` speeds separately af4071eb48

refactor: rename `GetSigOpCount` to `GetLegacySigOpCount`

Also added primitive argument name reminder to the call sites to highlight the meaning of the call.

05c0645213

refactor: moved the script-hash-size constants to `script.h`

This will be needed in upcoming commit, to be able to use previously defined constants in `script.h`.
I made sure to check that there were no other constants with the same names.

58004a8caa

refactor: pull `IsPayToScriptHash` to header

Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parenthesis, used descriptive `front()`/`back()` methods and changed the `0x14` magic constant to descriptive `WITNESS_V0_KEYHASH_SIZE`.
See: https://learnmeabitcoin.com/technical/script/p2sh/#scriptpubkey

35b19cc435

refactor: pull `IsPayToWitnessScriptHash` to header

Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parenthesis, used descriptive `front()` method and changed the `0x20` magic constant to descriptive `WITNESS_V0_SCRIPTHASH_SIZE`.
See: https://learnmeabitcoin.com/technical/script/p2wsh/#scriptpubkey

176c9d05ba

refactor: pull `IsPayToAnchor` to header

Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parenthesis, used descriptive `front()`/`back() methods.

625a09ed62

refactor: pull `IsPayToTaproot` to header

Moved the implementation to the header (implicitly inline + noexcept), removed redundant `this` and parenthesis, used descriptive `front()` method and changed the `0x20` magic constant to descriptive `WITNESS_V1_TAPROOT_SIZE`.

722170e0b4

refactor: add `IsPayToPubKeyHash` helper to script.h

The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new methods.

f1f8da7329

refactor: add `IsCompressedPayToPubKey` helper to script.h

The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new methods.
Note that compressor has additional checks as well, which are not properly exercised by the `compressed_p2pk` unit test.

4602eb743e

refactor: add `IsUncompressedPayToPubKey` helper to script.h

The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new methods.
Note that compressor has additional checks as well, which are not properly exercised by the `compressed_p2pk` unit test.

30c698c912

refactor: add `IsPayToWitnessPubKeyHash` helper to script.h

The usages in `compressor.cpp` were also updated to use the new methods.

b3d028691e

test: add `SigOpCount` edge‑cases & known‑template coverage

* `GetLegacySigOpCountErrors` feeds malformed PUSHDATA sequences to confirm the parser never crashes and still counts sig‑ops that appear before the error.
* `GetLegacySigOpCountKnownTemplates` asserts the expected legacy/accurate sig‑op totals for all common known script templates (P2PKH, P2SH, P2WPKH/WSH, P2TR, compressed & uncompressed P2PK, OP_RETURN, multisig).

cbdd56151e

refactor: extract `DecodePushData` from `GetScriptOp` and use it in `GetLegacySigOpCount` as well

Introduce helper `DecodePushData` that parses the size prefixes, checks bounds, and returns size_bytes & data_bytes - or error on truncation.

To make absolutely sure it's just a refactor, a new unit test was added which asserts the refactored `GetLegacySigOpCount` exactly matches the previous algorithm.

0819a34565

l0rinc force-pushed on Jun 6, 2025

DrahtBot added the label CI failed on Jun 6, 2025

DrahtBot commented at 5:38 pm on June 6, 2025: contributor

🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/runs/43633911782 LLM reason (✨ experimental): The CI failure is due to a lint check failure caused by improper include syntax in source files.

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

l0rinc commented at 5:39 pm on June 6, 2025: contributor

Thanks for the reviews, addressed more of the concerns in smaller commits, even more tests: https://github.com/bitcoin/bitcoin/compare/5b11b8706f14e0545f7e07c7c029e24811dc6c9d..0819a34565d01e241e5a3f884a508472861c96e1 @darosior, this change isn’t about IBD, it was mostly about block connect time optimization originally - now it’s rather meant as a cleanup and flooding it with tests and benchmarks to enable optimizations in follow-ups.

l0rinc commented at 5:39 pm on June 6, 2025: contributor

Thanks for the reviews, addressed more of the concerns in smaller commits, even more tests: https://github.com/bitcoin/bitcoin/compare/5b11b8706f14e0545f7e07c7c029e24811dc6c9d..0819a34565d01e241e5a3f884a508472861c96e1 @darosior, this change isn’t about IBD, it was mostly about block connect time optimization originally - now it’s rather meant as a cleanup and flooding it with tests and benchmarks to enable optimizations in follow-ups.

l0rinc renamed this:
~~refactor: extract `DecodePushData` from `GetScriptOp` and `GetLegacySigOpCount`~~
refactor,test: extract `DecodePushData` from `GetScriptOp` and `GetLegacySigOpCount`
on Jun 6, 2025

darosior commented at 6:10 pm on June 6, 2025: member

@darosior, this change isn’t about IBD, it was mostly about block connect time optimization originally - now it’s rather meant as a cleanup and flooding it with tests and benchmarks to enable optimizations in follow-ups.

This further strengthens my point. I don’t think we should make a ~~major~~ overhaul to tricky consensus-critical code that was never touched in 15 years just to “make it nicer” (and possibly enabling very marginal optimisations in the future).

DrahtBot removed the label CI failed on Jun 6, 2025

ryanofsky commented at 8:02 pm on June 6, 2025: contributor

I don’t think we should make a major overhaul to tricky consensus-critical code that was never touched in 15 years @darosior I almost wonder if we are looking at the same PR. The current set of changes seem pretty trivial and are certainly not a major overhaul. @l0rinc since the general arguments darosior’s making are worth taking seriously and apply to a wide variety of changes maybe including the changes made here, it could make sense to separate the concerns more by opening a different PR to only improve the benchmarks and tests and not refactor the script code.

IMO, having now looked at the script changes, I think they are trivial and not risky, and they could be merged with the normal review process and 3 acks. But a conversation about raising the standard for changes like this could be worth having here or somewhere else. I do think a more productive way of raising the standard would be to do something like increase the number of expected acks, instead of relying on people to shoot down changes individually with nacks.

In a project like this it is rightfully very hard to argue on the side of taking any risk, but we should look out for places where risks are conserved and trying to reduce them one place increases them other places. IMO, it is important for changes like this to be able to progress with sufficient review so we are not put in a position of maintaining code that we are afraid to modify and that no remaining developers will have familiarity with. They also help test our review process and build confidence that it works and catches real issues. And at least in the past, they’ve been useful for deploying security fixes before revealing them publicly.

A related question is how to decide what changes to spend time working on and reviewing given limited review and development bandwidth. I’m not sure there is an answer to this other than letting people individually decide for themselves, but I think if we want to shift attention different places, proactively asking for reviews and highlighting changes we want to encourage is probably also a better approach than trying to actively discourage other changes. “Don’t think of a pink elephant” is not a wildly effective strategy.

darosior commented at 8:40 pm on June 6, 2025: member

@ryanofsky Maybe “overhaul” isn’t the right term if it implies that a change is necessarily huge, and i should have used “refactoring” instead. Regardless, my point stands.

optimization: short-circuit `GetLegacySigOpCount` for known scripts

> build/bin/bench_bitcoin -filter='SigOpsBlockBench' --min-time=1000

|           ns/sigops |            sigops/s |    err% |      ins/sigops |      cyc/sigops |    IPC |     bra/sigops |   miss% |     total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
|               58.43 |       17,115,813.09 |    0.1% |          354.20 |          209.79 |  1.688 |          89.46 |    2.8% |     10.83 | `SigOpsBlockBench`

d76d7531df

l0rinc renamed this:
~~refactor,test: extract `DecodePushData` from `GetScriptOp` and `GetLegacySigOpCount`~~
[tracking PR] - short-circuit `GetLegacySigOpCount` for known scripts
on Jun 7, 2025

l0rinc commented at 2:50 pm on June 11, 2025: contributor

Thanks @ryanofsky, I have extracted the tests and benchmarks to #32729 - adding a few more test cases based on missing code and feature coverage I have observed since. The non-test codes changes are either move only with tiny changes (e.g. front() instead of (*this)[0] ), separated to different commits to ease the review as much as possible. @darosior, you’re also welcome to review or to ignore, up to you of course.

Edit: Now that the testing part was extracted I’m closing this to avoid confusion

l0rinc closed this on Jun 11, 2025

l0rinc renamed this:
~~[tracking PR] - short-circuit `GetLegacySigOpCount` for known scripts~~
Short-circuit `GetLegacySigOpCount` for known scripts
on Jun 29, 2025

Short-circuit GetLegacySigOpCount for known scripts #32532

Context

Testing

Code Coverage & Benchmarks

Reviews

Conflicts

Short-circuit `GetLegacySigOpCount` for known scripts #32532