refactor: extract DecodePushData from GetScriptOp and GetLegacySigOpCount #32532

pull l0rinc wants to merge 6 commits into bitcoin:master from l0rinc:l0rinc/short-circuit-known-script-types changing 16 files +439 −174
  1. l0rinc commented at 1:59 pm on May 16, 2025: contributor

    This was originally an optimization PR for GetSigOpCount - kept only the refactorings without any optimization. The goal of the PR now is to document the actual behavior better (more descriptive names, parameters, tests, benchmarks, split out sub-functionality).

    Context

    Test coverage was thin for this code path that suddenly became popular for different reasons (https://github.com/bitcoin/bitcoin/pull/31624 and #32521 and #32533) highlighting the need for better code coverage.

    The PR now splits out the common PUSHDATA length/bounds reads and error checks and covers its corner cases with unit tests and benchmarks.

    Testing

    • added unit tests covering every standard script type (and a historical OP_RETURN … OP_CHECKSIG oddity);
    • added fuzz-style test that compares the refactored implementation against the exact legacy algorithm on random scripts - also ran it continuously for 25 hours without divergence;
    • reindexed to height 897,000 comparing the outputs of the old and the new implementation without incidents.
  2. DrahtBot commented at 1:59 pm on May 16, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32532.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK darosior
    Concept ACK tapcrafter, ryanofsky

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32554 (bench: replace embedded raw block with configurable block generator by l0rinc)
    • #32521 (policy: make pathological transactions packed with legacy sigops non-standard by darosior)
    • #31989 (BIP-119 (OP_CHECKTEMPLATEVERIFY) (regtest only) by jamesob)
    • #31682 ([IBD] specialize CheckBlock’s input & coinbase checks by l0rinc)
    • #29060 (Policy: Report reason inputs are non standard by ismaelsadeeq)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. l0rinc force-pushed on May 16, 2025
  4. in src/test/sigopcount_tests.cpp:190 in 3248d025cc outdated
    29@@ -30,17 +30,16 @@ BOOST_FIXTURE_TEST_SUITE(sigopcount_tests, BasicTestingSetup)
    30 
    31 BOOST_AUTO_TEST_CASE(GetSigOpCount)
    32 {
    33-    // Test CScript::GetSigOpCount()
    34     CScript s1;
    35-    BOOST_CHECK_EQUAL(s1.GetSigOpCount(false), 0U);
    36-    BOOST_CHECK_EQUAL(s1.GetSigOpCount(true), 0U);
    37+    BOOST_CHECK_EQUAL(s1.GetLegacySigOpCount(/*fAccurate=*/false), 0U);
    


    tapcrafter commented at 3:32 pm on May 18, 2025:
    Should the test case (BOOST_AUTO_TEST_CASE(GetSigOpCount)) also be renamed?

    l0rinc commented at 9:19 am on May 19, 2025:
    It exercises both GetSigOpCount and GetLegacySigOpCount, so I kept it. But the new tests can indeed be renamed, thanks!
  5. in test/functional/feature_taproot.py:1504 in 3248d025cc outdated
    1500@@ -1501,7 +1501,7 @@ def test_spenders(self, node, spenders, input_counts):
    1501                     tx.vout[-1].nValue = random.randint(DUST_LIMIT, in_value)
    1502                 in_value -= tx.vout[-1].nValue
    1503                 tx.vout[-1].scriptPubKey = random.choice(host_spks)
    1504-                sigops_weight += CScript(tx.vout[-1].scriptPubKey).GetSigOpCount(False) * WITNESS_SCALE_FACTOR
    1505+                sigops_weight += CScript(tx.vout[-1].scriptPubKey).GetLegacySigOpCount(False) * WITNESS_SCALE_FACTOR
    


    tapcrafter commented at 3:39 pm on May 18, 2025:
    Can we add fAccurate=False here?

    l0rinc commented at 9:19 am on May 19, 2025:
    Indeed, added, thanks.
  6. in src/script/script.h:561 in fcc418de42 outdated
    558+    bool IsPayToPubKeyHash() const noexcept
    559+    {
    560+        return size() == 25 &&
    561+               front() == OP_DUP &&
    562+               (*this)[1] == OP_HASH160 &&
    563+               (*this)[2] == 0x14 &&
    


    tapcrafter commented at 6:17 pm on May 18, 2025:
    Not sure if we can include interpreter.h here, but if yes, we could use WITNESS_V0_KEYHASH_SIZE. And the other constants too, for extra clarity (WITNESS_V0_SCRIPTHASH_SIZE = 32, WITNESS_V1_TAPROOT_SIZE = 32).

    l0rinc commented at 9:19 am on May 19, 2025:
    I have tried including interpreter.h before, but it results in ugly weird compiler mess. But it seems we can move those constant over to script.h instead - done, added you as a co-author.
  7. in src/script/script.h:606 in fcc418de42 outdated
    592+        return size() == 34 &&
    593+               front() == OP_0 &&
    594+               (*this)[1] == 0x20;
    595+    }
    596+
    597+    bool IsCompressedPayToPubKey() const noexcept
    


    tapcrafter commented at 6:21 pm on May 18, 2025:
    Looks like this check omits the && (script[1] == 0x02 || script[1] == 0x03) part that was in the original IsToPubKey function. Not sure if this is on purpose?

    l0rinc commented at 9:19 am on May 19, 2025:
    I took these definitions from solver.cpp - the point here is not necessarily to fully validate the internal structure of the script templates, but rather to check if it follows a general template’s layout (opcode, followed by a version, followed by a hash, etc). Validating the existing versions is done in other places, it’s not important for e.g. SigOp count. If you think I’m mistaken here, please let me know.
  8. in src/script/script.h:613 in fcc418de42 outdated
    599+        return size() == CPubKey::COMPRESSED_SIZE + 2 &&
    600+               front() == CPubKey::COMPRESSED_SIZE &&
    601+               back() == OP_CHECKSIG;
    602+    }
    603+
    604+    bool IsUncompressedPayToPubKey() const noexcept
    


    tapcrafter commented at 6:22 pm on May 18, 2025:
    Same here, the original had the additional condition && script[1] == 0x04.

    l0rinc commented at 9:19 am on May 19, 2025:
    This wasn’t checked in MatchPayToPubkey either.
  9. tapcrafter commented at 6:45 pm on May 18, 2025: none

    Concept ACK 432ff7e736942c6dd6fc923300b5fa5f0f94c287.

    Can’t fully speak to the correctness of the refactor itself, still brushing up on my C++. So can’t give a full ACK but left some questions instead.

    Test protocol

    New benchmark test on 2e8a9e17 (first commit of PR, before refactor):

    0|           ns/sigops |            sigops/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               50.76 |       19,700,534.44 |    0.5% |      0.01 | `SigOpsBlockBench`
    

    On 432ff7e736942c6dd6fc923300b5fa5f0f94c287 (last commit):

    0|           ns/sigops |            sigops/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               18.87 |       53,002,534.31 |    0.4% |      0.01 | `SigOpsBlockBench`
    

    I also ran the unit tests a couple of times to excercise the randomized tests in src/test/sigopcount_tests.cpp. No failures to report.

  10. maflcko commented at 6:08 am on May 19, 2025: member
    I can’t tell from the pull description, but is there a end-user visible performance difference? If yes, how much?
  11. l0rinc force-pushed on May 19, 2025
  12. l0rinc commented at 9:25 am on May 19, 2025: contributor

    Thanks for the observations @tapcrafter, addressed your concerns in the latest push, adding you as a co-author in one of the commits. @maflcko, to clarify the effect on the overall behavior, I slimmed down the DeserializeAndCheckBlockTest similarly to #32457 and added the results to the PR’s description.


    I ran

    0cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release
    1cmake --build build -j$(nproc)
    2build/bin/bench_bitcoin -filter='CheckBlockBench' --min-time=10000
    

    several times before and after the SigOp fast-path.

    To isolate the remaining hot-spots I also:

    • split the former DeserializeAndCheckBlockTest into a pure‐serialization (optimized independetly) and a pure CheckBlock benchmark;
    • switched to block 784,588 (contains every modern script type) - see #32457;
    • included another related optimization (special-cases the coinbase & duplicate-input checks - #31682).

    Before (ef73758d5bd0ef32749d2cefc475bb84e4e6d11e - bench: measure behavior of a more representative block):

    0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|        1,077,514.54 |              928.06 |    0.0% |    8,231,717.91 |    3,867,061.18 |  2.129 |   1,234,943.74 |    2.5% |     11.00 | `CheckBlockBench`
    3|        1,077,039.60 |              928.47 |    0.1% |    8,231,717.76 |    3,865,853.67 |  2.129 |   1,234,943.80 |    2.5% |     11.01 | `CheckBlockBench`
    4|        1,077,973.36 |              927.67 |    0.1% |    8,231,717.62 |    3,869,437.50 |  2.127 |   1,234,943.75 |    2.5% |     11.01 | `CheckBlockBench`
    

    Before (9d719108bef902590aa55bf61f979dbcfa152020 - specialize CheckBlock’s input & coinbase checks):

    0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|          836,899.54 |            1,194.89 |    0.1% |    6,618,933.87 |    3,003,330.32 |  2.204 |     847,006.84 |    2.4% |     10.77 | `CheckBlockBench`
    3|          838,135.14 |            1,193.13 |    0.0% |    6,618,933.87 |    3,008,503.94 |  2.200 |     847,006.84 |    2.4% |     10.78 | `CheckBlockBench`
    4|          835,303.73 |            1,197.17 |    0.1% |    6,618,933.87 |    2,997,539.89 |  2.208 |     847,006.84 |    2.4% |     10.77 | `CheckBlockBench`
    

    After (ddc82fdb6fae77ba299104ed03c33ea9dd0e2e05 - script: short-circuit GetLegacySigOpCount for known scripts):

    0|            ns/block |             block/s |    err% |       ins/block |       cyc/block |    IPC |      bra/block |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|          753,774.48 |            1,326.66 |    0.0% |    5,224,278.78 |    2,705,172.05 |  1.931 |     597,353.76 |    3.4% |     10.70 | `CheckBlockBench`
    3|          753,337.59 |            1,327.43 |    0.0% |    5,224,278.78 |    2,704,108.71 |  1.932 |     597,353.76 |    3.4% |     10.70 | `CheckBlockBench`
    4|          749,374.04 |            1,334.45 |    0.0% |    5,224,278.78 |    2,690,235.39 |  1.942 |     597,353.75 |    3.4% |     10.69 | `CheckBlockBench`
    

    ~11.4% additional validation speed-up thanks to short-circuit SigOp counting (the two changes together result in a ~43.7% validation speedup for the given block). Note that instruction count also dropped from 6,6M to 5.2M, branch executions from 0.8M to 0.6M.

  13. maflcko commented at 9:41 am on May 19, 2025: member
    Ah, ok. So from extrapolating the previous comment and #31682 (comment), the IBD speedup should be around 1%?
  14. ryanofsky approved
  15. ryanofsky commented at 2:30 pm on May 19, 2025: contributor
    Concept ACK bde1fc1289443785e4538c757631d2fa19d5ecd8. Seems like this makes code clearer and faster without changing complexity, and adds tests.
  16. l0rinc renamed this:
    RFC: script: short-circuit `GetLegacySigOpCount` for known scripts
    script: short-circuit `GetLegacySigOpCount` for known scripts
    on May 19, 2025
  17. DrahtBot added the label Consensus on May 19, 2025
  18. l0rinc marked this as ready for review on May 19, 2025
  19. darosior commented at 7:22 pm on May 19, 2025: member

    This is a significant refactoring of tricky consensus critical routines that have essentially never been modified since Satoshi wrote them 15 years ago. The exploration is interesting but it seems ill-advised to completely overhaul this critical code for a marginal IBD speedup.

    I feel bad for being stop energy here, but since i have been an advocate for us voicing our opinion instead of just ignoring PRs i’ll go and own it. Concept NACK: i don’t think the risk-benefit ratio of this change is worth it.

  20. ryanofsky commented at 7:58 pm on May 19, 2025: contributor
    Might also help to add all the benchmark and test changes and simpler improvements like inlining in a separate PR, so it’s a more clear how tricky the sensitive changes here are, and if the sensitive changes are really worth making.
  21. l0rinc marked this as a draft on May 20, 2025
  22. DrahtBot added the label Needs rebase on May 21, 2025
  23. l0rinc force-pushed on May 21, 2025
  24. bench: measure `CheckBlock` speed separately from serialization f7b161ba72
  25. bench: measure `SigOpsBlock` speeds separately 1e614fb9da
  26. refactor: rename `GetSigOpCount` to `GetLegacySigOpCount`
    Also added primitive argument name reminder to the call sites to highlight the meaning of the call.
    038104fd1b
  27. refactor: add extra-fast checks for all standard script types
    The usages in `compressor.cpp` and `solver.cpp` were also updated to use the new methods.
    To be able to use previously defined constants here as well, instead of adding a dependency on `interpreter.h`, I've moved the script-hash-size constants to `script.h`. There are no other constants with the same names.
    
    Co-authored-by: TapCrafter <tapcrafter@proton.me>
    c87c7b0279
  28. test: add `SigOpCount` edge‑cases & known‑template coverage
    * `GetLegacySigOpCountErrors` feeds malformed PUSHDATA sequences to confirm the parser never crashes and still counts sig‑ops that appear before the error.
    * `GetLegacySigOpCountStandardScripts` asserts the expected legacy/accurate sig‑op totals for all common known script templates (P2PKH, P2SH, P2WPKH/WSH, P2TR, compressed & uncompressed P2PK, OP_RETURN, multisig).
    461779ae49
  29. refactor: extract `DecodePushData` from `GetScriptOp` and use it in `GetLegacySigOpCount` as well
    Introduce helper `DecodePushData` that parses the size prefixes, checks bounds, and returns size_bytes & data_bytes - or error on truncation.
    
    To make absolutely sure it's just a refactor, a new unit test was added which asserts the refactored `GetLegacySigOpCount` exactly matches the previous algorithm.
    5b11b8706f
  30. l0rinc force-pushed on May 21, 2025
  31. l0rinc renamed this:
    script: short-circuit `GetLegacySigOpCount` for known scripts
    refactor: extract `DecodePushData` from `GetScriptOp` and `GetLegacySigOpCount`
    on May 21, 2025
  32. l0rinc requested review from darosior on May 21, 2025
  33. l0rinc requested review from ryanofsky on May 21, 2025
  34. l0rinc commented at 1:40 pm on May 21, 2025: contributor

    @ryanofsky, @darosior, thank you for expressing your concerns. I have removed the optimization part, I understand why that can be considered risky. The new rebase only contains extraction of the common script parts to avoid some work duplication and to help with understanding this part of the codebase better. It’s also covered extensively with deterministic and random tests, comparing against fixed values, non-standard scripts, and completely random ones compared against the original implementation. We can reconsider the optimizations if this gets accepted.

    Also note that the optimizations weren’t there for IBD necessarily, but as the checkblock bench states:

    0// These are the two major time-sinks which happen after we have fully received
    1// a block off the wire, but before we can relay the block on to peers using
    2// compact block relay.
    
  35. l0rinc marked this as ready for review on May 21, 2025
  36. DrahtBot removed the label Needs rebase on May 21, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-05-25 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me