refactor: Split multithreaded case out of CheckInputScripts #32575

pull fjahr wants to merge 4 commits into bitcoin:master from fjahr:2025-05-CheckInputScript-split changing 2 files +117 −77
  1. fjahr commented at 8:52 pm on May 20, 2025: contributor

    This topic has been motivated by my work on batch validation and a related conversation just happened here: #32467 (review)

    CheckInputScripts currently only does what its name implies if there is no multithreading usage for validation. If there are worker threads available the function instead only creates the checks and puts them into a vector. Aside from some shared pre-checks the multithreaded code path is much simpler compared to the rest. This dual use makes the function hard to grasp and its naming confusing. So this PR is splitting the single thread case from the multithread. This results in more LOC but the result is much clearer.

  2. DrahtBot commented at 8:52 pm on May 20, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32575.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK theuni, TheCharlatan, stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32317 (kernel: Separate UTXO set access from validation functions by TheCharlatan)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. theuni commented at 8:57 pm on May 20, 2025: member

    Concept ACK. Anything to clean this up :)

    See also #32317 which moves some of this functionality around similarly.

  4. fjahr force-pushed on May 20, 2025
  5. fjahr marked this as ready for review on May 20, 2025
  6. fjahr commented at 9:49 pm on May 20, 2025: contributor
    Cleaned up the comments so this should be ready for more detailed feedback, but expect a rebase once #32467 is merged.
  7. in src/validation.cpp:2207 in 4262b75726 outdated
    2202+std::vector<CScriptCheck> PrepareInputScriptChecks(const CTransaction& tx,
    2203+                       const CCoinsViewCache& inputs, unsigned int flags, bool cacheSigStore,
    2204+                       bool cacheFullScriptStore, PrecomputedTransactionData& txdata,
    2205+                       ValidationCache& validation_cache) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
    2206+{
    2207+    if (tx.IsCoinBase()) return {};
    


    fjahr commented at 9:54 pm on May 20, 2025:
    I have been contemplating kicking this out (here and in CheckInputScripts) or even turning this into an assert because the callers already handle this everywhere. On the other hand it conceptually makes sense to return early if we know there is nothing to check because we don’t expect any inputs. So I am still undecided on this one and happy to hear opinions.
  8. TheCharlatan commented at 10:12 pm on May 20, 2025: contributor
    Concept ACK
  9. fjahr renamed this:
    RFC: refactor: Split multithreaded case out of CheckInputScripts
    refactor: Split multithreaded case out of CheckInputScripts
    on May 20, 2025
  10. DrahtBot added the label Refactoring on May 20, 2025
  11. stickies-v commented at 10:28 am on May 21, 2025: contributor
    Concept ACK
  12. DrahtBot added the label Needs rebase on May 22, 2025
  13. theuni commented at 6:29 pm on May 22, 2025: member
    Ready for rebase.
  14. fjahr force-pushed on May 22, 2025
  15. fjahr commented at 8:21 pm on May 22, 2025: contributor

    Ready for rebase.

    Done, ready for review :)

  16. DrahtBot removed the label Needs rebase on May 22, 2025
  17. in src/validation.cpp:2733 in 5dfd10c725 outdated
    2741-            if (!tx_ok) {
    2742-                // Any transaction validation failure in ConnectBlock is a block consensus failure
    2743-                state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
    2744-                              tx_state.GetRejectReason(), tx_state.GetDebugMessage());
    2745-                break;
    2746+                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
    


    theuni commented at 9:26 pm on May 22, 2025:

    I think this should go a little further. At this point, CheckInputScripts is essentially meant to be a mempool function and PrepareInputScriptChecks is for block validation, except for the weird edge-case when we’re single-threaded.

    I think I’d rather just see this go all the way and make the split purposeful. Something like:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 1c81620889f..d9a447b460f 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -2723,19 +2723,18 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
     5         {
     6             bool fCacheResults = fJustCheck; /* Don't cache results if we're actually connecting blocks (still consult the cache, though) */
     7             TxValidationState tx_state;
     8-            // If parallel script checking is possible (worker threads are available) the checks are appended to a
     9-            // vector without running them. The vector is then added to control which runs the checks asynchronously.
    10-            // Otherwise, CheckInputScripts runs the checks on a single thread before returning.
    11-            if (control) {
    12-                std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
    13-                control->Add(std::move(vChecks));
    14-            } else {
    15-                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
    16-                    // Any transaction validation failure in ConnectBlock is a block consensus failure
    17-                    state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
    18-                                  tx_state.GetRejectReason(), tx_state.GetDebugMessage());
    19-                    break;
    20+            // If parallel script checking is possible (worker threads are available), they are added to control which
    21+            // which runs the checks asynchronously. Otherwise they are run here directly.
    22+            std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
    23+            if (control) control->Add(std::move(vChecks));
    24+            else {
    25+                for (auto& check : vChecks) {
    26+                    if (const auto& check_result = check()) {
    27+                        state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, strprintf("mandatory-script-verify-flag-failed (%s)", ScriptErrorString(check_result->first)), check_result->second);
    28+                        break;
    29+                    }
    30                 }
    31+                if (!state.IsValid()) break;
    32             }
    33         }
    

    An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.


    mzumsande commented at 10:54 pm on May 22, 2025:

    I like that idea.

    I think it would be a small change in behaviour, because if I read this line right it will add entries to the cache if 1)pvChecks is not set (single-threaded) 2) cacheFullScriptStore is set (i.e. if ConnectBlock() was called with fJustCheck=True, so currently only from TestBlockValidity). I wonder if the current behavior is intended - it seems very strange to me that mining rpcs would add entries to the script cache, but only in the -par=1 case.


    theuni commented at 6:01 pm on June 17, 2025:

    @mzumsande I think it’s very unlikely that that behavior is intended :)

    Edit: likely -> unlikely.


    theuni commented at 6:02 pm on June 17, 2025:
    Whooops, *Very unlikely.

    fjahr commented at 10:04 pm on June 19, 2025:
    Took the suggestion here.
  18. in src/validation.cpp:2216 in 5dfd10c725 outdated
    2211+        return {};
    2212+    }
    2213+
    2214+    EnsureTxData(tx, inputs, txdata);
    2215+
    2216+    std::vector<CScriptCheck> pvChecks{};
    


    mzumsande commented at 9:38 pm on May 22, 2025:
    Just a rough idea, but did you consider the alternative approach of doing the pvChecks population not here (but in ConnectBlock() or a separate function), so that CheckInputScripts could also call PrepareInputScriptChecks and there would be no duplication of the coinbase check, cache check and TxData init?

    fjahr commented at 10:03 pm on June 19, 2025:
    I have implemented this at the cost of duplicating the cache entry calculation. I think it should be fine but I will try to run some benchmarks.
  19. refactor: Extract cache check from CheckInputScripts dcd3c59fa4
  20. refactor: Extract TxData initialization from CheckInputScripts 43b734647f
  21. fjahr force-pushed on Jun 19, 2025
  22. fjahr commented at 10:00 pm on June 19, 2025: contributor

    An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.

    Yeah, that does work basically out of the box. I tested it with this minimal change on top of the changes here: https://github.com/fjahr/bitcoin/commit/320b9a967f88ece5b98779948c47493ad7717da2 However I don’t favor that approach at the moment since there seems to be a lot of overhead when using checkqueue with a single thread and a lot of docs would need to be changed accordingly as well. Just logically you wouldn’t think about using a queue when you just have a single thread, so while it would definitely simplify the code in ConnectBlock, I am not sure it’s a bigger win than the other approaches discussed here, given that it may cause irritation in other places and a performance penalty. I am happy to be convinced of this approach though if I overlooked something.

    I have now pushed here an updated version that addresses both the suggestions from @theuni and @mzumsande .

  23. refactor: Extract PreCheckInputScripts 76ef1d26e8
  24. refactor: Remove multiprocess handling from CheckInputScripts 6667ea17a9
  25. in src/validation.cpp:2712 in edcc004c40 outdated
    2716-                tx_ok = CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache, &vChecks);
    2717-                if (tx_ok) control->Add(std::move(vChecks));
    2718-            } else {
    2719-                tx_ok = CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
    2720+            // If parallel script checking is possible (worker threads are available), they are added to control which
    2721+            // which runs the checks asynchronously. Otherwise they are run here directly.
    


    maflcko commented at 9:34 am on June 20, 2025:
    “this transactions’s input scripts” -> “this transaction’s input scripts” [incorrect possessive form]
    duplicate “which” in “control which which runs the checks asynchronously” -> remove one “which” [typo duplication]
    

    fjahr commented at 12:59 pm on June 20, 2025:
    both typos fixed
  26. fjahr force-pushed on Jun 20, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-23 09:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me