refactor: Split multithreaded case out of CheckInputScripts #32575

pull fjahr wants to merge 3 commits into bitcoin:master from fjahr:2025-05-CheckInputScript-split changing 2 files +117 −68
  1. fjahr commented at 8:52 pm on May 20, 2025: contributor

    This topic has been motivated by my work on batch validation and a related conversation just happened here: #32467 (review)

    CheckInputScripts currently only does what its name implies if there is no multithreading usage for validation. If there are worker threads available the function instead only creates the checks and puts them into a vector. Aside from some shared pre-checks the multithreaded code path is much simpler compared to the rest. This dual use makes the function hard to grasp and its naming confusing. So this PR is splitting the single thread case from the multithread. This results in more LOC but the result is much clearer.

  2. DrahtBot commented at 8:52 pm on May 20, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32575.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK theuni, TheCharlatan, stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32317 (kernel: Separate UTXO set access from validation functions by TheCharlatan)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. theuni commented at 8:57 pm on May 20, 2025: member

    Concept ACK. Anything to clean this up :)

    See also #32317 which moves some of this functionality around similarly.

  4. fjahr force-pushed on May 20, 2025
  5. fjahr marked this as ready for review on May 20, 2025
  6. fjahr commented at 9:49 pm on May 20, 2025: contributor
    Cleaned up the comments so this should be ready for more detailed feedback, but expect a rebase once #32467 is merged.
  7. in src/validation.cpp:2207 in 4262b75726 outdated
    2202+std::vector<CScriptCheck> PrepareInputScriptChecks(const CTransaction& tx,
    2203+                       const CCoinsViewCache& inputs, unsigned int flags, bool cacheSigStore,
    2204+                       bool cacheFullScriptStore, PrecomputedTransactionData& txdata,
    2205+                       ValidationCache& validation_cache) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
    2206+{
    2207+    if (tx.IsCoinBase()) return {};
    


    fjahr commented at 9:54 pm on May 20, 2025:
    I have been contemplating kicking this out (here and in CheckInputScripts) or even turning this into an assert because the callers already handle this everywhere. On the other hand it conceptually makes sense to return early if we know there is nothing to check because we don’t expect any inputs. So I am still undecided on this one and happy to hear opinions.
  8. TheCharlatan commented at 10:12 pm on May 20, 2025: contributor
    Concept ACK
  9. fjahr renamed this:
    RFC: refactor: Split multithreaded case out of CheckInputScripts
    refactor: Split multithreaded case out of CheckInputScripts
    on May 20, 2025
  10. DrahtBot added the label Refactoring on May 20, 2025
  11. stickies-v commented at 10:28 am on May 21, 2025: contributor
    Concept ACK
  12. DrahtBot added the label Needs rebase on May 22, 2025
  13. theuni commented at 6:29 pm on May 22, 2025: member
    Ready for rebase.
  14. refactor: Extract cache check from CheckInputScripts c0c85851b7
  15. refactor: Extract TxData initialization from CheckInputScripts bb17b4e7f9
  16. refactor: Split PrepareInputScriptChecks for multiprocess from CheckInputScripts 5dfd10c725
  17. fjahr force-pushed on May 22, 2025
  18. fjahr commented at 8:21 pm on May 22, 2025: contributor

    Ready for rebase.

    Done, ready for review :)

  19. DrahtBot removed the label Needs rebase on May 22, 2025
  20. in src/validation.cpp:2733 in 5dfd10c725
    2741-            if (!tx_ok) {
    2742-                // Any transaction validation failure in ConnectBlock is a block consensus failure
    2743-                state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
    2744-                              tx_state.GetRejectReason(), tx_state.GetDebugMessage());
    2745-                break;
    2746+                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
    


    theuni commented at 9:26 pm on May 22, 2025:

    I think this should go a little further. At this point, CheckInputScripts is essentially meant to be a mempool function and PrepareInputScriptChecks is for block validation, except for the weird edge-case when we’re single-threaded.

    I think I’d rather just see this go all the way and make the split purposeful. Something like:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 1c81620889f..d9a447b460f 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -2723,19 +2723,18 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
     5         {
     6             bool fCacheResults = fJustCheck; /* Don't cache results if we're actually connecting blocks (still consult the cache, though) */
     7             TxValidationState tx_state;
     8-            // If parallel script checking is possible (worker threads are available) the checks are appended to a
     9-            // vector without running them. The vector is then added to control which runs the checks asynchronously.
    10-            // Otherwise, CheckInputScripts runs the checks on a single thread before returning.
    11-            if (control) {
    12-                std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
    13-                control->Add(std::move(vChecks));
    14-            } else {
    15-                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
    16-                    // Any transaction validation failure in ConnectBlock is a block consensus failure
    17-                    state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
    18-                                  tx_state.GetRejectReason(), tx_state.GetDebugMessage());
    19-                    break;
    20+            // If parallel script checking is possible (worker threads are available), they are added to control which
    21+            // which runs the checks asynchronously. Otherwise they are run here directly.
    22+            std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
    23+            if (control) control->Add(std::move(vChecks));
    24+            else {
    25+                for (auto& check : vChecks) {
    26+                    if (const auto& check_result = check()) {
    27+                        state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, strprintf("mandatory-script-verify-flag-failed (%s)", ScriptErrorString(check_result->first)), check_result->second);
    28+                        break;
    29+                    }
    30                 }
    31+                if (!state.IsValid()) break;
    32             }
    33         }
    

    An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.


    mzumsande commented at 10:54 pm on May 22, 2025:

    I like that idea.

    I think it would be a small change in behaviour, because if I read this line right it will add entries to the cache if 1)pvChecks is not set (single-threaded) 2) cacheFullScriptStore is set (i.e. if ConnectBlock() was called with fJustCheck=True, so currently only from TestBlockValidity). I wonder if the current behavior is intended - it seems very strange to me that mining rpcs would add entries to the script cache, but only in the -par=1 case.

  21. in src/validation.cpp:2216 in 5dfd10c725
    2211+        return {};
    2212+    }
    2213+
    2214+    EnsureTxData(tx, inputs, txdata);
    2215+
    2216+    std::vector<CScriptCheck> pvChecks{};
    


    mzumsande commented at 9:38 pm on May 22, 2025:
    Just a rough idea, but did you consider the alternative approach of doing the pvChecks population not here (but in ConnectBlock() or a separate function), so that CheckInputScripts could also call PrepareInputScriptChecks and there would be no duplication of the coinbase check, cache check and TxData init?

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-05-25 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me