refactor: Split multithreaded case out of CheckInputScripts

fjahr commented at 8:52 pm on May 20, 2025: contributor

This topic has been motivated by my work on batch validation and a related conversation just happened here: #32467 (review)

CheckInputScripts currently only does what its name implies if there is no multithreading usage for validation. If there are worker threads available the function instead only creates the checks and puts them into a vector. Aside from some shared pre-checks the multithreaded code path is much simpler compared to the rest. This dual use makes the function hard to grasp and its naming confusing. So this PR is splitting the single thread case from the multithread. This results in more LOC but the result is much clearer.

DrahtBot commented at 8:52 pm on May 20, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32575.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	theuni, TheCharlatan, stickies-v

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#33050 (net, validation: don’t punish peers for consensus-invalid txs by ajtowns)
#32998 (Bump SCRIPT_VERIFY flags to 64 bit by ajtowns)
#32317 (kernel: Separate UTXO set access from validation functions by TheCharlatan)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

theuni commented at 8:57 pm on May 20, 2025: member

Concept ACK. Anything to clean this up :)

See also #32317 which moves some of this functionality around similarly.

fjahr force-pushed on May 20, 2025

fjahr marked this as ready for review on May 20, 2025

fjahr commented at 9:49 pm on May 20, 2025: contributor

Cleaned up the comments so this should be ready for more detailed feedback, ~~but expect a rebase once #32467 is merged.~~

in src/validation.cpp:2207 in 4262b75726 outdated

2202+std::vector<CScriptCheck> PrepareInputScriptChecks(const CTransaction& tx,
2203+                       const CCoinsViewCache& inputs, unsigned int flags, bool cacheSigStore,
2204+                       bool cacheFullScriptStore, PrecomputedTransactionData& txdata,
2205+                       ValidationCache& validation_cache) EXCLUSIVE_LOCKS_REQUIRED(cs_main)
2206+{
2207+    if (tx.IsCoinBase()) return {};

fjahr commented at 9:54 pm on May 20, 2025:

I have been contemplating kicking this out (here and in CheckInputScripts) or even turning this into an assert because the callers already handle this everywhere. On the other hand it conceptually makes sense to return early if we know there is nothing to check because we don’t expect any inputs. So I am still undecided on this one and happy to hear opinions.

TheCharlatan commented at 10:12 pm on May 20, 2025: contributor

Concept ACK

fjahr renamed this:
~~RFC: refactor: Split multithreaded case out of CheckInputScripts~~
refactor: Split multithreaded case out of CheckInputScripts
on May 20, 2025

DrahtBot added the label Refactoring on May 20, 2025

stickies-v commented at 10:28 am on May 21, 2025: contributor

Concept ACK

DrahtBot added the label Needs rebase on May 22, 2025

theuni commented at 6:29 pm on May 22, 2025: member

Ready for rebase.

fjahr force-pushed on May 22, 2025

fjahr commented at 8:21 pm on May 22, 2025: contributor

Ready for rebase.

Done, ready for review :)

DrahtBot removed the label Needs rebase on May 22, 2025

in src/validation.cpp:2733 in 5dfd10c725 outdated

2741-            if (!tx_ok) {
2742-                // Any transaction validation failure in ConnectBlock is a block consensus failure
2743-                state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
2744-                              tx_state.GetRejectReason(), tx_state.GetDebugMessage());
2745-                break;
2746+                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {

theuni commented at 9:26 pm on May 22, 2025:

I think this should go a little further. At this point, CheckInputScripts is essentially meant to be a mempool function and PrepareInputScriptChecks is for block validation, except for the weird edge-case when we’re single-threaded.

I think I’d rather just see this go all the way and make the split purposeful. Something like:

 0diff --git a/src/validation.cpp b/src/validation.cpp
 1index 1c81620889f..d9a447b460f 100644
 2--- a/src/validation.cpp
 3+++ b/src/validation.cpp
 4@@ -2723,19 +2723,18 @@ bool Chainstate::ConnectBlock(const CBlock& block, BlockValidationState& state,
 5         {
 6             bool fCacheResults = fJustCheck; /* Don't cache results if we're actually connecting blocks (still consult the cache, though) */
 7             TxValidationState tx_state;
 8-            // If parallel script checking is possible (worker threads are available) the checks are appended to a
 9-            // vector without running them. The vector is then added to control which runs the checks asynchronously.
10-            // Otherwise, CheckInputScripts runs the checks on a single thread before returning.
11-            if (control) {
12-                std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
13-                control->Add(std::move(vChecks));
14-            } else {
15-                if (!CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache)) {
16-                    // Any transaction validation failure in ConnectBlock is a block consensus failure
17-                    state.Invalid(BlockValidationResult::BLOCK_CONSENSUS,
18-                                  tx_state.GetRejectReason(), tx_state.GetDebugMessage());
19-                    break;
20+            // If parallel script checking is possible (worker threads are available), they are added to control which
21+            // which runs the checks asynchronously. Otherwise they are run here directly.
22+            std::vector<CScriptCheck> vChecks = PrepareInputScriptChecks(tx, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
23+            if (control) control->Add(std::move(vChecks));
24+            else {
25+                for (auto& check : vChecks) {
26+                    if (const auto& check_result = check()) {
27+                        state.Invalid(BlockValidationResult::BLOCK_CONSENSUS, strprintf("mandatory-script-verify-flag-failed (%s)", ScriptErrorString(check_result->first)), check_result->second);
28+                        break;
29+                    }
30                 }
31+                if (!state.IsValid()) break;
32             }
33         }

An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.

mzumsande commented at 10:54 pm on May 22, 2025:

I like that idea.

I think it would be a small change in behaviour, because if I read this line right it will add entries to the cache if 1)pvChecks is not set (single-threaded) 2) cacheFullScriptStore is set (i.e. if ConnectBlock() was called with fJustCheck=True, so currently only from TestBlockValidity). I wonder if the current behavior is intended - it seems very strange to me that mining rpcs would add entries to the script cache, but only in the -par=1 case.

theuni commented at 6:01 pm on June 17, 2025:

@mzumsande I think it’s very unlikely that that behavior is intended :)

Edit: likely -> unlikely.

theuni commented at 6:02 pm on June 17, 2025:

Whooops, *Very unlikely.

fjahr commented at 10:04 pm on June 19, 2025:

Took the suggestion here.

in src/validation.cpp:2216 in 5dfd10c725 outdated

2211+        return {};
2212+    }
2213+
2214+    EnsureTxData(tx, inputs, txdata);
2215+
2216+    std::vector<CScriptCheck> pvChecks{};

mzumsande commented at 9:38 pm on May 22, 2025:

Just a rough idea, but did you consider the alternative approach of doing the pvChecks population not here (but in ConnectBlock() or a separate function), so that CheckInputScripts could also call PrepareInputScriptChecks and there would be no duplication of the coinbase check, cache check and TxData init?

fjahr commented at 10:03 pm on June 19, 2025:

I have implemented this at the cost of duplicating the cache entry calculation. I think it should be fine but I will try to run some benchmarks.

refactor: Extract cache check from CheckInputScripts dcd3c59fa4

refactor: Extract TxData initialization from CheckInputScripts 43b734647f

fjahr force-pushed on Jun 19, 2025

fjahr commented at 10:00 pm on June 19, 2025: contributor

An alternative would be to make CCheckQueueControl work with 0 worker threads (maybe it already does?) and just always use it. That would completely unify the approach here.

Yeah, that does work basically out of the box. I tested it with this minimal change on top of the changes here: https://github.com/fjahr/bitcoin/commit/320b9a967f88ece5b98779948c47493ad7717da2 However I don’t favor that approach at the moment since there seems to be a lot of overhead when using checkqueue with a single thread and a lot of docs would need to be changed accordingly as well. Just logically you wouldn’t think about using a queue when you just have a single thread, so while it would definitely simplify the code in ConnectBlock, I am not sure it’s a bigger win than the other approaches discussed here, given that it may cause irritation in other places and a performance penalty. I am happy to be convinced of this approach though if I overlooked something.

I have now pushed here an updated version that addresses both the suggestions from @theuni and @mzumsande .

refactor: Extract PreCheckInputScripts 76ef1d26e8

refactor: Remove multiprocess handling from CheckInputScripts 6667ea17a9

in src/validation.cpp:2712 in edcc004c40 outdated

2716-                tx_ok = CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache, &vChecks);
2717-                if (tx_ok) control->Add(std::move(vChecks));
2718-            } else {
2719-                tx_ok = CheckInputScripts(tx, tx_state, view, flags, fCacheResults, fCacheResults, txsdata[i], m_chainman.m_validation_cache);
2720+            // If parallel script checking is possible (worker threads are available), they are added to control which
2721+            // which runs the checks asynchronously. Otherwise they are run here directly.

maflcko commented at 9:34 am on June 20, 2025:

“this transactions’s input scripts” -> “this transaction’s input scripts” [incorrect possessive form]
duplicate “which” in “control which which runs the checks asynchronously” -> remove one “which” [typo duplication]

fjahr commented at 12:59 pm on June 20, 2025:

both typos fixed

fjahr force-pushed on Jun 20, 2025

refactor: Split multithreaded case out of CheckInputScripts #32575

Code Coverage & Benchmarks

Reviews

Conflicts