validation: fetch block inputs in parallel during ConnectBlock #35295

pull andrewtoth wants to merge 12 commits into bitcoin:master from andrewtoth:threaded-inputs changing 13 files +500 −47

andrewtoth commented at 12:53 AM on May 15, 2026: contributor

This PR is a continuation of #31132. All outstanding issues raised there have been resolved, but the volume of stale comments can make that change difficult to review.

Currently, when connecting a block, each input prevout is looked up one at a time. For every input we first check the in-memory coins cache, and on a miss we make a synchronous round-trip to the chainstate LevelDB to read the coin from disk. Because these lookups happen serially as the block is being validated, the disk read latency stacks up and dominates the time spent in ConnectBlock whenever many inputs are not already in the cache.

This PR moves those disk reads onto a pool of worker threads that run in parallel with block connection. Before entering ConnectBlock the block is handed to a CoinsViewOverlay, which kicks off the workers to begin fetching all of the block's prevouts from disk and warming the cache. The main validation thread continues to do exactly the same work it does today, hitting the cache for each input in order. The only difference is that by the time it asks, the coin is much more likely to already be there. There are no validation logic or consensus behavior changes. This is purely a parallelization of an existing read pattern.

The number of fetcher threads is configurable via -inputfetchthreads=<n>, defaulting to 4 and capped at 16. Setting it to 0 disables input fetching entirely and reverts to the previous serial behavior.

We have measured large performance gains for IBD and -reindex-chainstate, as well as worst-case steady-state block connection at the tip. l0rinc ran many thorough benchmarking passes on the original PR across multiple machines, storage types, dbcache sizes^1, operating systems[^2], and fetcher thread counts[^3]. Many other contributors also posted their benchmark results in the original PR. IBD speedups range from 1.18× to over 3× faster[^4]. Worst-case block connection time for network-attached storage was over 2× faster[^5]. Flamegraph comparisons before and after this change are available[^6].

On safety: ConnectBlock runs while holding cs_main, so nothing else in the node can mutate the chainstate while the fetchers are reading it.

On LevelDB: concurrent reads are fully supported and documented as such. We already rely on this in production today against our other LevelDB-backed databases. The txindex DB is read by multiple simultaneous HTTP RPC worker threads via the getrawtransaction RPC. The blockfilterindex DB is called concurrently from both the P2P cfilters / cfheaders / cfcheckpt message handlers on the msghand thread, and from the getblockfilter RPC on the HTTP RPC worker threads. We have not yet been issuing concurrent reads against the chainstate DB, but there is no LevelDB-side reason we can't. In fact, the chainstate DB is already being touched by more than one thread on master, because LevelDB schedules its own background compaction work.

[^2]: #31132 (comment) [^3]: #31132 (comment) [^4]: #31132 (comment) [^5]: #31132 (comment) [^6]: #31132 (comment)

validation: collect block inputs in CoinsViewOverlay before ConnectBlock

Introduce CoinsViewOverlay::StartFetching, which maps all input prevouts of a
block to a new m_inputs vector of InputToFetch elements. Returns a ResetGuard
which is lifetime bound to the block, while the InputToFetch elements are
lifetime bound to the block as well.

Introduce StopFetching to clear the m_inputs vector.
CCoinsViewCache::Reset is made virtual and is overridden in CoinsViewOverlay.
StopFetching is called on Reset, so the InputToFetch objects will not
exceed the lifetime of the block.

Introduce ProcessInput to fetch the utxo of an individual input in m_inputs.
Each caller fetches the input at m_input_head and increments it, so each call
will fetch the next input in the queue.

Fetch coins from the m_inputs vector in FetchCoinFromBase by scanning all inputs
until we discover the input with the correct outpoint.

This is designed deliberately so multiple threads can call ProcessInput independently.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

a15470468b

coins: filter same-block spends in StartFetching

Inputs spending outputs of an earlier transaction in the same block won't
be in the cache or the db. They also won't be requested by FetchCoinFromBase,
so we can filter them out to not waste time trying to fetch them.

Build an unordered set of seen txids while flattening m_inputs and skip
any prevout whose hash is already in the set.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

ec77ed6146

consensus: add MIN_TXIN_SERIALIZED_SIZE and MAX_INPUTS_PER_BLOCK

Provides a worst-case upper bound on the number of inputs that can fit in
a block, so callers (e.g. parallel input prefetching) can pre-allocate
stable storage and rule out reallocation of per-input state.

Cherry-picked from PR #9938 (Lock-Free CheckQueue), with MAX_TXINS_PER_BLOCK
renamed to MAX_INPUTS_PER_BLOCK to match the call site.

Co-authored-by: Jeremy Rubin <jeremy.l.rubin@gmail.com>

a82c6186bb

coins: add ready flag to InputToFetch

Prepares for ProcessInput to be called from multiple threads.

This flag acts as a memory fence around InputToFetch::coin. There is no lock
guarding reads and writes of the coin field.
Instead we use the flag's release/acquire semantics to ensure that when the
main thread reads the coin it will have happened after a worker thread has
finished writing it.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

70ee8ba660

coins: stop fetching before mutating base

Prepares for ProcessInput to be called from multiple threads.

ProcessInput reads from base. For ProcessInput to be safe to call in parallel
on separate threads, it must not be mutated.
Flush, Sync, and SetBackend can modify base, so we override these and
StopFetching before calling the base class.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

fec8a2b732

validation: add -inputfetchthreads configuration option

Add a configuration option for the number of worker threads used for
parallel UTXO input fetching during block connection.

Default is 4 threads, max is 16, 0 disables parallel fetching.

22c4c1d737

coins: introduce thread pool in CoinsViewOverlay

Prepares for ProcessInput to be called from multiple threads.

Introduce a ThreadPool shared pointer to CoinsViewOverlay. A pool managed
externally can be passed in the constructor.

A global thread pool is used in fuzz harnesses since iterations can happen
faster than the OS can create and tear down thread pools.
This can cause a memory leak when fuzzing.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

a3981aa8a0

coins: fetch inputs in parallel

Leverages the thread pool to fetch inputs on multiple threads, while the overlay
serves inputs on the main thread.

This is a performance improvement over blocking the main thread to fetch inputs.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

1f0c8957e1

doc: update CoinsViewOverlay docstring to describe parallel fetching
Co-authored-by: l0rinc <pap.lorinc@gmail.com>
6d915f4178
test: add unit tests for CoinsViewOverlay::StartFetching
Co-authored-by: l0rinc <pap.lorinc@gmail.com>
650416ec87

fuzz: update harnesses to cover CoinsViewOverlay::StartFetching

Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: sedited <seb.kung@gmail.com>

db941367f0

fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads d6946a12ab
DrahtBot added the label Validation on May 15, 2026
DrahtBot commented at 12:53 AM on May 15, 2026: contributor



The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.



Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35295.



Reviews

See the guideline for information on the review process. A summary of reviews will appear here.

Contributors

andrewtoth

DrahtBot

Labels

Validation

Linked (view graph)

#31132 validation: fetch block inputs on parallel threads

validation: fetch block inputs in parallel during ConnectBlock #35295

Code Coverage & Benchmarks

Reviews