validation: fetch block inputs in parallel during ConnectBlock #35295

pull andrewtoth wants to merge 12 commits into bitcoin:master from andrewtoth:threaded-inputs changing 13 files +500 −47
  1. andrewtoth commented at 12:53 AM on May 15, 2026: contributor

    This PR is a continuation of #31132. All outstanding issues raised there have been resolved, but the volume of stale comments can make that change difficult to review.

    Currently, when connecting a block, each input prevout is looked up one at a time. For every input we first check the in-memory coins cache, and on a miss we make a synchronous round-trip to the chainstate LevelDB to read the coin from disk. Because these lookups happen serially as the block is being validated, the disk read latency stacks up and dominates the time spent in ConnectBlock whenever many inputs are not already in the cache.

    This PR moves those disk reads onto a pool of worker threads that run in parallel with block connection. Before entering ConnectBlock the block is handed to a CoinsViewOverlay, which kicks off the workers to begin fetching all of the block's prevouts from disk and warming the cache. The main validation thread continues to do exactly the same work it does today, hitting the cache for each input in order. The only difference is that by the time it asks, the coin is much more likely to already be there. There are no validation logic or consensus behavior changes. This is purely a parallelization of an existing read pattern.

    The number of fetcher threads is configurable via -inputfetchthreads=<n>, defaulting to 4 and capped at 16. Setting it to 0 disables input fetching entirely and reverts to the previous serial behavior.

    We have measured large performance gains for IBD and -reindex-chainstate, as well as worst-case steady-state block connection at the tip. l0rinc ran many thorough benchmarking passes on the original PR across multiple machines, storage types, dbcache sizes^1, operating systems[^2], and fetcher thread counts[^3]. Many other contributors also posted their benchmark results in the original PR. IBD speedups range from 1.18× to over 3× faster[^4]. Worst-case block connection time for network-attached storage was over 2× faster[^5]. Flamegraph comparisons before and after this change are available[^6].

    On safety: ConnectBlock runs while holding cs_main, so nothing else in the node can mutate the chainstate while the fetchers are reading it.

    On LevelDB: concurrent reads are fully supported and documented as such. We already rely on this in production today against our other LevelDB-backed databases. The txindex DB is read by multiple simultaneous HTTP RPC worker threads via the getrawtransaction RPC. The blockfilterindex DB is called concurrently from both the P2P cfilters / cfheaders / cfcheckpt message handlers on the msghand thread, and from the getblockfilter RPC on the HTTP RPC worker threads. We have not yet been issuing concurrent reads against the chainstate DB, but there is no LevelDB-side reason we can't. In fact, the chainstate DB is already being touched by more than one thread on master, because LevelDB schedules its own background compaction work.

    [^2]: #31132 (comment) [^3]: #31132 (comment) [^4]: #31132 (comment) [^5]: #31132 (comment) [^6]: #31132 (comment)

  2. validation: collect block inputs in CoinsViewOverlay before ConnectBlock
    Introduce CoinsViewOverlay::StartFetching, which maps all input prevouts of a
    block to a new m_inputs vector of InputToFetch elements. Returns a ResetGuard
    which is lifetime bound to the block, while the InputToFetch elements are
    lifetime bound to the block as well.
    
    Introduce StopFetching to clear the m_inputs vector.
    CCoinsViewCache::Reset is made virtual and is overridden in CoinsViewOverlay.
    StopFetching is called on Reset, so the InputToFetch objects will not
    exceed the lifetime of the block.
    
    Introduce ProcessInput to fetch the utxo of an individual input in m_inputs.
    Each caller fetches the input at m_input_head and increments it, so each call
    will fetch the next input in the queue.
    
    Fetch coins from the m_inputs vector in FetchCoinFromBase by scanning all inputs
    until we discover the input with the correct outpoint.
    
    This is designed deliberately so multiple threads can call ProcessInput independently.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    a15470468b
  3. coins: filter same-block spends in StartFetching
    Inputs spending outputs of an earlier transaction in the same block won't
    be in the cache or the db. They also won't be requested by FetchCoinFromBase,
    so we can filter them out to not waste time trying to fetch them.
    
    Build an unordered set of seen txids while flattening m_inputs and skip
    any prevout whose hash is already in the set.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    ec77ed6146
  4. consensus: add MIN_TXIN_SERIALIZED_SIZE and MAX_INPUTS_PER_BLOCK
    Provides a worst-case upper bound on the number of inputs that can fit in
    a block, so callers (e.g. parallel input prefetching) can pre-allocate
    stable storage and rule out reallocation of per-input state.
    
    Cherry-picked from PR #9938 (Lock-Free CheckQueue), with MAX_TXINS_PER_BLOCK
    renamed to MAX_INPUTS_PER_BLOCK to match the call site.
    
    Co-authored-by: Jeremy Rubin <jeremy.l.rubin@gmail.com>
    a82c6186bb
  5. coins: add ready flag to InputToFetch
    Prepares for ProcessInput to be called from multiple threads.
    
    This flag acts as a memory fence around InputToFetch::coin. There is no lock
    guarding reads and writes of the coin field.
    Instead we use the flag's release/acquire semantics to ensure that when the
    main thread reads the coin it will have happened after a worker thread has
    finished writing it.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    70ee8ba660
  6. coins: stop fetching before mutating base
    Prepares for ProcessInput to be called from multiple threads.
    
    ProcessInput reads from base. For ProcessInput to be safe to call in parallel
    on separate threads, it must not be mutated.
    Flush, Sync, and SetBackend can modify base, so we override these and
    StopFetching before calling the base class.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    fec8a2b732
  7. validation: add -inputfetchthreads configuration option
    Add a configuration option for the number of worker threads used for
    parallel UTXO input fetching during block connection.
    
    Default is 4 threads, max is 16, 0 disables parallel fetching.
    22c4c1d737
  8. coins: introduce thread pool in CoinsViewOverlay
    Prepares for ProcessInput to be called from multiple threads.
    
    Introduce a ThreadPool shared pointer to CoinsViewOverlay. A pool managed
    externally can be passed in the constructor.
    
    A global thread pool is used in fuzz harnesses since iterations can happen
    faster than the OS can create and tear down thread pools.
    This can cause a memory leak when fuzzing.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    a3981aa8a0
  9. coins: fetch inputs in parallel
    Leverages the thread pool to fetch inputs on multiple threads, while the overlay
    serves inputs on the main thread.
    
    This is a performance improvement over blocking the main thread to fetch inputs.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    1f0c8957e1
  10. doc: update CoinsViewOverlay docstring to describe parallel fetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    6d915f4178
  11. test: add unit tests for CoinsViewOverlay::StartFetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    650416ec87
  12. fuzz: update harnesses to cover CoinsViewOverlay::StartFetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    Co-authored-by: sedited <seb.kung@gmail.com>
    db941367f0
  13. fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads d6946a12ab
  14. DrahtBot added the label Validation on May 15, 2026
  15. DrahtBot commented at 12:53 AM on May 15, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35295.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-15 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me