validation: fetch block input prevouts in parallel during ConnectBlock #35295

pull andrewtoth wants to merge 11 commits into bitcoin:master from andrewtoth:threaded-inputs changing 16 files +603 −40
  1. andrewtoth commented at 12:53 AM on May 15, 2026: contributor

    This PR is a continuation of #31132. All outstanding issues raised there have been resolved, but the volume of stale comments can make that change difficult to review.

    Currently, when connecting a block, each input prevout is looked up one at a time. For every input we first check the in-memory coins cache, and on a miss we make a synchronous round-trip to the chainstate LevelDB to read the coin from disk. Because these lookups happen serially as the block is being validated, the disk read latency stacks up and dominates the time spent in ConnectBlock whenever many inputs are not already in the cache.

    This PR moves those disk reads onto a pool of worker threads that run in parallel with block connection. Before entering ConnectBlock the block is handed to a CoinsViewOverlay, which kicks off the workers to begin fetching all of the block's prevouts from disk and warming the cache. The main validation thread continues to do exactly the same work it does today, hitting the cache for each input in order. The only difference is that by the time it asks, the coin is much more likely to already be there. There are no validation logic or consensus behavior changes. This is purely a parallelization of an existing read pattern.

    The number of fetcher threads is configurable via -prevoutfetchthreads=<n>, defaulting to 8 and capped at 16. Setting it to 0 disables input fetching entirely and reverts to the previous serial behavior.

    We have measured large performance gains for IBD and -reindex-chainstate, as well as worst-case steady-state block connection at the tip. l0rinc ran many thorough benchmarking passes on the original PR across multiple machines, storage types, dbcache sizes^1, operating systems[^2], and fetcher thread counts[^3]. Many other contributors also posted their benchmark results in the original PR. IBD speedups range from 1.18× to over 3× faster[^4]. Worst-case block connection time for network-attached storage was over 2× faster[^5]. Flamegraph comparisons before and after this change are available[^6].

    On safety: ConnectBlock runs while holding cs_main, so nothing else in the node can mutate the chainstate while the fetchers are reading it.

    On LevelDB: concurrent reads are fully supported and documented as such. We already rely on this in production today against our other LevelDB-backed databases. The txindex DB is read by multiple simultaneous HTTP RPC worker threads via the getrawtransaction RPC. The blockfilterindex DB is called concurrently from both the P2P cfilters / cfheaders / cfcheckpt message handlers on the msghand thread, and from the getblockfilter RPC on the HTTP RPC worker threads. We have not yet been issuing concurrent reads against the chainstate DB, but there is no LevelDB-side reason we can't. In fact, the chainstate DB is already being touched by more than one thread on master, because LevelDB schedules its own background compaction work.

    For reviewers:

    The main change is CoinsViewOverlay gets 1 new public and 2 new private methods.

    • StartFetching: public method called in lieu of CreateResetGuard before we enter ConnectBlock. It still returns a ResetGuard so the view is Reset before the block it is working on leaves scope. This kicks off worker threads who each just run while (ProcessInput()) {} and then return.
    • StopFetching: private method called on Reset whenever the guard leaves scope, or a method that mutates base is called (Flush/Sync/SetBackend). Stops all threads and clears multi threaded state.
    • ProcessInput: private method that fetches a single input prevout. Returns true if an input was fetched and false otherwise. This is the only method on CoinsViewOverlay that is called concurrently by multiple threads. Every other method on the overlay is still called synchronously on the main thread.

    The CoinsViewOverlay::FetchCoinFromBase method is also extended to lookup the coins fetched from ProcessInput first before falling back to base->PeekCoin.

    Mutating methods Reset, Flush, Sync and SetBackend are overridden in CoinsViewOverlay to call StopFetching first.

    [^2]: #31132 (comment) [^3]: #31132 (comment) [^4]: #31132 (comment) [^5]: #31132 (comment) [^6]: #31132 (comment)

  2. DrahtBot added the label Validation on May 15, 2026
  3. DrahtBot commented at 12:53 AM on May 15, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35295.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK l0rinc, theStack, sedited, ryanofsky
    Concept ACK rkrux, w0xlt, jsarenik

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #35587 (Remove boost as a unit test runner by rustaceanrob)
    • #35569 (Encapsulation for CTransaction by purpleKarrot)
    • #34132 (coins: drop error catcher, centralize fatal read handling by l0rinc)
    • #32317 (kernel: Separate UTXO set access from validation functions by sedited)
    • #31260 (scripted-diff: Type-safe settings retrieval by ryanofsky)
    • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
    • #29700 (kernel, refactor: return error status on all fatal errors by ryanofsky)
    • #28690 (build: Introduce internal kernel library by sedited)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. in src/node/chainstatemanager_args.cpp:63 in 22c4c1d737
      59 | @@ -60,6 +60,13 @@ util::Result<void> ApplyArgsManOptions(const ArgsManager& args, ChainstateManage
      60 |      // Subtract 1 because the main thread counts towards the par threads.
      61 |      opts.worker_threads_num = script_threads - 1;
      62 |  
      63 | +    if (auto value{args.GetIntArg("-inputfetchthreads")}) {
    


    l0rinc commented at 2:02 PM on May 15, 2026:

    22c4c1d validation: add -inputfetchthreads configuration option:

    https://corecheck.dev/bitcoin/bitcoin/pulls/35295 indicates we're not testing this part of the code - could we add a unit test for this?

    diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
    --- a/src/test/validation_chainstatemanager_tests.cpp	(revision d6946a12ab306c5054b535b0b7fac0101fb306ae)
    +++ b/src/test/validation_chainstatemanager_tests.cpp	(revision 82be6bdbe682cf9b4d1fb46820302cbbfb5c34b6)
    @@ -991,6 +991,13 @@
     
         BOOST_CHECK(!get_opts({"-minimumchainwork=xyz"}));                                                               // invalid hex characters
         BOOST_CHECK(!get_opts({"-minimumchainwork=01234567890123456789012345678901234567890123456789012345678901234"})); // > 64 hex chars
    +
    +    // test -inputfetchthreads
    +    BOOST_CHECK_EQUAL(get_valid_opts({}).inputfetch_threads_num, DEFAULT_INPUTFETCH_THREADS);
    +    BOOST_CHECK_EQUAL(get_valid_opts({"-inputfetchthreads=0"}).inputfetch_threads_num, 0);
    +    BOOST_CHECK_EQUAL(get_valid_opts({"-inputfetchthreads=3"}).inputfetch_threads_num, 3);
    +    BOOST_CHECK_EQUAL(get_valid_opts({"-inputfetchthreads=100"}).inputfetch_threads_num, MAX_INPUTFETCH_THREADS);
    +    BOOST_CHECK(!get_opts({"-inputfetchthreads=-1"}));
     }
     
     BOOST_AUTO_TEST_SUITE_END()
    
  5. in src/validation.cpp:1859 in a3981aa8a0
    1856 |      AssertLockHeld(::cs_main);
    1857 |      m_cacheview = std::make_unique<CCoinsViewCache>(&m_catcherview);
    1858 | -    m_connect_block_view = std::make_unique<CoinsViewOverlay>(&*m_cacheview);
    1859 | +    auto thread_pool{std::make_shared<ThreadPool>("inputfetch")};
    1860 | +    if (inputfetch_threads > 0) {
    1861 | +        thread_pool->Start(inputfetch_threads);
    


    l0rinc commented at 2:22 PM on May 15, 2026:

    a3981aa coins: introduce thread pool in CoinsViewOverlay:

    nit: do we want to be able to init the cache with > MAX_INPUTFETCH_THREADS threads? That's not really possible currently as far as I can tell, but we may want to clamp here again to make it future-proof.

  6. in src/coins.h:591 in 70ee8ba660
     586 |          const COutPoint& outpoint;
     587 |          //! The coin that workers will fetch and main thread will insert into cache.
     588 |          mutable std::optional<Coin> coin{std::nullopt};
     589 |  
     590 | +        //! The move constructor will never be used, since m_inputs will never need to reallocate.
     591 | +        InputToFetch(InputToFetch&& other) noexcept : outpoint{other.outpoint} { Assert(false); }
    


    l0rinc commented at 9:51 AM on May 18, 2026:

    70ee8ba coins: add ready flag to InputToFetch:

    nit: https://corecheck.dev/bitcoin/bitcoin/pulls/35295 coverage indicates this is unused - maybe a simple unit test could cover it. Not sure it matters...

  7. in src/coins.h:709 in 1f0c8957e1 outdated
     712 | +                    while (ProcessInput()) {}
     713 | +                });
     714 | +                if (auto futures{m_thread_pool->Submit(std::move(tasks))}; futures.has_value()) {
     715 | +                    m_futures = std::move(*futures);
     716 | +                } else {
     717 | +                    m_inputs.clear();
    


    l0rinc commented at 9:52 AM on May 18, 2026:

    1f0c895 coins: fetch inputs in parallel:

    https://corecheck.dev/bitcoin/bitcoin/pulls/35295 indicates this path is never taken


    ryanofsky commented at 7:12 PM on May 27, 2026:

    In commit "coins: fetch inputs in parallel" (1f0c8957e1e483608148dc4bf7f396b82d2c6e44)

    Unclear why clearing the inputs is the right thing to do when the thread pool submit fails.

    Since thread pool is passed externally and not only controlled by this code, it would seem good to handle failures by doing something like raising an exception or logging an error. Of if clearing m_inputs is a good way to handle this, it would be worth saying what this does in a comment.


    andrewtoth commented at 9:18 PM on May 31, 2026:

    I added a comment explaining why it's the correct action to take, and I also added a warning log here.

  8. in src/coins.h:685 in fec8a2b732 outdated
     677 | @@ -678,6 +678,24 @@ class CoinsViewOverlay : public CCoinsViewCache
     678 |          while (ProcessInput()) {}
     679 |          return CreateResetGuard();
     680 |      }
     681 | +
     682 | +    void SetBackend(CCoinsView& view) override
     683 | +    {
     684 | +        StopFetching();
     685 | +        CCoinsViewCache::SetBackend(view);
    


    l0rinc commented at 9:56 AM on May 18, 2026:

    fec8a2b coins: stop fetching before mutating base:

    Same, https://corecheck.dev/bitcoin/bitcoin/pulls/35295 indicates this is never actually called. This is probably a consequence of the Coins interface bloat; maybe we could throw, log, or Assume instead?


    andrewtoth commented at 9:01 PM on May 31, 2026:

    It is called in the fuzz harness.


    l0rinc commented at 9:09 PM on May 31, 2026:

    If it's test-only code maybe we could throw, log, or Assume instead?

  9. in src/coins.h:694 in fec8a2b732 outdated
     689 | +    {
     690 | +        StopFetching();
     691 | +        CCoinsViewCache::Flush(reallocate_cache);
     692 | +    }
     693 | +
     694 | +    void Sync() override
    


    l0rinc commented at 9:56 AM on May 18, 2026:

    fec8a2b coins: stop fetching before mutating base:

    I'm a bit surprised this is never covered in https://corecheck.dev/bitcoin/bitcoin/pulls/35295 - can you please investigate?

    What is the expected behavior when we have a second fetch session attached to the same underlying base, e.g.

    BOOST_AUTO_TEST_CASE(fetch_state_is_cleared_by_mutating_operations)
    {
        const auto block{CreateBlock()};
        CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
        CCoinsViewCache main_cache{&db};
        CCoinsViewCache alternate_cache{&db};
        PopulateView(block, main_cache);
        PopulateView(block, alternate_cache);
    
        for (const bool flush : {false, true}) {
            CoinsViewOverlay view{&main_cache, StartedThreadPool()};
            const auto first_guard{view.StartFetching(block)};
            view.SetBestBlock(uint256::ONE);
            if (flush) {
                view.Flush();
            } else {
                view.Sync();
            }
    
            const auto second_guard{view.StartFetching(block)};
            CheckCache(block, view);
        }
    
        CoinsViewOverlay view{&main_cache, StartedThreadPool()};
        const auto first_guard{view.StartFetching(block)};
        view.SetBackend(alternate_cache);
    
        const auto second_guard{view.StartFetching(block)};
        CheckCache(block, view);
    }
    
  10. in src/coins.h:645 in a15470468b outdated
     640 | +            for (const auto& input : tx->vin) {
     641 | +                m_inputs.emplace_back(input.prevout);
     642 | +            }
     643 | +        }
     644 | +        while (ProcessInput()) {}
     645 | +        return CreateResetGuard();
    


    l0rinc commented at 10:07 AM on May 18, 2026:

    a154704 validation: collect block inputs in CoinsViewOverlay before ConnectBlock:

    It should be safe to start twice, we could cover that with a test case, maybe something like:

    BOOST_AUTO_TEST_CASE(fetch_state_is_reusable_after_reset)
    {
        const auto block{CreateBlock()};
        CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
        CCoinsViewCache main_cache{&db};
        PopulateView(block, main_cache);
        CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    
        {
            const auto reset_guard{view.StartFetching(block)};
            CheckCache(block, view);
            BOOST_CHECK_GT(view.GetCacheSize(), 0U);
        }
    
        BOOST_CHECK_EQUAL(view.GetCacheSize(), 0U);
    
        const auto reset_guard{view.StartFetching(block)};
        CheckCache(block, view);
    }
    
  11. in src/coins.h:646 in a3981aa8a0 outdated
     641 | @@ -640,10 +642,13 @@ class CoinsViewOverlay : public CCoinsViewCache
     642 |              return std::move(input.coin);
     643 |          }
     644 |  
     645 | -        // We will only get here for BIP30 checks.
     646 | +        // We will only get here for BIP30 checks or when parallel fetching is disabled.
     647 |          return base->PeekCoin(outpoint);
    


    l0rinc commented at 10:09 AM on May 18, 2026:

    a3981aa coins: introduce thread pool in CoinsViewOverlay:

    Since the workers call PeekCoin() concurrently on the shared base cache, maybe we could force routing the lookup through std::as_const to make the concurrent-read contract explicit at the access site:

    std::optional<Coin> CCoinsViewCache::PeekCoin(const COutPoint& outpoint) const
    {
        const auto& coins{std::as_const(cacheCoins)};
        if (auto it{coins.find(outpoint)}; it != coins.end()) {
            return it->second.coin.IsSpent() ? std::nullopt : std::optional{it->second.coin};
        }
        return base->PeekCoin(outpoint);
    }
    

    andrewtoth commented at 9:03 PM on May 31, 2026:

    I did not take this. This seems out of scope for this change.

  12. in src/test/fuzz/coins_view.cpp:452 in d6946a12ab outdated
     447 | +        .path = "",
     448 | +        .cache_bytes = 1_MiB,
     449 | +        .memory_only = true,
     450 | +    };
     451 | +    CCoinsViewDB backend_base_coins_view{std::move(db_params), CoinsViewOptions{}};
     452 | +    CCoinsViewCache backend_cache{&backend_base_coins_view, /*deterministic=*/true};
    


    l0rinc commented at 10:23 AM on May 18, 2026:

    d6946a1 fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads:

    The generated block inputs are mostly arbitrary misses this way - could we help the fuzzer by allowing a fuzz-selected subset of the staged external prevouts into the DB before starting the overlay, so the same concurrent read path covers real backing-store hits as well as misses:

    void PopulateFetchInputs(CCoinsView& view, const CBlock& block, FuzzedDataProvider& fuzzed_data_provider)
    {
        CCoinsViewCache cache{&view, /*deterministic=*/true};
        cache.SetBestBlock(uint256::ONE);
        for (const auto& tx : block.vtx | std::views::drop(1)) {
            for (const auto& input : tx->vin) {
                if (!fuzzed_data_provider.ConsumeBool()) continue;
                auto coin{ConsumeDeserializable<Coin>(fuzzed_data_provider)};
                if (!coin || coin->IsSpent()) continue;
                cache.AddCoin(input.prevout, std::move(*coin), /*possible_overwrite=*/true);
            }
        }
        cache.Flush();
    }
    ...
    CCoinsViewDB backend_base_coins_view{std::move(db_params), CoinsViewOptions{}};
    CBlock block{BuildRandomBlock(fuzzed_data_provider)};
    PopulateFetchInputs(backend_base_coins_view, block, fuzzed_data_provider);
    CCoinsViewCache backend_cache{&backend_base_coins_view, /*deterministic=*/true};
    

    andrewtoth commented at 9:03 PM on May 31, 2026:

    I did something similar. I add the utxos when I create the block.

  13. in src/test/coinsviewoverlay_tests.cpp:196 in 650416ec87 outdated
     191 | @@ -175,5 +192,65 @@ BOOST_AUTO_TEST_CASE(fetch_no_inputs)
     192 |      BOOST_CHECK_EQUAL(view.GetCacheSize(), 0);
     193 |  }
     194 |  
     195 | +// Access coins that are not block inputs
     196 | +BOOST_AUTO_TEST_CASE(access_non_input_coins)
    


    l0rinc commented at 1:32 PM on May 18, 2026:

    650416e test: add unit tests for CoinsViewOverlay::StartFetching:

    Would it make sense to cover the thread pool interruption explicitly?

    BOOST_AUTO_TEST_CASE(fetch_interrupted_thread_pool_uses_normal_lookup)
    {
        const auto block{CreateBlock()};
        CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
        CCoinsViewCache main_cache{&db};
        PopulateView(block, main_cache);
    
        auto thread_pool{std::make_shared<ThreadPool>("fetch_interrupted")};
        thread_pool->Start(DEFAULT_INPUTFETCH_THREADS);
        thread_pool->Interrupt();
    
        {
            CoinsViewOverlay view{&main_cache, thread_pool};
            const auto reset_guard{view.StartFetching(block)};
            CheckCache(block, view);
        }
    
        thread_pool->Stop();
    }
    

    and maybe the theoretic-but-documented out-of-order flow:

    BOOST_AUTO_TEST_CASE(fetch_out_of_order_input_uses_normal_lookup)
    {
        const auto block{CreateBlock()};
        CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
        CCoinsViewCache main_cache{&db};
        PopulateView(block, main_cache);
    
        std::vector<COutPoint> fetched_inputs;
        std::unordered_set<Txid, SaltedTxidHasher> txids;
        txids.reserve(block.vtx.size() - 1);
        for (const auto& tx : block.vtx | std::views::drop(1)) {
            for (const auto& input : tx->vin) {
                if (!txids.contains(input.prevout.hash)) fetched_inputs.push_back(input.prevout);
            }
            txids.emplace(tx->GetHash());
        }
        BOOST_REQUIRE_GE(fetched_inputs.size(), 2U);
    
        CoinsViewOverlay view{&main_cache, StartedThreadPool()};
        const auto reset_guard{view.StartFetching(block)};
    
        const auto& out_of_order_input{fetched_inputs[1]};
        BOOST_CHECK(!view.AccessCoin(out_of_order_input).IsSpent());
        BOOST_CHECK(view.HaveCoinInCache(out_of_order_input));
    
        CheckCache(block, view);
    }
    
  14. in src/coins.h:631 in a15470468b
     626 | +        StopFetching();
     627 | +        CCoinsViewCache::Reset();
     628 | +    }
     629 | +
     630 |  public:
     631 |      using CCoinsViewCache::CCoinsViewCache;
    


    l0rinc commented at 1:35 PM on May 18, 2026:

    a154704 validation: collect block inputs in CoinsViewOverlay before ConnectBlock:

    nit: maybe we could harden destruction of the overlay (though callers must still keep the reset guard inside the overlay lifetime)

        ~CoinsViewOverlay() noexcept override { StopFetching(); }
    
  15. in src/test/coinsviewoverlay_tests.cpp:61 in d6946a12ab
      56 |      for (const auto& tx : block.vtx | std::views::drop(1)) {
      57 |          for (const auto& in : tx->vin) {
      58 | +            if (txids.contains(in.prevout.hash)) continue;
      59 |              Coin coin{};
      60 |              if (!spent) coin.out.nValue = 1;
      61 |              cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
    


    l0rinc commented at 1:39 PM on May 18, 2026:

    Since the prod usually uses AddCoin for the unspent setup paths maybe we could exercise that as well:

                if (spent) {
                    cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
                } else {
                    coin.out.nValue = 1;
                    cache.AddCoin(in.prevout, std::move(coin), /*possible_overwrite=*/false);
                }
    
  16. l0rinc approved
  17. l0rinc commented at 1:48 PM on May 18, 2026: contributor

    ACK d6946a12ab306c5054b535b0b7fac0101fb306ae

    My concerns from the original PR were addressed, left a few ideas on how to extend code coverage but the implementation finally seems complete to me, thanks for your persistence @andrewtoth \:D/

  18. rkrux commented at 2:34 PM on May 20, 2026: contributor

    I don't know when I will get around to reviewing this PR but I do want to leave a Concept ACK d6946a1 here at this stage.

    It's also very very good to see multiple use cases of the common thread pool coming up in the codebase: http/rpcserver, wallet chain scanning, and this.

  19. in src/coins.h:698 in d6946a12ab
     698 | +            // We advance the tail since the input is cached and not accessed through this method again.
     699 | +            auto& input{m_inputs[m_input_tail++]};
     700 | +            // Check if the coin is ready to be read. We need acquire so we match the worker thread's release.
     701 | +            while (!input.ready.test(std::memory_order_acquire)) {
     702 | +                // Work instead of waiting if the coin is not ready
     703 | +                if (!ProcessInput()) {
    


    ryanofsky commented at 2:09 PM on May 21, 2026:

    In commit "coins: add ready flag to InputToFetch" (70ee8ba66055aed16cfa9e0a8ebeea6db0aad1b9)

    Is there a reason to call ProcessInput() here instead of input.ready.wait()?

    The comment above says this lets the main thread make progress instead of waiting for a specific worker, but was this benchmarked, or is there a reason to expect it to help? Since the main thread is the only thread that can continue validation, it seems possible that it should prioritize waiting for the needed input instead of switching to fetching later inputs, which may involve blocking I/O and may not make the current input ready any sooner.

    (Sorry if this was already asked somewhere. Either way it would be good to update the "Otherwise the main thread calls ProcessInput()..." comment above with a more specific rationale.)


    andrewtoth commented at 9:07 PM on May 31, 2026:

    I think this was done with an older design where we might not have any threads in the thread pool, and therefore wouldn't make any progress in that case. With the current design where we clear m_inputs if we didn't successfully submit tasks to at least 1 worker thread, it makes sense to just wait here. It simplifies this code and continues validation immediately instead of having the main thread potentially wait for expensive IO. I also bumped the default worker threads from 4 to 8. I believe we saw a speedup here due to this, but I think that was because we would now be fetching on a 5th thread by default. So bumping the default and just waiting on the main thread should be even faster.

  20. ryanofsky commented at 2:15 PM on May 21, 2026: contributor

    Code review d6946a12ab306c5054b535b0b7fac0101fb306ae. I've been meaning to review this for a while and finally getting to it now. Just one question so far below

  21. theStack commented at 8:53 PM on May 26, 2026: contributor

    Concept ACK

    I haven't had a chance yet to look at the code in-depth, but noticed that starting with commit a3981aa8a01195e3d4442e5d258f8f3198b49400 ("coins: introduce thread pool in CoinsViewOverlay"), the seemingly unrelated unit test test_LockDirectory fails on my arm64 Linux machine:

    $ git rev-parse --short HEAD
    a3981aa8a0
    $ ./build/bin/test_bitcoin --log_level=test_suite --run_test=coinsviewoverlay_tests,util_tests
    ./test/util_tests.cpp(1056): Entering test case "test_LockDirectory"     
    unknown location(0): fatal error: in "util_tests/test_LockDirectory": memory access violation at address: 0xffffb44470f0: no mapping at fault address
    ./test/util_tests.cpp(1068): last checkpoint                                                
    Test is aborted                                                                             
    ./test/util_tests.cpp(1056): Leaving test case "test_LockDirectory"; testing time: 2099us
    Test is aborted                                                                                                                                                                         
    ./test/util_tests.cpp(71): Leaving test suite "util_tests"; testing time: 10658us
    Test is aborted                                                                                                                                                                         
    Leaving test module "Bitcoin Core Test Suite"; testing time: 1008us                       
                                                                                                                                                                                            
    *** 1 failure is detected in the test module "Bitcoin Core Test Suite"                                                                                                                  
    ./test/util_tests.cpp(1135): error: in "util_tests/test_LockDirectory": check processstatus == 0 has failed [51456 != 0]      
    

    I suspect that the problem is that the forked unit test process in test_LockDirectory gets shut down hard, while the thread pool from the coinsviewoverlay_tests (started by StartedThreadPool) is still running. Keeping only the following minimal unit test in coinsviewoverlay_tests.cpp still fails on my machine with the command from above:

    BOOST_AUTO_TEST_CASE(lockdirectory_test_fail_reproducer)
    {
        CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
        CCoinsViewCache main_cache{&db};
        CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    }
    

    If I add StartedThreadPool()->Stop(); at the end, the LockDirectory unit test failure is gone. Happy to share more details and debug if there is a suspicion that this could point to a deeper issue, though ISTM that this is only a test problem and stopping the thread pool at the end of the coinsviewoverlay test suite (with a test fixture?) would solve the problem.

  22. in src/coins.h:637 in a15470468b
     632 | +
     633 | +    //! Start fetching inputs from block.
     634 | +    [[nodiscard]] ResetGuard StartFetching(const CBlock& block LIFETIMEBOUND) noexcept
     635 | +    {
     636 | +        Assume(m_inputs.empty());
     637 | +        Assume(m_input_head.load(std::memory_order_relaxed) == 0);
    


    ryanofsky commented at 9:20 PM on May 26, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (a15470468ba89f2f86184191423c2a4d5c2ffcc1)

    This is not a performance hotspot, and if these preconditions are wrong it would seem to indicate a serious problem with validation logic. Would suggest either replacing Assume with Assert or at least taking some action such logging a warning if the Assume returns false, not just ignoring the errors

  23. in src/coins.h:626 in a15470468b outdated
     621 |      }
     622 |  
     623 | +protected:
     624 | +    void Reset() noexcept override
     625 | +    {
     626 | +        StopFetching();
    


    ryanofsky commented at 9:52 PM on May 26, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (a15470468ba89f2f86184191423c2a4d5c2ffcc1)

    Since this is a virtual method it means StopFetching could be called (through CreateResetGuard) even if StartFetching is never called, which is surprising, even if safe. Seems worth mentioning StopFetching could be called without StartFetching if this is intentional.

    More ideally it would seem good to just drop the CCoinsViewCache::CreateResetGuard method now that the only non-test code calling it is StartFetching. None of the test code actually uses the reset guard, except as a kludge to immediately reset the cache, so it seems like the tests would be better off either recreating the cache or calling a test-only reset method.


    andrewtoth commented at 9:11 PM on May 31, 2026:

    I added a comment to note that StopFetching is idempotent, and can be safely called when not fetching.

    I'm not sure it's worth refactoring the change to drop CreateResetGuard in this PR, as there is already a lot happening. I opted to not do that (can be refactored in a follow-up).

  24. in src/coins.h:613 in a15470468b
     608 | +    }
     609 | +
     610 |      std::optional<Coin> FetchCoinFromBase(const COutPoint& outpoint) const override
     611 |      {
     612 | +        // TODO: linear scan; replaced with O(1) m_input_tail lookup in a follow-up commit.
     613 | +        for (const auto i : std::views::iota(0U, m_inputs.size())) {
    


    ryanofsky commented at 9:56 PM on May 26, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (a15470468ba89f2f86184191423c2a4d5c2ffcc1)

    i is replaced with m_inputs_head immediately in the next commit, so it seems like it would reduce churn in the PR and make it clearer by introducing both m_input_head and m_input_tail this commit, instead of adding m_input_head and i now and replacing i later.


    andrewtoth commented at 9:11 PM on May 31, 2026:

    Bundled m_input_head into the first commit.

  25. in src/coins.h:647 in ec77ed6146
     641 | @@ -635,11 +642,16 @@ class CoinsViewOverlay : public CCoinsViewCache
     642 |      {
     643 |          Assume(m_inputs.empty());
     644 |          Assume(m_input_head.load(std::memory_order_relaxed) == 0);
     645 | +        Assume(m_input_tail == 0);
     646 |          // Loop through the inputs of the block and set them in the queue.
     647 | +        // Filter txs that are spending inputs created earlier in the same block.
    


    ryanofsky commented at 10:03 PM on May 26, 2026:

    In commit "coins: filter same-block spends in StartFetching" (ec77ed61467a80112e172e27ebafa6e8011b7a5c)

    It's not obvious that these inputs "won't be requested by FetchCoinFromBase" as stated in the commit message. That fact seems worth mentioning in this comment, otherwise it looks like this will result in extra PeekCoin calls.

  26. in src/coins.h:659 in 70ee8ba660
     651 | @@ -635,7 +652,12 @@ class CoinsViewOverlay : public CCoinsViewCache
     652 |      }
     653 |  
     654 |  public:
     655 | -    using CCoinsViewCache::CCoinsViewCache;
     656 | +    explicit CoinsViewOverlay(CCoinsView* in_base, bool deterministic = false) noexcept
     657 | +        : CCoinsViewCache{in_base, deterministic}
     658 | +    {
     659 | +        // Reserve to maximum theoretical number so emplace_back in StartFetching never reallocates m_inputs.
     660 | +        m_inputs.reserve(MAX_INPUTS_PER_BLOCK);
    


    ryanofsky commented at 10:20 PM on May 26, 2026:

    In commit "coins: add ready flag to InputToFetch" (70ee8ba66055aed16cfa9e0a8ebeea6db0aad1b9)

    This seems like a fragile way to ensure vector elements will not be moved. It would seem safer to replace the vector with a std::array (and an m_num_inputs field) to guarantee they will never move.


    l0rinc commented at 3:28 PM on May 29, 2026:

    Isn't the stack limit 1MB on Windows? std::array would be inline in CoinsViewOverlay, so local instances would put 80 * 24390 = 1,951,200 bytes on the stack just for m_inputs. A heap-backed std::unique_ptr<std::array<...>> could work, but a direct std::array member seems risky. It would also need another state field to track the number of valid inputs. We might be able to get around that with a noop sentinel, but that seems more risky than keeping the current explicit size.


    andrewtoth commented at 5:22 PM on May 29, 2026:

    I was experimenting with this, and I don't think an array is ideal here. I think instead of asserting that we will never move, we can properly implement the move constructor here but assert in it that ready will be false and coin will be nullopt. We can get rid of the MAX_INPUTS_PER_BLOCK commit and just let it resize inside of StartFetching.


    l0rinc commented at 1:00 PM on May 31, 2026:

    andrewtoth commented at 2:00 PM on May 31, 2026:

    What changed since https://mirror.b10c.me/bitcoin-bitcoin/31132#discussion_r3166350935?

    The comment above stating that it seems fragile. The previous version in the referenced thread of #31132 was just partially moving the outpoint and ignoring the other fields. Instead, we can move all fields and assert that ready is false and coin is nullopt.


    andrewtoth commented at 9:12 PM on May 31, 2026:

    Updated to drop the MAX_INPUTS_PER_BLOCK and implement the move constructor but assert that ready is false and coin is nullopt.

  27. in src/coins.h:684 in fec8a2b732 outdated
     677 | @@ -678,6 +678,24 @@ class CoinsViewOverlay : public CCoinsViewCache
     678 |          while (ProcessInput()) {}
     679 |          return CreateResetGuard();
     680 |      }
     681 | +
     682 | +    void SetBackend(CCoinsView& view) override
     683 | +    {
     684 | +        StopFetching();
    


    ryanofsky commented at 6:43 PM on May 27, 2026:

    In commit "coins: stop fetching before mutating base" (fec8a2b732bda4c2287bbeab8cfb29af43e9e254)

    Commit description seems makes it clear why SetBackend / Flush / Sync would not be safe to call while inputs are being fetched, but it's unclear (1) what would be calling these methods while inputs are being fetched and (2) why stopping fetching is the right behavior in these cases. It would be good to say more about this in the commit message or code comments.

    Also, it seems fragile for CoinsViewOverlay to have to override these 3 methods individually, because if new methods or overloads are added to the parent class, CoinsViewOverlay could be silently broken. If CoinsViewOverlay needs to stop fetching when the base view is mutated it would seem better to have a single UpdatingBaseView method it could override, or something like that, instead of needing to override 3 methods.


    andrewtoth commented at 9:17 PM on May 31, 2026:

    I updated the commit message to explain why these changes were needed. In production code we don't need any of them, since Flush is the only method to be called on CoinsViewOverlay, and for a valid block all inputs would have already been fetched to get there. Nevertheless these methods are called by our fuzz harnesses, which exercise the full API of CCoinsViewCache. So, it allows us to substitute CoinsViewOverlay with CCoinsViewCache there. I also added unit tests to cover all these.

    The UpdatingBaseView idea would be an interesting refactor but seems out of scope here, so I opted not to take it.

  28. in src/init.cpp:520 in 22c4c1d737
     516 | @@ -517,6 +517,7 @@ void SetupServerArgs(ArgsManager& argsman, bool can_listen_ipc)
     517 |      argsman.AddArg("-minimumchainwork=<hex>", strprintf("Minimum work assumed to exist on a valid chain in hex (default: %s, testnet3: %s, testnet4: %s, signet: %s)", defaultChainParams->GetConsensus().nMinimumChainWork.GetHex(), testnetChainParams->GetConsensus().nMinimumChainWork.GetHex(), testnet4ChainParams->GetConsensus().nMinimumChainWork.GetHex(), signetChainParams->GetConsensus().nMinimumChainWork.GetHex()), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
     518 |      argsman.AddArg("-par=<n>", strprintf("Set the number of script verification threads (0 = auto, up to %d, <0 = leave that many cores free, default: %d)",
     519 |          MAX_SCRIPTCHECK_THREADS, DEFAULT_SCRIPTCHECK_THREADS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
     520 | +    argsman.AddArg("-inputfetchthreads=<n>", strprintf("Set the number of input fetch threads (0 disables, up to %d, default: %d). Negative values are rejected.", MAX_INPUTFETCH_THREADS, DEFAULT_INPUTFETCH_THREADS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    


    ryanofsky commented at 6:51 PM on May 27, 2026:

    In commit "validation: add -inputfetchthreads configuration option" (22c4c1d737dc64d60dd22f9a635866b8ff5236a3)

    "input fetch threads" sounds pretty generic, and it would seem good to be more specific about what these threads are. LLM suggests the following name & description which I think would be better:

    argsman.AddArg("-prevoutfetchthreads=<n>", strprintf("Set the number of threads used to prefetch block input prevouts from the chainstate database (0 disables, up to %d, default: %d). Negative values are rejected.", MAX_INPUTFETCH_THREADS, DEFAULT_INPUTFETCH_THREADS), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    
  29. in src/node/chainstatemanager_args.cpp:65 in 22c4c1d737
      59 | @@ -60,6 +60,13 @@ util::Result<void> ApplyArgsManOptions(const ArgsManager& args, ChainstateManage
      60 |      // Subtract 1 because the main thread counts towards the par threads.
      61 |      opts.worker_threads_num = script_threads - 1;
      62 |  
      63 | +    if (auto value{args.GetIntArg("-inputfetchthreads")}) {
      64 | +        if (*value < 0) {
      65 | +            return util::Error{Untranslated(strprintf("-inputfetchthreads must be non-negative (got %d). Use 0 to disable input fetching.", *value))};
    


    ryanofsky commented at 6:52 PM on May 27, 2026:

    In commit "validation: add -inputfetchthreads configuration option" (22c4c1d737dc64d60dd22f9a635866b8ff5236a3)

    Should probably say "parallel input fetching" instead of "input fetching"

  30. in src/test/coinsviewoverlay_tests.cpp:61 in a3981aa8a0
      57 | @@ -55,6 +58,17 @@ void PopulateView(const CBlock& block, CCoinsView& view, bool spent = false)
      58 |      cache.Flush();
      59 |  }
      60 |  
      61 | +//! Returns a started thread pool shared across tests, mirroring how production reuses pools.
    


    ryanofsky commented at 7:02 PM on May 27, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (a3981aa8a01195e3d4442e5d258f8f3198b49400)

    Unless there's a specific reason to share state across tests, it would seem better to make them self contained and not leak resources after they finish executing. Would suggest using a fixture instead of global like:

    <details><summary>diff</summary> <p>

    --- a/src/test/coinsviewoverlay_tests.cpp
    +++ b/src/test/coinsviewoverlay_tests.cpp
    @@ -20,7 +20,15 @@
     #include <memory>
     #include <ranges>
     
    -BOOST_AUTO_TEST_SUITE(coinsviewoverlay_tests)
    +struct CoinsViewOverlaySetup {
    +    std::shared_ptr<ThreadPool> m_thread_pool{[] {
    +        auto pool{std::make_shared<ThreadPool>("fetch_test")};
    +        pool->Start(DEFAULT_INPUTFETCH_THREADS);
    +        return pool;
    +    }()};
    +};
    +
    +BOOST_FIXTURE_TEST_SUITE(coinsviewoverlay_tests, CoinsViewOverlaySetup)
     
     namespace {
     
    @@ -58,17 +66,6 @@ void PopulateView(const CBlock& block, CCoinsView& view, bool spent = false)
         cache.Flush();
     }
     
    -//! Returns a started thread pool shared across tests, mirroring how production reuses pools.
    -std::shared_ptr<ThreadPool> StartedThreadPool()
    -{
    -    static const auto thread_pool{[] {
    -        auto pool{std::make_shared<ThreadPool>("fetch_test")};
    -        pool->Start(DEFAULT_INPUTFETCH_THREADS);
    -        return pool;
    -    }()};
    -    return thread_pool;
    -}
    -
     void CheckCache(const CBlock& block, const CCoinsViewCache& cache)
     {
         uint32_t counter{0};
    @@ -98,7 +95,7 @@ BOOST_AUTO_TEST_CASE(fetch_inputs_from_db)
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         PopulateView(block, db);
         CCoinsViewCache main_cache{&db};
    -    CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    +    CoinsViewOverlay view{&main_cache, m_thread_pool};
         const auto& outpoint{block.vtx[1]->vin[0].prevout};
     
         BOOST_CHECK(view.HaveCoin(outpoint));
    @@ -125,7 +122,7 @@ BOOST_AUTO_TEST_CASE(fetch_inputs_from_cache)
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         CCoinsViewCache main_cache{&db};
         PopulateView(block, main_cache);
    -    CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    +    CoinsViewOverlay view{&main_cache, m_thread_pool};
         CheckCache(block, view);
     
         const auto& outpoint{block.vtx[1]->vin[0].prevout};
    @@ -145,7 +142,7 @@ BOOST_AUTO_TEST_CASE(fetch_no_double_spend)
         CCoinsViewCache main_cache{&db};
         // Add all inputs as spent already in cache
         PopulateView(block, main_cache, /*spent=*/true);
    -    CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    +    CoinsViewOverlay view{&main_cache, m_thread_pool};
         for (const auto& tx : block.vtx) {
             for (const auto& in : tx->vin) {
                 const auto& c{view.AccessCoin(in.prevout)};
    @@ -163,7 +160,7 @@ BOOST_AUTO_TEST_CASE(fetch_no_inputs)
         const auto block{CreateBlock()};
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         CCoinsViewCache main_cache{&db};
    -    CoinsViewOverlay view{&main_cache, StartedThreadPool()};
    +    CoinsViewOverlay view{&main_cache, m_thread_pool};
         for (const auto& tx : block.vtx) {
             for (const auto& in : tx->vin) {
                 const auto& c{view.AccessCoin(in.prevout)};
    

    </p> </details>


    l0rinc commented at 3:56 PM on May 29, 2026:

    Yeah, I suggested deduplicating, this is likely the cause of @theStack's failure as well, your solution is more elegant

  31. in src/coins.h:625 in 1f0c8957e1
     621 |      void StopFetching() noexcept
     622 |      {
     623 | +        if (m_futures.empty()) {
     624 | +            Assume(m_inputs.empty());
     625 | +            Assume(m_input_head.load(std::memory_order_relaxed) == 0);
     626 | +            Assume(m_input_tail == 0);
    


    ryanofsky commented at 7:15 PM on May 27, 2026:

    In commit "coins: fetch inputs in parallel" (1f0c8957e1e483608148dc4bf7f396b82d2c6e44)

    Similar to other comments made previously it would seem good to replace these Assume checks with asserts or do something besides ignoring failures they return

  32. andrewtoth force-pushed on May 31, 2026
  33. andrewtoth renamed this:
    validation: fetch block inputs in parallel during ConnectBlock
    validation: fetch block input prevouts in parallel during ConnectBlock
    on May 31, 2026
  34. andrewtoth commented at 10:34 PM on May 31, 2026: contributor

    Thanks for your reviews @l0rinc @ryanofsky @theStack. I have taken most of your suggestions.

    • We no longer call ProcessInput on main thread, and just wait on the ready flag. This let's us drop const from ProcessInput.
    • Renamed all inputfetch|InputFetch to prevoutfetch|PrevoutFetch.
    • Bumped DEFAULT_PREVOUTFETCH_THREADS from 4 to 8. This was precipitated by no longer working on main thread, so we should have more workers.
    • Addressed various other suggestions.
    • The increase in diff size is mainly from adding unit tests for full code coverage.

    git diff d6946a12ab306c5054b535b0b7fac0101fb306ae..e71a395dfcc01d66ad0cafdc67fd87ba1de04a4b

  35. in src/coins.h:616 in d0aa46a717
     611 | @@ -602,6 +612,9 @@ class CoinsViewOverlay : public CCoinsViewCache
     612 |  
     613 |          auto& input{m_inputs[i]};
     614 |          input.coin = base->PeekCoin(input.outpoint);
     615 | +        // Use release so writing coin above happens before the main thread acquires.
     616 | +        input.ready.test_and_set(std::memory_order_release);
    


    l0rinc commented at 7:00 PM on June 1, 2026:

    d0aa46a coins: add ready flag to InputToFetch:

    Since each input slot should only be claimed once via m_input_head, could this assume the flag was not already set? It shouldn't fail if multiple threads are fetching the same (or the same, multiple times), but it would signal some kind of bug:

    Assume(!input.ready.test_and_set(std::memory_order_release));
    
  36. in src/coins.h:713 in e71a395dfc outdated
     713 | +
     714 | +        // We will only get here for BIP30 checks or when parallel fetching is disabled.
     715 |          return base->PeekCoin(outpoint);
     716 |      }
     717 |  
     718 | +    //! Non-null. May have zero workers when input fetching is disabled.
    


    l0rinc commented at 7:18 PM on June 1, 2026:

    If #34844 gets merged first, this is a good candidate

  37. in src/test/coinsviewoverlay_tests.cpp:64 in cfdc1b9aef
      57 | @@ -55,6 +58,17 @@ void PopulateView(const CBlock& block, CCoinsView& view, bool spent = false)
      58 |      cache.Flush();
      59 |  }
      60 |  
      61 | +//! Returns a started thread pool shared across tests, mirroring how production reuses pools.
      62 | +std::shared_ptr<ThreadPool> StartedThreadPool()
      63 | +{
      64 | +    static const auto thread_pool{[] {
    


    l0rinc commented at 7:27 PM on June 1, 2026:

    cfdc1b9 coins: introduce thread pool in CoinsViewOverlay:

    How come these are still static? It's moved in a later commit, can you please reduce churn and correct it here, too?

  38. in src/node/chainstatemanager_args.cpp:63 in 78ea7793f9
      59 | @@ -60,6 +60,13 @@ util::Result<void> ApplyArgsManOptions(const ArgsManager& args, ChainstateManage
      60 |      // Subtract 1 because the main thread counts towards the par threads.
      61 |      opts.worker_threads_num = script_threads - 1;
      62 |  
      63 | +    if (auto value{args.GetIntArg("-prevoutfetchthreads")}) {
    


    l0rinc commented at 11:48 AM on June 2, 2026:

    78ea779 validation: add -prevoutfetchthreads configuration option:

    Is it critical that we read this as a 64 bit value and cast later? It seems to me this would also work:

    diff --git a/src/node/chainstatemanager_args.cpp b/src/node/chainstatemanager_args.cpp
    --- a/src/node/chainstatemanager_args.cpp	(revision 371c525623e42424ead869114baff07ab60505b8)
    +++ b/src/node/chainstatemanager_args.cpp	(date 1780400814309)
    @@ -18,6 +18,8 @@
     #include <util/translation.h>
     #include <validation.h>
     
     #include <algorithm>
     #include <chrono>
     #include <cstdint>
     #include <string>
    @@ -60,11 +62,11 @@
         // Subtract 1 because the main thread counts towards the par threads.
         opts.worker_threads_num = script_threads - 1;
     
    -    if (auto value{args.GetIntArg("-prevoutfetchthreads")}) {
    +    if (auto value{args.GetArg<int32_t>("-prevoutfetchthreads")}) {
             if (*value < 0) {
                 return util::Error{Untranslated(strprintf("-prevoutfetchthreads must be non-negative (got %d). Use 0 to disable parallel input fetching.", *value))};
             }
    -        opts.prevoutfetch_threads_num = static_cast<int32_t>(std::min<int64_t>(*value, MAX_PREVOUTFETCH_THREADS));
    +        opts.prevoutfetch_threads_num = std::min(*value, MAX_PREVOUTFETCH_THREADS);
         }
     
         if (auto max_size = args.GetIntArg("-maxsigcachesize")) {
    
  39. in src/test/fuzz/coins_view.cpp:96 in e71a395dfc
      91 | +Mutex g_thread_pool_mutex;
      92 | +
      93 | +void StartPoolIfNeeded() EXCLUSIVE_LOCKS_REQUIRED(!g_thread_pool_mutex)
      94 | +{
      95 | +    LOCK(g_thread_pool_mutex);
      96 | +    if (g_thread_pool->WorkersCount() == DEFAULT_PREVOUTFETCH_THREADS) return;
    


    l0rinc commented at 3:29 PM on June 2, 2026:

    nit, we can either have an ThreadPool::IsStarted() method or use WorkersCount as a boolean:

        if (!g_thread_pool->WorkersCount()) g_thread_pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
    

    cc: @theStack


    l0rinc commented at 10:53 AM on June 3, 2026:

    Applies to src/test/fuzz/coinscache_sim.cpp as well

  40. in src/coins.h:758 in e71a395dfc
     759 | +            // Only start threads if we have something to fetch.
     760 | +            if (!m_inputs.empty()) {
     761 | +                std::vector<std::function<void()>> tasks(workers_count, [this] {
     762 | +                    while (ProcessInput()) {}
     763 | +                });
     764 | +                if (auto futures{m_thread_pool->Submit(std::move(tasks))}; futures.has_value()) {
    


    l0rinc commented at 3:53 PM on June 2, 2026:

    nit:

                    if (auto futures{m_thread_pool->Submit(std::move(tasks))}) {
    
  41. in src/coins.h:691 in 0f70a37afe outdated
     695 | -        std::unordered_set<Txid, SaltedTxidHasher> txids;
     696 | -        txids.reserve(block.vtx.size());
     697 | -        for (const auto& tx : block.vtx | std::views::drop(1)) {
     698 | -            for (const auto& input : tx->vin) {
     699 | -                if (!txids.contains(input.prevout.hash)) m_inputs.emplace_back(input.prevout);
     700 | +        if (const auto workers_count{m_thread_pool->WorkersCount()}; workers_count > 0) {
    


    l0rinc commented at 3:57 PM on June 2, 2026:

    0f70a37 coins: fetch inputs in parallel:

    Up until the previous commit, the cache was also warmed on a single thread; here we're reverting to the original behavior when there are no threads (because the main thread isn't fetching anymore). It's not incorrect, but it changes behavior back and forth. It's also not obvious what the speed difference should be between 1 thread and 0 threads. My understanding is that with 0 threads there is no pre-warming (i.e., done during validation), while with 1 thread it is prefetched so the validation code always finds the value already in cache. And the fact that StartFetching doesn't always "start" anything complicates things further (and prefixing with Maybe* wouldn't help). I can accept that keeping the old behavior as a fallback is necessary, but the 0 vs 1 ambiguity adds complexity that could maybe be avoided (similarly to how -par=1 means 0 additional workers for scripts, which is also quite confusing). cc: @theStack


    andrewtoth commented at 12:54 AM on June 3, 2026:

    The parallel fetching is disabled with 0 threads. Current behavior is retained. StartFetching is just essentially CreateResetGuard. With 1 or more threads, the prevouts are fetched in the background, and the main thread will not fetch them from main cache or disk. With only 1 worker thread we can expect only slightly better performance, since the main thread can continue any validation while the background worker fetches from disk. This would be especially true with -par=1, since the main thread can do CPU intensive validation while the background thread waits on disk IO.


    l0rinc commented at 10:49 AM on June 3, 2026:

    So what's the advantage of keeping the old behavior?


    andrewtoth commented at 3:06 PM on June 3, 2026:

    I don't think there's an advantage. I would not configure it to 0 myself. It was requested as an option in #31132 (comment). BIP30 checks will still read concurrently with background reads with 1 thread.


    l0rinc commented at 8:36 AM on June 5, 2026:

    It was requested as an option in #31132 (comment).

    I don't see it, can you please quote that part? Especially since the fetches are still single threaded with -prevoutfetchthreads=1.

    I don't think there's an advantage. I would not configure it to 0 myself.

    I've compared -prevoutfetchthreads=0 with -prevoutfetchthreads=1, there's basically no difference. My preference would be to disallow -prevoutfetchthreads=0 and always pre-warm the cache from now on - it's just confusing to have this optionality. cc: @willcl-ark, @ryanofsky, @theStack

    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 1000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        30800.474 s               [User: 34040.905 s, System: 1333.384 s]
     
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec43
      Time (abs ≡):        30736.822 s               [User: 36206.873 s, System: 2918.100 s]
    

    andrewtoth commented at 1:33 PM on June 5, 2026:

    quote that part?

    It is this:

    this could also be an argument for adding a config option to disable parallel fetch, if we cannot assure ourselves enough?

    the fetches are still single threaded with -prevoutfetchthreads=1

    Sorry if I was unclear from my initial reply. For BIP30 checks they will fall back to reading from the chainstate db concurrently with the single background thread.

    One other advantage is less memory overhead from one less thread, if the user is very concerned about memory.


    andrewtoth commented at 1:30 PM on June 9, 2026:

    Another advantage of keeping prevoutfetchthreads=0 that came to mind is that it very nicely avoids all new code paths and reverts back to the previous fetching logic exactly. If we discover a bug with parallel fetching post-release, being able to recommend a configuration option workaround will let us resolve this without requiring users to upgrade.

  42. in src/coins.h:644 in cfdc1b9aef outdated
     640 | @@ -639,10 +641,13 @@ class CoinsViewOverlay : public CCoinsViewCache
     641 |              return std::move(input.coin);
     642 |          }
     643 |  
     644 | -        // We will only get here for BIP30 checks.
     645 | +        // We will only get here for BIP30 checks or when parallel fetching is disabled.
    


    l0rinc commented at 3:59 PM on June 2, 2026:

    cfdc1b9 coins: introduce thread pool in CoinsViewOverlay:

    This is actually only guarded in the next commit

  43. in src/coins.h:704 in 0f70a37afe
     708 | +                    if (!txids.contains(input.prevout.hash)) m_inputs.emplace_back(input.prevout);
     709 | +                }
     710 | +                txids.emplace(tx->GetHash());
     711 | +            }
     712 | +            // Only start threads if we have something to fetch.
     713 | +            if (!m_inputs.empty()) {
    


    l0rinc commented at 4:14 PM on June 2, 2026:

    0f70a37 coins: fetch inputs in parallel:

    nit: Personally I find the !empty pattern harder to read than if (m_inputs.size()) { or if (m_inputs.size() > 0) {.

  44. in src/test/coinsviewoverlay_tests.cpp:49 in 3e995afb46
      41 | @@ -32,10 +42,13 @@ CBlock CreateBlock() noexcept
      42 |      coinbase.vin.emplace_back();
      43 |      block.vtx.push_back(MakeTransactionRef(coinbase));
      44 |  
      45 | +    Txid prevhash{Txid::FromUint256(uint256{1})};
      46 | +
      47 |      for (const auto i : std::views::iota(1, NUM_TXS)) {
      48 |          CMutableTransaction tx;
      49 | -        Txid txid{Txid::FromUint256(uint256(i))};
      50 | +        const Txid txid{i % 2 == 0 ? Txid::FromUint256(uint256(i)) : prevhash};
    


    l0rinc commented at 4:40 PM on June 2, 2026:

    nit: we don't usually have 50% internap spends, I remember only having around 5% on average

  45. in src/test/coinsviewoverlay_tests.cpp:73 in 3e995afb46 outdated
      70 | -            cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
      71 | +            if (spent) {
      72 | +                cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
      73 | +            } else {
      74 | +                coin.out.nValue = 1;
      75 | +                cache.AddCoin(in.prevout, std::move(coin), /*possible_overwrite=*/false);
    


    l0rinc commented at 4:41 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    nit: the EmplaceCoinInternalDANGER vs AddCoin difference could likely be added in earlier commit.

  46. in src/test/coinsviewoverlay_tests.cpp:100 in 3e995afb46
      93 | @@ -82,9 +94,12 @@ void CheckCache(const CBlock& block, const CCoinsViewCache& cache)
      94 |                  const auto& first{cache.AccessCoin(outpoint)};
      95 |                  const auto& second{cache.AccessCoin(outpoint)};
      96 |                  BOOST_CHECK_EQUAL(&first, &second);
      97 | -                ++counter;
      98 | -                BOOST_CHECK(cache.HaveCoinInCache(outpoint));
      99 | +                const auto should_have{!txids.contains(outpoint.hash)};
     100 | +                if (should_have) ++counter;
     101 | +                const auto have{cache.HaveCoinInCache(outpoint)};
     102 | +                BOOST_CHECK_EQUAL(should_have, have);
    


    l0rinc commented at 4:47 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    What we're checking here is that cache presence and an internal spend are mutually exclusive. It might be simpler to use BOOST_CHECK_NE for that:

                    const auto have{cache.HaveCoinInCache(outpoint)};
                    BOOST_CHECK_NE(txids.contains(outpoint.hash), have);
                    counter += have;
    
  47. in src/test/coinsviewoverlay_tests.cpp:207 in 3e995afb46
     203 | +    CCoinsViewCache main_cache{&db};
     204 | +    Coin coin{};
     205 | +    coin.out.nValue = 1;
     206 | +    const COutPoint outpoint{Txid::FromUint256(uint256::ZERO), 0};
     207 | +    main_cache.EmplaceCoinInternalDANGER(COutPoint{outpoint}, std::move(coin));
     208 | +    const COutPoint missing_outpoint{Txid::FromUint256(uint256::ONE), 0};
    


    l0rinc commented at 4:52 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    We could move this closer to the first usage to make it obvious that this was unknown during construction:

    const COutPoint missing_outpoint{Txid::FromUint256(uint256::ONE), 0};
    BOOST_CHECK(view.AccessCoin(missing_outpoint).IsSpent());
    BOOST_CHECK(!view.HaveCoinInCache(missing_outpoint));
    
  48. in src/test/coinsviewoverlay_tests.cpp:200 in 3e995afb46 outdated
     193 | @@ -175,5 +194,140 @@ BOOST_AUTO_TEST_CASE(fetch_no_inputs)
     194 |      BOOST_CHECK_EQUAL(view.GetCacheSize(), 0);
     195 |  }
     196 |  
     197 | -BOOST_AUTO_TEST_SUITE_END()
     198 | +// Access coins that are not block inputs
     199 | +BOOST_AUTO_TEST_CASE(access_non_input_coins)
     200 | +{
     201 | +    const auto block{CreateBlock()};
    


    theStack commented at 4:57 PM on June 2, 2026:

    nit: could as well start out with an empty block here, to ensure that it's actually a non-input hit below?


    andrewtoth commented at 10:11 PM on June 3, 2026:

    Ah this actually uncovered that the test was passing for the wrong reason!

    Below we define the missing_outpoint as having txid 1. But, the block does have txid 1! It was still missing from the db, but it wasn't taking the fallback path.

  49. in src/test/coinsviewoverlay_tests.cpp:214 in 3e995afb46
     210 | +    CoinsViewOverlay view{&main_cache, m_thread_pool};
     211 | +    const auto reset_guard{view.StartFetching(block)};
     212 | +
     213 | +    // Non-input fallback hit.
     214 | +    const auto& accessed_coin{view.AccessCoin(outpoint)};
     215 | +    BOOST_CHECK(!accessed_coin.IsSpent());
    


    l0rinc commented at 4:57 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    nit: can likely be inlined for simplicity (like we do in fetch_out_of_order_input_uses_normal_lookup or use HaveCoin directly):

        BOOST_CHECK(!view.AccessCoin(outpoint).IsSpent());
    

    cc: @theStack

  50. in src/test/coinsviewoverlay_tests.cpp:229 in 3e995afb46
     225 | +{
     226 | +    const auto block{CreateBlock()};
     227 | +    CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
     228 | +    CCoinsViewCache main_cache{&db};
     229 | +    PopulateView(block, main_cache);
     230 | +    auto thread_pool{std::make_shared<ThreadPool>("fetch_none")};
    


    l0rinc commented at 5:02 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    It's a bit confusing that we have a started shared thread pool in the fixture that we're not actually using in a few of the tests. This one doesn't start the thread pool, hence the fallback, but the fixture has a started one. Maybe we could add these to a different fixture to make this distinction obvious. cc: @theStack

  51. in src/test/coinsviewoverlay_tests.cpp:223 in 3e995afb46
     219 | +    BOOST_CHECK(missing_coin.IsSpent());
     220 | +    BOOST_CHECK(!view.HaveCoinInCache(missing_outpoint));
     221 | +}
     222 | +
     223 | +// Test that disabled input fetching falls back to normal cache lookups via base->PeekCoin.
     224 | +BOOST_AUTO_TEST_CASE(fetch_unstarted_thread_pool)
    


    l0rinc commented at 5:06 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    Coverage proves this does indeed exercise the fallback 👍

    <img width="855" height="307" alt="Image" src="https://github.com/user-attachments/assets/77b92691-9dcd-4bd9-b986-3b4e704d88cc" />

  52. in src/test/coinsviewoverlay_tests.cpp:277 in 3e995afb46 outdated
     273 | +    CoinsViewOverlay view{&main_cache, m_thread_pool};
     274 | +    const auto reset_guard{view.StartFetching(block)};
     275 | +
     276 | +    const auto& out_of_order_input{fetched_inputs[1]};
     277 | +    BOOST_CHECK(!view.AccessCoin(out_of_order_input).IsSpent());
     278 | +    BOOST_CHECK(view.HaveCoinInCache(out_of_order_input));
    


    l0rinc commented at 5:25 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    We might want to add a comment to the test header about FetchCoinFromBase refusing to advance for out-of-order requests (causing the fallback) and maybe prove that AccessCoin makes HaveCoinInCache pass:

    const auto& out_of_order_input{fetched_inputs[1]};
    BOOST_CHECK(!view.HaveCoinInCache(out_of_order_input));
    BOOST_CHECK(!view.AccessCoin(out_of_order_input).IsSpent());
    BOOST_CHECK(view.HaveCoinInCache(out_of_order_input));
    

    cc: @theStack

  53. in src/test/coinsviewoverlay_tests.cpp:285 in 3e995afb46 outdated
     281 | +}
     282 | +
     283 | +// Mutating operations on the overlay (Flush/Sync/SetBackend) must stop in-flight
     284 | +// fetching and clear all per-block state, so a second StartFetching on the same
     285 | +// overlay starts from clean preconditions.
     286 | +BOOST_AUTO_TEST_CASE(fetch_state_is_cleared_by_mutating_operations)
    


    l0rinc commented at 5:32 PM on June 2, 2026:

    3e995af test: add unit tests for CoinsViewOverlay::StartFetching:

    Commenting out StopFetching for SetBackend/Flush/Sync separately all makes this test fail 👍

  54. in src/validation.h:92 in 78ea7793f9
      88 | @@ -89,6 +89,9 @@ static const uint64_t MIN_DISK_SPACE_FOR_BLOCK_FILES{550_MiB};
      89 |  /** Maximum number of dedicated script-checking threads allowed */
      90 |  static constexpr int MAX_SCRIPTCHECK_THREADS{15};
      91 |  
      92 | +/** Maximum number of dedicate threads allowed for prefetching block input prevouts */
    


    l0rinc commented at 5:54 PM on June 2, 2026:

    78ea779 validation: add -prevoutfetchthreads configuration option:

    /** Maximum number of dedicated threads allowed for prefetching block input prevouts */
    
  55. l0rinc changes_requested
  56. l0rinc commented at 9:31 PM on June 2, 2026: contributor

    The number of fetcher threads is configurable via -prevoutfetchthreads=<n>, defaulting to 4

    Could you please update the PR description? Btw, I'm measuring the effect of 8 new threads on minimum node memory usage, will report back when it finishes.

    My remaining concern after the recent changes is that I don't understand the difference between parallelism of 0 and 1 - see comments.

    I remeasured reindex-chainstate with 1GB and 50GB dbcache on SSD and HDD to see the extremes (rebased to get seek competition as well): they're looking really good:

    <img width="2340" height="2340" alt="Image" src="https://github.com/user-attachments/assets/d88714a8-d676-4b01-a40b-900314ea4fbd" />

    <details><summary>i7</summary>

    for DBCACHE in 1000 50000; do \
        COMMITS="fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7"; \
        STOP=950059; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; done"; \he=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconso
    
    fbe628756c Merge bitcoin/bitcoin#35131: guix, refactor: Minor script cleanups and improvements
    3e90c5e1ce fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-05-31 | reindex-chainstate | 950059 blocks | dbcache 1000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
      Time (abs ≡):        30962.748 s               [User: 33946.611 s, System: 1343.201 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
      Time (abs ≡):        26650.647 s               [User: 38790.705 s, System: 1775.691 s]
    
    Relative speed comparison
            1.16          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
    
    fbe628756c Merge bitcoin/bitcoin#35131: guix, refactor: Minor script cleanups and improvements
    3e90c5e1ce fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-01 | reindex-chainstate | 950059 blocks | dbcache 50000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
      Time (abs ≡):        24869.226 s               [User: 28333.773 s, System: 779.739 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
      Time (abs ≡):        25187.463 s               [User: 30291.512 s, System: 885.903 s]
    
    Relative speed comparison
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
            1.01          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
    

    </details>

    <details><summary>i9</summary>

     for DBCACHE in 1000 50000; do \
        COMMITS="fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7"; \
        STOP=950059; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind;done=0"; \e=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtocon
    
    fbe628756c Merge bitcoin/bitcoin#35131: guix, refactor: Minor script cleanups and improvements
    3e90c5e1ce fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-05-31 | reindex-chainstate | 950059 blocks | dbcache 1000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
      Time (abs ≡):        19558.763 s               [User: 31796.752 s, System: 1064.759 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
      Time (abs ≡):        14958.313 s               [User: 35308.810 s, System: 1574.040 s]
    
    Relative speed comparison
            1.31          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
    
    fbe628756c Merge bitcoin/bitcoin#35131: guix, refactor: Minor script cleanups and improvements
    3e90c5e1ce fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-01 | reindex-chainstate | 950059 blocks | dbcache 50000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
      Time (abs ≡):        17399.201 s               [User: 25711.641 s, System: 684.138 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 3e90c5e1ceecb8fcb6a4271b98e84ded171252a7)
      Time (abs ≡):        16674.033 s               [User: 27487.513 s, System: 800.722 s]
    
    Relative speed comparison
            1.04          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = fbe628756cc417dd4b6ccd9d3a709ca8e2f6023c)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=50000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMM
    

    </details>

  57. andrewtoth force-pushed on Jun 3, 2026
  58. andrewtoth commented at 12:55 AM on June 3, 2026: contributor

    Thanks for the review @l0rinc, addressed all suggestions.

  59. DrahtBot added the label CI failed on Jun 3, 2026
  60. in src/coins.h:619 in d1756187fe
     619 | +        // This assumes ConnectBlock accesses all inputs in the same order as
     620 | +        // they are added to m_inputs in StartFetching.
     621 | +        if (m_input_tail < m_inputs.size() && m_inputs[m_input_tail].outpoint == outpoint) {
     622 |              // We advance the tail since the input is cached and not accessed through this method again.
     623 |              auto& input{m_inputs[m_input_tail++]};
     624 | -            if (input.outpoint != outpoint) continue;
    


    l0rinc commented at 10:52 AM on June 3, 2026:

    d175618 coins: filter same-block spends in StartFetching:

    Any reason for doing this differently in the previous commit?


    andrewtoth commented at 3:03 PM on June 3, 2026:

    We had to scan in the previous commit, because we were also fetching prevouts for inputs created in that block. The next commit we filter those out, so here we won't have to scan.

  61. l0rinc commented at 10:59 AM on June 3, 2026: contributor

    CI would probably pass after a rebase (which would also get us the seek compaction fixes)

  62. andrewtoth force-pushed on Jun 3, 2026
  63. andrewtoth commented at 3:10 PM on June 3, 2026: contributor

    Rebased on master to test compatibility with https://github.com/bitcoin-core/leveldb-subtree/pull/61.

    Also addressed #35295 (review) (thanks @l0rinc).

    git range-diff 9961229360..22066bfb4c 7c2718a4b8..9db6f241c7

  64. DrahtBot removed the label CI failed on Jun 3, 2026
  65. in src/coins.h:596 in a92444eedc
     591 | +
     592 | +        //! Move ctor is required for resizing m_inputs in StartFetching. Elements will never move once parallel tasks
     593 | +        //! are started, so we can assert that coin is nullopt and ready is false.
     594 | +        InputToFetch(InputToFetch&& other) noexcept : outpoint{other.outpoint}
     595 | +        {
     596 | +            Assert(!coin);
    


    theStack commented at 5:06 PM on June 3, 2026:

    in a92444eedc27537953bf797364bbd3196a8f7adc: was this meant to be

                Assert(!other.coin);
    

    ? That would make more sense to me, as the current assert is just "stating the obvious", i.e. that a freshly-default constructed std::optional member is std::nullopt.


    andrewtoth commented at 10:11 PM on June 3, 2026:

    Nice catch! Done.

  66. andrewtoth force-pushed on Jun 3, 2026
  67. andrewtoth commented at 10:12 PM on June 3, 2026: contributor

    Addressed comments #35295 (review) and #35295 (review) (thanks @theStack!).

    git diff 9db6f241c74709be7565219220b481a586b6cc6d..73df01964cdc3a8b48ddc45010c469a0af283106

  68. l0rinc approved
  69. in src/coins.h:650 in d3dac3c9dd
     644 | @@ -642,10 +645,15 @@ class CoinsViewOverlay : public CCoinsViewCache
     645 |          Assert(m_input_head.load(std::memory_order_relaxed) == 0);
     646 |          Assert(m_input_tail == 0);
     647 |          // Loop through the inputs of the block and set them in the queue.
     648 | +        // Filter txs that are spending inputs created earlier in the same block. These inputs will be created
     649 | +        // directly in the cache from the tx that creates them, so they will not be requested from a base view.
     650 | +        std::unordered_set<Txid, SaltedTxidHasher> txids;
    


    l0rinc commented at 11:07 AM on June 4, 2026:

    d3dac3c coins: filter same-block spends in StartFetching:

    In previous versions this collected all txids, but now it's the earlier txids, i.e. no more forward references possible: maybe we could rename to reflect that

            std::unordered_set<Txid, SaltedTxidHasher> earlier_txids;
    

    Note: technically it's "earlier, except coinbase", which means that an invalid block trying to spend its own coinbase would still go to disk (likely not something we should care about)

  70. in src/coins.h:582 in d7bb17e2f9
     582 | + * objects. StartFetching then submits worker tasks to a ThreadPool and keeps the returned futures alive until fetching
     583 | + * is stopped.
     584 | + *
     585 | + * ProcessInput() atomically fetches and increments m_input_head, so each thread can only access a single element of the
     586 | + * m_inputs vector at a time. Workers race to claim inputs, so they may fetch elements in any order. If the fetched
     587 | + * index is greater than the size of m_inputs, no more inputs can be fetched and false is returned.
    


    l0rinc commented at 11:09 AM on June 4, 2026:

    d7bb17e doc: update CoinsViewOverlay docstring to describe parallel fetching:

    nit: size is an invalid index:

     * index is greater than or equal to the size of m_inputs, no more inputs can be fetched and false is returned.
    
  71. in src/coins.h:568 in d7bb17e2f9
     564 | @@ -565,13 +565,63 @@ class CCoinsViewCache : public CCoinsViewBacked
     565 |  };
     566 |  
     567 |  /**
     568 | - * CCoinsViewCache overlay that avoids populating/mutating parent cache layers on cache misses.
     569 | + * CCoinsViewCache subclass that asynchronously fetches all block inputs in parallel during ConnectBlock without
    


    l0rinc commented at 11:21 AM on June 4, 2026:

    d7bb17e doc: update CoinsViewOverlay docstring to describe parallel fetching:

    nit: not "all" inputs can be prefeched

     * CCoinsViewCache subclass that asynchronously fetches most block input prevouts in parallel during ConnectBlock without
    
  72. in src/coins.h:594 in 2c9efd46f9
     589 | +
     590 | +    /**
     591 | +     * Claim and fetch the next input in the queue.
     592 | +     *
     593 | +     * @return true if there are more inputs in the queue to fetch
     594 | +     * @return false if there are no more inputs in the queue to fetch
    


    l0rinc commented at 11:30 AM on June 4, 2026:

    2c9efd4 validation: collect block inputs in CoinsViewOverlay before ConnectBlock:

    nit: wouldn't this return true for the last element?

         * [@return](/bitcoin-bitcoin/contributor/return/) true if an input prevout was fetched
         * [@return](/bitcoin-bitcoin/contributor/return/) false if there are no more input prevouts in the queue to fetch
    
  73. in src/coins.h:573 in d7bb17e2f9
     571 |   *
     572 | - * This is achieved by fetching coins from the base view using PeekCoin() instead of GetCoin(),
     573 | - * so intermediate CCoinsViewCache layers are not filled.
     574 | + * Only used in ConnectBlock to pass as an ephemeral view that can be reset if the block is invalid.
     575 | + * It provides the same interface as CCoinsViewCache. It overrides all methods that mutate base,
     576 | + * stopping threads before calling superclass.
    


    l0rinc commented at 11:42 AM on June 4, 2026:

    d7bb17e doc: update CoinsViewOverlay docstring to describe parallel fetching:

    nit: StopFetching does not stop or destroy the ThreadPool worker threads anymore:

     * stopping fetch tasks before calling superclass.
    
  74. in src/coins.h:1 in 2c9efd46f9 outdated


    l0rinc commented at 11:45 AM on June 4, 2026:

    2c9efd46f98e9c0b2e9698d7550e96a0082e14ea

    Fetch coins from the m_inputs vector in FetchCoinFromBase by scanning all inputs starting at m_input_tail until we discover the input with the correct outpoint.

    Is this still accurate?


    ryanofsky commented at 9:01 PM on June 10, 2026:

    In commit "coins: filter same-block spends in StartFetching" (2b0d87e8e0322119eb5f428d90045b71c3e19bf3)

    Can we add an assert below this comment to verify the loop above is working and the comment is actually true?

    Assert(std::none_of(m_inputs.begin(), m_inputs.end(), [&](const InputToFetch& i) { return i.outpoint == outpoint; }));
    

    I don't think adding this should have any performance impact because this code should never be hit for normal fetches, only for BIP30 checks in old blocks which are small. Also it should be ok later if -prevoutfetchthreads is 0 because m_inputs will be empty.


    andrewtoth commented at 10:19 PM on June 10, 2026:

    We could do this and it will be true for production code, but it will break in our unit and fuzz tests. The unit tests we could remove, but the fuzz tests would have to be modified. The fuzz tests are not like ConnectBlock, they request input prevouts in any order.


    andrewtoth commented at 10:29 PM on June 10, 2026:

    We could make the assertion check that none of the inputs exist after m_input_tail. That would be a little tighter, but it would miss inputs requested out of order (which is not possible with current fuzz harnesses). So with that we would keep the loop instead of changing it to the if statement here.


    andrewtoth commented at 10:51 PM on June 10, 2026:

    What about something like this:

    diff --git a/src/coins.h b/src/coins.h
    index 27c1a42cec..8b846d786f 100644
    --- a/src/coins.h
    +++ b/src/coins.h
    @@ -769,6 +769,8 @@ public:
             return CreateResetGuard();
         }
     
    +    bool AllInputsConsumed() const noexcept { return m_input_tail == m_inputs.size(); }
    +
         void SetBackend(CCoinsView& view) override
         {
             StopFetching();
    diff --git a/src/validation.cpp b/src/validation.cpp
    index c4e99c399f..2970fafe1f 100644
    --- a/src/validation.cpp
    +++ b/src/validation.cpp
    @@ -3046,6 +3046,7 @@ bool Chainstate::ConnectTip(
                 LogError("%s: ConnectBlock %s failed, %s\n", __func__, pindexNew->GetBlockHash().ToString(), state.ToString());
                 return false;
             }
    +        Assume(view.AllInputsConsumed());
             time_3 = SteadyClock::now();
             m_chainman.time_connect_total += time_3 - time_2;
             assert(m_chainman.num_blocks_total > 0);
    

    ryanofsky commented at 10:52 PM on June 10, 2026:

    re: #35295 (review)

    We could do this and it will be true for production code, but it will break in our unit and fuzz tests.

    Seem to indicate a problem with the tests if they are violating assumptions code needs work properly. Maybe a clean solution would be to leave m_inputs empty in these tests? Like add prevoutfetch_threads_num as the first commit, and only populate m_inputs if it is > 0. This would be nice anyway, because it would simplify the the later commit 93d101c1505bd8cafb8226ede860642ac7d15064 and further reduce code churn in the PR. For example, the for (const auto& tx : block.vtx | std::views::drop(1)) loop would no longer have to become conditional in that commit, it would be conditional from the beginning.

    I don't think there would be a downside to not populating m_inputs in tests that are not using it anyway.

    The other assertion you suggest seems better than nothing, but weaker than assumption this code is actually making. I also think it would be ok not to have an assert, since if the assumption is wrong the code will be slow and broken but still return valid results. I just think it would make things better to verify the code is working as expected.


    ryanofsky commented at 11:09 PM on June 10, 2026:

    re: #35295 (review)

    AllInputsConsumed

    I don't have as much context as you, but when I look at this code the thing that seems fragile about is the assumptions it makes about what order the calls to fetch happen. Because if the order changes even slightly, the entire optimization this PR is implementing could be negated without any sign other than slower performance. Checking whether all inputs are consumed could be good, too, but I don't see that as being related.


    andrewtoth commented at 3:24 PM on June 14, 2026:

    leave m_inputs empty in these tests

    Hmm I think we want to have the unit and especially fuzz tests be able to exercise the CoinsViewOverlay public API with m_inputs populated. Maybe I'm misunderstanding what you're suggesting here.

    add prevoutfetch_threads_num as the first commit, and only populate m_inputs if it is > 0

    I will experiment with this, to see if we can reduce code churn. But, I don't think we want to run unit tests without exercising m_inputs. Will think more on this.

    if the order changes even slightly, the entire optimization this PR is implementing could be negated without any sign other than slower performance

    It seems to me the invariant we want to check is that we indeed used the optimized path during ConnectBlock. So if the order of lookups in ConnectBlock is changed, we want to ensure we do not break the optimization. So, in the second commit 4216895f798ff8ae809634451ad4958f2efa19a4 I add an Assume(view.AllInputsConsumed()); check after ConnectBlock succeeds. This will test exactly what you are suggesting. If ConnectBlock is modified in such a way that we do not read through m_inputs in the optimized path, it will trip the Assume and break CI. An invalid block or fuzz tests will not necessarily pass this invariant, so we only do it on a successful ConnectBlock and not internal to CoinsViewOverlay, which has no holistic concept of the block being valid.


    andrewtoth commented at 4:13 PM on June 14, 2026:

    I reordered the commits to add the prevoutfetch_threads_num as the first commit, followed by introducing the threadpool into CoinsViewOverlay. The rest of the commits follow the same flow, but there is less code churn inside StartFetching because of this.


    l0rinc commented at 12:13 PM on June 16, 2026:

    nit: https://github.com/bitcoin/bitcoin/blob/93c6ea147422199777ddbc5f6d04134b82420b76/src/coins.h#L450-L452 throws unconditionally, doesn't make a lot of sense to me to fuzz it (I don't mind pushing this in a separate PR myself)


    ryanofsky commented at 3:47 PM on June 16, 2026:

    re: #35295 (review)

    Thanks for the clarification on AllInputsConsumed I was still thinking in terms of the original commit 41e90e14ddfc1e6fe4cf3accc47714f185fc8a60 where tail could be advanced without inputs actually being used. But with later code, AllInputsConsumed is a nice way of checking the optimization is applied and resolves my concern about assumptions the optimization relies on.

    More minor feedback:

    • I think the check should be done using assert not assume. Or else it should be using assume and checking the return value and logging a warning, or returning an error, or throwing an exception when it is false. I don't think unchecked assume expressions are appropriate outside of hot code paths where checking them could be bad for performance.

    • I still don't understand the point of populating m_inputs in tests if the inputs will not be used. It seems like it would be better to move the AllInputsConsumed check out of validation code into coins code to check that whenever m_inputs is populated it will always be used, and no caller is creating a bunch of threads doing work that is useless.


    andrewtoth commented at 3:10 AM on June 18, 2026:

    it would be better to move the AllInputsConsumed check out of validation code into coins code to check that whenever m_inputs is populated it will always be used, and no caller is creating a bunch of threads doing work that is useless.

    The issue here is that m_inputs is populated by a block that can be invalid. We don't know yet if it is valid or not because that will happen in ConnectBlock.

    Consider an invalid block which contains an input that does not exist in the utxo set. We will populate the m_inputs with this invalid input, ConnectBlock will access it in FetchCoinFromBase and receive a nullopt, and ConnectBlock will fail validation and return early. Now we will go the unhappy path and call StopFetching when the ResetGuard is destroyed, but all inputs after the missing one will never have been accessed. So there's no guarantee that can be had in coins code that all m_inputs will be used. If we asserted that all m_inputs must be accessed then we would crash on an invalid block.


    l0rinc commented at 1:18 PM on June 18, 2026:

    Removed in #35562, can be resolved


    andrewtoth commented at 7:04 PM on June 22, 2026:

    I updated the AllInputsConsumed to be Assert.


    andrewtoth commented at 4:47 AM on June 23, 2026:

    I think the check should be done using assert not assume. Or else it should be using assume and checking the return value and logging a warning, or returning an error, or throwing an exception when it is false. I don't think unchecked assume expressions are appropriate outside of hot code paths where checking them could be bad for performance.

    I think I will revert this and use Assume after all and log a warning. This is checking that an optimization was successful. When testing it would be good to break if it fails. In production, if a block is otherwise valid but fails to achieve a speed optimization it will crash the node. I don't think that's a good tradeoff if for some reason a change to our validation code is made that breaks the assumption in a subtle way and goes unnoticed in testing.


    l0rinc commented at 11:22 PM on June 23, 2026:

    803a8bcb3e84b50920da58f3fb158b14acd0433e

    (aside from BIP30 checks, an invalid block, or when the thread pool is not yet.


    ryanofsky commented at 6:46 PM on June 25, 2026:

    re: #35295 (review)

    If we asserted that all m_inputs must be accessed then we would crash on an invalid block.

    It would seem good to have a separate CoinsViewOverlay::CancelFetching method callers could use in this case, that would stop fetching new inputs and allow already fetched ones to be unused.

    An analogy would be with the std::thread destructor. To destroy the thread object you either have to join it or detach it or move from it, and if you don't do any of these things it's a bug and an error.

  75. in src/coins.h:703 in e92862f7e6
     707 | +                for (const auto& input : tx->vin) {
     708 | +                    if (!txids.contains(input.prevout.hash)) m_inputs.emplace_back(input.prevout);
     709 | +                }
     710 | +                txids.emplace(tx->GetHash());
     711 | +            }
     712 | +            // Only start threads if we have something to fetch.
    


    l0rinc commented at 11:50 AM on June 4, 2026:

    e92862f coins: fetch inputs in parallel:

    nit: this also seems to be a pre-threadpool framing:

                // Only submit tasks if we have something to fetch.
    
  76. in src/coins.h:648 in d3dac3c9dd
     644 | @@ -642,10 +645,15 @@ class CoinsViewOverlay : public CCoinsViewCache
     645 |          Assert(m_input_head.load(std::memory_order_relaxed) == 0);
     646 |          Assert(m_input_tail == 0);
     647 |          // Loop through the inputs of the block and set them in the queue.
     648 | +        // Filter txs that are spending inputs created earlier in the same block. These inputs will be created
    


    l0rinc commented at 11:51 AM on June 4, 2026:

    d3dac3c coins: filter same-block spends in StartFetching:

    nit: We're spending outputs

            // Loop through the block inputs and set their prevouts in the queue.
            // Filter inputs that spend outputs created earlier in the same block. These outputs will be created
    
  77. in src/test/fuzz/coins_view.cpp:395 in 4548f6468a
     399 |          }
     400 | +
     401 | +        // Stop async workers before accessing backend_coins_view to avoid data race
     402 | +        // (HaveCoin/GetCoin on CCoinsViewCache mutate cacheCoins via FetchCoin)
     403 | +        if (auto* async_cache{dynamic_cast<CoinsViewOverlay*>(&coins_view_cache)}) {
     404 | +            (void)async_cache->CreateResetGuard();
    


    l0rinc commented at 11:59 AM on June 4, 2026:

    4548f64 fuzz: update harnesses to cover CoinsViewOverlay::StartFetching:

    If I understand correctly, this stops fetching because the final backend consistency check calls HaveCoin/GetCoin which can mutate CCoinsViewCache.

    Maybe we can avoid that by using PeekCoin when mutation is prohibited, something like:

    std::optional<Coin> coin_in_backend;
    bool exists_using_have_coin_in_backend;
    if (dynamic_cast<CoinsViewOverlay*>(&coins_view_cache)) {
        // PeekCoin does not mutate cacheCoins, so async workers can keep running.
        coin_in_backend = backend_coins_view->PeekCoin(random_out_point);
        exists_using_have_coin_in_backend = coin_in_backend.has_value();
    } else {
        exists_using_have_coin_in_backend = backend_coins_view->HaveCoin(random_out_point);
        coin_in_backend = backend_coins_view->GetCoin(random_out_point);
    }
    

    which allows us to avoid those calls later:

    // If the backend has the coin, it must also be on the cache if the coin wasn't spent.
    if (!is_spent_using_access_coin && exists_using_have_coin_in_backend) {
        assert(exists_using_have_coin);
    }
    if (coin_in_backend) {
        assert(exists_using_have_coin_in_backend);
        // Note we can't assert that the backend coin matches the cache coin because the coin in
        // the cache may have been modified but not yet flushed.
    } else {
        assert(!exists_using_have_coin_in_backend);
    }
    
  78. in src/coins.h:683 in e0a95b547d outdated
     678 | +    {
     679 | +        StopFetching();
     680 | +        CCoinsViewCache::SetBackend(view);
     681 | +    }
     682 | +
     683 | +    void Flush(bool reallocate_cache = true) override
    


    l0rinc commented at 12:01 PM on June 4, 2026:

    e0a95b5 coins: stop fetching before mutating base:

    Should BatchWrite also stop fetching for API consistency? The fuzz harness can call the inherited BatchWrite while fetching is active (even though this cannot happen in production, as far as I can tell).

    void BatchWrite(CoinsViewCacheCursor& cursor, const uint256& block_hash) override
    {
        StopFetching();
        CCoinsViewCache::BatchWrite(cursor, block_hash);
    }
    

    andrewtoth commented at 9:45 PM on June 4, 2026:

    BatchWrite() only modifies this cacheCoins, it does not modify base in any way! So, it's safe to call. Flush() and Sync() call BatchWrite() on base.

  79. l0rinc approved
  80. l0rinc commented at 12:24 PM on June 4, 2026: contributor

    While I'm running IBD, pruned-IBD, reindex-chainstate, memory profiling, and -prevoutfetchthreads=0/1/4/8 comparisons I have added a few nits for the comments and docs. A few still reflect the pre-Threadpool state, we should probably fix it. But if the above benchmarks don't reveal anything new I'm fine with it as is, most of these can be follow-ups.

  81. andrewtoth force-pushed on Jun 6, 2026
  82. andrewtoth commented at 7:56 PM on June 6, 2026: contributor

    Thanks @l0rinc. Addressed all your suggestions.

    git diff 73df01964cdc3a8b48ddc45010c469a0af283106..14832b23025c3ee9af62d9c762bca59726b614c7

  83. l0rinc commented at 9:33 PM on June 7, 2026: contributor

    tested ACK 14832b23025c3ee9af62d9c762bca59726b614c7

    The measurements indicate that -prevoutfetchthreads=0 vs -prevoutfetchthreads=1 just complicates things; I'd personally be in favor of disallowing the former.

    They also still indicate that -prevoutfetchthreads=4 vs -prevoutfetchthreads=8 aren't very different, but the heavy memory measurements have been running for almost a week now and I still don't have a clear view on how much more memory this requires. But it's likely tolerable, so I'm fine with it ultimately.

    <img width="2683" height="814" alt="image" src="https://github.com/user-attachments/assets/5757512b-c585-44ed-8c87-3ff5b7ac4e69" />

    <details><summary>-prevoutfetchthreads=0,1,4,8 | 2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 1000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD</summary>

    for DBCACHE in 1000 20000; do     COMMITS="0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f";     STOP=950059; CC=gcc; CXX=g++;     BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs";     (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) &&     (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") &&     hyperfine     --sort command     --runs 1     --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json"     --parameter-list COMMIT ${COMMITS// /,}     --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;"     --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8"; done
    
    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 1000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        30800.474 s               [User: 34040.905 s, System: 1333.384 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec43
      Time (abs ≡):        30736.822 s               [User: 36206.873 s, System: 2918.100 s]
    
    Benchmark 3: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        27326.204 s               [User: 37305.479 s, System: 1641.282 s]
    
    Benchmark 4: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        26573.476 s               [User: 38786.031 s, System: 1750.080 s]
    
    Relative speed comparison
            1.16          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.16          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.03          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
    

    </details>

    <details><summary>-prevoutfetchthreads=0,1,4,8 | 2026-06-05 | reindex-chainstate | 950059 blocks | dbcache 20000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD</summary>

    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-05 | reindex-chainstate | 950059 blocks | dbcache 20000 | i7-hdd | x86_64 | Intel(R) Core(TM) i7-7700 CPU @ 3.60GHz | 8 cores | 62Gi RAM | HDD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        27750.191 s               [User: 28975.022 s, System: 829.911 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        25323.842 s               [User: 29845.897 s, System: 899.840 s]
    
    Benchmark 3: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        25589.183 s               [User: 30244.886 s, System: 899.719 s]
    
    Benchmark 4: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        24650.900 s               [User: 30616.339 s, System: 913.909 s]
    
    Relative speed comparison
            1.13          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.03          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.04          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=20000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
    

    </details>

    <details><summary>-prevoutfetchthreads=0,1,4,8 | 2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 1000 | rpi5-16-3 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | SSD</summary>

    for DBCACHE in 1000; do     COMMITS="0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f";     STOP=950059; CC=gcc; CXX=g++;     BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs";     (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) &&     (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") &&     hyperfine     --sort command     --runs 1     --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json"     --parameter-list COMMIT ${COMMITS// /,}     --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;"     --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8"; done
    
    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 1000 | rpi5-16-3 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        41493.792 s               [User: 55275.524 s, System: 3351.821 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        38404.656 s               [User: 58689.656 s, System: 9720.696 s]
    
    Benchmark 3: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        30625.547 s               [User: 60533.129 s, System: 4760.258 s]
    
    Benchmark 4: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        30194.755 s               [User: 60089.389 s, System: 4626.381 s]
    
    Relative speed comparison
            1.37          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.27          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=1 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.01          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=4 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 -prevoutfetchthreads=8 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
    

    </details>

    <details><summary>high total memory Rpi5 | 2026-06-03 | pruned IBD | 950059 blocks | dbcache 1000 | pruning 10000 | rpi5-16-2 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | ext4 | SSD</summary>

    for DBCACHE in 1000; do \
      COMMITS="b28cf409a13b893787c4c95b1ba975ac5f5b6254 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f"; \
      STOP=950059; PRUNE=10000; \
      CC=gcc; CXX=g++; \
      BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/ShallowBitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
      (echo ""; for c in $COMMITS; do git fetch -q origin $c 2>/dev/null || true && git log -1 --pretty='%h %s' $c || exit 1; done) && \
      (echo "" && echo "$(date -I) | pruned IBD | ${STOP} blocks | dbcache ${DBCACHE} | pruning ${PRUNE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(df -T $BASE_DIR | awk 'NR==2{print $2}') | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD) | local seed"; echo "") && \
      hyperfine \
        --sort command \
        --runs 2 \
        --export-json "$BASE_DIR/ibd-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -rf $DATA_DIR/*; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -prune=$PRUNE -listen=0 -printtoconsole=0; sleep 20;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -blocksonly -prune=$PRUNE -listen=0 -printtoconsole=0";
    done
    
    b28cf409a1 Merge bitcoin/bitcoin#34866: fuzz: target concurrent leveldb reads
    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-03 | pruned IBD | 950059 blocks | dbcache 1000 | pruning 10000 | rpi5-16-2 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | ext4 | SSD | local seed
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/m
      Time (mean ± σ):     59173.331 s ± 69.637 s    [User: 73500.604 s, System: 6263.421 s]
      Range (min … max):   59124.090 s … 59222.572 s    2 runs
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=1000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (mean ± σ):     45710.353 s ± 133.695 s    [User: 78017.247 s, System: 7844.232 s]
      Range (min … max):   45615.816 s … 45804.890 s    2 runs
    
    Relative speed comparison
            1.29 ±  0.00  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=1000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = b28cf409a13b893787c4c95b1ba975ac5f5b6254)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=1000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
    

    </details>

    <details><summary>low total memory Rpi5 | 2026-06-03 | pruned IBD | 950059 blocks | dbcache 4000 | pruning 10000 | rpi5-8 | aarch64 | Cortex-A76 | 4 cores | 7.7Gi RAM | ext4 | SSD | local seed</summary>

    for DBCACHE in 4000; do   COMMITS="b28cf409a13b893787c4c95b1ba975ac5f5b6254 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f";   STOP=950059; PRUNE=10000;   CC=gcc; CXX=g++;   BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/ShallowBitcoinData"; LOG_DIR="$BASE_DIR/logs";   (echo ""; for c in $COMMITS; do git fetch -q origin $c 2>/dev/null || true && git log -1 --pretty='%h %s' $c || exit 1; done) &&   (echo "" && echo "$(date -I) | pruned IBD | ${STOP} blocks | dbcache ${DBCACHE} | pruning ${PRUNE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(df -T $BASE_DIR | awk 'NR==2{print $2}') | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD) | local seed"; echo "") &&   hyperfine     --sort command     --runs 1     --export-json "$BASE_DIR/ibd-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json"     --parameter-list COMMIT ${COMMITS// /,}     --prepare "killall -9 bitcoind 2>/dev/null; rm -rf $DATA_DIR/*; git clean -fxd; git reset --hard {COMMIT} && \
    cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
    ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -prune=$PRUNE -listen=0 -printtoconsole=0; sleep 20;"     --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -blocksonly -prune=$PRUNE -listen=0 -printtoconsole=0"; done
    
    b28cf409a1 Merge bitcoin/bitcoin#34866: fuzz: target concurrent leveldb reads
    0637a1cc6f fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads
    
    2026-06-03 | pruned IBD | 950059 blocks | dbcache 4000 | pruning 10000 | rpi5-8 | aarch64 | Cortex-A76 | 4 cores | 7.7Gi RAM | ext4 | SSD | local seed
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=4000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = b28cf409a13b893787c4c95b1ba975ac5f5b6254)
      Time (abs ≡):        180920.821 s               [User: 70234.385 s, System: 16776.498 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=4000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
      Time (abs ≡):        125393.476 s               [User: 73042.131 s, System: 20660.173 s]
    
    Relative speed comparison
            1.44          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=4000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = b28cf409a13b893787c4c95b1ba975ac5f5b6254)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/ShallowBitcoinData -stopatheight=950059 -dbcache=4000 -blocksonly -prune=10000 -listen=0 -printtoconsole=0 (COMMIT = 0637a1cc6f2b2f28d4ab5d97aaf8a91ec436fc8f)
    

    </details>

  84. DrahtBot requested review from rkrux on Jun 7, 2026
  85. DrahtBot requested review from theStack on Jun 7, 2026
  86. in src/coins.h:616 in 9d6ae3c616
     611 | @@ -602,6 +612,9 @@ class CoinsViewOverlay : public CCoinsViewCache
     612 |  
     613 |          auto& input{m_inputs[i]};
     614 |          input.coin = base->PeekCoin(input.outpoint);
     615 | +        // Use release so writing coin above happens before the main thread acquires.
     616 | +        Assume(!input.ready.test_and_set(std::memory_order_release));
    


    ryanofsky commented at 2:57 PM on June 8, 2026:

    In commit "coins: filter same-block spends in StartFetching" (38b18bc5b33a626c108fa4ecc6beed24a97f0a34)

    Not sure Assume really makes sense here.

    If there is a performance reason for ignoring whether the flag was previously set in release builds. It would seem to make more sense to a plain store like input.ready.store(true, std::memory_order_release) instead of test_and_set and atomic<bool> over atomic_flag.

    Alternately if performance is not a factor here it would seem safest to replace Assume with Assert


    andrewtoth commented at 2:28 AM on June 9, 2026:

    Changed to Assert. The check shouldn't make a difference to performance, which is dominated by the prevout lookup.

  87. in src/coins.h:616 in aeb4bf72d9
     611 | +        m_input_tail = 0;
     612 | +    }
     613 | +
     614 |      std::optional<Coin> FetchCoinFromBase(const COutPoint& outpoint) const override
     615 |      {
     616 | +        while (m_input_tail < m_inputs.size()) {
    


    ryanofsky commented at 3:28 PM on June 8, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (aeb4bf72d9dc99f476446c68de7764d7f3d250d0)

    The comment added in the next commit seems like it would be helpful here.

            // This assumes ConnectBlock accesses all inputs in the same order as
            // they are added to m_inputs in StartFetching.
    
  88. in src/coins.h:619 in aeb4bf72d9
     614 |      std::optional<Coin> FetchCoinFromBase(const COutPoint& outpoint) const override
     615 |      {
     616 | +        while (m_input_tail < m_inputs.size()) {
     617 | +            // We advance the tail since the input is cached and not accessed through this method again.
     618 | +            auto& input{m_inputs[m_input_tail++]};
     619 | +            if (input.outpoint != outpoint) continue;
    


    ryanofsky commented at 3:32 PM on June 8, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (aeb4bf72d9dc99f476446c68de7764d7f3d250d0)

    Would suggest a comment for this continue that would explain why this is needed (and also why it is being removed next commit) like "// Keep advancing because m_inputs can contain same-block inputs that Fetch will never be called for"

  89. ryanofsky commented at 4:15 PM on June 8, 2026: contributor

    Code review 14832b23025c3ee9af62d9c762bca59726b614c7. I still need to review more but left a few comments. Thanks for considering previous suggestions too. Everything I've seen looks good and it's nice the PR is getting simpler.

  90. andrewtoth force-pushed on Jun 9, 2026
  91. andrewtoth commented at 2:29 AM on June 9, 2026: contributor

    Thanks @ryanofsky. Addressed all your suggestions.

    git range-diff 7c2718a4b8 91c62a8870 bed709c66f

  92. l0rinc commented at 12:07 PM on June 9, 2026: contributor

    ACK bed709c66f337191413d500846015d5bf444482b

  93. in src/coins.h:628 in 41e90e14dd
     623 | +            if (input.outpoint != outpoint) continue;
     624 | +            // We can move the coin since we won't access this input again.
     625 | +            return std::move(input.coin);
     626 | +        }
     627 | +
     628 | +        // We will only get here for BIP30 checks.
    


    ryanofsky commented at 8:45 PM on June 10, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (41e90e14ddfc1e6fe4cf3accc47714f185fc8a60)

    It looks like BIP30 checks are actually done before inputs are fetched so I don't think this comment is accurate in this commit (but would be ok in the next commit 2b0d87e8e0322119eb5f428d90045b71c3e19bf3).

    So for all blocks where BIP30 checks are done (old blocks from 2013 and earlier), the loop above will just increment m_input_tail to equal m_inputs.size() and m_inputs will be unused, and every call to FetchCoinFromBase will fall through to PeekCoin.

    If this is true, I feel like the "We will only get here for BIP30 checks" comment is misleading and should at least be fixed to describe the situation accurately.

    But also if this is true, I feel like it would be better to squash the next commit into this one 2b0d87e8e0322119eb5f428d90045b71c3e19bf3 to make review simpler and avoid needing to reason about this more confusing intermediate state.


    andrewtoth commented at 10:11 PM on June 10, 2026:

    Hmm yes this was a side-effect that I overlooked of adding m_input_tail to the first commit. Another option is to not advance the tail unless we find the outpoint we are looking for, like this:

    for (auto i{m_input_tail}; m_input_tail < m_inputs.size(); ++i) {
        auto& input{m_inputs[i]};
        // Outputs from earlier txs in the same block are created directly in the cache, so won't be fetched
        // from base. We skip those by scanning for the input prevout that matches the outpoint we are looking for.
        if (input.outpoint != outpoint) continue;
        // We advance the tail since the input is cached and not accessed through this method again.
        ++m_input_tail;
        // We can move the coin since we won't access this input again.
        return std::move(input.coin);
    }
    

    Not sure if you'd prefer that or to roll the first 2 commits into 1?


    ryanofsky commented at 10:42 PM on June 10, 2026:

    re: #35295 (review)

    Another option is to not advance the tail unless we find the outpoint we are looking for

    Nice, I think you'd need to use m_input_tail = i instead of ++m_input_tail, but otherwise that seems like a clean solution.

    Not sure if you'd prefer that or to roll the first 2 commits into 1?

    I would slightly prefer to see the two commits squashed, but I think you should follow your own preference or wait for another opinion, because both options seem reasonable and I'm probably unusually biased in favor of larger commits to reduce code churn.


    andrewtoth commented at 3:08 PM on June 14, 2026:

    I updated the first commit to use the above modification along with m_input_tail = i as you suggest.

    I decided to keep the first 2 commits separate. Other reviewers seem to have a preference for an initial commit that introduces the smallest working implementation of the inputs queue. I also think the same-block output filtering is not an entirely obvious concept to many reviewers, so it makes sense to split it into its own commit.

  94. w0xlt commented at 4:22 PM on June 11, 2026: contributor

    Concept ACK

  95. andrewtoth force-pushed on Jun 14, 2026
  96. andrewtoth force-pushed on Jun 14, 2026
  97. DrahtBot added the label CI failed on Jun 14, 2026
  98. DrahtBot commented at 2:37 PM on June 14, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task lint: https://github.com/bitcoin/bitcoin/actions/runs/27501773337/job/81285856466</sub> <sub>LLM reason (✨ experimental): CI failed due to lint tests detecting a Boost test suite naming violation in src/test/coinsviewoverlay_tests.cpp (coinsviewoverlay_no_workers_tests doesn’t follow the required *_tests/*_tests_bar convention).</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  99. in src/validation.cpp:1859 in bc5729aea2
    1856 | +void CoinsViews::InitCache(int32_t prevoutfetch_threads)
    1857 |  {
    1858 |      AssertLockHeld(::cs_main);
    1859 |      m_cacheview = std::make_unique<CCoinsViewCache>(&m_catcherview);
    1860 | -    m_connect_block_view = std::make_unique<CoinsViewOverlay>(&*m_cacheview);
    1861 | +    auto thread_pool{std::make_shared<ThreadPool>("prevoutfetch")};
    


    l0rinc commented at 3:13 PM on June 14, 2026:

    Wouldn't these names be still too long on Linux - see #35173?

    ps -eL | grep b-
    1216097 1216098 ?        00:00:00 b-scheduler
    1216097 1216099 ?        00:00:00 b-http_pool_0
    1216097 1216100 ?        00:00:00 b-http_pool_1
    1216097 1216101 ?        00:00:00 b-http_pool_2
    1216097 1216102 ?        00:00:00 b-http_pool_3
    1216097 1216103 ?        00:00:00 b-http_pool_4
    1216097 1216104 ?        00:00:00 b-http_pool_5
    1216097 1216105 ?        00:00:00 b-http_pool_6
    1216097 1216106 ?        00:00:00 b-http_pool_7
    1216097 1216107 ?        00:00:00 b-http_pool_8
    1216097 1216108 ?        00:00:00 b-http_pool_9
    1216097 1216109 ?        00:00:00 b-http_pool_10
    1216097 1216110 ?        00:00:00 b-http_pool_11
    1216097 1216111 ?        00:00:00 b-http_pool_12
    1216097 1216112 ?        00:00:00 b-http_pool_13
    1216097 1216113 ?        00:00:00 b-http_pool_14
    1216097 1216114 ?        00:00:00 b-http_pool_15
    1216097 1216115 ?        00:00:00 b-http
    1216097 1216116 ?        00:00:05 b-scriptch.0
    1216097 1216117 ?        00:00:05 b-scriptch.1
    1216097 1216118 ?        00:00:05 b-scriptch.2
    1216097 1216119 ?        00:00:05 b-scriptch.3
    1216097 1216120 ?        00:00:05 b-scriptch.4
    1216097 1216121 ?        00:00:05 b-scriptch.5
    1216097 1216122 ?        00:00:05 b-scriptch.6
    1216097 1216125 ?        00:00:00 b-prevoutfetch_
    1216097 1216126 ?        00:00:00 b-prevoutfetch_
    1216097 1216127 ?        00:00:00 b-prevoutfetch_
    1216097 1216128 ?        00:00:00 b-prevoutfetch_
    1216097 1216129 ?        00:00:00 b-prevoutfetch_
    1216097 1216130 ?        00:00:00 b-prevoutfetch_
    1216097 1216131 ?        00:00:00 b-prevoutfetch_
    1216097 1216132 ?        00:00:00 b-prevoutfetch_
    1216097 1216134 ?        00:00:00 b-mapport
    1216097 1216135 ?        00:00:00 b-torcontrol
    1216097 1216136 ?        00:00:00 b-net
    1216097 1216138 ?        00:00:00 b-addcon
    1216097 1216139 ?        00:00:00 b-opencon
    1216097 1216140 ?        00:00:08 b-msghand
    

    Edit: note that the logs do show the full name on Linux as well:

    2026-06-14T12:29:56Z prevoutfetch_pool_4 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_3 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_7 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_2 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_0 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_5 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_1 thread exit
    2026-06-14T12:29:56Z prevoutfetch_pool_6 thread exit
    

    andrewtoth commented at 3:25 PM on June 14, 2026:

    Do you have a suggestion for the name?


    andrewtoth commented at 4:11 PM on June 14, 2026:

    Updated to fetch.


    l0rinc commented at 6:53 AM on June 16, 2026:

    after https://github.com/bitcoin/bitcoin/pull/35173/changes#diff-bf307d8d34f553323259334512522a03841fd65f81982c3dad83fae143127eb5L115 we don't have the _pool suffix anymore, so the name can probably be something like coinsfetch (i.e. b-coinsfetch_10) - to differentiate it from download threads.

  100. DrahtBot removed the label CI failed on Jun 14, 2026
  101. andrewtoth force-pushed on Jun 14, 2026
  102. andrewtoth commented at 4:15 PM on June 14, 2026: contributor

    Addressed review comments from @ryanofsky (https://github.com/bitcoin/bitcoin/pull/35295#discussion_r3391397001, #35295 (review)) and @l0rinc (https://github.com/bitcoin/bitcoin/pull/35295#discussion_r3409747690). Thanks!

    git range-diff 7c2718a4b82df709ee06380209747577514acbf0..bed709c66f337191413d500846015d5bf444482b 2cfb10b668f628596e38506c0b0243840a906ff1..17bc5ea7c54f5d9daacd5f5ba17f60cf1475ac01

  103. jsarenik commented at 8:18 AM on June 15, 2026: none

    Concept ACK

  104. in src/coins.h:685 in 3c9d5921df
     678 | @@ -663,9 +679,12 @@ class CoinsViewOverlay : public CCoinsViewCache
     679 |          Assert(m_thread_pool);
     680 |      }
     681 |  
     682 | -    //! Start fetching inputs from block.
     683 | +    ~CoinsViewOverlay() noexcept override { StopFetching(); }
     684 | +
     685 | +    //! Start fetching inputs from block in background.
     686 |      [[nodiscard]] ResetGuard StartFetching(const CBlock& block LIFETIMEBOUND) noexcept
    


    l0rinc commented at 2:04 PM on June 16, 2026:

    3c9d592 coins: fetch inputs in parallel:

    coins.h is included across a large part of the codebase - could we move the method's body to the impl and move the util/threadpool.h include there?


    andrewtoth commented at 6:58 PM on June 22, 2026:

    Moved to coins.cpp, and forward declared ThreadPool and CBlock in coins.h.

  105. in src/coins.h:705 in 3c9d5921df
     699 | @@ -681,9 +700,20 @@ class CoinsViewOverlay : public CCoinsViewCache
     700 |                  }
     701 |                  earlier_txids.emplace(tx->GetHash());
     702 |              }
     703 | -            // Only process inputs if we have something to fetch.
     704 | +            // Only submit tasks if we have something to fetch.
     705 |              if (m_inputs.size()) {
     706 | -                while (ProcessInput()) {}
     707 | +                std::vector<std::function<void()>> tasks(workers_count, [this] {
    


    l0rinc commented at 2:11 PM on June 16, 2026:

    3c9d592 coins: fetch inputs in parallel:

    nit: for an unlikely block with fewer inputs than workers we could cap the task to the input count instead (to avoid needlessly allocating and immediately terminating these futures).

                    std::vector<std::function<void()>> tasks(std::min(workers_count, m_inputs.size()), [this] {
    

    andrewtoth commented at 7:00 PM on June 22, 2026:

    I don't think I'll take it since it will make this line slightly more complicated for no real gain (unless benchmarks show this has a measurable gain?).

  106. in src/coins.h:620 in 3783d646a0
     615 | +
     616 |      std::optional<Coin> FetchCoinFromBase(const COutPoint& outpoint) const override
     617 |      {
     618 | +        // This assumes ConnectBlock accesses all inputs in the same order as
     619 | +        // they are added to m_inputs in StartFetching.
     620 | +        for (auto i{m_input_tail}; i < m_inputs.size(); ++i) {
    


    l0rinc commented at 2:24 PM on June 16, 2026:

    3783d64 validation: collect block inputs in CoinsViewOverlay before ConnectBlock:

    Doesn't BIP30 make this quadratic now?


    ryanofsky commented at 3:50 PM on June 16, 2026:

    re: #35295 (review)

    Doesn't BIP30 make this quadratic now?

    Good catch, although this is fixed in the next commit aed8687b2690594c65a953fb64e7c46a67828620. IMO this slightly bolsters the case for merging the two commits, but no strong opinion.


    andrewtoth commented at 7:01 PM on June 22, 2026:

    I ended up merging the two commits, so when we introduce StartFetching we also do the same block output filtering.

  107. l0rinc changes_requested
  108. in src/test/coinsviewoverlay_tests.cpp:63 in d59725b731
      60 | -            cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
      61 | +            if (spent) {
      62 | +                cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
      63 | +            } else {
      64 | +                coin.out.nValue = 1;
      65 | +                cache.AddCoin(in.prevout, std::move(coin), /*possible_overwrite=*/false);
    


    ryanofsky commented at 2:48 PM on June 16, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (d59725b731e21894bd40a1d1159bdcf6552d0c3a)

    Is there a reason for switching from EmplaceCoinInternalDANGER to AddCoin in this commit? Might be good to explain in commit message why this needed now, or if this is just unrelated cleanup. IIUC switching to AddCoin adds the FRESH flag so this does seem better, but it's not clear why the change is here.


    andrewtoth commented at 7:02 PM on June 22, 2026:

    I removed touching this. I'm not exactly sure why we modified it, but I don't think it matters at all because we are both filtering same block outputs (so FRESH flag will not apply) and we flush this right after the loop.


    l0rinc commented at 10:26 PM on June 23, 2026:

    I'm not exactly sure why we modified it

    I requested exercising the version that's actually used in prod: #35295 (review). I'm fine with both.

  109. in src/validation.cpp:1861 in d59725b731
    1858 |      AssertLockHeld(::cs_main);
    1859 |      m_cacheview = std::make_unique<CCoinsViewCache>(&m_catcherview);
    1860 | -    m_connect_block_view = std::make_unique<CoinsViewOverlay>(&*m_cacheview);
    1861 | +    auto thread_pool{std::make_shared<ThreadPool>("fetch")};
    1862 | +    if (prevoutfetch_threads > 0) {
    1863 | +        thread_pool->Start(std::min(prevoutfetch_threads, MAX_PREVOUTFETCH_THREADS));
    


    ryanofsky commented at 2:57 PM on June 16, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (d59725b731e21894bd40a1d1159bdcf6552d0c3a)

    Argument exceeding MAX_PREVOUTFETCH_THREADS can never actually happen right? If so might suggest simplifying this code and dropping std::min, or switching to an assert, or adding a comment clarifying that this is defensive, because it is confusing to apply a limit multiple places and not be able to easily see where it is actually being applied.


    l0rinc commented at 10:23 PM on June 23, 2026:

    Was requested in #35295 (review). I'm fine with both.

  110. in src/test/coinsviewoverlay_tests.cpp:60 in d59725b731
      55 | @@ -47,8 +56,12 @@ void PopulateView(const CBlock& block, CCoinsView& view, bool spent = false)
      56 |      for (const auto& tx : block.vtx | std::views::drop(1)) {
      57 |          for (const auto& in : tx->vin) {
      58 |              Coin coin{};
      59 | -            if (!spent) coin.out.nValue = 1;
      60 | -            cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
      61 | +            if (spent) {
      62 | +                cache.EmplaceCoinInternalDANGER(COutPoint{in.prevout}, std::move(coin));
    


    ryanofsky commented at 3:05 PM on June 16, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (d59725b731e21894bd40a1d1159bdcf6552d0c3a)

    It would seem more consistent here to call SpendCoin instead of EmplaceCoinInternalDANGER for consistency with AddCoin below. Would this work? Would seem good to have a code comment explaining reason EmplaceCoinInternalDANGER needs to be used instead if not.


    andrewtoth commented at 7:02 PM on June 22, 2026:

    I removed touching this, same as above.

  111. ryanofsky commented at 4:00 PM on June 16, 2026: contributor

    Code review 17bc5ea7c54f5d9daacd5f5ba17f60cf1475ac01. Left a few more comments since I'm still working on this. Thanks for being so responsive with my previous suggestions

  112. andrewtoth force-pushed on Jun 22, 2026
  113. andrewtoth force-pushed on Jun 22, 2026
  114. DrahtBot added the label CI failed on Jun 22, 2026
  115. DrahtBot removed the label CI failed on Jun 22, 2026
  116. andrewtoth commented at 7:07 PM on June 22, 2026: contributor

    Thanks @l0rinc and @ryanofsky for your reviews. I've taken most of your suggestions.

    Rebased for #35173 and #35182 so reviewers can test that there are no adverse interactions with that change as well.

    git range-diff 7c2718a4b8..bed709c66f 33e3c7524f..e7d2a0537e

  117. andrewtoth force-pushed on Jun 23, 2026
  118. in src/test/fuzz/coinscache_sim.cpp:194 in b1aba878d6 outdated
     189 | +Mutex g_thread_pool_mutex;
     190 | +
     191 | +void StartPoolIfNeeded() EXCLUSIVE_LOCKS_REQUIRED(!g_thread_pool_mutex)
     192 | +{
     193 | +    LOCK(g_thread_pool_mutex);
     194 | +    if (!g_thread_pool->WorkersCount()) g_thread_pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
    


    theStack commented at 9:25 PM on June 23, 2026:

    in b1aba878d61473e1e9edabbb2b091fa77fc5721c: yocto-nit: this StartPoolIfNeeded function looks slightly different than the one in the coins_view.cpp fuzz test, could adapt either of them for consistency

  119. in test/functional/test_framework/util.py:569 in b1aba878d6 outdated
     565 | @@ -566,6 +566,7 @@ def write_config(config_path, *, n, chain, extra_config="", disable_autoconnect=
     566 |          #  nMaxConnections = available_fds - min_required_fds = 256 - 161 = 94;
     567 |          f.write("maxconnections=94\n")
     568 |          f.write("par=" + str(min(2, os.cpu_count())) + "\n")
     569 | +        f.write("prevoutfetchthreads=1\n")
    


    theStack commented at 9:28 PM on June 23, 2026:

    in b1aba878d61473e1e9edabbb2b091fa77fc5721c: I think it would be good if at least one node in functional tests is ran with the default setting in CI (and ideally also one with the maximum), in order to exercise the feature with multiple threads. Maybe one test could explicitly set it to a higher number, e.g. in feature_block.py? Could still be discussed and decided in a test-focused follow-up though.


    andrewtoth commented at 2:09 AM on June 25, 2026:

    Added 8 threads to feature_block.py.

  120. in src/coins.h:736 in f79559a8a0 outdated
     737 | +
     738 | +    //! Start fetching inputs from block.
     739 | +    [[nodiscard]] ResetGuard StartFetching(const CBlock& block LIFETIMEBOUND) noexcept;
     740 | +
     741 | +    //! Verify that all parallel fetched input prevouts have been consumed.
     742 | +    bool AllInputsConsumed() const noexcept { return m_input_tail == m_inputs.size(); }
    


    l0rinc commented at 10:08 PM on June 23, 2026:

    Nit: hopefully can't happen, but seems more accurate:

        bool AllInputsConsumed() const noexcept { return m_input_tail >= m_inputs.size(); }
    

    andrewtoth commented at 3:37 AM on June 24, 2026:

    Hmm yes this seems more accurate based on the name of the method, but it actually makes the invariant check weaker. If for some reason the tail passed the inputs, we would definitely want to break there and not swallow that error with a true result. Perhaps the name of the method could be changed to better describe this? Not sure I have one right now.

  121. in src/validation.cpp:3066 in f79559a8a0 outdated
    3062 | @@ -3058,6 +3063,9 @@ bool Chainstate::ConnectTip(
    3063 |              LogError("%s: ConnectBlock %s failed, %s\n", __func__, pindexNew->GetBlockHash().ToString(), state.ToString());
    3064 |              return false;
    3065 |          }
    3066 | +        if (!Assume(view.AllInputsConsumed())) {
    


    l0rinc commented at 10:13 PM on June 23, 2026:

    I ran a -reindex-chainstate until 400k blocks in debug (still running), no problems so far


    l0rinc commented at 3:16 AM on June 25, 2026:

    It finished successfully until 950k blocks in debug mode with the new Assumes

  122. in src/coins.cpp:399 in 794cf07ef0 outdated
     396 | +            } else {
     397 | +                // Submit can fail if a shared owner of the thread pool outside of this class calls Stop() or
     398 | +                // Interrupt() on a different thread after we call WorkersCount() above. In that case parallel
     399 | +                // fetching will not make progress, so we clear the inputs to fall back to single threaded fetching.
     400 | +                LogWarning("Failed to submit prevout fetch tasks; falling back to single-threaded fetching for this block.");
     401 | +                m_inputs.clear();
    


    theStack commented at 10:32 PM on June 23, 2026:

    in 794cf07ef0ba1f0163cc74a26db980efa42d74be: I was wondering whether it's necessary to set the m_input_{head,tail} variables back to zero as well, but IIUC they should still hold their initial zero values, as ProcessInput hasn't been called yet at this point (no workers submitted to the pool). Would it make sense to assert for head/tail being zero here (either manually or by calling StopFetching, which checks for that)?


    andrewtoth commented at 3:45 AM on June 24, 2026:

    Hmm well StartFetching is only called on the main thread, and as you say ProcessInput hasn't been called, so nothing could have incremented m_input_head. But I suppose we could assert here that it has not been incremented.

    I'm not sure about m_input_tail though, it is only incremented on main thread, so there's no way to reason it could have been changed from top of the method where it is asserted.


    andrewtoth commented at 2:09 AM on June 25, 2026:

    Added a StopFetching() call after we clear the inputs.

  123. theStack commented at 10:55 PM on June 23, 2026: contributor

    After following the previous iterations of this PR over the last few weeks, I think the current one is significantly easier to understanding and review. Left a few non-blocking comments, still want to look closer over tests again the next days. Some meta-nits:

    • For parallel script validation a log message "Script verification uses x additional threads" is printed, should we also have a similar one for input prevouts fetching?
    • I think this PR warrants a release note, given that a new config option is added.
  124. l0rinc approved
  125. l0rinc commented at 12:46 AM on June 24, 2026: contributor

    ACK f79559a8a08b494330162855e490173f896c7489

    I'm fine with the current version, the PR description needs a few updates and I don't mind re-reviewing of course after the other comments are applied:

    The main change is CoinsViewOverlay gets 1 new public and 2 new private methods

    Stops all threads and clears multi threaded state.

  126. DrahtBot requested review from theStack on Jun 24, 2026
  127. DrahtBot added the label Needs rebase on Jun 24, 2026
  128. validation: add -prevoutfetchthreads configuration option
    Add a configuration option for the number of worker threads used for
    parallel UTXO prevout prefetching during block connection.
    
    Default is 8 threads, max is 16, 0 disables parallel fetching.
    c706f0aad0
  129. coins: introduce thread pool in CoinsViewOverlay
    Introduce a ThreadPool shared pointer to CoinsViewOverlay. A pool managed
    externally can be passed in the constructor.
    
    A global thread pool is used in fuzz harnesses since iterations can happen
    faster than the OS can create and tear down thread pools.
    This can cause a memory leak when fuzzing.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    0569511161
  130. validation: collect block inputs in CoinsViewOverlay before ConnectBlock
    Introduce CoinsViewOverlay::StartFetching, which maps all input prevouts of a
    block to a new m_inputs vector of InputToFetch elements. Returns a ResetGuard
    which is lifetime bound to the block, while the InputToFetch elements are
    lifetime bound to the block as well.
    
    Inputs spending outputs of an earlier transaction in the same block won't
    be in the cache or the db. They also won't be requested by FetchCoinFromBase,
    so we filter them out while building m_inputs to not waste time trying to
    fetch them. Build an unordered set of seen txids while flattening m_inputs and
    skip any prevout whose hash is already in the set.
    
    Introduce StopFetching to clear the m_inputs vector.
    CCoinsViewCache::Reset is made virtual and is overridden in CoinsViewOverlay.
    StopFetching is called on Reset, so the InputToFetch objects will not
    exceed the lifetime of the block.
    
    Introduce ProcessInput to fetch the utxo of an individual input in m_inputs.
    Each caller fetches the input at m_input_head and increments it, so each call
    will fetch the next input in the queue.
    
    Fetch coins from the m_inputs vector in FetchCoinFromBase by comparing the
    requested outpoint against the single input at m_input_tail. ConnectBlock
    requests prevouts in the same order StartFetching queued them, and same-block
    spends are filtered out, so the coin to serve is always the one at m_input_tail
    (aside from BIP30 checks, an invalid block, or when the thread pool is not yet.
    These cases fall back to base->PeekCoin).
    
    This is designed deliberately so multiple threads can call ProcessInput independently.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>
    df1589dc20
  131. coins: add ready flag to InputToFetch
    Prepares for ProcessInput to be called from multiple threads.
    
    This flag acts as a memory fence around InputToFetch::coin. There is no lock
    guarding reads and writes of the coin field.
    Instead we use the flag's release/acquire semantics to ensure that when the
    main thread reads the coin it will have happened after a worker thread has
    finished writing it.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    b56a959844
  132. coins: stop fetching before mutating base
    Prepares for ProcessInput to be called from multiple threads.
    
    ProcessInput reads from base via base->PeekCoin(). For it to be safe to call
    in parallel on separate threads, base must not be mutated while fetching is
    in progress.
    
    Flush, Sync, and SetBackend all mutate base, so override them on
    CoinsViewOverlay to StopFetching before delegating to the base class.
    
    In production these do not matter, since only Flush will be called on a
    valid block after all inputs have already been fetched and validated.
    Sync and SetBackend are not called on CoinsViewOverlay in production code.
    
    Our fuzz harnesses call Flush, Sync, and SetBackend at any time, trying to
    fully exercise the CCoinsViewOverlay API which CoinsViewOverlay inherits.
    By overriding these methods, we can use the same fuzz harnesses and
    substitute a CoinsViewOverlay for a CCoinsViewCache with minimal
    modification.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    03c83f66f5
  133. coins: fetch inputs in parallel
    Leverages the thread pool to fetch inputs on multiple threads, while the overlay
    serves inputs on the main thread.
    
    This is a performance improvement over blocking the main thread to fetch inputs.
    
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    285669ad0a
  134. doc: update CoinsViewOverlay docstring to describe parallel fetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    f5e123e5a3
  135. test: add unit tests for CoinsViewOverlay::StartFetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    347bbe379c
  136. fuzz: update harnesses to cover CoinsViewOverlay::StartFetching
    Co-authored-by: l0rinc <pap.lorinc@gmail.com>
    Co-authored-by: sedited <seb.kung@gmail.com>
    ee3eb7aa6a
  137. fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads 25c3c132f8
  138. doc: add release notes 0aa7ca77c8
  139. andrewtoth force-pushed on Jun 25, 2026
  140. andrewtoth commented at 2:12 AM on June 25, 2026: contributor

    Thanks @theStack and @l0rinc for your reviews! I addressed all your suggestions.

    Rebased due to #35594.

    git range-diff 33e3c7524f..f79559a8a0 633044f143..0aa7ca77c8

  141. DrahtBot removed the label Needs rebase on Jun 25, 2026
  142. in doc/release-notes-35295.md:3 in 0aa7ca77c8
       0 | @@ -0,0 +1,7 @@
       1 | +## New settings
       2 | +
       3 | +- A new `-prevoutfetchthreads=<n>` option sets the number of threads used to
    


    l0rinc commented at 3:20 AM on June 25, 2026:

    0aa7ca7 doc: add release notes:

    Seems a bit off to focus on the arg instead of the feature itself. I'd start with the performance improvement and then cover how to configure it if you really want to (not even sure I'd mention it).


    theStack commented at 3:54 PM on June 25, 2026:

    0aa7ca7 doc: add release notes:

    Seems a bit off to focus on the arg instead of the feature itself. I'd start with the performance improvement and then cover how to configure it if you really want to (not even sure I'd mention it).

    I think explicitly mentioning new configuration options in the release notes is quite usual. I agree, though, that starting with an explanation of the improvement might read a bit better. (No big deal, as it can still be refined in the wiki before the release, I guess.)


    l0rinc commented at 4:20 PM on June 25, 2026:

    We're not yet sure people should be fiddling with this option yet, we got mixed results. And seeing how easy it is to misuse dbcache, I think the focus should be on the feature instead of how to tune it.


    andrewtoth commented at 4:43 PM on June 25, 2026:

    I think it's perfectly fine for users to fiddle with it and experiment to find best outcomes for their own setup. It's definitely safe to do so.


    l0rinc commented at 7:17 PM on June 25, 2026:

    Sure, they should be able to fiddle with it, but my argument was that the release notes should still describe the feature before the knob:

    - Block validation can now prefetch input prevouts from the chainstate database
      in parallel while connecting blocks, speeding up validation when prevouts need
      to be read from disk. A new `-prevoutfetchthreads=<n>` option controls the
      number of prefetch worker threads. The default is 8 threads, up to a maximum
      of 16; set it to 0 to disable parallel prefetching. (#35295)
    
  143. l0rinc approved
  144. l0rinc commented at 3:22 AM on June 25, 2026: contributor

    ACK 0aa7ca77c866a255259296719f12bbd3ba7e56d5

    (the PR description needs some updating, as mentioned before)

  145. in src/test/coinsviewoverlay_tests.cpp:31 in 0aa7ca77c8
      27 | +    std::shared_ptr<ThreadPool> m_thread_pool{[] {
      28 | +        auto pool{std::make_shared<ThreadPool>("fetch_test")};
      29 | +        pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
      30 | +        return pool;
      31 | +    }()};
      32 | +};
    


    sedited commented at 12:05 PM on June 25, 2026:

    Nit: Does this really need to be a fixture setup? How about making this a function that returns the pool and that can be called by each CoinsViewOverlay?


    l0rinc commented at 4:18 PM on June 25, 2026:

    It was recreated originally for every test, but production uses a single thread pool that is reused. It was changed later to a static pool which wasn't always destructed in time, so now it's a fixture-level instance to have both reuse (for some speed benefit, but mostly to make it more realistic) and proper cleanup.


    sedited commented at 4:29 PM on June 25, 2026:

    The way I read InitCache, the overlay now ends up owning the threadpool. Wouldn't it be more realistic to imitate this relationship in the tests?


    l0rinc commented at 7:15 PM on June 25, 2026:

    You mean that the pool is reused because the overlay is reused for later blocks, not because callers keep a separate owner and pass it into many overlays? Is this what you're suggesting instead?

    diff --git a/src/test/coinsviewoverlay_tests.cpp b/src/test/coinsviewoverlay_tests.cpp
    --- a/src/test/coinsviewoverlay_tests.cpp	(revision 70ca8b3083decab30eeced02a1ced948e3f7fef1)
    +++ b/src/test/coinsviewoverlay_tests.cpp	(revision 882e15222b9462861736809797042f45a75e2593)
    @@ -22,15 +22,14 @@
     #include <unordered_set>
     #include <vector>
     
    -struct CoinsViewOverlaySetup {
    -    std::shared_ptr<ThreadPool> m_thread_pool{[] {
    -        auto pool{std::make_shared<ThreadPool>("fetch_test")};
    -        pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
    -        return pool;
    -    }()};
    -};
    -
    -namespace {
    +namespace {
    +
    +std::shared_ptr<ThreadPool> MakeStartedThreadPool()
    +{
    +    auto pool{std::make_shared<ThreadPool>("fetch_test")};
    +    pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
    +    return pool;
    +}
     
     CBlock CreateBlock() noexcept
     {
    @@ -100,7 +99,7 @@
     
     } // namespace
     
    -BOOST_FIXTURE_TEST_SUITE(coinsviewoverlay_tests, CoinsViewOverlaySetup)
    +BOOST_AUTO_TEST_SUITE(coinsviewoverlay_tests)
     
     BOOST_AUTO_TEST_CASE(fetch_inputs_from_db)
     {
    @@ -108,7 +107,7 @@
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         PopulateView(block, db);
         CCoinsViewCache main_cache{&db};
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
         const auto& outpoint{block.vtx[1]->vin[0].prevout};
     
    @@ -137,7 +136,7 @@
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         CCoinsViewCache main_cache{&db};
         PopulateView(block, main_cache);
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
         CheckCache(block, view);
         BOOST_CHECK(view.AllInputsConsumed());
    @@ -159,7 +158,7 @@
         CCoinsViewCache main_cache{&db};
         // Add all inputs as spent already in cache
         PopulateView(block, main_cache, /*spent=*/true);
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
         for (const auto& tx : block.vtx) {
             for (const auto& in : tx->vin) {
    @@ -178,7 +177,7 @@
         const auto block{CreateBlock()};
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         CCoinsViewCache main_cache{&db};
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
         for (const auto& tx : block.vtx) {
             for (const auto& in : tx->vin) {
    @@ -205,7 +204,7 @@
         const COutPoint outpoint{Txid::FromUint256(uint256::ZERO), 0};
         main_cache.EmplaceCoinInternalDANGER(COutPoint{outpoint}, std::move(coin));
     
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
     
         // Non-input fallback hit.
    @@ -238,7 +237,7 @@
         }
         BOOST_REQUIRE_GE(fetched_inputs.size(), 2U);
     
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto reset_guard{view.StartFetching(block)};
     
         const auto& out_of_order_input{fetched_inputs[1]};
    @@ -262,7 +261,7 @@
         PopulateView(block, alternate_cache);
     
         for (const bool flush : {false, true}) {
    -        CoinsViewOverlay view{&main_cache, m_thread_pool};
    +        CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
             const auto first_guard{view.StartFetching(block)};
             view.SetBestBlock(uint256::ONE);
             flush ? view.Flush() : view.Sync();
    @@ -270,7 +269,7 @@
             CheckCache(block, view);
         }
     
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
         const auto first_guard{view.StartFetching(block)};
         view.SetBackend(alternate_cache);
     
    @@ -286,7 +285,8 @@
         CCoinsViewDB db{{.path = "", .cache_bytes = 1_MiB, .memory_only = true}, {}};
         CCoinsViewCache main_cache{&db};
         PopulateView(block, main_cache);
    -    CoinsViewOverlay view{&main_cache, m_thread_pool};
    +    // Reusing the overlay also reuses its owned thread pool.
    +    CoinsViewOverlay view{&main_cache, MakeStartedThreadPool()};
     
         {
             const auto reset_guard{view.StartFetching(block)};
    

    sedited commented at 7:46 PM on June 25, 2026:

    Yes, that is my suggestion

  146. in src/test/fuzz/coins_view.cpp:97 in 0569511161
      92 | +
      93 | +void StartPoolIfNeeded() EXCLUSIVE_LOCKS_REQUIRED(!g_thread_pool_mutex)
      94 | +{
      95 | +    LOCK(g_thread_pool_mutex);
      96 | +    if (!g_thread_pool->WorkersCount()) g_thread_pool->Start(DEFAULT_PREVOUTFETCH_THREADS);
      97 | +}
    


    sedited commented at 12:16 PM on June 25, 2026:

    Nit: In commit 05695111611f0c3e9ffa6facaa80fc0e173151b7:

    I think this is introduced too early, I'm getting an unused function warning:

    /home/drgrid/bitcoin/src/test/fuzz/coins_view.cpp:93:6: warning: unused function 'StartPoolIfNeeded' [-Wunused-function]
       93 | void StartPoolIfNeeded() EXCLUSIVE_LOCKS_REQUIRED(!g_thread_pool_mutex)
    
  147. in src/test/fuzz/coins_view.cpp:90 in 0569511161
      85 | @@ -84,6 +86,16 @@ class MutationGuardCoinsViewCache final : public CCoinsViewCache
      86 |  
      87 |      using CCoinsViewCache::CCoinsViewCache;
      88 |  };
      89 | +
      90 | +std::shared_ptr<ThreadPool> g_thread_pool{std::make_shared<ThreadPool>("view_fuzz")};
    


    ryanofsky commented at 1:03 PM on June 25, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (05695111611f0c3e9ffa6facaa80fc0e173151b7)

    Commit message says this causes a "memory leak when fuzzing". Wondering what is the memory leak? I also think it would be be good to mention the leak in a code comment instead of a commit message so future readers know about it.

  148. in test/functional/feature_proxy.py:138 in 0569511161
     134 | @@ -135,6 +135,7 @@ def setup_nodes(self):
     135 |          if self.have_unix_sockets:
     136 |              args[5] = ['-listen', f'-proxy=unix:{socket_path}']
     137 |              args[6] = ['-listen', f'-onion=unix:{socket_path}']
     138 | +        args = [a + ['-prevoutfetchthreads=0'] for a in args]
    


    ryanofsky commented at 1:06 PM on June 25, 2026:

    In commit "coins: introduce thread pool in CoinsViewOverlay" (05695111611f0c3e9ffa6facaa80fc0e173151b7)

    What is the rationale for these different settings in different tests? Code comments all 3 places would be helpful to make intent clear


    l0rinc commented at 7:12 PM on June 25, 2026:

    This test launches lots of nodes - there is no need to create so many threadpools.


    ryanofsky commented at 1:43 PM on June 26, 2026:

    re: #35295 (review)

    This test launches lots of nodes - there is no need to create so many threadpools.

    Thanks, this is one of 3 places the setting is overridden, and in each case having a short comment indicating intent would be helpful because when you see tests changing a random setting, it's not obvious if it's being done to make tests more predictable, faster, better for triggering bugs, or other possible reasons. Better if intent is clear and not something you have to guess about.

  149. theStack approved
  150. theStack commented at 3:55 PM on June 25, 2026: contributor

    ACK 0aa7ca77c866a255259296719f12bbd3ba7e56d5 🪙

    Thanks for addressing the nits.

  151. sedited approved
  152. sedited commented at 4:04 PM on June 25, 2026: contributor

    ACK 0aa7ca77c866a255259296719f12bbd3ba7e56d5

    In the coins view overlay fuzz test I get a leak detection log:

    [#8192](/bitcoin-bitcoin/8192/) pulse  cov: 3353 ft: 7192 corp: 393/22Kb exec/s: 186 rss: 339Mb
    INFO: libFuzzer disabled leak detection after every mutation
    

    I let it run for quite some time, which didn't indicate the memory growing rapidly, so not sure where this is coming from. Also put considerable CPU hours into the other targets.

    This is in so much better shape than the initial version. The layering and coinscache interdependence is still pretty confusing though. I will also have to re-think my approach in #32317, since we now get some performance gains by interleaving coins access with actual validation work. There are already a bunch of of refactoring pull requests waiting, and I think it is worth the risk to follow up with some more comprehensive refactors.

  153. in src/test/fuzz/coins_view.cpp:393 in df1589dc20
     389 | @@ -390,6 +390,7 @@ FUZZ_TARGET(coins_view_db, .init = initialize_coins_view)
     390 |  // called.
     391 |  FUZZ_TARGET(coins_view_overlay, .init = initialize_coins_view) EXCLUSIVE_LOCKS_REQUIRED(!g_thread_pool_mutex)
     392 |  {
     393 | +    SeedRandomStateForTest(SeedRand::ZEROS);
    


    ryanofsky commented at 4:15 PM on June 25, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (df1589dc2081becec63a3b5f52130234a0068e38)

    Would be good to explain this with // for SaltedTxidHasher comment. I initially thought this was added to the wrong commit.

  154. l0rinc approved
  155. in src/validation.cpp:3068 in df1589dc20
    3063 | @@ -3064,6 +3064,9 @@ bool Chainstate::ConnectTip(
    3064 |              LogError("%s: ConnectBlock %s failed, %s\n", __func__, pindexNew->GetBlockHash().ToString(), state.ToString());
    3065 |              return false;
    3066 |          }
    3067 | +        if (!Assume(view.AllInputsConsumed())) {
    3068 | +            LogWarning("Not all prefetched input prevouts were consumed");
    


    ryanofsky commented at 4:30 PM on June 25, 2026:

    In commit "validation: collect block inputs in CoinsViewOverlay before ConnectBlock" (df1589dc2081becec63a3b5f52130234a0068e38)

    Would suggest a more detailed message like:

    --- a/src/validation.cpp
    +++ b/src/validation.cpp
    @@ -3065,7 +3065,8 @@ bool Chainstate::ConnectTip(
                 return false;
             }
             if (!Assume(view.AllInputsConsumed())) {
    -            LogWarning("Not all prefetched input prevouts were consumed");
    +            LogWarning("Internal bug detected: block %s input prefetch queue was not fully consumed (%s %s). Please report this issue here: %s\n",
    +                pindexNew->GetBlockHash().ToString(), CLIENT_NAME, FormatFullVersion(), CLIENT_BUGREPORT);
             }
             time_3 = SteadyClock::now();
             m_chainman.time_connect_total += time_3 - time_2;
    
  156. in src/coins.h:671 in 03c83f66f5
     666 | @@ -667,6 +667,24 @@ class CoinsViewOverlay : public CCoinsViewCache
     667 |  
     668 |      //! Verify that all parallel fetched input prevouts have been consumed.
     669 |      bool AllInputsConsumed() const noexcept { return m_input_tail == m_inputs.size(); }
     670 | +
     671 | +    void SetBackend(CCoinsView& view) override
    


    ryanofsky commented at 5:43 PM on June 25, 2026:

    In commit "coins: stop fetching before mutating base" (03c83f66f597f6bea46105575e1bbc84722b8433)

    Seems unreliable to overrride public methods instead of using an internal hook. Scattering StopFetching calls different places with no comments is also confusing because it's not obvious how/why these methods should affect fetching. Using a hook should simplify the code and make the connections more clear. For example:

    <details><summary>diff</summary> <p>

    --- a/src/coins.cpp
    +++ b/src/coins.cpp
    @@ -264,6 +264,7 @@ void CCoinsViewCache::BatchWrite(CoinsViewCacheCursor& cursor, const uint256& in
     
     void CCoinsViewCache::Flush(bool reallocate_cache)
     {
    +    OnBaseMutate();
         auto cursor{CoinsViewCacheCursor(m_dirty_count, m_sentinel, cacheCoins, /*will_erase=*/true)};
         base->BatchWrite(cursor, m_block_hash);
         Assume(m_dirty_count == 0);
    @@ -276,6 +277,7 @@ void CCoinsViewCache::Flush(bool reallocate_cache)
     
     void CCoinsViewCache::Sync()
     {
    +    OnBaseMutate();
         auto cursor{CoinsViewCacheCursor(m_dirty_count, m_sentinel, cacheCoins, /*will_erase=*/false)};
         base->BatchWrite(cursor, m_block_hash);
         Assume(m_dirty_count == 0);
    @@ -287,6 +289,7 @@ void CCoinsViewCache::Sync()
     
     void CCoinsViewCache::Reset() noexcept
     {
    +    OnBaseMutate();
         cacheCoins.clear();
         cachedCoinsUsage = 0;
         m_dirty_count = 0;
    --- a/src/coins.h
    +++ b/src/coins.h
    @@ -382,11 +382,14 @@ class CCoinsViewBacked : public CCoinsView
     {
     protected:
         CCoinsView* base;
    +    //! Called before base view is mutated to allow stopping any threads that
    +    //! may be reading from it.
    +    virtual void OnBaseMutate() {}
     
     public:
         explicit CCoinsViewBacked(CCoinsView* in_view) : base{Assert(in_view)} {}
     
    -    virtual void SetBackend(CCoinsView& in_view) { base = &in_view; }
    +    void SetBackend(CCoinsView& in_view) { OnBaseMutate(); base = &in_view; }
     
         std::optional<Coin> GetCoin(const COutPoint& outpoint) const override { return base->GetCoin(outpoint); }
         std::optional<Coin> PeekCoin(const COutPoint& outpoint) const override { return base->PeekCoin(outpoint); }
    @@ -497,7 +500,7 @@ public:
          * If reallocate_cache is false, the cache will retain the same memory footprint
          * after flushing and should be destroyed to deallocate.
          */
    -    virtual void Flush(bool reallocate_cache = true);
    +    void Flush(bool reallocate_cache = true);
     
         /**
          * Push the modifications applied to this cache to its base while retaining
    @@ -505,7 +508,7 @@ public:
          * Failure to call this method or Flush() before destruction will cause the changes
          * to be forgotten.
          */
    -    virtual void Sync();
    +    void Sync();
     
         /**
          * Removes the UTXO with the given outpoint from the cache, if it is
    @@ -567,8 +570,7 @@ private:
      * mutating the base cache.
      *
      * Only used in ConnectBlock to pass as an ephemeral view that can be reset if the block is invalid.
    - * It provides the same interface as CCoinsViewCache. It overrides all methods that mutate base,
    - * stopping fetch tasks before calling superclass.
    + * It provides the same interface as CCoinsViewCache.
      * It adds an additional StartFetching method to provide the block.
      *
      * When a block is passed to StartFetching, the inputs of the block are flattened into a vector of InputToFetch
    @@ -593,7 +595,8 @@ private:
      *
      * StopFetching() is called before mutating operations (Flush/Sync/Reset/SetBackend). It stops fetching by moving
      * m_input_head to the end of m_inputs (so workers quickly exit), then waits for all futures to complete and clears
    - * the per-block state (m_inputs and the head/tail counters).
    + * the per-block state (m_inputs and the head/tail counters). Stopping fetching is ok because there is no realistic
    + * use-case for parallel fetching while updating the base view other than fuzz testing.
      *
      *       Workers advance m_input_head to fetch inputs. Main thread advances m_input_tail to consume.
      *
    @@ -713,11 +716,7 @@ private:
         std::vector<std::future<void>> m_futures{};
     
     protected:
    -    void Reset() noexcept override
    -    {
    -        StopFetching();
    -        CCoinsViewCache::Reset();
    -    }
    +    void OnBaseMutate() override { StopFetching(); }
     
     public:
         explicit CoinsViewOverlay(CCoinsView* in_base, std::shared_ptr<ThreadPool> thread_pool,
    @@ -734,24 +733,6 @@ public:
     
         //! Verify that all parallel fetched input prevouts have been consumed.
         bool AllInputsConsumed() const noexcept { return m_input_tail == m_inputs.size(); }
    -
    -    void SetBackend(CCoinsView& view) override
    -    {
    -        StopFetching();
    -        CCoinsViewCache::SetBackend(view);
    -    }
    -
    -    void Flush(bool reallocate_cache = true) override
    -    {
    -        StopFetching();
    -        CCoinsViewCache::Flush(reallocate_cache);
    -    }
    -
    -    void Sync() override
    -    {
    -        StopFetching();
    -        CCoinsViewCache::Sync();
    -    }
     };
     
     //! Utility function to add all of a transaction's outputs to a cache.
    

    </p> </details>


    andrewtoth commented at 10:23 PM on June 25, 2026:

    Yes, I considered doing this. But if we were going to add this hook, we would also want to add it to CCoinsViewCache::FetchCoinFromBase, which calls base->GetCoin(). However, that is a const method so we can't call that hook in there without making a whole bunch of state mutable. Without that the hook implementation is incomplete.


    ryanofsky commented at 3:27 PM on June 26, 2026:

    re: #35295 (review)

    Yes, I considered doing this. But if we were going to add this hook, we would also want to add it to CCoinsViewCache::FetchCoinFromBase, which calls base->GetCoin().

    I don't think this is true. (Also, even if it were true, I still think we'd be better off adding a few mutables or a strategic const_cast to reduce complexity of these overrides. We all agree this interface would be better if did not rely on mutables, but since it does use them, one of the known drawbacks of mutables is that they are viral and will spread.)

    In any case, the simplification I suggested above should be ok in its current form because this->OnBaseMutate() is called when this class is changing the base view (comment says "Called before base view is mutated "). If this class is just calling base->GetCoin() and the base class mutates itself, that's not something this class should assume or know about.

    But it would be good to clarify a little bit. Could rename "OnBaseMutate" to "OnMutateBase" and change "before base view is mutated" to "before this class updates the base or writes to it." I don't think it would be necessary, but it would also be reasonable to add a comment to CCoinsViewCache::FetchCoinFromBase saying something like "Depending on the base view, calling base->GetCoin() below could mutate its internal cache and interfere with concurrent readers. So it could be useful to call OnMutateBase() here, but it is not called because don't want to assume a get call is always a mutation, and because there are many other ways for callers to synchronize readers without this."

  157. sedited requested review from sipa on Jun 25, 2026
  158. ryanofsky approved
  159. ryanofsky commented at 7:08 PM on June 25, 2026: contributor

    Code review ACK 0aa7ca77c866a255259296719f12bbd3ba7e56d5 with caveats that so far I only lightly reviewed the tests, and I still have reservations about the fragility of the optimization this PR is implementing.

    I don't think the CoinsViewOverlay class should quietly allow inputs to be fetched that are never used. I also don't like that validation code needs to check whether inputs are used instead of CoinsViewOverlay handling this internally.

    It would seem better if CoinsViewOverlay had explicit Assume statements for the assumptions it makes, and had escape hatches for tests that violate those assumptions. I'm also skeptical about whether anything useful is being tested while violating the assumptions. For example it seems like coins_view_overlay test creates 8 fetch threads but stops fetching as soon as it enters the TestCoinsView while loop by mutating the cache, and even if it kept fetching, the fetch method would almost never find a matching input anyway. Unless i'm misunderstanding the test or the code (which is very possible), having this type of test doesn't seem like a good justification for extra fragility and complexity (03c83f66f597f6bea46105575e1bbc84722b8433) in production code.

    I don't want to get hung up on these points though, since overall this is a really nice PR, and even if there was a bug that cause the optimization to not work reliably, this should not be a very big deal. It just seems like something that should be more possible to catch.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-27 15:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me