fuzz: target concurrent leveldb reads #34866

pull andrewtoth wants to merge 2 commits into bitcoin:master from andrewtoth:fuzz_db_reads changing 1 files +145 −17
  1. andrewtoth commented at 1:41 PM on March 19, 2026: contributor

    Inspired by #31132 (comment).

    We currently do concurrent leveldb reads when accessing our indexes.

    1. txindex - we call FindTx() from multiple RPC threads.
    2. blockfilterindex - we call LookupFilter/Header() concurrently from msghand thread for p2p requests as well as RPC threads.
    3. coinstatsindex - we call LookUpStats() from multiple RPC threads.
    4. txospenderindex - we call FindSpender() from multiple RPC threads.

    We also read from our chainstate and blocks index while background compactions are writing.

    While OSS-Fuzz does cover leveldb (https://github.com/google/oss-fuzz/blob/master/projects/leveldb/fuzz_db.cc), it doesn't cover multi threaded access. Without a deterministic hypervisor this fuzz harness won't be deterministic, but we can at least run it with TSan to get a higher confidence that the synchronization code in leveldb is correct. Hopefully other reviewers find this useful.

    This harness creates a threadpool with 16 threads, and then creates an in-memory levelDB which it seeds with deterministically random values. It chooses a random set of keys to query. It first performs all queries on the db on a single thread to get a baseline, then synchronizes all threads on a latch so they hit the db at the same time. Each thread performs the same queries, and afterwards are all checked against the baseline.

    It uses a DeterministicEnv to capture background compaction work when seeding the db, which is also run immediately after the latch is released. This causes a race between compaction and reading, ensuring we exercise many thread synchronization code paths in leveldb.

    I ran both TSan and ASan/UBSan overnight with no issues.

  2. DrahtBot added the label Fuzzing on Mar 19, 2026
  3. DrahtBot commented at 1:41 PM on March 19, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34866.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK fjahr, l0rinc
    Stale ACK sedited, furszy

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. maflcko commented at 1:52 PM on March 19, 2026: member

    While OSS-Fuzz does cover leveldb (https://github.com/google/oss-fuzz/blob/master/projects/leveldb/fuzz_db.cc), it ...

    I wouldn't put too much value in this. I think there is only a single fuzz target, which fails, according to https://issues.oss-fuzz.com/issues/447252244, so it may be unmaintained.

    <!-- > TSan Reminds me that we didn't have any multi-threaded test in the past, so there is no TSan fuzz CI config. Could make sense to consider adding one at some point?

  5. sedited requested review from marcofleon on Mar 23, 2026
  6. andrewtoth force-pushed on Mar 29, 2026
  7. andrewtoth commented at 6:51 PM on March 29, 2026: contributor

    Updated this to be a commit on top of #34887, so that we can reuse the DeterministicEnv to run compaction concurrently with reading.

  8. furszy commented at 11:59 PM on March 29, 2026: member

    Concept ACK. Useful to verify this behavior on our end.

  9. sedited commented at 9:32 PM on May 2, 2026: contributor

    Concept ACK

  10. andrewtoth marked this as a draft on May 7, 2026
  11. andrewtoth commented at 8:49 AM on May 7, 2026: contributor

    Converted to draft until #34887 is merged.

  12. fanquake referenced this in commit bd0942bbd9 on May 21, 2026
  13. DrahtBot added the label Needs rebase on May 21, 2026
  14. rustaceanrob referenced this in commit 2a63f69878 on May 22, 2026
  15. andrewtoth force-pushed on May 23, 2026
  16. andrewtoth marked this as ready for review on May 23, 2026
  17. andrewtoth commented at 9:17 PM on May 23, 2026: contributor

    Rebased after #34887 was merged.

    This is ready for review again.

  18. DrahtBot removed the label Needs rebase on May 23, 2026
  19. sedited approved
  20. sedited commented at 2:25 PM on May 25, 2026: contributor

    ACK 6c8e7c947be024eb965593cfad9c0c64710e5b86

  21. DrahtBot requested review from furszy on May 25, 2026
  22. sedited requested review from fjahr on May 25, 2026
  23. in src/test/fuzz/dbwrapper.cpp:176 in 6c8e7c947b
     171 | +
     172 | +void initialize_dbwrapper_concurrent_reads()
     173 | +{
     174 | +    static auto setup{MakeNoLogFileContext<>()};
     175 | +    // Will not work with fork. Different threads will cause deadlock via latch.
     176 | +    g_pool.Start(NUM_WORKERS);
    


    andrewtoth commented at 3:40 AM on May 26, 2026:

    @marcofleon since we use a latch for all threads of the pool, if more than one thread is executing an iteration at a time with the same pool it will deadlock. So, I opted to not use the similar pattern we have for the threadpool fuzz harness. Will this still work with your fuzzing infrastructure, or should I try a different approach with the latch?


    marcofleon commented at 9:53 PM on May 26, 2026:

    Maybe for this target it'd be better to do a local thread pool per iteration. It'll be slower for sure (~20% slower based on my initial TSan runs), but then you would avoid the fork issue. And it feels more straightforward for what this harness is trying to accomplish.

    Also, we can probably make up for that slowdown in other ways without losing much, e.g. decreasing the max number of entries and/or max number of reads?


    andrewtoth commented at 12:11 AM on June 1, 2026:

    Updated to use a local thread pool.

  24. in src/test/fuzz/dbwrapper.cpp:361 in 6c8e7c947b
     356 | +    const size_t num_threads{provider.ConsumeIntegralInRange<uint8_t>(2, NUM_WORKERS)};
     357 | +
     358 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
     359 | +    DeterministicEnv det_env{memenv.get()};
     360 | +
     361 | +    CDBWrapper db{DBParams{
    


    fjahr commented at 8:26 PM on May 26, 2026:

    micro-nit: Could have possibly deduplicated this, looking at make_db above, but feel free to ignore

  25. in src/test/fuzz/dbwrapper.cpp:170 in 6c8e7c947b
     165 | @@ -152,6 +166,16 @@ void VerifyIterator(CDBWrapper& dbw, const Oracle& oracle,
     166 |      assert(oracle_it == oracle.end());
     167 |  }
     168 |  
     169 | +constexpr size_t NUM_WORKERS{16};
     170 | +ThreadPool g_pool{"dbfuzz"};
    


    fjahr commented at 8:34 PM on May 26, 2026:

    The comment here on why this is a global was valuable information for me. Maybe consider adding at least a short version here?


    andrewtoth commented at 12:12 AM on June 1, 2026:

    Replaced the global pool with local one.

  26. in src/test/fuzz/dbwrapper.cpp:384 in 6c8e7c947b
     379 | +    for (size_t start{0}; start < num_entries; start += SEED_BATCH_SIZE) {
     380 | +        CDBBatch batch{db};
     381 | +        const size_t end{std::min(start + SEED_BATCH_SIZE, num_entries)};
     382 | +        for (size_t i{start}; i < end; ++i) {
     383 | +            auto k{rng.rand256().ToString()};
     384 | +            const auto v{rng.rand256().ToString()};
    


    furszy commented at 8:39 PM on May 26, 2026:

    What about consuming random length strings directly from the provider?

    auto k{provider.ConsumeRandomLengthString(256)};
    const auto v{provider.ConsumeRandomLengthString(256)};
    

    As it right now, keys and values are deterministic per FastRandomContext(fDeterministic)=true, so the db is always populated in the same way. Could also vary the size of the key-value to store data even more dispersed.


    andrewtoth commented at 12:12 AM on June 1, 2026:

    Updated this to use the ConsumeKey and ConsumeValueSize helpers.

  27. furszy commented at 8:42 PM on May 26, 2026: member

    Code review ACK 6c8e7c947be024eb965593cfad9c0c64710e5b86

  28. in src/test/fuzz/dbwrapper.cpp:356 in 6c8e7c947b
     351 | +FUZZ_TARGET(dbwrapper_concurrent_reads, .init = initialize_dbwrapper_concurrent_reads)
     352 | +{
     353 | +    SeedRandomStateForTest(SeedRand::ZEROS);
     354 | +
     355 | +    FuzzedDataProvider provider{buffer.data(), buffer.size()};
     356 | +    const size_t num_threads{provider.ConsumeIntegralInRange<uint8_t>(2, NUM_WORKERS)};
    


    fjahr commented at 8:51 PM on May 26, 2026:

    Given what is being tested here, is there real value in restricting the number of workers lower than NUM_WORKERS?


    andrewtoth commented at 12:37 AM on June 1, 2026:

    Hmm yeah I don't see any reason to not have max threads each iteration. Updated to max each time.

  29. in src/test/fuzz/dbwrapper.cpp:363 in 6c8e7c947b
     358 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
     359 | +    DeterministicEnv det_env{memenv.get()};
     360 | +
     361 | +    CDBWrapper db{DBParams{
     362 | +        .path = "dbwrapper_fuzz",
     363 | +        .cache_bytes = provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB),
    


    fjahr commented at 8:56 PM on May 26, 2026:

    Based on the write_buffer_size comment below, shouldn't this be pinned to a small size so there are reliably more flushes and compaction?


    andrewtoth commented at 12:40 AM on June 1, 2026:

    I think we want to explore the whole space of lots of flushes and compactions and none at all. We don't want to limit what the fuzzer can find.

  30. in src/test/fuzz/dbwrapper.cpp:404 in 6c8e7c947b outdated
     399 | +        } else {
     400 | +            queries.push_back(provider.ConsumeRandomLengthString(64));
     401 | +        }
     402 | +    }
     403 | +
     404 | +    // Baseline read on a single thread
    


    fjahr commented at 9:29 PM on May 26, 2026:

    The DB creation and this baseline stuff seems to add considerable overhead for each run so I thought about if this could be made reusable somehow since the core test isn't deterministic anyway. So there could be more rounds of the non-deterministic test per db and baseline. It seems to be tricky though, best I could come up with is to write more keys to the DB in order to make compaction necessary again but keep them in a separate namespace (via prefix or so) from the original set to not change the results of the baseline. So, basically doing everything that is done in the test now but then continuing by adding these out-of-baseline keys and rerunning the futures part a few more times.

    Not sure if this really makes sense and is worth the effort though.


    andrewtoth commented at 12:46 AM on June 1, 2026:

    I suppose how much we want to optimize this depends on the throughput we want to achieve. You are right that we can't just seed a global db once and read from it, since we wouldn't get any flushes or compactions that way.

  31. fjahr commented at 9:31 PM on May 26, 2026: contributor

    Concept ACK

    Seems valuable to test this. I couldn't find any serious issues with the code but I'm leaving some naive questions since I'm not an expert on fuzz tests :)

  32. andrewtoth force-pushed on Jun 1, 2026
  33. andrewtoth force-pushed on Jun 1, 2026
  34. DrahtBot added the label CI failed on Jun 1, 2026
  35. DrahtBot commented at 12:37 AM on June 1, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task FreeBSD Cross: https://github.com/bitcoin/bitcoin/actions/runs/26728496965/job/78767847723</sub> <sub>LLM reason (✨ experimental): CI failed due to a C++ build error in the fuzz target: dbwrapper.cpp calls keys.push_back(...) with the wrong type (unsigned short instead of std::string), so clang++ reports “no matching member function for call to push_back.”</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  36. andrewtoth commented at 12:54 AM on June 1, 2026: contributor

    Thank you @sedited @furszy @marcofleon @fjahr for your reviews! I have taken most of your suggestions.

    • Use a local thread pool per iteration instead of a global.
    • Consume input for keys and values used to seed db.
    • Dedupe db creation logic.
    • Use a static number of threads per iteration (16) instead of consuming a range from input.

    git diff 6c8e7c947be024eb965593cfad9c0c64710e5b86..fb864c5dfabefb326290208f577dd84c0ecd7b7b

  37. DrahtBot removed the label CI failed on Jun 1, 2026
  38. in src/test/fuzz/dbwrapper.cpp:173 in fb864c5dfa outdated
     165 | @@ -152,6 +166,25 @@ void VerifyIterator(CDBWrapper& dbw, const Oracle& oracle,
     166 |      assert(oracle_it == oracle.end());
     167 |  }
     168 |  
     169 | +/** Maximum number of concurrent reader threads in dbwrapper_concurrent_reads. */
     170 | +constexpr size_t MAX_READ_WORKERS{16};
     171 | +
     172 | +/** Build randomized DBParams from the fuzz input, shared by all targets. */
     173 | +DBParams ConsumeDBParams(FuzzedDataProvider& provider, leveldb::Env* testing_env,
    


    l0rinc commented at 4:32 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Could this helper extraction be a separate preparatory commit?

  39. in src/test/fuzz/dbwrapper.cpp:365 in fb864c5dfa
     360 | +    CDBWrapper db{ConsumeDBParams(provider, &det_env, /*obfuscate=*/provider.ConsumeBool())};
     361 | +
     362 | +    // Seed the DB. Drain work between each to prevent deadlock.
     363 | +    // With write_buffer_size=64KB, ~860 entries fill a memtable,
     364 | +    // 4 flushes trigger L0→L1 compaction.
     365 | +    FastRandomContext rng{/*fDeterministic=*/true};
    


    l0rinc commented at 4:38 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    This looks unused, along with the corresponding include.

  40. in src/test/fuzz/dbwrapper.cpp:369 in fb864c5dfa outdated
     364 | +    // 4 flushes trigger L0→L1 compaction.
     365 | +    FastRandomContext rng{/*fDeterministic=*/true};
     366 | +    const size_t num_entries{provider.ConsumeIntegralInRange<size_t>(100, 5'000)};
     367 | +    std::vector<uint16_t> keys;
     368 | +    keys.reserve(num_entries);
     369 | +    constexpr size_t SEED_BATCH_SIZE{400};
    


    l0rinc commented at 4:47 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Is the fixed 400-entry batch size important here? If the bounded batches are needed so queued background work can be drained before a later write blocks, could the comment say that explicitly? Otherwise, a single write batch would be simpler.


    andrewtoth commented at 2:02 PM on June 2, 2026:

    Added a comment explaining we need to drain between writes. A big batch write without draining can deadlock if there's no background thread.

  41. in src/test/fuzz/dbwrapper.cpp:389 in fb864c5dfa
     384 | +
     385 | +    // Build query list from seeded and random keys.
     386 | +    std::vector<uint16_t> queries;
     387 | +    LIMITED_WHILE(provider.ConsumeBool(), 2000) {
     388 | +        if (provider.ConsumeBool()) {
     389 | +            queries.push_back(keys[provider.ConsumeIntegralInRange<size_t>(0, keys.size() - 1)]);
    


    l0rinc commented at 4:52 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Maybe we could use num_entries here - though we're indexing the keys so maybe it makes sense. There's no way for the keys to change while we're generating these, right?

    And we could reduce some duplication by:

    queries.push_back(provider.ConsumeBool()
                      ? keys[provider.ConsumeIntegralInRange<size_t>(0, keys.size() - 1)]
                      : ConsumeKey(provider));
    
  42. in src/test/fuzz/dbwrapper.cpp:398 in fb864c5dfa
     393 | +    }
     394 | +
     395 | +    // Baseline read on a single thread
     396 | +    using Results = std::vector<std::optional<std::string>>;
     397 | +    const auto num_queries{queries.size()};
     398 | +    Results baseline(num_queries);
    


    l0rinc commented at 4:58 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Since this isn't testing anything (seems like a truth oracle), maybe we could extract it to a separate method. Edit: This would likely pass if read is incorrect on a single thread as well - if it's simpler, consider using an std::unordered_map oracle here, too.


    andrewtoth commented at 2:05 PM on June 2, 2026:

    This would likely pass if read is incorrect on a single thread as well

    We need to have some baseline of correctness. We rely on single threaded reads being correct today, so we can use this as the oracle.

  43. in src/test/fuzz/dbwrapper.cpp:405 in fb864c5dfa outdated
     400 | +        std::string v;
     401 | +        if (db.Read(queries[i], v)) baseline[i].emplace(std::move(v));
     402 | +    }
     403 | +
     404 | +    ThreadPool pool{"dbfuzz"};
     405 | +    pool.Start(MAX_READ_WORKERS);
    


    l0rinc commented at 4:59 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Instead of always starting the same "max" number of threads, could we fuzz this as well:

    diff --git a/src/test/fuzz/dbwrapper.cpp b/src/test/fuzz/dbwrapper.cpp
    --- a/src/test/fuzz/dbwrapper.cpp	(revision a09d02e14fcab84aef00658e5143ad0c2ab62a20)
    +++ b/src/test/fuzz/dbwrapper.cpp	(revision 7155ea1326aaf6a636968688506f48f4476d3cde)
    @@ -166,7 +166,7 @@
     }
     
     /** Maximum number of concurrent reader threads in dbwrapper_concurrent_reads. */
    -constexpr size_t MAX_READ_WORKERS{16};
    +constexpr int MAX_READ_WORKERS{16};
     
     /** Build randomized DBParams from the fuzz input, shared by all targets. */
     DBParams ConsumeDBParams(FuzzedDataProvider& provider, leveldb::Env* testing_env,
    @@ -455,12 +455,13 @@
         const ReadResults baseline{RunReadQueries(db, queries)};
     
         ThreadPool pool{"dbfuzz"};
    -    pool.Start(MAX_READ_WORKERS);
    +    const int read_workers{provider.ConsumeIntegralInRange<int>(1, MAX_READ_WORKERS)};
    +    pool.Start(read_workers);
     
         // Workers + main thread synchronize on the latch.
    -    std::latch start_latch{static_cast<ptrdiff_t>(MAX_READ_WORKERS + 1)};
    +    std::latch start_latch{ptrdiff_t(read_workers + 1)};
     
    -    std::vector<std::function<ReadResults()>> tasks(MAX_READ_WORKERS, [&]() -> ReadResults {
    +    std::vector<std::function<ReadResults()>> tasks(read_workers, [&]() -> ReadResults {
             start_latch.arrive_and_wait();
             return RunReadQueries(db, queries);
         });
    

    andrewtoth commented at 2:06 PM on June 2, 2026:

    See #34866 (review). I can't think of a compelling reason why we would ever want to have fewer threads than max.

  44. in src/test/fuzz/dbwrapper.cpp:414 in fb864c5dfa
     409 | +
     410 | +    std::vector<std::function<Results()>> tasks(MAX_READ_WORKERS, [&]() -> Results {
     411 | +        Results results(num_queries);
     412 | +        start_latch.arrive_and_wait();
     413 | +        for (const auto i : std::views::iota(size_t{0}, num_queries)) {
     414 | +            if (std::string v; db.Read(queries[i], v)) results[i].emplace(std::move(v));
    


    l0rinc commented at 5:14 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Could we extract the repeated dbwrapper read loop into a helper used by both the single-threaded baseline and concurrent worker tasks, something like:

    diff --git a/src/test/fuzz/dbwrapper.cpp b/src/test/fuzz/dbwrapper.cpp
    --- a/src/test/fuzz/dbwrapper.cpp	(revision d79f92fe005fe3a03896a11badba575e2ef3179d)
    +++ b/src/test/fuzz/dbwrapper.cpp	(revision a09d02e14fcab84aef00658e5143ad0c2ab62a20)
    @@ -202,6 +202,8 @@
         friend bool operator==(const ReadResult&, const ReadResult&) = default;
     };
     
    +using ReadResults = std::vector<ReadResult>;
    +
     ReadResult RunReadQuery(CDBWrapper& db, const ReadQuery& query)
     {
         switch (query.type) {
    @@ -228,6 +230,16 @@
         return {};
     }
     
    +ReadResults RunReadQueries(CDBWrapper& db, const std::vector<ReadQuery>& queries)
    +{
    +    ReadResults results;
    +    results.reserve(queries.size());
    +    for (const auto& query : queries) {
    +        results.push_back(RunReadQuery(db, query));
    +    }
    +    return results;
    +}
    +
     template <typename DrainWorkFn, typename RunOneFn>
     void TestDbWrapper(FuzzedDataProvider& provider,
                           leveldb::Env* testing_env,
    @@ -440,12 +452,7 @@
             }
         }
     
    -    // Baseline read on a single thread
    -    using Results = std::vector<ReadResult>;
    -    Results baseline(num_queries);
    -    for (const auto i : std::views::iota(size_t{0}, num_queries)) {
    -        baseline[i] = RunReadQuery(db, queries[i]);
    -    }
    +    const ReadResults baseline{RunReadQueries(db, queries)};
     
         ThreadPool pool{"dbfuzz"};
         pool.Start(MAX_READ_WORKERS);
    @@ -453,13 +460,9 @@
         // Workers + main thread synchronize on the latch.
         std::latch start_latch{static_cast<ptrdiff_t>(MAX_READ_WORKERS + 1)};
     
    -    std::vector<std::function<Results()>> tasks(MAX_READ_WORKERS, [&]() -> Results {
    -        Results results(num_queries);
    +    std::vector<std::function<ReadResults()>> tasks(MAX_READ_WORKERS, [&]() -> ReadResults {
             start_latch.arrive_and_wait();
    -        for (const auto i : std::views::iota(size_t{0}, num_queries)) {
    -            results[i] = RunReadQuery(db, queries[i]);
    -        }
    -        return results;
    +        return RunReadQueries(db, queries);
         });
     
         auto futures{*Assert(pool.Submit(std::move(tasks)))};
    
  45. in src/test/fuzz/dbwrapper.cpp:421 in fb864c5dfa
     416 | +        return results;
     417 | +    });
     418 | +
     419 | +    auto futures{*Assert(pool.Submit(std::move(tasks)))};
     420 | +
     421 | +    // Release the latch and immediately run compaction on the main
    


    l0rinc commented at 5:15 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Where exactly does compaction run on the main thread here? Also, why do we drain queued work both before and after the concurrent reads? I remember this was explained somewhere but I don't fully understand the reason.


    andrewtoth commented at 2:12 PM on June 2, 2026:

    Where exactly does compaction run on the main thread here?

    Added a comment to explain.

    Also, why do we drain queued work both before and after the concurrent reads?

    Seek compactions during the reads could cause a compaction to be scheduled, so we drain again. Shouldn't be required if rebased on master. Can be removed in a follow-up when merged.

  46. in src/test/fuzz/dbwrapper.cpp:387 in fb864c5dfa
     382 | +
     383 | +    while (provider.ConsumeBool() && det_env.RunOne()) {}
     384 | +
     385 | +    // Build query list from seeded and random keys.
     386 | +    std::vector<uint16_t> queries;
     387 | +    LIMITED_WHILE(provider.ConsumeBool(), 2000) {
    


    l0rinc commented at 5:35 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    I understand this lets the fuzzer decide at every step whether it needs more values, but it also allows an input to release all workers and then perform zero read queries. That execution exercises the latch and compaction path, but not the concurrent-reader invariant. Please consider this alternative, which I find more predictable:

    diff --git a/src/test/fuzz/dbwrapper.cpp b/src/test/fuzz/dbwrapper.cpp
    --- a/src/test/fuzz/dbwrapper.cpp	(revision fb864c5dfabefb326290208f577dd84c0ecd7b7b)
    +++ b/src/test/fuzz/dbwrapper.cpp	(revision f8dfe2ec9047d66f1d0b822050d044944b950db1)
    @@ -383,8 +383,10 @@
         while (provider.ConsumeBool() && det_env.RunOne()) {}
     
         // Build query list from seeded and random keys.
    +    const size_t num_queries{provider.ConsumeIntegralInRange<size_t>(1, 2'000)};
         std::vector<uint16_t> queries;
    -    LIMITED_WHILE(provider.ConsumeBool(), 2000) {
    +    queries.reserve(num_queries);
    +    for (size_t i{0}; i < num_queries; ++i) {
             if (provider.ConsumeBool()) {
                 queries.push_back(keys[provider.ConsumeIntegralInRange<size_t>(0, keys.size() - 1)]);
             } else {
    @@ -394,7 +396,6 @@
     
         // Baseline read on a single thread
         using Results = std::vector<std::optional<std::string>>;
    -    const auto num_queries{queries.size()};
         Results baseline(num_queries);
         for (const auto i : std::views::iota(size_t{0}, num_queries)) {
             std::string v;
    
  47. in src/test/fuzz/dbwrapper.cpp:401 in fb864c5dfa
     396 | +    using Results = std::vector<std::optional<std::string>>;
     397 | +    const auto num_queries{queries.size()};
     398 | +    Results baseline(num_queries);
     399 | +    for (const auto i : std::views::iota(size_t{0}, num_queries)) {
     400 | +        std::string v;
     401 | +        if (db.Read(queries[i], v)) baseline[i].emplace(std::move(v));
    


    l0rinc commented at 5:37 PM on June 1, 2026:

    fb864c5 fuzz: target concurrent leveldb reads:

    Read() is not the only read-only operation exposed here. The sibling dbwrapper target checks Read(), Exists(), and iterator lookups in a single-threaded schedule. Could we exercise those paths here as well?

  48. in src/test/fuzz/dbwrapper.cpp:419 in fb864c5dfa
     414 | +            if (std::string v; db.Read(queries[i], v)) results[i].emplace(std::move(v));
     415 | +        }
     416 | +        return results;
     417 | +    });
     418 | +
     419 | +    auto futures{*Assert(pool.Submit(std::move(tasks)))};
    


    furszy commented at 5:52 PM on June 1, 2026:

    probably could expand coverage by submitting tasks that read the db through iterators in parallel too (CDBWrapper:NewIterator)

  49. furszy commented at 5:53 PM on June 1, 2026: member

    ACK fb864c5dfabefb326290208f577dd84c0ecd7b7b

  50. DrahtBot requested review from fjahr on Jun 1, 2026
  51. DrahtBot requested review from sedited on Jun 1, 2026
  52. l0rinc changes_requested
  53. l0rinc commented at 5:59 PM on June 1, 2026: contributor

    Concept ACK

  54. in src/test/fuzz/dbwrapper.cpp:394 in fb864c5dfa outdated
     389 | +            queries.push_back(keys[provider.ConsumeIntegralInRange<size_t>(0, keys.size() - 1)]);
     390 | +        } else {
     391 | +            queries.push_back(ConsumeKey(provider));
     392 | +        }
     393 | +    }
     394 | +
    


    l0rinc commented at 11:08 AM on June 2, 2026:

    Q: Would it make sense to shuffle the queries so that they're not accessing things in a predictable way? Just resolve if you don't think it's worth it.

  55. in src/test/fuzz/dbwrapper.cpp:412 in fb864c5dfa
     407 | +    // Workers + main thread synchronize on the latch.
     408 | +    std::latch start_latch{static_cast<ptrdiff_t>(MAX_READ_WORKERS + 1)};
     409 | +
     410 | +    std::vector<std::function<Results()>> tasks(MAX_READ_WORKERS, [&]() -> Results {
     411 | +        Results results(num_queries);
     412 | +        start_latch.arrive_and_wait();
    


    l0rinc commented at 11:10 AM on June 2, 2026:

    Was also wondering if we're introducing some bias by always releasing all threads at the exact same time. Maybe we could optionally wait a tiny bit and release them out of order to introduce more variability? Just resolve if you disagree, I don't have a strong opinion.


    andrewtoth commented at 2:08 PM on June 2, 2026:

    Ultimately we have the thread scheduler which we can't control that will decide when the threads begin executing :shrug:

  56. fuzz: extract ConsumeDBParams helper
    Pull the inline DBParams construction out of TestDbWrapper into a shared
    ConsumeDBParams() helper. This is a pure refactor with no behavior change,
    preparing for an additional harness that needs to build the same params.
    6609088fe6
  57. fuzz: target concurrent leveldb reads 8cb8653a22
  58. andrewtoth force-pushed on Jun 2, 2026
  59. andrewtoth commented at 2:10 PM on June 2, 2026: contributor

    Thanks @furszy and @l0rinc for the reviews. I've taken most of your suggestions.

    • Perform iterator seek and exists reads as well as regular reads.
    • Randomize read order per thread.
    • Addressed other code style and comment suggestions.

    git diff fb864c5dfabefb326290208f577dd84c0ecd7b7b..8cb8653a2236ee4e2e683216b31919c83df43920

  60. fjahr commented at 7:04 PM on June 2, 2026: contributor

    Code review ACK 8cb8653a2236ee4e2e683216b31919c83df43920

    Changes since my last review all look good to me.

  61. DrahtBot requested review from l0rinc on Jun 2, 2026
  62. DrahtBot requested review from furszy on Jun 2, 2026
  63. in src/test/fuzz/dbwrapper.cpp:420 in 8cb8653a22
     415 | +    // Build query list from seeded and random keys.
     416 | +    const size_t num_queries{provider.ConsumeIntegralInRange<size_t>(1, 2'000)};
     417 | +    std::vector<ReadQuery> queries;
     418 | +    queries.reserve(num_queries);
     419 | +    for (size_t i{0}; i < num_queries; ++i) {
     420 | +        const auto op{static_cast<ReadOp>(provider.ConsumeIntegralInRange<int>(0, 2))};
    


    l0rinc commented at 8:27 PM on June 2, 2026:

    nit: we can probably use PickValueInArray here

  64. l0rinc approved
  65. l0rinc commented at 8:28 PM on June 2, 2026: contributor

    ACK 8cb8653a2236ee4e2e683216b31919c83df43920

  66. fanquake merged this on Jun 3, 2026
  67. fanquake closed this on Jun 3, 2026

  68. sedited commented at 9:04 AM on June 3, 2026: contributor

    When I now run the fuzz test with sanitizers on, the job exceeds 2Gb of memory, and the slowest unit takes about 100s. A prior version of this PR did not have this issue.

    cc @dergoegge @marcofleon is this acceptable for our infrastructure and integration tests?

  69. marcofleon commented at 3:59 PM on June 3, 2026: contributor

    I ran the target for a while with TSan and got an oom input:

    //9F//85JuP/EBAQEBD/////////9PT09PT0/////////////////////////////////////////////////////////////////////////5v/Qy9DQ0Mp//////////////T09PRkZGRkZGRk9PT/////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////////Cv8AAP+p/////////5ub//////b///8pAAAC///2/////////////////////6n/////m5ubm5ubm5ubm5ubm/9DJi9DQ8Mp//////////////T09PT09P//////////////////////////////////////////////////////////////////////////////////////////////////////////KQD//////////////////////////////////////////wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAr//6n/////////m5v//////////////////////////////////////////////////////////////////////////7////////////////////8QEBAQEBAQ4/8QEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEAAAAAAAABAQEA4QEAoQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEBAQEAAAAAAAABAQEA==
    

    It reproduces on my Debian machine showing peak_rss_mb of 2597mb, higher than LibFuzzer's 2048 limit. On macOS this particular input shows a peak_rss_mb of 1673, but pretty sure that it would produce an oom input eventually, as rss keeps increasing the longer the test runs.

    I have a commit that fixes the oom for this input, bringing it down to 300mb. Instead of every worker returning the full Results vector, they just check against the baseline one at a time and only return a bool. The test's behavior should be the same.

    Separately, I don't think switching to a local threadpool was the right move (my bad). Under TSan it's just too slow the longer I run it and I am seeing rss still increase gradually, which is probably from all the overhead of creating and destroying a 16-worker threadpool every iteration. We should go back to a persistent pool.

    We should be able to use the same pattern that is in the threadpool fuzz harness. If we start the global pool on the first iteration then libFuzzer/LibAFL fork would work and each child gets its own pool. Waiting on the latch wouldn't be an issue, as each child is its own process.

    I've been running the patch above with a global threadpool and rss ~now stays constant~ increases at a much slower rate (just from larger and larger inputs). Execution speed is still fairly slow. I think the most impactful way to speed it up and deal with potential ooms would be decreasing MAX_READ_WORKERS, maybe to 8 if acceptable. Less impactful would be decreasing the upper range of num_entries and/or num_queries.

  70. andrewtoth deleted the branch on Jun 3, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-11 10:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me