fuzz: target CDBWrapper #34887

andrewtoth commented at 1:12 AM on March 21, 2026: contributor

We currently don't have a dedicated harness targeting CDBWrapper. OSS-Fuzz has a rudimentary harness for levelDB which fails, so doesn't appear maintained.

This PR adds a harness targeting CDBWrapper against an in-memory oracle to verify correctness.

A DeterministicEnv wraps levelDB's memenv to eliminate non-determinism by capturing background compaction and running it at fuzzer-chosen points. The fuzzer also controls the cache_bytes and max_file_size sizes so that small values trigger memtable flushes and compaction.

DrahtBot added the label Fuzzing on Mar 21, 2026

DrahtBot commented at 1:12 AM on March 21, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34887.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	l0rinc, marcofleon, dergoegge, sedited

If your review is incorrectly listed, please copy-paste <code></code> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#34866 (fuzz: target concurrent leveldb reads by andrewtoth)
#30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label CI failed on Mar 21, 2026

andrewtoth force-pushed on Mar 21, 2026

DrahtBot removed the label CI failed on Mar 21, 2026

andrewtoth force-pushed on Mar 21, 2026

DrahtBot added the label CI failed on Mar 22, 2026

andrewtoth force-pushed on Mar 22, 2026

DrahtBot removed the label CI failed on Mar 22, 2026

andrewtoth force-pushed on Mar 29, 2026

marcofleon commented at 3:22 PM on April 2, 2026: contributor

Concept ACK

Nice differential fuzz test. I'm running it now and I'll do a more thorough review soon.

andrewtoth force-pushed on Apr 11, 2026

andrewtoth commented at 3:10 PM on April 11, 2026: contributor

Improved the assertions around iterators. This now exhausts the DB iterator completely instead of using LIMITED_WHILE, and asserts that the oracle iterator is also exhausted, ensuring no entries are missing.

Also added coverage for failing to deserialize a value by reading with a type that always throws on deserialization.

git diff 430f357f7e304336d2b61f72d2bc5439dea557d8..b967e255e7620b8366ec1aeb8932138455be8701

in src/dbwrapper.cpp:226 in 9e444624bb outdated

 222 | @@ -224,6 +223,7 @@ CDBWrapper::CDBWrapper(const DBParams& params)
 223 |      DBContext().syncoptions.sync = true;
 224 |      DBContext().options = GetOptions(params.cache_bytes);
 225 |      DBContext().options.create_if_missing = true;
 226 | +    DBContext().options.max_file_size = params.max_file_size;

l0rinc commented at 9:11 AM on April 14, 2026:

9e44462 dbwrapper: make max_file_size a configurable DBParams field:

Is this useful? Changing this back and forth is not something we do and will likely make running this very slow (based on the chosen values)

andrewtoth commented at 10:34 PM on April 15, 2026:

Yes, this is very useful. We want to exercise file compactions at different sizes. At 32 MB default there is no way for input to trigger compactions without very large input lengths. That would be very slow. How do you mean this will be slow otherwise?

l0rinc commented at 9:52 AM on April 16, 2026:

.max_file_size = provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB),

I’m fine with exposing the knob, but I still think the actually used current value should be exercised explicitly. Right now the harness samples only 1–4 MiB and never hits DBWRAPPER_MAX_FILE_SIZE

andrewtoth commented at 7:30 AM on May 7, 2026:

We now optionally hit DBWRAPPER_MAX_FILE_SIZE as well.

in src/dbwrapper.cpp:229 in e903203536 outdated

 223 | @@ -224,7 +224,9 @@ CDBWrapper::CDBWrapper(const DBParams& params)
 224 |      DBContext().options = GetOptions(params.cache_bytes);
 225 |      DBContext().options.create_if_missing = true;
 226 |      DBContext().options.max_file_size = params.max_file_size;
 227 | -    if (params.memory_only) {
 228 | +    if (params.testing_env) {
 229 | +        DBContext().options.env = params.testing_env;
 230 | +    } else if (params.memory_only) {

l0rinc commented at 9:12 AM on April 14, 2026:

e903203 dbwrapper: accept optional testing leveldb::Env in DBParams:

Are these two configs mutually exclusive now? If so, we should add an assert to document that we can't use in-memory leveldb in a testing env.

l0rinc commented at 3:50 PM on May 9, 2026:

I meant here that reusing the same Env and path with wipe_data = true will reopen stale data now, even though DBParams documents wipe_data as removing all existing data.

in test/sanitizer_suppressions/ubsan:22 in b967e255e7 outdated

  18 | @@ -19,6 +19,7 @@ implicit-integer-sign-change:*/include/c++/
  19 |  implicit-integer-sign-change:*/new_allocator.h
  20 |  implicit-integer-sign-change:*/qarraydata.h
  21 |  implicit-integer-sign-change:crc32c/
  22 | +implicit-integer-sign-change:leveldb/

l0rinc commented at 9:14 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Can we document in the commit message why this is needed and add fixes to https://github.com/bitcoin-core/leveldb-subtree?

andrewtoth commented at 2:04 AM on April 16, 2026:

Documented in commit message, but changing leveldb is out of scope here. There are already existing suppressions for leveldb.

l0rinc commented at 9:50 AM on April 16, 2026:

Adds implicit-integer-sign-change:leveldb/ to the test UBSAN suppressions.

The code already shows that the suppression was added, I don't think this is useful. What is still unclear is why a broad implicit-integer-sign-change:leveldb/ suppression is justified here, and whether it could be hiding a real issue in the harness or bundled LevelDB code.

andrewtoth commented at 7:30 AM on May 7, 2026:

Added more fine-grained suppression, with more descriptive documentation in the commit message.

in src/test/fuzz/dbwrapper.cpp:108 in b967e255e7

 103 | +/** Comparator matching LevelDB's lexicographic order on little-endian
 104 | + *  serialized uint16_t keys (low byte first, then high byte). */
 105 | +struct SerializedU16Cmp {
 106 | +    bool operator()(uint16_t a, uint16_t b) const
 107 | +    {
 108 | +        const auto swap{[](uint16_t v) -> uint16_t { return static_cast<uint16_t>((v << 8) | (v >> 8)); }};

l0rinc commented at 9:17 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Can we reuse existing helpers and mention the LevelDB comparator we're cloning here?

/** Equivalent to `leveldb::BytewiseComparator()` on 2-byte little-endian
 *  serialized uint16_t keys, while keeping the oracle keyed by uint16_t. */
struct LevelDBBytewiseU16Cmp {
    bool operator()(uint16_t a, uint16_t b) const { return internal_bswap_16(a) < internal_bswap_16(b); }
};

in src/test/fuzz/dbwrapper.cpp:113 in b967e255e7

 108 | +        const auto swap{[](uint16_t v) -> uint16_t { return static_cast<uint16_t>((v << 8) | (v >> 8)); }};
 109 | +        return swap(a) < swap(b);
 110 | +    }
 111 | +};
 112 | +
 113 | +struct FailDeserialize {

l0rinc commented at 9:18 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

nit: I don't like the name, but we're calling Unserialize, not Deserialize

in src/test/fuzz/dbwrapper.cpp:133 in b967e255e7 outdated

 128 | +    FuzzedDataProvider provider{buffer.data(), buffer.size()};
 129 | +
 130 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
 131 | +    DeterministicEnv det_env{memenv.get()};
 132 | +
 133 | +    const bool obfuscate{provider.ConsumeBool()};

l0rinc commented at 9:18 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Non-obfuscated run should probably have a lot lower probability

    const bool obfuscate{provider.ConsumeIntegralInRange<uint8_t>(0, 10) != 0};

andrewtoth commented at 10:23 PM on April 15, 2026:

Fuzzing is coverage-guided, not random-sampling. There isn't really "probability" in the way you're thinking about it. The fuzzer discovers and maintains inputs that reach each unique path. ConsumeBool() gives a single-byte toggle that makes both paths trivially reachable. Changing to 1/11 or 1/8 just makes one path harder to reach without any benefit to coverage exploration. The fuzzer naturally allocates corpus entries based on which paths produce unique coverage.

l0rinc commented at 9:56 AM on April 16, 2026:

Changing to 1/11 or 1/8 just makes one path harder to reach without any benefit to coverage exploration

Was just a tuning suggestion to incentivize the fuzzer to spend more time with realistic scenarios (starting from more realistic state). Not a big deal either way, please resolve.

andrewtoth commented at 1:50 PM on April 16, 2026:

to incentivize the fuzzer to spend more time

The fuzzer is incentivized to spend more time based on new coverage. If the first byte does obfuscation and the second byte does non-obfuscation, it will explore both paths. If just 0 of the first 8 bytes does non-obfuscation, and any other value of 8 bytes does obfuscation, then the fuzzer will still explore both paths equally. It's just taking more time for the fuzzer to understand that any other value than 0 does not add new coverage.

l0rinc commented at 1:53 PM on April 16, 2026:

It's just taking more time for the fuzzer to understand that any other value than 0 does not add new coverage.

Exactly.

in src/test/fuzz/dbwrapper.cpp:135 in b967e255e7

 130 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
 131 | +    DeterministicEnv det_env{memenv.get()};
 132 | +
 133 | +    const bool obfuscate{provider.ConsumeBool()};
 134 | +    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB)};
 135 | +    const size_t max_file_size{provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB)};

l0rinc commented at 9:20 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

How come we're not even exercising the actual value that we're using internally? The previous value was 2_MiB, the current one is DBWRAPPER_MAX_FILE_SIZE = 32_MiB - if you insist on changing this can we simply choose between these two values only?

    const size_t max_file_size{provider.ConsumeBool() ? 2_MiB : DBWRAPPER_MAX_FILE_SIZE};

or

    const size_t max_file_size{provider.PickValueInArray<std::size_t>({2_MiB, DBWRAPPER_MAX_FILE_SIZE})};

andrewtoth commented at 10:36 PM on April 15, 2026:

#34887 (review)

More than 4 MB will not be very useful, since we won't have large enough inputs to do any kind of compaction. Choosing between 2 values is artificially limiting the fuzzer.

l0rinc commented at 9:58 AM on April 16, 2026:

Choosing between 2 values is artificially limiting the fuzzer.

Yes, that's the point - we want it to spend more time on valid cases instead of made-up ones. Making it a range will just result in the actual values used (2_MiB and 32_MiB) in the code being chosen less often.

sedited commented at 9:52 AM on May 3, 2026:

Could also do both? Choose between 2, 32, and a ranged value.

andrewtoth commented at 7:31 AM on May 7, 2026:

We now do both - either a ranged value that contains 2, or 32.

in src/test/fuzz/dbwrapper.cpp:1 in b967e255e7 outdated

l0rinc commented at 9:24 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

This last commit is huge, maybe we could split it into a few self-contained commits that each add one layer of coverage and still build on their own? For example:

basic deterministic differential harness with the core CDBWrapper methods
deserialization-failure coverage (FailUnserialize / PickExistingKey)
key-ordering and range checks (LevelDBBytewiseU16Cmp / EstimateSize)
iterator and restart/reopen coverage.

That would make it much easier to review the basic harness separately from the trickier ordering/iterator/lifecycle pieces.

l0rinc commented at 9:59 AM on April 16, 2026:

This still stands. The issue is not correctness, it is reviewability: the basic deterministic harness is much easier to reason about separately from the ordering / iterator / reopen logic.

in src/dbwrapper.h:53 in e903203536 outdated

  47 | @@ -44,6 +48,9 @@ struct DBParams {
  48 |      bool obfuscate = false;
  49 |      //! Passed-through options.
  50 |      DBOptions options{};
  51 | +    //! If non-null, use this as the leveldb::Env instead of the default.
  52 | +    //! Caller retains ownership.
  53 | +    leveldb::Env* testing_env = nullptr;

l0rinc commented at 9:30 AM on April 14, 2026:

e903203 dbwrapper: accept optional testing leveldb::Env in DBParams:

Can we cover this with tests in dbwrapper_tests.cpp to explain the expected behavior, e.g. where we reuse an injected memenv across reopens and verify that wipe_data still works (which I think we're silently bypassing here).

andrewtoth commented at 2:06 AM on April 16, 2026:

This is only used for this fuzz harness, which covers it. I think having a unit test for a test only construct where it's already tested is unnecessary.

l0rinc commented at 10:07 AM on April 16, 2026:

This is only used for this fuzz harness, which covers it

The fuzzer depends on this behavior, but it is not a substitute for a small API-level test of the constructor semantics here, especially the testing_env precedence and wipe_data behavior.

l0rinc commented at 3:52 PM on May 9, 2026:

nit: I would still like to test this behavior explicitly since we're relying on it in the fuzzer. "testing it in the fuzzer" seems circular to me.

in src/test/fuzz/dbwrapper.cpp:148 in b967e255e7

 143 | +    }};
 144 | +
 145 | +    // Oracle: key → value length. Content is reconstructed via MakeValue().
 146 | +    std::map<uint16_t, uint16_t, SerializedU16Cmp> oracle;
 147 | +
 148 | +    LIMITED_WHILE(provider.ConsumeBool(), 10000)

l0rinc commented at 9:33 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

nit:

    LIMITED_WHILE(provider.ConsumeBool(), 10'000)

in src/test/fuzz/dbwrapper.cpp:156 in b967e255e7

 151 | +            provider,
 152 | +            // --- Mutations ---
 153 | +            [&] {
 154 | +                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
 155 | +                const uint16_t len{provider.ConsumeIntegralInRange<uint16_t>(0, MAX_VALUE_SIZE)};
 156 | +                const bool sync{provider.ConsumeBool()};

l0rinc commented at 9:35 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

We rarely call fSync (the default is false for a reason), can we also make this more rare (to make sure we detect problems when it's not called and to make execution faster)? We could extract to a helper:

bool ConsumeSync(FuzzedDataProvider& provider)
{
    return provider.ConsumeIntegralInRange<uint8_t>(0, 7) == 0;
}

and use it as

dbw->Write(key, MakeValue(key, len), ConsumeSync(provider));

andrewtoth commented at 10:24 PM on April 15, 2026:

#34887 (review)

l0rinc commented at 10:15 AM on April 16, 2026:

Can be resolved

in src/test/fuzz/dbwrapper.cpp:191 in b967e255e7

 186 | +                        }
 187 | +                    }
 188 | +                }};
 189 | +                fill();
 190 | +                if (provider.ConsumeBool()) {
 191 | +                    (void)batch.ApproximateSize();

l0rinc commented at 9:37 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

nit: we could assert something here, e.g. that the value is always >= 12 (kHeader)

constexpr size_t WRITE_BATCH_HEADER{12}; // See `kHeader` in `db/write_batch.cc`
...
assert(batch.ApproximateSize() >= WRITE_BATCH_HEADER);
batch.Clear();
assert(batch.ApproximateSize() == WRITE_BATCH_HEADER);

l0rinc commented at 10:06 AM on April 16, 2026:

Please resolve.

in src/test/fuzz/dbwrapper.cpp:210 in b967e255e7

 205 | +                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
 206 | +                std::vector<uint8_t> value;
 207 | +                const bool found{dbw.Read(key, value)};
 208 | +                const auto it{oracle.find(key)};
 209 | +                assert(found == (it != oracle.end()));
 210 | +                if (found) assert(value == MakeValue(key, it->second));

l0rinc commented at 9:46 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Alternatively to make the found/not-found cases more symmetric, we could split based on whether we expect to find the value (like we did with obfiscation), something like:

if (const auto it{oracle.find(key)}; it != oracle.end()) {
    assert(found && value == MakeValue(key, it->second));
} else {
    assert(!found);
}

l0rinc commented at 10:06 AM on April 16, 2026:

Please resolve.

in src/test/fuzz/dbwrapper.cpp:244 in b967e255e7 outdated

 239 | +                        ++oracle_it;
 240 | +                    } else {
 241 | +                        assert(obfuscate);
 242 | +                        std::string key_str;
 243 | +                        assert(it->GetKey(key_str));
 244 | +                        assert(key_str == std::string("\000obfuscate_key", 14));

l0rinc commented at 9:49 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

could we use the constant here (we could expose it via dbwrapper_private)?

Thinking a bit more about this, this obfuscation is a bit intertwined with dbwrapper and it complicates the fuzzer since we can't assume the keys are always numbers and the code adds values to the database that aren't under our control, which hinders what we can assert. I don't think we can safely disable it (without introducing a heavy bias), but we can probably separate it from the rest of the iteration at least:

std::optional<uint16_t> GetIteratorUserKey(CDBIterator& it, bool obfuscate)
{
    struct RawKeyBytes {
        std::vector<std::byte> key;

        void Unserialize(DataStream& s)
        {
            key.resize(s.size());
            if (!key.empty()) s.read(std::span{key});
        }
    };

    // The harness only writes uint16_t keys, but an obfuscated DB also stores
    // the internal obfuscation metadata entry under a different key type.
    RawKeyBytes raw_key;
    assert(it.GetKey(raw_key));
    if (raw_key.key.size() != sizeof(uint16_t)) {
        assert(obfuscate);
        std::string key_str;
        assert(it.GetKey(key_str));
        assert(key_str == dbwrapper_private::GetObfuscationKey());
        return std::nullopt;
    }
    return ReadLE16(raw_key.key.data());
}

Which could help with the VerifyIterator to avoid the iteration falling back only when it disagrees with the oracle.

andrewtoth commented at 2:08 AM on April 16, 2026:

I don't think this is worth the extra code complexity, just to pull out this constant. This PR touches very minimal production code, and I'd prefer to keep it like that.

l0rinc commented at 10:06 AM on April 16, 2026:

This PR touches very minimal production code, and I'd prefer to keep it like that.

Sure, does that contradict my suggestion above? The point wasn’t necessarily just to reuse the constant. It was to separate the internal obfuscation-key path from normal user-key iteration so the iterator logic does not only recognize the metadata entry when it disagrees with the oracle.

andrewtoth commented at 7:39 AM on May 7, 2026:

I've updated to use encapsulate the iterator logic in a VerifyIterator helper function. I would need to update dbwrapper_private to expose the key though, so I opted to keep the constant in this file.

in src/test/fuzz/dbwrapper.cpp:261 in b967e255e7

 256 | +                }
 257 | +            },
 258 | +            [&] {
 259 | +                uint16_t k1{provider.ConsumeIntegral<uint16_t>()};
 260 | +                uint16_t k2{provider.ConsumeIntegral<uint16_t>()};
 261 | +                if (k1 > k2) std::swap(k1, k2);

l0rinc commented at 9:53 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Shouldn't we use the comparator here? And use std::minmax to avoid the swap, something like:

const auto [k1, k2]{std::minmax({provider.ConsumeIntegral<uint16_t>(), provider.ConsumeIntegral<uint16_t>()}, LevelDBBytewiseU16Cmp{})};
const size_t estimate_size{dbw.EstimateSize(k1, k2)};
assert((estimate_size == 0) == (k1 == k2));

but that seems to fail sometimes for me locally, not sure how EstimateSize can be 0 otherwise, but maybe if (k1 == k2) assert(estimate_size == 0); is a better assertion: https://github.com/bitcoin/bitcoin/blob/20a6babfa9a66f5432ef19c6c433b4357560f853/src/leveldb/include/leveldb/db.h#L133

in src/test/fuzz/dbwrapper.cpp:265 in b967e255e7

 260 | +                uint16_t k2{provider.ConsumeIntegral<uint16_t>()};
 261 | +                if (k1 > k2) std::swap(k1, k2);
 262 | +                (void)dbw.EstimateSize(k1, k2);
 263 | +            },
 264 | +            [&] {
 265 | +                (void)dbw.DynamicMemoryUsage();

l0rinc commented at 9:53 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

this returns 0 when it fails - can we guard against that? I don't have a concrete suggestion here - except modifying DynamicMemoryUsage to return an std::optional but that would be outside the scope of this change.

l0rinc commented at 10:09 AM on April 16, 2026:

Can be resolved.

in src/test/fuzz/dbwrapper.cpp:138 in b967e255e7

 133 | +    const bool obfuscate{provider.ConsumeBool()};
 134 | +    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB)};
 135 | +    const size_t max_file_size{provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB)};
 136 | +
 137 | +    CDBWrapper dbw{{
 138 | +        .path = "",

l0rinc commented at 10:21 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

        .path = "dbwrapper_fuzz",

in src/test/fuzz/dbwrapper.cpp:115 in b967e255e7

 110 | +    }
 111 | +};
 112 | +
 113 | +struct FailDeserialize {
 114 | +    template <typename Stream>
 115 | +    void Unserialize(Stream&) { throw std::runtime_error{"intentional"}; }

l0rinc commented at 10:30 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Alternatively, since ios_base::failure would be the natural failure mode for streams:

    void Unserialize(Stream&) { throw std::ios_base::failure{"always fail"}; }

in src/test/fuzz/dbwrapper.cpp:120 in b967e255e7

 115 | +    void Unserialize(Stream&) { throw std::runtime_error{"intentional"}; }
 116 | +};
 117 | +
 118 | +void initialize_dbwrapper()
 119 | +{
 120 | +    LogInstance().DisableLogging();

l0rinc commented at 10:31 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

When are we using this and when static const auto testing_setup{MakeNoLogFileContext<>()};?

Seems the context is more general and we can probably inline it:

FUZZ_TARGET(dbwrapper, .init = [] { static auto setup{MakeNoLogFileContext<>()}; })

in src/test/fuzz/dbwrapper.cpp:127 in b967e255e7 outdated

 122 | +
 123 | +} // namespace
 124 | +
 125 | +FUZZ_TARGET(dbwrapper, .init = initialize_dbwrapper)
 126 | +{
 127 | +    SeedRandomStateForTest(SeedRand::ZEROS);

l0rinc commented at 10:35 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

This seems to be needed even though the default is already https://github.com/bitcoin/bitcoin/blob/261d229455e30fee874a5215a618adf6210cbe3a/src/test/fuzz/fuzz.cpp#L100 - whatever...

l0rinc commented at 10:09 AM on April 16, 2026:

Resolve please.

in src/test/fuzz/dbwrapper.cpp:222 in b967e255e7

 217 | +                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
 218 | +                FailDeserialize wrong_type;
 219 | +                assert(!dbw.Read(key, wrong_type));
 220 | +            },
 221 | +            [&] {
 222 | +                const std::unique_ptr<CDBIterator> it{dbw.NewIterator()};

l0rinc commented at 10:56 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

This seems like a very important invariant that we could extract to a helper and use it more often, e.g.:

            [&] {
                const auto seek_key{provider.ConsumeBool() ? std::nullopt : std::optional{provider.ConsumeIntegral<uint16_t>()}};
                VerifyIterator(dbw, oracle, obfuscate, seek_key);
            },

and at the very end:

    // The fuzzer may end without choosing another iterator pass after the last
    // deferred compaction step, so drain and verify one final time here.
    det_env.DrainWork();
    VerifyIterator(dbw, oracle, obfuscate);

andrewtoth commented at 10:27 PM on April 15, 2026:

#34887 (review)

The fuzzer may or may not choose another iterator pass based on coverage. We don't need to add this code at the end if there is already a way for the fuzzer to reach it somewhere else.

l0rinc commented at 10:10 AM on April 16, 2026:

Yes, the fuzzer may reach that path elsewhere, but making it a final invariant is still useful: the state should be valid at the end of every run regardless of which branches were taken.

andrewtoth commented at 1:04 PM on April 16, 2026:

making it a final invariant is still useful

Not really. If there is deterministic input that causes the state to be invalid at the end of the run, then the fuzzer will be able to reach that invalid state again and trivially add the branch to reach the iterator check above.

l0rinc commented at 4:20 PM on April 16, 2026:

Ok, seems we're talking past each other here, will unsubscribe and leave it to others to review.

andrewtoth commented at 7:41 AM on May 7, 2026:

Took this suggestion.

in src/test/fuzz/dbwrapper.cpp:134 in b967e255e7

 129 | +
 130 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
 131 | +    DeterministicEnv det_env{memenv.get()};
 132 | +
 133 | +    const bool obfuscate{provider.ConsumeBool()};
 134 | +    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB)};

l0rinc commented at 11:09 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

These are often bigger in reality, can we add a few meaningful values here instead?

const size_t cache_bytes{provider.PickValueInArray<std::size_t>({64_KiB, 1_MiB, 2_MiB})};

or if we want to check arbitrary values to avoid introducing any bias:

 const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64_KiB, 4_MiB)};

andrewtoth commented at 10:38 PM on April 15, 2026:

We want to trigger memtable flushes. Larger values will not allow us to do that with our input size, so are not interesting. Similar to #34887 (review).

l0rinc commented at 10:11 AM on April 16, 2026:

so are not interesting

I get the goal of keeping values small enough to trigger flushes cheaply, but I still think production-relevant values should be sampled some of the time, especially the real 2_MiB block-tree setting.

in src/test/fuzz/dbwrapper.cpp:98 in b967e255e7

  93 | + *  controls only the key and a 16-bit length; the actual content is
  94 | + *  derived from a fixed pattern and doubled so that fuzz bytes aren't
  95 | + *  wasted on bulk data. */
  96 | +std::vector<uint8_t> MakeValue(uint16_t key, uint16_t len)
  97 | +{
  98 | +    std::vector<uint8_t> v(static_cast<size_t>(len) * 2);

l0rinc commented at 11:29 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

Not sure I understand why MakeValue is always even - and if we're multiplying MAX_VALUE_SIZE by 2 it's not the actual max size, right?

andrewtoth commented at 2:09 AM on April 16, 2026:

Made it 3 instead of 2 so it can be odd. Also updated the comment to clarify MAX_VALUE_SIZE.

l0rinc commented at 10:03 AM on April 16, 2026:

Smart, this fixes the "always even" concern, but it replaces it with a stronger size bias (e.g. no prime numbers, powers of 2, etc). What's the reason for multiplying at all? Why not just

    std::vector<uint8_t> v(len);

? Please unresolve.

l0rinc commented at 4:02 PM on May 9, 2026:

This still stands: multiplying len by 3 means the harness only exercises value sizes that are its multiples.

size is tripled so that fuzz bytes aren't wasted on bulk data

I'm not sure what this means.

Also, we're seeding iota with the lower bits only, maybe we could fold the higher bits into it like:

std::vector<uint8_t> MakeValue(uint16_t key, uint16_t len)
{
    std::vector<uint8_t> v(len);
    std::iota(v.begin(), v.end(), uint8_t(key ^ (key >> 8)));
    return v;
}

(note, std::ranges::iota seems to be a C++23 feature only, so we have to keep the verbose version)

andrewtoth commented at 3:22 PM on May 20, 2026:

We do both now - we choose a multiplier and a length. Also, took the fold higher bits change.

in src/test/fuzz/dbwrapper.cpp:219 in b967e255e7

 214 | +                assert(dbw.Exists(key) == oracle.contains(key));
 215 | +            },
 216 | +            [&] {
 217 | +                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
 218 | +                FailDeserialize wrong_type;
 219 | +                assert(!dbw.Read(key, wrong_type));

l0rinc commented at 11:40 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

This would already be false without the failure since the key is very unlikely to be present, maybe we could extend it to:

const uint16_t key{!oracle.empty() && provider.ConsumeBool() ? PickExistingKey(provider, oracle) : provider.ConsumeIntegral<uint16_t>()};

andrewtoth commented at 10:28 PM on April 15, 2026:

#34887 (review)

l0rinc commented at 10:13 AM on April 16, 2026:

The point here is to exercise the deserialization-failure path, not the much more common key-not-found path, so I still think this branch should bias toward existing keys when the oracle is non-empty, see:

andrewtoth commented at 1:01 PM on April 16, 2026:

Using words like "unlikely" and "bias" don't make sense when discussing a fuzz harness. The fuzzer will choose keys that are present or not, based on coverage data. It does not choose inputs randomly. We don't want to introduce bias, we want to make it easy for the fuzzer to find all paths.

l0rinc commented at 1:06 PM on April 16, 2026:

Using words like "unlikely" and "bias" don't make sense when discussing a fuzz harness.

We don't want to introduce bias, we want to make it easy for the fuzzer to find all paths.

We're always introducing bias; I'd rather bias it towards the more realistic uses instead of uniformity. But this comment is not about that, it's about this branch being incorrect since it exercises the not-found path, not the deserialization-error path.

andrewtoth commented at 1:14 PM on April 16, 2026:

This is about bias. You are suggesting that it is "unlikely" the value will be present in this codepath. That is not how fuzzers work. The fuzzer will choose a key that is present, and then exercise the deserialization-error path.

l0rinc commented at 4:14 PM on April 16, 2026:

You are suggesting that it is "unlikely" the value will be present in this codepath

I instrumented the FailUnserialize branch to distinguish Read() returning false because the key is absent vs because deserialization throws. Running the dbwrapper fuzzer for 766,281 executions showed 46,127 missing-key hits and 3,041 actual deserialization-failure hits (and 4144181 runs show dbwrapper FailUnserialize stats: missing_key=573326 deserialize_failure=45901). So while libFuzzer does discover the intended path, about 94% of the observed hits to this branch are still the missing-key case rather than the deserialization-failure case. This is what I meant by unlikely.

in src/test/fuzz/dbwrapper.cpp:195 in b967e255e7 outdated

 190 | +                if (provider.ConsumeBool()) {
 191 | +                    (void)batch.ApproximateSize();
 192 | +                    batch.Clear();
 193 | +                    batch_writes.clear();
 194 | +                    batch_erases.clear();
 195 | +                    fill();

l0rinc commented at 11:46 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

after batch.Clear(), the harness could avoid always refilling the batch:

                    if (provider.ConsumeBool()) fill();

andrewtoth commented at 10:28 PM on April 15, 2026:

This is covered already by the fill() looping zero times.

l0rinc commented at 10:02 AM on April 16, 2026:

Good point!

in src/test/fuzz/dbwrapper.cpp:137 in b967e255e7

 132 | +
 133 | +    const bool obfuscate{provider.ConsumeBool()};
 134 | +    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB)};
 135 | +    const size_t max_file_size{provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB)};
 136 | +
 137 | +    CDBWrapper dbw{{

l0rinc commented at 11:53 AM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

I would expect some bugs occurring when we randomly close&reopen this - could we add such a branch?

    const auto make_db{[&] {
        return std::make_unique<CDBWrapper>(DBParams{
            .path = "dbwrapper_fuzz",
            .cache_bytes = cache_bytes,
            .obfuscate = obfuscate,
            .testing_env = &det_env,
            .max_file_size = max_file_size,
        });
    }};
    std::unique_ptr<CDBWrapper> dbw{make_db()};

            [&] {
                if (provider.ConsumeBool()) det_env.DrainWork();
                dbw.reset();
                dbw = make_db({.force_compact = provider.ConsumeBool()});
                VerifyIterator(*dbw, oracle, obfuscate);
            },

(we could even exercise force_compact this way)

andrewtoth commented at 2:10 AM on April 16, 2026:

Added reopening the db, but we can't exercise force_compact since it needs a background compaction. It deadlocks on DeterminiticEnv. If we introduced a background thread it wouldn't really be Deterministic anymore.

l0rinc commented at 10:28 AM on April 16, 2026:

we can't exercise force_compact since it needs a background compaction

Sounds like a something we should investigate, without a background compaction we're missing out on a fundamental behavior difference that we're not checking at all. Reopen with force_compact is one of the more interesting remaining constructor paths. If deterministic compaction makes it impossible here, I think that should at least be called out explicitly.

in src/test/fuzz/dbwrapper.cpp:229 in b967e255e7 outdated

 224 | +                if (provider.ConsumeBool()) {
 225 | +                    const uint16_t seek_key{provider.ConsumeIntegral<uint16_t>()};
 226 | +                    it->Seek(seek_key);
 227 | +                    oracle_it = oracle.lower_bound(seek_key);
 228 | +                } else {
 229 | +                    it->SeekToFirst();

l0rinc commented at 12:24 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

We know the lowest key, we could unify the two branches to:

    it->Seek(seek_key.value_or(uint16_t{0}));

(IsEmpty already covers SeekToFirst)

andrewtoth commented at 10:29 PM on April 15, 2026:

This is specifically to cover SeekToFirst.

l0rinc commented at 6:38 AM on April 16, 2026:

As mentioned, IsEmpty() already covers SeekToFirst(), so I still think unifying this to Seek(seek_key.value_or(uint16_t{0})) would be fine.

in src/test/fuzz/dbwrapper.cpp:154 in b967e255e7

 149 | +    {
 150 | +        CallOneOf(
 151 | +            provider,
 152 | +            // --- Mutations ---
 153 | +            [&] {
 154 | +                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};

l0rinc commented at 12:39 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

This is repeated very often - we could add a dedicated helper to hide the details and make this inlineable:

uint16_t ConsumeKey(FuzzedDataProvider& provider) { return provider.ConsumeIntegral<uint16_t>(); }
uint16_t ConsumeValueLength(FuzzedDataProvider& provider) { return provider.ConsumeIntegralInRange<uint16_t>(0, MAX_VALUE_SIZE); }
bool ConsumeSync(FuzzedDataProvider& provider) { return provider.ConsumeIntegralInRange<uint8_t>(0, 7) == 0; }
uint16_t PickExistingKey(FuzzedDataProvider& provider, const std::map<uint16_t, uint16_t, LevelDBBytewiseU16Cmp>& oracle)
{
    assert(!oracle.empty());
    auto it{oracle.begin()};
    std::advance(it, provider.ConsumeIntegralInRange<size_t>(0, oracle.size() - 1));
    return it->first;
}

in src/test/fuzz/dbwrapper.cpp:176 in b967e255e7

 171 | +                std::set<uint16_t> batch_erases;
 172 | +                const auto fill{[&] {
 173 | +                    LIMITED_WHILE(provider.ConsumeBool(), 20)
 174 | +                    {
 175 | +                        if (provider.ConsumeBool()) {
 176 | +                            const uint16_t key{provider.ConsumeIntegral<uint16_t>()};

l0rinc commented at 12:45 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

key can be deduplicated in this branch

in src/test/fuzz/dbwrapper.cpp:256 in b967e255e7

 251 | +            [&] {
 252 | +                if (!obfuscate) {
 253 | +                    assert(dbw.IsEmpty() == oracle.empty());
 254 | +                } else {
 255 | +                    assert(!dbw.IsEmpty());
 256 | +                }

l0rinc commented at 12:50 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

We could simplify this to a single assert:

                assert(dbw->IsEmpty() == (oracle.empty() && !obfuscate));

in src/test/fuzz/dbwrapper.cpp:49 in b967e255e7

  44 | +    struct Work {
  45 | +        void (*function)(void*);
  46 | +        void* arg;
  47 | +    };
  48 | +
  49 | +    mutable Mutex m_mutex;

l0rinc commented at 12:53 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

We could avoid the mutable here if we make bool HasPending() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex) non-const.

in src/test/fuzz/dbwrapper.cpp:45 in b967e255e7

  40 | + * all pending work before every write to avoid a single-threaded deadlock.
  41 | + */
  42 | +class DeterministicEnv final : public leveldb::EnvWrapper
  43 | +{
  44 | +    struct Work {
  45 | +        void (*function)(void*);

l0rinc commented at 1:12 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

We could dedup and name this type:

using WorkFunction = void (*)(void*);
..
struct Work {
    WorkFunction function;
    void* arg;
};
...
void Schedule(WorkFunction function, void* arg) override EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)

in src/test/fuzz/dbwrapper.cpp:65 in b967e255e7

  60 | +
  61 | +    bool HasPending() const EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
  62 | +    {
  63 | +        LOCK(m_mutex);
  64 | +        return !m_queue.empty();
  65 | +    }

l0rinc commented at 1:13 PM on April 14, 2026:

b967e25 test: add fuzz harness for CDBWrapper:

WITH_LOCK would make this a one-liner:

    bool HasPending() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex) { return WITH_LOCK(m_mutex, return !m_queue.empty()); }

but if we adjust RunOne to return a bool, we can even get rid of it:

/** Execute one pending background task. The task may schedule a
 *  successor which is left pending for a later call. */
bool RunOne() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
{
    Work work{nullptr, nullptr};
    {
        LOCK(m_mutex);
        if (m_queue.empty()) return false;
        work = m_queue.front();
        m_queue.pop_front();
    }
    work.function(work.arg);
    return true;
}

/** Execute pending background tasks until none remain. */
void DrainWork() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex) { while (RunOne()) {} }

l0rinc changes_requested

l0rinc commented at 1:34 PM on April 14, 2026: contributor

Concept ACK

Thanks for tackling this, a dedicated CDBWrapper fuzzer helps with increasing our confidence as we iterate towards #31132.

I have spent more time with this review than anticipated (again), the main themes of my suggestions are:

make the wrapper/env behavior more explicit and covered with focused unit tests, especially injected testing_env precedence and wipe_data
keep the fuzz inputs closer to meaningful real values and cheaper execution paths where possible
split the large fuzz commit into a few smaller buildable steps so the basic harness can be reviewed separately from the ordering / iterator / reopen logic
simplify and reuse helpers in the fuzzer where we can, and make the iterator / obfuscation handling more explicit and robust

<details><summary>Local changes during review</summary>

diff --git a/src/dbwrapper.cpp b/src/dbwrapper.cpp
index f1884c6cf4..3ca8fce9d7 100644
--- a/src/dbwrapper.cpp
+++ b/src/dbwrapper.cpp
@@ -229,13 +229,16 @@ CDBWrapper::CDBWrapper(const DBParams& params)
     } else if (params.memory_only) {
         DBContext().penv = leveldb::NewMemEnv(leveldb::Env::Default());
         DBContext().options.env = DBContext().penv;
-    } else {
+    }
+    if (!params.memory_only) {
         if (params.wipe_data) {
             LogInfo("Wiping LevelDB in %s", fs::PathToString(params.path));
             leveldb::Status result = leveldb::DestroyDB(fs::PathToString(params.path), DBContext().options);
             HandleError(result);
         }
-        TryCreateDirectories(params.path);
+        if (!params.testing_env) {
+            TryCreateDirectories(params.path);
+        }
         LogInfo("Opening LevelDB in %s", fs::PathToString(params.path));
     }
     // PathToString() return value is safe to pass to leveldb open function,
@@ -392,4 +395,9 @@ const Obfuscation& GetObfuscation(const CDBWrapper& w)
     return w.m_obfuscation;
 }
 
+const std::string& GetObfuscationKey()
+{
+    return CDBWrapper::OBFUSCATION_KEY;
+}
+
 } // namespace dbwrapper_private
diff --git a/src/dbwrapper.h b/src/dbwrapper.h
index 3e81a27e37..4b24fefb94 100644
--- a/src/dbwrapper.h
+++ b/src/dbwrapper.h
@@ -73,6 +73,7 @@ namespace dbwrapper_private {
  * specific database.
  */
 const Obfuscation& GetObfuscation(const CDBWrapper&);
+const std::string& GetObfuscationKey();
 }; // namespace dbwrapper_private
 
 bool DestroyDB(const std::string& path_str);
@@ -188,6 +189,7 @@ struct LevelDBContext;
 class CDBWrapper
 {
     friend const Obfuscation& dbwrapper_private::GetObfuscation(const CDBWrapper&);
+    friend const std::string& dbwrapper_private::GetObfuscationKey();
 private:
     //! holds all leveldb-specific fields of this class
     std::unique_ptr<LevelDBContext> m_db_context;
diff --git a/src/test/CMakeLists.txt b/src/test/CMakeLists.txt
index 1365d6c147..5e61b40ae6 100644
--- a/src/test/CMakeLists.txt
+++ b/src/test/CMakeLists.txt
@@ -170,6 +170,7 @@ target_link_libraries(test_bitcoin
   libevent::extra
   $<TARGET_NAME_IF_EXISTS:USDT::headers>
 )
+target_include_directories(test_bitcoin PRIVATE ${PROJECT_SOURCE_DIR}/src/leveldb/include)
 
 if(ENABLE_WALLET)
   add_subdirectory(${PROJECT_SOURCE_DIR}/src/wallet/test wallet)
diff --git a/src/test/dbwrapper_tests.cpp b/src/test/dbwrapper_tests.cpp
index 3896ea64da..dfd1f73c78 100644
--- a/src/test/dbwrapper_tests.cpp
+++ b/src/test/dbwrapper_tests.cpp
@@ -3,6 +3,8 @@
 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
 
 #include <dbwrapper.h>
+#include <leveldb/env.h>
+#include <leveldb/helpers/memenv/memenv.h>
 #include <test/util/common.h>
 #include <test/util/random.h>
 #include <test/util/setup_common.h>
@@ -267,6 +269,32 @@ BOOST_AUTO_TEST_CASE(existing_data_no_obfuscate)
     BOOST_CHECK_EQUAL(res3.ToString(), in2.ToString());
 }
 
+BOOST_AUTO_TEST_CASE(injected_environment)
+{
+    std::unique_ptr<leveldb::Env> env{leveldb::NewMemEnv(leveldb::Env::Default())};
+    const fs::path ph{m_args.GetDataDirBase() / "dbwrapper_injected_environment"};
+    const uint8_t key{'k'};
+    const uint256 value{m_rng.rand256()};
+    uint256 read_value;
+
+    {
+        CDBWrapper dbw({.path = ph, .cache_bytes = 1_MiB, .wipe_data = true, .testing_env = env.get()});
+        dbw.Write(key, value);
+    }
+
+    {
+        CDBWrapper dbw({.path = ph, .cache_bytes = 1_MiB, .memory_only = true, .testing_env = env.get()});
+        BOOST_CHECK(dbw.Read(key, read_value));
+        BOOST_CHECK_EQUAL(read_value, value);
+    }
+
+    {
+        CDBWrapper dbw({.path = ph, .cache_bytes = 1_MiB, .wipe_data = true, .testing_env = env.get()});
+        BOOST_CHECK(dbw.IsEmpty());
+        BOOST_CHECK(!dbw.Read(key, read_value));
+    }
+}
+
 // Ensure that we start obfuscating during a reindex.
 BOOST_AUTO_TEST_CASE(existing_data_reindex)
 {
diff --git a/src/test/fuzz/dbwrapper.cpp b/src/test/fuzz/dbwrapper.cpp
index 433ada799c..5fe2ad69c9 100644
--- a/src/test/fuzz/dbwrapper.cpp
+++ b/src/test/fuzz/dbwrapper.cpp
@@ -3,12 +3,13 @@
 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
 
 #include <dbwrapper.h>
-#include <logging.h>
+#include <compat/byteswap.h>
+#include <crypto/common.h>
 #include <sync.h>
 #include <test/fuzz/FuzzedDataProvider.h>
 #include <test/fuzz/fuzz.h>
 #include <test/fuzz/util.h>
-#include <test/util/random.h>
+#include <test/util/setup_common.h>
 #include <util/byte_units.h>
 
 #include <leveldb/env.h>
@@ -21,7 +22,9 @@
 #include <map>
 #include <memory>
 #include <numeric>
+#include <optional>
 #include <set>
+#include <string>
 #include <vector>
 
 namespace {
@@ -41,53 +44,46 @@ namespace {
  */
 class DeterministicEnv final : public leveldb::EnvWrapper
 {
+    using WorkFunction = void (*)(void*);
+
     struct Work {
-        void (*function)(void*);
+        WorkFunction function;
         void* arg;
     };
 
-    mutable Mutex m_mutex;
+    Mutex m_mutex;
     std::deque<Work> m_queue GUARDED_BY(m_mutex);
 
 public:
     explicit DeterministicEnv(leveldb::Env* base) : EnvWrapper(base) {}
 
-    void Schedule(void (*function)(void* arg), void* arg) override EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
+    void Schedule(WorkFunction function, void* arg) override EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
     {
         LOCK(m_mutex);
         m_queue.push_back({function, arg});
     }
 
-    bool HasPending() const EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
-    {
-        LOCK(m_mutex);
-        return !m_queue.empty();
-    }
-
     /** Execute one pending background task. The task may schedule a
      *  successor which is left pending for a later call. */
-    void RunOne() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
+    bool RunOne() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
     {
         Work work{nullptr, nullptr};
         {
             LOCK(m_mutex);
-            if (m_queue.empty()) return;
+            if (m_queue.empty()) return false;
             work = m_queue.front();
             m_queue.pop_front();
         }
         work.function(work.arg);
+        return true;
     }
 
     /** Execute pending background tasks until none remain. */
-    void DrainWork() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
-    {
-        while (HasPending()) {
-            RunOne();
-        }
-    }
+    void DrainWork() EXCLUSIVE_LOCKS_REQUIRED(!m_mutex) { while (RunOne()) {} }
 };
 
 constexpr size_t MAX_VALUE_SIZE{4096};
+constexpr size_t WRITE_BATCH_HEADER{12}; // See `kHeader` in `db/write_batch.cc`
 
 /** Generate a deterministic value from key and length. The fuzz input
  *  controls only the key and a 16-bit length; the actual content is
@@ -100,86 +96,140 @@ std::vector<uint8_t> MakeValue(uint16_t key, uint16_t len)
     return v;
 }
 
-/** Comparator matching LevelDB's lexicographic order on little-endian
- *  serialized uint16_t keys (low byte first, then high byte). */
-struct SerializedU16Cmp {
-    bool operator()(uint16_t a, uint16_t b) const
-    {
-        const auto swap{[](uint16_t v) -> uint16_t { return static_cast<uint16_t>((v << 8) | (v >> 8)); }};
-        return swap(a) < swap(b);
-    }
+/** Equivalent to `leveldb::BytewiseComparator()` on 2-byte little-endian
+ *  serialized uint16_t keys, while keeping the oracle keyed by uint16_t. */
+struct LevelDBBytewiseU16Cmp {
+    bool operator()(uint16_t a, uint16_t b) const { return internal_bswap_16(a) < internal_bswap_16(b); }
 };
 
-struct FailDeserialize {
+uint16_t ConsumeKey(FuzzedDataProvider& provider) { return provider.ConsumeIntegral<uint16_t>(); }
+uint16_t ConsumeValueLength(FuzzedDataProvider& provider) { return provider.ConsumeIntegralInRange<uint16_t>(0, MAX_VALUE_SIZE); }
+bool ConsumeSync(FuzzedDataProvider& provider) { return provider.ConsumeIntegralInRange<uint8_t>(0, 7) == 0; }
+uint16_t PickExistingKey(FuzzedDataProvider& provider, const std::map<uint16_t, uint16_t, LevelDBBytewiseU16Cmp>& oracle)
+{
+    assert(!oracle.empty());
+    auto it{oracle.begin()};
+    std::advance(it, provider.ConsumeIntegralInRange<size_t>(0, oracle.size() - 1));
+    return it->first;
+}
+
+struct FailUnserialize {
     template <typename Stream>
-    void Unserialize(Stream&) { throw std::runtime_error{"intentional"}; }
+    void Unserialize(Stream&) { throw std::ios_base::failure{"always fail"}; }
 };
 
-void initialize_dbwrapper()
+std::optional<uint16_t> GetIteratorUserKey(CDBIterator& it, bool obfuscate)
+{
+    struct RawKeyBytes {
+        std::vector<std::byte> key;
+
+        void Unserialize(DataStream& s)
+        {
+            key.resize(s.size());
+            if (!key.empty()) s.read(std::span{key});
+        }
+    };
+
+    // The harness only writes uint16_t keys, but an obfuscated DB also stores
+    // the internal obfuscation metadata entry under a different key type.
+    RawKeyBytes raw_key;
+    assert(it.GetKey(raw_key));
+    if (raw_key.key.size() != sizeof(uint16_t)) {
+        assert(obfuscate);
+        std::string key_str;
+        assert(it.GetKey(key_str));
+        assert(key_str == dbwrapper_private::GetObfuscationKey());
+        return std::nullopt;
+    }
+    return ReadLE16(raw_key.key.data());
+}
+
+/** Compare the DB iterator against the oracle, allowing only the obfuscation
+ *  metadata entry as an extra DB record when obfuscation is enabled. */
+void VerifyIterator(CDBWrapper& dbw, const std::map<uint16_t, uint16_t, LevelDBBytewiseU16Cmp>& oracle, bool obfuscate, std::optional<uint16_t> seek_key = std::nullopt)
 {
-    LogInstance().DisableLogging();
+    const std::unique_ptr<CDBIterator> it{dbw.NewIterator()};
+    auto oracle_it{seek_key ? oracle.lower_bound(*seek_key) : oracle.begin()};
+    size_t entry_count{0};
+    it->Seek(seek_key.value_or(uint16_t{0}));
+    for (; it->Valid(); it->Next()) {
+        if (const auto db_key{GetIteratorUserKey(*it, obfuscate)}) {
+            assert(oracle_it != oracle.end());
+            assert(*db_key == oracle_it->first);
+
+            std::vector<uint8_t> db_value;
+            assert(it->GetValue(db_value));
+            assert(db_value == MakeValue(*db_key, oracle_it->second));
+            ++oracle_it;
+        }
+        ++entry_count;
+    }
+    assert(oracle_it == oracle.end());
+    if (!seek_key) assert(entry_count == oracle.size() + obfuscate);
 }
 
 } // namespace
 
-FUZZ_TARGET(dbwrapper, .init = initialize_dbwrapper)
+FUZZ_TARGET(dbwrapper, .init = [] { static auto setup{MakeNoLogFileContext<>()}; })
 {
     SeedRandomStateForTest(SeedRand::ZEROS);
     FuzzedDataProvider provider{buffer.data(), buffer.size()};
 
     const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
     DeterministicEnv det_env{memenv.get()};
+    leveldb::EnvWrapper* const testing_env{&det_env};
 
-    const bool obfuscate{provider.ConsumeBool()};
-    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB)};
-    const size_t max_file_size{provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB)};
+    const bool obfuscate{provider.ConsumeIntegralInRange<uint8_t>(0, 10) != 0};
+    const size_t cache_bytes{provider.ConsumeIntegralInRange<size_t>(64 << 10, 2_MiB)};
+    const size_t max_file_size{provider.PickValueInArray<std::size_t>({2_MiB, DBWRAPPER_MAX_FILE_SIZE})};
 
-    CDBWrapper dbw{{
-        .path = "",
-        .cache_bytes = cache_bytes,
-        .obfuscate = obfuscate,
-        .testing_env = &det_env,
-        .max_file_size = max_file_size,
+    const auto make_db{[&](DBOptions options = {}) {
+        return std::make_unique<CDBWrapper>(DBParams{
+            .path = "dbwrapper_fuzz",
+            .cache_bytes = cache_bytes,
+            .obfuscate = obfuscate,
+            .options = options,
+            .testing_env = testing_env,
+            .max_file_size = max_file_size,
+        });
     }};
+    std::unique_ptr<CDBWrapper> dbw{make_db()};
 
     // Oracle: key → value length. Content is reconstructed via MakeValue().
-    std::map<uint16_t, uint16_t, SerializedU16Cmp> oracle;
+    std::map<uint16_t, uint16_t, LevelDBBytewiseU16Cmp> oracle;
 
-    LIMITED_WHILE(provider.ConsumeBool(), 10000)
+    LIMITED_WHILE(provider.ConsumeBool(), 10'000)
     {
         CallOneOf(
             provider,
             // --- Mutations ---
             [&] {
-                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
-                const uint16_t len{provider.ConsumeIntegralInRange<uint16_t>(0, MAX_VALUE_SIZE)};
-                const bool sync{provider.ConsumeBool()};
+                const auto key{ConsumeKey(provider)};
+                const auto len{ConsumeValueLength(provider)};
                 det_env.DrainWork();
-                dbw.Write(key, MakeValue(key, len), sync);
+                dbw->Write(key, MakeValue(key, len), ConsumeSync(provider));
                 oracle[key] = len;
             },
             [&] {
-                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
-                const bool sync{provider.ConsumeBool()};
+                const auto key{ConsumeKey(provider)};
                 det_env.DrainWork();
-                dbw.Erase(key, sync);
+                dbw->Erase(key, ConsumeSync(provider));
                 oracle.erase(key);
             },
             [&] {
-                CDBBatch batch{dbw};
+                CDBBatch batch{*dbw};
                 std::map<uint16_t, uint16_t> batch_writes;
                 std::set<uint16_t> batch_erases;
                 const auto fill{[&] {
                     LIMITED_WHILE(provider.ConsumeBool(), 20)
                     {
+                        const auto key{ConsumeKey(provider)};
                         if (provider.ConsumeBool()) {
-                            const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
-                            const uint16_t len{provider.ConsumeIntegralInRange<uint16_t>(0, MAX_VALUE_SIZE)};
+                            const auto len{ConsumeValueLength(provider)};
                             batch.Write(key, MakeValue(key, len));
                             batch_writes[key] = len;
                             batch_erases.erase(key);
                         } else {
-                            const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
                             batch.Erase(key);
                             batch_erases.insert(key);
                             batch_writes.erase(key);
@@ -188,81 +238,59 @@ FUZZ_TARGET(dbwrapper, .init = initialize_dbwrapper)
                 }};
                 fill();
                 if (provider.ConsumeBool()) {
-                    (void)batch.ApproximateSize();
+                    assert(batch.ApproximateSize() >= WRITE_BATCH_HEADER);
                     batch.Clear();
+                    assert(batch.ApproximateSize() == WRITE_BATCH_HEADER);
                     batch_writes.clear();
                     batch_erases.clear();
-                    fill();
+                    if (provider.ConsumeBool()) fill();
                 }
-                const bool sync{provider.ConsumeBool()};
                 det_env.DrainWork();
-                dbw.WriteBatch(batch, sync);
+                dbw->WriteBatch(batch, ConsumeSync(provider));
                 for (const auto& [k, v] : batch_writes) oracle[k] = v;
                 for (const auto& k : batch_erases) oracle.erase(k);
             },
+            [&] {
+                det_env.DrainWork();
+                dbw.reset();
+                dbw = make_db({.force_compact = provider.ConsumeBool()});
+                VerifyIterator(*dbw, oracle, obfuscate);
+            },
             // --- Reads ---
             [&] {
-                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
+                const auto key{ConsumeKey(provider)};
                 std::vector<uint8_t> value;
-                const bool found{dbw.Read(key, value)};
-                const auto it{oracle.find(key)};
-                assert(found == (it != oracle.end()));
-                if (found) assert(value == MakeValue(key, it->second));
+                const bool found{dbw->Read(key, value)};
+                if (const auto it{oracle.find(key)}; it != oracle.end()) {
+                    assert(found && value == MakeValue(key, it->second));
+                } else {
+                    assert(!found);
+                }
             },
             [&] {
-                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
-                assert(dbw.Exists(key) == oracle.contains(key));
+                const auto key{ConsumeKey(provider)};
+                assert(dbw->Exists(key) == oracle.contains(key));
             },
             [&] {
-                const uint16_t key{provider.ConsumeIntegral<uint16_t>()};
-                FailDeserialize wrong_type;
-                assert(!dbw.Read(key, wrong_type));
+                const auto key{!oracle.empty() && provider.ConsumeBool() ? PickExistingKey(provider, oracle) : ConsumeKey(provider)};
+                FailUnserialize wrong_type;
+                assert(!dbw->Read(key, wrong_type));
             },
             [&] {
-                const std::unique_ptr<CDBIterator> it{dbw.NewIterator()};
-                auto oracle_it{oracle.begin()};
-                if (provider.ConsumeBool()) {
-                    const uint16_t seek_key{provider.ConsumeIntegral<uint16_t>()};
-                    it->Seek(seek_key);
-                    oracle_it = oracle.lower_bound(seek_key);
-                } else {
-                    it->SeekToFirst();
-                }
-                while (it->Valid()) {
-                    uint16_t db_key;
-                    if (!it->GetKey(db_key)) break;
-
-                    if (oracle_it != oracle.end() && db_key == oracle_it->first) {
-                        std::vector<uint8_t> db_value;
-                        assert(it->GetValue(db_value));
-                        assert(db_value == MakeValue(db_key, oracle_it->second));
-                        ++oracle_it;
-                    } else {
-                        assert(obfuscate);
-                        std::string key_str;
-                        assert(it->GetKey(key_str));
-                        assert(key_str == std::string("\000obfuscate_key", 14));
-                    }
-                    it->Next();
-                }
-                assert(oracle_it == oracle.end());
+                const auto seek_key{provider.ConsumeBool() ? std::nullopt : std::optional{ConsumeKey(provider)}};
+                VerifyIterator(*dbw, oracle, obfuscate, seek_key);
             },
             // --- Stats ---
             [&] {
-                if (!obfuscate) {
-                    assert(dbw.IsEmpty() == oracle.empty());
-                } else {
-                    assert(!dbw.IsEmpty());
-                }
+                assert(dbw->IsEmpty() == (oracle.empty() && !obfuscate));
             },
             [&] {
-                uint16_t k1{provider.ConsumeIntegral<uint16_t>()};
-                uint16_t k2{provider.ConsumeIntegral<uint16_t>()};
-                if (k1 > k2) std::swap(k1, k2);
-                (void)dbw.EstimateSize(k1, k2);
+                const auto [k1, k2]{std::minmax({ConsumeKey(provider), ConsumeKey(provider)}, LevelDBBytewiseU16Cmp{})};
+                const size_t estimate_size{dbw->EstimateSize(k1, k2)};
+                if (k1 == k2) assert(estimate_size == 0);
             },
             [&] {
-                (void)dbw.DynamicMemoryUsage();
+                (void)dbw->DynamicMemoryUsage();
             },
             // --- Compaction control ---
             [&] {
@@ -270,5 +298,8 @@ FUZZ_TARGET(dbwrapper, .init = initialize_dbwrapper)
             });
     }
 
+    // The fuzzer may end without choosing another iterator pass after the last
+    // deferred compaction step, so drain and verify one final time here.
     det_env.DrainWork();
+    VerifyIterator(*dbw, oracle, obfuscate);
 }

</details>

andrewtoth force-pushed on Apr 16, 2026

andrewtoth commented at 2:22 AM on April 16, 2026: contributor

Thank you for your detailed review @l0rinc! I have taken many of your suggestions.

a dedicated CDBWrapper fuzzer helps with increasing our confidence as we iterate towards #31132.

I'm not sure I see how this is related to #31132, or what you mean by "iterate towards". If you have tangible suggestions for that PR, please share them there!

make the wrapper/env behavior more explicit and covered with focused unit tests, especially injected testing_env precedence and wipe_data

I'm not sure unit tests will add much value for this test only construct that is well covered by this fuzz harness. I opted not to take this.

keep the fuzz inputs closer to meaningful real values and cheaper execution paths where possible

I didn't take most of the suggestions regarding this theme. Real values will not allow us to execute many interesting code paths cheaply, in fact the opposite! Please see inline comments and let me know if you still disagree.

split the large fuzz commit into a few smaller buildable steps so the basic harness can be reviewed separately from the ordering / iterator / reopen logic

I did not take this suggestion. The first 2 commits touch production code. They are small and atomic so easy to review for correctness. The third large commit is adding the fuzz harness, which touches no other code. It is mostly one large CallOneOf loop that can be reviewed top to bottom. I don't see a benefit in splitting this up.

simplify and reuse helpers in the fuzzer where we can, and make the iterator / obfuscation handling more explicit and robust

I took many of these, others I have made comments inline.

git diff b967e255e7620b8366ec1aeb8932138455be8701..b11a6cd998864defcc4e1e6ae9fb5eae7c6a6e3e

l0rinc changes_requested

sedited requested review from l0rinc on Apr 23, 2026

sedited requested review from marcofleon on Apr 23, 2026

in src/test/fuzz/dbwrapper.cpp:120 in b11a6cd998

 115 | +{
 116 | +    SeedRandomStateForTest(SeedRand::ZEROS);
 117 | +    FuzzedDataProvider provider{buffer.data(), buffer.size()};
 118 | +
 119 | +    const auto memenv{std::unique_ptr<leveldb::Env>{leveldb::NewMemEnv(leveldb::Env::Default())}};
 120 | +    DeterministicEnv det_env{memenv.get()};

sedited commented at 10:11 AM on May 3, 2026:

Could there be value in not using this unconditionally?

andrewtoth commented at 9:39 AM on May 6, 2026:

If we didn't use this, we would remove determinism due to thread scheduling. There could be value in a separate target that uses the PosixEnv and running with TSan.

andrewtoth commented at 7:37 AM on May 7, 2026:

Added a second target that uses a NewMemEnv so we can exercise the multi threaded paths.

in src/test/fuzz/dbwrapper.cpp:53 in b11a6cd998

  48 | +    struct Work {
  49 | +        WorkFunction function;
  50 | +        void* arg;
  51 | +    };
  52 | +
  53 | +    Mutex m_mutex;

sedited commented at 10:17 AM on May 3, 2026:

I'm confused why this mutex is required. If it is for follow-up work for introducing a concurrent access fuzz test, I would introduce then.

andrewtoth commented at 7:38 AM on May 7, 2026:

Removed mutex for now. Will introduce in the follow-up PR to add concurrent read coverage.

in src/test/fuzz/dbwrapper.cpp:130 in b11a6cd998

 125 | +        return std::make_unique<CDBWrapper>(DBParams{
 126 | +            .path = "dbwrapper_fuzz",
 127 | +            .cache_bytes = provider.ConsumeIntegralInRange<size_t>(64 << 10, 1_MiB),
 128 | +            .obfuscate = obfuscate,
 129 | +            .testing_env = &det_env,
 130 | +            .max_file_size = provider.ConsumeIntegralInRange<size_t>(1_MiB, 4_MiB),

sedited commented at 10:35 AM on May 3, 2026:

How likely are we to hit the max file size with the current fuzz loop? Assuming a 4kb input provider string, is tripling the data size in MakeValue enough to hit it?

andrewtoth commented at 10:26 AM on May 6, 2026:

Length is a uint16_t, 2 bytes. Max is 65,535 * 3 = 196,605. It's another 2 bytes for the key, so each entry is 4 bytes of input. Max file size can be hit with 22 entries (4,194,000 % 196,605), or 88 bytes of input.

sedited commented at 10:38 AM on May 3, 2026: contributor

Concept ACK

Interesting approach with the deterministic environment. Just left some questions for now, but it already looks pretty promising. Will also be interesting to see what coverage we actually achieve here.

andrewtoth force-pushed on May 6, 2026

DrahtBot added the label CI failed on May 6, 2026

DrahtBot commented at 6:24 PM on May 6, 2026: contributor

🚧 At least one of the CI tasks failed. <sub>Task fuzzer,address,undefined,integer: https://github.com/bitcoin/bitcoin/actions/runs/25449230634/job/74661658639</sub> <sub>LLM reason (✨ experimental): CI failed because the fuzz test (libFuzzer) crashed with an assertion failure (oracle_it != oracle.end()) in test/fuzz/dbwrapper.cpp (VerifyIterator).</sub>

<details><summary>Hints</summary>

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

</details>

andrewtoth force-pushed on May 6, 2026

DrahtBot removed the label CI failed on May 6, 2026

marcofleon commented at 11:00 AM on May 8, 2026: contributor

For the single-threaded harness, drain_work() needs to be called again after VerifyIterator(), yes? To clear whatever compaction may be scheduled there.

solution-timeout.txt

<details> <summary>stack trace</summary>

FUZZ=dbwrapper ./fuzzbuild/bin/fuzz ../solution-timeout.txt 
INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 4251447260
INFO: Loaded 1 modules   (403748 inline 8-bit counters): 403748 [0x55e292f07010, 0x55e292f69934), 
INFO: Loaded 1 PC tables (403748 PCs): 403748 [0x55e292f69938,0x55e293592b78), 
./fuzzbuild/bin/fuzz: Running 1 inputs 1 time(s) each.
Running: ../solution-timeout.txt
ALARM: working on the last Unit for 1202 seconds
       and the timeout value is 1200 (use -timeout=N to change)
==1503306== ERROR: libFuzzer: timeout after 1202 seconds
    [#0](/bitcoin-bitcoin/0/) 0x55e291ad2864 in __sanitizer_print_stack_trace (/root/bitcoin/fuzzbuild/bin/fuzz+0xa8c864) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#1](/bitcoin-bitcoin/1/) 0x55e291aa8798 in fuzzer::PrintStackTrace() (/root/bitcoin/fuzzbuild/bin/fuzz+0xa62798) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#2](/bitcoin-bitcoin/2/) 0x55e291a8eb27 in fuzzer::Fuzzer::AlarmCallback() (/root/bitcoin/fuzzbuild/bin/fuzz+0xa48b27) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#3](/bitcoin-bitcoin/3/) 0x7fa7e661d04f  (/lib/x86_64-linux-gnu/libc.so.6+0x3c04f) (BuildId: 58254ca972028402bc40624f81388d85ec95f70d)
    [#4](/bitcoin-bitcoin/4/) 0x7fa7e6666e95  (/lib/x86_64-linux-gnu/libc.so.6+0x85e95) (BuildId: 58254ca972028402bc40624f81388d85ec95f70d)
    [#5](/bitcoin-bitcoin/5/) 0x7fa7e6669557 in pthread_cond_wait (/lib/x86_64-linux-gnu/libc.so.6+0x88557) (BuildId: 58254ca972028402bc40624f81388d85ec95f70d)
    [#6](/bitcoin-bitcoin/6/) 0x55e292a5db29 in leveldb::port::CondVar::Wait() /root/bitcoin/fuzzbuild/src/../../src/leveldb/port/port_stdcxx.h:75:9
    [#7](/bitcoin-bitcoin/7/) 0x55e292a5db29 in leveldb::DBImpl::~DBImpl() /root/bitcoin/fuzzbuild/src/../../src/leveldb/db/db_impl.cc:158:38
    [#8](/bitcoin-bitcoin/8/) 0x55e292a5e088 in leveldb::DBImpl::~DBImpl() /root/bitcoin/fuzzbuild/src/../../src/leveldb/db/db_impl.cc:153:19
    [#9](/bitcoin-bitcoin/9/) 0x55e2924ad135 in CDBWrapper::~CDBWrapper() /root/bitcoin/fuzzbuild/src/../../src/dbwrapper.cpp:269:5
    [#10](/bitcoin-bitcoin/10/) 0x55e291bd6a8a in std::default_delete<CDBWrapper>::operator()(CDBWrapper*) const /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/unique_ptr.h:95:2
    [#11](/bitcoin-bitcoin/11/) 0x55e291bd6a8a in std::unique_ptr<CDBWrapper, std::default_delete<CDBWrapper>>::~unique_ptr() /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/unique_ptr.h:396:4
    [#12](/bitcoin-bitcoin/12/) 0x55e291bd6a8a in void (anonymous namespace)::TestDbWrapper<dbwrapper_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>)::$_0, dbwrapper_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>)::$_1>(FuzzedDataProvider&, leveldb::Env*, dbwrapper_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>)::$_0, dbwrapper_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>)::$_1, bool) /root/bitcoin/fuzzbuild/src/test/fuzz/../../../../src/test/fuzz/dbwrapper.cpp:288:1
    [#13](/bitcoin-bitcoin/13/) 0x55e291bd6a8a in dbwrapper_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>) /root/bitcoin/fuzzbuild/src/test/fuzz/../../../../src/test/fuzz/dbwrapper.cpp:298:5
    [#14](/bitcoin-bitcoin/14/) 0x55e291e93804 in std::function<void (std::span<unsigned char const, 18446744073709551615ul>)>::operator()(std::span<unsigned char const, 18446744073709551615ul>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/12/../../../../include/c++/12/bits/std_function.h:591:9
    [#15](/bitcoin-bitcoin/15/) 0x55e291e93804 in test_one_input(std::span<unsigned char const, 18446744073709551615ul>) /root/bitcoin/fuzzbuild/src/test/fuzz/util/../../../../../src/test/fuzz/fuzz.cpp:88:5
    [#16](/bitcoin-bitcoin/16/) 0x55e291e93804 in LLVMFuzzerTestOneInput /root/bitcoin/fuzzbuild/src/test/fuzz/util/../../../../../src/test/fuzz/fuzz.cpp:216:5
    [#17](/bitcoin-bitcoin/17/) 0x55e291a8ffc0 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/root/bitcoin/fuzzbuild/bin/fuzz+0xa49fc0) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#18](/bitcoin-bitcoin/18/) 0x55e291a7a2b2 in fuzzer::RunOneTest(fuzzer::Fuzzer*, char const*, unsigned long) (/root/bitcoin/fuzzbuild/bin/fuzz+0xa342b2) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#19](/bitcoin-bitcoin/19/) 0x55e291a7fb7a in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/root/bitcoin/fuzzbuild/bin/fuzz+0xa39b7a) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#20](/bitcoin-bitcoin/20/) 0x55e291aa9062 in main (/root/bitcoin/fuzzbuild/bin/fuzz+0xa63062) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)
    [#21](/bitcoin-bitcoin/21/) 0x7fa7e6608249  (/lib/x86_64-linux-gnu/libc.so.6+0x27249) (BuildId: 58254ca972028402bc40624f81388d85ec95f70d)
    [#22](/bitcoin-bitcoin/22/) 0x7fa7e6608304 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x27304) (BuildId: 58254ca972028402bc40624f81388d85ec95f70d)
    [#23](/bitcoin-bitcoin/23/) 0x55e291a750a0 in _start (/root/bitcoin/fuzzbuild/bin/fuzz+0xa2f0a0) (BuildId: a5a62bb4e15a4e47532da73aed608bdbf4cfa82a)

SUMMARY: libFuzzer: timeout

</details>

andrewtoth force-pushed on May 8, 2026

andrewtoth commented at 5:30 PM on May 8, 2026: contributor

@marcofleon yes, nice catch. VerifyIterator will schedule a compaction, and then on db destruction if that compaction was not yet cleared on the single thread it will deadlock. Moved the drain_work call after VerifyIterator and the input passes.

git diff 0778cc98d18339d2a30d60901ba357e89d501873..3b413a3b48460825aef8e7a97d12ed678c4ee097

in src/test/fuzz/dbwrapper.cpp:128 in 3b413a3b48 outdated

 123 | +    } else {
 124 | +        it->SeekToFirst();
 125 | +    }
 126 | +    while (it->Valid()) {
 127 | +        uint16_t db_key;
 128 | +        assert(it->GetKey(db_key));

l0rinc commented at 3:56 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

While the harness only writes uint16_t keys, the obfuscation key re-read later has a different shape: Could we maybe read it as raw bytes before deserializing it into either? We could add a custom deserializer that would abstract the read of this dual-type system:

struct IteratorKey {
    std::optional<uint16_t> user_key;
    std::string metadata_key;

    template <typename Stream>
    void Unserialize(Stream& s)
    {
        if (s.size() == sizeof(uint16_t)) {
            s >> user_key.emplace();
        } else {
            s >> metadata_key;
        }
    }
};

in src/test/fuzz/dbwrapper.cpp:139 in 3b413a3b48

 134 | +            ++oracle_it;
 135 | +        } else {
 136 | +            assert(obfuscate);
 137 | +            std::string key_str;
 138 | +            assert(it->GetKey(key_str));
 139 | +            assert(key_str == std::string("\000obfuscate_key", 14));

l0rinc commented at 3:57 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

nit: I understand if we don't want to deduplicate it here, but could we maybe document what this key means? It's fine if you think it's a trivial grep, just resolve the comment in that case.

in src/test/fuzz/dbwrapper.cpp:233 in 3b413a3b48 outdated

 228 | +                dbw.reset();
 229 | +                DBOptions options{};
 230 | +                if (allow_force_compact && provider.ConsumeBool()) {
 231 | +                    options.force_compact = true;
 232 | +                }
 233 | +                dbw = make_db(options);

l0rinc commented at 3:58 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

Would it make sense to call VerifyIterator() immediately after reopening the database to so that reopen exercises LevelDB recovery from existing data?

in src/test/fuzz/dbwrapper.cpp:228 in 3b413a3b48 outdated

 223 | +                for (const auto& [k, v] : batch_writes) oracle[k] = v;
 224 | +                for (const auto& k : batch_erases) oracle.erase(k);
 225 | +            },
 226 | +            [&] {
 227 | +                drain_work();
 228 | +                dbw.reset();

l0rinc commented at 4:04 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

Maybe we could document that DeterministicEnv users must drain before DB destruction, see #34887 (review)

in src/test/fuzz/dbwrapper.cpp:183 in 3b413a3b48

 178 | +            // --- Mutations ---
 179 | +            [&] {
 180 | +                const auto key{ConsumeKey(provider)};
 181 | +                const auto len{ConsumeValueLength(provider)};
 182 | +                drain_work();
 183 | +                dbw->Write(key, MakeValue(key, len), provider.ConsumeBool());

l0rinc commented at 4:06 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

We could either document the parameter name or use a named helper:

                dbw->Write(key, MakeValue(key, len), /*fSync=*/provider.ConsumeBool());

in src/test/fuzz/dbwrapper.cpp:126 in 3b413a3b48

 121 | +    if (seek_key) {
 122 | +        it->Seek(*seek_key);
 123 | +    } else {
 124 | +        it->SeekToFirst();
 125 | +    }
 126 | +    while (it->Valid()) {

l0rinc commented at 4:39 PM on May 9, 2026:

3b413a3 test: add fuzz harness for CDBWrapper:

Nit: maybe we could document how this iteration advances by:

for (; it->Valid(); it->Next()) {

l0rinc approved

l0rinc commented at 4:48 PM on May 9, 2026: contributor

tested ACK 3b413a3b48460825aef8e7a97d12ed678c4ee097

I have a few remaining suggestions but I'm also fine with it as is. Happy to re-review if my comments are applied. Please also see a few of my previous comments that weren't addressed (or seen). Ran the fuzzer for a day, it didn't crash.

DrahtBot requested review from sedited on May 9, 2026

in test/sanitizer_suppressions/ubsan:22 in 3b413a3b48

  18 | @@ -19,6 +19,7 @@ implicit-integer-sign-change:*/include/c++/
  19 |  implicit-integer-sign-change:*/new_allocator.h
  20 |  implicit-integer-sign-change:*/qarraydata.h
  21 |  implicit-integer-sign-change:crc32c/
  22 | +implicit-integer-sign-change:leveldb/util/comparator.cc

marcofleon commented at 5:36 PM on May 18, 2026:

nit: The commit message for this is saying the violation is in Compare but it's in FindShortSuccessor afaict. I also would prefer if the suppression were as specific as it can be: implicit-integer-sign-change:BytewiseComparatorImpl::FindShortSuccessor, as opposed to the entire file.

andrewtoth commented at 3:23 PM on May 20, 2026:

Done.

in src/test/fuzz/dbwrapper.cpp:111 in 3b413a3b48 outdated

 106 | +struct FailUnserialize {
 107 | +    template <typename Stream>
 108 | +    void Unserialize(Stream&) { throw std::ios_base::failure{"always fail"}; }
 109 | +};
 110 | +
 111 | +uint16_t ConsumeKey(FuzzedDataProvider& provider) { return provider.ConsumeIntegral<uint16_t>(); }

marcofleon commented at 6:58 PM on May 18, 2026:

I was wondering why we don't try to vary the key more? I get that the oracle is simpler just using uint16_t, but it feels like more could be tested with larger and more diverse keys (at least in LevelDB). Is it so that we overwrite often?

andrewtoth commented at 3:25 PM on May 20, 2026:

It was so the input could be used for many diverse keys, overlapping or not, while not wasting a lot of fuzz input. I suppose we could do something similar with extending keys that we do with values. But, this would touch a lot of the harness at this point. If it's ok, I would like to leave it as is for now.

marcofleon commented at 6:59 PM on May 18, 2026: contributor

crACK 3b413a3b48460825aef8e7a97d12ed678c4ee097

coverage: dbwrapper and dbwrapper_threaded

Coverage looks good (not seeing any major gaps) and the differences between the two look as expected, with dbwrapper_threaded covering a bit more in LevelDB by hitting the compaction methods in db_impl.cc.

Overall, this is a useful differential fuzz test. Just left non-blocking question and nit.

edit: Also meant to add that stability on AFL++ is very high (99.42%) for dbwrapper after a few hours, as expected with the deterministic env.

in src/test/fuzz/dbwrapper.cpp:174 in 3b413a3b48

 169 | +    std::unique_ptr<CDBWrapper> dbw{make_db()};
 170 | +
 171 | +    // Oracle: key → value length. Content is reconstructed via MakeValue().
 172 | +    Oracle oracle;
 173 | +
 174 | +    LIMITED_WHILE(provider.ConsumeBool(), 10'000)

marcofleon commented at 9:24 PM on May 18, 2026:

Late realization here, but the 10,000 iteration max will likely end up causing a time out in our fuzzing infra eventually. I'd recommend lowering it to 1000.

sedited commented at 7:23 AM on May 20, 2026: contributor

@andrewtoth can you address @marcofleon's remaining comments? Will re-ACK this quickly thereafter.

dbwrapper: make max_file_size a configurable DBParams field

Useful for fuzzing different values.

8d390c93fc

dbwrapper: accept optional testing leveldb::Env in DBParams

Allow callers to inject a custom leveldb::Env via DBParams::testing_env,
which takes priority over the memory_only in-memory environment. This
enables fuzz harnesses to supply a deterministic environment.

32169c3855

test: add fuzz harness for CDBWrapper

Introduces a libFuzzer harness that exercises CDBWrapper operations
against a std::map oracle, with a DeterministicEnv that captures LevelDB
background compaction for single-threaded determinism.

A sibling dbwrapper_threaded target uses a bare memenv so LevelDB's real
background thread runs, exercising force_compact and threaded compaction
paths that the deterministic variant cannot reach.

Adds an implicit-integer-sign-change suppression for
BytewiseComparatorImpl::FindShortSuccessor (leveldb/util/comparator.cc:58)
to the test ubsan suppressions list. LevelDB's bytewise comparator
implicitly converts a signed `char` byte to `uint8_t` there. The path
is only reached when compaction picks an SST boundary key, so it
requires a small enough max_file_size for compaction to fire during
the fuzz run.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

b63ef20d54

andrewtoth force-pushed on May 20, 2026

andrewtoth commented at 3:25 PM on May 20, 2026: contributor

@l0rinc @marcofleon thanks, took most of both of your suggestions.

git diff 3b413a3b48460825aef8e7a97d12ed678c4ee097..b63ef20d545f739f9def814155c0eb95c85c0b85

l0rinc commented at 6:03 PM on May 20, 2026: contributor

code review ACK b63ef20d545f739f9def814155c0eb95c85c0b85

Since last ack the new push contains a few leftover nits, covering fewer iterations, variable multiplier, separating testing_env from memory_only, name hints, etc.

DrahtBot requested review from marcofleon on May 20, 2026

marcofleon commented at 10:59 AM on May 21, 2026: contributor

ACK b63ef20d545f739f9def814155c0eb95c85c0b85

Reviewed the changes and ran the two targets overnight to get a fresh corpus. Everything looks good.

fanquake requested review from dergoegge on May 21, 2026

dergoegge commented at 1:54 PM on May 21, 2026: member

utACK b63ef20d545f739f9def814155c0eb95c85c0b85

Not going to review this in-depth as it already has reviews but looks solid to me at a glance.

sedited approved

sedited commented at 1:58 PM on May 21, 2026: contributor

ACK b63ef20d545f739f9def814155c0eb95c85c0b85

Ran both targets for a while, and they do seem to achieve decent coverage.

fanquake merged this on May 21, 2026

fanquake closed this on May 21, 2026

andrewtoth deleted the branch on May 21, 2026

rustaceanrob referenced this in commit 2a63f69878 on May 22, 2026

yuvicc referenced this in commit 9df13f29ba on May 26, 2026

in src/test/fuzz/dbwrapper.cpp:270 in b63ef20d54

 265 | +                    key = it->first;
 266 | +                } else {
 267 | +                    key = ConsumeKey(provider);
 268 | +                }
 269 | +                FailUnserialize wrong_type;
 270 | +                assert(!dbw->Read(key, wrong_type));

l0rinc commented at 4:41 PM on June 18, 2026:

This crashes after https://github.com/bitcoin/bitcoin/pull/34132/changes/63bd0b5de0817b00ae4d306bc448a7befdb24703 - please let me know if you agree with the adjustment

andrewtoth commented at 5:25 PM on June 18, 2026:

@l0rinc where is that commit from?

l0rinc commented at 5:33 PM on June 18, 2026:

GitHub distorts the URLs - you can Quote reply in these cases to see the source: last commit in https://github.com/bitcoin/bitcoin/pull/34132