fuzz: assert the reused mempool stays empty in process_message(s) #35482

pull HowHsu wants to merge 1 commits into bitcoin:master from HowHsu:fuzz-assert-mempool-empty changing 4 files +25 −4
  1. HowHsu commented at 8:57 AM on June 8, 2026: contributor

    While investigating @maflcko's global-state-detector patch (https://github.com/bitcoin/bitcoin/issues/29018#issuecomment-4422112607), I noticed that a global variable may be, or may contain, a pointer to heap-allocated data. Even if the global variable itself remains unchanged, the referenced heap data can still be modified across fuzz inputs, allowing persistent state to escape detection.

    The process_message and process_messages targets reuse a single mempool across iterations, but ResetChainman() (invoked only when the block index has grown) rebuilds the chainman without resetting the mempool. A transaction accepted into the mempool by one input would therefore leak into the next iteration.

    This is a defensive guard, not a fix for an observed leak. A random payload is extremely unlikely to deserialize into a transaction that also passes AcceptToMemoryPool (its inputs must spend real UTXOs and its scripts must verify), so the mempool is expected to stay empty, and replaying the existing corpus confirms it never grows. Still, asserting the invariant is cheap and would catch any future change, such as a new path that adds to the mempool without full validation, or crafted transaction inputs, that would silently make one iteration depend on another. Having the guard is better than not having it.

  2. DrahtBot added the label Fuzzing on Jun 8, 2026
  3. DrahtBot commented at 8:57 AM on June 8, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35482.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. maflcko commented at 9:09 AM on June 10, 2026: member

    Hmm, what makes the fuzz target produce valid blocks, but not valid transactions? Is it that the prevhash can be solved by a fuzz engine for some reason, but the prevout can not? I guess that could be, because one may be a linear memory compare, and the other a map lookup?

    Not sure what to do here, but an alternative could be to add the mempool reset logic and hope that fuzz engines in the future will be smart enough to solve a map lookup?

  5. sedited commented at 9:33 AM on June 10, 2026: contributor

    I think I would also prefer if the mempools are just reset instead. Having an empty mempool at all times shouldn't be an invariant of the test.

  6. fuzz: reset the reused mempool in process_message(s)
    The mempool is reused across iterations of process_message and process_messages,
    but it was never reset, so a transaction accepted into the mempool by one input
    would leak into the next iteration. A random payload is extremely unlikely to
    produce a transaction that passes AcceptToMemoryPool (its inputs must spend real
    UTXOs and its scripts must verify), so the mempool stays empty in practice today.
    But rather than assert that as a test invariant, reset it.
    
    Add a ResetMempool() helper to test/util/txmempool (next to MemPoolOptionsForTest)
    and call it together with ResetChainman() when either the block index grew or the
    mempool is non-empty. The mempool is bound to the chainman at chainman construction
    (ChainstateLoadOptions::mempool), so the two must be reset together.
    357d0767e3
  7. HowHsu force-pushed on Jun 11, 2026
  8. maflcko commented at 8:52 AM on June 11, 2026: member

    I don't think this change makes sense. Can you please answer my previous questions why this code is not hit?

    I know the answer, but the burden to explain pull requests is on the author.

    Hint: You'll have to look at the runtime coverage.

    Absent that, I don't think this change makes sense, because it is blindly modifying dead/unreachable code without a context or goal.

    <!-- 019eb5c9-2420-7d01-88dd-bb944b831d13 # in dir /tmp/tmp.a9CcqXNjti/git_dir# fuzz process_message tx. 101 history 102 git diff 103 git diff -U1 104 history root@21d4968c4719:/tmp/tmp.a9CcqXNjti/git_dir# git diff -U1 diff --git a/src/test/fuzz/process_message.cpp b/src/test/fuzz/process_message.cpp index f5d5989..6bfcb7a 100644 --- a/src/test/fuzz/process_message.cpp +++ b/src/test/fuzz/process_message.cpp @@ -6,3 +6,5 @@ #include <banman.h> +#include <coins.h> #include <consensus/consensus.h> +#include <core_io.h> #include <kernel/chainparams.h> @@ -45,2 +47,3 @@ TestingSetup* g_setup; std::string_view LIMIT_TO_MESSAGE_TYPE{}; +std::vector<COutPoint> g_mature_coinbase_outpoints; @@ -52,5 +55,10 @@ void ResetChainman(TestingSetup& setup) setup.LoadVerifyActivateChainstate(); + g_mature_coinbase_outpoints.clear(); for (int i = 0; i < 2 * COINBASE_MATURITY; i++) { node::BlockCreateOptions options; - MineBlock(setup.m_node, options); + options.coinbase_output_script = P2WSH_OP_TRUE; + const COutPoint coinbase_outpoint{MineBlock(setup.m_node, options)}; + if (i < COINBASE_MATURITY) { + g_mature_coinbase_outpoints.push_back(coinbase_outpoint); + } } @@ -89,2 +97,24 @@ FUZZ_TARGET(process_message, .init = initialize_process_message) + Assert(!g_mature_coinbase_outpoints.empty()); + + const COutPoint prevout{PickValue(fuzzed_data_provider, g_mature_coinbase_outpoints)}; + CAmount coin_value; + { + LOCK(cs_main); + coin_value = node.chainman->ActiveChainstate().CoinsTip().AccessCoin(prevout).out.nValue; + } + Assert(coin_value > 1); + CMutableTransaction mtx; + mtx.vin.resize(1); + mtx.vin[0].prevout = prevout; + mtx.vin[0].scriptWitness.stack = {WITNESS_STACK_ELEM_OP_TRUE}; + mtx.vout.emplace_back(coin_value - 1, P2WSH_OP_TRUE); + const CTransactionRef tx{MakeTransactionRef(mtx)}; + + LOCK(cs_main); + const auto tx_result{node.chainman->ProcessTransaction(tx)}; + Assert(tx_result.m_result_type == MempoolAcceptResult::ResultType::VALID); + std::cout << EncodeHexTx(*tx) << std::endl; + Assert(node.mempool->size() == 0); + // Reset, so that dangling pointers can be detected by sanitizers.

  9. HowHsu commented at 9:10 AM on June 11, 2026: contributor

    Hmm, what makes the fuzz target produce valid blocks, but not valid transactions? Is it that the prevhash can be solved by a fuzz engine for some reason, but the prevout can not? I guess that could be, because one may be a linear memory compare, and the other a map lookup?

    Not sure what to do here, but an alternative could be to add the mempool reset logic and hope that fuzz engines in the future will be smart enough to solve a map lookup?

    I measured this on the process_message corpus: out of 2386 inputs, 90 grow the block index (so the fuzz engine does solve a valid hashPrevBlock), but 0 make the mempool non-empty. It's probably not prevhash-vs-prevout: both are unordered_map lookups (BlockMap and the UTXO set), about equally hard, and those 90 hits show the engine can solve that kind of key. The real difference is signature verification. A block reaches the block index with no script checks at all — CheckBlock/AcceptBlock only verify (simplified) PoW + merkle + prevhash, and CheckInputScripts only runs later in ConnectBlock (where coinbase is skipped) — so an empty block with just a coinbase needs no signatures. A transaction, on the other hand, must pass CheckInputScripts to enter the mempool, and a fuzz engine can't brute-force a valid signature.

  10. maflcko commented at 9:15 AM on June 11, 2026: member

    No, that is the wrong explanation and ai slop. A signature isn't needed for OP_TRUE.

    Please don't use llm bots if you don't understand their output. You are neither helping yourself, nor anyone else.

  11. HowHsu commented at 9:16 AM on June 11, 2026: contributor

    No, that is the wrong explanation and ai slop. A signature isn't needed for OP_TRUE.

    Please don't use llm bots if you don't understand their output. You are neither helping yourself, nor anyone else.

    Need sometime to investigate it.

  12. maflcko commented at 9:20 AM on June 11, 2026: member

    Need sometime to investigate it.

    Sure, take all the time you need. I am happy to answer any questions you may have.

    Also, for future reference, copy-pasting LLM output without marking it as such is against the guidelines: See https://github.com/bitcoin/bitcoin/pull/35386/changes#diff-0db3c9d8a653e398b629930edc9e7dcde74514ef45c26f02a7bd51157d52dcee


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-11 10:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me