mempool: asynchronous mempool fee rate diagram updates via validation interface #34803

pull ismaelsadeeq wants to merge 7 commits into bitcoin:master from ismaelsadeeq:03-2026-notify-mempool-update changing 14 files +825 −84
  1. ismaelsadeeq commented at 1:39 PM on March 11, 2026: member

    Motivation

    Now that we have the full linearization of mempool clusters, whenever a cluster in the mempool gets updated either due to replacement, eviction for various reasons, or block connection — we now have the previous linearization of the affected cluster and the new linearization of the new clusters available.

    This brings two advantages — the ability for outside observers to make decisions based on fee rate diagram updates without direct access to the mempool, and reduced lock contention.

    It can used primarily for:

    1. Making CBlockPolicyEstimator package aware
    2. Preventing redundant block template builds in the proposed block template manager BlockTemplateManager Proposal.

    During the development of cluster mempool those usage was discussed. Note: This is not the motivation behind cluster mempool just a part of it.

    See discussion in Package aware fee estimator and Determining BlockTemplate Fee Increase Using Fee Rate Diagram.

    Currently the fee estimator is asynchronous but not package-aware. After this PR it is possible to make it package-aware — a follow-up branch on top of this does exactly that see commit.

    For the mining interface, the node every 1 seconds rebuilds a block template regardless of whether any significant mempool change occurred to the top block.

    Even after cluster mempool, building a block template is not free since we have to pause the node (holding cs_main).

    With this notification, a proposed block template manager that observes the mempool update will only rebuild after an inflow in the mempool top block template warrants it. Redundant builds where no significant fee rate change occurred are eliminated.

    And doing this becomes even more useful because of proposed improvements like mempool-based fee rate estimation, which also builds a block template. Right now, it uses a constant 7-second interval and rebuilds regardless of whether an inflow affects the top block template or not. This will fix that.

    A proposed sendtemplate in the p2p layer also builds a block template. I propose that it also use the proposed block template manager, which should be built on top of this, so that redundant block template builds are eliminated.

    There is a different approach to determining updates in the top block template, but that still requires some direct interaction with the mempool. Since the fee estimator needs this notification anyway and we already have it, the block template manager can use it too and not interact with the mempool at all — it becomes a purely passive observer.

    Changes

    • TxGraph::Chunk a new struct that has with transaction TxGraph::Ref's, and fee rate FeeFrac so the before/after diagram carries enough information to compute chunk hashes.

    • MemPoolChunk (kernel/mempool_entry.h) — new struct representing a mempool chunk: a FeeFrac fee rate and a uint256 chunk hash identifying the set of witness transactions in the chunk, along with the vector of witnesses.

    • MemPoolChunksUpdate (txmempool.h) — new struct carrying:

      • old_chunks — chunks before the update.
      • new_chunks — chunks after the update
      • reasonMemPoolRemovalReason identifying the update type
      • block_height — set only when reason == BLOCK, carries the connecting block's height; defaults to std::nullopt
    • MemPoolUpdate — new ValidationInterface callback receiving a MemPoolChunksUpdate.

    • GetHashFromWitnesses (src/util/witnesses_hash.{h,cpp}) — sorts a vector of Wtxids lexicographically and hashes them deterministically to produce a stable chunk hash. GetPackageHash in src/policy/packages.cpp is refactored to use this.

    • Caches the fee rate diagram computed in GetAndSaveMainStagingDiagram / GetFeeRateDiagramChunks so it is immediately available when the notification is emitted without recomputing.

    • Extract the dependency addition subroutine in UpdateTransactionsFromBlock to a new method private method addDependenciesFromBlock

    MempoolUpdated is fired after every mempool mutation. Specifically:

    • Transaction addition and RBF replacement: fired inside CTxMemPool::ChangeSet::Apply with reason REPLACED.
    • Block connection: fired inside removeForBlock with reason BLOCK. Confirmed transactions are fully removed before conflicts are evicted. The connecting block's height is carried in block_height.
    • Reorg: fired inside removeForReorg with reason REORG.
    • Recursive removal: fired inside removeRecursive with a caller-supplied reason e.g. CONFLICT, EXPIRY, REORG.
    • Expiry: fired inside Expire with reason EXPIRY.
    • Size limit eviction: fired inside TrimToSize with reason SIZELIMIT.
    • Post-reorg dependency update: fired inside UpdateTransactionsFromBlock with reason SIZELIMIT.

    In all cases the before/after chunks are obtained from changeSet->GetFeeRateDiagramChunks() after committing the staging graph.

    • Note on UpdateTransactionsFromBlock, a naive single-pass approach calling Trim() directly on the main graph fails because oversized clusters cannot be linearized, so GetMainStagingDiagram cannot produce a valid before/after diff. The fix is a two-stage process:

    • Phase 1 (optimistic): Create a txgraph staging , call addDependenciesFromBlock to register all new parent-child relationship, then call Trim().

    • If Trim() returns nothing — no cluster exceeded the limit — commit staging, compute the diagram, emit MempoolUpdated(SIZELIMIT), done.

    • If Trim() returns evicted transactions, the optimistic path failed. Abort the staging we roll back to main, open a fresh staging, call RemoveTransaction for each evicted tx to remove them from staging, call addDependenciesFromBlock again (evicted txs are silently skipped since they are no longer in the graph), commit, compute the before/after diagram, emit MempoolUpdated(SIZELIMIT), then call RemoveStaged to fully remove evicted transactions from the mempool.

    • Reorgs are rare, and reorgs that burst the cluster size limit are also rarer still, so in the optimistic common case, we apply dependencies only once.

    • Added a new mempool_update_tests: unit test suite covering all MempoolUpdated emission paths:

    • We compute the chunk hash lazily in the scheduler thread because the mempool update is a hot path. see thread #34803 (review)

  2. DrahtBot added the label Mempool on Mar 11, 2026
  3. DrahtBot commented at 1:40 PM on March 11, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK sedited

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34075 (fees: Introduce Mempool Based Fee Estimation to reduce overestimation by ismaelsadeeq)
    • #33922 (mining: add getMemoryLoad() and track template non-mempool memory footprint by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • it's -> its [possessive form in Datatype that represent a chunk along with it's feerate...]

    <sup>2026-05-01 10:38:52</sup>

  4. ismaelsadeeq commented at 1:43 PM on March 11, 2026: member

    This PR is targeted for v32.0.

    The main changes are not significant; the test in the last commit constitutes the bulk of the diff, 493 lines of code addition.

  5. sedited added this to the milestone 32.0 on Mar 14, 2026
  6. sedited commented at 8:26 AM on March 14, 2026: contributor

    Concept ACK

    Put this on the milestone. I am a bit surprised at the amount of changes this requires, but I guess this is what it takes to wire up the new cluster mempool data structures.

  7. DrahtBot added the label Needs rebase on Mar 19, 2026
  8. ismaelsadeeq force-pushed on Mar 19, 2026
  9. DrahtBot removed the label Needs rebase on Mar 19, 2026
  10. DrahtBot added the label Needs rebase on Apr 3, 2026
  11. ismaelsadeeq force-pushed on Apr 14, 2026
  12. ismaelsadeeq force-pushed on Apr 14, 2026
  13. DrahtBot removed the label Needs rebase on Apr 14, 2026
  14. DrahtBot added the label CI failed on Apr 14, 2026
  15. DrahtBot removed the label CI failed on Apr 15, 2026
  16. in src/txmempool.cpp:1017 in 8c38cd73cb outdated
    1013 | +    return GetAndSaveMainStagingDiagram();
    1014 | +}
    1015 |  
    1016 | +std::pair<std::vector<FeeFrac>, std::vector<FeeFrac>> CTxMemPool::ChangeSet::GetAndSaveMainStagingDiagram()
    1017 | +{
    1018 | +    LOCK(m_pool->cs);
    


    sedited commented at 9:31 PM on April 24, 2026:

    Are there any call sites where this lock is actually needed? Or it just here as a safeguard, because we cannot correctly annotate this method as taking the mempool lock?

  17. in src/util/hasher.h:121 in 899fa5cdd3
     116 | @@ -116,4 +117,7 @@ class SaltedSipHasher
     117 |      size_t operator()(const std::span<const unsigned char>& script) const;
     118 |  };
     119 |  
     120 | +/** Sorts and hashes a set of wtxids into a single hash*/
     121 | +uint256 GetHashFromWitnesses(std::vector<Wtxid> wtxids);
    


    sedited commented at 10:11 AM on April 25, 2026:

    It is not immediately clear to me what this hash will be used for. Is it just a unique identifier for the chunk, that needn't even be reproducible, or is the expectation that consumers will re-generate it? Could any of our other hash functions be a reasonable alternative?


    ismaelsadeeq commented at 6:17 PM on April 26, 2026:

    Is it just a unique identifier for the chunk, that needn't even be reproducible Yes.

    Is the expectation that consumers will re-generate it?

    No, this is not defined anywhere; so there is no guarantee it is just a unique identifier for now.

    Could any of our other hash functions be a reasonable alternative?

    I did not look at other hash functions; I used the same one we use to compute the package hash for logging, during package validation, which is also a hot path. So, not sure is their a better alternative?.


    sedited commented at 7:29 PM on April 26, 2026:

    Thanks for linking the example again below, I just read over that line in your description, sorry! Could the hash be replaced with a simple counter then? The package hash is a bit different, since it is only calculated unconditionally when debug log is on.

    EDIT: Looking at it a bit more closely, we do calculate the package hash in non-logging contexts, but I'm not sure if that is really comparable. Not sure if a counter is really feasible either, seems like we would have track more stuff then.


    ismaelsadeeq commented at 10:06 AM on April 27, 2026:

    I don't think counter is better; it's worse and adds more accounting. How about moving this responsibility to clients? The trade-off is that we may perform duplicate calculations of the chunk hash.

    Another alternative to compute the chunk hashes in the validation interface method before passing them to listeners?


    sedited commented at 10:13 AM on April 27, 2026:

    Mmh, doing this in the scheduler thread does sound interesting to me. We don't really have precedent for that though. Anyway, I don't think this is necessarily a blocker. There are probably some trade offs to be made here.


    ismaelsadeeq commented at 10:22 AM on April 27, 2026:

    Yes, we do not do any computation apart from queuing the jobs. This can be the precedent :). Unless there is an objection. IMO that's its place to do this, the mempool does not need any of this computation. I think it is worth pursuing that direction, even if this is not a blocker; it is a computation that is done in all mempool updates and all block connections/disconnections, so not computing a hash is a win.


    ismaelsadeeq commented at 10:47 AM on April 29, 2026:

    Fixed.

  18. sedited commented at 10:17 AM on April 25, 2026: contributor

    Just a few questions so far. Do you have an example that already wires this up to a client? From a first glance, it seems a bit unfortunate that we're rebuilding the various vectors for different clients of the diagram. I wonder if some more work could be saved there by adapting some of the current users a bit.

  19. ismaelsadeeq commented at 6:17 PM on April 26, 2026: member

    Do you have an example that already wires this up to a client?

    Yeah, I linked an example in the PR description

    https://github.com/ismaelsadeeq/bitcoin/commit/fd88b167529aca80a4f65ed6aa77ec9372a41bce

    From a first glance, it seems a bit unfortunate that we're rebuilding the various vectors for different clients of the diagram. I wonder if some more work could be saved there by adapting some of the current users a bit.

    txgraph is agnostic of the chunk information, so we cannot just wire that up; we have to add that information: the chunk cumulative fee and the chunk identifier. Which I think it is a good design and should be kept as is. To do this, we have to iterate through the cluster chunks vector, then iterate through the chunk vector of refs, get the actual transaction fees and wtxid, compute the chunk hash and cumulative fees. Then move it to the mempool updated notification, where the notification dispatch passes a reference and not a copy. So we do build a 2D vector of witness IDs, and a 1D vector of the actual chunks for both the old and new diagram. I am happy to incorporate suggestions to simplify this.

  20. ismaelsadeeq force-pushed on Apr 29, 2026
  21. DrahtBot added the label CI failed on Apr 29, 2026
  22. DrahtBot commented at 11:34 AM on April 29, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task macOS native: https://github.com/bitcoin/bitcoin/actions/runs/25104403535/job/73561784519</sub> <sub>LLM reason (✨ experimental): CI failed due to a failing ctest run: mempool_update_tests failed (1 test failure in Bitcoin Core Test Suite).</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  23. ismaelsadeeq force-pushed on Apr 29, 2026
  24. DrahtBot removed the label CI failed on Apr 29, 2026
  25. DrahtBot added the label Needs rebase on Apr 30, 2026
  26. txgraph: populate chunk refs in `GetMainStagingDiagrams` result
    GetMainStagingDiagrams currently returns vector<FeeFrac> pair
    containing only chunk feerates.
    Some callers might need to know which transactions belong to
    each e.g chunk in order to compute chunk hash.
    
    This commit introduces TxGraph::Chunk which pairs a chunk feerate with
    the refs of all transactions in that chunk. Update GetMainStagingDiagrams
    to return pair of vector<TxGraph::Chunk> and rename AppendChunkFeerates
    to AppendChunks to reflect the richer return type.
    
    This commit then added a SanityCheck coverage for the new behaviour.
    6574e5adfb
  27. refactor: move-only: split GetPackageHash 08a86a7a86
  28. mempool: cache fee rate diagrams as chunks in ChangeSet
    CalculateChunksForRBF extracts only the FeeFrac from each TxGraph::Chunk
    and discards the rest.
    
    Cache the fee rate diagrams chunks in ChangeSet.
    
    The cached chunks are going to be used in subsequent commit.
    417a8f4f4b
  29. mempool: add GetAndSaveMainStagingDiagram to ChangeSet
    GetAndSaveMainStagingDiagram is extracted from CalculateChunksForRBF so
    the diagram snapshot can be taken on removal paths that do not go through
    the RBF codepath.
    
    The const fix to RemoveStaged is required because GetRemovals()
    returns a const reference.
    3b83374825
  30. refactor: move-only: extract dependency addition into a seperate method
    The newly created method addDependenciesFromBlock is going to be
    used multiple times in subsequent commit.
    9834798e47
  31. mempool: add and fire MempoolUpdated signal on each mempool update path
    Add `MemPoolChunksUpdate` to `kernel/mempool_entry.h`.
    
    Each removal path now captures the before/after fee rate diagram chunks
    and fires MempoolUpdated so subscribers can observe every mempool change.
    
    This signal allows subscribers to observe mempool fee rate diagram
    changes in an asynchronous way without holding the mempool lock.
    
    This changes the order of operations in removeForBlock: all block
    transactions in the mempool are now removed before conflict
    removals.
    c5b49c0749
  32. test: add unit test to mempool update validation events 0412515d4f
  33. ismaelsadeeq force-pushed on May 1, 2026
  34. DrahtBot removed the label Needs rebase on May 1, 2026
  35. in src/txgraph.h:70 in 6574e5adfb
      65 | @@ -66,6 +66,16 @@ class TxGraph
      66 |          MAIN //!< Always refers to the main graph, whether staging is present or not.
      67 |      };
      68 |  
      69 | +    /* Datatype that represent a chunk along with it's feerate and all the txs refs in the chunk. */
      70 | +    struct Chunk : public FeeFrac {
    


    sipa commented at 12:34 PM on May 1, 2026:

    I think a plain struct is cleaner here than deriving from FeeFrac. Semantically, a chunk is not a subtype of a feerate.

    Also, if GetMainStagingDiagrams isn't returning a pure std::vector<FeeFrac>, the justification of using that over FeePerWeight so CompareChunks can be used directly disappears, so I would suggest using

    struct Chunk
    {
        FeePerWeight feerate;
        std::vector<Ref*> refs;
    };
    

    An alternative is returning it as two vectors, std::pair<std::vector<FeeFrac>, std::vector<std::vector<Ref*>>>, where the first pair element is exactly the return type used in master now.

    Also, this has potentially some non-trivial cost to always allocate and fill the full Refs. I haven't thought through whether that performance impact matters for any of the call sites, but if it does, perhaps adding a separate GetMainStagingChunks() (in addition to GetMainStagingDiagram()) is worth it?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-02 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me