mempool: asynchronous mempool fee rate diagram updates via validation interface #34803

ismaelsadeeq commented at 1:39 PM on March 11, 2026: member

Motivation

Now that we have the full linearization of mempool clusters, whenever a cluster in the mempool gets updated either due to replacement, eviction for various reasons, or block connection — we now have the previous linearization of the affected cluster and the new linearization of the new clusters available.

This brings two advantages — the ability for outside observers to make decisions based on fee rate diagram updates without direct access to the mempool, and reduced lock contention.

It can used primarily for:

Making CBlockPolicyEstimator package aware
Preventing redundant block template builds in the proposed block template manager BlockTemplateManager Proposal.

During the development of cluster mempool those usage was discussed. Note: This is not the motivation behind cluster mempool just a part of it.

See discussion in Package aware fee estimator and Determining BlockTemplate Fee Increase Using Fee Rate Diagram.

Currently the fee estimator is asynchronous but not package-aware. After this PR it is possible to make it package-aware — a follow-up branch on top of this does exactly that see commit.

For the mining interface, the node every 1 seconds rebuilds a block template regardless of whether any significant mempool change occurred to the top block.

Even after cluster mempool, building a block template is not free since we have to pause the node (holding cs_main).

With this notification, a proposed block template manager that observes the mempool update will only rebuild after an inflow in the mempool top block template warrants it. Redundant builds where no significant fee rate change occurred are eliminated.

And doing this becomes even more useful because of proposed improvements like mempool-based fee rate estimation, which also builds a block template. Right now, it uses a constant 7-second interval and rebuilds regardless of whether an inflow affects the top block template or not. This will fix that.

A proposed sendtemplate in the p2p layer also builds a block template. I propose that it also use the proposed block template manager, which should be built on top of this, so that redundant block template builds are eliminated.

There is a different approach to determining updates in the top block template, but that still requires some direct interaction with the mempool. Since the fee estimator needs this notification anyway and we already have it, the block template manager can use it too and not interact with the mempool at all — it becomes a purely passive observer.

Changes

TxGraph::Chunk a new struct that has with transaction TxGraph::Ref's, and fee rate FeeFrac so the before/after diagram carries enough information to compute chunk hashes.
MemPoolChunk (kernel/mempool_entry.h) — new struct representing a mempool chunk: a FeeFrac fee rate and a uint256 chunk hash identifying the set of witness transactions in the chunk, along with the vector of witnesses.
MemPoolChunksUpdate (txmempool.h) — new struct carrying:
- old_chunks — chunks before the update.
- new_chunks — chunks after the update
- reason — MemPoolRemovalReason identifying the update type
- block_height — set only when reason == BLOCK, carries the connecting block's height; defaults to std::nullopt
MemPoolUpdate — new ValidationInterface callback receiving a MemPoolChunksUpdate.
GetHashFromWitnesses (src/util/witnesses_hash.{h,cpp}) — sorts a vector of Wtxids lexicographically and hashes them deterministically to produce a stable chunk hash. GetPackageHash in src/policy/packages.cpp is refactored to use this.
Caches the fee rate diagram computed in GetAndSaveMainStagingDiagram / GetFeeRateDiagramChunks so it is immediately available when the notification is emitted without recomputing.
Extract the dependency addition subroutine in UpdateTransactionsFromBlock to a new method private method addDependenciesFromBlock

MempoolUpdated is fired after every mempool mutation. Specifically:

Transaction addition and RBF replacement: fired inside CTxMemPool::ChangeSet::Apply with reason REPLACED.
Block connection: fired inside removeForBlock with reason BLOCK. Confirmed transactions are fully removed before conflicts are evicted. The connecting block's height is carried in block_height.
Reorg: fired inside removeForReorg with reason REORG.
Recursive removal: fired inside removeRecursive with a caller-supplied reason e.g. CONFLICT, EXPIRY, REORG.
Expiry: fired inside Expire with reason EXPIRY.
Size limit eviction: fired inside TrimToSize with reason SIZELIMIT.
Post-reorg dependency update: fired inside UpdateTransactionsFromBlock with reason SIZELIMIT.

In all cases the before/after chunks are obtained from changeSet->GetFeeRateDiagramChunks() after committing the staging graph.

Note on UpdateTransactionsFromBlock, a naive single-pass approach calling Trim() directly on the main graph fails because oversized clusters cannot be linearized, so GetMainStagingDiagram cannot produce a valid before/after diff. The fix is a two-stage process:
Phase 1 (optimistic): Create a txgraph staging , call addDependenciesFromBlock to register all new parent-child relationship, then call Trim().
If Trim() returns nothing — no cluster exceeded the limit — commit staging, compute the diagram, emit MempoolUpdated(SIZELIMIT), done.
If Trim() returns evicted transactions, the optimistic path failed. Abort the staging we roll back to main, open a fresh staging, call RemoveTransaction for each evicted tx to remove them from staging, call addDependenciesFromBlock again (evicted txs are silently skipped since they are no longer in the graph), commit, compute the before/after diagram, emit MempoolUpdated(SIZELIMIT), then call RemoveStaged to fully remove evicted transactions from the mempool.
Reorgs are rare, and reorgs that burst the cluster size limit are also rarer still, so in the optimistic common case, we apply dependencies only once.
Added a new mempool_update_tests: unit test suite covering all MempoolUpdated emission paths:
We compute the chunk hash lazily in the scheduler thread because the mempool update is a hot path. see thread #34803 (review)

DrahtBot added the label Mempool on Mar 11, 2026

DrahtBot commented at 1:40 PM on March 11, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34803.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	sedited

If your review is incorrectly listed, please copy-paste <code></code> into the comment that the bot should ignore.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#33922 (mining: add getMemoryLoad() and track template non-mempool memory footprint by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

Returns MemPoolChunkUpdate struct. -> Returns MemPoolChunksUpdate struct. [The struct name is wrong here; the added type is MemPoolChunksUpdate, so this doc comment is misleading.]

2026-05-06 10:46:54

ismaelsadeeq commented at 1:43 PM on March 11, 2026: member

This PR is targeted for v32.0.

The main changes are not significant; the test in the last commit constitutes the bulk of the diff, 493 lines of code addition.

sedited added this to the milestone 32.0 on Mar 14, 2026

sedited commented at 8:26 AM on March 14, 2026: contributor

Concept ACK

Put this on the milestone. I am a bit surprised at the amount of changes this requires, but I guess this is what it takes to wire up the new cluster mempool data structures.

DrahtBot added the label Needs rebase on Mar 19, 2026

ismaelsadeeq force-pushed on Mar 19, 2026

DrahtBot removed the label Needs rebase on Mar 19, 2026

DrahtBot added the label Needs rebase on Apr 3, 2026

ismaelsadeeq force-pushed on Apr 14, 2026

DrahtBot removed the label Needs rebase on Apr 14, 2026

DrahtBot added the label CI failed on Apr 14, 2026

DrahtBot removed the label CI failed on Apr 15, 2026

in src/txmempool.cpp:1017 in 8c38cd73cb outdated

1013 | +    return GetAndSaveMainStagingDiagram();
1014 | +}
1015 |  
1016 | +std::pair<std::vector<FeeFrac>, std::vector<FeeFrac>> CTxMemPool::ChangeSet::GetAndSaveMainStagingDiagram()
1017 | +{
1018 | +    LOCK(m_pool->cs);

sedited commented at 9:31 PM on April 24, 2026:

Are there any call sites where this lock is actually needed? Or it just here as a safeguard, because we cannot correctly annotate this method as taking the mempool lock?

in src/util/hasher.h:121 in 899fa5cdd3

 116 | @@ -116,4 +117,7 @@ class SaltedSipHasher
 117 |      size_t operator()(const std::span<const unsigned char>& script) const;
 118 |  };
 119 |  
 120 | +/** Sorts and hashes a set of wtxids into a single hash*/
 121 | +uint256 GetHashFromWitnesses(std::vector<Wtxid> wtxids);

sedited commented at 10:11 AM on April 25, 2026:

It is not immediately clear to me what this hash will be used for. Is it just a unique identifier for the chunk, that needn't even be reproducible, or is the expectation that consumers will re-generate it? Could any of our other hash functions be a reasonable alternative?

ismaelsadeeq commented at 6:17 PM on April 26, 2026:

Is it just a unique identifier for the chunk, that needn't even be reproducible Yes.

Is the expectation that consumers will re-generate it?

No, this is not defined anywhere; so there is no guarantee it is just a unique identifier for now.

Could any of our other hash functions be a reasonable alternative?

I did not look at other hash functions; I used the same one we use to compute the package hash for logging, during package validation, which is also a hot path. So, not sure is their a better alternative?.

sedited commented at 7:29 PM on April 26, 2026:

Thanks for linking the example again below, I just read over that line in your description, sorry! Could the hash be replaced with a simple counter then? The package hash is a bit different, since it is only calculated unconditionally when debug log is on.

EDIT: Looking at it a bit more closely, we do calculate the package hash in non-logging contexts, but I'm not sure if that is really comparable. Not sure if a counter is really feasible either, seems like we would have track more stuff then.

ismaelsadeeq commented at 10:06 AM on April 27, 2026:

I don't think counter is better; it's worse and adds more accounting. How about moving this responsibility to clients? The trade-off is that we may perform duplicate calculations of the chunk hash.

Another alternative to compute the chunk hashes in the validation interface method before passing them to listeners?

sedited commented at 10:13 AM on April 27, 2026:

Mmh, doing this in the scheduler thread does sound interesting to me. We don't really have precedent for that though. Anyway, I don't think this is necessarily a blocker. There are probably some trade offs to be made here.

ismaelsadeeq commented at 10:22 AM on April 27, 2026:

Yes, we do not do any computation apart from queuing the jobs. This can be the precedent :). Unless there is an objection. IMO that's its place to do this, the mempool does not need any of this computation. I think it is worth pursuing that direction, even if this is not a blocker; it is a computation that is done in all mempool updates and all block connections/disconnections, so not computing a hash is a win.

ismaelsadeeq commented at 10:47 AM on April 29, 2026:

Fixed.

sedited commented at 10:17 AM on April 25, 2026: contributor

Just a few questions so far. Do you have an example that already wires this up to a client? From a first glance, it seems a bit unfortunate that we're rebuilding the various vectors for different clients of the diagram. I wonder if some more work could be saved there by adapting some of the current users a bit.

ismaelsadeeq commented at 6:17 PM on April 26, 2026: member

Do you have an example that already wires this up to a client?

Yeah, I linked an example in the PR description

https://github.com/ismaelsadeeq/bitcoin/commit/fd88b167529aca80a4f65ed6aa77ec9372a41bce

From a first glance, it seems a bit unfortunate that we're rebuilding the various vectors for different clients of the diagram. I wonder if some more work could be saved there by adapting some of the current users a bit.

txgraph is agnostic of the chunk information, so we cannot just wire that up; we have to add that information: the chunk cumulative fee and the chunk identifier. Which I think it is a good design and should be kept as is. To do this, we have to iterate through the cluster chunks vector, then iterate through the chunk vector of refs, get the actual transaction fees and wtxid, compute the chunk hash and cumulative fees. Then move it to the mempool updated notification, where the notification dispatch passes a reference and not a copy. So we do build a 2D vector of witness IDs, and a 1D vector of the actual chunks for both the old and new diagram. I am happy to incorporate suggestions to simplify this.

ismaelsadeeq force-pushed on Apr 29, 2026

DrahtBot added the label CI failed on Apr 29, 2026

DrahtBot commented at 11:34 AM on April 29, 2026: contributor

🚧 At least one of the CI tasks failed. Task macOS native: https://github.com/bitcoin/bitcoin/actions/runs/25104403535/job/73561784519 LLM reason (✨ experimental): CI failed due to a failing ctest run: mempool_update_tests failed (1 test failure in Bitcoin Core Test Suite).

<details><summary>Hints</summary>

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

</details>

ismaelsadeeq force-pushed on Apr 29, 2026

DrahtBot removed the label CI failed on Apr 29, 2026

DrahtBot added the label Needs rebase on Apr 30, 2026

ismaelsadeeq force-pushed on May 1, 2026

DrahtBot removed the label Needs rebase on May 1, 2026

in src/txgraph.h:70 in 6574e5adfb

  65 | @@ -66,6 +66,16 @@ class TxGraph
  66 |          MAIN //!< Always refers to the main graph, whether staging is present or not.
  67 |      };
  68 |  
  69 | +    /* Datatype that represent a chunk along with it's feerate and all the txs refs in the chunk. */
  70 | +    struct Chunk : public FeeFrac {

sipa commented at 12:34 PM on May 1, 2026:

I think a plain struct is cleaner here than deriving from FeeFrac. Semantically, a chunk is not a subtype of a feerate.

Also, if GetMainStagingDiagrams isn't returning a pure std::vector<FeeFrac>, the justification of using that over FeePerWeight so CompareChunks can be used directly disappears, so I would suggest using

struct Chunk
{
    FeePerWeight feerate;
    std::vector<Ref*> refs;
};

An alternative is returning it as two vectors, std::pair<std::vector<FeeFrac>, std::vector<std::vector<Ref*>>>, where the first pair element is exactly the return type used in master now.

Also, this has potentially some non-trivial cost to always allocate and fill the full Refs. I haven't thought through whether that performance impact matters for any of the call sites, but if it does, perhaps adding a separate GetMainStagingChunks() (in addition to GetMainStagingDiagram()) is worth it?

ismaelsadeeq commented at 12:41 PM on May 3, 2026:

Semantically, a chunk is not a subtype of a feerate.

I was trying to leverage the newly added wrappers to use the std library when sorting in order to make the change to the current code minimal. I agree, not worth the logical mix-up. I will revert.

std::pair<std::vector<FeeFrac>, std::vector<std::vector<Ref*>>>, where the first pair element is exactly the return type used in master now

I like this approach, no need for the struct. It will be a bit verbose. Is defining a type alias ideal? not sure I didn't see it in the interface now.

Also, this has potentially some non-trivial cost to always allocate and fill the full Refs.

This is correct, also further up we allocate and fill more vectors for the chunks and a corresponding vector for each chunk to store their witness id's. I agree that this is non-trivial. I will run some benchmarks and see some numbers.

I haven't thought through whether that performance impact matters for any of the call sites, but if it does, perhaps adding a separate GetMainStagingChunks() (in addition to GetMainStagingDiagram()) is worth it?

In further commits we there is no call to GetMainStagingChunks where the refs are not needed. So, not sure it's worth it.

txgraph: populate chunk refs in `GetMainStagingDiagrams` result

GetMainStagingDiagrams currently returns vector<FeeFrac> pair
containing only chunk feerates.
Some callers might need to know which transactions belong to
each e.g chunk in order to compute chunk hash.

This commit introduces TxGraph::Chunk which pairs a chunk feerate with
the refs of all transactions in that chunk. Update GetMainStagingDiagrams
to return pair of vector<TxGraph::Chunk> and rename AppendChunkFeerates
to AppendChunks to reflect the richer return type.

This commit then added a SanityCheck coverage for the new behaviour.

832063779f

refactor: move-only: split GetPackageHash 7150488806

mempool: cache fee rate diagrams as chunks in ChangeSet

CalculateChunksForRBF extracts only the FeeFrac from each TxGraph::Chunk
and discards the rest.

Cache the fee rate diagrams chunks in ChangeSet.

The cached chunks are going to be used in subsequent commit.

6dac419e32

mempool: add GetAndSaveMainStagingDiagram to ChangeSet

GetAndSaveMainStagingDiagram is extracted from CalculateChunksForRBF so
the diagram snapshot can be taken on removal paths that do not go through
the RBF codepath.

The const fix to RemoveStaged is required because GetRemovals()
returns a const reference.

4970c6a4b3

refactor: move-only: extract dependency addition into a seperate method

The newly created method addDependenciesFromBlock is going to be
used multiple times in subsequent commit.

d26bb87dbc

mempool: add and fire MempoolUpdated signal on each mempool update path

Add `MemPoolChunksUpdate` to `kernel/mempool_entry.h`.

Each removal path now captures the before/after fee rate diagram chunks
and fires MempoolUpdated so subscribers can observe every mempool change.

This signal allows subscribers to observe mempool fee rate diagram
changes in an asynchronous way without holding the mempool lock.

This changes the order of operations in removeForBlock: all block
transactions in the mempool are now removed before conflict
removals.

6510604b94

test: add unit test to mempool update validation events 1b2fe5c860

ismaelsadeeq force-pushed on May 6, 2026

ismaelsadeeq commented at 10:49 AM on May 6, 2026: member

Forced pushed from 0412515d4f05ac6208cc237ef021df521229e219 to 1b2fe5c860e0c9f848a82f8e1dc63c0e1a55ea50 0412515d4f...1b2fe5c860

#34803 (review) Thanks, I took your suggestion.