TxGraph’s fundamental responsibility is deciding the order of transactions in the mempool. It relies on the cluster_linearize.h code to optimize it, but there can and often will be many different orderings that are essentially equivalent from a quality perspective, so we have to pick one. At a high level, the solution will involve one or more of:
Deciding based on internal identifiers (Cluster::m_sequence, DepGraphIndex). This is very simple, but risks leaking information about transaction receive order.
Deciding randomly, which is private, but may interfere with relay expectations, block propagation, and ability to monitor network behavior.
Deciding based on txid, which is private and deterministic, but risks incentivizing grinding to get an edge (though we haven’t really seen such behavior).
Deciding based on size (e.g. prefer smaller transactions), which is somewhat related to quality, but not unconditionally (depending on mempool layout, the ideal ordering might call for smaller transactions first, last, or anywhere in between). It’s also not a strong ordering as there can be many identically-sized transactions. However, if it were to encourage grinding behavior, incentivizing smaller transactions is probably not a bad thing.
As of #32545, the current behavior is primarily picking randomly, though inconsistently, as some code paths also use internal identifiers and size. #33335 sought to change it to use random (preferring size in a few places), with the downsides listed above.
This PR is an alternative to that, which changes the order to tie-break based on size everywhere possible, and use lowest-txid-first as final fallback. This is fully deterministic: for any given set of mempool transactions, if all linearized optimally, the transaction order exposed by TxGraph is deterministic.
The order of transactions within a chunk is tie-broken using:
The output of PostLinearize (which improves sub-chunk order), using an initial linearization created using the rules 2-4 below.
Individual transaction feerate (high to low)
Individual transaction weight (small to large)
Txid (low to high txid)
The order of chunks within a cluster is tie-broken using:
Chunk feerate (high to low)
Chunk weight (small to large)
Max-txid (chunk with lowest maximum-txid first)
The order of chunks across clusters is tie-broken using:
Feerate (high to low)
Equal-feerate-chunk-prefix weight (small to large)
Max-txid (chunk with lowest maximum-txid first)
The equal-feerate-chunk-prefix weight of a chunk C is defined as the sum of the weights of all chunks in the same cluster as C, with the same feerate as C, up to and including C itself, in linearization order (but excluding such chunks that appear after C). This is a well-defined approximation of sorting chunks from small to large across clusters, while remaining consistent with intra-cluster linearization order.
DrahtBot
commented at 11:56 pm on January 11, 2026:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
See the guideline for information on the review process.
A summary of reviews will appear here.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#34138 (Cluster mempool: more accurate cost model for SFL by sipa)
#34023 (Optimized SFL cluster linearization by sipa)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
DrahtBot added the label
CI failed
on Jan 12, 2026
DrahtBot
commented at 1:29 am on January 12, 2026:
contributor
🚧 At least one of the CI tasks failed.
Task fuzzer,address,undefined,integer: https://github.com/bitcoin/bitcoin/actions/runs/20904012857/job/60054281786
LLM reason (✨ experimental): UndefinedBehaviorSanitizer detected an unsigned integer overflow in the fuzz target cluster_linearize.cpp, causing the CI failure.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on Jan 12, 2026
sipa force-pushed
on Jan 12, 2026
DrahtBot removed the label
CI failed
on Jan 12, 2026
sipa force-pushed
on Jan 12, 2026
sipa force-pushed
on Jan 13, 2026
sipa force-pushed
on Jan 13, 2026
DrahtBot added the label
CI failed
on Jan 13, 2026
DrahtBot removed the label
CI failed
on Jan 13, 2026
sipa force-pushed
on Jan 14, 2026
sipa
commented at 11:00 pm on January 14, 2026:
member
I have a few changes to comments and tests. Most importantly, the optimal linearization unit tests now compare with an exact known-correct optimal linearization. To generate those, I have updated https://github.com/sipa/pyclusterlin to construct the same deterministic linearization order (if optimal).
DrahtBot added the label
Needs rebase
on Jan 15, 2026
sipa force-pushed
on Jan 15, 2026
sipa force-pushed
on Jan 15, 2026
txgraph: initialize Ref in AddTransaction (preparation)
Instead of returning a TxGraph::Ref from TxGraph::AddTransaction(),
pass in a TxGraph::Ref& which is updated to refer to the new transaction
in that graph.
This cleans up the usage somewhat, avoiding the need for dummy Refs in
CTxMemPoolEntry constructor calls, but the motivation is that a future
commit will allow a callback to passed to MakeTxGraph to define a
fallback order on the transaction objects. This does not when a Ref is
created separately from the CTxMemPoolEntry it ends up living in, as
passing the newly-created Ref to the callback would be UB before it's
emplaced in its final CTxMemPoolEntry.
225d420424
txgraph: update chunk index on Compact (preparation)
This makes TxGraphImpl::Compact() invoke Cluster::Updated() on all
affected clusters, in case they have internal GraphIndex values stored
that may have become outdated with the renumbering of GraphIndex values
that Compact() caused.
This is necessary as preparation for a future commit in which chunk data
will gain a reference to the maximal-fallback transaction within the
chunk.
4ff87517a3
txgraph: clear cluster's chunk index in ~Ref (preparation)
Whenever a TxGraph::Ref is destroyed, if it by then still appears inside
main-level clusters, wipe the chunk index entries for those clusters, to
prevent having lingering indexes for transactions without Ref.
This is preparation for enabling a callback being passed to MakeTxGraph
to define a fallback order on objects. Once the Ref for a transaction is
gone, it is not possible to invoke the callback anymore. To prevent the
index becoming inconsistent, we need to immediately get rid of the index
entries when the Ref disappears.
This is not a problem, because such destructions necessarily will
trigger a relinearization of the cluster (assuming there are
transactions in it left) before becoming acceptable again, and the chunk
ordering is not observable (through CompareMainOrder, or through the
BlockBuilder interface) until that point. However, the index itself
needs to remain consistent in the mean time, even if not meaningful.
1d2e267820
clusterlin: sort tx in chunk by feerate and size (feature)
This changes the order of transactions within a chunk to be:
1. Topology (parents before children)
2. Individual transaction feerate (high to low)
3. Individual transaction weight (small to large)
4. Random tiebreak (will be changed in a future commit)
To do so, use a heap of topology-ready transactions within
GetLinearization(), sorted by (2), (3), and (4).
The order of chunks within a cluster is unmodified, but very similar;
they remain ordered by:
1. Topology (chunks before chunks they depend on)
2. Chunk feerate (high to low)
3. Chunk weight (small to large)
4. Random tiebreak (will be changed in a future commit)
0824365a01
txgraph: sort chunks by equal-feerate-prefix size (feature)
This makes TxGraph track the equal-feerate-prefix size of all chunks in
all clusters in the main graph, and uses it to sort chunks coming from
distinct clusters.
The order of chunks across clusters becomes:
1. Feerate (high to low)
2. Equal-feerate-chunk-prefix (small to large)
3. Cluster sequence number (old to new); this will be changed later.
The equal-feerate-chunk-prefix size of a chunk C is defined as the sum
of the weights of all chunks in the same cluster as C, with the same
feerate as C, up to and including C itself, in linearization order (but
excluding such chunks that appear after C).
This is an approximation of sorting chunks from small to large across
clusters, while remaining consistent with intra-cluster linearization
order.
89b627d3cf
clusterlin: make optimal linearizations deterministic (feature)
This allows passing in a fallback order comparator to Linearize(), which
is used as final tiebreak when deciding the order of chunks and
transactions within a chunk, rather than a random tiebreak.
The order of transactions within a chunk becomes:
1. Topology (parents before children)
2. Individual transaction feerate (high to low)
3. Weight (small to large)
4. Fallback (low to high fallback order)
The order of chunks within a cluster becomes:
1. Topology (chunks after their dependencies)
2. Feerate (high to low)
3. Weight (small to large)
4. Max-fallback (chunk with lowest maximum-fallback-tx first)
For now, txgraph passes a naive comparator to Linearize(), which makes
the cluster order deterministic when treating the input transactions as
identified by the DepGraphIndex. However, since DepGraphIndexes are the
result of possibly-randomized operations inside txgraph, this doesn't
actually make txgraph's per-cluster ordering deterministic. That will be
changed in a later commit, by using a txid-based fallback instead.
59cf919602
txgraph test: subclass TxGraph::Ref like mempool does (preparation)
This is a small change to the txgraph fuzz test to make it used objects
derived from TxGraph::Ref (SimTxObject) rather than TxGraph::Ref
directly. This matches how the mempool uses CTxMemPoolEntry, which
derives from TxGraph::Ref.
This is preparation for a future commit which will introduce simulated
txids to the transactions in this fuzz test, to be used as falllback
order.
43b323fc29
txgraph: pass fallback_order to TxGraph (preparation)
This adds an std::function<strong_ordering(Ref&,Ref&)> argument to the
MakeTxGraph function, which can be used by the caller (e.g., mempool
code) to provide a fallback order to TxGraph.
This is just preparation; TxGraph does not yet use this fallback order
for anything.
cf54d9c3bb
txgraph: use fallback order when linearizing (feature)
Add glue to make TxGraph use the fallback order provided to it, in the
fallback comparator it provides to the cluster linearization code.
The order of chunks within a cluster becomes:
1. Topology (chunks after their dependencies)
2. Feerate (high to low)
3. Weight (small to large)
4. Max-txid (chunk with lowest maximum-txid first)
The order of transactions within a chunk becomes:
1. Topology (parents before children)
2. Individual transaction feerate (high to low)
3. Weight (small to large)
4. Txid (low to high txid)
This makes optimal cluster linearization, both the order of chunks
within a chunk, and the order of transactions within those chunks,
completely deterministic.
dff43a6c7a
txgraph: use fallback order to sort chunks (feature)
This makes TxGraph also use the fallback order to decide the order of
chunks from distinct clusters.
The order of chunks across clusters becomes:
1. Feerate (high to low)
2. Equal-feerate-chunk-prefix (small to large)
3. Max-txid (chunk with lowest maximum-txid first)
This makes the full TxGraph ordering fully deterministic as long as all
clusters in it are optimally linearized.
b1aaedc92d
sipa force-pushed
on Jan 15, 2026
DrahtBot added the label
CI failed
on Jan 15, 2026
DrahtBot removed the label
Needs rebase
on Jan 15, 2026
DrahtBot removed the label
CI failed
on Jan 15, 2026
sipa
commented at 3:07 am on January 16, 2026:
member
Both @darosior (here) and @ajtowns (here) suggested preferring larger size over smaller size instead. I agree that this is also fine from an incentives perspective, because nobody is going to deliberately construct larger transactions at the same feerate - that’s just paying more. If that is acceptable to you, you’re better off just paying a higher feerate.
For ordering chunks across clusters, when combined with a greedy block building algorithm, larger first may have a slight advantage, because skipping a larger chunk still allows the smaller chunk to be considered. This does not work for chunks within a cluster, or transactions within a chunk. I don’t think it matters much, but this could be simulated if we care.
Practically however, I don’t really see how to achieve this. For ordering chunks across clusters, we need some kind of function that maps chunks to a sorting key, with the property that it must map equal-feerate chunks within a cluster to increasing keys. The cross-cluster chunk ordering then boils down to sorting by increasing key. The PR here currently uses (equal-feerate-chunk-prefix size, maximum txid within cluster), which will approximately prioritize smaller chunks (but it’s a little murky when there are multiple equal-feerate chunks from the same cluster). I guess it’s possible to use a key like sum of inverses of the sizes of all equal-feerate chunks in the same cluster, but that seems extremely ugly and only very approximate anyway.
For ordering chunks within a cluster, or (if we ever consider a tx-granularity based block building algorithm rather than chunk-granularity) ordering transactions within a chunk, it occurs to me that we really want uniformity - get the chunk/tx boundaries density as uniform as possible, because realistically, every byte position has equal probability of ending up being a block limit, and we want to minimize the gap. This actually calls for something (pseudo)random, or maybe alternating between smallest and largest.
The only (very weak) argument in favor of smaller first that I know of, is that in a regime where mempools are reliably full, and transactions regularly get dropped/replaced, it may be better to have higher chunk/tx boundary density near the beginning, because the end has a non-negligible probability of never being mined at all, so it may pay off to deprioritize it.
Overall, it is unlikely to matter I think, and having something is better than nothing. Remember that this is all just tie-breaking, which can’t really guarantee any objective quality - that comes through chunking and linearizations that maximize feerates.
Re @ajtowns’s comment there about grinding: I think it really boils down to me having a (perhaps irrational) dislike of using txid or any other not-objectively-relevant criterion in the sort order. It feels arbitrary, and possibly brittle if there ever is an incentive for something else equally arbitrary. It’s setting a convention that at least in theory will have winners and losers, unlike feerate which is directly incentive-related, and even unlike size which at least intimately affects block building even if so in a rather chaotic way.
So I like having a size-based criterion in there first, because it reduces the impact of txid on the order to places where we’re talking about equal-fee and equal-size transactions/chunks. But maybe it doesn’t really matter…
Thoughts?
gmaxwell
commented at 8:21 am on January 16, 2026:
contributor
I also liked having neutral or positive criteria ahead of the arbitrary one– also making your txn smaller (at equal feerate) already increases your odds of being included because you’re more likely to successfully pack, and if doing so as a tie breaker causes people to grind or whatnot to further reduce their transaction sizes then that seems good to me. Those were the points that came to mind when I considered the alternative ‘wouldn’t larger be better’– though mostly in the sense of “prefer the transaction that pays the greater absolute fees, among transactions at the same feerate” which amounts to a prefer larger policy.
I think a prior comment of mine pointed out that you could stick other criteria in ahead of the arbitrary one– Total bitcoin being spent? the old priority metric? net change in utxo set size? I don’t have strong opinions. These metrics, unlike grinding are kind of naturally self limiting. one only has so much bitcoin to spend, and so on.
sipa
commented at 4:24 pm on January 16, 2026:
member
The following approach could be used to get an approximate big-to-small chunk ordering.
Rather than summing the sizes of all equal-feerate chunks in a given cluster, up to the one under consideration, sum all of them starting from the one under consideration. Then sort by that metric decreasing.
It also doesn’t need consistency with intra-cluster chunk order, or intra-chunk tx order. Those could still be small-to-large, though the effect of such a suffix-size based inter-chunk order is probably larger (if it is anything at all) if it’s consistent with intra-cluster order.
Adding additional criteria, after size (whether that’s small first, or large first) is actually not hard with the approach used in this PR, because the fallback tiebreak after feerate and size is abstracted behind a “fallback compare two transactions” function provided by the mempool. Currently, that function compares txids, but it could be changed to compare other things like input amounts, input count, output count, age of inputs, …
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2026-01-19 18:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me