@sipa and I gave a presentation to some other developers recently about some work we’ve been doing to redesign how the mempool works, and I wanted to share a writeup outlining our thoughts with a broader audience to further the discussion. I’ve also attached a PDF of slides that go along with this: Reinventing the Mempool.pdf.
Summary
The current mempool is primarily designed around maintaining two different sort orders, where for each transaction we keep track of its position in an ancestor-feerate-sort and in a descendant-feerate-sort. We then use those two cached sort orders in order to implement eviction and mining.
Those algorithms are deficient in several ways. In particular, the eviction algorithm (which removes the transaction with lowest descendant feerate, along with its descendants) does not guarantee that the first transaction evicted would be the last transaction mined, and there exist situations where we could evict a transaction that would otherwise actually be mined in the next block.
Furthermore, the mining algorithm is more complex than the ancestor feerate sort we maintain, and it is insufficient to merely do lookups into our ancestor feerate index to determine which of two transactions will be mined from the mempool first. This lack of a total ordering on mempool transactions makes RBF calculations difficult to align with miner incentives, and creates distortions where we sometimes can replace transactions via RBF that would actually be better for a miner than the replacement.
To solve all these problems, we propose maintaining a total ordering on mempool transactions, which we use to implement a mining and eviction algorithm that are symmetrically opposite and provide better RBF rules for determining when to allow a replacement to occur.
Background
Eviction and mining are not opposite
Sometimes our eviction criteria might remove something that would be mined in the next block from our mempool. Consider the following transaction graph (with (ancestor, descendant) feerates in sat/byte in parentheses):
0graph TD;
1 A["Tx A: size 250, fee 250 (1, 2)"]-->B;
2 A-->C;
3 A-->E["Tx E: size 250, fee 49750 (100, 199)"];
4 B["Tx B: size 50000, fee 50000 (1, 2.02)"]-->D["Tx D: size 500, fee 52000 (1.51, 104)"];
5 C["Tx C: size 50000, fee 50000 (1, 2.02)"]-->D;
Note that the ancestor feerate of E is 100 sat/byte, while the descendant feerate of A is 2 sat/byte. In particular, the descendant feerate of A is lowest among all transactions in this diagram, so if it were also the lowest descendant feerate package in the mempool, then we might evict A, and therefore E, even though (A, E) might otherwise be selected for the next block.
RBF is not incentive compatible
There are many forms of incentive incompatibility baked into the BIP 125 replacement rules:
- The “no-new-parents” rule doesn’t consider the replacement transaction’s desirability to a miner – it might be better than all the to-be-replaced transactions, despite the new unconfirmed parent.
- The rule requiring that a new transaction pays a higher feerate than its direct conflicts is insufficient to ensure that the new transaction is more desirable than what is being evicted. One of the direct conflicts could have a smaller set of unconfirmed parents than the new transaction, resulting in a higher ancestor feerate. Moreover, because we don’t compare feerates of any indirect conflicts – that is, descendants of direct conflicts – with the feerate of the new transaction, it’s possible that we’re evicting a child transaction that would otherwise be mined in the next block with something that will not be mined for a long time.
See #26451 for more discussion of the second point.
Proposed solution
If we’re able to maintain a total ordering on the mempool, then we can design efficient algorithms that will solve these problems.
How to maintain a total ordering
With the current algorithm and rules, it is infeasible to continuously maintain a fully-ordered mempool at all times. Ordering an $n$-transaction mempool from scratch scales with $O(d n \log n)$, where $d$ is the descendant limit (default 25), which quickly becomes too much when the mempool grows beyond a trivial size. And unfortunately, doing this incrementally (just re-ordering affected transactions whenever updates happen) does not help, because even a single added transaction can change the ordering of all remaining ones.
[Edited to use a better example] As an example of this effect, consider the following transaction graph (all transactions are same size, with ancestor feerates of children in parentheses):
0graph TD;
1 A[Tx A: fee 16]-->D["Tx D: fee 17 (16.5)"];
2 A-->E["Tx E: fee 19 (16)"];
3 B[Tx B: fee 13]-->E;
4 B-->F["Tx F: fee 23 (14.67)"];
5 C[Tx C: fee 8]-->F;
With these ancestor feerates, Tx D should be included first along with its parent Tx A. Once Tx A is included, Tx E will have the next highest ancestor feerate and it is included next with its parent, and so on to produce the ordering: [A, D, B, E, C, F].
Now consider a single transaction being added, Tx G:
0graph TD;
1 A[Tx A: fee 16]-->D["Tx D: fee 17 (16.5)"];
2 A-->E["Tx E: fee 19 (16)"];
3 B[Tx B: fee 13]-->E;
4 B-->F["Tx F: fee 23 (14.67)"];
5 C[Tx C: fee 8]-->F;
6 C-->G["Tx G: fee 29 (18.5)"]
Transaction G now has the highest ancestor feerate, and so it would be chosen first in our ancestor-feerate-based ordering. Once Tx G and its parent Tx C are included, Tx F will see its ancestor feerate increase (as it only has one remaining ancestor) to 18, making it the next best to include (along with its parent Tx B). Once Tx B is in, Tx E has the next highest ancestor feerate and we can see that the final ordering of transactions is: [C, G, B, F, A, E, D]. Note that the order in which the child transactions appear has been reversed, and moreover, the number of transactions in this graph could be extended arbitrarily while retaining this behavior.
Fundamentally this means that we need to bound the number of transactions whose ordering can be affected by individual changes to the mempool. Our main insight is that this number is exactly the size of the largest “cluster” of transactions that we might have in the mempool, where a “cluster” is a maximal set of transactions that are connected via ancestor/descendant relationships in the transaction graph.
Our proposal is thus to introduce a limit on the size of clusters in the mempool. It then becomes feasible to maintain an ordering for all the individual clusters at all times, and perform a quick merge sort over all of them whenever a total ordering of the entire mempool is needed.
Definitions
Clusters
We associate the transactions in the mempool with a graph, where each transaction corresponds to a vertex and two transactions are adjacent (ie have an edge between them) if one spends any outputs of the other. We say that two transactions are in the same cluster if there exists a path of edges between them. The overall graph of transactions is partitioned into disjoint clusters, where each transaction is in exactly one cluster.
Linearizations
We introduce the term “linearization” to refer to any topologically valid ordering of transactions, that is, any ordering of transactions that would be valid in a block under Bitcoin’s consensus rules. This could be an ancestor-feerate-based ordering of transactions (like we use in the mining code today), or it could be the optimal ordering (one where we choose amongst all possible subsets of transactions the topologically valid subset that maximizes feerate, and repeat until all transactions are selected). Or it could be something in between; for the purposes of discussion we abstract away the details of the cluster ordering algorithm, or “linearization”, that we might settle on.
Chunks
Our intuition for cluster linearization algorithms is that we often will choose transactions with lower feerate before we choose transactions with higher feerate, because of topology constraints. This is a consequence of behaviors like “child- pays-for-parent”. So when thinking about how we might compare transactions from two different clusters to determine which one is “better” (eg for mining) or “worse” (eg for eviction), we don’t just want to look at the first or last transaction in the cluster. Instead, our intuition is that we want to look at the same package of transactions that our ordering algorithm would have used in coming up with the ordering in the first place.
This concept, that transactions should be looked at in some aggregate with other transactions, can be generalized in a way that is indifferent to the underlying linearization algorithm that is used. For a given linearized cluster $[tx_1, …, tx_n]$, we can partition the cluster into a series of disjoint “chunks” $[c_1, c_2, …, c_j]$ where $c_1 = [tx_1, …, tx_{i_1}]$, $c_2 = [tx_{i_1+1}, …, tx_{i_2}]$ ,…, $c_j = [tx_{i_{j-1}+1}, …, tx_n]$, such that the feerate of each chunk is no greater than the feerate of the chunk that precedes it.
Another way of putting it: chunks are the partitioning of a cluster that respects the linearization in a way that ensures that feerates from one chunk to the next are monotonically decreasing.
(Note that due to ties there can be more than one chunking that is compatible with the given definition; generally we’ll prefer smaller chunks and therefore not merge together chunks with the same feerate.)
Given any cluster linearization, it turns out that it’s possible to construct the chunks for the cluster in linear time, and this operation is independent of the linearization algorithm that may have been used.
Implementing fundamental mempool operations
For the sake of discussion below, let’s now imagine that our mempool is grouped into its different clusters, and each cluster is linearized using some algorithm (perhaps the ancestor-feerate-based mining algorithm in use today, or perhaps an optimal ordering), and those linearized clusters are each chunked into their monotonically decreasing feerate order. Given such a mempool, we can design algorithms to implement the fundamental mempool operations.
Mining
Our goal with mining is to maximize the fees in a block template, and we do so by applying a greedy algorithm.
- Make a highest-feerate-sorted heap consisting of the first chunk from each cluster. Note that these will be each cluster’s highest feerate chunk.
- While the candidate block is not yet full: i. Remove the top of the heap and add to the block candidate. ii. If the cluster from which the selection was made has more chunks left, insert the next best chunk from that cluster into the heap.
Note that when we approach the end of a block, we run into a knapsack problem where chunks may not fit. How we might optimize the end of a block is deferred for later discussion.
Eviction
Our goal with eviction is to remove the transactions that would be the last ones selected by our mining algorithm. With a fully ordered mempool, this calculation is straightforward and mirrors the mining algorithm.
- Make a lowest-feerate-sorted heap consisting of the last chunk from each cluster. Note that these will be each cluster’s lowest feerate chunk.
- While the mempool is over its size limit: i. Remove the top of the heap and evict from the mempool. ii. If the cluster from which the selection was made has more chunks left, insert the next lowest feerate chunk from that cluster into the heap.
As with our current eviction algorithm, we will also bump up the mempool’s minimum fee to be an increment above the highest feerate chunk that was evicted.
RBF
Our goal with RBF is to allow a “better” transaction (something a miner would prefer to have) to evict some existing, conflicting transactions in a way that is DoS-resistant but is otherwise as permissible as practical.
Given a mempool where all clusters are ordered and chunked, we can assign a score to a transaction which is the chunk feerate of the chunk to which it belongs. Then our main RBF incentive compatibility rule just becomes:
- A replacement transaction must have a chunk feerate greater than that of all transactions which it would evict.
This would eliminate the BIP 125 rule against replacements introducing new unconfirmed parents and replaces the BIP 125 feerate rule in which a replacement transaction must have a higher individual feerate than all direct conflicts.
However, at least for now, we would retain some of the additional anti-DoS rules:
- A replacement transaction must pay more total fees than the sum of fees paid by all evicted transactions.
- A replacement transaction must not evict more than (100?) existing transactions.
The first of those rules prevents free-relay attacks, while some version of the second rule is necessary to prevent us from needing to re-cluster/re-order too much of the mempool while processing a single transaction.
Some open questions
- Philosophically, is it problematic for RBF rules to be even more of a “black box” than the BIP 125 algorithm?
- Are there usage patterns for wallets or other protocols that would be fundamentally harmed by cluster size limits? Do we know what values would be acceptable or what values would be insufficient?
- What can we do if a high feerate transaction arrives that would violate the cluster limit? Can we design some sort of “cluster-sibling eviction” algorithm that is DoS-resistant and would help reduce the downsides of cluster size limits?
- Is it problematic if linearization algorithms used on the network are non-deterministic?
- We could imagine using a variety of linearization algorithms at any given moment – perhaps for small clusters we use an exponential run-time optimal ordering algorithm, while for medium sized clusters we do something different, like an ancestor-feerate-based ordering, or an optimal ordering with some bounded computation limits.
- Are there situations where different nodes having different orders causes meaningful user-facing issues with transactions not relaying effectively, eg because of the use of chunk feerates in the RBF algorithm?
- Chunk sizes should be bounded in vbytes, so that we don’t exacerbate the knapsack problem during mining. However, what is the best way to do that?
- We could just bound the vbytes allowed in a single cluster, which would guarantee that chunks are not too big.
- Or we could design linearization algorithms and adapt our chunking logic to never produce chunks bigger than a certain size – but this would come at the expense of chunks within a cluster no longer having monotonically decreasing feerates. This may be possible to do, but is the need for large clusters (measured in vbytes) great enough to warrant the complexity that this would introduce in our algorithms?
Current status and next steps
- First draft implementation of this logic is done (link to be added).
- Analyzing effects of this logic on historical transaction data is in process, to try to answer questions like:
- How big are the clusters we’ve historically seen / what is the cluster size distribution on the network today
- How would the network have been affected if we’d had cluster sizes bounded at various different values in the past
- Performance analysis of all the mempool operations (particularly as a function of cluster size) to ensure that mempool operations are optimized with this approach.
- The information gathered will (eventually) inform a concrete proposal for what cluster size limits we might consider (along with chunk vbyte and/or cluster vbyte limits).