sdaftuar
commented at 6:59 pm on October 18, 2023:
member
This is a draft implementation of the cluster mempool design described in #27677. I’m opening this as a draft PR now to share the branch I’m working on with others, so that we can start to think about in-progress projects (like package relay, package validation, and package rbf) in the context of this design. Also, I can use some help from others for parts of this work, including the interaction between the mempool and the wallet, and also reworking some of our existing test cases to fit a cluster-mempool world.
Note that the design of this implementation is subject to change as I continue to iterate on the code (to make the code more hygienic and robust, in particular). At this point though I think the performance is pretty reasonable and I’m not currently aware of any bugs. There are some microbenchmarks added here, and some improved fuzz tests; it would be great if others ran both of those on their own hardware as well and reported back on any findings.
This branch implements the following observable behavior changes:
Maintains a partitioning of the mempool into connected clusters
Each cluster is sorted (“linearized”) either using an optimal sort, or an ancestor-feerate-based one, depending on the size of the cluster (thanks to @sipa for this logic)
Transaction selection for mining is updated to use the cluster linearizations
Mempool eviction is updated to use the cluster linearizations
The RBF rules are updated to drop the requirement that no new inputs are introduced, and to change the feerate requirement to instead check that the mining score of a replacement transaction exceed the mining score of the conflicted transactions
The CPFP carveout rule is eliminated (it doesn’t make sense in a cluster-limited mempool)
The ancestor and descendant limits are no longer enforced.
New cluster count/cluster vsize limits are now enforced instead.
Some less observable behavior changes:
The cached ancestor and descendant data are dropped from the mempool, along with the multi_index indices that were maintained to sort the mempool by ancestor and descendant feerates. For compatibility (eg with wallet behavior or RPCs exposing this), this information is now calculated dynamically instead.
The ancestor and descendant walking algorithms are now implemented using epochs (resulting in a significant performance improvement, according to the benchmarks I’ve looked at)
Still to do:
More comparisons between this branch and master on historical data to compare validation speed (accepting loose transactions, processing RBF transactions, validating a block/postprocessing, updating the mempool for a reorg).
More historical data analysis to try to evaluate the likely impact of setting the cluster size limits to varying values (to motivate what values we should ultimately pick). [DONE, see this post]
Updating wallet code to be cluster-aware (including mini_miner and coin selection)
Rework many of our functional tests to be cluster-aware
Figure out what package validation and package RBF rules should be in this design
Rework the partially_downloaded_block fuzz target to not add duplicate transactions to the mempool (#29990).
Update RBF logic to ensure that replacements always strictly improve the mempool.
Figure out how we want to document our RBF policy (preserve historical references to BIP 125 or previous Bitcoin Core behaviors vs clean slate documentation?)
For discussion/feedback:
How significant is it to be dropping the CPFP carveout rule? Does that affect how we will ultimately want to stage new mempool deployment?
How well do the proposed RBF rules meet everyone’s use cases?
What design improvements can we make to the cluster tracking implementation?
The ZMQ callbacks that occur when a block is found will happen in a slightly different order, because we now will fully remove all transactions occurring in a block from the mempool before removing any conflicts. Is this a problem?
DrahtBot
commented at 6:59 pm on October 18, 2023:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
#28132 (policy: Enable full-rbf by default by petertodd)
#28121 (include verbose “debug-message” field in testmempoolaccept response by pinheadmz)
#26593 (tracing: Only prepare tracepoint arguments when actually tracing by 0xB10C)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
in
doc/policy/mempool-replacements.md:29
in
b0f771bb55outdated
23- currently-unconfirmed transaction.
24-
25- *Rationale*: When RBF was originally implemented, the mempool did not keep track of
26- ancestor feerates yet. This rule was suggested as a temporary restriction.
27-
28-3. The replacement transaction pays an absolute fee of at least the sum paid by the original
instagibbs
commented at 7:11 pm on October 18, 2023:
might be easier to just “strike out” this rule and mark it deprecated instead of changing numbers since that’s become engineer lingo
sdaftuar
commented at 4:52 pm on December 7, 2023:
I’m not sure the best way to document things, but I added a to-do item to the OP so that we don’t lose track of figuring out how we want the docs to look. (Will have to conform all the code comments as well.)
DrahtBot added the label
CI failed
on Oct 18, 2023
instagibbs
commented at 3:13 pm on October 19, 2023:
shouldn’t this be 0?
in
src/policy/rbf.cpp:65
in
3a42ba255aoutdated
88- std::set<uint256> parents_of_conflicts;
89- for (const auto& mi : iters_conflicting) {
90- for (const CTxIn& txin : mi->GetTx().vin) {
91- parents_of_conflicts.insert(txin.prevout.hash);
92+ // Exit early if we're going to fail (see below)
93+ if (all_conflicts.size() > MAX_REPLACEMENT_CANDIDATES) {
instagibbs
commented at 3:40 pm on October 19, 2023:
Note: this is sticking with the same rule#5 instead of number of effected clusters. It would be more ideal if it were number of clusters to allow for better usage of adversarial-ish batched CPFPs
sdaftuar
commented at 12:48 pm on October 25, 2023:
There is room to relax this rule some, so if this is important we can do so. I think the requirement is a bound on the number of clusters that would have to be re-sorted in order to accept the new transaction. We can approximate that as the number of clusters that would be non-empty as a result of removing all the conflicting transactions from the mempool, and only process replacements for which that is below some target.
That would be a more complex logic though, so before implementing it I wanted to have some sense of whether we need to. Has the historical 100-transaction-conflict limit been problematic for use cases in the past? Note also that in the new code, we are calculating the number of conflicts exactly (the old code used an approximation, which could be gamed by an adversary).
instagibbs
commented at 1:52 pm on October 25, 2023:
Ah! I wrote a huge response to this, then looked up our previous discussions, and realized I didn’t actually read the code: #27677 (comment)
IIUC now, this is only counting direct conflicts, and not the descendants that are booted.
I think that’s fine.
Actually no, the existing code comments were just misleading, looks like the issue still exists, see: #27677 (comment)
sdaftuar
commented at 1:59 pm on October 25, 2023:
Ah yes, in the new code I’m only counting direct conflicts right now, because every descendant of a direct conflict must be in the same cluster as that conflict. So this is already a relaxation of the existing rule.
instagibbs
commented at 2:01 pm on October 25, 2023:
I think the requirement is a bound on the number of clusters that would have to be re-sorted in order to accept the new transaction.
As an alternative, we drop the replacement limit to like, 10 or something, and then only count the direct conflicts, not the direct conflicts and all the descendants?
sdaftuar
commented at 5:00 pm on December 7, 2023:
I believe I (finally) actually fixed this behavior to count the number of direct conflicts.
This is still needed for the same reason as before: transaction that is above minrelay, but would be in a chunk below minrelay. We could immediately evict below minrelay chunks post-re-linearization f.e. which would allow 0-fee parents then relax this maybe.
in
src/bench/mempool_stress.cpp:99
in
c5d3c42da4outdated
95+ }
96+ const auto testing_setup = MakeNoLogFileContext<const TestingSetup>(ChainType::MAIN);
97+ CTxMemPool& pool = *testing_setup.get()->m_node.mempool;
98+
99+ std::vector<CTransactionRef> transactions;
100+ // Create 1000 clusters of 100 transactions each
instagibbs
commented at 9:01 pm on October 19, 2023:
numbers in comments are off
shower thought: Should we/can we bound the number of clusters in addition to total memory in TrimToSize? I can’t think of a good way to do that that doesn’t complicate things quite a bit, and perhaps practical mempool sizes make this moot. Just something to consider in case I missed something obvious.
sdaftuar
commented at 7:24 pm on October 24, 2023:
The immediate downside to a cap on number of clusters is that singleton, high-feerate transactions would not be accepted. And I don’t think we need to – the only places where having more clusters makes us slower is in eviction and mining, and for both of those use cases we could improve performance (if we need to) by maintaining the relevant heap data structures (or something equivalent) as chunks are modified, rather than all at once.
For now in this branch I’ve created these from scratch each time, but if it turns out that performance is meaningfully impacted when the mempool is busy, then I can optimize this further by just using a bit more memory.
sdaftuar force-pushed
on Oct 19, 2023
glozow added the label
Mempool
on Oct 20, 2023
in
src/txmempool.cpp:1481
in
26e831d5f2outdated
1477+ // TODO: branch on size of fee to do this as 32-bit calculation
1478+ // instead? etc
1479+ return a.first->fee*b.first->size > b.first->fee*a.first->size;
1480+ };
1481+
1482+ std::make_heap(heap_chunks.begin(), heap_chunks.end(), cmp);
instagibbs
commented at 7:39 pm on October 20, 2023:
You probably didn’t mean to call make_heap in the loop for 3N work each time. fwiw I don’t see any performance difference between push_heaping all the elements on vs one make_heap.
once this is changed, add+trimming seems to be faster than in master regardless of topology tested
sdaftuar
commented at 5:03 pm on December 7, 2023:
Should be fixed now.
in
src/node/miner.cpp:248
in
26e831d5f2outdated
381+ // TODO: branch on size of fee to do this as 32-bit calculation
382+ // instead? etc
383+ return a.first->fee*b.first->size < b.first->fee*a.first->size;
384+ };
385+ // TODO: replace the heap with a priority queue
386+ std::make_heap(heap_chunks.begin(), heap_chunks.end(), cmp);
instagibbs
commented at 7:46 pm on October 20, 2023:
don’t ask why but I’m getting significant performance improvement(>10%) just push_heaping everything from scratch, and similarly with priority_queue
DrahtBot added the label
Needs rebase
on Oct 30, 2023
in
src/bench/mempool_stress.cpp:261
in
dd6684ab66outdated
dd6684ab665bc8ae76b76fdd2e578fc77d562a52: While you’re touching this, can you rename MempoolCheck to MemPoolCheck, MempoolEviction to MemPoolEviction and ComplexMemPool to MempoolComplex? That makes -filter=MemPool.* work
As a workaround, -filter=.*Mem.* does work.
Sjors
commented at 10:29 am on November 14, 2023:
member
It would be useful to add a mempool_backwards_compatibility.py test to illustrate how the new rules interact with older nodes. It could have two modern nodes and one v25 (or v26) node. Some of the tests you deleted in this branch could be moved there. E.g. the test could demonstrate how RBF rule 2 is not enforced when relaying to the new node, but it is when relaying to the v25 node.
Benchmarks on a 2019 MacBook Pro (2,3 GHz 8-Core Intel Core i9), plugged in:
573@@ -574,6 +574,8 @@ void SetupServerArgs(ArgsManager& argsman)
574 argsman.AddArg("-limitancestorsize=<n>", strprintf("Do not accept transactions whose size with all in-mempool ancestors exceeds <n> kilobytes (default: %u)", DEFAULT_ANCESTOR_SIZE_LIMIT_KVB), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
575 argsman.AddArg("-limitdescendantcount=<n>", strprintf("Do not accept transactions if any ancestor would have <n> or more in-mempool descendants (default: %u)", DEFAULT_DESCENDANT_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
576 argsman.AddArg("-limitdescendantsize=<n>", strprintf("Do not accept transactions if any ancestor would have more than <n> kilobytes of in-mempool descendants (default: %u).", DEFAULT_DESCENDANT_SIZE_LIMIT_KVB), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
577+ argsman.AddArg("-limitclustercount=<n>", strprintf("Do not accept transactions connected to <n> or more existing in-mempool transactions (default: %u)", DEFAULT_CLUSTER_LIMIT), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
I (naively) replaced ancestor and descendent limits in coin selection with the new cluster limit. At least the tests pass *.
When we drop these settings anyone who uses them will get an error when starting the node. That’s probably a good thing, since they should read about this change.
* = well, wallet_basic.py fails with:
0Internal bug detected: Shared UTXOs among selection results
1wallet/coinselection.h:340 (InsertInputs)
in
src/txmempool.cpp:1857
in
54f39ca8f1outdated
1549+
1550+void Cluster::RemoveTransaction(const CTxMemPoolEntry& entry)
1551+{
1552+ m_chunks[entry.m_loc.first].txs.erase(entry.m_loc.second);
1553+
1554+ // Chunk (or cluster) may now be empty, but this will get cleaned up
54f39ca8f101483f5f82707689ca49431d4091e5: what if the deleted transaction makes it so there are now two clusters? This is also safe to ignore?
sdaftuar
commented at 1:33 am on December 10, 2023:
I wouldn’t say it’s safe to ignore, but the idea is that we often want to be able to batch deletions, and then clean things up in one pass. So the call sites should all be dealing with this issue and ensuring that we always clean up at some point.
(This is definitely an area where I expect that we’ll be re-engineering all this logic and trying to come up with a better abstraction layer so that this is more robust and easier to think about!)
54f39ca8f101483f5f82707689ca49431d4091e5: So you’re creating a chunk for each new transaction and then erasing it if the fee rate goes down. Why not the other way around?
in
src/txmempool.cpp:1877
in
54f39ca8f1outdated
1569+ m_chunks.emplace_back(txentry.get().GetModifiedFee(), txentry.get().GetTxSize());
1570+ m_chunks.back().txs.emplace_back(txentry);
1571+ while (m_chunks.size() >= 2) {
1572+ auto cur_iter = std::prev(m_chunks.end());
1573+ auto prev_iter = std::prev(cur_iter);
1574+ double feerate_prev = prev_iter->fee*cur_iter->size;
Sjors
commented at 2:27 pm on November 14, 2023:
member
Couple more comments / questions. To be continued…
in
src/txmempool.h:440
in
065e18fff3outdated
633@@ -632,6 +634,22 @@ class CTxMemPool
634 */
635 void UpdateTransactionsFromBlock(const std::vector<uint256>& vHashesToUpdate) EXCLUSIVE_LOCKS_REQUIRED(cs, cs_main) LOCKS_EXCLUDED(m_epoch);
636637+ /**
638+ * Calculate whether cluster size limits would be exceeded if a new tx were
639+ * added to the mempool (assuming no conflicts).
065e18fff30a91c94c47d5c2fc65b13ddc38aa47: do you plan to relax this assumption so that a transaction in (a) full cluster(s) can be replaced?
Update: AcceptSingleTransaction skips ClusterSizeChecks if there’s a replacement, in which case ReplacementChecks checks the new cluster size. So this is not an issue.
0b284b5fd29c06d46c1ec60ea7e1bcd07f36feb1: can you order chunks by mining score?
sdaftuar
commented at 4:55 pm on December 7, 2023:
Chunks for a given cluster are already sorted in descending feerate (ie mining score) order, is that what you’re asking about or is there another issue I’m overlooking?
It would be good to explain the rationale for the miner score somewhere.
Why a * b and b * a?
And why is not the fee rate a_fee / a_size? (inverse score?)
sipa
commented at 1:53 pm on November 20, 2023:
member
@Sjors I believe most if not all of this PR will be rewritten, and split up into several components. The goal here is just to give an idea of the high-level interactions with other changes (wallet behavior, package validation/relay/RBF, …). I don’t think a detailed line-by-line code review at this stage is a good use of your time.
0b284b5fd29c06d46c1ec60ea7e1bcd07f36feb1 : Could this use CompareMiningScore or does it have a different goal?
in
doc/policy/mempool-replacements.md:49
in
bf467f8286outdated
45@@ -53,12 +46,10 @@ other consensus and policy rules, each of the following conditions are met:
46 significant portions of the node's mempool using replacements with multiple directly conflicting
47 transactions, each with large descendant sets.
4849-6. The replacement transaction's feerate is greater than the feerates of all directly conflicting
50+5. The replacement transaction's mining score is greater than the mining score of all directly conflicting
bf467f8286425b692a1736ab6d417d0ba6074658 Maybe make this rule 7. That seems a bit more clear than saying “previously this rule referred to fee rate”
in
doc/policy/mempool-replacements.md:27
in
bf467f8286outdated
27-
28-3. The replacement transaction pays an absolute fee of at least the sum paid by the original
29+2. The replacement transaction pays an absolute fee of at least the sum paid by the original
30 transactions.
3132 *Rationale*: Only requiring the replacement transaction to have a higher feerate could allow an
bf467f8286425b692a1736ab6d417d0ba6074658: “a higher feerate or (using clusters) mining score”
Is the “Additionally” reasoning still valid?
in
doc/policy/mempool-replacements.md:53
in
bf467f8286outdated
54- preferable for block-inclusion, compared to what would be removed from the mempool. This rule
55- predates ancestor feerate-based transaction selection.
56+ *Rationale*: Ensure that the new transaction is more appealing to mine than those being evicted.
5758 This set of rules is similar but distinct from BIP125.
59
bf467f8286425b692a1736ab6d417d0ba6074658: history can be expanded:
0* Cluster mempool introduced, dropping rule 2 and introducing rule 7.
1 As of **v27.0** ([PR 28676](https://github.com/bitcoin/bitcoin/pull/28676)).
in
src/policy/rbf.cpp:116
in
bf467f8286outdated
116+ // directly replaced, because descendant transactions which pay for the
117+ // parent will be reflected in the parent's chunk feerate.
118+ Cluster::Chunk &chunk = mi->m_cluster->m_chunks[mi->m_loc.first];
119+ CFeeRate original_feerate(chunk.fee, chunk.size);
120 if (replacement_feerate <= original_feerate) {
121 return strprintf("rejecting replacement %s; new feerate %s <= old feerate %s",
bf467f8286425b692a1736ab6d417d0ba6074658: ; new chunk feerate %s <= old chunk feerate
Sjors
commented at 3:00 pm on November 20, 2023:
member
I was wondering if we can drop RBF rule 5 (in a followup), but I’m guessing not really. My initial thinking was the cluster limit could be used instead. But CalculateMiningScoreOfReplacementTx only checks that the new cluster doesn’t get too big.
But does the new cluster system make it less of a burden to have large numbers of transactions appear and disappear from the mempool?
—
I’m also trying to understand if dropping the CPFP carveout is fine. The original scenario described on the mailistlist:
0TX_important
1 * output Alice <- child_a_1 <- child_a_2 <- … -< child_a_25
2 (^ intentionally low fees so it doesn't confirm before timeout)
3 * output Bob: child_b_1
4 ^ high fee
The carveout rule allows Bob’s child_b_1, despite Alice having used up the 25 ancestor limit for TX_important. And alice can’t use the carveout to just add child_a_26, because child_26 has more than one unconfirmed ancestor.
So what happens in the cluster world?
For simplicity, let’s set -limitclustercount to 26 (the default is 100 in this PR). Without the CPFP carveout, can Bob still insert child_b_1?
My understanding is that he can’t, because ClusterSizeChecks will fail.
Instead of failing, can we check if it’s possible to evict the lowest value chunk? Though it introduces an implicit RBF-like mechanism… If it were possible, then child_a_25 would be evicted from the cluster as long as it has a longer fee rate than child_b_1.
Alice might make sure that the fee rates of child_a_1 … child_a_25 increase such that there’s only one chunk. That just requires Bob to use a higher fee rate.
Can you clarify what txs, orig_txs and cluster are for, and what the general strategy is here?
Sjors
commented at 5:16 pm on November 20, 2023:
member
I don’t think a detailed line-by-line code review at this stage is a good use of your time.
That’s not what I’m trying to do. I strategically / haphazardly picked a dozen lines to ask questions about to get a better understanding of the whole thing. Based on IRC chat today, I’ll wait now for the upcoming update.
DrahtBot removed the label
Needs rebase
on Dec 7, 2023
sdaftuar force-pushed
on Dec 8, 2023
sdaftuar force-pushed
on Dec 10, 2023
sdaftuar force-pushed
on Dec 10, 2023
sdaftuar force-pushed
on Dec 10, 2023
DrahtBot removed the label
CI failed
on Dec 10, 2023
DrahtBot added the label
Needs rebase
on Dec 18, 2023
achow101 referenced this in commit
7143d43884
on Feb 10, 2024
ariard
commented at 3:45 am on February 19, 2024:
member
do you have a simulation environment or script like you’ve done for #25717 to observe the computational complexity diff compared to today’s mempool against different chain of transaction performance payload ?
interesting what is the lowest performance linux host assumed for decentralization of the tx-relay network.
assume always 24/7 internet and on a vpcu instance, not baremetal.
Sjors
commented at 3:05 pm on February 19, 2024:
member
In the context of Stratum v2 #29432 I’m looking for a more efficient way to decide when to generate a new block template.
The current implementation looks at the mempool’s GetTransactionsUpdated() every -sv2interval seconds. If there was any update, it calls CreateNewBlock() on BlockAssembler. It then checks if fees increased by at least -sv2feedelta since the last template and if so, pushes it to connected clients.
GetTransactionsUpdated() changes all the time, especially with a big mempool, so not that useful of a filter. I’d like something like GetFeeEstimateAtTop(size vbytes=4000000). Would it be easy to keep track of that (approximate) number every time something is added or removed from the mempool?
Ideally I’d like to change the meaning of -sv2interval from “check every n seconds and push update if needed” to “push better templates immediately, but wait least n seconds between them”.
sipa
commented at 3:10 pm on February 19, 2024:
member
Would it be easy to keep track of that (approximate) number every time something is added or removed from the mempool?
After clustermempool, yes. Today doing that inherently requires running the mining algorithm to figure out what transactions to put there.
Sjors
commented at 3:12 pm on February 19, 2024:
member
Indeed, I meant on top of this PR (or a later incarnation).
ariard
commented at 5:56 pm on February 22, 2024:
member
interesting what is the lowest performance linux host assumed for decentralization of the tx-relay network.
assume always 24/7 internet and on a vpcu instance, not baremetal.
Running this branch mainet on a 2 vcpu instance, with the following performance characteristics:
It would be very interesting to feed cluster-support testing node with all kinds of chain of transactions, and check substantial performance diff compared to non-cluster-support testing node. Check there is no substantial performance regression.
sdaftuar force-pushed
on Mar 7, 2024
sdaftuar force-pushed
on Mar 7, 2024
DrahtBot removed the label
Needs rebase
on Mar 7, 2024
in
src/txmempool.h:839
in
96df5acec3outdated
833@@ -764,6 +834,9 @@ class CTxMemPool
834 */
835 void UpdateForDescendants(txiter updateIt, cacheMap& cachedDescendants,
836 const std::set<uint256>& setExclude, std::set<uint256>& descendants_to_remove) EXCLUSIVE_LOCKS_REQUIRED(cs);
837+ // During reorg we add transactions back to mempool, must reconnect
838+ // clusters with in-mempool descendants.
839+ void UpdateClusterForDescendants(txiter updateIt) EXCLUSIVE_LOCKS_REQUIRED(cs);
I think the introduction of Cluster alters the worst-case computational during re-org, when we reconnect disconnectpool to the current mempool. Before the limit we were strictly bounded with DEFAULT_ANCESTOR_LIMIT / DEFAULT_DESCENDANT_LIMIT. Now we have a DEFAULT_CLUST_LIMIT, where the worst-case graph traversal algorithm might have to visit more element than at max 25.
in
src/txmempool.cpp:122
in
96df5acec3outdated
117+ }
118+ if (clusters_to_merge.size() > 1) {
119+ // Merge the other clusters into this one, but keep this cluster as
120+ // first so that it's topologically sound.
121+ clusters_to_merge[0]->Merge(clusters_to_merge.begin()+1, clusters_to_merge.end(), true);
122+ // TODO: limit the size of the cluster, in case it got too big.
I think this introduces a transaction censorship vector for time-sensitive second-layers. If you can re-org one block and construct a cluster such a a target descendant is not reconnected with other clusters, the non-cluster connected target descendant might have a stale feerate. This descendant could be pinnned in the bottom of the local mempool.
in
src/txmempool.cpp:559
in
96df5acec3outdated
554+ // Only one parent cluster: add to it.
555+ clusters_to_merge[0]->AddTransaction(*newit, true);
556+ cachedInnerUsage += clusters_to_merge[0]->GetMemoryUsage();
557+ } else {
558+ cachedInnerUsage -= clusters_to_merge[0]->GetMemoryUsage();
559+ clusters_to_merge[0]->Merge(clusters_to_merge.begin()+1, clusters_to_merge.end(), false);
I think here a counter should be introduced to avoid iterating on more than DEFAULT_CLUSTER_LIMIT on this work_queue. It’s in RemoveStaged itself call in Finalize, I think you might have temporarily unbounded numbers with conflicting descendants transactions.
in
src/txmempool.cpp:1545
in
96df5acec3outdated
1540+void Cluster::RemoveTransaction(const CTxMemPoolEntry& entry)
1541+{
1542+ m_chunks[entry.m_loc.first].txs.erase(entry.m_loc.second);
1543+
1544+ // Chunk (or cluster) may now be empty, but this will get cleaned up
1545+ // when the cluster is re-sorted (or when the cluster is deleted) Note:
I think you will have computational threshold effect with whatever limits you’re picking up for max cluster size and chunk ordering parameters. Namely, what is the minimal chunk modification that can provoke the maximum of re-ordering of all in-cluster chunks.
in
src/txmempool.cpp:1569
in
96df5acec3outdated
1564+ double feerate_prev = prev_iter->fee*cur_iter->size;
1565+ double feerate_cur = cur_iter->fee*prev_iter->size;
1566+ // We only combine chunks if the feerate would go up; if two
1567+ // chunks have equal feerate, we prefer to keep the smaller
1568+ // chunksize (which is generally better for both mining and
1569+ // eviction).
I think this is a broken assumption. You’re using virtual bytes (GetTxSize()), however an equivalent feerate cluster can have a smaller total witness unit as such a higher feerate w.r.t consensus rules (MAX_BLOCK_WEIGHT) and it should be selected in a mining block template.
in
src/txmempool.cpp:1605
in
96df5acec3outdated
1600+ txs.push_back(chunk_tx.get());
1601+ }
1602+ }
1603+ // Sorting by ancestor count is equivalent to topological sort.
1604+ std::sort(txs.begin(), txs.end(), [](const CTxMemPoolEntry::CTxMemPoolEntryRef& a, const CTxMemPoolEntry::CTxMemPoolEntryRef& b) {
1605+ return a.get().GetCountWithAncestors() < b.get().GetCountWithAncestors();
Note topological sort is not defined. At the very least you can have a and b both being the first child of a single ancestor and still being the 2rd and 3rd child of a common ancestor. Strictly inferior here cannot be say to be equivalent.
in
src/rpc/mempool.cpp:259
in
ce41413592outdated
246@@ -247,34 +247,81 @@ static RPCHelpMan testmempoolaccept()
247 };
248 }
249250+static std::vector<RPCResult> ClusterDescription()
251+{
252+ return {
253+ RPCResult{RPCResult::Type::NUM, "vsize", "virtual transaction size as defined in BIP 141. This is different from actual serialized size for witness transactions as witness data is discounted."},
254+ RPCResult{RPCResult::Type::NUM, "txcount", "number of transactions (including this one)"},
255+ RPCResult{RPCResult::Type::NUM, "clusterid", "id of the cluster containing this tx"},
About the clusterid (m_next_cluster_id), I don’t think it’s a good idea to use it as a tie-breaker in CompareMiningScore as it’s just a monotonic counter, it’s not information perfomative for a mining score. You could use weight at least or reception timestamp e.g oldest the tx-relay reception more well-propagated on the network, already in every miner mempool.
in
src/rpc/mempool.cpp:426
in
ce41413592outdated
419@@ -370,6 +420,76 @@ UniValue MempoolToJSON(const CTxMemPool& pool, bool verbose, bool include_mempoo
420 }
421 }
422423+static RPCHelpMan getmempoolfeesize()
424+{
425+ return RPCHelpMan{"getmempoolfeesize",
426+ "Returns fee/size data for the whole mempool.",
11+/** Data structure storing a fee and size, ordered by increasing fee/size.
12+ *
13+ * The size of a FeeFrac cannot be zero unless the fee is also zero.
14+ *
15+ * FeeFracs have a total ordering, first by increasing feerate (ratio of fee over size), and then
16+ * by decreasing size. The empty FeeFrac (fee and size both 0) sorts last. So for example, the
This is not perfect as you could use weight units and even have to consider what is the minimal satisfying witness under given consensus rules for a given transaction, especially if you have multiple candidate for a spend given a witnessScript. You have already such types of transaction like second-stage LN transactions (either revocation path or preimage/timeout path).
in
src/util/feefrac.h:36
in
bb1bc54a7coutdated
31+ * The >> and << operators only compare feerate and treat equal feerate but different size as
32+ * equivalent. The empty FeeFrac is neither lower or higher in feerate than any other.
33+ *
34+ * These comparisons are only guaranteed to be correct when the product of the highest fee and
35+ * highest size does not exceed 2^64-1. If the fee is a number in sats, and size in bytes, then
36+ * this allows up to 46116.86 BTC at size 4M, and 1844674.4 BTC at size 100k).
I think this should be MAX_MONEY as ReplacementChecks() in PreChecks() are done before PolicyScriptChecks() and ConsensusScriptChecks() so an adversary does not have to own the solution of the scriptpubkey to fake 46116.86 or 1844674.4 of value in your local mempool, exceeds 2^64-1 and mess up with linearization.
ariard
commented at 3:55 am on March 8, 2024:
member
I’ll spend time reviewing current CTxMemPool’s code path w.r.t DoS resistance before to pursue review further. This is very unsure the introduction of TxGraph improves on this front as the design is focus on mining incentive-compatibility improvement first.
ariard
commented at 4:01 am on March 8, 2024:
member
Running this branch mainet on a 2 vcpu instance, with the following performance characteristics:
Got a disk space issue at block 501691 by syncing from genesis.
02024-02-23T10:53:41Z UpdateTip: new best=000000000000000000753b6b9821750d271e1730d7403a0658c507c88092bdf0 height=501691 version=0x20000000 log2_work=87.760876 tx=287312998 date='2017-12-30T07:48:59Z' progress=0.295839 cache=16.3MiB(151170txo)
12024-02-23T10:53:41Z New outbound-full-relay v1 peer connected: version: 70016, blocks=831673, peer=196
22024-02-23T10:53:47Z *** Disk space is too low!
32024-02-23T10:53:47Z Error: Disk space is too low!
Was configured as a prune node with default pruning settings.
02024-02-22T17:49:33Z Prune configured to target 550 MiB on disk for block and undo files.
DrahtBot added the label
Needs rebase
on Mar 9, 2024
DrahtBot added the label
CI failed
on Apr 3, 2024
DrahtBot
commented at 7:04 pm on April 3, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
Needs rebase
on Apr 26, 2024
sdaftuar force-pushed
on Apr 26, 2024
sdaftuar force-pushed
on Apr 26, 2024
sdaftuar force-pushed
on Apr 26, 2024
sdaftuar force-pushed
on Apr 26, 2024
sdaftuar force-pushed
on Apr 26, 2024
sdaftuar force-pushed
on Apr 27, 2024
Dianagram009 approved
sdaftuar force-pushed
on Apr 28, 2024
DrahtBot removed the label
CI failed
on Apr 28, 2024
DrahtBot added the label
Needs rebase
on Apr 30, 2024
in
src/kernel/txgraph.cpp:508
in
c49e0444e5outdated
502+ cluster->Clear();
503+
504+ // The first transaction gets to stay in the existing cluster.
505+ bool first = true;
506+ for (auto& txentry : txs) {
507+ if (txentry.get().m_cluster == nullptr) {
Note also that I expect this implementation of txgraph to be rewritten entirely, so the most relevant thing to review right now relating to this code is the header file, as the interface is the most important part (and the rest of the PR and the mempool code builds on top of that interface).
in
src/rpc/mempool.cpp:656
in
c49e0444e5outdated
644@@ -547,6 +645,40 @@ static RPCHelpMan getmempooldescendants()
645 };
646 }
647648+static RPCHelpMan getmempoolcluster()
649+{
650+ return RPCHelpMan{"getmempoolcluster",
651+ "\nReturns mempool data for given cluster\n",
652+ {
653+ {"id", RPCArg::Type::NUM, RPCArg::Optional::NO, "The cluster id (must be in mempool)"},
Sjors suggested here #28676 (review) that I add a way to do it by transaction hash, which would be easy. Let me know if you have other suggestions…
sdaftuar force-pushed
on May 8, 2024
sdaftuar force-pushed
on Jun 5, 2024
DrahtBot removed the label
Needs rebase
on Jun 5, 2024
sdaftuar force-pushed
on Jun 5, 2024
DrahtBot added the label
CI failed
on Jun 5, 2024
sdaftuar force-pushed
on Jun 5, 2024
DrahtBot removed the label
CI failed
on Jun 5, 2024
sdaftuar force-pushed
on Jun 5, 2024
DrahtBot added the label
Needs rebase
on Jun 7, 2024
sdaftuar force-pushed
on Jun 10, 2024
DrahtBot removed the label
Needs rebase
on Jun 10, 2024
DrahtBot added the label
CI failed
on Jun 10, 2024
DrahtBot added the label
Needs rebase
on Jun 11, 2024
sdaftuar force-pushed
on Jun 12, 2024
sdaftuar force-pushed
on Jun 12, 2024
sdaftuar force-pushed
on Jun 14, 2024
DrahtBot removed the label
Needs rebase
on Jun 14, 2024
DrahtBot removed the label
CI failed
on Jun 14, 2024
DrahtBot added the label
Needs rebase
on Jun 17, 2024
sdaftuar force-pushed
on Jun 18, 2024
DrahtBot removed the label
Needs rebase
on Jun 18, 2024
DrahtBot added the label
CI failed
on Jun 18, 2024
DrahtBot
commented at 8:03 pm on June 18, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot added the label
Needs rebase
on Jun 19, 2024
sdaftuar force-pushed
on Jun 26, 2024
DrahtBot removed the label
Needs rebase
on Jun 26, 2024
clusterlin: introduce cluster_linearize.h with Cluster and DepGraph types
This primarily adds the DepGraph class, which encapsulated precomputed
ancestor/descendant information for a given transaction cluster, with a
number of a utility features (inspectors for set feerates, computing
reduced parents/children, adding transactions, adding dependencies), which
will become needed in future commits.
926da79d11
tests: Fuzzing framework for DepGraph class
This introduces a bespoke fuzzing-focused serialization format for DepGraphs,
and then tests that this format can represent any graph, roundtrips, and then
uses that to test the correctness of DepGraph itself.
This forms the basis for future fuzz tests that need to work with interesting
graph.
0efd17c52e
clusterlin: add AncestorCandidateFinder class
This is a class that encapsulated precomputes ancestor set feerates, and
presents an interface for getting the best remaining ancestor set.
89de978ff6
clusterlin: add SearchCandidateFinder class
Similar to AncestorCandidateFinder, this encapsulates the state needed for
finding good candidate sets using a search algorithm.
6da4d8ec58
clusterlin: add Linearize function
This adds a first version of the overall linearization interface, which given
a DepGraph constructs a good linearization, by incrementally including good
candidate sets (found using AncestorCandidateFinder and SearchCandidateFinder).
3f2af80683
bench: Candidate finding and linearization benchmarks
Add benchmarks for known bad graphs for the purpose of search (as
an upper bound on work per search iterations) and ancestor sorting
(as an upper bound on linearization work with no search iterations).
959b37c487
clusterlin: use bounded BFS exploration (optimization)
Switch to BFS exploration of the search tree in SearchCandidateFinder
instead of DFS exploration. This appears to behave better for real
world clusters.
As BFS has the downside of needing far larger search queues, switch
back to DFS temporarily when the queue grows too large.
7496980f6f
clusterlin: randomize the SearchCandidateFinder search order
To make search non-deterministic, change the BFS logic from always picking
the first queue item, randomly picking the first or second queue item.
87862a9f72
clusterlin: permit passing in existing linearization to Linearize
This implements the LIMO algorithm for linearizing by improving an existing
linearization. See
https://delvingbitcoin.org/t/limo-combining-the-best-parts-of-linearization-search-and-merging
for details.
443f7c31c3
clusterlin: add algorithms for connectedness/connected components
Add utility functions to DepGraph for finding connected components.
clusterlin: add MergeLinearizations function + fuzz test + benchmark6e2b004b62
Add is_visited helper to epochguard057808e355
Add txgraph module67c09bdad3
add fuzz test for txgraphc01dd0c8c2
Make CTxMemPoolEntry derive from TxEntry64df69c425
Track clusters in mempool with TxGraphee84336fb2
Limit mempool size based on chunk feerate
Rather than evicting the transactions with the lowest descendant feerate,
instead evict transactions that have the lowest chunk feerate.
Once mining is implemented based on choosing transactions with highest chunk
feerate (see next commit), mining and eviction will be opposites, so that we
will evict the transactions that would be mined last.
7883064f04
Select transactions for blocks based on chunk feerate862a8fd5b2
Add new (unused) limits for cluster size/count8759f835e2
Do not allow mempool clusters to exceed configured limits1ae45cfb64
policy: Remove CPFP carveout rule
The addition of a cluster size limit makes the CPFP carveout rule useless,
because carveout cannot be used to bypass the cluster size limit. Remove this
policy rule and update tests to no longer rely on the behavior.
5b2c4d342d
Implement new RBF logic for cluster mempool
With a total ordering on mempool transactions, we are now able to calculate a
transaction's mining score at all times. Use this to improve the RBF logic:
- we no longer enforce a "no new unconfirmed parents" rule
- we now require that the mempool's feerate diagram must improve in order
to accept a replacement
TODO: update functional test feature_rbf.py to cover all our new scenarios.
deadfd6b86
==== END CLUSTER IMPLEMENTATION ====4e6440ae02
==== BEGIN MEMPOOL CLEANUP ====7ddbd12ee6
Remove the ancestor and descendant indices from the mempool9b52bf30bc
Use cluster linearization for transaction relay sort order
Previously, transaction batches were first sorted by ancestor count and then
feerate, to ensure transactions are announced in a topologically valid order,
while prioritizing higher feerate transactions. Ancestor count is a crude
topological sort criteria, so replace this with linearization order so that the
highest feerate transactions (as would be observed by the mining algorithm) are
relayed before lower feerate ones, in a topologically valid way.
This also fixes a test that only worked due to the ancestor-count-based sort
order.
c312e94dfb
Remove CTxMemPool::GetSortedDepthAndScore
The mempool clusters and linearization permit sorting the mempool topologically
without making use of ancestor counts.
ac15997109
Reimplement GetTransactionAncestry() to not rely on cached data
In preparation for removing ancestor data from CTxMemPoolEntry, recalculate the
ancestor statistics on demand wherever needed.
6b9b57141e
rpc: Calculate ancestor data from scratch for mempool rpc calls5ed154663c
Remove dependency on cached ancestor data in mini-miner8793cffb90
Stop enforcing ancestor size/count limits
The cluster limits should be sufficient.
03a49e50b6
Add test case for cluster size limits to v3 logic0becdc70d4
Use mempool/txgraph to determine if a tx has descendants
Remove a reference to GetCountWithDescendants() in preparation for removing
this function and the associated cached state from the mempool.
bbcb8048ef
Calculate descendant information for mempool RPC output on-the-fly
This is in preparation for removing the cached descendant state from the
mempool.
2f5426ae9e
test: fix rbf carveout test in mempool_limit.py
Minimal fix to the test that the RBF carveout doesn't apply in certain package
validation cases. Now that RBF carveout doesn't exist, we can just test that
the cluster count limit is respected (in preparation for removing the
descendant limit altogether).
551d41d27a
Stop enforcing descendant size/count limits
Cluster size limits should be enough.
96d470fd72
Eliminate RBF workaround for CPFP carveout transactions
The new cluster mempool RBF rules take into account clusters sizes exactly, so
with the removal of descendant count enforcement this idea is obsolete.
c6e91be89d
wallet: Replace max descendantsize with clustersize
With the descendant size limits removed, replace the concept of "max number of
descendants of any ancestor of a given tx" with the cluster count of the cluster
that the transaction belongs to.
c4821e81e9
mempool: Remove unused function CalculateDescendantMaximumbfb8714a8e
Eliminate use of cached ancestor data in miniminer_tests and v3_policy0a9927d647
mempool: eliminate accessors to mempool entry ancestor/descendant cached state329fd176c6
Remove unused members from CTxMemPoolEntry56c33b907b
Remove mempool logic designed to maintain ancestor/descendant state4000084160
mempool: addUnchecked no longer needs ancestors589976a22a
Remove unused limits from CalculateMemPoolAncestorsb022093804
Make removeConflicts private6da99e095b
==== END MEMPOOL CLEANUP ====9e70ac28a1
==== BEGIN OPTIMIZATIONS ====923852eebe
Rework removeForBlock so that clusters are only touched once
Also remove extra linearization that was happening and some logging
Update interface_zmq.py for new block connection behavior
857bb27f15
Simplify ancestor calculation functions
Now that ancestor calculation never fails (due to ancestor/descendant limits
being eliminated), we can eliminate the error handling from
CalculateMemPoolAncestors.
interface_zmq test is broken
ee9c0b6caf
Use txgraph to calculate ancestorsc544191fc7
Use txgraph to calculate descendantscf609e6989
Move GetNumChildren() to be a mempool function382ddb619b
Make getting parents/children a function of the mempool, not a mempool entry99d086c133
Rewrite GatherClusters to use the txgraph implementationd6ee17f417
Switch to using txgraph parents/children in public accessorsc604867de1
Stop tracking parents/children outside of txgraph11be533b37
Remove unused argument to RemoveStagedc2c232655a
Switch to using the faster CalculateDescendants
The only place we still use the older interface is in policy/rbf.cpp, where
it's helpful to incrementally calculate descendants to avoid calculating too
many at once (or cluttering the CalculateDescendants interface with a
calculation limit).
09f3ca251a
Rewrite removeRecursive to use vectors instead of setsff71fa68fe
Rewrite Expire to only invoke CalculateDescendants once645669492f
Rewrite removeForReorg to use a vector instead of a set88bb9d8214
Drop unnecessary set in TrimToSizee2eb8046f7
RemoveStaged takes a vector, not a set972ec6009a
Only use CalculateDescendants() with vectors, not sets49cb7a70c7
Use faster CalculateMemPoolAncestors in rbfe4ee25faf7
Use faster CMPA in rpc/mempool3e4966e8a6
Eliminate need for ancestors in SingleV3Checks43043c991b
Eliminate need for ancestors in PackageV3Checks
TO DO: Rewrite unit tests for PV3C to not lie about mempool parents, so that we
can push down the parent calculation into v3_policy from validation.
5c045fcaa3
Don't calculate ancestors except for RBF transactions9479f87830
==== END OPTIMIZATIONS ====bcafdf1556
==== BEGIN TESTS ====b575eadcb7
bench: add more mempool benchmarks
Add benchmarks for:
- mempool update time when blocks are found
- adding a transaction
- performing the mempool's RBF calculation
- calculating mempool ancestors/descendants
95b4a5a4f9
fuzz: try to add more code coverage for mempool fuzzing
Including test coverage for mempool eviction and expiry
a0fb9cbfba
Pass through cluster size limits to TxGraph::check()3153247470
Expose cluster information via rpc57ed7264c6
doc: Update mempool_replacements.md to reflect feerate diagram checks79a8b3dfce
test: add functional test for new cluster mempool RPCs6532117bf4
fuzz: remove comparison between mini_miner block construction and miner
This is in preparation for eliminating the block template building happening in
mini_miner, in favor of directly using the linearizations done in the mempool.
e9a67a31c8
try to make ci happy with zmq cppflagse5ba9df591
sdaftuar force-pushed
on Jun 28, 2024
fixup! Implement new RBF logic for cluster mempool1f9de856b6
DrahtBot removed the label
CI failed
on Jun 28, 2024
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-06-29 10:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me