This replaces the cluster linearization algorithm introduced in #30126 and #30286 (a combination of LIMO with candidate-set search), with a completely different algorithm: spanning-forest linearization, which appears to have much better performance for hard clusters. See this post for a comparison between various linearization algorithms, and this post for benchmarks comparing them. Replaying historical mempool data on it shows that it can effectively linearize every observed cluster up to 64 transactions optimally within tens of microseconds, though pathological examples can be created which take longer.
The algorithm is effectively a very specialized version of the simplex algorithm to the problem of finding high-feerate topological subsets of clusters, but modified to find all consecutive such subsets concurrently rather than just the first one. See the post above for how it is related.
It represents the cluster as partitioned into a set of chunks, each with a spanning tree of its internal dependencies connecting the transactions. Randomized improvements are made by selecting dependencies to add and remove to these spanning trees, merging and splitting chunks, until no more improvements are possible, or a computation budget is reached. Like simplex, it does not necessarily make progress in every step, and thus has no upper bound on its runtime to find optimal, but randomization makes long runtimes very unlikely, and additionally makes it hard to adversarially construct clusters in which the algorithm reliably makes bad choices.
DrahtBot
commented at 11:24 pm on May 17, 2025:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
No conflicts as of last run.
sipa force-pushed
on May 17, 2025
DrahtBot added the label
CI failed
on May 17, 2025
DrahtBot
commented at 11:40 pm on May 17, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task ARM, unit tests, no functional tests: https://github.com/bitcoin/bitcoin/runs/42417371062
LLM reason (✨ experimental): The CI failure is due to a build error during the compilation of txgraph.cpp.o.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on May 18, 2025
sipa force-pushed
on May 18, 2025
DrahtBot removed the label
CI failed
on May 18, 2025
sipa force-pushed
on May 18, 2025
sipa force-pushed
on May 19, 2025
DrahtBot added the label
CI failed
on May 19, 2025
DrahtBot
commented at 3:39 am on May 19, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task ARM, unit tests, no functional tests: https://github.com/bitcoin/bitcoin/runs/42447412610
LLM reason (✨ experimental): The CI failure is due to a failed CTest test: cluster_linearize_tests.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on May 20, 2025
sipa force-pushed
on May 20, 2025
DrahtBot removed the label
CI failed
on May 20, 2025
sipa added the label
Mempool
on May 20, 2025
in
src/cluster_linearize.h:462
in
23072f2b0eoutdated
554@@ -539,492 +555,651 @@ class LinearizationChunking
555 }
556 };
557558-/** Class encapsulating the state needed to find the best remaining ancestor set.
559+/** Class to represent the internal state of the spanning-forest linearization algorithm.
Appreciate the excellent doxygen documentation here.
jonatack
commented at 4:09 pm on May 22, 2025:
member
Concept ACK
sipa force-pushed
on May 23, 2025
sipa force-pushed
on May 23, 2025
sipa force-pushed
on May 24, 2025
sipa force-pushed
on May 25, 2025
sipa force-pushed
on May 25, 2025
DrahtBot added the label
CI failed
on May 25, 2025
DrahtBot
commented at 4:44 pm on May 25, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task CentOS, depends, gui: https://github.com/bitcoin/bitcoin/runs/42858330826
LLM reason (✨ experimental): The CI failure is due to assertion failures within the cluster_linearize_tests and bench_sanity_check tests.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on May 25, 2025
I’m interested in seeing benchmarks of this on various systems, especially high-end/modern ones (as their performance is likely most predictive of what future hardware will be like).
ismaelsadeeq
commented at 1:51 pm on May 29, 2025:
member
I’m interested in seeing benchmarks of this on various systems, especially high-end/modern ones (as their performance is likely most predictive of what future hardware will be like).
Below are benchmarks on MacBook M2 PRO 2023.
Compiled using clang with only build bench cmake option.
@ismaelsadeeq Not an apples-to-apples comparison I’m afraid, because the benchmark cases are changed in this PR too. Can you run the LinearizeBoundedExample? benchmarks too?
ismaelsadeeq
commented at 2:35 pm on May 29, 2025:
member
@ismaelsadeeq Not an apples-to-apples comparison I’m afraid, because the benchmark cases are changed in this PR too
Oops sorry haven’t take a look, I was hoping I can compare locally myself. I saw a link to the bench results comparing them in the description 👍🏾
Can you run the LinearizeBoundedExample? benchmarks too?
Yep I did and updated the bench results. previous one was not pointing to your latest push.
instagibbs
commented at 4:28 pm on May 29, 2025:
member
Ran apples to apples out of curiosity. One example ran marginally slower on this PR (but still very fast), many are many multiples faster and some see 500x+ improvements.
gmaxwell
commented at 6:47 pm on May 29, 2025:
contributor
@instagibbs I believe the only cases that should be slower with SFL are either very small examples (where their time is irrelevant because its very low per tx compared to bigger clusters) and ones with huge numbers of dependencies. Beyond huge dependencies being generally unrepresentative they’re also expensive to generate because every dependency needs a vin, they also tend to be so slow with both that I guess neither will run to completion in practice.
A better way to compare across implementations might be some kind of optimization time vs amount of fee required to create a cluster with that dependency geometry (just assuming a 1s/vb feerate or something, not adding up the actual costs in the cluster because another cluster could exist with the same topology but different fees/size). like fee = 16*txn + 41*inputs_consumed + 9*outputs_consumed_within_cluster – so I’d hope to find that SFL always beats the current exponential algorithm on this kind of metric (except for very small inputs).
The exponential algorithm gets faster with dependency rich clusters because the linearization essentially becomes topologically forced, so the current algorithm doesn’t have many options to consider. If it massive high dependency clusters became common on the network in the future it might make sense to bring back the exponential algorithm for them. :P but I doubt that would happen.
fjahr
commented at 8:26 pm on May 30, 2025:
contributor
I’m interested in seeing benchmarks of this on various systems
@instagibbs@ismaelsadeeq@gmaxwell FWIW, I didn’t include apples-to-apples comparison because the large-scale comparison benchmarks I did (especially, this one) were just overwhelmingly in favor of SFL over the existing one. The benchmarks were mostly done to decide between SFL and another replacement algorithm, GGT, but see the reasons for choosing SFL here.
A fair apples-to-apples comparison I think needs to use a dataset large enough to encompass both good and bad cases for each of the algorithms, and not be specifically biased for one of them, which is hard in a benchmark I can include in the PR directly. If people are interested, I can try to clean up the branch I used to create the graphs above, however.
l0rinc
commented at 10:21 am on June 4, 2025:
contributor
I’ve run the command on a Raspberry Pi 4 Model B - Cortex-A72 4-Core ARM with SanDisk SSD PLUS 1TB (USB 3.0) with to have data about lower-end hardware as well.
l0rinc
commented at 1:22 pm on June 4, 2025:
contributor
Also, is this with 32-bit or 64-bit userspace?
64-bit userspace (AArch64)
0$ file bitcoind
1bitcoind: ELF 64-bit LSB pie executable, ARM aarch64, version 1(GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=7e059ec01f7460042910ca4ed15270382269c9d5, for GNU/Linux 3.7.0, with debug_info, not stripped
DrahtBot added the label
Needs rebase
on Jul 29, 2025
sipa force-pushed
on Aug 9, 2025
DrahtBot added the label
CI failed
on Aug 9, 2025
DrahtBot
commented at 4:45 am on August 9, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task no wallet, libbitcoinkernel: https://github.com/bitcoin/bitcoin/runs/47723469045
LLM reason (✨ experimental): The build failed due to compilation errors in cluster_linearize.cpp caused by incorrect member access of a tuple, leading to a build error.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
Needs rebase
on Aug 9, 2025
sipa force-pushed
on Aug 9, 2025
sipa force-pushed
on Aug 9, 2025
DrahtBot removed the label
CI failed
on Aug 9, 2025
sipa marked this as a draft
on Sep 11, 2025
sipa
commented at 3:32 pm on September 11, 2025:
member
I’m working on some improvements and reorganization of the commit here, so marking it as draft for now.
sipa force-pushed
on Sep 11, 2025
sipa force-pushed
on Oct 16, 2025
sipa marked this as ready for review
on Oct 16, 2025
sipa force-pushed
on Oct 16, 2025
sipa force-pushed
on Oct 16, 2025
sipa
commented at 7:01 pm on October 16, 2025:
member
Rebased, and made a significant change to the SFL algorithm itself:
Switched to a different technique for making the initial state topological. The big advantage is that this approach also works when one already has a linearization which is not entirely topological already.
Also the following changes to the cluster_linearize.h code in general:
Using SFL gaining the ability to fix existing linearizations, replaced FixLinearization with just making this feature part of Linearize.
With the LIMO-based Linearize, and MergeLinearizations (see below), gone, there is no more need for the LinearizationChunking class.
And as a result some changes in TxGraph are possible too:
Removed the need for FixLinearization by introducing QualityLevel states to represent “may be non-topological”, which makes the next call to Linearize() do the fixing instead.
Removed MergeLinearizations, as it was never used.
As SFL, upon being fed an existing linearization, runs through an initialization procedure that is pretty much equivalent to PostLinearize, drop all calls to PostLinearize()prior to Linearize() calls.
Because the new algorithm randomizes output linearization order (within equal-chunk-feerate parts), undo some of the non-determinism that introduces by always calling PostLinearize()afterLinearize() calls (before it would only do this if non-optimal).
sipa
commented at 7:10 pm on October 16, 2025:
member
This PR has grown quite a bit in scope (despite still being a net negative in LoC!), so I don’t think it’s unreasonable to split it up into an SFL-specific one, and one with follow-up changes to txgraph, or even further. I’ll wait for reviewer comments.
sipa requested review from Copilot
on Oct 20, 2025
in
src/cluster_linearize.h:1496
in
52b983bf90outdated
The new is_topological parameter defaults to true, but when old_linearization is empty, this parameter is ignored. This could be confusing for API consumers. Consider documenting this behavior more explicitly in the function comment, or validate that when old_linearization is empty, is_topological is not set to false.
Copilot
commented at 0:30 am on October 20, 2025:
none
Pull Request Overview
This PR replaces the existing cluster linearization algorithm (LIMO with candidate-set search) with a new spanning-forest linearization (SFL) algorithm that offers significantly better performance for complex clusters. The new algorithm uses a fundamentally different approach based on maintaining spanning trees within chunks and performing split/merge operations.
Key Changes:
Replaces search-based linearization with spanning-forest based approach
Simplifies quality level management by introducing NEEDS_FIX and NEEDS_SPLIT_FIX states
Updates the ExtractTransactions interface to use two separate visitor callbacks
Reviewed Changes
Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.
File
Description
src/txgraph.cpp
Implements quality level changes, updates transaction extraction to use two-pass visitors, modifies merging/splitting logic to work with new algorithm
src/test/util/cluster_linearize.h
Adjusts test utilities for changed type parameters and updated iteration bounds
src/test/fuzz/txgraph.cpp
Updates fuzz tests to use new chunking API
src/test/fuzz/cluster_linearize.cpp
Removes tests for deprecated algorithm components, adds tests for new chunking functions
src/cluster_linearize.h
Replaces entire linearization algorithm implementation with spanning-forest approach, adds ChunkLinearizationInfo function
src/bench/cluster_linearize.cpp
Updates benchmarks to test new algorithm with historical and random cluster examples
src/test/fuzz/cluster_linearize.cpp:1
[nitpick] The variable name done changed from TestBitSet to SetType. While this makes the code more generic, verify that this change aligns with the intended test coverage, as the original code may have been explicitly testing with TestBitSet for specific reasons.
0// Copyright (c) The Bitcoin Core developers
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
[nitpick] The function signature changed from accepting const std::array<uint8_t, N>& to const std::vector<uint8_t>&. This removes compile-time size checking. While this makes the function more flexible, it also removes a safety guarantee. Consider if the added flexibility is worth the loss of compile-time size verification.
sipa
commented at 8:35 am on October 27, 2025:
member
I have added an additional fuzz test (clusterlin_sfl) which verifies properties of the underlying data structure (SpanningForestState) plus sanity checks on it (as opposed to clusterlin_linearize which only tests the higher-level Linearize() function implemented in function of it).
It’s no longer a net reduction in LoC now :(
clusterlin: add class implementing SFL state and operations (preparation)
This adds a data structure representing the optimization state for the spanning-forest
linearization algorithm (SFL), plus a fuzz test for its correctness.
This is preparation for switching over Linearize() to use this algorithm.
See https://delvingbitcoin.org/t/spanning-forest-cluster-linearization/1419 for
a description of the algorithm.
7cf0b1035a
clusterlin: replace cluster linearization with SFL implementation (feature)
This replaces the existing LIMO linearization algorithm (which internally uses
ancestor set finding and candidate set finding) with the much more performant
spanning-forest linearization algorithm.
069e417cc4
clusterlin: support fixing linearizations in Linearize()
This also updates FixLinearization to just be a thin wrapper around Linearize.
In a future commit, FixLinearization will be removed entirely.
36c9158cd2
clusterlin: keep track of active parents per transaction (optimization)
This avoids the need for a loop over all parents of a transaction while walking
a chunk, and removes the need to store the set of parent dependencies explicitly.
2954b20a96
clusterlin: split chunks based on maximum gain (optimization)
This introduces the notion of gain to the SFL algorithm. Given a chunk c,
an active dependency d in it, and the chunks (t, b) that c would split into
if d were deactivated, the gain is defined as either (they are equivalent):
(feerate(t) - feerate(b)) * size(t) * size(b)
fee(t) * size(b) - fee(b) * size(t)
It happens to also be equal to these:
(feerate(t) - feerate(c)) * size(t) * size(c)
fee(t) * size(c) - fee(c) * size(t)
Its relevance is that this metric is proportional to a lower bound on the area
under the fee-size diagram which would be gained IF a deactivation of d does not
result in a self-merge of t and b again.
This commit adds logic to find, within each chunk, the dependency with the highest
gain. In benchmarks, this appears to be a very good heuristic for deciding which
splits are worth making.
2ca422a309
clusterlin: use pre-allocated array for dependencies (optimization)
This reduces the number of allocations required inside the SFL algorithm, and works
because the number of dependencies per transaction is at most n-1.
To minimize the memory usage from this pre-allocation (which might impact memory
locality), change the data type of DepIdx from uint32_t to uint8_t or uint16_t when
possible.
dc3976a491
clusterlin: keep active child dependencies separate (optimization)
Within the per-transaction child dependency list, keep the active ones before all
inactive ones. This improves the complexity over iterating over active dependencies
from O(m) to O(n), as at most n-1 dependencies can be active within any given chunk
at any given time.
251d128d25
clusterlin: keep FIFO queue of improvable chunks (optimization)
This distributes the work over the various chunks fairly, and simultaneously
avoids retrying chunks over and over again which are already known to be optimal.
58bf4a3736
clusterlin: keep track of whether chunks are already in suboptimal lists
This avoids adding them a second time when they happen to already be there.
cb5b856719
clusterlin: randomize merges/splits in SFL (feature)865531c0b0
clusterlin: use random split in SFL on every 3rd attempt (feature)
Out of an abundance of caution that adversarially-constructed clusters might
reliably result in bad chunk split decisions with the maximum-gain strategy,
make every third consecutive attempt to split the same chunk use a random
strategy instead.
296ecab91d
clusterlin: make dependency being active implicit (optimization)
We do not need to actually keep track of whether a dependency is active
or not; it is implied by whether or not it appears within the active
prefix of its parent's child_deps, and its child's parent_deps.
Just remove setting and checking it.
d303224911
clusterlin: only consider 1 direction initially in MakeTopological (optimization)6d2085f3ec
clusterlin: minimize chunks (feature)
After the normal optimization process finishes, and finds an optimal
spanning forest, run a second process (while computation budget remains)
to split chunks into minimal equal-feerate chunks.
56e19de003
clusterlin: add more accurate cost modeling (feature)
his adds a rough estimate of algorithm runtime, so it can be interrupted if no
solution is found in time. Due to inherent differences between platforms, this
will not be extremely accurate, but it is preferable over directly measuring
time for consistency.
5f45b6a6f8
clusterlin: randomize equal-feerate parts of linearization (privacy)
This places equal-feerate chunks (with no dependencies between them) in random
order in the linearization output, hiding information about DepGraph insertion
order from the output. Likewise, it randomizes the order of transactions within
chunks for the same reason.
2218d0de99
txgraph: drop NEEDS_SPLIT_ACCEPTABLE (simplification)
With the SFL algorithm, we will practically be capable of keeping
most if not all clusters optimal. With that, it seems less valuable
to avoid doing work after splitting an acceptable cluster, because by
doing some work we may get it to OPTIMAL.
This reduces the complexity of the code a bit as well.
a184fab11f
txgraph: use PostLinearize less prior to linearizing
With the new SFL algorithm, the process of loading an existing linearization into the
SFL state is very similar to what PostLinearize does. This means there is little benefit
to performing an explicit PostLinearize step before linearizing inside txgraph. Instead,
it seems better to use our allotted CPU time to perform more SFL optimization steps.
842f3304e3
txgraph: always PostLinearize after linearization
With the new linearization algorithm which includes randomization (in order to
not leak information about transaction insertion order through the linearization
output), reduce some of that non-determinism by postlinearizing. Note that this
does not reintroduce the leak, because postlinearizing does not care about the
input DepGraphIndexes.
878f93e89a
txgraph: make topological inside Linearize rather than FixLinearization9067efa84b
clusterlin: drop support for improvable chunking (simplification)
With MergeLinearizations() gone and the LIMO-based Linearize() replaced by SFL, we do not
need a class (LinearizationChunking) that can maintain an incrementally-improving chunk
set anymore.
Replace it with a function (ChunkLinearizationInfo) that just computes the chunks as
SetInfos once, and returns them as a vector. This simplifies several call sites too.
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-10-30 21:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me