So far, the TxGraph::DoWork() function took no parameters, and just made all clusters reach the “acceptable” internal quality level by performing a minimum number of improvement iterations on it, but:
Did not attempt to go beyond that.
Was broken, as the QualityLevel of optimal clusters that merge together was not being reset.
Fix this by adding an argument to DoWork() to control how much work it is allowed to do right now, which will first be used to get all clusters to the acceptable level, and if more budget remains, use it to try to get some or all clusters optimal. The function will now return true if all clusters are known to be optimal (and thus no further work remains). This is verified in the tests, by remembering whether the graph is optimal, and if it is at the end of the simulation run, verify that the overall linearization cannot be improved further.
DrahtBot
commented at 9:29 pm on April 13, 2025:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#32545 (Replace cluster linearization algorithm with SFL by sipa)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
LLM Linter (✨ experimental)
Possible typos and grammar issues:
“nothing to do left” → “nothing left to do” [correct word order for clarity]
drahtbot_id_4_m
sipa marked this as a draft
on Apr 13, 2025
DrahtBot added the label
CI failed
on Apr 13, 2025
DrahtBot
commented at 10:42 pm on April 13, 2025:
contributor
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on Apr 14, 2025
DrahtBot removed the label
CI failed
on Apr 14, 2025
in
src/test/fuzz/txgraph.cpp:321
in
81a88311b5outdated
268@@ -269,9 +269,11 @@ FUZZ_TARGET(txgraph)
269 auto max_count = provider.ConsumeIntegralInRange<DepGraphIndex>(1, MAX_CLUSTER_COUNT_LIMIT);
270 // And the maximum combined size of transactions per cluster.
271 auto max_size = provider.ConsumeIntegralInRange<uint64_t>(1, 0x3fffff * MAX_CLUSTER_COUNT_LIMIT);
272+ // And the number of iterations to consider a cluster acceptably linearized.
273+ auto acceptable_iters = provider.ConsumeIntegralInRange<uint64_t>(0, 10000);
instagibbs
commented at 5:27 pm on April 14, 2025:
nice, this should allow some non-optimals to finally be created :+1:
DrahtBot added the label
Needs rebase
on Apr 17, 2025
sipa force-pushed
on Apr 17, 2025
DrahtBot removed the label
Needs rebase
on Apr 17, 2025
sipa force-pushed
on Apr 22, 2025
DrahtBot added the label
CI failed
on Apr 28, 2025
sipa force-pushed
on Apr 30, 2025
DrahtBot removed the label
CI failed
on Apr 30, 2025
sipa force-pushed
on May 12, 2025
sipa force-pushed
on May 13, 2025
sipa added the label
Mempool
on May 20, 2025
sipa force-pushed
on May 21, 2025
DrahtBot added the label
CI failed
on May 22, 2025
sipa force-pushed
on May 22, 2025
DrahtBot removed the label
CI failed
on May 22, 2025
sipa force-pushed
on May 22, 2025
sipa force-pushed
on May 28, 2025
sipa force-pushed
on May 28, 2025
DrahtBot added the label
CI failed
on May 28, 2025
DrahtBot
commented at 3:32 am on May 28, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task macOS-cross, gui, no tests: https://github.com/bitcoin/bitcoin/runs/43009701810
LLM reason (✨ experimental): The CI failure is due to a compilation error caused by an incompatible call to std::max with mixed types in txgraph.cpp.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on May 28, 2025
sipa force-pushed
on May 28, 2025
sipa force-pushed
on Jun 17, 2025
sipa force-pushed
on Jun 17, 2025
DrahtBot added the label
CI failed
on Jun 17, 2025
DrahtBot
commented at 10:53 pm on June 17, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task multiprocess, i686, DEBUG: https://github.com/bitcoin/bitcoin/runs/44292522153
LLM reason (✨ experimental): Build failure due to compilation errors in txgraph_tests.cpp caused by incorrect function usage.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on Jun 18, 2025
sipa force-pushed
on Jun 18, 2025
DrahtBot removed the label
CI failed
on Jun 18, 2025
sipa force-pushed
on Jun 18, 2025
sipa force-pushed
on Jul 2, 2025
sipa force-pushed
on Jul 2, 2025
sipa force-pushed
on Jul 2, 2025
DrahtBot added the label
CI failed
on Jul 2, 2025
DrahtBot
commented at 8:12 pm on July 2, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task tidy: https://github.com/bitcoin/bitcoin/runs/45240690589
LLM reason (✨ experimental): The CI failure is caused by errors from clang-tidy due to inclusion of deprecated headers in txgraph.cpp.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot removed the label
CI failed
on Jul 2, 2025
sipa force-pushed
on Jul 7, 2025
sipa marked this as ready for review
on Jul 7, 2025
Nice, indeed. With PostLinearize I think even 3 may work, but changed to 2 now.
in
src/bench/txgraph.cpp:51
in
46c49fbdd0outdated
45@@ -46,6 +46,9 @@ void BenchTxGraphTrim(benchmark::Bench& bench)
46 static constexpr int NUM_DEPS_PER_BOTTOM_TX = 100;
47 /** Set a very large cluster size limit so that only the count limit is triggered. */
48 static constexpr int32_t MAX_CLUSTER_SIZE = 100'000 * 100;
49+ /** Set a very high number for acceptable iterations, so that we certainly benchmark optimal
50+ * linearization. */
51+ static constexpr uint64_t NUM_ACCEPTABLE_ITERS = 100'000'000;
Right, they only have one valid linearization anyway, so any number of iterations suffices to find optimal (though I guess it could require some iterations to figure out that it is actually optimal).
in
src/txgraph.cpp:258
in
46c49fbdd0outdated
254@@ -255,6 +255,8 @@ class TxGraphImpl final : public TxGraph
255 const DepGraphIndex m_max_cluster_count;
256 /** This TxGraphImpl's maximum cluster size limit. */
257 const uint64_t m_max_cluster_size;
258+ /** The number of linearization improvements steps needed for it to be considered acceptable. */
246@@ -247,7 +247,8 @@ class TxGraph
247248 /** Construct a new TxGraph with the specified limit on the number of transactions within a cluster,
249 * and on the sum of transaction sizes within a cluster. max_cluster_count cannot exceed
250- * MAX_CLUSTER_COUNT_LIMIT. */
251-std::unique_ptr<TxGraph> MakeTxGraph(unsigned max_cluster_count, uint64_t max_cluster_size) noexcept;
252+ * MAX_CLUSTER_COUNT_LIMIT. acceptable_iters controls how many linearization optimization
253+ * steps will be performed before it is considered to be of acceptable quality. */
2508+ auto& queue = clusterset.m_clusters[int(QualityLevel::ACCEPTABLE)];
2509+ while (!queue.empty()) {
2510+ // Randomize the order in which we process, so that if the first cluster somehow needs
2511+ // more work than what iters allows, we don't keep spending it on the same one.
2512+ auto pos = m_rng.randrange<size_t>(queue.size());
2513+ auto [cost, optimal] = queue[pos].get()->Relinearize(*this, iters - iters_done);
Done. I’ve factored out the logic to count the number of iterations required for optimal linearization, and then used that in the txgraph simulation when DoWork() returns false.
It turned out that the implemented logic so far made this actually very hard to test for. If the remaining iterations budget in DoWork dropped below m_acceptable_iterations, but there were NEEDS_RELINEARIZE clusters left, the function would return false immediately. This would happen even when the remaining budget was actually sufficient to make everything optimal.
Thus, the DoWork overhaul commit has been changed fairly substantially (and, I think, simplified a bit).
The new logic means that you may not have enough to make it considered ACCEPTABLE BUT you may end up with OPTIMAL anyways and are able to continue until you hit a difficult cluster. In practice this may end up with a lot more clusters known as optimal when their topology is simple.
Might be worth a comment on the “if (!improved) return false;” line
instagibbs
commented at 8:28 pm on July 10, 2025:
member
review through 18e40b6cd5a11e23b7a0654bdd12f08e610ce980
no big concerns on first pass
in
src/txgraph.cpp:1119
in
3e49d0a908outdated
1107@@ -1108,6 +1108,9 @@ void Cluster::ApplyDependencies(TxGraphImpl& graph, std::span<std::pair<GraphInd
1108 // linearization, and post-linearize it to fix up the worst problems with it.
1109 FixLinearization(m_depgraph, m_linearization);
1110 PostLinearize(m_depgraph, m_linearization);
1111+ if (IsAcceptable()) {
DrahtBot added the label
CI failed
on Jul 13, 2025
DrahtBot
commented at 6:43 pm on July 13, 2025:
contributor
🚧 At least one of the CI tasks failed.
Task previous releases, depends DEBUG: https://github.com/bitcoin/bitcoin/runs/45882550988
LLM reason (✨ experimental): The CI failure is caused by a compilation error due to warnings being treated as errors, specifically an unused function that is being flagged and preventing compilation from succeeding.
Try to run the tests locally, according to the documentation. However, a CI failure may still
happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
sipa force-pushed
on Jul 13, 2025
DrahtBot removed the label
CI failed
on Jul 13, 2025
in
src/txgraph.cpp:193
in
04778c5b25outdated
188@@ -189,8 +189,9 @@ class Cluster
189 void Merge(TxGraphImpl& graph, Cluster& cluster) noexcept;
190 /** Given a span of (parent, child) pairs that all belong to this Cluster, apply them. */
191 void ApplyDependencies(TxGraphImpl& graph, std::span<std::pair<GraphIndex, GraphIndex>> to_apply) noexcept;
192- /** Improve the linearization of this Cluster. Returns how much work was performed. */
193- uint64_t Relinearize(TxGraphImpl& graph, uint64_t max_iters) noexcept;
194+ /** Improve the linearization of this Cluster. Returns how much work was performed and whether
195+ * the Cluster has changed quality now. */
instagibbs
commented at 12:54 pm on July 14, 2025:
93@@ -94,9 +94,10 @@ class TxGraph
94 virtual void SetTransactionFee(const Ref& arg, int64_t fee) noexcept = 0;
95 96 /** TxGraph is internally lazy, and will not compute many things until they are needed.
97- * Calling DoWork will compute everything now, so that future operations are fast. This can be
98- * invoked while oversized. */
99- virtual void DoWork() noexcept = 0;
100+ * Calling DoWork will perform some work now (controlled by iters) so that future operations
101+ * are fast, if there is any. Returns whether all work is done. This can be invoked while
instagibbs
commented at 1:24 pm on July 14, 2025:
member
reviewed via git range-diff master 18e40b6cd5a11e23b7a0654bdd12f08e610ce980 4788c87f7c119f35c551da05e7c43bd5be51420c
suggested changes were included, including making sure DoWork is reaching optimality when expected
txgraph: reset quality when merging clusters (bugfix)fad0eb091e
txgraph: 1-or-2-tx split-off clusters are optimal (optimization)6ba316eaa0
txgraph: track amount of work done in linearization (preparation)cfe9958852
txgraph: make number of acceptable iterations configurable (feature)e96b00d99e
txgraph: add work limit to DoWork(), try optimal (feature)
This adds an `iters` parameter to DoWork(), which controls how much work it is
allowed to do right now.
Additionally, DoWork() won't stop at just getting everything ACCEPTABLE, but if
there is work budget left, will also attempt to get every cluster linearized
optimally.
f3c2fc867f
txgraph: check that DoWork finds optimal if given high budget (tests)62ed1f92ef
sipa force-pushed
on Jul 14, 2025
sipa
commented at 2:46 pm on July 14, 2025:
member
Addressed comments, and also made some further simplifications (dropped the returning of amount of work performed by MakeAcceptable and MakeAllAcceptable in the preparation commit, as the new DoWork() doesn’t use them anymore but invokes Cluster::Relinearize directly).
instagibbs approved
instagibbs
commented at 4:35 pm on July 14, 2025:
member
ACK62ed1f92efff42bc79c50935e6dbd9da4e072020
ismaelsadeeq
commented at 10:11 pm on July 14, 2025:
member
Concept ACK. The test added in f3c2fc867fc4332dfed0a3766997433e1676dbe3 fails without the bugfix in the first commit.
I notice DoWork is unused in #28676, so this PR is not a blocker for the project?
in
src/txgraph.h:100
in
f3c2fc867foutdated
98- * invoked while oversized. */
99- virtual void DoWork() noexcept = 0;
100+ * Calling DoWork will perform some work now (controlled by iters) so that future operations
101+ * are fast, if there is any. Returns whether all currently-available work is done. This can
102+ * be invoked while oversized, but oversized graphs will be skipped by this call. */
103+ virtual bool DoWork(uint64_t iters) noexcept = 0;
ismaelsadeeq
commented at 10:17 pm on July 14, 2025:
nit: maximum_iterations would be clearer than iters?
This is nice the mempool can decide what to pass to DoWork by converting the time it can spend into iterations.
ismaelsadeeq
commented at 4:08 pm on July 15, 2025:
yeah will make internal comments of DoWork easier to follow as well.
in
src/txgraph.cpp:981
in
6ba316eaa0outdated
977@@ -978,10 +978,11 @@ bool Cluster::Split(TxGraphImpl& graph) noexcept
978 // Iterate over the connected components of this Cluster's m_depgraph.
979 while (todo.Any()) {
980 auto component = m_depgraph.FindConnectedComponent(todo);
981+ auto split_quality = component.Count() <= 2 ? QualityLevel::OPTIMAL : new_quality;
ismaelsadeeq
commented at 3:00 pm on July 15, 2025:
Why size two?
Isn’t this supposed to be one? That is, individual splits are already optimal, but any component with size greater than one needs to be linearized, no?
I was confused by why we use max_iters instead of cost to gauge whether we did enough work to make the cluster acceptable (intuitively, we care about actual work done rather than the upper bound given). But when optimal is not true that means we ~used up the iters available (or determined it wasn’t enough).
Alternatively I thought maybe max_iters == graph.m_acceptable_iters is used as a way to tell that we’re still in the “make things acceptable” stage and haven’t yet run out of budget.
in
src/txgraph.cpp:2523
in
f3c2fc867foutdated
2523+ iters_done += cost;
2524+ // If no improvement was made to the Cluster, it means we've essentially run out of
2525+ // budget. Even though it may be the case that iters_done < iters still, the
2526+ // linearizer decided there wasn't enough budget left to attempt anything with.
2527+ // To avoid an infinite loop that keeps trying clusters with minuscule budgets,
2528+ // stop here too.
We have improved = true when either (1) we took the cluster from any non-optimal state -> optimal regardless of iters usage or (2) we are working on a non-acceptable cluster and had m_acceptable_iters to spend.
That means improved = false if we didn’t have enough iters to do m_acceptable_iters on a non-acceptable cluster (ran out of iters), or we have moved on to acceptable clusters but didn’t have enough to make it optimal.
It’s possible that we have some iters left and it would be enough to linearize e.g. a smaller cluster that remains, but we exit as soon as we hit a cluster for which our budget is insufficient.
glozow
commented at 3:49 pm on July 18, 2025:
member
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-08-02 15:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me