Replace cluster linearization algorithm with SFL #32545

pull sipa wants to merge 9 commits into bitcoin:master from sipa:202505_sfl changing 9 files +1341 −1422
  1. sipa commented at 11:24 pm on May 17, 2025: member

    Part of cluster mempool: #30289.

    This replaces the cluster linearization algorithm introduced in #30126 and #30286 (a combination of LIMO with candidate-set search), with a completely different algorithm: spanning-forest linearization, which appears to have much better performance for hard clusters. See this post for a comparison between various linearization algorithms, and this post for benchmarks comparing them. Replaying historical mempool data on it shows that it can effectively linearize every observed cluster up to 64 transactions optimally within tens of microseconds, though pathological examples can be created which take longer.

    The algorithm is effectively a very specialized version of the simplex algorithm to the problem of finding high-feerate topological subsets of clusters, but modified to find all consecutive such subsets concurrently rather than just the first one. See the post above for how it is related.

    It represents the cluster as partitioned into a set of chunks, each with a spanning tree of its internal dependencies connecting the transactions. Randomized improvements are made by selecting dependencies to add and remove to these spanning trees, merging and splitting chunks, until no more improvements are possible, or a computation budget is reached. Like simplex, it does not necessarily make progress in every step, and thus has no upper bound on its runtime to find optimal, but randomization makes long runtimes very unlikely, and additionally makes it hard to adversarially construct clusters in which the algorithm reliably makes bad choices.

  2. DrahtBot commented at 11:24 pm on May 17, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32545.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK jonatack

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    No conflicts as of last run.

  3. sipa force-pushed on May 17, 2025
  4. DrahtBot added the label CI failed on May 17, 2025
  5. DrahtBot commented at 11:40 pm on May 17, 2025: contributor

    🚧 At least one of the CI tasks failed. Task ARM, unit tests, no functional tests: https://github.com/bitcoin/bitcoin/runs/42417371062 LLM reason (✨ experimental): The CI failure is due to a build error during the compilation of txgraph.cpp.o.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  6. sipa force-pushed on May 18, 2025
  7. sipa force-pushed on May 18, 2025
  8. DrahtBot removed the label CI failed on May 18, 2025
  9. sipa force-pushed on May 18, 2025
  10. sipa force-pushed on May 19, 2025
  11. DrahtBot added the label CI failed on May 19, 2025
  12. DrahtBot commented at 3:39 am on May 19, 2025: contributor

    🚧 At least one of the CI tasks failed. Task ARM, unit tests, no functional tests: https://github.com/bitcoin/bitcoin/runs/42447412610 LLM reason (✨ experimental): The CI failure is due to a failed CTest test: cluster_linearize_tests.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  13. sipa force-pushed on May 20, 2025
  14. sipa force-pushed on May 20, 2025
  15. DrahtBot removed the label CI failed on May 20, 2025
  16. sipa added the label Mempool on May 20, 2025
  17. in src/cluster_linearize.h:464 in 23072f2b0e outdated
    554@@ -539,492 +555,651 @@ class LinearizationChunking
    555     }
    556 };
    557 
    558-/** Class encapsulating the state needed to find the best remaining ancestor set.
    559+/** Class to represent the internal state of the spanning-forest linearization algorithm.
    


    jonatack commented at 4:09 pm on May 22, 2025:
    Appreciate the excellent doxygen documentation here.
  18. jonatack commented at 4:09 pm on May 22, 2025: member
    Concept ACK
  19. sipa force-pushed on May 23, 2025
  20. sipa force-pushed on May 23, 2025
  21. sipa force-pushed on May 24, 2025
  22. sipa force-pushed on May 25, 2025
  23. sipa force-pushed on May 25, 2025
  24. DrahtBot added the label CI failed on May 25, 2025
  25. DrahtBot commented at 4:44 pm on May 25, 2025: contributor

    🚧 At least one of the CI tasks failed. Task CentOS, depends, gui: https://github.com/bitcoin/bitcoin/runs/42858330826 LLM reason (✨ experimental): The CI failure is due to assertion failures within the cluster_linearize_tests and bench_sanity_check tests.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  26. DrahtBot removed the label CI failed on May 25, 2025
  27. sipa force-pushed on May 25, 2025
  28. sipa force-pushed on May 25, 2025
  29. sipa force-pushed on May 28, 2025
  30. sipa commented at 2:57 pm on May 28, 2025: member

    I’m interested in seeing benchmarks of this on various systems, especially high-end/modern ones (as their performance is likely most predictive of what future hardware will be like).

    0./build_dev_mode/bin/bench_bitcoin --filter="Linearize.*Example.*" -min-time=10000
    

    Here are some I have gathered:

    ns/cost cost/s err% total benchmark
    1.68 595,052,476.10 0.4% 11.01 LinearizeBoundedExample0
    1.92 521,264,873.78 0.1% 11.01 LinearizeBoundedExample1
    1.79 557,860,504.16 0.3% 10.58 LinearizeBoundedExample2
    1.89 528,599,597.23 0.1% 10.97 LinearizeBoundedExample3
    1.81 552,507,201.82 0.4% 10.81 LinearizeBoundedExample4
    1.97 508,721,677.18 1.0% 11.05 LinearizeBoundedExample5
    1.89 528,912,231.85 0.4% 11.06 LinearizeBoundedExample6
    1.79 557,221,394.89 0.8% 11.01 LinearizeBoundedExample7
    1.79 557,162,554.42 0.3% 11.05 LinearizeBoundedExample8
    1.84 543,573,897.10 0.3% 10.99 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,576.22 634,428.95 0.4% 10.95 LinearizeOptimallyExample00
    3,010.14 332,209.92 0.3% 10.99 LinearizeOptimallyExample01
    5,871.01 170,328.44 0.4% 11.09 LinearizeOptimallyExample02
    6,321.90 158,180.35 0.6% 10.55 LinearizeOptimallyExample03
    15,081.30 66,307.30 0.4% 10.99 LinearizeOptimallyExample04
    12,227.15 81,785.18 0.2% 11.01 LinearizeOptimallyExample05
    12,703.52 78,718.34 0.4% 11.02 LinearizeOptimallyExample06
    11,519.71 86,807.71 0.3% 11.01 LinearizeOptimallyExample07
    14,768.84 67,710.11 0.5% 10.93 LinearizeOptimallyExample08
    19,655.67 50,875.90 0.3% 11.03 LinearizeOptimallyExample09
    11,776.61 84,914.06 0.1% 10.97 LinearizeOptimallyExample10
    17,396.59 57,482.51 0.5% 10.97 LinearizeOptimallyExample11
    25,232.95 39,630.72 0.3% 11.04 LinearizeOptimallyExample12
    28,941.27 34,552.73 0.2% 10.98 LinearizeOptimallyExample13
    36,964.92 27,052.68 0.7% 11.00 LinearizeOptimallyExample14
    37,103.54 26,951.61 0.3% 11.02 LinearizeOptimallyExample15
    43,747.38 22,858.51 0.3% 10.98 LinearizeOptimallyExample16
    59,132.17 16,911.27 0.8% 11.04 LinearizeOptimallyExample17
    61,824.04 16,174.94 0.2% 10.54 LinearizeOptimallyExample18
    73,274.33 13,647.34 0.6% 10.65 LinearizeOptimallyExample19
    ns/cost cost/s err% total benchmark
    1.31 763,277,399.95 0.0% 10.99 LinearizeBoundedExample0
    1.53 652,463,095.02 0.1% 11.00 LinearizeBoundedExample1
    1.49 672,057,837.39 0.0% 11.00 LinearizeBoundedExample2
    1.66 602,362,799.74 0.0% 10.69 LinearizeBoundedExample3
    1.52 656,146,353.18 0.1% 10.71 LinearizeBoundedExample4
    1.65 607,795,039.45 0.0% 10.76 LinearizeBoundedExample5
    1.59 628,108,657.72 0.0% 10.82 LinearizeBoundedExample6
    1.50 668,384,115.59 0.1% 11.00 LinearizeBoundedExample7
    1.51 663,161,838.88 0.0% 11.01 LinearizeBoundedExample8
    1.53 654,682,223.87 0.1% 10.99 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,209.24 826,967.78 0.3% 10.98 LinearizeOptimallyExample00
    2,393.97 417,716.84 0.0% 11.01 LinearizeOptimallyExample01
    4,764.63 209,879.68 0.0% 11.01 LinearizeOptimallyExample02
    5,067.95 197,318.41 0.1% 11.01 LinearizeOptimallyExample03
    12,716.36 78,638.87 0.0% 11.01 LinearizeOptimallyExample04
    10,231.29 97,739.43 0.0% 11.01 LinearizeOptimallyExample05
    10,649.64 93,899.84 0.0% 11.01 LinearizeOptimallyExample06
    9,564.17 104,556.95 0.1% 11.01 LinearizeOptimallyExample07
    11,884.47 84,143.41 0.3% 11.00 LinearizeOptimallyExample08
    16,436.12 60,841.62 0.1% 11.00 LinearizeOptimallyExample09
    9,937.05 100,633.51 0.1% 11.00 LinearizeOptimallyExample10
    14,575.73 68,607.20 0.2% 11.02 LinearizeOptimallyExample11
    21,355.98 46,825.28 0.1% 11.01 LinearizeOptimallyExample12
    24,439.13 40,917.99 0.0% 11.00 LinearizeOptimallyExample13
    31,407.36 31,839.67 0.1% 11.00 LinearizeOptimallyExample14
    31,366.41 31,881.24 0.1% 11.01 LinearizeOptimallyExample15
    37,182.88 26,894.10 0.1% 11.00 LinearizeOptimallyExample16
    50,528.12 19,790.96 0.3% 11.02 LinearizeOptimallyExample17
    52,714.16 18,970.24 0.1% 11.00 LinearizeOptimallyExample18
    62,357.87 16,036.47 0.1% 10.57 LinearizeOptimallyExample19
    ns/cost cost/s err% total benchmark
    1.53 655,436,526.57 0.7% 1.10 LinearizeBoundedExample0
    1.81 552,194,379.23 0.2% 1.10 LinearizeBoundedExample1
    1.52 656,148,852.84 0.4% 1.09 LinearizeBoundedExample2
    1.75 572,294,606.97 0.7% 1.06 LinearizeBoundedExample3
    1.62 615,593,105.29 0.3% 1.07 LinearizeBoundedExample4
    1.78 561,989,024.83 1.2% 1.08 LinearizeBoundedExample5
    1.69 592,612,447.36 0.2% 1.10 LinearizeBoundedExample6
    1.58 633,450,827.45 0.3% 1.10 LinearizeBoundedExample7
    1.56 640,690,050.49 0.1% 1.10 LinearizeBoundedExample8
    1.59 628,148,004.87 0.2% 1.09 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,499.94 666,693.94 0.4% 1.09 LinearizeOptimallyExample00
    2,884.51 346,679.16 0.2% 1.10 LinearizeOptimallyExample01
    5,815.27 171,961.12 0.1% 1.10 LinearizeOptimallyExample02
    6,097.37 164,005.14 0.2% 1.05 LinearizeOptimallyExample03
    13,502.59 74,059.89 0.1% 1.10 LinearizeOptimallyExample04
    11,447.25 87,357.24 0.1% 1.10 LinearizeOptimallyExample05
    11,639.59 85,913.71 0.1% 1.10 LinearizeOptimallyExample06
    9,557.65 104,628.26 0.1% 1.10 LinearizeOptimallyExample07
    12,139.84 82,373.43 0.1% 1.10 LinearizeOptimallyExample08
    18,237.62 54,831.73 0.1% 1.10 LinearizeOptimallyExample09
    11,806.18 84,701.39 0.1% 1.10 LinearizeOptimallyExample10
    17,519.95 57,077.80 0.2% 1.09 LinearizeOptimallyExample11
    24,397.16 40,988.37 0.2% 1.10 LinearizeOptimallyExample12
    28,264.97 35,379.48 0.1% 1.10 LinearizeOptimallyExample13
    36,230.91 27,600.74 0.5% 1.10 LinearizeOptimallyExample14
    36,251.16 27,585.33 0.3% 1.10 LinearizeOptimallyExample15
    42,480.07 23,540.45 0.3% 1.10 LinearizeOptimallyExample16
    59,497.62 16,807.39 0.4% 1.10 LinearizeOptimallyExample17
    62,192.70 16,079.06 0.2% 1.06 LinearizeOptimallyExample18
    73,604.11 13,586.20 0.7% 1.06 LinearizeOptimallyExample19
    ns/cost cost/s err% total benchmark
    4.39 227,695,874.58 0.4% 10.99 LinearizeBoundedExample0
    5.32 187,925,655.80 2.3% 11.09 LinearizeBoundedExample1
    4.96 201,470,069.83 1.6% 11.13 LinearizeBoundedExample2
    5.47 182,878,034.13 1.9% 10.75 LinearizeBoundedExample3
    5.02 199,278,377.58 2.6% 10.73 LinearizeBoundedExample4
    5.69 175,678,783.21 2.7% 11.08 LinearizeBoundedExample5
    5.53 180,751,956.64 4.1% 11.00 LinearizeBoundedExample6
    5.80 172,338,240.84 2.0% 10.81 LinearizeBoundedExample7
    5.24 190,865,678.33 3.1% 10.63 LinearizeBoundedExample8
    5.51 181,564,438.26 2.9% 11.38 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    4,399.38 227,304.78 1.9% 11.18 LinearizeOptimallyExample00
    8,789.91 113,766.77 0.9% 10.84 LinearizeOptimallyExample01
    15,890.75 62,929.71 1.6% 11.07 LinearizeOptimallyExample02
    19,082.82 52,403.17 2.2% 11.00 LinearizeOptimallyExample03
    44,627.25 22,407.83 1.8% 11.20 LinearizeOptimallyExample04
    33,718.11 29,657.66 1.1% 11.03 LinearizeOptimallyExample05
    35,473.62 28,189.96 1.1% 11.05 LinearizeOptimallyExample06
    33,162.95 30,154.13 1.5% 10.88 LinearizeOptimallyExample07
    45,473.12 21,991.01 0.9% 11.12 LinearizeOptimallyExample08
    57,919.95 17,265.21 1.1% 11.04 LinearizeOptimallyExample09
    32,640.19 30,637.08 2.1% 11.16 LinearizeOptimallyExample10
    46,559.58 21,477.86 1.5% 10.98 LinearizeOptimallyExample11
    66,614.50 15,011.75 1.4% 10.55 LinearizeOptimallyExample12
    76,783.09 13,023.70 0.5% 10.74 LinearizeOptimallyExample13
    97,635.10 10,242.22 1.7% 11.00 LinearizeOptimallyExample14
    99,854.80 10,014.54 0.8% 11.00 LinearizeOptimallyExample15
    119,444.66 8,372.08 1.2% 11.11 LinearizeOptimallyExample16
    156,160.25 6,403.68 0.9% 10.98 LinearizeOptimallyExample17
    163,364.57 6,121.28 0.8% 10.90 LinearizeOptimallyExample18
    193,524.10 5,167.32 1.2% 11.14 LinearizeOptimallyExample19
    ns/cost cost/s err% total benchmark
    1.75 571,671,353.08 0.1% 10.99 LinearizeBoundedExample0
    2.05 487,547,934.48 0.1% 11.00 LinearizeBoundedExample1
    1.97 506,752,217.57 0.1% 10.64 LinearizeBoundedExample2
    2.24 447,162,026.45 0.1% 11.00 LinearizeBoundedExample3
    2.08 480,557,495.14 0.1% 11.01 LinearizeBoundedExample4
    2.30 434,505,839.40 0.1% 11.12 LinearizeBoundedExample5
    2.21 451,557,580.61 0.0% 11.00 LinearizeBoundedExample6
    2.25 444,374,372.72 0.0% 11.00 LinearizeBoundedExample7
    2.12 471,017,980.18 0.1% 10.93 LinearizeBoundedExample8
    2.22 450,256,469.71 1.1% 11.07 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,586.71 630,233.45 0.1% 10.98 LinearizeOptimallyExample00
    3,268.23 305,976.43 0.0% 11.00 LinearizeOptimallyExample01
    6,858.84 145,797.34 0.2% 10.54 LinearizeOptimallyExample02
    7,873.51 127,008.11 0.1% 10.73 LinearizeOptimallyExample03
    17,277.23 57,879.64 0.1% 11.00 LinearizeOptimallyExample04
    14,178.66 70,528.55 0.1% 11.00 LinearizeOptimallyExample05
    16,009.26 62,463.84 0.0% 10.99 LinearizeOptimallyExample06
    12,586.44 79,450.57 1.6% 10.87 LinearizeOptimallyExample07
    15,396.54 64,949.64 0.0% 11.00 LinearizeOptimallyExample08
    24,369.54 41,034.82 0.0% 11.00 LinearizeOptimallyExample09
    13,610.19 73,474.38 0.1% 10.99 LinearizeOptimallyExample10
    20,073.90 49,815.92 0.1% 11.01 LinearizeOptimallyExample11
    30,462.32 32,827.44 0.3% 11.12 LinearizeOptimallyExample12
    34,713.56 28,807.18 0.3% 10.93 LinearizeOptimallyExample13
    44,015.70 22,719.17 0.1% 11.00 LinearizeOptimallyExample14
    43,492.08 22,992.69 0.1% 11.00 LinearizeOptimallyExample15
    51,596.62 19,381.12 0.1% 11.01 LinearizeOptimallyExample16
    71,384.39 14,008.67 0.2% 10.69 LinearizeOptimallyExample17
    73,714.84 13,565.79 0.1% 10.69 LinearizeOptimallyExample18
    87,910.27 11,375.24 0.0% 10.82 LinearizeOptimallyExample19
  31. davidgumberg commented at 3:16 am on May 29, 2025: contributor

    (built with Clang)

    AMD Ryzen 9 9950X 16-Core Processor

    0./build/bin/bench_bitcoin --filter="Linearize.*Example.*" -min-time=10000
    
    ns/cost cost/s err% ins/cost cyc/cost IPC bra/cost miss% total benchmark
    1.19 836,855,166.72 0.4% 20.42 5.13 3.984 2.29 2.4% 11.05 LinearizeBoundedExample0
    1.44 695,776,928.37 0.1% 24.33 6.17 3.945 2.74 2.4% 10.99 LinearizeBoundedExample1
    1.17 852,566,861.36 0.2% 18.78 5.03 3.733 2.33 3.2% 11.04 LinearizeBoundedExample2
    1.36 735,968,415.25 0.1% 19.54 5.83 3.349 2.36 4.6% 10.57 LinearizeBoundedExample3
    1.24 805,653,703.91 0.0% 18.58 5.33 3.487 2.27 4.2% 10.56 LinearizeBoundedExample4
    1.34 745,442,570.61 0.1% 20.46 5.76 3.553 2.50 4.1% 10.62 LinearizeBoundedExample5
    1.27 784,639,270.59 0.0% 19.10 5.47 3.490 2.35 4.3% 10.65 LinearizeBoundedExample6
    1.18 847,347,367.90 0.2% 18.45 5.07 3.642 2.23 3.6% 10.70 LinearizeBoundedExample7
    1.19 841,752,012.32 0.2% 18.65 5.10 3.657 2.29 3.5% 10.70 LinearizeBoundedExample8
    1.20 833,486,862.05 0.1% 19.07 5.15 3.703 2.31 3.2% 10.79 LinearizeBoundedExample9
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    1,062.27 941,380.47 0.2% 23,154.64 4,561.62 5.076 2,732.15 1.1% 10.99 LinearizeOptimallyExample00
    2,199.32 454,686.52 0.1% 44,955.91 9,443.49 4.761 5,125.75 1.2% 11.01 LinearizeOptimallyExample01
    4,655.43 214,803.15 0.1% 76,020.42 19,989.80 3.803 8,376.75 2.7% 11.01 LinearizeOptimallyExample02
    4,687.57 213,330.01 0.2% 79,606.35 20,125.95 3.955 8,838.43 2.7% 10.99 LinearizeOptimallyExample03
    11,198.28 89,299.41 0.1% 180,121.05 48,080.45 3.746 21,333.25 2.9% 11.01 LinearizeOptimallyExample04
    8,806.12 113,557.33 0.0% 148,315.36 37,809.30 3.923 17,597.94 2.9% 10.83 LinearizeOptimallyExample05
    8,881.81 112,589.68 0.0% 129,414.56 38,127.56 3.394 15,799.78 4.3% 10.83 LinearizeOptimallyExample06
    7,396.23 135,204.04 0.0% 139,880.41 31,756.73 4.405 16,345.08 2.7% 10.67 LinearizeOptimallyExample07
    9,691.58 103,182.40 0.0% 201,859.77 41,612.38 4.851 23,957.84 1.1% 11.01 LinearizeOptimallyExample08
    13,912.45 71,878.08 0.1% 205,829.89 59,733.52 3.446 25,060.12 4.2% 11.00 LinearizeOptimallyExample09
    9,799.39 102,047.14 0.1% 131,704.66 42,074.45 3.130 13,959.94 5.7% 11.00 LinearizeOptimallyExample10
    14,178.88 70,527.42 0.1% 181,891.98 60,879.32 2.988 18,797.61 5.9% 11.02 LinearizeOptimallyExample11
    19,417.52 51,499.89 0.1% 231,655.76 83,370.63 2.779 26,582.04 6.6% 11.01 LinearizeOptimallyExample12
    21,875.47 45,713.31 0.1% 262,258.87 93,911.73 2.793 30,218.20 6.7% 11.00 LinearizeOptimallyExample13
    28,875.31 34,631.67 0.1% 336,971.73 123,981.98 2.718 38,520.06 6.5% 11.01 LinearizeOptimallyExample14
    27,860.30 35,893.37 0.1% 334,160.29 119,619.93 2.794 37,727.10 6.5% 11.00 LinearizeOptimallyExample15
    33,495.46 29,854.79 0.1% 395,850.86 143,808.78 2.753 45,419.38 6.8% 11.00 LinearizeOptimallyExample16
    46,820.27 21,358.27 0.2% 527,868.16 201,024.15 2.626 60,683.66 7.4% 11.01 LinearizeOptimallyExample17
    47,420.53 21,087.91 0.1% 549,255.39 203,589.67 2.698 62,108.84 7.0% 11.01 LinearizeOptimallyExample18
    56,870.07 17,583.94 0.2% 665,564.77 244,167.60 2.726 75,210.75 6.7% 10.98 LinearizeOptimallyExample19

    AMD Ryzen 9 7900X 12-Core Processor

    0./build/bin/bench_bitcoin --filter="Linearize.*Example.*" -min-time=10000
    
    ns/cost cost/s err% ins/cost cyc/cost IPC bra/cost miss% total benchmark
    1.36 737,410,406.03 0.2% 20.58 6.34 3.247 2.34 2.5% 11.02 LinearizeBoundedExample0
    1.65 606,854,042.04 0.1% 24.57 7.70 3.190 2.76 2.5% 10.99 LinearizeBoundedExample1
    1.31 760,547,140.52 0.1% 18.87 6.14 3.071 2.36 3.4% 11.02 LinearizeBoundedExample2
    1.52 657,532,700.47 0.0% 19.61 7.11 2.758 2.38 4.9% 10.63 LinearizeBoundedExample3
    1.39 718,444,894.17 0.1% 18.63 6.51 2.864 2.30 4.3% 10.63 LinearizeBoundedExample4
    1.52 656,965,976.43 0.1% 20.51 7.11 2.882 2.53 4.1% 10.71 LinearizeBoundedExample5
    1.44 694,175,863.25 0.1% 19.12 6.73 2.839 2.39 4.5% 10.72 LinearizeBoundedExample6
    1.34 745,455,006.56 0.0% 18.49 6.27 2.948 2.27 3.8% 10.81 LinearizeBoundedExample7
    1.35 740,869,763.95 0.2% 18.70 6.31 2.965 2.32 3.7% 10.83 LinearizeBoundedExample8
    1.36 733,610,430.06 0.0% 19.17 6.37 3.009 2.34 3.4% 11.00 LinearizeBoundedExample9
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    1,231.66 811,912.24 0.0% 23,357.46 5,757.88 4.057 2,788.34 1.0% 11.01 LinearizeOptimallyExample00
    2,608.32 383,388.20 0.1% 44,890.34 12,192.83 3.682 5,176.76 1.2% 11.02 LinearizeOptimallyExample01
    5,129.41 194,954.12 0.1% 76,112.07 23,977.96 3.174 8,574.18 2.6% 10.99 LinearizeOptimallyExample02
    5,283.32 189,274.88 0.1% 79,355.44 24,692.88 3.214 8,960.39 2.7% 11.01 LinearizeOptimallyExample03
    12,510.48 79,933.00 0.1% 178,410.73 58,475.18 3.051 21,948.73 3.0% 11.00 LinearizeOptimallyExample04
    9,699.21 103,101.15 0.1% 147,263.70 45,334.12 3.248 18,517.16 2.7% 10.99 LinearizeOptimallyExample05
    9,890.03 101,111.94 0.1% 128,779.01 46,228.98 2.786 16,175.51 4.5% 11.00 LinearizeOptimallyExample06
    8,853.35 112,951.56 0.1% 138,404.58 41,381.10 3.345 16,972.04 2.6% 10.83 LinearizeOptimallyExample07
    11,429.49 87,493.00 0.0% 199,933.47 53,426.08 3.742 24,782.93 1.0% 11.00 LinearizeOptimallyExample08
    15,278.31 65,452.27 0.1% 203,844.43 71,404.74 2.855 25,811.52 4.2% 10.98 LinearizeOptimallyExample09
    10,430.03 95,877.04 0.1% 130,835.87 48,749.71 2.684 14,322.22 5.7% 11.01 LinearizeOptimallyExample10
    14,955.33 66,865.78 0.1% 181,388.73 69,905.18 2.595 19,826.07 5.7% 11.01 LinearizeOptimallyExample11
    20,192.09 49,524.34 0.1% 229,811.63 94,374.34 2.435 27,200.27 6.6% 11.01 LinearizeOptimallyExample12
    23,265.71 42,981.71 0.3% 260,056.43 108,734.94 2.392 30,956.74 6.7% 10.99 LinearizeOptimallyExample13
    29,690.95 33,680.30 0.1% 334,363.41 138,778.90 2.409 39,510.12 6.4% 10.99 LinearizeOptimallyExample14
    29,188.83 34,259.68 0.1% 331,541.94 136,412.59 2.430 38,603.88 6.3% 10.99 LinearizeOptimallyExample15
    35,084.30 28,502.78 0.0% 391,658.97 163,986.36 2.388 46,505.21 6.6% 11.00 LinearizeOptimallyExample16
    47,904.99 20,874.65 0.1% 521,828.04 223,869.99 2.331 62,246.01 7.0% 11.01 LinearizeOptimallyExample17
    49,498.85 20,202.49 0.1% 543,459.92 231,331.22 2.349 63,630.07 6.7% 11.01 LinearizeOptimallyExample18
    59,173.83 16,899.36 0.1% 658,965.47 276,557.48 2.383 77,128.66 6.3% 11.00 LinearizeOptimallyExample19
  32. ismaelsadeeq commented at 1:51 pm on May 29, 2025: member

    I’m interested in seeing benchmarks of this on various systems, especially high-end/modern ones (as their performance is likely most predictive of what future hardware will be like).

    Below are benchmarks on MacBook M2 PRO 2023. Compiled using clang with only build bench cmake option.

    0cmake -B build -DBUILD_BENCH=ON && cmake --build build
    
    0abubakarismail@Abubakars-MacBook-Pro ~/D/W/b/bitcoin ((47bdf8fa))> ./build/bin/bench_bitcoin --filter="Linearize.*Example.*" -min-ti
    1me=1000
    
    ns/cost cost/s err% total benchmark
    1.85 539,881,102.87 6.1% 10.64 :wavy_dash: LinearizeBoundedExample0 (Unstable with ~171,993.8 iters. Increase minEpochIterations to e.g. 1719938)
    2.16 463,765,560.64 1.9% 11.52 LinearizeBoundedExample1
    1.89 529,370,444.85 3.4% 11.88 LinearizeBoundedExample2
    2.09 478,266,850.10 2.1% 10.69 LinearizeBoundedExample3
    2.00 500,459,378.92 3.4% 10.99 LinearizeBoundedExample4
    2.23 448,896,053.69 4.1% 12.62 LinearizeBoundedExample5
    2.21 453,201,053.94 6.2% 9.74 :wavy_dash: LinearizeBoundedExample6 (Unstable with ~70,645.3 iters. Increase minEpochIterations to e.g. 706453)
    1.88 531,282,530.06 1.7% 11.59 LinearizeBoundedExample7
    1.82 550,292,679.97 1.5% 11.85 LinearizeBoundedExample8
    1.81 553,506,180.26 1.1% 10.68 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,927.91 518,696.06 1.2% 11.39 LinearizeOptimallyExample00
    3,227.99 309,790.35 1.0% 10.99 LinearizeOptimallyExample01
    6,619.50 151,068.72 2.3% 10.17 LinearizeOptimallyExample02
    6,648.74 150,404.54 1.1% 10.67 LinearizeOptimallyExample03
    15,989.38 62,541.52 1.7% 11.11 LinearizeOptimallyExample04
    13,491.31 74,121.78 4.1% 12.15 LinearizeOptimallyExample05
    12,958.96 77,166.66 1.0% 10.63 LinearizeOptimallyExample06
    11,525.25 86,766.02 3.7% 11.77 LinearizeOptimallyExample07
    14,095.38 70,945.21 0.6% 10.80 LinearizeOptimallyExample08
    21,410.11 46,706.90 1.1% 11.46 LinearizeOptimallyExample09
    13,372.29 74,781.50 1.5% 11.06 LinearizeOptimallyExample10
    19,803.65 50,495.75 3.2% 10.79 LinearizeOptimallyExample11
    26,699.86 37,453.38 1.1% 11.15 LinearizeOptimallyExample12
    36,106.95 27,695.50 8.0% 11.03 :wavy_dash: LinearizeOptimallyExample13 (Unstable with ~24,184.5 iters. Increase minEpochIterations to e.g. 241845)
    41,346.30 24,185.96 3.3% 10.49 LinearizeOptimallyExample14
    41,673.85 23,995.86 1.6% 9.89 LinearizeOptimallyExample15
    47,425.14 21,085.87 1.1% 11.07 LinearizeOptimallyExample16
    66,540.78 15,028.38 2.4% 11.26 LinearizeOptimallyExample17
    66,934.73 14,939.93 1.4% 10.54 LinearizeOptimallyExample18
    81,828.99 12,220.61 1.8% 11.44 LinearizeOptimallyExample19

    Almost 10x speedup edit: bench cases changed, so not comparable with master.

  33. sipa commented at 1:54 pm on May 29, 2025: member
    @ismaelsadeeq Not an apples-to-apples comparison I’m afraid, because the benchmark cases are changed in this PR too. Can you run the LinearizeBoundedExample? benchmarks too?
  34. ismaelsadeeq commented at 2:35 pm on May 29, 2025: member

    @ismaelsadeeq Not an apples-to-apples comparison I’m afraid, because the benchmark cases are changed in this PR too

    Oops sorry haven’t take a look, I was hoping I can compare locally myself. I saw a link to the bench results comparing them in the description 👍🏾

    Can you run the LinearizeBoundedExample? benchmarks too?

    Yep I did and updated the bench results. previous one was not pointing to your latest push.

  35. instagibbs commented at 4:28 pm on May 29, 2025: member
    Ran apples to apples out of curiosity. One example ran marginally slower on this PR (but still very fast), many are many multiples faster and some see 500x+ improvements.
  36. gmaxwell commented at 6:47 pm on May 29, 2025: contributor

    @instagibbs I believe the only cases that should be slower with SFL are either very small examples (where their time is irrelevant because its very low per tx compared to bigger clusters) and ones with huge numbers of dependencies. Beyond huge dependencies being generally unrepresentative they’re also expensive to generate because every dependency needs a vin, they also tend to be so slow with both that I guess neither will run to completion in practice.

    A better way to compare across implementations might be some kind of optimization time vs amount of fee required to create a cluster with that dependency geometry (just assuming a 1s/vb feerate or something, not adding up the actual costs in the cluster because another cluster could exist with the same topology but different fees/size). like fee = 16*txn + 41*inputs_consumed + 9*outputs_consumed_within_cluster – so I’d hope to find that SFL always beats the current exponential algorithm on this kind of metric (except for very small inputs).

    The exponential algorithm gets faster with dependency rich clusters because the linearization essentially becomes topologically forced, so the current algorithm doesn’t have many options to consider. If it massive high dependency clusters became common on the network in the future it might make sense to bring back the exponential algorithm for them. :P but I doubt that would happen.

  37. fjahr commented at 8:26 pm on May 30, 2025: contributor

    I’m interested in seeing benchmarks of this on various systems

    $ build/bin/bench_bitcoin –filter=“Linearize.Example.” -min-time=10000

    ns/cost cost/s err% total benchmark
    1.12 893,382,447.85 0.5% 11.06 LinearizeBoundedExample0
    1.34 743,601,759.92 0.7% 11.07 LinearizeBoundedExample1
    1.14 880,692,144.98 0.3% 10.99 LinearizeBoundedExample2
    1.33 749,663,261.00 0.3% 10.98 LinearizeBoundedExample3
    1.24 807,545,779.18 0.5% 10.57 LinearizeBoundedExample4
    1.37 730,046,090.82 0.4% 10.64 LinearizeBoundedExample5
    1.28 779,862,831.19 0.4% 10.66 LinearizeBoundedExample6
    1.17 856,937,081.55 0.4% 10.71 LinearizeBoundedExample7
    1.19 840,624,085.43 0.2% 10.74 LinearizeBoundedExample8
    1.19 840,938,970.51 0.4% 10.74 LinearizeBoundedExample9
    ns/op op/s err% total benchmark
    1,155.37 865,524.35 0.2% 11.00 LinearizeOptimallyExample00
    2,170.00 460,828.59 0.2% 11.04 LinearizeOptimallyExample01
    4,507.21 221,866.90 0.2% 11.01 LinearizeOptimallyExample02
    4,435.85 225,435.87 0.3% 10.96 LinearizeOptimallyExample03
    10,189.93 98,136.12 0.2% 11.02 LinearizeOptimallyExample04
    8,854.57 112,936.09 0.4% 10.79 LinearizeOptimallyExample05
    8,685.08 115,140.05 0.1% 10.82 LinearizeOptimallyExample06
    7,162.17 139,622.57 0.0% 10.64 LinearizeOptimallyExample07
    9,120.88 109,638.49 0.0% 11.00 LinearizeOptimallyExample08
    12,780.73 78,242.79 0.1% 11.00 LinearizeOptimallyExample09
    9,242.70 108,193.53 0.1% 10.98 LinearizeOptimallyExample10
    13,653.81 73,239.62 0.3% 11.04 LinearizeOptimallyExample11
    19,211.80 52,051.34 0.2% 11.00 LinearizeOptimallyExample12
    22,243.75 44,956.45 0.1% 11.00 LinearizeOptimallyExample13
    28,409.81 35,199.11 0.2% 10.99 LinearizeOptimallyExample14
    28,265.23 35,379.16 0.1% 10.99 LinearizeOptimallyExample15
    33,767.58 29,614.20 0.3% 11.02 LinearizeOptimallyExample16
    46,851.80 21,343.90 0.4% 11.03 LinearizeOptimallyExample17
    48,470.35 20,631.17 0.1% 10.98 LinearizeOptimallyExample18
    57,668.81 17,340.40 0.4% 10.98 LinearizeOptimallyExample19
  38. sipa force-pushed on May 31, 2025
  39. sipa commented at 1:24 pm on May 31, 2025: member

    @instagibbs @ismaelsadeeq @gmaxwell FWIW, I didn’t include apples-to-apples comparison because the large-scale comparison benchmarks I did (especially, this one) were just overwhelmingly in favor of SFL over the existing one. The benchmarks were mostly done to decide between SFL and another replacement algorithm, GGT, but see the reasons for choosing SFL here.

    A fair apples-to-apples comparison I think needs to use a dataset large enough to encompass both good and bad cases for each of the algorithms, and not be specifically biased for one of them, which is hard in a benchmark I can include in the PR directly. If people are interested, I can try to clean up the branch I used to create the graphs above, however.

  40. sipa commented at 9:25 pm on May 31, 2025: member

    Thanks for all the benchmarks, here are some graphs created from them to show their relative performance:

    (code & data to generate them is here)

  41. sipa commented at 1:43 am on June 2, 2025: member

    If people are interested in a much larger scale comparison between the old and the new algorithm, checkout https://github.com/sipa/bitcoin/commits/bench_sfl_css, build its bin/bench_bitcoin, and:

    0wget https://bitcoin.sipa.be/clusters/clusters.tgz
    1tar -xzf clusters.tgz
    2for F in clusters_sim2023_large clusters_medium clusters_spanning clusters_bipartite; do time ./build/bin/bench_bitcoin --filter="LinearizeDataSet" <$F >$F.out; done
    3zstd -19 clusters_*.out
    

    And then PR the resulting *.out.zst files to https://github.com/sipa/lin-benches/tree/main/data.

  42. l0rinc commented at 10:21 am on June 4, 2025: contributor

    I’ve run the command on a Raspberry Pi 4 Model B - Cortex-A72 4-Core ARM with SanDisk SSD PLUS 1TB (USB 3.0) with to have data about lower-end hardware as well.

    0git remote add sipa https://github.com/sipa/bitcoin.git || true && git fetch sipa bench_sfl_css && git switch -C bench_sfl_css sipa/bench_sfl_css && \
    1git clean -fxd && git reset --hard && \
    2cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake --build build -j"$(nproc)" && \
    3wget -q https://bitcoin.sipa.be/clusters/clusters.tgz && tar -xzf clusters.tgz && \
    4for F in clusters_sim2023_large clusters_medium clusters_spanning clusters_bipartite; do \
    5    time build/bin/bench_bitcoin --filter="LinearizeDataSet" <"$F" >"$F.out"; \
    6done && \
    7tar -czf clusters.out.tgz clusters_*.out
    

    It took considerably longer than expected (~13h):

    • clusters_sim2023_large: 11h 39m
    • clusters_medium: 26m 33s
    • clusters_spanning: 19m 37s
    • clusters_bipartite: 37m 17s

    Let’s see if I can attach it here: clusters.out.tgz

  43. sipa commented at 12:44 pm on June 4, 2025: member

    I’ve created a repo for benchmark data, and added my results plus @l0rinc’s: https://github.com/sipa/lin-benches/tree/main/data.

    Also, is this with 32-bit or 64-bit userspace?

  44. l0rinc commented at 1:22 pm on June 4, 2025: contributor

    Also, is this with 32-bit or 64-bit userspace?

    64-bit userspace (AArch64)

    0$ file bitcoind
    1bitcoind: ELF 64-bit LSB pie executable, ARM aarch64, version 1 (GNU/Linux), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, BuildID[sha1]=7e059ec01f7460042910ca4ed15270382269c9d5, for GNU/Linux 3.7.0, with debug_info, not stripped
    
    0$ getconf LONG_BIT
    164
    2$ dpkg --print-architecture
    3arm64
    4$ uname -m
    5aarch64
    
  45. sipa force-pushed on Jun 12, 2025
  46. sipa commented at 5:28 pm on June 12, 2025: member
    Rebased on top of #30605.
  47. sipa force-pushed on Jun 12, 2025
  48. sipa force-pushed on Jun 14, 2025
  49. sipa force-pushed on Jul 11, 2025
  50. DrahtBot added the label Needs rebase on Jul 29, 2025
  51. sipa force-pushed on Aug 9, 2025
  52. DrahtBot added the label CI failed on Aug 9, 2025
  53. DrahtBot commented at 4:45 am on August 9, 2025: contributor

    🚧 At least one of the CI tasks failed. Task no wallet, libbitcoinkernel: https://github.com/bitcoin/bitcoin/runs/47723469045 LLM reason (✨ experimental): The build failed due to compilation errors in cluster_linearize.cpp caused by incorrect member access of a tuple, leading to a build error.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  54. DrahtBot removed the label Needs rebase on Aug 9, 2025
  55. sipa force-pushed on Aug 9, 2025
  56. sipa force-pushed on Aug 9, 2025
  57. DrahtBot removed the label CI failed on Aug 9, 2025
  58. sipa marked this as a draft on Sep 11, 2025
  59. sipa commented at 3:32 pm on September 11, 2025: member
    I’m working on some improvements and reorganization of the commit here, so marking it as draft for now.
  60. sipa force-pushed on Sep 11, 2025
  61. sipa force-pushed on Oct 16, 2025
  62. sipa marked this as ready for review on Oct 16, 2025
  63. sipa force-pushed on Oct 16, 2025
  64. sipa force-pushed on Oct 16, 2025
  65. sipa commented at 7:01 pm on October 16, 2025: member

    Rebased, and made a significant change to the SFL algorithm itself:

    • Switched to a different technique for making the initial state topological. The big advantage is that this approach also works when one already has a linearization which is not entirely topological already.

    Also the following changes to the cluster_linearize.h code in general:

    • Using SFL gaining the ability to fix existing linearizations, replaced FixLinearization with just making this feature part of Linearize.
    • With the LIMO-based Linearize, and MergeLinearizations (see below), gone, there is no more need for the LinearizationChunking class.

    And as a result some changes in TxGraph are possible too:

    • Removed the need for FixLinearization by introducing QualityLevel states to represent “may be non-topological”, which makes the next call to Linearize() do the fixing instead.
    • Removed MergeLinearizations, as it was never used.
    • As SFL, upon being fed an existing linearization, runs through an initialization procedure that is pretty much equivalent to PostLinearize, drop all calls to PostLinearize() prior to Linearize() calls.
    • Because the new algorithm randomizes output linearization order (within equal-chunk-feerate parts), undo some of the non-determinism that introduces by always calling PostLinearize() after Linearize() calls (before it would only do this if non-optimal).
  66. sipa commented at 7:10 pm on October 16, 2025: member
    This PR has grown quite a bit in scope (despite still being a net negative in LoC!), so I don’t think it’s unreasonable to split it up into an SFL-specific one, and one with follow-up changes to txgraph, or even further. I’ll wait for reviewer comments.
  67. sipa requested review from Copilot on Oct 20, 2025
  68. in src/cluster_linearize.h:1496 in 52b983bf90 outdated
    1896- *
    1897- * Complexity: possibly O(N * min(max_iterations + N, sqrt(2^N))) where N=depgraph.TxCount().
    1898  */
    1899 template<typename SetType>
    1900-std::tuple<std::vector<DepGraphIndex>, bool, uint64_t> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed, std::span<const DepGraphIndex> old_linearization = {}) noexcept
    1901+std::tuple<std::vector<DepGraphIndex>, bool, uint64_t> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed, std::span<const DepGraphIndex> old_linearization = {}, bool is_topological = true) noexcept
    


    Copilot commented at 0:30 am on October 20, 2025:
    The new is_topological parameter defaults to true, but when old_linearization is empty, this parameter is ignored. This could be confusing for API consumers. Consider documenting this behavior more explicitly in the function comment, or validate that when old_linearization is empty, is_topological is not set to false.
  69. Copilot commented at 0:30 am on October 20, 2025: none

    Pull Request Overview

    This PR replaces the existing cluster linearization algorithm (LIMO with candidate-set search) with a new spanning-forest linearization (SFL) algorithm that offers significantly better performance for complex clusters. The new algorithm uses a fundamentally different approach based on maintaining spanning trees within chunks and performing split/merge operations.

    Key Changes:

    • Replaces search-based linearization with spanning-forest based approach
    • Simplifies quality level management by introducing NEEDS_FIX and NEEDS_SPLIT_FIX states
    • Updates the ExtractTransactions interface to use two separate visitor callbacks

    Reviewed Changes

    Copilot reviewed 6 out of 7 changed files in this pull request and generated 2 comments.

    File Description
    src/txgraph.cpp Implements quality level changes, updates transaction extraction to use two-pass visitors, modifies merging/splitting logic to work with new algorithm
    src/test/util/cluster_linearize.h Adjusts test utilities for changed type parameters and updated iteration bounds
    src/test/fuzz/txgraph.cpp Updates fuzz tests to use new chunking API
    src/test/fuzz/cluster_linearize.cpp Removes tests for deprecated algorithm components, adds tests for new chunking functions
    src/cluster_linearize.h Replaces entire linearization algorithm implementation with spanning-forest approach, adds ChunkLinearizationInfo function
    src/bench/cluster_linearize.cpp Updates benchmarks to test new algorithm with historical and random cluster examples

    src/test/fuzz/cluster_linearize.cpp:1

    • [nitpick] The variable name done changed from TestBitSet to SetType. While this makes the code more generic, verify that this change aligns with the intended test coverage, as the original code may have been explicitly testing with TestBitSet for specific reasons.
    0// Copyright (c) The Bitcoin Core developers
    

    Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

  70. in src/bench/cluster_linearize.cpp:47 in 52b983bf90 outdated
    65-    });
    66-}
    67-
    68-template<size_t N>
    69-void BenchLinearizeOptimally(benchmark::Bench& bench, const std::array<uint8_t, N>& serialized)
    70+void BenchLinearizeOptimally(benchmark::Bench& bench, const std::vector<uint8_t>& serialized)
    


    Copilot commented at 0:30 am on October 20, 2025:

    [nitpick] The function signature changed from accepting const std::array<uint8_t, N>& to const std::vector<uint8_t>&. This removes compile-time size checking. While this makes the function more flexible, it also removes a safety guarantee. Consider if the added flexibility is worth the loss of compile-time size verification.

    0template <size_t N>
    1void BenchLinearizeOptimally(benchmark::Bench& bench, const std::array<uint8_t, N>& serialized)
    
  71. sipa force-pushed on Oct 26, 2025
  72. sipa force-pushed on Oct 27, 2025
  73. sipa commented at 8:35 am on October 27, 2025: member

    I have added an additional fuzz test (clusterlin_sfl) which verifies properties of the underlying data structure (SpanningForestState) plus sanity checks on it (as opposed to clusterlin_linearize which only tests the higher-level Linearize() function implemented in function of it).

    It’s no longer a net reduction in LoC now :(

  74. sipa force-pushed on Oct 27, 2025
  75. sipa force-pushed on Nov 3, 2025
  76. sipa force-pushed on Nov 4, 2025
  77. sipa force-pushed on Nov 23, 2025
  78. sipa commented at 3:42 pm on November 23, 2025: member
    Rebased on top of #33629.
  79. DrahtBot added the label Needs rebase on Nov 25, 2025
  80. sipa force-pushed on Nov 25, 2025
  81. sipa commented at 12:39 pm on November 25, 2025: member
    Rebased after merge of #33629.
  82. DrahtBot removed the label Needs rebase on Nov 25, 2025
  83. fanquake added this to the milestone 31.0 on Nov 27, 2025
  84. fanquake requested review from instagibbs on Dec 2, 2025
  85. fanquake requested review from sdaftuar on Dec 2, 2025
  86. fanquake requested review from glozow on Dec 2, 2025
  87. in src/cluster_linearize.h:399 in 060e1856b1
    394+        feerate -= other.feerate;
    395+        return *this;
    396+    }
    397+
    398+    /** Compute the difference between this and other SetInfo (which must be a subset). */
    399+    SetInfo operator-(const SetInfo& other) noexcept
    


    marcofleon commented at 6:36 pm on December 3, 2025:
    0    SetInfo operator-(const SetInfo& other) const noexcept
    

    sipa commented at 8:16 pm on December 6, 2025:
    Done.
  88. in src/cluster_linearize.h:850 in 060e1856b1
    946+            auto& tx_data = m_tx_data[tx];
    947+            auto unreached = (DownWard ? tx_data.children : tx_data.parents) - explored;
    948+            while (unreached.Any()) {
    949+                m_cost += 3;
    950+                auto chunk_rep = m_tx_data[unreached.First()].chunk_rep;
    951+                auto& reached = m_tx_data[m_tx_data[unreached.First()].chunk_rep].chunk_setinfo;
    


    marcofleon commented at 6:50 pm on December 3, 2025:

    nit: Is there a reason we recalculate here vs just using chunk_rep from the line above?

    Also additional nit: The local chunk_rep is named the same as the function parameter and both are TxIdx I believe. Could be worth having different names?


    sipa commented at 8:16 pm on December 6, 2025:
    Nice catch, done both.
  89. marcofleon commented at 7:10 pm on December 3, 2025: contributor

    Did a first pass of the changes in cluster_linearize.h, left a couple small comments. I’m running a few of the fuzz targets, including the new one, and I’ll leave those going for a while.

    Still need to look at the tests thoroughly. Let me know if there’s anything specific you think reviewers can do that would be useful.

  90. sipa commented at 10:35 pm on December 3, 2025: member
    FWIW, I’m writing a Python version of SFL with simpler data structures and less optimizations. It needs some polishing, but I’ll post it here. It could be used as a form of documentation, or live in the functional test framework.
  91. instagibbs commented at 2:32 pm on December 4, 2025: member

    glancing through the PR now

    how bad would it be to merge all the way through “clusterlin: replace cluster linearization with SFL implementation (feature)”, maybe some of the commits that drops unused things (ala “clusterlin: remove unused MergeLinearizations (cleanup)”) and defer the other optimisations for next PR?

  92. in src/cluster_linearize.h:474 in f81f72a11e outdated
    677+ * dependencies cannot all simultaneously be active.
    678+ *
    679+ * The sets of transactions that are internally connected by active dependencies are called chunks.
    680+ * Each chunk of N transactions contains exactly N-1 active dependencies (an additional one would
    681+ * necessarily form a cycle), and thus those active dependencies form a spanning tree for the chunk.
    682+ * The collection of all spanning trees for the entire cluster form a spanning forest. Each
    


    instagibbs commented at 7:31 pm on December 4, 2025:
    0 * The collection of all spanning trees for the entire cluster form a spanning forest. In the extreme each
    

    sipa commented at 2:22 pm on December 9, 2025:
    Done.
  93. in src/cluster_linearize.h:481 in f81f72a11e outdated
    683+ * transaction may be in its own chunk (and thus 0 dependencies are active), or all transactions
    684+ * may form a single chunk (and thus N-1 dependencies are active).
    685+ *
    686+ * Each chunk has a feerate: the total fee of all transactions in it divided by the total size of
    687+ * all transactions in it. We say the spanning forest is topological whenever no inactive
    688+ * dependencies exist from one chunk to another chunk with lower or equal feerate. The algorithm
    


    instagibbs commented at 7:35 pm on December 4, 2025:

    or equal

    Need to understand why, delving seems to suggest the more “natural” choice https://delvingbitcoin.org/t/spanning-forest-cluster-linearization/1419 “If there is an inactive dependency, where the parent and child are in distinct chunks, and the child chunk has higher feerate than the parent chunk, make the dependency active, merging the two chunks.”


    sipa commented at 7:38 pm on December 8, 2025:

    If you don’t merge along equal-feerate dependencies, you can end up in a situation with two equal-feerate “chunks” that depend on each other.

    E.g. the AC and BD “chunks” in:

    0graph BT;
    1A["A: 1/1"];
    2B["B: 1/1"];
    3C["C: 2/1"];
    4D["D: 2/1"];
    5C --> A;
    6C -.-> B;
    7D -.-> A;
    8D --> B;
    

    The solution is an initial phase where you don’t care about splitting equal-feerate parts from one another (this PR), and then a secondary phase where you split up chunks in their minimal components (added in #34023).


    instagibbs commented at 2:43 pm on December 9, 2025:
    makes sense, I am just noting it’s not clear at all from the writeup

    sipa commented at 2:48 pm on December 9, 2025:

    Do you mean the write-up in the code comment added here, on the one on Delving?

    Do you think it’s worth elaborating on here?


    instagibbs commented at 2:57 pm on December 9, 2025:
    sorry, I meant the one on delving. I think it’s worth mentioning in the code itself actually.
  94. in src/cluster_linearize.h:525 in f81f72a11e outdated
    727+ * In addition, this allows refining the algorithm flow into:
    728+ *
    729+ * - Construct an initial topological spanning forest for the graph:
    730+ *   - Start with graph with all dependencies inactive (i.e., each transaction is a singleton
    731+ *     chunk).
    732+ *   - Make the graph topological by randomly picking chunks, and merging them (with their
    


    instagibbs commented at 8:07 pm on December 4, 2025:
    do you mean pick a chunk randomly, then observer the dependency/dependee list, and pick the corresponding lowers/highest feerate one for activation?

    sipa commented at 7:50 pm on December 8, 2025:

    Yeah, exactly.

    • Pick a random chunk (from the list you maintain of chunks that need to be considered still)
    • See if it can be merged upwards or downwards, and:
      • If so, merge with its max-feerate-difference dependency/depender, and add the resulting chunk to list.
      • If not, remove the chunk from the list.
  95. in src/test/fuzz/cluster_linearize.cpp:930 in f81f72a11e outdated
    1208+    //
    1209+    if (is_topological) {
    1210+        auto lin = sfl.GetLinearization();
    1211+        // Which must be valid.
    1212+        SanityCheck(depgraph, lin);
    1213+        // If we started from a topological input, the resulting feerate diagram cannot be worse.
    


    instagibbs commented at 8:40 pm on December 4, 2025:
    0        // If we started from a topological input, the resulting feerate diagram cannot be worse or incomparable.
    

    sipa commented at 2:22 pm on December 9, 2025:
    Done.
  96. in src/test/fuzz/cluster_linearize.cpp:745 in 85cff48b78 outdated
    738@@ -799,14 +739,11 @@ FUZZ_TARGET(clusterlin_simple_finder)
    739     // AncestorCandidateFinder it is being tested against.
    


    instagibbs commented at 8:44 pm on December 4, 2025:

    85cff48b782538ed5a1c980015dd01cb32986b66

    straggling AncestorCandidateFinder reference


    sipa commented at 2:22 pm on December 9, 2025:
    Fixed.
  97. in src/cluster_linearize.h:1124 in f81f72a11e outdated
    1119+                    const auto& dep_data = m_dep_data[dep_idx];
    1120+                    if (!dep_data.active) continue;
    1121+                    // Skip if this dependency is ineligible (the top chunk that would be created
    1122+                    // does not have higher feerate than the chunk it is currently part of).
    1123+                    if (!(dep_data.top_setinfo.feerate >> chunk_data.chunk_setinfo.feerate)) continue;
    1124+                    // Activate it otherwise.
    


    instagibbs commented at 6:55 pm on December 5, 2025:
    0                    // De-activate it otherwise.
    

    sipa commented at 2:22 pm on December 9, 2025:
    Done.
  98. sipa force-pushed on Dec 6, 2025
  99. sipa commented at 8:18 pm on December 6, 2025: member

    I have reorganized this PR, dropping most optimizations in favor of a future follow-up.

    Addressed comments, added more explanations, and made a few code simplifications.

    A python version of the algorithm, lacking most optimizations and using simpler data structures than the C++ version, can be found in https://gist.github.com/sipa/822f60db6608a26bb4aae75fd31bcb12 (in sfl.py).

  100. sipa force-pushed on Dec 6, 2025
  101. DrahtBot added the label CI failed on Dec 6, 2025
  102. DrahtBot commented at 8:52 pm on December 6, 2025: contributor

    🚧 At least one of the CI tasks failed. Task Windows native, fuzz, VS 2022: https://github.com/bitcoin/bitcoin/actions/runs/19993713242/job/57338160248 LLM reason (✨ experimental): Fuzz txgraph failed due to an assertion failure in cluster_linearize (transactions overlapping in SetInfo|=), causing exit code 1.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  103. sipa force-pushed on Dec 6, 2025
  104. sipa force-pushed on Dec 7, 2025
  105. sipa force-pushed on Dec 7, 2025
  106. sipa force-pushed on Dec 7, 2025
  107. DrahtBot removed the label CI failed on Dec 7, 2025
  108. sipa commented at 10:52 pm on December 7, 2025: member
    Now a proper repository for the Python version: https://github.com/sipa/pyclusterlin
  109. clusterlin: add known-correct optimal linearization tests (tests) 81a95d340c
  110. clusterlin: replace benchmarks with SFL-hard ones (bench)
    This also adds a per-cost variant of each.
    84fa6ff310
  111. clusterlin: add class implementing SFL state (preparation)
    This adds a data structure representing the optimization state for the spanning-forest
    linearization algorithm (SFL), plus a fuzz test for its correctness.
    
    This is preparation for switching over Linearize() to use this algorithm.
    
    See https://delvingbitcoin.org/t/spanning-forest-cluster-linearization/1419 for
    a description of the algorithm.
    a21a2451c6
  112. clusterlin: replace cluster linearization with SFL (feature)
    This replaces the existing LIMO linearization algorithm (which internally uses
    ancestor set finding and candidate set finding) with the much more performant
    spanning-forest linearization algorithm.
    
    This removes the old candidate-set search algorithm, and several of its tests,
    benchmarks, and needed utility code.
    8f0828ea51
  113. clusterlin: keep FIFO queue of improvable chunks (preparation)
    This introduces a queue of chunks that still need processing, in both
    MakeTopological() and OptimizationStep(). This is simultaneously:
    * A preparation for introducing randomization, by allowing permuting the
      queue.
    * An improvement to the fairness of suboptimal solutions, by distributing
      the work more fairly over chunks.
    * An optimization, by avoiding retrying chunks over and over again which
      are already known to be optimal.
    7e299020e2
  114. clusterlin: randomize various decisions in SFL (feature)
    This introduces a local RNG inside the SFL state, which is used to randomize
    various decisions inside the algorithm, in order to make it hard to create
    pathological clusters which predictably have bad performance.
    
    The decisions being randomized are:
    * When deciding what chunk to attempt to split, the queue order is
      randomized.
    * When deciding which dependency to split on, a uniformly random one is
      chosen among those with higher top feerate than bottom feerate within
      the chosen chunk.
    * When deciding which chunks to merge, a uniformly random one among those
      with the higher feerate difference is picked.
    * When merging two chunks, a uniformly random dependency between them is
      now activated.
    * When making the state topological, the queue of chunks to process is
      randomized.
    490a296640
  115. clusterlin: randomize equal-feerate parts of linearization (privacy)
    This places equal-feerate chunks (with no dependencies between them) in random
    order in the linearization output, hiding information about DepGraph insertion
    order from the output. Likewise, it randomizes the order of transactions within
    chunks for the same reason.
    7d56ebb135
  116. clusterlin: remove unused MergeLinearizations (cleanup)
    This ended up never being used in txgraph.
    bfe7c83cc8
  117. clusterlin: drop support for improvable chunking (simplification)
    With MergeLinearizations() gone and the LIMO-based Linearize() replaced by SFL, we do not
    need a class (LinearizationChunking) that can maintain an incrementally-improving chunk
    set anymore.
    
    Replace it with a function (ChunkLinearizationInfo) that just computes the chunks as
    SetInfos once, and returns them as a vector. This simplifies several call sites too.
    d56ddf821d
  118. sipa force-pushed on Dec 9, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-12-10 03:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me