cluster mempool: cluster linearization algorithm #30126

pull sipa wants to merge 12 commits into bitcoin:master from sipa:202405_clusterlin changing 9 files +2142 −0
  1. sipa commented at 8:29 pm on May 16, 2024: member

    Part of cluster mempool: #30289

    This introduces low-level cluster linearization code, including tests and some benchmarks. It is currently not hooked up to anything.

    Ultimately, what this PR adds is a function Linearize which operates on instances of DepGraph (instances of which represent pre-processed transaction clusters) to produce and/or improve linearizations for that cluster.

    To provide assurance, the code heavily relies on fuzz tests. A novel approach is used here, where the fuzz input is parsed using the serialization.h framework rather than FuzzedDataProvider, with a custom serializer/deserializer for DepGraph objects. By including serialization, it’s possible to ascertain that the format can represent every relevant cluster, as well as potentially permitting the construction of ad-hoc fuzz inputs from clusters (not included in this PR, but used during development).


    The Linearize(depgraph, iteration_limit, rng_seed, old_linearization) function is an implementation of the (single) LIMO algorithm, with the $S$ in every iteration found as the best out of (a) the best remaining ancestor set and (b) randomized computationally-bounded search. It incrementally builds up a linearization by finding good topologically-valid subsets to move to the front, in such a way that the resulting linearization has a diagram that is at least as good as the old_linearization passed in (if any).

    • Despite using both best ancestor set and search, this is not Double LIMO, as no intersections between these are involved; just the best of the two.
    • The iteration_limit and rng_seed only control the (b) randomized search. Even with 0 iterations, the result will be as good as the old linearization, and the included sets at every point will have a feerate at least as high as the best remaining ancestor set at that point.

    The search algorithm used in the (b) step is very basic, and largely matches Section 2.1 of How to Linearize your Cluster.. See #30286 for optimizations to make it more efficient.

    For background and references, see Introduction to cluster linearization.

  2. DrahtBot commented at 8:29 pm on May 16, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK instagibbs, glozow, sdaftuar

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30286 (cluster mempool: optimized candidate search by sipa)
    • #30285 (cluster mempool: merging & postprocessing of linearizations by sipa)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label CI failed on May 16, 2024
  4. DrahtBot commented at 10:28 pm on May 16, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25072594213

  5. sipa force-pushed on May 17, 2024
  6. sipa force-pushed on May 17, 2024
  7. sipa force-pushed on May 17, 2024
  8. sipa force-pushed on May 17, 2024
  9. sipa force-pushed on May 17, 2024
  10. DrahtBot removed the label CI failed on May 17, 2024
  11. laanwj added the label Mempool on May 17, 2024
  12. sipa force-pushed on May 20, 2024
  13. sipa force-pushed on May 20, 2024
  14. DrahtBot added the label CI failed on May 20, 2024
  15. DrahtBot removed the label CI failed on May 20, 2024
  16. sipa force-pushed on May 20, 2024
  17. sipa commented at 6:53 pm on May 20, 2024: member

    Benchmarks on my Ryzen 5950X system:

    ns/op op/s err% total benchmark
    2,373.94 421,240.11 0.1% 1.10 LinearizeNoIters16TxWorstCase
    7,530.22 132,798.26 0.0% 1.07 LinearizeNoIters32TxWorstCase
    16,585.34 60,294.20 0.1% 1.10 LinearizeNoIters48TxWorstCase
    28,591.70 34,975.18 0.1% 1.10 LinearizeNoIters64TxWorstCase
    53,918.56 18,546.49 0.0% 1.10 LinearizeNoIters75TxWorstCase
    93,589.21 10,684.99 0.1% 1.10 LinearizeNoIters99TxWorstCase
    ns/iters iters/s err% total benchmark
    45.36 22,045,550.98 0.5% 1.10 LinearizePerIter16TxWorstCase
    35.57 28,111,376.58 0.1% 1.10 LinearizePerIter32TxWorstCase
    33.04 30,262,951.89 0.0% 1.10 LinearizePerIter48TxWorstCase
    33.21 30,107,745.17 0.1% 1.10 LinearizePerIter64TxWorstCase
    75.98 13,161,530.63 0.4% 1.07 LinearizePerIter75TxWorstCase
    76.62 13,051,066.77 0.5% 1.08 LinearizePerIter99TxWorstCase
    ns/op op/s err% total benchmark
    332.97 3,003,274.74 0.0% 1.10 PostLinearize16TxWorstCase
    1,121.92 891,330.77 0.0% 1.10 PostLinearize32TxWorstCase
    3,358.33 297,767.01 0.3% 1.13 PostLinearize48TxWorstCase
    5,826.72 171,623.05 0.5% 1.11 PostLinearize64TxWorstCase
    7,453.31 134,168.55 0.1% 1.07 PostLinearize75TxWorstCase
    12,476.44 80,151.09 0.1% 1.10 PostLinearize99TxWorstCase

    This means that for a 64-transaction cluster, it should be possible to linearize (28.59 µs) with 100 candidate search iterations (3.32 µs) plus postlinearize (5.83 µs), within a total of 37.74 µs, on my system.

  18. sipa force-pushed on May 20, 2024
  19. sipa added the label DrahtBot Guix build requested on May 21, 2024
  20. in src/util/bitset.h:15 in 316e2044aa outdated
    10+#include <cstdint>
    11+#include <limits>
    12+#include <type_traits>
    13+
    14+#ifdef _MSC_VER
    15+#  include <intrin.h>
    


    theuni commented at 2:55 pm on May 23, 2024:
    Why is this needed?

    sipa commented at 3:16 pm on May 23, 2024:
    Remnant of a long-lost pre-C++20 past. Gone.
  21. sipa force-pushed on May 23, 2024
  22. sipa commented at 3:18 pm on May 23, 2024: member
    I’ve dropped the dependency on #29625, and switched to using FastRandomContext instead; there is a measurable slowdown from using the (ChaCha20-based) FastRandomContext over the (xoroshiro128++-based) InsecureRandomContext introduced there, but it’s no more than 1-2%. I can switch back to that approach if 29625 were to make it in.
  23. sipa force-pushed on May 23, 2024
  24. sipa removed the label DrahtBot Guix build requested on May 23, 2024
  25. sipa force-pushed on May 23, 2024
  26. DrahtBot commented at 6:33 pm on May 23, 2024: contributor

    Guix builds (on x86_64) [untrusted test-only build, possibly unsafe, not for production use]

    File commit 83ae1bac9d3e01de994734b4bc7002deaf19bc67(master) commit e5cbc2372efc917784b6cae4175b8653a20517c4(master and this pull)
    SHA256SUMS.part 24fd016e03e8c7da... 15fae3483445e33b...
    *-aarch64-linux-gnu-debug.tar.gz 94942cf7dedf3604... 23eeccf77ee5799d...
    *-aarch64-linux-gnu.tar.gz 4b30ca93b6788f48... ed8e5024d960f53e...
    *-arm-linux-gnueabihf-debug.tar.gz a0f57c45e5f02bb1... f22f89c1eba49dda...
    *-arm-linux-gnueabihf.tar.gz 9f0376baaf54b988... 17da8a968635c492...
    *-arm64-apple-darwin-unsigned.tar.gz 9b952b32db70d099... 16d805ab4bcf8d54...
    *-arm64-apple-darwin-unsigned.zip d49361bbbc5529fc... e225d79a24b058a5...
    *-arm64-apple-darwin.tar.gz 34e9cf4b79cbc190... 29b28e6d57761201...
    *-powerpc64-linux-gnu-debug.tar.gz 5f322a7b213e244e... cb5f37b036b5c52c...
    *-powerpc64-linux-gnu.tar.gz bb57b46482c5b1e6... 57adf954458a27d5...
    *-riscv64-linux-gnu-debug.tar.gz d1a3a405c5b45fff... 237eb467f8547d22...
    *-riscv64-linux-gnu.tar.gz 68d7e6671e2dba30... 29d9f1e9052e96d3...
    *-x86_64-apple-darwin-unsigned.tar.gz 6fb22000e8c14c40... 67e5bd5b86483c8a...
    *-x86_64-apple-darwin-unsigned.zip 1c5f2a216e87cbf5... abdbca97fafc146f...
    *-x86_64-apple-darwin.tar.gz 66f17a574163ecaf... f002830b4b8da330...
    *-x86_64-linux-gnu-debug.tar.gz a5044f956a824228... 0791685e39e80672...
    *-x86_64-linux-gnu.tar.gz 23af1dc6cb921b37... ff9625165c3f19c2...
    *.tar.gz caac4a182deb1e04... ba8abeef4165dafb...
    guix_build.log c7cc0190f7085f04... 100da60c2f0e6686...
    guix_build.log.diff 7c460aa3b1aafc32...
  27. in src/cluster_linearize.h:825 in f6deff9f4c outdated
    510+        auto best = anc_finder.FindCandidateSet();
    511+
    512+        // Invoke bounded search to update best, with up to half of our remaining iterations as
    513+        // limit.
    514+        uint64_t iterations = (iteration_limit + 1) / 2;
    515+        iteration_limit -= iterations;
    


    instagibbs commented at 2:59 pm on May 24, 2024:
    in/out args for iterations is kinda spooky and made these few lines really hard to read, could FindCandidateSet just return a std::pair or similar?

    sipa commented at 9:16 pm on June 12, 2024:
    Done. I’ve encapsulated the std::pair<SetType, FeeFrac> for representing sets with their feerates into a new SetInfo<SetType> type, and changed the return type of candidate finders which count iterations to be std::pair<SetInfo<SetType>, uint64_t>, representing the found candidate and the number of performed iterations. I’ve also changed the Linearization functions to just return a bool optimal, rather than an iteration count (as it wasn’t used anywhere anyway).
  28. sipa force-pushed on May 24, 2024
  29. sr-gi commented at 4:49 pm on May 24, 2024: member

    Some more benchmarks.

    MacBook M3 Max:

    ns/op op/s err% total benchmark
    1,678.30 595,841.56 0.4% 1.11 LinearizeNoIters16TxWorstCase
    4,975.16 200,998.49 0.1% 1.10 LinearizeNoIters32TxWorstCase
    10,624.85 94,118.94 0.1% 1.10 LinearizeNoIters48TxWorstCase
    18,570.63 53,848.46 0.1% 1.10 LinearizeNoIters64TxWorstCase
    38,962.09 25,665.97 0.1% 1.10 LinearizeNoIters75TxWorstCase
    65,933.96 15,166.69 0.1% 1.06 LinearizeNoIters99TxWorstCase
    ns/iters iters/s err% total benchmark
    37.31 26,803,657.99 0.1% 1.10 LinearizePerIter16TxWorstCase
    26.62 37,561,209.12 0.2% 1.10 LinearizePerIter32TxWorstCase
    25.79 38,768,485.22 0.2% 1.10 LinearizePerIter48TxWorstCase
    26.05 38,392,847.23 0.1% 1.10 LinearizePerIter64TxWorstCase
    37.99 26,322,699.89 0.2% 1.09 LinearizePerIter75TxWorstCase
    38.21 26,169,082.76 0.1% 1.10 LinearizePerIter99TxWorstCase
    ns/op op/s err% total benchmark
    375.76 2,661,273.20 0.3% 1.11 PostLinearize16TxWorstCase
    1,295.00 772,200.32 0.1% 1.10 PostLinearize32TxWorstCase
    2,850.98 350,756.38 0.0% 1.10 PostLinearize48TxWorstCase
    5,155.30 193,975.14 0.0% 1.10 PostLinearize64TxWorstCase
    6,804.84 146,954.16 0.1% 1.06 PostLinearize75TxWorstCase
    11,696.22 85,497.68 0.0% 1.10 PostLinearize99TxWorstCase

    Raspberry Pi 4 (arm32):

    ns/op op/s err% ins/op cyc/op IPC total benchmark
    24,979.12 40,033.43 0.1% 62,303.01 37,432.31 1.664 1.10 LinearizeNoIters16TxWorstCase
    79,796.77 12,531.84 0.0% 199,042.69 119,579.10 1.665 1.08 LinearizeNoIters32TxWorstCase
    206,049.75 4,853.20 0.0% 514,152.74 308,732.11 1.665 1.10 LinearizeNoIters48TxWorstCase
    339,314.40 2,947.12 0.0% 866,686.12 508,414.12 1.705 1.10 LinearizeNoIters64TxWorstCase
    574,288.94 1,741.28 0.0% 1,309,819.21 860,366.98 1.522 1.10 LinearizeNoIters75TxWorstCase
    983,808.56 1,016.46 0.0% 2,278,204.68 1,474,149.12 1.545 1.10 LinearizeNoIters99TxWorstCase
    ns/iters iters/s err% ins/iters cyc/iters IPC total benchmark
    362.94 2,755,255.49 0.1% 862.03 543.89 1.585 1.10 LinearizePerIter16TxWorstCase
    290.11 3,446,974.76 0.2% 692.02 434.73 1.592 1.10 LinearizePerIter32TxWorstCase
    529.35 1,889,094.80 0.1% 1,319.02 793.26 1.663 1.11 LinearizePerIter48TxWorstCase
    530.00 1,886,788.68 0.2% 1,324.44 794.26 1.668 1.11 LinearizePerIter64TxWorstCase
    782.35 1,278,197.39 0.5% 2,079.81 1,172.37 1.774 1.06 LinearizePerIter75TxWorstCase
    737.95 1,355,112.46 0.5% 1,684.40 1,105.59 1.524 1.07 LinearizePerIter99TxWorstCase
    ns/op op/s err% ins/op cyc/op IPC total benchmark
    5,010.36 199,586.29 0.8% 10,606.00 7,506.09 1.413 1.10 PostLinearize16TxWorstCase
    14,365.91 69,609.26 0.0% 34,330.00 21,530.90 1.594 1.09 PostLinearize32TxWorstCase
    32,751.91 30,532.57 0.0% 76,671.01 49,085.98 1.562 1.10 PostLinearize48TxWorstCase
    55,794.83 17,922.81 0.0% 131,423.02 83,621.11 1.572 1.10 PostLinearize64TxWorstCase
    81,534.00 12,264.82 1.6% 210,784.02 120,341.56 1.752 1.09 PostLinearize75TxWorstCase
    144,665.53 6,912.50 0.0% 399,573.04 216,805.74 1.843 1.10 PostLinearize99TxWorstCase

    Raspberry Pi 4 (aarch64):

    ns/op op/s err% ins/op cyc/op IPC total benchmark
    13,765.93 72,643.09 0.1% 37,001.00 20,625.64 1.794 11.01 LinearizeNoIters16TxWorstCase
    41,091.94 24,335.67 0.2% 107,447.00 61,567.66 1.745 10.99 LinearizeNoIters32TxWorstCase
    73,005.68 13,697.56 0.1% 210,986.00 109,380.97 1.929 10.67 LinearizeNoIters48TxWorstCase
    118,418.69 8,444.61 0.0% 348,914.00 177,423.52 1.967 11.01 LinearizeNoIters64TxWorstCase
    229,345.84 4,360.23 0.0% 618,749.67 343,589.10 1.801 11.01 LinearizeNoIters75TxWorstCase
    377,809.11 2,646.84 0.0% 1,021,644.01 565,996.91 1.805 11.00 LinearizeNoIters99TxWorstCase
    ns/iters iters/s err% ins/iters cyc/iters IPC total benchmark
    231.99 4,310,566.43 0.0% 643.14 347.59 1.850 11.00 LinearizePerIter16TxWorstCase
    190.47 5,250,207.80 0.1% 525.81 285.35 1.843 11.03 LinearizePerIter32TxWorstCase
    175.35 5,702,969.82 0.1% 492.10 262.75 1.873 11.01 LinearizePerIter48TxWorstCase
    175.70 5,691,481.30 0.1% 494.56 263.25 1.879 11.00 LinearizePerIter64TxWorstCase
    293.22 3,410,384.42 0.2% 763.38 439.30 1.738 11.01 LinearizePerIter75TxWorstCase
    293.55 3,406,519.94 0.1% 766.88 439.77 1.744 11.01 LinearizePerIter99TxWorstCase
    ns/op op/s err% ins/op cyc/op IPC total benchmark
    2,915.23 343,026.59 0.1% 5,325.00 4,368.47 1.219 11.01 PostLinearize16TxWorstCase
    9,830.26 101,726.71 0.0% 17,247.00 14,729.95 1.171 11.00 PostLinearize32TxWorstCase
    20,550.14 48,661.46 0.0% 35,742.00 30,794.09 1.161 11.00 PostLinearize48TxWorstCase
    35,211.73 28,399.62 0.0% 61,199.00 52,763.93 1.160 11.00 PostLinearize64TxWorstCase
    31,801.73 31,444.83 0.0% 95,378.00 47,653.77 2.001 11.00 PostLinearize75TxWorstCase
    52,955.80 18,883.67 0.0% 162,443.00 79,349.66 2.047 11.00 PostLinearize99TxWorstCase
  30. sipa force-pushed on May 24, 2024
  31. sipa force-pushed on May 24, 2024
  32. sipa force-pushed on May 24, 2024
  33. DrahtBot added the label CI failed on May 24, 2024
  34. DrahtBot commented at 5:37 pm on May 24, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25390396918

  35. instagibbs commented at 6:26 pm on May 24, 2024: member

    Ancestor-set based presplitting inside the search is replaced with using the best ancestor set as one of the LIMO set choices.

    Can this be elaborated a bit? I’ve taken a few minutes looking at the commits and it wasn’t immediately clear how this all maps to existing public discussions and the comments:

    0        // This is an implementation of the (single) LIMO algorithm:
    1        // https://delvingbitcoin.org/t/limo-combining-the-best-parts-of-linearization-search-and-merging/825
    2        // where S is instantiated to be the result of a bounded search, which itself is seeded
    3        // with the best prefix of what remains of the input linearization, or the best ancestor set.
    

    thanks!

  36. sipa commented at 7:05 pm on May 24, 2024: member
    @instagibbs I’ve expanded the explanation in the PR description. Happy to elaborate more if things are unclear.
  37. DrahtBot removed the label CI failed on May 24, 2024
  38. sipa force-pushed on May 25, 2024
  39. sipa force-pushed on May 25, 2024
  40. sipa force-pushed on May 26, 2024
  41. sipa force-pushed on May 26, 2024
  42. sipa force-pushed on May 28, 2024
  43. sipa force-pushed on May 29, 2024
  44. sipa commented at 8:14 pm on May 29, 2024: member
    I’ve added support for merging linearizations to this PR (MergeLinearizations() function), plus benchmarks and tests.
  45. DrahtBot added the label CI failed on May 30, 2024
  46. DrahtBot commented at 5:55 am on May 30, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25568469819

  47. sipa force-pushed on May 30, 2024
  48. DrahtBot removed the label CI failed on May 30, 2024
  49. in src/cluster_linearize.h:48 in 8170a81e71 outdated
    43+        S ancestors;
    44+        /** All descendants of the transaction (including itself). */
    45+        S descendants;
    46+
    47+        friend bool operator==(const Entry&, const Entry&) noexcept = default;
    48+        friend auto operator<=>(const Entry&, const Entry&) noexcept = default;
    


    instagibbs commented at 2:31 pm on May 31, 2024:
    For this operator are S ancestors and S descendants ordered? Is it just compared via m_val for BitSets? Is ordering by these elements meaningful?

    sipa commented at 9:41 pm on June 7, 2024:
    I think any attempt to invoke this comparison operator would just have failed, as BitSet doesn’t have an operator<=> (it did at some point, but I had dropped it). I’ve just removed the operator here too.
  50. in src/cluster_linearize.h:78 in 8170a81e71 outdated
    70+     *
    71+     * Complexity: O(N) where N=ntx.
    72+     **/
    73+    explicit DepGraph(ClusterIndex ntx) noexcept
    74+    {
    75+        entries.resize(ntx);
    


    instagibbs commented at 3:04 pm on May 31, 2024:

    it crashes once it hits S::Singleton if violated but maybe an earlier crash is more immediately informative

    0        Assume(ntx <= S::Size());
    1        entries.resize(ntx);
    

    sipa commented at 9:17 pm on June 12, 2024:
    Done.
  51. in src/cluster_linearize.h:126 in 8170a81e71 outdated
    118+     *
    119+     * Complexity: O(N^2) where N=depgraph.TxCount().
    120+     */
    121+    DepGraph(const DepGraph<S>& depgraph, Span<const ClusterIndex> mapping) noexcept : entries(depgraph.TxCount())
    122+    {
    123+        // Fill in fee, size, ancestors.
    


    instagibbs commented at 3:13 pm on May 31, 2024:

    to avoid an OOB access regression

    0        Assert(mapping.size() == depgraph.TxCount());
    1        // Fill in fee, size, ancestors.
    

    sipa commented at 9:17 pm on June 12, 2024:
    Done.
  52. in src/cluster_linearize.h:34 in 8170a81e71 outdated
    29+/** Data type to represent transaction indices in clusters. */
    30+using ClusterIndex = uint32_t;
    31+
    32+/** Data structure that holds a transaction graph's preprocessed data (fee, size, ancestors,
    33+ *  descendants). */
    34+template<typename S>
    


    instagibbs commented at 4:16 pm on May 31, 2024:

    not super important but can we use a more descriptive typename since our use-case is pretty narrow? I’m mentally substituting S everywhere with bitset which is a fair mental leap and it’s relatively difficult to grep for being so short.

    S_bitset? I don’t know why the conventions are the way they are :sweat_smile:


    sipa commented at 6:54 pm on June 7, 2024:
    I’ve renamed it to SetType everywhere. While it’s expected to be instantiated with BitSet, that’s not a necessity, and I expect that for certain temporarily-very-large clusters in TxGraph we’ll need some algorithms instantiated by another type (something using an std::set or a sorted vector, for example).
  53. in src/cluster_linearize.h:581 in 8170a81e71 outdated
    553+                // split on.
    554+                if (und.None()) return;
    555+            }
    556+
    557+            // Actually construct new work item on the queue.
    558+            Assume(queue.size() < queue.capacity());
    


    instagibbs commented at 5:29 pm on June 4, 2024:
    is this for ensuring no allocations are going to take place? How do we know this?

    sipa commented at 9:17 pm on June 12, 2024:
    Indeed. I’ve added a comment. LMK whether it’s clearer now.

    instagibbs commented at 6:43 pm on June 24, 2024:
    yes, took me a while to convince myself, but the claim is clear
  54. instagibbs commented at 2:13 pm on June 6, 2024: member
    glanced in the context of checking what of VecDeque/BitSet actually required
  55. sipa force-pushed on Jun 6, 2024
  56. glozow referenced this in commit feab35189b on Jun 7, 2024
  57. DrahtBot added the label Needs rebase on Jun 7, 2024
  58. sipa force-pushed on Jun 7, 2024
  59. sipa force-pushed on Jun 7, 2024
  60. DrahtBot removed the label Needs rebase on Jun 7, 2024
  61. sipa force-pushed on Jun 7, 2024
  62. sipa force-pushed on Jun 10, 2024
  63. sipa force-pushed on Jun 11, 2024
  64. DrahtBot added the label CI failed on Jun 11, 2024
  65. DrahtBot commented at 5:11 am on June 11, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26052313359

  66. sipa force-pushed on Jun 11, 2024
  67. DrahtBot removed the label CI failed on Jun 11, 2024
  68. sipa force-pushed on Jun 11, 2024
  69. achow101 referenced this in commit 91e0beede2 on Jun 11, 2024
  70. DrahtBot added the label Needs rebase on Jun 11, 2024
  71. sipa force-pushed on Jun 11, 2024
  72. DrahtBot removed the label Needs rebase on Jun 12, 2024
  73. sipa force-pushed on Jun 12, 2024
  74. sipa force-pushed on Jun 12, 2024
  75. DrahtBot added the label CI failed on Jun 13, 2024
  76. DrahtBot commented at 8:57 am on June 13, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26153096652

  77. sipa force-pushed on Jun 13, 2024
  78. sipa force-pushed on Jun 13, 2024
  79. in src/test/fuzz/cluster_linearize.cpp:525 in 7dfe49ea5d outdated
    261+    // Construct a cluster of a certain length, with no dependencies.
    262+    Cluster<TestBitSet> cluster;
    263+    FuzzedDataProvider provider(buffer.data(), buffer.size());
    264+    auto num_tx = provider.ConsumeIntegralInRange<ClusterIndex>(2, 32);
    265+    cluster.resize(num_tx);
    266+    for (auto& item : cluster) item.first.size = 1;
    


    instagibbs commented at 3:43 pm on June 13, 2024:
    Cluster<TestBitSet> cluster(num_tx, std::make_pair(FeeFrac{0, 1}, TestBitSet())); should work?

    sipa commented at 9:57 pm on June 27, 2024:
    Done.
  80. in src/test/fuzz/cluster_linearize.cpp:529 in 7dfe49ea5d outdated
    265+    cluster.resize(num_tx);
    266+    for (auto& item : cluster) item.first.size = 1;
    267+    // Construct the corresponding DepGraph object (also no dependencies).
    268+    DepGraph depgraph(cluster);
    269+    SanityCheck(depgraph);
    270+    // Read (parent, child) pairs, and add them to the cluster and txgraph.
    


    instagibbs commented at 3:51 pm on June 13, 2024:
    0    // Read (parent, child) pairs, and add them to the cluster and depgraph.
    

    sipa commented at 9:57 pm on June 27, 2024:
    Done.
  81. in src/cluster_linearize.h:151 in 06c600099a outdated
    136+     *
    137+     * Complexity: O(N) where N=TxCount().
    138+     **/
    139+    void AddDependency(ClusterIndex parent, ClusterIndex child) noexcept
    140+    {
    141+        // To each ancestor of the parent, add as descendants the descendants of the child.
    


    instagibbs commented at 4:00 pm on June 13, 2024:
    Should we Assume()/short-circuit if parent == child

    sipa commented at 9:48 pm on June 27, 2024:
    I don’t think there is a need for that. If parent == child then this is a no-op (every transaction is already an ancestor and descendant of itself).

    ismaelsadeeq commented at 11:01 am on July 15, 2024:
    maybe just return early in that case and avoid the looping?

    sipa commented at 11:14 pm on July 16, 2024:
  82. in src/cluster_linearize.h:131 in 06c600099a outdated
    121+    const SetType& Descendants(ClusterIndex i) const noexcept { return entries[i].descendants; }
    122+
    123+    /** Add a new unconnected transaction to this transaction graph (at the end), and return its
    124+     *  ClusterIndex.
    125+     *
    126+     * Complexity: Amortized O(1).
    


    instagibbs commented at 4:03 pm on June 13, 2024:
    Where does the amortization come in?

    sipa commented at 9:57 pm on June 27, 2024:
    Added a comment.
  83. in src/test/fuzz/cluster_linearize.cpp:530 in 7dfe49ea5d outdated
    266+    for (auto& item : cluster) item.first.size = 1;
    267+    // Construct the corresponding DepGraph object (also no dependencies).
    268+    DepGraph depgraph(cluster);
    269+    SanityCheck(depgraph);
    270+    // Read (parent, child) pairs, and add them to the cluster and txgraph.
    271+    LIMITED_WHILE(provider.remaining_bytes() > 0, 1024) {
    


    instagibbs commented at 4:15 pm on June 13, 2024:
    0    LIMITED_WHILE(provider.remaining_bytes() > 0, TestBitSet::Size() * TestBitSet::Size()) {
    

    sipa commented at 9:57 pm on June 27, 2024:
    Done.
  84. sipa force-pushed on Jun 13, 2024
  85. DrahtBot removed the label CI failed on Jun 13, 2024
  86. sipa force-pushed on Jun 13, 2024
  87. sipa force-pushed on Jun 14, 2024
  88. sipa force-pushed on Jun 14, 2024
  89. in src/cluster_linearize.h:339 in 626b247e21 outdated
    237+                m_ancestor_set_feerates[j] -= feerate;
    238+            }
    239+        }
    240+    }
    241+
    242+    /** Find the best remaining ancestor set. Unlinearized transactions must remain.
    


    glozow commented at 1:06 pm on June 14, 2024:

    In the docs, maybe instead of “best remaining ancestor set”, say “highest feerate ancestor set” ?

    Side note - I was thinking that this was a replica of the ancestor set algo in BlockAssembler but this isn’t the minimum of individual and ancestor feerate.


    sipa commented at 8:40 pm on July 8, 2024:

    I’ve changed it to “Find the best (highest-feerate, smallest among those in case of a tie) ancestor set among the remaining transactions.”, because “best” is a bit more specific than highest-feerate.

    Regarding the side note: that rule in BlockAssembler is an optimization, as it avoids looking at child transactions with higher ancestor feerate, because the ancestors will have been picked earlier anyway. The same optimization could be made in AncestorCandidateFinder, but because it’s an $\mathcal{O}(n)$ algorithm that iterates over all transactions once anyway, there is nothing to be gained from doing so.

  90. sipa renamed this:
    Low-level cluster linearization code
    cluster mempool: cluster linearization algorithm
    on Jun 14, 2024
  91. sipa commented at 1:17 pm on June 14, 2024: member
    I have split off the optimizations for candidate search to PR #30286, and the merging & postprocessing algorithms to PR #30285, and renamed the PR. It is now focused on just adding the Linearize() function, with its eventual interface (including passing in an old linearization, and a randomization seed), but without optimizations beyond that.
  92. in src/test/fuzz/cluster_linearize.cpp:197 in e5ba6ece9b outdated
    149+        depgraph = {};
    150+        while (true) {
    151+            // Read size. Size 0 signifies the end of the DepGraph.
    152+            int32_t size;
    153+            s >> VARINT_MODE(size, VarIntMode::NONNEGATIVE_SIGNED);
    154+            size &= 0x3FFFFF; // Enough for size up to 4M.
    


    instagibbs commented at 6:00 pm on June 14, 2024:
    could leave an assert to this effect right after this

    sipa commented at 9:58 pm on June 27, 2024:
    Done with a static_assert.
  93. in src/test/fuzz/cluster_linearize.cpp:202 in e5ba6ece9b outdated
    154+            size &= 0x3FFFFF; // Enough for size up to 4M.
    155+            if (size == 0 || depgraph.TxCount() == SetType::Size()) break;
    156+            // Read fee, encoded as a signed varint (odd means negative, even means non-negative).
    157+            uint64_t coded_fee;
    158+            s >> VARINT(coded_fee);
    159+            coded_fee &= 0xFFFFFFFFFFFFF; // Enough for fee between -21M...21M BTC.
    


    instagibbs commented at 6:00 pm on June 14, 2024:
    could leave an assert to this effect right after this

    sipa commented at 9:57 pm on June 27, 2024:
    Done with a static_assert.
  94. in src/cluster_linearize.h:449 in 86b341f549 outdated
    346+        /** Local copy of the iteration limit. */
    347+        uint64_t iterations_left = max_iterations;
    348+
    349+        /** Internal function to add a work item.
    350+         *
    351+         * - inc: the "inc" value for the new work item
    


    glozow commented at 1:17 pm on June 18, 2024:
    maybe mention that caller must ensure inc includes Ancestors(inc)

    sipa commented at 8:41 pm on July 8, 2024:
    I’ve added a comment that inc needs to be topological (a term which is defined in the WorkItem definition above).
  95. in src/cluster_linearize.h:406 in 626b247e21 outdated
    210+            FeeFrac anc_feerate;
    211+            // Reuse accumulated feerate from first ancestor, if usable.
    212+            Assume(anc_to_add.Any());
    213+            ClusterIndex first = anc_to_add.First();
    214+            if (first < i) {
    215+                anc_feerate = m_ancestor_set_feerates[first];
    


    instagibbs commented at 6:04 pm on June 18, 2024:
    0                Assume(!m_ancestor_set_feerates[first].IsEmpty());
    1                anc_feerate = m_ancestor_set_feerates[first];
    

    sipa commented at 9:57 pm on June 27, 2024:
    Done.
  96. in src/cluster_linearize.h:453 in 626b247e21 outdated
    245+     */
    246+    SetInfo<SetType> FindCandidateSet() const noexcept
    247+    {
    248+        std::optional<ClusterIndex> best;
    249+        for (auto i : m_todo) {
    250+            if (best.has_value()) {
    


    instagibbs commented at 6:46 pm on June 18, 2024:

    belt and suspenders nit prior to a comparison

    0            if (best.has_value()) {
    1                Assume(!m_ancestor_set_feerates[i].IsEmpty());
    

    sipa commented at 7:21 pm on June 28, 2024:
    Done.
  97. in src/test/fuzz/cluster_linearize.cpp:382 in 626b247e21 outdated
    405+        // Find a topologically valid subset of transactions to remove from the graph.
    406+        auto del_set = ReadTopologicalSet(depgraph, todo, reader);
    407+        // If we did not find anything, use best_anc itself, because we should remove something.
    408+        if (del_set.None()) del_set = best_anc.transactions;
    409+        todo -= del_set;
    410+        anc_finder.MarkDone(del_set);
    


    instagibbs commented at 7:26 pm on June 18, 2024:
    it’d be worth it to have del_set be sometimes overcomplete, including random subsets of ~todo which should be handled internally by being dropped. Alternatively it could be disallowed via Assume()?

    sipa commented at 7:21 pm on June 28, 2024:
    I’ve outlawed MarkDone() with select not a subset of todo.
  98. in src/test/fuzz/cluster_linearize.cpp:64 in 86b341f549 outdated
    232+        }
    233+        return best;
    234+    }
    235+};
    236+
    237+/** A simple finder class for candidate sets. */
    


    instagibbs commented at 7:33 pm on June 18, 2024:
    A top-level description of what this is useful for would be appreciated.

    sipa commented at 7:22 pm on June 28, 2024:
    Added, please have a look if this is what you meant.
  99. in src/test/fuzz/cluster_linearize.cpp:459 in 86b341f549 outdated
    582+            }
    583+        }
    584+
    585+        // Find a topologically valid subset of transactions to remove from the graph.
    586+        auto del_set = ReadTopologicalSet(depgraph, todo, reader);
    587+        // If we did not find anything, use found_set itself, because we should remove something.
    


    instagibbs commented at 7:45 pm on June 18, 2024:
    found_set doesn’t exist

    sipa commented at 7:22 pm on June 28, 2024:
    Fixed.
  100. in src/test/fuzz/cluster_linearize.cpp:74 in 86b341f549 outdated
    242+    const DepGraph<SetType>& m_depgraph;
    243+    /** Which transaction are left to include. */
    244+    SetType m_todo;
    245+
    246+public:
    247+    /** Construct an SimpleOptimalCandidateFinder for a given graph. */
    


    instagibbs commented at 7:48 pm on June 18, 2024:
    SimpleCandidateFinder

    sipa commented at 7:22 pm on June 28, 2024:
    Fixed.
  101. in src/test/fuzz/cluster_linearize.cpp:33 in 86b341f549 outdated
    201+    const DepGraph<SetType>& m_depgraph;
    202+    /** Which transaction are left to include. */
    203+    SetType m_todo;
    204+
    205+public:
    206+    /** Construct an SimpleOptimalCandidateFinder for a given graph. */
    


    instagibbs commented at 7:48 pm on June 18, 2024:
    ExhaustiveCandidateFinder

    sipa commented at 7:22 pm on June 28, 2024:
    Fixed.
  102. in src/test/fuzz/cluster_linearize.cpp:438 in 86b341f549 outdated
    561+        assert(iterations_done <= ((uint64_t{1} << todo.Count()) - 1));
    562+
    563+        // Perform quality checks only if SearchCandidateFinder claims an optimal result.
    564+        if (iterations_done < max_iterations) {
    565+            // Compare with SimpleCandidateFinder.
    566+            auto [simple, simple_iters] = smp_finder.FindCandidateSet(0x3ffff);
    


    instagibbs commented at 8:00 pm on June 18, 2024:
    please consti-fy 0x3ffff since it’s use 3x

    sipa commented at 7:22 pm on June 28, 2024:
    Done, named it MAX_SIMPLE_ITERATIONS (and used more readable number 300000).
  103. sipa force-pushed on Jun 20, 2024
  104. sipa commented at 11:09 pm on June 20, 2024: member
    Added a big description about the fuzzer serialization format for DepGraph objects.
  105. in src/test/fuzz/cluster_linearize.cpp:74 in 4fbc4687d9 outdated
    69+bool CanAddDependency(const DepGraph<SetType>& depgraph, ClusterIndex parent, ClusterIndex child) noexcept
    70+{
    71+    // If child is already a descendant of parent, the dependency would be redundant.
    72+    if (depgraph.Descendants(parent)[child]) return false;
    73+    // If child is already an ancestor of parent, the dependency would cause a cycle.
    74+    if (depgraph.Ancestors(parent)[child]) return false;
    


    instagibbs commented at 3:59 pm on June 21, 2024:
    useful note in code would be this is the only critical check to ensure fuzzer doesn’t generate a cyclical graph, and the other checks are optimizations?

    sipa commented at 7:22 pm on June 28, 2024:
    Done.
  106. in src/test/fuzz/cluster_linearize.cpp:75 in 4fbc4687d9 outdated
    70+{
    71+    // If child is already a descendant of parent, the dependency would be redundant.
    72+    if (depgraph.Descendants(parent)[child]) return false;
    73+    // If child is already an ancestor of parent, the dependency would cause a cycle.
    74+    if (depgraph.Ancestors(parent)[child]) return false;
    75+    // If there is an ancestor of parent which is a direct parent of a descendant of child,
    


    instagibbs commented at 4:46 pm on June 21, 2024:

    I can’t make an example that causes this clause to fail which hinders my understanding of what it’s doing.

    This is where having static unit test for this case firing would be helpful to step through.


    sipa commented at 7:23 pm on June 28, 2024:
    See new unit test.

    sipa commented at 2:09 pm on July 8, 2024:
    Actually, all this code is gone now.
  107. in src/test/fuzz/cluster_linearize.cpp:446 in 68a6c5daa3 outdated
    611+
    612+            // Compare with AncestorCandidateFinder;
    613+            auto anc = anc_finder.FindCandidateSet();
    614+            assert(found.feerate >= anc.feerate);
    615+
    616+            // If todo isn't too big, compare with ExhaustiveCandidateFinder.
    


    instagibbs commented at 5:10 pm on June 21, 2024:
    If we’re checking against exhaustive we can avoid using the other finders?

    sipa commented at 7:23 pm on June 28, 2024:

    I’d rather not. The point of ExhaustiveCandidateFinder is really testing the correctness of SimpleCandidateFinder (whose correctness may not be obvious to reviewers), so that SimpleCandidateFinder on its turn can be used to test SearchCandidateFinder. I added comments to explain that.

    Both of these are done within the same fuzz test, which is perhaps confusing? I could split up the tests (an exhaustive-vs-simple test, and a simple-vs-search test).

  108. in src/test/fuzz/cluster_linearize.cpp:88 in 68a6c5daa3 outdated
    329+                    if (new_inc.feerate > best.feerate) best = new_inc;
    330+                    break;
    331+                }
    332+            }
    333+        }
    334+        return {std::move(best), max_iterations - iterations_left};
    


    instagibbs commented at 5:36 pm on June 21, 2024:
    double-checking: In this and other candidate set finders it seems like max_iterations - iterations_left could be zero even if it was optimal on the very last step?

    sipa commented at 7:23 pm on June 28, 2024:
    Indeed. I don’t think 1 iteration matters.

    glozow commented at 2:15 pm on July 8, 2024:
    And I suppose it’s not problematic if it is optimal but didn’t claim to be so?

    instagibbs commented at 5:15 pm on July 8, 2024:
    I’m assuming that it’s only used for stronger invariants checks in the fuzz/test harness

    sipa commented at 8:46 pm on July 8, 2024:
    @glozow Indeed. Even if this function was changed to return an explicit optimal, it would remain quite possible that it returns the optimal without knowing it’s optimal (because it’s possible that the best passed in for example was already optimal, but it requires a ton of iterations to exhaust the search space to prove nothing better exists). @instagibbs Well I think we may want to cache in the mempool clusters whether or not the linearization for them is known to be optimal too, so that a hypothetical background cluster-improver thread could know to skip it.
  109. in src/cluster_linearize.h:398 in 68a6c5daa3 outdated
    314+     * Complexity: O(N * min(max_iterations, 2^N)) where N=depgraph.TxCount().
    315+     */
    316+    std::pair<SetInfo<SetType>, uint64_t> FindCandidateSet(uint64_t max_iterations, SetInfo<SetType> best) noexcept
    317+    {
    318+        // Bail out quickly if we're given a (remaining) cluster that is empty.
    319+        if (m_todo.None()) return {};
    


    instagibbs commented at 6:08 pm on June 21, 2024:
    I think this needs coverage or should be disallowed entirely

    sipa commented at 7:23 pm on June 28, 2024:
    Disallowed.
  110. in src/test/fuzz/cluster_linearize.cpp:104 in 68a6c5daa3 outdated
    314+            auto [inc, und] = queue.back();
    315+            queue.pop_back();
    316+            // Look for a transaction to consider adding/removing.
    317+            bool inc_none = inc.None();
    318+            for (auto pivot : und) {
    319+                // If inc is empty, consider any pivot. Otherwise only consider transactions
    


    instagibbs commented at 6:22 pm on June 21, 2024:
    can we call this split rather than pivot since it only occurs here in the codebase?

    sipa commented at 7:23 pm on June 28, 2024:
    Done.
  111. in src/cluster_linearize.h:573 in 68a6c5daa3 outdated
    355+            if (!inc.feerate.IsEmpty()) {
    356+                // If inc's feerate is better than best's, remember it as our new best.
    357+                if (inc.feerate > best.feerate) {
    358+                    best = inc;
    359+                }
    360+            }
    


    instagibbs commented at 6:32 pm on June 21, 2024:
    0            } else {
    1                Assume(inc.transactions.None())
    2            }
    

    sipa commented at 7:24 pm on June 28, 2024:
    Done.
  112. in src/test/fuzz/cluster_linearize.cpp:541 in a34467001c outdated
    716+    }
    717+
    718+    // If Linearize claims optimal result, run quality tests.
    719+    if (optimal) {
    720+        // It must be as good as SimpleLinearize.
    721+        auto [simple_linearization, simple_optimal] = SimpleLinearize(depgraph, 0x3ffff);
    


    instagibbs commented at 7:53 pm on June 21, 2024:
    0x3ffff should be explained

    sipa commented at 0:56 am on July 1, 2024:
    Done, named as MAX_SIMPLE_ITERATIONS.
  113. in src/bench/clusterlin.cpp:98 in 8733d2dbcd outdated
    89+void BenchLinearizeNoItersWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
    90+{
    91+    const auto depgraph = MakeLinearGraph<SetType>(ntx);
    92+    bench.run([&] {
    93+        // Do 10 iterations just to make sure some of that logic is executed, but this is
    94+        // effectively negligible.
    


    instagibbs commented at 8:22 pm on June 21, 2024:
    a little confused what this is trying to say

    sipa commented at 7:24 pm on June 28, 2024:
    I was trying to explain why using 10 iterations instead of 0 made sense. I have instead just changed it to 0 iterations.
  114. in src/bench/clusterlin.cpp:106 in 8733d2dbcd outdated
     97+    });
     98+}
     99+
    100+} // namespace
    101+
    102+static void LinearizePerIter16TxWorstCase(benchmark::Bench& bench) { BenchLinearizePerIterWorstCase<BitSet<16>>(16, bench); }
    


    instagibbs commented at 8:25 pm on June 21, 2024:
    if each benchmark matches the BitSet::Size(), no need for ntx args?

    sipa commented at 7:20 pm on June 28, 2024:
    BitSet<N>::Size() may differ from N.
  115. in src/bench/clusterlin.cpp:90 in 8733d2dbcd outdated
    84+    });
    85+}
    86+
    87+/** Benchmark for linearization of a trivial linear graph using just ancestor sort. */
    88+template<typename SetType>
    89+void BenchLinearizeNoItersWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
    


    instagibbs commented at 8:27 pm on June 21, 2024:
    having trouble understanding what PerIter and NoIter mean

    sipa commented at 0:57 am on July 1, 2024:
    I’ve added comments.
  116. in src/bench/clusterlin.cpp:27 in 8733d2dbcd outdated
    22+        if (i > 0) depgraph.AddDependency(i - 1, i);
    23+    }
    24+    return depgraph;
    25+}
    26+
    27+// Construct a difficult graph. These need at least sqrt(2^(n-1)) iterations in the best
    


    instagibbs commented at 3:09 pm on June 24, 2024:
    Can you add a description of how it’s inherently difficult and what these two types of topologies are achieving?

    sipa commented at 0:57 am on July 1, 2024:
    I have no idea! Added a comment that it was found empirically.
  117. in src/cluster_linearize.h:548 in aacb67d914 outdated
    351@@ -351,7 +352,8 @@ class SearchCandidateFinder
    352         };
    353 
    354         /** The queue of work items. */
    355-        std::vector<WorkItem> queue;
    356+        VecDeque<WorkItem> queue;
    357+        queue.reserve(std::max<size_t>(256, 2 * m_todo.Count()));
    


    instagibbs commented at 4:18 pm on June 24, 2024:

    I don’t think 256 is ever exceeded in bench(99)/fuzz(32) cases?

    IIUC If the other value were ever exercised, it has to at least be m_todo.Count() + 1


    sipa commented at 8:21 pm on July 2, 2024:
    The way the bound was computed was inaccurate; I have fixed that (256 does get reached (but not exceeded) now in the bench(99) benchmark if you increase the iteration count to 100000). I’ve also rewritten the comments and computation a bit.

    instagibbs commented at 7:57 pm on July 15, 2024:
    I’m unsure how this was addressed

    sipa commented at 11:12 pm on July 16, 2024:
    It’s not clear to me what you’re asking for. Are you asking what I changed to address this, or saying that it’s still an issue, or saying that more explanatory comments are needed?

    instagibbs commented at 2:35 pm on July 17, 2024:

    The way the bound was computed was inaccurate; I have fixed that

    I’m unsure where the mis-estimate was or how it was fixed.


    sipa commented at 3:31 pm on July 17, 2024:

    The current condition (for using DFS) is:

    0while (queue.size() - 1 + queue.front().und.Count() > queue.capacity()) {
    

    In the original version of this PR it was:

    0const auto queuesize_for_front = queue.capacity() - queue.front().und.Count();
    1...
    2while (queue.size() > queuesize_for_front) {
    

    There is an offset 1 difference between the two; it didn’t take into account that choosing BFS would first delete an entry from the queue before adding new ones.

  118. in src/cluster_linearize.h:524 in aacb67d914 outdated
    428+        // The approach here combines the two: use BFS until the queue grows too large, at which
    429+        // point we temporarily switch to DFS until the size shrinks again.
    430         while (!queue.empty()) {
    431+            // See if processing the first queue item (BFS) is possible without exceeding the queue
    432+            // capacity(), assuming we process the last queue items (DFS) after that.
    433+            const auto queuesize_for_front = queue.capacity() - queue.front().und.Count();
    


    instagibbs commented at 6:40 pm on June 24, 2024:
    I believe this is all correct and understand reasoning for flow, but I don’t really understand what queuesize_for_front name is meant to convey. It took me a while to convince myself this section was correct fwiw

    sipa commented at 7:29 pm on June 28, 2024:
    It means “what the queue size needs to be before we can process an element from the front”.

    sipa commented at 8:21 pm on July 2, 2024:
    I have rewritten this.
  119. in src/bench/clusterlin.cpp:77 in 2228113dbb outdated
    76@@ -77,8 +77,9 @@ void BenchLinearizePerIterWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
    77 {
    


    instagibbs commented at 6:53 pm on June 24, 2024:

    2228113dbbdd356aaaad385fd3e46a71308392aa commit message nit:

    s/item,/item to/


    sipa commented at 0:57 am on July 1, 2024:
    Done.
  120. in src/cluster_linearize.h:498 in 2228113dbb outdated
    494@@ -467,6 +495,7 @@ class SearchCandidateFinder
    495  *
    496  * @param[in] depgraph        Dependency graph of the the cluster to be linearized.
    497  * @param[in] max_iterations  Upper bound on the number of optimization steps that will be done.
    498+ * @param[in] rng_seed        A random number seed to control search order.
    


    instagibbs commented at 6:55 pm on June 24, 2024:
    I presume the non-determinism is helpful to reduce the influence other nodes have on your linearization? Or maybe the idea that you could run linearizatio algo intermittently and make improvements? Other motivations?

    sipa commented at 0:42 am on July 1, 2024:
    Yeah, it is to reduce the ability for peers to construct cases that just happen to be worst case for the exact search order they know we’re going to try.

    sipa commented at 0:57 am on July 1, 2024:
    Added a comment to Linearize to reflect this.
  121. in src/cluster_linearize.h:691 in be82e86931 outdated
    522+std::pair<std::vector<ClusterIndex>, uint64_t> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed, Span<const ClusterIndex> old_linearization = {}) noexcept
    523 {
    524     uint64_t iterations_left = max_iterations;
    525     auto todo = SetType::Fill(depgraph.TxCount());
    526     std::vector<ClusterIndex> linearization;
    527 
    


    instagibbs commented at 2:08 pm on June 25, 2024:
    0
    1    Assume(old_linearization.empty() || old_linearization.size() == depgraph.TxCount());
    

    sipa commented at 0:57 am on July 1, 2024:
    Done.
  122. in src/cluster_linearize.h:617 in be82e86931 outdated
    560+                }
    561+            }
    562+        }
    563+
    564+        // Then initialize best to be either the best remaining ancestor set, or the first chunk.
    565+        auto best_anc = anc_finder.FindCandidateSet();
    


    instagibbs commented at 2:28 pm on June 25, 2024:
    best_anc seems unneeded since it’s only read once next line?

    sipa commented at 8:21 pm on July 2, 2024:
    Gone.
  123. in src/cluster_linearize.h:687 in be82e86931 outdated
    562@@ -527,6 +563,24 @@ std::pair<std::vector<ClusterIndex>, uint64_t> Linearize(const DepGraph<SetType>
    563 
    564         if (iterations_done_now == max_iterations_now) {
    565             optimal = false;
    566+            // If the search result is not (guaranteed to be) optimal, run intersections to make
    567+            // sure we don't pick something that makes us unable to reach further diagram points
    568+            // of the old linearization.
    569+            if (best.transactions != best_prefix.transactions) {
    


    instagibbs commented at 2:36 pm on June 25, 2024:
    I think this section bears further elaboration, or direct citation to a good explanation of what this is necessary for correctness.

    sipa commented at 8:22 pm on July 2, 2024:
    I think my in-progress delving post will cover this. I will postpone addressing this until that’s out.

  124. in src/test/fuzz/cluster_linearize.cpp:215 in be82e86931 outdated
    468+template<typename BS>
    469+std::vector<ClusterIndex> ReadLinearization(const DepGraph<BS>& depgraph, SpanReader& reader)
    470+{
    471+    std::vector<ClusterIndex> linearization;
    472+    TestBitSet todo = TestBitSet::Fill(depgraph.TxCount());
    473+    for (ClusterIndex i = 0; i < depgraph.TxCount(); ++i) {
    


    instagibbs commented at 4:17 pm on June 25, 2024:

    i is never used, which was confusing for me, perhaps this?

    0    while (todo.Any()) {
    

    along with asserting that todo.Count() is going down for each iteration of the loop to avoid infinite loops ala assert(todo.Count() < todo_count);?


    sipa commented at 8:23 pm on July 2, 2024:
    Done. I’ve added an assert(todo[j]); before todo.Reset(j); to make sure infinite iteration is not possible.
  125. in src/cluster_linearize.h:169 in 9633b40722 outdated
    165@@ -166,6 +166,45 @@ class DepGraph
    166         return ret;
    167     }
    168 
    169+    /** Find some connected component within the subset "left" of this graph.
    


    instagibbs commented at 4:39 pm on June 25, 2024:
    I kept reading “Left” and is left vs right, since it was taking the First bit. “todo” probably matches better.

    sipa commented at 8:23 pm on July 2, 2024:
    Done.
  126. in src/test/fuzz/cluster_linearize.cpp:172 in 9633b40722 outdated
    433+{
    434+    auto todo = BS::Fill(depgraph.TxCount());
    435+    auto comp = depgraph.FindConnectedComponent(todo);
    436+    todo -= comp;
    437+    while (todo.Any()) {
    438+        auto nextcomp = depgraph.FindConnectedComponent(todo);
    


    instagibbs commented at 4:57 pm on June 25, 2024:
    0        Assume(!depgraph.IsConnected());
    1        auto nextcomp = depgraph.FindConnectedComponent(todo);
    

    sipa commented at 8:23 pm on July 2, 2024:
    Done.
  127. in src/test/fuzz/cluster_linearize.cpp:279 in 4fbc4687d9 outdated
    348+
    349+    // Construct dependency graph. The sanity check here includes a round-trip check.
    350+    DepGraph depgraph(cluster);
    351+    SanityCheck(depgraph);
    352+
    353+    // Verify that ancestry is computed correctly.
    


    instagibbs commented at 3:01 pm on June 27, 2024:
    for assurance shouldn’t we also be checking the descendants are being computed as expected? Or is that covered somewhere else?

    sipa commented at 8:19 pm on July 2, 2024:
    That’s covered in SanityCheck.

    sdaftuar commented at 7:49 pm on July 3, 2024:
    I think SanityCheck is already checking that the ancestors and descendants are consistent with each other, so it’s sufficient here to check that the ancestors are correct.
  128. instagibbs commented at 3:02 pm on June 27, 2024: member

    couldn’t find any logic issues in the parts I complete understood, I’ve run the fuzzer on each commit and reviewed through https://github.com/bitcoin/bitcoin/pull/30126/commits/9633b40722fd9295b93baaf9914b31b9dec96f45

    a couple general comments:

    1. The bespoke DepGraphFormatter is pretty artisanal and as a reviewer I’m relying heavily on the cluster->depgraph->check depgraph matches -> serialization round-trip for asserting correctness. Unit tests-as-documentation is one thing I’m kind of missing, especially with the more involved parts.
    2. Matching the actual LIMO commit with the LIMO literature/theory available is difficult for me(e.g., what’s sufficient vs necessary). I know there’s a new writeup coming, so I’ll wait for that before investigating further.
  129. sipa force-pushed on Jun 27, 2024
  130. sipa commented at 9:59 pm on June 27, 2024: member
    Addressed some of @instagibbs’ comments, which involved moving some of the src/test/fuzz/cluster_linearize.cpp code to a common src/test/util/cluster_linearize.h, where it is also available to a newly-added (but pretty barebones) src/test/cluster_linearize_tests.cpp.
  131. DrahtBot commented at 11:27 pm on June 27, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26783773668

  132. DrahtBot added the label CI failed on Jun 27, 2024
  133. sipa force-pushed on Jun 28, 2024
  134. sipa force-pushed on Jun 28, 2024
  135. DrahtBot removed the label CI failed on Jun 28, 2024
  136. sipa force-pushed on Jun 28, 2024
  137. sipa force-pushed on Jul 1, 2024
  138. in src/Makefile.am:132 in 2ab345e52c outdated
    131@@ -132,6 +132,7 @@ BITCOIN_CORE_H = \
    132   chainparamsseeds.h \
    


    sdaftuar commented at 3:27 pm on July 1, 2024:

    Nit: noticed a couple typos in the commit message, if you end up touching this again:

    0This primarily adds the DepGraph class, which encapsulated precomputed
    

    encapsulates

    0number of a utility features (inspectors for set feerates, computing
    

    a


    sipa commented at 4:36 am on July 8, 2024:
    Fixed.
  139. in src/cluster_linearize.h:49 in 2ab345e52c outdated
    38+        SetType ancestors;
    39+        /** All descendants of the transaction (including itself). */
    40+        SetType descendants;
    41+
    42+        /** Equality operator. */
    43+        friend bool operator==(const Entry&, const Entry&) noexcept = default;
    


    sdaftuar commented at 3:41 pm on July 1, 2024:

    Is this only used for verifying whether two DepGraph objects are identical, eg in the serialization/deserialization tests in the next commit?

    It occurred to me that there might be some contexts where an equality operator for two Entry objects might want to be defined differently – eg because you could have two Entry’s in the same DepGraph with identical feerate and ancestor/descendant state, but still treat them as different objects – but I’m guessing that is not the case for any of the uses currently contemplated?


    ismaelsadeeq commented at 11:42 am on July 5, 2024:
    Entry struct does not have any other distinguishing information, are you suggesting adding txid to Entry?

    sipa commented at 4:38 am on July 8, 2024:

    From the perspective of a DepGraph, transactions are identified by their position in the entry vector (which matches their position in the Cluster they came from). DepGraph’s equality tests whether every transaction in every position matches; there is nothing else it knows about.

    I’ve added a comment to DepGraph::operator== to indicate it’s primarily for testing.

  140. DrahtBot added the label CI failed on Jul 2, 2024
  141. DrahtBot removed the label CI failed on Jul 2, 2024
  142. in src/cluster_linearize.h:138 in 2ab345e52c outdated
    125+     *
    126+     * Complexity: O(1) (amortized, due to resizing of backing vector).
    127+     */
    128+    ClusterIndex AddTransaction(const FeeFrac& feefrac) noexcept
    129+    {
    130+        ClusterIndex new_idx = TxCount();
    


    sdaftuar commented at 3:25 pm on July 2, 2024:
    Should this function be checking that we haven’t exceeded the capacity of the SetType?

    sipa commented at 4:38 am on July 8, 2024:
    Added an Assume for that.
  143. in src/test/util/cluster_linearize.h:79 in ecfbe2ce7a outdated
    74+ *   - The fee: VARINT(SignedToUnsigned(tx[t].fee)), see below for SignedToUnsigned.
    75+ *   - The dependencies: for each minimized parent and minimized child of t among tx[0..t-1]:
    76+ *     - VARINT(delta), which cannot be 0.
    77+ *       To determine these values, consider the list of all potential parents and children tx t
    78+ *       has among tx[0..t-1]. First the parents, in order from t-1 back to 0, and then the
    79+ *       children in the same order. For these, we only consider ones that satisyfy
    


    sdaftuar commented at 4:01 pm on July 2, 2024:
    nit: “satisyfy” –> “satisfy”

    sipa commented at 4:38 am on July 8, 2024:
    Fixed.
  144. in src/test/util/cluster_linearize.h:176 in ecfbe2ce7a outdated
    171+                // In loop 0 store parents among tx 0..idx-1; in loop 1 store children among those.
    172+                SetType towrite = loop ? GetReducedChildren(depgraph, idx) : GetReducedParents(depgraph, idx);
    173+                for (ClusterIndex i = 0; i < idx; ++i) {
    174+                    ClusterIndex parent = loop ? idx : idx - 1 - i;
    175+                    ClusterIndex child = loop ? idx - 1 - i : idx;
    176+                    if (CanAddDependency(rebuild, parent, child)) {
    


    sdaftuar commented at 4:20 pm on July 2, 2024:

    I was really confused about why we need to use CanAddDependency here – as far as I can understand, it’s basically about compression, so that we get shorter serializations? And conceptually, the reason compression is helpful is in the context of serializations that are produced by a fuzzer being as meaningful as possible?

    I guess compression doesn’t hurt here, but just to verify my understanding: would all this logic work just fine if we dropped the use of CanAddDependency in the serializer, but left it in place in the deserializer? Or are there cases where that difference would result in the round trip of serializing/deserializing to not result in the same graph? (And if so, could you provide such an example so I can try to better understand this logic?)


    sipa commented at 9:19 pm on July 3, 2024:

    Yes, it is needed, for the simple reason that the serializer needs to predict what the deserializer state will be when reading this, or they will go out of sync. If they disagree about which dependencies are possible, the delta-encoded differences between them will be off, resulting in possibly widely different DepGraphs on decoding. An example where this matters is in the 4th unit test (specifically constructed for that, suggested by @instagibbs).

    I am considering a simplification for the encoding format however, but I’m not sure yet if it’ll be an improvement. I’ll try pushing a commit that switches things over in the next few days. Perhaps it’s worth holding off reviewing the encoding format and associated logic until then.


    sipa commented at 4:39 am on July 8, 2024:
    I’ve pushed this simplification; it works a lot different now, but CanAddDependency, GetReducedParents, and GetReducedChildren are gone.
  145. in src/test/util/cluster_linearize.h:160 in ecfbe2ce7a outdated
    155+    }
    156+
    157+    template <typename Stream, typename SetType>
    158+    static void Ser(Stream& s, const DepGraph<SetType>& depgraph)
    159+    {
    160+        /** The graph corresponding to what the deserializer already knows. */
    


    sdaftuar commented at 4:22 pm on July 2, 2024:
    Should this say “serializer” instead of “deserializer”?

    sipa commented at 10:41 pm on July 7, 2024:
    No, the serializer is serializing the this object; the rebuild object represents what the deserializer already knows when it has deserialized up to whatever has been serialized already.

    sipa commented at 7:38 pm on July 8, 2024:
    This code is gone.
  146. in src/test/util/cluster_linearize.h:102 in ecfbe2ce7a outdated
     97+ *   absolute value. This way we can encode negative fees in graphs, but still let small negative
     98+ *   numbers have small encodings.
     99+ * - Why are the parents/children emitted in order from t-1 back to 0? This means that if E is the
    100+ *   encoding of a subgraph with no outside dependencies, copies of E in the serialization (in the
    101+ *   right places) will result in copies of that subgraph.
    102+ * - Why use CanAddDependency in the serialization definition? This makes sure that every variation
    


    sdaftuar commented at 4:29 pm on July 2, 2024:
    Left another comment down below regarding CanAddDependency() in the serializer – it makes sense to me to use it in the deserializer in the context of having a fuzzer produce random bytes which we try to deserialize into a graph, but I don’t follow why it’s necessary in the serializer.

    sipa commented at 4:39 am on July 8, 2024:
    It’s gone.
  147. in src/test/cluster_linearize_tests.cpp:64 in ecfbe2ce7a outdated
    59+    TestDepGraphSerialization<TestBitSet>({{{0, 1}, {}}}, "010000");
    60+
    61+    // Transactions: A(fee=42,size=11), B(fee=-13,size=7), B depends on A.
    62+    TestDepGraphSerialization<TestBitSet>(
    63+        {{{42, 11}, {}}, {{-13, 7}, {0}}},
    64+        "0b5407190100");
    


    sdaftuar commented at 4:46 pm on July 2, 2024:
    Is the best way to review this code to manually check that the encoding is as described in the serialization code?

    sipa commented at 4:40 am on July 8, 2024:
    With the format change, I’ve documented every byte of the encoding in the tests here.
  148. in src/test/cluster_linearize_tests.cpp:71 in ecfbe2ce7a outdated
    66+    // Transactions: A(64,128), B(128,256), C(1,1), C depends on A and B.
    67+    TestDepGraphSerialization<TestBitSet>(
    68+        {{{64, 128}, {}}, {{128, 256}, {}}, {{1, 1}, {0, 1}}},
    69+        "8000800081008100000102010100");
    70+
    71+    // Transactions: A(-58,113), B(36,114), C(-59,115), D(37,116). Deps: B->A, C->A, D->C, in order
    


    sdaftuar commented at 6:12 pm on July 2, 2024:
    Nit: The fee/size values in this comment don’t seem to match the fee/size values in the test below.

    sipa commented at 4:40 am on July 8, 2024:
    Good catch, fixed.
  149. sipa force-pushed on Jul 2, 2024
  150. in src/cluster_linearize.h:5 in ca6f49dd0d outdated
    4@@ -5,6 +5,7 @@
    5 #ifndef BITCOIN_CLUSTER_LINEARIZE_H
    


    sdaftuar commented at 8:03 pm on July 3, 2024:

    nit: if you touch up this commit, I think the commit message has a couple typos:

    This is a class that encapsulated precomputes ancestor set feerates, and

    should read “encapsulates precomputed ancestor set feerates”


    sipa commented at 4:40 am on July 8, 2024:
    Fixed.
  151. sdaftuar commented at 8:23 pm on July 3, 2024: member
    Just reviewed through the first 3 commits so far.
  152. in src/cluster_linearize.h:51 in e277976004 outdated
    46+        Entry() noexcept = default;
    47+        /** Construct an entry with a given feerate, ancestor set, descendant set. */
    48+        Entry(const FeeFrac& f, const SetType& a, const SetType& d) noexcept : feerate(f), ancestors(a), descendants(d) {}
    49+    };
    50+
    51+    /** Data for each transaction, in order. */
    


    ismaelsadeeq commented at 11:42 am on July 5, 2024:
    which order?

    sipa commented at 4:40 am on July 8, 2024:
    Added a comment.
  153. sipa force-pushed on Jul 8, 2024
  154. in src/test/fuzz/cluster_linearize.cpp:48 in 4b1e978b2b outdated
    43+    assert(DepGraph(cluster) == depgraph);
    44+}
    45+
    46+FUZZ_TARGET(clusterlin_cluster_serialization)
    47+{
    48+    // Verify that any graph of transaction has its ancestry correctly computed by DepGraph, and if
    


    glozow commented at 12:21 pm on July 8, 2024:

    4b1e978b2bafd9da564aa52d2ce64a723cf64036 nit:

    0    // Verify that any graph of transactions has its ancestry correctly computed by DepGraph, and if
    

    sipa commented at 8:41 pm on July 8, 2024:
    Fixed.
  155. in src/cluster_linearize.h:427 in 304d3cb23b outdated
    237+     *
    238+     * Complexity: O(N*M) where N=depgraph.TxCount(), M=select.Count().
    239+     */
    240+    void MarkDone(SetType select) noexcept
    241+    {
    242+        Assume(select.IsSubsetOf(m_todo));
    


    glozow commented at 12:33 pm on July 8, 2024:

    304d3cb23ba9f084b98f9f29b47e3dfbb61ca334 may be helpful context

    0        // Never MarkDone the same transaction more than once as this function
    1        // blindly subtracts the transaction's feerate from m_ancestor_set_feerates
    2        Assume(select.IsSubsetOf(m_todo));
    

    sipa commented at 8:41 pm on July 8, 2024:
    I’ve added a short comment to the docstring of the function instead.
  156. in src/cluster_linearize.h:383 in 304d3cb23b outdated
    197+class AncestorCandidateFinder
    198+{
    199+    /** Internal dependency graph. */
    200+    const DepGraph<SetType>& m_depgraph;
    201+    /** Which transaction are left to include. */
    202+    SetType m_todo;
    


    glozow commented at 1:44 pm on July 8, 2024:
    For each of the CandidateFinders, would it be helpful to expose a AllDone() function? Seems slightly inconvenient for the caller to separately keep track of todo for a while(todo.Any()) loop. FindCandidateSet() Assumes that todo isn’t empty, so we can’t use that to query.

    sipa commented at 8:42 pm on July 8, 2024:
    Done, good idea. Avoiding a separate copy m_todo in Linearize forced me to rewrite some of the code, which turned out to simplify the LIMO logic slightly.
  157. in src/cluster_linearize.h:519 in 5661fcdb15 outdated
    331+     *                              be <= max_iterations. If strictly < max_iterations, the
    332+     *                              returned subset is optimal.
    333+     *
    334+     * Complexity: O(N * min(max_iterations, 2^N)) where N=depgraph.TxCount().
    335+     */
    336+    std::pair<SetInfo<SetType>, uint64_t> FindCandidateSet(uint64_t max_iterations, SetInfo<SetType> best) noexcept
    


    glozow commented at 2:11 pm on July 8, 2024:
    Note to reviewers: pseudocode for the function when first introduced in 5661fcdb15244bc6d602379c294276c7ec686505 can be found here
  158. glozow commented at 3:34 pm on July 8, 2024: member
    Looked at the first 4 commits so far, just doing code review and trying the algos out in MiniMiner.
  159. sipa force-pushed on Jul 8, 2024
  160. sipa force-pushed on Jul 9, 2024
  161. in src/cluster_linearize.h:473 in 4e6e416fed outdated
    467@@ -452,6 +468,54 @@ class SearchCandidateFinder
    468     }
    469 };
    470 
    471+/** Find a linearization for a cluster.
    472+ *
    473+ * @param[in] depgraph        Dependency graph of the the cluster to be linearized.
    


    sdaftuar commented at 7:42 pm on July 9, 2024:
    nit: s/the the/the/

    sipa commented at 8:28 pm on July 24, 2024:
    Fixed.
  162. in src/cluster_linearize.h:684 in 750d52fa8d outdated
    555  *
    556  * Complexity: O(N * min(max_iterations + N, 2^N)) where N=depgraph.TxCount().
    557  */
    558 template<typename SetType>
    559-std::pair<std::vector<ClusterIndex>, bool> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed) noexcept
    560+std::pair<std::vector<ClusterIndex>, bool> Linearize(const DepGraph<SetType>& depgraph, uint64_t max_iterations, uint64_t rng_seed, Span<const ClusterIndex> old_linearization = {}) noexcept
    


    sdaftuar commented at 8:29 pm on July 9, 2024:
    Perhaps worth documenting that if an existing linearization is passed in, it must be topologically valid, or else the output of Linearize may not be topologically valid.

    sipa commented at 8:28 pm on July 24, 2024:
    Done.
  163. sipa force-pushed on Jul 10, 2024
  164. DrahtBot added the label CI failed on Jul 10, 2024
  165. sipa force-pushed on Jul 10, 2024
  166. in src/test/fuzz/cluster_linearize.cpp:223 in 41e31ce727 outdated
    218+            reader >> VARINT(idx);
    219+        } catch (const std::ios_base::failure&) {}
    220+        idx %= potential_next.Count();
    221+        // Find out which transaction that corresponds to.
    222+        for (auto j : potential_next) {
    223+            if (idx-- == 0) {
    


    DrahtBot commented at 6:12 pm on July 10, 2024:
     0Run clusterlin_chunking with args ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-max_total_time=60']INFO: Running with entropic power schedule (0xFF, 100).
     1INFO: Seed: 2800051321
     2INFO: Loaded 1 modules   (625955 inline 8-bit counters): 625955 [0x556ead4c6938, 0x556ead55f65b), 
     3INFO: Loaded 1 PC tables (625955 PCs): 625955 [0x556ead55f660,0x556eadeec890), 
     4INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
     5INFO: A corpus is not provided, starting from an empty corpus
     6[#2](/bitcoin-bitcoin/2/)	INITED cov: 101 ft: 102 corp: 1/1b exec/s: 0 rss: 112Mb
     7	NEW_FUNC[1/5]: 0x556eaa4369e0 in cluster_linearize::DepGraph<bitset_detail::IntBitSet<unsigned int>>::AddTransaction(FeeFrac const&) src/./cluster_linearize.h:134
     8	NEW_FUNC[2/5]: 0x556eaa437100 in cluster_linearize::DepGraph<bitset_detail::IntBitSet<unsigned int>>::Entry& std::vector<cluster_linearize::DepGraph<bitset_detail::IntBitSet<unsigned int>>::Entry, std::allocator<cluster_linearize::DepGraph<bitset_detail::IntBitSet<unsigned int>>::Entry>>::emplace_back<FeeFrac const&, bitset_detail::IntBitSet<unsigned int>, bitset_detail::IntBitSet<unsigned int>>(FeeFrac const&, bitset_detail::IntBitSet<unsigned int>&&, bitset_detail::IntBitSet<unsigned int>&&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/vector.tcc:113
     9[#7](/bitcoin-bitcoin/7/)	NEW    cov: 130 ft: 137 corp: 2/3b lim: 4 exec/s: 0 rss: 112Mb L: 2/2 MS: 5 ChangeByte-CopyPart-ChangeBinInt-CrossOver-InsertByte-
    10[#8](/bitcoin-bitcoin/8/)	NEW    cov: 131 ft: 138 corp: 3/5b lim: 4 exec/s: 0 rss: 112Mb L: 2/2 MS: 1 InsertByte-
    11[#9](/bitcoin-bitcoin/9/)	NEW    cov: 132 ft: 139 corp: 4/7b lim: 4 exec/s: 0 rss: 112Mb L: 2/2 MS: 1 ChangeASCIIInt-
    12[#22](/bitcoin-bitcoin/22/)	NEW    cov: 132 ft: 143 corp: 5/10b lim: 4 exec/s: 0 rss: 112Mb L: 3/3 MS: 3 ChangeByte-ShuffleBytes-InsertByte-
    13	NEW_FUNC[1/3]: 0x556eaa1fb770 in void std::vector<unsigned int, std::allocator<unsigned int>>::_M_realloc_insert<unsigned int const&>(__gnu_cxx::__normal_iterator<unsigned int*, std::vector<unsigned int, std::allocator<unsigned int>>>, unsigned int const&) /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/vector.tcc:453
    14	NEW_FUNC[2/3]: 0x556eaa1fbed0 in std::vector<unsigned int, std::allocator<unsigned int>>::_M_check_len(unsigned long, char const*) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/stl_vector.h:1897
    15[#27](/bitcoin-bitcoin/27/)	NEW    cov: 161 ft: 174 corp: 6/13b lim: 4 exec/s: 0 rss: 112Mb L: 3/3 MS: 5 ChangeBit-ChangeByte-CopyPart-CopyPart-ChangeByte-
    16[#36](/bitcoin-bitcoin/36/)	NEW    cov: 162 ft: 175 corp: 7/15b lim: 4 exec/s: 0 rss: 112Mb L: 2/3 MS: 4 ChangeBit-CrossOver-ChangeBit-ChangeBinInt-
    17[#44](/bitcoin-bitcoin/44/)	NEW    cov: 162 ft: 179 corp: 8/19b lim: 4 exec/s: 0 rss: 112Mb L: 4/4 MS: 3 ChangeBit-ChangeBit-CopyPart-
    18[#49](/bitcoin-bitcoin/49/)	NEW    cov: 162 ft: 180 corp: 9/23b lim: 4 exec/s: 0 rss: 112Mb L: 4/4 MS: 5 ShuffleBytes-CrossOver-ShuffleBytes-ChangeByte-ShuffleBytes-
    19[#52](/bitcoin-bitcoin/52/)	NEW    cov: 162 ft: 182 corp: 10/27b lim: 4 exec/s: 0 rss: 112Mb L: 4/4 MS: 3 InsertByte-CrossOver-CrossOver-
    20[#55](/bitcoin-bitcoin/55/)	NEW    cov: 173 ft: 196 corp: 11/31b lim: 4 exec/s: 0 rss: 112Mb L: 4/4 MS: 3 ChangeByte-CrossOver-CMP- DE: "\000\000\000\000"-
    21test/fuzz/cluster_linearize.cpp:223:20: runtime error: unsigned integer overflow: 0 - 1 cannot be represented in type 'uint64_t' (aka 'unsigned long')
    22    [#0](/bitcoin-bitcoin/0/) 0x556eaa4203ed in std::vector<unsigned int, std::allocator<unsigned int>> (anonymous namespace)::ReadLinearization<bitset_detail::IntBitSet<unsigned int>>(cluster_linearize::DepGraph<bitset_detail::IntBitSet<unsigned int>> const&, SpanReader&) src/test/fuzz/cluster_linearize.cpp:223:20
    23    [#1](/bitcoin-bitcoin/1/) 0x556eaa41e849 in clusterlin_chunking_fuzz_target(std::span<unsigned char const, 18446744073709551615ul>) src/test/fuzz/cluster_linearize.cpp:356:26
    24    [#2](/bitcoin-bitcoin/2/) 0x556eaa96888d in std::function<void (std::span<unsigned char const, 18446744073709551615ul>)>::operator()(std::span<unsigned char const, 18446744073709551615ul>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    25    [#3](/bitcoin-bitcoin/3/) 0x556eaa96888d in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:209:5
    26    [#4](/bitcoin-bitcoin/4/) 0x556eaa033534 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1c02534) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    27    [#5](/bitcoin-bitcoin/5/) 0x556eaa032c29 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1c01c29) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    28    [#6](/bitcoin-bitcoin/6/) 0x556eaa034415 in fuzzer::Fuzzer::MutateAndTestOne() (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1c03415) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    29    [#7](/bitcoin-bitcoin/7/) 0x556eaa034f75 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1c03f75) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    30    [#8](/bitcoin-bitcoin/8/) 0x556eaa02224f in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1bf124f) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    31    [#9](/bitcoin-bitcoin/9/) 0x556eaa04c8d6 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1c1b8d6) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    32    [#10](/bitcoin-bitcoin/10/) 0x7fae2899a1c9  (/lib/x86_64-linux-gnu/libc.so.6+0x2a1c9) (BuildId: 08134323d00289185684a4cd177d202f39c2a5f3)
    33    [#11](/bitcoin-bitcoin/11/) 0x7fae2899a28a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 08134323d00289185684a4cd177d202f39c2a5f3)
    34    [#12](/bitcoin-bitcoin/12/) 0x556eaa017234 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1be6234) (BuildId: 4f2fa2987d9f90b866b2491be1f80d23e1cce505)
    35
    36SUMMARY: UndefinedBehaviorSanitizer: unsigned-integer-overflow test/fuzz/cluster_linearize.cpp:223:20 
    37MS: 4 ChangeByte-EraseBytes-CMP-CopyPart- DE: "\003\000"-; base unit: e5dc0d414ce7dc334d19fc3ca66d8e4d4272b8cd
    380x1,0x3,0x0,0x0,
    39\001\003\000\000
    40artifact_prefix='./'; Test unit written to ./crash-275df7aaf0e8504be8fb5e3fbcd9b1be65e65326
    41Base64: AQMAAA==
    
  167. sipa force-pushed on Jul 10, 2024
  168. sipa force-pushed on Jul 10, 2024
  169. sipa force-pushed on Jul 11, 2024
  170. sipa force-pushed on Jul 11, 2024
  171. sipa force-pushed on Jul 11, 2024
  172. DrahtBot removed the label CI failed on Jul 11, 2024
  173. sipa commented at 2:49 pm on July 11, 2024: member

    I’ve made a few changes the past few days, and the code should be ready for more review now:

    • The fuzzer serialization format for DepGraph was changed to something simpler.
    • The linearization -> chunk feerates function ChunkLinearization was moved from the fuzz tests to the main cluster_linearization.h file, in a new separate commit, which also includes a fuzz test for just that function.
    • The LIMO code now runs intersections with the chunk prefixes of the current chunk boundaries of the remainder of the old linearization, rather than prefixes of the remainders of the original chunk boundaries of the old linearization:
      • This should be a bit faster for what I believe to be the worst case.
      • It can only improve the quality of the linearizations that come out (though probably not substantially).
      • It means a significant part of the LIMO complexity can now be shared with the Merge algorithm from #30285. Because of that, this part (keeping track of the chunking of the remainder of a linearization, and computing intersections with its prefixes) has been abstracted out into a separate class LinearizationChunking, which is introduced in a separate commit, with its own new fuzz test.

    Further, I have posted https://delvingbitcoin.org/t/introduction-to-cluster-linearization/1032 which I’d encourage people to look at if they want to understand the usefulness of these algorithms.

  174. sipa force-pushed on Jul 11, 2024
  175. sipa commented at 3:06 pm on July 11, 2024: member
    Sorry, one small push to get rid of an extra function that I introduced in the mean time, but that ended up being unnecessary.
  176. sipa force-pushed on Jul 11, 2024
  177. in src/test/cluster_linearize_tests.cpp:34 in 1492194c5f outdated
    37+
    38+    // Test that the serialization of depgraph matches hexenc.
    39+    std::vector<unsigned char> encoding;
    40+    VectorWriter writer(encoding, 0);
    41+    writer << Using<DepGraphFormatter>(depgraph);
    42+    BOOST_CHECK_EQUAL(HexStr(encoding), hexenc);
    


    sdaftuar commented at 3:20 pm on July 12, 2024:
    Given that there’s more than one valid serialization that will deserialize to the same graph, perhaps it’s worth commenting something to that effect here? (I guess we could also consider dropping this check and instead just checking that if we deserialize hexenc we get the same depgraph, but perhaps it’s beneficial to check that what we think of as a canonical encoding is correct?)

    sipa commented at 8:29 pm on July 24, 2024:
    I’ve added comments explaining this, and also moved the “test correspondence between Cluster and DepGraph” to a new VerifyDepGraphFromCluster function that’s invoked in both the unit and fuzz tests.
  178. in src/cluster_linearize.h:42 in 503af62cba outdated
    37+        /** All ancestors of the transaction (including itself). */
    38+        SetType ancestors;
    39+        /** All descendants of the transaction (including itself). */
    40+        SetType descendants;
    41+
    42+        /** Equality operator. */
    


    ismaelsadeeq commented at 10:07 am on July 15, 2024:

    nit: I think this is obvious


    sipa commented at 11:08 pm on July 16, 2024:
    Fair enough, but style-wise I still like to have a comment on each function. I’ve expanded the comment a bit here to point out it’s for testing.
  179. in src/cluster_linearize.h:98 in 503af62cba outdated
    86+        for (ClusterIndex i = 0; i < cluster.size(); ++i) {
    87+            entries[i].feerate = cluster[i].first;
    88+            entries[i].ancestors = cluster[i].second;
    89+            // Make sure transactions are ancestors of themselves.
    90+            entries[i].ancestors.Set(i);
    91+        }
    


    ismaelsadeeq commented at 10:19 am on July 15, 2024:

    nit: be explicit we are adding direct parents

    0        for (ClusterIndex i = 0; i < cluster.size(); ++i) {
    1            // Fill in fee, size
    2            entries[i].feerate = cluster[i].first;
    3            // Fill in direct parents as ancestors
    4            entries[i].ancestors = cluster[i].second;
    5            // Make sure transactions are ancestors of themselves.
    6            entries[i].ancestors.Set(i);
    7        }
    

    sipa commented at 11:07 pm on July 16, 2024:
    Done.
  180. in src/cluster_linearize.h:151 in 503af62cba outdated
    137+     *
    138+     * Complexity: O(N) where N=TxCount().
    139+     **/
    140+    void AddDependency(ClusterIndex parent, ClusterIndex child) noexcept
    141+    {
    142+        // To each ancestor of the parent, add as descendants the descendants of the child.
    


    ismaelsadeeq commented at 11:03 am on July 15, 2024:
    Should we return early if dependency already exist ?

    sipa commented at 11:07 pm on July 16, 2024:
    The function is just for testing, so it doesn’t matter much, but sure, added.
  181. in src/test/util/cluster_linearize.h:107 in 1492194c5f outdated
    102+            if (anc_a != anc_b) return anc_a < anc_b;
    103+            return a < b;
    104+        });
    105+
    106+        /** Which transactions the deserializer already knows when it has deserialized what has
    107+         *  has been serialized here so far, and in what order. */
    


    instagibbs commented at 6:10 pm on July 15, 2024:
    0         * been serialized here so far, and in what order. */
    

    sipa commented at 11:04 pm on July 16, 2024:
    Done.
  182. in src/test/util/cluster_linearize.h:176 in 1492194c5f outdated
    141+                // Loop to find how far from the end in rebuilt_order to insert.
    142+                if (idx > *(rebuilt_order.end() - 1 - insert_distance)) break;
    143+                ++insert_distance;
    144+            }
    145+            rebuilt_order.insert(rebuilt_order.end() - insert_distance, idx);
    146+            s << VARINT(diff + insert_distance);
    


    instagibbs commented at 6:54 pm on July 15, 2024:
    This is the remaining part of logic that is hard to intuit now that the rest is a lot simpler to follow. I believe it’s correct due to the harness itself but…

    sipa commented at 3:04 am on July 17, 2024:
    Does looking at the documented serialization in the unit tests help? If not, I will add a worked-out example in more detail to the DepGraphFormatter documentation.

    instagibbs commented at 2:16 pm on July 17, 2024:
    Don’t go out of your way for now, this may just be a mental block for myself only. I worked out some examples myself and it works, I just can’t mentally model the skipping somehow.

    sipa commented at 2:39 pm on July 17, 2024:

    Let me try to explain here. If it helps you, that may be a reason to incorporate it as a comment.

    Overall the serialization format consists of:

    • For each transaction:
      • VARINT(size)
      • VARINT(fee)
      • List of VARINTs that are the number of skipped options (see below), encoding dependencies and position.
    • VARINT(0)

    Let’s say you have a 7-transaction cluster, consisting of transactions F,A,C,B,G,E,D, but serialized in order A,B,C,D,E,F,G, because that’s a topological ordering. By the time G gets serialized, what has been serialized already represents the cluster F,A,C,B,E,D (in that order). Let’s say G has B and E as parents, and that E depends on C.

    Think of the skip-VARINTs as encoding the following:

    0[G->F] [G->E] [G->D] [G->B] [G->A] [...,D,G] [...,E,G,D] [...,B,G,E,...] [...,C,G,B,...] ...
    1        ####          ####                                #############
    2<-1-->        <-1-->        <-------------3------------>
    

    Thus the encoded “skip” VARINTs are, in order: 1, 1, 3. No further VARINT is needed after the 3, because there can only be one insertion position for G. Also note that [G->C] is not included in the list, as it is implied by the included G->E and E->C that came before it


    instagibbs commented at 3:32 pm on July 17, 2024:
    yes this was very insightful, thank you

    sipa commented at 8:38 pm on July 17, 2024:
    I have incorporated something like this into the DepGraphFormatter comment.
  183. in src/test/util/cluster_linearize.h:245 in 1492194c5f outdated
    203+        for (ClusterIndex idx = 0; idx < reordering.size(); ++idx) {
    204+            // Add transactions to depgraph in the original cluster order.
    205+            ClusterIndex topo_idx = reordering[idx];
    206+            depgraph.AddTransaction(topo_depgraph.FeeRate(topo_idx));
    207+        }
    208+        for (ClusterIndex idx = 0; idx < reordering.size(); ++idx) {
    


    instagibbs commented at 7:09 pm on July 15, 2024:
    0        for (const auto topo_idx : reordering) {
    

    sipa commented at 11:04 pm on July 16, 2024:
    The loop uses idx below, unfortunately.

    instagibbs commented at 2:28 pm on July 17, 2024:
    oh whoops, wrong one. I meant for the loop above!

    sipa commented at 8:40 pm on July 17, 2024:
    Done!
  184. in src/cluster_linearize.h:427 in 3b632d4177 outdated
    240+     *
    241+     * Complexity: O(N*M) where N=depgraph.TxCount(), M=select.Count().
    242+     */
    243+    void MarkDone(SetType select) noexcept
    244+    {
    245+        Assume(select.IsSubsetOf(m_todo));
    


    instagibbs commented at 7:30 pm on July 15, 2024:
    mind if we also Assume(select.Any())?

    sipa commented at 11:04 pm on July 16, 2024:
    Done.
  185. in src/cluster_linearize.h:658 in 86ed02aeed outdated
    437+
    438+    /** Remove a subset of transactions from the cluster being linearized.
    439+     *
    440+     * Complexity: O(N) where N=done.Count().
    441+     */
    442+    void MarkDone(const SetType& done) noexcept
    



    sipa commented at 11:05 pm on July 16, 2024:
    Done.
  186. in src/bench/cluster_linearize.cpp:45 in ad7aacebe4 outdated
    25+    }
    26+    return depgraph;
    27+}
    28+
    29+// Construct a difficult graph. These need at least sqrt(2^(n-1)) iterations in the best
    30+// implemented algorithms (purely empircally determined).
    


    instagibbs commented at 7:43 pm on July 15, 2024:
    empircally

    sipa commented at 3:04 am on July 17, 2024:
    Fixed.
  187. in src/cluster_linearize.h:317 in e70ddafeb5 outdated
    267+    std::vector<SetInfo<SetType>> m_chunks;
    268+
    269+    /** Which transactions remain in the linearization. */
    270+    SetType m_todo;
    271+
    272+    /** Fill the m_chunks variable, updating lin_done in the process. */
    


    instagibbs commented at 8:20 pm on July 15, 2024:
    lin_done doesn’t exist

    sipa commented at 3:04 am on July 17, 2024:
    Fixed.
  188. in src/cluster_linearize.h:391 in e70ddafeb5 outdated
    323+        m_chunks.clear();
    324+        BuildChunks();
    325+    }
    326+
    327+    /** Find the shortest intersection between subset and the prefixes of remaining chunks
    328+     *  of the linearization that has a feerate not below best's.
    


    instagibbs commented at 8:31 pm on July 15, 2024:
    there’s no “best” in this context

    sipa commented at 3:04 am on July 17, 2024:
    Fixed.
  189. achow101 commented at 9:19 pm on July 15, 2024: member
    c192b30156ae41638291010b40b874479ea1943c “clusterlin: add algorithms for connectedness/connected components” seems to be only relevant for the optimizations in #30286, so could be dropped from here?
  190. in src/test/util/cluster_linearize.h:41 in 1492194c5f outdated
    36+
    37+/** A formatter for a bespoke serialization for acyclic DepGraph objects.
    38+ *
    39+ * The serialization format outputs information about transaction in a topological order (parents
    40+ * before children), together with position information so transactions can be moved back to their
    41+ * their correct position on deserialization.
    


    ismaelsadeeq commented at 11:48 am on July 16, 2024:

    nit: typo

    0 * before children), together with position information so transactions can be moved back to their
    1 * correct position on deserialization.
    

    sipa commented at 8:39 pm on July 17, 2024:
    Fixed.
  191. in src/cluster_linearize.h:395 in e70ddafeb5 outdated
    327+    /** Find the shortest intersection between subset and the prefixes of remaining chunks
    328+     *  of the linearization that has a feerate not below best's.
    329+     *
    330+     * This is a crucial operation in guaranteeing improvements to linearizations. The set
    331+     * returned by this function, when moved to the front of (what remains of) the linearization,
    332+     * is guaranteed not to make the linearization worse.
    


    instagibbs commented at 6:19 pm on July 16, 2024:
    0     * is guaranteed not to make the linearization worse at any point.
    

    sipa commented at 3:05 am on July 17, 2024:
    Done.
  192. in src/cluster_linearize.h:308 in e70ddafeb5 outdated
    303+        m_chunks.reserve(depgraph.TxCount());
    304+        BuildChunks();
    305+    }
    306+
    307+    /** Determine how many chunks remain in the linearization. */
    308+    ClusterIndex ChunksLeft() const noexcept { return m_chunks.size(); }
    


    instagibbs commented at 6:21 pm on July 16, 2024:

    nitty renaming

    0    ClusterIndex NumChunksLeft() const noexcept { return m_chunks.size(); }
    

    sipa commented at 3:05 am on July 17, 2024:
    Done.
  193. in src/cluster_linearize.h:414 in e70ddafeb5 outdated
    346+                // shorter intersection with higher/equal feerate exists.
    347+                accumulator.transactions |= to_add;
    348+                if (accumulator.transactions == subset.transactions) break;
    349+                // Otherwise update the accumulator feerate.
    350+                accumulator.feerate += m_depgraph.FeeRate(to_add);
    351+                // If that does result in something better, return that.
    


    instagibbs commented at 6:31 pm on July 16, 2024:
    If there specific motivation for returning the accumulated value if it equals subset’s vs exceeds? IIUC either would maintain correctness of the move.

    sipa commented at 2:35 am on July 17, 2024:

    If accumulator is equal in feerate to subset, but smaller in size, it would be correct to either return accumulator (as done in the current code) or to continue the loop. Returning subset in this case would not be correct, as it’s possible that a bigger intersection still has a higher feerate.

    In general, I prefer returning the first (smallest) acceptable intersection because that is the least amount of iteration work, and it does not hurt quality (if a bigger intersection of higher feerate exists, we’ll discover it in the next loop anyway).


    instagibbs commented at 2:13 pm on July 17, 2024:

    returning subset in this case would not be correct, as it’s possible that a bigger intersection still has a higher feerate.

    Ah I had forgotten this fact, makes total sense because it doesn’t hurt to just stop here. In fact I think it makes the logic simpler since you don’t have to remember this match for later if you fail to find anything “better”.


    sipa commented at 8:39 pm on July 17, 2024:
    I have added a comment to explain this briefly.
  194. in src/test/fuzz/cluster_linearize.cpp:582 in e70ddafeb5 outdated
    537+            assert(chunking.GetChunk(i).feerate == chunking_left[i]);
    538+        }
    539+
    540+        // Check consistency of chunking.
    541+        TestBitSet combined;
    542+        for (ClusterIndex i = 0; i < chunking.ChunksLeft(); ++i) {
    


    instagibbs commented at 6:40 pm on July 16, 2024:
    could also check that chunk feerate is dropping monotonically

    sipa commented at 3:06 am on July 17, 2024:
    Done, and added more consistency checks on top (chunks must be non-empty, match the highest-feerate remaining prefix, and be topological).
  195. in src/cluster_linearize.h:734 in f2193f70f8 outdated
    713@@ -697,6 +714,9 @@ std::pair<std::vector<ClusterIndex>, bool> Linearize(const DepGraph<SetType>& de
    714         anc_finder.MarkDone(best.transactions);
    715         if (anc_finder.AllDone()) break;
    716         src_finder.MarkDone(best.transactions);
    717+        if (old_chunking.ChunksLeft() > 0) {
    718+            old_chunking.MarkDone(best.transactions);
    



    sipa commented at 2:52 am on July 17, 2024:
    No, we need to invoke MarkDone even when the iterations_done_now == max_iterations_now condition is not satisfied.
  196. in src/test/fuzz/cluster_linearize.cpp:354 in c192b30156 outdated
    350@@ -320,6 +351,7 @@ FUZZ_TARGET(clusterlin_chunking)
    351     try {
    352         reader >> Using<DepGraphFormatter>(depgraph);
    353     } catch (const std::ios_base::failure&) {}
    354+    MakeConnected(depgraph);
    


    instagibbs commented at 7:18 pm on July 16, 2024:
    Do we always want to make the graphs connected? Seems like a possible reduction in coverage?

    sipa commented at 3:08 am on July 17, 2024:

    Good question.

    Clusters, as intended to be used by the mempool/txgraph code, are always connected (if not, they’d be separate clusters), but nothing actually relies on this property, so all cluster_linearization tests should pass even with non-connecting clusters. I believe that earlier it was helpful to forcefully make all clusters for most fuzz tests connected, as this helped the fuzzer find extreme cases, though I have not tested this recently.

    This is also the reason why this commit is in the PR to begin with. If it’s controversial, I’m happy to just move it to #30286.


    instagibbs commented at 2:32 pm on July 17, 2024:
    I’m not particularly bothered by it for the reasons you mentioned. I’ll let others weigh in.

    sipa commented at 8:39 pm on July 17, 2024:
    I have moved the connected-component logic, together with its (now optional) usage to #30286.
  197. instagibbs commented at 7:36 pm on July 16, 2024: member

    another pass, through https://github.com/bitcoin/bitcoin/pull/30126/commits/23496cb4d43327c3d27401a531586bcfc985b6ad

    changes since last look make sense, I would like to spend more time validating the optimizations

  198. sipa force-pushed on Jul 16, 2024
  199. ismaelsadeeq commented at 9:58 pm on July 16, 2024: member

    Reviewed:

    • 4409a282d7fe7a0ebee67c8fb9fb4ef157ed5883
    • f5fa49f589f477caa5a4ca41d8331acdca6d7298
    • 48f086eb624a5371b846e908b4702b28726465fa
    • ccde33054e9573bc9c263acd9746b8003578cb29
    • 6fdb3d07d4350b0d93bda5765ea1053c29866c0a
    • 448b8b977920b8d2f90008fa1939d3d3bba63668
    • 896b52337be5765cb3a7d3f0458475f2f8e2cb3a

    I spent time understanding the cluster linearization theory. The serialization and de-serialization code in f5fa49f589f477caa5a4ca41d8331acdca6d7298 are well documented 💯

  200. sipa force-pushed on Jul 17, 2024
  201. DrahtBot added the label CI failed on Jul 17, 2024
  202. DrahtBot removed the label CI failed on Jul 17, 2024
  203. instagibbs commented at 2:35 pm on July 17, 2024: member

    reviewed through https://github.com/bitcoin/bitcoin/pull/30126/commits/2003bb8a279c8891e55bab190ca36f0c6c8697ea

    via git range-diff master 23496cb 2003bb8a279c8891e55bab190ca36f0c6c8697ea

  204. in src/bench/cluster_linearize.cpp:130 in 2003bb8a27 outdated
    126@@ -112,6 +127,23 @@ void BenchLinearizeNoItersWorstCase(ClusterIndex ntx, benchmark::Bench& bench)
    127     });
    128 }
    129 
    130+/** Benchmark for linearization of a trivial linear graph using just ancestor sort/LIMO.
    


    instagibbs commented at 8:17 pm on July 17, 2024:
    this isn’t making the trivial linear graph

    sipa commented at 8:39 pm on July 17, 2024:
    Fixed.
  205. in src/bench/cluster_linearize.cpp:161 in 2003bb8a27 outdated
    131+ *
    132+ * Its goal is measuring how much time improving a linearization may take without any search
    133+ * iterations, similar to the previous function.
    134+ */
    135+template<typename SetType>
    136+void BenchLinearizeNoItersWorstCaseLIMO(ClusterIndex ntx, benchmark::Bench& bench)
    


    instagibbs commented at 8:18 pm on July 17, 2024:
    Why is this benchmark “LIMO” but not BenchLinearizeNoItersWorstCaseAnc?

    sipa commented at 8:40 pm on July 17, 2024:
    I don’t understand this comment.

    instagibbs commented at 3:13 pm on July 18, 2024:
    Both benchmarks are doing LIMO with ancestor sort, right? The only difference is in cluster shape.

    sipa commented at 3:14 pm on July 18, 2024:
    The “ANC” one is testing a scenario that is a worst-case for ancestor sort (but is trivial for the LIMO aspect), the “LIMO” one is testing a scenario that is a worst-case for LIMO (but is trivial for the ancestor sorting aspect).

    instagibbs commented at 3:16 pm on July 18, 2024:
    ah, parsing the name wrong basically, thanks

    sipa commented at 5:35 pm on July 19, 2024:
    I’ve added a comment to clarify.
  206. in src/cluster_linearize.h:334 in faf8a3bdd3 outdated
    330@@ -323,9 +331,21 @@ class LinearizationChunking
    331         Assume(subset.Any());
    332         Assume(subset.IsSubsetOf(m_todo));
    333         m_todo -= subset;
    334-        // Rechunk what remains of m_linearization.
    335-        m_chunks.clear();
    336-        BuildChunks();
    337+        if (GetChunk(0).transactions == subset) {
    


    instagibbs commented at 8:20 pm on July 17, 2024:

    re: clusterlin: optimize rechunking in LinearizationChunking

    benchmark doesn’t seem to show much improvement for this change. I understood this as purely an efficiency improvement which should be reflected in bench results?


    sipa commented at 8:40 pm on July 17, 2024:
    I have moved this optimization (it is indeed supposed to be one) to #30286.
  207. sipa force-pushed on Jul 17, 2024
  208. sipa commented at 8:45 pm on July 17, 2024: member

    I have made a number of changes:

    • Addressed a number of typos
    • Moved connected-component logic to #30286 (where MakeConnected is now optional, and FindConnectedComponent has its own fuzz test).
    • Moved the chunk-caching logic in LinearizationChunking to #30286.
    • Added more explanatory comments, especially in the DepGraphFormatter format description.
    • Made the DepGraphFormatter deserializer resilient to EOF (previously it would return an empty graph in this case), plus added a test for it in SanityCheck(const DepGraph&).
  209. instagibbs commented at 3:23 pm on July 18, 2024: member

    ACK https://github.com/bitcoin/bitcoin/pull/30126/commits/6160ccf9b4327649e9bb0293fba630a10b3befc3

    reviewed via git range-diff master 2003bb8a279c8891e55bab190ca36f0c6c8697ea 6160ccf9b4327649e9bb0293fba630a10b3befc3

    I didn’t validate that the BFS optimisation for search improved results, but it at least doesn’t seem to hurt benchmark performance.

  210. in src/cluster_linearize.h:451 in 48f086eb62 outdated
    267+     *
    268+     * Complexity: O(N) where N=depgraph.TxCount();
    269+     */
    270+    SetInfo<SetType> FindCandidateSet() const noexcept
    271+    {
    272+        std::optional<ClusterIndex> best;
    


    ismaelsadeeq commented at 11:08 am on July 19, 2024:
    0        Assume(!AllDone());
    1        std::optional<ClusterIndex> best;
    

    sipa commented at 5:34 pm on July 19, 2024:
    Done.
  211. in src/test/fuzz/cluster_linearize.cpp:444 in 1949b516cb outdated
    349+
    350+        // At most 2^N-1 iterations can be required: the number of non-empty subsets a graph with N
    351+        // transactions has.
    352+        assert(iterations_done <= ((uint64_t{1} << todo.Count()) - 1));
    353+
    354+        // Perform quality checks only if SearchCandidateFinder claims an optimal result.
    


    ismaelsadeeq commented at 11:41 am on July 19, 2024:

    nit: typo

    0        // Perform equality checks only if SearchCandidateFinder claims an optimal result.
    1	
    2Comment
    

    sipa commented at 3:56 pm on July 19, 2024:
    Not a typo. There can be multiple distinct optimal linearizations; we just check for optimality, not that it is actually identical to the exhaustive solutions.
  212. in src/test/fuzz/cluster_linearize.cpp:459 in 1949b516cb outdated
    363+            // Compare with AncestorCandidateFinder;
    364+            auto anc = anc_finder.FindCandidateSet();
    365+            assert(found.feerate >= anc.feerate);
    366+
    367+            // If todo isn't too big, also compare with ExhaustiveCandidateFinder.
    368+            if (todo.Count() <= 12) {
    


    ismaelsadeeq commented at 11:43 am on July 19, 2024:
    why is 12 selected here, does this implies that the optimal search done with todo.Count() > 12 using SearchCandidateFinder and SimpleCandidateFinder are not optimal we can do better with exhaustive search?

    sipa commented at 5:35 pm on July 19, 2024:
    We don’t invoke ExhaustiveCandidateFinder for larger clusters because it just gets too computationally expensive to do so. I’ve added a comment to clarify this.
  213. in src/cluster_linearize.h:203 in 6fdb3d07d4 outdated
    183@@ -184,10 +184,23 @@ struct SetInfo
    184     /** Construct a SetInfo for a specified set and feerate. */
    


    ismaelsadeeq commented at 11:52 am on July 19, 2024:

    Add span include here and the tests

    0#include <span.h>
    

    sipa commented at 5:35 pm on July 19, 2024:
    Done.
  214. in src/test/util/cluster_linearize.h:39 in a53b7eb82c outdated
    34+    return true;
    35+}
    36+
    37+/** A formatter for a bespoke serialization for acyclic DepGraph objects.
    38+ *
    39+ * The serialization format outputs information about transaction in a topological order (parents
    


    glozow commented at 2:43 pm on July 19, 2024:

    a53b7eb82c3: nitty nit

    0 * The serialization format outputs information about transactions in a topological order (parents
    

    sipa commented at 5:35 pm on July 19, 2024:
    Fixed.
  215. in src/test/util/cluster_linearize.h:65 in a53b7eb82c outdated
    60+ * Let's say you have a 7-transaction cluster, consisting of transactions F,A,C,B,G,E,D, but
    61+ * serialized in order A,B,C,D,E,F,G, because that happens to be a topological ordering. By the
    62+ * time G gets serialized, what has been serialized already represents the cluster F,A,C,B,E,D (in
    63+ * that order). G has B and E as direct parents, and E depends on C.
    64+ *
    65+ * In this case, the possitibilities are, in order:
    


    glozow commented at 2:45 pm on July 19, 2024:
    0 * In this case, the possibilities are, in order:
    

    sipa commented at 5:35 pm on July 19, 2024:
    Fixed.
  216. in src/cluster_linearize.h:557 in 6160ccf9b4 outdated
    552+        queue.emplace_back(SetInfo<SetType>{}, SetType{m_todo});
    553+
    554+        /** Local copy of the iteration limit. */
    555+        uint64_t iterations_left = max_iterations;
    556+
    557+        /** Internal function to add a work item.
    


    glozow commented at 4:58 pm on July 19, 2024:
    nit: maybe something like “explore and add a work item if there are any transactions left to split on” ?

    sipa commented at 5:35 pm on July 19, 2024:
    Did something like that.
  217. glozow commented at 5:02 pm on July 19, 2024: member
    code review ACK, planning to spend more time reviewing the tests
  218. sipa force-pushed on Jul 19, 2024
  219. instagibbs commented at 1:37 pm on July 22, 2024: member

    reACK https://github.com/bitcoin/bitcoin/pull/30126/commits/cad318fa843f411e52c6761a4882bfaf0ad21812

    via git range-diff master 6160ccf9b4327649e9bb0293fba630a10b3befc3 cad318fa843f411e52c6761a4882bfaf0ad21812

  220. DrahtBot requested review from glozow on Jul 22, 2024
  221. in src/test/fuzz/cluster_linearize.cpp:529 in 2597f096af outdated
    524+    auto linearization = ReadLinearization(depgraph, reader);
    525+
    526+    // Construct a LinearizationChunking object, initially for the whole linearization.
    527+    LinearizationChunking chunking(depgraph, linearization);
    528+
    529+    // Remove piece by piece of the linearization in the chunking object, and check various
    


    glozow commented at 3:52 pm on July 22, 2024:

    2597f096afde07d9c7f6466c0970a73947562be3

    nit: found this comment hard to parse. “Remove transactions from the chunking object, and check various…” ?


    sipa commented at 5:54 pm on July 23, 2024:
    Will do if I retouch.

    sipa commented at 8:27 pm on July 24, 2024:
    Done.
  222. in src/bench/cluster_linearize.cpp:47 in 896b52337b outdated
    27+}
    28+
    29+// Construct a difficult graph. These need at least sqrt(2^(n-1)) iterations in the best
    30+// known algorithms (purely empirically determined).
    31+template<typename SetType>
    32+DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
    


    glozow commented at 10:54 am on July 23, 2024:

    Question: I whiteboarded what this graph + the decision tree looks like to try to understand what makes this a hard graph. Do I understand correctly that

    • Within this PR, the feerates of transactions in the graph have no impact on how many iterations SearchCandidateFinder might run, since all work items are considered as long as they are topological.
    • The feerates of the parent+child transactions {4..n-1} become important after #30286’s optimizations that rule out transactions/branches that can’t lead to better feerates.

    (I’m not suggesting that you change anything here, just hoping to clarify my understanding as I haven’t looked very deeply at #30286 yet.)


    sipa commented at 5:28 pm on July 23, 2024:

    Yes, the current SearchCandidateFinder in this PR effectively searches over every topologically valid subset of the cluster (if the max_iterations limit is sufficiently high), which means that among other things the feerates don’t actually matter. It has a few other weirdnesses:

    • The randomization of the search order in this PR is mostly irrelevant (but still included so that the interface of Linearize is stable after this PR).
    • The SimpleCandidateFinder in the fuzz tests is actually more efficient (for now) than SearchCandidateFinder, because it only considers connected topologically valid subsets. This isn’t done in SearchCandidateFinder because a much more important optimization follows in #30286 that conflicts with it, but one might wonder why it’s not done here.
    • There are probably topologies that are for now worse for SearchCandidateFinder than the implemented one in terms of number of iterations required, but the benchmark is about cost per iteration rather than number of iterations, and I prefer being able to keep the benchmark stable throughout the optimizations.

    sipa commented at 8:14 pm on July 23, 2024:

    FWIW, this is what the clusters should look like:

    Odd (mentally swap T4 and T6):

    0graph BT
    1T0["T0: 1/2"];T1["T1: 14/2"];T2["T2: 6/1"];T3["T3: 5/1"];T4["T4: 7/1"];
    2T5["T5: 5/1"];T6["T6: 7/1"];T7["T7: 5/1"];T8["T8: 7/1"];T9["T9: 5/1"];
    3T10["T10: 7/1"];
    4T1-->T0;T1-->T2;T3-->T2;T4-->T3;T4-->T5;T6-->T5;T4-->T7;T8-->T7;T4-->T9;T10-->T9;
    

    Even (mentally swap T3 and T5):

    0graph BT
    1T0["T0: 1"];T1["T1: 3"];T2["T2: 1"];T3["T3: 4"];T4["T4: 0"];T5["T5: 4"];
    2T6["T6: 0"]; T7["T7: 4"];T8["T8: 0"];T9["T9: 4"];
    3T1-->T0;T2-->T0;T3-->T2;T3-->T4;T5-->T4;T3-->T6;T7-->T6;T3-->T8;T9-->T8;
    

    glozow commented at 11:55 am on July 24, 2024:
    Matches my diagrams. Thanks for the additional context! :+1:

    sipa commented at 8:27 pm on July 24, 2024:
    Added the mermaid code as comments to the benchmark.

    glozow commented at 9:39 am on July 25, 2024:
    Ah I didn’t notice this yesterday - I think you may have swapped odd and even? Same in the commented mermaid diagram code just added.

    sipa commented at 10:43 am on July 25, 2024:
    @glozow I don’t think so.

    glozow commented at 11:21 am on July 25, 2024:

    Hm, I was confused because when I was looking at the code, I expected the branch with depgraph.AddDependency(i, 4) to have lots of T4–>Tx in its diagram. And the branch with depgraph.AddDependency(i, 3) to have lots of T3–>Tx.

    {T0…T9} is 10 transactions, and {T0…T10} is 11 transactions. Am I missing something?


    sipa commented at 11:32 am on July 25, 2024:

    Errr, yes, you’re right of course!

    The odd and even are correct, but the mermaid code for the two are swapped.


    glozow commented at 11:39 am on July 25, 2024:
    Whew, so not wholly unmeritorious.

    sipa commented at 11:43 am on July 25, 2024:

    Yeah, not totally without merit.

    Actually, I don’t understand at all what is going on. The “// Odd cluster size.” code constructs a graph with an odd number of transactions, and the “// Even cluster size.” constructs one with an even number. Further, the mermaid code’s fees/sizes matches the corresponding code, but does not match the transaction counts?!


    sipa commented at 12:13 pm on July 25, 2024:

    Sigh! I believe this was the problem: it was constructing the intended-to-be-even-sized graph with an odd number of transactions and vice verse:

     0diff --git a/src/bench/cluster_linearize.cpp b/src/bench/cluster_linearize.cpp
     1index ddd7c6a7a61..3ebcc383bb4 100644
     2--- a/src/bench/cluster_linearize.cpp
     3+++ b/src/bench/cluster_linearize.cpp
     4@@ -33,10 +33,10 @@ DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
     5 {
     6     DepGraph<SetType> depgraph;
     7     for (ClusterIndex i = 0; i < ntx; ++i) {
     8-        if (ntx & 1) {
     9-            // Odd cluster size.
    10+        if ((ntx & 1) == 0) {
    11+            // Even cluster size.
    12             //
    13-            // Mermaid diagram code for the resulting cluster for 9 transactions:
    14+            // Mermaid diagram code for the resulting cluster for 10 transactions:
    15             // ```mermaid
    16             // graph BT
    17             // T0["T0: 1/2"];T1["T1: 14/2"];T2["T2: 6/1"];T3["T3: 5/1"];T4["T4: 7/1"];
    18@@ -62,9 +62,9 @@ DepGraph<SetType> MakeHardGraph(ClusterIndex ntx)
    19                 depgraph.AddDependency(i, 4);
    20             }
    21         } else {
    22-            // Even cluster size.
    23+            // Odd cluster size.
    24             //
    25-            // Mermaid diagram code for the resulting cluster for 10 transactions:
    26+            // Mermaid diagram code for the resulting cluster for 11 transactions:
    27             // ```mermaid
    28             // graph BT
    29             // T0["T0: 1"];T1["T1: 3"];T2["T2: 1"];T3["T3: 4"];T4["T4: 1"];T5["T5: 4"];
    

    sipa commented at 1:10 pm on July 25, 2024:
    Sorry, no, the original code was right, and there is more mixed up in the diagrams. Sorry…

    glozow commented at 1:14 pm on July 25, 2024:
    Are we meant to end with a child, not a parent? And the pairs are supposed to be cpfps?

    sipa commented at 1:27 pm on July 25, 2024:
    I’ve updated the diagrams higher up in the thread to match the actual code (both had several errors…). Let me verify whether they also actually have the complexity properties I claim they have, and then I’ll do a push to fix it. Sorry for the mess.

    sipa commented at 2:17 pm on July 25, 2024:
    Ok, pushed.
  223. in src/test/fuzz/cluster_linearize.cpp:599 in 2597f096af outdated
    614+            // the first transaction in todo if done is empty.
    615+            done = depgraph.Ancestors(todo.First()) & todo;
    616+        }
    617+        todo -= done;
    618+        chunking.MarkDone(done);
    619+        subset = SetInfo(depgraph, subset.transactions - done);
    


    glozow commented at 2:08 pm on July 23, 2024:
    There wouldn’t be any problem with resetting subset = SetInfo(depgraph, ReadTopologicalSet(depgraph, todo, reader)); here, right?

    sipa commented at 5:47 pm on July 23, 2024:
    It should work, but wouldn’t be testing the exact same thing (because the remaining part of a LinearizationChunking isn’t required to be topological, and in practice it’ll generally be the opposite: the full cluster with a topological subset removed).
  224. in src/test/fuzz/cluster_linearize.cpp:588 in 2597f096af outdated
    603+            prefix |= chunking.GetChunk(i).transactions;
    604+            auto reintersect = SetInfo(depgraph, prefix & intersect.transactions);
    605+            if (!reintersect.feerate.IsEmpty()) {
    606+                assert(reintersect.feerate <= intersect.feerate);
    607+            }
    608+        }
    


    glozow commented at 2:11 pm on July 23, 2024:
    I suppose this is the same as chunking.Intersect(intersect) except without exiting early?

    sipa commented at 5:45 pm on July 23, 2024:

    It’s similar, but not actually the same, because it isn’t testing the exact result of LinearizationChunking::Intersect, but testing the properties it has we care about for LIMO/Merging to work (feerate of output is as high as feerate of input, and no intersection with a chunk prefix is better). Specifically, if you replace LinearizationChunking::Intersect with something that returns the highest-feerate intersection (rather than the first intersection) which has as high feerate as the input, or if you make it consider every linearization prefix (rather than just chunk prefixes), this test still works fine.

    Another difference is that the test here recomputes the full intersection feerate every time rather than incrementally computing it.

  225. glozow commented at 4:38 pm on July 23, 2024: member

    ACK cad318fa843

    I did code review to convince myself that the code does what it’s supposed to be doing / what’s described in the delving post. Everything is a bit abstract so I manually tested by writing some naive versions of BlockAssembler and MiniMiner using cluster_linearize. I also code reviewed the tests and introduced bugs to check their coverage. I didn’t really dive into DepGraphFormatter but it seems to work.

  226. in src/test/util/cluster_linearize.h:346 in a26618fbb3 outdated
    302+    // Check completeness.
    303+    assert(linearization.size() == depgraph.TxCount());
    304+    TestBitSet done;
    305+    for (auto i : linearization) {
    306+        // Check topology and lack of duplicates.
    307+        assert((depgraph.Ancestors(i) - done) == TestBitSet::Singleton(i));
    


    sdaftuar commented at 6:56 pm on July 23, 2024:
    We could consider checking that the entries in linearization are all in the expected range [0, depGraph.TxCount()), to avoid a crash here if a garbage linearization were passed in?

    sipa commented at 8:29 pm on July 24, 2024:
    Done.
  227. in src/test/fuzz/cluster_linearize.cpp:563 in 2597f096af outdated
    558+            assert(chunk_info.transactions.IsSubsetOf(todo));
    559+            // Chunks' claimed feerates must match their transactions' aggregate feerate.
    560+            assert(depgraph.FeeRate(chunk_info.transactions) == chunk_info.feerate);
    561+            // Chunks must be the highest-feerate remaining prefix.
    562+            SetInfo<TestBitSet> accumulator, best;
    563+            for (auto i : linearization) {
    


    sdaftuar commented at 4:49 pm on July 24, 2024:
    Maybe a different variable here would be better, since i is already used in the outer loop?

    sipa commented at 8:29 pm on July 24, 2024:
    Done.
  228. sdaftuar commented at 5:20 pm on July 24, 2024: member

    Code review ACK cad318fa843f411e52c6761a4882bfaf0ad21812. Left some non-blocking nits/suggestions.

    I’ve run the fuzz tests in earlier versions of this PR for a while, but I haven’t done so with the latest branch – will do that and also post some benchmarks in a bit.

  229. sdaftuar commented at 5:27 pm on July 24, 2024: member

    Benchmark results on my ryzen 7995wx:

    ns/op op/s err% total benchmark
    1,474.01 678,422.17 0.3% 0.01 LinearizeNoIters16TxWorstCaseAnc
    1,473.48 678,667.50 0.2% 0.01 LinearizeNoIters16TxWorstCaseLIMO
    4,763.02 209,951.04 0.1% 0.01 LinearizeNoIters32TxWorstCaseAnc
    5,138.13 194,623.26 0.0% 0.01 LinearizeNoIters32TxWorstCaseLIMO
    10,296.19 97,123.33 0.0% 0.01 LinearizeNoIters48TxWorstCaseAnc
    11,148.56 89,697.71 0.0% 0.01 LinearizeNoIters48TxWorstCaseLIMO
    17,321.56 57,731.51 0.1% 0.01 LinearizeNoIters64TxWorstCaseAnc
    19,224.32 52,017.44 0.0% 0.01 LinearizeNoIters64TxWorstCaseLIMO
    25,346.32 39,453.45 0.1% 0.01 LinearizeNoIters75TxWorstCaseAnc
    28,625.57 34,933.81 0.1% 0.01 LinearizeNoIters75TxWorstCaseLIMO
    43,628.14 22,920.98 0.1% 0.01 LinearizeNoIters99TxWorstCaseAnc
    49,273.81 20,294.76 0.1% 0.01 LinearizeNoIters99TxWorstCaseLIMO
    ns/iters iters/s err% total benchmark
    13.81 72,390,978.12 0.1% 0.01 LinearizePerIter16TxWorstCase
    9.62 103,944,701.42 0.2% 0.01 LinearizePerIter32TxWorstCase
    9.29 107,658,429.17 0.2% 0.01 LinearizePerIter48TxWorstCase
    9.30 107,518,263.44 0.2% 0.01 LinearizePerIter64TxWorstCase
    10.21 97,957,584.37 0.1% 0.01 LinearizePerIter75TxWorstCase
    10.07 99,325,875.28 0.2% 0.01 LinearizePerIter99TxWorstCase
  230. sipa force-pushed on Jul 24, 2024
  231. glozow commented at 11:35 am on July 25, 2024: member
    reACK df749c06e9d via range-diff
  232. DrahtBot requested review from instagibbs on Jul 25, 2024
  233. DrahtBot requested review from sdaftuar on Jul 25, 2024
  234. instagibbs commented at 12:14 pm on July 25, 2024: member

    reACK https://github.com/bitcoin/bitcoin/pull/30126/commits/df749c06e9d45ba92a4699e1d3e58522f93ad6bb

    via range-diff master cad318fa843f411e52c6761a4882bfaf0ad21812 df749c06e9d45ba92a4699e1d3e58522f93ad6bb

    Didn’t verify the mermaid diagrams in the comments.

    At this point could we maybe not rebase on master unless necessary? :)

  235. sipa force-pushed on Jul 25, 2024
  236. sipa commented at 12:15 pm on July 25, 2024: member

    At this point could we maybe not rebase on master unless necessary? :)

    I shall resist.

  237. instagibbs commented at 12:19 pm on July 25, 2024: member

    reACK https://github.com/bitcoin/bitcoin/pull/30126/commits/448374f2c901c48824904357b4e1297105e97315

    via git range-diff master df749c06e9d45ba92a4699e1d3e58522f93ad6bb 448374f2c901c48824904357b4e1297105e97315

    only changes to MakeHardGraph to reflect the intended topology

  238. DrahtBot requested review from glozow on Jul 25, 2024
  239. glozow commented at 1:08 pm on July 25, 2024: member
    reACK 448374f2c90 via range-diff
  240. clusterlin: introduce cluster_linearize.h with Cluster and DepGraph types
    This primarily adds the DepGraph class, which encapsulates precomputed
    ancestor/descendant information for a given transaction cluster, with a
    number of utility features (inspectors for set feerates, computing
    reduced parents/children, adding transactions, adding dependencies), which
    will become needed in future commits.
    a6e07e769a
  241. tests: framework for testing DepGraph class
    This introduces a bespoke fuzzing-focused serialization format for DepGraphs,
    and then tests that this format can represent any graph, roundtrips, and then
    uses that to test the correctness of DepGraph itself.
    
    This forms the basis for future fuzz tests that need to work with interesting
    graphs.
    58f7e01db4
  242. clusterlin: add AncestorCandidateFinder class
    This is a class that encapsulates precomputed ancestor set feerates, and
    presents an interface for getting the best remaining ancestor set.
    4828079db3
  243. clusterlin: add SearchCandidateFinder class
    Similar to AncestorCandidateFinder, this encapsulates the state needed for
    finding good candidate sets using a search algorithm.
    2a41f151af
  244. clusterlin: add chunking algorithm
    A fuzz test is added which verifies various of its expected properties, including
    correctness
    ee0ddfe4f6
  245. clusterlin: add Linearize function
    This adds a first version of the overall linearization interface, which given
    a DepGraph constructs a good linearization, by incrementally including good
    candidate sets (found using AncestorCandidateFinder and SearchCandidateFinder).
    46aad9b099
  246. bench: Candidate finding and linearization benchmarks
    Add benchmarks for known bad graphs for the purpose of search (as
    an upper bound on work per search iterations) and ancestor sorting
    (as an upper bound on linearization work with no search iterations).
    d9b235e7d2
  247. clusterlin: use bounded BFS exploration (optimization)
    Switch to BFS exploration of the search tree in SearchCandidateFinder
    instead of DFS exploration. This appears to behave better for real
    world clusters.
    
    As BFS has the downside of needing far larger search queues, switch
    back to DFS temporarily when the queue grows too large.
    991ff9a9a4
  248. clusterlin: randomize the SearchCandidateFinder search order
    To make search non-deterministic, change the BFS logic from always picking
    the first queue item to randomly picking the first or second queue item.
    d5918dc3c6
  249. clusterlin: add LinearizationChunking class
    It encapsulates a given linearization in chunked form, permitting arbitrary
    subsets of transactions to be removed from the linearization. Its purpose
    is adding the Intersect function, which is a crucial operation that will
    be used in a further commit to make Linearize improve existing linearizations.
    97d98718b0
  250. clusterlin: permit passing in existing linearization to Linearize
    This implements the LIMO algorithm for linearizing by improving an existing
    linearization. See
    https://delvingbitcoin.org/t/limo-combining-the-best-parts-of-linearization-search-and-merging
    for details.
    28549791b3
  251. bench: add cluster linearization improvement benchmark 647fa37cdb
  252. sipa force-pushed on Jul 25, 2024
  253. sipa commented at 2:19 pm on July 25, 2024: member

    Ok, the mermaid diagrams should be fixed now, match the code, and I’ve verified (post- #30286) that the claims about difficulty of linearization hold for the code.

    I shall resist.

    I failed.

  254. instagibbs commented at 3:04 pm on July 25, 2024: member

    reACK 647fa37cdbadbeebba147ca6b24e138559cffaaf

    via git range-diff master 448374f2c901c48824904357b4e1297105e97315 647fa37cdbadbeebba147ca6b24e138559cffaaf

    I traced through the generated diagrams + code, seems to match. Didn’t verify difficulty.

  255. glozow commented at 3:05 pm on July 25, 2024: member

    reACK 647fa37cdba, both code and mermaid diagram look correct to me

    The git diff is always greener When you rebase by mistake We dream about cluster mempool Don care how many reACKs it take Just look at the code around you The bench was correct before! Under the sea, Under the sea…

  256. sdaftuar commented at 8:41 pm on July 25, 2024: member

    ACK 647fa37cdbadbeebba147ca6b24e138559cffaaf

    In addition to some manual testing, I ran all the fuzz targets on my machine for a while as well.

  257. glozow merged this on Jul 26, 2024
  258. glozow closed this on Jul 26, 2024

  259. hebasto added the label Needs CMake port on Jul 31, 2024
  260. hebasto commented at 9:55 am on July 31, 2024: member
    Ported to the CMake-based build system in https://github.com/hebasto/bitcoin/pull/290.
  261. hebasto removed the label Needs CMake port on Jul 31, 2024
  262. glozow referenced this in commit bba01ba18d on Aug 5, 2024
  263. ismaelsadeeq commented at 4:57 pm on August 26, 2024: member
    post merge ACK 647fa37cdbadbeebba147ca6b24e138559cffaaf

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me