cluster mempool: merging & postprocessing of linearizations #30285

pull sipa wants to merge 5 commits into bitcoin:master from sipa:202406_clusterlin_meta changing 3 files +653 −16
  1. sipa commented at 2:12 AM on June 14, 2024: member

    Part of cluster mempool: #30289

    Depends on #30126, and was split off from it. #28676 depends on this.

    This adds the algorithms for merging & postprocessing linearizations.

    The PostLinearize(depgraph, linearization) function performs an in-place improvement of linearization, using two iterations of the Linearization post-processing algorithm. The first running from back to front, the second from front to back.

    The MergeLinearizations(depgraph, linearization1, linearization2) function computes a new linearization for the provided cluster, given two existing linearizations for that cluster, which is at least as good as both inputs. The algorithm is described at a high level in merging incomparable linearizations.

    For background and references, see Introduction to cluster linearization.

  2. DrahtBot commented at 2:12 AM on June 14, 2024: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK glozow, instagibbs, sdaftuar

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28676 ([WIP] Cluster mempool implementation by sdaftuar)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. sipa renamed this:
    Low-level cluster linearization code: merging & postprocessing
    cluster mempool: merging and postprocessing for linearizations
    on Jun 14, 2024
  4. sipa renamed this:
    cluster mempool: merging and postprocessing for linearizations
    cluster mempool: merging & postprocessing of linearizations
    on Jun 14, 2024
  5. sipa added the label Mempool on Jun 14, 2024
  6. sipa force-pushed on Jul 2, 2024
  7. sipa force-pushed on Jul 10, 2024
  8. sipa force-pushed on Jul 10, 2024
  9. DrahtBot added the label CI failed on Jul 10, 2024
  10. DrahtBot commented at 10:20 PM on July 10, 2024: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    <sub>Debug: https://github.com/bitcoin/bitcoin/runs/27291659212</sub>

  11. sipa force-pushed on Jul 11, 2024
  12. sipa force-pushed on Jul 11, 2024
  13. DrahtBot removed the label CI failed on Jul 11, 2024
  14. sipa force-pushed on Jul 19, 2024
  15. sipa force-pushed on Jul 19, 2024
  16. DrahtBot added the label Needs rebase on Jul 26, 2024
  17. sipa force-pushed on Jul 26, 2024
  18. DrahtBot removed the label Needs rebase on Jul 26, 2024
  19. in src/cluster_linearize.h:807 in 8eaca20d60 outdated
     802 | +    //
     803 | +    // During an odd pass, the high-level operation is:
     804 | +    // - Start with an empty list of groups L=[].
     805 | +    // - For every transaction i in the old linearization, from front to back:
     806 | +    //   - Append a new group C=[i], containing just i, to the back of L.
     807 | +    //   - While there is a group P in L immediately preceding C, which has lower feerate than C:
    


    instagibbs commented at 1:44 PM on July 30, 2024:
        //   - While there is a group P in L immediately preceding C which has lower feerate than C:
    

    sipa commented at 6:11 PM on August 1, 2024:

    This was rewritten.

  20. in src/cluster_linearize.h:815 in 8eaca20d60 outdated
     785 | + * @param[in,out] linearization  On input, an existing linearization for depgraph. On output, a
     786 | + *                               potentially better linearization for the same graph.
     787 | + *
     788 | + * Postlinearization guarantees:
     789 | + * - The resulting chunks are connected.
     790 | + * - If the input has a tree shape (either all transactions have at most one child, or all
    


    instagibbs commented at 2:10 PM on July 30, 2024:

    intuitively, and from attempting to fuzz this, each pass corresponds to one type of tree shape. Might be good to note this, and if state if there are other motivations as well to do two passes and under what topologies.


    sipa commented at 8:13 PM on August 1, 2024:

    I've explained this in more detail (in the comment below, inside the function body).

  21. in src/cluster_linearize.h:834 in 8eaca20d60 outdated
     829 | +    {
     830 | +        /** The index of the previous transaction in this group; 0 if this is the first entry of
     831 | +         *  a group. */
     832 | +        ClusterIndex prev_tx;
     833 | +
     834 | +        // Fields that are only used for transactions that are the last one in a group:
    


    instagibbs commented at 2:23 PM on July 30, 2024:

    next field is first_tx, not last


    sipa commented at 6:40 PM on July 30, 2024:

    The first_tx field (just like the prev_group, group, deps, feerate fields) are only used for transactions which are the last one in a group.


    sipa commented at 1:27 PM on July 31, 2024:

    I have tried to make the comments here a bit clearer.

  22. in src/test/fuzz/cluster_linearize.cpp:789 in 8eaca20d60 outdated
     784 | +    SanityCheck(depgraph, linearization);
     785 | +
     786 | +    // Produce a post-processed version.
     787 | +    auto post_linearization = linearization;
     788 | +    PostLinearize(depgraph, post_linearization);
     789 | +    SanityCheck(depgraph, linearization);
    


    instagibbs commented at 2:27 PM on July 30, 2024:

    sanity checking wrong linearization?


    sipa commented at 10:10 PM on July 30, 2024:

    Indeed, fixed (here and below).

  23. in src/test/fuzz/cluster_linearize.cpp:804 in 8eaca20d60 outdated
     791 | +    // Compare diagrams: post-linearization cannot worsen anywhere.
     792 | +    auto old_chunking = ChunkLinearization(depgraph, linearization);
     793 | +    auto new_chunking = ChunkLinearization(depgraph, post_linearization);
     794 | +    auto cmp = CompareChunks(new_chunking, old_chunking);
     795 | +    assert(cmp >= 0);
     796 | +
    


    instagibbs commented at 2:31 PM on July 30, 2024:
    
        // Run again, things can keep improving (and never get worse)
        PostLinearize(depgraph, post_linearization);
        SanityCheck(depgraph, post_linearization);
        auto newest_chunking = ChunkLinearization(depgraph, post_linearization);
        cmp = CompareChunks(newest_chunking, new_chunking);
        assert(cmp >= 0);
        
    

    sipa commented at 10:11 PM on July 30, 2024:

    Added.

  24. in src/test/fuzz/cluster_linearize.cpp:871 in 8eaca20d60 outdated
     858 | +
     859 | +    // Compare diagrams.
     860 | +    auto old_chunking = ChunkLinearization(depgraph_tree, linearization);
     861 | +    auto new_chunking = ChunkLinearization(depgraph_tree, post_linearization);
     862 | +    auto cmp = CompareChunks(new_chunking, old_chunking);
     863 | +    assert(cmp >= 0);
    


    instagibbs commented at 2:34 PM on July 30, 2024:

    Could test that running PostLinearization again doesn't do anything


    sipa commented at 10:11 PM on July 30, 2024:

    Done.

  25. in src/test/fuzz/cluster_linearize.cpp:892 in 8eaca20d60 outdated
     870 | +    assert(cmp_opt == 0);
     871 | +}
     872 | +
     873 | +FUZZ_TARGET(clusterlin_postlinearize_moved_leaf)
     874 | +{
     875 | +    // Verify that taking an existing linearization, and moving a leaf to the back, potentially
    


    instagibbs commented at 2:34 PM on July 30, 2024:

    is this specifically for prioritisetransaction or something?


    sipa commented at 10:12 PM on July 30, 2024:

    No, it means that a particular approach for RBF will never worsen single-transaction leaf replacements that do not change the size of a transaction. I've added a comment to clarify.


    instagibbs commented at 10:16 PM on July 30, 2024:

    That was my initial thought but the size isn't changing so I'm unsure how it would map to an RBF?


    sipa commented at 10:19 PM on July 30, 2024:

    "laboratory conditions" I guess. It's just a nice property that holds; even if the conditions for it don't exactly hold often in reality, it probably means they're not far off. I'll look into whether I can generalize it a bit to support changing size.


    sdaftuar commented at 3:11 PM on July 31, 2024:

    To add on to that -- at one point in an earlier draft of #28676, we were seeing examples of RBFs where a single transaction was being replaced with one that had identical size, identical parents, but higher fee, yet due to quirks in how linearization worked the replacement was being rejected because the diagram wasn't improving.

    We've since generalized this concern around accidentally discovering a better linearization for a cluster while processing a potential replacement, and I think we will be addressing this in a more robust way, but it's nice that PostLinearization solves the specific problem we observed in the wild.

  26. in src/test/fuzz/cluster_linearize.cpp:857 in 8eaca20d60 outdated
     852 | +    SanityCheck(depgraph_tree, linearization);
     853 | +
     854 | +    // Produce a post-processed version.
     855 | +    auto post_linearization = linearization;
     856 | +    PostLinearize(depgraph_tree, post_linearization);
     857 | +    SanityCheck(depgraph_tree, linearization);
    


    instagibbs commented at 2:35 PM on July 30, 2024:

    sanity checking wrong linearization?


    sipa commented at 10:11 PM on July 30, 2024:

    Indeed, fixed.

  27. in src/cluster_linearize.h:992 in c17480e144 outdated
     987 | +        // prefixes of remaining chunks of the other linearization.
     988 | +        SetInfo<SetType> best;
     989 | +        const auto& lin1_firstchunk = chunking1.GetChunk(0);
     990 | +        const auto& lin2_firstchunk = chunking2.GetChunk(0);
     991 | +        if (lin2_firstchunk.feerate >> lin1_firstchunk.feerate) {
     992 | +            best = chunking1.Intersect(lin2_firstchunk);
    


    instagibbs commented at 3:06 PM on July 30, 2024:

    hm, Intersect had me quite confused in this context, consider renaming this as something like PrefixIntersect?


    sipa commented at 10:10 PM on July 30, 2024:

    Added a commit that renames it.

  28. in src/cluster_linearize.h:1042 in c17480e144 outdated
     994 | +            best = chunking2.Intersect(lin1_firstchunk);
     995 | +        }
     996 | +        // Append the result to the output and mark it as done.
     997 | +        depgraph.AppendTopo(ret, best.transactions);
     998 | +        chunking1.MarkDone(best.transactions);
     999 | +        if (chunking1.NumChunksLeft() == 0) break;
    


    instagibbs commented at 3:12 PM on July 30, 2024:

    breaking here allows you to skip one last MarkDone on completion vs putting the count in the while loop, yes?


    sipa commented at 10:03 PM on July 30, 2024:

    Exactly.

  29. instagibbs commented at 3:21 PM on July 30, 2024: member

    f4a183cd17c7c770fd6b9d32b9d6c6f791bceb4e

    did not yet review implementation of PostLinearize, didn't verify benchmarks with optimizations

  30. sipa force-pushed on Jul 30, 2024
  31. DrahtBot added the label CI failed on Jul 30, 2024
  32. DrahtBot removed the label CI failed on Jul 31, 2024
  33. in src/cluster_linearize.h:842 in 2df78c9cd6 outdated
     811 | +    // - Start with an empty list of groups L=[].
     812 | +    // - For every transaction i in the old linearization, from front to back:
     813 | +    //   - Append a new group C=[i], containing just i, to the back of L.
     814 | +    //   - While L has at least one group before C, and the group immediately before C has feerate
     815 | +    //     lower than C:
     816 | +    //       - Merge P into C, making C the concatenation of P+C, continuing with the combined C.
    


    sdaftuar commented at 2:01 PM on July 31, 2024:

    I think in your update to this comment, you dropped the condition that C depends on P. (But thanks for redrafting -- this is easier for me to read.)


    sipa commented at 8:14 PM on August 1, 2024:

    Indeed, fixed.

  34. in src/test/fuzz/cluster_linearize.cpp:855 in 2df78c9cd6 outdated
     850 | +                parents -= depgraph_gen.Ancestors(j);
     851 | +                parents.Set(j);
     852 | +            }
     853 | +            if (parents.Any()) depgraph_tree.AddDependency(parents.First(), i);
     854 | +        }
     855 | +    }
    


    sdaftuar commented at 3:06 PM on July 31, 2024:

    Just to test my understanding: at this point, the graph we've constructed may not be connected, right?


    sipa commented at 7:08 PM on August 1, 2024:

    Certainly, for two reasons:

    • depgraph_gen may not be connected to begin with (there is no MakeConnected(depgraph_gen); perhaps there should be?)
    • Even if depgraph_gen is connected (imagine it being a trellis for example), removing all but the first parent, or all but the first child, may split it up.
  35. in src/test/fuzz/cluster_linearize.cpp:896 in 2df78c9cd6 outdated
     891 | +{
     892 | +    // Verify that taking an existing linearization, and moving a leaf to the back, potentially
     893 | +    // increasing its fee, and then post-linearizing, results in something as good as the
     894 | +    // original. This guarantees that in an RBF that replaces a transaction with one of the same
     895 | +    // size but higher fee, applying the "remove conflicts, append new transaction, postlinearize"
     896 | +    // process to an optimalized linearization will retain optimality.
    


    sdaftuar commented at 3:13 PM on July 31, 2024:

    optimalized --> optimal


    sipa commented at 8:14 PM on August 1, 2024:

    Rewritten (the claim wasn't just grammatically bad, it was also just wrong).

  36. sdaftuar commented at 4:21 PM on July 31, 2024: member

    code review ACK 157464f3a9444ac9b2337227a2fee2f3adbd0918, will fuzz

  37. in src/cluster_linearize.h:183 in 157464f3a9 outdated
     178 | +     * Specifically, this finds the connected component which contains the first transaction of
     179 | +     * todo (if any).
     180 | +     *
     181 | +     * Two transactions are considered connected if there is a path from one to the other inside
     182 | +     * todo, where each is an ancestor or descendant of the next one in the entire graph (not just
     183 | +     * within todo). Thus, if if todo contains a transaction and a grandparent, but misses the
    


    instagibbs commented at 4:37 PM on July 31, 2024:
         * within todo). Thus, if todo contains a transaction and a grandparent, but misses the
    

    sipa commented at 8:14 PM on August 1, 2024:

    Fixed.

  38. in src/test/fuzz/cluster_linearize.cpp:806 in 157464f3a9 outdated
     801 | +    auto post_post_chunking = ChunkLinearization(depgraph, post_post_linearization);
     802 | +    cmp = CompareChunks(post_post_chunking, post_chunking);
     803 | +    assert(cmp >= 0);
     804 | +
     805 | +    // The chunks that come out of postlinearizing are always connected.
     806 | +    LinearizationChunking linchunking(depgraph, post_linearization);
    


    instagibbs commented at 4:41 PM on July 31, 2024:

    sounds like a tasty cantonese restaurant

  39. in src/cluster_linearize.h:181 in 157464f3a9 outdated
     172 | @@ -171,6 +173,50 @@ class DepGraph
     173 |          return ret;
     174 |      }
     175 |  
     176 | +    /** Find some connected component within the subset "todo" of this graph.
     177 | +     *
     178 | +     * Specifically, this finds the connected component which contains the first transaction of
     179 | +     * todo (if any).
     180 | +     *
     181 | +     * Two transactions are considered connected if there is a path from one to the other inside
    


    instagibbs commented at 4:51 PM on July 31, 2024:
         * Two transactions are considered connected if there is a path from one to the other and both are inside
    

    ?

    in general this description and rewrite was very helpful


    sipa commented at 7:33 PM on August 1, 2024:

    Not just the two transactions, but all transactions in the path need to be in todo.


    sipa commented at 8:15 PM on August 1, 2024:

    I have rewritten this. Please have a look if it's better.


    instagibbs commented at 8:22 PM on August 1, 2024:

    definitely clearer on what the "path" is :+1: (and matches my understanding of the code now)

  40. in src/cluster_linearize.h:861 in 157464f3a9 outdated
     856 | +    {
     857 | +        /** The index of the previous transaction in this group; 0 if this is the first entry of
     858 | +         *  a group. */
     859 | +        ClusterIndex prev_tx;
     860 | +
     861 | +        // The fields below are only used for transactions that are the last one in a group:
    


    instagibbs commented at 5:27 PM on July 31, 2024:
            // The fields below are only used for transactions that are the last one in a group, also known as a tail transaction:
    

    sipa commented at 8:15 PM on August 1, 2024:

    Done-ish.

  41. in src/cluster_linearize.h:920 in 157464f3a9 outdated
     915 | +            ClusterIndex cur_group = idx + 1;
     916 | +            entries[cur_group].group = SetType::Singleton(idx);
     917 | +            entries[cur_group].deps = rev ? depgraph.Descendants(idx): depgraph.Ancestors(idx);
     918 | +            entries[cur_group].feerate = depgraph.FeeRate(idx);
     919 | +            if (rev) entries[cur_group].feerate.fee = -entries[cur_group].feerate.fee;
     920 | +            entries[cur_group].prev_tx = 0; // No previous transaction in group.
    


    instagibbs commented at 8:15 PM on July 31, 2024:

    except for here, I think all uses of 0 are sentinel references, could add a constant to denote that meaning?


    sipa commented at 8:15 PM on August 1, 2024:

    Done. I have introduced two constants SENTINEL and NO_PREV_TX (both with value 0) and used them instead of the literal 0s here.

  42. in src/cluster_linearize.h:958 in 157464f3a9 outdated
     938 | +                Assume(cur_group != 0);
     939 | +                Assume(prev_group != 0);
     940 | +                if (entries[cur_group].deps.Overlaps(entries[prev_group].group)) {
     941 | +                    // There is a dependency between cur_group and prev_group; merge prev_group
     942 | +                    // into cur_group.
     943 | +                    entries[cur_group].group |= entries[prev_group].group;
    


    instagibbs commented at 8:18 PM on July 31, 2024:

    could note that after these lines, entries[prev_group]'s group + deps + feerate fields are still set but no longer used


    sipa commented at 8:15 PM on August 1, 2024:

    Indeed, added a comment.

  43. sdaftuar commented at 8:24 PM on July 31, 2024: member

    ACK 157464f3a9444ac9b2337227a2fee2f3adbd0918

  44. in src/test/fuzz/cluster_linearize.cpp:930 in 157464f3a9 outdated
     925 | +    // Compare diagrams (applying the fee delta after computing the old one).
     926 | +    auto old_chunking = ChunkLinearization(depgraph, lin);
     927 | +    depgraph.FeeRate(lin_leaf.back()).fee += fee_inc;
     928 | +    auto new_chunking = ChunkLinearization(depgraph, lin_moved);
     929 | +    auto cmp = CompareChunks(new_chunking, old_chunking);
     930 | +    assert(cmp >= 0);
    


    instagibbs commented at 8:32 PM on July 31, 2024:

    nice to have a non-tree example of things strictly improving, also interesting to see that improvements still happen even if no fees are added edit: hm, still not quite showing strict improvement since fees are after the fact; it's a different "cluster". Oh well.

        if (fee_inc > 0) {
            // It's more fees; should be superior
            assert(cmp > 0);
        } else {
            assert(cmp >= 0);
        }
    
  45. instagibbs approved
  46. instagibbs commented at 8:37 PM on July 31, 2024: member

    ACK 157464f3a9444ac9b2337227a2fee2f3adbd0918

    I didn't verify benchmark timings but opts seemed ok

  47. in src/cluster_linearize.h:1045 in 17b29eae7c outdated
    1005 | +        depgraph.AppendTopo(ret, best.transactions);
    1006 | +        chunking1.MarkDone(best.transactions);
    1007 | +        if (chunking1.NumChunksLeft() == 0) break;
    1008 | +        chunking2.MarkDone(best.transactions);
    1009 | +    }
    1010 | +
    


    glozow commented at 11:24 AM on August 1, 2024:

    could do an Assume(ret.size() == depgraph.TxCount()); before returning here, or in the fuzz test


    sipa commented at 8:16 PM on August 1, 2024:

    Done.

  48. sdaftuar commented at 3:37 PM on August 1, 2024: member

    Here are my linearization benchmarks (ryzen 7995wx) after this PR (compare with #30126 (comment)):

    | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 973.43 | 1,027,297.78 | 0.5% | 0.01 | LinearizeNoIters16TxWorstCaseAnc | 1,464.85 | 682,664.84 | 0.2% | 0.01 | LinearizeNoIters16TxWorstCaseLIMO | 2,503.93 | 399,372.47 | 0.3% | 0.01 | LinearizeNoIters32TxWorstCaseAnc | 5,091.86 | 196,392.05 | 0.2% | 0.01 | LinearizeNoIters32TxWorstCaseLIMO | 4,881.34 | 204,861.99 | 0.1% | 0.01 | LinearizeNoIters48TxWorstCaseAnc | 10,880.48 | 91,907.74 | 0.0% | 0.01 | LinearizeNoIters48TxWorstCaseLIMO | 7,843.67 | 127,491.30 | 0.0% | 0.01 | LinearizeNoIters64TxWorstCaseAnc | 18,803.07 | 53,182.80 | 0.1% | 0.01 | LinearizeNoIters64TxWorstCaseLIMO | 11,747.48 | 85,124.62 | 0.1% | 0.01 | LinearizeNoIters75TxWorstCaseAnc | 28,902.79 | 34,598.74 | 0.0% | 0.01 | LinearizeNoIters75TxWorstCaseLIMO | 19,030.02 | 52,548.55 | 0.1% | 0.01 | LinearizeNoIters99TxWorstCaseAnc | 49,663.43 | 20,135.54 | 0.1% | 0.01 | LinearizeNoIters99TxWorstCaseLIMO

    | ns/iters | iters/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 13.66 | 73,231,349.71 | 0.1% | 0.01 | LinearizePerIter16TxWorstCase | 9.63 | 103,869,988.01 | 0.3% | 0.01 | LinearizePerIter32TxWorstCase | 9.38 | 106,626,518.23 | 0.3% | 0.01 | LinearizePerIter48TxWorstCase | 9.44 | 105,920,646.37 | 0.3% | 0.01 | LinearizePerIter64TxWorstCase | 10.38 | 96,365,108.12 | 0.3% | 0.01 | LinearizePerIter75TxWorstCase | 10.38 | 96,374,023.73 | 0.2% | 0.01 | LinearizePerIter99TxWorstCase

    | ns/op | op/s | err% | total | benchmark |--------------------:|--------------------:|--------:|----------:|:---------- | 690.49 | 1,448,239.70 | 0.4% | 0.01 | MergeLinearizations16TxWorstCase | 2,703.90 | 369,835.96 | 0.1% | 0.01 | MergeLinearizations32TxWorstCase | 6,143.66 | 162,769.31 | 0.0% | 0.01 | MergeLinearizations48TxWorstCase | 11,300.76 | 88,489.66 | 0.1% | 0.01 | MergeLinearizations64TxWorstCase | 17,576.48 | 56,894.21 | 0.0% | 0.01 | MergeLinearizations75TxWorstCase | 31,100.42 | 32,153.91 | 0.0% | 0.01 | MergeLinearizations99TxWorstCase | 278.14 | 3,595,258.27 | 0.0% | 0.01 | PostLinearize16TxWorstCase | 863.81 | 1,157,659.26 | 0.0% | 0.01 | PostLinearize32TxWorstCase | 2,657.66 | 376,271.23 | 0.2% | 0.01 | PostLinearize48TxWorstCase | 4,652.11 | 214,956.22 | 0.0% | 0.01 | PostLinearize64TxWorstCase | 5,392.31 | 185,449.25 | 0.2% | 0.01 | PostLinearize75TxWorstCase | 9,431.09 | 106,032.27 | 0.1% | 0.01 | PostLinearize99TxWorstCase

  49. in src/test/fuzz/cluster_linearize.cpp:927 in 2df78c9cd6 outdated
     922 | +    PostLinearize(depgraph, lin_moved);
     923 | +    SanityCheck(depgraph, lin_moved);
     924 | +
     925 | +    // Compare diagrams (applying the fee delta after computing the old one).
     926 | +    auto old_chunking = ChunkLinearization(depgraph, lin);
     927 | +    depgraph.FeeRate(lin_leaf.back()).fee += fee_inc;
    


    glozow commented at 4:45 PM on August 1, 2024:

    I'm a bit confused trying to understand the "remove conflicts, append new tx to the end, postlinearize" RBF idea. I kind of expected this fee increment to happen before PostLinearize() is called, if it's similar to appending the replacement with higher fees to the end.


    sipa commented at 8:08 PM on August 1, 2024:

    So imagine your have an existing cluster C with an existing linearization L. It contains a leaf transaction R (which has no descendants) somewhere, but not at the very end of L.

    A new transaction A comes in which has the same size as R, same ancestors as R, and the same or higher fee as R.

    Imagine that in such a scenario the RBF code starts by taking L, removing all conflicts (in this case just R) from it, and appending the new transaction A at the end, and runs PostLinearize(), before invoking LIMO and whatnot.

    The test here guarantees that the result of that PostLinearize() will already be as good as the old linearization was, despite having "forgotten" where inside L R was placed. In other words, PostLinearize is - in this very limited scenario - able to "restore" the quality the linearization had.


    glozow commented at 10:49 AM on August 2, 2024:

    Ah okay my bad, I thought it was simulating the RBF strategy but it's actually asserting a more general property, that PostLinearize can bring us "back" to something at least as good as before the leaf was moved to the back.

  50. clusterlin: rename Intersect -> IntersectPrefixes
    This makes it clearer what the function does.
    0e52728a2d
  51. clusterlin: add algorithms for connectedness/connected components
    Add utility functions to DepGraph for finding connected components.
    0e2812d293
  52. clusterlin: add PostLinearize + benchmarks + fuzz tests 4f8958d756
  53. clusterlin: add MergeLinearizations function + fuzz test + benchmark 04d7a04ea4
  54. clusterlin: improve rechunking in LinearizationChunking (optimization)
    When the transactions being marked done exactly match the first chunk of
    what remains of the linearization, we can just remember to skip that
    chunk instead of computing a full rechunking.
    
    Further, chop off prefixes of the input linearization that are already done,
    so they don't need to be reconsidered for further rechunkings.
    bbcee5a0d6
  55. sipa force-pushed on Aug 1, 2024
  56. in src/cluster_linearize.h:833 in bbcee5a0d6
     828 | +    // than the one started from.
     829 | +    // - One pass in either direction guarantees that the resulting chunks are connected.
     830 | +    // - Each direction corresponds to one shape of tree being linearized optimally (forward passes
     831 | +    //   guarantee this for graphs where each transaction has at most one child; backward passes
     832 | +    //   guarantee this for graphs where each transaction has at most one parent).
     833 | +    // - Starting with a backward pass guarantees the moved-tree property.
    


    instagibbs commented at 8:24 PM on August 1, 2024:

    moved-leaf? (or point to another usage of moved tree)

  57. in src/cluster_linearize.h:914 in bbcee5a0d6
     909 | +    //                                         \-F-/
     910 | +    //
     911 | +    // During an even pass, the diagram above would correspond to linearization [2,3,0,1], with
     912 | +    // groups [2] and [3,0,1].
     913 | +
     914 | +    std::vector<TxEntry> entries(linearization.size() + 1);
    


    glozow commented at 10:27 AM on August 2, 2024:

    Are we persisting entries between passes just so we don't need to reallocate this vector? I don't see that we need to keep any of the information from a previous pass. To check, added a entries.clear() which didn't seem to break anything for me.


    sipa commented at 11:31 AM on August 2, 2024:

    Indeed, the only reuse is the vector allocation.

  58. in src/cluster_linearize.h:815 in bbcee5a0d6
     810 | + * @param[in,out] linearization  On input, an existing linearization for depgraph. On output, a
     811 | + *                               potentially better linearization for the same graph.
     812 | + *
     813 | + * Postlinearization guarantees:
     814 | + * - The resulting chunks are connected.
     815 | + * - If the input has a tree shape (either all transactions have at most one child, or all
    


    glozow commented at 10:37 AM on August 2, 2024:

    This is a subset of the "there exists a maximum of 1 path from any node to another" definition of a tree, right? This definition seems to be like a... botanical tree?


    sipa commented at 12:10 PM on August 2, 2024:

    According to https://en.wikipedia.org/wiki/Tree_(graph_theory), a transaction graph where each transaction has at most one child would be an arborescence or out-tree, and the opposite an anti-arborescence or in-tree.

  59. glozow commented at 11:14 AM on August 2, 2024: member

    code review ACK bbcee5a0d67

  60. DrahtBot requested review from sdaftuar on Aug 2, 2024
  61. DrahtBot requested review from instagibbs on Aug 2, 2024
  62. instagibbs approved
  63. sdaftuar commented at 6:40 PM on August 2, 2024: member

    ACK bbcee5a0d67db46526ba29a1a4a7c590d303de03

  64. glozow merged this on Aug 5, 2024
  65. glozow closed this on Aug 5, 2024

  66. bitcoin locked this on Aug 5, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me