cluster mempool: add TxGraph reorg functionality #31553

pull sipa wants to merge 6 commits into bitcoin:master from sipa:202412_txgraph_trim changing 3 files +644 −60
  1. sipa commented at 7:06 pm on December 22, 2024: member

    Part of cluster mempool (#30289).

    During reorganisations, it is possible that dependencies get added which would result in clusters that violate policy limits (cluster count, cluster weight), when linking the new from-block transactions to the old from-mempool transactions. Unlike RBF scenarios, we cannot simply reject the changes when they are due to received blocks. To accommodate this, add a TxGraph::Trim(), which removes some subset of transactions (including descendants) in order to make all resulting clusters satisfy the limits.

    Conceptually, the way this is done is by defining a rudimentary linearization for the entire would-be too-large cluster, iterating it from beginning to end, and reasoning about the counts and weights of the clusters that would be reached using transactions up to that point. If a transaction is encountered whose addition would violate the limit, it is removed, together with all its descendants.

    This rudimentary linearization is like a merge sort of the chunks of the clusters being combined, but respecting topology. More specifically, it is continuously picking the highest-chunk-feerate remaining transaction among those which have no unmet dependencies left. For efficiency, this rudimentary linearization is computed lazily, by putting all viable transactions in a heap, sorted by chunk feerate, and adding new transactions to it as they become viable.

    The Trim() function is rather unusual compared to the TxGraph functionality added in previous PRs, in that Trim() makes it own decisions about what the resulting graph contents will be, without good specification of how it makes that decision - it is just a best-effort attempt (which is improved in the last commit). All other TxGraph mutators are simply to inform the graph about changes the calling mempool code decided on; this one lets the decision be made by txgraph.

    As part of this, the “oversized” property is expanded to also encompass a configurable cluster weight limit (in addition to cluster count limit).

  2. DrahtBot commented at 7:06 pm on December 22, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31553.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK instagibbs, ismaelsadeeq

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. glozow added the label Mempool on Jan 2, 2025
  4. sipa force-pushed on Jan 8, 2025
  5. DrahtBot added the label CI failed on Jan 9, 2025
  6. DrahtBot commented at 0:52 am on January 9, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/35343429418

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. sipa force-pushed on Jan 9, 2025
  8. DrahtBot removed the label CI failed on Jan 9, 2025
  9. sipa force-pushed on Jan 9, 2025
  10. in src/txgraph.cpp:1708 in 0c8dc2323e outdated
    948+    Ref ret;
    949+    // Construct a new Entry, and link it with the Ref.
    950+    auto idx = m_entries.size();
    951+    m_entries.emplace_back();
    952+    auto& entry = m_entries.back();
    953+    entry.m_ref = &ret;
    


    theuni commented at 6:43 pm on January 9, 2025:
    I’m not sure how this is intended to be used, but storing a stack address seems like a problem? RVO may help but that seems brittle. I imagine the caller should be passing in their own Ref instead?

    sipa commented at 8:19 pm on January 9, 2025:

    I believe it is safe, both with NRVO and without.

    With NRVO, ret is constructed directly in the caller’s target destination, so this isn’t a pointer to local stack space.

    Without NRVO, the Ref(Ref&&) move constructor is invoked by return ret;, which will update the pointer to the caller’s destination.


    theuni commented at 10:15 pm on January 9, 2025:
    Ah, right, I missed that the move ctor would handle the update. Thanks for explaining.
  11. in src/txgraph.cpp:1172 in 0c8dc2323e outdated
    1167+    }
    1168+}
    1169+
    1170+TxGraph::Ref& TxGraph::Ref::operator=(Ref&& other) noexcept
    1171+{
    1172+    // Inform both TxGraphs about the Refs being swapped.
    


    theuni commented at 6:48 pm on January 9, 2025:
    Why is this doing an effective swap? I would expect this to call UnlinkRef on the moved-from value and reset its m_graph and m_index. Otherwise it wouldn’t be unlinked until the moved-from variable goes out of scope, no?

    sipa commented at 8:38 pm on January 9, 2025:

    Why is this doing an effective swap?

    I think this is quite common, that move-construction is effectively performing a swap.

    I would expect this to call UnlinkRef on the moved-from value and reset its m_graph and m_index

    That’s possible too, and slightly more efficient I guess.

    Otherwise it wouldn’t be unlinked until the moved-from variable goes out of scope, no?

    Indeed. I don’t think that’s a problem.


    sipa commented at 9:34 pm on January 9, 2025:
    Anyway, done!

    theuni commented at 9:56 pm on January 9, 2025:

    Why is this doing an effective swap?

    I think this is quite common, that move-construction is effectively performing a swap.

    I would expect this to call UnlinkRef on the moved-from value and reset its m_graph and m_index

    That’s possible too, and slightly more efficient I guess.

    Otherwise it wouldn’t be unlinked until the moved-from variable goes out of scope, no?

    Indeed. I don’t think that’s a problem.

    Afaik the move/swap idiom is only safe if the swapped-to value’s dtor doesn’t have any interesting ordering requirements or side-effects.

    As a contrived example, a user may do something like:

    0std::vector<TxGraph::Ref> vec;
    1
    2vec.push_back(txgraph->AddTransaction(fee));
    3auto ref = txgraph->AddTransaction(fee2);
    4...
    5ref = std::move(vec.back());
    

    The vector now holds the old ref and UnlinkRef will not be called until that element is removed. I realize it’s allowed to be a “valid but unspecified state”, but I wouldn’t expect a ref to be hanging around.

  12. in src/txgraph.cpp:188 in 0c8dc2323e outdated
    183+    /** A class of objects held internally in TxGraphImpl, with information about a single
    184+     *  transaction. */
    185+    struct Entry
    186+    {
    187+        /** Pointer to the corresponding Ref object, if any. */
    188+        Ref* m_ref;
    


    theuni commented at 6:51 pm on January 9, 2025:
    m_ref{nullptr};

    sipa commented at 9:34 pm on January 9, 2025:
    Done.
  13. in src/txgraph.cpp:297 in 0c8dc2323e outdated
    292+    LinearizationIndex lin_idx{0};
    293+    // Iterate over the chunks.
    294+    for (unsigned chunk_idx = 0; chunk_idx < chunking.NumChunksLeft(); ++chunk_idx) {
    295+        auto chunk = chunking.GetChunk(chunk_idx);
    296+        // Iterate over the transactions in the linearization, which must match those in chunk.
    297+        while (true) {
    


    theuni commented at 7:02 pm on January 9, 2025:

    Trying to convince myself this is guaranteed to terminate…

    do{} while (!chunk.transactions.None()) rather than the break for readability? Or just while() if we need to guard against an empty linearization (presumably not?)


    sipa commented at 9:39 pm on January 9, 2025:

    It terminates because:

    • Every chunk contains at least one element (added an Assume for that)
    • In the inner loop, one element from that chunk is Reset() (added an Assume that it indeed resets a bit that was previously set).

    I’ve changed it to a do {} while(chunk.transactions.Any()); loop in the first commits, though it reverts back to a while (true) { ... } loop later, when the loop becomes a bit more complex.

  14. in src/txgraph.cpp:836 in 0c8dc2323e outdated
    306+}
    307+
    308+void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove) noexcept
    309+{
    310+    // Iterate over the prefix of to_remove that applies to this cluster.
    311+    SetType todo;
    


    theuni commented at 7:17 pm on January 9, 2025:
    Assume !to_remove.empty() or early return if it’s allowed?

    sipa commented at 9:40 pm on January 9, 2025:
    Done. I’ve also added a comment to the Cluster::ApplyRemovals() function definition stating that at least one element from the front of to_remove must belong to this Cluster (which is really why that requirement exists).
  15. in src/txgraph.cpp:1662 in 0c8dc2323e outdated
    1002+    // Make sure the transaction isn't scheduled for removal.
    1003+    ApplyRemovals();
    1004+    return m_entries[GetRefIndex(arg)].m_locator.IsPresent();
    1005+}
    1006+
    1007+std::vector<TxGraph::Ref*> Cluster::GetAncestorRefs(const TxGraphImpl& graph, ClusterIndex idx) noexcept
    


    theuni commented at 7:42 pm on January 9, 2025:
    Looks like these 3 functions could reserve() for their ret vectors.

    sipa commented at 9:40 pm on January 9, 2025:
    Done. The third one disappears in a later commit, though.
  16. theuni commented at 8:03 pm on January 9, 2025: member
    Very quick and shallow pass through the initial impl commit. This PR is a lot to get through :)
  17. sipa force-pushed on Jan 9, 2025
  18. DrahtBot added the label CI failed on Jan 9, 2025
  19. DrahtBot commented at 11:22 pm on January 9, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/35398283653

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  20. sipa force-pushed on Jan 10, 2025
  21. DrahtBot removed the label CI failed on Jan 10, 2025
  22. sipa force-pushed on Jan 10, 2025
  23. sipa force-pushed on Jan 16, 2025
  24. sipa force-pushed on Jan 22, 2025
  25. in src/txgraph.cpp:1548 in 6cb99b067c outdated
    1464@@ -1449,6 +1465,7 @@ Cluster::Cluster(TxGraphImpl& graph, const FeeFrac& feerate, GraphIndex graph_in
    1465 TxGraph::Ref TxGraphImpl::AddTransaction(const FeeFrac& feerate) noexcept
    1466 {
    1467     Assume(m_chunkindex_observers == 0 || m_clustersets.size() > 1);
    1468+    Assume(feerate.size > 0 && uint64_t(feerate.size) <= m_max_cluster_size);
    


    sdaftuar commented at 2:10 pm on January 24, 2025:
    FYI – in my rebase of #28676, I’m seeing tx_pool fuzz test failures due to this line. Not clear to me whether we should require the caller to enforce the policy requirement that a single tx be below the cluster size limit, or just let the caller discover a changeset is oversized and then reject?

    sipa commented at 2:34 pm on January 24, 2025:

    Right. That rule exists because the alternative requires existing clusters to be oversized as AddTransaction constructs a singleton cluster instantly. All other forms of oversizedness happen as a result of applying dependencies, which are done lazily.

    I’ll think about relaxing this.


    sipa commented at 8:45 pm on January 26, 2025:
    Done, it is now allowed to have individually oversized transactions.
  26. sipa force-pushed on Jan 24, 2025
  27. sipa commented at 10:14 pm on January 24, 2025: member

    Some changes:

    • As a result of dropping Cleanup in the base PR, Trim now reports which transactions it removed, as it becomes the caller’s responsibility of destroying Refs.
  28. DrahtBot added the label CI failed on Jan 24, 2025
  29. DrahtBot commented at 11:23 pm on January 24, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/36148923463

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  30. sipa force-pushed on Jan 26, 2025
  31. sipa commented at 4:42 am on January 26, 2025: member
    • Add support for calling AddTransaction with a feerate whose size already violates the cluster size limit.
  32. DrahtBot removed the label CI failed on Jan 26, 2025
  33. sipa force-pushed on Jan 26, 2025
  34. sipa force-pushed on Jan 30, 2025
  35. DrahtBot commented at 1:01 am on January 31, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/36451226624

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  36. DrahtBot added the label CI failed on Jan 31, 2025
  37. sipa force-pushed on Jan 31, 2025
  38. DrahtBot removed the label CI failed on Jan 31, 2025
  39. sipa force-pushed on Jan 31, 2025
  40. sipa force-pushed on Feb 1, 2025
  41. sipa force-pushed on Feb 4, 2025
  42. DrahtBot added the label CI failed on Feb 4, 2025
  43. DrahtBot removed the label CI failed on Feb 4, 2025
  44. sipa force-pushed on Feb 6, 2025
  45. sipa force-pushed on Feb 11, 2025
  46. sipa force-pushed on Feb 12, 2025
  47. DrahtBot commented at 11:49 pm on February 12, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/37128455883

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  48. DrahtBot added the label CI failed on Feb 12, 2025
  49. sipa force-pushed on Feb 13, 2025
  50. DrahtBot removed the label CI failed on Feb 13, 2025
  51. sipa force-pushed on Feb 14, 2025
  52. sipa force-pushed on Feb 20, 2025
  53. sipa force-pushed on Feb 21, 2025
  54. in src/cluster_linearize.h:1381 in e43f6ca3b8 outdated
    1364+        // Figure out which elements need to be moved before elem.
    1365+        SetType place_before = done & depgraph.Ancestors(elem);
    1366+        // Find which position to place elem in (updating j), continuously moving the elements
    1367+        // in between forward.
    1368+        while (place_before.Any()) {
    1369+            // j cannot be 0 here; if it was, then there was necessarily nothing earlier which
    


    yancyribbens commented at 10:20 pm on February 24, 2025:
    0            // j cannot be 0 or less; if it was, then there was necessarily nothing earlier which
    

    yancyribbens commented at 10:23 pm on February 24, 2025:
    0if it was, then there was necessarily nothing earlier which elem needs to be place before anymore, and place_before would be empty.
    

    This comment seems jumbled and hard to understand. Is it possible to word this better?

  55. in src/txgraph.cpp:678 in e43f6ca3b8 outdated
    615+
    616+void TxGraphImpl::CreateChunkData(GraphIndex idx, LinearizationIndex chunk_count) noexcept
    617+{
    618+    auto& entry = m_entries[idx];
    619+    if (!m_chunkindex_discarded.empty()) {
    620+        // Reuse an discarded node handle.
    


    yancyribbens commented at 10:48 pm on February 24, 2025:
    0        // Reuse a discarded node handle.
    
  56. sipa force-pushed on Mar 3, 2025
  57. sipa force-pushed on Mar 19, 2025
  58. sipa force-pushed on Mar 19, 2025
  59. DrahtBot added the label CI failed on Mar 19, 2025
  60. DrahtBot commented at 9:06 pm on March 19, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39067440822

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  61. sipa force-pushed on Mar 19, 2025
  62. DrahtBot removed the label CI failed on Mar 20, 2025
  63. DrahtBot added the label Needs rebase on Mar 20, 2025
  64. sipa force-pushed on Mar 20, 2025
  65. DrahtBot added the label CI failed on Mar 20, 2025
  66. DrahtBot commented at 1:42 pm on March 20, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39111492257

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  67. sipa force-pushed on Mar 20, 2025
  68. DrahtBot removed the label Needs rebase on Mar 20, 2025
  69. DrahtBot removed the label CI failed on Mar 20, 2025
  70. sipa force-pushed on Mar 22, 2025
  71. DrahtBot added the label CI failed on Mar 22, 2025
  72. DrahtBot commented at 1:23 am on March 22, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39212785621

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  73. sipa force-pushed on Mar 22, 2025
  74. DrahtBot removed the label CI failed on Mar 22, 2025
  75. sipa force-pushed on Mar 23, 2025
  76. sipa force-pushed on Mar 24, 2025
  77. glozow referenced this in commit f1d129d963 on Mar 26, 2025
  78. sipa force-pushed on Mar 27, 2025
  79. sipa commented at 3:49 am on March 27, 2025: member
    Rebased after merge of #31363, and on top of #32151.
  80. sipa force-pushed on Mar 27, 2025
  81. sipa force-pushed on Mar 27, 2025
  82. sipa force-pushed on Mar 28, 2025
  83. sipa force-pushed on Mar 28, 2025
  84. sipa force-pushed on Mar 28, 2025
  85. sipa force-pushed on Apr 4, 2025
  86. sipa force-pushed on Apr 4, 2025
  87. sipa force-pushed on Apr 7, 2025
  88. sipa force-pushed on Apr 7, 2025
  89. DrahtBot added the label Needs rebase on Apr 17, 2025
  90. sipa force-pushed on Apr 17, 2025
  91. DrahtBot removed the label Needs rebase on Apr 17, 2025
  92. sipa force-pushed on Apr 22, 2025
  93. sipa force-pushed on Apr 30, 2025
  94. sipa force-pushed on May 12, 2025
  95. sipa force-pushed on May 13, 2025
  96. sipa commented at 8:33 pm on May 13, 2025: member
    Now ready for review, with #31444 merged.
  97. in src/test/fuzz/txgraph.cpp:272 in 462429bc15 outdated
    267@@ -262,12 +268,16 @@ FUZZ_TARGET(txgraph)
    268 
    269     // Decide the maximum number of transactions per cluster we will use in this simulation.
    270     auto max_count = provider.ConsumeIntegralInRange<DepGraphIndex>(1, MAX_CLUSTER_COUNT_LIMIT);
    271+    // And the maximum combined size of transactions per cluster.
    272+    auto max_size = provider.ConsumeIntegralInRange<uint64_t>(1, 0x3fffff * MAX_CLUSTER_COUNT_LIMIT);
    


    instagibbs commented at 2:50 pm on May 14, 2025:

    462429bc159b1b5ddb5443c484c9bdeae985ad26

    nitty: max_agg_size


    sipa commented at 7:15 pm on May 21, 2025:
    I have changed the variable names to max_cluster_count and max_cluster_size everywhere.
  98. in src/test/fuzz/txgraph.cpp:409 in 462429bc15 outdated
    406                 } else {
    407                     // Otherwise, use smaller range which consume fewer fuzz input bytes, as just
    408                     // these are likely sufficient to trigger all interesting code paths already.
    409                     fee = provider.ConsumeIntegral<uint8_t>();
    410-                    size = provider.ConsumeIntegral<uint8_t>() + 1;
    411+                    size = provider.ConsumeIntegralInRange<uint32_t>(1, std::min<uint32_t>(0xff, max_tx_size));
    


    instagibbs commented at 2:56 pm on May 14, 2025:

    462429bc159b1b5ddb5443c484c9bdeae985ad26

    misinterpreted the commit message that the aggregate size should not be hit when adding transaction, rather than each individual txn shouldn’t exceed this limit. Maybe make this crystal clear.


    sipa commented at 7:16 pm on May 21, 2025:
    Do you mean in the TxGraph::AddTransaction definition in txgraph.h? I have expanded the comments in the fuzz test around this a bit.

    sipa commented at 9:25 pm on May 21, 2025:
    Does :+1: mean “That is what I meant” or “It’s fixed now”?

    instagibbs commented at 4:59 pm on May 23, 2025:
    comments appear clear enough now
  99. in src/txgraph.cpp:113 in 462429bc15 outdated
    108@@ -109,6 +109,8 @@ class Cluster
    109     }
    110     /** Get the number of transactions in this Cluster. */
    111     LinearizationIndex GetTxCount() const noexcept { return m_linearization.size(); }
    112+    /** Get the total size of the transactions in this Cluster. */
    113+    uint64_t GetTxSize() const noexcept;
    


    instagibbs commented at 3:02 pm on May 14, 2025:

    462429bc159b1b5ddb5443c484c9bdeae985ad26

    nitty: GetTotalTxSize? GetAggTxSize? GetTxsSize?


    sipa commented at 7:16 pm on May 21, 2025:
    GetTotalTxSize is a nice color.
  100. in src/txgraph.cpp:1473 in 462429bc15 outdated
    1468@@ -1453,7 +1469,7 @@ void TxGraphImpl::GroupClusters(int level) noexcept
    1469             ++new_entry.m_deps_count;
    1470         }
    1471         // Detect oversizedness.
    1472-        if (total_count > m_max_cluster_count) {
    1473+        if (total_count > m_max_cluster_count || total_size > m_max_cluster_size) {
    1474             clusterset.m_group_data->m_group_oversized = true;
    


    instagibbs commented at 3:04 pm on May 14, 2025:

    462429bc159b1b5ddb5443c484c9bdeae985ad26

    nit: docstring for m_group_oversized should be updated


    sipa commented at 7:17 pm on May 21, 2025:
    Done, but also removed in a later commit.
  101. in src/txgraph.h:67 in 462429bc15 outdated
    62@@ -63,10 +63,10 @@ class TxGraph
    63     /** Virtual destructor, so inheriting is safe. */
    64     virtual ~TxGraph() = default;
    65     /** Construct a new transaction with the specified feerate, and return a Ref to it.
    66-     *  If a staging graph exists, the new transaction is only created there. In all
    67-     *  further calls, only Refs created by AddTransaction() are allowed to be passed to this
    68-     *  TxGraph object (or empty Ref objects). Ref objects may outlive the TxGraph they were
    69-     *  created for. */
    70+     *  If a staging graph exists, the new transaction is only created there. feerate.size cannot
    71+     *  exceed the graph's max cluster size. In all further calls, only Refs created by
    


    instagibbs commented at 3:09 pm on May 14, 2025:
    and can’t also be 0? Didn’t double check this was already the case, but an assertion is added in 462429bc159b1b5ddb5443c484c9bdeae985ad26

    sipa commented at 7:18 pm on May 21, 2025:
    Added as comment.
  102. in src/txgraph.cpp:2145 in 462429bc15 outdated
    2140@@ -2124,6 +2141,8 @@ void Cluster::SanityCheck(const TxGraphImpl& graph, int level) const
    2141     assert(m_linearization.size() <= graph.m_max_cluster_count);
    2142     // The level must match the level the Cluster occurs in.
    2143     assert(m_level == level);
    2144+    // The sum of their sizes cannot exceed m_max_cluster_size.
    2145+    assert(GetTxSize() <= graph.m_max_cluster_size);
    


    instagibbs commented at 3:14 pm on May 14, 2025:

    462429bc159b1b5ddb5443c484c9bdeae985ad26

    Was a bit confusing at first, but this must be true because clusters are never merged (via dependency application) when to-be-merged-clusters would be oversized.


    sipa commented at 7:19 pm on May 21, 2025:
    I’ve added a comment to highlight that.
  103. in src/test/fuzz/txgraph.cpp:147 in 439ccf713b outdated
    133@@ -134,6 +134,8 @@ struct SimTxGraph
    134         simmap[simpos] = std::make_shared<TxGraph::Ref>();
    135         auto ptr = simmap[simpos].get();
    136         simrevmap[ptr] = simpos;
    137+        // This may invalidate our cached oversized value.
    


    instagibbs commented at 3:24 pm on May 14, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    commit message maybe could get beefed up a bit with motivation since at first blush it seems odd to add singletons we wouldn’t have relayed in the first place


    sipa commented at 7:15 pm on May 21, 2025:
    I have expanded the commit message of both Trim commits.
  104. in src/txgraph.cpp:40 in 439ccf713b outdated
    34@@ -35,6 +35,9 @@ using ClusterSetIndex = uint32_t;
    35 /** Quality levels for cached cluster linearizations. */
    36 enum class QualityLevel
    37 {
    38+    /** This is a singleton cluster consisting of a transaction that individually exceeds the
    39+     *  cluster size limit. It cannot be merged with anything. */
    40+    OVERSIZED,
    


    instagibbs commented at 3:25 pm on May 14, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    nit: SINGLETON_OVERSIZED to reduce later mental gymnastics


    sipa commented at 7:19 pm on May 21, 2025:
    Good idea, I went with OVERSIZED_SINGLETON.
  105. in src/txgraph.cpp:500 in 439ccf713b outdated
    455@@ -447,7 +456,7 @@ class TxGraphImpl final : public TxGraph
    456     ClusterSet& GetClusterSet(int level) noexcept;
    457     const ClusterSet& GetClusterSet(int level) const noexcept;
    458     /** Make a transaction not exist at a specified level. It must currently exist there. */
    459-    void ClearLocator(int level, GraphIndex index) noexcept;
    460+    void ClearLocator(int level, GraphIndex index, bool oversized_tx) noexcept;
    


    instagibbs commented at 3:27 pm on May 14, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    please add documentation on what oversized_tx should be


    sipa commented at 7:19 pm on May 21, 2025:
    Done.
  106. in src/txgraph.cpp:830 in 439ccf713b outdated
    826@@ -815,9 +827,14 @@ void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove
    827         todo.Reset(m_linearization.back());
    828         m_linearization.pop_back();
    829     }
    830-    if (todo.None()) {
    831+    if (m_linearization.empty()) {
    


    instagibbs commented at 3:43 pm on May 14, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    is this a future optimization? these types of clusters were already being marked as NEED_SPLIT_*


    sipa commented at 7:20 pm on May 21, 2025:
    I have simplified this to just a m_linearization.empty() || in the if conditional below.
  107. in src/txgraph.cpp:861 in 439ccf713b outdated
    854@@ -838,7 +855,10 @@ void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove
    855 void Cluster::Clear(TxGraphImpl& graph) noexcept
    856 {
    857     for (auto i : m_linearization) {
    858-        graph.ClearLocator(m_level, m_mapping[i]);
    859+        // We do not care about setting oversized_tx accurately here, because this function is only
    860+        // applied to main-graph Clusters in CommitStaging, which will overwrite main's
    861+        // m_txcount_oversized anyway with the staging graph's value.
    862+        graph.ClearLocator(m_level, m_mapping[i], /*oversized_tx=*/false);
    


    instagibbs commented at 3:48 pm on May 14, 2025:
    0        Assume(m_level == 0);
    1        graph.ClearLocator(m_level, m_mapping[i], /*oversized_tx=*/false);
    

    sipa commented at 7:21 pm on May 21, 2025:
    Instead of this big unintuitive comment, I have replaced it with /*oversized_tx=*/m_quality == QualityLevel::OVERSIZED_SINGLETON. With that, I don’t think the Assume is neccesary?

    instagibbs commented at 2:33 pm on May 27, 2025:
    much better thanks
  108. in src/txgraph.cpp:2680 in acb2b6a90a outdated
    2686+        // Replace arg1 and arg2 by their representatives.
    2687+        auto rep1 = find_fn(arg1);
    2688+        auto rep2 = find_fn(arg2);
    2689+        // Bail out if both representatives are the same, because that means arg1 and arg2 are in
    2690+        // the same partition already.
    2691+        if (rep1 == rep2) return rep1;
    


    instagibbs commented at 9:15 pm on May 14, 2025:

    acb2b6a90a47567773268a024bcf2f1ddfcc60df

    FYI this condition doesn’t appear to be hit in the fuzz harness


    sipa commented at 7:21 pm on May 21, 2025:
    Both assert(rep1 == rep2) and assert(rep1 != rep2) instantly trigger failures when fuzzing for me.

    instagibbs commented at 5:00 pm on May 23, 2025:
    curious. Coverage report may be off.
  109. in src/txgraph.cpp:858 in 439ccf713b outdated
    854@@ -838,7 +855,10 @@ void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove
    855 void Cluster::Clear(TxGraphImpl& graph) noexcept
    856 {
    857     for (auto i : m_linearization) {
    858-        graph.ClearLocator(m_level, m_mapping[i]);
    859+        // We do not care about setting oversized_tx accurately here, because this function is only
    


    instagibbs commented at 2:23 pm on May 19, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    Should this function have clearer naming/documentation to avoid future regressions in use? Seems brittle to me.


    sipa commented at 7:22 pm on May 21, 2025:
    See above, I have dropped the constraints on ClearLocator instead.

    instagibbs commented at 2:33 pm on May 27, 2025:
    better thanks
  110. in src/txgraph.cpp:260 in 439ccf713b outdated
    253@@ -248,8 +254,11 @@ class TxGraphImpl final : public TxGraph
    254         /** Total number of transactions in this graph (sum of all transaction counts in all
    255          *  Clusters, and for staging also those inherited from the main ClusterSet). */
    256         GraphIndex m_txcount{0};
    257+        /** Total number of individually oversized transactions in the graph. */
    258+        GraphIndex m_txcount_oversized{0};
    259         /** Whether this graph is oversized (if known). This roughly matches
    260-         *  m_group_data->m_group_oversized, but may be known even if m_group_data is not. */
    261+         *  m_group_data->m_group_oversized || (m_txcount_oversized > 0), but may be known even if
    


    instagibbs commented at 2:40 pm on May 19, 2025:

    439ccf713b8f9de1dff1d02678bb68d0f29ea175

    Do we need m_group_oversized? Seems like a variable we have to keep in sync for no reason:

     0diff --git a/src/txgraph.cpp b/src/txgraph.cpp
     1index 01076b763b..853f8530a0 100644
     2--- a/src/txgraph.cpp
     3+++ b/src/txgraph.cpp
     4@@ -231,7 +231,4 @@ private:
     5         /** Which clusters are to be merged. GroupEntry::m_cluster_offset indexes into this. */
     6         std::vector<Cluster*> m_group_clusters;
     7-        /** Whether at least one of the groups cannot be applied because it would result in a
     8-         *  Cluster that violates the cluster count limit. */
     9-        bool m_group_oversized;
    10     };
    11 
    12@@ -257,7 +254,5 @@ private:
    13         /** Total number of individually oversized transactions in the graph. */
    14         GraphIndex m_txcount_oversized{0};
    15-        /** Whether this graph is oversized (if known). This roughly matches
    16-         *  m_group_data->m_group_oversized || (m_txcount_oversized > 0), but may be known even if
    17-         *  m_group_data is not. */
    18+        /** Whether this graph is oversized (if known). */
    19         std::optional<bool> m_oversized{false};
    20 
    21@@ -1470,7 +1465,7 @@ void TxGraphImpl::GroupClusters(int level) noexcept
    22     clusterset.m_group_data = GroupData{};
    23     clusterset.m_group_data->m_group_clusters.reserve(an_clusters.size());
    24-    clusterset.m_group_data->m_group_oversized = false;
    25     clusterset.m_deps_to_add.clear();
    26     clusterset.m_deps_to_add.reserve(an_deps.size());
    27+    clusterset.m_oversized = false;
    28     auto an_deps_it = an_deps.begin();
    29     auto an_clusters_it = an_clusters.begin();
    30@@ -1502,10 +1497,9 @@ void TxGraphImpl::GroupClusters(int level) noexcept
    31         // Detect oversizedness.
    32         if (total_count > m_max_cluster_count || total_size > m_max_cluster_size) {
    33-            clusterset.m_group_data->m_group_oversized = true;
    34+            clusterset.m_oversized = true;
    35         }
    36     }
    37     Assume(an_deps_it == an_deps.end());
    38     Assume(an_clusters_it == an_clusters.end());
    39-    clusterset.m_oversized = clusterset.m_group_data->m_group_oversized;
    40     Compact();
    41 }
    42@@ -2371,11 +2365,4 @@ void TxGraphImpl::SanityCheck() const
    43         if (!clusterset.m_removed.empty()) compact_possible = false;
    44 
    45-        // If m_group_data exists, and no outstanding removals remain, m_group_oversized must match
    46-        // m_group_oversized || (m_txcount_oversized > 0).
    47-        if (clusterset.m_group_data.has_value() && clusterset.m_to_remove.empty()) {
    48-            assert(clusterset.m_oversized ==
    49-                   (clusterset.m_group_data->m_group_oversized || (clusterset.m_txcount_oversized > 0)));
    50-        }
    51-
    52         // For non-top levels, m_oversized must be known (as it cannot change until the level
    53         // on top is gone).
    

    sipa commented at 7:23 pm on May 21, 2025:
    Nice catch, after the changes in “Permit transactions that exceed cluster size limit”, m_group_oversized is never read anymore. Included this patch as a separate simplification commit.
  111. in src/txgraph.cpp:70 in acb2b6a90a outdated
    61@@ -62,12 +62,29 @@ struct TrimTxData
    62     TxGraph::GraphIndex m_index;
    63     /** Number of unmet dependencies this transaction has. -1 if the transaction is included. */
    64     uint32_t m_deps_left;
    65+    /** Number of dependencies that apply to this transaction as child. */
    


    instagibbs commented at 3:09 pm on May 19, 2025:

    acb2b6a90a47567773268a024bcf2f1ddfcc60df

    Some of this commit message should probably go in ce825731e8bea294e71b339015eccdb087e760c7 ? First paragraph at least.


    sipa commented at 7:25 pm on May 21, 2025:
    I have expanded the commit messages.
  112. in src/test/fuzz/txgraph.cpp:828 in ce825731e8 outdated
    787@@ -788,6 +788,27 @@ FUZZ_TARGET(txgraph)
    788                     assert(sum == worst_chunk_feerate);
    789                 }
    790                 break;
    791+            } else if ((block_builders.empty() || sims.size() > 1) && command-- == 0) {
    


    instagibbs commented at 8:23 pm on May 19, 2025:

    Here are a couple suggestions to tighten up the checks on what we’re trimming. Attempted to state that:

    1. non-oversized clusters remain untouched
    2. transactions with no ancestors that are undersized should never be trimmed (this seems natural but the interface doesn’t guarantee it)

    Mostly trying to avoid degenerate behavior like entire mempool being wiped, or any oversized cluster being completely wiped needlessly.

     0diff --git a/src/test/fuzz/txgraph.cpp b/src/test/fuzz/txgraph.cpp
     1index 47ad5792bf..bdf759f485 100644
     2--- a/src/test/fuzz/txgraph.cpp
     3+++ b/src/test/fuzz/txgraph.cpp
     4@@ -91,4 +91,28 @@ struct SimTxGraph
     5     }
     6 
     7+    /** Returns true if all positions in the given set are in oversized clusters, false otherwise. */
     8+    bool AllInOversizedClusters(SetType set)
     9+    {
    10+        if (set.Any() && !IsOversized()) return false;
    11+
    12+        auto todo = graph.Positions();
    13+        if (!set.IsSubsetOf(todo)) return false;
    14+
    15+        // Walk all clusters, and make sure all of `set` doesn't come from non-oversized clusters
    16+        while (todo.Any()) {
    17+            auto component = graph.FindConnectedComponent(todo);
    18+            bool is_oversized{component.Count() > max_cluster_count};
    19+            uint64_t component_size{0};
    20+            for (auto i : component) component_size += graph.FeeRate(i).size;
    21+            is_oversized |= component_size > max_cluster_size;
    22+            // Some element existed in a non-oversized cluster
    23+            if (!is_oversized && set.Overlaps(component)) {
    24+                return false;
    25+            }
    26+            todo -= component;
    27+        }
    28+        return true;
    29+    }
    30+
    31     void MakeModified(DepGraphIndex index)
    32     {
    33@@ -798,8 +822,16 @@ FUZZ_TARGET(txgraph)
    34                 }
    35                 auto removed_set = top_sim.MakeSet(removed);
    36-                // The removed set must contain all its own descendants.
    37                 for (auto simpos : removed_set) {
    38+                    // The removed set must contain all its own descendants.
    39                     assert(top_sim.graph.Descendants(simpos).IsSubsetOf(removed_set));
    40+
    41+                    // Nothing removed should have no ancestors and be undersized itself
    42+                    assert(!(top_sim.GetAncDesc(top_sim.GetRef(simpos), /*desc=*/false).Count() == 0 &&
    43+                        (uint64_t) top_sim.graph.FeeRate(simpos).size <= top_sim.max_cluster_size));
    44                 }
    45+
    46+                // Nothing from not-oversized-clusters should have been removed
    47+                assert(top_sim.AllInOversizedClusters(removed_set));
    48+
    49                 // Apply all removals to the simulation, and verify the result is no longer
    50                 // oversized. Don't query the real graph for oversizedness; it is compared
    

    sipa commented at 7:24 pm on May 21, 2025:

    Taken, except for:

    0+                  assert(!(top_sim.GetAncDesc(top_sim.GetRef(simpos), /*desc=*/false).Count() == 0 &&
    1+                        (uint64_t) top_sim.graph.FeeRate(simpos).size <= top_sim.max_cluster_size));
    

    because Count() == 0 will never be true (ancestors always includes the transaction itself), and the test fails when I change it to Count() == 1.


    instagibbs commented at 8:34 pm on May 21, 2025:

    Ah, I see it now. Both variants of the Trim() algorithm can definitely remove these types of transactions.

    f.e. if cluster limit is 2 and you have: TxA <- TxB <- TxC TxA <- TxE TxD <- TxE

    You may select TxA+TxB only (after stopping at TxC), evicting TxC+TxD+TxE, if TxD’s cfr is low.

    If TxD’s cfr is higher than TxB or TxC, then you could retain A+B+D

    After a bunch of debugging, I was able to figure out a stronger check, if we constrain the scenario from a non-oversized to oversized transition with pending dependencies applied: https://github.com/instagibbs/bitcoin/tree/2025-05-applytrim


    sipa commented at 3:38 am on May 28, 2025:
    Added a commit with a test like this, with some changes. LMK what you think.
  113. instagibbs commented at 8:26 pm on May 19, 2025: member

    acb2b6a90a47567773268a024bcf2f1ddfcc60df

    focused on motivation/structure/approach and test coverage for first pass. Haven’t validated the actual trimming logic yet.

  114. in src/txgraph.cpp:919 in ce825731e8 outdated
    907@@ -886,6 +908,37 @@ void Cluster::AppendChunkFeerates(std::vector<FeeFrac>& ret) const noexcept
    908     ret.insert(ret.end(), chunk_feerates.begin(), chunk_feerates.end());
    909 }
    910 
    911+uint64_t Cluster::AppendTrimData(std::vector<TrimTxData>& ret, std::vector<std::pair<GraphIndex, GraphIndex>>& deps) const noexcept
    912+{
    913+    LinearizationChunking linchunking(m_depgraph, m_linearization);
    


    instagibbs commented at 2:20 pm on May 20, 2025:
    0    const LinearizationChunking linchunking(m_depgraph, m_linearization);
    

    sipa commented at 3:33 am on May 28, 2025:
    Done.
  115. in src/txgraph.cpp:189 in ce825731e8 outdated
    167@@ -151,6 +168,10 @@ class Cluster
    168     void Relinearize(TxGraphImpl& graph, uint64_t max_iters) noexcept;
    169     /** For every chunk in the cluster, append its FeeFrac to ret. */
    170     void AppendChunkFeerates(std::vector<FeeFrac>& ret) const noexcept;
    171+    /** Add a TrimTxData entry for every transaction in the Cluster to ret. Implicit dependencies
    


    instagibbs commented at 2:51 pm on May 20, 2025:
    nit: 3 out of the N fields are being set here, could we be specific which ones for clarity?

    sipa commented at 3:38 am on May 28, 2025:
    Done. Also organized the fields of struct TrimTxData a bit more logically, into “Populated by Cluster::AppendTxData” and “Only used internally in TxGraphImpl::Trim” sections.
  116. in src/txgraph.cpp:2669 in ce825731e8 outdated
    2664+        // Gather trim data from all involved Clusters.
    2665+        auto cluster_span = std::span{clusterset.m_group_data->m_group_clusters}
    2666+                                .subspan(group_data.m_cluster_offset, group_data.m_cluster_count);
    2667+        uint64_t size{0};
    2668+        for (Cluster* cluster : cluster_span) {
    2669+            size += cluster->AppendTrimData(trim_data, deps);
    


    instagibbs commented at 3:12 pm on May 20, 2025:
    nit: leave a comment that deps is also being added to here, briefly forgot

    sipa commented at 3:37 am on May 28, 2025:
    Done.
  117. in src/txgraph.cpp:2677 in ce825731e8 outdated
    2672+        if (trim_data.size() <= m_max_cluster_count && size <= m_max_cluster_size) continue;
    2673+        // Sort the trim data by GraphIndex. In what follows, we will treat this sorted vector as
    2674+        // a map from GraphIndex to TrimTxData, and its ordering will not change anymore.
    2675+        std::sort(trim_data.begin(), trim_data.end(), [](auto& a, auto& b) noexcept { return a.m_index < b.m_index; });
    2676+
    2677+        // Construct deps, and sort it by child.
    


    instagibbs commented at 3:12 pm on May 20, 2025:
    0        // Append explicit deps, and sort it by child.
    

    sipa commented at 3:37 am on May 28, 2025:
    Done.
  118. in src/txgraph.cpp:2682 in ce825731e8 outdated
    2677+        // Construct deps, and sort it by child.
    2678+        deps.insert(deps.end(),
    2679+                    clusterset.m_deps_to_add.begin() + group_data.m_deps_offset,
    2680+                    clusterset.m_deps_to_add.begin() + group_data.m_deps_offset + group_data.m_deps_count);
    2681+        std::sort(deps.begin(), deps.end(), [](auto& a, auto& b) noexcept { return a.second < b.second; });
    2682+        // Fill m_deps_left in trim_data. Because of the sort above, all
    


    instagibbs commented at 3:17 pm on May 20, 2025:
    0        // Fill m_deps_left in trim_data and initially populate trim_heap. Because of the sort above, all
    

    sipa commented at 3:37 am on May 28, 2025:
    Done.
  119. in src/txgraph.cpp:2700 in ce825731e8 outdated
    2695+                trim_heap.push_back(trim_it);
    2696+            }
    2697+        }
    2698+        Assume(deps_it == deps.end());
    2699+
    2700+        // Sort deps by parent.
    


    instagibbs commented at 3:19 pm on May 20, 2025:
    0        // Sort deps by parent by graph index. The order is unchanged from now on.
    

    sipa commented at 3:36 am on May 28, 2025:
    Done.
  120. in src/txgraph.cpp:2713 in ce825731e8 outdated
    2669+            size += cluster->AppendTrimData(trim_data, deps);
    2670+        }
    2671+        // If this group of Clusters does not violate any limits, continue to the next group.
    2672+        if (trim_data.size() <= m_max_cluster_count && size <= m_max_cluster_size) continue;
    2673+        // Sort the trim data by GraphIndex. In what follows, we will treat this sorted vector as
    2674+        // a map from GraphIndex to TrimTxData, and its ordering will not change anymore.
    


    instagibbs commented at 3:23 pm on May 20, 2025:
    0        // a map from GraphIndex to TrimTxData via locate_fn, and its ordering will not change anymore.
    

    sipa commented at 3:37 am on May 28, 2025:
    Done.
  121. in src/txgraph.cpp:2849 in ce825731e8 outdated
    2767+            }
    2768+        }
    2769+    }
    2770+    clusterset.m_group_data.reset();
    2771+    clusterset.m_oversized = false;
    2772+    return ret;
    


    instagibbs commented at 3:43 pm on May 20, 2025:
    0    Assume(!ret.empty());
    1    return ret;
    

    sipa commented at 3:35 am on May 28, 2025:
    Done.
  122. sipa force-pushed on May 21, 2025
  123. in src/txgraph.cpp:149 in bd95322455 outdated
    103@@ -101,6 +104,9 @@ class Cluster
    104     {
    105         return m_quality == QualityLevel::OPTIMAL;
    106     }
    107+    /** Whether this cluster is oversized (just due to the size of its transaction(s), not due to
    108+     *  dependencies that are yet to be added. */
    109+    bool IsOversized() const noexcept { return m_quality == QualityLevel::OVERSIZED_SINGLETON; }
    


    ismaelsadeeq commented at 9:18 am on May 22, 2025:

    This method’s name aligns with its functionality.

    However with the addition of this commit I think there is ambiguity currently between the size oversize, the count oversize and both. This makes it a bit confusing to follow.

    maybe separate the two.

    oversize should be for size.

    overlimit or something like it for both


    sipa commented at 7:54 pm on May 22, 2025:
    To be clear, Cluster::IsOversized() returns whether the Cluster is oversized in general. It’s just the case that changes to clusters which bring it over the size/count limit are never applied, so the only way an actually materialized Cluster can be oversized is if it’s a singleton transaction which is oversized on its own.

    ismaelsadeeq commented at 8:42 pm on May 22, 2025:

    To be clear, Cluster::IsOversized() returns whether the Cluster is oversized in general.

    huh, I missed thanks for the clarity. I think you can add this comment as well or anything that would make this explicit and clearer will be appreciated.


    sipa commented at 11:50 pm on May 22, 2025:
    Expanded the comment to say that.
  124. in src/txgraph.cpp:834 in bd95322455 outdated
    830@@ -818,7 +831,7 @@ void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove
    831     if (todo.None()) {
    832         // If no further removals remain, and thus all removals were at the end, we may be able
    833         // to leave the cluster at a better quality level.
    834-        if (IsAcceptable(/*after_split=*/true)) {
    835+        if (m_linearization.empty() || IsAcceptable(/*after_split=*/true)) {
    


    ismaelsadeeq commented at 11:00 am on May 22, 2025:
    if there is nothing in the cluster? why should it be marked as needs_split_acceptable? Also I see no crashes after removing the new or condition.

    sipa commented at 7:57 pm on May 22, 2025:
    You’re right, this doesn’t matter at all. The Cluster will just be deleted by Split anyway. Removed it.
  125. in src/test/fuzz/txgraph.cpp:835 in 412a9e0e69 outdated
    820+                auto removed = real->Trim();
    821+                if (!was_oversized) {
    822+                    assert(removed.empty());
    823+                    break;
    824+                }
    825+                auto removed_set = top_sim.MakeSet(removed);
    


    ismaelsadeeq commented at 12:38 pm on May 22, 2025:

    assert that we trim when oversized ?

    0                assert(!removed.empty());
    1                auto removed_set = top_sim.MakeSet(removed);
    

    sipa commented at 7:51 pm on May 22, 2025:
    Done.
  126. in src/test/fuzz/txgraph.cpp:270 in 412a9e0e69 outdated
    265+            for (auto i : component) component_size += graph.FeeRate(i).size;
    266+            is_oversized |= component_size > max_cluster_size;
    267+            // Some element existed in a non-oversized cluster
    268+            if (!is_oversized && set.Overlaps(component)) {
    269+                return false;
    270+            }
    


    ismaelsadeeq commented at 12:56 pm on May 22, 2025:

    check both ways

    0            }
    1            // No element exist in set and cluster is oversized 
    2            if (is_oversized && !set.Overlaps(component)){
    3               return false;
    4            }
    

    sipa commented at 7:56 pm on May 22, 2025:

    Interesting, done! I have renamed the function to MatchesOversizedClusters and added a comment to clarify what it does.

    I’ve also simplified it to just if (is_oversized != set.Overlaps(component)) return false;.

  127. in src/txgraph.h:176 in 412a9e0e69 outdated
    168@@ -169,6 +169,11 @@ class TxGraph
    169      *  that appear identically in both. Use FeeFrac rather than FeePerWeight so CompareChunks is
    170      *  usable without type-conversion. */
    171     virtual std::pair<std::vector<FeeFrac>, std::vector<FeeFrac>> GetMainStagingDiagrams() noexcept = 0;
    172+    /** Remove transactions (including their own descendants) according to a fast but best-effort
    173+     *  strategy such that the TxGraph's cluster and size limits are respected. Applies to staging
    174+     *  if it exists, and to main otherwise. Returns the list of all removed transactions in
    175+     *  unspecified order. This has no effect unless the relevant graph is oversized. */
    176+    virtual std::vector<Ref*> Trim() noexcept = 0;
    


    ismaelsadeeq commented at 1:30 pm on May 22, 2025:
    Indicate that it should not be called when their is an instance of block builder.

    sipa commented at 7:55 pm on May 22, 2025:
    The GetBlockBuilder function already states “While the returned object exists, no mutators on the main graph are allowed.”. I’m open to putting a comment to that effect on each of the mutators where it matters (incl. Trim), but I feel like it should be done consistently.
  128. ismaelsadeeq commented at 2:30 pm on May 22, 2025: member

    Code review up to 412a9e0e69ccf461554660b7c622b768cfb8ccb3

    Also did some fuzzing and mutate the new code added and tests, craches occur as expected.

    If a transaction is encountered whose addition would violate the limit, it is removed, together with all its descendants.

    In real-world scenarios it is unlikely we encounter those transactions that will exceed the cluster count limit, however when it occur I assume that the newly connected block will have similar transaction with the reorged block. I think maintaining an in-memory data structure for these transactions removed due to the limit seperate from mempool will reduce bandwidth from re-questing the tx again and also avoid non-contextual checks?

    We can get rid of them after the reorg is complete and they arent included in any block.

  129. sipa force-pushed on May 22, 2025
  130. sipa commented at 8:02 pm on May 22, 2025: member

    In real-world scenarios it is unlikely we encounter those transactions that will exceed the cluster count limit, however when it occur I assume that the newly connected block will have similar transaction with the reorged block. I think maintaining an in-memory data structure for these transactions removed due to the limit seperate from mempool will reduce bandwidth from re-questing the tx again and also avoid non-contextual checks?

    In theory, almost the entire mempool could be trimmed away (in a pathological case where the re-added block transactions link all mempool transaction together), so any approach for doing this would need to be able to handle picking just an acceptable-sized subset. It may make sense to just throw them into the compact block reconstruction extrapool though, up to a certain limit.

  131. sipa force-pushed on May 22, 2025
  132. in src/txgraph.h:250 in 0e59c73288 outdated
    239@@ -240,8 +240,9 @@ class TxGraph
    240     };
    241 };
    242 
    243-/** Construct a new TxGraph with the specified limit on transactions within a cluster. That
    244- *  number cannot exceed MAX_CLUSTER_COUNT_LIMIT. */
    245-std::unique_ptr<TxGraph> MakeTxGraph(unsigned max_cluster_count) noexcept;
    246+/** Construct a new TxGraph with the specified limit on the number of transactions within a cluster,
    247+ *  and on the sum of transaction sizes within a cluster. max_cluster_count cannot exceed
    248+ *  MAX_CLUSTER_COUNT_LIMIT. */
    


    instagibbs commented at 5:07 pm on May 23, 2025:
    nano-nit while we’re driving by: max_cluster_count should be > 0

    sipa commented at 3:36 am on May 28, 2025:
    Forgot to take this. Will leave open for now.
  133. ismaelsadeeq commented at 3:18 pm on May 26, 2025: member
    light code review ACK 538e9ff804f8dfba6f6a808e83572fdeab145ab8
  134. in src/txgraph.cpp:2776 in 538e9ff804 outdated
    2761+        Assume(deps_it == deps_by_parent.end());
    2762+
    2763+        // Build a heap of all transactions with 0 unmet dependencies.
    2764+        std::make_heap(trim_heap.begin(), trim_heap.end(), cmp_fn);
    2765+
    2766+        // Iterate over to-be-included transactions. It is possible that the heap empties without
    


    instagibbs commented at 2:39 pm on May 27, 2025:

    https://github.com/instagibbs/bitcoin/tree/2025-05-applytrim

    Worked on a test attempts to cover the claim here. It ended up being more complicated than that (claim a bit wrong?) due to the lazy application of dependencies. For this property to hold, I believe you have to touch the TxGraph in any way that applies dependencies when undersized, then transition to oversized graph for the circular dependencies to not occur in the implied graph.

    Without the dependencies applied, the linearization order can be “nonsensical”, which then creates a contradictory dependency, iow cycle. This cycle ends up dropping sometimes the entire cluster ala

    EhBGLQYGhDOENwYAr6+vr682AAYAAHoAAAAAAEr/EIQGBi0GBoQ0BjYQLUYGBoQzAEAzhAYGhDcG ADIABgAAMwAAMQCEBgY=

    This can be tested by commenting out the CountDistinctClusters invocation in the new test.

    The good news is this usage pattern is “natural”, so in practice each new transaction (package) will have a TxGraph with all dependencies applied except for the new package. It’s just weird and I’m unsure how to do better. Perhaps Trim could pre-emptively apply dependencies + FixLinearization for each grouped cluster.

    In this graph, T2 is the implied parent of T1 thanks to to-be-applied dependencies not being applied yet, causing the trim heap to never get an entry: image


    sipa commented at 3:36 am on May 28, 2025:
    I’ve written a test based on this, which randomly adds dependencies but in a “directed” sense that never introduces cycles (rather than just has 2 specific scenarios).
  135. instagibbs commented at 2:46 pm on May 27, 2025: member

    Reviewed through 538e9ff804f8dfba6f6a808e83572fdeab145ab8

    Last big Q is about the cycles being generated in the implied graph, and what if anything we should do differently about it.

  136. txgraph: Add ability to configure maximum cluster size/weight (feature)
    This is integrated with the oversized property: the graph is oversized when
    any connected component within it contains more than the cluster count limit
    many transactions, or when their combined size/weight exceeds the cluster size
    limit.
    
    It becomes disallowed to call AddTransaction with a size larger than this limit,
    though this limit will be lifted in the next commit.
    
    In addition, SetTransactionFeeRate becomes SetTransactionFee, so that we do not
    need to deal with the case that a call to this function might affect the
    oversizedness.
    416a8d16d0
  137. txgraph: Permit transactions that exceed cluster size limit (feature)
    This removes the restriction added in the previous commit that individual
    transactions do not exceed the max cluster size limit.
    
    With this change, the responsibility for enforcing cluster size limits can
    be localized purely in TxGraph, without callers (and in particular, tests)
    needing to duplicate the enforcement for individual transactions.
    221de30171
  138. txgraph: remove unnecessary m_group_oversized (simplification) f06edc620b
  139. txgraph: Add ability to trim oversized clusters (feature)
    During reorganisations, it is possible that dependencies get add which
    result in clusters that violate limits (count, size), when linking the
    new from-block transactions to the old from-mempool transactions.
    
    Unlike RBF scenarios, we cannot simply reject these policy violations
    when they are due to received blocks. To accomodate this, add a Trim()
    function to TxGraph, which removes transactions (including descendants)
    in order to make all resulting clusters satisfy the limits.
    
    In the initial version of the function added here, the following approach
    is used:
    - Lazily compute a naive linearization for the to-be-merged cluster (using
      an O(n log n) algorithm, optimized for far larger groups of transactions
      than the normal linearization code).
    - Initialize a set of accepted transactions to {}
    - Iterate over the transactions in this cluster one by one:
      - If adding the transaction to the set makes it exceed the max cluster size
        or count limit, stop.
      - Add the transaction to the set.
    - Remove all transactions from the cluster that were not included in the set
      (note that this necessarily includes all descendants too, because they
      appear later in the naive linearization).
    
    Co-authored-by: Greg Sanders <gsanders87@gmail.com>
    1664ec567f
  140. txgraph: Track multiple potential would-be clusters in Trim (improvement)
    In the existing Trim function, as soon as the set of accepted transactions
    would exceed the max cluster size or count limit, the acceptance loop is
    stopped, removing all later transactions. However, it is possible that by
    excluding some of those transactions the would-be cluster splits apart into
    multiple would-clusters. And those clusters may well permit far more
    transactions before their limits are reached.
    
    Take this into account by using a union-find structure inside TrimTxData to
    keep track of the count/size of all would-be clusters that would be formed
    at any point, and only reject transactions which would cause these resulting
    partitions to exceed their limits.
    
    This is not an optimization in terms of CPU usage or memory; it just
    improves the quality of the transactions removed by Trim().
    45e3661d25
  141. sipa force-pushed on May 28, 2025
  142. sipa force-pushed on May 28, 2025
  143. DrahtBot added the label CI failed on May 28, 2025
  144. DrahtBot commented at 3:32 am on May 28, 2025: contributor

    🚧 At least one of the CI tasks failed. Task ARM, unit tests, no functional tests: https://github.com/bitcoin/bitcoin/runs/43009654762 LLM reason (✨ experimental): The failure is caused by a type mismatch in the call to std::max due to incompatible argument types.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  145. DrahtBot removed the label CI failed on May 28, 2025
  146. txgraph: add fuzz test scenario that avoids cycles inside Trim()
    Trim internally builds an approximate dependency graph of the merged cluster,
    replacing all existing dependencies within existing clusters with a simple
    linear chain of dependencies. This helps keep the complexity of the merging
    operation down, but may result in cycles to appear in the general case, even
    though in real scenarios (where Trim is called for stitching re-added mempool
    transactions after a reorg back to the existing mempool transactions) such
    cycles are not possible.
    
    Add a test that specifically targets Trim() but in scenarios where it is
    guaranteed not to have any cycles. It is a special case, is much more a
    whitebox test than a blackbox test, and relies on randomness rather than
    fuzz input. The upside is that somewhat stronger properties can be tested.
    
    Co-authored-by: Greg Sanders <gsanders87@gmail.com>
    e1f501cab2
  147. sipa force-pushed on May 28, 2025
  148. instagibbs commented at 2:13 pm on May 28, 2025: member

    ACK https://github.com/bitcoin/bitcoin/pull/31553/commits/e1f501cab224608004e3a1eee7a48eec3cea25fc

    I appreciate the additional coverage added; I know it’s not particularly blackbox but should catch the most basic regressions going forward.

  149. DrahtBot requested review from ismaelsadeeq on May 28, 2025
  150. ismaelsadeeq commented at 2:05 pm on May 29, 2025: member

    reACK e1f501cab224608004e3a1eee7a48eec3cea25fc deab145ab8..e1f501cab2

    Following the last light ACK, review comments by @instagibbs were addressed:

    • Increased fuzz test coverage
    • Clearer and corrected comments
    • An Assume statement was added

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-08 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me