cluster mempool: control/optimize TxGraph memory usage #33157

pull sipa wants to merge 11 commits into bitcoin:master from sipa:202508_txgraph_memusage changing 5 files +804 −263
  1. sipa commented at 2:21 pm on August 8, 2025: member

    Part of #30289.

    This adds a few optimizations to reduce TxGraph’s memory usage, and makes sure that dynamic memory it uses doesn’t linger after shrinking clusters. Finally, it exposes a function GetMainMemoryUsage() to compute TxGraph’s approximate memory usage.

    It makes the Cluster type abstract, with two instances (SingletonClusterImpl for 1-transaction clusters, and GenericClusterImpl for others).

    On my 64-bit system, I obtain the following numbers:

    • SingletonClusterImpl: 48 bytes, plus 16 bytes malloc overhead in its unique_ptr, plus 8-byte pointer in m_clusters
    • GenericClusterImpl: 104 bytes, plus 16 bytes malloc overhead in its unique_ptr, plus 8-byte pointer in m_clusters, plus 72 bytes malloc overhead inside its vectors and DepGraph, plus 40 bytes per transaction in those.
    • TxGraphImpl::Entry: 72 bytes per transaction
    • TxGraphImpl::ChunkData: 8 bytes, plus 56 bytes in std::set overhead + malloc overhead, all per chunk.
    • TxGraph::Ref: 16 bytes per transaction

    This overall amounts to 200 bytes per cluster, plus 64 bytes per chunk, plus 128 bytes per transaction, but only 224 bytes overall per singleton cluster.

  2. DrahtBot commented at 2:21 pm on August 8, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33157.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK ismaelsadeeq, instagibbs

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28676 (Cluster mempool implementation by sdaftuar)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • “Cannot move or copy (would invalidate Cluster* in Locator and ClusterSet). /” -> “Cannot move or copy (would invalidate Cluster in Locator and ClusterSet).” [Trailing “*/” appears to be an accidental comment terminator and is confusing/incomprehensible in the added comment line.]

    drahtbot_id_5_m

  3. sipa force-pushed on Aug 8, 2025
  4. DrahtBot added the label CI failed on Aug 8, 2025
  5. DrahtBot commented at 2:31 pm on August 8, 2025: contributor

    🚧 At least one of the CI tasks failed. Task macOS-cross, gui, no tests: https://github.com/bitcoin/bitcoin/runs/47684523262 LLM reason (✨ experimental): The CI failure is caused by a compilation error due to ambiguous call to ‘DynamicUsage’ in txgraph.cpp.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  6. DrahtBot removed the label CI failed on Aug 8, 2025
  7. ismaelsadeeq commented at 8:14 pm on September 2, 2025: member
    Concept ACK to have control over memory usage, also not suprised by the numbers because of the added complexity of cluster mempool v master.
  8. sipa commented at 5:43 pm on September 4, 2025: member
    I had missed a few sources of memory usage in the numbers posted in the PR description before. I’ve redone the derivation, and explained it in more detail. Thanks @glozow for trying to make sense of my incorrect numbers.
  9. in src/txgraph.cpp:488 in 15277c05a1 outdated
    484@@ -484,12 +485,13 @@ class TxGraphImpl final : public TxGraph
    485     /** Swap the Entry referred to by a and the one referred to by b. */
    486     void SwapIndexes(GraphIndex a, GraphIndex b) noexcept;
    487     /** If idx exists in the specified level ClusterSet (explicitly, or in the level below and not
    488-    *   removed), return the Cluster it is in. Otherwise, return nullptr. */
    489-    Cluster* FindCluster(GraphIndex idx, int level) const noexcept;
    490+    *   removed), return the Cluster it is in and the level it has. Otherwise, return
    


    instagibbs commented at 7:09 pm on September 4, 2025:
    0    *   removed), return the Cluster it is in and the level the Cluster has. Otherwise, return
    

    sipa commented at 8:35 pm on September 5, 2025:
    Done.
  10. in src/txgraph.cpp:1006 in c638faec3a outdated
    1000@@ -1001,8 +1001,9 @@ bool Cluster::Split(TxGraphImpl& graph, int level) noexcept
    1001     while (todo.Any()) {
    1002         auto component = m_depgraph.FindConnectedComponent(todo);
    1003         auto split_quality = component.Count() <= 2 ? QualityLevel::OPTIMAL : new_quality;
    1004-        if (first && component == todo) {
    1005-            // The existing Cluster is an entire component. Leave it be, but update its quality.
    1006+        if (first && component == todo && SetType::Fill(component.Count()) == component) {
    1007+            // The existing Cluster is an entire component, without holes. Leave it be, but update
    1008+            // its quality.
    


    instagibbs commented at 7:50 pm on September 4, 2025:
    probably should be explicit in comments that this is a memory optimization, letting hole-having clusters be reconstructed without holes

    sipa commented at 8:35 pm on September 5, 2025:
    Added a comment to explain that.
  11. in src/test/fuzz/cluster_linearize.cpp:510 in 614eefb7df outdated
    506@@ -507,6 +507,11 @@ FUZZ_TARGET(clusterlin_depgraph_sim)
    507             }
    508             continue;
    509         }
    510+        if (command <= 3) {
    


    instagibbs commented at 7:56 pm on September 4, 2025:
    this in unconditionally called I think

    sipa commented at 8:35 pm on September 5, 2025:
    I’ve rewritten this using the “loop of comand– approach”.
  12. in src/test/fuzz/cluster_linearize.cpp:457 in 614eefb7df outdated
    452@@ -453,8 +453,8 @@ FUZZ_TARGET(clusterlin_depgraph_sim)
    453     };
    454 
    455     LIMITED_WHILE(provider.remaining_bytes() > 0, 1000) {
    456-        uint8_t command = provider.ConsumeIntegral<uint8_t>();
    457-        if (num_tx_sim == 0 || ((command % 3) <= 0 && num_tx_sim < TestBitSet::Size())) {
    458+        uint8_t command = provider.ConsumeIntegral<uint8_t>() % 4;
    459+        if ((command <= 2 && num_tx_sim == 0) || (command <= 0 && num_tx_sim < TestBitSet::Size())) {
    


    instagibbs commented at 8:03 pm on September 4, 2025:

    not sure I track why the command values are being mapped to specific series of actions. Writing it out in case it ends up helpful:

    0: a. AddTransaction if txns below set limit b. AddDependency if any txns exist c. RemoveTransactions if any txns exist d. Compact 1: a. AddTransaction if no transaction exists b. AddDependency (transaction will exist) c. RemoveTransactions (transaction will exist) d. Compact

    2: a. AddTransaction if no transaction exists b. RemoveTransactions (transaction will exist) c. Compact

    3. a. Compact


    sipa commented at 8:35 pm on September 5, 2025:
    I have rewritten this.
  13. instagibbs commented at 8:20 pm on September 4, 2025: member
    concept ACK, I refuse to do math
  14. sipa force-pushed on Sep 5, 2025
  15. sipa force-pushed on Sep 5, 2025
  16. sipa commented at 8:39 pm on September 5, 2025: member

    I have implemented the “abstract Cluster” approach anyway, as special-casing singletons outside of the Cluster logic was too cumbersome.

    Old numbers:

    • 384 bytes per singleton cluster
    • 512 or 576 bytes per pair cluster

    New numbers:

    • 224 bytes per singleton cluster
    • 520 or 584 bytes per pair cluster
  17. sipa force-pushed on Sep 5, 2025
  18. sipa commented at 1:02 pm on September 6, 2025: member

    And with the addition of SingletonClusterImpl, overall mempool usage (determined by loading a mempool.dat from a long-running node into both (1) master and (2) #28676 + this PR), the mempool memusage/vsize ratio seems to have dropped ~4.5%, meaning cluster mempool will effectively increase mempool size by that much for a constant -maxmempool setting.

    Master (6.09 membyte/vbyte):

    0  "size": 58610,
    1  "bytes": 28502050,
    2  "usage": 173585008,
    3  "total_fee": 0.05257629,
    4  "maxmempool": 300000000,
    

    Cluster mempool + this PR (5.83 membyte/vbyte):

    0  "size": 58348,
    1  "bytes": 28461800,
    2  "usage": 165896648,
    3  "total_fee": 0.05210373,
    4  "maxmempool": 300000000,
    
  19. sipa force-pushed on Sep 7, 2025
  20. sipa force-pushed on Sep 8, 2025
  21. sipa force-pushed on Sep 9, 2025
  22. DrahtBot added the label Needs rebase on Sep 11, 2025
  23. sipa force-pushed on Sep 11, 2025
  24. sipa commented at 5:07 pm on September 11, 2025: member
    Rebased after #33354 merge.
  25. DrahtBot removed the label Needs rebase on Sep 11, 2025
  26. in src/txgraph.cpp:111 in 22f5fe833f outdated
    112@@ -113,8 +113,6 @@ class Cluster
    113     QualityLevel m_quality{QualityLevel::NONE};
    


    ismaelsadeeq commented at 12:31 pm on September 16, 2025:

    In “txgraph: Make level of Cluster implicit (optimization)” 22f5fe833fcf3da6a4180d181ff68d6da34f40db

    Should also add coverage for GetLevel

     0diff --git a/src/txgraph.cpp b/src/txgraph.cpp
     1index ef7b6568107..0e7e5d22947 100644
     2--- a/src/txgraph.cpp
     3+++ b/src/txgraph.cpp
     4@@ -2421,6 +2421,8 @@ void TxGraphImpl::SanityCheck() const
     5                 if (cluster.GetTxCount() != 0) {
     6                     actual_clusters.insert(&cluster);
     7                 }
     8+
     9+                assert(cluster.GetLevel(*this) == level);
    10                 // Sanity check the cluster, according to the Cluster's internal rules.
    11                 cluster.SanityCheck(*this, level);
    12                 // Check that the cluster's quality and setindex matches its position in the quality list
    

    sipa commented at 2:57 pm on September 17, 2025:
    Done. I’ve also refactored these asserts a bit, moving some of them from *ClusterImpl::SanityCheck to TxGraphImpl::SanityCheck, so duplication across the two implementations can be avoided.
  27. in src/txgraph.cpp:146 in 7974ed311c outdated
    141+               // Dynamic memory usage inside m_depgraph.
    142+               m_depgraph.DynamicMemoryUsage() +
    143+               // Memory usage of the allocated Cluster itself.
    144+               memusage::MallocUsage(sizeof(Cluster)) +
    145+               // Memory usage of the ClusterSet::m_clusters entry.
    146+               sizeof(std::unique_ptr<Cluster>);
    


    ismaelsadeeq commented at 1:22 pm on September 16, 2025:

    In “txgraph: keep track of Cluster memory usage (preparation)” 7974ed311c1843d548d98e9eae8fed71a42a1026

    Should also count the sequence number of the cluster?

    0               // Memory usage of the cluster m_sequence.
    1               + sizeof(uint64_t);
    

    sipa commented at 11:42 am on September 17, 2025:
    No, that’s included in sizeof(Cluster) above.
  28. in src/txgraph.cpp:168 in 22f5fe833f outdated
    164+    int GetLevel(const TxGraphImpl& graph) const noexcept;
    165     /** Only called by Graph::SwapIndexes. */
    166     void UpdateMapping(DepGraphIndex cluster_idx, GraphIndex graph_idx) noexcept { m_mapping[cluster_idx] = graph_idx; }
    167     /** Push changes to Cluster and its linearization to the TxGraphImpl Entry objects. */
    168-    void Updated(TxGraphImpl& graph) noexcept;
    169+    void Updated(TxGraphImpl& graph, int level) noexcept;
    


    ismaelsadeeq commented at 1:58 pm on September 16, 2025:

    In “txgraph: Make level of Cluster implicit (optimization)” https://github.com/bitcoin/bitcoin/commit/22f5fe833fcf3da6a4180d181ff68d6da34f40db

    Instead of using an int for level here, should we relax the Level enum from a scoped enum to an unscoped one to allow implicit conversion and unified input? This would also apply in cases where we want to pass a specific enum value, e.g. in e.g CopyToStaging


    sipa commented at 11:45 am on September 17, 2025:
    I’d rather keep the external selector (Level level) and internal identifier (int level) separate, as it may allow adding more external ones later if need be (e.g. a Level::SINGLE which requires there to be only the main level and asserts otherwise, or a Level::STAGING which requires a staging level to exist). Also, if we’d add a third level (which may be needed to implement testmempoolaccept with package RBF) there may be even more level selectors to add.

    ismaelsadeeq commented at 6:51 pm on September 17, 2025:
    Thanks for the context. For anyone that wants to dig in, the testmempoolaccept third level was discussed here https://github.com/bitcoin/bitcoin/issues/32160
  29. txgraph: Make level of Cluster implicit (optimization)
    This reduces per-Cluster memory usage by making Clusters not aware of their
    own level. Instead, track it either in calling code, or infer it based on
    the transactions in them.
    fdcb5f17bd
  30. txgraph: move some sanity checks from Cluster to TxGraphImpl (refactor) 4281438bc3
  31. txgraph: avoid holes in DepGraph positions (mem optimization) b6cf300096
  32. depgraph: add memory usage control (feature) f8761d4dcd
  33. txgraph: keep data structures compact (mem optimization) 64e5c6979c
  34. txgraph: keep track of Cluster memory usage (preparation) 9c84cc34b9
  35. txgraph: expose memory usage estimate function (feature) a940e46bb0
  36. txgraph: make Cluster an abstract class (refactor) 752ce6b1ae
  37. txgraph: add Cluster functions to avoid downcasts in Merge/Split (refactor)
    This adds 4 functions to Cluster to help implement Merge() and Split() without
    explicit downcasts using static_cast (removing the assumption that the Clusters
    being merged/split are also GenericClusterImpls).
    89b7997e74
  38. txgraph: abstract out creation of empty Clusters (refactor) a2899bcd93
  39. txgraph: add SingletonClusterImpl (mem optimization)
    This adds a specialized Cluster implementation for singleton clusters, saving
    a significant amount of memory by avoiding the need for m_depgraph, m_mapping,
    and m_linearization, and their overheads.
    66401148dd
  40. sipa force-pushed on Sep 17, 2025
  41. in src/txgraph.cpp:1246 in fdcb5f17bd outdated
    1245     // Copy the Cluster from main to staging, if it's not already there.
    1246     if (level == 0) {
    1247         // Make the Cluster Acceptable before copying. This isn't strictly necessary, but doing it
    1248         // now avoids doing double work later.
    1249-        MakeAcceptable(*cluster);
    1250+        MakeAcceptable(*cluster, 0);
    


    instagibbs commented at 1:38 pm on September 18, 2025:

    fdcb5f17bd100eb2c1765c13bbb8367d03fc7ea5

    micro-nit for readability

    0        MakeAcceptable(*cluster, level);
    
  42. ismaelsadeeq commented at 1:54 pm on September 18, 2025: member

    Code Review ACK 66401148ddcba04c27751e39e53fc41e6deb5d14

    I shutdown a long-running node at height 915249.

    0.rw------- 128M ismaelsadeeq 18 Sep 14:46 mempool.dat
    

    Restarted the node on master 1444ed855f4 with:

    0build/bin/bitcoind -connect=0 -debug
    

    After loading the mempool, getmempoolinfo returned:

    0{
    1  "loaded": true,
    2  "size": 104944,
    3  "bytes": 43254588,
    4  "usage": 260130384,
    5  "total_fee": 0.10523099,
    6  "maxmempool": 300000000
    7}
    

    I restarted the node on full cluster mempool implementation 57f929d with the same config. getmempoolinfo returned:

    0{
    1  "loaded": true,
    2  "size": 104944,
    3  "bytes": 43254588,
    4  "usage": 244614928,
    5  "total_fee": 0.10523099,
    6  "maxmempool": 300000000
    7}
    

    The mempool size and fee data are identical across runs, but usage differ with cluster mempool having slightly less usage at 6%.

  43. DrahtBot requested review from instagibbs on Sep 18, 2025
  44. in src/txgraph.cpp:2290 in fdcb5f17bd outdated
    2287@@ -2275,7 +2288,7 @@ void Cluster::SanityCheck(const TxGraphImpl& graph, int level) const
    2288     // The number of transactions in a Cluster cannot exceed m_max_cluster_count.
    2289     assert(m_linearization.size() <= graph.m_max_cluster_count);
    2290     // The level must match the level the Cluster occurs in.
    


    instagibbs commented at 2:01 pm on September 18, 2025:
    0    // The level must match the level the Cluster occurs in (call only valid with non-empty cluster).
    
  45. in src/txgraph.cpp:2981 in a940e46bb0 outdated
    2977@@ -2976,6 +2978,22 @@ std::vector<TxGraph::Ref*> TxGraphImpl::Trim() noexcept
    2978     return ret;
    2979 }
    2980 
    2981+size_t TxGraphImpl::GetMainMemoryUsage() noexcept
    


    instagibbs commented at 2:32 pm on September 18, 2025:

    a940e46bb06cf40f65772516d67f78cbd87c6b52

    noting we aren’t tracking BlockBuilderImpl memory usage which only has growth in m_excluded_clusters

    (N.B. I don’t think we need ordered set for m_excluded_clusters)

  46. in src/txgraph.cpp:794 in 89b7997e74 outdated
    790@@ -779,6 +791,30 @@ uint64_t GenericClusterImpl::GetTotalTxSize() const noexcept
    791     return ret;
    792 }
    793 
    794+DepGraphIndex GenericClusterImpl::AppendTransaction(GraphIndex graph_idx, FeePerWeight feerate) noexcept
    


    instagibbs commented at 4:28 pm on September 18, 2025:

    89b7997e7414074da3b0a35a17d86a2dfe24f378

    For these functions(and the later singleton ones) can we Assume(graph_idx >= 0)

  47. in src/txgraph.cpp:1996 in 66401148dd
    1992@@ -1745,18 +1993,33 @@ void TxGraphImpl::Merge(std::span<Cluster*> to_merge, int in_level) noexcept
    1993     // moves.
    1994     size_t max_size_pos{0};
    1995     DepGraphIndex max_size = to_merge[max_size_pos]->GetTxCount();
    1996+    DepGraphIndex tot_size = max_size;
    


    instagibbs commented at 4:40 pm on September 18, 2025:

    66401148ddcba04c27751e39e53fc41e6deb5d14

    micro-nit: couldn’t figure out what tot meant for a second… “to t…. size?”

    0    DepGraphIndex total_size = max_size;
    
  48. instagibbs approved
  49. instagibbs commented at 5:00 pm on September 18, 2025: member

    ACK 66401148ddcba04c27751e39e53fc41e6deb5d14

    Did not validate the numbers, non-blocking comments only


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-18 18:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me