cluster mempool: introduce TxGraph #31363

pull sipa wants to merge 25 commits into bitcoin:master from sipa:202411_txgraph changing 11 files +3360 −150
  1. sipa commented at 3:59 pm on November 24, 2024: member

    Part of cluster mempool: #30289.

    1. Overview

    This introduces the TxGraph class, which encapsulates knowledge about the (effective) fees, sizes, and dependencies between all mempool transactions, but nothing else. In particular, it lacks knowledge about CTransaction, inputs, outputs, txids, wtxids, prioritization, validatity, policy rules, and a lot more. Being restricted to just those aspects of the mempool makes the behavior very easy to fully specify (ignoring the actual linearizations produced), and write simulation-based tests for (which are included in this PR).

    2. Interface

    The interface can be largely categorized into:

    • Mutation functions:
      • AddTransaction (add a new transaction with specified feerate, and get a Ref object back to identify it).
      • RemoveTransaction (given a Ref object, remove the transaction).
      • AddDependency (given two Ref objects, add a dependency between them).
      • SetTransactionFee (modify the fee associated with a Ref object).
    • Inspector functions:
      • GetAncestors (get the ancestor set in the form of Ref* pointers)
      • GetAncestorsUnion (like above, but for the union of ancestors of multiple Ref* pointers)
      • GetDescendants (get the descendant set in the form of Ref* pointers)
      • GetDescendantsUnion (like above, but for the union of ancestors of multiple Ref* pointers)
      • GetCluster (get the connected component set in the form of Ref* pointers, in the order they would be mined).
      • GetIndividualFeerate (get the feerate of a transaction)
      • GetChunkFeerate (get the mining score of a transaction)
      • CountDistinctClusters (count the number of distinct clusters a list of Refs belong to)
    • Staging functions:
      • StartStaging (make all future mutations operate on a proposed transaction graph)
      • CommitStaging (apply all the changes that are staged)
      • AbortStaging (discard all the changes that are staged)
    • Miscellaneous functions:
      • DoWork (do queued-up computations now, so that future operations are fast)

    This TxGraph::Ref type used as a “handle” on transactions in the graph can be inherited from, and the idea is that in the full cluster mempool implementation (#28676, after it is rebased on this), CTxMempoolEntry will inherit from it, and all actually used Ref objects will be CTxMempoolEntrys. With that, the mempool code can just cast any Ref* returned by txgraph to CTxMempoolEntry*.

    3. Implementation

    Internally the graph data is kept in clustered form (partitioned into connected components), for which linearizations are maintained and updated as needed using the cluster_linearize.h algorithms under the hood, but this is hidden from the users of this class. Implementation-wise, mutations are generally applied lazily, appending to queues of to-be-removed transactions and to-be-added dependencies, so they can be batched for higher performance. Inspectors will generally only evaluate as much as is needed to answer queries, with roughly 5 levels of processing to go to fully instantiated and acceptable cluster linearizations, in order:

    1. ApplyRemovals (take batches of to-be-removed transactions and translate them to “holes” in the corresponding Clusters/DepGraphs).
    2. SplitAll (creating holes in Clusters may cause them to break apart into smaller connected components, so make turn them into separate Clusters/linearizations).
    3. GroupClusters (figure out which Clusters will need to be combined in order to add requested to-be-added dependencies, as these may span clusters).
    4. ApplyDependencies (actually merge Clusters as precomputed by GroupClusters, and add the dependencies between them).
    5. MakeAcceptable (perform the LIMO linearization algorithm on Clusters to make sure their linearizations are acceptable).

    4. Future work

    This is only an initial version of TxGraph, and some functionality is missing before #28676 can be rebased on top of it:

    • The ability to get comparative feerate diagrams before/after for the set of staged changes (to evaluate RBF incentive-compatibility).
    • Mining interface (ability to iterate transactions quickly in mining score order) (see #31444).
    • Eviction interface (reverse of mining order, plus memory usage accounting) (see #31444).
    • Ability to fix oversizedness of clusters (before or after committing) - this is needed for reorgs where aborting/rejecting the change just is not an option (see #31553).
    • Interface for controlling how much effort is spent on LIMO. In this PR it is hardcoded.

    Then there are further improvements possible which would not block other work:

    • Making Cluster a virtual class with different implementations based on transaction count (which could dramatically reduce memory usage, as most Clusters are just a single transaction, for which the current implementation is overkill).
    • The ability to have background thread(s) for improving cluster linearizations.
  2. DrahtBot commented at 3:59 pm on November 24, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31363.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK instagibbs, ajtowns, ismaelsadeeq, glozow
    Concept ACK jonatack

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30605 (Cluster linearization: separate tests from tests-of-tests by sipa)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. sipa force-pushed on Nov 24, 2024
  4. DrahtBot commented at 4:20 pm on November 24, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/33443336095

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  5. DrahtBot added the label CI failed on Nov 24, 2024
  6. sipa force-pushed on Nov 24, 2024
  7. sipa force-pushed on Nov 24, 2024
  8. DrahtBot removed the label CI failed on Nov 24, 2024
  9. sipa force-pushed on Nov 25, 2024
  10. sipa force-pushed on Nov 25, 2024
  11. sipa force-pushed on Nov 25, 2024
  12. glozow added the label Mempool on Nov 28, 2024
  13. sipa force-pushed on Dec 2, 2024
  14. sipa force-pushed on Dec 4, 2024
  15. jonatack commented at 2:22 pm on December 5, 2024: member
    Concept ACK
  16. in src/txgraph.cpp:258 in 18184f4e73 outdated
    225+        Ref* m_ref;
    226+        /** Which Cluster and position therein this Entry appears in. ([0] = main, [1] = staged). */
    227+        Locator m_locator[MAX_LEVELS];
    228+        /** The chunk feerate of this transaction in main (if present in m_locator[0]) */
    229+        FeeFrac m_main_chunk_feerate;
    230+        /** The position this transaction in the main linearization (if present). /*/
    


    instagibbs commented at 3:25 pm on December 18, 2024:
    0        /** The position this transaction in the main linearization (if present). **/
    

    sipa commented at 10:31 pm on February 12, 2025:
    Done.
  17. sipa force-pushed on Dec 20, 2024
  18. sipa commented at 4:33 pm on December 20, 2024: member

    I’m aware this is a big chunk of code, but perhaps this can aid review a bit.

    The biggest commits is txgraph: (feature) add initial version. To get an idea of what is going on, I think it’s useful to look (in order) at:

    • The src/txgraph.h file added in that commit, as it defines the public interface.
    • The src/test/fuzz/txgraph.cpp test added in the next commit (txgraph: (tests) add simulation fuzz test), as it does to an extent show how the interface is used, and what is expected of it.
    • The data structures (TxGraphImpl and Cluster) in src/txgraph.cpp, and the internal functions corresponding to the 5 processing steps that happen:
      • Batch removal of queued transaction removals: TxGraphImpl::ApplyRemovals and Cluster::ApplyRemovals.
      • Splitting of clusters into components after removing transactions: TxGraphImpl::SplitAll, TxGraphImpl::Split, and Cluster::Split
      • Computing which clusters will need to be combined due to queued dependencies being added between them: TxGraphImpl::GroupClusters.
      • Merging of clusters and applying queued dependencies to them: TxGraphImpl::ApplyDependencies, TxGraphImpl::Merge, Cluster::Merge, Cluster::ApplyDependencies.
      • Making the linearization of clusters acceptable: TxGraphImpl::MakeAcceptable and Cluster::Relinearize.
    • The interface implementation functions that are built on top in TxGraphImpl: AddTransaction, RemoveTransaction, AddDependency, SetTransactionFee, Cleanup, GetAncestors, GetDescendants, GetCluster, GetChunkFeerate, etc.
  19. sipa force-pushed on Dec 22, 2024
  20. sipa commented at 5:23 pm on December 22, 2024: member
    I have pushed a substantial change to the TxGraphImpl::GroupClusters function (in commit “txgraph: (feature) add initial version”). In a benchmark with 64000 transactions being combined into 1000 clusters, this makes it go from ~160ms to ~16ms. I will add the benchmark to this PR when I clean it up.
  21. ismaelsadeeq commented at 7:33 pm on December 24, 2024: member

    Concept ACK

    GetChunkFeerate (get the mining score of a transaction)

    From reading the description and skimming through the code, IIUC TxGraph encapsulates knowledge about fees and sizes but lacks knowledge about prioritization. This implies that the fee it uses is the effective fee, not the modified fee that includes prioritization.

    How are we accounting for the real mining score of a Ref without that prioritization knowledge? The current mining algorithm uses the modified ancestor fee to calculate the mining score.

    Could you clarify how prioritization will be handled post-cluster mempool?

  22. sipa commented at 7:41 pm on December 24, 2024: member

    @ismaelsadeeq TxGraph (and DepGraph, and FeeFrac) just treat “fee” and “size” as numbers whose ratio is to be maximized. These classes don’t care what meaning they correspond to.

    In practice, I expect that the mempool code will provide the modified fee as “fee”, and vsize or weight as “size”.

  23. ismaelsadeeq commented at 8:37 pm on December 24, 2024: member
    Thank you for clarifying. Reading ’lacks knowledge about prioritization’ led me to infer that. But I see now you are talking about mapDeltas
  24. sipa force-pushed on Jan 8, 2025
  25. in src/txgraph.h:124 in 972df15af2 outdated
    134-    virtual bool IsOversized() noexcept = 0;
    135-    /** Get the feerate of the chunk which transaction arg is in. Returns the empty FeeFrac if arg
    136-     *  does not exist. The graph must not be oversized. */
    137-    virtual FeeFrac GetChunkFeerate(const Ref& arg) noexcept = 0;
    138+    virtual bool IsOversized(bool main_only = false) noexcept = 0;
    139+    /** Get the feerate of the chunk which transaction arg is in in the main graph. Returns the
    


    ismaelsadeeq commented at 6:28 pm on January 9, 2025:

    nit when retouching

    In " txgraph: (feature) add staging support" 972df15af28ba6787f096162f776b2973342e674

    0    /** Get the feerate of the chunk which transaction arg is in the main graph. Returns the
    

    sipa commented at 9:49 pm on January 9, 2025:
    Done.
  26. in src/txgraph.h:142 in 0c8dc2323e outdated
    104+    virtual FeeFrac GetChunkFeerate(const Ref& arg) noexcept = 0;
    105+    /** Get the individual transaction feerate of transaction arg. Returns the empty FeeFrac if
    106+     *  arg does not exist. */
    107+    virtual FeeFrac GetIndividualFeerate(const Ref& arg) noexcept = 0;
    108+    /** Get pointers to all transactions in the connected component ("cluster") which arg is in.
    109+     *  The transactions will be returned in a topologically-valid order of acceptable quality.
    


    ismaelsadeeq commented at 6:29 pm on January 9, 2025:

    optional nit when their is need to retouch

    In “txgraph: (feature) add initial version” 0c8dc2323eb1ec34357a807f0860cf0a08a63a75 you mean QualityLevel::ACCEPTABLE right we could just reference the enum like you were doing txgraph.cpp?

    0     *  The transactions will be returned in a topologically-valid order of acceptable quality (QualityLevel::ACCEPTABLE).
    

    sipa commented at 8:07 pm on January 9, 2025:

    I consider the QualityLevel enum to be an implementation detail of the txgraph module that should not leak into its interface.

    However, I guess it would be helpful to specify more clearly what acceptable mean. I’ll think about that; it’s hard to define.


    instagibbs commented at 4:23 pm on February 5, 2025:
    I’m unsure the “Acceptable” word helps much here. It’s just whatever quality it is, at least topologically valid.

    instagibbs commented at 6:59 pm on March 4, 2025:
    Would it go too far to assure the caller that the ordering reflects the underlying stored linearization i.e. reflects the underlying ordering values like chunk feerates are coming from?

    ajtowns commented at 6:02 am on March 14, 2025:
    Doesn’t “topologically-valid order” already cover what “acceptable quality” means here, as far as the interface is concerned?

    ismaelsadeeq commented at 9:53 am on March 20, 2025:
    “acceptable quality” is more than just a topologically valid order. It refers to a linearization that may not be optimal but is the best we can achieve for that cluster within resource limitations.

    sipa commented at 3:37 am on March 22, 2025:
    This has been rewritten.
  27. in src/cluster_linearize.h:1372 in 1c5f1c4601 outdated
    1353+        // Figure out which elements elem needs to be placed before.
    1354+        SetType place_before = done & depgraph.Ancestors(elem);
    1355+        // Find which position to place elem in (updating j), continuously moving the elements
    1356+        // in between forward.
    1357+        while (place_before.Any()) {
    1358+            auto to_swap = linearization[len - 1 - (j - 1)];
    


    ismaelsadeeq commented at 6:35 pm on January 9, 2025:
    In “clusterlin: add FixLinearization function + fuzz test” 1c5f1c4601e0a895258cfe073125361f7a6ea012 Is it safe to get the ClusterIndex when j is 0, we will subtract 1 from uint32_t which will result to a large positive number?

    sipa commented at 8:13 pm on January 9, 2025:

    j cannot be 0. If it were, there is necessarily nothing elem needs to be moved before anymore, and place_before would be empty, and the loop would have terminated.

    Will add an Assume(j > 0); on the next push.


    sipa commented at 2:52 pm on January 23, 2025:
    Done.

    yancyribbens commented at 0:48 am on February 26, 2025:
    The comment // j cannot be 0 here; implies this should be j != 0 instead of j > 0. Since it’s unsigned j != 0 is also correct.
  28. sipa force-pushed on Jan 9, 2025
  29. sipa commented at 9:49 pm on January 9, 2025: member

    A few changes:

    • Added a TxGraph::DoWork() functions which performs queued-up computations now, so that future operations are fast.
    • Make Ref& arguments to TxGraph::AddDependency, TxGraph::RemoveTransaction, and TxGraph::SetTransactionFee const. They do not modify the Ref itself; they only modify what the Ref points to.
    • Address review comments by @ismaelsadeeq here, and @theuni on #31553.
  30. sipa force-pushed on Jan 16, 2025
  31. sipa commented at 9:14 pm on January 16, 2025: member

    Added an additional inspector function:

    • CountDistinctClusters, which efficiently counts how many distinct clusters a list of specified Refs belong to, for helping with enforcing RBF cluster policy limits.
  32. sipa force-pushed on Jan 22, 2025
  33. sipa commented at 7:59 pm on January 22, 2025: member
    I split out two optimizations to TxGraphImpl::GroupClusters() from the initial commit into their own commits.
  34. Pumpkin1234567812 approved
  35. sipa commented at 5:28 pm on January 23, 2025: member

    A suggestion for simplifying the removed/destroyed/unlinked/cleanup interface.

    • Unlinking only happens when the calling code destroys a Ref. Currently, unlinking can happen either through Ref destruction, or through TxGraph::Cleanup() (which unlinks anything not appearing in the graph anymore).
    • The TxGraphImpl::m_entries vector compaction happens transparently, whenever applicable; there is no TxGraph::Cleanup() any more. This can only happen after the corresponding Ref was destroyed (because that’s now the only way to unlink).
    • To assist the calling code with cleaning up Refs, a new TxGraph::GetRemoved() function is added, which is a pure inspector, and reports which Refs do not appear in either main or staging (if present) anymore.
  36. instagibbs commented at 5:35 pm on January 23, 2025: member

    Unlinking only happens when the calling code destroys a Ref. Currently, unlinking can happen either through Ref destruction, or through TxGraph::Cleanup() (which unlinks anything not appearing in the graph anymore).

    this would result in:

    1. exists
    2. removed
    3. unlinked + destroyed

    as the three possible states?

  37. sipa commented at 5:38 pm on January 23, 2025: member

    @instagibbs Right. Though “removed/exists” is really a per-graph (main vs staging) thing, while destroyed+unlinked are TxGraph-wide.

    Maybe it should not be called GetRemoved(), as that term is ambiguous. How about GetUnusedRefs or so?

  38. sipa commented at 9:42 pm on January 23, 2025: member
    Actually, having this GetUnusedRefs() is pretty annoying (read: would require extra per-transaction memory) to be efficiently implementable, and talking to @sdaftuar it does not seem important if Trim() can report what it is deleting (because that’s really the only place where TxGraphImpl makes a decision about which transactions belong to the graph).
  39. sipa force-pushed on Jan 24, 2025
  40. sipa commented at 10:14 pm on January 24, 2025: member

    Some changes:

    • Cleanup is gone. Compaction now happens automatically and transparently, but the caller is responsible for destroying Refs when they are no longer needed.
    • Avoid re-Group-ing when trying to determine oversizedness while it is already known.
  41. DrahtBot commented at 11:23 pm on January 24, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/36148884388

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  42. DrahtBot added the label CI failed on Jan 24, 2025
  43. sipa force-pushed on Jan 26, 2025
  44. sipa commented at 4:41 am on January 26, 2025: member
    • Split out a memory optimization in GroupClusters to its own commit.
    • Moved caching of oversizedness to its own commit.
  45. sipa force-pushed on Jan 26, 2025
  46. sipa commented at 8:00 pm on January 26, 2025: member
    • Remove the TxGraph::Ref::operator bool(); it has little use now that cleanup is implicit.
    • Add assertion that cluster count limit is at least 1.
    • Modify fuzz test to use TxGraph::Ref pointers everywhere rather than references; I realized the existing (test) code was not entirely kosher (could sometimes use a reference to an object that was already destroyed).
  47. DrahtBot removed the label CI failed on Jan 26, 2025
  48. in src/test/fuzz/cluster_linearize.cpp:1148 in 83d61884b4 outdated
    1143+        uint64_t val{0};
    1144+        try {
    1145+            reader >> VARINT(val);
    1146+        } catch (const std::ios_base::failure&) {}
    1147+        val %= todo.Count();
    1148+        // Find which element in todo that corresponds to.
    


    ismaelsadeeq commented at 6:59 pm on January 27, 2025:

    In 83d61884b41b2a2be620fbe35736b433c93d4c76

    I find it a bit difficult to follow this comment

    /** How long the prefix of the constructed linearization is which is topological. */
    

    ??

        // Find which element in todo that corresponds to.
    

    ??


    sipa commented at 10:04 pm on January 31, 2025:
    I have largely rewritten this, and expanded comments. Please have a look and let me know if it is clearer now.

    ismaelsadeeq commented at 10:08 pm on February 10, 2025:
    Yes this is much better now! Thanks
  49. in src/txgraph.cpp:32 in 6741590e2a outdated
    23+class TxGraphImpl;
    24+
    25+/** Position of a ClusterIndex within a Cluster::m_linearization. */
    26+using LinearizationIndex = uint32_t;
    27+/** Position of a Cluster within Graph::m_clusters. */
    28+using ClusterSetIndex = uint32_t;
    


    ismaelsadeeq commented at 7:58 pm on January 27, 2025:

    In “txgraph: (feature) add initial version” 6741590e2a4632c4f9b4655679df9694223beac4

    In cluster_linearize, we have ClusterIndex, which is defined as the “Data type to represent transaction indices in clusters.” (DepGraph implicitly is a cluster without the linearization and the TxGraph context that Cluster class introduced in this commit has.)

    Since we now have a separate Cluster class, would it be clearer to rename ClusterIndex to DepGraphIndex? The type alias names and class names are overlapping


    sipa commented at 10:04 pm on January 31, 2025:
    Done.
  50. in src/txgraph.cpp:34 in 6741590e2a outdated
    25+/** Position of a ClusterIndex within a Cluster::m_linearization. */
    26+using LinearizationIndex = uint32_t;
    27+/** Position of a Cluster within Graph::m_clusters. */
    28+using ClusterSetIndex = uint32_t;
    29+
    30+/** Quality levels for cached linearizations. */
    


    ismaelsadeeq commented at 8:04 pm on January 27, 2025:

    In “txgraph: (feature) add initial version” 6741590e2a4632c4f9b4655679df9694223beac4

    0/** Quality levels for cached cluster linearizations. */
    

    sipa commented at 10:02 pm on January 31, 2025:
    Done.
  51. in src/txgraph.h:15 in 15da6fdd19 outdated
    11@@ -12,8 +12,7 @@
    12 #ifndef BITCOIN_TXGRAPH_H
    13 #define BITCOIN_TXGRAPH_H
    14 
    15-/** No connected component within TxGraph is allowed to exceed this number of transactions. */
    16-static constexpr unsigned CLUSTER_COUNT_LIMIT{64};
    17+static constexpr unsigned MAX_CLUSTER_COUNT_LIMIT{64};
    


    ismaelsadeeq commented at 9:21 pm on January 27, 2025:

    In “txgraph: (feature) make max cluster count configurable and “oversize” state” 15da6fdd19813a08bf9571147f97e22b5390ecf9

    Maybe comment why 64 selected and how having limit above this will hurt the node/miner? fwiw we have seen clusters with size more than 64 in the past https://bitcoincore.reviews/25038, but I think it’s not a regular occurrence.


    sipa commented at 10:04 pm on January 31, 2025:
    This code is unused, the current value just affects testing, but it can literally be set to any positive integer, and the code will work (though at degraded performance for higher values). The real question is what to set the max_cluster_count value to in MakeTxGraph(), but I think that discussion belongs in #28676.
  52. sipa force-pushed on Jan 30, 2025
  53. sipa commented at 11:47 pm on January 30, 2025: member
    • Introduce vsize and weight tagged versions of FeeFrac, to avoid accidental type confusion.
    • Convert all of txgraph to work using FeePerWeight, rather than bare FeeFrac.
  54. DrahtBot added the label CI failed on Jan 31, 2025
  55. sipa force-pushed on Jan 31, 2025
  56. DrahtBot removed the label CI failed on Jan 31, 2025
  57. sipa force-pushed on Jan 31, 2025
  58. sipa force-pushed on Feb 1, 2025
  59. sipa force-pushed on Feb 4, 2025
  60. DrahtBot added the label CI failed on Feb 4, 2025
  61. in src/txgraph.cpp:315 in 68435c7a4f outdated
    308+        auto& entry = m_entries[idx];
    309+        Assume(entry.m_ref != nullptr);
    310+        entry.m_ref = &new_location;
    311+    }
    312+
    313+    /** Only called by Ref::~Ref to unlink Refs. */
    


    ismaelsadeeq commented at 9:27 pm on February 4, 2025:

    In “txgraph: (feature) add initial version” c89d147209c91bb0464321f5bc733a4eeab0dea0

    It is also called by the move assignment operator of Ref


    sipa commented at 8:33 pm on February 11, 2025:
    Fixed.
  62. DrahtBot removed the label CI failed on Feb 4, 2025
  63. in src/txgraph.cpp:547 in c89d147209 outdated
    542+
    543+void TxGraphImpl::ApplyRemovals() noexcept
    544+{
    545+    auto& to_remove = m_to_remove;
    546+    // Skip if there is nothing to remove.
    547+    if (to_remove.empty()) return;
    


    ismaelsadeeq commented at 3:12 pm on February 5, 2025:

    In “txgraph: (feature) add initial version” c89d147209c91bb0464321f5bc733a4eeab0dea0

    We can check whether m_to_remove is empty first so we don’t have to copy the reference when it is, the copied reference is also only used in this if statement, so we can just use the m_to_remove value


    sipa commented at 8:35 pm on February 11, 2025:
    Creating a reference doesn’t cost anything (in simple cases); it’s just introducing a new name for an existing object. It’s true that it’s kind of pointless at this stage, but it’ll be used in “group per-graph data in ClusterSet”, and in “add staging support”.
  64. in src/cluster_linearize.h:1355 in 1fb067066e outdated
    1335@@ -1336,6 +1336,38 @@ std::vector<ClusterIndex> MergeLinearizations(const DepGraph<SetType>& depgraph,
    1336     return ret;
    1337 }
    1338 
    1339+/** Make linearization topological, retaining its ordering where possible. */
    1340+template<typename SetType>
    1341+void FixLinearization(const DepGraph<SetType>& depgraph, Span<ClusterIndex> linearization) noexcept
    1342+{
    1343+    // This algorithm can be summarized as moving every element in the linearization backwards
    1344+    // until it is placed after all this ancestors.
    


    instagibbs commented at 3:20 pm on February 5, 2025:
    0    // until it is placed after all its ancestors.
    

    sipa commented at 10:31 pm on February 12, 2025:
    Done.
  65. in src/cluster_linearize.h:1364 in 1fb067066e outdated
    1348+    for (ClusterIndex i = 0; i < len; ++i) {
    1349+        /** The element at that position. */
    1350+        ClusterIndex elem = linearization[len - 1 - i];
    1351+        /** j represents how far from the back of the linearization elem should be placed. */
    1352+        ClusterIndex j = i;
    1353+        // Figure out which elements elem needs to be placed before.
    


    instagibbs commented at 3:29 pm on February 5, 2025:
    0        // Figure out which elements elem needs to be placed after.
    

    we are putting the other elements “before”


    sipa commented at 10:31 pm on February 12, 2025:
    Done.
  66. in src/cluster_linearize.h:1371 in 1fb067066e outdated
    1355+        // Find which position to place elem in (updating j), continuously moving the elements
    1356+        // in between forward.
    1357+        while (place_before.Any()) {
    1358+            // j cannot be 0 here; if it was, then there was necessarily nothing earlier which
    1359+            // elem needs to be place before anymore, and place_before would be empty.
    1360+            Assume(j > 0);
    


    instagibbs commented at 3:40 pm on February 5, 2025:
    in release builds this will result in j-- wrapping around to max value of uint32_t, causing an OOB access? We should probably just stop trying to fix things if it happens?

    sipa commented at 7:09 pm on February 12, 2025:
    :man_shrugging: I hope the code is clear enough that reviewers are convinced this is impossible. The Assume hopefully helps with that convincing. I hope an assertion isn’t needed to convince you even more.
  67. in src/test/fuzz/cluster_linearize.cpp:410 in a921ae75c2 outdated
    406@@ -407,7 +407,7 @@ FUZZ_TARGET(clusterlin_depgraph_serialization)
    407     SanityCheck(depgraph);
    408 
    409     // Verify the graph is a DAG.
    410-    assert(IsAcyclic(depgraph));
    


    instagibbs commented at 3:56 pm on February 5, 2025:

    a921ae75c2f7162b3b617ddeccb84e3e60728cf8

    Should we add a test that completes a cycle then asserts it’s not acyclic to add coverage that way?


    sipa commented at 10:32 pm on February 12, 2025:
    Done.
  68. in src/CMakeLists.txt:304 in c89d147209 outdated
    279@@ -280,6 +280,7 @@ add_library(bitcoin_node STATIC EXCLUDE_FROM_ALL
    280   signet.cpp
    281   torcontrol.cpp
    282   txdb.cpp
    283+  txgraph.cpp
    


    instagibbs commented at 4:02 pm on February 5, 2025:
    c89d147209c91bb0464321f5bc733a4eeab0dea0 in commit message: “… and so for.” and so on?

    sipa commented at 10:32 pm on February 12, 2025:
    Done.

    ajtowns commented at 4:54 am on March 14, 2025:
    (“and so forth” alternatively)
  69. in src/txgraph.h:45 in c89d147209 outdated
    40+        GraphIndex m_index = GraphIndex(-1);
    41+    public:
    42+        /** Construct an empty Ref. Non-empty Refs can only be created using
    43+         *  TxGraph::AddTransaction. */
    44+        Ref() noexcept = default;
    45+        /** Destroy this Ref. This is only allowed when it is empty, or the transaction it refers
    


    instagibbs commented at 4:07 pm on February 5, 2025:

    This is only allowed when it is empty

    Question: how is this enforced?


    sipa commented at 6:59 pm on February 12, 2025:
    You are eaten by a grue (specifically, by His Unpredictableness, the Grue of Undefined Behavior).

    ajtowns commented at 5:20 am on March 14, 2025:
    Why does it say earlier “Destroying this TxGraph::Ref removes the corresponding transaction” but here “[destroying this ref] is only allowed when .. the transaction it refers to has been removed from the graph” ? I think the destructor “unlinks” but does not “remove” the tx, so failure to remove before destructing will result in an Assume(!entry.m_locator.IsPresent()) failure in Compact().

    sipa commented at 9:09 pm on March 19, 2025:
    @ajtowns That comment was put in the wrong commit. I’ve fixed it.
  70. in src/txgraph.h:85 in c89d147209 outdated
    79+     * its ancestors, are removed as well (which is what always happens in realistic scenarios),
    80+     * this reordering will not affect the behavior of TxGraph.
    81+     *
    82+     * As an example, imagine 3 transactions A,B,C where B depends on A. If a dependency of C on B
    83+     * is added, and then B is deleted, C will still depend on A. If the deletion of B is reordered
    84+     * before the C->B dependency is added, it has no effect instead. If, together with the
    


    instagibbs commented at 4:14 pm on February 5, 2025:
    comment clarity: “it has no effect” meaning the dependency addition?

    sipa commented at 10:32 pm on February 12, 2025:
    I’ve reworded this a bit.
  71. in src/txgraph.cpp:159 in c89d147209 outdated
    154+    /** Which dependencies are to be added ((parent,child) pairs). GroupData::m_deps_offset indexes
    155+     *  into this. */
    156+    std::vector<std::pair<GraphIndex, GraphIndex>> m_deps_to_add;
    157+    /** Information about the merges to be performed, if known. */
    158+    std::optional<std::vector<GroupEntry>> m_group_data = std::vector<GroupEntry>{};
    159+    /** Total number of transactions in this ClusterSet (explicit + implicit). */
    


    instagibbs commented at 4:59 pm on February 5, 2025:

    c89d147209c91bb0464321f5bc733a4eeab0dea0

    what does implicit mean here?


    sipa commented at 10:33 pm on February 12, 2025:
    It’s something that really only has meaning after staging support is added. I have dropped it here, expanded the comment a bit, and expanded on it a lot more in the staging commit.
  72. in src/txgraph.cpp:252 in c89d147209 outdated
    177+
    178+    /** A class of objects held internally in TxGraphImpl, with information about a single
    179+     *  transaction. */
    180+    struct Entry
    181+    {
    182+        /** Pointer to the corresponding Ref object, if any. */
    


    instagibbs commented at 5:12 pm on February 5, 2025:

    c89d147209c91bb0464321f5bc733a4eeab0dea0

    could mention nullptr -> “unlinked”


    sipa commented at 10:33 pm on February 12, 2025:
    Done.
  73. sipa force-pushed on Feb 6, 2025
  74. sipa commented at 4:30 am on February 6, 2025: member

    Changed:

    • Added GetAncestorsUnion and GetDescendantsUnion, which act like their non-union version, but take in a span of multiple Ref*, and return the Ref* for the union of their ancestors/descendants

    (requested by @sdaftuar while rebasing #28676)

  75. in src/txgraph.cpp:110 in 781c15bfca outdated
    82+    /** Push changes to Cluster and its linearization to the TxGraphImpl Entry objects. */
    83+    void Updated(TxGraphImpl& graph) noexcept;
    84+
    85+    // Functions that implement the Cluster-specific side of internal TxGraphImpl mutations.
    86+
    87+    /** Apply any number of removals from the front of to_remove, popping them off. At least one
    


    instagibbs commented at 5:25 pm on February 6, 2025:

    781c15bfca1ebaffe7b634196e19144f5ab10a50

    Wasn’t immediately clear what the expected behavior is based on the comment. It will apply and pop off the entries where the prefix of to_remove is all owned by the cluster itself?


    sipa commented at 10:33 pm on February 12, 2025:
    I have rewritten this.
  76. in src/txgraph.cpp:596 in 781c15bfca outdated
    301+}
    302+
    303+void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove) noexcept
    304+{
    305+    // Iterate over the prefix of to_remove that applies to this cluster.
    306+    Assume(!to_remove.empty());
    


    instagibbs commented at 5:27 pm on February 6, 2025:
    UB if this Assume() is wrong, should we Assert() or more gracefully handle it?

    sipa commented at 8:47 pm on February 12, 2025:
    I don’t feel that’s necessary. The calling code is wrong if it’s called with an empty to_remove. It’s hopefully sufficient to document this, so reviewers can be convinced this isn’t being violated.

    ajtowns commented at 7:20 am on March 14, 2025:

    What’s the advantage of using Assume() over Assert()? The condition is still checked in production builds, it’s just that the result is discarded and execution continues, rather than triggering an abort, no? My understanding was:

    • CHECK_NONFATAL(cond) – to throw when there’s an internal error that prevents continuing in RPC code, but is recoverable overall
    • Assume(x) – when we expect x to be true-ish, but the code will still operate correctly if it is not
    • Assert(x) or assert(x) when we require x to be true-ish and the code won’t operate correctly if it is not

    You could conceivably have ref.m_graph be a debug-only field, and write Assume(ref.CheckGraph(txgraph)) with bool Ref::CheckGraph(const TxGraph* g) { if constexpr (DEBUG) { return g == m_graph; } else return true; } perhaps; but otherwise I don’t think you’re getting any benefit from Assume vs Assert.


    sipa commented at 8:54 pm on March 19, 2025:
    See my comment here: #31363 (comment) about how Assume() gives you compiler-guaranteed side-effect-equivalence, largely without runtime cost.
  77. in src/txgraph.cpp:599 in 781c15bfca outdated
    304+{
    305+    // Iterate over the prefix of to_remove that applies to this cluster.
    306+    Assume(!to_remove.empty());
    307+    SetType todo;
    308+    do {
    309+        GraphIndex idx = to_remove.front();
    


    instagibbs commented at 5:28 pm on February 6, 2025:
    0        GraphIndex idx = to_remove.front();
    1        if (!Assume(graph.m_entries.size() > idx)) break;
    

    sipa commented at 10:34 pm on February 12, 2025:
    I have added an Assume, but not the break. It just shouldn’t happen, and this is tested extensively in fuzzing (where the Assumes actually have teeth).
  78. in src/txgraph.cpp:62 in 781c15bfca outdated
    49+    friend class TxGraphImpl;
    50+    using GraphIndex = TxGraph::GraphIndex;
    51+    using SetType = BitSet<CLUSTER_COUNT_LIMIT>;
    52+    /** The DepGraph for this cluster, holding all feerates, and ancestors/descendants. */
    53+    DepGraph<SetType> m_depgraph;
    54+    /** m_mapping[i] gives the GraphIndex for the position i transaction in m_depgraph. */
    


    instagibbs commented at 5:33 pm on February 6, 2025:
    should probably say what value -1 means

    sipa commented at 10:34 pm on February 12, 2025:
    Done.
  79. in src/txgraph.cpp:64 in 781c15bfca outdated
    51+    using SetType = BitSet<CLUSTER_COUNT_LIMIT>;
    52+    /** The DepGraph for this cluster, holding all feerates, and ancestors/descendants. */
    53+    DepGraph<SetType> m_depgraph;
    54+    /** m_mapping[i] gives the GraphIndex for the position i transaction in m_depgraph. */
    55+    std::vector<GraphIndex> m_mapping;
    56+    /** The current linearization of the cluster. Size equals m_mapping.TxCount().
    


    instagibbs commented at 5:39 pm on February 6, 2025:
    m_mapping is a vector, I presume this is m_depgraph instead?

    sipa commented at 10:34 pm on February 12, 2025:
    Indeed. Expanded/corrected the comment.
  80. in src/txgraph.cpp:702 in 781c15bfca outdated
    349+    std::vector<Cluster*> new_clusters;
    350+    bool first{true};
    351+    // Iterate over the connected components of this Cluster's m_depgraph.
    352+    while (todo.Any()) {
    353+        auto component = m_depgraph.FindConnectedComponent(todo);
    354+        if (first && component == todo) {
    


    instagibbs commented at 8:23 pm on February 6, 2025:

    might help reader instantly recognize what todo means in this case

    0        if (first && component == todo) {
    1            Assume(todo == m_depgraph.Positions());
    

    sipa commented at 10:34 pm on February 12, 2025:
    Done.
  81. in src/txgraph.cpp:637 in 781c15bfca outdated
    360+        }
    361+        first = false;
    362+        // Construct a new Cluster to hold the found component.
    363+        auto new_cluster = std::make_unique<Cluster>();
    364+        new_clusters.push_back(new_cluster.get());
    365+        // Remember that all the component's transaction go to this new Cluster. The positions
    


    instagibbs commented at 8:54 pm on February 6, 2025:
    0        // Remember that all the component's transactions go to this new Cluster. The positions
    

    sipa commented at 10:34 pm on February 12, 2025:
    Done.
  82. in src/txgraph.cpp:776 in 781c15bfca outdated
    419+        }
    420+        m_linearization.push_back(new_pos);
    421+        // Copy the transaction's dependencies, translating them using remap.
    422+        SetType parents;
    423+        for (auto par : other.m_depgraph.GetReducedParents(pos)) {
    424+            parents.Set(remap[par]);
    


    instagibbs commented at 9:19 pm on February 6, 2025:
    note: remap here is already filled out from a topological pov since we’re iterating over other.m_linearization which is assumed to be topologically valid

    sipa commented at 10:35 pm on February 12, 2025:
    Added a comment to this effect.
  83. in src/txgraph.cpp:798 in 781c15bfca outdated
    441+    // between which dependencies are added, which simply concatenates their linearizations. Invoke
    442+    // PostLinearize, which has the effect that the linearization becomes a merge-sort of the
    443+    // constituent linearizations. Do this here rather than in Cluster::Merge, because this
    444+    // function is only invoked once per merged Cluster, rather than once per constituent one.
    445+    // This concatenation + post-linearization could be replaced with an explicit merge-sort.
    446+    PostLinearize(m_depgraph, m_linearization);
    


    instagibbs commented at 9:33 pm on February 6, 2025:

    781c15bfca1ebaffe7b634196e19144f5ab10a50

    note: I’m not sure this obvious that PostLinearize obtains this property.


    sipa commented at 9:16 pm on February 12, 2025:

    It follows from the property that PostLinearize results in chunks that do not consist of multiple disconnected components: if a higher-feerate chunk is appended to a lower-feerate one, it must mean they get separated.

    I’m happy to elaborate briefly in this comment, but perhaps this is also not the place to provide deeper more theoretical reasons why the properties hold.


    instagibbs commented at 9:45 pm on February 12, 2025:

    I am perhaps taking the comment too literally? I just don’t see how it’s a merge-sort of chunks when a PostLinearize may very well end up changing chunk boundaries. Or maybe it doesn’t? I buy that the documented properties otherwise hold.

    Either way, is there a presumed standout benefit to calling this once here, then after dependencies have been applied and linearization fixed?


    sipa commented at 8:55 pm on March 19, 2025:
    Late update (discussed elsewhere): this property actually did not hold. Added comments, and partially rewritten.
  84. in src/txgraph.cpp:118 in 781c15bfca outdated
    90+    /** Split this cluster (must have a NEEDS_SPLIT* quality). Returns whether to delete this
    91+     *  Cluster afterwards. */
    92+    [[nodiscard]] bool Split(TxGraphImpl& graph) noexcept;
    93+    /** Move all transactions from cluster to *this (as separate components). */
    94+    void Merge(TxGraphImpl& graph, Cluster& cluster) noexcept;
    95+    /** Given a span of (parent, child) pairs that all belong to this Cluster (or be removed),
    


    instagibbs commented at 9:40 pm on February 6, 2025:

    Cluster (or be removed)

    not sure I get this comment, be removed what?


    sipa commented at 10:35 pm on February 12, 2025:
    This comment was no longer relevant, it dates from an earlier iteration. Gone.
  85. in src/txgraph.cpp:726 in 781c15bfca outdated
    449+    std::sort(to_apply.begin(), to_apply.end(), [](auto& a, auto& b) { return a.second < b.second; });
    450+    // Iterate over groups of to-be-added dependencies with the same child.
    451+    auto it = to_apply.begin();
    452+    while (it != to_apply.end()) {
    453+        auto& first_child = graph.m_entries[it->second].m_locator;
    454+        DepGraphIndex child_idx = first_child.index;
    


    instagibbs commented at 9:42 pm on February 6, 2025:
    const this?

    sipa commented at 10:35 pm on February 12, 2025:
    Done.
  86. in src/txgraph.cpp:772 in 781c15bfca outdated
    492+
    493+    // Clean up space in quality_cluster.
    494+    auto max_setindex = quality_clusters.size() - 1;
    495+    if (setindex != max_setindex) {
    496+        // If the cluster was not the last element of quality_clusters, move that to take its place.
    497+        quality_clusters.back()->m_quality = quality;
    


    instagibbs commented at 9:50 pm on February 6, 2025:

    781c15bfca1ebaffe7b634196e19144f5ab10a50

    the cluster is from the same quality level, so this is kind of a no-op?


    sipa commented at 10:35 pm on February 12, 2025:
    Gone.
  87. in src/txgraph.cpp:561 in 781c15bfca outdated
    556+        if (cluster != nullptr) {
    557+            // If the first to_remove_span entry's Cluster exists, hand to_remove_span to it, so it
    558+            // can pop off whatever applies to it.
    559+            cluster->ApplyRemovals(*this, to_remove_span);
    560+        } else {
    561+            // Otherwise, skip this already-removed entry.
    


    instagibbs commented at 10:18 pm on February 6, 2025:
    when does this happen?

    sipa commented at 10:35 pm on February 12, 2025:
    When the same Ref is RemoveTransaction()ed twice. Added a comment to clarify.
  88. in src/txgraph.cpp:708 in 781c15bfca outdated
    353+        auto component = m_depgraph.FindConnectedComponent(todo);
    354+        if (first && component == todo) {
    355+            // The existing Cluster is an entire component. Leave it be, but update its quality.
    356+            graph.SetClusterQuality(m_quality, m_setindex, QualityLevel::NEEDS_RELINEARIZE);
    357+            // We need to recompute and cache its chunking.
    358+            Updated(graph);
    


    ismaelsadeeq commented at 8:23 pm on February 10, 2025:

    IIUC we call ApplyRemoval on each cluster before splitting, which calls updated with the graph at the end. so it seems to me that this call to updated before returning here is redundant and repeats work that has already been done?

    I also think even if the above assumption is incorrect and an update is necessary, then since the cluster quality is now NEEDS_RELINEARIZE, we only need to update the locator.


    sipa commented at 8:40 pm on February 11, 2025:
    You’re right, but that’s exactly what the “delay chunking while sub-acceptable” commit is about. In the initial commit, the invariant is that the chunk feerates are maintained at all times (and the sanity check will fail if not). In this “delay chunking while sub-acceptable” commit, some updates are avoided while they are unnecessary. The commit after it (“special-case removal of tail of cluster”) reintroduces some, because with that, performing a Split() can in fact immediately introduce optimal clusters directly.

    ismaelsadeeq commented at 10:07 pm on February 13, 2025:
    I see this now 👍🏾
  89. in src/txgraph.h:83 in 781c15bfca outdated
    82+     * As an example, imagine 3 transactions A,B,C where B depends on A. If a dependency of C on B
    83+     * is added, and then B is deleted, C will still depend on A. If the deletion of B is reordered
    84+     * before the C->B dependency is added, it has no effect instead. If, together with the
    85+     * deletion of B also either A or C is deleted, there is no distinction.
    86+     */
    87+    virtual void RemoveTransaction(const Ref& arg) noexcept = 0;
    


    ismaelsadeeq commented at 8:50 pm on February 10, 2025:
    IIUC you are recommending that callers should remove transactions topologically starting from bottom descendant to the top ancestor.

    sipa commented at 8:42 pm on February 11, 2025:
    No, the order of removals is irrelevant. The comment is just about pointing out that while TxGraph is allowed to reorder dependency additions and transaction removals w.r.t. each other, this doesn’t matter in practice. As long as every transaction removal is combined (doesn’t matter when) with also removal of all its ancestors, or with all its descendants, this reordering does not affect its observable behavior.
  90. in src/txgraph.cpp:746 in 781c15bfca outdated
    391+        for (auto par : m_depgraph.GetReducedParents(i)) new_parents.Set(remap[par].second);
    392+        new_cluster->m_depgraph.AddDependencies(new_parents, remap[i].second);
    393+    }
    394+    // Update all the Locators of moved transactions.
    395+    for (Cluster* new_cluster : new_clusters) {
    396+        new_cluster->Updated(graph);
    


    ismaelsadeeq commented at 8:54 pm on February 10, 2025:
    Don’t we just need to update the locator here, because all NEEDS_RELINEARIZE clusters has to be made acceptable before we can get their chunk feerate?

    sipa commented at 8:43 pm on February 11, 2025:
    Same comment as before: in the initial commit, the invariant is that the chunk feerates are maintained at all times. It is relaxed later on.
  91. ismaelsadeeq commented at 10:02 pm on February 10, 2025: member
    Code review to 741a6a8c4d851cb10ecee810a09187bcbfa5af4c
  92. in src/txgraph.cpp:928 in 781c15bfca outdated
    594+    if (!m_deps_to_add.empty()) return;
    595+    if (!m_to_remove.empty()) return;
    596+
    597+    // Sort the GraphIndexes that need to be cleaned up. They are sorted in reverse, so the last
    598+    // ones get processed first. This means earlier-processed GraphIndexes will not cause moving of
    599+    // later-processed ones during the "swap with end of m_entries" step (which might invalidate
    


    instagibbs commented at 2:59 pm on February 11, 2025:

    781c15bfca1ebaffe7b634196e19144f5ab10a50

    0    // later-processed ones during the "swap with end of m_entries" step below (which might invalidate
    

    sipa commented at 10:36 pm on February 12, 2025:
    Done.
  93. in src/txgraph.cpp:998 in 781c15bfca outdated
    654+    if (m_group_data.has_value()) return;
    655+
    656+    /** Annotated clusters: an entry for each Cluster, together with the representative for the
    657+     *  partition it is in if known, or with nullptr if not yet known. */
    658+    std::vector<std::pair<Cluster*, Cluster*>> an_clusters;
    659+    /** Annotated dependencies: an entry for each m_deps_to_apply entry (excluding ones that apply
    


    instagibbs commented at 3:37 pm on February 11, 2025:
    m_deps_to_add?

    sipa commented at 10:36 pm on February 12, 2025:
    Fixed.
  94. in src/txgraph.cpp:1451 in 781c15bfca outdated
    1005+std::vector<TxGraph::Ref*> TxGraphImpl::GetAncestors(const Ref& arg) noexcept
    1006+{
    1007+    // Return the empty vector if the Ref is empty.
    1008+    if (GetRefGraph(arg) == nullptr) return {};
    1009+    Assume(GetRefGraph(arg) == this);
    1010+    // Apply all dependencies, as the result might be incorrect otherwise.
    


    instagibbs commented at 3:50 pm on February 11, 2025:
    Probably worth noting in these comment and elsewhere that it also applies removals if queued.

    sipa commented at 10:36 pm on February 12, 2025:
    Done.
  95. in src/txgraph.h:93 in 781c15bfca outdated
    88+    /** Add a dependency between two specified transactions. Parent may not be a descendant of
    89+     *  child already (but may be an ancestor of it already, in which case this is a no-op). If
    90+     *  either transaction is already removed, this is a no-op. */
    91+    virtual void AddDependency(const Ref& parent, const Ref& child) noexcept = 0;
    92+    /** Modify the fee of the specified transaction. If the transaction does not exist (or was
    93+     *  removed), this has no effect. */
    


    instagibbs commented at 3:58 pm on February 11, 2025:
    should say the Ref if non-null should be connected to this TxGraph object

    sipa commented at 10:36 pm on February 12, 2025:
    I’ve added a comment to AddTransaction to clarify one can only pass Refs from a given txgraph object to functions of the same object.
  96. in src/txgraph.cpp:1835 in 781c15bfca outdated
    1099+
    1100+void TxGraphImpl::SetTransactionFee(const Ref& ref, int64_t fee) noexcept
    1101+{
    1102+    // Don't do anything if the passed Ref is empty.
    1103+    if (GetRefGraph(ref) == nullptr) return;
    1104+    Assume(GetRefGraph(ref) == this);
    


    instagibbs commented at 3:59 pm on February 11, 2025:
    0    if (!Assume(GetRefGraph(ref) == this)) return;
    

    sipa commented at 9:57 pm on February 12, 2025:
    If we’re going to expend run-time cost to perform the check, we might as well drop the requirement that Ref refers to this graph entirely. Given that we’re not expecting to ever have multiple TxGraph objects in production, this feels like overkill.
  97. in src/txgraph.cpp:1702 in 781c15bfca outdated
    1067+
    1068+FeePerWeight TxGraphImpl::GetChunkFeerate(const Ref& arg) noexcept
    1069+{
    1070+    // Return the empty FeePerWeight if the passed Ref is empty.
    1071+    if (GetRefGraph(arg) == nullptr) return {};
    1072+    Assume(GetRefGraph(arg) == this);
    


    instagibbs commented at 4:04 pm on February 11, 2025:
    0    if (!Assume(GetRefGraph(arg) == this)) return {};
    

    and so on for the public functions?

  98. in src/txgraph.h:47 in 781c15bfca outdated
    17+
    18+/** Data structure to encapsulate fees, sizes, and dependencies for a set of transactions. */
    19+class TxGraph
    20+{
    21+public:
    22+    /** Internal identifier for a transaction within a TxGraph. */
    


    instagibbs commented at 4:26 pm on February 11, 2025:
    0    /** Internal identifier for a transaction within a TxGraph corresponding to m_entries order. */
    

    sipa commented at 9:59 pm on February 12, 2025:
    I’d rather not refer to implementation details in the public interface.
  99. in src/txgraph.cpp:846 in 781c15bfca outdated
    841+    Assume(m_group_data.has_value());
    842+    // Nothing to do if there are no dependencies to be added.
    843+    if (m_deps_to_add.empty()) return;
    844+
    845+    // For each group of to-be-merged Clusters.
    846+    Assume(m_group_data.has_value());
    


    instagibbs commented at 4:35 pm on February 11, 2025:

    this was just assumed a few lines above

    edit: in ee4d16ff3a32a0b12a7c1b7837ef5195d9deedf0 the other one is gone, but we’re accessing it two lines above


    sipa commented at 10:37 pm on February 12, 2025:
    Fixed.
  100. in src/txgraph.cpp:146 in 781c15bfca outdated
    141+    /** Information about one group of Clusters to be merged. */
    142+    struct GroupEntry
    143+    {
    144+        /** Which clusters are to be merged. */
    145+        std::vector<Cluster*> m_clusters;
    146+        /** Which dependencies are to be applied to those merged clusters. */
    


    instagibbs commented at 4:40 pm on February 11, 2025:
    0        /** Which dependencies are to be applied to those merged clusters from parent to child. */
    

    sipa commented at 10:37 pm on February 12, 2025:
    Added a comment to this effect (the variable is gone in a future commit, though).
  101. in src/txgraph.cpp:1243 in 781c15bfca outdated
    845+    // For each group of to-be-merged Clusters.
    846+    Assume(m_group_data.has_value());
    847+    for (auto& group_data : *m_group_data) {
    848+        // Invoke Merge() to merge them into a single Cluster.
    849+        Merge(group_data.m_clusters);
    850+        // Actually apply all to-be-added dependencies (for each, parent and child belong to the
    


    instagibbs commented at 4:42 pm on February 11, 2025:
    0        // Actually apply all to-be-added dependencies (all parents and children from this grouping belong to the
    

    sipa commented at 10:37 pm on February 12, 2025:
    Done.
  102. in src/txgraph.cpp:1097 in 781c15bfca outdated
    663+
    664+    // Construct a an_clusters entry for every parent and child in the to-be-applied dependencies.
    665+    for (const auto& [par, chl] : m_deps_to_add) {
    666+        auto par_cluster = m_entries[par].m_locator.cluster;
    667+        auto chl_cluster = m_entries[chl].m_locator.cluster;
    668+        // Skip dependencies for which the parent or child transaction is removed.
    


    instagibbs commented at 5:07 pm on February 11, 2025:
    this would be the case where a removal and a dependency addition was queued for the same ref?

    sipa commented at 10:18 pm on February 12, 2025:
    Indeed.
  103. sipa force-pushed on Feb 11, 2025
  104. in src/txgraph.cpp:1133 in 0eacbd61cb outdated
    690+             *  a representative for the entire tree, and can be found by walking upwards from any
    691+             *  element. */
    692+            PartitionData* parent;
    693+            /** (only if this is a root, so when parent == this) An upper bound on the height of
    694+             *  tree for this partition. */
    695+            unsigned rank;
    


    instagibbs commented at 9:24 pm on February 11, 2025:
    we doing just unsigned for a reason? I don’t see this in the codebase often

    sipa commented at 10:21 pm on February 12, 2025:
    It’s a tiny number, as it’s bounded by $\mathcal{O}(\log n)$, so it didn’t feel right to use the same DepGraphIndex or GraphIndex type.
  105. in src/txgraph.cpp:710 in 0eacbd61cb outdated
    705+            Assume(it->cluster == arg);
    706+            return &*it;
    707+        };
    708+
    709+        /** Given a PartitionData, find the root of the tree it is in (its representative). */
    710+        static constexpr auto find_uf = [](PartitionData* data) noexcept -> PartitionData* {
    


    instagibbs commented at 9:28 pm on February 11, 2025:

    s/find_uf/find_root_uf/ ?

    find_uf and locate_uf read identical


    sipa commented at 10:37 pm on February 12, 2025:
    Renamed to find_root_fn, locate_fn, union_fn.
  106. in src/txgraph.cpp:1217 in 0eacbd61cb outdated
    759+        for (size_t i = 0; i < partition_data.size(); ++i) {
    760+            auto& data = partition_data[i];
    761+            // Find the representative of the partition Cluster i is in, and store it with the
    762+            // Cluster.
    763+            auto rep = find_uf(&data)->cluster;
    764+            an_clusters[i].second = rep;
    


    instagibbs commented at 9:43 pm on February 11, 2025:
    0            Assume(an_clusters[i].second == nullptr);
    1            an_clusters[i].second = rep;
    

    sipa commented at 10:38 pm on February 12, 2025:
    Done.
  107. instagibbs commented at 4:22 pm on February 12, 2025: member

    reviewed up to “txgraph: (feature) add initial version” 781c15bfca1ebaffe7b634196e19144f5ab10a50 , basically all nits at this point, logic all makes sense

    One question I had coming up is use of Assume(), there is a number of places that Assume() is used, then subsequently may result in UB or weird behavior. In a few spots I gave suggestions but was wondering if there is thinking behind doing more (or Assert’ing) or not to handle that in release builds.

  108. in src/test/fuzz/txgraph.cpp:172 in 1f06bc1e4a outdated
    167+
    168+    // Construct a real and a simulated graph.
    169+    auto real = MakeTxGraph();
    170+    SimTxGraph sim;
    171+
    172+    /** Function to pick any Ref (in sim real, sim.removed, or empty). */
    


    ismaelsadeeq commented at 4:34 pm on February 12, 2025:

    In “txgraph: (tests) add simulation fuzz test” 1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    using real in the comments here is a bit confusing as you represent the actual tx graph as real


    instagibbs commented at 4:51 pm on February 12, 2025:

    sim real

    feel like we’re missing a punctuation or I’m unclear what it’s saying


    sipa commented at 4:35 am on February 13, 2025:
    Agreed, fixed.

    ismaelsadeeq commented at 7:20 pm on February 13, 2025:
    nit: There is still one remaining in line 180.

    sipa commented at 10:11 pm on February 21, 2025:
    This has been rewritten already.
  109. in src/txgraph.cpp:605 in 1f06bc1e4a outdated
    310@@ -311,7 +311,7 @@ void Cluster::ApplyRemovals(TxGraphImpl& graph, std::span<GraphIndex>& to_remove
    311         auto& locator = entry.m_locator;
    312         // Stop once we hit an entry that applies to another Cluster.
    313         if (locator.cluster != this) break;
    314-        // - Remember it in a set of to-remove ClusterIndexes.
    315+        // - Remember it in a set of to-remove DepGraphIndexes.
    


    instagibbs commented at 4:43 pm on February 12, 2025:
    micro-nit: seems this should have been renamed in a different commit 1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    sipa commented at 10:11 pm on February 21, 2025:
    Done.
  110. in src/test/fuzz/txgraph.cpp:142 in 1f06bc1e4a outdated
    100+        auto pos = Find(ref);
    101+        if (pos == MISSING) return;
    102+        graph.RemoveTransactions(SetType::Singleton(pos));
    103+        simrevmap.erase(simmap[pos].get());
    104+        // Retain the TxGraph::Ref corresponding to this position, until explicitly destroyed.
    105+        // to see it when calling Cleanup().
    


    instagibbs commented at 5:14 pm on February 12, 2025:

    1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    Cleanup

    probably an old reference?


    sipa commented at 10:11 pm on February 21, 2025:
    Removed it.
  111. in src/test/fuzz/txgraph.cpp:306 in 1f06bc1e4a outdated
    210+        while (true) {
    211+            if (sim.GetTransactionCount() < SimTxGraph::MAX_TRANSACTIONS && command-- == 0) {
    212+                // AddTransaction.
    213+                int64_t fee;
    214+                int32_t size;
    215+                if (alt) {
    


    instagibbs commented at 5:19 pm on February 12, 2025:
    to be clear this is to allow the search space to be split in a bimodal way?

    sipa commented at 10:11 pm on February 21, 2025:
    It’s mostly to avoid consuming 12 bytes from the fuzz data for every transaction; added a comment.
  112. in src/test/fuzz/txgraph.cpp:255 in 1f06bc1e4a outdated
    250+                    if (oversize) break;
    251+                }
    252+                sim.AddDependency(par, chl);
    253+                real->AddDependency(*par, *chl);
    254+                break;
    255+            } else if (sim.removed.size() < 100 && command-- == 0) {
    


    instagibbs commented at 5:29 pm on February 12, 2025:

    sim.removed.size() < 100

    Is this because the loop does 200 commands, and we logically speaking cannot remove more than 100 after 100 additions?


    sipa commented at 10:12 pm on February 21, 2025:
    This dates from a time when the fuzz test allowed up to 1000 steps, and it felt unnecessary to support cases with unboundedly growing numbers of removed transactions. I’ve just dropped it.
  113. in src/test/fuzz/txgraph.cpp:271 in 1f06bc1e4a outdated
    266+                    real->RemoveTransaction(*ptr);
    267+                    sim.RemoveTransaction(ptr);
    268+                }
    269+                break;
    270+            } else if (sim.removed.size() > 0 && command-- == 0) {
    271+                // ~Ref
    


    instagibbs commented at 5:35 pm on February 12, 2025:

    1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    this is the odd command out; what precisely is this simulating, or is this just making sure destruction of the Ref is being exercised in general?


    sipa commented at 10:13 pm on February 21, 2025:
    I’ve added a comment. ~Ref triggers observable behavior in TxGraph, so the simulation controls when it gets called (and its ordering w.r.t. other operations).
  114. in src/test/fuzz/txgraph.cpp:279 in 1f06bc1e4a outdated
    274+                if (removed_pos != sim.removed.size() - 1) {
    275+                    std::swap(sim.removed[removed_pos], sim.removed.back());
    276+                }
    277+                sim.removed.pop_back();
    278+                break;
    279+            } else if (sim.GetTransactionCount() > 0 && command-- == 0) {
    


    instagibbs commented at 5:36 pm on February 12, 2025:
    do we really not want coverage when the graph is empty?

    sipa commented at 10:13 pm on February 21, 2025:
    Fixed.
  115. in src/test/fuzz/txgraph.cpp:256 in 1f06bc1e4a outdated
    173+    auto pick_fn = [&]() noexcept -> TxGraph::Ref* {
    174+        auto tx_count = sim.GetTransactionCount();
    175+        /** The number of possible choices. */
    176+        size_t choices = tx_count + sim.removed.size() + 1;
    177+        /** Pick one of them. */
    178+        auto choice = provider.ConsumeIntegralInRange<size_t>(0, choices - 1);
    


    ismaelsadeeq commented at 7:01 pm on February 12, 2025:

    In “txgraph: (tests) add simulation fuzz test” 1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    why add 1 and then subtract again when using the choices value?


    sipa commented at 4:26 am on February 13, 2025:

    Well choices counts the actual number of choices, consisting of:

    • All transactions in sim.simmap (tx_count)
    • All transactions in sim.removed (sim.removed.size())
    • The empty Ref (1).

    When picking one of them, we want one in the range from 0 up to and including choices - 1.


    ismaelsadeeq commented at 7:19 pm on February 13, 2025:
    Makes sense thanks.
  116. in src/test/fuzz/txgraph.cpp:117 in 1f06bc1e4a outdated
    77+    }
    78+
    79+    /** Add a dependency between two positions in this graph. */
    80+    void AddDependency(TxGraph::Ref* parent, TxGraph::Ref* child)
    81+    {
    82+        auto par_pos = Find(parent);
    


    ismaelsadeeq commented at 7:23 pm on February 12, 2025:

    In “txgraph: (tests) add simulation fuzz test” 1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    Should we return early when parent ref is the same as ref child? This will prevent a possibility of having a cycle. As such, we don’t have to check before adding dependency in the fuzz test.


    sipa commented at 4:27 am on February 13, 2025:
    It doesn’t matter; TxGraph ignores self-dependencies anyway (because DepGraph does), so having it doesn’t hurt, and adds testing for that case.
  117. in src/test/fuzz/txgraph.cpp:282 in 1f06bc1e4a outdated
    195+        // Return empty.
    196+        assert(choice == 0);
    197+        return &empty_ref;
    198+    };
    199+
    200+    LIMITED_WHILE(provider.remaining_bytes() > 0, 200) {
    


    ismaelsadeeq commented at 7:51 pm on February 12, 2025:

    In “txgraph: (tests) add simulation fuzz test” 1f06bc1e4a8108f1430bcd20fc391c9f663a2e4b

    This loop is quite big, has series of steps that are depending if behavior of the if else branches.

    Will be nice to add an overview of how the test works, just like it was done in cluster_linearize fuzz test


    sipa commented at 4:29 am on February 13, 2025:
    It’s not clear to me what you’re asking for. I would describe the test as “big simulation test, which performs a number of operations on a real TxGraph, and on a simpler reimplementation, and in the end compares the two”. Something like that? Which cluster_linearize fuzz test are you referring to?

    ismaelsadeeq commented at 7:20 pm on February 13, 2025:

    Yeah something like that that

    In most of the fuzz test in src/test/fuzz/cluster_linearize.cpp you provided a really nice description , which is just an overview of what the test is going to do before jumping to the instructions.

    Here we just jump into the instructions without that context.

    But this is nitty, you can ignore this comment and resolve.


    sipa commented at 9:30 pm on February 13, 2025:
    Will address on the next push.

    sipa commented at 9:11 pm on March 19, 2025:

    I added the following comment:

    0    // This is a big simulation test for TxGraph, which performs a fuzz-derived sequence of valid
    1    // operations on a TxGraph instance, as well as on a simpler (mostly) reimplementation (see
    2    // SimTxGraph above), comparing the outcome of functions that return a result, and finally
    3    // performing a full comparison between the two.
    
  118. in src/txgraph.cpp:1201 in 741a6a8c4d outdated
    1196+        // ... for all clusters in them ...
    1197+        for (ClusterSetIndex setindex = 0; setindex < quality_clusters.size(); ++setindex) {
    1198+            const auto& cluster = *quality_clusters[setindex];
    1199+            // Remember we saw this Cluster (only if it is non-empty; empty Clusters aren't
    1200+            // expected to be referenced by the Entry vector).
    1201+            if (cluster.GetTxCount() != 0) {
    


    instagibbs commented at 8:05 pm on February 12, 2025:
    Empty clusters can happen anytime ApplyRemovals() is invoked, but not ApplyDependencies since that invokes SplitAll(), correct?

    sipa commented at 8:55 pm on February 21, 2025:
    Indeed. Empty clusters can only exist in after an ApplyRemovals but before a SplitAll(). After oversizedness caching, ApplyDependencies may end up not calling SplitAll(), though, if the result is known to be oversized already.
  119. in src/txgraph.cpp:1895 in 741a6a8c4d outdated
    1158+    assert(m_done == m_depgraph.Positions());
    1159+}
    1160+
    1161+void TxGraphImpl::SanityCheck() const
    1162+{
    1163+    /** Which GraphIndexes ought to occur in m_wiped, based on m_entries. */
    


    ismaelsadeeq commented at 8:53 pm on February 12, 2025:

    In “txgraph: (tests) add internal sanity check function” 741a6a8c4d851cb10ecee810a09187bcbfa5af4c

    What is m_wiped


    sipa commented at 4:35 am on February 13, 2025:
    That was outdated. Fixed.
  120. in src/txgraph.cpp:1149 in 741a6a8c4d outdated
    1144+        // Check that the Entry has a locator pointing back to this Cluster & position within it.
    1145+        assert(entry.m_locator.cluster == this);
    1146+        assert(entry.m_locator.index == lin_pos);
    1147+        // Check linearization position and chunk feerate.
    1148+        if (!linchunking.GetChunk(0).transactions[lin_pos]) {
    1149+            linchunking.MarkDone(linchunking.GetChunk(0).transactions);
    


    ismaelsadeeq commented at 9:01 pm on February 12, 2025:

    In “txgraph: (tests) add internal sanity check function” 741a6a8c4d851cb10ecee810a09187bcbfa5af4c

    Should we also compare that the list of transactions in chunk 0 matches m_linearization subset from from previous hit of the conditional statement to lin_pos?


    sipa commented at 4:32 am on February 13, 2025:
    I’m not sure what you mean here.

    ismaelsadeeq commented at 7:21 pm on February 13, 2025:

    Yeah I was thinking something like

     0diff --git a/src/txgraph.cpp b/src/txgraph.cpp
     1index afaa46e6326..a9b3e446d26 100644
     2--- a/src/txgraph.cpp
     3+++ b/src/txgraph.cpp
     4@@ -1143,6 +1143,7 @@ void Cluster::SanityCheck(const TxGraphImpl& graph) const
     5 
     6     // Verify m_linearization.
     7     SetType m_done;
     8+    SetType last_chunk;
     9     assert(m_depgraph.IsAcyclic());
    10     for (auto lin_pos : m_linearization) {
    11         assert(lin_pos < m_mapping.size());
    12@@ -1155,9 +1156,12 @@ void Cluster::SanityCheck(const TxGraphImpl& graph) const
    13         assert(entry.m_locator.index == lin_pos);
    14         // Check linearization position and chunk feerate.
    15         if (!linchunking.GetChunk(0).transactions[lin_pos]) {
    16+            assert(linchunking.GetChunk(0).transactions == last_chunk);
    17+            last_chunk = SetType();
    18             linchunking.MarkDone(linchunking.GetChunk(0).transactions);
    19         }
    20         assert(entry.m_chunk_feerate == linchunking.GetChunk(0).feerate);
    21+        last_chunk.Set(lin_pos);
    22         // If this Cluster has an acceptable quality level, its chunks must be connected.
    23         if (m_quality == QualityLevel::ACCEPTABLE || m_quality == QualityLevel::OPTIMAL) {
    24             assert(m_depgraph.IsConnected(linchunking.GetChunk(0).transactions));
    

    sipa commented at 9:29 pm on February 13, 2025:
    I don’t think this is the place for it, as it amounts to testing of LinearizationChunking, not TxGraph. The point of the code here is just to get the chunks defined by m_linearization out, so we can test them. If LinearizationChunking works correctly (see clusterlin_linearization_chunking for that), then the chunks that come out will correspond to consecutive transactions from the linearization.
  121. sipa force-pushed on Feb 12, 2025
  122. DrahtBot added the label CI failed on Feb 13, 2025
  123. DrahtBot commented at 0:01 am on February 13, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/37128519486

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  124. sipa force-pushed on Feb 13, 2025
  125. DrahtBot removed the label CI failed on Feb 13, 2025
  126. in src/txgraph.cpp:1189 in 56cbdd7889 outdated
    833+            while (deps_it != m_deps_to_add.end()) {
    834+                auto [par, chl] = *deps_it;
    835+                auto chl_cluster = m_entries[chl].m_locator.cluster;
    836+                // Skip dependencies that apply to earlier Clusters (those necessary are for
    837+                // deleted transactions, as otherwise we'd have processed them already).
    838+                if (!std::less{}(chl_cluster, data.cluster)) {
    


    instagibbs commented at 6:02 pm on February 13, 2025:

    56cbdd7889e957c29d76681c46c4a7a6983c9be6

    I rewrote it so my smooth brain could better decipher it (I believe it’s the same) :

     0diff --git a/src/txgraph.cpp b/src/txgraph.cpp
     1index b2cecffebb..df93eb2a13 100644
     2--- a/src/txgraph.cpp
     3+++ b/src/txgraph.cpp
     4@@ -821,12 +821,13 @@ void TxGraphImpl::GroupClusters() noexcept
     5             while (deps_it != m_deps_to_add.end()) {
     6                 auto [par, chl] = *deps_it;
     7                 auto chl_cluster = m_entries[chl].m_locator.cluster;
     8                 // Skip dependencies that apply to earlier Clusters (those necessary are for
     9                 // deleted transactions, as otherwise we'd have processed them already).
    10-                if (!std::less{}(chl_cluster, data.cluster)) {
    11-                    if (chl_cluster != data.cluster) break;
    12+                if (std::greater{}(chl_cluster, data.cluster)) {
    13+                    break;
    14+                } else if (chl_cluster == data.cluster) {
    15                     auto par_cluster = m_entries[par].m_locator.cluster;
    16                     // Also filter out dependencies applying to a removed parent.
    17                     if (par_cluster != nullptr) an_deps.emplace_back(*deps_it, rep);
    18                 }
    19                 ++deps_it;
    

    sipa commented at 10:15 pm on February 21, 2025:
    Done, largely.
  127. in src/txgraph.cpp:662 in 693c3df67d outdated
    418+           m_quality == QualityLevel::NEEDS_SPLIT_ACCEPTABLE);
    419+    // Determine the new quality the split-off Clusters will have.
    420+    QualityLevel new_quality = m_quality == QualityLevel::NEEDS_SPLIT ? QualityLevel::NEEDS_RELINEARIZE :
    421+                               m_quality == QualityLevel::NEEDS_SPLIT_OPTIMAL ? QualityLevel::OPTIMAL :
    422+                               QualityLevel::ACCEPTABLE;
    423+    // If the cluster was NEEDS_SPLIT_OPTIMAL, and we're thus going to produce OPTIMAL clusters, we
    


    instagibbs commented at 6:29 pm on February 13, 2025:

    693c3df67dc972dd2fcfbeb8179bcd7833c77bd1

    Working on convincing myself that PostLinearization is suffcient to result in optimal clusters. I think it’s straight forward that the chunk prefixes before any removed tail transactions are untouched, and once we remove tail transactions, the last touched-but-not-fully-removed chunk may be in a substandard ordering.

    So PostLinearization guarantees that the final sub-chunk is reordered to being connected, and that means that sub-chunk is now optimal? I’m going to need help here.

    Something I was too lazy to do: A fuzz harness could be constructed via taking verified-optimal depgraphs (from exhaustive solver), chopping the tail off, and verifying all connected components PostLinearize-ation are just as good as whatever the exhaustive linearizer gives for each component-now-cluster?


    instagibbs commented at 7:01 pm on February 27, 2025:

    Update from offline conversation: This was a confusion about what we’re calling “optimal”. The splits clusters are already optimal in the sense of the diagram check, but not minimal or fully connected. PostLinearization makes sure that all chunks are connected/minimal.

    Perhaps a rewording of this section to make it abundantly clear?


    instagibbs commented at 3:54 pm on March 3, 2025:
    Also, as discussed offline, this seems like an easy scenario to write a fuzz test for as described above ^

    sipa commented at 9:15 pm on March 19, 2025:
    NEEDS_SPLIT_OPTIMAL is gone now after further discussion. Marking resolved.
  128. in src/txgraph.cpp:1021 in 370c419c33 outdated
    1017@@ -997,6 +1018,7 @@ void TxGraphImpl::AddDependency(const Ref& parent, const Ref& child) noexcept
    1018     m_group_data.reset();
    1019     // Remember that this dependency is to be applied.
    1020     m_deps_to_add.emplace_back(GetRefIndex(parent), GetRefIndex(child));
    1021+    m_group_data.reset();
    


    ismaelsadeeq commented at 8:09 pm on February 13, 2025:

    In “txgraph: (feature) make max cluster count configurable and oversize state” 370c419c330ca73d9ea44a7d48506a5c203e9c17

    This is a duplicate I think


    sipa commented at 10:28 pm on February 14, 2025:
    Indeed, fixed.
  129. in src/txgraph.cpp:324 in 30d7c8ce7f outdated
    336-        } while(chunk.transactions.Any());
    337+    // If the Cluster's quality is ACCEPTABLE or OPTIMAL, compute its chunking and store its
    338+    // information in the Entry's m_chunk_feerate. These fields are only accessed after making
    339+    // the entire graph ACCEPTABLE, so it is pointless to compute these if we haven't reached that
    340+    // quality level yet.
    341+    if (m_quality == QualityLevel::OPTIMAL || m_quality == QualityLevel::ACCEPTABLE) {
    


    ismaelsadeeq commented at 9:35 pm on February 13, 2025:

    In “txgraph: (optimization) delay chunking while sub-acceptable” 30d7c8ce7f34e977238d1454dd032196cbfd936b

    nit: we can have truthy/false method for this so that we can just call it. we have this conditional statement in quite a few places


    sipa commented at 10:29 pm on February 14, 2025:
    Done. I have actually added a bunch of functions (IsAcceptable(), IsOptimal(), NeedsSplitting(), and further on also IsOversized()) and mostly rewritten the long conditions involving m_quality to use these functions instead.
  130. in src/txgraph.cpp:263 in 1b207547a9 outdated
    259@@ -260,6 +260,8 @@ class TxGraphImpl final : public TxGraph
    260     ClusterSetIndex InsertCluster(std::unique_ptr<Cluster>&& cluster, QualityLevel quality) noexcept;
    261     /** Change the QualityLevel of a Cluster (identified by old_quality and old_index). */
    262     void SetClusterQuality(QualityLevel old_quality, ClusterSetIndex old_index, QualityLevel new_quality) noexcept;
    263+    /** Make a transaction not exist. */
    


    instagibbs commented at 3:16 pm on February 14, 2025:

    nit

    0    /** Make a transaction not exist. Transaction must currently exist. */
    

    sipa commented at 10:15 pm on February 21, 2025:
    Done.
  131. in src/txgraph.cpp:308 in e19bbc3282 outdated
    304@@ -252,16 +305,21 @@ class TxGraphImpl final : public TxGraph
    305 
    306     /** Swap the Entrys referred to by a and b. */
    307     void SwapIndexes(GraphIndex a, GraphIndex b) noexcept;
    308-    /** Extract a Cluster. */
    309-    std::unique_ptr<Cluster> ExtractCluster(QualityLevel quality, ClusterSetIndex setindex) noexcept;
    310+    /** If idx exists in the specified level ClusterSet (explicitly or implicitly), return the
    


    ismaelsadeeq commented at 3:40 pm on February 14, 2025:
    What do you mean by implicitly

    sipa commented at 10:29 pm on February 14, 2025:
    I have expanded on this. LMK if it’s clearer now.
  132. in src/txgraph.cpp:120 in e19bbc3282 outdated
     94@@ -90,6 +95,17 @@ class Cluster
     95     void UpdateMapping(DepGraphIndex cluster_idx, GraphIndex graph_idx) noexcept { m_mapping[cluster_idx] = graph_idx; }
     96     /** Push changes to Cluster and its linearization to the TxGraphImpl Entry objects. */
     97     void Updated(TxGraphImpl& graph) noexcept;
     98+    /** Create a copy of this Cluster, returning a pointer to it (used by PullIn). */
     99+    Cluster* CopyTo(TxGraphImpl& graph, int to_level) const noexcept;
    100+    /** Get the list of Clusters that conflict with this one (at level-1). */
    


    instagibbs commented at 3:47 pm on February 14, 2025:
    nit: “level-1” is a bit vague, but understandable once you dive into the code itself. “top level - 1”?

    sipa commented at 10:18 pm on February 21, 2025:
    That’s not quite what this does. It looks for conflicts one level below the one where the *this Cluster exists (in practice, that’ll always be top_level-1, though). Improved the comment.
  133. in src/txgraph.cpp:229 in e19bbc3282 outdated
    208@@ -189,28 +209,59 @@ class TxGraphImpl final : public TxGraph
    209         std::vector<std::pair<GraphIndex, GraphIndex>> m_deps_to_add;
    210         /** Information about the merges to be performed, if known. */
    211         std::optional<GroupData> m_group_data = GroupData{};
    212+        /** Which entries were removed in this ClusterSet (so they can be wiped on abort). */
    


    instagibbs commented at 3:49 pm on February 14, 2025:
    IIUC this field should become a subset of m_to_remove’s value which ended up being IsPresent? Is there any better way to explain the relationship of these two variables?

    sipa commented at 9:20 pm on February 21, 2025:
    I think the most accurate description is “All entries which have an (R) removed locator at this level, plus any transactions in m_unlinked”. Would that help?

    instagibbs commented at 11:38 pm on February 21, 2025:
    yes, thanks

    sipa commented at 9:08 pm on March 19, 2025:
    Done.
  134. in src/txgraph.cpp:232 in e19bbc3282 outdated
    232+     *  - (P)resent: actually occurs in a Cluster at that level.
    233+     *
    234+     *  - (M)issing: not present in a Cluster at that level. For main, this means the transaction
    235+     *               does not exist. For staging this means it its existence is inherited from
    236+     *               main. If it doesn't exist in main, it doesn't exist in staging either. If it
    237+     *               does existing in main, the cluster it is in is unmodified in staging.
    


    ismaelsadeeq commented at 3:49 pm on February 14, 2025:
    0     *               does exist in main, the cluster it is in is unmodified in staging.
    

    sipa commented at 10:29 pm on February 14, 2025:
    Done.
  135. in src/txgraph.cpp:122 in e19bbc3282 outdated
    102+    /** Mark all the Entry objects belonging to this Cluster as missing. The Cluster must be
    103+     *  deleted immediately after. */
    104+    void MakeTransactionsMissing(TxGraphImpl& graph) noexcept;
    105+    /** Remove all transactions in a Cluster. */
    106+    void Clear(TxGraphImpl& graph) noexcept;
    107+    /** Change a Cluster's level from level to level-1. */
    


    ismaelsadeeq commented at 4:05 pm on February 14, 2025:
    Should just refer to main and staging only instead of level-1, main, staging and top level? It will be consistent this way.

    sipa commented at 9:14 pm on March 19, 2025:
    Together with a lot of changes (getting rid of the ClusterSet vector), done.
  136. in src/txgraph.cpp:273 in e19bbc3282 outdated
    243+     * - (M,M): the transaction doesn't exist in either graph.
    244+     * - (P,M): the transaction exists in both, but only exists explicitly in a Cluster object in
    245+     *          main. Its existence in staging is inherited from main.
    246+     * - (P,P): the transaction exists in both, and is materialized in both. Thus, the clusters
    247+     *          and/or their linearizations may be different in main and staging.
    248+     * - (M,P): the transaction is added in staging, and does not exist in main.
    


    ismaelsadeeq commented at 4:15 pm on February 14, 2025:

    hmm, above you mentioned “If it doesn’t exist in main, it doesn’t exist in staging either.”.

    And now there is a state where the transaction exits in staging but not in main


    sipa commented at 10:30 pm on February 14, 2025:

    I see this was a bit confusing, I have expanded the explanation.

    But what I meant to convey was that (M) in staging means that if the transaction does not exist in main, it doesn’t exist in staging either. A (P) in staging just means it exists there. LMK if the new explanation is clearer.

  137. in src/txgraph.cpp:525 in e19bbc3282 outdated
    449+    }
    450+}
    451+
    452+std::vector<Cluster*> TxGraphImpl::GetConflicts() const noexcept
    453+{
    454+    int level = m_clustersets.size() - 1;
    


    ismaelsadeeq commented at 4:46 pm on February 14, 2025:

    Should we just do this here and other places?

    0    int level = MAX_LEVELS - 1;
    

    sipa commented at 9:14 pm on March 19, 2025:
    This code is gone now, resolving.
  138. in src/txgraph.cpp:518 in e19bbc3282 outdated
    442+    for (auto i : m_linearization) {
    443+        auto& entry = graph.m_entries[m_mapping[i]];
    444+        // For every transaction Entry in this Cluster, if it also exists in a lower-level Cluster,
    445+        // then that Cluster conflicts.
    446+        if (entry.m_locator[m_level - 1].IsPresent()) {
    447+            out.push_back(entry.m_locator[m_level - 1].cluster);
    


    ismaelsadeeq commented at 5:11 pm on February 14, 2025:
    Should be set to avoid duplicates, will prevent de duplicating later on?

    sipa commented at 8:03 pm on February 18, 2025:

    Both options are O(n log n), but a vector + sorting + deduplication has much better constant factors. An std::set requires a separate allocation for each element, while a vector uses a single allocator for everything. Having everything in continuous memory also exploits the CPU cache better.

    In theory, using an std::unordered_map could asymptotically faster for large numbers as it’s just O(n), but I suspect not for the numbers we care about. I haven’t tried or benchmarked it, though.


    ismaelsadeeq commented at 8:06 pm on February 18, 2025:

    Agreed, I calculated recently. This is better.

    The constant factors for GetDistinctClusters implementation using vector and then sorting is also better 👍🏾 Please resolve.

  139. ismaelsadeeq commented at 5:36 pm on February 14, 2025: member

    Code Review e19bbc328236f64716034277857951184309cd14

    It seems to me that this is designed with the flexibility to accommodate more levels, which explains the use of std::vector for ClusterSet

    If yes then current approach is perfect else would it be better for TxGraph to maintain a std::array of two ClusterSet? mainandstaging and store their indexes in TxGraph as enums? This way, Cluster would only need to maintain that enum.

    In ClusterSet, we could introduce a boolean to indicate whether a set is active or not.

    The main set could always remain active, while staging would be flexible and toggleable.

    Thus, when we start staging, we can simply mark the back array as active, and once done, clear the back array and mark it as inactive.

    I believe this approach would reduce Assumes since we explicitly know we can only have two ClusterSet’s.

  140. in src/txgraph.cpp:532 in e19bbc3282 outdated
    451+
    452+std::vector<Cluster*> TxGraphImpl::GetConflicts() const noexcept
    453+{
    454+    int level = m_clustersets.size() - 1;
    455+    std::vector<Cluster*> ret;
    456+    // All Clusters at level-1 containing transactions in m_removed are conflicts.
    


    instagibbs commented at 7:18 pm on February 14, 2025:
    i.e., any (P, R)? If so, putting in comment would be nice

    sipa commented at 10:18 pm on February 21, 2025:
    Indeed, done.
  141. in src/txgraph.cpp:539 in e19bbc3282 outdated
    457+    for (auto i : m_clustersets[level].m_removed) {
    458+        auto& entry = m_entries[i];
    459+        Assume(entry.m_locator[level - 1].IsPresent());
    460+        ret.push_back(entry.m_locator[level - 1].cluster);
    461+    }
    462+    // Then go over all Clusters at this level, and find their conflicts.
    


    instagibbs commented at 7:19 pm on February 14, 2025:
    i.e., any (P, P)? If so, putting in comment would be nice

    sipa commented at 10:19 pm on February 21, 2025:
    Indeed, done.
  142. in src/txgraph.cpp:563 in e19bbc3282 outdated
    481+    ptr->m_depgraph = m_depgraph;
    482+    ptr->m_mapping = m_mapping;
    483+    ptr->m_linearization = m_linearization;
    484+    // Insert the new Cluster into the graph.
    485+    graph.InsertCluster(to_level, std::move(ret), m_quality);
    486+    // Update its Locators (and possibly linearization data in its Entrys).
    


    instagibbs commented at 7:31 pm on February 14, 2025:

    and possibly linearization data

    e.g. when to_level is 0 and is otherwise acceptable?


    sipa commented at 10:19 pm on February 21, 2025:
    Yes, but no, because to_level is never 0. I have just dropped it.
  143. in src/txgraph.cpp:125 in e19bbc3282 outdated
    100+    /** Get the list of Clusters that conflict with this one (at level-1). */
    101+    void GetConflicts(const TxGraphImpl& graph, std::vector<Cluster*>& out) const noexcept;
    102+    /** Mark all the Entry objects belonging to this Cluster as missing. The Cluster must be
    103+     *  deleted immediately after. */
    104+    void MakeTransactionsMissing(TxGraphImpl& graph) noexcept;
    105+    /** Remove all transactions in a Cluster. */
    


    instagibbs commented at 7:36 pm on February 14, 2025:

    I find this slightly clearer since txns can still exist somewhere

    0    /** Remove all transactions from a Cluster. */
    

    sipa commented at 10:19 pm on February 21, 2025:
    Done.
  144. in src/txgraph.cpp:905 in e19bbc3282 outdated
    826+    int level = cluster->m_level;
    827+    Assume(level <= to_level);
    828+    // Copy the Cluster from the level it was found at to higher levels, if any.
    829+    while (level < to_level) {
    830+        // Make the Cluster Acceptable before copying. This isn't strictly necessary, but doing it
    831+        // now avoids doing doable work later.
    


    instagibbs commented at 7:59 pm on February 14, 2025:
    micro-nit: immediately doable

    sipa commented at 10:20 pm on February 21, 2025:
    Actually I meant “double”, not “doable”, I think. Fixed.

    instagibbs commented at 11:57 pm on February 21, 2025:
    hah, was my first guess that I doubted :)
  145. in src/txgraph.cpp:842 in e19bbc3282 outdated
    844+    auto& clusterset = m_clustersets[level];
    845     auto& to_remove = clusterset.m_to_remove;
    846     // Skip if there is nothing to remove.
    847     if (to_remove.empty()) return;
    848+    // Pull in all Clusters that are not in the top ClusterSet.
    849+    for (GraphIndex index : clusterset.m_to_remove) {
    


    instagibbs commented at 8:06 pm on February 14, 2025:
    0    for (GraphIndex index : to_remove) {
    

    and elsewhere so we’re only looking at two different to_remove* things vs three?


    sipa commented at 10:20 pm on February 21, 2025:
    Done.
  146. sipa force-pushed on Feb 14, 2025
  147. in src/txgraph.h:96 in 88454752cd outdated
     98@@ -99,6 +99,11 @@ class TxGraph
     99      *  effect. */
    100     virtual void SetTransactionFee(const Ref& arg, int64_t fee) noexcept = 0;
    101 
    102+    /** TxGraph is internally lazy, and will not compute many things until they are needed.
    103+     *  Calling DoWork will compute everything now, so that future operations are fast. This can be
    104+     *  invoked while oversized. */
    


    ismaelsadeeq commented at 3:46 pm on February 18, 2025:
    It can be called when oversized but for staging DoWork it will be no-op when it is oversized.

    sipa commented at 8:45 pm on February 18, 2025:
    It might still do something; for example removals may be applied still, and grouping may be calculated. I feel it’s unnecessary to elaborate that much about implementation details in the interface, though.
  148. sipa commented at 8:30 pm on February 18, 2025: member

    @ismaelsadeeq

    It seems to me that this is designed with the flexibility to accommodate more levels, which explains the use of std::vector for ClusterSet

    It’s not really written with the intent of accomodating more levels; it’s more that I found it more elegant to write it generically, because most operations don’t care what level they are operating on. I can see that the increased abstraction may be confusing though; I’d like to hear more reviewer comments on this.

    If yes then current approach is perfect else would it be better for TxGraph to maintain a std::array of two ClusterSet? mainandstaging and store their indexes in TxGraph as enums? This way, Cluster would only need to maintain that enum.

    In ClusterSet, we could introduce a boolean to indicate whether a set is active or not.

    I think it’s more natural to use a vector, which implicitly encodes how many there are, and it means one can always use m_clustersets.back() to refer to the top set, regardless of whether main/staging exist; the alternative would involve checking whether m_clustersets[1].m_active is true or not. Performance isn’t a concern here either; they’re big data structures that aren’t created/destroyed inside tight loops.

    The main set could always remain active, while staging would be flexible and toggleable.

    Thus, when we start staging, we can simply mark the back array as active, and once done, clear the back array and mark it as inactive.

    I believe this approach would reduce Assumes since we explicitly know we can only have two ClusterSet’s.

    True, but it would also add a consistency requirement that m_clustersets[0].m_active is always true, for example. And of course, Assumes are never “necessary”; they’re there to help review (by convincing people that if the condition didn’t hold, presumably a test could hit it). If you find an Assume that’s more cluttering than helpful, it can always just be dropped too.

  149. sipa force-pushed on Feb 20, 2025
  150. sipa commented at 5:24 pm on February 20, 2025: member

    I have modified (in “add staging support” and further) the TxGraph functions relating to the normalization steps to all take a level argument as input (ApplyRemovals, SplitAll, GroupClusters, ApplyDependencies, and MakeAllAcceptable). This meant a number of repeated sequences like:

    0    SplitAll(0);
    1    if (m_clustersets.size() == 1) ApplyDependencies();
    

    could be changed into just

    0    ApplyDependencies(0);
    
  151. in src/txgraph.cpp:828 in 913e14e6fd outdated
    830+{
    831+    Assume(level >= 0 && size_t(level) < m_clustersets.size());
    832+    auto& entry = m_entries[idx];
    833+    // Search the entry's locators from top to bottom.
    834+    for (int l = level; l >= 0; --l) {
    835+        // If the locator is missing, dig deeper; it may exist at a lower level.
    


    instagibbs commented at 7:20 pm on February 20, 2025:

    micro-nit: documentation was pretty clear in declaration, but just in case…

    0        // If the locator is missing, dig deeper; it may exist at a lower level and therefore be implicitly available at this level.
    

    sipa commented at 10:20 pm on February 21, 2025:
    Done.
  152. in src/txgraph.cpp:1548 in 913e14e6fd outdated
    1575+void TxGraphImpl::AbortStaging() noexcept
    1576+{
    1577+    Assume(m_clustersets.size() > 1);
    1578+    int stage_level = m_clustersets.size() - 1;
    1579+    auto& stage = m_clustersets[stage_level];
    1580+    // Mark are removed transactions as Missing (so the stage_level locator for these transactions
    


    instagibbs commented at 7:48 pm on February 20, 2025:
    s/Mark are/Mark all/

    sipa commented at 10:20 pm on February 21, 2025:
    Done.
  153. in src/txgraph.cpp:1717 in 913e14e6fd outdated
    1578+    int stage_level = m_clustersets.size() - 1;
    1579+    auto& stage = m_clustersets[stage_level];
    1580+    // Mark are removed transactions as Missing (so the stage_level locator for these transactions
    1581+    // can be reused if another staging is created).
    1582+    for (auto idx : stage.m_removed) {
    1583+        m_entries[idx].m_locator[stage_level].SetMissing();
    


    instagibbs commented at 7:50 pm on February 20, 2025:
    0        Assume(m_entries[idx].m_locator[stage_level].IsRemoved());
    1        m_entries[idx].m_locator[stage_level].SetMissing();
    

    sipa commented at 10:22 pm on February 21, 2025:
    This seems to work in fuzzing, but I’m not entirely convinced this is clearly true. Entries can exist in m_to_remove which are already removed (if they’re unlinked), so I have not taken this.

    instagibbs commented at 0:01 am on February 22, 2025:
    saw the updated comment/code in SanityCheck, makes sense
  154. in src/txgraph.cpp:1760 in 913e14e6fd outdated
    1576+{
    1577+    Assume(m_clustersets.size() > 1);
    1578+    int stage_level = m_clustersets.size() - 1;
    1579+    auto& stage = m_clustersets[stage_level];
    1580+    // Mark are removed transactions as Missing (so the stage_level locator for these transactions
    1581+    // can be reused if another staging is created).
    


    instagibbs commented at 7:54 pm on February 20, 2025:
    it’s an array of Locators, so what else could have been done here? Might be reading too much into this comment.

    sipa commented at 10:24 pm on February 21, 2025:
    The goal is just that when staging is aborted/committed, and then you start a new staging, the m_locator[1] values will all be cleared already.
  155. in src/txgraph.cpp:2057 in 07632d6ce0 outdated
    1773@@ -1755,6 +1774,11 @@ void TxGraphImpl::SanityCheck() const
    1774         if (!clusterset.m_deps_to_add.empty()) compact_possible = false;
    1775         if (!clusterset.m_to_remove.empty()) compact_possible = false;
    1776         if (!clusterset.m_removed.empty()) compact_possible = false;
    1777+
    1778+        // If m_group_data exists, its m_group_oversized must match m_oversized.
    1779+        if (clusterset.m_group_data.has_value()) {
    1780+            assert(clusterset.m_oversized == clusterset.m_group_data->m_group_oversized);
    


    instagibbs commented at 8:12 pm on February 20, 2025:
    should check that clusterset.m_oversized.has_value()?

    sipa commented at 10:24 pm on February 21, 2025:
    No, because it’s comparing an std::optional<bool> with a bool, which can only be true if the first isn’t std::nullopt.
  156. in src/txgraph.cpp:308 in fdd68be0ac outdated
    302@@ -303,6 +303,8 @@ class TxGraphImpl final : public TxGraph
    303         Locator m_locator[MAX_LEVELS];
    304         /** The chunk feerate of this transaction in main (if present in m_locator[0]). */
    305         FeePerWeight m_main_chunk_feerate;
    306+        /** The position this transaction has in the main linearization (if present). */
    


    instagibbs commented at 8:48 pm on February 20, 2025:
    This value is set to -1, sometimes, but never acted on with that value. Noting in case you have thoughts on that.

    sipa commented at 10:25 pm on February 21, 2025:
    Not really. The -1 has no meaning, but hopefully would result in a nice detectable error if it were ever used.
  157. bitcoin deleted a comment on Feb 20, 2025
  158. in src/txgraph.h:149 in 11135464c5 outdated
    148+    virtual std::vector<Ref*> GetAncestorsUnion(std::span<const Ref* const> args, bool main_only = false) noexcept = 0;
    149     /** Get pointers to all descendants of the specified transaction. If main_only is false and a
    150      *  staging graph exists, it is queried; otherwise the main graph is queried. The queried
    151      *  graph must not be oversized. Returns {} if arg does not exist in the queried graph. */
    152     virtual std::vector<Ref*> GetDescendants(const Ref& arg, bool main_only = false) noexcept = 0;
    153+    /** Like GetDescendants, but return the Refs for all transactions in the union of the provided
    


    instagibbs commented at 4:32 pm on February 21, 2025:
    nit: might be worthwhile to mention in this and elsewhere that results will be unique

    sipa commented at 10:10 pm on February 21, 2025:
    It says “(each transaction is only reported once)”, is that not what you mean?

    instagibbs commented at 11:35 pm on February 21, 2025:
    major reading comprehension issues it seems
  159. instagibbs commented at 5:07 pm on February 21, 2025: member

    Looking good 11135464c5c3aef1ce8d2823120467a522ce2c87

    I just have conceptual issues with what PostLinerize is guaranteeing us beyond what is explicitly stated in the header file.

    Pile of nits/questions otherwise

  160. sipa force-pushed on Feb 21, 2025
  161. sipa commented at 10:25 pm on February 21, 2025: member
    Addressed most comments. I will address the PostLinearize related ones later.
  162. instagibbs commented at 0:02 am on February 22, 2025: member

    verified changes, just waiting on further PostLinearization commentary

    via git range-diff master 11135464c5c3aef1ce8d2823120467a522ce2c87 d5abb86439e79d4adfbfbd46f833268bbca0bf6e

  163. sipa commented at 8:15 pm on March 3, 2025: member

    @instagibbs So… it seems that the “optimal linearization, truncate, postlinearize, split” does not actually result in minimal chunks (but it does result in feerate-diagram-optimal ones, and connected ones):

    0graph BT;T18["A: 4"];T19["B: 5"];T20["C: 7"];T20-->T18;T20-->T13;T9["D: 7"];T9-->T18;T9-->T19;T13["E: 6"];T13-->T19;
    

    The original (optimal) linearization is [ABEDC]. It’s just a single chunk, so really any ordering is equally good.

    Truncation removes C. Postlinearization turns it into [BADE]. However, a 2-chunk alternative exists: [BE,AD].

  164. sipa force-pushed on Mar 3, 2025
  165. sipa commented at 9:54 pm on March 3, 2025: member
    I have dropped the QualityLevel::NEEDS_SPLIT_OPTIMAL value, and made the “post-linearize after remove, before split” apply to NEEDS_SPLIT_ACCEPTABLE instead, with updated comments about it.
  166. instagibbs commented at 5:09 pm on March 4, 2025: member

    ACK 72a97c0a07ea6e5a95ab37c8d95e1ea02cff8e92

    With the weaker claims in the code re:PostLinearization, it works for me, and obviates the need for an additional harness test.

    git range-diff master d5abb86439e79d4adfbfbd46f833268bbca0bf6e 72a97c0a07ea6e5a95ab37c8d95e1ea02cff8e92

  167. DrahtBot requested review from jonatack on Mar 4, 2025
  168. DrahtBot requested review from ismaelsadeeq on Mar 4, 2025
  169. ismaelsadeeq approved
  170. ismaelsadeeq commented at 8:29 pm on March 6, 2025: member

    Code review ACK 72a97c0a07ea6e5a95ab37c8d95e1ea02cff8e92 🛸

    I’ve reviewed each commit and also looked at how the interface is being used by CTxMemPool in the big PR

  171. in src/cluster_linearize.h:23 in 8872339583 outdated
    18@@ -19,8 +19,8 @@
    19 
    20 namespace cluster_linearize {
    21 
    22-/** Data type to represent transaction indices in clusters. */
    23-using ClusterIndex = uint32_t;
    24+/** Data type to represent transaction indices in DepGraphs and the clusters they represent. */
    25+using DepGraphIndex = uint32_t;
    


    ajtowns commented at 4:52 am on March 14, 2025:

    This commit could be a scripted diff.

    0-BEGIN VERIFY SCRIPT-
    1sed -i 's/Data type to represent transaction indices in clusters./Data type to represent transaction indices in DepGraphs and the clusters they represent./' $(git grep -l 'using ClusterIndex')
    2sed -i 's|\<ClusterIndex\>|DepGraphIndex|g' $(git grep -l 'ClusterIndex')
    3-END VERIFY SCRIPT-
    

    sipa commented at 9:15 pm on March 19, 2025:
    Done.
  172. in src/txgraph.h:77 in e22a0b21f8 outdated
    72+    virtual ~TxGraph() = default;
    73+    /** Construct a new transaction with the specified feerate, and return a Ref to it. In all
    74+     *  further calls, only Refs created by AddTransaction() are allowed to be passed to this
    75+     *  TxGraph object (or empty Ref objects). */
    76+    [[nodiscard]] virtual Ref AddTransaction(const FeePerWeight& feerate) noexcept = 0;
    77+    /** Remove the specified transaction. This is a no-op if the transaction was already removed.
    


    ajtowns commented at 5:31 am on March 14, 2025:

    The invalidation behaviour for Refs seems unclear to me? I think the idea is that if you hold onto a Ref you can continue to uniquely refer to a removed transaction, up until the Ref is destructed, at which point the TxGraph is compacted and memory might be reused at that point. This delay then also allows for reordering to occur which can result in optimisations.

    I think the intended behaviour could be made a bit clearer in the header file though? It’s not immediately obvious to me under what circumstances the ApplyRemovals() and Compact() are/should be called.


    sipa commented at 9:19 pm on March 19, 2025:

    As far as the interface is concerned: you get a Ref object from AddTransaction, and as long as the object exists, using it remains valid (which, due to the unlinking in ~TxGraphImpl you’ve suggested I add, may now even outlive the TxGraph). In the initial commit a Ref object may only be destroyed after the transaction has been removed, but this is relaxed in “txgraph: (feature) destroying Ref means removing transaction” (currently 9a04cc18e19aa03d215ce1fb030809d7e6eb98a6).

    I’d like to keep the implementation details (like when ApplyRemovals() and Compact() are called) out of the interface definition, but other than that, happy for any suggestions that would clarify things (I’ve adjusted comments a bit, and moved them to the correct commits already).


    ajtowns commented at 0:34 am on March 20, 2025:

    Maybe it would be helpful to explain in the header that AddTransaction() creates a Ref, which you’re then expected to assign/move into your own container, which may be a subclass containing even more info about the tx, and then that serves as a permanent handle, even as the txgraph gets rearranged internally and that it’s deleted from the graph either with RemoveTransaction explicitly or implicitly by destructing the Ref? Essentially expanding the “Data type used to reference transactions within a TxGraph” comment that’s already there to be a bit more ELI5 maybe?

    If you create an empty Ref (ie, construct it directly, not via AddTransaction) is there any way to associate that Ref with a TxGraph or is it forever useless as a Ref? Would it make any sense to have an AddTransaction variant that takes a Ref& argument rather than returning a new Ref? I guess equivalently, is an empty Ref useful for anything other than tests? Maybe “Non-empty Refs can only be created using AddTx” should be more explicit and say “You want to use AddTransaction() rather than constructing a Ref directly”.


    sipa commented at 6:10 pm on March 23, 2025:

    This works:

    0TxGraph::Ref ref;
    1...
    2ref = txgraph.AddTransaction(fee, size):
    

    So think of an empty Ref as an equivalent to an std::unique_ptr to nullptr or so. It’s potentially useful just so you can have a variable declared as a member in a class or so without immediately initializing it.

    It even goes further in that this works even through subclasses:

    0class TxMempoolEntry : public TxGraph::Ref { ... }
    1
    2TxMemoolEntry entry;
    3...
    4entry = txgraph.AddTransaction(fee, size);
    
  173. in src/txgraph.cpp:168 in e22a0b21f8 outdated
    163+         *  pairs. */
    164+        std::vector<std::pair<GraphIndex, GraphIndex>> m_deps;
    165+    };
    166+
    167+    /** The vectors of clusters, one vector per quality level. ClusterSetIndex indexes into each. */
    168+    std::vector<std::unique_ptr<Cluster>> m_clusters[int(QualityLevel::NONE)];
    


    ajtowns commented at 6:11 am on March 14, 2025:
    std::array<std::vector<std::unique_ptr<Cluster>>,int{QualityLevel::NONE}> m_clusters; might read clearer?

    sipa commented at 9:19 pm on March 19, 2025:
    Done.
  174. in src/txgraph.cpp:197 in e22a0b21f8 outdated
    192+        /** Check if this Locator is present (in some Cluster). */
    193+        bool IsPresent() const noexcept { return cluster != nullptr; }
    194+    };
    195+
    196+    /** A class of objects held internally in TxGraphImpl, with information about a single
    197+     *  transaction. */
    


    ajtowns commented at 6:29 am on March 14, 2025:
    “A class of objects” doesn’t seem like a very useful description? “Internal information about each transaction in a TxGraphImpl” ?

    sipa commented at 9:20 pm on March 19, 2025:
    Done.
  175. in src/txgraph.cpp:303 in e22a0b21f8 outdated
    196+    /** A class of objects held internally in TxGraphImpl, with information about a single
    197+     *  transaction. */
    198+    struct Entry
    199+    {
    200+        /** Pointer to the corresponding Ref object if any, or nullptr if unlinked. */
    201+        Ref* m_ref{nullptr};
    


    ajtowns commented at 7:06 am on March 14, 2025:

    Having each Ref point to the TxGraph and the TxGraph point to (and potentially modify via SwapIndexes) every Ref seems very self-referential, which is probably a bit confusing and not very cache optimal and so forth…

    I think you could reduce this to a const Ref* (so that Refs wouldn’t get modified via entry.m_ref) by replacing the use of SwapIndexes on Compact() with a free list/heap of no-longer used entries. Probably not worthwhile though.


    sipa commented at 8:53 pm on March 19, 2025:

    So this is roughly the thinking that led to this design of the split between Ref and Entry.

    The most direct way of accomplishing this is just moving all data currently in Entry into Ref instead, but that requires exposing TxGraphImpl implementation details in the header. It also lacks a way to refer to entries in a stable manner internally. For example, what if a transaction has been marked for removal or dependency adding, and the Ref object is moved by the mempool code… we’d need a way to iterate over all places where a particular Ref is marked, which (I believe) needs several more cyclic references between components.

    Another extreme is to have a map from GraphIndex values (which are chosen incrementally and never reused) to Entry objects, making them truly globally and permanently unique identifiers for Refs. No rewriting is ever necessary, as there are no compactions needed, but it implies an additional allocation per transaction, plus probably (even) worse memory locality than the current design.

    I think using a freelist-like design can be seen as a variant of this, by storing the map in a support/allocators/pool.h pooled resource, and using map iterators as the graph indexes? I feel that’s probably overkill.

    The current design avoids the Entry->Ref map by allocating them continuously, but compacting at stable points, which necessitates a backreference, but that’s still cheaper than an additional map.

  176. in src/txgraph.h:18 in e22a0b21f8 outdated
    13+#define BITCOIN_TXGRAPH_H
    14+
    15+/** No connected component within TxGraph is allowed to exceed this number of transactions. */
    16+static constexpr unsigned CLUSTER_COUNT_LIMIT{64};
    17+
    18+/** Data structure to encapsulate fees, sizes, and dependencies for a set of transactions. */
    


    ajtowns commented at 10:07 am on March 15, 2025:
    Might be worth pointing out that the api is designed towards allowing for an implementation that stores the transitive closure of dependencies – when querying, if B spends C, then you can’t differentiate between A spends B and A spends B and C.

    sipa commented at 9:20 pm on March 19, 2025:
    Done.
  177. in src/txgraph.cpp:1120 in e22a0b21f8 outdated
    701+    std::sort(an_clusters.begin(), an_clusters.end());
    702+    an_clusters.erase(std::unique(an_clusters.begin(), an_clusters.end()), an_clusters.end());
    703+
    704+    // Run the union-find algorithm to to find partitions of the input Clusters which need to be
    705+    // grouped together. See https://en.wikipedia.org/wiki/Disjoint-set_data_structure.
    706+    {
    


    ajtowns commented at 11:47 am on March 16, 2025:
    Slightly surprised the “union-find algorithm” section isn’t broken out into a top level class or similar, rather than lambda functions.

    sipa commented at 8:59 pm on March 19, 2025:
    Yeah, fair point. I will think about this. It’s a bit unfortunate that there is a second union-find implementation added in #31553, but it’s so different (including different per-partition meta-data being kept) that trying to have a single implementation would be too abstract/template-heavy.

    sipa commented at 6:16 pm on March 23, 2025:

    I tried to encapsulate it in its own class, but it’s hard to give it the same performance (which is relevant in the context of huge reorgs) without making the interface rather involved (needs a “representative” type that contains a pointer to the PartitionData object, separate from the element data type), and even then, the union-find version in Trim() in 31553 would need its own separate implementation still.

    Going to leave it for now, we can think about cleanups later.

  178. in src/txgraph.cpp:2089 in e22a0b21f8 outdated
    1141+
    1142+TxGraph::Ref::~Ref()
    1143+{
    1144+    if (m_graph) {
    1145+        // Inform the TxGraph about the Ref being destroyed.
    1146+        m_graph->UnlinkRef(m_index);
    


    ajtowns commented at 4:35 pm on March 17, 2025:
    I think this gives you a segfault when aborting the program if the TxGraph is destructed prior to some of its outstanding Ref’s being destructed. It might be worthwhile having the TxGraph destructor iterate through any txs that haven’t already been unlinked and clear the m_graph pointers back to itself. Alternatively, the TxGraph destructor could issue an assertion failure if there are any linked Refs remaining.

    sipa commented at 9:00 pm on March 19, 2025:
    Done. I’ve made ~TxGraphImpl clean up all Ref::m_graph pointers.
  179. ajtowns commented at 5:30 pm on March 17, 2025: contributor

    This seems fairly difficult to review – there’s a lot of mostly implicit background in what TxGraph does and why it does it that way, as evidenced by both the description in this PR and in the review club.

    When I started typing this comment, I was wondering if maybe it would make sense to make the commits here into more of a gentle introduction to TxGraph, rather than one commit that does everything, then optimisations. But I don’t really think that approach would survive subsequent modifications to the code very well. Likewise for just adding more comments to the code.

    Alternatively, I wonder if it would be helpful both for new devs and debugging to expose the TxGraph API as a python module, possibly with additional debugging handles so that you can examine internal state? I could imagine an interactive python notebook that allows you to interact with a real TxGraph being a useful experimental tool for learning/writing mempool code and also being something that could perhaps be reasonably maintained. Here’s a rough take at what that might look like. It’s using boost-python, which seems reasonable, but doesn’t seem super friendly to pure-virtual classes (hence the tedious wrapper class) or move-only objects (hence AddTransaction being wrapped via a constructor). (EDIT: add link for “here’s”; EDIT2: converting the vector<Ref*> stuff to python made the move-only Ref stuff irrelevant; the wrapper class isn’t quite so tedious any more either really)

    Comments below are mostly a bunch of nits/observations while familiarising myself with the code, feel free to ignore.

  180. in src/txgraph.cpp:112 in 72a97c0a07 outdated
    107+    /** Given a DepGraphIndex into this Cluster, find the corresponding GraphIndex. */
    108+    GraphIndex GetClusterEntry(DepGraphIndex index) const noexcept { return m_mapping[index]; }
    109+    /** Only called by Graph::SwapIndexes. */
    110+    void UpdateMapping(DepGraphIndex cluster_idx, GraphIndex graph_idx) noexcept { m_mapping[cluster_idx] = graph_idx; }
    111+    /** Push changes to Cluster and its linearization to the TxGraphImpl Entry objects. */
    112+    void Updated(TxGraphImpl& graph) noexcept;
    


    ajtowns commented at 2:01 pm on March 18, 2025:
    Most of the Cluster methods require a TxGraphImpl to keep the quality/staging/mapping things consistent; feels a bit like it might be simpler just to have almost all this code stay in TxGraphImpl?

    sipa commented at 8:32 pm on March 19, 2025:
    See my comments below: #31363 (comment) about how the division between the two follows whether or not the function needs access to Cluster’s internals (which may become dependent on the subclass).
  181. in src/txgraph.cpp:27 in 72a97c0a07 outdated
    22+
    23+/** The maximum number of levels a TxGraph can have (0 = main, 1 = staging). */
    24+static constexpr int MAX_LEVELS{2};
    25+
    26+// Forward declare the TxGraph implementation class.
    27+class TxGraphImpl;
    


    ajtowns commented at 3:22 pm on March 18, 2025:

    I think this is a close to accurate diagram of the data structures here:

    txgraph excalidraw

    To me, the cycles in that graph seem like a code smell. The GraphIndex backrefs make sense (or are unavoidable), as some stable way to refer to each tx is needed. But, I think it would probably be better for a Cluster not to be aware of its quality level or staging level (removing the m_setindex/m_quality/m_level back links) and moving the code that would query/change those items into TxGraph instead.

    I think you could think of GraphIndex as the unique global reference to a tx in a TxGraph, in which case Ref could be viewed as an RAII view of a GraphIndex (which enforces the need for a reference to TxGraph to allow for the resource to be freed on destruction), and which also copes with GraphIndex values being updated.


    sipa commented at 8:31 pm on March 19, 2025:
    See my comment below: #31363 (comment) regarding Cluster becoming an abstract base class.

    ajtowns commented at 0:41 am on March 20, 2025:
    Sounds like this is something that can be reconsidered when we get a PR that introduces variant Clusters then. Mark as resolved for now?

    sipa commented at 11:57 pm on March 21, 2025:
    Marking resolved.
  182. in src/txgraph.cpp:240 in 72a97c0a07 outdated
    235+         *  m_group_data->m_group_oversized, but may be known even if m_group_data is not. */
    236+        std::optional<bool> m_oversized{false};
    237+    };
    238+
    239+    /** The ClusterSets in this TxGraphImpl. Has exactly 1 (main) or exactly 2 elements (main and staged). */
    240+    std::vector<ClusterSet> m_clustersets;
    


    ajtowns commented at 4:14 pm on March 18, 2025:

    ClusterSet m_main_clusterset; std::optional<ClusterSet> m_staging_clusterset; might be clearer – the vector stuff seems more confusing than helpful?

    Does it ever make sense to have multiple nested staging levels? Parallel staging levels could make sense (I want to compare three RBFs in parallel on multiple cores, having checked that they’re all independent and don’t interact with each others’ clusters) but seems probably more complicated than would be worthwhile.


    instagibbs commented at 6:23 pm on March 18, 2025:
    I ran into a somewhat contrived use-case, but I have a fairly strong YNGNI feeling of this aspect of the code. That said, I don’t think it hurts understanding too much as is.

    ismaelsadeeq commented at 6:28 pm on March 18, 2025:
    Curious to know this contrived use case, I also think it’s a bit of an overkill in a previous comment #31363#pullrequestreview-2618143580, but @sipa made a compelling argument for it #31363 (comment)

    ajtowns commented at 9:31 am on March 19, 2025:

    I think StartStaging() gets a bit simpler if you know that it can only be called with m_clustersets.size()==1:

    0 void TxGraphImpl::StartStaging() noexcept
    1 {
    2+    Assume(m_clustersets.size() == 1);  // vs < MAX_LEVELS
    3+    SplitAll(0); // vs m_clustersets.size() - 1
    4+    ApplyDependencies(0); // ditto
    5     m_clustersets.emplace_back();
    6+    auto& stage = m_clustersets[1]; // vs m_clustersets.back()
    7+    auto& main = m_clustersets[0]; // vs *(m_clustersets.rbegin() + 1);
    8     ...
    

    That’s the only use of m_clustersets.rbegin(), the remaining API of m_clustersets seems to be:

     0    class MainStageClusterSets
     1    {
     2    private:
     3        std::vector<ClusterSet> cs;
     4
     5    public:
     6        MainStageClusterSets()
     7        {
     8            cs.reserve(MAX_LEVELS);
     9            cs.emplace_back();
    10        }
    11
    12        bool UnfinishedTodo() const {
    13            for (auto& clusterset : cs) {
    14                if (!clusterset.m_deps_to_add.empty()) return true;
    15                if (!clusterset.m_to_remove.empty()) return true;
    16                if (!clusterset.m_removed.empty()) return true;
    17            }
    18            return false;
    19        }
    20
    21        const ClusterSet& operator[](size_t i) const { return cs[i]; }
    22        ClusterSet& operator[](size_t i) { return cs[i]; }
    23        ClusterSet& back() { return cs.back(); }
    24        size_t size() const { return cs.size(); }
    25
    26        void emplace_back() { cs.emplace_back(); }
    27        void pop_back() { cs.pop_back(); }
    28    };
    29    MainStageClusterSets m_clustersets;
    

    which seem pretty straightforward to switch to a main/optional-stage model. Doing some renaming would make sense if so (emplace_back/pop_back to add/remove_staging; back to work maybe; size to has_staging maybe; maybe a templated ForEach() accepting a lambda taking a ClusterSet for iteration).


    sipa commented at 9:22 pm on March 19, 2025:
    I’ve done something like this; std::vector<ClusterSet> is gone, and the code reduced by a whopping 23 lines as a result.
  183. in src/txgraph.cpp:1887 in 72a97c0a07 outdated
    1882+        // Check that the Entry has a locator pointing back to this Cluster & position within it.
    1883+        assert(entry.m_locator[level].cluster == this);
    1884+        assert(entry.m_locator[level].index == lin_pos);
    1885+        // For top-level entries, check linearization position and chunk feerate.
    1886+        if (level == 0 && IsAcceptable()) {
    1887+            assert(entry.m_main_lin_index == linindex++);
    


    ajtowns commented at 4:56 pm on March 18, 2025:
    Avoid side-effects inside asserts if possible?

    sipa commented at 9:22 pm on March 19, 2025:
    Done.
  184. ajtowns commented at 6:03 pm on March 18, 2025: contributor

    Preliminary ACK 72a97c0a07ea6e5a95ab37c8d95e1ea02cff8e92

    I think it’s okay to merge this code:

    • as of this PR, it isn’t actually used, so should not introduce bugs immediately
    • the fuzz testing is pretty good, so there should not be any catastrophic bugs even if it were used
    • while maybe the design could be improved, that can be done incrementally after it’s merged
    • I’m reasonably confident I understand what’s going on and can help find/fix bugs/problems if they occur
    • I think the design and implementation are a good step forward for implementing cluster mempool

    On the other hand, calling this “preliminary” as:

    • I’m not yet confident that I could maintain this code if its authors all suddenly vanished
    • I’m a bit concerned that to contribute to mempool code you might need to be a triple expert – C++, bitcoin, and industrial optimisation theory; not a show stopper, but as further PRs progress I’d like to gain confidence either that the optimisation parts can be safely treated as a very stable black box, or (ideally and) that we can reasonably onboard people to understand how the internals of the black box work.
  185. sipa commented at 7:12 pm on March 18, 2025: member

    @ajtowns Responding to some of the more conceptual comments

    • I’m indeed not entirely happy with the cycles between TxGraphImpl and Cluster, though there is a design concern that isn’t really materialized in the current code yet: Cluster may become a virtual class with multiple implementations for the purpose of memory usage (e.g., there could be a specialized SingletonCluster which just stores a single transaction, and no DepGraph, linearization, …, avoiding several vectors and their allocation overhead). The current code is designed to accommodate that, by putting code in Cluster member functions that need to have access to the internals, while having the generic code in TxGraphImpl. It’s still not trivial to introduce, because cluster merging/splitting needs to somehow work for all combination of cluster types, but to the extent I can say without actually having implemented it, I think it’s fairly close to that. I’m happy to hear about alternative approaches.

    • The above relates somewhat to the m_setindex/m_quality/m_level not feeling like they deserve to be dealt with in Cluster: indeed, they’re per-Cluster data, but not part of the “cluster implementation specific” code, so I was imagining these (the variables, and functions relating to them) to be part of the abstract base class for Cluster, and not in the Cluster derived classes. It would be possible to move the related functions to TxGraphImpl instead, but if this virtual Cluster design appears, maybe having it in the Cluster base class is actually more clear anyway?

    • Regarding Assume / Assert / assert. It’s true that semantically speaking, Assume() is always evaluated (so it’s fine to have side effects inside). However, it’s also the case that if the compiler can prove a statement is side-effect free, it can optimize the call away in non-debug builds. So I’m using them as a form of cheap assertion that can be used extensively throughout the codebase, most of which without runtime impact in production, compiler-proven to have same side-effects in debug and release builds. Combined with (fuzz) tests with high branch coverage, I believe this gives you the best of both worlds for non-critical assertions that still help reviewers reason about the code (as a “ah, an Assume(), I can be reasonably assured that in fuzz-test-reachable-scenarios, this condition is actually always true”).

    • Regarding the destruction of Ref objects and removing/unlinking. Note that the behavior changes: in the initial commit you can only destroy Ref objects whose transactions have already been RemoveTransaction()d, so that there is never a need for an inspector function to refer back to it anymore. In a later commit, Ref destroying is changed to automatically trigger a removal in the linked TxGraph if any (in both main and staging, which RemoveTransaction can’t do, as it only works on the top level).

      The current design tries very hard to make sure that under all circumstances, TxGraph is internally consistent (including when destroying non-removed Ref objects, which necessitates the m_graph pointer inside), but the responsibility to make sure the graph is externally consistent with the mempool lies outside. However, it doesn’t actually achieve this design entirely when considering locking: one could accidentally destroy a Ref without holding whatever lock protects the TxGraph object, for example, and if we need to restrict to only supporting “sane” behavior, maybe an alternative design is possible, where instead of TxGraph::Ref::~Ref, we have a TxGraph::WipeTransaction(Ref&) function (which does removing/unlinking/compacting as needed), and the actual Ref destructor then just asserts(false); if not wiped first? This would also remove the need for having an m_graph inside a Ref (or making it debug-only).

    • Regarding having a ClusterSet m_main_clusterset; std::optional<ClusterSet> m_staging_clusterset instead of a vector of ClusterSets (also tagging @instagibbs and @ismaelsadeeq who have commented about this). There is absolutely no intention of trying to accommodate multiple nested (or parallel) levels, and the design isn’t aiming for that. I just found it more elegant to largely avoid treating main and staging as separate concepts, and instead see them as two copies of the same thing. I haven’t tried the alternative, and there is probably a bunch of “interact with level below” that can become a bit more simple/specialized, but maybe also more logic for picking between m_main_clusterset and m_staging_clusterset (which aren’t even the same type). If you all think it’s worth trying, I’m happy to look into it.

  186. ajtowns commented at 12:10 pm on March 19, 2025: contributor
    • However, it’s also the case that if the compiler can prove a statement is side-effect free, it can optimize the call away in non-debug builds.

    That makes sense. Might be a good idea to update doc/developer-notes.md pointing out that use case? It currently says:

    • Assume should be used to document assumptions when program execution can safely continue even if the assumption is violated.
  187. sipa force-pushed on Mar 19, 2025
  188. sipa force-pushed on Mar 19, 2025
  189. DrahtBot added the label CI failed on Mar 19, 2025
  190. DrahtBot commented at 9:08 pm on March 19, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39067477614

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  191. sipa force-pushed on Mar 19, 2025
  192. sipa commented at 9:34 pm on March 19, 2025: member

    Big change: std::vector<ClusterSet> is gone, replaced by m_main_clusterset and m_staging_clusterset.

    • New helper functions GetTopLevel(), GetSpecifiedLevel(), GetClusterSet() made this fairly easy to do.
    • Simplifications resulted in ClearLocator, GetConflicts, PullIn, MakeTransactionsMissing, StartStaging, CommitStaging, ExtractCluster.
    • CopyTo was renamed to CopyToStaging.
    • LevelDown was renamed to MoveToMain.
    • MakeTransactionsMissing was renamed to MakeStagingTransactionsMissing.

    Other changes:

    • Various comments
    • Ref objects are now allowed to outlive TxGraph.
    • GetChunkFeerate (and its testing, plus Entry::m_chunk_feerate maintenance) was moved to a separate commit.
  193. DrahtBot removed the label CI failed on Mar 20, 2025
  194. in src/txgraph.h:117 in 63b8f96e67 outdated
    128+     *  configured maximum cluster count). If main_only is false and a staging graph exists, it is
    129+     *  queried; otherwise the main graph is queried. Some of the functions below are not available
    130+     *  for oversized graphs. The mutators above are always available. Removing a transaction by
    131+     *  destroying its Ref while staging exists will not clear main's oversizedness until staging
    132+     *  is aborted or committed. */
    133+    virtual bool IsOversized(bool main_only = false) noexcept = 0;
    


    ajtowns commented at 6:50 am on March 20, 2025:

    If I’m understanding right, this seem like it makes IsOversized() largely unfixable by the client – so the only reasonable strategy I can see is to do:

    0    StartStaging();
    1    // change transactions
    2    if (IsOversized() || otherwise_undesirable()) {
    3        AbortStaging();
    4    } else {
    5        CommitStaging();`
    6    }
    

    That’s a very reasonable strategy of course, so this isn’t a complaint! Just that if so, it seems like it should be documented a bit more clearly? I wonder a bit if CommitStaging() should have an Assume(G_FUZZING || !m_oversized) or something…

    (In particular, if you ever end up with the main graph being oversized, I don’t see how you can even discover which txs are contributing to the oversized-ness in order to remove them – GetCluster(ref) will fail the m_deps_to_add.empty() check I think, so you can’t use that, and if you do you’ll just get a subset of the actual cluster that is within limits; same with GetDescendants)


    ajtowns commented at 7:08 am on March 20, 2025:

    see also the TODO item

    • Ability to inspect the would-be-constructed clusters (so they can be “fixed” if they violate policy rules, as opposed to just accepted/rejected, before applying - this is needed for reorgs which are not optional).

    sipa commented at 7:42 am on March 20, 2025:

    See #31553, which adds a TxGraph::Trim function that removes some subset of transactions + their descendants, such that the result is no longer oversized, in O(n log n) time.

    When I started this PR, the idea was to expose ways to inspect would-be-oversized clusters, so they could be fixed by the user, before committing.

    It turned out that really nothing beats an internal “figure it out, and fix it” method in terms of efficiency and convenience, so that’s what 31553 does. It’s a bit of a design break compared to the rest of TxGraph, which lets the user make all decisions about the content of the graph, but it’s very practical.

    I’ve fixed the TODO here to reflect this.


    ajtowns commented at 8:38 am on March 20, 2025:
    :+1: I think it makes sense for TxGraph to take care of all the topology checks and feerate comparison decisions, which includes finding the best txs (block building), finding the worst txs (eviction) and finding good subsets when limits would otherwise be exceeded (reorgs, “rbf”/“carve out” cases maybe). So this doesn’t feel like a design break at all to me, but rather pretty much exactly parallel to GetWorstMainChunk() – “here’s a bunch of txs that you want to remove”.

    instagibbs commented at 1:40 pm on March 20, 2025:

    so the only reasonable strategy I can see is to do:

    The other reason to not follow this strategy, aside from reorgs, is “kindred eviction”, where the caller is in charge of gathering sufficient information try to get back to an un-oversized state in the staging area, using non-oversized main stage information as a guiding tool.


    sipa commented at 0:02 am on March 22, 2025:

    @ajtowns That’s fair; I liked to think of TxGraph as something with well-specified behavior, but there are several instances of unspecified/best-effort behavior already:

    • The cluster linearizations used (as no optimality is guaranteed)
    • Reordering of tx removal / dep adding due to lazy batching
    • Tie-breaking of order of same-chunk-feerate transactions from different clusters
    • The exact logic Trim uses

    Going to mark this resolved.

  195. ajtowns commented at 7:10 am on March 20, 2025: contributor
    (conflicts with just merged #31519, Span to std::span)
  196. in src/txgraph.cpp:1015 in 63b8f96e67 outdated
    1010+    if (!m_main_clusterset.m_to_remove.empty()) return;
    1011+    if (!m_main_clusterset.m_removed.empty()) return;
    1012+    if (m_staging_clusterset.has_value()) {
    1013+        if (!m_staging_clusterset->m_deps_to_add.empty()) return;
    1014+        if (!m_staging_clusterset->m_to_remove.empty()) return;
    1015+        if (!m_staging_clusterset->m_removed.empty()) return;
    


    ajtowns commented at 8:46 am on March 20, 2025:

    Could tidy this up:

    0    if (!m_main_clusterset.empty()) return;
    1    if (m_staging_clusterset.has_value() && !m_staging_clusterset->empty()) return;
    2
    3bool ClusterSet::empty() {
    4    return m_deps_to_add.empty() && m_to_remove.empty() && m_removed.empty();
    5}
    

    sipa commented at 11:58 pm on March 21, 2025:
    Good idea. Done through a ClusterSet::HoldsGraphIndexes() function.
  197. ajtowns commented at 9:01 am on March 20, 2025: contributor

    ReACK 63b8f96e67f9ad649070a231532d48fb6c3573e4 . Needs a rebase anyway though.

    The main/staging changes read better to me fwiw. The level stuff in the locators could be ungeneralised as well I suppose, but that doesn’t seem likely to be much of an improvement.

  198. DrahtBot requested review from instagibbs on Mar 20, 2025
  199. DrahtBot requested review from ismaelsadeeq on Mar 20, 2025
  200. DrahtBot added the label Needs rebase on Mar 20, 2025
  201. in src/txgraph.cpp:199 in 101a8ee328 outdated
    171-        std::vector<std::pair<GraphIndex, GraphIndex>> m_deps;
    172+        /** Where the clusters to be merged start in m_group_clusters. */
    173+        uint32_t m_cluster_offset;
    174+        /** How many clusters to merge. */
    175+        uint32_t m_cluster_count;
    176+        /** Where the dependencies for this cluster group in m_deps_to_add start. */
    


    ismaelsadeeq commented at 10:27 am on March 20, 2025:

    In 101a8ee3280e50c3272a80939b46a67faca838e4 “txgraph: (optimization) avoid per-group vectors for clusters & dependencies”

    0        /** Where the dependencies for this cluster group in ClusterSet::m_deps_to_add start. */
    

    sipa commented at 11:58 pm on March 21, 2025:
    Done.
  202. DrahtBot requested review from ismaelsadeeq on Mar 20, 2025
  203. ismaelsadeeq commented at 11:23 am on March 20, 2025: member

    That makes sense. Might be a good idea to update doc/developer-notes.md pointing out that use case? It currently says:

    Done in #32100

  204. sipa force-pushed on Mar 20, 2025
  205. sipa commented at 1:39 pm on March 20, 2025: member
    Rebased after the merge of #31519.
  206. DrahtBot added the label CI failed on Mar 20, 2025
  207. DrahtBot commented at 1:41 pm on March 20, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39111521595

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  208. sipa force-pushed on Mar 20, 2025
  209. DrahtBot removed the label Needs rebase on Mar 20, 2025
  210. instagibbs commented at 2:55 pm on March 20, 2025: member

    reviewed via git range-diff master 72a97c0a07ea6e5a95ab37c8d95e1ea02cff8e92 1601906941fa559ebbee7898453fa77f4606ad38

    Ref objects are now allowed to outlive TxGraph

    Can this get surfaced in the fuzz coverage with something that would previously fail?

  211. DrahtBot removed the label CI failed on Mar 20, 2025
  212. ajtowns commented at 7:56 am on March 21, 2025: contributor

    Can this get surfaced in the fuzz coverage with something that would previously fail?

    As long as you actually run the fuzz binary with your changes, and not the fuzz binary in the old location that you compiled a few days ago (cough), the following should work:

    0--- a/src/test/fuzz/txgraph.cpp
    1+++ b/src/test/fuzz/txgraph.cpp
    2@@ -723,4 +723,6 @@ FUZZ_TARGET(txgraph)
    3
    4     // Sanity check again (because invoking inspectors may modify internal unobservable state).
    5     real->SanityCheck();
    6+    real.reset(); // kill the txgraph
    7+    sims.clear(); // clear the sim container and any Refs
    8 }
    

    Dropping the TxGraphImpl destructor body then causes segfaults (rather than a nice assertion failure).

  213. ajtowns commented at 4:02 pm on March 21, 2025: contributor
    ReACK 1601906941fa559ebbee7898453fa77f4606ad38
  214. instagibbs commented at 4:30 pm on March 21, 2025: member

    reACK 1601906941fa559ebbee7898453fa77f4606ad38

    Dropping the TxGraphImpl destructor body then causes segfaults (rather than a nice assertion failure).

    Right, I got myself turned around on this thinking it was harder to achieve. I’d suggest adding coverage here or in a quick follow-up.

  215. sipa force-pushed on Mar 22, 2025
  216. sipa commented at 0:06 am on March 22, 2025: member

    Changes:

    • Did a big pass through the interface documentation in txgraph.h
    • Moved the support for Refs outliving TxGraph to a separate commit, and added the test suggested in #31363 (comment).
    • Some smaller things listed below:
  217. DrahtBot added the label CI failed on Mar 22, 2025
  218. DrahtBot commented at 1:22 am on March 22, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/39212798483

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  219. sipa force-pushed on Mar 22, 2025
  220. DrahtBot removed the label CI failed on Mar 22, 2025
  221. sipa force-pushed on Mar 23, 2025
  222. in src/txgraph.cpp:2076 in 4d1e491c88 outdated
    1956@@ -1942,6 +1957,13 @@ void TxGraphImpl::SanityCheck() const
    1957     }
    1958 }
    1959 
    1960+void TxGraphImpl::DoWork() noexcept
    


    ismaelsadeeq commented at 11:14 am on March 24, 2025:

    As it is right now, it’s unused in the main PR.

    I’ve been thinking when DoWork can be called in the mempool lifecycle and whether that would make a difference because everything is done lazily and we only apply staged changes when we want to get the current state. (specifically a scenario where after some lazy mutations it’s cheap to just call DoWork and not wait until we want to get a state.)

    Maybe after trimming due to a reorg? or after multiple additions of transactions e.g loading new mempool.


    sipa commented at 2:08 pm on March 24, 2025:
    I believe the plan is to basically run it at every “stable” point in time, so after any transaction or block processing.
  223. ismaelsadeeq commented at 11:54 am on March 24, 2025: member

    re-ACK 41b4434fed169570ce0976c6e58db0d4a9614aaa after #31363#pullrequestreview-2707582068

    The new doc is better, and it’s explicit that ref’s can outlive their txgraph.

  224. DrahtBot requested review from ajtowns on Mar 24, 2025
  225. in src/txgraph.h:85 in bf8f04b11d outdated
    80+    /** Determine whether arg exists in this graph (i.e., was not removed). */
    81+    virtual bool Exists(const Ref& arg) noexcept = 0;
    82+    /** Get the individual transaction feerate of transaction arg. Returns the empty FeePerWeight
    83+     *  if arg does not exist. */
    84+    virtual FeePerWeight GetIndividualFeerate(const Ref& arg) noexcept = 0;
    85+    /** Get pointers to all transactions in the which arg is in. The transactions will be returned
    


    ajtowns commented at 12:14 pm on March 24, 2025:
    Missing the word “cluster” (fixed in a later commit)

    sipa commented at 2:06 pm on March 24, 2025:
    Fixed.
  226. in src/txgraph.h:92 in bf8f04b11d outdated
    87+    virtual std::vector<Ref*> GetCluster(const Ref& arg) noexcept = 0;
    88+    /** Get pointers to all ancestors of the specified transaction, in unspecified order. Returns
    89+     *  {} if arg does not exist in the graph. */
    90+    virtual std::vector<Ref*> GetAncestors(const Ref& arg) noexcept = 0;
    91+    /** Get pointers to all descendants of the specified transaction, in unspecified order. Returns
    92+     *  {} if arg does not exist in the graph. */
    


    ajtowns commented at 12:17 pm on March 24, 2025:
    Maybe clarify that each tx is included in its own list of ancestors/descendants?

    sipa commented at 2:06 pm on March 24, 2025:
    Done.
  227. in src/txgraph.h:25 in bf8f04b11d outdated
    20+ * The connected components within the transaction graph are called clusters: whenever one
    21+ * transaction is reachable from another, through any sequence of is-parent-of or is-child-of
    22+ * relations, they belong to the same cluster (so clusters include parents, children, but also
    23+ * grandparents, siblings, cousins twice removed, ...).
    24+ *
    25+ * TxGraph implicitly maintains an associated total ordering on its transactions (its linearization)
    


    ajtowns commented at 12:52 pm on March 24, 2025:
    I don’t think this is strictly true until the chunk indexes are added in #31444? Prior to that, it’s possible to obtain such an ordering, but only by requesting all the clusters, breaking them into chunks manually (via chunk feerate calls on each tx in a cluster), then sorting the chunks.

    sipa commented at 2:07 pm on March 24, 2025:
    I’ve replaced “maintains” with “defines”. As far as the interface is concerned, what the actual implementation does in this regard is irrelevant.
  228. in src/txgraph.h:121 in 41b4434fed outdated
    116+     *  is aborted or committed. */
    117+    virtual bool IsOversized(bool main_only = false) noexcept = 0;
    118+    /** Get the feerate of the chunk which transaction arg is in the main graph. Returns the empty
    119+     *  FeePerWeight if arg does not exist in the main graph. The main graph must not be
    120+     *  oversized. */
    121+    virtual bool Exists(const Ref& arg, bool main_only = false) noexcept = 0;
    


    ajtowns commented at 1:02 pm on March 24, 2025:
    Looks like comments for Exists/GetMainChunkFeerate/GetIndividualFeerate have been mixed up (in the “Add staging support” commit, I think)

    sipa commented at 2:07 pm on March 24, 2025:
    Fixed.
  229. ajtowns changes_requested
  230. ajtowns commented at 1:07 pm on March 24, 2025: contributor

    range-diff review of 41b4434fed169570ce0976c6e58db0d4a9614aaa

    Good to see it’s now explicit which calls you can make on an oversized txgraph.

  231. DrahtBot requested review from ajtowns on Mar 24, 2025
  232. clusterlin: Add FixLinearization function + fuzz test
    This function takes an existing ordering for transactions in a DepGraph, and
    makes it a valid linearization for it (i.e., topological). Any topological
    prefix of the input remains untouched.
    0aa874a357
  233. clusterlin: Make IsAcyclic() a DepGraph member function
    ... instead of being a separate test-only function.
    
    Also add a fuzz test for it returning false.
    bfeb69f6e0
  234. scripted-diff: (refactor) ClusterIndex -> DepGraphIndex
    Since cluster_linearize.h does not actually have a Cluster type anymore, it is more
    appropriate to rename the index type to DepGraphIndex.
    
    -BEGIN VERIFY SCRIPT-
    sed -i 's/Data type to represent transaction indices in clusters./Data type to represent transaction indices in DepGraphs and the clusters they represent./' $(git grep -l 'using ClusterIndex')
    sed -i 's|\<ClusterIndex\>|DepGraphIndex|g' $(git grep -l 'ClusterIndex')
    -END VERIFY SCRIPT-
    d449773899
  235. feefrac: Introduce tagged wrappers to distinguish vsize/WU rates 6eab3b2d73
  236. txgraph: Add initial version (feature)
    This adds an initial version of the txgraph module, with the TxGraph class.
    It encapsulates knowledge about the fees, sizes, and dependencies between all
    mempool transactions, but nothing else.
    
    In particular, it lacks knowledge about txids, inputs, outputs, CTransactions,
    ... and so forth. Instead, it exposes a generic TxGraph::Ref type to reference
    nodes in the TxGraph, which can be passed around and stored by layers on top.
    8ad3ed2681
  237. txgraph: Add simulation fuzz test (tests)
    This adds a simulation fuzz test for txgraph, by comparing with a naive
    reimplementation that models the entire graph as a single DepGraph, and
    clusters in TxGraph as connected components within that DepGraph.
    05abf336f9
  238. txgraph: Add internal sanity check function (tests)
    To make testing more powerful, expose a function to perform an internal sanity
    check on the state of a TxGraph. This is especially important as TxGraphImpl
    contains many redundantly represented pieces of information:
    
    * graph contains clusters, which refer to entries, but the entries refer back
    * graph maintains pointers to Ref objects, which point back to the graph.
    
    This lets us make sure they are always in sync.
    ee57e93099
  239. txgraph: Avoid per-group vectors for clusters & dependencies (optimization)
    Instead construct a single vector with the list of all clusters in all groups,
    and then store per-group offset/range in that list.
    
    For dependencies, reuse m_deps_to_add, and store offset/range into that.
    c80aecc24d
  240. txgraph: Add GetChunkFeerate function (feature)
    This adds a function to query the chunk feerate of a transaction, by caching it
    inside the Entry objects.
    1d27b74c8e
  241. txgraph: Make max cluster count configurable and "oversize" state (feature)
    Instead of leaving the responsibility on higher layers to guarantee that
    no connected component within TxGraph (a barely exposed concept, except through
    GetCluster()) exceeds the cluster count limit, move this responsibility to
    TxGraph itself:
    * TxGraph retains a cluster count limit, but it becomes configurable at construction
      time (this primarily helps with testing that it is properly enforced).
    * It is always allowed to perform mutators on TxGraph, even if they would cause the
      cluster count limit to be exceeded. Instead, TxGraph exposes an IsOversized()
      function, which queries whether it is in a special "oversize" state.
    * During oversize state, many inspectors are unavailable, but mutators remain valid,
      so the higher layer can "fix" the oversize state before continuing.
    64f69ec8c3
  242. txgraph: Avoid representative lookup for each dependency (optimization)
    The m_deps_to_add vector is sorted by child Cluster*, which matches the
    order of an_clusters. This means we can walk through m_deps_to_add while
    doing the representative lookups for an_clusters, and reuse them.
    1171953ac6
  243. txgraph: Avoid looking up the same child cluster repeatedly (optimization)
    Since m_deps_to_add has been sorted by child Cluster* already, all dependencies
    with the same child will be processed consecutively. Take advantage of this by
    remember the last partition merged with, and reusing that if applicable.
    57f5499882
  244. txgraph: Delay chunking while sub-acceptable (optimization)
    Chunk-based information (primarily, chunk feerates) are never accessed without
    first bringing the relevant Clusters to an "acceptable" quality level. Thus,
    while operations are ongoing and Clusters are not acceptable, we can omit
    computing the chunkings and chunk feerates for Clusters.
    5801e0fb2b
  245. txgraph: Special-case removal of tail of cluster (Optimization)
    When transactions are removed from the tail of a cluster, we know the existing
    linearization remains acceptable (if it already was), but may just need splitting
    and postlinearization, so special case these into separate quality levels.
    36dd5edca5
  246. txgraph: Group per-graph data in ClusterSet (refactor)
    This is a preparation for a next commit where a TxGraph will start representing
    potentially two distinct graphs (a main one, and a staging one with proposed
    changes).
    34aa3da5ad
  247. txgraph: Abstract out ClearLocator (refactor)
    Move a number of related modifications to TxGraphImpl into a separate
    function for removal of transactions. This is preparation for a later
    commit where this will be useful in more than one place.
    c99c7300b4
  248. txgraph: Add staging support (feature)
    In order to make it easy to evaluate proposed changes to a TxGraph, introduce a
    "staging" mode, where mutators (AddTransaction, AddDependency, RemoveTransaction)
    do not modify the actual graph, but just a staging version of it. That staging
    graph can then be commited (replacing the main one with it), or aborted (discarding
    the staging).
    8c70688965
  249. txgraph: Cache oversizedness of graphs (optimization) 6b037ceddf
  250. txgraph: Destroying Ref means removing transaction (feature)
    Before this commit, if a TxGraph::Ref object is destroyed, it becomes impossible
    to refer to, but the actual corresponding transaction node in the TxGraph remains,
    and remains indefinitely as there is no way to remove it.
    
    Fix this by making the destruction of TxGraph::Ref trigger immediate removal of
    the corresponding transaction in TxGraph, both in main and staging if it exists.
    82fa3573e1
  251. txgraph: Allow Refs to outlive the TxGraph (feature) 22c68cd153
  252. txgraph: Expose ability to compare transactions (feature)
    In order to make it possible for higher layers to compare transaction quality
    (ordering within the implicit total ordering on the mempool), expose a comparison
    function and test it.
    295a1ca8bb
  253. txgraph: Add DoWork function (feature)
    This can be called when the caller has time to spend now, and wants future operations
    to be fast.
    b685d322c9
  254. txgraph: Add CountDistinctClusters function (feature) aded047019
  255. txgraph: Multiple inputs to Get{Ancestors,Descendant}Refs (preparation)
    This is a preparation for the next commit, which adds a feature to request
    the Refs to multiple ancestors/descendants at once.
    54bceddd3a
  256. txgraph: Add Get{Ancestors,Descendants}Union functions (feature)
    Like GetAncestors and GetDescendants, but for the union of multiple inputs.
    b2ea365648
  257. sipa force-pushed on Mar 24, 2025
  258. instagibbs commented at 2:59 pm on March 24, 2025: member

    reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f

    nice doc improvements

    git range-diff master 1601906941fa559ebbee7898453fa77f4606ad38 b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f

  259. DrahtBot requested review from ismaelsadeeq on Mar 24, 2025
  260. in src/txgraph.cpp:882 in c80aecc24d outdated
    878@@ -857,22 +879,27 @@ void TxGraphImpl::ApplyDependencies() noexcept
    879     if (m_deps_to_add.empty()) return;
    880 
    881     // For each group of to-be-merged Clusters.
    882-    for (auto& group_data : *m_group_data) {
    883+    for (const auto& group_data : m_group_data->m_groups) {
    


    glozow commented at 6:36 pm on March 25, 2025:

    nit c80aecc24ddd878c62be9753a2746e36860e3a97:

    now that this variable has type GroupEntry instead of GroupData, it would be more clear to name it something like group or group_entry


    sipa commented at 3:04 am on March 27, 2025:
    Done in #32151
  261. in src/txgraph.cpp:71 in 8c70688965 outdated
    67@@ -65,14 +68,16 @@ class Cluster
    68     QualityLevel m_quality{QualityLevel::NONE};
    69     /** Which position this Cluster has in Graph::ClusterSet::m_clusters[m_quality]. */
    70     ClusterSetIndex m_setindex{ClusterSetIndex(-1)};
    71+    /** Which level this Cluster is at in the graph (-1=not inserted, 0=main, 1=staging). */
    


    glozow commented at 7:13 pm on March 25, 2025:

    nit in 8c70688965bc4038f28f41e4490180e40a88b5ee

    It could be worth assigning constants MAIN_LEVEL{0} and STAGING_LEVEL{1} to replace the 0s and 1s in the file?


    sipa commented at 3:04 am on March 27, 2025:
    Done in #32151
  262. sipa commented at 7:24 pm on March 25, 2025: member

    It occurs to me that it may be possible to get rid of TxGraphImpl::Entry::m_main_lin_index, by storing a transaction linearization position rather than its DepGraphIndex inside the TxGraphImpl::Locator. So instead of using the DepGraphIndex as primary way to identify a transaction within a cluster, make the linearization index the primary, and look up the other one through m_linearization when needed. It’s not clear to me if it’s better to leave m_mapping be indexed by DepGraphIndex or linearization index, but I think it can be done.

    Anyway, after looking into it, I don’t think it’s worth changing at this stage in the PR, but maybe it can be done as a follow-up.

  263. in src/txgraph.cpp:341 in 8c70688965 outdated
    343-    void SetClusterQuality(QualityLevel old_quality, ClusterSetIndex old_index, QualityLevel new_quality) noexcept;
    344-    /** Make a transaction not exist. It must currently exist. */
    345-    void ClearLocator(GraphIndex index) noexcept;
    346+    void SetClusterQuality(int level, QualityLevel old_quality, ClusterSetIndex old_index, QualityLevel new_quality) noexcept;
    347+    /** Get the index of the top level ClusterSet (staging if it exists, main otherwise). */
    348+    int GetTopLevel() const noexcept { return m_staging_clusterset.has_value(); }
    


    glozow commented at 7:51 pm on March 25, 2025:
    Some parts of the file seem to suggest a max_level-agnostic implementation (e.g. looping on level from 0 to GetTopLevel()), but this casting from a bool to an int suggests it has to be 2. I think it would be useful to have more levels (e.g. for package test acceptance) in the future, but perhaps a static_assert(MAX_LEVELS == 2) might be good documentation here?

    sipa commented at 3:04 am on March 27, 2025:
    Done in #32151
  264. ajtowns commented at 6:25 pm on March 26, 2025: contributor
    reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f
  265. ismaelsadeeq commented at 8:11 pm on March 26, 2025: member
    reACK b2ea3656481b4196acaf6a1b5f3949a9ba4cf48f 🚀
  266. in src/txgraph.cpp:1368 in b2ea365648
    1363+    // that the chunks of the resulting linearization are all connected.
    1364+    if (!optimal) PostLinearize(m_depgraph, linearization);
    1365+    // Update the linearization.
    1366+    m_linearization = std::move(linearization);
    1367+    // Update the Cluster's quality.
    1368+    auto new_quality = optimal ? QualityLevel::OPTIMAL : QualityLevel::ACCEPTABLE;
    


    glozow commented at 8:34 pm on March 26, 2025:
    fwiw, I don’t think there is test coverage for this (i.e. fine if this is always set to optimal), though I expect any problems would become more obvious when mempool things are hooked up to txgraph

    sipa commented at 3:05 am on March 27, 2025:

    Right. The difference between OPTIMAL and ACCEPTABLE is not observable as far as the interface is concerned, because it only promises a best effort.

    There is an operational difference, though: DoWork() will not bother putting more effort into OPTIMAL clusters.

  267. glozow commented at 9:06 pm on March 26, 2025: member

    ACK b2ea3656481

    Reviewed the structure to convince myself this approach makes sense to support the features required, and reviewed the SimTxGraph + differential fuzzer in closer detail. I haven’t read through all the TxGraphImpl functions but did some manual mutation testing.

  268. in src/test/fuzz/txgraph.cpp:583 in b2ea365648
    578+                        for (auto i : component) {
    579+                            component |= sel_sim.graph.Ancestors(i);
    580+                            component |= sel_sim.graph.Descendants(i);
    581+                        }
    582+                        if (component == old_component) break;
    583+                    }
    


    glozow commented at 9:08 pm on March 26, 2025:
    Why can’t FindConnectedComponent be used instead?

    sipa commented at 3:06 am on March 27, 2025:
    That finds a connected component, but not the one that this particular transaction is in. I guess it could be refactored to support that too.

    sipa commented at 3:22 pm on March 27, 2025:
    Done in #32151.
  269. in src/test/fuzz/txgraph.cpp:561 in b2ea365648
    556+                refs.resize(count);
    557+                for (size_t i = 0; i < count; ++i) {
    558+                    refs[i] = pick_fn();
    559+                }
    560+                // Their order should not matter, shuffle them.
    561+                std::shuffle(refs.begin(), refs.end(), rng);
    


    glozow commented at 9:09 pm on March 26, 2025:
    Is this necessary given that the fuzzer already did pick_fn randomly?

    sipa commented at 3:08 am on March 27, 2025:
    Well, pick_fn picks based on fuzz input, which is arbitrary, but definitely not random. There is a point to be made that we should leave it up to the fuzzer entirely here, but I felt like adding actual random ordering might more easily cover strange orderings. Probably doesn’t add much, but also does not hurt much.
  270. in src/cluster_linearize.h:1359 in 0aa874a357 outdated
    1354+        SetType place_before = done & depgraph.Ancestors(elem);
    1355+        // Find which position to place elem in (updating j), continuously moving the elements
    1356+        // in between forward.
    1357+        while (place_before.Any()) {
    1358+            // j cannot be 0 here; if it was, then there was necessarily nothing earlier which
    1359+            // elem needs to be place before anymore, and place_before would be empty.
    


    glozow commented at 9:10 pm on March 26, 2025:

    nit 0aa874a357865dd4768091f26dff238e66fb8d83

    0            // elem needs to be placed before anymore, and place_before would be empty.
    

    sipa commented at 3:08 am on March 27, 2025:
    Fixed in #32151
  271. in src/test/fuzz/txgraph.cpp:338 in b2ea365648
    333+                    if (top_sim.graph.Ancestors(pos_par)[pos_chl]) break;
    334+                }
    335+                top_sim.AddDependency(par, chl);
    336+                real->AddDependency(*par, *chl);
    337+                break;
    338+            } else if (top_sim.removed.size() < 100 && command-- == 0) {
    


    glozow commented at 9:11 pm on March 26, 2025:

    in 05abf336f997f477c6f48412809ab540fccf1cb0

    I suppose this stops us from making removed larger than MAX_TRANSACTIONS + 100?


    sipa commented at 3:10 am on March 27, 2025:
    Where does the MAX_TRANSACTIONS come from? It should limit the size of removed to 100. This was added when the fuzzer ran for much longer (up to 1000 iterations, I think), and I wanted to avoid it just creating and removing transactions. It currently doesn’t do much anymore with the lower iteration count.

    glozow commented at 2:31 pm on March 27, 2025:
    I was thinking that if all the transactions are descended from this tx, you’d add all of them to to_remove here. So if you started out with 99, you can end up with a lot more than 100 afterward. You wouldn’t do more removals afterward, but you’d have more than 100.

    sipa commented at 2:33 pm on March 27, 2025:
    Ah, yes, indeed!
  272. glozow merged this on Mar 26, 2025
  273. glozow closed this on Mar 26, 2025

  274. fanquake referenced this in commit a54baa8698 on Mar 27, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-31 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me