Remove CTxMempool::mapLinks data structure member #19478

pull JeremyRubin wants to merge 2 commits into bitcoin:master from JeremyRubin:epoch-mempool-clean-split-3 changing 5 files +120 −118
  1. JeremyRubin commented at 8:54 pm on July 9, 2020: contributor

    Currently we have a peculiar data structure in the mempool called maplinks. Maplinks job is to track the in-pool children and parents of each transaction. This PR can be primarily understood and reviewed as a simple refactoring to remove this extra data structure, although it comes with a nice memory and performance improvement for free.

    Maplinks is particularly peculiar because removing it is not as simple as just moving it’s inner structure to the owning CTxMempoolEntry. Because TxLinks (the class storing the setEntries for parents and children) store txiters to each entry in the mempool corresponding to the parent or child, it means that the TxLinks type is “aware” of the boost multiindex (mapTx) it’s coming from, which is in turn, aware of the entry type stored in mapTx. Thus we used maplinks to store this entry associated data we in an entirely separate data structure just to avoid a circular type reference caused by storing a txiter inside a CTxMempoolEntry.

    It turns out, we can kill this circular reference by making use of iterator_to multiindex function and std::reference_wrapper. This allows us to get rid of the maplinks data structure and move the ownership of the parents/child sets to the entries themselves.

    The benefit of this good all around, for any of the reasons given below the change would be acceptable, and it doesn’t make the code harder to reason about or worse in any respect (as far as I can tell, there’s no tradeoff).

    Simpler ownership model

    No longer having to consistency check that mapLinks did have records for our CTxMempoolEntry, impossible to have a mapLinks entry outlive or incorrectly die before a CTxMempoolEntry.

    Memory Usage

    We get rid of a O(Transactions) sized map in the mempool, which is a long lived data structure.

    Performance

    If you have a CTxMemPoolEntry, you immediately know the address of it’s children/parents, rather than having to do a O(log(Transactions)) lookup via maplinks (which we do very often). We do it in so many places that a true benchmark has to look at a full running node, but it is easy enough to show an improvement in this case.

    The ComplexMemPool shows a good coherence check that we see the expected result of it being 12.5% faster / 1.14x faster.

    0Before:
    1# Benchmark, evals, iterations, total, min, max, median
    2ComplexMemPool, 5, 1, 1.40462, 0.277222, 0.285339, 0.279793
    3
    4After:
    5# Benchmark, evals, iterations, total, min, max, median
    6ComplexMemPool, 5, 1, 1.22586, 0.243831, 0.247076, 0.244596
    

    The ComplexMemPool benchmark only checks doing addUnchecked and TrimToSize for 800 transactions. While this bench does a good job of hammering the relevant types of function, it doesn’t test everything.

    Subbing in 5000 transactions shows a that the advantage isn’t completely wiped out by other asymptotic factors (this isn’t the only bottleneck in growing the mempool), but it’s only a bit proportionally slower (10.8%, 1.12x), which adds evidence that this will be a good change for performance minded users.

    0# Benchmark, evals, iterations, total, min, max, median
    1ComplexMemPool, 5, 1, 59.1321, 11.5919, 12.235, 11.7068
    2
    3# Benchmark, evals, iterations, total, min, max, median
    4ComplexMemPool, 5, 1, 52.1307, 10.2641, 10.5206, 10.4306
    

    I don’t think it’s possible to come up with an example of where a maplinks based design would have better performance, but it’s something for reviewers to consider.

    Discussion

    I spoke with the author of mapLinks (sdaftuar) a while back, and my recollection from our conversation was that it was implemented because he did not know how to resolve the circular dependency at the time, and there was no other reason for making it a separate map.

    Is iterator_to weird?

    iterator_to is expressly for this purpose, see https://www.boost.org/doc/libs/1_51_0/libs/multi_index/doc/tutorial/indices.html#iterator_to

    iterator_to provides a way to retrieve an iterator to an element from a pointer to the element, thus making iterators and pointers interchangeable for the purposes of element pointing (not so for traversal) in many situations. This notwithstanding, it is not the aim of iterator_to to promote the usage of pointers as substitutes for real iterators: the latter are specifically designed for handling the elements of a container, and not only benefit from the iterator orientation of container interfaces, but are also capable of exposing many more programming bugs than raw pointers, both at compile and run time. iterator_to is thus meant to be used in scenarios where access via iterators is not suitable or desireable:

    - Interoperability with preexisting APIs based on pointers or references.
    - Publication of pointer-based interfaces (for instance, when designing a C-compatible library).
    - The exposure of pointers in place of iterators can act as a type erasure barrier effectively decoupling the user of the code from the implementation detail of which particular container is being used. Similar techniques, like the famous Pimpl idiom, are used in large projects to reduce dependencies and build times.
    - Self-referencing contexts where an element acts upon its owner container and no iterator to itself is available.
    

    In other words, iterator_to is the perfect tool for the job by the last reason given. Under the hood it should just be a simple pointer cast and have no major runtime overhead (depending on if the function call is inlined).

    Edit by laanwj: removed at sign from the description

  2. JeremyRubin added the label Refactoring on Jul 9, 2020
  3. JeremyRubin added the label Mempool on Jul 9, 2020
  4. JeremyRubin added the label Resource usage on Jul 9, 2020
  5. JeremyRubin requested review from hebasto on Jul 9, 2020
  6. DrahtBot commented at 2:21 am on July 10, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19872 (Avoid locking CTxMemPool::cs recursively in some cases by hebasto)
    • #18191 (Change UpdateForDescendants to use Epochs by JeremyRubin)
    • #18017 (txmempool: split epoch logic into class by ajtowns)
    • #17786 (refactor: Nuke policy/fees->mempool circular dependencies by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. laanwj commented at 12:24 pm on July 15, 2020: member
    This is cool! How much memory would be saved by this change on average, do you think?
  8. JeremyRubin commented at 7:03 pm on July 15, 2020: contributor

    std::sets are awful for memory. I think you can get like a max of 220k txns in the mempool… let’s assume that average is 50k? I don’t know if that’s a good assumption though. the overhead for a map entry is very high. I would guess that it’s somewhere around 64 bytes per entry? Can double check, but based on:

    0 struct stl_tree_node
    1 {
    2 private:
    3     int color;
    4     void* parent;
    5     void* left;
    6     void* right;
    7     std::pair<txiter, txlinks>;
    8 };
    

    That’s 4 bytes + 4 bytes alignment + 8 * 3 bytes pointers + 8 bytes txiter (ignoring txlinks itself since we still store those in the CTxMempoolEntry, but adding another 8 bytes for alignments and padding) = 6 * 8 = 48 byes of raw overhead. Then with mallocusage, this gives us 64 bytes. So it may not use 64 bytes per element, but we certainly count that many! Depending on how big txlinks is itself, we may have some extra overhead (implementation dependent).

    Anyways, let’s assume 64 bytes.

    64*50,000 bytes = 3.2 MB, 64 * 200,000 bytes = 12.8 MB.

    For a 300MB pool, 1-3% memory savings. Note that this doesn’t account for the fact that there are other downsides to the memory fragmentation of doing so many allocations, but a pooled allocator could also help with this issue.

    (A further change which is maybe worth looking at is the Children themselves – because there is so much memory overhead, storing a std::vector and std::vector might be more efficient for many txs with small numbers of vins/vouts, especially since you can super optimize checking if a H is in a vector v.s. log N random memory compares)

  9. JeremyRubin commented at 8:35 pm on July 15, 2020: contributor
    0  template <size_t s1, size_t s2> struct static_check {
    1       static_check() {
    2       static_assert(s1 == s2);
    3       }
    4   };
    5       auto s = static_check<48, sizeof(setEntries)>();
    6       auto s2 = static_check<48*2, sizeof(TxLinks)>();
    7       auto s3 = static_check<136, sizeof(memusage::stl_tree_node<std::pair<const txiter, TxLinks>>)>();
    8       auto s4 = static_check<160, ((31 + sizeof(memusage::stl_tree_node<std::pair<const txiter, TxLinks>>)) >> 4)<<4 >();
    

    This confirms that a setEntries uses 48 bytes for the container itself, and the overhead is 64 bytes per node for each TxLinks in the mapLinks data structure compared to inlined TxLinks in CTxMempoolEntry. Even though the nodes themselves are only 136 bytes, the MallocUsage accounting pads out to 160 bytes.

  10. in src/txmempool.cpp:69 in dc0d3fb626 outdated
    70         setAllDescendants.insert(cit);
    71         stageEntries.erase(cit);
    72-        const setEntries &setChildren = GetMemPoolChildren(cit);
    73-        for (txiter childEntry : setChildren) {
    74-            cacheMap::iterator cacheIt = cachedDescendants.find(childEntry);
    75+        const CTxMemPoolEntry::Children &setChildren = cit.GetMemPoolChildrenConst();
    


    jonatack commented at 5:57 am on July 20, 2020:

    da53f15e

    0        const CTxMemPoolEntry::Children& setChildren = cit.GetMemPoolChildrenConst();
    
  11. in src/txmempool.cpp:199 in dc0d3fb626 outdated
    195@@ -195,11 +196,13 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry, setEntr
    196             return false;
    197         }
    198 
    199-        const setEntries & setMemPoolParents = GetMemPoolParents(stageit);
    200-        for (txiter phash : setMemPoolParents) {
    201+        const CTxMemPoolEntry::Parents & setMemPoolParents = stageit->GetMemPoolParentsConst();
    


    jonatack commented at 5:58 am on July 20, 2020:

    da53f15e

    0        const CTxMemPoolEntry::Parents& setMemPoolParents = stageit->GetMemPoolParentsConst();
    
  12. in src/txmempool.cpp:205 in dc0d3fb626 outdated
    203+            txiter phash = mapTx.iterator_to(hash);
    204+
    205             // If this is a new ancestor, add it.
    206             if (setAncestors.count(phash) == 0) {
    207-                parentHashes.insert(phash);
    208+                parentHashes.insert(hash);
    


    jonatack commented at 5:59 am on July 20, 2020:
    da53f15e be sure phash and hash are used in the right places here

    JeremyRubin commented at 4:59 pm on July 20, 2020:

    Yep! CTxMemPool::Parents stores reference_wrappers (hash) whereas phash is a txiter as stored in setAncestors.

    In some places we may eventually want to replace things like setAncestors with set<reference_wrappers> to get rid of the casting, but that’s a more invasive change with less clear benefits so it is not done for now.

  13. in src/txmempool.cpp:248 in dc0d3fb626 outdated
    244@@ -242,9 +245,9 @@ void CTxMemPool::UpdateEntryForAncestors(txiter it, const setEntries &setAncesto
    245 
    246 void CTxMemPool::UpdateChildrenForRemoval(txiter it)
    247 {
    248-    const setEntries &setMemPoolChildren = GetMemPoolChildren(it);
    249-    for (txiter updateIt : setMemPoolChildren) {
    250-        UpdateParent(updateIt, it, false);
    251+    const CTxMemPoolEntry::Children &setMemPoolChildren = it->GetMemPoolChildrenConst();
    


    jonatack commented at 6:00 am on July 20, 2020:

    da53f15e

    0    const CTxMemPoolEntry::Children& setMemPoolChildren = it->GetMemPoolChildrenConst();
    
  14. in src/txmempool.cpp:463 in dc0d3fb626 outdated
    459@@ -457,8 +460,9 @@ void CTxMemPool::CalculateDescendants(txiter entryit, setEntries& setDescendants
    460         setDescendants.insert(it);
    461         stage.erase(it);
    462 
    463-        const setEntries &setChildren = GetMemPoolChildren(it);
    464-        for (txiter childiter : setChildren) {
    465+        const CTxMemPoolEntry::Children &setChildren = it->GetMemPoolChildrenConst();
    


    jonatack commented at 6:00 am on July 20, 2020:

    da53f15e

    0        const CTxMemPoolEntry::Children& setChildren = it->GetMemPoolChildrenConst();
    
  15. jonatack commented at 6:53 am on July 20, 2020: member

    These changes look good. I’m unsure what our plans are with respect to Boost features.

    ACK dc0d3fb62666fd4d5dad25455e0c12aa9395e7d7 code review and tested rebased on master. The benchmarks show a 20% improvement for me, built with --enable-bench CXXFLAGS="-O2".

    Current master 476436b

    0# Benchmark, evals, iterations, total, min, max, median
    1ComplexMemPool, 50, 1, 22.3, 0.371758, 0.665461, 0.426088
    

    After, rebased on master

    0# Benchmark, evals, iterations, total, min, max, median
    1ComplexMemPool, 50, 1, 18.2, 0.337176, 0.493058, 0.354405
    

    With 5000 transactions

    0src/bench/mempool_stress.cpp
    1-    for (auto x = 0; x < 800 && !available_coins.empty(); ++x) {
    2+    for (auto x = 0; x < 5000 && !available_coins.empty(); ++x) {
    

    Current master 476436b

    0# Benchmark, evals, iterations, total, min, max, median
    1ComplexMemPool, 5, 1, 97.3971, 18.5328, 20.4063, 19.4885
    

    After, rebased on master

    0# Benchmark, evals, iterations, total, min, max, median
    1ComplexMemPool, 5, 1, 80.3426, 15.6254, 16.5093, 16.118
    
  16. jonatack commented at 7:01 am on July 20, 2020: member
    nit: maybe a less violent PR title :axe: :slightly_smiling_face: :+1: (joking, but s/kill/remove/ works) and add the component prefix
  17. sipa commented at 7:06 am on July 20, 2020: member
    @jonatack You must have hated #9260.
  18. jonatack commented at 7:14 am on July 20, 2020: member
    Not bad :) The description even required translation. #9260 (comment)
  19. jonatack commented at 8:18 am on July 20, 2020: member

    One run each of all benchmarks (though I suspect there is a fair amount of variation between runs).

    current master

     0(master)$ ./src/bench/bench_bitcoin
     1# Benchmark, evals, iterations, total, min, max, median
     2AddrManAdd, 5, 5, 1.72099, 0.063045, 0.0772824, 0.0669075
     3AddrManGetAddr, 5, 500, 5.81175, 0.0022402, 0.00251569, 0.0022842
     4AddrManGood, 5, 2, 6.0441, 0.584628, 0.624264, 0.596058
     5AddrManSelect, 5, 1000000, 2.69665, 5.28107e-07, 5.52293e-07, 5.3419e-07
     6AssembleBlock, 5, 700, 1.62831, 0.000441994, 0.000487594, 0.000469525
     7Base58CheckEncode, 5, 320000, 4.00335, 2.45396e-06, 2.58267e-06, 2.48388e-06
     8Base58Decode, 5, 800000, 4.13891, 9.41483e-07, 1.20126e-06, 9.73678e-07
     9Base58Encode, 5, 470000, 3.47461, 1.45507e-06, 1.49106e-06, 1.48153e-06
    10Bech32Decode, 5, 800000, 2.02791, 4.81211e-07, 5.65706e-07, 4.93396e-07
    11Bech32Encode, 5, 800000, 2.97694, 6.88539e-07, 8.73764e-07, 7.08336e-07
    12BenchLockedPool, 5, 1300, 6.36001, 0.000945641, 0.00101048, 0.000980838
    13BenchTimeDeprecated, 5, 100000000, 3.06631, 5.61812e-09, 7.04958e-09, 6.02113e-09
    14BenchTimeMillis, 5, 6000000, 2.88408, 9.29844e-08, 1.00058e-07, 9.63667e-08
    15BenchTimeMillisSys, 5, 6000000, 2.92817, 9.08119e-08, 1.17317e-07, 9.40587e-08
    16BenchTimeMock, 5, 300000000, 3.31172, 2.1875e-09, 2.21985e-09, 2.21037e-09
    17BlockToJsonVerbose, 5, 10, 4.90092, 0.0940476, 0.100517, 0.0993346
    18BnBExhaustion, 5, 650, 3.83522, 0.00111645, 0.00123052, 0.00119955
    19CCheckQueueSpeedPrevectorJob, 5, 1400, 7.23464, 0.000990455, 0.00107279, 0.00103406
    20CCoinsCaching, 5, 170000, 0.356379, 3.90721e-07, 4.65786e-07, 4.13522e-07
    21CHACHA20_1MB, 5, 340, 3.97313, 0.00228695, 0.00236048, 0.00235003
    22CHACHA20_256BYTES, 5, 250000, 0.744309, 5.75571e-07, 6.18917e-07, 5.97898e-07
    23CHACHA20_64BYTES, 5, 500000, 0.39087, 1.54202e-07, 1.60676e-07, 1.55381e-07
    24CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT, 5, 340, 11.3753, 0.0064394, 0.00718509, 0.00649879
    25CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT, 5, 340, 5.56076, 0.00322281, 0.00332228, 0.00327013
    26CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT, 5, 250000, 2.77848, 2.16111e-06, 2.28038e-06, 2.22017e-06
    27CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT, 5, 250000, 1.40301, 1.09434e-06, 1.14021e-06, 1.13141e-06
    28CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT, 5, 500000, 2.62965, 1.01675e-06, 1.09266e-06, 1.05452e-06
    29CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT, 5, 500000, 1.3327, 5.12329e-07, 5.47454e-07, 5.30224e-07
    30CoinSelection, 5, 650, 0.755947, 0.000220572, 0.000257613, 0.000226304
    31ComplexMemPool, 5, 1, 1.96453, 0.377724, 0.41796, 0.389692
    32ConstructGCSFilter, 5, 1000, 10.173, 0.00198849, 0.00209198, 0.00204057
    33DeserializeAndCheckBlockTest, 5, 160, 6.35808, 0.0076843, 0.00885282, 0.007737
    34DeserializeBlockTest, 5, 130, 4.11579, 0.00605016, 0.00697307, 0.00626367
    35DuplicateInputs, 5, 10, 0.643048, 0.0107885, 0.014601, 0.0130247
    36FastRandom_1bit, 5, 440000000, 5.20149, 2.31664e-09, 2.42676e-09, 2.33602e-09
    37FastRandom_32bit, 5, 110000000, 6.10483, 1.08269e-08, 1.13123e-08, 1.11124e-08
    38HASH_1MB, 5, 340, 6.63799, 0.00383218, 0.00397389, 0.00389316
    39HASH_256BYTES, 5, 250000, 1.97735, 1.49095e-06, 1.78286e-06, 1.54661e-06
    40HASH_64BYTES, 5, 500000, 2.0604, 7.66024e-07, 9.86615e-07, 7.89013e-07
    41MatchGCSFilter, 5, 50000, 7.91817, 2.90019e-05, 3.72023e-05, 3.08161e-05
    42MempoolEviction, 5, 41000, 9.86147, 4.14122e-05, 5.192e-05, 4.86635e-05
    43MerkleRoot, 5, 800, 6.25447, 0.00153309, 0.00158666, 0.00156357
    44POLY1305_1MB, 5, 340, 1.58711, 0.000894605, 0.00101961, 0.000919736
    45POLY1305_256BYTES, 5, 250000, 0.299447, 2.30486e-07, 2.54892e-07, 2.37903e-07
    46POLY1305_64BYTES, 5, 500000, 0.189159, 7.41961e-08, 7.89804e-08, 7.44001e-08
    47PrePadded, 5, 10000, 0.0126671, 2.47568e-07, 2.73122e-07, 2.48886e-07
    48PrevectorClearNontrivial, 5, 28300, 9.39727, 6.36911e-05, 6.8178e-05, 6.71331e-05
    49PrevectorClearTrivial, 5, 88600, 23.0503, 5.17078e-05, 5.25987e-05, 5.19458e-05
    50PrevectorDeserializeNontrivial, 5, 6800, 5.32902, 0.00013906, 0.000195636, 0.000142078
    51PrevectorDeserializeTrivial, 5, 52000, 3.72396, 1.33016e-05, 1.64363e-05, 1.39542e-05
    52PrevectorDestructorNontrivial, 5, 28800, 8.92926, 5.91341e-05, 6.46597e-05, 6.28898e-05
    53PrevectorDestructorTrivial, 5, 88900, 7.11175, 1.53554e-05, 1.63233e-05, 1.61115e-05
    54PrevectorResizeNontrivial, 5, 28900, 5.94407, 3.94754e-05, 4.21574e-05, 4.12122e-05
    55PrevectorResizeTrivial, 5, 90300, 5.85012, 1.26935e-05, 1.35717e-05, 1.28682e-05
    56RIPEMD160, 5, 440, 6.42822, 0.00282782, 0.00305228, 0.00292482
    57RegularPadded, 5, 10000, 0.0279986, 4.90113e-07, 6.5017e-07, 5.60703e-07
    58RollingBloom, 5, 1500000, 5.31282, 6.39848e-07, 7.7822e-07, 6.86684e-07
    59RollingBloomReset, 5, 20000, 7.80648, 7.52528e-05, 8.27285e-05, 7.69287e-05
    60RpcMempool, 5, 40, 2.98067, 0.0144245, 0.0157567, 0.0149161
    61SHA1, 5, 570, 6.74604, 0.00219204, 0.00275459, 0.00231692
    62SHA256, 5, 340, 6.32523, 0.00362398, 0.00384323, 0.00371196
    63SHA256D64_1024, 5, 7400, 5.96159, 0.000154711, 0.000171021, 0.000159504
    64SHA256_32b, 5, 4700000, 6.35779, 2.57318e-07, 2.83581e-07, 2.7218e-07
    65SHA512, 5, 330, 6.03323, 0.00348758, 0.00388368, 0.00367868
    66SipHash_32b, 5, 40000000, 6.60139, 3.16447e-08, 3.39715e-08, 3.30938e-08
    67Sleep100ms, 5, 10, 5.00574, 0.100082, 0.100135, 0.100122
    68Trig, 5, 12000000, 1.0619, 1.55502e-08, 2.09141e-08, 1.66462e-08
    69VerifyNestedIfScript, 5, 100, 0.0642893, 0.000114644, 0.000145037, 0.000127313
    70VerifyScriptBench, 5, 6300, 5.78638, 0.00017174, 0.000196193, 0.000180358
    71WalletBalanceClean, 5, 8000, 1.91247, 4.34781e-05, 5.36606e-05, 4.7943e-05
    72WalletBalanceDirty, 5, 2500, 3.90809, 0.000291278, 0.000342813, 0.00031049
    73WalletBalanceMine, 5, 16000, 1.77235, 2.108e-05, 2.35293e-05, 2.18813e-05
    74WalletBalanceWatch, 5, 8000, 1.88894, 4.62994e-05, 4.81851e-05, 4.72744e-05
    

    this branch, rebased to master

     0(pr/19478)$ ./src/bench/bench_bitcoin
     1# Benchmark, evals, iterations, total, min, max, median
     2AddrManAdd, 5, 5, 1.64512, 0.0651763, 0.0661761, 0.0659298
     3AddrManGetAddr, 5, 500, 6.3247, 0.00239997, 0.00279001, 0.0024759
     4AddrManGood, 5, 2, 6.62272, 0.629379, 0.739089, 0.634495
     5AddrManSelect, 5, 1000000, 3.32086, 5.37746e-07, 9.32109e-07, 5.66366e-07
     6AssembleBlock, 5, 700, 1.5542, 0.000426625, 0.000465905, 0.000444475
     7Base58CheckEncode, 5, 320000, 4.17478, 2.46738e-06, 2.91027e-06, 2.561e-06
     8Base58Decode, 5, 800000, 3.85293, 9.4571e-07, 9.81843e-07, 9.65822e-07
     9Base58Encode, 5, 470000, 3.51497, 1.43368e-06, 1.5361e-06, 1.52626e-06
    10Bech32Decode, 5, 800000, 2.25873, 5.30087e-07, 6.44574e-07, 5.52377e-07
    11Bech32Encode, 5, 800000, 3.19484, 7.32648e-07, 8.20117e-07, 8.15208e-07
    12BenchLockedPool, 5, 1300, 7.35172, 0.000980906, 0.00140676, 0.00106936
    13BenchTimeDeprecated, 5, 100000000, 3.30681, 6.27607e-09, 7.04851e-09, 6.62835e-09
    14BenchTimeMillis, 5, 6000000, 2.84261, 9.23429e-08, 9.91789e-08, 9.44398e-08
    15BenchTimeMillisSys, 5, 6000000, 2.97928, 9.24101e-08, 1.07893e-07, 9.93668e-08
    16BenchTimeMock, 5, 300000000, 3.33207, 2.18275e-09, 2.28349e-09, 2.21065e-09
    17BlockToJsonVerbose, 5, 10, 5.00496, 0.0960341, 0.105335, 0.100521
    18BnBExhaustion, 5, 650, 3.97426, 0.00111834, 0.00137522, 0.00117855
    19CCheckQueueSpeedPrevectorJob, 5, 1400, 7.85864, 0.00100836, 0.00120258, 0.00111514
    20CCoinsCaching, 5, 170000, 0.416761, 3.99718e-07, 6.26237e-07, 4.6736e-07
    21CHACHA20_1MB, 5, 340, 4.15272, 0.00232247, 0.00256342, 0.00241795
    22CHACHA20_256BYTES, 5, 250000, 0.821719, 6.04351e-07, 7.16052e-07, 6.47608e-07
    23CHACHA20_64BYTES, 5, 500000, 0.392367, 1.55135e-07, 1.60704e-07, 1.55682e-07
    24CHACHA20_POLY1305_AEAD_1MB_ENCRYPT_DECRYPT, 5, 340, 11.2433, 0.0064071, 0.00696151, 0.00653254
    25CHACHA20_POLY1305_AEAD_1MB_ONLY_ENCRYPT, 5, 340, 5.67251, 0.00320464, 0.00347243, 0.00329092
    26CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT, 5, 250000, 2.72123, 2.14751e-06, 2.23938e-06, 2.16475e-06
    27CHACHA20_POLY1305_AEAD_256BYTES_ONLY_ENCRYPT, 5, 250000, 1.42395, 1.07704e-06, 1.25263e-06, 1.10776e-06
    28CHACHA20_POLY1305_AEAD_64BYTES_ENCRYPT_DECRYPT, 5, 500000, 2.51465, 9.83994e-07, 1.0357e-06, 9.93211e-07
    29CHACHA20_POLY1305_AEAD_64BYTES_ONLY_ENCRYPT, 5, 500000, 1.28575, 5.03617e-07, 5.34522e-07, 5.09937e-07
    30CoinSelection, 5, 650, 0.754373, 0.00021323, 0.000260981, 0.000223156
    31ComplexMemPool, 5, 1, 1.72752, 0.329097, 0.376804, 0.341924
    32ConstructGCSFilter, 5, 1000, 10.5092, 0.00204965, 0.00214565, 0.00210984
    33DeserializeAndCheckBlockTest, 5, 160, 6.12895, 0.00726007, 0.00793475, 0.0076974
    34DeserializeBlockTest, 5, 130, 4.25455, 0.00636748, 0.00672154, 0.00647489
    35DuplicateInputs, 5, 10, 0.632548, 0.0107695, 0.0151279, 0.0130934
    36FastRandom_1bit, 5, 440000000, 5.37446, 2.20134e-09, 2.76454e-09, 2.4326e-09
    37FastRandom_32bit, 5, 110000000, 6.259, 1.07203e-08, 1.26064e-08, 1.11061e-08
    38HASH_1MB, 5, 340, 6.68846, 0.0037417, 0.00407083, 0.00404221
    39HASH_256BYTES, 5, 250000, 1.93397, 1.43203e-06, 1.61792e-06, 1.55341e-06
    40HASH_64BYTES, 5, 500000, 1.9273, 7.53427e-07, 7.92554e-07, 7.69106e-07
    41MatchGCSFilter, 5, 50000, 7.41827, 2.9138e-05, 3.08851e-05, 2.91944e-05
    42MempoolEviction, 5, 41000, 8.62135, 4.0131e-05, 4.33651e-05, 4.2242e-05
    43MerkleRoot, 5, 800, 6.1202, 0.00146477, 0.00158029, 0.00154912
    44POLY1305_1MB, 5, 340, 1.60954, 0.00093638, 0.000971838, 0.000939872
    45POLY1305_256BYTES, 5, 250000, 0.299708, 2.3101e-07, 2.53748e-07, 2.39605e-07
    46POLY1305_64BYTES, 5, 500000, 0.199672, 7.49288e-08, 8.58498e-08, 7.96427e-08
    47PrePadded, 5, 10000, 0.0145195, 2.46289e-07, 3.66109e-07, 2.7668e-07
    48PrevectorClearNontrivial, 5, 28300, 9.12451, 6.1713e-05, 6.81542e-05, 6.43679e-05
    49PrevectorClearTrivial, 5, 88600, 23.7796, 5.16061e-05, 5.61612e-05, 5.26222e-05
    50PrevectorDeserializeNontrivial, 5, 6800, 4.70353, 0.000125809, 0.000155278, 0.0001387
    51PrevectorDeserializeTrivial, 5, 52000, 4.00423, 1.4031e-05, 1.73909e-05, 1.49766e-05
    52PrevectorDestructorNontrivial, 5, 28800, 10.0167, 6.71979e-05, 7.25358e-05, 6.8967e-05
    53PrevectorDestructorTrivial, 5, 88900, 6.70445, 1.45854e-05, 1.55963e-05, 1.51922e-05
    54PrevectorResizeNontrivial, 5, 28900, 5.86289, 3.87845e-05, 4.19374e-05, 4.1117e-05
    55PrevectorResizeTrivial, 5, 90300, 6.17383, 1.30828e-05, 1.48376e-05, 1.35303e-05
    56RIPEMD160, 5, 440, 6.65392, 0.00289608, 0.00322838, 0.00296501
    57RegularPadded, 5, 10000, 0.024881, 4.61499e-07, 5.36864e-07, 4.99317e-07
    58RollingBloom, 5, 1500000, 5.26776, 6.34446e-07, 7.91226e-07, 6.99945e-07
    59RollingBloomReset, 5, 20000, 8.06222, 7.65343e-05, 8.62762e-05, 7.88462e-05
    60RpcMempool, 5, 40, 2.85708, 0.0141566, 0.0145827, 0.0142359
    61SHA1, 5, 570, 6.69593, 0.00221977, 0.00245258, 0.00232601
    62SHA256, 5, 340, 6.37176, 0.00357123, 0.00393608, 0.0037714
    63SHA256D64_1024, 5, 7400, 5.96427, 0.000156924, 0.000165802, 0.000161775
    64SHA256_32b, 5, 4700000, 6.34025, 2.61224e-07, 2.77951e-07, 2.69953e-07
    65SHA512, 5, 330, 5.99635, 0.0035573, 0.00368859, 0.0036532
    66SipHash_32b, 5, 40000000, 6.50056, 3.13885e-08, 3.40995e-08, 3.22946e-08
    67Sleep100ms, 5, 10, 5.00548, 0.100081, 0.100137, 0.100105
    68Trig, 5, 12000000, 1.15828, 1.48329e-08, 2.88076e-08, 1.67105e-08
    69VerifyNestedIfScript, 5, 100, 0.0523492, 8.93512e-05, 0.000123463, 0.000103065
    70VerifyScriptBench, 5, 6300, 5.64757, 0.000166801, 0.00019268, 0.000176785
    71WalletBalanceClean, 5, 8000, 1.95422, 4.57885e-05, 5.51424e-05, 4.67078e-05
    72WalletBalanceDirty, 5, 2500, 3.97355, 0.000295304, 0.000335195, 0.000318887
    73WalletBalanceMine, 5, 16000, 1.9643, 2.28496e-05, 2.7418e-05, 2.38308e-05
    74WalletBalanceWatch, 5, 8000, 2.03689, 4.80869e-05, 5.49796e-05, 4.97744e-05
    
  20. JeremyRubin renamed this:
    Kill mapLinks data structure in Mempool
    Remove CTxMempool::mapLinks data structure member
    on Jul 20, 2020
  21. in src/txmempool.cpp:173 in 2a47af4874 outdated
    171@@ -172,16 +172,17 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry, setEntr
    172         // If we're not searching for parents, we require this to be an
    173         // entry in the mempool already.
    


    JeremyRubin commented at 5:40 pm on July 20, 2020:
    When we say “we require” in this comment, previously we had an assertion that this was true, this patch removes this assertion. This should be safe to do, as it would have previously resulted in crashes. But if we want the assertion checked, we should add back a lookup to mapTX.

    JeremyRubin commented at 5:59 pm on July 20, 2020:

    This is because previously we had to consistency check constantly to make sure we weren’t looking at a dead mempool entry https://github.com/bitcoin/bitcoin/pull/19478/files#diff-ca74c4b28865382863b8fe7633a85cd6L987

    Maybe one way to get around this would be to make CTxMemPoolEntry contain bool “dead” if the entry has been evicted but a valid reference is dangling somewhere? This is unconditionally a bug, but then we could add asserts back to CTxMempoolEntry::GetMempool{Children,Parents}{,Const}.

    I think this can be handled in follow up work though, as the major need for this consistency check was a fraught ownership model in mapLinks.

  22. JeremyRubin commented at 6:11 pm on July 20, 2020: contributor

    @jonatack thanks for the review! Cool to see it making an even bigger impact on other people’s systems – can you share your hardware specs? Mine were run on AMD Ryzen 7 1800X Eight-Core, 3.6ghz, 32 gb ram 2133 MT/s. It generally makes sense that if you’re on lesser hardware (more total time on your benches) there may be some other factors contributing to a bigger speedup.

    W.r.t. boost features, iterator_to is a member function of multiindex, so it’s not a new feature it’s an existing API. In any case, it has been in use in the mempool since 2015 on line 174 60de0d5826f1b848a43ec989ff712f002eddc3dc, so it should be fine.

    All nits have been addressed.

  23. JeremyRubin added this to the "Epoch MemPool" column in a project

  24. jonatack commented at 8:15 am on July 21, 2020: member

    Intel i7-6500U 4 core @ 2.5GHz, 32 gb RAM DDR4 2666, SSD nvme Samsung 960 Pro 1T, running Debian bullseye/sid with Coreboot (I need to upgrade to a faster CPU but it’s not likely to happen soon).

    Code review re-ACK 2a47af4 per git diff dc0d3fb 2a47af4

  25. in src/txmempool.h:53 in c977ecc917 outdated
    48@@ -49,6 +49,17 @@ struct LockPoints
    49     LockPoints() : height(0), time(0), maxInputBlock(nullptr) { }
    50 };
    51 
    52+struct CompareIteratorByHashGeneric {
    53+    // SFINAE for a is a pointer or a reference
    


    hebasto commented at 5:30 am on July 22, 2020:
    c977ecc917ecf40c554173a52700973ea7d42f07 I’d remove this comment as needless and confusing. Which type from std::reference_wrapper<T>& and T& is a pointer to T?

    JeremyRubin commented at 4:54 pm on July 22, 2020:
    T is either a reference wrapper (which I believe a normal reference can convert to) or a txiter, which is a pointer to a mempool entry (e.g., you have to use a-> to access members). Maybe more accurate for comment to be about T than a.

    JeremyRubin commented at 5:45 pm on July 22, 2020:
    comment improved.
  26. in src/txmempool.h:550 in c977ecc917 outdated
    551-            return a->GetTx().GetHash() < b->GetTx().GetHash();
    552-        }
    553-    };
    554-    typedef std::set<txiter, CompareIteratorByHash> setEntries;
    555+    typedef std::set<txiter, CompareIteratorByHashGeneric> setEntries;
    556+    typedef std::vector<txiter> vecEntries;
    


    hebasto commented at 5:31 am on July 22, 2020:
    c977ecc917ecf40c554173a52700973ea7d42f07 This typedef is not used, and should be dropped.

    JeremyRubin commented at 5:24 pm on July 22, 2020:
    ah sorry it exists in #18191 which is how it made it over here as this work was supposed to follow that work. I can drop it.
  27. in src/txmempool.cpp:968 in c977ecc917 outdated
    966-    if (add && mapLinks[entry].children.insert(child).second) {
    967+    CTxMemPoolEntry::Children s;
    968+    if (add && entry->GetMemPoolChildren().insert(child).second) {
    969         cachedInnerUsage += memusage::IncrementalDynamicUsage(s);
    970-    } else if (!add && mapLinks[entry].children.erase(child)) {
    971+    } else if (!add && entry->GetMemPoolChildren().erase(child)) {
    


    hebasto commented at 6:02 am on July 22, 2020:

    c977ecc917ecf40c554173a52700973ea7d42f07 Since add has bool type, could be

    0    } else if (entry->GetMemPoolChildren().erase(child)) {
    

    JeremyRubin commented at 5:02 pm on July 22, 2020:

    I prefer to not make changes like this unless there’s a motivating reason. The mapLinks refactor is about removing mapLinks, code style changes should be proposed separately.

    In particular, your recommended change introduces a bug, which is another reason to avoid changes like this.

    If the preceding if fails, it is either because add was not set OR because the item was already in the Children set. When we then check the else if, we should only erase if we set add to false, otherwise we will spuriously delete children.


    hebasto commented at 5:04 pm on July 22, 2020:
    My bad (( Sorry for noise.
  28. in src/txmempool.cpp:978 in c977ecc917 outdated
    979-    if (add && mapLinks[entry].parents.insert(parent).second) {
    980+    CTxMemPoolEntry::Parents s;
    981+    if (add && entry->GetMemPoolParents().insert(parent).second) {
    982         cachedInnerUsage += memusage::IncrementalDynamicUsage(s);
    983-    } else if (!add && mapLinks[entry].parents.erase(parent)) {
    984+    } else if (!add && entry->GetMemPoolParents().erase(parent)) {
    


    hebasto commented at 6:03 am on July 22, 2020:

    c977ecc917ecf40c554173a52700973ea7d42f07 Since add has bool type, could be

    0    } else if (entry->GetMemPoolParents().erase(parent)) {
    

    JeremyRubin commented at 5:03 pm on July 22, 2020:
    see #19478 (review) for an explanation of the bug the proposed change would introduce.
  29. hebasto changes_requested
  30. hebasto commented at 6:24 am on July 22, 2020: member

    Concept ACK. Tested 2a47af487401d61e6f4418859f063d8f18f75caf on Linux Mint 20 (x86_64).

    1. The first commit c977ecc917ecf40c554173a52700973ea7d42f07 “Part 1/2…” is not compiled. I think we should avoid such situations in the git history.

    2. I strictly prefer that new code follows the coding style, therefore mind applying clang-format-diff.py to it?

  31. JeremyRubin force-pushed on Jul 22, 2020
  32. JeremyRubin force-pushed on Jul 22, 2020
  33. JeremyRubin force-pushed on Jul 22, 2020
  34. JeremyRubin commented at 5:55 pm on July 22, 2020: contributor

    Ok I’ve gone ahead and squashed all the nits and intermittent commits so that bisecting should work cleanly.

    I think the resultant history is tad harder to review than with the two parts, as there was a bigger delineation from the architectural refactoring, the callsite changes, and just style changes to touched code, but it is what it is. @jonatack you’ll notice some formatting changes when comparing to the prior commit.

    git diff 2a47af4 74a780a3aa07ce0c6f766877b0535d07c26dba0e

     0diff --git a/src/txmempool.h b/src/txmempool.h
     1index 8050eaefa..a3a0197d9 100644
     2--- a/src/txmempool.h
     3+++ b/src/txmempool.h
     4@@ -50,13 +50,16 @@ struct LockPoints
     5 };
     6 
     7 struct CompareIteratorByHashGeneric {
     8-    // SFINAE for a is a pointer or a reference
     9-    template<typename T>
    10-    bool operator()(const std::reference_wrapper<T> &a, const std::reference_wrapper<T> &b) const {
    11+    // SFINAE for T where T is either a pointer type (e.g., a txiter) or a reference_wrapper<T>
    12+    // (e.g. a wrapped CTxMemPoolEntry&)
    13+    template <typename T>
    14+    bool operator()(const std::reference_wrapper<T>& a, const std::reference_wrapper<T>& b) const
    15+    {
    16         return a.get().GetTx().GetHash() < b.get().GetTx().GetHash();
    17     }
    18-    template<typename T>
    19-    bool operator()(const T &a, const T &b) const {
    20+    template <typename T>
    21+    bool operator()(const T& a, const T& b) const
    22+    {
    23         return a->GetTx().GetHash() < b->GetTx().GetHash();
    24     }
    25 };
    26@@ -79,6 +82,7 @@ public:
    27     // two aliases, should the types ever diverge
    28     typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Parents;
    29     typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Children;
    30+
    31 private:
    32     const CTransactionRef tx;
    33     mutable Parents m_parents;
    34@@ -145,10 +149,10 @@ public:
    35     CAmount GetModFeesWithAncestors() const { return nModFeesWithAncestors; }
    36     int64_t GetSigOpCostWithAncestors() const { return nSigOpCostWithAncestors; }
    37 
    38-    const Parents& GetMemPoolParentsConst() const {return m_parents;}
    39-    const Children& GetMemPoolChildrenConst() const {return m_children;}
    40-    Parents& GetMemPoolParents() const {return m_parents;}
    41-    Children& GetMemPoolChildren() const {return m_children;}
    42+    const Parents& GetMemPoolParentsConst() const { return m_parents; }
    43+    const Children& GetMemPoolChildrenConst() const { return m_children; }
    44+    Parents& GetMemPoolParents() const { return m_parents; }
    45+    Children& GetMemPoolChildren() const { return m_children; }
    46 
    47     mutable size_t vTxHashesIdx; //!< Index in mempool's vTxHashes
    48     mutable uint64_t m_epoch; //!< epoch when last touched, useful for graph algorithms
    49@@ -547,7 +551,6 @@ public:
    50     std::vector<std::pair<uint256, txiter>> vTxHashes GUARDED_BY(cs); //!< All tx witness hashes/entries in mapTx, in random order
    51 
    52     typedef std::set<txiter, CompareIteratorByHashGeneric> setEntries;
    53-    typedef std::vector<txiter> vecEntries;
    54 
    55     uint64_t CalculateDescendantMaximum(txiter entry) const EXCLUSIVE_LOCKS_REQUIRED(cs);
    56 private:
    
  35. hebasto approved
  36. hebasto commented at 6:15 pm on July 22, 2020: member

    ACK 74a780a3aa07ce0c6f766877b0535d07c26dba0e, tested on Linux Mint 20 (x86_64).

    Benchmarking results:

    • master (2aaff4813cc340764c99846513d58fc3553fcb6a)
    0$ ./src/bench/bench_bitcoin -filter="ComplexMemPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2ComplexMemPool, 5, 1, 1.42105, 0.283276, 0.286269, 0.283499
    
    • this PR (74a780a3aa07ce0c6f766877b0535d07c26dba0e)
    0$ ./src/bench/bench_bitcoin -filter="ComplexMemPool"
    1# Benchmark, evals, iterations, total, min, max, median
    2ComplexMemPool, 5, 1, 1.21549, 0.240644, 0.246652, 0.243024
    
  37. DrahtBot added the label Needs rebase on Jul 22, 2020
  38. JeremyRubin force-pushed on Jul 22, 2020
  39. JeremyRubin commented at 8:15 pm on July 22, 2020: contributor
    rebased.
  40. DrahtBot removed the label Needs rebase on Jul 22, 2020
  41. in src/txmempool.cpp:926 in cd39f1a3be outdated
    921@@ -917,8 +922,8 @@ bool CCoinsViewMemPool::GetCoin(const COutPoint &outpoint, Coin &coin) const {
    922 
    923 size_t CTxMemPool::DynamicMemoryUsage() const {
    924     LOCK(cs);
    925-    // Estimate the overhead of mapTx to be 15 pointers + an allocation, as no exact formula for boost::multi_index_contained is implemented.
    926-    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 15 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(mapLinks) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
    927+    // Estimate the overhead of mapTx to be 12 pointers + an allocation, as no exact formula for boost::multi_index_contained is implemented.
    928+    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 12 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
    


    hebasto commented at 5:40 am on July 23, 2020:

    Mis-rebased?

    0    // Estimate the overhead of mapTx to be 15 pointers + an allocation, as no exact formula for boost::multi_index_contained is implemented.
    1    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 15 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
    

    See 2b4b90aa8f0440deacefb5997d7bd1f9f5c591b3 from #18044


    JeremyRubin commented at 10:24 am on July 23, 2020:
    Absolutely right; I’ll tell ya I spent like 5 minutes trying to spot the difference on that conflict and couldn’t see it :p
  42. JeremyRubin force-pushed on Jul 23, 2020
  43. in src/txmempool.cpp:70 in 794e803a15 outdated
    74-            cacheMap::iterator cacheIt = cachedDescendants.find(childEntry);
    75+        const CTxMemPoolEntry& descendant = *stageEntries.begin();
    76+        descendants.insert(descendant);
    77+        stageEntries.erase(descendant);
    78+        const CTxMemPoolEntry::Children& children = descendant.GetMemPoolChildrenConst();
    79+        for (const CTxMemPoolEntry& childEntry : children) {
    


    jonatack commented at 1:50 pm on July 23, 2020:
    nit: if you have to retouch again, can update childEntry to current naming style child or child_entry in this range-for block, since the lines are being touched anyway

    JeremyRubin commented at 7:07 pm on September 3, 2020:
    skipped touching this for aforementioned reasons (other work will shortly overwrite this)
  44. jonatack commented at 2:44 pm on July 23, 2020: member
    ACK 794e803 per git range-diff 9d4b3d8 2a47af4 794e803 + re-read commit diffs, debug-built and ran tests at both commits, ran a node at 794e803 for an hour. The naming and formatting edits introduce noisier changes but the result seems better, as well as the asserts in CTxMemPool::check and other improvements.
  45. JeremyRubin commented at 3:54 pm on July 23, 2020: contributor
    To be honest most of the style changes on early prs in this series don’t really matter that much as all that code is going to be nuked by follow up work anyways.
  46. laanwj added this to the "Blockers" column in a project

  47. hebasto approved
  48. hebasto commented at 7:14 am on July 24, 2020: member

    re-ACK 794e803a1539516908a432c3482e6e6c5f1fe422, only rebased since my previous ACK, verified with

     0$ git range-diff master 74a780a3aa07ce0c6f766877b0535d07c26dba0e 794e803a1539516908a432c3482e6e6c5f1fe422
     11:  dd2401bf4 ! 1:  85753be75 Remove mapLinks in favor of entry inlined structs with iterator type erasure
     2    @@ src/txmempool.cpp: void CTxMemPool::check(const CCoinsViewCache *pcoins) const
     3     @@ src/txmempool.cpp: bool CCoinsViewMemPool::GetCoin(const COutPoint &outpoint, Coin &coin) const {
     4      size_t CTxMemPool::DynamicMemoryUsage() const {
     5          LOCK(cs);
     6    -     // Estimate the overhead of mapTx to be 12 pointers + an allocation, as no exact formula for boost::multi_index_contained is implemented.
     7    --    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 12 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(mapLinks) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
     8    -+    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 12 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
     9    +     // Estimate the overhead of mapTx to be 15 pointers + an allocation, as no exact formula for boost::multi_index_contained is implemented.
    10    +-    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 15 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(mapLinks) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
    11    ++    return memusage::MallocUsage(sizeof(CTxMemPoolEntry) + 15 * sizeof(void*)) * mapTx.size() + memusage::DynamicUsage(mapNextTx) + memusage::DynamicUsage(mapDeltas) + memusage::DynamicUsage(vTxHashes) + cachedInnerUsage;
    12      }
    13      
    14      void CTxMemPool::RemoveUnbroadcastTx(const uint256& txid, const bool unchecked) {
    152:  74a780a3a = 2:  794e803a1 Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents
    
  49. jonatack commented at 8:00 am on July 24, 2020: member

    ran a node at 794e803 for an hour

    Ran the node overnight. As Space-X would say, node operation appears to be nominal.

  50. promag commented at 6:42 pm on August 2, 2020: member
    Concept ACK.
  51. in src/txmempool.cpp:245 in 85753be752 outdated
    244@@ -242,9 +245,9 @@ void CTxMemPool::UpdateEntryForAncestors(txiter it, const setEntries &setAncesto
    245 
    246 void CTxMemPool::UpdateChildrenForRemoval(txiter it)
    247 {
    248-    const setEntries &setMemPoolChildren = GetMemPoolChildren(it);
    


    ariard commented at 1:16 pm on August 3, 2020:
    You have a dangling comment reference to setMemPoolChildren in CTxMemPool class comment.

    JeremyRubin commented at 1:16 am on August 4, 2020:

    yeah in general the mempool documentation is not consistent with the current variable names for things (e.g. setMemPoolChildren is referred to in many places as belonging to a entry but no such member exists).

    Making the documentation consistent can happen as follow up work (there are other changes in the pipe that will make more substantive changes requiring documentation updates).

  52. in src/txmempool.cpp:264 in 85753be752 outdated
    259@@ -257,9 +260,9 @@ void CTxMemPool::UpdateForRemoveFromMempool(const setEntries &entriesToRemove, b
    260         // updateDescendants should be true whenever we're not recursively
    261         // removing a tx and all its descendants, eg when a transaction is
    262         // confirmed in a block.
    263-        // Here we only update statistics and not data in mapLinks (which
    264-        // we need to preserve until we're finished with all operations that
    265-        // need to traverse the mempool).
    266+        // Here we only update statistics and not data in CTxMemPool::Parents
    267+        // and CTxMemPoolEntry::Children (which we need to preserve until we're
    


    ariard commented at 0:03 am on August 4, 2020:

    I think what is meaned by “data” is a bit confusing here. It can be either a pair of iterator/sets in mapLinks or the sets in themselves or even statistics of each set element ? Of course, what we care is integrity of links to keep our traversal consistent until we’re done updating. But referring now to Parents/Children doesn’t make it clearer.

    As a better suggestion, comment may explain that our processing 1) update accounting for all interdependent entries of target 2) remove target and pointers/iterators to interdependencies and thus we MUST conserve last ones while doing 1) as otherwise it would fraught our accounting. Where interdependency is defined as a ancestors/descendants relation.


    JeremyRubin commented at 1:08 am on August 4, 2020:
    Maybe we can clean up the documentation later? I agree this is confusing but it’s not made worse by changing it here.
  53. in src/txmempool.cpp:288 in 85753be752 outdated
    289-        // ancestors whose packages include this transaction, because when we
    290-        // add a new transaction to the mempool in addUnchecked(), we assume it
    291-        // has no children, and in the case of a reorg where that assumption is
    292-        // false, the in-mempool children aren't linked to the in-block tx's
    293-        // until UpdateTransactionsFromBlock() is called.
    294+        // of ancestors reachable via GetMemPoolParents()/GetMemPoolChildren()
    


    ariard commented at 0:50 am on August 4, 2020:
    I’m not sure if it makes sense to refer to GetMemPoolChildren. Again I understand this comment it’s saying that during a reorg mapTx entries won’t have their parent set correctly setup until UpdateTransactionsFromBlock thus relying on mapLinks to fetch the correct parent set, as this one is setup at addUnchecked.

    JeremyRubin commented at 1:10 am on August 4, 2020:
    I’m not really sure what you’re suggesting but I think the documentation can be cleaned up in the future.
  54. ariard commented at 0:59 am on August 4, 2020: member
    Concept ACK
  55. ariard commented at 1:10 pm on August 4, 2020: member
    @JeremyRubin Sure documentation improvements can happen in follow-ups, comments are just a way to track suggestions. I was mostly pointing places where switching names confused a bit further IMO but we can overhaul the whole documentation latter. Will review new ownership model soon.
  56. in src/txmempool.h:84 in 85753be752 outdated
    76@@ -63,8 +77,16 @@ struct LockPoints
    77 
    78 class CTxMemPoolEntry
    79 {
    80+public:
    81+    typedef std::reference_wrapper<const CTxMemPoolEntry> CTxMemPoolEntryRef;
    82+    // two aliases, should the types ever diverge
    83+    typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Parents;
    84+    typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHashGeneric> Children;
    


    Empact commented at 5:07 pm on September 2, 2020:
    nit: By YAGNI, I would weigh against preemptive definitions

    JeremyRubin commented at 6:52 pm on September 2, 2020:

    I was doing some experiments where they diverge :)

    Parents are fundamentally fixed by the TX (e.g., named by TXID) so there are some opportunities for eliminating the std::set based on that, whereas children kind of have to remain a std::set, but can potentially be turned into a pointer into MapNextTx.

  57. in src/txmempool.h:155 in 85753be752 outdated
    148@@ -127,6 +149,11 @@ class CTxMemPoolEntry
    149     CAmount GetModFeesWithAncestors() const { return nModFeesWithAncestors; }
    150     int64_t GetSigOpCostWithAncestors() const { return nSigOpCostWithAncestors; }
    151 
    152+    const Parents& GetMemPoolParentsConst() const { return m_parents; }
    153+    const Children& GetMemPoolChildrenConst() const { return m_children; }
    154+    Parents& GetMemPoolParents() const { return m_parents; }
    155+    Children& GetMemPoolChildren() const { return m_children; }
    


    Empact commented at 5:09 pm on September 2, 2020:
    nit: It seems redundant to offer two definitions for these getters - constness can be enforced by the target variable, no?

    JeremyRubin commented at 6:44 pm on September 2, 2020:

    nope it can’t really – e.g., if I change the type of the local binding it will automatically pick the correct const/non const version, but I want to call enforcing that it can’t be used non-const.

    see begin/cbegin https://en.cppreference.com/w/cpp/container/vector/begin which exists for the same purpose.


    Empact commented at 11:37 pm on September 2, 2020:
    Ok I took a crack at making an example of this but found it conflicted with the indexed_transaction_set constraints, which force const on the stored value. So this is fine by me.

    MarcoFalke commented at 9:14 am on September 6, 2020:

    nit in commit 46d955d196043cc297834baeebce31ff778dff80: For the “mutating” getters, I’d prefer to suffix them with Mut (or similar) and remove the suffix from the “const” getters. The mutating ones are only called twice, whereas the read-only getters are called more often, so this would reduce verbosity. Also, safe languages (like rust) treat the default const and otherwise require mutable to be specified.

    0    Children& GetMemPoolChildrenMut() const { return m_children; }
    

    JeremyRubin commented at 3:46 pm on September 6, 2020:

    It’s C++ convention that the const one has a const-y name (e.g., begin -> cbegin). I opted to suffix with const rather than CGet… because we use the hungarian notation everywhere so it would look like a class.

    Future work can also get rid of the mutating getters use entirely by moving UpdateChild into the CTXMemPoolEntry itself. This isn’t done here as it is a bigger refactor (requires changing the interfaces) and has no functional benefit.


    MarcoFalke commented at 8:54 am on September 7, 2020:
    Makes sense. Closing discussion
  58. in src/txmempool.h:65 in 85753be752 outdated
    60+    template <typename T>
    61+    bool operator()(const T& a, const T& b) const
    62+    {
    63+        return a->GetTx().GetHash() < b->GetTx().GetHash();
    64+    }
    65+};
    


    Empact commented at 5:10 pm on September 2, 2020:
    nit: Could just call this CompareIteratorByHash given there’s no alternative definition.

    JeremyRubin commented at 6:55 pm on September 2, 2020:

    Fair – I think when i originally wrote the generic class, I didn’t delete the prior class definition, but realized I could with the SFINAE at some point.

    If I have to make other changes, can adjust the name.


    JeremyRubin commented at 6:48 pm on September 3, 2020:
    adjusted.
  59. laanwj commented at 10:58 am on September 3, 2020: member

    I get the following error while building (on top of master), I think it must be a silent merge conflict with another recent change:

    0//bitcoin/src/net_processing.cpp:1778:69: error: no member named 'GetMemPoolParents' in 'CTxMemPool'
    1                    const CTxMemPool::setEntries& parents = mempool.GetMemPoolParents(*txiter);
    2                                                            ~~~~~~~ ^
    31 error generated.
    4make[2]: *** [Makefile:11736: libbitcoin_server_a-net_processing.o] Error 1
    
  60. JeremyRubin commented at 4:49 pm on September 3, 2020: contributor

    ugh will fix; this should become txiter->GetMemPoolParentsConst();

    hopefully the only place…

  61. JeremyRubin force-pushed on Sep 3, 2020
  62. JeremyRubin commented at 8:38 pm on September 3, 2020: contributor

    Fixed merge issue!

    Test failure is unrelated (one failing build on an unrelated wallet test). Leaving the failed test up to permit others to inspect, may be related to #19853 according to @achow101.

  63. jonatack commented at 9:54 pm on September 3, 2020: member
    re-ACK 3ba1665a95ae8e7140aedc54f8cebacb88ce63e1
  64. jonatack commented at 9:58 pm on September 3, 2020: member
    The wallet_basic.py --descriptors Travis test failure is unrelated. The test passes for me locally on this branch and I’ve seen the same failure, in one Travis job only, in other PRs reviewed today.
  65. hebasto approved
  66. hebasto commented at 6:50 am on September 4, 2020: member
    re-ACK 3ba1665a95ae8e7140aedc54f8cebacb88ce63e1, only rebased and renamed s/CompareIteratorByHashGeneric/CompareIteratorByHash/ since my previous review.
  67. DrahtBot added the label Needs rebase on Sep 4, 2020
  68. Remove mapLinks in favor of entry inlined structs with iterator type erasure 46d955d196
  69. Get rid of unused functions CTxMemPool::GetMemPoolChildren, CTxMemPool::GetMemPoolParents 296be8f58e
  70. JeremyRubin force-pushed on Sep 4, 2020
  71. JeremyRubin commented at 4:51 pm on September 4, 2020: contributor
    Since #19854 is merged, this conflicts and has been rebased. Apologies to @jonatack and @hebasto, but mind re-slapping an ACK on this when tests pass? diff is just adding AssertLockHeld(cs) to UpdateChild/UpdateParent, which would not cleanly merge with #19854.
  72. hebasto approved
  73. hebasto commented at 6:01 pm on September 4, 2020: member
    re-ACK 296be8f58e02b39a58f017c52294aceed22c3ffd, only rebased since my previous review (verified with git range-diff).
  74. DrahtBot removed the label Needs rebase on Sep 4, 2020
  75. jonatack commented at 7:38 pm on September 4, 2020: member
    re-ACK 296be8f per git range-diff ab338a19 3ba1665 296be8f, sanity check gcc 10.2 debug build is clean.
  76. in src/txmempool.cpp:201 in 46d955d196 outdated
    195@@ -195,13 +196,15 @@ bool CTxMemPool::CalculateMemPoolAncestors(const CTxMemPoolEntry &entry, setEntr
    196             return false;
    197         }
    198 
    199-        const setEntries & setMemPoolParents = GetMemPoolParents(stageit);
    200-        for (txiter phash : setMemPoolParents) {
    201+        const CTxMemPoolEntry::Parents& parents = stageit->GetMemPoolParentsConst();
    202+        for (const CTxMemPoolEntry& parent : parents) {
    203+            txiter parent_it = mapTx.iterator_to(parent);
    


    MarcoFalke commented at 9:24 am on September 6, 2020:
    style-nit: Any reason to add an alias that is only used once?

    JeremyRubin commented at 3:52 pm on September 6, 2020:
    I believe I consistently use this style as I find it is more readable to not use the result of a function call on the right hand side of a range.
  77. in src/txmempool.cpp:90 in 46d955d196 outdated
    101-    for (txiter cit : setAllDescendants) {
    102-        if (!setExclude.count(cit->GetTx().GetHash())) {
    103-            modifySize += cit->GetTxSize();
    104-            modifyFee += cit->GetModifiedFee();
    105+    for (const CTxMemPoolEntry& descendant : descendants) {
    106+        if (!setExclude.count(descendant.GetTx().GetHash())) {
    


    MarcoFalke commented at 9:27 am on September 6, 2020:

    nit: in this case the diff would be smaller if an alias was used.

    0        txiter cit = mapTx.iterator_to(descendant);
    1        if (!setExclude.count(cit->GetTx().GetHash())) {
    

    JeremyRubin commented at 4:00 pm on September 6, 2020:
    this code is imminently going to be rewritten, so it doesn’t matter that much. I’m not sure if optimizing for small diff matters that much as the name “cit” kinda sucks anyways.
  78. in src/txmempool.cpp:181 in 46d955d196 outdated
    179     size_t totalSizeWithAncestors = entry.GetTxSize();
    180 
    181-    while (!parentHashes.empty()) {
    182-        txiter stageit = *parentHashes.begin();
    183+    while (!staged_ancestors.empty()) {
    184+        const CTxMemPoolEntry& stage = staged_ancestors.begin()->get();
    


    MarcoFalke commented at 9:29 am on September 6, 2020:

    nit: Should be able to decay by itself?

    0        const CTxMemPoolEntry& stage = *staged_ancestors.begin();
    
  79. in src/txmempool.cpp:58 in 46d955d196 outdated
    54@@ -55,45 +55,45 @@ size_t CTxMemPoolEntry::GetTxSize() const
    55 }
    56 
    57 // Update the given tx for any in-mempool descendants.
    58-// Assumes that setMemPoolChildren is correct for the given tx and all
    59+// Assumes that CTxMemPool::m_children is correct for the given tx and all
    


    MarcoFalke commented at 9:30 am on September 6, 2020:
    0// Assumes that CTxMemPoolEntry::m_children is correct for the given tx and all
    

    JeremyRubin commented at 5:51 pm on September 6, 2020:
    Ah good catch on the m_children.
  80. in src/txmempool.cpp:122 in 46d955d196 outdated
    118@@ -119,7 +119,7 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
    119     // Iterate in reverse, so that whenever we are looking at a transaction
    120     // we are sure that all in-mempool descendants have already been processed.
    121     // This maximizes the benefit of the descendant cache and guarantees that
    122-    // setMemPoolChildren will be updated, an assumption made in
    123+    // CTxMemPool::m_children will be updated, an assumption made in
    


    MarcoFalke commented at 9:32 am on September 6, 2020:
    same
  81. in src/txmempool.cpp:131 in 46d955d196 outdated
    127@@ -128,8 +128,8 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
    128             continue;
    129         }
    130         auto iter = mapNextTx.lower_bound(COutPoint(hash, 0));
    131-        // First calculate the children, and update setMemPoolChildren to
    132-        // include them, and update their setMemPoolParents to include this tx.
    133+        // First calculate the children, and update CTxMemPool::m_children to
    


    MarcoFalke commented at 9:32 am on September 6, 2020:
    same
  82. in src/txmempool.cpp:263 in 46d955d196 outdated
    259@@ -257,9 +260,9 @@ void CTxMemPool::UpdateForRemoveFromMempool(const setEntries &entriesToRemove, b
    260         // updateDescendants should be true whenever we're not recursively
    261         // removing a tx and all its descendants, eg when a transaction is
    262         // confirmed in a block.
    263-        // Here we only update statistics and not data in mapLinks (which
    264-        // we need to preserve until we're finished with all operations that
    265-        // need to traverse the mempool).
    266+        // Here we only update statistics and not data in CTxMemPool::Parents
    


    MarcoFalke commented at 9:32 am on September 6, 2020:
    same
  83. in src/txmempool.h:84 in 46d955d196 outdated
    76@@ -63,8 +77,16 @@ struct LockPoints
    77 
    78 class CTxMemPoolEntry
    79 {
    80+public:
    81+    typedef std::reference_wrapper<const CTxMemPoolEntry> CTxMemPoolEntryRef;
    82+    // two aliases, should the types ever diverge
    83+    typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHash> Parents;
    84+    typedef std::set<CTxMemPoolEntryRef, CompareIteratorByHash> Children;
    


    MarcoFalke commented at 10:18 am on September 6, 2020:

    nit: Can use C++11 :sweat_smile:

    0    using Children = std::set<CTxMemPoolEntryRef, CompareIteratorByHash>;
    

    jonatack commented at 10:47 am on September 6, 2020:
    Good review comments above. I’ll point out, though, that “typedef” seems much better for git grepping the codebase than “using” (yields more useful results).

    MarcoFalke commented at 10:55 am on September 6, 2020:
    git grep can also do regex, so a git grep typedef would be git grep --extended-regexp 'using \w+ =', and we are already using using quite extensively ;)

    JeremyRubin commented at 5:52 pm on September 6, 2020:
    i don’t have any preference.
  84. MarcoFalke approved
  85. MarcoFalke commented at 10:27 am on September 6, 2020: member

    Any reason the commits aren’t squashed? I left some style nits, but feel free to ignore the stupid ones. Though, it would be good to fixup the documentation ones. Also those:

    0git grep --extended-regexp '(setMemPool|mapLinks)'
    

    Also checked that the dynamic size doesn’t increase:

     0diff --git a/src/bench/mempool_stress.cpp b/src/bench/mempool_stress.cpp
     1index 89233e390c..7b07f7723e 100644
     2--- a/src/bench/mempool_stress.cpp
     3+++ b/src/bench/mempool_stress.cpp
     4@@ -86,6 +86,8 @@ static void ComplexMemPool(benchmark::Bench& bench)
     5         for (auto& tx : ordered_coins) {
     6             AddTx(tx, pool);
     7         }
     8+        std::cout << pool.DynamicMemoryUsage() << std::endl;
     9+        assert(false);
    10         pool.TrimToSize(pool.DynamicMemoryUsage() * 3 / 4);
    11         pool.TrimToSize(GetVirtualTransactionSize(*ordered_coins.front()));
    12     });
    
    0Before: 3499728
    1After:  3442128
    

    a76734e4471002f0643bef86fbbea9a2da187f4e

  86. JeremyRubin commented at 1:17 am on September 7, 2020: contributor

    @MarcoFalke do you have a preference for how some of these are handled? I think because it’s all non-functional style related or documentary concerns it would be OK to do post-merge as there’s a lot of other pending work that will touch/redo these areas eventually in any case.

    as to why it is 2 commits and not one, it used to be more commits, and then I slimmed it down. It previously made more sense as the prior patches were something like: 1) Introduce new API 2) Use New API 3) Delete old API, but 1 + 2 got squashed by request since there were a couple issues with the API changing the underlying return type making patch 1+2 fail to compile (even though readable). I don’t care strongly, but have a mild preference for not burning out the extant reviewers.

  87. MarcoFalke commented at 6:22 am on September 7, 2020: member
    No preference on how those are handled, but re-ACKing some doc fixups should cost as much (or as little) review effort as creating a new commit/pull and getting that reviewed, no?
  88. JeremyRubin commented at 9:59 am on September 7, 2020: contributor

    No because there’s already outstanding work (I have like something in the range of 40-70 other commits staged since last year for the mempool project) that is going to touch these areas and cause me to have to rewrite some of the documentation anyways before they are PR ready. I’m not making a separate docs only fixup.

    So for the most part the style changes and docs changes are already going to have to be re-reviewed when those follow on changes land.

    Originally I intended not to update much documentation along the way (except where critical) because the mempool documentation is already very stale, but to honor some requests to improve it incrementally I added more stuff here. This points to why “just fixing up the docs” isn’t really an option, they’re so stale that any effort to improve them is inevitably going to bikeshed more because they’re unclear and people want something easier to read.

  89. MarcoFalke merged this on Sep 7, 2020
  90. MarcoFalke closed this on Sep 7, 2020

  91. MarcoFalke commented at 10:11 am on September 7, 2020: member
    I wouldn’t call that bikeshedding. Referring to CTxMemPoolEntry::m_children with CTxMemPool::m_children is objectively wrong and someone will create pull requests to fix it. But if the documentation/code goes away later then it might not be worth it to spend much time on it. Though, for reference it would be nice if you could link to pull requests that rework/remove the docs.
  92. JeremyRubin commented at 4:14 pm on September 7, 2020: contributor

    Thanks for the merge – if someone wants to clean up the docs that’s fine.

    Yes, the docs were objectively wrong, I probably fucked up the autocomplete when changing it, and if someone wants to fix it before I get around to it that’s dandy. Just noting that the prior name – setMemPoolChildren – was wrong (no such member existed) for like 5 years and no one fixed it (and even more wrong – there was no field anywhere), so I sort of doubt anyone is actually reading the docs to learn how the code works anyways.

    It needs a decent amount of rebasing and extracting specific chunks but I’m picking out patches from #17268 and https://github.com/JeremyRubin/bitcoin/pull/5. As I do that, a lot of the function signatures will change which is a good time to update the docs to be more coherent.

  93. MarcoFalke commented at 4:25 pm on September 7, 2020: member

    For my own reference, the broken docs are

    0git grep --extended-regexp '(setMemPool|mapLinks|CTxMemPool::(Parents|m_children))'
    
  94. sidhujag referenced this in commit ff51cbf631 on Sep 9, 2020
  95. meshcollider removed this from the "Blockers" column in a project

  96. kiminuo referenced this in commit c6d25db7ee on Mar 14, 2021
  97. kiminuo referenced this in commit a1f734847a on Mar 14, 2021
  98. kiminuo referenced this in commit d8a942cd19 on Mar 14, 2021
  99. kiminuo referenced this in commit 09adbcc01a on Mar 14, 2021
  100. kiminuo referenced this in commit 106f9bb0ab on Mar 23, 2021
  101. kiminuo referenced this in commit e54a411c32 on Sep 2, 2021
  102. Fabcien referenced this in commit 957ddb12b3 on Sep 27, 2021
  103. DrahtBot locked this on Feb 15, 2022
  104. hebasto commented at 7:59 pm on October 8, 2022: member

    I wouldn’t call that bikeshedding. Referring to CTxMemPoolEntry::m_children with CTxMemPool::m_children is objectively wrong and someone will create pull requests to fix it. But if the documentation/code goes away later then it might not be worth it to spend much time on it. Though, for reference it would be nice if you could link to pull requests that rework/remove the docs.

    Fixed in #26281.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-20 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me