[refactor] rewrite DisconnectedBlockTransactions to not use boost #28385

pull glozow wants to merge 6 commits into bitcoin:master from glozow:2023-09-unboost-disconnectedtxns-listonly changing 9 files +319 −129
  1. glozow commented at 3:52 pm on September 1, 2023: member

    Motivation

    • I think it’s preferable to use stdlib data structures instead of depending on boost if we can achieve the same thing.
    • Also see #28335 for further context/motivation. This PR simplifies that one.

    Things done in this PR:

    • Add a bench for DisconnectedBlockTransactions where we reorg and the new chain has {100%, 90%, 10%} of the same transactions. AFAIU in practice, it’s usually close to 100%.
    • Rewrite DisconnectedBlockTransactions as a std::list + unordered_map instead of a boost multi index container.
      • On my machine, the bench suggests the performance is very similar.
    • Move DisconnectedBlockTransactions from txmempool.h to its own kernel/disconnected_transactions.h. This struct isn’t used by txmempool and doesn’t have much to do with txmempool. My guess is that it’s been living there for convenience since the boost includes are there.
  2. glozow added the label Refactoring on Sep 1, 2023
  3. DrahtBot commented at 3:52 pm on September 1, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK stickies-v, TheCharlatan, ismaelsadeeq
    Concept ACK fanquake, instagibbs, theuni, hebasto

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  4. glozow force-pushed on Sep 1, 2023
  5. DrahtBot added the label CI failed on Sep 1, 2023
  6. glozow force-pushed on Sep 1, 2023
  7. DrahtBot removed the label CI failed on Sep 1, 2023
  8. in src/validation.cpp:300 in 0c356aaafa outdated
    300@@ -301,8 +301,8 @@ void Chainstate::MaybeUpdateMempoolForReorg(
    301     // Iterate disconnectpool in reverse, so that we add transactions
    


    TheCharlatan commented at 7:00 am on September 2, 2023:
    The comment here about insertion_order needs updating.

    glozow commented at 10:23 am on September 4, 2023:
    Done, thanks
  9. DrahtBot added the label CI failed on Sep 3, 2023
  10. glozow force-pushed on Sep 4, 2023
  11. glozow renamed this:
    [refactor] rewrite DisconnectedblockTransactions to not use boost, remove some external mapTx accesses
    [refactor] rewrite DisconnectedblockTransactions to not use boost
    on Sep 4, 2023
  12. glozow renamed this:
    [refactor] rewrite DisconnectedblockTransactions to not use boost
    [refactor] rewrite DisconnectedBlockTransactions to not use boost
    on Sep 4, 2023
  13. glozow commented at 10:23 am on September 4, 2023: member
    • Dropped last 2 commits since they’re now in #28391
    • Switched to the list + map implementation. I thought about it more over the weekend and it feels more similar to the current impl + would be more performant in the average case.
    • Addressed #28385 (review)
    • Improved the bench a little, reduced some duplicate code
  14. in src/validation.h:250 in 17c2f91fca outdated
    245+ * (and therefore better to not do in the middle of reorg-processing).
    246+ * Instead, store the disconnected transactions (in order!) as we go, remove any
    247+ * that are included in blocks in the new chain, and then process the remaining
    248+ * still-unconfirmed transactions at the end.
    249+ */
    250+struct DisconnectedBlockTransactions {
    


    maflcko commented at 12:56 pm on September 4, 2023:
    This is only used in validation.cpp, and one unit test, so I wonder if it makes sense to move it to a separate header?

    glozow commented at 12:59 pm on September 4, 2023:
    makes sense to me. kernel/disconnected_transactions.h?

    glozow commented at 1:13 pm on September 5, 2023:
    Done
  15. in src/validation.h:283 in 17c2f91fca outdated
    278+        // Short-circuit in the common case of a block being added to the tip
    279+        if (queuedTx.empty()) {
    280+            return;
    281+        }
    282+        // Create a set of all block txids.
    283+        std::unordered_set<uint256, SaltedTxidHasher> txids;
    


    stickies-v commented at 4:16 pm on September 4, 2023:

    nit: would suggest reserving to avoid rehashing the set

    0        std::unordered_set<uint256, SaltedTxidHasher> txids;
    1        txids.reserve(vtx.size());
    

    glozow commented at 1:11 pm on September 5, 2023:
    This line went away with the list + map approach so marking as resolved
  16. in src/validation.h:293 in 17c2f91fca outdated
    288+            auto it_next = std::next(it);
    289+            if (txids.count((*it)->GetHash()) > 0) {
    290+                cachedInnerUsage -= RecursiveDynamicUsage(*it);
    291+                queuedTx.erase(it);
    292+            }
    293+            it = it_next;
    


    stickies-v commented at 4:27 pm on September 4, 2023:

    nit: could avoid creating it_next

    0            if (txids.count((*it)->GetHash()) > 0) {
    1                cachedInnerUsage -= RecursiveDynamicUsage(*it);
    2                it = queuedTx.erase(it);
    3            } else {
    4                ++it;
    5            }
    

    stickies-v commented at 4:46 pm on September 4, 2023:

    Or actually, could use std::list::remove_if. Quickly conveys the intent of the code imo, and looks like it speeds up the bench a tiny bit, going from

    ns/op op/s err% total benchmark
    7,204,642.33 138.80 0.2% 0.87 AddAndRemoveDisconnectedBlockTransactions10
    5,293,310.64 188.92 0.4% 0.64 AddAndRemoveDisconnectedBlockTransactions100
    6,931,825.73 144.26 0.9% 0.84 AddAndRemoveDisconnectedBlockTransactions90

    to

    ns/op op/s err% total benchmark
    6,971,191.70 143.45 0.8% 0.84 AddAndRemoveDisconnectedBlockTransactions10
    4,991,280.27 200.35 0.7% 0.60 AddAndRemoveDisconnectedBlockTransactions100
    6,632,837.50 150.77 0.6% 0.80 AddAndRemoveDisconnectedBlockTransactions90
     0diff --git a/src/validation.h b/src/validation.h
     1index 3e1a024d09..c81331f950 100644
     2--- a/src/validation.h
     3+++ b/src/validation.h
     4@@ -283,15 +283,13 @@ struct DisconnectedBlockTransactions {
     5         std::unordered_set<uint256, SaltedTxidHasher> txids;
     6         std::transform(vtx.cbegin(), vtx.cend(), std::inserter(txids, txids.end()), [](const auto& tx) { return tx->GetHash(); });
     7         // Iterate through entire list once, removing any transactions in the block.
     8-        auto it = queuedTx.begin();
     9-        while (it != queuedTx.end()) {
    10-            auto it_next = std::next(it);
    11-            if (txids.count((*it)->GetHash()) > 0) {
    12-                cachedInnerUsage -= RecursiveDynamicUsage(*it);
    13-                queuedTx.erase(it);
    14+        queuedTx.remove_if([&](const CTransactionRef& tx) {
    15+            if (txids.count(tx->GetHash()) > 0) {
    16+                cachedInnerUsage -= RecursiveDynamicUsage(tx);
    17+                return true;
    18             }
    19-            it = it_next;
    20-        }
    21+            return false;
    22+        });
    23     }
    24 
    25     // Remove the earliest-inserted transaction.
    

    maflcko commented at 10:27 am on September 5, 2023:
    remove_if doesn’t change the size of the container, so your patch is incomplete, no?

    stickies-v commented at 10:42 am on September 5, 2023:
    queuedTx is not a vector but a list, so I think just removing it from the list is sufficient?

    maflcko commented at 10:49 am on September 5, 2023:

    Why would the container make a difference? If I print std::list::size(), it remains unchanged.

     0#include <algorithm>
     1#include <iostream>
     2#include <list>
     3  
     4int main()
     5{
     6    std::list<char> str1 {'T', ' ', 't'};
     7
     8 
     9    auto noSpaceEnd = std::remove(str1.begin(), str1.end(), ' ');
    10 
    11         std::cout  << " size: " << str1.size() << '\n';
    12
    13    str1.erase(noSpaceEnd, str1.end());
    14 
    15     std::cout  << " size: " << str1.size() << '\n';
    16 
    17  
    18}
    
    0 size: 3
    1
    2 size: 2
    

    maflcko commented at 11:02 am on September 5, 2023:
    Ah, sorry, you are using std::list::remove_if, not std::remove_if.

    stickies-v commented at 11:12 am on September 5, 2023:

    Yeah, I didn’t realize they behaved differently but with std::list::remove_if the items are indeed removed from the container, as opposed to just shifted:

     0#include <algorithm>
     1#include <iostream>
     2#include <list>
     3  
     4int main()
     5{
     6    std::list<char> str1 {'T', ' ', 't'};
     7
     8    str1.remove(' ');
     9         std::cout  << " size: " << str1.size() << '\n';
    10  
    11}
    
    0size: 2
    

    glozow commented at 1:11 pm on September 5, 2023:
    This line went away with the list + map approach so marking as resolved
  17. in src/validation.cpp:2726 in 17c2f91fca outdated
    2725@@ -2726,9 +2726,9 @@ bool Chainstate::DisconnectTip(BlockValidationState& state, DisconnectedBlockTra
    2726         }
    2727         while (disconnectpool->DynamicMemoryUsage() > MAX_DISCONNECTED_TX_POOL_SIZE * 1000) {
    


    stickies-v commented at 4:54 pm on September 4, 2023:
    Since the dynamic memory usage of queuedTx now changed, I think this is behaviour change as we’ll be dropping entries less frequently?

    glozow commented at 12:23 pm on September 5, 2023:
    Perhaps (I think it’s negligible compared to the transactions themselves), and would probably be an improvement.
  18. in src/validation.cpp:2730 in 17c2f91fca outdated
    2725@@ -2726,9 +2726,9 @@ bool Chainstate::DisconnectTip(BlockValidationState& state, DisconnectedBlockTra
    2726         }
    2727         while (disconnectpool->DynamicMemoryUsage() > MAX_DISCONNECTED_TX_POOL_SIZE * 1000) {
    2728             // Drop the earliest entry, and remove its children from the mempool.
    2729-            auto it = disconnectpool->queuedTx.get<insertion_order>().begin();
    2730-            m_mempool->removeRecursive(**it, MemPoolRemovalReason::REORG);
    2731-            disconnectpool->removeEntry(it);
    2732+            auto ptx = disconnectpool->queuedTx.front();
    


    stickies-v commented at 4:59 pm on September 4, 2023:
    Perhaps worth adding an assert(!disconnectpool->queuedTx.empty()) here? I know we already assume that’s the case because disconnectpool is non-nullptr and because of the dynamic memory usage check, but i think adding an assertion may make this more robust, since for an empty vector we have UB here and two lines down.

    glozow commented at 1:13 pm on September 5, 2023:
    Added in the while loop condition, feels a bit more robust than crashing
  19. in src/validation.cpp:305 in 17c2f91fca outdated
    300@@ -301,8 +301,8 @@ void Chainstate::MaybeUpdateMempoolForReorg(
    301     // Iterate disconnectpool in reverse, so that we add transactions
    302     // back to the mempool starting with the earliest transaction that had
    303     // been previously seen in a block.
    304-    auto it = disconnectpool.queuedTx.get<insertion_order>().rbegin();
    305-    while (it != disconnectpool.queuedTx.get<insertion_order>().rend()) {
    306+    auto it = disconnectpool.queuedTx.rbegin();
    


    stickies-v commented at 5:10 pm on September 4, 2023:
    this comment still needs updating

    glozow commented at 12:33 pm on September 5, 2023:
    Wow, I just realized I was pushing changes to the wrong branch 🤦 I have changed this now
  20. stickies-v commented at 5:24 pm on September 4, 2023: contributor
    Approach ACK, this approach with std::list seems elegant and reduces dependency on boost.
  21. fanquake commented at 10:18 am on September 5, 2023: member

    Concept ACK - great stuff.

    I think it’s preferable to use stdlib data structures instead of depending on boost if we can achieve the same thing.

    +1

  22. glozow force-pushed on Sep 5, 2023
  23. glozow force-pushed on Sep 5, 2023
  24. in src/validation.h:21 in 0409dbcc9d outdated
    17@@ -18,6 +18,7 @@
    18 #include <kernel/chainparams.h>
    19 #include <kernel/chainstatemanager_opts.h>
    20 #include <kernel/cs_main.h> // IWYU pragma: export
    21+#include <kernel/disconnected_transactions.h>
    


    TheCharlatan commented at 2:26 pm on September 5, 2023:
    I think this should be included in validation.cpp instead.

    glozow commented at 3:03 pm on September 5, 2023:
    moved
  25. in src/bench/disconnected_transactions.cpp:43 in 5a1233c098 outdated
    38+    for (uint32_t i = 0; i < num_txns; ++i) {
    39+        CMutableTransaction tx;
    40+        tx.vin.resize(1);
    41+        // We should get a different prevout hash every time. But just to be sure, change the index
    42+        // as well to ensure every tx has a different txid.
    43+        tx.vin.emplace_back(CTxIn{COutPoint{det_rand.rand256(), uint32_t(i + BLOCK_VTX_COUNT * unique_set_num)}});
    


    furszy commented at 2:27 pm on September 5, 2023:

    In 5a1233c0:

    Small idea: what about having an always increasing lock_time number?. It will ensure different tx ids and let you remove the unique_set_num as well as the fast RNG for the prevout hash which doesn’t seems to be used anywhere.

    E.g.

     0static BlockTxns CreateRandomTransactions(size_t num_txns)
     1{
     2    static uint32_t next_locktime{0};
     3
     4    BlockTxns txns;
     5    txns.reserve(num_txns);
     6    // Simplest spk for every tx
     7    CScript spk = CScript() << OP_TRUE;
     8    for (uint32_t i = 0; i < num_txns; ++i) {
     9        CMutableTransaction tx;
    10        tx.vin.resize(1);
    11        tx.vout.resize(1);
    12        tx.vout.emplace_back(CTxOut{CENT, spk});
    13        tx.nLockTime = next_locktime++;
    14        txns.emplace_back(MakeTransactionRef(tx));
    15    }
    16    return txns;
    17}
    

    glozow commented at 3:02 pm on September 5, 2023:
    Good point about using a static var. I’ve simplified this but instead of using locktime or prevout n, I’m just having each tx spend the previous one. Should be sufficient to ensure we don’t get any duplicates, so that gets rid of unique_set_num and the rng.
  26. in src/bench/disconnected_transactions.cpp:54 in 5a1233c098 outdated
    49+}
    50+
    51+/** Creates 2 blocks with BLOCK_VTX_COUNT transactions each. There will be num_not_shared
    52+ * transactions that are different, all other transactions the exact same. This is to simulate a
    53+ * reorg in which all but num_not_shared transactions are confirmed in the new chain. */
    54+static ReorgTxns CreateBlocks(TestChain100Setup& testing_setup, size_t num_not_shared)
    


    furszy commented at 2:28 pm on September 5, 2023:

    In In 5a1233c0:

    Unused testing_setup arg


    glozow commented at 3:03 pm on September 5, 2023:
    deleted
  27. TheCharlatan commented at 2:52 pm on September 5, 2023: contributor
    Approach ACK 0409dbcc9d5394d310da50e0ca326fb132b6d1e2
  28. glozow force-pushed on Sep 5, 2023
  29. fanquake commented at 3:14 pm on September 5, 2023: member
    @theuni you’ll probably want to take a look here.
  30. in src/txmempool.h:857 in 565247aec4 outdated
    872-            >
    873-        >
    874-    > indexed_disconnected_transactions;
    875+    uint64_t cachedInnerUsage = 0;
    876+    std::list<CTransactionRef> queuedTx;
    877+    using Queue = decltype(queuedTx);
    


    theuni commented at 3:33 pm on September 5, 2023:
    I see how this happened here, but calling this list a Queue is super confusing :)

    glozow commented at 1:38 pm on September 6, 2023:
    :sweat_smile: renamed to List
  31. in src/txmempool.h:872 in 565247aec4 outdated
    873-    // Estimate the overhead of queuedTx to be 6 pointers + an allocation, as
    874-    // no exact formula for boost::multi_index_contained is implemented.
    875     size_t DynamicMemoryUsage() const {
    876-        return memusage::MallocUsage(sizeof(CTransactionRef) + 6 * sizeof(void*)) * queuedTx.size() + cachedInnerUsage;
    877+        // std::list has 3 pointers per entry
    878+        return cachedInnerUsage + memusage::DynamicUsage(iters_by_txid) + memusage::MallocUsage(3 * sizeof(void*)) * queuedTx.size();
    


    theuni commented at 3:37 pm on September 5, 2023:
    Any reason not to add std::list to memusage.h instead?
  32. in src/txmempool.h:891 in 565247aec4 outdated
    899-            auto it = queuedTx.find(tx->GetHash());
    900-            if (it != queuedTx.end()) {
    901-                cachedInnerUsage -= RecursiveDynamicUsage(*it);
    902-                queuedTx.erase(it);
    903+        for (const auto& tx : vtx) {
    904+            auto iter = iters_by_txid.find(tx->GetHash());
    


    theuni commented at 4:05 pm on September 5, 2023:

    ~This could look a little cleaner using extract()~.

    0for (const auto& tx : vtx) {
    1    if (auto node = iters_by_txid.extract(tx->GetHash())) {
    2        auto& list_iter = node.mapped();
    3        cachedInnerUsage -= RecursiveDynamicUsage(*list_iter);
    4        queuedTx.erase(list_iter);
    5    }
    6}
    

    Edit: Ignore this. I benched and performance is worse.

  33. DrahtBot removed the label CI failed on Sep 5, 2023
  34. instagibbs commented at 5:07 pm on September 5, 2023: member

    approach ACK

    iiuc we’re looking at constant time access to oldest element, or any via txid, with constant time deletion/insertion for arbitrary elements. This accounts for deleting the oldest when busting allotted memory for the pool, or deleting transactions found in new blocks

  35. in src/txmempool.h:875 in 565247aec4 outdated
    876-        return memusage::MallocUsage(sizeof(CTransactionRef) + 6 * sizeof(void*)) * queuedTx.size() + cachedInnerUsage;
    877+        // std::list has 3 pointers per entry
    878+        return cachedInnerUsage + memusage::DynamicUsage(iters_by_txid) + memusage::MallocUsage(3 * sizeof(void*)) * queuedTx.size();
    879     }
    880 
    881     void addTransaction(const CTransactionRef& tx)
    


    theuni commented at 5:34 pm on September 5, 2023:

    Hmm, why no addTransactions(const std::vector<CTransactionRef>& vtx) to match removeForBlock() ?

    Is it because the tx order needs to be reversed?

    That would allow a iters_by_txid.reserve(vtx.size()); which gets me a ~10% speedup in the benches.

    Edit: removed snark after finding the comment I needed.

    Edit2: I suppose this is a potential optimization unrelated to this PR, but an addTransactionsForBlock(const CBlock&) would be unambiguous as to tx order if this is interesting.


    glozow commented at 1:41 pm on September 6, 2023:

    Hmm, why no addTransactions(const std::vector& vtx) to match removeForBlock() ?

    I had been trying to keep the diff as small as possible

    That would allow a iters_by_txid.reserve(vtx.size()); which gets me a ~10% speedup in the benches.

    This is worth it though, so I’ve done this now - thanks!

    Edit: removed snark after finding the comment I needed.

    Added a comment as I agree it wasn’t super obvious why we reverse vtx

  36. theuni commented at 5:47 pm on September 5, 2023: member

    Concept ACK, though I’m concerned about the public queuedTx staying in sync with the private iters_by_txid.

    For example: here queuedTx is cleared without clearing iters_by_txid. That looks like a bug to me?

  37. theuni commented at 5:53 pm on September 5, 2023: member
    Really appreciate the added bench for this btw. It makes experimenting with this much easier.
  38. theuni commented at 7:02 pm on September 5, 2023: member

    Concept ACK, though I’m concerned about the public queuedTx staying in sync with the private iters_by_txid.

    For example: here queuedTx is cleared without clearing iters_by_txid. That looks like a bug to me?

    Quick POC commit which adds full encapsulation here: https://github.com/theuni/bitcoin/commit/98c0b86d7cb8455e3252c264ec7133d1ebc2305a (only validation fixed up, not tests/benches).

  39. stickies-v commented at 11:22 am on September 6, 2023: contributor

    Quick POC commit which adds full encapsulation here: theuni@98c0b86 (only validation fixed up, not tests/benches).

    Oh yes, strong approach ACK for this (left a few comments on the branch), nice!

  40. theuni referenced this in commit 98c0b86d7c on Sep 6, 2023
  41. glozow force-pushed on Sep 6, 2023
  42. glozow force-pushed on Sep 6, 2023
  43. DrahtBot added the label CI failed on Sep 6, 2023
  44. glozow force-pushed on Sep 6, 2023
  45. theuni commented at 2:38 pm on September 6, 2023: member

    Quick POC commit which adds full encapsulation here: theuni@98c0b86 (only validation fixed up, not tests/benches).

    Oh yes, strong approach ACK for this (left a few comments on the branch), nice!

    After looking at that some more, I realized that the memory management was also not encapsulated. I took a quick run at a completely encapsulated DisconnectedBlockTransactions on that branch. One nice advantage there is that the max memory size is no longer hard-coded, so it should be possible to test/bench/fuzz much more effectively. (I switched to using CBlock for tx batches which was probably a mistake though, so just ignore that :). @glozow I know you wanted to keep the size of this diff down, so I suspect you won’t want to take a cleaned up version of that here. But IMO we should do something like it soon after this.

  46. glozow commented at 2:51 pm on September 6, 2023: member

    For example: here queuedTx is cleared without clearing iters_by_txid. That looks like a bug to me?

    Quick POC commit which adds full encapsulation here: https://github.com/theuni/bitcoin/commit/98c0b86d7cb8455e3252c264ec7133d1ebc2305a (only validation fixed up, not tests/benches).

    Definitely prefer this, thanks! It looks like we’ve been using the wrong clear() for a while :scream: https://github.com/bitcoin/bitcoin/pull/9208/files#diff-97c3a52bc5fad452d82670a7fd291800bae20c7bc35bb82686c2c0a4ea7b5b98R578

    After looking at that some more, I realized that the memory management was also not encapsulated. I took a quick run at a completely encapsulated DisconnectedBlockTransactions on that branch. One nice advantage there is that the max memory size is no longer hard-coded, so it should be possible to test/bench/fuzz much more effectively.

    @glozow I know you wanted to keep the size of this diff down, so I suspect you won’t want to take a cleaned up version of that here.

    I mean, this is pretty easy to add and I like it. I really don’t mind if people are willing to review :P I only tried to keep this small as I wasn’t sure if people would be interested in it.

  47. theuni commented at 2:54 pm on September 6, 2023: member

    I mean, this is pretty easy to add and I like it. I really don’t mind if people are willing to review :P I only tried to keep this small as I wasn’t sure if people would be interested in it.

    In that case, feel free to clean it up as you see fit and take it :)

    Don’t worry about trying to keep the commits, I was just poking around.

  48. DrahtBot removed the label CI failed on Sep 6, 2023
  49. in src/txmempool.h:894 in fc83729478 outdated
    899         // transactions (if any) are at the beginning. If this data structure grows too large, we
    900-        // will trim transactions from the front. Transactions with fewer dependencies should be
    901-        // removed first.
    902+        // will trim transactions from the front, and should avoid removing transactions that other
    903+        // ones depend on.
    904+        iters_by_txid.reserve(iters_by_txid.size() + vtx.size());
    


    theuni commented at 4:54 pm on September 6, 2023:
    Thanks for catching my problem here!
  50. in src/validation.cpp:319 in fc83729478 outdated
    316@@ -316,7 +317,6 @@ void Chainstate::MaybeUpdateMempoolForReorg(
    317         }
    318         ++it;
    319     }
    320-    disconnectpool.queuedTx.clear();
    


    theuni commented at 4:58 pm on September 6, 2023:

    I realize this came from my patch but..

    I think we still want to disconnectpool.clear() here just in case. Otherwise it’s potentially a change of behavior as we’re keeping these tx’s alive longer (until disconnectpool goes out of scope) than before.


    glozow commented at 6:17 pm on September 6, 2023:
    disconnectpool is cleared as part of take() though?

    theuni commented at 6:22 pm on September 6, 2023:
    Yes, sorry! I meant queuedTx.clear().

    glozow commented at 5:51 pm on September 7, 2023:
    Added a local scope for queuedTx to go out of
  51. in src/test/validation_chainstatemanager_tests.cpp:544 in fc83729478 outdated
    540@@ -541,7 +541,7 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_snapshot_init, SnapshotTestSetup)
    541     {
    542         LOCK2(::cs_main, bg_chainstate.MempoolMutex());
    543         BOOST_CHECK(bg_chainstate.DisconnectTip(unused_state, &unused_pool));
    544-        unused_pool.clear();  // to avoid queuedTx assertion errors on teardown
    545+        unused_pool.take();  // to avoid queuedTx assertion errors on teardown
    


    theuni commented at 5:09 pm on September 6, 2023:
    Why prefer take() here? Seems like that’s relying on the side-effect, as opposed to clear() which does the thing we want.

    glozow commented at 10:56 am on September 7, 2023:
    It just seemed like slightly more realistic usage but that doesn’t seem to be a concern here, so changed to clear() now.
  52. in src/txmempool.h:854 in fc83729478 outdated
    850@@ -851,28 +851,20 @@ class CCoinsViewMemPool : public CCoinsViewBacked
    851  * that are included in blocks in the new chain, and then process the remaining
    852  * still-unconfirmed transactions at the end.
    853  */
    854-
    855-// multi_index tag names
    856-struct txid_index {};
    857-struct insertion_order {};
    858-
    859 struct DisconnectedBlockTransactions {
    


    theuni commented at 5:11 pm on September 6, 2023:
    Nit: With private members and a ctor, I’d call this a class now. But that’s arbitrary and I don’t think we have any guidelines for it.

    glozow commented at 10:56 am on September 7, 2023:
    True, changed
  53. in src/kernel/disconnected_transactions.h:82 in a1f378adbe outdated
    79@@ -73,6 +80,15 @@ struct DisconnectedBlockTransactions {
    80             iters_by_txid.emplace((*block_it)->GetHash(), it);
    81             cachedInnerUsage += RecursiveDynamicUsage(*block_it);
    82         }
    83+
    84+        // Trim the earliest-added entries until we are within memory bounds.
    85+        while (!queuedTx.empty() && DynamicMemoryUsage() > m_max_mem_usage) {
    86+            evicted.emplace_back(queuedTx.front());
    


    theuni commented at 5:31 pm on September 6, 2023:
    This could potentially be done more efficiently by iterating to find the number to evict, then doing a single splice() rather than multiple emplace_back(). I doubt it’s worth the code complexity though.

    glozow commented at 5:48 pm on September 7, 2023:
    Didn’t end up doing this as it requires knowing how much the memory usage will change and I’m not sure if I’d implement it 100% accurately…
  54. theuni commented at 5:42 pm on September 6, 2023: member

    Looks good to me. My benchmarks show we’re a little slower in the 10% case, and a little faster in the others.

    My only hesitation is that as @stickies-v points out, this is likely to change mempool behavior somewhat because of the space difference. It seems like it could only be better, but the memusage trigger behavior has not tested as far as I can see. It’d be nice to have (read: bonus) some dead-simple tests for DisconnectedBlockTransactions::DynamicMemoryUsage()against hard-coded sizes before/after the boost swap, just as a sanity check that we’re calculating what we think we are.

  55. in src/kernel/disconnected_transactions.h:38 in a1f378adbe outdated
    33+private:
    34+    uint64_t cachedInnerUsage = 0;
    35+    const size_t m_max_mem_usage;
    36+    std::list<CTransactionRef> queuedTx;
    37+    using List = decltype(queuedTx);
    38+    std::unordered_map<uint256, List::iterator, SaltedTxidHasher> iters_by_txid;
    


    stickies-v commented at 9:38 am on September 7, 2023:

    nit: TxList seems like a more appropriate name. But I’d be equally happy just inlining it, just as readable imo.

    0    std::unordered_map<uint256, decltype(queuedTx)::iterator, SaltedTxidHasher> iters_by_txid;
    

    glozow commented at 5:43 pm on September 7, 2023:
    called it TxList
  56. glozow force-pushed on Sep 7, 2023
  57. glozow force-pushed on Sep 7, 2023
  58. DrahtBot added the label CI failed on Sep 7, 2023
  59. in src/kernel/disconnected_transactions.h:68 in 88dcf22abf outdated
    63+    // removeForBlock due to queuedTx containing an item without a corresponding entry in iters_by_txid.
    64+    // Returns vector of transactions that were evicted for size-limiting.
    65+    [[nodiscard]] std::vector<CTransactionRef> AddTransactionsFromBlock(const std::vector<CTransactionRef>& vtx)
    66+    {
    67+        std::vector<CTransactionRef> evicted;
    68+        // Blocks are disconnected in descending order by block height. Within each block, add
    


    stickies-v commented at 11:19 am on September 7, 2023:

    “Blocks are disconnected in descending order by block height.” I think this should be in the function docstring as it’s not an implementation detail but an important part of its interface? We expect the caller to:

    • call this function for the outermost/highest block first
    • keep vtx in the same order as they appear in the block (the reversion is an implementation detail and not relevant for this function’s docstring imo, although we do want to document take’s behaviour as per my other comment)
    • ensure all tx ids provided are unique, even across multiple calls

    (ideally using structured “pre” doxygen tags?)


    glozow commented at 5:44 pm on September 7, 2023:
    Added notes about order of queuedTx in the class description and stated that the callers should go descending order by block height in the comment for this function.
  60. in src/test/disconnected_transactions.cpp:83 in 88dcf22abf outdated
    78+        BOOST_CHECK_EQUAL(disconnectpool.DynamicMemoryUsage(), EXPECTED_USAGE_100);
    79+        disconnectpool.clear();
    80+    }
    81+
    82+    // Test DisconnectedBlockTransactions::DynamicMemoryUsage() for 1 transaction
    83+    const size_t EXPECTED_USAGE_1{800};
    


    glozow commented at 11:35 am on September 7, 2023:

    Note: DynamicMemoryUsage in the boost implementation was returning a fixed estimate (+80B) for usage per entry, so the “before” size here would be 720 for 1 tx, 72000 for 100 (I’m not sure how accurate this is but those are the numbers).

    This means the number has increased so we’d drop more stuff now than before (in the rare case of a multi-block reorg).

  61. glozow commented at 11:37 am on September 7, 2023: member

    It seems like it could only be better, but the memusage trigger behavior has not tested as far as I can see. It’d be nice to have (read: bonus) some dead-simple tests for DisconnectedBlockTransactions::DynamicMemoryUsage()against hard-coded sizes before/after the boost swap, just as a sanity check that we’re calculating what we think we are.

    Added some basic unit tests after the unboosting changes.

  62. in src/validation.cpp:302 in 88dcf22abf outdated
    299+    // disconnectpool sorts the entries from
    300+    // oldest to newest. The oldest entry will be the last tx from the
    301     // latest mined block that was disconnected.
    302     // Iterate disconnectpool in reverse, so that we add transactions
    303     // back to the mempool starting with the earliest transaction that had
    304     // been previously seen in a block.
    


    stickies-v commented at 1:15 pm on September 7, 2023:

    Maybe I’m misunderstanding things, but I find this language quite unintuitive and if my understanding is correct - misleading. Assume we have a chaintip at height 100, and we reorged the last 2 blocks, i.e. 100 and 99 (in that order) are disconnected.

    • “from oldest to newest”: I would call a tx in block 99 older than a tx in block 100, but disconnectpool.take() returns txs from block 100 before block 99
    • “the latest mined block that was disconnected” could mean (and this was my first intuition) “the latest block that was disconnected”, i.e. block 99, or “the highest of the block(s) that were disconnected”, i.e. the previous chaintip, which I think is what is meant here.

    Suggested rephrasing, which (imo) removes that ambiguity and also highlights the intent a bit more:

    0    // disconnectpool.take() returns entries first sorted by descending blockheight (i.e. txs from
    1    // the previous chaintip appear first) and then by reversed in-block order.
    2    // For example, if blocks 100 and 99 are disconnected, disconnectpool.take() returns 
    3    // [100:n, 100:n-1, ..., 100:0, 99:m, 99:m-1, ..., 99:0]
    4    // Iterate disconnectpool.take(( in reverse, so that we ensure first adding parent transactions
    5    // before children are added (if any).
    

    Actually, I think moving the first lines that describe the .take() return value to the DisconnectedBlockTransactions::take docstring makes probably more sense, and then just reference that here?


    glozow commented at 5:50 pm on September 7, 2023:
    Er, I found this a bit hard to parse too, so I’ve reworded the comment but a bit differently.
  63. in src/validation.cpp:304 in 88dcf22abf outdated
    317-        }
    318-        ++it;
    319-    }
    320-    disconnectpool.queuedTx.clear();
    321+    {
    322+        std::list<CTransactionRef> queuedTx = disconnectpool.take();
    


    stickies-v commented at 1:16 pm on September 7, 2023:

    nit: can be const

    0        const auto queuedTx = disconnectpool.take();
    

    glozow commented at 5:51 pm on September 7, 2023:
    done
  64. in src/kernel/disconnected_transactions.h:87 in 88dcf22abf outdated
    82+            evicted.emplace_back(queuedTx.front());
    83+            cachedInnerUsage -= RecursiveDynamicUsage(queuedTx.front());
    84+            iters_by_txid.erase(queuedTx.front()->GetHash());
    85+            queuedTx.pop_front();
    86+        }
    87+        return evicted;
    


    stickies-v commented at 1:34 pm on September 7, 2023:

    For readability, would you consider moving this to its own private function?

     0diff --git a/src/kernel/disconnected_transactions.h b/src/kernel/disconnected_transactions.h
     1index 6e12059874..d1a8b5e989 100644
     2--- a/src/kernel/disconnected_transactions.h
     3+++ b/src/kernel/disconnected_transactions.h
     4@@ -37,6 +37,20 @@ private:
     5     using List = decltype(queuedTx);
     6     std::unordered_map<uint256, List::iterator, SaltedTxidHasher> iters_by_txid;
     7 
     8+    // Trim and return the earliest-added entries until we are within memory bounds.
     9+    std::vector<CTransactionRef> EvictTransactionsIfNeeded()
    10+    {
    11+        std::vector<CTransactionRef> evicted;
    12+        while (!queuedTx.empty() && DynamicMemoryUsage() > m_max_mem_usage) {
    13+            evicted.emplace_back(queuedTx.front());
    14+            cachedInnerUsage -= RecursiveDynamicUsage(queuedTx.front());
    15+            iters_by_txid.erase(queuedTx.front()->GetHash());
    16+            queuedTx.pop_front();
    17+        }
    18+
    19+        return evicted;
    20+    }
    21+
    22 public:
    23     DisconnectedBlockTransactions(size_t max_mem_usage) : m_max_mem_usage{max_mem_usage} {}
    24 
    25@@ -64,7 +78,6 @@ public:
    26     // Returns vector of transactions that were evicted for size-limiting.
    27     [[nodiscard]] std::vector<CTransactionRef> AddTransactionsFromBlock(const std::vector<CTransactionRef>& vtx)
    28     {
    29-        std::vector<CTransactionRef> evicted;
    30         // Blocks are disconnected in descending order by block height. Within each block, add
    31         // transactions in reverse order so that transactions with dependencies on other
    32         // transactions (if any) are at the beginning. If this data structure grows too large, we
    33@@ -77,13 +90,7 @@ public:
    34             cachedInnerUsage += RecursiveDynamicUsage(*block_it);
    35         }
    36 
    37-        // Trim the earliest-added entries until we are within memory bounds.
    38-        while (!queuedTx.empty() && DynamicMemoryUsage() > m_max_mem_usage) {
    39-            evicted.emplace_back(queuedTx.front());
    40-            cachedInnerUsage -= RecursiveDynamicUsage(queuedTx.front());
    41-            iters_by_txid.erase(queuedTx.front()->GetHash());
    42-            queuedTx.pop_front();
    43-        }
    44+        const auto evicted{EvictTransactionsIfNeeded()};
    45         return evicted;
    46     }
    47 
    

    glozow commented at 5:48 pm on September 7, 2023:
    Sure, moved
  65. in src/kernel/disconnected_transactions.h:17 in 88dcf22abf outdated
    12+
    13+#include <list>
    14+#include <unordered_map>
    15+
    16+/** Maximum kilobytes for transactions to store for processing during reorg */
    17+static const unsigned int MAX_DISCONNECTED_TX_POOL_SIZE = 20000;
    


    stickies-v commented at 1:37 pm on September 7, 2023:

    nit

    0static const unsigned int MAX_DISCONNECTED_TX_POOL_SIZE = 20'000;
    

    glozow commented at 5:43 pm on September 7, 2023:
    Done
  66. in src/test/disconnected_transactions.cpp:21 in 88dcf22abf outdated
    16+    // transactions would realistically be in a block together, they just need distinct txids for
    17+    // this test to work.
    18+    std::vector<CTransactionRef> block_vtx(m_coinbase_txns);
    19+
    20+    size_t usage_full{0};
    21+    // DisconnectedBlockTransactions with a comfortable maximum memory usage s.t. nothing is evicted.
    


    stickies-v commented at 1:38 pm on September 7, 2023:

    nit: less abbreviations if not necessary

    0    // DisconnectedBlockTransactions with a comfortable maximum memory usage so that nothing is evicted.
    

    glozow commented at 5:49 pm on September 7, 2023:
    done
  67. stickies-v commented at 1:45 pm on September 7, 2023: contributor

    Approach re-ACK. Big fan of this new interface.

    Haven’t looked at the tests and memusage stuff properly yet but everything else LGTM. Will do full review once CI is green.

    Worth adding a kernel/disconnected_transactions.cpp? We’ve got a lot of implementations in the header now.

  68. [refactor] batch-add transactions to DisconnectedBlockTransactions
    No behavior change.
    In a future commit, we can optimize by reserving vtx.size().
    925bb723ca
  69. glozow force-pushed on Sep 7, 2023
  70. in src/validation.cpp:297 in a9e682194a outdated
    315-            vHashUpdate.push_back((*it)->GetHash());
    316-        }
    317-        ++it;
    318-    }
    319-    disconnectpool.queuedTx.clear();
    320+    {
    


    theuni commented at 6:14 pm on September 7, 2023:
    Yes!
  71. in src/txmempool.h:915 in a9e682194a outdated
    925+            auto iter = iters_by_txid.find(tx->GetHash());
    926+            if (iter != iters_by_txid.end()) {
    927+                auto list_iter = iter->second;
    928+                iters_by_txid.erase(iter);
    929+                cachedInnerUsage -= RecursiveDynamicUsage(*list_iter);
    930+                queuedTx.erase(list_iter);
    


    darosior commented at 2:21 pm on September 9, 2023:

    Here list_iter can never be deduced as a reference to iter->second which would then be erased right?

    An alternative to avoid having to ask this question could be to simply erase from iters_by_txid last:

    0cachedInnerUsage -= RecursiveDynamicUsage(*iter->second);
    1queuedTx.erase(iter->second);
    2iters_by_txid.erase(iter);
    

    glozow commented at 10:50 am on September 13, 2023:
    Doesn’t this invalidate iter, as it would point to something that’s since been erased from queuedTx?
  72. glozow force-pushed on Sep 11, 2023
  73. DrahtBot removed the label CI failed on Sep 11, 2023
  74. in src/memusage.h:158 in 9dcef47e4d outdated
    153+struct list_node : private X
    154+{
    155+private:
    156+    void* ptr_next;
    157+    void* ptr_prev;
    158+    void* ptr_val;
    


    stickies-v commented at 11:19 am on September 12, 2023:

    Currently, by inheriting from X, we’re accounting for the size of X (as this will be included in sizeof(list_node<X>). I think that makes sense - X is owned by the list.

    I’m not familiar with std::list implementations, but I’m not sure if accounting for both X as well as void* ptr_val makes sense. I think the node owning X is plenty, I don’t see the need for an additional pointer?

    Finally, by inheriting from private X we cannot use this on non-class/struct types, e.g. it wouldn’t work for std::list<int>. A suggestion to address both problems (and it seems the same approach was taken for e.g. stl_tree_node:

    0struct list_node
    1{
    2private:
    3    void* ptr_next;
    4    void* ptr_prev;
    5    X val;
    

    theuni commented at 5:27 pm on September 12, 2023:

    I was looking into this yesterday and arrived at both of the same conclusions.

    Poked at libstdc++’s impl with gdb:

    (gdb) print -pretty off -raw-values -- queuedTx ... _M_node = {<std::__detail::_List_node_base> = {_M_next = 0x55555556aeb0, _M_prev = 0x55555556aed0}, _M_size = 2}}} ...

    Which agrees with (on 64bit):

    0(gdb) print sizeof(queuedTx)
    1$7 = 24
    

    So it appears to me that in general a list will be: A structure with pointer to its head and tail nodes, plus an additional int for size. Plus for each node: a pointer to prev, next, and the data itself.

    I’m still working on confirming those numbers fully with an implementation though.


    glozow commented at 10:38 am on September 13, 2023:
    Thanks! I think this is correct (for future reference I’m looking at https://github.com/gcc-mirror/gcc/blob/releases/gcc-13.2.0/libstdc%2B%2B-v3/include/bits/stl_list.h#L232-L245), changed.
  75. in src/memusage.h:164 in 9dcef47e4d outdated
    159+};
    160+
    161+template<typename X>
    162+static inline size_t DynamicUsage(const std::list<X>& l)
    163+{
    164+    return MallocUsage(sizeof(list_node<X>)) * l.size();
    


    stickies-v commented at 11:22 am on September 12, 2023:

    Do we not also need to account for sizeof(l) for the list overhead?

    0    return MallocUsage(sizeof(l)) + MallocUsage(sizeof(list_node<X>)) * l.size();
    

    glozow commented at 8:26 am on September 13, 2023:
    I don’t think that part is dynamically allocated…?

    stickies-v commented at 8:38 am on September 13, 2023:
    Ah yes the hint is in the function name, woops 🙃
  76. in src/kernel/disconnected_transactions.h:53 in 9dcef47e4d outdated
    48+    {
    49+        std::vector<CTransactionRef> evicted;
    50+
    51+        while (!queuedTx.empty() && DynamicMemoryUsage() > m_max_mem_usage) {
    52+            evicted.emplace_back(queuedTx.front());
    53+            cachedInnerUsage -= RecursiveDynamicUsage(queuedTx.front());
    


    stickies-v commented at 12:15 pm on September 12, 2023:

    I think cachedInnerUsage shouldn’t account for the memory usage of the shared_ptr<CTransaction>, only for the CTransaction. We already account for the shared_ptr in DisconnectedBlockTransactions::DynamicMemoryUsage (as I think it should be).

     0diff --git a/src/kernel/disconnected_transactions.h b/src/kernel/disconnected_transactions.h
     1index 0c170e7a4b..a3bc77b6fc 100644
     2--- a/src/kernel/disconnected_transactions.h
     3+++ b/src/kernel/disconnected_transactions.h
     4@@ -50,7 +50,7 @@ private:
     5 
     6         while (!queuedTx.empty() && DynamicMemoryUsage() > m_max_mem_usage) {
     7             evicted.emplace_back(queuedTx.front());
     8-            cachedInnerUsage -= RecursiveDynamicUsage(queuedTx.front());
     9+            cachedInnerUsage -= RecursiveDynamicUsage(*queuedTx.front());
    10             iters_by_txid.erase(queuedTx.front()->GetHash());
    11             queuedTx.pop_front();
    12         }
    13@@ -91,7 +91,7 @@ public:
    14         for (auto block_it = vtx.rbegin(); block_it != vtx.rend(); ++block_it) {
    15             auto it = queuedTx.insert(queuedTx.end(), *block_it);
    16             iters_by_txid.emplace((*block_it)->GetHash(), it);
    17-            cachedInnerUsage += RecursiveDynamicUsage(*block_it);
    18+            cachedInnerUsage += RecursiveDynamicUsage(**block_it);
    19         }
    20         return LimitMemoryUsage();
    21     }
    22@@ -108,7 +108,7 @@ public:
    23             if (iter != iters_by_txid.end()) {
    24                 auto list_iter = iter->second;
    25                 iters_by_txid.erase(iter);
    26-                cachedInnerUsage -= RecursiveDynamicUsage(*list_iter);
    27+                cachedInnerUsage -= RecursiveDynamicUsage(**list_iter);
    28                 queuedTx.erase(list_iter);
    29             }
    30         }
    

    glozow commented at 11:04 am on September 13, 2023:
    I agree, changed (but in the rewrite commit instead of here)

    stickies-v commented at 3:15 pm on October 17, 2023:

    Revisiting this again, I think my suggestion was wrong and we’re now underestimating memory usage.

    In DisconnectedBlockTransactions::DynamicMemoryUsage, with memusage::DynamicUsage(queuedTx) we account only for the dynamic allocation of queuedTx, which is (2 ptrs + sizeof(CTransactionRef)) * size. This corresponds to 2+2 pointers per item, and does not include the managed CTransaction object or the control block.

    In AddTransactionsFromBlock (and other places where we modify cachedInnerUsage), when we calculate RecursiveDynamicUsage(**block_it) (i.e. on a CTransaction) as I suggested here, we only account for the vins and vouts of the transaction, and not for the CTransaction object nor the control block.

    So effectively, we do need to account for the CTransactionRef in both places, because in DynamicMemoryUsage we account for the 2 pointers, and in AddTransactionRef we account for the managed object and the control block. In the current implementation, we are not accounting for the CTransaction or the control block, even though everything is allocated on the heap.


    To illustrate, for a coinbase transaction, the difference in AddTransactionsFromBlock is:

    0(lldb) p (size_t) RecursiveDynamicUsage(*block_it)
    1(size_t) 640
    2(lldb) p (size_t) RecursiveDynamicUsage(**block_it)
    3(size_t) 448
    

    cc @ismaelsadeeq perhaps something to include in #28530?


    stickies-v commented at 10:47 am on October 18, 2023:
    Fixed in https://github.com/stickies-v/bitcoin/commit/87f3f28b3acfc329eaed0b0cce6ff589271bfefd - happy to open a separate pull if you prefer though.

    ismaelsadeeq commented at 12:12 pm on October 18, 2023:

    Thanks @stickies-v happy to add it to #28530, I f I understand you correctly, you mean that we are not accounting for the shared_ptr and the CTransaction object before only the vins and vouts.

    In DisconnectedBlockTransactions::DynamicMemoryUsage, with memusage::DynamicUsage(queuedTx) we account only for the dynamic allocation of queuedTx, which is (2 ptrs + sizeof(CTransactionRef)) * size. This corresponds to 2+2 pointers per item.

    I think this is correct. We only account for the vins and vouts, hence the CTransaction Object itself and the shared_ptr that makes up CTransactionRef are not accounted.

    With https://github.com/stickies-v/bitcoin/commit/87f3f28b3acfc329eaed0b0cce6ff589271bfefd we are now accounting for the shared_ptr CTransaction Object, its inputs and outputs and the dynamic allocation of queuedTx in DynamicMemoryUsage.

    because we pass shared_ptr which executes the below version of RecursiveDynamicUsage

    0template<typename X>
    1static inline size_t RecursiveDynamicUsage(const std::shared_ptr<X>& p) {
    2    return p ? memusage::DynamicUsage(p) + RecursiveDynamicUsage(*p) : 0;
    3}
    

    memusage::DynamicUsage(p) accounts for the size of p which is CTransaction and then accounts for the shared pointer size.

    It is unclear to me what

    does not include the managed CTransaction object or the control block.

    mean

    Do you mean the shared_ptr and CTransaction Object ?


    stickies-v commented at 12:31 pm on October 18, 2023:

    Do you mean the shared_ptr and CTransaction Object ?

    No. When instantiating a shared_ptr<CTransaction>, typically we would allocate:

    1. 2 raw pointers on the stack
    2. an instance of CTransaction and a control block on the heap

    This is why memusage::DynamicUsage(const std::shared_ptr<X>& p) does not include 2 raw pointers in its calculation. However, since in our case the shared_ptrs are managed by a vector, the 2 raw pointers (from 1.) are also allocated on the heap, and we account for this in DisconnectedBlockTransactions::DynamicMemoryUsage when calling memusage::DynamicUsage(queuedTx).


    ismaelsadeeq commented at 1:30 pm on October 18, 2023:

    Thanks for the explanation.

    So In memusage::DynamicUsage(const std::shared_ptr<X>& p),

    • memusage::DynamicUsage(p) accounts for the CTransaction Class and the control block.
    • RecursiveDynamicUsage(*p) accounts for the vins and vouts,
    • whereas the two raw pointers are being accounted in cachedInnerUsage.

    Will add https://github.com/stickies-v/bitcoin/commit/87f3f28b3acfc329eaed0b0cce6ff589271bfefd in next push


    stickies-v commented at 1:38 pm on October 18, 2023:

    RecursiveDynamicUsage(*p) accounts for the vins and vouts,

    We don’t want to call RecursiveDynamicUsage(*p) anymore, only RecursiveDynamicUsage(p) as per my suggested fix

    whereas the two raw pointers are being accounted in cachedInnerUsage.

    Not quite: the raw pointers are accounted for in DisconnectedBlockTransactions::DynamicMemoryUsage when calling memusage::DynamicUsage(queuedTx).


    ismaelsadeeq commented at 1:42 pm on October 18, 2023:

    Yeah But calling RecursiveDynamicUsage(p) means calling

    0template<typename X>
    1static inline size_t RecursiveDynamicUsage(const std::shared_ptr<X>& p) {
    2    return p ? memusage::DynamicUsage(p) + RecursiveDynamicUsage(*p) : 0;
    3}
    

    which brings to my point that

    • memusage::DynamicUsage(p) accounts for the CTransaction Class and the control block.
    • RecursiveDynamicUsage(*p) accounts for the vins and vouts,

    Not quite: the raw pointers are accounted for in DisconnectedBlockTransactions::DynamicMemoryUsage when calling memusage::DynamicUsage(queuedTx).

    Yes.

  77. in src/kernel/disconnected_transactions.h:43 in 9dcef47e4d outdated
    35+ * of the list. After trimming, transactions can be re-added to the mempool from the back of the
    36+ * list to the front without running into missing inputs.
    37+ */
    38+class DisconnectedBlockTransactions {
    39+private:
    40+    uint64_t cachedInnerUsage = 0;
    


    stickies-v commented at 12:24 pm on September 12, 2023:

    nit: A docstring would be helpful I think:

    0    //! Memory usage of the CTransaction objects to which queuedTx holds references
    1    uint64_t cachedInnerUsage = 0;
    

    glozow commented at 11:03 am on September 13, 2023:
    Added
  78. stickies-v commented at 12:36 pm on September 12, 2023: contributor
    Review on 9dcef47e4d52b15ffe88be7ddf5ea626121e5f0d. Mostly focused on the memory usage stuff in this review round, because I found it interesting and some things stood out to me. I know it’s not core to this PR and that it’s not critical for the calculations to be 100% accurate, so I’m okay dealing with this in a follow-up too if this is slowing down progress - I haven’t seen any showstoppers.
  79. in src/bench/disconnected_transactions.cpp:39 in 9dcef47e4d outdated
    34+    txns.reserve(num_txns);
    35+    // Simplest spk for every tx
    36+    CScript spk = CScript() << OP_TRUE;
    37+    for (uint32_t i = 0; i < num_txns; ++i) {
    38+        CMutableTransaction tx;
    39+        tx.vin.resize(1);
    


    stickies-v commented at 1:34 pm on September 12, 2023:
    What’s the purpose of these resizes? Do we want two vins and two vouts?

    glozow commented at 10:29 am on September 13, 2023:
    Removed
  80. in src/bench/disconnected_transactions.cpp:52 in 9dcef47e4d outdated
    47+    return txns;
    48+}
    49+
    50+/** Creates 2 blocks with BLOCK_VTX_COUNT transactions each. There will be num_not_shared
    51+ * transactions that are different, all other transactions the exact same. This is to simulate a
    52+ * reorg in which all but num_not_shared transactions are confirmed in the new chain. */
    


    stickies-v commented at 1:44 pm on September 12, 2023:

    nit

    0/** Creates 1 disconnected block and 2 connected blocks, each with BLOCK_VTX_COUNT transactions. Between the disconnected and the first connected block, there will be num_not_shared
    1 * transactions that are different, all other transactions the exact same. This is to simulate a
    2 * reorg in which all but num_not_shared transactions are confirmed in the new chain. */
    

    glozow commented at 10:32 am on September 13, 2023:
    (mostly) taken, some of the documentation lives on the Reorg definition.
  81. in src/bench/disconnected_transactions.cpp:92 in 9dcef47e4d outdated
    87+    assert(disconnectpool.size() == BLOCK_VTX_COUNT - reorg.num_shared);
    88+
    89+    disconnectpool.clear();
    90+}
    91+
    92+/** Add transactions from DisconnectedBlockTransactions, remove all of them, and then pop from the front until empty. */
    


    stickies-v commented at 1:59 pm on September 12, 2023:

    nit: docstrings seems to be from a previous version of this PR. Would suggest updating to e.g.:

    0/** Reorg scenario where all of the transactions in the newly connected block existed in the disconnected block. */
    

    glozow commented at 10:36 am on September 13, 2023:
    It’s the other way around actually, but added.

    stickies-v commented at 1:18 pm on September 13, 2023:

    What do you mean with the other way around?

    Still nit, but, I think the first part of the new description (i.e. “Add transactions from DisconnectedBlockTransactions, remove all but one (the disconnected block’s coinbase transaction) of them, and then pop from the front until empty.”) is too much of an implementation detail for this docstring. And the other 2 benches would probably benefit from a consistent docstring. So new suggestion:

    “Test DisconnectedBlockTransactions in a reorg in which {*all*, *90%*, *10%*} of the non-coinbase transactions in the disconnected chain also exist in the new chain.”


    glozow commented at 2:25 pm on September 13, 2023:

    What do you mean with the other way around?

    It’s not that all of the transactions in the newly connected block existed in the disconnected block; it’s that all of the transactions in the disconnected block are in the new connected chain. I assumed the suggestion was to add a “real world scenario” explanation for what’s being tested?


    stickies-v commented at 2:33 pm on September 13, 2023:

    Ah yes, subset and superset and all that. I see what you mean now, your phrasing is better indeed, mine assumes both blocks to be the same size (which they are, but still).

    I assumed the suggestion was to add a “real world scenario” explanation for what’s being tested?

    Yes, the “popping from the front” etc seems not useful to me here, describing what we’re testing much more so.

  82. in src/kernel/disconnected_transactions.h:130 in 9dcef47e4d outdated
    122+        iters_by_txid.clear();
    123+        queuedTx.clear();
    124+    }
    125+
    126+    /** Clear all data structures and return the list of transactions. */
    127+    std::list<CTransactionRef> take() {
    


    stickies-v commented at 2:11 pm on September 12, 2023:

    tidy nit

    0    std::list<CTransactionRef> take()
    1    {
    

    glozow commented at 12:04 pm on September 13, 2023:
    Done
  83. in src/txmempool.h:921 in a9e682194a outdated
    933     }
    934 
    935-    // Remove an entry by insertion_order index, and update memory usage.
    936-    void removeEntry(indexed_disconnected_transactions::index<insertion_order>::type::iterator entry)
    937+    /** Remove the first entry and update memory usage. */
    938+    CTransactionRef take_first()
    


    stickies-v commented at 2:33 pm on September 12, 2023:
    nit: I think adding take_first() and then removing it in the next non-move commit is not ideal and seems like it could be avoided by moving the memory management commit to come before the boost removal commit, but (I didn’t check) it’s probably not worth the rebase hell.

    glozow commented at 11:02 am on September 13, 2023:
    I agree it’s not worth it as it’s still an interface change + implementation change on separate commits, so resolving this.
  84. in src/validation.cpp:2728 in 9dcef47e4d outdated
    2732-            disconnectpool->removeEntry(it);
    2733+        // Save transactions to re-add to mempool at end of reorg. If any entries are dropped,
    2734+        // remove their descendants from the mempool.
    2735+        for (auto&& ptx : disconnectpool->AddTransactionsFromBlock(block.vtx)) {
    2736+            m_mempool->removeRecursive(*ptx, MemPoolRemovalReason::REORG);
    2737         }
    


    stickies-v commented at 2:37 pm on September 12, 2023:

    nit

    0        // Save transactions to re-add to mempool at end of reorg. If any entries are evicted for exceeding memory limits,
    1        // remove them and their descendants from the mempool.
    2        for (auto&& evicted_tx : disconnectpool->AddTransactionsFromBlock(block.vtx)) {
    3            m_mempool->removeRecursive(*evicted_tx, MemPoolRemovalReason::REORG);
    4        }
    

    glozow commented at 12:04 pm on September 13, 2023:
    Done
  85. in src/kernel/disconnected_transactions.h:50 in 9dcef47e4d outdated
    42+    std::list<CTransactionRef> queuedTx;
    43+    using TxList = decltype(queuedTx);
    44+    std::unordered_map<uint256, TxList::iterator, SaltedTxidHasher> iters_by_txid;
    45+
    46+    /** Trim the earliest-added entries until we are within memory bounds. */
    47+    std::vector<CTransactionRef> LimitMemoryUsage()
    


    stickies-v commented at 2:41 pm on September 12, 2023:
    nit: #include <vector>

    glozow commented at 11:04 am on September 13, 2023:
    added
  86. in src/test/disconnected_transactions.cpp:33 in 9dcef47e4d outdated
    28+    // Overhead for the hashmap depends on number of buckets
    29+    std::unordered_map<uint256, CTransaction*, SaltedTxidHasher> temp_map;
    30+    temp_map.reserve(2);
    31+    const size_t MAP_2{memusage::DynamicUsage(temp_map)};
    32+    temp_map.reserve(100);
    33+    const size_t MAP_100{memusage::DynamicUsage(temp_map)};
    


    stickies-v commented at 3:32 pm on September 12, 2023:
    I don’t think this is correct. reserve makes sure the buckets are allocated, but the map’s size remains unchanged. So MAP_100 significantly underestimates the dynamic usage I think?

    glozow commented at 11:07 am on September 13, 2023:
    This estimation was meant to be for overhead of just the hashmap (see comment above) rather than the full container. I was just trying to make the estimate more accurate as I couldn’t get there just by multiplying an entry usage x number of elements.
  87. in src/test/disconnected_transactions.cpp:48 in 9dcef47e4d outdated
    43+        // Just 2 transactions because the allocation depends on the number of txns passed in.
    44+        std::vector<CTransactionRef> two_txns({block_vtx.at(0), block_vtx.at(1)});
    45+        auto evicted_txns = disconnectpool.AddTransactionsFromBlock(two_txns);
    46+        BOOST_CHECK(disconnectpool.DynamicMemoryUsage() <= MAP_2 + ENTRY_USAGE_OVERESTIMATE);
    47+
    48+        // Only 1 transaction can be kept
    


    stickies-v commented at 3:34 pm on September 12, 2023:
    It’s not immediately obvious to me why this is expected behaviour, when we instantiated disconnectpool to the memory usage of a map of 2 + overhead. At the moment, I think this is the case because (as per my previous comment) the MAP_x vars don’t include the map nodes, but that seems unintuitive/unexpected?
  88. in src/test/disconnected_transactions.cpp:29 in 9dcef47e4d outdated
    24+
    25+    // Roughly estimate sizes to sanity check that DisconnectedBlockTransactions::DynamicMemoryUsage
    26+    // is within an expected range.
    27+
    28+    // Overhead for the hashmap depends on number of buckets
    29+    std::unordered_map<uint256, CTransaction*, SaltedTxidHasher> temp_map;
    


    stickies-v commented at 3:46 pm on September 12, 2023:

    Why not use CTransactionRef here so we can better estimate the shared_ptr overhead?

    0    std::unordered_map<uint256, CTransactionRef, SaltedTxidHasher> temp_map;
    

    glozow commented at 11:06 am on September 13, 2023:
    I felt that a raw pointer would be more accurate as the map is from txids to iterators

    stickies-v commented at 1:04 pm on September 13, 2023:
    Ah yes, you’re right.
  89. stickies-v commented at 3:51 pm on September 12, 2023: contributor

    Finished review at 9dcef47e4d52b15ffe88be7ddf5ea626121e5f0d.

    I’ll review the unit test again after the other underlying comments are addressed because I think it needs some changing, but other than that and outstanding comments I think I won’t have too many more comments coming up.

  90. [bench] DisconnectedBlockTransactions 59a35a7398
  91. add std::list to memusage 79ce9f0aa4
  92. glozow force-pushed on Sep 13, 2023
  93. glozow commented at 11:58 am on September 13, 2023: member

    Addressed comments and

    • Dropped the unit test commit as I can see how the estimations are too rough / unclear. Maybe we can follow up with more precise assertions on the expected size in a separate PR.
    • Changed the AddAndRemoveDisconnectedBlockTransactionsAll to have all but 1 (the coinbases) to be more realistic. This didn’t change the bench results for me - still slightly faster on *90 and *All, slightly slower on *10.
  94. rewrite DisconnectedBlockTransactions as a list + map
    And encapsulate underlying data structures to avoid misuse.
    It's better to use stdlib instead of boost when we can achieve the same thing.
    
    Behavior change: the number returned by DynamicMemoryUsage for the same
    transactions is higher (due to change in usage or more accurate
    accounting), which effectively decreases the maximum amount of
    transactions kept for resubmission in a reorg.
    
    Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
    2765d6f343
  95. MOVEONLY: DisconnectedBlockTransactions to its own file
    This struct is only used in validation + tests and has very little to do
    with txmempool.
    cf5f1faa03
  96. make DisconnectedBlockTransactions responsible for its own memory management
    Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
    4313c77400
  97. glozow force-pushed on Sep 13, 2023
  98. hebasto commented at 2:41 pm on September 13, 2023: member
    Concept ACK.
  99. stickies-v approved
  100. stickies-v commented at 5:09 pm on September 15, 2023: contributor

    ACK 4313c77400eb8eaa8586db39a7e29a861772ea80

    Very nice removal of boost dependency while improving the interface through encapsulation.

    My final (re-)consideration is that there’s a lot of new implementation code in header files (mostly kernel/disconnected_transactions.h and txmempool.h), which I think could be improved but also is not a blocker?

    (and also to be supernitty: a fair amount of new code not using list initialization, I’ve made the diff of things I could see here but not worth going through rebase adventures either, so feel very free to skip/pick from this)

     0diff --git a/src/bench/disconnected_transactions.cpp b/src/bench/disconnected_transactions.cpp
     1index d6f1590950..26d97ce30b 100644
     2--- a/src/bench/disconnected_transactions.cpp
     3+++ b/src/bench/disconnected_transactions.cpp
     4@@ -33,8 +33,8 @@ static BlockTxns CreateRandomTransactions(size_t num_txns)
     5     BlockTxns txns;
     6     txns.reserve(num_txns);
     7     // Simplest spk for every tx
     8-    CScript spk = CScript() << OP_TRUE;
     9-    for (uint32_t i = 0; i < num_txns; ++i) {
    10+    CScript spk{CScript() << OP_TRUE};
    11+    for (uint32_t i{0}; i < num_txns; ++i) {
    12         CMutableTransaction tx;
    13         tx.vin.emplace_back(CTxIn{COutPoint{prevout_hash, 0}});
    14         tx.vout.emplace_back(CTxOut{CENT, spk});
    15@@ -75,7 +75,7 @@ static void Reorg(const ReorgTxns& reorg)
    16 {
    17     DisconnectedBlockTransactions disconnectpool{MAX_DISCONNECTED_TX_POOL_SIZE * 1000};
    18     // Disconnect block
    19-    const auto evicted = disconnectpool.AddTransactionsFromBlock(reorg.disconnected_txns);
    20+    const auto evicted{disconnectpool.AddTransactionsFromBlock(reorg.disconnected_txns)};
    21     assert(evicted.empty());
    22 
    23     // Connect first block
    24diff --git a/src/kernel/disconnected_transactions.h b/src/kernel/disconnected_transactions.h
    25index 7db39ba5ca..3effae8333 100644
    26--- a/src/kernel/disconnected_transactions.h
    27+++ b/src/kernel/disconnected_transactions.h
    28@@ -15,7 +15,7 @@
    29 #include <vector>
    30 
    31 /** Maximum kilobytes for transactions to store for processing during reorg */
    32-static const unsigned int MAX_DISCONNECTED_TX_POOL_SIZE = 20'000;
    33+static const unsigned int MAX_DISCONNECTED_TX_POOL_SIZE{20'000};
    34 /**
    35  * DisconnectedBlockTransactions
    36 
    37@@ -40,7 +40,7 @@ class DisconnectedBlockTransactions {
    38 private:
    39     /** Cached dynamic memory usage for the CTransactions (memory for the shared pointers is
    40      * included in the container calculations). */
    41-    uint64_t cachedInnerUsage = 0;
    42+    uint64_t cachedInnerUsage{0};
    43     const size_t m_max_mem_usage;
    44     std::list<CTransactionRef> queuedTx;
    45     using TxList = decltype(queuedTx);
    46@@ -91,8 +91,8 @@ public:
    47     [[nodiscard]] std::vector<CTransactionRef> AddTransactionsFromBlock(const std::vector<CTransactionRef>& vtx)
    48     {
    49         iters_by_txid.reserve(iters_by_txid.size() + vtx.size());
    50-        for (auto block_it = vtx.rbegin(); block_it != vtx.rend(); ++block_it) {
    51-            auto it = queuedTx.insert(queuedTx.end(), *block_it);
    52+        for (auto block_it{vtx.rbegin()}; block_it != vtx.rend(); ++block_it) {
    53+            auto it{queuedTx.insert(queuedTx.end(), *block_it)};
    54             iters_by_txid.emplace((*block_it)->GetHash(), it);
    55             cachedInnerUsage += RecursiveDynamicUsage(**block_it);
    56         }
    57@@ -107,9 +107,9 @@ public:
    58             return;
    59         }
    60         for (const auto& tx : vtx) {
    61-            auto iter = iters_by_txid.find(tx->GetHash());
    62+            auto iter{iters_by_txid.find(tx->GetHash())};
    63             if (iter != iters_by_txid.end()) {
    64-                auto list_iter = iter->second;
    65+                auto list_iter{iter->second};
    66                 iters_by_txid.erase(iter);
    67                 cachedInnerUsage -= RecursiveDynamicUsage(**list_iter);
    68                 queuedTx.erase(list_iter);
    69@@ -129,7 +129,7 @@ public:
    70     /** Clear all data structures and return the list of transactions. */
    71     std::list<CTransactionRef> take()
    72     {
    73-        std::list<CTransactionRef> ret = std::move(queuedTx);
    74+        std::list<CTransactionRef> ret{std::move(queuedTx)};
    75         clear();
    76         return ret;
    77     }
    78diff --git a/src/validation.cpp b/src/validation.cpp
    79index a9a0912d2a..6be0d48f14 100644
    80--- a/src/validation.cpp
    81+++ b/src/validation.cpp
    82@@ -300,8 +300,8 @@ void Chainstate::MaybeUpdateMempoolForReorg(
    83         // Iterate disconnectpool in reverse, so that we add transactions
    84         // back to the mempool starting with the earliest transaction that had
    85         // been previously seen in a block.
    86-        const auto queuedTx = disconnectpool.take();
    87-        auto it = queuedTx.rbegin();
    88+        const auto queuedTx{disconnectpool.take()};
    89+        auto it{queuedTx.rbegin()};
    90         while (it != queuedTx.rend()) {
    91             // ignore validation errors in resurrected transactions
    92             if (!fAddToMempool || (*it)->IsCoinBase() ||
    
  101. in src/memusage.h:152 in 79ce9f0aa4 outdated
    148@@ -148,6 +149,21 @@ static inline size_t DynamicUsage(const std::shared_ptr<X>& p)
    149     return p ? MallocUsage(sizeof(X)) + MallocUsage(sizeof(stl_shared_counter)) : 0;
    150 }
    151 
    152+template<typename X>
    


    TheCharlatan commented at 7:12 pm on September 15, 2023:
    Nit (clang-format) in commit 79ce9f0aa46de8ff742be83fd6f68eab40e073ec: s/template</template </ - here and just below.
  102. in src/txmempool.h:860 in 2765d6f343 outdated
    855+ * The front of the list should be the most recently-confirmed transactions (transactions at the
    856+ * end of vtx of blocks closer to the tip). If memory usage grows too large, we trim from the front
    857+ * of the list. After trimming, transactions can be re-added to the mempool from the back of the
    858+ * list to the front without running into missing inputs.
    859  */
    860+class DisconnectedBlockTransactions {
    


    TheCharlatan commented at 7:15 pm on September 15, 2023:
    Nit (clang-format) in commit 2765d6f3434c101fe2d46e9313e540aa680fbd77, open brace on newline. Here and for the methods.
  103. in src/txmempool.h:895 in 925bb723ca outdated
    891@@ -892,10 +892,18 @@ struct DisconnectedBlockTransactions {
    892         return memusage::MallocUsage(sizeof(CTransactionRef) + 6 * sizeof(void*)) * queuedTx.size() + cachedInnerUsage;
    893     }
    894 
    895-    void addTransaction(const CTransactionRef& tx)
    896+	/** Add transactions from the block, iterating through vtx in reverse order. Callers should call
    


    TheCharlatan commented at 7:24 pm on September 15, 2023:

    Nit in commit 925bb723ca71aa76380b769d8926c7c2ad9bbb7b: This renders with a weird indent on my terminal:

    0	    /** Add transactions from the block, iterating through vtx in reverse order. Callers should call
    1     * this function for blocks in descending order by block height.
    

    It is fixed a few commits later, but that line should not need to be re-touched.

  104. in src/kernel/disconnected_transactions.h:18 in 4313c77400
    11@@ -12,7 +12,10 @@
    12 
    13 #include <list>
    14 #include <unordered_map>
    15+#include <vector>
    16 
    17+/** Maximum kilobytes for transactions to store for processing during reorg */
    18+static const unsigned int MAX_DISCONNECTED_TX_POOL_SIZE = 20'000;
    


    TheCharlatan commented at 7:42 am on September 16, 2023:
    Nit: It is just moved here, but everywhere this is used we multiply by 1000, so why define it in kilobytes in the first place?

    stickies-v commented at 8:16 am on September 16, 2023:
    Also, if you do this, how about replacing _SIZE with _BYTES ?

    glozow commented at 4:28 pm on September 22, 2023:
    Agree that the fact that it’s kb should be documented / baked in. Going to say out of scope for this PR. Should be easy to do in a followup.
  105. TheCharlatan approved
  106. TheCharlatan commented at 7:47 am on September 16, 2023: contributor

    ACK 4313c77400eb8eaa8586db39a7e29a861772ea80

    Nice improvements all around, no need to retouch on the nits.

  107. fanquake requested review from theuni on Sep 20, 2023
  108. ismaelsadeeq commented at 3:52 pm on September 21, 2023: member

    This is a nice improvement, works fine on my machine.

    Result of benchmark using boost multi index to store DisconnectedBlockTransactions.

    0|               ns/op |                op/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|          866,197.92 |            1,154.47 |    0.1% |      0.10 | `AddAndRemoveDisconnectedBlockTransactions10`
    3|          787,781.25 |            1,269.39 |    0.8% |      0.10 | `AddAndRemoveDisconnectedBlockTransactions90`
    4|          767,416.73 |            1,303.07 |    1.2% |      0.09 | `AddAndRemoveDisconnectedBlockTransactionsAll`
    

    After switching to using c++ stdlib list and unordered_map to store DisconnectedBlockTransactions head 4313c77400eb8eaa8586db39a7e29a861772ea80

    0|               ns/op |                op/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|          856,820.80 |            1,167.11 |    0.2% |      0.10 | `AddAndRemoveDisconnectedBlockTransactions10`
    3|          774,020.80 |            1,291.95 |    0.2% |      0.09 | `AddAndRemoveDisconnectedBlockTransactions90`
    4|          727,310.64 |            1,374.93 |    0.6% |      0.09 | `AddAndRemoveDisconnectedBlockTransactionsAll`
    

    Clearly there is improvement and this reduces dependency on boost. Code looks good to me 🍝 Tested ACK 4313c77400eb8eaa8586db39a7e29a861772ea80

  109. in src/kernel/disconnected_transactions.h:64 in 4313c77400
    59+        }
    60+        return evicted;
    61+    }
    62+
    63 public:
    64+    DisconnectedBlockTransactions(size_t max_mem_usage) : m_max_mem_usage{max_mem_usage} {}
    


    ismaelsadeeq commented at 4:10 pm on September 21, 2023:

    nit: We are always passing the same MAX_DISCONNECTED_TX_POOL_SIZE * 1000 as the constructor argument. We can just declare a default value for it

    0    DisconnectedBlockTransactions(const size_t max_mem_usage = MAX_DISCONNECTED_TX_POOL_SIZE * 1000) : m_max_mem_usage{max_mem_usage} {}
    

    glozow commented at 4:27 pm on September 22, 2023:
    I’m not a fan of default arguments as it makes it easier to misuse functions. I think it’s useful to keep this configurable and would prefer that a caller be aware of / explicit about bounding the memusage.
  110. theuni commented at 5:13 pm on September 22, 2023: member
    I’m on holiday for the next ~10 days or so. I would like to review this, but please don’t let my absence hold up merge. I’ll review when I’m back whether it’s been merged or not.
  111. fanquake merged this on Sep 23, 2023
  112. fanquake closed this on Sep 23, 2023

  113. ismaelsadeeq commented at 3:55 pm on September 25, 2023: member

    @stickies-v @glozow Opened up a PR that moved implementation code to disconnected_transactions.cpp and fix nits #28385#pullrequestreview-1629400522 and #28385 (review)

    #28530

  114. Frank-GER referenced this in commit 6585dc7350 on Sep 25, 2023
  115. glozow deleted the branch on Sep 26, 2023
  116. sidhujag referenced this in commit d3966680b1 on Sep 26, 2023
  117. in src/bench/disconnected_transactions.cpp:101 in 4313c77400
     96+static void AddAndRemoveDisconnectedBlockTransactionsAll(benchmark::Bench& bench)
     97+{
     98+    const auto chains{CreateBlocks(/*num_not_shared=*/1)};
     99+    assert(chains.num_shared == BLOCK_VTX_COUNT - 1);
    100+
    101+    bench.minEpochIterations(10).run([&]() NO_THREAD_SAFETY_ANALYSIS {
    


    maflcko commented at 11:18 am on September 28, 2023:
    nit: I think all of the NO_THREAD_SAFETY_ANALYSIS can just be removed from this file, no?

    fanquake commented at 12:17 pm on October 2, 2023:
    See #28557.
  118. glozow referenced this in commit fd8ab08558 on Oct 2, 2023
  119. in src/validation.cpp:2730 in 2765d6f343 outdated
    2725@@ -2724,9 +2726,8 @@ bool Chainstate::DisconnectTip(BlockValidationState& state, DisconnectedBlockTra
    2726         disconnectpool->AddTransactionsFromBlock(block.vtx);
    2727         while (disconnectpool->DynamicMemoryUsage() > MAX_DISCONNECTED_TX_POOL_SIZE * 1000) {
    2728             // Drop the earliest entry, and remove its children from the mempool.
    2729-            auto it = disconnectpool->queuedTx.get<insertion_order>().begin();
    2730-            m_mempool->removeRecursive(**it, MemPoolRemovalReason::REORG);
    2731-            disconnectpool->removeEntry(it);
    2732+            auto ptx = disconnectpool->take_first();
    2733+            m_mempool->removeRecursive(*ptx, MemPoolRemovalReason::REORG);
    


    theuni commented at 8:29 pm on October 4, 2023:
    Nit: This looks like a potential null-pointer dereference, but it’s safe to assume that disconnectpool is non-empty while we’re in the loop. assert(ptx) would make it clear that it’s not an oversight.

    theuni commented at 9:26 pm on October 4, 2023:
    Nevermind, this is removed in a later commit.
  120. in src/txmempool.h:899 in 2765d6f343 outdated
    904+        iters_by_txid.reserve(iters_by_txid.size() + vtx.size());
    905         for (auto block_it = vtx.rbegin(); block_it != vtx.rend(); ++block_it) {
    906-            queuedTx.insert(*block_it);
    907-            cachedInnerUsage += RecursiveDynamicUsage(*block_it);
    908+            auto it = queuedTx.insert(queuedTx.end(), *block_it);
    909+            iters_by_txid.emplace((*block_it)->GetHash(), it);
    


    theuni commented at 9:41 pm on October 4, 2023:

    According to the comment: We assume that callers never pass multiple transactions with the same txid.

    IMO we should either commit to that and enforce it:

    0auto emplace_result = iters_by_txid.emplace((*block_it)->GetHash(), it);
    1assert(emplace_result.second);
    

    or remove the restriction:

    0// Ignore duplicates
    1iters_by_txid.try_emplace((*block_it)->GetHash(), it);
    

    glozow commented at 9:47 am on October 5, 2023:
    Agreed, I like the first option. Maybe @ismaelsadeeq might be interested in adding it to the followup?

    ismaelsadeeq commented at 11:02 am on October 5, 2023:
    Added to #28530
  121. theuni commented at 9:41 pm on October 4, 2023: member
    ACK 4313c77400eb8eaa8586db39a7e29a861772ea80.
  122. Frank-GER referenced this in commit cd084525e4 on Oct 5, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me