Mempool Update Cut-Through Optimization #21464

pull JeremyRubin wants to merge 2 commits into bitcoin:master from JeremyRubin:epoch-mempool-cut-through-optimized changing 4 files +73 −33
  1. JeremyRubin commented at 9:13 PM on March 17, 2021: contributor

    Often when we're updating mempool entries we update entries that we ultimately end up removing the updated entries shortly thereafter. This patch makes it so that we filter for such entries a bit earlier in processing, which yields a mild improvement for these cases, and is negligible overhead otherwise.

    There's potential for a better -- but more sophisticated -- algorithm that can be used taking advantage of epochs, but I figured it is better to do something that is simple and works first and upgrade it later as the other epoch mempool work proceeds as it makes the patches for the epoch algorithm simpler to understand, so you can consider this as preparatory work. It could either go in now if it is not controversial, or we could wait until the other patch is ready to go.

  2. JeremyRubin added this to the "Epoch MemPool" column in a project

  3. JeremyRubin force-pushed on Mar 17, 2021
  4. DrahtBot added the label Mempool on Mar 17, 2021
  5. DrahtBot added the label Validation on Mar 17, 2021
  6. jonatack commented at 7:26 PM on March 18, 2021: member

    Interesting on first look, will review.

  7. JeremyRubin force-pushed on Apr 6, 2021
  8. JeremyRubin commented at 12:20 AM on April 7, 2021: contributor

    pushed an update to simplify setRemove to hold txids instead of tx reference_wrappers, which eliminated the confusing ownership thing. should make review simpler.

  9. laanwj added this to the "Blockers" column in a project

  10. in src/txmempool.cpp:163 in 590fb3ef43 outdated
     160 | +    for (auto& txid : setRemove) {
     161 | +        // this guard is safe because we know that all to_remove *were* in the mempool before we
     162 | +        // began iterating here, so this only serves to de-duplicate calls to removeRecursive.
     163 | +        // Otherwise, removeRecursive should be called whether or not the hash is in the mempool.
     164 | +        const std::optional<txiter> txiter_opt = GetIter(txid);
     165 | +        if (txiter_opt)
    


    instagibbs commented at 12:03 AM on April 9, 2021:

    brackets

  11. in src/txmempool.cpp:125 in 590fb3ef43 outdated
     121 | @@ -117,6 +122,8 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
     122 |      // accounted for in the state of their ancestors)
     123 |      std::set<uint256> setAlreadyIncluded(vHashesToUpdate.begin(), vHashesToUpdate.end());
     124 |  
     125 | +    std::set<uint256> setRemove;
    


    instagibbs commented at 12:03 AM on April 9, 2021:

    nit: snake_case for new vars


    JeremyRubin commented at 1:58 AM on April 12, 2021:

    renamed

  12. in src/txmempool.cpp:102 in 590fb3ef43 outdated
      95 | @@ -95,6 +96,10 @@ void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendan
      96 |              cachedDescendants[updateIt].insert(mapTx.iterator_to(descendant));
      97 |              // Update ancestor state for each descendant
      98 |              mapTx.modify(mapTx.iterator_to(descendant), update_ancestor_state(updateIt->GetTxSize(), updateIt->GetModifiedFee(), 1, updateIt->GetSigOpCost()));
      99 | +
     100 | +            if (descendant.GetCountWithAncestors() > ancestor_limit || descendant.GetSizeWithAncestors() > ancestor_size_limit) {
    


    instagibbs commented at 12:28 AM on April 9, 2021:

    This probably warrants a comment since it's an opportunistic check that is not about updating the descendant map


    instagibbs commented at 1:31 AM on April 12, 2021:

    overall I wish we could be much more compartmentalized about these limit checks.

    no good suggestion; just a complaint


    JeremyRubin commented at 1:58 AM on April 12, 2021:

    will clarify why doing it here makes sense; good point.

  13. in src/txmempool.cpp:160 in 590fb3ef43 outdated
     152 | @@ -146,7 +153,16 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
     153 |                  }
     154 |              }
     155 |          } // release epoch guard for UpdateForDescendants
     156 | -        UpdateForDescendants(it, mapMemPoolDescendantsToUpdate, setAlreadyIncluded);
     157 | +        UpdateForDescendants(it, mapMemPoolDescendantsToUpdate, setAlreadyIncluded, setRemove, ancestor_size_limit, ancestor_limit);
     158 | +    }
     159 | +
     160 | +    for (auto& txid : setRemove) {
     161 | +        // this guard is safe because we know that all to_remove *were* in the mempool before we
    


    instagibbs commented at 12:28 AM on April 9, 2021:

    what's to_remove?


    instagibbs commented at 12:32 AM on April 9, 2021:

    Also I don't really understand the comment block in general


    JeremyRubin commented at 1:36 AM on April 12, 2021:

    right; stale comment on the prior version. Before i was using txiters + CTransactionRefs and had to use a second vec before removing (otherwise you'd have use after free on the txiters).

    Fixed the comment.


    JeremyRubin commented at 1:36 AM on April 12, 2021:

    ibid

  14. instagibbs commented at 1:32 AM on April 12, 2021: member

    some comments, logic looks right from manual inspection of other limits code

    will test

  15. JeremyRubin force-pushed on Apr 12, 2021
  16. JeremyRubin force-pushed on Apr 12, 2021
  17. instagibbs commented at 6:58 AM on April 12, 2021: member
  18. in src/txmempool.h:671 in e8dafeb700 outdated
     666 | @@ -667,7 +667,8 @@ class CTxMemPool
     667 |       *  for).  Note: vHashesToUpdate should be the set of transactions from the
     668 |       *  disconnected block that have been accepted back into the mempool.
     669 |       */
     670 | -    void UpdateTransactionsFromBlock(const std::vector<uint256>& vHashesToUpdate) EXCLUSIVE_LOCKS_REQUIRED(cs, cs_main) LOCKS_EXCLUDED(m_epoch);
     671 | +    void UpdateTransactionsFromBlock(const std::vector<uint256>& vHashesToUpdate,
     672 | +            uint64_t ancestor_size_limit, uint64_t ancestor_limit) EXCLUSIVE_LOCKS_REQUIRED(cs, cs_main) LOCKS_EXCLUDED(m_epoch);
    


    glozow commented at 8:25 PM on April 14, 2021:

    Should update the doxygen comments, something like

    * [@param](/bitcoin-bitcoin/contributor/param/)[out]    set_should_remove          Populated with the txids of entries that exceed ancestor limits. It's the responsibility of the caller to removeRecursive them.
    

    JeremyRubin commented at 1:42 AM on April 15, 2021:

    will clean it up if I touch.


    jnewbery commented at 10:00 AM on April 28, 2021:

    Definitely agree that changes to signature functions should be documented in the doxygen comments.


    jnewbery commented at 12:53 PM on April 28, 2021:

    This function only ever gets called in one place, with ancestor_size_limit set to -limitancestorsize and ancestor_limit set to -limitancestorcount.

    What do you think about parameterizing the CTxMemPool constructor with those values? They don't change during runtime and it'd mean you'd be making no changes to validation.

    See the last two commits in https://github.com/jnewbery/bitcoin/tree/pr21464.1.


    JeremyRubin commented at 7:11 PM on April 28, 2021:

    mmmm maybe. In general in the mempool I think the functions accept parameters with limits as opposed to being static in the mempool, and I think it would be confusing to be inconsistent. I like that passing in limits makes it explicit what the code will do, and forces any caller to consider what limit they should use.

    Where these limits get enforced later they are set by reading configuration parameter at the time of transaction acceptance rather than on startup, so altering this behavior could lead to an inconsistency down the line if someone at runtime increased the setting.


    JeremyRubin commented at 7:12 PM on April 28, 2021:

    Saying as this function doesn't have doxygen comments is this a request to doxygen comment it?

    If I set default values for these params would that be better so as not to disrupt the fsig?


    jnewbery commented at 7:12 AM on April 29, 2021:

    In general in the mempool I think the functions accept parameters with limits

    I think that's more a historical artifact that until #19556, the mempool was a global that was constructed before main(). If we were building this from scratch, I think we'd make any mempool parameters that can't be changed at runtime const and set in the ctor, rather than using globals.

    lead to an inconsistency down the line if someone at runtime increased the setting.

    Changing these limits during runtime seems like a terrible idea! There would be all kinds of edge-cases where it'd be difficult to ensure that invariants aren't being invalidated.

    Anyway, your choice. This is just a suggestion and could be done after this PR. Whenever I see function arguments where the same value is always passed (eg UpdateTransactionsFromBlock() in this PR), then it suggests to me that the interface can be simplified.


    MarcoFalke commented at 7:16 AM on April 29, 2021:

    Changing these limits during runtime seems like a terrible idea!

    Agree. Though, the tx_pool fuzz targets do this and they haven't (yet) crashed :grimacing:

  19. in src/txmempool.h:810 in e8dafeb700 outdated
     806 | @@ -806,7 +807,7 @@ class CTxMemPool
     807 |       */
     808 |      void UpdateForDescendants(txiter updateIt,
     809 |              cacheMap &cachedDescendants,
     810 | -            const std::set<uint256> &setExclude) EXCLUSIVE_LOCKS_REQUIRED(cs);
     811 | +            const std::set<uint256> &setExclude, std::set<uint256>& set_should_remove, uint64_t ancestor_size_limit, uint64_t ancestor_limit) EXCLUSIVE_LOCKS_REQUIRED(cs);
    


    glozow commented at 8:29 PM on April 14, 2021:

    Small naming suggestion; "count" removes any ambiguity about which limit this is.

                const std::set<uint256> &setExclude, std::set<uint256>& set_should_remove, uint64_t ancestor_size_limit, uint64_t ancestor_count_limit) EXCLUSIVE_LOCKS_REQUIRED(cs);
    

    JeremyRubin commented at 1:42 AM on April 15, 2021:

    will clean this up if I touch it.


    jnewbery commented at 12:56 PM on April 28, 2021:

    This style seems like the worst of all worlds. Including linebreaks, but making the last line too long for side-by-side diff tools. How about:

        void UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendants, const std::set<uint256> &setExclude,
                                  std::set<uint256>& set_should_remove, uint64_t ancestor_size_limit,
                                  uint64_t ancestor_limit) EXCLUSIVE_LOCKS_REQUIRED(cs);
    
  20. in src/txmempool.cpp:220 in e8dafeb700 outdated
     154 | @@ -146,7 +155,16 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
     155 |                  }
     156 |              }
     157 |          } // release epoch guard for UpdateForDescendants
     158 | -        UpdateForDescendants(it, mapMemPoolDescendantsToUpdate, setAlreadyIncluded);
    


    glozow commented at 8:46 PM on April 14, 2021:

    Question: Would it be correct/better to also add children to mapMemPoolDescendantsToUpdate in the code block above?


    JeremyRubin commented at 1:44 AM on April 15, 2021:

    Nah; mapMemPoolDescendantsToUpdate is a cache for UpdateForDescendants where each line in the cache is all the desc, not just children, so if you added it prematurely it would uncache incorrectly in UpdateForDescednants

  21. glozow commented at 9:48 PM on April 14, 2021: member

    ACK e8dafeb700 Code looks correct to me. I'm mostly just suggesting some docs if you retouch.

    I wonder if (in addition to this) it would be worth changing ATMP to take a bool check_for_children param where, if true, looks for descendants in mempool and includes them when applying ancestor/descendant limits. Technically, ATMP is currently not enforcing these rules properly for reorgs by assuming no children in the mempool. That way we can probably remove them in UpdateMempoolForReorg() before even calling UpdateTransactionsFromBlock().

  22. rebroad commented at 11:06 AM on April 16, 2021: contributor

    @JeremyRubin could you explain what this does in more layman's terms please? Is this pull anything to do with helping to fill the mempool in a way that reduces filling it with TXs that will get deleted later (due to mempool becoming full)?

  23. instagibbs commented at 11:09 AM on April 16, 2021: member

    This is an optimization for the case where a reorg puts an ancestor transaction(package) back into the mempool, thus temporarily busting the transaction chain limits of the mempool.

  24. in src/txmempool.cpp:127 in e8dafeb700 outdated
     123 | @@ -117,6 +124,8 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
     124 |      // accounted for in the state of their ancestors)
     125 |      std::set<uint256> setAlreadyIncluded(vHashesToUpdate.begin(), vHashesToUpdate.end());
     126 |  
     127 | +    std::set<uint256> set_should_remove;
    


    jnewbery commented at 2:16 PM on April 28, 2021:

    current style is to not include the type in the variable name. Consider:

        std::set<uint256> txs_to_remove;
    

    ariard commented at 5:53 AM on May 29, 2021:

    Even more speaking descendants_to_remove

  25. in src/txmempool.cpp:166 in e8dafeb700 outdated
     162 | +    for (auto& txid : set_should_remove) {
     163 | +        // This txid may have been removed already in a prior call to removeRecursive.
     164 | +        // Therefore we ensure it is not yet removed already.
     165 | +        const std::optional<txiter> txiter_opt = GetIter(txid);
     166 | +        if (txiter_opt) {
     167 | +            removeRecursive((*txiter_opt)->GetTx(), MemPoolRemovalReason::SIZELIMIT);
    


    jnewbery commented at 2:26 PM on April 28, 2021:

    Consider using c++17 if statement initializer:

            if (const auto txiter_opt = GetIter(txid)) {
                removeRecursive((*txiter_opt)->GetTx(), MemPoolRemovalReason::SIZELIMIT);
    

    of if you want to be a bit more explicit:

            if (const auto txiter_opt = GetIter(txid); txiter_opt) {
                removeRecursive((*txiter_opt)->GetTx(), MemPoolRemovalReason::SIZELIMIT);
    
    

    also, perhaps remove opt from the variable name - project style is to not include types in variable names.


    ariard commented at 5:55 AM on May 29, 2021:

    You could even introduce a new MemPoolRemovalReason::ANCESTORLIMIT. This check is different than the one applied by TrimToSize on absolute mempool size.

    It would be a clearer message for listeners of validation interface.


    JeremyRubin commented at 9:09 PM on July 10, 2021:

    done


    jnewbery commented at 11:44 AM on July 12, 2021:

    I don't think this change is necessary for this PR. Unless there's a validation interface client that uses the new MemPoolRemovalReason::ANCESTORLIMIT, then it's just broadening that interface for no benefit. If we find that a client needs to use this in future, it'll be easy to add it at that point.


    JeremyRubin commented at 10:56 PM on July 15, 2021:

    I will drop this commit and open a new PR with it for @ariard.

  26. in src/txmempool.cpp:99 in e8dafeb700 outdated
      95 | @@ -95,6 +96,12 @@ void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendan
      96 |              cachedDescendants[updateIt].insert(mapTx.iterator_to(descendant));
      97 |              // Update ancestor state for each descendant
      98 |              mapTx.modify(mapTx.iterator_to(descendant), update_ancestor_state(updateIt->GetTxSize(), updateIt->GetModifiedFee(), 1, updateIt->GetSigOpCost()));
      99 | +            // don't directly remove the transaction here -- doing so would invalidate iterators in
    


    jnewbery commented at 2:29 PM on April 28, 2021:
                // Don't directly remove the transaction here -- doing so would invalidate iterators in
    
  27. in src/txmempool.cpp:101 in e8dafeb700 outdated
      95 | @@ -95,6 +96,12 @@ void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendan
      96 |              cachedDescendants[updateIt].insert(mapTx.iterator_to(descendant));
      97 |              // Update ancestor state for each descendant
      98 |              mapTx.modify(mapTx.iterator_to(descendant), update_ancestor_state(updateIt->GetTxSize(), updateIt->GetModifiedFee(), 1, updateIt->GetSigOpCost()));
      99 | +            // don't directly remove the transaction here -- doing so would invalidate iterators in
     100 | +            // cachedDescendants. Mark it for removal by inserting into set_should_remove here as an
     101 | +            // optimization so we don't have to scan for its removal later.
    


    jnewbery commented at 2:30 PM on April 28, 2021:
                // cachedDescendants. Mark it for removal by inserting into set_should_remove.
    
  28. in src/txmempool.cpp:62 in e8dafeb700 outdated
      57 | @@ -58,7 +58,8 @@ size_t CTxMemPoolEntry::GetTxSize() const
      58 |  // Update the given tx for any in-mempool descendants.
      59 |  // Assumes that CTxMemPool::m_children is correct for the given tx and all
      60 |  // descendants.
      61 | -void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendants, const std::set<uint256> &setExclude)
      62 | +void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendants, const std::set<uint256> &setExclude,
      63 | +        std::set<uint256> &set_should_remove, uint64_t ancestor_size_limit, uint64_t ancestor_limit)
    


    jnewbery commented at 2:30 PM on April 28, 2021:
            std::set<uint256>& set_should_remove, uint64_t ancestor_size_limit, uint64_t ancestor_limit)
    
  29. jnewbery commented at 4:46 PM on April 28, 2021: member

    Looks good. A few style comments inline, including a suggested change to minimize the diff in the CTxMemPool interface and in the calling code.

    I think the two commits can be squashed.

  30. DrahtBot commented at 9:33 AM on May 3, 2021: member

    <!--4a62be1de6b64f3ed646cdc7932c8cf5-->

    🕵️ @hebasto @sipa have been requested to review this pull request as specified in the REVIEWERS file.

  31. in src/validation.cpp:369 in e8dafeb700 outdated
     413 | @@ -414,7 +414,9 @@ static void UpdateMempoolForReorg(CChainState& active_chainstate, CTxMemPool& me
     414 |      // previously-confirmed transactions back to the mempool.
     415 |      // UpdateTransactionsFromBlock finds descendants of any transactions in
     416 |      // the disconnectpool that were added back and cleans up the mempool state.
     417 | -    mempool.UpdateTransactionsFromBlock(vHashUpdate);
     418 | +    uint64_t ancestor_limit = gArgs.GetArg("-limitancestorcount", DEFAULT_ANCESTOR_LIMIT);
     419 | +    uint64_t ancestor_limit_size = gArgs.GetArg("-limitancestorsize", DEFAULT_ANCESTOR_SIZE_LIMIT) * 1000;
    


    ariard commented at 5:59 AM on May 29, 2021:

    What do you think about hardcoded ancestor_limit/ancestor_limit_size, instead of relying on node operators settings ?

    As it's a performance optimization it might be easier to get predictable improvements with invariants.


    JeremyRubin commented at 2:34 PM on June 8, 2021:

    I think that would lead to miners to custom compiling to configure rather than running releases -- maybe we need a survey of what settings mining node operators are using?

  32. ariard commented at 6:05 AM on May 29, 2021: member

    Concept ACK

    Looking to test this a bit more.

    There's potential for a better -- but more sophisticated -- algorithm that can be used taking advantage of epochs, but I figured it is better to do something that is simple and works first and upgrade it later as the other epoch mempool work proceeds as it makes the patches for the epoch algorithm simpler to understand, so you can consider this as preparatory work. It could either go in now if it is not controversial, or we could wait until the other patch is ready to go.

    I don't have an opinion on patch ordering but I would be glad if we can land the Epoch changes (#17628) at some point. Either as it is, or after a bit of txmempool internals cleaning/encapsulation :)

  33. DrahtBot commented at 2:49 PM on July 10, 2021: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23962 (Use int32_t type for transaction size/weight consistently by hebasto)
    • #21436 (doc: Fix several references in txmempool comments by kiminuo)
    • #17786 (refactor: Nuke policy/fees->mempool circular dependencies by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  34. JeremyRubin force-pushed on Jul 10, 2021
  35. JeremyRubin commented at 9:09 PM on July 10, 2021: contributor

    Addressed style/doc stuff in a separate commit to preserve ACKs (squashable)

    I also introcuded a new removal reason in a separate commit, this should not be squashed as it is ortho to the main change. @laanwj anything else you'd like to see here?

  36. jnewbery commented at 11:47 AM on July 12, 2021: member

    Addressed style/doc stuff in a separate commit to preserve ACKs (squashable)

    I don't understand this. The head commit is what needs ACKs, and that's changed now that there are new commits. I'm very happy to re-review once you've squashed commits and rebased on master.

  37. DrahtBot added the label Needs rebase on Jul 15, 2021
  38. JeremyRubin force-pushed on Jul 15, 2021
  39. JeremyRubin commented at 11:02 PM on July 15, 2021: contributor

    Ok, made changes requested by @jnewbery and squashed. Will open a followup PR with the validation interface after as requested by @ariard. Thanks.

  40. DrahtBot removed the label Needs rebase on Jul 16, 2021
  41. in src/txmempool.h:829 in 2820b04797 outdated
     825 | +     *     exceed ancestor limits. It's the responsibility of the caller to
     826 | +     *     removeRecursive them.
     827 | +     * @param[in] ancestor_size_limit the max number of ancestral bytes allowed
     828 | +     *     for any descendant
     829 | +     * @param[in] ancestor_count_limit the max number of ancestor transactions
     830 | +     *     allowed for any descendant
    


    glozow commented at 2:37 PM on July 16, 2021:

    Limits are inclusive and apply to all entries

         * [@param](/bitcoin-bitcoin/contributor/param/)[in] ancestor_size_limit      The maximum allowed size in virtual bytes of an entry and its ancestors
         * [@param](/bitcoin-bitcoin/contributor/param/)[in] ancestor_count_limit     The maximum allowed number of transactions including the entry and its ancestors.
    
    

    jnewbery commented at 10:17 AM on July 26, 2021:

    (also much prefer this formatting style with a visual gap between the parameter name and the documentation).


    JeremyRubin commented at 6:21 PM on September 12, 2021:

    fwiw if you use a text editor which understands doxygen you get something which looks fine without a visual gap. <img width="588" alt="image" src="https://user-images.githubusercontent.com/886523/132998517-30b6dee2-137e-46c6-8a26-307a5cdea93c.png">

  42. darosior approved
  43. darosior commented at 8:24 AM on July 19, 2021: member

    ACK 2820b04797

  44. in src/txmempool.h:669 in 2820b04797 outdated
     663 | @@ -664,10 +664,18 @@ class CTxMemPool
     664 |       *  UpdateTransactionsFromBlock() will find child transactions and update the
     665 |       *  descendant state for each transaction in vHashesToUpdate (excluding any
     666 |       *  child transactions present in vHashesToUpdate, which are already accounted
     667 | -     *  for).  Note: vHashesToUpdate should be the set of transactions from the
     668 | -     *  disconnected block that have been accepted back into the mempool.
     669 | +     *  for).
     670 | +     *
     671 | +     * @param[in] vHashesToUpdate should be the set of transactions IDs from
    


    jnewbery commented at 10:19 AM on July 26, 2021:

    "should be" implies that this function could be called with a different set of transactions. In fact, vHashesToUpdate is always the set of transactions from the disconnected block that have been added back to the mempool:

         * [@param](/bitcoin-bitcoin/contributor/param/)[in] vHashesToUpdate   the set of txids from
    
  45. in src/txmempool.h:807 in 2820b04797 outdated
     810 | +     *  chain reorg.
     811 |       *
     812 | -     *  cachedDescendants will be updated with the descendants of the transaction
     813 | -     *  being updated, so that future invocations don't need to walk the
     814 | -     *  same transaction again, if encountered in another transaction chain.
     815 | +     * @param[in] updateIt the entry to update for it's descendants
    


    jnewbery commented at 10:20 AM on July 26, 2021:
         * [@param](/bitcoin-bitcoin/contributor/param/)[in] updateIt the entry to update for its descendants
    
  46. in src/txmempool.cpp:179 in 2820b04797 outdated
     113 | @@ -106,7 +114,7 @@ void CTxMemPool::UpdateForDescendants(txiter updateIt, cacheMap &cachedDescendan
     114 |  // for each entry, look for descendants that are outside vHashesToUpdate, and
     115 |  // add fee/size information for such descendants to the parent.
     116 |  // for each such descendant, also update the ancestor state to include the parent.
     117 | -void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashesToUpdate)
     118 | +void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashesToUpdate, uint64_t ancestor_size_limit, uint64_t ancestor_count_limit)
    


    jnewbery commented at 10:22 AM on July 26, 2021:

    Perhaps remove the comment above this function definition (it's redundant with the doxygen comment above the declaration)

  47. in src/txmempool.h:808 in 2820b04797 outdated
     811 |       *
     812 | -     *  cachedDescendants will be updated with the descendants of the transaction
     813 | -     *  being updated, so that future invocations don't need to walk the
     814 | -     *  same transaction again, if encountered in another transaction chain.
     815 | +     * @param[in] updateIt the entry to update for it's descendants
     816 | +     * @param[in] cachedDescendants a cache where each line corresponds to all
    


    jnewbery commented at 10:25 AM on July 26, 2021:
         * [@param](/bitcoin-bitcoin/contributor/param/)[in/out] cachedDescendants a cache where each line corresponds to all
    

    JeremyRubin commented at 6:39 PM on September 12, 2021:

    Is proper syntax is comma not slash?


    jonatack commented at 6:52 PM on September 12, 2021:

    Yes, per https://www.doxygen.nl/manual/commands.html#cmdparam

    "Possible values are "[in]", "[in,out]", and "[out]", note the [square] brackets in this description. When a parameter is both input and output, [in,out] is used as attribute."

  48. jnewbery commented at 11:31 AM on July 26, 2021: member

    I think it'd be better to squash the test commit into the first commit, so the changes are atomic.

  49. in test/functional/mempool_updatefromblock.py:20 in 2820b04797 outdated
      16 | @@ -17,7 +17,7 @@
      17 |  class MempoolUpdateFromBlockTest(BitcoinTestFramework):
      18 |      def set_test_params(self):
      19 |          self.num_nodes = 1
      20 | -        self.extra_args = [['-limitdescendantsize=1000', '-limitancestorsize=1000']]
      21 | +        self.extra_args = [['-limitdescendantsize=1000', '-limitancestorsize=1000', '-limitancestorcount=100']]
    


    reemuru commented at 9:50 PM on July 27, 2021:

    Is this change necessary? I'm currently updating the functional test for MiniWallet compatibility


    JeremyRubin commented at 10:41 PM on July 31, 2021:

    yes


    mjdietzx commented at 5:11 PM on November 15, 2021:

    I see that you are increasing the node's allowed limitancestorcount here, but I don't think it's being utilized? Bc the test params are hard-coded at L119 self.transaction_graph_test(size=100, n_tx_to_mine=[25, 50, 75]), and don't make use of the increased limit here. So you'd need to update that as well, right?


    instagibbs commented at 1:52 AM on January 4, 2022:

    Did this get addressed?


    JeremyRubin commented at 5:15 AM on January 4, 2022:

    Yes, without this param the old limit wasn't being fully tested since it defaults to 25, with the new code we test the limit up until 75 -- the 100 was just a nice pow ten number.

  50. in src/validation.cpp:368 in 2820b04797 outdated
     364 | @@ -365,7 +365,9 @@ void CChainState::MaybeUpdateMempoolForReorg(
     365 |      // previously-confirmed transactions back to the mempool.
     366 |      // UpdateTransactionsFromBlock finds descendants of any transactions in
     367 |      // the disconnectpool that were added back and cleans up the mempool state.
     368 | -    m_mempool->UpdateTransactionsFromBlock(vHashUpdate);
     369 | +    uint64_t ancestor_limit = gArgs.GetArg("-limitancestorcount", DEFAULT_ANCESTOR_LIMIT);
    


    Talkless commented at 12:04 PM on August 17, 2021:

    Maybe worth renaming ancestor_limit into ancestor_limit_count, to be more symmetrical with ancestor_limit_size/limitancestorsize pair?

    Also, UpdateTransactionsFromBlock and UpdateForDescendants has these parameters phrased a bit differently with ancestor_size_limit and ancestor_count_limit ("limit" goes last). Maybe worth unifying all these new names?


    Talkless commented at 12:05 PM on August 17, 2021:

    Also, ancestor_limit and ancestor_limit_size could be const.

  51. in src/txmempool.cpp:163 in 2820b04797 outdated
     156 | @@ -147,7 +157,15 @@ void CTxMemPool::UpdateTransactionsFromBlock(const std::vector<uint256> &vHashes
     157 |                  }
     158 |              }
     159 |          } // release epoch guard for UpdateForDescendants
     160 | -        UpdateForDescendants(it, mapMemPoolDescendantsToUpdate, setAlreadyIncluded);
     161 | +        UpdateForDescendants(it, mapMemPoolDescendantsToUpdate, setAlreadyIncluded, descendants_to_remove, ancestor_size_limit, ancestor_count_limit);
     162 | +    }
     163 | +
     164 | +    for (auto& txid : descendants_to_remove) {
    


    Talkless commented at 12:06 PM on August 17, 2021:

    could be const auto& txid.

  52. JeremyRubin force-pushed on Sep 12, 2021
  53. JeremyRubin commented at 8:44 PM on September 12, 2021: contributor

    Addressed all feedback; keeping tests separate from changes for easier A/B comparison. However, I have reordered them so Tests come first which is more practical.

  54. ryihan approved
  55. Talkless commented at 8:39 PM on November 6, 2021: none

    Weak ACK 81efedb60e1e88f806adb010c926b4edd04393b8. Weak because of my limited mempool knowledge, though code looks OK, and I successfully ran unit and extended functional tests.

  56. mjdietzx commented at 5:15 PM on November 15, 2021: contributor

    When I read the commit:

    Often when we're updating mempool entries we update entries that we ultimately end up removing the updated entries shortly thereafter. This patch makes it so that we filter for such entries a bit earlier in processing

    based on the wording, I assumed that we were "filtering" these entries out before adding them to the mempool. However, after reviewing the PR it seems that these entries are still added to the mempool, state is still calculated/updated and cached, it's just that we are now (pre-emptively) removing descendants_to_remove right away at the end of UpdateTransactionsFromBlock

    Is my understanding correct? If so, it makes me wonder where the performance improvement is coming from. To me it seems all the same work is done before/after this PR, just earlier.


    Re: my review comment 92647669428bdaca44813b958e33513958644227, mempool_updatefromblock.py (which I agree is a great benchmark test for this optimization) is running in the ~same time before/after this PR. So I am curious if we are actually seeing any speedup from this optimization

  57. laanwj commented at 3:39 PM on December 2, 2021: member

    Code review ACK 81efedb60e1e88f806adb010c926b4edd04393b8 Thanks for clarifying doc comments to UpdateTransactionsFromBlock and UpdateForDescendants.

  58. laanwj commented at 3:43 PM on December 2, 2021: member

    Looks like there has been a silent merge conflict though. Local compilation results in:

    /…/bitcoin/src/validation.cpp:351:79: error: reference to type 'const std::string' (aka 'const basic_string<char>') could not bind to an lvalue of type 'const unsigned int'
        const uint64_t ancestor_count_limit = gArgs.GetArg("-limitancestorcount", DEFAULT_ANCESTOR_LIMIT);
                                                                                  ^~~~~~~~~~~~~~~~~~~~~~
    /…/bitcoin/src/util/system.h:327:70: note: passing argument to parameter 'strDefault' here
        std::string GetArg(const std::string& strArg, const std::string& strDefault) const;
                                                                         ^
    /…/bitcoin/src/validation.cpp:351:20: error: no viable conversion from 'std::string' (aka 'basic_string<char>') to 'const uint64_t' (aka 'const unsigned long')
        const uint64_t ancestor_count_limit = gArgs.GetArg("-limitancestorcount", DEFAULT_ANCESTOR_LIMIT);
                       ^                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/basic_string.h:816:7: note: candidate function
          operator __sv_type() const noexcept
          ^
    /…/bitcoin/src/validation.cpp:352:77: error: reference to type 'const std::string' (aka 'const basic_string<char>') could not bind to an lvalue of type 'const unsigned int'
        const uint64_t ancestor_size_limit = gArgs.GetArg("-limitancestorsize", DEFAULT_ANCESTOR_SIZE_LIMIT) * 1000;
                                                                                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
    /…/bitcoin/src/util/system.h:327:70: note: passing argument to parameter 'strDefault' here
        std::string GetArg(const std::string& strArg, const std::string& strDefault) const;
                                                                         ^
    /…/bitcoin/src/validation.cpp:352:106: error: invalid operands to binary expression ('std::string' (aka 'basic_string<char>') and 'int')
        const uint64_t ancestor_size_limit = gArgs.GetArg("-limitancestorsize", DEFAULT_ANCESTOR_SIZE_LIMIT) * 1000;
                                             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~
    /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/complex:391:5: note: candidate template ignored: could not match 'complex' against 'basic_string'
        operator*(const complex<_Tp>& __x, const complex<_Tp>& __y)
        ^
    /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/complex:400:5: note: candidate template ignored: could not match 'complex' against 'basic_string'
        operator*(const complex<_Tp>& __x, const _Tp& __y)
        ^
    /usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/complex:409:5: note: candidate template ignored: could not match 'complex<type-parameter-0-0>' against 'int'
        operator*(const _Tp& __x, const complex<_Tp>& __y)
        ^
    

    I think it's simply a matter of changing GetArg to GetIntArg for these.

  59. JeremyRubin force-pushed on Dec 7, 2021
  60. [TESTS] Increase limitancestorcount in tournament RPC test to showcase improved algorithm c49daf9885
  61. Mempool Update Cut-Through Optimization
    Often when we're updating mempool entries we update entries that we
    ultimately end up removing the updated entries shortly thereafter. This
    patch makes it so that we filter for such entries a bit earlier in
    processing, which yields a mild improvement for these cases, and is
    negligible overhead otherwise.
    c5b36b1c1b
  62. JeremyRubin force-pushed on Dec 9, 2021
  63. JeremyRubin commented at 11:41 PM on December 10, 2021: contributor

    rebased, test failures are unrelated it looks like (this PR doesn't touch secp?)

  64. MarcoFalke commented at 4:00 PM on January 3, 2022: member

    reran ci (was unrelated failure)

  65. JeremyRubin commented at 7:04 PM on January 3, 2022: contributor

    requesting re-acks from: @instagibbs @glozow @ariard @laanwj @Talkless

  66. in src/validation.cpp:351 in c5b36b1c1b
     347 | @@ -348,7 +348,9 @@ void CChainState::MaybeUpdateMempoolForReorg(
     348 |      // previously-confirmed transactions back to the mempool.
     349 |      // UpdateTransactionsFromBlock finds descendants of any transactions in
     350 |      // the disconnectpool that were added back and cleans up the mempool state.
     351 | -    m_mempool->UpdateTransactionsFromBlock(vHashUpdate);
     352 | +    const uint64_t ancestor_count_limit = gArgs.GetIntArg("-limitancestorcount", DEFAULT_ANCESTOR_LIMIT);
    


    instagibbs commented at 2:46 AM on January 4, 2022:

    any reason we can't access these fields via m_mempool->m_limit_ancestors and m_mempool->m_limit_ancestor_size?


    JeremyRubin commented at 5:12 AM on January 4, 2022:

    i think the main reason is that there is no such field on m_mempool?


    JeremyRubin commented at 5:13 AM on January 4, 2022:

    (it's a field on MemPoolAccept not on the mempool iteself)


    instagibbs commented at 5:17 AM on January 4, 2022:

    ok that would do it 🤦

  67. instagibbs commented at 5:18 AM on January 4, 2022: member

    reACK c5b36b1

  68. sipa commented at 8:23 PM on January 7, 2022: member

    utACK c5b36b1c1b11f04e5da7fb44183f61d09a14e40d

  69. mzumsande commented at 11:18 PM on January 24, 2022: member

    Code Review ACK c5b36b1c1b11f04e5da7fb44183f61d09a14e40d The change looks correct to me, the doc rework for the touched functions is an improvement.

  70. instagibbs commented at 11:30 PM on January 24, 2022: member

    RFM?

  71. fanquake merged this on Jan 25, 2022
  72. fanquake closed this on Jan 25, 2022

  73. fanquake removed this from the "Blockers" column in a project

  74. sidhujag referenced this in commit 53664162ec on Jan 28, 2022
  75. Fabcien referenced this in commit 921fe3f989 on Nov 24, 2022
  76. DrahtBot locked this on Jan 25, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-29 06:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me