Improve performance of p2p inv to send queues #27610

pull ajtowns wants to merge 2 commits into bitcoin:master from ajtowns:202305-invtosend changing 2 files +11 −4
  1. ajtowns commented at 0:11 am on May 10, 2023: contributor

    Couple of performance improvements when draining the inventory-to-send queue:

    • drop txs that have already been evicted from the mempool (or included in a block) immediately, rather than at the end of processing
    • marginally increase outgoing trickle rate during spikes in tx volume
  2. DrahtBot commented at 0:11 am on May 10, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK darosior, willcl-ark, instagibbs, glozow, dergoegge
    Concept ACK 0xB10C, pinheadmz

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. txmempool: have CompareDepthAndScore sort missing txs first
    We use CompareDepthAndScore to choose an order of txs to inv. Rather
    than sorting txs that have been evicted from the mempool at the end
    of the list, sort them at the beginning so they are removed from
    the queue immediately.
    228e9201ef
  4. net_processing: Boost inv trickle rate
    If transactions are being added to the mempool at a rate faster than 7tx/s
    (INVENTORY_BROADCAST_PER_SECOND) then peers' inventory_to_send queue can
    become relatively large. If this happens, increase the number of txids
    we include in an INV message (normally capped at 35) by 5 for each 1000
    txids in the queue.
    
    This will tend to clear a temporary excess out reasonably quickly; an
    excess of 4000 invs to send will be cleared down to 1000 in about 30
    minutes, while an excess of 20000 invs would be cleared down to 1000 in
    about 60 minutes.
    5b3406094f
  5. ajtowns force-pushed on May 10, 2023
  6. darosior commented at 7:20 am on May 10, 2023: member
    utACK 5b3406094f2679dfb3763de4414257268565b943
  7. 0xB10C commented at 1:04 pm on May 10, 2023: contributor
    Concept ACK
  8. willcl-ark commented at 1:05 pm on May 10, 2023: member

    ACK 5b34060

    Currently running this cherry-picked on top of v24.0.1 on mainnet and seeing reduced resource usage.

  9. instagibbs commented at 1:19 pm on May 10, 2023: member

    ACK https://github.com/bitcoin/bitcoin/pull/27610/commits/5b3406094f2679dfb3763de4414257268565b943

    Significant reduction in CPU usage when influx of transactions is high and sustained. Allows an additional INV to trickle per additional 200 INV backlog, capped at 1k.

  10. 0xB10C commented at 2:48 pm on May 10, 2023: contributor

    Compared where the time is spent in the b-msghand thread on a mainnet master and a mainnet 5b3406094f2679dfb3763de4414257268565b943 node that were both running for a while. Followed eklitzke’s flamegraph.md to create the flamegraphs below. This seems to be indeed a nice performance improvement. The second flamegraph looks healthier.

    master:

    flamegraph-master

    5b3406094f2679dfb3763de4414257268565b943:

    flamegraph-202305-invtosend

  11. in src/net_processing.cpp:5670 in 5b3406094f
    5665@@ -5666,7 +5666,9 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
    5666                     // especially since we have many peers and some will draw much shorter delays.
    5667                     unsigned int nRelayedTransactions = 0;
    5668                     LOCK(tx_relay->m_bloom_filter_mutex);
    5669-                    while (!vInvTx.empty() && nRelayedTransactions < INVENTORY_BROADCAST_MAX) {
    5670+                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size()/1000)*5};
    5671+                    broadcast_max = std::min<size_t>(1000, broadcast_max);
    


    pinheadmz commented at 3:05 pm on May 10, 2023:

    Why stop at 1000? If it’s the same rationale could you use this constant?

    https://github.com/bitcoin/bitcoin/blob/104eed116614996d9724f82d47e373ef420cd372/src/net_processing.cpp#L109-L110


    ajtowns commented at 2:28 am on May 11, 2023:

    It’s more related to MAX_PEER_TX_ANNOUNCEMENTS=5000 (we don’t want to send more than we would be willing to receive) and INVENTORY_MAX_RECENT_RELAY=3500 (we don’t want to advertise txids and then refuse to relay them because we’ve forgotten that we advertised them).

    (Could make sense to limit it to 90 for inbounds and 45 for outbounds as (if sustained for two minutes) those values would hit roughly a 1-in-100 chance of overflowing INVENTORY_MAX_RECENT_RELAY)


    dergoegge commented at 11:20 am on May 11, 2023:
    Explaining that in a comment would be nice

    instagibbs commented at 1:16 pm on May 11, 2023:
    static asserts or something would be even better
  12. in src/net_processing.cpp:5669 in 5b3406094f
    5665@@ -5666,7 +5666,9 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
    5666                     // especially since we have many peers and some will draw much shorter delays.
    5667                     unsigned int nRelayedTransactions = 0;
    5668                     LOCK(tx_relay->m_bloom_filter_mutex);
    5669-                    while (!vInvTx.empty() && nRelayedTransactions < INVENTORY_BROADCAST_MAX) {
    5670+                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size()/1000)*5};
    


    glozow commented at 3:28 pm on May 10, 2023:

    Could this increase the number of transactions we’re announcing beyond INVENTORY_MAX_RECENT_RELAY = 3500 so things fall out of m_recently_announced_invs (essentially what you said in #27602 (review))?

    Maybe we want to increase INVENTORY_MAX_RECENT_RELAY, e.g. equal to MAX_PEER_TX_ANNOUNCEMENTS = 5000 (the max requesting coming from the receiver end if they’re Bitcoin Core - imo it could make sense for these to be symmetrical)?


    ajtowns commented at 1:02 am on May 11, 2023:
    The 3500 figure is calculated based on outbound peers, which already have 2.5x the INV rate of inbound peers. An inbound peer would need to be at 65 or 70 txids per INV message to have the same 1-in-a-million chance of a problem, implying a tx rate of 13 or 14 tx/s. To get to a 1-in-100 chance of a problem, you’d need to hit 90-to-95 txids per INV (and sustain that for ~40 INVs, and hit the 1-in-100 chance of sending those 40 INVs in a 2 minute period). Those seem okay to me.

    dergoegge commented at 11:20 am on May 11, 2023:
    • INVENTORY_BROADCAST_MAX should be renamed or have its comment amended
    • clang-format?
    • Maybe add the commit description as a comment here

    dergoegge commented at 12:54 pm on May 11, 2023:
    0                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size() / 1000) * count_seconds(INBOUND_INVENTORY_BROADCAST_INTERVAL)};
    

    Sjors commented at 4:39 pm on May 15, 2023:
    @dergoegge INVENTORY_BROADCAST_MAX has been poorly named for a while. It applies to inbound peers, but to outbounds we send 5 / 2 more.

    Sjors commented at 5:21 pm on May 15, 2023:

    Note to others: we may send outbound peers 7 * 2.5 * 60 * 2 ~= 2100 transaction ids per 2 minutes. Depending on random variation it can be more. UNCONDITIONAL_RELAY_DELAY = 2min; is the period of time that we have to remember what we sent to a specific peer. For privacy reason we don’t want to reveal these transactions to other (spy) peers yet. We do this tracking with a bloom filter m_recently_announced_invs that has a 1 in a million chance of messing up. I forgot if it’s false positive (we give the spy a transaction we shouldn’t have) or negative (we deny them a transaction we just announced).

    Now my question: This change increases the max rate for outbound rate too, so I’m still confused why we don’t have to increase the 3500. This PR increases the maximum rate to outbound peers from 7 * 2.5 = 17.5 tx/s to 1000 / 2 = 500 tx/s (in theory, presumably not in practice), so a 4 second burst would fill it up.

    From below:

    (Could make sense to limit it to 90 for inbounds and 45 for outbounds as (if sustained for two minutes) those values would hit roughly a 1-in-100 chance of overflowing INVENTORY_MAX_RECENT_RELAY)

    That would make more sense to me as well, plus:

    static asserts or something would be even better

  13. in src/txmempool.cpp:758 in 5b3406094f
    754@@ -755,11 +755,16 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
    755 
    756 bool CTxMemPool::CompareDepthAndScore(const uint256& hasha, const uint256& hashb, bool wtxid)
    757 {
    758+    /* Return `true` if hasha should be considered sooner than hashb. Namely when:
    


    pinheadmz commented at 3:28 pm on May 10, 2023:

    Is there a reason why we can’t ignore not-in-mempool txs before we even get to this point?

    i.e. here:

    https://github.com/bitcoin/bitcoin/blob/104eed116614996d9724f82d47e373ef420cd372/src/net_processing.cpp#L5671-L5673


    ajtowns commented at 1:07 am on May 11, 2023:
    Just seems like more code to do it that way, since that is just dealing with txids and CompareDepthAndScore already has access to the mempool info.

    Sjors commented at 9:04 am on May 12, 2023:

    For a followup it would be could to clarify “should be considered” for what. It’s somewhat confusing to reason about because the call site reverses a and b.

    The first case in the comment is “a is not in the mempool, but b is”, but the line of code checks if b is not in the mempool.


    maflcko commented at 9:15 am on May 12, 2023:

    The first case in the comment is “a is not in the mempool, but b is”, but the line of code checks if b is not in the mempool.

    If b is not in the mempool, the function must return false, see #27610 (review)

    If b is in the mempool, the function will continue over the early return false, and checks to see if a is missing, in which case it should be considered sooner.


    Sjors commented at 1:39 pm on May 12, 2023:

    “should be considered” for what

    For processing, which begins by dropping a transaction from m_tx_inventory_to_send before deciding whether to send it. Since we stop processing after having sent broadcast_max transactions, it makes sense to do the cleaning first.

    A return value of true means hasha is lower, i.e. would be at the bottom of the (max)heap and processed last. But because we swap a and b when calling this, it works. That was introduced in the second commit of #7840.

    We could prune non-mempool transactions even before making a heap, but that would double the mempool lookups, and the flame graph suggests that make_heap is plenty fast already now.

  14. pinheadmz commented at 3:29 pm on May 10, 2023: member

    Concept ACK

    A few questions below.

    I have a non-debug mainnet node with a 100% b-msghandler thread. I’m going to deploy this branch today and monitor

  15. in src/txmempool.cpp:766 in 228e9201ef outdated
    764-    indexed_transaction_set::const_iterator i = wtxid ? get_iter_from_wtxid(hasha) : mapTx.find(hasha);
    765-    if (i == mapTx.end()) return false;
    766     indexed_transaction_set::const_iterator j = wtxid ? get_iter_from_wtxid(hashb) : mapTx.find(hashb);
    767-    if (j == mapTx.end()) return true;
    768+    if (j == mapTx.end()) return false;
    769+    indexed_transaction_set::const_iterator i = wtxid ? get_iter_from_wtxid(hasha) : mapTx.find(hasha);
    


    maflcko commented at 10:03 am on May 11, 2023:
    question (feel free to ignore): I tried to figure out if there was a reason why this was re-ordered, but couldn’t find one. An equivalent patch of simply changing the literals false to true and true to false in this function should have achieved the same, right?

    ajtowns commented at 2:13 pm on May 11, 2023:
    The CompareTxMemPoolEntryByScore comparator returns false on equality (ie, it’s “less than” not “less than or equal”), so this preserves that behaviour.

    maflcko commented at 3:39 pm on May 11, 2023:
    Thanks, I should have just read the documentation: https://en.cppreference.com/w/cpp/named_req/Compare , which would have answered this.
  16. glozow commented at 12:49 pm on May 11, 2023: member
    code review ACK 5b3406094f2679dfb3763de4414257268565b943
  17. dergoegge approved
  18. dergoegge commented at 12:55 pm on May 11, 2023: member

    utACK 5b3406094f2679dfb3763de4414257268565b943

    The code and approach looks good to me.

    (my comments can be addressed in a follow-up)

  19. fanquake merged this on May 11, 2023
  20. fanquake closed this on May 11, 2023

  21. brunoerg approved
  22. brunoerg commented at 1:21 pm on May 11, 2023: contributor
    post-merge crACK 5b3406094f2679dfb3763de4414257268565b943
  23. in src/txmempool.cpp:759 in 5b3406094f
    754@@ -755,11 +755,16 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
    755 
    756 bool CTxMemPool::CompareDepthAndScore(const uint256& hasha, const uint256& hashb, bool wtxid)
    757 {
    758+    /* Return `true` if hasha should be considered sooner than hashb. Namely when:
    759+     *   a is not in the mempool, but b is
    


    brunoerg commented at 1:23 pm on May 11, 2023:
    Could we have these cases covered by a unit test?

    glozow commented at 1:57 pm on May 11, 2023:

    brunoerg commented at 5:50 pm on May 11, 2023:
    Cool, @glozow.
  24. fanquake referenced this in commit a9a861af2b on May 11, 2023
  25. fanquake referenced this in commit 128da6e41f on May 11, 2023
  26. fanquake referenced this in commit 1adbcd302f on May 11, 2023
  27. fanquake referenced this in commit 7ef71e30c9 on May 11, 2023
  28. maflcko commented at 1:59 pm on May 11, 2023: member
    first commit lgtm. Left one question. Didn’t look too long at the second commit.
  29. achow101 referenced this in commit ac5b9f37de on May 11, 2023
  30. mzumsande commented at 3:57 pm on May 11, 2023: contributor

    Post-Merge ACK.

    I’ve been running this patch (plus added reporting in getpeerinfo, https://github.com/mzumsande/bitcoin/commit/6932997b1b41d2197094a0a9c71601df72630501) for a few days, and have neither seen extremely high backlogs > 5000 in m_tx_inventory_to_send nor enhanced CPU use.

    In order to better monitor the efficacy of this, it might make sense to expose the size of m_tx_inventory_to_send via RPC permanently?

  31. fanquake referenced this in commit d0a2c87214 on May 11, 2023
  32. fanquake referenced this in commit 06731d19bc on May 11, 2023
  33. fanquake referenced this in commit 8996da626d on May 11, 2023
  34. sidhujag referenced this in commit 5557e20a78 on May 11, 2023
  35. fanquake referenced this in commit 2e9fc2e353 on May 12, 2023
  36. Sjors commented at 1:21 pm on May 12, 2023: member
    Partial utACK, for 228e9201efb5574b1b96bb924de1d2e8dd1317f3 (i.e. the changes in CompareDepthAndScore). Took me a while to wrap my head around it.
  37. pinheadmz commented at 2:10 pm on May 17, 2023: member

    I’ve been running this branch for 7 days on a VPS and noticed this morning the CPU on b-msghand is back up to 100%. It was about that high running v24 release but dropped to 30% or so when I first switched to this branch and restarted.

  38. willcl-ark commented at 3:12 pm on May 17, 2023: member

    @pinheadmz I am also still running this patch, but I still see pretty stable utilisation in the range of ~2-12%, currently with 88 inbound peers.

    image

  39. 0xB10C commented at 5:19 pm on May 18, 2023: contributor

    I’ve been running this branch for 7 days on a VPS and noticed this morning the CPU on b-msghand is back up to 100%. It was about that high running v24 release but dropped to 30% or so when I first switched to this branch and restarted. @pinheadmz I’ve been running the patch on multiple nodes for a week now and haven’t seen 100% CPU usage in the b-msghand thread again. If you haven’t restarted or if it happens again, it would be helpful to see which functions are slow. perf top -p $(pidof bitcoind) should do the trick.

  40. maflcko commented at 5:30 pm on May 18, 2023: member
    Could also make sense to double check the debug.log to ensure you restarted bitcoind after compiling? :sweat_smile:
  41. pinheadmz commented at 5:35 pm on May 18, 2023: member

    Could also make sense to double check the debug.log to ensure you restarted bitcoind after compiling? 😅

    Phew! That would’ve been embarassing

    02023-05-10T15:39:40Z Bitcoin Core version v25.99.0-5b3406094f26 (debug build)
    

    perf top -p $(pidof bitcoind) should do the trick.

    Not familiar with this tool but it looks cool! This is at the top. Every other line in the output is < 0.10%

    0  99.56%  bitcoind             [.] boost::multi_index::detail::safe_iterator_base::detach
    
  42. maflcko commented at 5:47 pm on May 18, 2023: member

    (debug build)

    See #27700 (comment) ?

  43. pinheadmz commented at 5:50 pm on May 18, 2023: member
    ah thanks I forgot I configured that way, will follow the other threads
  44. losh11 referenced this in commit 2170c245ee on Mar 3, 2024
  45. losh11 referenced this in commit 3270c6ca30 on Mar 6, 2024
  46. losh11 referenced this in commit b0f96bc53a on Mar 6, 2024
  47. arnout referenced this in commit a3a88ff1c8 on Mar 11, 2024
  48. arnout referenced this in commit d4169b9dbf on Mar 21, 2024
  49. arnout referenced this in commit e7529976cc on Mar 21, 2024
  50. arnout referenced this in commit a7affb39bc on Mar 21, 2024
  51. digirayc referenced this in commit cce321a8b1 on Apr 1, 2024
  52. bitcoin locked this on May 17, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me