Improve performance of p2p inv to send queues

ajtowns commented at 0:11 am on May 10, 2023: contributor

Couple of performance improvements when draining the inventory-to-send queue:

drop txs that have already been evicted from the mempool (or included in a block) immediately, rather than at the end of processing
marginally increase outgoing trickle rate during spikes in tx volume

DrahtBot commented at 0:11 am on May 10, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	darosior, willcl-ark, instagibbs, glozow, dergoegge
Concept ACK	0xB10C, pinheadmz

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

txmempool: have CompareDepthAndScore sort missing txs first

We use CompareDepthAndScore to choose an order of txs to inv. Rather
than sorting txs that have been evicted from the mempool at the end
of the list, sort them at the beginning so they are removed from
the queue immediately.

228e9201ef

net_processing: Boost inv trickle rate

If transactions are being added to the mempool at a rate faster than 7tx/s
(INVENTORY_BROADCAST_PER_SECOND) then peers' inventory_to_send queue can
become relatively large. If this happens, increase the number of txids
we include in an INV message (normally capped at 35) by 5 for each 1000
txids in the queue.

This will tend to clear a temporary excess out reasonably quickly; an
excess of 4000 invs to send will be cleared down to 1000 in about 30
minutes, while an excess of 20000 invs would be cleared down to 1000 in
about 60 minutes.

5b3406094f

ajtowns force-pushed on May 10, 2023

darosior commented at 7:20 am on May 10, 2023: member

utACK 5b3406094f2679dfb3763de4414257268565b943

0xB10C commented at 1:04 pm on May 10, 2023: contributor

Concept ACK

willcl-ark commented at 1:05 pm on May 10, 2023: member

ACK 5b34060

Currently running this cherry-picked on top of v24.0.1 on mainnet and seeing reduced resource usage.

instagibbs commented at 1:19 pm on May 10, 2023: member

ACK https://github.com/bitcoin/bitcoin/pull/27610/commits/5b3406094f2679dfb3763de4414257268565b943

Significant reduction in CPU usage when influx of transactions is high and sustained. Allows an additional INV to trickle per additional 200 INV backlog, capped at 1k.

0xB10C commented at 2:48 pm on May 10, 2023: contributor

Compared where the time is spent in the b-msghand thread on a mainnet master and a mainnet 5b3406094f2679dfb3763de4414257268565b943 node that were both running for a while. Followed eklitzke’s flamegraph.md to create the flamegraphs below. This seems to be indeed a nice performance improvement. The second flamegraph looks healthier.

master:

flamegraph-master

5b3406094f2679dfb3763de4414257268565b943:

flamegraph-202305-invtosend

in src/net_processing.cpp:5670 in 5b3406094f

5665@@ -5666,7 +5666,9 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5666                     // especially since we have many peers and some will draw much shorter delays.
5667                     unsigned int nRelayedTransactions = 0;
5668                     LOCK(tx_relay->m_bloom_filter_mutex);
5669-                    while (!vInvTx.empty() && nRelayedTransactions < INVENTORY_BROADCAST_MAX) {
5670+                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size()/1000)*5};
5671+                    broadcast_max = std::min<size_t>(1000, broadcast_max);

pinheadmz commented at 3:05 pm on May 10, 2023:

Why stop at 1000? If it’s the same rationale could you use this constant?

https://github.com/bitcoin/bitcoin/blob/104eed116614996d9724f82d47e373ef420cd372/src/net_processing.cpp#L109-L110

ajtowns commented at 2:28 am on May 11, 2023:

It’s more related to MAX_PEER_TX_ANNOUNCEMENTS=5000 (we don’t want to send more than we would be willing to receive) and INVENTORY_MAX_RECENT_RELAY=3500 (we don’t want to advertise txids and then refuse to relay them because we’ve forgotten that we advertised them).

(Could make sense to limit it to 90 for inbounds and 45 for outbounds as (if sustained for two minutes) those values would hit roughly a 1-in-100 chance of overflowing INVENTORY_MAX_RECENT_RELAY)

dergoegge commented at 11:20 am on May 11, 2023:

Explaining that in a comment would be nice

instagibbs commented at 1:16 pm on May 11, 2023:

static asserts or something would be even better

in src/net_processing.cpp:5669 in 5b3406094f

5665@@ -5666,7 +5666,9 @@ bool PeerManagerImpl::SendMessages(CNode* pto)
5666                     // especially since we have many peers and some will draw much shorter delays.
5667                     unsigned int nRelayedTransactions = 0;
5668                     LOCK(tx_relay->m_bloom_filter_mutex);
5669-                    while (!vInvTx.empty() && nRelayedTransactions < INVENTORY_BROADCAST_MAX) {
5670+                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size()/1000)*5};

glozow commented at 3:28 pm on May 10, 2023:

Could this increase the number of transactions we’re announcing beyond INVENTORY_MAX_RECENT_RELAY = 3500 so things fall out of m_recently_announced_invs (essentially what you said in #27602 (review))?

Maybe we want to increase INVENTORY_MAX_RECENT_RELAY, e.g. equal to MAX_PEER_TX_ANNOUNCEMENTS = 5000 (the max requesting coming from the receiver end if they’re Bitcoin Core - imo it could make sense for these to be symmetrical)?

ajtowns commented at 1:02 am on May 11, 2023:

The 3500 figure is calculated based on outbound peers, which already have 2.5x the INV rate of inbound peers. An inbound peer would need to be at 65 or 70 txids per INV message to have the same 1-in-a-million chance of a problem, implying a tx rate of 13 or 14 tx/s. To get to a 1-in-100 chance of a problem, you’d need to hit 90-to-95 txids per INV (and sustain that for ~40 INVs, and hit the 1-in-100 chance of sending those 40 INVs in a 2 minute period). Those seem okay to me.

dergoegge commented at 11:20 am on May 11, 2023:

INVENTORY_BROADCAST_MAX should be renamed or have its comment amended
clang-format?
Maybe add the commit description as a comment here

dergoegge commented at 12:54 pm on May 11, 2023:

0                    size_t broadcast_max{INVENTORY_BROADCAST_MAX + (tx_relay->m_tx_inventory_to_send.size() / 1000) * count_seconds(INBOUND_INVENTORY_BROADCAST_INTERVAL)};

Sjors commented at 4:39 pm on May 15, 2023:

@dergoegge INVENTORY_BROADCAST_MAX has been poorly named for a while. It applies to inbound peers, but to outbounds we send 5 / 2 more.

Sjors commented at 5:21 pm on May 15, 2023:

Note to others: we may send outbound peers 7 * 2.5 * 60 * 2 ~= 2100 transaction ids per 2 minutes. Depending on random variation it can be more. UNCONDITIONAL_RELAY_DELAY = 2min; is the period of time that we have to remember what we sent to a specific peer. For privacy reason we don’t want to reveal these transactions to other (spy) peers yet. We do this tracking with a bloom filter m_recently_announced_invs that has a 1 in a million chance of messing up. I forgot if it’s false positive (we give the spy a transaction we shouldn’t have) or negative (we deny them a transaction we just announced).

Now my question: This change increases the max rate for outbound rate too, so I’m still confused why we don’t have to increase the 3500. This PR increases the maximum rate to outbound peers from 7 * 2.5 = 17.5 tx/s to 1000 / 2 = 500 tx/s (in theory, presumably not in practice), so a 4 second burst would fill it up.

From below:

(Could make sense to limit it to 90 for inbounds and 45 for outbounds as (if sustained for two minutes) those values would hit roughly a 1-in-100 chance of overflowing INVENTORY_MAX_RECENT_RELAY)

That would make more sense to me as well, plus:

static asserts or something would be even better

in src/txmempool.cpp:758 in 5b3406094f

754@@ -755,11 +755,16 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
755 
756 bool CTxMemPool::CompareDepthAndScore(const uint256& hasha, const uint256& hashb, bool wtxid)
757 {
758+    /* Return `true` if hasha should be considered sooner than hashb. Namely when:

pinheadmz commented at 3:28 pm on May 10, 2023:

Is there a reason why we can’t ignore not-in-mempool txs before we even get to this point?

i.e. here:

https://github.com/bitcoin/bitcoin/blob/104eed116614996d9724f82d47e373ef420cd372/src/net_processing.cpp#L5671-L5673

ajtowns commented at 1:07 am on May 11, 2023:

Just seems like more code to do it that way, since that is just dealing with txids and CompareDepthAndScore already has access to the mempool info.

Sjors commented at 9:04 am on May 12, 2023:

For a followup it would be could to clarify “should be considered” for what. It’s somewhat confusing to reason about because the call site reverses a and b.

The first case in the comment is “a is not in the mempool, but b is”, but the line of code checks if b is not in the mempool.

maflcko commented at 9:15 am on May 12, 2023:

The first case in the comment is “a is not in the mempool, but b is”, but the line of code checks if b is not in the mempool.

If b is not in the mempool, the function must return false, see #27610 (review)

If b is in the mempool, the function will continue over the early return false, and checks to see if a is missing, in which case it should be considered sooner.

Sjors commented at 1:39 pm on May 12, 2023:

“should be considered” for what

For processing, which begins by dropping a transaction from m_tx_inventory_to_send before deciding whether to send it. Since we stop processing after having sent broadcast_max transactions, it makes sense to do the cleaning first.

A return value of true means hasha is lower, i.e. would be at the bottom of the (max)heap and processed last. But because we swap a and b when calling this, it works. That was introduced in the second commit of #7840.

We could prune non-mempool transactions even before making a heap, but that would double the mempool lookups, and the flame graph suggests that make_heap is plenty fast already now.

pinheadmz commented at 3:29 pm on May 10, 2023: member

Concept ACK

A few questions below.

I have a non-debug mainnet node with a 100% b-msghandler thread. I’m going to deploy this branch today and monitor

in src/txmempool.cpp:766 in 228e9201ef outdated

764-    indexed_transaction_set::const_iterator i = wtxid ? get_iter_from_wtxid(hasha) : mapTx.find(hasha);
765-    if (i == mapTx.end()) return false;
766     indexed_transaction_set::const_iterator j = wtxid ? get_iter_from_wtxid(hashb) : mapTx.find(hashb);
767-    if (j == mapTx.end()) return true;
768+    if (j == mapTx.end()) return false;
769+    indexed_transaction_set::const_iterator i = wtxid ? get_iter_from_wtxid(hasha) : mapTx.find(hasha);

maflcko commented at 10:03 am on May 11, 2023:

question (feel free to ignore): I tried to figure out if there was a reason why this was re-ordered, but couldn’t find one. An equivalent patch of simply changing the literals false to true and true to false in this function should have achieved the same, right?

ajtowns commented at 2:13 pm on May 11, 2023:

The CompareTxMemPoolEntryByScore comparator returns false on equality (ie, it’s “less than” not “less than or equal”), so this preserves that behaviour.

maflcko commented at 3:39 pm on May 11, 2023:

Thanks, I should have just read the documentation: https://en.cppreference.com/w/cpp/named_req/Compare , which would have answered this.

glozow commented at 12:49 pm on May 11, 2023: member

code review ACK 5b3406094f2679dfb3763de4414257268565b943

dergoegge approved

dergoegge commented at 12:55 pm on May 11, 2023: member

utACK 5b3406094f2679dfb3763de4414257268565b943

The code and approach looks good to me.

(my comments can be addressed in a follow-up)

fanquake merged this on May 11, 2023

fanquake closed this on May 11, 2023

brunoerg approved

brunoerg commented at 1:21 pm on May 11, 2023: contributor

post-merge crACK 5b3406094f2679dfb3763de4414257268565b943

in src/txmempool.cpp:759 in 5b3406094f

754@@ -755,11 +755,16 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
755 
756 bool CTxMemPool::CompareDepthAndScore(const uint256& hasha, const uint256& hashb, bool wtxid)
757 {
758+    /* Return `true` if hasha should be considered sooner than hashb. Namely when:
759+     *   a is not in the mempool, but b is

brunoerg commented at 1:23 pm on May 11, 2023:

Could we have these cases covered by a unit test?

glozow commented at 1:57 pm on May 11, 2023:

I’ve written a unit test here: https://github.com/glozow/bitcoin/commits/2023-05-comparedepthandscore-test

brunoerg commented at 5:50 pm on May 11, 2023:

Cool, @glozow.

fanquake referenced this in commit a9a861af2b on May 11, 2023

fanquake referenced this in commit 128da6e41f on May 11, 2023

fanquake referenced this in commit 1adbcd302f on May 11, 2023

fanquake referenced this in commit 7ef71e30c9 on May 11, 2023

maflcko commented at 1:59 pm on May 11, 2023: member

first commit lgtm. Left one question. Didn’t look too long at the second commit.

achow101 referenced this in commit ac5b9f37de on May 11, 2023

mzumsande commented at 3:57 pm on May 11, 2023: contributor

Post-Merge ACK.

I’ve been running this patch (plus added reporting in getpeerinfo, https://github.com/mzumsande/bitcoin/commit/6932997b1b41d2197094a0a9c71601df72630501) for a few days, and have neither seen extremely high backlogs > 5000 in m_tx_inventory_to_send nor enhanced CPU use.

In order to better monitor the efficacy of this, it might make sense to expose the size of m_tx_inventory_to_send via RPC permanently?

fanquake referenced this in commit d0a2c87214 on May 11, 2023

fanquake referenced this in commit 06731d19bc on May 11, 2023

fanquake referenced this in commit 8996da626d on May 11, 2023

sidhujag referenced this in commit 5557e20a78 on May 11, 2023

fanquake referenced this in commit 2e9fc2e353 on May 12, 2023

Sjors commented at 1:21 pm on May 12, 2023: member

Partial utACK, for 228e9201efb5574b1b96bb924de1d2e8dd1317f3 (i.e. the changes in CompareDepthAndScore). Took me a while to wrap my head around it.

pinheadmz commented at 2:10 pm on May 17, 2023: member

I’ve been running this branch for 7 days on a VPS and noticed this morning the CPU on b-msghand is back up to 100%. It was about that high running v24 release but dropped to 30% or so when I first switched to this branch and restarted.

willcl-ark commented at 3:12 pm on May 17, 2023: member

@pinheadmz I am also still running this patch, but I still see pretty stable utilisation in the range of ~2-12%, currently with 88 inbound peers.

0xB10C commented at 5:19 pm on May 18, 2023: contributor

I’ve been running this branch for 7 days on a VPS and noticed this morning the CPU on b-msghand is back up to 100%. It was about that high running v24 release but dropped to 30% or so when I first switched to this branch and restarted. @pinheadmz I’ve been running the patch on multiple nodes for a week now and haven’t seen 100% CPU usage in the b-msghand thread again. If you haven’t restarted or if it happens again, it would be helpful to see which functions are slow. perf top -p $(pidof bitcoind) should do the trick.

maflcko commented at 5:30 pm on May 18, 2023: member

Could also make sense to double check the debug.log to ensure you restarted bitcoind after compiling? :sweat_smile:

pinheadmz commented at 5:35 pm on May 18, 2023: member

Could also make sense to double check the debug.log to ensure you restarted bitcoind after compiling? 😅

Phew! That would’ve been embarassing

02023-05-10T15:39:40Z Bitcoin Core version v25.99.0-5b3406094f26 (debug build)

perf top -p $(pidof bitcoind) should do the trick.

Not familiar with this tool but it looks cool! This is at the top. Every other line in the output is < 0.10%

0  99.56%  bitcoind             [.] boost::multi_index::detail::safe_iterator_base::detach

maflcko commented at 5:47 pm on May 18, 2023: member

(debug build)

See #27700 (comment) ?

pinheadmz commented at 5:50 pm on May 18, 2023: member

ah thanks I forgot I configured that way, will follow the other threads

losh11 referenced this in commit 2170c245ee on Mar 3, 2024

losh11 referenced this in commit 3270c6ca30 on Mar 6, 2024

losh11 referenced this in commit b0f96bc53a on Mar 6, 2024

arnout referenced this in commit a3a88ff1c8 on Mar 11, 2024

arnout referenced this in commit d4169b9dbf on Mar 21, 2024

arnout referenced this in commit e7529976cc on Mar 21, 2024

arnout referenced this in commit a7affb39bc on Mar 21, 2024

digirayc referenced this in commit cce321a8b1 on Apr 1, 2024

bitcoin locked this on May 17, 2024

Improve performance of p2p inv to send queues #27610

Reviews