p2p: Increase tx relay rate

ajtowns commented at 1:59 am on October 5, 2023: contributor

In the presence of smaller transactions on the network, blocks can sustain a higher relay rate than 7tx/second. In this event, the per-peer inventory queues can grow too large.

This commit bumps the rate up to 14 tx/s (for inbound peers), increasing the safety margin by a factor of 2.

Outbound peers continue to receive relayed transactions at 2.5x the rate of inbound peers, for a rate of 35tx/second.

DrahtBot commented at 1:59 am on October 5, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/28592.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Stale ACK	naumenkogs

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

DrahtBot added the label P2P on Oct 5, 2023

ajtowns commented at 2:02 am on October 5, 2023: contributor

Reopens #27630 with a reduced rate (14tx/s vs 22tx/s). Here’s a graph showing the rise in the tx rate over the past year or so, generated from getchaintxstats:

Spreadsheet with the detailed numbers (and command line for reproducing) at: https://docs.google.com/spreadsheets/d/1RSWOmT_fWOdDERuI7yM_v_57C6DkCBojebyILT4eVZw/edit?usp=sharing

Note that these numbers will be an underestimate as it ignores RBFed transactions, and transaction bursts.

(144 blocks is about 24 hours, 36 blocks is about 6 hours)

ajtowns commented at 2:10 am on October 5, 2023: contributor

Perhaps worth explicitly noting: this is simplified compared to #27630 since #27675 removed INVENTORY_MAX_RECENT_RELAY (and m_recently_announced_invs for which that constant was needed). If it’s backported without that PR, then that constant will also need to be updated, which will increase per-peer memory usage. Backporting shouldn’t be necessary though, as #27610 (see also #27613, #27614) should be sufficient.

Sjors commented at 2:38 pm on October 5, 2023: member

Seems reasonable. I’ll run with 80489ba6e8430177db5c1d0c9148e83e410bd704 for a while.

ariard commented at 7:30 pm on October 5, 2023: none

If this is same than #27630 at least one benefit for second-layer and Bitcoin applications can be too document tx relay rate, see #27630 (comment)

naumenkogs commented at 8:11 am on October 9, 2023: member

ACK 80489ba6e8430177db5c1d0c9148e83e410bd704

The justification above makes sense. With this change, going through m_tx_inventory_to_send will be more predictable and reactive, making the tx relay smoother.

I also checked that the catch up code introduced in 5b3406094f2679dfb3763de4414257268565b943 still makes sense. (I think the commit message from there better be in the code because the division by 1000 is not trivial). I think catching up will become slower in the case when m_tx_inventory_to_send is around 1,000,000; but I think it’s fine.

I also want to note that it’s not exactly the relay rate, it’s the rate of going through the queue. Because some transactions won’t be relayed e.g. due to the fee filter.

luke-jr referenced this in commit 87cffd6789 on Oct 19, 2023

ajtowns commented at 5:28 pm on November 28, 2023: contributor

@Sjors did you want to comment on any results re #28592 (comment) ?

Sjors commented at 6:10 pm on November 28, 2023: member

@ajtowns it did not crash my node. That’s about the extend of monitoring I did on it, not sure what to measure.

ajtowns commented at 3:35 pm on November 30, 2023: contributor

<instagibbs> _aj_ might be helpful to know what the potential downsides are to the PR
             (I have not thought about this stuff deeply)
<_aj_> instagibbs: ie, why there's a limit at all?

The code comment explanation of this is:

0/** Maximum rate of inventory items to send per second.
1 *  Limits the impact of low-fee transaction floods. */
2static constexpr unsigned int INVENTORY_BROADCAST_PER_SECOND = 7;

My understanding of this limit is that our tx relay works like this:

we have perhaps 100 peers, who send INV messages to us at whatever rate they like
our first limit applies if they send us huge bursts of INV messages that we/they can’t keep up to with GETDATA/TX: in that case AddTxAnnouncement will hit the MAX_PEER_TX_ANNOUNCEMENTS limit (5000, overridden by “relay” perm), and drop INV messages until we catch up (due to sending a GETDATA for the corresponding tx and getting a reply from any node; or a NOTFOUND or timeout by this node)
we then send GETDATA for the new txs: the second limit applies here - if we already have more than MAX_PEER_TX_REQUEST_IN_FLIGHT GETDATA requests outstanding with this peer (100), we’ll wait an extra OVERLOADED_PEER_DELAY seconds (2) in the hopes we can get the tx from another peer first
then we get the TX response, and add it to our mempool and relay it to everyone else
but the final rate limit applies here: we limit our outbound INV announcemnets to only happen every INBOUND_INVENTORY_BROADCAST_INTERVAL (5s; 2s for outbound connections) on average, and aim to only include INVENTORY_BROADCAST_TARGET (7*5 = 35) txs in each announcement; which is bumped up by an additional 5 txs for every 1000 txs of backlog we have (see #27610).

The first two rate limits have no effect if our peers are high-bandwidth well-behaved peers: they can send us 500 tx/s every second indefinitely, and we’ll accept them all into our mempool provided they’re valid/obey rbf rules/etc. That’s a huge waste of bandwidth though: we can only put txs into blocks at much lower rates, so perhaps 450 or more of those 500 txs are going to either be replaced before being mined or just expire. We can’t really stop our peers from wasting bandwidth by sending tx data to us, but we can be kind to our peers, and only relay tx data that’s likely to be mined: so we pick the top set of txs (CompareInvMempoolOrder) trim the txs that the peer has announced to us, and then limit that to a rate somewhat around what can reasonably be confirmed.

(Note that the we’ll send txs faster to our outbounds by a factor of 2.5x (35tx/2s instead of 35tx/5s), and that we don’t announce txs that have been announced to us also has a multiplying effect)

The drawback of having this number be too low is thus (a) that txs may not propagate well, and this may affect txs that should go in the next blocks if feerates are rising, and (b) that our internal queues of what to send (m_tx_inventory_to_send) may grow large. The drawback of having this number be too high is that we will waste our peers’ bandwidth and cpu by giving them more txs than can possibly go into blocks, which they’ll then have to validate and may in turn relay onwards.

ajtowns commented at 3:39 pm on November 30, 2023: contributor

@ajtowns it did not crash my node. That’s about the extend of monitoring I did on it, not sure what to measure.

I have a couple of patches at https://github.com/ajtowns/bitcoin/tree/202305-reportnetdata which can be used to track if the inv_to_send queues grow large and how quickly they empty out. With those patches, this script can be interesting:

0bitcoin-cli getmempoolinfo | jq -j '"loaded:", .loaded, " ", .size, " txs ", (.bytes/1000000|floor), "MvB ", (.usage/.maxmempool*1000|floor)/10, "% used ", (.mempoolminfee*1e6|floor)/10, "s/vB minfee\n"'
1echo
2bitcoin-cli getnetworkinfo | jq -j '(.internal | to_entries | .[] | (.value, " ", .key, "\n"))' | fmt
3echo
4bitcoin-cli getpeerinfo | jq -j '.[] | (.inv_to_send, " ", .connection_type, " ", .id, " ", .relaytxes, " ", .synced_blocks, " ", (.network|sub("not_publicly_routable";"onion")), " inv-rx:", ((.bytesrecv_per_msg.inv//0)/1024|floor), "kB inv-tx:", ((.bytessent_per_msg.inv//0)/1024|floor), "kB gd-rx:", ((.bytesrecv_per_msg.getdata//0)/1024|floor), "kB nf-tx:", ((.bytessent_per_msg.notfound//0)/1024|floor), "kB minfee:", (.minfeefilter*1e6|floor)/10, "s/vb ", (.subver|gsub(" |(^$)"; "_")), " ", .addr, "\n")' | sort -rn | column -t

amitiuttarwar commented at 9:25 pm on December 14, 2023: contributor

posted a topic to delving bitcoin to brainstorm using warnet to evaluate the impacts of this change

DrahtBot added the label CI failed on Jan 17, 2024

ajtowns force-pushed on Feb 13, 2024

Americason1 approved

DrahtBot removed the label CI failed on Feb 13, 2024

sdaftuar commented at 11:46 am on February 13, 2024: member

The drawback of having this number be too high is that we will waste our peers’ bandwidth and cpu by giving them more txs than can possibly go into blocks, which they’ll then have to validate and may in turn relay onwards.

Just to expand on this a bit more, a benefit of this number not being too high is that we reserve some of the network’s relay capacity for higher feerate transactions that might show up after, say, a flood of low feerate transactions arrives.

If we had no limit at all, then you could think of relay as being much more like a big FIFO queue, where the first transactions announced are the ones that end up being relayed to everyone. By buffering transactions, we allow higher feerate ones that are announced later to jump the queue.

See also #7805, where gmaxwell originally proposed this behavior (before it was adopted in #7840).

Retropex referenced this in commit c247609a9d on Mar 28, 2024

achow101 requested review from glozow on Apr 9, 2024

DrahtBot added the label CI failed on May 12, 2024

DrahtBot removed the label CI failed on May 13, 2024

DrahtBot added the label CI failed on Sep 12, 2024

DrahtBot removed the label CI failed on Sep 15, 2024

Sjors commented at 8:08 am on September 18, 2024: member

I’ve been running this patch for almost a year now. Didn’t do very extensive monitoring, but at least it didn’t crash the node or noticeably upset the machine.

DrahtBot added the label CI failed on Oct 24, 2024

DrahtBot commented at 10:00 am on October 25, 2024: contributor

Could rebase for fresh CI after 8 months?

DrahtBot removed the label CI failed on Oct 31, 2024

DrahtBot added the label CI failed on Nov 24, 2024

p2p: Increase tx relay rate

In the presence of smaller transactions on the network, blocks can sustain a
higher relay rate than 7tx/second. In this event, the per-peer inventory queues
can grow too large.

This commit bumps the rate up to 14 tx/s (for inbound peers), increasing the
safety margin by a factor of 2.

Outbound peers continue to receive relayed transactions at 2.5x the rate of
inbound peers, for a rate of 35tx/second.

Co-Authored-By: Suhas Daftuar <sdaftuar@gmail.com>

b81f37031c

ajtowns force-pushed on Dec 3, 2024

ajtowns commented at 8:44 am on December 3, 2024: contributor

Rebased for fresh CI

DrahtBot removed the label CI failed on Dec 3, 2024

luke-jr referenced this in commit 9e4dad615d on Feb 22, 2025

DrahtBot commented at 1:37 am on June 1, 2025: contributor

There hasn’t been much activity lately. What is the status here?

Finding reviewers may take time. However, if the patch is no longer relevant, please close this pull request. If the author lost interest or time to work on this, please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.

p2p: Increase tx relay rate #28592

Code Coverage & Benchmarks

Reviews