p2p: Replace per-peer transaction rate-limiting with global rate limits #34628

pull ajtowns wants to merge 10 commits into bitcoin:master from ajtowns:202602-mempool-invtosend changing 13 files +624 −106
  1. ajtowns commented at 3:09 AM on February 20, 2026: contributor

    Per-peer m_tx_inventory_to_send queues have CPU and memory costs that scale with both queue size and peer count. Under high transaction volume, this has previously caused severe issues (May 2023 disclosure) and still can cause measurable delays (Feb 2026 Runestone surge, with the msghand thread observed hitting 100% CPU and queue memory reaching ~95MB).

    This PR replaces the per-peer rate limiting with a global queue using dual token buckets (limiting transaction by both count and serialized size). Transactions that arrive within the bucket capacity still relay nearly immediately, but excess transactions queue in a global backlog and drain as the token buckets refill.

    Key parameters:

    • Count bucket: 14 tx/s, 420 capacity (30s buffer)
    • Size bucket: 20 kB/s (~12 MB/600s), 50 MB capacity
    • Outbound peers refill faster by a factor of 2.5

    Per-peer queues are retained solely for privacy batching and are always fully emptied, removing the old INVENTORY_BROADCAST_MAX cap.

    This reduces the memory and CPU burden during transaction spikes when the queuing logic is engaged from O(queue * peers) to O(queue), as the queued transactions no longer need to be retained per-peer or re-sorted per-peer.

    Design discussion: https://gist.github.com/ajtowns/d61bea974a07190fa6c6c8eaef3638b9

  2. DrahtBot added the label P2P on Feb 20, 2026
  3. DrahtBot commented at 3:10 AM on February 20, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK 0xB10C, polespinasa

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #35054 (p2p: UTXO set sharing by fjahr)
    • #34824 (net: refactor: replace Peer::TxRelay RecursiveMutex instances with Mutex by w0xlt)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. DrahtBot added the label CI failed on Feb 20, 2026
  5. ajtowns force-pushed on Feb 20, 2026
  6. ajtowns commented at 12:43 PM on February 20, 2026: contributor

    CI failure is presumably either #34631 or #34387

  7. DrahtBot removed the label CI failed on Feb 24, 2026
  8. in src/util/tokenbucket.h:57 in 869a1ae012 outdated
      52 | +    }
      53 | +
      54 | +    /** Consume n tokens. Returns false if the balance dropped below m_max_debt. */
      55 | +    bool decrement(double n = 1.0)
      56 | +    {
      57 | +        m_value -= n;
    


    chriszeng1010 commented at 5:33 PM on March 2, 2026:

    Decrement can still go below m_max_debt before checking is complete.


    ajtowns commented at 12:57 PM on March 4, 2026:

    decrement() can always go below m_max_debt, it only reports when it has done so -- it leaves it up to the caller to not go further into debt.

  9. DrahtBot added the label Needs rebase on Mar 11, 2026
  10. ajtowns force-pushed on Mar 12, 2026
  11. DrahtBot removed the label Needs rebase on Mar 12, 2026
  12. 0xB10C commented at 9:45 AM on March 12, 2026: contributor

    Concept ACK!

    I've been running this for a few days now and written down a few observations on a small mass-broadcast event that happend a few hours ago: https://bnoc.xyz/t/increased-b-msghand-thread-utilization-due-to-runestone-transactions-on-2026-02-17/81/11

    The node with this patch was significantly less affected than the others running a recent master.

    I haven't set up any monitoring for the newly added getnetworkinfo fields yet.

  13. DrahtBot added the label CI failed on Mar 12, 2026
  14. DrahtBot closed this on Mar 12, 2026

  15. DrahtBot reopened this on Mar 12, 2026

  16. ajtowns added this to the milestone 32.0 on Mar 12, 2026
  17. DrahtBot removed the label CI failed on Mar 12, 2026
  18. in src/txmempool.cpp:541 in 344de4b8dd outdated
     537 | @@ -538,6 +538,55 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
     538 |          for (const auto& input: tx.vin) mempoolDuplicate.SpendCoin(input.prevout);
     539 |          AddCoins(mempoolDuplicate, tx, std::numeric_limits<int>::max());
     540 |      }
     541 | +
    


    sipa commented at 6:22 PM on March 20, 2026:

    In commit "txmempool: Add SortMiningScoreWithTopology"

    This feels more like something for a fuzz or unit test. CTxMemPool::check is for internal consistency checks in the CTxMemPool representation, I feel.


    ajtowns commented at 1:20 AM on March 21, 2026:

    .... I might have spent too much time vibecoding and caught hallucinations? I could have sworn I was replacing existing code here. EDIT: Dropped this code.

  19. in src/txmempool.cpp:605 in 344de4b8dd outdated
     601 | @@ -553,6 +602,27 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
     602 |      assert(innerUsage == cachedInnerUsage);
     603 |  }
     604 |  
     605 | +std::vector<CTxMemPool::txiter> CTxMemPool::SortMiningScoreWithTopology(std::span<const Wtxid> wtxids, size_t n) const
    


    sipa commented at 6:46 PM on March 20, 2026:

    It looks like both eventual production call sites of this function (BumpInvVecForProcessing and PeerManagerImpl::SendMessages) do a deduplication pass on the results.

    Would it make sense to do this on the fly inside this function? It can't use std::partial_sort anymore, but it can use std::make_heap and friends to implement partial sorting, with a dynamic end point until n distinct elements have been found? Something like

    std::vector<CTxMemPool::txiter> CTxMemPool::SortMiningScoreWithTopology(std::span<const Wtxid> wtxids, size_t n) const
    {
        auto cmp = [&](const auto& a, const auto& b) EXCLUSIVE_LOCKS_REQUIRED(cs) noexcept { return m_txgraph->CompareMainOrder(*a, *b) > 0; };
    
        std::vector<txiter> res;
    
        n = std::min(wtxids.size(), n);
        if (n > 0) {
            // Construct a heap with txiters for all wtxids that exist in the mempool.
            std::vector<txiter> heap;
            heap.reserve(wtxids.size());
            for (auto& wtxid : wtxids) {
                if (auto i{GetIter(wtxid)}; i.has_value()) {
                    heap.push_back(i.value());
                }
            }
            std::ranges::make_heap(heap, cmp);
    
            // Pop transactions until n distinct ones in res have been found.
            res.reserve(heap.size());
            while (res.size() < n && !heap.empty()) {
                std::ranges::pop_heap(heap, cmp);
                if (res.empty() || heap.back() != res.back()) {
                    res.push_back(heap.back());
                }
                heap.pop_back();
            }
    
            // Copy the remainder over, without sorting or deduplication.
            res.insert(res.end(), heap.begin(), heap.end());
        }
    
        return res;
    }
    

    With even more low-level code the duplicate vector can be avoided, I think. Tests don't pass with this, I haven't investigated why.


    ajtowns commented at 1:14 AM on March 21, 2026:

    Without having looked, is the comparison backwards?

    My understanding is partial_sort has two benefits:

    • it only makes a heap out of the target size, so iterates through the source array once and then does log(k) work for each element, with better locality
    • when updating the k elements in the heap with a new element from the source, it does the sift-down algorithm which is more efficient than heap_push()/heap_pop(), but isn't exposed via the STL so would mean writing your own heap implementation

    Deduping the fairly small output list as you pass through it, when duplicates are rare anyway, seemed fine to me?


    sipa commented at 2:20 PM on March 21, 2026:

    Without having looked, is the comparison backwards?

    I don't think so. It's a max heap, but I want to pop the "lowest" elements off first, so I needed to swap the comparator I think.

    My understanding is partial_sort has two benefits:

    Interesting. So it has complexity O(n log m) (with n = number of elements, m = sorted prefix size), while the approach I have in mind is O(n + (m log n)) (O(n) to construct the heap of all elements, and then m operations of each O(log n) to extract the best m elements. Complexity-wise, my approach seems better, since n > m, except it has worse memory locality. This makes me wonder if I'm missing something, since the cppreference.cpp documentation seems to imply std::partial_sort is intended for low m values.

    Deduping the fairly small output list as you pass through it, when duplicates are rare anyway, seemed fine to me?

    Yeah, it probably doesn't matter that much. It just looked like the deduplication is something that CTxMemPool::SortMiningScoreWithTopology could do internally since both call sites need it anyway. And then it seemed possible to have the count be dynamic, but only when using a different approach than what std::partial_sort seems to enable.


    ajtowns commented at 12:58 AM on March 22, 2026:

    Complexity-wise, my approach seems better, since n > m, except it has worse memory locality.

    Yeah. I think the ratio between the number of swaps each approach performs in the worst case is log2(m) : 2 -- if you give the input in exactly the wrong order, each element goes to the top of the heap with log(m) steps, whereas for the full heap, it adds up to 2. So for m~=100, that's 3.3x more swaps, but the swaps are contained to a set of 100 elements, which might get you more than a 3.3x speedup per-swap due to locality? (In the average case, for most elements you'll just compare to the top of heap element, find it's worse and do 0 swaps, and overall it reduces from O(n log(m) + m log(m)) to O(n + m log(m)) which is better than the O(n + m log(n)) from the full heap.

    Oh, hmm; in the per-peer thread we're always taking everything, so I think there we should be explicitly using std::sort then and not any of this partial business anyway (pushed). That should ensure that the m sizes we're using in practice are always about 70 (-txsendrate times inbound broadcast interval).

  20. in src/txmempool.cpp:483 in ea647debfd
     480 | @@ -481,10 +481,10 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
     481 |          const CTransaction& tx = it->GetTx();
     482 |  
     483 |          // CompareMiningScoreWithTopology should agree with GetSortedScoreWithTopology()
    


    sipa commented at 7:46 PM on March 20, 2026:

    In commit "txmempool: Drop CompareMiningScoreWithTopology"

    Comment is outdated now.

  21. in src/util/tokenbucket.h:16 in 859e8eb020
      11 | +
      12 | +/** A token bucket rate limiter.
      13 | + *
      14 | + * Tokens are added at a steady rate (m_rate per second) up to a capacity
      15 | + * cap (m_cap). Tokens are removed by calling decrement(). The balance
      16 | + * may go negative down to m_max_debt; decrement() returns false when
    


    sipa commented at 7:54 PM on March 20, 2026:

    In commit "util/tokenbucket.h: Provide a generic TokenBucket class"

    Is it useful to support debt? I believe it can be avoided by a transformation that raises both m_value and m_cap by -m_max_debt.


    ajtowns commented at 4:04 AM on March 21, 2026:

    The distinction is in InvToSendBucket::avail() which says "you can start doing stuff as long as the size_bucket's value is >=0", which would have to get incremented as well to be equivalent.

    The main effect is that when the size bucket is under pressure, you get a chance to relay at least 50kB each iteration, rather than the avail test passing as soon as you can relay 1B, then the loop ending immediately after you relay the first tx.

    I think it can be simplified a bit by moving the max_debt value to just being a parameter of decrement() though. Will update.

  22. ajtowns force-pushed on Mar 21, 2026
  23. ajtowns force-pushed on Mar 22, 2026
  24. ajtowns force-pushed on Mar 22, 2026
  25. DrahtBot added the label CI failed on Mar 22, 2026
  26. DrahtBot removed the label CI failed on Mar 22, 2026
  27. in src/net_processing.cpp:6190 in ba3a81d036 outdated
    6187 | @@ -6028,63 +6188,56 @@ bool PeerManagerImpl::SendMessages(CNode& node)
    6188 |                  // Determine transactions to relay
    6189 |                  if (fSendTrickle) {
    6190 |                      // Produce a vector with all candidates for sending
    


    xyzconstant commented at 3:04 AM on April 15, 2026:

    In commit: "net_processing: Change m_tx_inventory_to_send from set to vector" (2998d73f692059f149fa2d5a4108b172b21c8cac)

    nit: Forgot to remove this comment as well?


    ajtowns commented at 4:11 AM on April 15, 2026:

    No? inv_tx is a vector with all candidates for sending in the new code.


    xyzconstant commented at 4:33 AM on April 15, 2026:

    Yeah you're right, but now filterrate definition sits in between

        // Produce a vector with all candidates for sending
        const CFeeRate filterrate{tx_relay->m_fee_filter_received.load()};
    
        // Topologically and fee-rate sort the inventory we send for privacy and priority reasons.
        // sorted from lowest priority to highest, skipping low fee
        auto inv_tx = [&]() EXCLUSIVE_LOCKS_REQUIRED(tx_relay->m_tx_inventory_mutex) {
    
  28. DrahtBot added the label Needs rebase on Apr 23, 2026
  29. net_processing: Pass full tx to InitiateTxBroadcastToAll()
    All callers already have the full transaction available, so just pass that
    through. This matches the signature for InitiateTxBroadcastPrivate()
    and prepares for a later commit that needs the full transaction to
    compute its serialized size for rate limiting.
    cafa97202e
  30. net_processing: Remove per-peer rate-limiting
    Per-peer rate limiting introduces storage and compute costs proportional
    to the number of peers. This has caused severe bugs in the past, and
    continues to be a risk in the event of periods of extremely high rates
    of transaction submission. Avoid these problems by always completely
    emptying the m_tx_inventory_to_send queue when processing it.
    
    Note that this increases the potential size of INV messages we send
    for normal tx relay from ~1000 (limited by INVENTORY_BROADCAST_MAX)
    to potentially 50000 (limited by MAX_INV_SZ).
    466dccc89c
  31. txmempool: Add SortMiningScoreWithTopology
    Add a method for sorting a batch of transactions (specified as a vector
    of wtxids) per mempool order, designed for transaction relay.
    798db790a8
  32. net_processing: Change m_tx_inventory_to_send from set to vector
    Change the per-peer tx relay queue from std::set to std::vector. This
    reduces the memory usage and improves locality, at the cost of not
    automatically deduping entries.
    94d20279e6
  33. txmempool: Drop CompareMiningScoreWithTopology
    Now unused; replaced by SortMiningScoreWithTopology.
    70751abca9
  34. util/tokenbucket.h: Provide a generic TokenBucket class
    This is a simple token bucket parameterized on clock type, used in the
    following commit.
    d5cdee68cb
  35. net_processing: add a global delay queue for sending txs
    Without the per-peer rate limiting, nodes can act as an amplifier for
    transaction spam -- receiving many transactions from one node, but
    relaying each of them to over 100 other nodes. Limit the impact of this
    by providing a global rate limit.
    
    This is implemented using dual token buckets, one that consumes a
    token for every transaction, and one that consumes a token for every
    serialized byte. This rate limits both per-tx resource usage (eg INV
    messages) and overall relay bandwidth.
    
    Main bucket parameters:
     * Count: 14tx/s rate, 420tx (30s) capacity
     * Size: 12MB/600s rate (4-6 blocks per target block interval), 50MB capacity
    
    The size bucket is expected to be large enough to almost never have an
    impact in normal usage, even during transaction storms, and is primarily
    intended to mitigate attack-like scenarios.
    
    Outbound connections get a separate pair of buckets, with rates boosted
    by a 2.5x multiplier.
    
    This avoids the excessive memory and CPU usage due to the 100x multiplier
    from the queues being per-peer.
    
    Note that this also reduces the size of INV messages we send for general
    tx relay back to a more reasonable level of under 600 txs in 99.999%
    of cases.
    0491df8e78
  36. init: add -txsendrate configuration parameter
    Adds a debug-only configuration option to set the target
    transaction/second rate for relay to inbound connections. This is mostly
    intended to be set to artificially low values to aid in testing behaviour
    when a backlog occurs, but is also available in case the default 14tx/s
    target is somehow too low in practice.
    272d37725d
  37. rpc: report -txsendrate and bucket info via getnetworkinfo
    Add `tx_send_rate` and `inv_buckets` fields to getnetworkinfo. The latter
    has `inbound` and `outbound` entries, reporting reports backlog count,
    count tokens, and size tokens. Useful for monitoring relay behavior.
    8373c95f35
  38. tests: basic functional test for tx rate limiting 686615470f
  39. ajtowns force-pushed on Apr 23, 2026
  40. DrahtBot removed the label Needs rebase on Apr 23, 2026
  41. in src/node/transaction.cpp:64 in cafa97202e
      59 | @@ -60,15 +60,15 @@ TransactionError BroadcastTransaction(NodeContext& node,
      60 |              if (!existingCoin.IsSpent()) return TransactionError::ALREADY_IN_UTXO_SET;
      61 |          }
      62 |  
      63 | -        if (auto mempool_tx = node.mempool->get(txid); mempool_tx) {
      64 | +        mempool_tx = node.mempool->get(txid);
      65 | +        if (mempool_tx) {
    


    polespinasa commented at 11:38 AM on April 24, 2026:

    in cafa97202e95df278153369ccda9c1bb61880a5e I don't think we should have an if statement without code logic inside. Why not just if(!mempool_tx) and do what is inside the else code block

  42. in src/net_processing.cpp:174 in 466dccc89c
     175 | -static constexpr unsigned int INVENTORY_BROADCAST_TARGET = INVENTORY_BROADCAST_PER_SECOND * count_seconds(INBOUND_INVENTORY_BROADCAST_INTERVAL);
     176 | -/** Maximum number of inventory items to send per transmission. */
     177 | -static constexpr unsigned int INVENTORY_BROADCAST_MAX = 1000;
     178 | -static_assert(INVENTORY_BROADCAST_MAX >= INVENTORY_BROADCAST_TARGET, "INVENTORY_BROADCAST_MAX too low");
     179 | -static_assert(INVENTORY_BROADCAST_MAX <= node::MAX_PEER_TX_ANNOUNCEMENTS, "INVENTORY_BROADCAST_MAX too high");
     180 | +// static constexpr unsigned int INVENTORY_BROADCAST_TARGET = INVENTORY_BROADCAST_PER_SECOND * count_seconds(INBOUND_INVENTORY_BROADCAST_INTERVAL);
    


    polespinasa commented at 3:49 PM on April 24, 2026:

    In 466dccc89c5748c7f4ccfb10b2079c2eb54589c5 you can delete this two lines that are commented and then deleted in 0491df8e7876b9b2b86e25190823f3ff632311c6.

  43. in src/txmempool.cpp:541 in 798db790a8
     537 | @@ -538,6 +538,7 @@ void CTxMemPool::check(const CCoinsViewCache& active_coins_tip, int64_t spendhei
     538 |          for (const auto& input: tx.vin) mempoolDuplicate.SpendCoin(input.prevout);
     539 |          AddCoins(mempoolDuplicate, tx, std::numeric_limits<int>::max());
     540 |      }
     541 | +
    


    polespinasa commented at 3:57 PM on April 24, 2026:

    in 798db790a8a5b9c6b11b3e8599cb88f9d6015d87 nit: random empty line added here

  44. in src/txmempool.cpp:567 in 798db790a8
     562 | +
     563 | +    n = std::min(wtxids.size(), n);
     564 | +    if (n > 0) {
     565 | +        res.reserve(wtxids.size());
     566 | +        for (auto& wtxid : wtxids) {
     567 | +            if (auto i{GetIter(wtxid)}; i.has_value()) {
    


    polespinasa commented at 4:07 PM on April 24, 2026:

    in 798db790a8a5b9c6b11b3e8599cb88f9d6015d87

    nit: I think this can be simplified: if (auto i = GetIter(wtxid)) res.push_back(*i);

  45. in src/txmempool.cpp:572 in 798db790a8
     567 | +            if (auto i{GetIter(wtxid)}; i.has_value()) {
     568 | +                res.push_back(i.value());
     569 | +            }
     570 | +        }
     571 | +
     572 | +        if (n >= res.size()) {
    


    polespinasa commented at 4:47 PM on April 24, 2026:

    in 798db790a8a5b9c6b11b3e8599cb88f9d6015d87

    I think std::partial_sort gives the same result as std::sort if the mid iterator reaches the end. So changing the whole if ... else... block for:

    std::partial_sort(res.rbegin(),
                      res.rbegin() + std::min(n, res.size()),
                      res.rend(),
                      cmp);
    

    I think it should work.

  46. polespinasa commented at 4:51 PM on April 24, 2026: member

    Concept ACK

    Code reviewed till 798db790a8a5b9c6b11b3e8599cb88f9d6015d87 will continue soon :)

    Left small comments and suggestions but nothing important, feel free to ignore.

  47. DrahtBot added the label Needs rebase on Apr 24, 2026
  48. DrahtBot commented at 11:35 PM on April 24, 2026: contributor

    <!--cf906140f33d8803c4a75a2196329ecb-->

    🐙 This pull request conflicts with the target branch and needs rebase.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-25 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me