p2p: Stop relaying non-mempool txs #27625

pull maflcko wants to merge 2 commits into bitcoin:master from maflcko:2305-remove-mapRelay- changing 2 files +52 −38
  1. maflcko commented at 1:53 pm on May 11, 2023: member

    mapRelay (used to relay announced transactions that are no longer in the mempool) has issues:

    • It doesn’t have an absolute memory limit, only an implicit one based on the rate of transaction announcements
    • It doesn’t have a use-case EDIT: see below

    Fix all issues by removing mapRelay.

    For more context, on why a transaction may have been removed from the mempool, see https://github.com/bitcoin/bitcoin/blob/c2f2abd0a4f4bd18bfca41b632d21d803479f3f4/src/txmempool.h#L228-L238

    For my rationale on why it is fine to not relay them:

    Reason Rationale
    EXPIRY Expired from mempool Mempool expiry is by default 2 weeks and can not be less than 1 hour, so a transaction can not be in mapRelay while expiring, unless a re-broadcast happened. This should be fine, because the transaction will be re-added to the mempool and potentially announced/relayed on the next re-broadcast.
    SIZELIMIT Removed in size limiting A low fee transaction, which will be relayed by a different peer after GETDATA_TX_INTERVAL or after we sent a notfound message. Assuming it ever made it to another peer, otherwise it will happen on re-broadcast (same as with EXPIRY above).
    REORG Removed for reorganization Block races are rare, so reorgs should be rarer. Also, the transaction is likely to be re-accepted via the disconnectpool later on. If not, it seems fine to let the originating wallet deal with rebroadcast in this case.
    BLOCK Removed for block EDIT: Needed for compact block relay, see #27625 (comment)
    CONFLICT Removed for conflict with in-block transaction The peer won’t be able to add the tx to the mempool anyway, unless it is on a different block, in which case it seems fine to let the originating wallet take care of the rebroadcast (if needed).
    REPLACED Removed for replacement EDIT: Also needed for compact block relay, see #27625 (comment) ?
  2. DrahtBot commented at 1:53 pm on May 11, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK sdaftuar, ajtowns, glozow

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27675 (net_processing: Drop m_recently_announced_invs bloom filter by ajtowns)
    • #27602 (net processing: avoid serving non-announced txs as a result of a MEMPOOL message by sr-gi)
    • #26969 (net, refactor: net_processing, add ProcessCompactBlockTxns by brunoerg)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label P2P on May 11, 2023
  4. sdaftuar commented at 2:04 pm on May 11, 2023: member
    I think the removed-for-block case is the main one we care about – in a situation where a peer has requested a transaction where we have received a block containing the transaction in the time between announcement and relay, it seems likely that the peer is about to receive the block as well, and with the compact block protocol we may be helping our peer quite a bit (potentially saving a round trip) if we deliver the transaction before the cmpctblock arrives to them.
  5. maflcko marked this as a draft on May 11, 2023
  6. DrahtBot added the label Needs rebase on May 11, 2023
  7. maflcko commented at 3:10 pm on May 11, 2023: member
    Ok, good point. I was about to suggest that in that case it seems sufficient to only put BLOCK (maybe also CONFLICTED/REORG) transactions into mapRelay (keeping their mempool timestamp, to avoid adding ones that are older than 15 minutes). However, I think your point also applies to REPLACED transactions, because we’d still want to relay them, so that they are rejected by the peer and end up in their AddToCompactExtraTransactions? I wonder if it makes sense to drop a line or two of documentation here. (Edit: Or maybe even a test, if we want to get extra fancy)
  8. ajtowns commented at 4:08 pm on May 11, 2023: contributor
    FWIW, I’ve thought about moving mapRelay into txmempool so that its size can be counted as part of the 300MB (or whatever), and, if it grows too large, that we could trim mapRelay before their scheduled expiry time has actually been reached. Then you could have a multi_index amongst txid, wtxid, expiry to an CTransactionRef, along with a counter of how many entries in mapRelay aren’t also in the mempool for handling memory usage?
  9. maflcko added the label Brainstorming on May 11, 2023
  10. bitcoin deleted a comment on May 11, 2023
  11. maflcko commented at 7:15 pm on May 11, 2023: member

    we could trim mapRelay before their scheduled expiry time has actually been reached.

    Sounds like this would either degrade compact block relay when the mempool is full due to trimming mapRelay, or turn one DoS (OOM) into another DoS (lost fee income) due to trimming valid mempool txs? And if you use separate and dedicated size limits for mapRelay and the mempool, it might be easier to keep them separate data structures?

  12. glozow commented at 7:45 pm on May 11, 2023: member
    What is the benefit of mapRelay overlapping with mempool? If we moved it into txmempool, could we just add things to mapRelay from mapTx whenever evicting something for a block-related reason + whose timestamp is recent? If the main reason is transactions that could be in blocks, it seems reasonable to cap it at just a few thousand entries?
  13. ajtowns commented at 0:53 am on May 12, 2023: contributor

    we could trim mapRelay before their scheduled expiry time has actually been reached.

    Sounds like this would either degrade compact block relay when the mempool is full due to trimming mapRelay, or turn one DoS (OOM) into another DoS (lost fee income) due to trimming valid mempool txs? And if you use separate and dedicated size limits for mapRelay and the mempool, it might be easier to keep them separate data structures?

    The memory usage of mapRelay depends on whether the transactions in it have been removed from the mempool – if you’ve got 15k txs in mapRelay but all of them are in the mempool, then that’s likely using up ~1MB; but if 50% of them were removed from the mempool and each CTransactionRef is using up 1kB, then that’s more an extra ~15MB. You maybe get the latter scenario if you’re relaying a lot of RBF txs. So I don’t think you can sensibly put a figure on how much memory mapRelay is using if you don’t track the relationship.

    Beyond that, the advantage of linking the two together for limits is that if you start trimming the mempool because mapRelay is too large, then you’re increasing the mempool minfee, which reduces the RBF headroom – instead of being able to RBF 50 times from 10 sat/vb up to 60 sat/vb, you might only see and relay 40 RBFs, from 20 sat/vb up to 60 sat/vb, which then reduces pressure on mapRelay.

    What is the benefit of mapRelay overlapping with mempool?

    (a) We want things to expire from mapRelay based on when we first announced them, not when we first received them or when we evicted them, so that’s extra data to store.

    (b) The number of txs we announce in 15 minutes is (currently) relatively tightly constrained (35tx every 2s for outbounds over 15min is ~16k; chance of >20k in 15min is about 1-in-5M).

    If the main reason is transactions that could be in blocks, it seems reasonable to cap it at just a few thousand entries?

    With RBF, you could have many different variants of a single tx, each relatively likely to make it into a block depending on when the miner picked the block template, so I think the time-based cap makes sense.

    That said, I don’t think there’s a strong reason for the expiry duration to be 15 minutes rather than something shorter – that was just left unchanged in #16851 when mapRelay went from being the only way to find a tx to just being the way to get at txs that are new to the mempool.

  14. sdaftuar commented at 1:48 pm on May 15, 2023: member

    I’m a concept ACK on getting rid of mapRelay. It’s been a while since I thought about this and I haven’t fully dug through the code again to refresh my memory, but my recollection is that one of the main reasons we had to keep mapRelay in the past was that historically, our software wouldn’t deal with NOTFOUND responses to a getdata very well, and so if we announced a transaction and then dropped it (for whatever reason) we would be mildly DoS’ing our peers (I think we used to wait 2 minutes before trying the next peer who had announced it to us).

    Now that we handle NOTFOUND’s better, I think the last reason to hang on to mapRelay is the reason I gave above – if the transaction was just removed in a block, we should provide it. However, we can do that by just looking at the last block we received, which I believe we store in memory for fast compact block replies (m_most_recent_block). So if we just built an index into that last block, then I think we would be clear to drop mapRelay altogether.

  15. maflcko commented at 2:53 pm on May 15, 2023: member

    I think the last reason to hang on to mapRelay is the reason I gave above

    My understanding is that REPLACED would also need to be provided to peers: #27625 (comment). Otherwise there may be another round trip if the miner mines a previous version of a recently replaced transaction, which is in mapRelay and vExtraTxnForCompact, but not the mempool (and not yet in a block, because it may be in-flight)?

    However, that may be solved, but just replacing the mapRelay lookup by a vExtraTxnForCompact lookup?

  16. sdaftuar commented at 8:09 pm on May 15, 2023: member

    My understanding is that REPLACED would also need to be provided to peers: #27625 (comment). Otherwise there may be another round trip if the miner mines a previous version of a recently replaced transaction, which is in mapRelay and vExtraTxnForCompact, but not the mempool (and not yet in a block, because it may be in-flight)?

    I think it’s better that we not relay transactions that are REPLACED – note that we already won’t send a given peer the INV for a replaced transaction if the replacement occurs prior to the original INV going out (because we filter out transactions that are dropped from the mempool when we INV).

    Moreover I think it’s better if, when we are aware of conflicting transactions, that we only relay the version that we think ought to confirm. At the margin, it makes it more likely that the version we think is better gets mined, and from the perspective of a single node, we’re presumably picking the one that we think is best to keep in our own mempools, so I don’t think we should be spending system resources (and those of our peer) relaying the wrong version.

  17. ajtowns commented at 5:39 am on May 18, 2023: contributor

    Concept ACK

    The original use of mapRelay was that if a tx wasn’t in mapRelay it wouldn’t be relayed at all. That resulted in problems with child transactions – if we relay a recent child to a peer that doesn’t have the parent, the parent may have expired from mapRelay and thus the peer wouldn’t be able to accept the child at all.

    We fixed that with #16851 so that we’d relay txs from the mempool if they’d been in the mempool long enough to expire from mapRelay (ie, 15 minutes). That got further tweaked in #18861 so that we’d skip txs that were in the to-be-announced queue, and then got tweaked further in #19109 so that the mempool would be used unconditionally for txs that had been announced recently, or that were older than 2 minutes, and mapRelay would only be used for txs that had left the mempool, had been announced recently (with the last 3500-5250 txs), and had entered the mempool within the last 15 minutes.

    This resurrects #17303, and it’s probably worth reviewing the comments there; most of the issues they raise have already been addressed, I think.

    I think I agree with Suhas’s comment that we should ensure we serve any txs in the most recent block before dropping mapRelay though, so only a concept ack here. Here’s a test commit that adds that functionality https://github.com/bitcoin/bitcoin/pull/27675/commits/f26a85059fb4eaf29f9c1652b2b5e14060d79a36 I’m now running that with logging for both relayed-from-most-recent-block and relayed-from-mapRelay-only to see how relevant those cases are in practice.

    I think it would also be good to continue serving txs from vExtraTxnForCompact (and to use them for resolving orphans?) to help address the privacy issue for rbf relay, but I don’t think that needs to be a prerequisite for dropping mapRelay.

  18. maflcko commented at 7:43 am on May 18, 2023: member

    I think it would also be good to continue serving txs from vExtraTxnForCompact

    I wasn’t sure how to do this, given the planned removal of m_recently_announced_invs in #27675 . Otherwise we may relay invalid/non-standard transactions that were injected into us by a spy node.

    Maybe a flag on each transaction in vExtraTxnForCompact could be set to indicate if it ever was in the mempool. Though that would still leave the fingerprint vector if the spy node sent tx and its replacement immediately after (before we’d announce the original tx on any link).

    address the #17303 (comment), but I don’t think that needs to be a prerequisite for dropping mapRelay

    I agree. One could argue that the issue has little to do with mapRelay, because mapRelay only delays the attack by 15 minutes. Instead of 30 seconds, a spy node could wait 15 minutes + 30 seconds to clear mapRelay for its spy purposes and then proceed. See also #27602 (review)

    So if there is a fix for this it can (and probably should) be suggested and discussed independent of this pull.

    Here’s a test commit that adds that functionality https://github.com/bitcoin/bitcoin/commit/f26a85059fb4eaf29f9c1652b2b5e14060d79a36 I’m now running that with logging for both relayed-from-most-recent-block and relayed-from-mapRelay-only to see how relevant those cases are in practice.

    How would you judge “relevant”? I presume you are looking for rbf’d transactions that were requested from mapRelay and ended up in a block, but the block was not yet available to you when the tx was requested?

    This resurrects #17303, and it’s probably worth reviewing the comments there;

    I don’t think the discussion contains anything relevant anymore, since the tx relay logic has been completely reworked in the meantime. (As summarized by you). And the only relevant stuff (that mapRelay is needed for compact block relay) wasn’t even mentioned in that thread.

  19. ajtowns commented at 8:27 am on May 18, 2023: contributor

    How would you judge “relevant”? I presume you are looking for rbf’d transactions that were requested from mapRelay and ended up in a block, but the block was not yet available to you?

    I’m still tweaking the logging, but so far what I’m seeing seems like roughly:

    • 95% relay from the mempool
    • of the 5% that don’t:
      • 87% are recent txs included in the most recent block (so would also be available via mapRelay)
      • 12% are recent txs not in the most recent block, so only via mapRelay (combination of rbf’ed txs and things from the 2nd most recent block by the looks; EDIT: also things that were at the bottom of the mempool as minfee gets raised)
      • 1% txs in the most recent block but not in mapRelay

    When things gets broadcast and then quickly rbf’ed, our behaviour looks a little weird, eg:

     02023-05-18T07:32:49Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  new peer=40
     12023-05-18T07:32:51Z [net] Requesting wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7 peer=40
     22023-05-18T07:33:13Z [mempool] replacing tx 8d291ba6d69303d2aec6147df7768726a44108bbbdfc4064dfdbbb2f7ce9f280 with {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17} for 0.00003867 additional fees, 0 delta bytes
     32023-05-18T07:33:13Z [mempool] AcceptToMemoryPool: peer=40: accepted {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17} (poolsz 19793 txn, 100120 kB) feerate=158.23
     42023-05-18T07:33:54Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  have peer=30
     52023-05-18T07:33:56Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  have peer=48
     62023-05-18T07:34:01Z [mempool] replacing tx {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17} with ab6b9c2863bfbf2fd013f76ae010d52c378df7785a16dcb15d73077ab2406139 for 0.00010398 additional fees, 0 delta bytes
     72023-05-18T07:34:04Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  new peer=46
     82023-05-18T07:34:04Z [net] Requesting wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7 peer=46
     92023-05-18T07:34:05Z [mempoolrej] {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17} from peer=46 was not accepted: insufficient fee, rejecting replacement {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17}; new feerate 0.00158227 BTC/kvB <= old feerate 0.00287881 BTC/kvB
    102023-05-18T07:34:05Z [net]  XXX - +mr hey, relaying something from mapRelay {06a3c28af14f9ad196f4aa209bc32fe4c79c79c8e176dc63428d7db96418cb17} 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7
    112023-05-18T07:34:06Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  have peer=8
    122023-05-18T07:34:09Z [net] got inv: wtx 57232af955cc9a788118508f50bc4f4bd8f41769c9ffbc218c6fe964c431e6c7  have peer=43
    

    (ie, tx B arrives and replaces tx A; then we replace it with tx C; then B is announced by someone else so we request it again and run it through ATMP then we reject it; then someone else requests it and we relay it from mapRelay)

  20. maflcko commented at 10:07 am on May 18, 2023: member

    I think I agree with #27625 (comment) that we should ensure we serve any txs in the most recent block

    Added a test for this in #27695, which should fail on this pull as is. I think the test only checks for sendcmpct 0 (low bandwidth) behavior. And only for the case where the tx-getdata was already sent on the wire, because otherwise Bitcoin Core on current master would queue it behind the cmpctblock-getdata, which would also delay the reply so much that it will be useless. For sendcmpct 1 (high bandwidth) the cmpctblock is sent directly to the peer without consideration of any outstanding getdata.

    Maybe transactions that are in the block, and recent, and presumed to be not know by the peer can be pushed on the send buffer before the compact block message, even absent a corresponding inv or getdata for those transactions?

  21. ajtowns commented at 11:31 am on May 18, 2023: contributor

    I think it would also be good to continue serving txs from vExtraTxnForCompact

    I wasn’t sure how to do this, given the planned removal of m_recently_announced_invs in #27675 . Otherwise we may relay invalid/non-standard transactions that were injected into us by a spy node.

    I guess a straightforward approach would be to have the mempool separately keep track of the last ~700 txs removed, as well as their entry time and removal time. Then they could be relayed to a node (under #27675’s proposed logic) provided entry <= last_inv < removal. At 14tx/s, 700 txs is 50s, and the liklihood of not sending an inv to inbounds in that time seems like it’s around once every 12 days (lowering to 40s would already reduce it to once a day).

  22. maflcko commented at 11:42 am on May 18, 2023: member

    I guess a straightforward approach would be to have the mempool separately keep track of the last ~700 txs removed, as well as their entry time and removal time. Then they could be relayed to a node (under #27675’s proposed logic)

    That would still leave the fingerprint vector if the spy node sent a transaction and its replacement immediately after. That is, before we’d have a chance to announce or send the original to anyone. Leaving us with a unique fingerprint that can be used to find out if we are reachable on two networks.

  23. sdaftuar commented at 3:22 pm on May 18, 2023: member

    I think it would also be good to continue serving txs from vExtraTxnForCompact (and to use them for resolving orphans?) to help address the #17303 (comment), but I don’t think that needs to be a prerequisite for dropping mapRelay.

    I don’t think the privacy issues raised by not relaying replaced transactions are significant at all. There are many ways to use transaction relay behavior to infer the network topology, and I think there are downsides (perhaps mild) to both a node and the network if we do relay such transactions. It seems like we all agree that this shouldn’t be a requirement to dropping mapRelay but I just want to be clear that I think there is no benefit to doing so either.

    When things gets broadcast and then quickly rbf’ed, our behaviour looks a little weird, eg:

    One idea may be to add replaced transactions to a reject filter.

    Maybe transactions that are in the block, and recent, and presumed to be not know by the peer can be pushed on the send buffer before the compact block message, even absent a corresponding inv or getdata for those transactions?

    Note that compact blocks can be sent with prefilled transactions for precisely this use case. Some research should be done to come up with heuristics for determining when to include prefilled transactions (as there is a bandwidth, and therefore latency cost to including transactions that are unnecessary). Anyway, I think that is work for another PR.

    My suggestion would be to move forward with a PR that uses the m_recently_announced_invs filter for now to determine when to relay from the mempool, or the most recent block, which I think is a very straightforward code change. The PR that seeks to remove the recently announced filter should be easy to update to support what we’re doing here.

  24. maflcko commented at 10:19 am on May 19, 2023: member

    Here’s a test commit that adds that functionality https://github.com/bitcoin/bitcoin/commit/f26a85059fb4eaf29f9c1652b2b5e14060d79a36

    Thanks. Did you want me to push it here (removing the logging), or did you want to open a fresh pull yourself?

    Also, I think we dropped support for non-witness compact block relay, so I guess you can drop txid from your map and only use wtxid?

  25. sdaftuar commented at 12:36 pm on May 19, 2023: member

    Also, I think we dropped support for non-witness compact block relay, so I guess you can drop txid from your map and only use wtxid?

    I believe that there should be some older versions of our software that support witness-compact-block-relay but don’t support wtxidrelay, so I think @ajtowns patch is correct to index on both.

  26. ajtowns commented at 2:13 pm on May 19, 2023: contributor

    Thanks. Did you want me to push it here (removing the logging), or did you want to open a fresh pull yourself?

    Go ahead and clean it up and push it here :)

  27. glozow commented at 5:50 pm on May 26, 2023: member
    concept ACK
  28. ajtowns commented at 7:26 am on June 2, 2023: contributor
    I’ve cleaned up and rebased this as https://github.com/ajtowns/bitcoin/commits/202306-dropmaprelay if that’s any help.
  29. maflcko force-pushed on Jun 7, 2023
  30. maflcko marked this as ready for review on Jun 7, 2023
  31. DrahtBot removed the label Needs rebase on Jun 7, 2023
  32. DrahtBot added the label CI failed on Jun 7, 2023
  33. maflcko commented at 12:44 pm on June 7, 2023: member
    Thanks, I pushed your branch. Rebased for green CI.
  34. net_processing: relay txs from m_most_recent_block fccecd75fe
  35. Remove mapRelay faa2976a56
  36. maflcko force-pushed on Jun 8, 2023
  37. DrahtBot removed the label CI failed on Jun 8, 2023
  38. sdaftuar commented at 1:52 am on June 12, 2023: member
    ACK faa2976a56ea7cdfd77ce2580a89ce493b57b5d4
  39. ajtowns commented at 2:45 am on June 12, 2023: contributor
    ACK faa2976a56ea7cdfd77ce2580a89ce493b57b5d4
  40. maflcko removed the label Brainstorming on Jun 12, 2023
  41. glozow commented at 8:50 am on June 12, 2023: member
    code review ACK faa2976a56ea7cdfd77ce2580a89ce493b57b5d4
  42. fanquake merged this on Jun 12, 2023
  43. fanquake closed this on Jun 12, 2023

  44. ismaelsadeeq commented at 2:06 pm on June 12, 2023: member

    Post Merge code review ACK faa2976a56ea7cdfd77ce2580a89ce493b57b5d4

    This PR removes mapRelay and introduces m_most_recent_block_txs. m_most_recent_block_txs retains the transactions of the most recent mined block. When a new block is mined, all the block transactions in our mempool are dropped, because of BLOCK MemPoolRemovalReason.

    • Storing the most recently mined block transactions in m_most_recent_block_txs allows us to relay the dropped transactions to peers upon request, which we might have previously relayed to them.

    • Once another block is mined, m_most_recent_block_txs is updated to store the transactions from the new block.

    • If a requested transaction was dropped from the mempool and is two blocks old, FindTxForGetData will return an empty result {}.

    In my opinion, this approach is preferable to using mapRelay because the dropped transactions (except for the BLOCK reason) are not needed to be relayed anymore, (all transactions dropped for the BLOCK reason) are covered by m_most_recent_block_txs for the period before receiving any new block.

  45. sidhujag referenced this in commit 4519b9f7eb on Jun 12, 2023
  46. maflcko deleted the branch on Jun 15, 2023
  47. bitcoin locked this on Jun 14, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-24 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me