p2p: Stop relaying non-mempool txs, improve tx privacy slightly #17303

pull MarcoFalke wants to merge 2 commits into bitcoin:master from MarcoFalke:1910-p2pNoRemovedTxs changing 3 files +19 −28
  1. MarcoFalke commented at 6:49 pm on October 29, 2019: member

    There is no reason to relay transactions that have been, but are no longer, in the mempool at the time of the getdata request for the transaction.

    Possible reasons for mempool removal, and my thinking why it is safe to not relay

    Reason Rationale
    EXPIRY Expired from mempool Mempool expiry is 2 weeks, so does not interfere with relay at all.
    SIZELIMIT Removed in size limiting A low fee transaction, which will be relayed by a different peer after GETDATA_TX_INTERVAL. Assuming it ever made it to another peer.
    REORG Removed for reorganization Block races are rare, so reorgs should be rarer. Also the tx will be reaccepted from the disconnectpool later on and then relayed (TODO), so this should never be observable in practise.
    BLOCK Removed for block The other peer will eventually receive the (filtered)block with the tx in it.
    CONFLICT Removed for conflict with in-block transaction The peer won’t be able to add the tx to the mempool anyway. No need to waste bandwidth.
    REPLACED Removed for replacement Same as above, send them the higher fee tx instead.

    This should also improve transaction relay privacy slightly. Previously any peer (even inbounds) could ask for transactions in mapRelay, even if they were never announced to them via an inv message. After my changes, the peer can only request transactions that were added to the mempool before the last inv message was sent to that peer.

  2. fanquake added the label P2P on Oct 29, 2019
  3. fanquake requested review from sdaftuar on Oct 29, 2019
  4. fanquake requested review from ajtowns on Oct 29, 2019
  5. sdaftuar commented at 7:23 pm on October 29, 2019: member

    I think this is the eventual right direction for us to go (because mapRelay doesn’t have great memory bounds), but without something like #15505 this turns nodes into accidental DoSers whenever they evict a transaction after announcement and before a getdata comes in. If you’re a node running with a small mempool, then transactions can be evicted even if they’re not “low feerate”.

    I tend to think we should not be turning nodes with small mempools into DoSers; so if we solve that problem first (eg by making the peer receiving a NOTFOUND be able to find the transaction elsewhere in a reasonable amount of time), then I think we could go ahead and remove mapRelay and only serve things from our mempool. We should go ahead and fix the privacy problem (mentioned in #14220) at the same time though.

  6. MarcoFalke force-pushed on Oct 29, 2019
  7. MarcoFalke commented at 8:12 pm on October 29, 2019: member

    We should go ahead and fix the privacy problem (mentioned in #14220) at the same time though.

    It is trivial to fix that now, but I refuse to write code that does time comparisons that don’t use std::chrono. Will push the commit after #17243 .

  8. DrahtBot commented at 10:52 pm on October 29, 2019: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18861 (Do not answer GETDATA for to-be-announced tx by sipa)
    • #18808 ([net processing] Drop unknown types in getdata by jnewbery)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  9. MarcoFalke added the label Waiting for author on Oct 30, 2019
  10. naumenkogs commented at 7:06 pm on November 4, 2019: member
    This sounds reasonable to me, but I think we should be careful here to not introduce vulns as a side-effect. So I’d want to give a more thorough review once all the pre-reqs mentioned above are satisfied.
  11. MarcoFalke removed the label Waiting for author on Nov 6, 2019
  12. MarcoFalke force-pushed on Nov 6, 2019
  13. MarcoFalke force-pushed on Nov 6, 2019
  14. MarcoFalke renamed this:
    p2p: Stop relaying non-mempool txs
    p2p: Stop relaying non-mempool txs, improve tx privacy
    on Nov 6, 2019
  15. MarcoFalke commented at 6:38 pm on November 6, 2019: member
    Minimized the diff. Also fixed the privacy issue as requested by @sdaftuar
  16. MarcoFalke renamed this:
    p2p: Stop relaying non-mempool txs, improve tx privacy
    p2p: Stop relaying non-mempool txs, improve tx privacy slightly
    on Nov 6, 2019
  17. naumenkogs commented at 7:19 pm on November 6, 2019: member

    Summary

    I think the privacy problem mentioned above is fixed here in a limited way. There are 2 privacy threats related to the discussed behavior right now: a) inferring an origin node of the particular transaction b) inferring tx relay network topology

    This proposal fixes a), but does not fix b), see below. I also show that ideas of fixing b) after this change would require bringing back mapRelay in another form.

    New attack

    We will not respond if something wasn’t INVed, right. But imagine flooding the network with tx_A, so that everybody announces is to an attacker and an attacker now can query for it. Then in 30 seconds, when everyone does have tx_A, we selectively announce tx_B (which is an RBF to tx_A and will evict tx_A from the mempool) to node_victim. And start querying for tx_A persistently. Whoever responds with NOTFOUND received tx_B, which means that node was closer to node_victim.

    Stuff like diffusion, previous PR addressing the problem and this proposal would make it difficult to infer tx_A or tx_B relay paths directly, but they can easily be deduced by querying for tx_A.

    So, the cost of an attack is increased from 1 tx (or even 0 if using someone's txs) to 1 tx + 1 RBF, which does not really address the goal.

    Ideas

    Crazy expensive but fundamental

    Keep responding about tx_A even when it’s replaced with tx_B. This would require storing tx_A while tx_B is still in the mempool. Afaik this is very opposite to the current intuition we implement in the codebase.

    Less expensive but limited

    Allow at most 1 (or 2 just in case) queries per transaction. So that tx_A can be requested by every node at most once. In this case, an attacker would have to run more nodes or spend more double-spends. Not ideal, but at least something. It would also require a data structure like a bloom filter to track our past responses.

    Maybe this should be implemented as a follow-up PR, I might do that sometime.

  18. MarcoFalke force-pushed on Nov 11, 2019
  19. in src/net_processing.cpp:1632 in fa9aa380a1 outdated
    1553@@ -1554,20 +1554,17 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
    1554             const CInv &inv = *it;
    1555             it++;
    1556 
    1557-            // Send stream from relay memory
    1558             bool push = false;
    


    jkczyz commented at 8:47 pm on November 12, 2019:
    You can now remove push and then update vNotFound in an else branch opposite the PushMessage below.

    MarcoFalke commented at 0:52 am on November 13, 2019:
    Happy to remove in a new commit if I get at least one ACK and others agree to do this.

    MarcoFalke commented at 12:22 pm on March 25, 2020:
    push is used in #18038, so to avoid a merge conflict I will leave it as is for now. Whichever pull gets merged last can address this nit.
  20. in src/net_processing.cpp:1634 in fa9aa380a1 outdated
    1560             int nSendFlags = (inv.type == MSG_TX ? SERIALIZE_TRANSACTION_NO_WITNESS : 0);
    1561-            if (mi != mapRelay.end()) {
    1562-                connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second));
    1563-                push = true;
    1564-            } else {
    1565+            {
    


    jkczyz commented at 8:47 pm on November 12, 2019:
    Is this scope needed?

    MarcoFalke commented at 0:52 am on November 13, 2019:
    no, but it keeps the scope that came previously after the else to minimize the diff. Happy to remove in a new commit if I get at least one ACK and others agree to do this.
  21. jkczyz commented at 6:59 pm on November 13, 2019: contributor
    The PR title states that this change “improve tx privacy slightly”. Could you elaborate in the PR description how? (i.e., why was it less private before?)
  22. jnewbery commented at 9:27 pm on November 13, 2019: member

    Concept ACK @sdaftuar You had two concept comments on this PR:

    1. If we make this change, then it should address the privacy issue identified in #14220. I haven’t fully reviewed the second commit yet, but it seems like a reasonable fix (keep a per-peer timestamp of the last tx INV we sent, don’t respond to GETDATAs for txs that entered our mempool after that timestamp).
    2. This PR might exacerbate the problem in #15505 and we could stall relay for txs that are removed from our mempool for SIZELIMIT. That seems to me to be a very minor edge condition for transactions that are at the bottom of the mempool. Presumably we expect almost all nodes’ mempool limits to be at least an order of magnitude greater than the block weight limit, which means that any tx that falls into this category has at least ten blocks worth of better transactions than it in the mempool, and we wouldn’t expect it to be mined for 100 minutes. That’s plenty of time for the peer to request the tx from another peer. @naumenkogs I don’t think the topology probing technique you describe in #17303 (comment) is made any worse by this PR. We can always probe whether a tx is in a peer’s mempool using the technique here: #16851 (comment).
  23. in src/net_processing.cpp:1641 in fa5f3dae9c outdated
    1555+                // and couldn't yet have been INVed to that peer otherwise.
    1556                 if (txinfo.tx && (
    1557                      (mempool_req.count() && txinfo.m_time <= mempool_req)
    1558-                      || (mapRelay.find(inv.hash) != mapRelay.end())
    1559-                      || (txinfo.m_time <= longlived_mempool_time)))
    1560+                      || (txinfo.m_time <= pfrom->m_tx_relay->m_last_inv_sent)))
    


    ariard commented at 10:43 pm on November 13, 2019:

    Given new m_last_inv_sent update just before the mempool one in SendMessages, is checking mempool one still useful ?

    We take cs_main in beginning of SendMessages which implicitly means new tx can’t get inserted in the mempool. So we are going to send only mempool txn with a m_time before updated m_last_inv_sent. There can’t a tx INVed by us between two time measures in reply to a MEMPOOL request. We may receive one in between but if then peer GETDATA it, it wasn’t thanks to us. Thoughts ?


    MarcoFalke commented at 1:12 pm on March 25, 2020:
    @ariard Good point, added commit to remove mempool_req as well
  24. ariard commented at 5:16 pm on November 14, 2019: member

    Concept ACK on the privacy leak fix, i.e before this PR peers could have sent us unsolicited GETDATAs to identify tx in our mempool but not previously INVed. The strict requirement of tx m_time being earlier than m_last_inv_sent to be broadcast guarantee we don’t send TX(tx) without INV them before.

    On the potential DoS vector,

    That’s plenty of time for the peer to request the tx from another peer. @jnewbery but I think Suhas point is requester is going to keep sending GETDATA as tx isn’t erased from m_tx_process_time at NOTFOUND processing?

  25. jnewbery commented at 6:25 pm on November 14, 2019: member

    I think Suhas point is requester is going to keep sending GETDATA as tx isn’t erased from m_tx_process_time at NOTFOUND processing?

    I think the concern is that a peer will send us an INV for a tx that then gets removed from its mempool because of SIZELIMIT. It’ll reply with a NOTFOUND which we don’t do anything with.

    I think that’s ok because one minute later we’ll request from another peer that INV’ed the transaction to us: https://github.com/bitcoin/bitcoin/blob/cd6cb9745e13a62e130b11f78a13bcc1d424b05e/src/net_processing.cpp#L4043 and if the tx was removed from the first peer’s mempool for SIZELIMIT, then one minute delay shouldn’t be a problem (since we don’t expect the tx to be mined any time soon).

  26. ariard commented at 8:09 pm on November 14, 2019: member

    I think that’s ok because one minute later we’ll request from another peer that INV’ed the transaction to us

    Ok, I was making the assumption than the NOTFOUND peer is the origin one so you don’t have an INV from other peers. Isn’t our current logic going to loop on the sending GETDATA, receiving NOTFOUND sequence ? Or is this too an edge case?

  27. sdaftuar commented at 8:12 pm on November 14, 2019: member

    So the benefits of this PR are presumably to eliminate DoS attack surface on mapRelay (though I’m not sure anyone has articulated this clearly) and to solve a privacy issue. I think both those things could also be solved without introducing relay problems from announcing transactions and then not responding (and I’ve proposed one approach that I think would be satisfactory; I’m sure someone could probably propose a solution to both these things that involves keeping mapRelay around as well).

    Anyway I don’t understand why we would be sloppy here; it’s not like relay degradation is some kind of fundamental trade-off with what we’re trying to accomplish. (And the DoS and privacy issues are not new, they have been around for a long time, so it doesn’t really seem like they are urgent to address either?)

    (BTW if anyone wants to pick up #15505 and try to get that merged so that this gets unblocked, please go for it — I closed it earlier this year because there didn’t seem to be much review interest, and then reopened it in light of this PR to remind others of one approach to solving the issue that I thought needed to be fixed first, before we went in the direction proposed here.)

  28. naumenkogs commented at 0:55 am on November 15, 2019: member

    @jnewbery

    I don’t think the topology probing technique you describe in #17303 (comment) is made any worse by this PR. We can always probe whether a tx is in a peer’s mempool using the technique here: #16851 (comment).

    I have never said it’s made any worse. What I don’t like is that in future when we’ll want to fix privacy issue we might return mapRelay in a slightly different form, so I thought maybe we should do it right away here. This is not a deal-breaker for me, but I want others to understand this while making a decision.

  29. ajtowns commented at 6:16 am on February 3, 2020: member

    @naumenkogs

    I have never said it’s made any worse. What I don’t like is that in future when we’ll want to fix privacy issue we might return mapRelay in a slightly different form, so I thought maybe we should do it right away here.

    I don’t think we have a clear enough idea on how to fix privacy issues in non-token ways to do anything “right away” here? And if we’ll want a different thing than mapRelay anyway, pulling mapRelay out now doesn’t sound like it’ll make things much more difficult?

  30. ajtowns commented at 11:18 am on February 3, 2020: member

    I don’t think this PR really makes reacting to notfounds a more urgent issue for all the reasons @jnewbery lists. But I had a look at #15505 again anyway, and think that logic like “if we see a notfound for an inflight tx, look through our other peers to reset the process_time for that tx that was soonest to right now” should be fine – we don’t need to take any new locks, and we’re just iterating over and looking up some maps which should be plenty fast. Example patch at https://github.com/ajtowns/bitcoin/commits/202002-bump-notfound

    I don’t think the p2p_leak_tx test is quite right – if I take away the “break” after a GETDATA fails, it continues for a while then fails on the txid in inbound_peer.last_message['inv'] assertion. Not sure why; maybe there’s a race condition if the getdata comes in between the inv getting queued and actually send or similar?

  31. ajtowns removed review request from ajtowns on Feb 3, 2020
  32. MarcoFalke force-pushed on Mar 25, 2020
  33. MarcoFalke force-pushed on Mar 25, 2020
  34. MarcoFalke force-pushed on Mar 25, 2020
  35. MarcoFalke commented at 2:17 pm on March 25, 2020: member

    I don’t think the p2p_leak_tx test is quite right – if I take away the “break” after a GETDATA fails, it continues for a while then fails on the txid in inbound_peer.last_message[‘inv’] assertion. Not sure why; maybe there’s a race condition if the getdata comes in between the inv getting queued and actually send or similar?

    The test is “right”. It is only single-use, as correctly assumed by you: In the next run after receiving the notfound, it will receive a bunch of invs of previous transactions (but not yet the inv of the current transaction), so the assertion will fail.

  36. MarcoFalke force-pushed on Apr 29, 2020
  37. MarcoFalke force-pushed on Apr 29, 2020
  38. DrahtBot commented at 3:04 am on May 12, 2020: member

    🐙 This pull request conflicts with the target branch and needs rebase.

  39. DrahtBot added the label Needs rebase on May 12, 2020
  40. MarcoFalke force-pushed on May 23, 2020
  41. MarcoFalke force-pushed on May 23, 2020
  42. p2p: Stop relaying non-mempool txs 11c09018f2
  43. p2p: Stop relaying txs to nodes that could have never received an inv for it 4dcefd7b77
  44. MarcoFalke force-pushed on May 23, 2020
  45. adamjonas commented at 6:21 pm on June 12, 2020: member
    Looks like there are still a couple of references to mapRelay in net_processing.cpp left to remove.
  46. MarcoFalke commented at 7:04 pm on June 12, 2020: member
    Please don’t review yet. This has not yet properly been rebased.
  47. MarcoFalke marked this as a draft on Jun 12, 2020
  48. MarcoFalke commented at 12:00 pm on June 28, 2020: member
    Closing for now. There have been some other changes, so if this still makes sense, it is probably best to start a new pull request
  49. MarcoFalke closed this on Jun 28, 2020

  50. MarcoFalke deleted the branch on Jun 28, 2020
  51. fanquake referenced this in commit 5550fa5983 on Jul 14, 2020
  52. sidhujag referenced this in commit 0090059f60 on Jul 14, 2020
  53. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 19:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me