p2p: 1 IP per-connection policy #21687

am33r commented at 4:46 AM on April 15, 2021: none

Currently, users can open multiple connections to a reachable node using the same IP address and a different port and request blockchain data at each connection. This behavior has the following impacts. (1) Overwhelm the reachable node and consume its bandwidth by requesting data and discarding it (see video demonstrations here: https://www.dropbox.com/s/pvasua7lnepdgfq/Occupy_Connections.mp4?dl=0 and https://www.dropbox.com/s/45ld1h7cxi3xzj6/RequestingBlockchain.mp4?dl=0)

(2) Undermine the transaction relaying anonymity since multiple connections to each node increase the probability of mapping a transaction to an IP address (currently, the random delay factor in the trickling effect implicitly assumes the attacker has one connection to each node). By establishing multiple connections, the attacker undermines the anonymity property.

To address the two concerns above, we enforce 1 incoming connection per IP address. Please note that it will not significantly affect the nodes behind NAT since there are close to 10K reachable nodes and NATed nodes only need 8-10 outbound connections. Assuming 10K reachable nodes, each with default 117 incoming connections, there are 10K117 incoming slots available out of which (10K8) are supposedly occupied by the reachable nodes. As a result, the network has 1090000 slots for unreachable nodes. In the worst case, (i.e., the unreachable nodes really exceed in count), a natural equilibrium is self-enforced where some of the unreachable nodes must become reachable and support the rest of the network (altruism in the P2P network).

We were motivated to apply 1IP per-connection policy since we observed malicious behavior on our node whereby two IP addresses occupied multiple incoming connection slots which led to the bandwidth and privacy concerns.

1IP per-connection policy 26af082294

fanquake added the label P2P on Apr 15, 2021

jarolrod commented at 5:29 AM on April 15, 2021: member

This would needs tests written. Also consistent failures on CI and linter

in src/net.cpp:1163 in 26af082294

1158 | +	//     (see video demonstrations here: https://www.dropbox.com/s/pvasua7lnepdgfq/Occupy_Connections.mp4?dl=0
1159 | +	//     and https://www.dropbox.com/s/45ld1h7cxi3xzj6/RequestingBlockchain.mp4?dl=0)
1160 | +	//   (2) Undermine the transaction relaying anonymity since multiple connections to each node increase the
1161 | +	//     probability of mapping a transaction to an IP address (currently, the random delay factor in the
1162 | +	//     trickling effect implicitly assumes the attacker has one connection to each node). By establishing
1163 | +	//     multiple connections, the attacker undermines the anonymity property.

MarcoFalke commented at 10:22 AM on April 15, 2021:

This is untrue. All inbound connections have the same relay time assigned. See https://github.com/bitcoin/bitcoin/blob/c6b30ccb2eee5f80f844f79766591f0a1326ce43/src/net.cpp#L3014

Having multiple connections should be no different than a single one.

beingmsaad commented at 8:03 PM on April 15, 2021:

Feasibly occupying multiple incoming connections using the same IP address will undermine the transaction anonymity property provisioned by the "diffusion" process. In the current deployment, a node announces a transaction to each connection after a certain delay. The assumption is that the adversary will not be able to infer the source of the transaction as it might receive the INV from another connection rather than the source (ref diffusion and exponential delay).

If an adversary occupies multiple incoming connections, the adversary can easily infer the source of the transaction, despite that delay (higher probability of receiving INV from the source rather than any other connection of that source). Allowing multiple incoming connections from the same IP address makes it a lot easier.

sipa commented at 8:05 PM on April 15, 2021:

That's not correct. All incoming connections share the same Poisson timer for diffusion, so an attacker that creates multiple incoming connections should not be able to observe (significantly) more than with a single connection.

beingmsaad commented at 9:05 PM on April 15, 2021:

Thanks Pieter. A test case would be to (1) set up a reachable node (say A), (2) find its outgoing connections (getpeerinfo), (3) establish one connection to each of A's outgoing connections, (4) establish multiple connections to A (say 30), and (5) generate a transaction from A and see how much can be inferred. I can do this experiment on my nodes and get back to you with the results.

You have pointed out that that attacker will not be able to observe (significantly) more than with a single connection. Does significantly mean a probability in k where k is the total number of incoming connections?

sipa commented at 9:09 PM on April 15, 2021:

@beingmsaad That would be interesting to get data on!

The only gain you should get from having multiple incoming connections vs. just a single one is avoiding more internal latency/processing delay in the message processing loop. I.e., if CPU speed was infinite, there should be no gain at all.

I mention "significantly" because it's not necessarily true in reality that it's literally no data, but at best I expect that you'd quickly converge towards a small constant factor better accuracy than a single one (maybe a few percent; perhaps a bit more on nodes that are relatively loaded). Of course, if this intuition is wrong I'd very much like to know.

beingmsaad commented at 9:17 PM on April 15, 2021:

@sipa I will get some data on it and share it with you (possibly over the weekend). Will generate a few transactions in the testnet and see how much can be inferred by making 2-30 incoming connections.

in src/net.cpp:1159 in 26af082294

1151 | @@ -1136,6 +1152,31 @@ void CConnman::CreateNodeFromAcceptedSocket(SOCKET hSocket,
1152 |          }
1153 |      }
1154 |  
1155 | +	// Currently, users can open multiple connections to a reachable node using the same IP address and a 
1156 | +	// different port and request blockchain data at each connection. This behavior has the following impacts:
1157 | +	//   (1) Overwhelm the reachable node and consume its bandwidth by requesting data and discarding it
1158 | +	//     (see video demonstrations here: https://www.dropbox.com/s/pvasua7lnepdgfq/Occupy_Connections.mp4?dl=0
1159 | +	//     and https://www.dropbox.com/s/45ld1h7cxi3xzj6/RequestingBlockchain.mp4?dl=0)

MarcoFalke commented at 10:25 AM on April 15, 2021:

I didn't visit the website and I won't open this mp4, but a single inbound connection should be able to ask for the same data that multiple connections ask for.

beingmsaad commented at 8:15 PM on April 15, 2021:

You are right. Any inbound connection should be able to request the same data that any other connection asks for. In that case, any effect on the bandwidth is perfectly fine. However, having multiple inbound connections from the same IP address makes it a lot easier for the adversary to overwhelm the bandwidth of the victim node. And this can be done easily through simple scripts without actually running a blockchain node. I have observed 33 incoming connections to my reachable node from the same IP address which raised a flag for me. Again, the problem circle backs to allowing multiple incoming connections from the same IP address which needs to be addressed.

MarcoFalke commented at 10:30 AM on April 15, 2021: member

I don't see how this improves anything. If you disagree, a test or otherwise steps to reproduce would be helpful.

in src/net.cpp:1158 in 26af082294

1151 | @@ -1136,6 +1152,31 @@ void CConnman::CreateNodeFromAcceptedSocket(SOCKET hSocket,
1152 |          }
1153 |      }
1154 |  
1155 | +	// Currently, users can open multiple connections to a reachable node using the same IP address and a 
1156 | +	// different port and request blockchain data at each connection. This behavior has the following impacts:
1157 | +	//   (1) Overwhelm the reachable node and consume its bandwidth by requesting data and discarding it
1158 | +	//     (see video demonstrations here: https://www.dropbox.com/s/pvasua7lnepdgfq/Occupy_Connections.mp4?dl=0

practicalswift commented at 3:53 AM on April 16, 2021:

If these videos contain any relevant rationale for this PR, consider sharing that information as text instead (and then remove the links).

For security reasons I don't think many regular contributors to this project (which tend to be highly security/OPSEC conscious) would be willing to download and open an MP4 file which is bit-by-bit controlled by an untrusted uploader. FWIW, these lists are pretty long/scary https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=vlc and https://cve.mitre.org/cgi-bin/cvekey.cgi?keyword=ffmpeg :)

The above is a general observation: I'm not claiming that there is anything fishy with these specific MP4 files.

beingmsaad commented at 4:39 AM on April 16, 2021:

@practicalswift Sure. Actually, we have a research paper in submission. The submission is double-blind and usually, reviewers do not have time to go through data. Hence, I just did a screen recording and uploaded it.

Here is what happens in the two videos. (1) A lightweight scrip occupies all the incoming connections of our reachable node using the same IP address and multiple ports (there are 65K ports on a commodity computer. An adversary can potentially make 64K outgoing connections using a single machine). (2) The second video shows each of those incoming connections continuously requesting block headers from our reachable node and discarding them. As a result, the node's bandwidth is exhausted while the adversary does not have to store anything. The adversary just keeps requesting block headers through each connection and discards them.

Also important to mention is that the idea came to mind since I do blockchain synchronization analysis (i.e, execute getpeerinfo to see who has the up-to-date chain). During that analysis, I found 33 incoming connections from the same IP address, and they remained connected for 21 hours. That's when I started looking into this.

beingmsaad commented at 4:46 AM on April 16, 2021:

@practicalswift Additional caveat (alongside bandwidth exhaustion) is the risk of transaction deanonymization. I am currently conducting experiments on that.

33 incoming connections from one IP address for 21 hours looked fishy. So I reached out to @am33r who deployed 1 connection per IP policy to stop that.

fanquake renamed this:
~~1IP per-connection policy~~
p2p: 1 IP per-connection policy
on Apr 16, 2021

sipa commented at 5:01 AM on April 16, 2021: member

So there could be a number of explanations why someone would want to do this.

One is that perhaps they simply don't know that multiple connections doesn't really add accuracy to transaction announcements. This synchronization of inbound poisson timers was only added a few releases ago, specifically in response to observing deanonymizers making multiple connections (from the same IP, or from multiple).

Another is to increase the strength of bandwidth or other resource exhaustion attacks. I believe that this indeed helps the attacker, but I'm not sure how much. Our protection against volumetric DoS attacks against individual nodes is low, and it should be almost as easily be possible to perform them with a single connection.

As for disallowing multiple connections from the same IP:

When incoming transaction slots run out, we already prioritize peers from varied IP ranges, and so multiple connections from the same IP would already be preferred for disconnection. It works per IP range, because it's believed that in general it's cheap for attackers to get access to multiple IPs in the same range already.
If we're actually observing attacks in the wild (especially effective DoS attacks), we should obviously do something against that; your solution is one, but it's fairly aggressive.
A slightly less aggressive solution would be added to the inbound prioritization logic a feature that prefers disconnecting multiple connections from the same IP (even more than multiple connections from the same IP range).
If we'd actually want to refuse like you propose here, it would need a bunch of safeguards (e.g. localhost would need to be excluded, because many local services may connect directly locally; also it should be able to bypass this policy using net permissions like NOBAN

beingmsaad commented at 5:28 AM on April 16, 2021: none

@sipa I completely agree with the policy being aggressive. My bigger concern however was that no two unreachable nodes behind NAT will be able to connect to the same reachable node. Now there are two approaches to find an optimal solution.

(1) The current connection eviction policy. I am currently reviewing it to see if there is a chance of it being a double-edged sword. If we evict adversary's connections (assuming adversary exploits 1 IP 64K ports weakness), the adversary can also evict benign connections among honest nodes by switching IP addresses. There are great safeguards in place right now (i.e., protecting a subset of peers that relay blocks to us). But again, if an adversary arbitrarily disconnects a connection between two honest nodes, we wouldn't want that. I am stress testing the connection eviction logic on my node right now and will update you if I come across a problem.

(2) The second approach is to find a middle ground. Can we allow x number of incoming connections from the same IP address such that the policy is not really aggressive and we also support NATed nodes. For that purpose, I have been connecting to reachable nodes and monitoring IP Addresses in the ADDR response. The last I checked (November 2020), the ratio between reachable and unreachable nodes was 1:5. We can test that approach (i.e., by setting x=5, we do not risk deanonymization, support the unreachable nodes behind NAT, and local host support for onioncat).

A minor concern that I have with the eviction logic is that it only comes into play when all incoming connections are occupied. If a node has 10 incoming connections right now, the adversary can occupy (117-10) and stay connected. The eviction logic will only slowly get rid of the adversary.

beingmsaad commented at 5:34 AM on April 16, 2021: none

@sipa Also, "A slightly less aggressive solution would be added to the inbound prioritization logic a feature that prefers disconnecting multiple connections from the same IP (even more than multiple connections from the same IP range)" looks like a great approach indeed.

naumenkogs commented at 9:41 AM on April 16, 2021: member

I pretty much agree with @MarcoFalke and @sipa above, but won't add extra noise by rephrasing their arguments. I would love to see improvements to bandwidth DoS and transactional privacy, but this PR in the current shape I think is not a good solution to either problem.

gmaxwell commented at 12:24 AM on April 18, 2021: contributor

The decision to not do this was very much intentional. There are some cases where whole countries (e.g. Qatar) access the internet from behind one or just a few IPs-- likewise for many institutions. Hard prohibiting would allow attackers that can make connections behind these to easily totally deny everyone else behind their firewall from connecting-- just mass connect with a single connection to every node and then with the help of this policy anyone else will be unable to connect to anyone.

No matter what is done with IP filtering attackers can still make multiple connections. There are even commercial services that specialize in providing IPs in different net blocks for attacking P2P networks, and of course any large botnet could proxy connections through hundreds of thousands of IPs -- people sometimes use these casually just to be nussances on the Bitcoin IRC channels.

There are at least two known networks that attack the Bitcoin network using multiple entire /24s today -- and even one is enough to use every connection slot. Any attack that would be prevented by this PR would immediately again be possible for an attacker after get got access to 125 IPs which even random scriptkiddies can do with moderate effort, and pretty much every IPv6 user can do trivially.

So. in general. the system must be robust against attackers that make multiple connections, and still must be just as robust against it no matter what per IP limits there are. AFAIK it largely is robust and parts of it that aren't need to be be fixed rather than patched around with a known-false assumption that attackers can't just get a /25 to attack with (super duper false for IPv6 :) ).

As a minor concession to deal with idiots wasting resources, non-protected peers on the same hosts are prioritized for disconnection in the eviction logic.

I believe the above comment about repeating doesn't hold because after protected peers are removed it's the youngest peer that is eliminated.

(2) Undermine the transaction relaying anonymity since multiple connections to each node increase the probability of mapping a transaction to an IP address (currently, the random delay factor in the trickling effect implicitly assumes the attacker has one connection to each node).

This is false by the system's design. All incoming peers share the same delay pool because it is known that attackers can and do use multiple connections. If there is some sidechannel that bypasses this, it needs to be fixed.

beingmsaad commented at 9:11 PM on April 18, 2021: none

I did some experiments today and found that incoming connections indeed share the same delay pool (modulo the minor variations incorporated by network delays). Here is what I did. Correct me if there are lapses in the approach.

(1) I set up a reachable node (A) with 7 outbound connections (B1...B7). (2) Using a script on another machine, I established 25 outbound connections (C1...C25 --> same IP address and different ports) to A and 1 outbound connection to each B1... B7. (3) I conducted two standalone experiments in which I generated 10 transactions and 9 transactions from node A, respectively. Then, I applied the following heuristic to analyze "the degree of deanonymization (if any) achieved in both experiments.

Heuristic: If A is the source of transaction, then any node in C1...C25 will receive that transaction from A before receiving it from any node in B1...B7. Conversely, for all transactions not generated by A, but relayed by A and B1...B7, all nodes in C1..C25 will receive them from any node in (B1...B7) before receiving it from A.

So far, the results show that among 10+9= 19 transactions generated by A, (7+6 = 13) mapped on Heuristic 1. Among the other 15 transactions not generated by A but relayed by A and B1..B7, C1..C25 received (14/15) from B1...B7, before receiving them from A. (I could have done a better job explaining heuristic and results mathematically using Latex, but I hope the meaning comes across clearly). I will double-check the results for accuracy just in case something is missing. Two connections in B1...B7 crashed in the middle of the experiment, so that needs to be corrected as well.

Share your comments/thoughts on the aforementioned results. If the approach is meaningful, then is 13/19 (68%) accuracy good? Please also suggest ways in which to further broaden the scope of the experiment (I am planning to connect a random set of nodes D1..Dx) to which A is not connected and also do some timestamping as a control.

The overarching idea is to analyze how much an adversary can infer about the source of a transaction by establishing multiple incoming connections to a node (made more feasible by 1IP + 65K ports)?

Re Connection Patterns from Qatar: There is a natural caveat of Nakamoto consensus desynchronization (global vs local) if all nodes behind NAT connect to the same set of reachable nodes. I will elaborate on that later.

Re Sidechannels: Good suggestion. One way to explore side-channel attacks would be to look at the variance in transaction timestamps when (1) transactions are relayed by the source, and (2) not relayed by the source, given the delay pool inbound connections is the same. I will look into that.

am33r commented at 9:25 PM on April 18, 2021: none

Thanks, everyone for pitching in. The current PR was aimed at highlighting a few caveats in 1IP and 65K connections. Based on the feedback, there are a few things that clearly standout. (1) The policy can be made less aggressive by allowing a few connections from the same IP address or the same netgroup as @sipa pointed out. (2) @beingmsaad also highlighted the gap between reachable and unreachable nodes (IP addresses received in Addr response) to find an optimistic bound on the policy. I will explore both avenues.

It is clear that an adversary can use proxies or strong adversaries like ASes and ISPs can exhaust connection limits. However, the current design makes it a lot easier given an adversary needs just one IP and all the ports available on the adversary's machine.

I am also curious to know one more thing that could be a DoS attempt and if current Bitcoin Core design takes care of that (@beingmsaad and @sipa).

If an attacker makes 15 connections from the same address and requests the blockchain and discards whatever it receives. Assume the attacker has received the entire chain and then repeats all the requests. Is there a mechanism in place to prevent that? Since we have already relayed those blocks to that connection, requesting them multiple times (and discarding them on reception) is bandwidth exhausting attempt on a victim node.

sipa commented at 9:33 PM on April 18, 2021: member

If an attacker makes 15 connections from the same address and requests the blockchain and discards whatever it receives. Assume the attacker has received the entire chain and then repeats all the requests. Is there a mechanism in place to prevent that?

No, there isn't; our protections against DoS attacks of this nature (ones where the attacker and victim have similar resource costs, against a single victim at a time) is very low. But my belief is that doing the same attack with just one connection will cause just as much harm as doing it with 15; in both cases whatever the bottleneck is (disk I/O, network bandwidth, CPU usage, ...) will be maxed out, and the rest won't be.

Where attackers with many connection matters is in their ability to prevent a node from using connection slots for honest peers instead. I think Bitcoin Core has fairly good defenses against this effect already, as such peer would be prioritized for disconnection already.

Longer term I believe the solution against volumetric DoS attacks like these is keeping track of how many resources are spent on each peer, and slowing down processing of the worst ones whenever those resources run out. I don't believe the longer term solution is playing cat-and-mouse games; specializing multiple connections from a single IP is just going to result in attackers using ranges of IP addresses instead as @gmaxwell mentions (especially with increasing IPv6 availability, this is trivial).

beingmsaad commented at 10:04 PM on April 18, 2021: none

Specializing multiple connections from a single IP is just going to result in attackers using ranges of IP addresses instead as @gmaxwell mentions (especially with increasing IPv6 availability, this is trivial).

I agree with @gmaxwell's observation. However, wouldn't it be more desirable to force the adversary to acquire a range of IP addresses in the given threat model rather than achieving the same goal with just 1 IP address?

sipa commented at 11:50 PM on April 18, 2021: member

However, wouldn't it be more desirable to force the adversary to acquire a range of IP addresses in the given threat model rather than achieving the same goal with just 1 IP address?

If there are concrete attacks right now that make use of multiple connections per IP (in a way where the attacker actually gains an advantage by using multiple connections), yes.

But I do expect that in the longer term, if there is better generic DoS protection, there won't be an advantage anymore for attackers to do so.

gmaxwell commented at 1:25 AM on April 21, 2021: contributor

If there are concrete attacks right now that make use of multiple connections per IP (in a way where the attacker actually gains an advantage by using multiple connections), yes.

Even if there are it has to be weighed against the cost of amplifying other attacks by making it easier for an attacker to deny connectivity to the network for other parties sharing his NAT. My thought is that even reasoning about that isn't worth the benefit, better to fix the attacks directly since it needs to be fixed regardless as it's really easy to get a hundred IPs.

If the approach is meaningful, then is 13/19 (68%) accuracy good?

I don't see how this test is useful. I think a useful test would have a network of nodes and two equal attackers. The attackers don't relay transactions. One attacker makes 25 connections to each target, the other attacker makes 1 connection to each target. Attackers attribute transactions to the first node they heard the connection from. Then you want to ask if the attacker is identifying the actual origin more often and if the difference is statistically significant.

Instead, if I've understood your description you've just implemented an attacker with many connections and asked how often the true origin showed up. This doesn't tell you anything about the relative advantage of one or many connections.

The fact that txn will come slightly more often from the actual source rather than some graph of other peers is expected but doesn't have anything to do with one or many connections. The privacy provided is much stronger for single and infrequent txn than it is for some transaction firehose. :)

connect to the same set of reachable nodes.

The concern wasn't so much that they be able to connect to the same set of nodes in ordinary usage, but rather that an attacker can aggressively connect to all reachable nodes and thereby completely deny service to its same network siblings (with the proposed change) or allow the attacker to only connect to a small number of nodes they control. The attack is cheap for the attacker and not very obvious -- even to the victim except for the fact that most places they try to connect to disconnect them.

It's better if they connect to different nodes, but the sheer number of nodes means that honest participants will mostly do so in any case. Its attackers you need to worry about.

MarcoFalke commented at 5:15 AM on April 21, 2021: member

I am going to close this for now, because this connection policy wouldn't make sense for us. Though, feel free to continue discussion here.

Improvements to the eviction logic or improving the scheduler to prioritize "good" peers (and thus de-prioritize "bad" peers) are welcome.

MarcoFalke closed this on Apr 21, 2021

bitcoin locked this on Aug 18, 2022