gmaxwell commented at 11:53 pm on August 25, 2016: contributor

We should learn about new peers via address messages.

An inbound peer connecting to us tells us nothing about its ability to accept incoming connections from us, so we shouldn’t assume that we can connect to it based on this.

The vast majority of nodes on the network do not accept incoming connections, adding them will only slow down the process of making a successful connection in the future.

Nodes which have configured themselves to not announce would prefer we not violate their privacy by announcing them in GETADDR responses.

Do not add random inbound peers to addrman.

We should learn about new peers via address messages.

An inbound peer connecting to us tells us nothing about
 its ability to accept incoming connections from us, so
 we shouldn't assume that we can connect to it based on
 this.

The vast majority of nodes on the network do not accept
 incoming connections, adding them will only slow down
 the process of making a successful connection in the
 future.

Nodes which have configured themselves to not announce would prefer we
 not violate their privacy by announcing them in GETADDR responses.

eb3596f7c2

dcousens commented at 11:55 pm on August 25, 2016: contributor

@gmaxwell how does this impact on the connectivity of those [non-incoming] nodes? (that is, does it hinder them at all?)

pstratem commented at 1:03 am on August 26, 2016: contributor

utACK 53f8f226bd1d627c4a6dec5862a1d4ea5a933e45

travis is borked

rebroad commented at 4:06 am on August 26, 2016: contributor

utACK

sipa commented at 7:12 am on August 26, 2016: member

@EthanHeilman any comments here?

jonasschnelli added the label P2P on Aug 26, 2016

laanwj commented at 9:19 am on August 26, 2016: member

utACK, this has bothered me before

sipa commented at 1:01 pm on August 26, 2016: member

utACK

sipa commented at 1:06 pm on August 26, 2016: member

Can we backport this to 0.13?

EthanHeilman commented at 3:20 pm on August 26, 2016: contributor

@sipa @gmaxwell Reading main.cpp:5033, my understanding is that incoming connections are only added to the tried table if the source IP address of the TCP connection is the same as the IP address supplied in the version message. The assumption here is that if the peer’s IP in the version message is the same as the IP in TCP connect, the peer can accept incoming connections. This assumption can be violated under two conditions:

Nothing prevents a malicious peer behind a NAT from altering the IP address in the version message to make it look like it is not behind a NAT. While this is less than ideal an attacker can only do this once per IP address they control. This is certainly better than the protections provided by address messages where a malicious peer controlling a single IP address can inject up to ~2048 IP addresses into the new table via address messages (this is an approximate back of the envelope calculation).
A peer might not accept incoming connections but still may not be behind a NAT. It is unclear how common this is, but my expectation from reading bitcoind log files over the years would be that it would be that it is uncommon. We can answer this question with data.

From a security standpoint IP addresses from address messages are more vulnerable to manipulation than IP addresses from incoming connections.

The vast majority of nodes on the network do not accept incoming connections, adding them will only slow down the process of making a successful connection in the future.

I am operating under the assumption that most nodes that do not accept incoming connections do so because they are behind NATs and therefore would not be added to the tried table.

For this and other reasons I would be against this change. If we want to improve the quality of the tried table perhaps confirming that an incoming connection can be connected to via a feeler connection might be a solution. Test-before-evict which I’m currently working on (proof of concept can be found at #6355) does provide some functionality like this and my recently accepted pull request #8282 did, in tests, boost the reachable nodes in the tried table from 55 IPs to 590 IPs. I suspect the real problem with reachable IPs in the tried table is nodes switching IP addresses.

laanwj commented at 3:29 pm on August 26, 2016: member

Reading main.cpp:5033, my understanding is that incoming connections are only added to the tried table if the source IP address of the TCP connection is the same as the IP address supplied in the version message.

Do we leak internal LAN IPs here, or is that prevented some way? Nodes should probably not advertise their address at all on outgoing connections, certainly not if they don’t want / can’t be connected to.

EthanHeilman commented at 4:18 pm on August 26, 2016: contributor

Do we leak internal LAN IPs here.

When sending a version message you include your local IP address (see net.cpp:503) but the logic which determines this is fairly complicated (see net.cpp:13).

I’ve checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true. Can someone who is behind a NAT confirm this behavior in more recent code?

Nodes should probably not advertise their address at all on outgoing connections, certainly not if they don’t want / can’t be connected to.

If a node sets its IP address in the version message to 0.0.0.0 or 127.0.0.1 when making an outgoing connection they will not be connected to.

sipa commented at 4:47 pm on August 26, 2016: member

@EthanHeilman It seems dangerous to me to let an attacker directly enter their IP addresses into our tried table. By murmuring addresses they’re certainly able to get into the known table, but the tried table should be reserved for IPs which we selected ourselves. Feel free to contradict my intuition here if you have numbers that show otherwise, though.

EthanHeilman commented at 5:28 pm on August 26, 2016: contributor

@sipa Its an interesting question and I’m not sure what the right answer is.

It seems dangerous to me to let an attacker basically enter their IP addresses into our tried table [..] the tried table should be reserved for IPs which we selected ourselves.

If tried addresses are only added when we select them I agree it would make it harder for an attacker to control tried but it would also have two downsides:

An attacker who eclipsed a node would not need to worry about incoming connections telling the eclipsed node about non-attacker IP addresses except via address messages. An attacker could flush the new table regularly. I’m not sure how this composes with the new connection exhaustion countermeasures.
The tried table will be smaller. It is hard to judge exactly how small since feeler connections will increase the number of short lived incoming connections boosting the tried table, but feeler connections will also increase the number of selected outgoing connections so adding incoming connections to the tried table might not be as valuable.

Determining how the pros balance against the cons would require more research and experiments which I plan on doing but not in the next few months. That said, doing this for security reasons seems more justifiable than doing it to decrease the time it takes to make an outgoing connection. @gmaxwell What do you think about putting incoming nodes into the new table?

0            if (((CNetAddr)pfrom->addr) == (CNetAddr)addrFrom)      
1             {      
2                 addrman.Add(addrFrom, addrFrom);       
3             }

sipa commented at 5:30 pm on August 26, 2016: member

I would be fine with adding them to the new table… that is identical to what would happen (and likely happens anyway) when they send us an addr message after connecting.

gmaxwell commented at 8:12 pm on August 26, 2016: contributor

@EthanHeilman hosts already send an addr message on connect (see the above stanza), which will add it to new. So I think also doing it there doesn’t add anything new except making it more likely that they’ll advertise themselves accidentally.

The fact that it was ever added to tried at all was a mistake that violated the design of the system, tried table was supposed to be protected against self-selection by actual trying :) – I think there could be some legitimate concern that actual useful hosts will be promoted to tried somewhat slower, but I believe that is addressed by the feeler connections.

I would not be surprised if, after this change is widespread, if tried has more reachable hosts in it– because nodes will pollute their tried table with hosts that aren’t reachable less often.

I’ve checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true.

No, it sets it to its current discovered address. If you are behind NAT and don’t have discovery (e.g. via UPNP or configuration) then you won’t have a discovered address, and so you’ll get 0.0.0.0. Because of security concerns with the upnp library we were using we’ve currently got it off by default, but I hope to have discovery come back in the future.

I am operating under the assumption that most nodes that do not accept incoming connections do so because they are behind NATs and therefore would not be added to the tried table

Eliminating inbound connections is a good way to mitigate a number of DOS vulnerabilities. A standard recommended commercial configuration is to have production nodes which you care about connect only to other somewhat trusted nodes.

From a security standpoint IP addresses from address messages are more vulnerable to manipulation than IP addresses from incoming connections.

If we want to privilege address announcements that also happen to come from the host telling us about them (e.g. using a lower last seen time for them), I think that would be completely reasonable. But there is no need to infer an addr announcement from the version message.

In the future we’ll likely want to carry additional metadata with announcements, which might not be possible to just infer from version handshakes in any case. (Already we can’t get port numbers that way).

luke-jr commented at 3:26 am on August 27, 2016: member

I’ve checked a bunch of older bitcoind logs and it appears that when the host is behind a NAT it sets the IP address in the version message to either 0.0.0.0 or 127.0.0.1. Not sure if that is always true. Can someone who is behind a NAT confirm this behavior in more recent code?

Wireshark tells me I am sending my local-scope IPv6 address, when connecting over IPv4 (behind NAT)… :/

sipa commented at 5:51 pm on August 27, 2016: member

@luke-jr Can you open a separate issue for that? We should never send local addresses out.

gmaxwell commented at 7:28 am on September 1, 2016: contributor

Where are we on this?

sipa commented at 10:41 am on September 1, 2016: member

@EthanHeilman Since we are announcing our own address as soon as we’re past IBD anyway, and there is nothing to prevent other nodes from doing the same, what value does this suggested call to addrMan.Add(addr, addr) have?

laanwj commented at 1:08 pm on September 1, 2016: member

Where are we on this?

I still think we should continue with this change. I think it would be good eventually to deprecate “addrFrom” in version messages completely: to just send a dummy value and ignore what is sent there. It’s a nest of potential privacy and spoofing issues (akin to the X-Forwarded-For header the for HTTP proxies). Also from a protocol point of view advertisement should be separate from version.

That said, @EthanHeilman does have a point that the if (((CNetAddr)pfrom->addr) == (CNetAddr)addrFrom) check at least makes sure that the advertiser can use the IP address advertised, so is a little bit more trustworthy than just a random addr.

On the other hand this same heuristic (if desired) could be used for an address advertised in addr messages. There’s no need to couple this to version messages.

jonasschnelli added the label Needs backport on Sep 1, 2016

gmaxwell commented at 9:31 am on September 3, 2016: contributor

@laanwj Yes, I agree, I think we should do something when we receive address advertisements to up-weight self ones– probably by giving the source a time penalty of zero for it’s own address in CAddrMan::Add. But I think that makes sense as a separate PR. (I’ll go do that now).

gmaxwell commented at 0:52 am on September 6, 2016: contributor

@laanwj PR #8661 now applies that heuristic for self announcements in addr messages.

EthanHeilman commented at 9:09 pm on September 6, 2016: contributor

@gmaxwell After thinking about this PR more and performing some back of the envelope calculations I am no longer opposed to this change. To understand the trade-offs I performed a short analysis.

Eclipse attackers have two resources:

time the attack requires,
and attacker controlled IPs in different IP groups (\16 prefixes).

This PR helps attackers by reducing the number of IP addresses an attacker needs to eclipse a node by shrinking the tried table (by how much I don’t know) but, as shown below, increases by months the time an attacker needs to spend murmur the attacker addresses into tried.

Disclaimer: I didn’t model a more effective hybrid strategy where an attacker strategically places attacker controlled IP addresses in both new and tried as I don’t know what the optimal hybrid new and tried strategy should be. The success probability shown below measures only attacker IPs in tried.

A Murmur Model:

I built a very simple model of the tried and new table to determine the length of time it would take feelers to move via a feeler connection running at a 2 minute interval the attacker controlled IP addresses from the new table into the tried table.

The honest IPs are the number of honest IPs in the tried table at the start of an attack.
The attacker IPs are the attacker control IP addresses in different “groups” (\16 prefixes). Some of these will never make it into the new or tried table due to collisions.
The success probability is the probability that the node makes all 8 outgoing connections to the attacker IP addresses.
For simplicity sake we only measure addresses selected from tried. As noted in the disclaimer this is a not a fair assumption in an actual attack but allows us to look at murmured addresses without having to worry about the new table.

murmerattack

sipa commented at 11:15 am on September 7, 2016: member

@EthanHeilman Thanks a lot for the analysis.

sipa merged this on Sep 7, 2016

sipa closed this on Sep 7, 2016

sipa referenced this in commit 5b2ea29cf4 on Sep 7, 2016

MarcoFalke added this to the milestone 0.13.1 on Sep 7, 2016

MarcoFalke removed the label Needs backport on Sep 9, 2016

laanwj referenced this in commit d9c99c3058 on Sep 15, 2016

luke-jr referenced this in commit 42ea51a65f on Sep 21, 2016

OlegGirko referenced this in commit e666114da5 on Aug 31, 2017

UdjinM6 referenced this in commit 589d22f2ca on Sep 1, 2017

pyritepirate referenced this in commit 3a80e10e8c on Jan 14, 2019

random-zebra referenced this in commit 8bbc0650e6 on Jul 1, 2020

Fuzzbawls referenced this in commit c79ac2ada9 on Aug 19, 2020

Fuzzbawls referenced this in commit fe12ff96ab on Aug 25, 2020

MarcoFalke locked this on Sep 8, 2021

Do not add random inbound peers to addrman. #8594

A Murmur Model: