p2p: Remove fGetAddr flag

mzumsande commented at 9:27 pm on August 24, 2020: contributor

fGetAddr is an old flag that indicates an ongoing GETADDR/ADDR interaction. Introduced in https://github.com/bitcoin/bitcoin/commit/dd519206a684c772a4a06ceecc87c665ad09d8be, it prevents further relay of received ADDR answers to other peers via the RelayAddress() gossip mechanism while set - probably to prevent ADDR traffic from being dominated by large GETADDR answers.

It also used to be possible to receive a large GETADDR answer spread over multiple ADDR messages (each bounded to max 1000 addresses), so the condition if (vAddr.size() < 1000) pfrom.fGetAddr = false; was meant to ensure that fGetAddr would only get cleared once the last ADDR of the package was received (typically with < 1000 addresses).

However, fGetAddr currently does not work as originally intended and is no longer useful:

typically, the first ADDR we receive from a peer after sending the GETADDR is not the GETADDR response but the self-announcement of our peer. If this is the case, fGetAddr acts on it (preventing its relay) and is already inactivated once the GETADDR response is received.
https://github.com/bitcoin/bitcoin/commit/5cbf75324d1509a1262b65c5073314a4da3f6d77 introduced the condition vAddr.size() <= 10, so messages with more than 10 elements won’t be relayed further regardless.
since #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

Removing the flag will cause initial self-announcement from outbound peers to be gossip-relayed.

Thanks to jnewbery for help with reconstructing the history of fGetAddr!

[Outdated, but left in for context] Removing the flag will change behaviour in two special cases: 1.) If the GETADDR response is exactly of size 1000, fGetAddr will not not be cleared on current master, so that the next “regular” ADDR message by this peer won’t be relayed - this does not seem intentional, changing it arguably fixes a bug. 2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

jnewbery commented at 9:54 pm on August 24, 2020: contributor

concept ACK

DrahtBot added the label P2P on Aug 24, 2020

DrahtBot commented at 7:20 am on August 25, 2020: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#17785 (p2p: Unify Send and Receive protocol versions by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/net_processing.cpp:2496 in 32bd4cbd7e outdated

2445@@ -2446,7 +2446,6 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
2446 
2447             // Get recent addresses
2448             m_connman.PushMessage(&pfrom, CNetMsgMaker(nSendVersion).Make(NetMsgType::GETADDR));
2449-            pfrom.fGetAddr = true;

jnewbery commented at 10:21 am on August 25, 2020:

This is unrelated to this PR, but could you replace this with a blank line instead of deleting the line. The MarkAddressGood() call below is unrelated to getting recent addresses, so should be visually separated from the line above.

in src/net_processing.cpp:2589 in 32bd4cbd7e outdated

2585@@ -2587,7 +2586,7 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
2586                 continue;
2587             }
2588             bool fReachable = IsReachable(addr);
2589-            if (addr.nTime > nSince && !pfrom.fGetAddr && vAddr.size() <= 10 && addr.IsRoutable())
2590+            if (addr.nTime > nSince && vAddr.size() <= 10 && addr.IsRoutable())

jnewbery commented at 10:21 am on August 25, 2020:

since you’re touching this line, join it with the line below to have the opening brace on the same line.

jnewbery commented at 10:22 am on August 25, 2020: contributor

Looks good. Just two minor style comments inline.

You should update the commit log to include your excellent description of why it’s ok to remove this.

mzumsande force-pushed on Aug 25, 2020

mzumsande commented at 8:32 pm on August 25, 2020: contributor

Thanks for the review! I pushed an update, implementing both style suggestions and copying the PR description into the commit message.

naumenkogs commented at 11:56 am on August 26, 2020: member

ACK ef9855580098654cdfc5844f8922518e6602878f.

It’s good to get rid of the useless code. Even though it’s only a couple lines, the burden of this variable is massive: I spent 10 minutes figuring what does it do, even after reading the PR description.

I tried to make sure we don’t miss any corner cases, and I think this PR works correctly.

Regarding your commit message:

2.) If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.

I’m not sure it’s a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don’t send them out immediately anyway.

As a side note, is this possible?

Victim node V starts and connects to their peers
An attacker spams V with a lot of unsolicited ADDR messages (of size <10 but from different Sybils) to make sure all ADDRs V sends to their peers is of size 11 at least.
V’s self-announcements (and other unsolicited ADDRs from V) are never forwarded by their peers, so only immediate peers ever learn about V.

This PR doesn’t address this issue as a whole, but it removes one out of two mechanisms making this possible I believe :) I think we should look further and see if we can fix it in full, perhaps in another PR.

mzumsande commented at 10:22 pm on August 26, 2020: contributor

I’m not sure it’s a bug in master. I always thought of <=10 check as something to limit the forwarding of unsolicited ADDRs. Now it may also forward a solicited ADDR. May it leak privacy somehow? I guess not, because we don’t send them out immediately anyway.

I just wanted to note that it is a small change in behaviour, not sure if it is a bug. In any case, I don’t think that this will occur often, because with MAX_PCT_ADDR_TO_SEND = 23 it would mean that an honest peer would need to have a really small addrman with less than 50 addresses to send such a small response - usually, GETADDR responses should be considerably larger.

Regarding your scenario: I think this is a very interesting scenario and probably worth addressing (V’s address could still be part of a GETADDR response though, even if it not forwarded). I don’t see though how this PR helps with this even a little. On current master, fGetAddr is set in the VERSION processing with an outbound peer, and immediately unset as soon as we get a single ADDR with size < 1000 from the peer (and can never be set again for the duration of the connection) - while your scenario entails repeated messages.

laanwj commented at 11:36 am on August 27, 2020: member

Though I had an idea what it was for, I don’t think I’ve ever thoroughly understood this flag. Thanks for the explanation and ACK on removing it. ACK ef9855580098654cdfc5844f8922518e6602878f

jnewbery commented at 8:47 am on September 1, 2020: contributor

ACK the code changes in ef9855580098654cdfc5844f8922518e6602878f

If you touch this again, could you line-wrap the body of your commit log at 80 chars? (also no need to credit me in the commit log - this is your work!)

DrahtBot added the label Needs rebase on Sep 3, 2020

amitiuttarwar commented at 3:21 pm on September 3, 2020: contributor

strong concept ACK! thank you! that variable is a distraction.

mzumsande force-pushed on Sep 3, 2020

mzumsande commented at 6:51 pm on September 3, 2020: contributor

Rebased and reformatted the commit message as suggested by @jnewbery .

DrahtBot removed the label Needs rebase on Sep 3, 2020

naumenkogs commented at 6:55 am on September 4, 2020: member

ACK d3eaaf40541d1ef7b64bdc354c5ac23c3ad9c9e2

Another behavior change, or rather an elaboration over “If the response to GETADDR is small (<=10 addresses), it will be relayed to peers just as any other ADDR message of that size.”

We just connected to the peer and send them a GETADDR
Before getting GETADDR, they schedule their self-announcement for us by adding their addr to vAddrToSend
Their ADDR sending is triggered (see pto->m_next_addr_send < current_time)
We receive their self-announcement containing only 1 addr and ~~keep fGetAddr=true~~ set fGetAddr=false
They finally receive our GETADDR and schedule their response (can be flipped with (4))

I think this order of events is unlikely (need much luck for (3) to happen before (5)) but possible.

In that case,

before this PR, we would not forward their self-announcement, because fGetAddr=true at the time we receive it (we update it right at the end of processing it)
after this PR, we would forward their self-announcement

I think this is actually an improvement. I don’t see any reason to withhold an initial announcement of the peer instead of forwarding it to other nodes.

Even if this had any privacy benefits as a side-effect (e.g., somehow less revealing of new connections), it had a very low chance of working. Maybe this should be a real feature tho.

jnewbery commented at 7:39 am on September 4, 2020: contributor

ACK d3eaaf40541d1ef7b64bdc354c5ac23c3ad9c9e2

amitiuttarwar commented at 11:23 pm on September 4, 2020: contributor

ACK d3eaaf4054

CI failures look unrelated.

sdaftuar commented at 11:49 pm on September 4, 2020: member

since #5419, we do not get more than one ADDR message in response to GETADDR because vAddrToSend cannot have more than 1000 elements.

If I’m understanding you correctly, you’re saying that we can’t get more than one ADDR message in response because our own software wouldn’t do it. But what about other software on the network?

At any rate, I think the reason we have this variable is to track whether an ADDR message is likely in response to a getaddr message we send, in which case we don’t want to relay the learned addresses to other peers, or if an addr message is an unsolicited relay from our peer, in which case we also want to try to relay. (The reason to not relay messages with more than 10 entries is to rate limit the addr relay, so that a spammer can’t use up all the network’s bandwidth by blasting out addr messages.)

I think one downside to relaying getaddr responses is a privacy leak - if you started up with a small addrman, and learned only a small set of nodes to connect to from one peer (and relayed that whole thing to another peer), that other peer might have a pretty good idea of who you’re going to be connecting to.

Anyway I’d be hesitant to remove this bool without talking through how we view this scenario, as it seems odd to me to relay addresses that are in response to a getaddr message, since we expect those to be of a very different nature than unsolicited addresses being relayed to us, which you expect to be originating with some peer announcing itself. (I wish the response to a getaddr was a different message type than an unsolicited address being relayed to us!)

amitiuttarwar commented at 0:28 am on September 5, 2020: contributor

@sdaftuar hmm interesting, I’m trying to evaluate this concern, do I have the scenario right:

Node A starts up, connects to Node B
When Node A receives the VERSION message, it sends a GETADDR message to Node B
Node B has an addrman with < 43 addresses, fulfills the request, sends back 9 addresses randomly selected from its addrman (based on 23 percent calculation)
Node A receives an ADDR message with 9 addresses, passes misc checks in ProcessMessage, invokes RelayAddress, which then sends it out to 1 or 2 nodes? (if I’m understanding the role of nRelayNodes correctly), including Node C

and then your concern is at this point, that Node C is able to deduce private information? the connection between Node A and Node B? or more generally what Node A’s options for connections are?

mzumsande commented at 7:02 am on September 5, 2020: contributor

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master:

The problem is that the first ADDR we receive in the lifetime of a typical connection is the initial self-announcement of our peer! After receiving this ADDR message of size 1, fGetAddr will be cleared - it wasn’t active a single time by the time I received the actual GETADDR response from a peer. This is exactly what @naumenkogs described in his earlier comment (https://github.com/bitcoin/bitcoin/pull/19794#issuecomment-686953704), however it is not rare because m_next_addr_send is 0 initially.

Therefore, what this flag currently seems to do is prevent initial self-announcements of outbound peers from being relayed to other peers.

I’ll update the PR description with above info later this weekend and will also address @sdaftuar’s comment - marking the PR as Draft until then.

mzumsande marked this as a draft on Sep 5, 2020

jnewbery commented at 9:38 am on September 5, 2020: contributor

I just realized that fGetAddr seems to be completely broken/ineffective after setting up a node with extra logging for fGetAddr on mainnet with current master…

Wow, good observation. So, this flag doesn’t even do what it was introduced for. Strong ACK on improving the comments/logic here, but I also still think removing it entirely is fine. Our “don’t relay getaddr responses” logic is enforced by the vAddr.size() <= 10 clause.

mzumsande commented at 10:02 pm on September 6, 2020: contributor

To elaborate on my last comment: The typical behavior seems to be that in the first SendMessages() loop after being successfully connected, our peer will self-advertise (because m_next_local_addr_send == 0 initially) and send us an ADDR without further delay (because m_next_addr_send == 0 initially), sending the GETADDR response later. It probably depends on the timing of when the VERACK vs. GETADDR messages are received by our peer, but from my node logs, this is the order of events in almost all cases.

So I think that there are now two separate issues: 1.) Considering that this flag mainly prevents initial self-announcements from being gossip-relayed and seems to have little to no interaction with GETADDR: Should we embrace this unintended change in behavior and rename the flag to fDontRelayInitialSelfAnnouncements or similar? As @naumenkogs above, I don’t really see a good reason to do this.

2.) Even though fGetAddr doesn’t seem to work as intended, @sdaftuar’s privacy concerns still poses the question if there should be some mechanism suppressing relay of GETADDR answers beyond the vAddr.size() <= 10 limit. I think that it should be an extremely rare event for our outbound peer to have an addrman with < 43 addresses. Even starting with an empty addrman and querying some DNS seeds should get us over that limit, which should be no issue after having completed the first GETADDR interaction with a typical outbound peer. So I am not sure how realistic the scenario is.

I think it might be more problematic for privacy sending GETADDR responses < 1000, because we communicate the size of our addrman (via 23% factor) and signal that we might be susceptive to an eclipse attack if our addrman is small.

I’ll update the commit message and the PR description (will leave the originally assumed changes in behavior in for now, because they were quoted in this discussion).

p2p: Remove fGetAddr flag

`fGetAddr` is an old flag that indicates an ongoing GETADDR/ADDR interaction.
Introduced in dd519206a684c772a4a06ceecc87c665ad09d8be, it prevents further
relay of received ADDR answers to other peers via the `RelayAddress()` gossip
mechanism while set - probably to prevent ADDR traffic from being dominated by
large GETADDR answers. 

It also used to be possible to receive a large GETADDR answer spread over
multiple ADDR messages (each bounded to max 1000 addresses), so the condition
`if (vAddr.size() < 1000) pfrom.fGetAddr = false;` was meant to ensure that
`fGetAddr` would only get cleared once the last ADDR of the package was
received (typically with < 1000 addresses).

However, `fGetAddr` currently does not work as originally intended and is no
longer useful:
- typically, the first ADDR we receive from a peer after sending the GETADDR
is not the GETADDR response but the self-announcement of our peer. If this is
the case, `fGetAddr` acts on it (preventing its relay) and is already
inactivated once the GETADDR response is received.
- 5cbf75324d1509a1262b65c5073314a4da3f6d77 introduced the condition
`vAddr.size() <= 10`, so messages with more than 10 elements won't be relayed
further regardless.
- since #5419, we do not get more than one ADDR message in response to GETADDR
because `vAddrToSend` cannot have more than 1000 elements.

Removing the flag will cause initial self-announcements from outbound peers to
be gossip-relayed.

628812a34e

mzumsande force-pushed on Sep 6, 2020

mzumsande marked this as ready for review on Sep 6, 2020

jnewbery commented at 9:07 am on September 7, 2020: contributor

ACK 628812a34ea0d6ac6ed17abaa06fb8e4758c4c5f

naumenkogs commented at 7:53 am on September 8, 2020: member

So there are 2 issues pointed out by sdaftuar, I’d consider them separately:

Privacy of a Bitcoin Core node with few AddrMan records making a new connection to a Bitcoin Core node. I agree with the concern, but according to mzumsande experiments above, the fGetAddr flag is ineffective here, so we already have this problem. The problem probably should be addressed in a different PR. If we can’t come up with something better, we always can expand ADDR message to say it’s unsolicited (not GETADDR response), I don’t see any (privacy) issues with it.
Some other software on the network. The right thing here would be to consider all more or less reasonable scenarios of the communication between our Bitcoin Core node and other software. For example, if that software don’t self-advertise at all, it will actually get the privacy benefit described in (1), because fGetAddr flag won’t be cleared on initial self-ad. Now, maybe that software relied on this privacy benefit? (I doubt though).

I’m leaning towards reACK, but before that I’d rather us make some efforts to make sure (2) is not a concern (maybe an ML post is sufficient).

naumenkogs commented at 9:53 am on September 8, 2020: member

Actually, fGetAddr as currently working (set to false after initial peer’s self-ad) are protecting the initial self-ad from the privacy leak I’ve pointed out here. But I’d repeat that this feature should be made explicit in a follow-up rather than a random side-effect.

jnewbery commented at 10:37 am on September 8, 2020: contributor

@naumenkogs , I don’t understand this:

We receive their self-announcement containing only 1 addr and keep fGetAddr=true

Why would fGetAddr stay as true? I think it’s flipped to false in this scenario.

naumenkogs commented at 10:50 am on September 8, 2020: member

@jnewbery this was a mistake (see I updated it), but the example remains valid.

sdaftuar commented at 11:20 pm on September 10, 2020: member

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer’s own address?

naumenkogs commented at 1:57 pm on September 11, 2020: member

@sdaftuar

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end, eg by my suggestion here to delay the initial addr message we send out to inbound peers in order to mitigate the ambiguity between the response to an initial getaddr and the advertising of the peer’s own address?

I’m not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

sdaftuar commented at 4:10 pm on September 11, 2020: member

I’m not 100% persuaded that fGetAddr better stays (the only valid justification I see at the moment is potentially breaking other software?), but I think this idea would make the flag work as intended.

It seems like it should be a reasonably good guess that if you send someone a getaddr, the next addr you get is probably in response. However, since we send out a getaddr to outbound peers on connection, and because we send out an addr for ourselves to inbound peers, also usually on connection, we’ve created a natural race condition between nodes running our same software.

Fixing that race condition (usually) for our own software is pretty easy; and that then reduces the question posed by this PR to “is it important to keep track of when an addr message is in response to a getaddr, or not?” And I tend to think that the answer is probably yes; although addr relay is not something specified in any BIPs, it makes sense to me that responses to a getaddr are of a different quality than general addr-relay, so we might as well try to detect those situations when reasonably possible and relay differently.

As for what other software might exist: in addition to other full-node implementations out there, I think the dns seednodes can be used as addr-fetch peers by Bitcoin Core, in some situations? I haven’t tried to exercise it carefully, but my understanding is that the seednodes are custom software, which potentially differ from each other and from Bitcoin Core.

mzumsande commented at 5:32 pm on September 11, 2020: contributor

Rather than remove the fGetAddr flag, what do you think about improving our behavior instead on the other end.

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

I think of fGetAddr as an heuristic machanism meant to cover the typical case (obviously it does not help at all with adversarial situations), so I think it is justified to adapt it, if the typical case has changed a long time ago - especially if we have a second mechanism (size limit <=10) preventing GETADDR answer relay, which, even though not direct, seems less fragile in many respects compared to a timing-based one.

sdaftuar commented at 5:47 pm on September 11, 2020: member

In this case, I think that we should change the if (vAddr.size() < 1000) pfrom.fGetAddr = false; condition which, in the typical case of receiving a single GETADDR answer of size 1000, prevents us from relaying the next ADDR message of our peer, even though this one would not be related to the GETADDR interaction at all.

Yes, I agree that makes sense and better matches what we typically are seeing on the network.

mzumsande commented at 7:02 pm on September 15, 2020: contributor

Ok - I will open a PR in the next days for that. Will close this one and link it from the new one, because it would become a bit confusing with another change in the PR description and discussion referring to old versions.

mzumsande closed this on Sep 15, 2020

sipa commented at 7:04 pm on September 15, 2020: member

There is some more discussion here, BTW: https://github.com/bitcoin/bips/pull/907

bitcoin locked this on Feb 15, 2022

mzumsande deleted the branch on Oct 13, 2023

p2p: Remove fGetAddr flag #19794

Conflicts