BIP155: include changes from followup discussions #907

pull vasild wants to merge 2 commits into bitcoin:master from vasild:bip155_clarifications changing 1 files +14 −12
  1. vasild commented at 7:52 am on April 10, 2020: contributor
    • Increase the maximum length of an address from 32 to 512 bytes and clarify that the entire message should be rejected if it contains a longer address. (from #766 (comment))

    • Remove a contradiction about unknown address types - “MUST ignore” VS “MAY store and gossip”.

    • Recommend gossiping addresses for unknown network ID and for networks to which the node is not currently connected to. (from #766 (comment))

    • Clarify that the entire message should be rejected if it contains an address with unexpected size (e.g. IPv4 address with length 5).

  2. vasild commented at 7:55 am on April 10, 2020: contributor

    Was it decided how to signal support for the new addrv2 type of messages? 3 options have been discussed:

    1. Bump the protocol version (to 70016). pros: simple to implement and the info will be available right after the handshake, without changing the handshake itself. cons: the protocol version is not a bitmap, so if an implementation wants to support addrv2 then it also has to support all features prior to 70016, see also #766 (review)

    2. Use a new network service bit (NODE_ADDRV2 = (1 << 4)). pros: same as 1., see also #766 (review) cons: hmm?

    3. Use a new message sendaddrv2 during the handshake. pros: can send wider addresses during handshake that would not otherwise fit in addrMe, but addrMe is deprecated. See #766 (comment) cons: complicates and requires changes to the handshake and requires handling if the new message sendaddrv2 arrives after the handshake.

  3. MarcoFalke commented at 11:52 am on April 10, 2020: member
    cc @dongcarl (not sure if you have been working on this as well in the past)
  4. dongcarl commented at 4:41 pm on April 10, 2020: none
    Right, so the reason why there hasn’t been a PR updating the BIP is because we were still working out all the details, and was using the #766 thread as the place for discussion. Some offline discussion has taken place as well, so I should update that thread.
  5. in bip-0155.mediawiki:81 in 4e7714b8a8 outdated
    77@@ -78,8 +78,8 @@ This means that the message contains a serialized <code>std::vector</code> of th
    78 
    79 One message can contain up to 1,000 addresses. Clients SHOULD reject messages with more addresses.
    80 
    81-Field <code>addr</code> has a variable length, with a maximum of 32 bytes (256 bits). Clients SHOULD reject
    82-longer addresses.
    83+Field <code>addr</code> has a variable length, with a maximum of 64 bytes (512 bits).
    


    dongcarl commented at 4:46 pm on April 10, 2020:
    I think that Wladimir actually meant 512 bytes, it may seems large but it’s a DoS limit.

    vasild commented at 7:03 pm on April 10, 2020:
    Oh, yes, that comment clearly says 512 bytes. Updated.
  6. in bip-0155.mediawiki:82 in 4e7714b8a8 outdated
    77@@ -78,8 +78,8 @@ This means that the message contains a serialized <code>std::vector</code> of th
    78 
    79 One message can contain up to 1,000 addresses. Clients SHOULD reject messages with more addresses.
    80 
    81-Field <code>addr</code> has a variable length, with a maximum of 32 bytes (256 bits). Clients SHOULD reject
    82-longer addresses.
    83+Field <code>addr</code> has a variable length, with a maximum of 64 bytes (512 bits).
    84+Clients SHOULD reject messages with longer addresses, even if they are for an unknown network ID (old client seeing a message from a newer client that supports a new network ID).
    


    dongcarl commented at 4:58 pm on April 10, 2020:

    Nit: Could perhaps be reworded, may be confused if the parenthetical “(old client seeing a message from a newer client that supports a new network ID)” refers to “unknown network ID” or “messages with longer addresses”

    Perhaps: Clients SHOULD reject messages with longer addresses, irrespective of the network ID.


    vasild commented at 7:04 pm on April 10, 2020:
    Updated to your proposal, much shorter and clear.
  7. vasild force-pushed on Apr 10, 2020
  8. dongcarl cross-referenced this on Apr 10, 2020 from issue BIP 155: addrv2 BIP proposal by laanwj
  9. dongcarl commented at 9:39 pm on April 10, 2020: none
  10. vasild commented at 1:30 pm on April 16, 2020: contributor

    So, this PR updates some bits that have settled (up to my understanding).

    How to signal for BIP155 is not one of them and there is still debate on how to best achieve it. It’s not in this PR. I think it will deserve a separate PR once it settles.

  11. naumenkogs commented at 10:11 pm on April 16, 2020: member

    I’m not sure it’s a good idea to coordinate all of the changes/ideas via small commits.

    The issue being that since BIPs are big conceptual ideas, it’s usually better when there’s someone (or several people) championing changes. Otherwise, it’s harder too coordinate and get enough review for the change. I’m not even sure people watch these BIP repo PRs much…

    It’s obviously fine to submit minor improvements to already existing BIPs, but if they are Final, not when there’s a lot of uncertainty :)

    Said that, I think everyone will be happy if @vasild chooses to move progress on BIP155 forward, I’d just rather see progress in bigger chunks, along with a substantial discussion (mailing list, weekly meeting, etc).

  12. MarcoFalke commented at 11:08 pm on April 16, 2020: member
    @dongcarl or @naumenkogs Would you be willing to be a co-champion? @laanwj Would you be willing to assign a co-champion?
  13. naumenkogs commented at 11:54 pm on April 17, 2020: member
    @MarcoFalke I’ll be able to timely answer all the questions to the best of my knowledge, talk about different options and arguments, proofread. I don’t have the energy right now to push this proposal forward. That may change in a month or two.
  14. luke-jr requested review from laanwj on Apr 30, 2020
  15. luke-jr assigned laanwj on Apr 30, 2020
  16. luke-jr added the label Proposed BIP modification on Apr 30, 2020
  17. vasild cross-referenced this on May 5, 2020 from issue Tor v3 support by vasild
  18. laanwj commented at 12:55 pm on June 5, 2020: member
    ACK 181a224b47391be1fe6e7e60345a27aef3e5b179 (on the current changes, I agree we should keep this PR open for now, there’s no reason to merge too many things incrementally)
  19. vasild commented at 9:09 am on June 9, 2020: contributor
    Appended a commit to change the signaling from “protocol version” to a “dedicated message”, render preview.
  20. in bip-0155.mediawiki:136 in 934c2dec8a outdated
    139-For older peers keep sending the legacy <code>addr</code> message, ignoring addresses with the newly introduced address types.
    140+==Signaling support and compatibility==
    141+
    142+Introduce a new message type <code>sendaddrv2</code>. Sending such a message indicates that a node can understand and prefers to receive <code>addrv2</code> messages instead of <code>addr</code> messages. I.e. "Send me addrv2".
    143+
    144+<code>sendaddrv2</code> SHOULD be sent immediately after the <code>version</code> message.
    


    laanwj commented at 12:10 pm on June 10, 2020:
    Even before receiving versionack from the peer? (or doesn’t this matter?)

    vasild commented at 12:56 pm on June 10, 2020:

    I think it does not matter, but the current code in https://github.com/bitcoin/bitcoin/pull/19031 does it after receiving the verack message, similarly to sendheaders and sendcmpct.

    I updated this PR to clarify that.

  21. vasild cross-referenced this on Jun 10, 2020 from issue Implement ADDRv2 support (part of BIP155) by vasild
  22. vasild force-pushed on Jun 10, 2020
  23. vasild commented at 12:58 pm on June 10, 2020: contributor
    Clarify when to send the sendaddrv2 message, render preview
  24. luke-jr commented at 5:09 am on June 25, 2020: member
    Needs re-ACK I guess
  25. naumenkogs commented at 8:10 am on June 25, 2020: member
    ACK 9286b5254317d9e73fb25c5f0acd2b2d9937843e
  26. gmaxwell commented at 7:43 pm on July 10, 2020: contributor

    Increase the maximum length of an address from 32 to 512 bytes and clarify that the entire message should be rejected if it contains a longer address.

    I agree that this is potentially a worthwhile change because there isn’t otherwise any negotiation about what kind of things are permitted inside address messages. Without it a whole v3 address message type would be needed just to support some larger addresses down the line.

    But makes the maximum size of an addr message 512kb, that is a pretty big slug of latency. Should this increase be couple with a maximum size of the total addr message?

    Remove a contradiction about unknown address types - “MUST ignore” VS “MAY store and gossip”. Recommend gossiping addresses for unknown network ID and for networks to which the node is not currently connected to. (from #766 (comment))

    I think this is a bad change and it isn’t justified by the linked comment. Gleb’s rational is related to networks that are not unknown to you but which you aren’t connected to– such as HS peers for a non HS node.

    I agree that nodes probably should store and forward popular network types that they aren’t connected to, subject to the limitation that since they can’t test these peers, they can’t benefit from the nodes own livelyness detection and might just be garbage– so it would be entirely understandable if they were deprioritized.

    I don’t see any rational given in the link for storing or relaying entirely unknown address types. Particularly since you’ll have no way to filter or sanity check these values.

    I think the combination of these two is particularly bad, effectively constructing a moderately high bandwidth free relay mechanism without a current productive use.

    Instead, if a new type is put into use– what is the harm in older nodes just ignoring it? Eventually they’ll upgrade and carry it too.

    Clarify that the entire message should be rejected if it contains an address with unexpected size (e.g. IPv4 address with length 5).

    In combination with the prior point, this creates a transitive DOS attack in the presence of future or incomplete implementations. E.g. IETF smokes a great big bong and creates an IPv8 protocol that has 136 bit addresses. Bitcoin software is updated with a IPv8 type with a limit of 17 byte addresses. Now I start announcing v8-type addresses with 18 bytes in them, and non-upgraded peers now suddenly start getting all their addr messages ignored by newer peers because they’re relaying address types that they don’t understand which violate rules they don’t know about.

    Can be resolved by strictly filtering things you know about and not relaying anything else.

  27. vasild commented at 1:39 pm on July 13, 2020: contributor

    FWIW the implementation at https://github.com/bitcoin/bitcoin/pull/19031 ignores addresses from unknown networks.

    The text in this PR reads:

    To allow for future extensibility, clients MAY store and are RECOMMENDED to gossip addresses with unknown network ID and addresses for networks to which the given client is not currently connected to. Gossiping all addresses, regardless of network ID, would be useful for multi-homed nodes and would make it more difficult for an attacker to assess whether a node is multi-homed or not.

    What about changing it to:

    Clients are RECOMMENDED to gossip addresses from all known networks even if they are currently not connected to some of them. That could help multi-homed nodes and make it more difficult for an observer to tell which networks a node is connected to.

    Clients are NOT RECOMMENDED to gossip addresses from unknown networks because they have no means to validate those addresses and so can be tricked to gossip invalid addresses resulting in a possible ban from other honest nodes. @naumenkogs ?

  28. gmaxwell commented at 11:08 pm on July 13, 2020: contributor
    I would go stronger and say SHOULD NOT because if a participant relays an address they don’t understand it may violate the validity rules and get their messages ignored.
  29. naumenkogs commented at 3:20 pm on July 15, 2020: member

    I agree with the following suggestion from Greg:

    I would go stronger and say SHOULD NOT because if a participant relays an address they don’t understand it may violate the validity rules and get their messages ignored.

    I would drop the following part from your suggestion, because it’s too specific while talking about the behavior that doesn’t exist:

    resulting in a possible ban from other honest nodes

    I would make it more general, although I don’t have a strong opinion here.

  30. vasild commented at 9:07 am on July 16, 2020: contributor

    Clients are RECOMMENDED to gossip addresses from all known networks even if they are currently not connected to some of them. That could help multi-homed nodes and make it more difficult for an observer to tell which networks a node is connected to.

    Clients SHOULD NOT gossip addresses from unknown networks because they have no means to validate those addresses and so can be tricked to gossip invalid addresses.

    Good?

  31. gmaxwell commented at 6:08 pm on July 16, 2020: contributor
    sounds fine to me
  32. BIP155: include changes from followup discussions
    * Increase the maximum length of an address from 32 to 512 bytes and
      clarify that the entire message should be rejected if it contains
      a longer address.
      (from https://github.com/bitcoin/bips/pull/766#issuecomment-519248699)
    
    * Remove a contradiction about unknown address types - "MUST ignore" VS
      "MAY store and gossip".
    
    * Recommend gossiping addresses for networks to which the node is not
      currently connected to.
      (from https://github.com/bitcoin/bips/pull/766#issuecomment-545067608)
    
    * Clarify that the entire message should be rejected if it contains an
      address with unexpected size (e.g. IPv4 address with length 5).
    42ee3f5c15
  33. BIP155: use a dedicated message for signalling
    Change signaling of support for the new `addrv2` messages to be done via
    a dedicated message `sendaddrv2` instead of protocol bump.
    
    The drawback of using a protocol bump is that the protocol version is
    not a bitmask and if a node wants to announce support for `addrv2` this
    would imply support for all prior features included in that protocol
    version.
    350189ad48
  34. vasild force-pushed on Jul 16, 2020
  35. vasild commented at 6:41 pm on July 16, 2020: contributor
    Updated as in #907 (comment), render preview.
  36. vasild commented at 9:00 am on August 13, 2020: contributor
    @naumenkogs, @gmaxwell dare to ACK this one? Last activity here is that I reworded it as per your suggestion.
  37. in bip-0155.mediawiki:135 in 350189ad48
    134 See the appendices for the address encodings to be used for the various networks.
    135 
    136-==Compatibility==
    137+==Signaling support and compatibility==
    138+
    139+Introduce a new message type <code>sendaddrv2</code>. Sending such a message indicates that a node can understand and prefers to receive <code>addrv2</code> messages instead of <code>addr</code> messages. I.e. "Send me addrv2".
    


    jonatack commented at 9:34 am on August 13, 2020:
    s/. I.e./, e.g./

    troygiorshev commented at 5:57 pm on August 25, 2020:

    I’m not sure the plain language phrase is needed, the explanation is clear enough without it.

    If it’s kept, then I think “i.e.” is correct. It’s a rewording of the previous sentence, not giving an example of a larger class of things.


    jonatack commented at 7:22 pm on August 25, 2020:

    If it’s kept, then I think “i.e.” is correct. It’s a rewording of the previous sentence, not giving an example of a larger class of things.

    I agree.

    Still prefer s/. I.e./, i.e./


    troygiorshev commented at 7:30 pm on August 25, 2020:

    Yep, agree i.e. is preferred over I.e.

    I guess ideally (in the case of it being kept) it would be something like: … messages, i.e. “Send me addrv2”. … messages - i.e. “Send me addrv2”.

  38. in bip-0155.mediawiki:139 in 350189ad48
    144-<source lang="c++">
    145-//! gossiping using `addrv2` messages starts with this version
    146-static const int GOSSIP_ADDRV2_VERSION = 70016;
    147-</source>
    148-For older peers keep sending the legacy <code>addr</code> message, ignoring addresses with the newly introduced address types.
    149+For older peers, that did not emit <code>sendaddrv2</code>, keep sending the legacy <code>addr</code> message, ignoring addresses with the newly introduced address types.
    


    jonatack commented at 9:35 am on August 13, 2020:

    s/peers, /peers /

    style: s/keep/continue/

  39. jonatack commented at 9:35 am on August 13, 2020: contributor

    ACK 350189a

    Can ignore the comments unless you need to retouch.

  40. vasild commented at 9:59 am on August 13, 2020: contributor

    @jonatack thanks, I will amend those if I retouch.

    Since this PR is settled and I believe can/should be merged, I opened another one for further changes at #967. They do not conflict or depend on each other semantically or textually.

  41. naumenkogs commented at 11:40 am on August 23, 2020: member
    ACK 350189a
  42. in bip-0155.mediawiki:137 in 350189ad48
    136-==Compatibility==
    137+==Signaling support and compatibility==
    138+
    139+Introduce a new message type <code>sendaddrv2</code>. Sending such a message indicates that a node can understand and prefers to receive <code>addrv2</code> messages instead of <code>addr</code> messages. I.e. "Send me addrv2".
    140+
    141+<code>sendaddrv2</code> SHOULD be sent after receiving the <code>verack</code> message from the peer.
    


    troygiorshev commented at 6:53 pm on August 25, 2020:

    I don’t think this is completely clear.

    If it is supposed to mean “… as opposed to before receiving the verack message from the peer” (like almost every other message) then: s/SHOULD/MUST/

    If it is supposed to mean “… and before anything else” (i.e. soon after receiving the verack) then maybe it should be clarified to say that, but I’m not sure what precisely can be said. Looking at the implementation it seems like we could technically send sendaddrv2 whenever after verack, and honestly I don’t see a huge problem with that.

  43. in bip-0155.mediawiki:129 in 350189ad48
    127-Clients SHOULD reject addresses that have a different length than specified in this table for a specific address ID, as these are meaningless.
    128+Clients SHOULD NOT gossip addresses from unknown networks because they have no means to validate those addresses and so can be tricked to gossip invalid addresses.
    129+
    130+Further network ID numbers MUST be reserved in a new BIP document.
    131+
    132+Clients SHOULD reject messages that contain addresses that have a different length than specified in this table for a specific network ID, as these are meaningless.
    


    troygiorshev commented at 6:53 pm on August 25, 2020:

    Comparing L123 and L129 here.

    RECOMMENDED and SHOULD are synonyms in RFC 2119. If we are intending these to mean two different things, maybe the first (RECOMMENDED) should be changed to MAY?

  44. troygiorshev commented at 7:03 pm on August 25, 2020: none

    ACK 350189a

    Reviewed the discussion around gossiping unknown networks most closely.

    All of the following are RFC 2119 nits and can probably be ignored.

    The blockquote regarding RFC 2119 at the beginning is wrong (though it’s such a popular error that it probably doesn’t need to be fixed). From RFC 2119’s Errata, “NOT RECOMMENDED” is missing, it should say

    The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “NOT RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.

  45. laanwj commented at 9:41 pm on September 2, 2020: member

    There is some demand for peers to be able to communicate not being interested in address broadcasts at all. @sdaftuar @sipa @naumenkogs

    E.g. light clients might not want unsolicited addr messaages and would not partake in addr relays. The only time they’d want to receive address messages is in direct response to getaddr.

    This is not possible at this point (the current code assumes every peer takes part in this). But we could change this as part of addrv2. A possibility would be to make this a parameter of the sendaddrv2 signalling message. E.g.: a boolean that is true to receive addr relays, but false otherwise.

  46. gmaxwell commented at 9:52 pm on September 2, 2020: contributor
    Are there distinct cases of “I don’t care what addr you send me, but I’m not going to relay any” and “don’t send me unsolicited stuff”? Bitcoin Core should care about the former– as it doesn’t want to select non-relayers for low fanout relay. The latter sounds fine too but I dunno if anyone even wants it.
  47. sipa commented at 10:05 pm on September 2, 2020: member

    @gmaxwell If you’re suggesting to have 3 levels:

    1. Don’t send me unsolliciated addr
    2. Send me unsollicited addr, but I won’t relay further
    3. Send me unsollicited addr, and I will relay further

    I’m not sure that’s desirable. An implementation would want to pick 1-2 fanout peers among the level (3) ones, but relay to some extent to the level (2) ones in addition to that.

    If that’s the case, an attacker would be able to get a higher than proportional number of addr messages revealed to them by making many connections to a peer at level (2).

  48. gmaxwell commented at 11:30 pm on September 2, 2020: contributor
    No, actually the other way around. I’m suggesting that “I will relay stuff” and “I won’t relay stuff” is the only interesting criteria. I don’t think anyone cares if peers send them addr messages they don’t need.
  49. naumenkogs commented at 7:48 am on September 3, 2020: member

    E.g. light clients might not want unsolicited addr messages and would not partake in addr relays.

    I previously thought that this will help with better control of unsolicited addr forwarding, but we’ll be fine either way considering this observation by @sipa: An implementation would want to pick 1-2 fanout peers among the level (3) ones, but relay to some extent to the level (2) ones in addition to that.

    I guess they might have the bandwidth constraints and that’s the only real reason. And even that is probably no big deal. It’s hard for me to predict what will happen to addr relay in the future, but assuming no huge structural changes, the addr traffic should remain low. Especially for non-relaying clients, considering the same observation above.

  50. sipa commented at 8:23 am on September 3, 2020: member

    @gmaxwell I think my point is that whether we treat it as “i will/won’t relay” or as “i want/don’t want unsollicited addresses” doesn’t matter, because how we treat them should be identical: if someone doesn’t want addresses at all, we can use that to bias relay away from them, but we can’t bias way from them but still relay to them.

    I think it’s easiest to call it a “give me unsolicited addresses or not” flag and bitcoin core will then relay uniformly over all that want addresses.

  51. sdaftuar commented at 10:59 pm on September 3, 2020: member

    Seems like it’s just a difference in how we describe the boolean, and not what we expect the intended behavior to be – but I think it would make sense to add a boolean that we could use to determine whether the peer is participating in addr relay or not (which in turn would determine whether we send them unrequested addr messages or not).

    The only distinction I can think of between those two ideas is if a node might want to announce its own address to all its peers, even the ones that don’t participate in addr relay? If we would, then I think characterizing the boolean as “participates in addr relay” would be the right descriptor.

  52. naumenkogs commented at 7:11 am on September 4, 2020: member

    “participates in addr relay” would be the right descriptor.

    Then setting it to false might be (mistakenly I assume?) interpreted as “a node doesn’t respond to GETADDR request”. In that case probably be more explicit and say “a node doesn’t forward unsolicited addr messages”

  53. sdaftuar commented at 1:41 pm on September 14, 2020: member

    What do people think about separating message types so that the response to a getaddr is differentiated from unsolicited addr messages?

    Right now we have no ability to differentiate the two; in Bitcoin Core we make a guess based on whether we’ve sent a getaddr but race conditions can make this difficult.

    I’m not sure if this is too large a change to make this late in the process – curious what others think.

  54. naumenkogs commented at 9:33 am on September 15, 2020: member

    What do people think about separating message types so that the response to a getaddr is differentiated from unsolicited addr messages?

    The discussion in #19794 demonstrated that it can be useful to clarify the ADDR handling logic, and I don’t see any negative implications of this change. It may be even more useful when we start sending more than one GETADDR per peer.

    I see 2 alternatives:

    • differentiate based on the addr message size and hope that no edge cases occur (like very little AddrMan of the sender)
    • differentiate on the receiving time and rely on delays

    Right now we sort of rely on the combination of these two it seems. #19794 demonstrated that these alternatives are hard to reason about.

    I think an explicit flag in the ADDR message is probably the best choice.

  55. vasild commented at 10:32 am on September 15, 2020: contributor

    If we differentiate the two, then what shall we do if we get an unsolicited “response to getaddr”?

    Why is it that we don’t want to gossip responses to getaddr?

    https://github.com/bitcoin/bitcoin/blob/fa7e407b504bc60c77341f02636ed9d6a4b53d79/src/net_processing.cpp#L2623

    Why not always gossip up to 10 randomly picked addresses, regardless of whether we sent getaddr or not?

    The current code seems a bit strange - if we receive an addr message with 10 addresses it will gossip all of them, but if we receive a message with 11 addresses it will not gossip any of them. Wouldn’t it be better, if we don’t want to gossip too much, to gossip up to 10 addresses from bigger messages instead of zero?

  56. sdaftuar commented at 12:19 pm on September 15, 2020: member

    If we differentiate the two, then what shall we do if we get an unsolicited “response to getaddr”?

    I think we can just specify in the BIP how errors are handled; my inclination is that disconnecting peers who violate the spec is probably the best way to ensure that software is coded up correctly when implemented, but this could be discussed if others feel differently.

    Why is it that we don’t want to gossip responses to getaddr?

    Though relevant, I don’t think this is the central question – I think the central question is, could there ever be situations where differentiating between the response to a getaddr and unsolicited addr relay is desirable? And I think based on the history of code in Bitcoin Core, at least, the answer is yes: we try to differentiate between the two for both determining addr relay policy, and for correct implementation of addr_fetch peers (see https://github.com/bitcoin/bitcoin/issues/19694).

    I would add that one of the lessons I learned after implementing BIP 130 (sendheaders) is that reusing headers messages as a response to a getheaders message, and for unsolicited headers-relay, was a mistake, which has resulted in additional complexity in our headers handling, and has made some problems very hard to solve. It’s long been on my list to revamp how we do headers relay, and in part, disambiguate the two, so that we can handle the different cases more easily.

    I think that same basic lesson applies here: it’s basically free to disambiguate between the two, implementations that don’t care can just ignore the disambiguation and treat both types of messages the same, while implementations that want to do something different would have the ability to do so. In my view the main question would be whether it’s worth wrapping such a change into this proposal (which has been open for some time and making progress towards deployment), or deferring this proposal to the future?

  57. sipa commented at 6:28 pm on September 15, 2020: member

    @sdaftuar I think there may be another angle to this, which makes addr handling very different from block header handling: headers are verifiable data structures, which permit detecting/disconnecting/punishing peers that send invalid ones. Addresses are just unauthenticated data, so if semantics differ between “addresses just gossipped” and “addresses in response to getaddr”, we should be mindful that an attacker could at any time choose between them without any real consequences.

    It does seem simpler for protocol handling to distinguish them, but wouldn’t this work as well?

    • After a getaddr, the first 1000 responses - regardless of how they’re split over addr messages - are treated as the response to it. This only really matters to ADDR_FETCH nodes.
    • All addr messages, regardless of size, are treated as candidates for relay. There is already an existing criterion for relay which will make it mostly false for (honest) responses to getaddr, namely the nTime field which must have a timestamp at most 10 minutes in past - as getaddr responses are historical ones, they will generally not qualify. Beyond that we could add a leaky bucket counter or so, that rate limits how many incoming addrs we treat as relayable (to prevent malicious load).
    • To prevent spamming/sybilling addrman, use a separate rate limiter to determine when/how many addresses to insert into it from any given peer.

    I’m also happy to be convinced that there aren’t any useful attacks that could be used if the two types of responses are distinguished. But given that addr’s timestamps are already kind of a proxy for “currently relyable”, maybe there is no need?

  58. sipa cross-referenced this on Sep 15, 2020 from issue p2p: Remove fGetAddr flag by mzumsande
  59. naumenkogs commented at 7:23 am on September 16, 2020: member

    It does seem simpler for protocol handling to distinguish them, but wouldn’t this work as well?

    I think this works in honest cases, but when considering an adversary, it feels like the mechanism you described is more prone just due to complexity. Remember that we sometimes do unsolicited relay on behalf of other nodes (if addr < 10) and such. “Malicious load handler” and rate limiters may be tricked by this too.

    “Explicit ADDR flag” is, on the other hand, as robust as current ADDR handling, at least the implementation I envision in context of #19794 it can be.

    This is just my intuition for now, I don’t have an exact way of breaking your mechanism, especially since it’s only an idea. Maybe once I see the code, I change my mind.

  60. Sjors commented at 3:43 pm on September 16, 2020: member

    Do we want to require nodes to perform some minimum sanity checks on address types they gossip? Currently the BIP only requires that the length field is enforced. The currently proposed Bitcoin Core implementation https://github.com/bitcoin/bitcoin/pull/19845 performs an additional check, namely the TORv3 checksum, but just ignores the item if there’s a problem. The BIP lists additional properties that a node could check, such as the public key being a valid ed25519 pubkey. The scenario I’m worried about is an attacker gossiping a whole bunch of nonsense that’s expensive for nodes to verify, but with no way to punish them because we can’t tighten this rule later.

    I’m also a bit worried about nodes receiving - even if they don’t relay - huge amounts of nonsense in the form of fake “future addresses”. But afaik it doesn’t matter what the maximum size is per address, so it’s a bit orthogonal to the changes in this PR.

  61. sipa commented at 5:59 pm on September 16, 2020: member
    LGTM. I think it would be good to have these changes (which are now being discussed as part of BIP155 anyway) reflected in the BIP. There are a few follow-up discussions still happening here, but I think it would be better if the BIP is updated to at least the things in this PR.
  62. sipa commented at 6:04 pm on September 16, 2020: member

    @Sjors There are no checksums in BIP155 TORv3 handling, only in the conversion from/to human readable “.onion” notation.

    I don’t think any further checks should be performed. Sending bogus invalid addresses is not more harmful to the network than sending bogus valid addresses, so no need to waste time on it. Invalid ones will be caught by Tor, which is in a much better position to judge validity.

  63. sdaftuar commented at 6:51 pm on September 16, 2020: member

    Addresses are just unauthenticated data, so if semantics differ between “addresses just gossipped” and “addresses in response to getaddr”, we should be mindful that an attacker could at any time choose between them without any real consequences.

    I don’t think this is exactly true, at at least it shouldn’t be – “addresses in response to getaddr” would only be allowed if a getaddr were actually sent (presumably). I suppose we need semantics around how many response messages a peer could provide; if sending a getaddr once allowed the peer to send “responses” to that throughout the lifetime of the connection, interspersed with other addr messages, that would be odd.

    However if we put some reasonable bounds on the response I think this concern should go away (one simple rule would be that addr-response messages must come sequentially, so that as soon as a network message arrives that isn’t an addr-response, the response is over – we could additionally bound the time from sending a getaddr to receiving the responses).

    With something like that, we could view addr traffic as two separate protocols:

    • a protocol for unsolicited addr messages
    • a protocol for an “addr_request” and “addr_response” pair of messages

    So these protocols can only overlap after an addr_request is sent – outside of such a window, these two protocols are non-overlapping.

    If we had either one and not the other, it’s hard to imagine that the system would be definitely attackable by an adversary, and it’s hard for me to imagine that if we had one already, that adding the other would definitely break anything, since mostly these don’t overlap. That is not to say that any given implementation couldn’t be broken – but to argue against having a way to distinguish, I think you’d have to argue that all possible (useful) implementations where the two kinds of data are distinguished are definitely attackable?

    I suspect that we have insufficient study/research at this point to be able to make that claim, so given that we have at times in the past (and at present in the Bitcoin Core code) tried to distinguish these cases, I’d say we should go ahead and do so at the protocol level if we have a good opportunity for doing so. My basic assumption here is that honest nodes participating in this protocol would make some things more efficient, while it shouldn’t necessarily be more vulnerable to attack by an adversary than today’s protocol.

  64. sipa commented at 7:53 am on September 17, 2020: member
    @sdaftuar Right - if you think of it as two different protocols that seems clearly fine. I have a feeling it’s possible to merge them, but that’s probably harder than simply adding a new message here while the rest of address relay is being touched anyway.
  65. laanwj commented at 8:07 pm on September 17, 2020: member
    ACK 350189ad489c99a426915b123502a8339ef8d1ba I think this is fine as it is. We can leave adding a “relays addresses” flag to a future PR, but it seems that needs some more discussion anyway.
  66. luke-jr merged this on Sep 23, 2020
  67. luke-jr closed this on Sep 23, 2020

  68. mzumsande cross-referenced this on Apr 21, 2021 from issue test: Extend functional tests for addr relay by mzumsande
  69. vasild deleted the branch on Jun 16, 2021
  70. vasild cross-referenced this on Jun 16, 2021 from issue [p2p] Stop sending SENDADDRV2 message to block-relay-only peers by amitiuttarwar

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bips. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 01:10 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me