Make AddrMan support multiple ports per IP #23306

pull sipa wants to merge 1 commits into bitcoin:master from sipa:202110_addrmanmultiport changing 5 files +53 −62
  1. sipa commented at 4:14 am on October 19, 2021: member

    For a long part of Bitcoin’s history, this codebase has aggressively avoided making automatic connections to anything but nodes running on port 8333. I’d like to propose changing that, and this is a first PR necessary for that.

    The folklore justification (eventually actually added as a comment to the codebase in #20668) is that this is to prevent the Bitcoin P2P network from being leveraged to perform a DoS attack on other services, if their IP/port would get rumoured. It appears, at least the current network scale - and probably significantly larger - that the impact is very low at best (see calculations by vasild in #5150 (comment) e.g.). Another possible justification would be a risk that treating different IP:port combinations separately would help perform Eclipse attacks (by an attacker rumouring their own IP with many ports). This concern is (a) no different than what is possible with IPv6 (where large ranges of IP addresses are very cheaply available), and (b) already hopefully sufficiently addressed by addrman’s design (which limits access through based selected based on network groups).

    And this policy has downsides too; in particular, a fixed port is easy to detect, and a very obvious sign a Bitcoin node is running there.

    One obstacle in moving away from a default port that is the fact that addrman is currently restricted to a single entry per IP address. If ports are no longer expected to be generally always the default one, we need to deal with the case where conflicting information is relayed. It turns out there is a very natural solution to this: treat (IP,port) combination exactly as we’re treating IPs now; this automatically means that the same IP may appear with multiple ports, simply because those would be distinct entries. Given that indexing into addrman’s bucket already uses the port number, the only change required is making all addrman lookup be (IP,port) (aka CService) based, rather than IP (aka CNetAddr) based.

    This PR doesn’t include any change to the actual outbound connection preference logic, as perhaps that’s something that we want to phase in more gradually.

  2. fanquake added the label P2P on Oct 19, 2021
  3. DrahtBot commented at 8:26 am on October 19, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23077 (Full CJDNS support by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. ghost commented at 9:15 am on October 19, 2021: none

    For a long part of Bitcoin’s history, this codebase has aggressively avoided making automatic connections to anything but nodes running on port 8333. I’d like to propose changing that, and this is a first PR necessary for that.

    Concept ACK for allowing automatic connections to nodes which are not using default ports

    This concern is (a) no different than what is possible with IPv6 (where large ranges of IP addresses are very cheaply available), and (b) already hopefully sufficiently addressed by addrman’s design (which limits access through based selected based on network groups).

    Agree

    And this policy has downsides too; in particular, a fixed port is easy to detect, and a very obvious sign a Bitcoin node is running there.

    I found UA string as the easiest way to look for Bitcoin nodes while using shodan in which I could also see few nodes using non-default ports:

    image

    image

    0    "addr": "193.32.127.161:58477",
    1    "addrbind": "192.168.1.4:60877",
    2    "network": "ipv4",
    

    Ignore if its not related: Few users use Tor hidden services and different ports for RPC as explained in Learning-Bitcoin-from-the-Command-Line/Verifying_Your_Tor_Setup.md

  5. dhruv commented at 2:32 pm on October 19, 2021: member

    Concept ACK

    For a long part of Bitcoin’s history, this codebase has aggressively avoided making automatic connections to anything but nodes running on port 8333. I’d like to propose changing that, and this is a first PR necessary for that.

    Port 8333 is the cheapest fingerprint for a Bitcoin node. The possibility of port randomization is quite exciting as we will be able to use the raised floor of fingerprinting cost from BIP324.

  6. jamesob commented at 6:04 pm on October 19, 2021: member
    Concept ACK. Changes look pretty straightforward.
  7. lsilva01 approved
  8. lsilva01 commented at 1:56 am on October 20, 2021: contributor

    Tested ACK 5697d60 on Ubuntu 20.04.

    CNetAddr has been replaced by CService in mapAddr in this change. CService is a subclass of CNetAddr that has the port attribute. Started node with an existing peers.dat file and with no file. It had no impact on serialization / unserialization as this is done using AddrInfo.

  9. in src/test/fuzz/addrman.cpp:158 in 5697d6054a outdated
    160+            return (size_t)hasher.Finalize();
    161         };
    162 
    163         AddrInfoEq addrinfo_eq = [](const AddrInfo& lhs, const AddrInfo& rhs) {
    164-            return static_cast<CNetAddr>(lhs) == static_cast<CNetAddr>(rhs) &&
    165+            return static_cast<const CService&>(lhs) == static_cast<const CService&>(rhs) &&
    


    vasild commented at 10:01 am on October 20, 2021:
    Why cast to reference? Is that not the same as static_cast<const CService>(...)?

    sipa commented at 12:32 pm on October 21, 2021:
    static_cast<const CService> constructs a new CService object which is a copy of the CService inside the AddrInfo. static_cast<const CService&> just constructs a reference to that CService object.

    jnewbery commented at 9:47 am on October 22, 2021:

    Maybe clearer (or maybe not):

    0            return std::tie(static_cast<const CService&>(lhs), lhs.source, lhs.nLastSuccess, lhs.nAttempts, lhs.nRefCount, lhs.fInTried) ==
    1                   std::tie(static_cast<const CService&>(rhs), rhs.source, rhs.nLastSuccess, rhs.nAttempts, rhs.nRefCount, rhs.fInTried);
    

    sipa commented at 4:09 pm on October 22, 2021:
    @jnewbery I like that; done.
  10. practicalswift commented at 10:01 am on October 20, 2021: contributor

    The folklore justification (eventually actually added as a comment to the codebase in #20668) is that this is to prevent the Bitcoin P2P network from being leveraged to perform a DoS attack on other services, if their IP/port would get rumoured. […]  Another possible justification would be a risk that treating different IP:port combinations separately would help perform Eclipse attacks (by an attacker rumouring their own IP with many ports).

    A third concern (which I think can be addressed: see below) could be that Bitcoin Core users would risk having to deal with bogus complaints along the lines of “why am I seeing a failed SSH-login to my server on port 22 from your IP-address w.x.y.z?” if we allowed for automatic outgoing connections on all ports by default.

    When opening up for additional port numbers we might want to look at how browsers are tackling “bad ports”:

    Avoiding “bad ports” (or a subset of these ports) by default is probably a cheap way to reduce the likelihood of Bitcoin Core users having to deal with these types of bogus complaints.

    An alternative route would be to explicitly whitelist (by default) a set of “good ports” which a.) have low likelihood of causing said type of complaints, and b.) are unlikely to be firewalled.

  11. in src/test/fuzz/addrman.cpp:155 in 5697d6054a outdated
    156+            hasher.Write(addr_key.size());
    157+            hasher.Write(source_key.size());
    158+            hasher.Write(addr_key.data(), addr_key.size());
    159+            hasher.Write(source_key.data(), source_key.size());
    160+            return (size_t)hasher.Finalize();
    161         };
    


    vasild commented at 10:08 am on October 20, 2021:
    This does not take into account CNetAddr::m_net. So, two addresses from different networks with the same address bytes would hash the same. I think this is fine coz IPv4 and IPv6 have different length of the address. It could happen that a Tor address hashes the same as I2P address if they have identical 32 bytes :eyes:. I think this is ok for tests.

    sipa commented at 4:09 pm on October 22, 2021:
    Fixed; I added a GetNetwork call to both the address and the source.
  12. in src/test/fuzz/addrman.cpp:150 in 5697d6054a outdated
    151+            auto source_key = a.source.GetAddrBytes();
    152+            hasher.Write(a.nLastSuccess);
    153+            hasher.Write(a.nAttempts);
    154+            hasher.Write(a.nRefCount);
    155+            hasher.Write(a.fInTried);
    156+            hasher.Write(addr_key.size());
    


    vasild commented at 10:08 am on October 20, 2021:
    Why write the size? Is it not unnecessary given that we also write the data?

    sipa commented at 4:11 pm on October 22, 2021:
    Just a general principle: when serializing data, always do it in a way that would be deserializable (even if it is being fed to a hasher where that doesn’t matter). This avoids the situation where you have multiple variant-length fields, and you get collisions between (shorter+longer) and (longer+shorter). One way of accomplishing that is also encoding the length of every variable-length piece of data.
  13. in src/addrman_impl.h:182 in 5697d6054a outdated
    179@@ -180,7 +180,7 @@ class AddrManImpl
    180     std::unordered_map<int, AddrInfo> mapInfo GUARDED_BY(cs);
    181 
    182     //! find an nId based on its network address
    


    vasild commented at 12:10 pm on October 20, 2021:
    0    //! find an nId based on its network address and port
    

    sipa commented at 4:11 pm on October 22, 2021:
    Fixed.
  14. vasild approved
  15. vasild commented at 12:31 pm on October 20, 2021: member

    ACK 5697d6054a1aaec0f89a8c8ce792d5e9f5c02314

    TODO: once this PR and #22839 are merged, ensure that all logging of addresses from within addrman include the port (i.e. avoid ToStringIP()).

  16. in src/test/addrman_tests.cpp:230 in 5697d6054a outdated
    228+    BOOST_CHECK_EQUAL(addrman.size(), 2U);
    229     auto addr_ret2 = addrman.Select().first;
    230-    BOOST_CHECK_EQUAL(addr_ret2.ToString(), "250.1.1.1:8333");
    231+    BOOST_CHECK(addr_ret2.ToString() == "250.1.1.1:8333" || addr_ret2.ToString() == "250.1.1.1:8334");
    232 
    233     // Test: Add same IP but diff port to tried table, it doesn't get added.
    


    mzumsande commented at 10:16 pm on October 20, 2021:
    comment is outdated, it does get added because the IP will now make it to new, and then to tried

    sipa commented at 4:11 pm on October 22, 2021:
    Fixed.
  17. naumenkogs commented at 12:13 pm on October 21, 2021: member

    Concept ACK. Interested in addressing the point by practicalswift (either dismissing or accepting, but with rationale).

    Otherwise, I agree this change is beneficial and I don’t see any real harm.

  18. mzumsande commented at 6:01 pm on October 21, 2021: member

    Concept ACK

    In four different places in Addrman, there is the pattern

     0    AddrInfo* pinfo = Find(addr);
     1
     2    // if not found, bail out
     3    if (!pinfo)
     4        return;
     5
     6    AddrInfo& info = *pinfo;
     7
     8    // check whether we are talking about the exact same CService (including same port)
     9    if (info != addr)
    10        return;
    

    I think the final check for info != addr could be removed now, since with this PR, Find() only returns an entry if there is a match including the port (and the first return would be hit if not).

  19. sipa commented at 8:49 pm on October 21, 2021: member
    @practicalswift That’s a reasonable concern. I think as a first step, the policy could e.g. still maintain the deprioritization of connections to ports numbered < 1024.
  20. vasild commented at 7:58 am on October 22, 2021: member

    I like the idea of “bad ports”, especially if popular softwares like Firefox and Chromium agree on what is a “bad port”. Trying to open http://bitcoin.org:22/ in Firefox gives me “This address is restricted” error.

    This discussion belongs to the next PR, which removes the 8333 preference:

    https://github.com/bitcoin/bitcoin/blob/9469ffcb17f0018b61f85f835cc8b49370d26d1f/src/net.cpp#L2070

    Maybe when removing that, in the same PR, we should introduce the concept for “bad ports”. I wonder if it would be as easy as adding CService::IsValid() which calls the parent’s CNetAddr::IsValid() and does an extra check like:

    0switch (port) {
    1case 1: return false; // tcpmux
    2case 7: return false; // echo
    3case 9: return false; // discard
    4...
    5}
    6return true;
    
  21. in src/test/addrman_tests.cpp:375 in 5697d6054a outdated
    368@@ -369,18 +369,18 @@ BOOST_AUTO_TEST_CASE(addrman_find)
    369     CNetAddr source2 = ResolveIP("250.1.2.2");
    370 
    371     BOOST_CHECK(addrman.Add({addr1}, source1));
    372-    BOOST_CHECK(!addrman.Add({addr2}, source2));
    373+    BOOST_CHECK(addrman.Add({addr2}, source2));
    374     BOOST_CHECK(addrman.Add({addr3}, source1));
    375 
    376     // Test: ensure Find returns an IP matching what we searched on.
    


    jnewbery commented at 9:40 am on October 22, 2021:
    0    // Test: ensure Find returns an IP/port matching what we searched on.
    

    sipa commented at 4:11 pm on October 22, 2021:
    Fixed.
  22. in src/test/addrman_tests.cpp:380 in 5697d6054a outdated
    377     AddrInfo* info1 = addrman.Find(addr1);
    378     BOOST_REQUIRE(info1);
    379     BOOST_CHECK_EQUAL(info1->ToString(), "250.1.2.1:8333");
    380 
    381-    // Test 18; Find does not discriminate by port number.
    382+    // Test 18; Find does discriminate by port number.
    


    jnewbery commented at 9:41 am on October 22, 2021:
    0    // Test: Find discriminates by port number.
    

    sipa commented at 4:12 pm on October 22, 2021:
    Done.
  23. jnewbery commented at 9:49 am on October 22, 2021: member

    Concept ACK. Couple of minor comments inline.

    I agree with @mzumsande about removing the // check whether we are talking about the exact same CService (including same port) logic from _Good(), _Attempt(), _Connected() and _SetServices().

  24. laanwj commented at 12:55 pm on October 22, 2021: member

    Concept ACK, I think this makes sense. More flexibility with regard to ports is better, allows people to accept incoming connections on their node even if they can’t bind to specific port 8333 (e.g. on a shared router). Also, it allows to route around the dumbest kind of ISP firewall blocking.

    @practicalswift That’s a reasonable concern. I think as a first step, the policy could e.g. still maintain the deprioritization of connections to ports numbered < 1024.

    I like this idea. Labeling specific “bad ports” could work too, but I don’t see any problems with de-prioritizing the entire <1024 range, after all, to use them one would have to run bitcoind as root on many operating systems, which is inadvisable.

  25. practicalswift commented at 3:20 pm on October 22, 2021: contributor

    @sipa

    A third concern (which I think can be addressed: see below) could be that Bitcoin Core users would risk having to deal with bogus complaints along the lines of “why am I seeing a failed SSH-login to my server on port 22 from your IP-address w.x.y.z?” if we allowed for automatic outgoing connections on all ports by default.

    @practicalswift That’s a reasonable concern. I think as a first step, the policy could e.g. still maintain the deprioritization of connections to ports numbered < 1024.

    Sounds resonable. That should resolve this concern. Perhaps port 80 and 443 could be allowed too (in addition to >= 1024) in order to accommodate users behind very strict firewalls.

    FWIW, IETF Network Working Group’s “Implications of running Internet over ports 80 and 443”: “Users are often connected to Internet with very few outgoing ports available, such as only port 80 and 443 over TCP. This situation has many implications on designing, deploying and using IETF protocols, such as encaspulating protocols within HTTP, difficulty to do traffic engineering, quality of service, peer-to-peer, multi-channel protocols or deploying new transport protocols. This document describes the situation and its implications.”

    From the same paper:

    • “A trend started many years ago has been to provide Internet access to end-users with limited outgoing ports. The most constraint but common case is to only have outgoing TCP port 80 and port 443 opened.”
    • “A consequence of this trend is that Internet statistics show [Labovitz] that a majority of the Internet traffic is over port 80 and 443. And the concentration on these ports are further increasing every year.”
  26. vasild commented at 3:30 pm on October 22, 2021: member
    There are good ports <1024 (e.g. 80, 443) and bad ports >1024 (e.g. 6000). I think that if we are going to do if (port < 1024 && port != 80 && port != 443) (poor man bad ports approximation) we might as well do the proper if (port == 1 || port == 7 || ...) (or the switch from above).
  27. Make AddrMan support multiple ports per IP 92617b7a75
  28. sipa force-pushed on Oct 22, 2021
  29. jnewbery commented at 5:22 pm on October 22, 2021: member
    Code review ACK 92617b7a758c0425330fba4b886296730567927c
  30. sipa commented at 6:49 pm on October 22, 2021: member
    There is plenty to discuss still on exactly how the relax the peer port preference policy practically, which is probably best left for an actual PR that implements that.
  31. naumenkogs commented at 7:21 am on October 25, 2021: member
    ACK 92617b7a758c0425330fba4b886296730567927c
  32. vasild approved
  33. vasild commented at 7:52 am on October 25, 2021: member
    ACK 92617b7a758c0425330fba4b886296730567927c
  34. ajtowns commented at 2:41 pm on October 25, 2021: member

    ACK 92617b7a758c0425330fba4b886296730567927c

    If I’m reading it right, it seems like ~5% of addresses in my addrman use non-default ports (and none of them seem to be bad ports like ssh or ircd), so moving towards more egalitarian treatment of those nodes seems like a good thing to me. I agree we should have some check for bad ports as part of the next PR that moves towards enabling that.

  35. sipa commented at 2:44 pm on October 25, 2021: member
    Mental note: investigate what happens when downgrading, and you have a peers.dat with multiple entries for the same IP?
  36. MarcoFalke merged this on Oct 25, 2021
  37. MarcoFalke closed this on Oct 25, 2021

  38. MarcoFalke commented at 3:05 pm on October 25, 2021: member

    Looks like a downgrade will fail with:

    Error: Invalid or corrupt peers.dat (Corrupt data. Consistency check failed with code -5: iostream error). If you believe this is a bug, please report it to https://github.com/bitcoin/bitcoin/issues. As a workaround, you can move the file ("/tmp/regtest/peers.dat") out of the way (rename, move, or delete) to have a new one created on the next start.

  39. sidhujag referenced this in commit c1956e4522 on Oct 25, 2021
  40. fanquake deleted a comment on Oct 28, 2021
  41. vasild referenced this in commit 1a8a382787 on Oct 28, 2021
  42. vasild commented at 2:46 pm on October 28, 2021: member

    Looks like a downgrade will fail with:

    Error: Invalid or corrupt peers.dat (Corrupt data. Consistency check failed with code -5: iostream error). If you believe this is a bug, please report it to https://github.com/bitcoin/bitcoin/issues. As a workaround, you can move the file ("/tmp/regtest/peers.dat") out of the way (rename, move, or delete) to have a new one created on the next start.

    Changed in https://github.com/bitcoin/bitcoin/pull/23382

  43. MarcoFalke referenced this in commit 5574881ce3 on Oct 29, 2021
  44. sidhujag referenced this in commit 827e54c8e7 on Oct 29, 2021
  45. mzumsande referenced this in commit 78bcf4c324 on Oct 31, 2021
  46. mzumsande referenced this in commit bd0a52fccb on Oct 31, 2021
  47. vasild referenced this in commit 38a7cf975c on Nov 18, 2021
  48. vasild commented at 10:17 am on November 18, 2021: member
    A followup to this, that removes the strong preference for 8333 is in #23542.
  49. vasild referenced this in commit a47716aa5a on Nov 18, 2021
  50. vasild referenced this in commit cb3887ee36 on Nov 19, 2021
  51. vasild referenced this in commit 634f2838ee on Nov 19, 2021
  52. rebroad referenced this in commit 1c344370a2 on Nov 24, 2021
  53. rebroad referenced this in commit e76dac6336 on Nov 25, 2021
  54. rebroad referenced this in commit be2fe623aa on Nov 25, 2021
  55. rebroad referenced this in commit c79bb0ac3c on Dec 2, 2021
  56. vasild referenced this in commit 2d9224ad4d on Dec 10, 2021
  57. rebroad referenced this in commit d717949657 on Dec 12, 2021
  58. vasild referenced this in commit c66edde667 on Jan 6, 2022
  59. rebroad referenced this in commit 87baa6e478 on Feb 2, 2022
  60. rebroad referenced this in commit 7bf884235b on Feb 2, 2022
  61. vasild referenced this in commit 10a08ce771 on Feb 3, 2022
  62. vasild referenced this in commit 0eb87dc80d on Feb 11, 2022
  63. vasild referenced this in commit 520d1a86b9 on Feb 11, 2022
  64. vasild referenced this in commit 97208634b9 on Feb 11, 2022
  65. rebroad referenced this in commit 49d2e136fc on Feb 15, 2022
  66. laanwj referenced this in commit ba11eb354b on Mar 2, 2022
  67. rebroad referenced this in commit 7094ea7994 on Mar 4, 2022
  68. janus referenced this in commit 623ce6ff91 on Jul 24, 2022
  69. Fabcien referenced this in commit 0db06b504d on Oct 21, 2022
  70. DrahtBot locked this on Nov 18, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-06 16:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me