net: save the network type explicitly in CNetAddr #19534

pull vasild wants to merge 2 commits into bitcoin:master from vasild:explicit_network_type_in_CNetAddr changing 3 files +113 −40
  1. vasild commented at 10:12 am on July 16, 2020: member

    (chopped off from #19031 to ease review)

    Before this change, we would analyze the contents of CNetAddr::ip[16] in order to tell which type is an address. Change this by introducing a new member CNetAddr::m_net that explicitly tells the type of the address.

    This is necessary because in BIP155 we will not be able to tell the address type by just looking at its raw representation (e.g. both TORv3 and I2P are “seemingly random” 32 bytes).

    As a side effect of this change we no longer need to store IPv4 addresses encoded as IPv6 addresses - we can store them in proper 4 bytes (will be done in a separate commit). Also the code gets somewhat simplified - instead of memcmp(ip, pchIPv4, sizeof(pchIPv4)) == 0 we can use m_net == NET_IPV4.

  2. fanquake added the label P2P on Jul 16, 2020
  3. vasild force-pushed on Jul 16, 2020
  4. DrahtBot commented at 4:12 pm on July 16, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    No conflicts as of last run.

  5. vasild force-pushed on Jul 16, 2020
  6. vasild commented at 6:12 pm on July 16, 2020: member
    Rebased to wake up Travis
  7. troygiorshev commented at 3:43 am on July 17, 2020: contributor

    ACK eb22eeda5b37afea0649688690e04c00d5a6dbfd

    If you make any changes, consider running clang-format. No worries if you don’t, we can always do it later in the roadmap. Might be nice to clean up the formatting of some of the other methods here as well, now that we’re touching so much of this.

  8. Saibato commented at 7:08 am on July 17, 2020: contributor

    So if i read this correct, if this is merged we narrow before even having benefit and v3 , and say an address is not routable if one flag in CNetAddr m_net say’s something else than m_net == NET_IPV6 and skip that the Bitcoin imperative ‘data decides’

    So you want to change even the old ADDRv1 logic while BIP155 say’s don’t. hmm… The way bitcoin is used in the wild is often strange and has hidden paths, i can only urge you to not touch in any way the old ADDR logic and the way that net works.

    And btw if ip is filled with « instead of memcpy and data don’t decide, how sophisticated is the checksum plan for bit or byte errors in the new ADDRv1/v2 stream?

    btw Since the BIP155 ADDv2 is to send ip2 and v3 just as 32 byte without checksum, i would also recommend to concat least sig byte of hash(m net | 32 byte ) before send, but that;s prob a PR for BIP155.
    edit@saibato: Not that to provide piggy pack support for tor onions in ipv6 packets when ipv6 in #1021 was merged is a holy move, but time passed by and the value of bitcoin is not only the timestamp consensus, but AFAICS also the network itself.

  9. michaelfolkson commented at 10:49 am on July 18, 2020: contributor

    Concept ACK, Approach ACK.

    The BIP refers to I2P and CJDNS overlay networks and gives them their own network IDs. An I2P address is a Tor v3 address and a CJDNS address is a IPV6 address (or at least they are the same size) right? I’m just trying to understand why they warrant their own reserved network IDs in the BIP but don’t seem to be implemented here?

  10. in src/netaddress.cpp:325 in eb22eeda5b outdated
    329@@ -322,13 +330,7 @@ enum Network CNetAddr::GetNetwork() const
    330     if (!IsRoutable())
    331         return NET_UNROUTABLE;
    332 
    333-    if (IsIPv4())
    


    jonatack commented at 2:56 pm on July 18, 2020:

    Is this code still needed in GetNetwork()?

    0    if (IsInternal())
    1        return NET_INTERNAL;
    

    vasild commented at 8:56 am on July 20, 2020:
    Good question - no. But it was not necessary before this PR either. Some good simplification can be done here even on master. I will see whether it is better to add a new commit to this PR or file it as a separate PR.

    vasild commented at 1:54 pm on July 20, 2020:

    Correction - it is needed, otherwise this test would fail:

    https://github.com/bitcoin/bitcoin/blob/476436b/src/test/netbase_tests.cpp#L45

    Anyway, I made some simplifications on top of this PR that remove the “extending” of enum Network with NET_UNKNOWN and NET_TEREDO and also removed GetExtNetwork(). To avoid distractions, I will not append those commits to this PR but would rather open a new PR against master, once this PR gets merged.


    jonatack commented at 2:10 pm on July 20, 2020:
    Thanks @vasild, looking forward to reviewing the simplifications.
  11. jonatack commented at 3:25 pm on July 18, 2020: member
    Light ACK eb22eeda5b37afea0649688690e04c00d5a6dbfd though viewed in isolation this change seems to be making the code more complex, not simpler.
  12. in src/netaddress.h:46 in eb22eeda5b outdated
    38+
    39+    /// TORv2
    40     NET_ONION,
    41+
    42+    /// A set of dummy addresses that map a name to an IPv6 address.
    43     NET_INTERNAL,
    


    Saibato commented at 5:49 pm on July 18, 2020:

    Suggested change:

    0/// A set of special crafted FC00/7 IPv6 addresses we use to map a string or FQDN to an IPv6 address.
    1/// We uses these important fake addresses also in CAddrMan to keep track of which DNS seeds were used 
    2/// and how we manage buckets and peers.
    

    With that in mind when dev we can take special care in transit to ADDRv2 to not mess with this address realm, because we need them later to get our new long addresses in and out from peers,dat ?


    vasild commented at 2:16 pm on July 20, 2020:

    I extended the comment, thanks!

    In #19031 we drop the 6-byte fd6b:88c0:8724 prefix from these addresses and store them in memory as m_net=NET_INTERNAL, m_addr=...10 bytes.... Serializing that to addrv1 remains unchanged (we add the fd6b:88c0:8724 prefix) and they are never serialized as addrv2.


    Saibato commented at 2:56 pm on July 20, 2020:

    Not that related here and just to your info, the follow up construction on this run in an assert failure when i made the v3 patch,

    The structures need also some failure tolerance and flexibility.
    Don;t mind if i am harsh and sometimes sarcastic, but you do here heart surgery and better not look good but beat the drum

    Also to get in an out of peers and DNS seeds is somewhat unrelated to the data structure on the ADDR net. We could easy use intern more data and flags on our address than we use on the wire. Those two structures need not to be the same. edit@saiboto what let me tend to ADDRv2 is 2038. tyi.

  13. Saibato changes_requested
  14. vasild commented at 9:46 am on July 20, 2020: member

    The BIP refers to I2P and CJDNS overlay networks and gives them their own network IDs. An I2P address is a Tor v3 address and a CJDNS address is a IPV6 address (or at least they are the same size) right? I’m just trying to understand why they warrant their own reserved network IDs in the BIP but don’t seem to be implemented here? @michaelfolkson, the sizes of I2P and Torv3 addresses are equal (32 bytes) and also IPv6 and CJDNS addresses are both 16 bytes, but their interpretation is different - ie given some chunk of 32 bytes we need to know whether that is an I2P address or Torv3 address. This is why they have a separate network ids. Also there could be an overlap - e.g. some 16 bytes could both be valid IPv6 address and a valid CJDNS address.

    The code in this PR and its parent PR (https://github.com/bitcoin/bitcoin/pull/19031) aims to implement ADDRv2 address format and signaling with the networks that are currently supported. Once that is done, on top of it, we will implement new networks like Torv3, I2P and others.

  15. net: document `enum Network` 100c64a95b
  16. vasild force-pushed on Jul 20, 2020
  17. vasild commented at 2:19 pm on July 20, 2020: member
    Extended the NET_INTERNAL comment as per @Saibato’s suggestion.
  18. in src/netaddress.cpp:252 in 662bb25efa outdated
    242@@ -235,7 +243,7 @@ bool CNetAddr::IsLocal() const
    243 
    244     // IPv6 loopback (::1/128)
    245     static const unsigned char pchLocal[16] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1};
    246-    if (memcmp(ip, pchLocal, 16) == 0)
    247+    if (IsIPv6() && memcmp(ip, pchLocal, 16) == 0)
    248         return true;
    


    Saibato commented at 2:34 pm on July 20, 2020:

    In general some of this might be optimized out by the compiler,
    And i would prefer for the transition period something like this that we can refactor out when all works fine to an compiler and bytecode friendly version.

    I would prefer constructions for all those ipv6 overrides in the rest of the code to look like this suggested change:

    0if  (memcmp(ip, pchLocal, 16) == 0) && IsIPv6();
    

    And some logging here and in the other overrides i.e.

    0LogPrint(BCLog::ADDV2, "old isXXXX() says so, ADDRv2 m_net did override with  value = xxx .....   
    
  19. Saibato changes_requested
  20. jonatack commented at 2:35 pm on July 20, 2020: member
    re-ACK 662bb25 per git diff eb22eed 662bb25 only change is improved documentation.
  21. troygiorshev commented at 3:50 pm on July 20, 2020: contributor

    reACK 662bb25

    Only change is in a comment.

  22. Saibato commented at 8:19 am on July 21, 2020: contributor
    tyi, I will not ACK or NACK any of ADDRv2 roadmap, Since Tor v2 will be deprecated sooner or later. I will try to start a side project to morph some old legacy tunnel code code to a special general ‘bitcoin-peer-address’ tor or whatever proxy wrapper, for all old nodes to have full v3 and after 2038 support, without need to softfork This will from day one, mitigate any chainsplit or censorship.
  23. vasild commented at 1:20 pm on July 21, 2020: member
    Just to clarify - ADDRv2/BIP155 does not involve a softfork, chainsplit or censorship. Old nodes (e.g. 0.19.1) can’t have full support for TORv3 because they don’t contain the code for it.
  24. Saibato commented at 1:56 pm on July 21, 2020: contributor

    I full support ADDRv2 and i will comment and provide review. Also imo ADDRv2 its inevitable for master.

    My approach aims to let old tor nodes not behind and be forced to softfork when tor v2 fades out.

    can’t have full support for TORv3 because they don’t contain the code for it

    I love this cant’s in software dev, because i can, There is a great misconception in how MITM or say jmp esp works when its clear that you don;t have the space,bytes or source code on the wire, that always astound me; why not use a bit hacking voodoo for some useful things,when we could tunnel because we have solid keys

    That wrapper is on track and orthogonal to this here and when ADDRv2 is there, it will be even more easy to support this.

    Maybe it will make its path to contrib to assist all old node software in support of protocols aside from internet protocol.

    edit@saibato Btw:It also astound me in many defi or blockchain software projects how they think the real internet or tor works, there are multiple ways to do the wrapper, and one is of such nature. that there is a minor chance that you could hear some jaw bones touch the floor.

    The status quo, so far is that even now every tor v2 proxy capable node, can even now connect over tor v3 9050 socks5 proxy to any node that provides a v3 inbound hidden service over torrc. What missing is a bit virgin seeding that could be done by a signed script file and banman and conman bitcoin logic and then address gossip about that and to store this, for node reboot, if you don;t want to run every time a long bash cli addnode <v3.onion> add .script or a long cmdline. All can be solved without touching the old software. And since we have a great FC00/7 space for grabs in all old nodes, that could i.e. bind tun or taps to them to proxy out to even stranger protocols. What is the misconception, is that we need to know the full 256 address or greater pubkey space of other protocols, when in fact to work for bitcoin we only will ever have a short directory or index list we need to gossip, that fits easy in the enormous 80bit FC00/7 index space, we have in every node even now. edit@saibato; When this and the followup PR’s gain track and we will have ADDRv2 nodes in the wild, users can run anyway stripped down master versions, they connect to local or remote with there old node to advertise there v3 address, if they wish so, but honestly the nature of hidden net, is that its hidden, so to advertise inbound is anyway not so sexy. I guess, if we would have ̶a̶ ̶b̶y̶ ̶s̶a̶t̶o̶s̶h̶i̶ ̶k̶e̶y̶s̶ signed v3/ip2 whatever seeds list, that will be enough to bootstrap and there would be almost none old node, to not use such a list?

  25. vasild commented at 5:14 pm on July 21, 2020: member
    Coverage report of modified code: https://people.freebsd.org/~vd/pr19534_662bb25ef_coverage/ (all modified code is covered by tests)
  26. Saibato commented at 9:00 am on July 22, 2020: contributor

    Just to clarify - ADDRv2/BIP155 does not involve a softfork, chainsplit or censorship. Old nodes (e.g. 0.19.1) can’t have full support for TORv3 because they don’t contain the code for it.

    wow, that was really funny took,me a while to figure out the zero. is a morse placeholder. This kind of humor makes really fun and that i like so much when programing senator brains. I almost tend to blind ACK

  27. in src/netaddress.h:42 in 662bb25efa outdated
    37     NET_IPV6,
    38+
    39+    /// TORv2
    40     NET_ONION,
    41+
    42+    /// A set of dummy addresses that map a name to an IPv6 address. These
    


    laanwj commented at 1:58 pm on July 25, 2020:
    Thanks for adding documentation here, it was not immediately clear to me what NET_INTERNAL was.

    vasild commented at 12:06 pm on July 27, 2020:
    It was unclear to me too :)
  28. in src/netaddress.h:61 in 662bb25efa outdated
    57 {
    58     protected:
    59+        /**
    60+         * Network to which this address belongs.
    61+         */
    62+        Network m_net;
    


    laanwj commented at 1:59 pm on July 25, 2020:

    Please initialize the default here instead of in the constructor

    0    Network m_net{NET_IPV6};
    

  29. in src/netaddress.cpp:45 in 662bb25efa outdated
    36@@ -36,10 +37,20 @@ void CNetAddr::SetRaw(Network network, const uint8_t *ip_in)
    37     switch(network)
    38     {
    39         case NET_IPV4:
    40+            m_net = NET_IPV4;
    41             memcpy(ip, pchIPv4, 12);
    42             memcpy(ip+12, ip_in, 4);
    43             break;
    44         case NET_IPV6:
    45+            if (memcmp(ip_in, pchIPv4, sizeof(pchIPv4)) == 0) {
    


    laanwj commented at 2:01 pm on July 25, 2020:
    I think it would make sense to add this detection out to a separate function, e.g. SetLegacyIPV6 then call it here and in the deserialization logic, highlighting that it does auto-detection.

  30. vasild commented at 12:42 pm on July 27, 2020: member

    Made 2 changes as per review suggestions:

    • Initialize CNetAddr::m_net in its definition instead of in the constructor
    • Move the code that detects non-IPv6 addresses embedded in IPv6 to a separate method
  31. vasild force-pushed on Jul 27, 2020
  32. in src/netaddress.h:158 in 64897c5776 outdated
    154+        void Unserialize(Stream& s)
    155+        {
    156+            unsigned char ip_temp[sizeof(ip)];
    157+            s >> ip_temp;
    158+            // Use SetRaw() so that m_net is set correctly. For example
    159+            // ::FFFF:0102:0304 should be set as m_net=NET_IPV4 (1.2.3.4).
    


    troygiorshev commented at 1:04 pm on July 27, 2020:

    Comment no longer reflects the code.

    0            // Use SetLegacyIPv6() so that m_net is set correctly. For example
    1            // ::FFFF:0102:0304 should be set as m_net=NET_IPV4 (1.2.3.4).
    

    Or, maybe reword to avoid writing the name of a method in a comment.


    vasild commented at 1:13 pm on July 27, 2020:
    Right! Updated.
  33. troygiorshev commented at 1:06 pm on July 27, 2020: contributor

    reACK 64897c5776f024662000528415315995e2a83d3c via git range-diff master 662bb25 64897c5

    Just one comment

  34. net: save the network type explicitly in CNetAddr
    Before this change, we would analyze the contents of `CNetAddr::ip[16]`
    in order to tell which type is an address. Change this by introducing a
    new member `CNetAddr::m_net` that explicitly tells the type of the
    address.
    
    This is necessary because in BIP155 we will not be able to tell the
    address type by just looking at its raw representation (e.g. both TORv3
    and I2P are "seemingly random" 32 bytes).
    
    As a side effect of this change we no longer need to store IPv4
    addresses encoded as IPv6 addresses - we can store them in proper 4
    bytes (will be done in a separate commit). Also the code gets
    somewhat simplified - instead of
    `memcmp(ip, pchIPv4, sizeof(pchIPv4)) == 0` we can use
    `m_net == NET_IPV4`.
    
    Co-authored-by: Carl Dong <contact@carldong.me>
    bcfebb6d55
  35. vasild force-pushed on Jul 27, 2020
  36. troygiorshev commented at 1:15 pm on July 27, 2020: contributor
    reACK bcfebb6d5511ad4c156868bc799831ace628a225 via git range-diff master 64897c5 bcfebb6
  37. vasild commented at 1:16 pm on July 27, 2020: member
    Updated a comment to reflect that we now call SetLegacyIPv6() instead of SetRaw(), as per suggestion.
  38. jonatack commented at 2:37 pm on July 27, 2020: member
    re-ACK bcfebb6 per git diff 662bb25 bcfebb6, code review, debug build/tests clean, ran bitcoind.
  39. promag commented at 7:44 pm on July 28, 2020: member
    Concept ACK.
  40. laanwj commented at 11:31 am on July 29, 2020: member
    Code review ACK bcfebb6d5511ad4c156868bc799831ace628a225
  41. laanwj merged this on Jul 29, 2020
  42. laanwj closed this on Jul 29, 2020

  43. vasild deleted the branch on Jul 29, 2020
  44. in src/netaddress.cpp:35 in bcfebb6d55
    27@@ -28,19 +28,35 @@ CNetAddr::CNetAddr()
    28 
    29 void CNetAddr::SetIP(const CNetAddr& ipIn)
    30 {
    31+    m_net = ipIn.m_net;
    32     memcpy(ip, ipIn.ip, sizeof(ip));
    33 }
    34 
    35+void CNetAddr::SetLegacyIPv6(const uint8_t ipv6[16])
    


    MarcoFalke commented at 7:33 am on July 30, 2020:
    Any reason to pass around fixed-length data with a raw data pointer (of arbitrary length) when type safe alternatives like span or std::array (probably not applicable in this case) are available?

    laanwj commented at 8:13 am on July 30, 2020:
    The input has to be 16 bytes, in this case. Is there a way to enforce this in the type, for spans? (sure, the [16] doesn’t do anything here either except as a signal to the programmer)

    MarcoFalke commented at 9:03 am on July 30, 2020:

    Not sure, but I presumed spans could be of non-dynamic extent as well. Does this not work?

    0const std::span<uint8_t, 16>& ipv6
    

    vasild commented at 12:35 pm on July 30, 2020:

    Yes, in this case the [16] is just a “signal to the programmer”. In SetRaw() we have a bare pointer const uint8_t* when calling this function.

    I have removed SetRaw() in the subsequent commit and will revisit this code.

    Some experiments:

     0int f1(const std::span<int, 3>& p)
     1{
     2    return p[5]; // no warning
     3}
     4
     5int f2(const int (&p)[3])
     6{
     7    return p[5]; // warning: array index 5 is past the end of the array (which contains 3 elements) [-Warray-bounds]
     8}
     9
    10int f3(const std::array<int, 3>& p)
    11{
    12    return p[5]; // no warning
    13}
    14
    15int main(int, char**) {
    16    int a[3];
    17    int b[2];
    18
    19    f1(a);
    20    f1(b); // error: no known conversion from 'int [2]' to 'const std::span<int, 3>'
    21
    22    f2(a);
    23    f2(b); // error: no known conversion from 'int [2]' to 'const int [3]'
    24
    25    f3(std::array<int, 3>{a[0], a[1], a[2]});
    26
    27    return 0;
    28}
    

    vasild commented at 3:52 pm on July 30, 2020:

    So, in #19628 I changed the argument to const uint8_t (&ipv6)[ADDR_IPv6_SIZE] mostly to make the interface clean. From 3 callers 2 need an abusive typecasts because they don’t have an array (one has bare pointer (the fuzz tests) and the other has struct in6_addr).

    Feel free to comment there: https://github.com/bitcoin/bitcoin/pull/19628/files#diff-b64569708508232923e5fe3059396334R28

  45. vasild commented at 3:47 pm on July 30, 2020: member

    Thanks to everybody involved!

    I opened the next PR at #19628. It is the biggest one from the ADDRv2 roadmap (https://github.com/bitcoin/bitcoin/pull/19031) and is mostly (but not entirely) mechanical.

    After it follow more “interesting” commits :)

  46. sidhujag referenced this in commit a040eaeadc on Jul 31, 2020
  47. PastaPastaPasta referenced this in commit 0e083669ff on Jan 16, 2021
  48. UdjinM6 referenced this in commit 31ff054e1b on Jan 18, 2021
  49. PastaPastaPasta referenced this in commit 888f36abe7 on Jan 18, 2021
  50. deadalnix referenced this in commit b4c94da5b5 on Feb 5, 2021
  51. random-zebra referenced this in commit b4751e10ce on Aug 11, 2021
  52. DeckerSU referenced this in commit 3753a69999 on Feb 8, 2022
  53. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me