net: change CNetAddr::ip to have flexible size

vasild commented at 3:33 pm on July 30, 2020: member

(chopped off from #19031 to ease review)

Before this change CNetAddr::ip was a fixed-size array of 16 bytes, not being able to store larger addresses (e.g. TORv3) and encoded smaller ones as 16-byte IPv6 addresses.

Change its type to prevector, so that it can hold larger addresses and do not disguise non-IPv6 addresses as IPv6. So the IPv4 address 1.2.3.4 is now encoded as 01020304 instead of 00000000000000000000FFFF01020304.

Rename CNetAddr::ip to CNetAddr::m_addr because it is not an “IP” or “IP address” (TOR addresses are not IP addresses).

In order to preserve backward compatibility with serialization (where e.g. 1.2.3.4 is serialized as 00000000000000000000FFFF01020304) introduce CNetAddr dedicated legacy serialize/unserialize methods.

Adjust CSubNet accordingly. Still use CSubNet::netmask[] of fixed 16 bytes, but use the first 4 for IPv4 (not the last 4). Do not accept invalid netmasks that have 0-bits followed by 1-bits and only allow subnetting for IPv4 and IPv6.

Co-authored-by: Carl Dong contact@carldong.me

jonatack commented at 3:43 pm on July 30, 2020: member

Concept ACK

vasild commented at 4:01 pm on July 30, 2020: member

Filtered code coverage report (files not modified by this PR are omitted and not modified lines in files that are otherwise modified are dimmed).

List of modified and not covered lines.

laanwj added the label P2P on Jul 30, 2020

DrahtBot commented at 8:25 pm on July 30, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#19797 (net: Remove old check for 3-byte shifted IP addresses from pre-0.2.9 node … by i-rme)
#19687 (refactor: make EncodeBase{32,64} consume Spans by theStack)
#19415 (net: Make DNS lookup mockable, add fuzzing harness by practicalswift)
#18722 (addrman: improve performance by using more suitable containers by vasild)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/netaddress.h:91 in daa1d8b114 outdated

86     protected:
87+        /**
88+         * Raw representation of the network address.
89+         * In network byte order (big endian) for IPv4 and IPv6.
90+         */
91+        prevector<ADDR_IPv6_SIZE, uint8_t> m_addr{ADDR_IPv6_SIZE, 0x0};

practicalswift commented at 10:36 am on August 1, 2020:

Are you sure the use of prevector is warranted here? Measurements would be nice.

Generally I think we should try stick with to the standard containers unless we have clear quantitative evidence showing that they are unsuitable for the scenario being considered. Reviewers, static analysers and compilers tend to be better at analysing use of the standard containers :)

FWIW:

0$ git grep -l prevector -- "*.cpp" "*.h" ":(exclude)src/bench/" ":(exclude)src/test/"
1src/hash.h
2src/memusage.h
3src/prevector.h
4src/script/script.h
5src/serialize.h

sipa commented at 5:52 pm on August 1, 2020:

I suggested this, so let me back it up.

Currently an address in CAddrMan consumes 256 bytes (cost in mapInfo and mapAddr together, plus allocation overhead).

This PR as-is raises that to 288 bytes, for addresses that take up to 16 bytes (which is all of them, for now).

Using an std::vector<uint8_t> instead would change that to 352 bytes per address, or an extra 64 on top. It would also increase memory fragmentation and (slightly) increase CPU usage because of two malloc’s per address.

FTR, my node has 69691 addresses in CAddrMan.

practicalswift commented at 8:01 pm on August 2, 2020:

@sipa Thanks for quantifying! That answers my question :)

vasild commented at 7:58 am on August 5, 2020:

Thanks for the question and the elaborate explanation! Indeed smaller size and avoiding an extra heap allocation is better. FWIW I moved m_addr definition before m_net to save 8 bytes from the size of CNetAddr (32 vs 40 bytes on x64).

in src/netaddress.cpp:116 in daa1d8b114 outdated

132-    for (int i = 0; i < cmplen; ++i) {
133-        if (GetByte(i)) return false;
134+    if (!IsIPv4() && !IsIPv6()) {
135+        return false;
136+    }
137+    for (uint8_t b : m_addr) {

sipa commented at 1:42 am on August 3, 2020:

Slightly more modern:

0return std::all_of(m_addr.begin(), m_addr.end(), [](uint8_t b) { return b == 0; });

vasild commented at 11:30 am on August 5, 2020:

Changed.

in src/netaddress.cpp:284 in daa1d8b114 outdated

293-        // 0
294-        ipNone = 0;
295-        if (memcmp(ip+12, &ipNone, 4) == 0)
296-            return false;
297+    if (IsIPv4()) {
298+        for (uint32_t a : {(uint32_t)INADDR_ANY, (uint32_t)INADDR_NONE}) {

sipa commented at 1:50 am on August 3, 2020:

Not something you introduced, but this only works because those addresses have the same byte representation in every byte order.

vasild commented at 8:39 am on August 5, 2020:

Right! Even though the values of those constants are never going to change (I hope!), does the following look better?

0for (uint32_t a : {htonl(INADDR_ANY), htonl(INADDR_NONE)}) {

laanwj commented at 9:46 am on August 7, 2020:

That seems the best option, I think. It’s too bad that there’s no way to get these constants as byte spans that can be directly compared against m_addr so it wouldn’t need the cast and memcmp at all.

vasild commented at 8:40 pm on August 7, 2020:

Changed to use htonl().

in src/netaddress.cpp:368 in daa1d8b114 outdated

365 }
366 
367 bool operator<(const CNetAddr& a, const CNetAddr& b)
368 {
369-    return a.m_net < b.m_net || (a.m_net == b.m_net && memcmp(a.ip, b.ip, 16) < 0);
370+    return a.m_net < b.m_net || (a.m_net == b.m_net && a.m_addr < b.m_addr);

sipa commented at 1:51 am on August 3, 2020:

More modern:

0return std::tie(a.m_net, a.m_addr) < std::tie(b.m_net, b.m_addr);

vasild commented at 11:32 am on August 5, 2020:

Modernized :bulb:

in src/netaddress.h:55 in daa1d8b114 outdated

50@@ -49,18 +51,50 @@ enum Network
51     NET_MAX,
52 };
53 
54+/// If an IPv6 address begins with this, then we treat the rest of it as IPv4 address.
55+static constexpr uint8_t IPv4_IN_IPv6_PREFIX[12] = {

sipa commented at 1:56 am on August 3, 2020:

Nit: uppercase v in global constants (here and further).

vasild commented at 11:32 am on August 5, 2020:

Changed to uppercase.

DrahtBot added the label Needs rebase on Aug 3, 2020

vasild force-pushed on Aug 5, 2020

vasild commented at 11:04 am on August 5, 2020: member

Rebased to resolve a conflict.

vasild force-pushed on Aug 5, 2020

vasild commented at 11:33 am on August 5, 2020: member

Applied review suggestions.

DrahtBot removed the label Needs rebase on Aug 5, 2020

in src/test/fuzz/asmap.cpp:51 in 11632c7380 outdated

47+    const uint8_t* addr_data = buffer.data() + 1 + asmap_size;
48     CNetAddr net_addr;
49-    net_addr.SetRaw(ipv6 ? NET_IPV6 : NET_IPV4, buffer.data() + 1 + asmap_size);
50+    if (ipv6) {
51+        assert(addr_size == ADDR_IPV6_SIZE);
52+        net_addr.SetLegacyIPv6(*reinterpret_cast<const uint8_t (*)[ADDR_IPV6_SIZE]>(addr_data));

laanwj commented at 9:53 am on August 7, 2020:

This is pretty scary cast expression. Wouldn’t a cast to just const uint8_t* work just as well?

vasild commented at 8:55 pm on August 7, 2020:

Passing const uint8_t* will not work as long as the argument of the method is a reference to an array: CNetAddr::SetLegacyIPv6(const uint8_t (&ipv6)[ADDR_IPV6_SIZE]) which is best of type safety but unfortunately two of the callers don’t have an array to pass.

I think either one of the following two is better:

 0diff --git i/src/netaddress.cpp w/src/netaddress.cpp
 1index b421e1b11..bb4d2b50b 100644
 2--- i/src/netaddress.cpp
 3+++ w/src/netaddress.cpp
 4@@ -26,32 +26,33 @@ CNetAddr::CNetAddr() {}
 5 void CNetAddr::SetIP(const CNetAddr& ipIn)
 6 {
 7     m_net = ipIn.m_net;
 8     m_addr = ipIn.m_addr;
 9 }
10 
11-void CNetAddr::SetLegacyIPv6(const uint8_t (&ipv6)[ADDR_IPV6_SIZE])
12+void CNetAddr::SetLegacyIPv6(const void* ipv6)
13 {
14-    const uint8_t* ipv6_end = ipv6 + ADDR_IPV6_SIZE;
15+    const uint8_t* ipv6_begin = static_cast<const uint8_t*>(ipv6);
16+    const uint8_t* ipv6_end = ipv6_begin + ADDR_IPV6_SIZE;
17 
18-    if (memcmp(ipv6, IPV4_IN_IPV6_PREFIX, sizeof(IPV4_IN_IPV6_PREFIX)) == 0) {
19+    if (memcmp(ipv6_begin, IPV4_IN_IPV6_PREFIX, sizeof(IPV4_IN_IPV6_PREFIX)) == 0) {
20         // IPv4-in-IPv6
21         m_net = NET_IPV4;
22-        m_addr.assign(ipv6 + sizeof(IPV4_IN_IPV6_PREFIX), ipv6_end);
23-    } else if (memcmp(ipv6, TORV2_IN_IPV6_PREFIX, sizeof(TORV2_IN_IPV6_PREFIX)) == 0) {
24+        m_addr.assign(ipv6_begin + sizeof(IPV4_IN_IPV6_PREFIX), ipv6_end);
25+    } else if (memcmp(ipv6_begin, TORV2_IN_IPV6_PREFIX, sizeof(TORV2_IN_IPV6_PREFIX)) == 0) {
26         // TORv2-in-IPv6
27         m_net = NET_ONION;
28-        m_addr.assign(ipv6 + sizeof(TORV2_IN_IPV6_PREFIX), ipv6_end);
29-    } else if (memcmp(ipv6, INTERNAL_IN_IPV6_PREFIX, sizeof(INTERNAL_IN_IPV6_PREFIX)) == 0) {
30+        m_addr.assign(ipv6_begin + sizeof(TORV2_IN_IPV6_PREFIX), ipv6_end);
31+    } else if (memcmp(ipv6_begin, INTERNAL_IN_IPV6_PREFIX, sizeof(INTERNAL_IN_IPV6_PREFIX)) == 0) {
32         // Internal-in-IPv6
33         m_net = NET_INTERNAL;
34-        m_addr.assign(ipv6 + sizeof(INTERNAL_IN_IPV6_PREFIX), ipv6_end);
35+        m_addr.assign(ipv6_begin + sizeof(INTERNAL_IN_IPV6_PREFIX), ipv6_end);
36     } else {
37         // IPv6
38         m_net = NET_IPV6;
39-        m_addr.assign(ipv6, ipv6_end);
40+        m_addr.assign(ipv6_begin, ipv6_end);
41     }
42 }
43 
44 /**
45  * Try to make this a dummy address that maps the specified name into IPv6 like
46  * so: (0xFD + %sha256("bitcoin")[0:5]) + %sha256(name)[0:10]. Such dummy
47@@ -105,13 +106,13 @@ CNetAddr::CNetAddr(const struct in_addr& ipv4Addr)
48     m_addr.assign(ptr, ptr + ADDR_IPV4_SIZE);
49 }
50 
51 CNetAddr::CNetAddr(const struct in6_addr& ipv6Addr, const uint32_t scope)
52 {
53     assert(sizeof(ipv6Addr) == ADDR_IPV6_SIZE);
54-    SetLegacyIPv6(*reinterpret_cast<const uint8_t (*)[ADDR_IPV6_SIZE]>(&ipv6Addr));
55+    SetLegacyIPv6(&ipv6Addr);
56     scopeId = scope;
57 }
58 
59 bool CNetAddr::IsBindAny() const
60 {
61     if (!IsIPv4() && !IsIPv6()) {
62diff --git i/src/netaddress.h w/src/netaddress.h
63index 8924ed0b4..ca048fbf5 100644
64--- i/src/netaddress.h
65+++ w/src/netaddress.h
66@@ -105,13 +105,13 @@ class CNetAddr
67         /**
68          * Set from a legacy IPv6 address.
69          * Legacy IPv6 address may be a normal IPv6 address, or another address
70          * (e.g. IPv4) disguised as IPv6. This encoding is used in the legacy
71          * `addr` encoding.
72          */
73-        void SetLegacyIPv6(const uint8_t (&ipv6)[ADDR_IPV6_SIZE]);
74+        void SetLegacyIPv6(const void* ipv6);
75 
76         bool SetInternal(const std::string& name);
77 
78         bool SetSpecial(const std::string &strName); // for Tor addresses
79         bool IsBindAny() const; // INADDR_ANY equivalent
80         bool IsIPv4() const;    // IPv4 mapped address (::FFFF:0:0/96, 0.0.0.0/0)
81diff --git i/src/test/fuzz/asmap.cpp w/src/test/fuzz/asmap.cpp
82index 84902ee2d..98ac82453 100644
83--- i/src/test/fuzz/asmap.cpp
84+++ w/src/test/fuzz/asmap.cpp
85@@ -45,13 +45,13 @@ void test_one_input(const std::vector<uint8_t>& buffer)
86     if (!SanityCheckASMap(asmap)) return;
87 
88     const uint8_t* addr_data = buffer.data() + 1 + asmap_size;
89     CNetAddr net_addr;
90     if (ipv6) {
91         assert(addr_size == ADDR_IPV6_SIZE);
92-        net_addr.SetLegacyIPv6(*reinterpret_cast<const uint8_t (*)[ADDR_IPV6_SIZE]>(addr_data));
93+        net_addr.SetLegacyIPv6(addr_data);
94     } else {
95         assert(addr_size == ADDR_IPV4_SIZE);
96         in_addr ipv4;
97         memcpy(&ipv4, addr_data, addr_size);
98         net_addr.SetIP(CNetAddr{ipv4});
99     }

 0diff --git i/src/netaddress.cpp w/src/netaddress.cpp
 1index b421e1b11..121b92650 100644
 2--- i/src/netaddress.cpp
 3+++ w/src/netaddress.cpp
 4@@ -26,32 +26,32 @@ CNetAddr::CNetAddr() {}
 5 void CNetAddr::SetIP(const CNetAddr& ipIn)
 6 {
 7     m_net = ipIn.m_net;
 8     m_addr = ipIn.m_addr;
 9 }
10 
11-void CNetAddr::SetLegacyIPv6(const uint8_t (&ipv6)[ADDR_IPV6_SIZE])
12+void CNetAddr::SetLegacyIPv6(Span<const uint8_t> ipv6)
13 {
14-    const uint8_t* ipv6_end = ipv6 + ADDR_IPV6_SIZE;
15+    assert(ipv6.size() == ADDR_IPV6_SIZE);
16 
17-    if (memcmp(ipv6, IPV4_IN_IPV6_PREFIX, sizeof(IPV4_IN_IPV6_PREFIX)) == 0) {
18+    if (memcmp(ipv6.begin(), IPV4_IN_IPV6_PREFIX, sizeof(IPV4_IN_IPV6_PREFIX)) == 0) {
19         // IPv4-in-IPv6
20         m_net = NET_IPV4;
21-        m_addr.assign(ipv6 + sizeof(IPV4_IN_IPV6_PREFIX), ipv6_end);
22-    } else if (memcmp(ipv6, TORV2_IN_IPV6_PREFIX, sizeof(TORV2_IN_IPV6_PREFIX)) == 0) {
23+        m_addr.assign(ipv6.begin() + sizeof(IPV4_IN_IPV6_PREFIX), ipv6.end());
24+    } else if (memcmp(ipv6.begin(), TORV2_IN_IPV6_PREFIX, sizeof(TORV2_IN_IPV6_PREFIX)) == 0) {
25         // TORv2-in-IPv6
26         m_net = NET_ONION;
27-        m_addr.assign(ipv6 + sizeof(TORV2_IN_IPV6_PREFIX), ipv6_end);
28-    } else if (memcmp(ipv6, INTERNAL_IN_IPV6_PREFIX, sizeof(INTERNAL_IN_IPV6_PREFIX)) == 0) {
29+        m_addr.assign(ipv6.begin() + sizeof(TORV2_IN_IPV6_PREFIX), ipv6.end());
30+    } else if (memcmp(ipv6.begin(), INTERNAL_IN_IPV6_PREFIX, sizeof(INTERNAL_IN_IPV6_PREFIX)) == 0) {
31         // Internal-in-IPv6
32         m_net = NET_INTERNAL;
33-        m_addr.assign(ipv6 + sizeof(INTERNAL_IN_IPV6_PREFIX), ipv6_end);
34+        m_addr.assign(ipv6.begin() + sizeof(INTERNAL_IN_IPV6_PREFIX), ipv6.end());
35     } else {
36         // IPv6
37         m_net = NET_IPV6;
38-        m_addr.assign(ipv6, ipv6_end);
39+        m_addr.assign(ipv6.begin(), ipv6.end());
40     }
41 }
42 
43 /**
44  * Try to make this a dummy address that maps the specified name into IPv6 like
45  * so: (0xFD + %sha256("bitcoin")[0:5]) + %sha256(name)[0:10]. Such dummy
46@@ -104,14 +104,13 @@ CNetAddr::CNetAddr(const struct in_addr& ipv4Addr)
47     const uint8_t* ptr = reinterpret_cast<const uint8_t*>(&ipv4Addr);
48     m_addr.assign(ptr, ptr + ADDR_IPV4_SIZE);
49 }
50 
51 CNetAddr::CNetAddr(const struct in6_addr& ipv6Addr, const uint32_t scope)
52 {
53-    assert(sizeof(ipv6Addr) == ADDR_IPV6_SIZE);
54-    SetLegacyIPv6(*reinterpret_cast<const uint8_t (*)[ADDR_IPV6_SIZE]>(&ipv6Addr));
55+    SetLegacyIPv6(Span<const uint8_t>(reinterpret_cast<const uint8_t*>(&ipv6Addr), sizeof(ipv6Addr)));
56     scopeId = scope;
57 }
58 
59 bool CNetAddr::IsBindAny() const
60 {
61     if (!IsIPv4() && !IsIPv6()) {
62diff --git i/src/netaddress.h w/src/netaddress.h
63index 8924ed0b4..a76b5e9ce 100644
64--- i/src/netaddress.h
65+++ w/src/netaddress.h
66@@ -105,13 +105,13 @@ class CNetAddr
67         /**
68          * Set from a legacy IPv6 address.
69          * Legacy IPv6 address may be a normal IPv6 address, or another address
70          * (e.g. IPv4) disguised as IPv6. This encoding is used in the legacy
71          * `addr` encoding.
72          */
73-        void SetLegacyIPv6(const uint8_t (&ipv6)[ADDR_IPV6_SIZE]);
74+        void SetLegacyIPv6(Span<const uint8_t> ipv6);
75 
76         bool SetInternal(const std::string& name);
77 
78         bool SetSpecial(const std::string &strName); // for Tor addresses
79         bool IsBindAny() const; // INADDR_ANY equivalent
80         bool IsIPv4() const;    // IPv4 mapped address (::FFFF:0:0/96, 0.0.0.0/0)
81diff --git i/src/test/fuzz/asmap.cpp w/src/test/fuzz/asmap.cpp
82index 84902ee2d..e3aefa18a 100644
83--- i/src/test/fuzz/asmap.cpp
84+++ w/src/test/fuzz/asmap.cpp
85@@ -45,13 +45,13 @@ void test_one_input(const std::vector<uint8_t>& buffer)
86     if (!SanityCheckASMap(asmap)) return;
87 
88     const uint8_t* addr_data = buffer.data() + 1 + asmap_size;
89     CNetAddr net_addr;
90     if (ipv6) {
91         assert(addr_size == ADDR_IPV6_SIZE);
92-        net_addr.SetLegacyIPv6(*reinterpret_cast<const uint8_t (*)[ADDR_IPV6_SIZE]>(addr_data));
93+        net_addr.SetLegacyIPv6(Span<const uint8_t>(addr_data, addr_size));
94     } else {
95         assert(addr_size == ADDR_IPV4_SIZE);
96         in_addr ipv4;
97         memcpy(&ipv4, addr_data, addr_size);
98         net_addr.SetIP(CNetAddr{ipv4});
99     }

What do you think?

sipa commented at 9:02 pm on August 7, 2020:

Span span span…

span

vasild commented at 1:11 pm on August 10, 2020:

Done

in src/netaddress.cpp:346 in 11632c7380 outdated

328@@ -341,9 +329,9 @@ enum Network CNetAddr::GetNetwork() const
329 std::string CNetAddr::ToStringIP() const
330 {
331     if (IsTor())
332-        return EncodeBase32(&ip[6], 10) + ".onion";
333+        return EncodeBase32(m_addr.data(), m_addr.size()) + ".onion";
334     if (IsInternal())
335-        return EncodeBase32(ip + sizeof(g_internal_prefix), sizeof(ip) - sizeof(g_internal_prefix)) + ".internal";
336+        return EncodeBase32(m_addr.data(), m_addr.size()) + ".internal";

laanwj commented at 10:03 am on August 7, 2020:

Heh we need to make EncodeBase32 take a span too (not in this PR ofcourse).

vasild commented at 12:49 pm on August 10, 2020:

Done in https://github.com/bitcoin/bitcoin/pull/19687

in src/netaddress.cpp:874 in 11632c7380 outdated

923+        }
924+        cidr += NetmaskBits(netmask[i]);
925     }
926 
927-    return network.ToString() + "/" + strNetmask;
928+    return network.ToString() + "/" + strprintf("%hhu", cidr);

laanwj commented at 10:12 am on August 7, 2020:

You do not need the “hh” here, or any type specifiers ever, strprintf is type safe. Also maybe include the / to avoid an extra string concatenation

0network.ToString() + strprintf("/%u", cidr);

vasild commented at 8:42 pm on August 7, 2020:

Changed.

vasild force-pushed on Aug 7, 2020

vasild force-pushed on Aug 10, 2020

vasild commented at 1:10 pm on August 10, 2020: member

Fixed Windows build and changed SetLegacyIPv6() to take a ~~Spam~~Span argument.

vasild force-pushed on Aug 10, 2020

vasild commented at 8:19 pm on August 10, 2020: member

s/IPv4_IN_IPv6_PREFIX/IPV4_IN_IPV6_PREFIX/ also in comment, was missed during the rename.

vasild force-pushed on Aug 11, 2020

vasild commented at 12:05 pm on August 11, 2020: member

Rebased just to restart Travis.

in src/netaddress.h:215 in bfe7d1b565 outdated

212+                assert(m_addr.size() == sizeof(arr));
213+                memcpy(arr, m_addr.data(), m_addr.size());
214+                break;
215+            case NET_ONION:
216+                prefix_size = sizeof(TORV2_IN_IPV6_PREFIX);
217+                assert(m_addr.size() + prefix_size == sizeof(arr));

jonatack commented at 8:40 am on August 12, 2020:

perhaps place the NET_IPV6 case first to group the similar cases together

suggestion here and line 221 to be the same as the first case line 205:

0                assert(prefix_size + m_addr.size() == sizeof(arr));

vasild commented at 11:08 am on August 12, 2020:

Done.

in src/netaddress.h:229 in bfe7d1b565 outdated

226+                break;
227+            case NET_UNROUTABLE:
228+            case NET_MAX:
229+                memset(arr, 0x0, sizeof(arr));
230+                break;
231+            }

jonatack commented at 8:43 am on August 12, 2020:

per developer-notes.md:

0    } // no default case, so the compiler can warn about missing cases
1    assert(false);

vasild commented at 11:08 am on August 12, 2020:

Done.

jonatack commented at 2:42 pm on August 12, 2020:

Add the assert as well as the comment?

vasild commented at 3:17 pm on August 12, 2020:

Hmm, I almost added the assert(false); but then realized that the code flow just continues after the switch (it contains break;, not return; as the example in the developer-notes).

Apply this?

 0diff --git i/src/netaddress.h w/src/netaddress.h
 1index b75eff1b9..8cd2401e2 100644
 2--- i/src/netaddress.h
 3+++ w/src/netaddress.h
 4@@ -200,36 +200,38 @@ class CNetAddr
 5             size_t prefix_size;
 6 
 7             switch (m_net) {
 8             case NET_IPV6:
 9                 assert(m_addr.size() == sizeof(arr));
10                 memcpy(arr, m_addr.data(), m_addr.size());
11-                break;
12+                return;
13             case NET_IPV4:
14                 prefix_size = sizeof(IPV4_IN_IPV6_PREFIX);
15                 assert(prefix_size + m_addr.size() == sizeof(arr));
16                 memcpy(arr, IPV4_IN_IPV6_PREFIX, prefix_size);
17                 memcpy(arr + prefix_size, m_addr.data(), m_addr.size());
18-                break;
19+                return;
20             case NET_ONION:
21                 prefix_size = sizeof(TORV2_IN_IPV6_PREFIX);
22                 assert(prefix_size + m_addr.size() == sizeof(arr));
23                 memcpy(arr, TORV2_IN_IPV6_PREFIX, prefix_size);
24                 memcpy(arr + prefix_size, m_addr.data(), m_addr.size());
25-                break;
26+                return;
27             case NET_INTERNAL:
28                 prefix_size = sizeof(INTERNAL_IN_IPV6_PREFIX);
29                 assert(prefix_size + m_addr.size() == sizeof(arr));
30                 memcpy(arr, INTERNAL_IN_IPV6_PREFIX, prefix_size);
31                 memcpy(arr + prefix_size, m_addr.data(), m_addr.size());
32-                break;
33+                return;
34             case NET_UNROUTABLE:
35             case NET_MAX:
36                 memset(arr, 0x0, sizeof(arr));
37-                break;
38+                return;
39             } // no default case, so the compiler can warn about missing cases
40+
41+            assert(false);
42         }
43 
44         /**
45          * Serialize in pre-ADDRv2/BIP155 format to a stream.
46          * Some addresses (e.g. TORv3) cannot be serialized in pre-BIP155 format.
47          */

jonatack commented at 3:47 pm on August 12, 2020:

I should have mentioned that. Returning LGTM, nothing else is being done in the function.

vasild commented at 7:55 am on August 13, 2020:

Changed break; to return; and added assert(false); after the switch.

in src/netaddress.cpp:506 in bfe7d1b565 outdated

502@@ -507,16 +503,14 @@ std::vector<unsigned char> CNetAddr::GetGroup(const std::vector<bool> &asmap) co
503     }
504 
505     vchRet.push_back(net_class);
506-    int nStartByte = 0;
507-    int nBits = 16;
508+    int nBits;

jonatack commented at 9:22 am on August 12, 2020:

nBits is currently given a value a few lines down but seems a good habit to always provide a default.

0    int nBits{0};

vasild commented at 11:08 am on August 12, 2020:

Default assigning in the case where we are supposed to always assign the variable later will suppress valgrind and compiler warnings if there is a bug in the code and it does not actually initialize the variable in some cases.

jonatack commented at 11:40 am on August 12, 2020:

(there’s nuance to be sure; if helpful:)

https://github.com/isocpp/CppCoreGuidelines/blob/master/CppCoreGuidelines.md#es20-always-initialize-an-object

Some history and reading on uninitialized reads in Bitcoin Core: https://bitcoincore.reviews/17639

See also this comment and thread #17627 (comment)

vasild commented at 1:43 pm on August 12, 2020:

Initialized to 0. Note that the Cpp Core Guidelines contain an example that resembles our case and admits that “This cannot trivially be rewritten to initialize i and j with initializers”.

We have:

0int nBits;
1... code that is supposed to set nBits ... // [1]
2read nBits // [2]

So if there is a bug in [1] and nBits is not set (or if new code is added that reads nBits before [1]), then we would get undefined behavior in [2], but we would also get compiler, msan and valgrind warnings about it, so not likely to remain unnoticed.

OTOH if we initialize nBits with some dummy_value and the same bugs happen, then we would not get undefined behavior, but we would get unexpected results due to [2] reading dummy_value. And the downside is that none of the compiler, msan or valgrind will warn about the bug.

jonatack commented at 2:13 pm on August 12, 2020:

Yes. Attempted at the time to summarize the trade-offs in the review club notes I linked to (LMK if you see any errors, I wanted it to be a resource).

Edit: Thanks @vasild for the offline suggestions on the review club notes.

MarcoFalke commented at 3:38 pm on August 12, 2020:

Maybe it is time to add an init macro that default initializes a variable, but disables the initialization when --enable-debug or so?

in src/netaddress.cpp:539 in bfe7d1b565 outdated

536     {
537-        vchRet.push_back(GetByte(15 - nStartByte));
538-        nStartByte++;
539+        assert(i < m_addr.size());
540+        vchRet.push_back(m_addr[i]);
541+        i++;

jonatack commented at 9:27 am on August 12, 2020:

consider using the pre-increment (++i) operator

vasild commented at 11:09 am on August 12, 2020:

Done.

jonatack commented at 10:08 am on August 12, 2020: member

Glad to see progress on BIP155/addrv2 with the imminent Tor v2 deprecation in a month with planned obsolescence on July 15, 2021.

ACK bfe7d1b56500b08bb a couple hours of code review – tried to distinguish the many straightforward-looking and good mechanical changes from the critical ones as they are all smooshed together, however I will continue looking at this code and it’s good to have it merged not late in the release cycle–, clean debug builds on gcc Debian 10.1.0, running a node since a day now with the change.

A few comments below that can be ignored or left for a follow-up unless you decide to retouch or take some of them now.

vasild force-pushed on Aug 12, 2020

vasild commented at 11:09 am on August 12, 2020: member

Addressed review suggestions.

vasild force-pushed on Aug 12, 2020

vasild commented at 1:43 pm on August 12, 2020: member

Addressed review suggestion.

vasild force-pushed on Aug 13, 2020

vasild commented at 7:55 am on August 13, 2020: member

Addressed review suggestion.

jonatack commented at 8:15 am on August 13, 2020: member

re-ACK 8b7f2c1 per git diff bfe7d1b 8b7f2c1

Ran bitcoind at bfe7d1b with both onion and clearnet ipv4/ipv6 conns open for a couple of days, debug log and -netinfo connections dashboard have been showing nominal operation. Restarting bitcoind now at 8b7f2c1.

laanwj added this to the "Blockers" column in a project

ShawkiS approved

prusnak commented at 3:34 pm on August 19, 2020: contributor

I assume TORv3 addresses will be addressed in the future PRs, right?

laanwj commented at 3:36 pm on August 19, 2020: member

I assume TORv3 addresses will be addressed in the future PRs, right?

Yes. See BIP155 and #19031. This is just preparatory work required for that,

in src/netaddress.cpp:41 in 8b7f2c13b6 outdated

48+void CNetAddr::SetLegacyIPv6(Span<const uint8_t> ipv6)
49 {
50-    if (memcmp(ipv6, pchIPv4, sizeof(pchIPv4)) == 0) {
51+    assert(ipv6.size() == ADDR_IPV6_SIZE);
52+
53+    if (memcmp(ipv6.begin(), IPV4_IN_IPV6_PREFIX, sizeof(IPV4_IN_IPV6_PREFIX)) == 0) {

elichai commented at 7:49 pm on August 19, 2020:

nit, you could reformat this to more “modern” C++ (though I’m not sure if it is less or more readable)

0if(std::equal(std::cbegin(IPV4_IN_IPV6_PREFIX), std::cend(IPV4_IN_IPV6_PREFIX), ipv6.cbegin()))

(you could replace std::cbegin(IPV4_IN_IPV6_PREFIX) with IPV4_IN_IPV6_PREFIX.cbegin() if you make it a std::array and not a C style array)

sipa commented at 2:05 am on August 20, 2020:

Agree with @elichai’s suggestion, std::equal is nicer.

Given how many of these tests are being done (including the IsRFC… stuff below), perhaps you want to go even further (this is absolutely not a blocker for this PR, however):

0template<typename T, int LEN>
1bool HasPrefix(const T1& obj, const uint8_t (&prefix)[LEN])
2{
3    return obj.size() >= LEN && std::equal(std::cbegin(prefix), std::cend(prefix), std::cbegin(obj));
4}
5
6...
7
8if (HasPrefix(ipv6, IPV4_IN_IPV6_PREFIX)) ...

If you turn the prefixes into std::array it could be

0template<typename T1, typename T2>
1bool HasPrefix(const T1& obj, const T2& prefix)
2{
3    return obj.size() >= prefix.size() && std::equal(std::cbegin(prefix), std::cend(prefix), std::cbegin(obj));
4}

instead.

elichai commented at 8:52 am on August 20, 2020:

In C++17 HasPrefix will be possible even with C arrays :D (by replacing .size() with std::size(..))

vasild commented at 4:53 pm on August 20, 2020:

Done - diff together with other suggestions.

in src/netaddress.cpp:355 in 8b7f2c13b6 outdated

345@@ -353,13 +346,13 @@ std::string CNetAddr::ToStringIP() const
346             return std::string(name);
347     }
348     if (IsIPv4())

sipa commented at 0:27 am on August 20, 2020:

Nit: { } around indented blocks, if you’re touching this code anyway.

vasild commented at 4:59 pm on August 20, 2020:

I added { } to all places where the line containing if/for/while is modified for another reason (this one is not).

in src/netaddress.cpp:131 in 8b7f2c13b6 outdated

145-    const int cmplen = IsIPv4() ? 4 : 16;
146-    for (int i = 0; i < cmplen; ++i) {
147-        if (GetByte(i)) return false;
148+    if (m_net == NET_IPV4) {
149+        assert(m_addr.size() == ADDR_IPV4_SIZE);
150+        return true;

kallewoof commented at 0:40 am on August 20, 2020:

Placing this assertion here seems weird to me. Shouldn’t this check be done when(ever) m_net is modified? Same with IsIPv6() below.

vasild commented at 4:54 pm on August 20, 2020:

Done - diff together with other suggestions.

in src/netaddress.cpp:300 in 8b7f2c13b6 outdated

299+    if (IsIPv4()) {
300+        for (uint32_t a : {htonl(INADDR_ANY), htonl(INADDR_NONE)}) {
301+            if (memcmp(m_addr.data(), &a, sizeof(a)) == 0) {
302+                return false;
303+            }
304+        }

kallewoof commented at 0:51 am on August 20, 2020:

The opposite seems more straightforward:

0if (IsIPv4()) {
1    uint32_t a;
2    memcpy(&a, m_addr.data(), sizeof(a));
3    if (a == htonl(INADDR_ANY) || a == htonl(INADDR_NONE)) return false;
4}

@sipa might yell at me but it is also possible to do uint32_t a = *(uint32_t*)m_addr.data(); to skip the memcpy line.

sipa commented at 1:46 am on August 20, 2020:

It seems I’ve caused a lot of confusion here. Everything works here of course due to INADDR_ANY and INADDR_NONE being the same in LE and BE, but if they weren’t, the code here would actually be wrong.

These constants (and the s_addr field of struct in_addr) are uint32_t in name only; they’re really 4 bytes in big endian notation storing an IP address. So the IP address 1.2.3.4 would be stored as the bytes {0x01,0x02,0x03,0x04}, which on LE systems would be the integer 0x04030201 while on BE systems it would be 0x01020304.

htonl is converting them from network order (BE) to native order, so it would result in the integer 0x01020304 on both systems. That’s surprisingly not what we want, we want it in network order, because that’s the order we store in m_addr (but as bytes, rather than a fake-uint32_t).

This would work:

0static constexpr uint32_t ANY = INADDR_ANY;
1static constexpr uint32_t ALL = INADDR_ALL;
2if (memcmp((const uint8_t*)&ANY, m_addr.data(), sizeof(ANY) ||
3    memcmp((const uint8_t*)&ALL, m_addr.data(), sizeof(ALL)) ...

this would also work:

0uint32_t a;
1memcpy((uint8_t*)&a, m_addr.data(), sizeof(a));
2if (a == INADDR_ANY || a == INADDR_ALL) ...

@kallewoof Your suggestion is UB, I’m afraid. m_addr is an array of bytes, you can’t access it through a pointer cast to another type (except char* or unsigned char*, which can be used to access the in-memory representation of any object).

vasild commented at 4:55 pm on August 20, 2020:

Those constants are defined like this in the system headers:

0typedef uint32_t in_addr_t;
1...
2#define	INADDR_ANY		((in_addr_t) 0x00000000)
3#define	INADDR_NONE		((in_addr_t) 0xffffffff)
4...
5#define INADDR_ALLHOSTS_GROUP	((in_addr_t) 0xe0000001) /* 224.0.0.1 */

Of course any discussion about the byte order of INADDR_ANY is meaningless, but nevertheless - we can assume that all INADDR_* are in the same byte order. From the above we see that 224.0.0.1 is represented as the integer 3758096385 or 0xe0000001 which on LE systems is stored in memory as 01 00 00 e0 and on BE systems as e0 00 00 01.

This means that whenever we want to compare those constants to something we know is in BE byte order (like m_addr) we have to apply htonl() to them. This is how it is done in the Linux and FreeBSD system headers.

So, leaving this as is at this iteration.

sipa commented at 6:17 pm on August 20, 2020:

You’re right.

I was confused by the fact that struct in_addr’s s_addr field stores IP addresses in BE order, but the INADDR_* constants are in native order (and you are supposed to use htonl & friends to convert between them; see the section on sin_addr in https://man7.org/linux/man-pages/man7/ip.7.html).

Arguably you should be using ntohl instead of htonl (as these constants are in native order, and we want them in BE order) - but on every realistic platform these two are equivalent.

If you want to avoid the winsock/inet header dependency you can also use WriteBE32 to convert them to bytes, or:

0uint32_t a = ReadBE32(m_data.data());
1if (a == INADDR_ANY || a == INADDR_ALL) ...

elichai commented at 6:43 pm on August 20, 2020:

as @sipa said

0uint32_t a = ReadBE32(m_data.data());
1if (a == INADDR_ANY || a == INADDR_ALL) ...

is so much more readable :)

ryanofsky commented at 8:18 pm on August 20, 2020:

re: #19628 (review)

@kallewoof Your suggestion is UB, I’m afraid

Just in case anybody else reading this gets confused like me, this is /not/ saying @kallewoof memcpy suggestion is UB. Only the second suggestion without memcpy is UB

sipa commented at 10:10 pm on August 20, 2020:

Oops, yes, I guess this confused @kallewoof too (on IRC). Specifially, this is UB:

0uint32_t a = *(uint32_t*)m_addr.data();

as it’s accessing a byte array through an incompatible type.

vasild commented at 11:30 am on August 21, 2020:

Alright, we have a clear winner with

0uint32_t a = ReadBE32(m_data.data());
1if (a == INADDR_ANY || a == INADDR_ALL) ...

Changed.

in src/netaddress.cpp:537 in 8b7f2c13b6 outdated

534     {
535-        vchRet.push_back(GetByte(15 - nStartByte));
536-        nStartByte++;
537+        assert(i < m_addr.size());
538+        vchRet.push_back(m_addr[i]);
539+        ++i;

kallewoof commented at 1:00 am on August 20, 2020:

It looks like you can do this as a copy operation now that the ordering is the same.

0size_t i = nBits >> 3;
1vchRet.insert(vchRet.end(), m_addr.begin(), m_addr.begin() + i);
2nBits &= 0x7;

sipa commented at 2:14 am on August 20, 2020:

I believe that’s correct.

vasild commented at 4:56 pm on August 20, 2020:

Done - diff together with other suggestions.

in src/netaddress.cpp:820 in 8b7f2c13b6 outdated

837+{
838+    valid = (addr.IsIPv4() || addr.IsIPv6()) && addr.m_net == mask.m_net;
839+    if (!valid) {
840+        return;
841+    }
842+    // Check if `mask` contains 1-bits after 0-bits (which is an invalid netmask).

sipa commented at 2:19 am on August 20, 2020:

I don’t think this is correct, for two reasons:

We technically support non-CIDR subnets (which are allowed to contain an arbitrary mix of 0 and 1 bits).
This test isn’t even implementing whether the mask is CIDR, as it’s only checking whether no 1-bits-after-0-bits occur within one byte of the mask, rather than the entire mask.

I wouldn’t be opposed to dropping non-CIDR netmasks, but if we do, it should be a separate PR, I think.

vasild commented at 5:07 pm on August 20, 2020:

Uh, oh, you found a bug (the second item above)! Fixed - diff together with other suggestions.

A summary about the netmasks: #19628 (comment)

in src/netaddress.cpp:859 in 8b7f2c13b6 outdated

853@@ -822,68 +854,27 @@ bool CSubNet::Match(const CNetAddr &addr) const
854 {
855     if (!valid || !addr.IsValid() || network.m_net != addr.m_net)
856         return false;
857-    for(int x=0; x<16; ++x)
858-        if ((addr.ip[x] & netmask[x]) != network.ip[x])
859+    assert(network.m_addr.size() == addr.m_addr.size());
860+    for (size_t x = 0; x < addr.m_addr.size(); ++x)
861+        if ((addr.m_addr[x] & netmask[x]) != network.m_addr[x])

sipa commented at 2:24 am on August 20, 2020:

Braces, please.

(for most rules I don’t care that strongly whether you stick to the existing/surrounding style vs. follow the new style, but a braceless if has actually been part of the cause of a nontrivial security bug, see https://www.imperialviolet.org/2014/02/22/applebug.html)

vasild commented at 5:01 pm on August 20, 2020:

Done - diff together with other suggestions.

in src/netaddress.cpp:889 in 8b7f2c13b6 outdated

926+        }
927+        cidr += NetmaskBits(netmask[i]);
928     }
929 
930-    return network.ToString() + "/" + strNetmask;
931+    return network.ToString() + strprintf("/%u", cidr);

sipa commented at 2:26 am on August 20, 2020:

Here too it seems you’re dropping support for non-CIDR masks. Is that intentional?

vasild commented at 5:05 pm on August 20, 2020:

Yes, see #19628 (comment) - it is mentioned in the commit message and also some tests are adjusted.

vasild force-pushed on Aug 20, 2020

vasild commented at 4:52 pm on August 20, 2020: member

Addressed suggestions:

1. Fixed a bug in netmask validation and extended a test to cover that case. 2. Introduced HasPrefix() and used it where appropriate. 3. Moved an assert to where m_net is set. 4. Copy part of our address using std::vector::insert(). 5. Added braces around single statement for and if.

vasild commented at 5:04 pm on August 20, 2020: member

A note on invalid netmasks and sub-netting non IP networks:

A netmask that contains 1-bits after 0-bits (the 1-bits are not contiguous on the left side) is invalid. IMO as invalid as “255.255.390.0” [1] [2].

The code before this PR used to parse and accept such non-left-contiguous netmasks. That has been changed to not accept them because the new code is more robust that way and it is nonsensical anyway. An IRC discussion around non-left-contiguous netmasks.

This PR also restricts subnetting to just IPv4 and IPv6 (subnetting of TOR or NET_INTERNAL is meaningless).

Those changes are mentioned in the commit message and in the PR description:

Do not accept invalid netmasks that have 0-bits followed by 1-bits and only allow subnetting for IPv4 and IPv6

Since some comments above [3] [4] I looked at this carefully and managed to restore the original behavior (iuck!):

  0diff --git i/src/netaddress.cpp w/src/netaddress.cpp
  1index 8d73d4a58..1d4f43fcd 100644
  2--- i/src/netaddress.cpp
  3+++ w/src/netaddress.cpp
  4@@ -810,28 +810,16 @@ static inline int NetmaskBits(uint8_t x)
  5     default: return -1;
  6     }
  7 }
  8 
  9 CSubNet::CSubNet(const CNetAddr& addr, const CNetAddr& mask) : CSubNet()
 10 {
 11-    valid = (addr.IsIPv4() || addr.IsIPv6()) && addr.m_net == mask.m_net;
 12+    valid = addr.m_net == mask.m_net;
 13     if (!valid) {
 14         return;
 15     }
 16-    // Check if `mask` contains 1-bits after 0-bits (which is an invalid netmask).
 17-    bool zeros_found = false;
 18-    for (auto b : mask.m_addr) {
 19-        const int num_bits = NetmaskBits(b);
 20-        if (num_bits == -1 || (zeros_found && num_bits != 0)) {
 21-            valid = false;
 22-            return;
 23-        }
 24-        if (num_bits < 8) {
 25-            zeros_found = true;
 26-        }
 27-    }
 28 
 29     assert(mask.m_addr.size() <= sizeof(netmask));
 30 
 31     memcpy(netmask, mask.m_addr.data(), mask.m_addr.size());
 32 
 33     network = addr;
 34@@ -841,16 +829,13 @@ CSubNet::CSubNet(const CNetAddr& addr, const CNetAddr& mask) : CSubNet()
 35         network.m_addr[x] &= netmask[x];
 36     }
 37 }
 38 
 39 CSubNet::CSubNet(const CNetAddr& addr) : CSubNet()
 40 {
 41-    valid = addr.IsIPv4() || addr.IsIPv6();
 42-    if (!valid) {
 43-        return;
 44-    }
 45+    valid = true;
 46 
 47     assert(addr.m_addr.size() <= sizeof(netmask));
 48 
 49     memset(netmask, 0xFF, addr.m_addr.size());
 50 
 51     network = addr;
 52@@ -872,24 +857,54 @@ bool CSubNet::Match(const CNetAddr &addr) const
 53     }
 54     return true;
 55 }
 56 
 57 std::string CSubNet::ToString() const
 58 {
 59-    assert(network.m_addr.size() <= sizeof(netmask));
 60-
 61-    uint8_t cidr = 0;
 62+    /* Parse binary 1{n}0{N-n} to see if mask can be represented as /n */
 63+    int cidr = 0;
 64+    bool valid_cidr = true;
 65+    int size = network.m_addr.size();
 66+    int i;
 67+    for (i = 0; i < size && netmask[i] == 0xff; ++i) {
 68+        cidr += 8;
 69+    }
 70+    if (i < size) {
 71+        int bits = NetmaskBits(netmask[i]);
 72+        if (bits < 0) {
 73+            valid_cidr = false;
 74+        } else {
 75+            cidr += bits;
 76+        }
 77+        ++i;
 78+    }
 79+    for (; i < size && valid_cidr; ++i) {
 80+        if (netmask[i] != 0x00) {
 81+            valid_cidr = false;
 82+        }
 83+    }
 84 
 85-    for (size_t i = 0; i < network.m_addr.size(); ++i) {
 86-        if (netmask[i] == 0x00) {
 87-            break;
 88+    /* Format output */
 89+    std::string strNetmask;
 90+    if (valid_cidr) {
 91+        strNetmask = strprintf("%u", cidr);
 92+    } else {
 93+        if (network.IsIPv4()) {
 94+            strNetmask = strprintf("%u.%u.%u.%u", netmask[0], netmask[1], netmask[2], netmask[3]);
 95+        } else {
 96+            // Notice - this is not a valid representation of an IPv6 netmask.
 97+            // They are only defined to use CIDR notation.
 98+            strNetmask = strprintf("%x:%x:%x:%x:%x:%x:%x:%x",
 99+                             netmask[0] << 8 | netmask[1], netmask[2] << 8 | netmask[3],
100+                             netmask[4] << 8 | netmask[5], netmask[6] << 8 | netmask[7],
101+                             netmask[8] << 8 | netmask[9], netmask[10] << 8 | netmask[11],
102+                             netmask[12] << 8 | netmask[13], netmask[14] << 8 | netmask[15]);
103         }
104-        cidr += NetmaskBits(netmask[i]);
105     }
106 
107-    return network.ToString() + strprintf("/%u", cidr);
108+    return network.ToString() + "/" + strNetmask;
109 }
110 
111 bool CSubNet::IsValid() const
112 {
113     return valid;
114 }
115diff --git i/src/test/netbase_tests.cpp w/src/test/netbase_tests.cpp
116index 6681c92bb..47e9f854e 100644
117--- i/src/test/netbase_tests.cpp
118+++ w/src/test/netbase_tests.cpp
119@@ -295,17 +295,15 @@ BOOST_AUTO_TEST_CASE(subnet_test)
120     subnet = ResolveSubNet("1:2:3:4:5:6:7:8/ffff:0000:0000:0000:0000:0000:0000:0000");
121     BOOST_CHECK_EQUAL(subnet.ToString(), "1::/16");
122     subnet = ResolveSubNet("1:2:3:4:5:6:7:8/0000:0000:0000:0000:0000:0000:0000:0000");
123     BOOST_CHECK_EQUAL(subnet.ToString(), "::/0");
124     // Invalid netmasks (with 1-bits after 0-bits)
125     subnet = ResolveSubNet("1.2.3.4/255.255.232.0");
126-    BOOST_CHECK(!subnet.IsValid());
127-    subnet = ResolveSubNet("1.2.3.4/255.0.255.255");
128-    BOOST_CHECK(!subnet.IsValid());
129+    BOOST_CHECK_EQUAL(subnet.ToString(), "1.2.0.0/255.255.232.0");
130     subnet = ResolveSubNet("1:2:3:4:5:6:7:8/ffff:ffff:ffff:fffe:ffff:ffff:ffff:ff0f");
131-    BOOST_CHECK(!subnet.IsValid());
132+    BOOST_CHECK_EQUAL(subnet.ToString(), "1:2:3:4:5:6:7:8/ffff:ffff:ffff:fffe:ffff:ffff:ffff:ff0f");
133 }
134 
135 BOOST_AUTO_TEST_CASE(netbase_getgroup)
136 {
137     std::vector<bool> asmap; // use /16
138     BOOST_CHECK(ResolveIP("127.0.0.1").GetGroup(asmap) == std::vector<unsigned char>({0})); // Local -> !Routable()
139@@ -433,14 +431,13 @@ BOOST_AUTO_TEST_CASE(netbase_dont_resolve_strings_with_embedded_nul_characters)
140     BOOST_CHECK(!LookupHost(std::string("127.0.0.1\0example.com\0", 22), addr, false));
141     CSubNet ret;
142     BOOST_CHECK(LookupSubNet(std::string("1.2.3.0/24", 10), ret));
143     BOOST_CHECK(!LookupSubNet(std::string("1.2.3.0/24\0", 11), ret));
144     BOOST_CHECK(!LookupSubNet(std::string("1.2.3.0/24\0example.com", 22), ret));
145     BOOST_CHECK(!LookupSubNet(std::string("1.2.3.0/24\0example.com\0", 23), ret));
146-    // We only do subnetting for IPv4 and IPv6
147-    BOOST_CHECK(!LookupSubNet(std::string("5wyqrzbvrdsumnok.onion", 22), ret));
148+    BOOST_CHECK(LookupSubNet(std::string("5wyqrzbvrdsumnok.onion", 22), ret));
149     BOOST_CHECK(!LookupSubNet(std::string("5wyqrzbvrdsumnok.onion\0", 23), ret));
150     BOOST_CHECK(!LookupSubNet(std::string("5wyqrzbvrdsumnok.onion\0example.com", 34), ret));
151     BOOST_CHECK(!LookupSubNet(std::string("5wyqrzbvrdsumnok.onion\0example.com\0", 35), ret));
152 }
153 
154 BOOST_AUTO_TEST_SUITE_END()

There are a few options:

Leave this PR as is, or
Amend the above patch into the sole commit of this PR and
- Reverse-apply the patch as a separate (second) commit in this PR, or
- Reverse-apply the patch as a separate PR after this one gets merged

What do you think?

in src/netaddress.h:191 in 86b2397fae outdated

188+
189+    private:
190+        /**
191+         * Size of CNetAddr when serialized as ADDRv1 (pre-BIP155) (in bytes).
192+         */
193+        static constexpr size_t V1_SERIALIZATION_SIZE = 16;

ryanofsky commented at 6:28 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Maybe s/16/ADDR_IPV6_SIZE/ since code would be broken using any other size.

vasild commented at 11:31 am on August 21, 2020:

Done.

in src/netaddress.h:226 in 86b2397fae outdated

223+                memcpy(arr, INTERNAL_IN_IPV6_PREFIX, prefix_size);
224+                memcpy(arr + prefix_size, m_addr.data(), m_addr.size());
225+                return;
226+            case NET_UNROUTABLE:
227+            case NET_MAX:
228+                memset(arr, 0x0, sizeof(arr));

ryanofsky commented at 6:32 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Can you add a comment to explain why this is serializing 0’s rather than asserting false in both of these cases? Maybe say when this code could be called to serialize unroutable addresses and when it could be called to serialize invalid NET_MAX addresses.

vasild commented at 11:33 am on August 21, 2020:

There is no reason to serialize as 0s. NET_UNROUTABLE and NET_MAX are never set to m_net and must never be set. Changed this to assert(false).

in src/netaddress.cpp:58 in 86b2397fae outdated

64+    m_addr = ipIn.m_addr;
65 }
66 
67-void CNetAddr::SetLegacyIPv6(const uint8_t ipv6[16])
68+template <typename T, size_t PREFIX_LEN>
69+inline bool HasPrefix(const T& obj, const uint8_t (&prefix)[PREFIX_LEN])

elichai commented at 6:40 pm on August 20, 2020:

I had to google this syntax :/ (for a second I thought it was a C trick to make arrays work, but it’s actually C++) could we please use const std::array<uint8_t, N>& instead? :)

elichai commented at 6:41 pm on August 20, 2020:

or just make this generic over all containers

MarcoFalke commented at 5:47 am on August 21, 2020:

generic over all containers

Span !?

vasild commented at 7:54 am on August 21, 2020:

I can’t figure out how to use std::array or Span and keep the usage to just HasPrefix(m_addr, {192, 168}) (which is sweet!), so I am keeping the reference to array.

MarcoFalke commented at 4:36 pm on August 21, 2020:

The current one doesn’t compile either on gcc 4.8:

 0netaddress.cpp: In member function ‘bool CNetAddr::IsRFC3927() const’:
 1
 2netaddress.cpp:163:52: error: no matching function for call to ‘HasPrefix(const prevector<16u, unsigned char>&, <brace-enclosed initializer list>)’
 3
 4     return IsIPv4() && HasPrefix(m_addr, {169, 254});
 5
 6                                                    ^
 7
 8netaddress.cpp:163:52: note: candidate is:
 9
10netaddress.cpp:52:13: note: template<class T1, class T2, unsigned int PREFIX_LEN> bool HasPrefix(const T1&, const T2 (&)[PREFIX_LEN])
11
12 inline bool HasPrefix(const T1& obj, const T2 (&prefix)[PREFIX_LEN])
13
14             ^
15
16netaddress.cpp:52:13: note:   template argument deduction/substitution failed:
17
18netaddress.cpp:163:52: note:   couldn't deduce template parameter ‘T2’
19
20     return IsIPv4() && HasPrefix(m_addr, {169, 254});
21
22                                                    ^

sipa commented at 4:39 pm on August 21, 2020:

@vasild Nice, that indeed works, and I don’t see how that could work with std::array (the on-the-fly constructed array’s length template parameter can’t be figured out from the list of elements). It certainly can’t work for Span as there needs to be some object that actually holds the elements. @elichai I guess you have an answer to the question you asked on IRC yesterday “why would someone use a C-style array over std::array?”.

sipa commented at 6:59 pm on August 21, 2020:

@vasild @MarcoFalke I think that can be fixed by removing the T2 template parameter, and just hardcoding it to an array of uint8_t (but templated size).

elichai commented at 11:35 am on August 23, 2020:

@sipa Yep, I guess C arrays are still treated quite differently than anything else (more like initilaizer_list etc) :( (interesting: https://stackoverflow.com/a/61390991/1621927)

vasild commented at 11:57 am on August 24, 2020:

I think that can be fixed by removing the T2 template parameter, and just hardcoding it to an array of uint8_t

I think that can be fixed by removing the T2 template parameter, and just hardcoding it to an array of uint8_t (but templated size).

It does not compile either because then gcc 4.8 complained it cannot deduce the length from the initialiser list, so I switched to std::array :-/

vasild commented at 10:37 am on August 25, 2020:

This ended up with const std::array<uint8_t, N>& anyway, due to GCC 4.8. Will change it back to reference to an array once the support for GCC 4.8 is removed.

in src/netbase.cpp:844 in 86b2397fae outdated

838@@ -838,10 +839,13 @@ bool LookupSubNet(const std::string& strSubnet, CSubNet& ret)
839         if (slash != strSubnet.npos)
840         {
841             std::string strNetmask = strSubnet.substr(slash + 1);
842-            int32_t n;
843-            if (ParseInt32(strNetmask, &n)) {
844+            uint32_t n;
845+            if (ParseUInt32(strNetmask, &n)) {
846+                if (n > std::numeric_limits<uint8_t>::max()) {

ryanofsky commented at 6:40 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Having a ParseUInt8() function would seem more appealing than layering uint8 range checking on top of uint32 range checking done by ParseUInt32

vasild commented at 11:35 am on August 21, 2020:

Right, added ParseUint8().

in src/netaddress.cpp:27 in 86b2397fae outdated

36-{
37-    memset(ip, 0, sizeof(ip));
38-}
39+CNetAddr::CNetAddr() {}
40 
41 void CNetAddr::SetIP(const CNetAddr& ipIn)

ryanofsky commented at 7:07 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Precedes this PR, so maybe not worth changing here, but this method seems more awkward to use for assignment than plain operator=, and the SetIP method name is misleading since it works for all addresses, not just IP addresses. I think it would better to just use plain = for address assignment, and have a separate AssertValid() method if it’s helpful to have runtime asserts in places like this.

vasild commented at 8:49 am on August 21, 2020:

I agree. I considered swapping this for operator=, but cut it off, trying to keep the patch small. There are so many things that can be improved around this code!

in src/netaddress.cpp:97 in 86b2397fae outdated

117 
118 /**
119- * Try to make this a dummy address that maps the specified onion address into
120- * IPv6 using OnionCat's range and encoding. Such dummy addresses have a prefix
121- * of fd87:d87e:eb43::/48 and are guaranteed to not be publicly routable as they
122- * fall under RFC4193's fc00::/7 subnet allocated to unique-local addresses.

ryanofsky commented at 7:19 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

It would be nice not to drop this explanation. I think it could move to the SerializeV1Array function or be incorporated in the TORV2_IN_IPV6_PREFIX comment.

vasild commented at 11:36 am on August 21, 2020:

Moved the comment to TORV2_IN_IPV6_PREFIX and similar with SetInternal() and INTERNAL_IN_IPV6_PREFIX.

in src/netaddress.cpp:133 in 86b2397fae outdated

154 }
155 
156 CNetAddr::CNetAddr(const struct in6_addr& ipv6Addr, const uint32_t scope)
157 {
158-    SetRaw(NET_IPV6, (const uint8_t*)&ipv6Addr);
159+    SetLegacyIPv6(Span<const uint8_t>(reinterpret_cast<const uint8_t*>(&ipv6Addr), sizeof(ipv6Addr)));

ryanofsky commented at 7:29 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Not sure if it’s portable, but it seems like you can avoid the casting and span declaration on this line with just SetLegacyIPv6(ipv6Addr.s6_addr);

vasild commented at 10:01 am on August 21, 2020:

Looks like it is not: https://docs.microsoft.com/en-us/windows/win32/api/in6addr/ns-in6addr-in6_addr :-(

in src/netaddress.cpp:127 in 86b2397fae outdated

147 
148 CNetAddr::CNetAddr(const struct in_addr& ipv4Addr)
149 {
150-    SetRaw(NET_IPV4, (const uint8_t*)&ipv4Addr);
151+    m_net = NET_IPV4;
152+    const uint8_t* ptr = reinterpret_cast<const uint8_t*>(&ipv4Addr);

ryanofsky commented at 7:41 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Maybe would be good to static assert that in_addr size is the same as ADDR_IPV4_SIZE

vasild commented at 10:07 am on August 21, 2020:

I think that’s not necessary. It was also the case before this PR that struct in_addr would be assumed to be 4 bytes. I don’t see this ever changing.

in src/netaddress.cpp:395 in 86b2397fae outdated

397@@ -391,7 +398,7 @@ bool CNetAddr::GetInAddr(struct in_addr* pipv4Addr) const
398 {
399     if (!IsIPv4())
400         return false;
401-    memcpy(pipv4Addr, ip+12, 4);
402+    memcpy(pipv4Addr, m_addr.data(), m_addr.size());

ryanofsky commented at 8:54 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Now that this is calling memcpy with dynamic size rather than fixed size, I think it’d be good to add an assert to check that it doesn’t overflow the in_addr struct. Same suggestion for in6_addr struct below.

vasild commented at 11:38 am on August 21, 2020:

Done

in src/netaddress.cpp:435 in 86b2397fae outdated

433-        return ReadBE32(ip + 12);
434+    if (IsIPv4()) {
435+        return ReadBE32(m_addr.data());
436+    } else if (IsRFC6052() || IsRFC6145()) {
437+        // mapped IPv4, SIIT translated IPv4: the IPv4 address is the last 4 bytes of the address
438+        return ReadBE32(m_addr.data() + m_addr.size() - ADDR_IPV4_SIZE);

ryanofsky commented at 9:02 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

I think this could be a little safer with bounds checking, and more readable as ReadBE32(MakeSpan(m_addr).last(ADDR_IPV4_SIZE).data()). Similarly for RFC4380 case below.

vasild commented at 11:39 am on August 21, 2020:

Done

in src/netaddress.cpp:546 in 86b2397fae outdated

548-        vchRet.push_back(GetByte(15 - nStartByte));
549-        nStartByte++;
550-        nBits -= 8;
551-    }
552+    // Push our address onto vchRet.
553+    const size_t i = nBits >> 3;

ryanofsky commented at 9:18 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Probably my fault, but I did spend a minute looking at this diff being baffled by nBits >> 3 and nBits &= 0x7. Would suggest changing to size_t bytes = nBits / 8 and nBits = nBits % 8 to give a few more hints for comprehension.

vasild commented at 11:40 am on August 21, 2020:

Done

in src/netaddress.cpp:551 in 86b2397fae outdated

555+    nBits &= 0x7;
556     // ...for the last byte, push nBits and for the rest of the byte push 1's
557-    if (nBits > 0)
558-        vchRet.push_back(GetByte(15 - nStartByte) | ((1 << (8 - nBits)) - 1));
559+    if (nBits > 0) {
560+        assert(i < m_addr.size());

ryanofsky commented at 9:24 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

It seems like this assert would make more sense moved before the vchRet.insert statement above, and not conditioned on nBits divisibility.

vasild commented at 10:36 am on August 21, 2020:

It is here because here we access m_addr[i]. I would normally use m_addr.at(i) if there is a chance it may get out of bounds and not write an explicit separate assert, but prevector does not have an at() method :/

ryanofsky commented at 11:28 pm on August 24, 2020:

re: #19628 (review)

It is here because here we access m_addr[i]. I would normally use m_addr.at(i) if there is a chance it may get out of bounds and not write an explicit separate assert, but prevector does not have an at() method :/

Right, but if you assume the assert is false and num_bytes >= m_addr.size(), then the expression a few lines earlier m_addr.begin() + num_bytes will read from uninitialized memory. So I’m saying it might make sense to move the assert earlier there. Not very important, though.

in src/netaddress.cpp:562 in 86b2397fae outdated

572+}
573+
574 uint64_t CNetAddr::GetHash() const
575 {
576-    uint256 hash = Hash(ip);
577+    uint256 hash = Hash(m_addr);

ryanofsky commented at 9:31 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Might be worth mentioning in commit message CNetAddr now has returns different hash values. It seems like GetHash is only called one place in net_processing. Maybe this is the only change in the PR that’s not just refactoring and actually affects externally observable behavior.

vasild commented at 11:03 am on August 21, 2020:

Hmm, that hash is only used when relaying addresses and is not saved on disk or sent over the network. The current time is also thrown to the hasher, together with the result of the changed CNetAddr::GetHash() and also with the relay destination. So the final result from the hasher is likely to change even after just a bitcoind restart (because it will connect to other nodes) or after 24h or less. I think it is not necessary to mention this in the commit message.

sipa commented at 4:31 pm on August 21, 2020:

The hashing (in RelayAddress) is actually salted by a value that’s randomly generated at startup (through CConnman::GetDeterministicRandomizer), so it will in fact always be different after restart.

You could add a comment in the commit description, but I don’t think it needs more exposure than that.

in src/netaddress.cpp:893 in 86b2397fae outdated

900@@ -893,12 +901,18 @@ bool CSubNet::IsValid() const
901 
902 bool operator==(const CSubNet& a, const CSubNet& b)
903 {
904-    return a.valid == b.valid && a.network == b.network && !memcmp(a.netmask, b.netmask, 16);
905+    const size_t size = a.network.IsIPv4() ? ADDR_IPV4_SIZE : ADDR_IPV6_SIZE;

ryanofsky commented at 9:44 pm on August 20, 2020:

In commit “net: change CNetAddr::ip to have flexible size” (86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3)

Value of size variable used here seems fragile. Seems like it would be better to not assume any particular network type and use network.m_addr_size() like code above seems to do.

Same suggestion for operator < below.

vasild commented at 11:42 am on August 21, 2020:

a.network.m_addr is protected here and can’t be accessed. Anyway, since we only do subnetting of IPv4 and IPv6 I added an assert to ensure that this is the case.

ryanofsky approved

ryanofsky commented at 10:05 pm on August 20, 2020: member

Code review ACK 86b2397faedcd9b57649ddb90e98c8cf6a4a6ad3. Left some suggestions below which are minor and can be ignored. I reviewed this pretty thoroughly and didn’t find any actual problems.

I can’t weigh in on decision to drop support for discontiguous netmasks. Obviously that blocks current PR if it can’t be done here, but otherwise this change looks very good in its current form.

vasild force-pushed on Aug 21, 2020

vasild commented at 11:26 am on August 21, 2020: member

@ryanofsky, thanks for the valuable input! Most of the suggestions addressed.

Changes since the last push:

1. Fixed compilation with GCC 8 (HasPrefix()) 2. Simplified the check for INADDR_ANY and INADDR_NONE 3. Set V1_SERIALIZATION_SIZE to ADDR_IPV6_SIZE 4. assert(false) if we try to serialize NET_UNROUTABLE or NET_MAX 5. Moved the comments about the dummy IPv6 addresses being unroutable 6. Added two more asserts 7. Used Span::last() instead of pointer arithmetics 8. Used / and % instead of bitwise >> and & to improve readability

sipsorcery commented at 1:40 pm on August 21, 2020: member

Concept ACK.

I’ve dones some tests with bitcoind -testnet between Windows and Linux and IPv4 and IPv6 nodes are successfully communicating.

vasild force-pushed on Aug 21, 2020

vasild commented at 2:06 pm on August 21, 2020: member

Revert any changes to bool operator==(const CSubNet& a, const CSubNet& b) and operator< - they are not necessary and a recent tweak upset the fuzzer.

It is ok to compare the entire netmask[16] even for IPv4 because we memset it to 0s in the constructors, so we will compare the trailing 12 zero bytes unnecessary. That is closer to the original behavior.

sipa commented at 8:48 pm on August 21, 2020: member

Regarding the change to remove support arbitrary bitmasks as mask, if it’s easier to make this change while dropping them, that’s what we should do. They’re undocumented, unexpected, useless as far as I can see, and I would be very surprised if anyone was using it.

My preference would be to do it as a separate commit, as it’s semantically a very different change than goal of the PR. I would expect that restricting netmasks to CIDR first, and then changing the ip field to have flexible size would be easiest, but I haven’t tried.

I do think it’s worth adding release notes for, so that it’s at least documented that that change was made, on the off chance anyone is using it.

jonatack commented at 6:02 pm on August 23, 2020: member

Checkpoint WIP code review and tested re-ACK f37cde9, caught up with the discussion/feedback, reviewed changes since my last review per git diff 8b7f2c1 f37cde9 and verified a few aspects adding/removing/godbolting things, re-reviewed the code, and running a node for a few hours now.

Nit, with the latest push this code is no longer being changed but could perhaps document the 16 value in some way or other.

 0--- a/src/netaddress.cpp
 1+++ b/src/netaddress.cpp
 2@@ -888,12 +888,12 @@ bool CSubNet::IsValid() const
 3 
 4 bool operator==(const CSubNet& a, const CSubNet& b)
 5 {
 6-    return a.valid == b.valid && a.network == b.network && !memcmp(a.netmask, b.netmask, 16);
 7+    return a.valid == b.valid && a.network == b.network && !memcmp(a.netmask, b.netmask, ADDR_IPV6_SIZE);
 8 }
 9 
10 bool operator<(const CSubNet& a, const CSubNet& b)
11 {
12-    return (a.network < b.network || (a.network == b.network && memcmp(a.netmask, b.netmask, 16) < 0));
13+    return (a.network < b.network || (a.network == b.network && memcmp(a.netmask, b.netmask, ADDR_IPV6_SIZE) < 0));
14 }

vasild force-pushed on Aug 24, 2020

vasild commented at 6:42 pm on August 24, 2020: member

Changes in the last push - fix compilation with GCC 4.8.

net: don't accept non-left-contiguous netmasks

A netmask that contains 1-bits after 0-bits (the 1-bits are not
contiguous on the left side) is invalid [1] [2].

The code before this PR used to parse and accept such
non-left-contiguous netmasks. However, a coming change that will alter
`CNetAddr::ip` to have flexible size would make juggling with such
netmasks more difficult, thus drop support for those.

[1] https://en.wikipedia.org/wiki/Classless_Inter-Domain_Routing#Subnet_masks
[2] https://tools.ietf.org/html/rfc4632#section-5.1

1ea57ad674

net: change CNetAddr::ip to have flexible size

Before this change `CNetAddr::ip` was a fixed-size array of 16 bytes,
not being able to store larger addresses (e.g. TORv3) and encoded
smaller ones as 16-byte IPv6 addresses.

Change its type to `prevector`, so that it can hold larger addresses and
do not disguise non-IPv6 addresses as IPv6. So the IPv4 address
`1.2.3.4` is now encoded as `01020304` instead of
`00000000000000000000FFFF01020304`.

Rename `CNetAddr::ip` to `CNetAddr::m_addr` because it is not an "IP" or
"IP address" (TOR addresses are not IP addresses).

In order to preserve backward compatibility with serialization (where
e.g. `1.2.3.4` is serialized as `00000000000000000000FFFF01020304`)
introduce `CNetAddr` dedicated legacy serialize/unserialize methods.

Adjust `CSubNet` accordingly. Still use `CSubNet::netmask[]` of fixed 16
bytes, but use the first 4 for IPv4 (not the last 4). Only allow
subnetting for IPv4 and IPv6.

Co-authored-by: Carl Dong <contact@carldong.me>

102867c587

vasild force-pushed on Aug 24, 2020

vasild commented at 8:00 pm on August 24, 2020: member

Regarding the change to remove support arbitrary bitmasks as mask … My preference would be to do it as a separate commit

Done in the last push: this PR is now split in 2 commits with the cumulative diff equal to the single commit from before (modulo the added release notes).

A further improvement would be to change uint8_t CSubNet::netmask[16] to uint8_t CSubNet::m_cidr (as a separate PR). @jonatack It was hard for me to stop myself from s/16/ADDR_IPV6_SIZE in operator== but I resisted because that would also drag a similar change in the definition of CSubNet::netmask[16] and invite other improvements to CSubNet which are not necessary for this PR. A followup PR is warranted.

in doc/release-notes.md:138 in 1ea57ad674 outdated

134@@ -135,6 +135,10 @@ Updated settings
135   in future releases. Refer to the help of the affected settings `-whitebind`
136   and `-whitelist` for more details. (#19191)
137 
138+- Netmasks that contain 1-bits after 0-bits (the 1-bits are not contiguous on

sipa commented at 9:11 pm on August 24, 2020:

Perhaps point out where these are used. I think the -rpcallowip parameter, and the setban RPC are the only ones?

vasild commented at 10:40 am on August 25, 2020:

Will do so in a followup.

in src/netaddress.cpp:278 in 102867c587

274@@ -272,13 +275,16 @@ bool CNetAddr::IsValid() const
275     // header20 vectorlen3 addr26 addr26 addr26 header20 vectorlen3 addr26 addr26 addr26...
276     // so if the first length field is garbled, it reads the second batch
277     // of addr misaligned by 3 bytes.
278-    if (IsIPv6() && memcmp(ip, pchIPv4+3, sizeof(pchIPv4)-3) == 0)
279+    if (IsIPv6() && memcmp(m_addr.data(), IPV4_IN_IPV6_PREFIX.data() + 3,

sipa commented at 9:34 pm on August 24, 2020:

Not a comment on this PR (it makes sense to not change any further behavior), but 0.2.8 is over 10 years old now, and can’t even communicate with current software due to the mandatory checksum change in 2012. It’s probably fine to drop this code at some point.

sipa commented at 11:00 pm on August 24, 2020: member

utACK 102867c587f5f7954232fb8ed8e85cda78bb4d32

ryanofsky approved

ryanofsky commented at 11:32 pm on August 24, 2020: member

Code review ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32. Just many suggested updates since last review. Thanks for following up on everything!

It’s a shame this had to switch to std::array everywhere. Makes the code more verbose and fragile, but I guess it was the easiest way to work around a gcc 4.8 #19628 (review) compiler issue

sipa commented at 11:36 pm on August 24, 2020: member

FWIW, if we just want to syntactic advantage of being able to write HasPrefix(m_addr, {1,2,3,4}), that could be done by making the argument std::vector<uint8_t>. I don’t think it’s worth changing that, especially as we’ll probably be dropping support for GCC 4.8 soon anyway.

in doc/release-notes.md:140 in 102867c587

134@@ -135,6 +135,10 @@ Updated settings
135   in future releases. Refer to the help of the affected settings `-whitebind`
136   and `-whitelist` for more details. (#19191)
137 
138+- Netmasks that contain 1-bits after 0-bits (the 1-bits are not contiguous on
139+  the left side, e.g. 255.0.255.255) are no longer accepted. They are invalid
140+  according to RFC 4632.

jonatack commented at 3:05 am on August 25, 2020:

Append " (#19628)"

vasild commented at 10:42 am on August 25, 2020:

Will do so in a followup, to keep the current ACKs.

jonatack commented at 3:24 am on August 25, 2020: member

re-ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32 diff review, code review, build/tests/running bitcoind with ipv4/ipv6/onion peers

in src/netaddress.cpp:53 in 102867c587

59+    m_addr = ipIn.m_addr;
60 }
61 
62-void CNetAddr::SetLegacyIPv6(const uint8_t ipv6[16])
63+template <typename T1, size_t PREFIX_LEN>
64+inline bool HasPrefix(const T1& obj, const std::array<uint8_t, PREFIX_LEN>& prefix)

kallewoof commented at 3:35 am on August 25, 2020:

μ-nit: not sure why this is T1 and not simply T

vasild commented at 10:48 am on August 25, 2020:

This was T initially, then was changed to T1 and T2, then I dropped the second one but forgot to rename T1 to T. Will take are about it when touching this after GCC 4.8 is no longer supported.

in src/netaddress.cpp:253 in 102867c587

250+    }
251 
252     // IPv6 loopback (::1/128)
253     static const unsigned char pchLocal[16] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1};
254-    if (IsIPv6() && memcmp(ip, pchLocal, 16) == 0)
255+    if (IsIPv6() && memcmp(m_addr.data(), pchLocal, sizeof(pchLocal)) == 0) {

kallewoof commented at 4:08 am on August 25, 2020:

If we for a brief moment pretended that prevector.h:258 had this line added to it:

0    static prevector from(std::initializer_list<T> il) { return prevector(il.begin(), il.end()); }

then it would have been worth it to do:

0static auto ipv6_loopback = prevector<16, uint8_t>::from({0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1});
1if (IsIPv6() && m_addr == ipv6_loopback) {

Alas, that prevector line would need to be added. Not sure it’s worth it, or if it should be its own PR perhaps.

Edit: I initially suggested doing an initializer list constructor, but that interferes e.g. with the 2 element constructor messing stuff up and being generally unpredictable. The above may be too messy to be warranted, but I’ll leave it up anyway.

vasild commented at 10:52 am on August 25, 2020:

I think it is also important to not deviate from std::vector interface so that it is not too much hassle should one decide to swap prevector for std::vector or the other way around.

Btw, std::vector has an initializer list constructor.

Anyway - leaving this as is.

kallewoof commented at 1:32 am on August 26, 2020:

Yeah, it may be that my lines below (the ipNode6) were in fact invalid, and were the cause for the errors I saw when I did this as a prevector constructor, where I didn’t initially cast 16 to size_t in ipNone6{16, 0x0}. I much prefer the constructor variant, which is also closer to std::vector. FTR, that was

0prevector(std::initializer_list<T> il) : prevector(il.begin(), il.end()) {}

and the line here beomes

0static prevector<ADDR_IPV6_SIZE, uint8_t> ipv6_loopback{0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1};

vasild commented at 8:07 am on August 26, 2020:

I wonder if we can avoid defining a local variable for IPv6 loopback address and use IN6ADDR_LOOPBACK_INIT or in6addr_loopback…

in src/netaddress.cpp:285 in 102867c587

282+    }
283 
284     // unspecified IPv6 address (::/128)
285     unsigned char ipNone6[16] = {};
286-    if (IsIPv6() && memcmp(ip, ipNone6, 16) == 0)
287+    if (IsIPv6() && memcmp(m_addr.data(), ipNone6, sizeof(ipNone6)) == 0) {

kallewoof commented at 4:32 am on August 25, 2020:

This could be

0    static prevector<ADDR_IPV6_SIZE, uint8_t> ipNone6{ADDR_IPV6_SIZE, 0x0};
1    if (IsIPv6() && m_addr == ipNone6) {

vasild commented at 10:58 am on August 25, 2020:

Right! I will apply it if I modify this PR.

ADDR_IPV6_SIZE is of type size_t, the typecast should not be needed?

kallewoof commented at 1:28 am on August 26, 2020:

Yeah, I had originally put 16 there but realized ADDR_IPV6_SIZE was better.

kallewoof commented at 4:37 am on August 25, 2020: member

Partial re-review ACK

Some nits, most of which increase the diff, so I definitely understand if you choose to ignore them.

kallewoof commented at 8:04 am on August 25, 2020: member

ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32

Reviewed changes, ran node, did IBD to ~200k, added peers etc (did not test onion routes).

vasild commented at 11:07 am on August 25, 2020: member

This, I think, is settled and gathered some ACKs.

Further changes to be done as a followup:

Change uint8_t CSubNet::netmask[16] to uint8_t CSubNet::m_cidr and other improvements around CSubNet that are highly desirable, but not required for this PR, thus not included in it.
Elaborate on the release notes (open this diff as new PR). Done: #19802
Use the cleaner HasPrefix(m_addr, {1, 2, 3}) once support for GCC 4.8 is dropped. Also s/T1/T/ in the template.

MarcoFalke commented at 4:09 pm on August 25, 2020: member

Concept ACK 102867c587f5f7954232fb8ed8e85cda78bb4d32

MarcoFalke commented at 4:10 pm on August 25, 2020: member

(edited OP to remove html comments)

MarcoFalke merged this on Aug 25, 2020

MarcoFalke closed this on Aug 25, 2020

vasild deleted the branch on Aug 25, 2020

sidhujag referenced this in commit 348398bb82 on Aug 25, 2020

laanwj removed this from the "Blockers" column in a project

MarcoFalke referenced this in commit 5edef20a65 on Aug 28, 2020

vasild commented at 1:31 pm on September 1, 2020: member

Next PR on the road to TORv3: https://github.com/bitcoin/bitcoin/pull/19845

laanwj referenced this in commit 72affcb16c on Sep 30, 2020

sidhujag referenced this in commit 079ccad86e on Sep 30, 2020

laanwj referenced this in commit e3b474c548 on Oct 15, 2020

sidhujag referenced this in commit 16a7f0f84f on Oct 16, 2020

Fabcien referenced this in commit 8bd41b771c on Feb 5, 2021

Fabcien referenced this in commit cbe0b91f3e on Feb 5, 2021

kittywhiskers referenced this in commit 3fe52df179 on May 20, 2021

kittywhiskers referenced this in commit 47906dc3ea on May 20, 2021

kittywhiskers referenced this in commit 8752621d85 on May 20, 2021

kittywhiskers referenced this in commit 1245d45604 on May 20, 2021

kittywhiskers referenced this in commit e84a603abc on May 21, 2021

kittywhiskers referenced this in commit 5aadf43da7 on May 21, 2021

kittywhiskers referenced this in commit 69b6d6a550 on May 22, 2021

kittywhiskers referenced this in commit 337c15ae25 on May 23, 2021

PastaPastaPasta referenced this in commit 41a8349d4d on May 23, 2021

PastaPastaPasta referenced this in commit f67aba4862 on Jun 27, 2021

PastaPastaPasta referenced this in commit 69e2115b46 on Jun 28, 2021

PastaPastaPasta referenced this in commit ae0ae43c01 on Jun 29, 2021

PastaPastaPasta referenced this in commit 10c3d63f2c on Jul 1, 2021

PastaPastaPasta referenced this in commit 1988592ea7 on Jul 1, 2021

PastaPastaPasta referenced this in commit 1ff7090066 on Jul 15, 2021

PastaPastaPasta referenced this in commit 5dc39741e9 on Jul 16, 2021

random-zebra referenced this in commit b4751e10ce on Aug 11, 2021

PastaPastaPasta referenced this in commit 05a4222f19 on Sep 17, 2021

PastaPastaPasta referenced this in commit 18e0bb20dd on Sep 19, 2021

PastaPastaPasta referenced this in commit 45b30c3b3a on Sep 24, 2021

kittywhiskers referenced this in commit 12afb98a99 on Oct 12, 2021

DeckerSU referenced this in commit 38edf5f033 on Feb 10, 2022

DrahtBot locked this on Feb 15, 2022

net: change CNetAddr::ip to have flexible size #19628

Conflicts