Here’s a reduced version:
0 return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> ((7 + bitpos) % 8)) & 1;
I’m surprised by the way the shift amount changes for different bit positions. Would have expected something closer to this for big endian…
0 return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - (bitpos % 8))) & 1;
….but it fails the unit tests.
Godbolt experiment https://godbolt.org/z/6rjEzxhPM outputs:
0PR bit shift expression (adfd8006...): (7 - ((bytes.size() * 8 - bitpos) % 8))
1PR bit shift reduced: (7 + bitpos) % 8
2Expected BE bit shift: 7 - (bitpos % 8)
3
4bitpos: 0, PR: 7, reduced: 7, expected: 7
5bitpos: 1, PR: 0, reduced: 0, expected: 6
6bitpos: 2, PR: 1, reduced: 1, expected: 5
7bitpos: 3, PR: 2, reduced: 2, expected: 4
8bitpos: 4, PR: 3, reduced: 3, expected: 3
9bitpos: 5, PR: 4, reduced: 4, expected: 2
10bitpos: 6, PR: 5, reduced: 5, expected: 1
11bitpos: 7, PR: 6, reduced: 6, expected: 0
12bitpos: 8, PR: 7, reduced: 7, expected: 7
13bitpos: 9, PR: 0, reduced: 0, expected: 6
14bitpos: 10, PR: 1, reduced: 1, expected: 5
15bitpos: 11, PR: 2, reduced: 2, expected: 4
16bitpos: 12, PR: 3, reduced: 3, expected: 3
17bitpos: 13, PR: 4, reduced: 4, expected: 2
18bitpos: 14, PR: 5, reduced: 5, expected: 1
19bitpos: 15, PR: 6, reduced: 6, expected: 0
However, when looking at the usage in Interpret(), we see if we would encounter JUMP as the first instruction, our bitpos is sort of out of bounds by one.
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L184
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L201
If we instead at the callsite make sure to send in in-bounds values - GetBitBE(ip, bits - 1), it turns out we can simplify GetBitBE quite a bit.
0--- a/src/util/asmap.cpp
1+++ b/src/util/asmap.cpp
2@@ -48,8 +48,8 @@ namespace {
3 constexpr uint32_t INVALID = 0xFFFFFFFF;
4
5 /**
6- * Extract a single bit from byte array using little-endian bit ordering.
7- * Used for ASMap data where bits are numbered right-to-left within each byte (LSB first).
8+ * Extract a single bit (LSB first) from byte array using little-endian byte ordering.
9+ * Used for ASMap data.
10 */
11 inline bool GetBitLE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
12 {
13@@ -57,12 +57,12 @@ inline bool GetBitLE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
14 }
15
16 /**
17- * Extract a single bit from byte array using big-endian bit ordering.
18+ * Extract a single bit (LSB first) from byte array using big-endian byte ordering.
19 * Used for IP addresses to match network byte order conventions.
20 */
21 inline bool GetBitBE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
22 {
23- return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - ((bytes.size() * 8 - bitpos) % 8))) & 1;
24+ return (std::to_integer<uint8_t>(bytes[bytes.size() - ((bitpos + 8) / 8)]) >> (bitpos % 8)) & 1;
25 }
26
27 /**
28@@ -198,7 +198,7 @@ uint32_t Interpret(const std::span<const std::byte> asmap, const std::span<const
29 if (jump == INVALID) break; // Jump offset straddles EOF
30 if (bits == 0) break; // No input bits left
31 if (int64_t{jump} >= static_cast<int64_t>(endpos - pos)) break; // Jumping past EOF
32- if (GetBitBE(ip, bits)) { // Check next IP bit (big-endian)
33+ if (GetBitBE(ip, bits - 1)) { // Check next IP bit (big-endian)
34 pos += jump; // Bit = 1: skip to right subtree
35 }
36 // Bit = 0: fall through to left subtree
37@@ -213,7 +213,7 @@ uint32_t Interpret(const std::span<const std::byte> asmap, const std::span<const
38 matchlen = std::bit_width(match) - 1; // An n-bit value matches n-1 input bits
39 if (bits < matchlen) break; // Not enough input bits
40 for (uint32_t bit = 0; bit < matchlen; bit++) {
41- if (GetBitBE(ip, bits) != ((match >> (matchlen - 1 - bit)) & 1)) {
42+ if (GetBitBE(ip, bits - 1) != ((match >> (matchlen - 1 - bit)) & 1)) {
43 return default_asn; // Pattern mismatch - use default
44 }
45 bits--;
The similarity we get between the shifting in this new GetBitBE and GetBitLE show that we have the same bit ordering, just different byte-ordering.
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L54-L56
That’s why I also updated the comments in the diff above.