Here's a reduced version:
return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> ((7 + bitpos) % 8)) & 1;
I'm surprised by the way the shift amount changes for different bit positions. Would have expected something closer to this for big endian...
return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - (bitpos % 8))) & 1;
....but it fails the unit tests.
Godbolt experiment https://godbolt.org/z/6rjEzxhPM outputs:
PR bit shift expression (adfd8006...): (7 - ((bytes.size() * 8 - bitpos) % 8))
PR bit shift reduced: (7 + bitpos) % 8
Expected BE bit shift: 7 - (bitpos % 8)
bitpos: 0, PR: 7, reduced: 7, expected: 7
bitpos: 1, PR: 0, reduced: 0, expected: 6
bitpos: 2, PR: 1, reduced: 1, expected: 5
bitpos: 3, PR: 2, reduced: 2, expected: 4
bitpos: 4, PR: 3, reduced: 3, expected: 3
bitpos: 5, PR: 4, reduced: 4, expected: 2
bitpos: 6, PR: 5, reduced: 5, expected: 1
bitpos: 7, PR: 6, reduced: 6, expected: 0
bitpos: 8, PR: 7, reduced: 7, expected: 7
bitpos: 9, PR: 0, reduced: 0, expected: 6
bitpos: 10, PR: 1, reduced: 1, expected: 5
bitpos: 11, PR: 2, reduced: 2, expected: 4
bitpos: 12, PR: 3, reduced: 3, expected: 3
bitpos: 13, PR: 4, reduced: 4, expected: 2
bitpos: 14, PR: 5, reduced: 5, expected: 1
bitpos: 15, PR: 6, reduced: 6, expected: 0
However, when looking at the usage in Interpret(), we see if we would encounter JUMP as the first instruction, our bitpos is sort of out of bounds by one.
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L184
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L201
If we instead at the callsite make sure to send in in-bounds values - GetBitBE(ip, bits - 1), it turns out we can simplify GetBitBE quite a bit.
--- a/src/util/asmap.cpp
+++ b/src/util/asmap.cpp
@@ -48,8 +48,8 @@ namespace {
constexpr uint32_t INVALID = 0xFFFFFFFF;
/**
- * Extract a single bit from byte array using little-endian bit ordering.
- * Used for ASMap data where bits are numbered right-to-left within each byte (LSB first).
+ * Extract a single bit (LSB first) from byte array using little-endian byte ordering.
+ * Used for ASMap data.
*/
inline bool GetBitLE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
{
@@ -57,12 +57,12 @@ inline bool GetBitLE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
}
/**
- * Extract a single bit from byte array using big-endian bit ordering.
+ * Extract a single bit (LSB first) from byte array using big-endian byte ordering.
* Used for IP addresses to match network byte order conventions.
*/
inline bool GetBitBE(std::span<const std::byte> bytes, uint32_t bitpos) noexcept
{
- return (std::to_integer<uint8_t>(bytes[(bytes.size() * 8 - bitpos) / 8]) >> (7 - ((bytes.size() * 8 - bitpos) % 8))) & 1;
+ return (std::to_integer<uint8_t>(bytes[bytes.size() - ((bitpos + 8) / 8)]) >> (bitpos % 8)) & 1;
}
/**
@@ -198,7 +198,7 @@ uint32_t Interpret(const std::span<const std::byte> asmap, const std::span<const
if (jump == INVALID) break; // Jump offset straddles EOF
if (bits == 0) break; // No input bits left
if (int64_t{jump} >= static_cast<int64_t>(endpos - pos)) break; // Jumping past EOF
- if (GetBitBE(ip, bits)) { // Check next IP bit (big-endian)
+ if (GetBitBE(ip, bits - 1)) { // Check next IP bit (big-endian)
pos += jump; // Bit = 1: skip to right subtree
}
// Bit = 0: fall through to left subtree
@@ -213,7 +213,7 @@ uint32_t Interpret(const std::span<const std::byte> asmap, const std::span<const
matchlen = std::bit_width(match) - 1; // An n-bit value matches n-1 input bits
if (bits < matchlen) break; // Not enough input bits
for (uint32_t bit = 0; bit < matchlen; bit++) {
- if (GetBitBE(ip, bits) != ((match >> (matchlen - 1 - bit)) & 1)) {
+ if (GetBitBE(ip, bits - 1) != ((match >> (matchlen - 1 - bit)) & 1)) {
return default_asn; // Pattern mismatch - use default
}
bits--;
The similarity we get between the shifting in this new GetBitBE and GetBitLE show that we have the same bit ordering, just different byte-ordering.
https://github.com/bitcoin/bitcoin/blob/b41f5a29f7d56da8fd157770ad29b5776c3684c6/src/util/asmap.cpp#L54-L56
That's why I also updated the comments in the diff above.