SipHash
is decent (I tried several other candidates and most were slower), but not particularly fast (even a general SHA256-shani is in the same ball-park).
RapidHash
is much simpler and, on short inputs, dramatically faster:
ns/byte |
byte/s |
err% |
total |
benchmark |
0.06 |
16,529,344,089.83 |
0.2% |
11.00 |
RapidHash_32b |
0.96 |
1,043,368,577.00 |
0.1% |
11.01 |
Sha256Shani_32b using the 'arm_shani(1way,2way)' SHA256 implementation |
0.39 |
2,547,686,515.00 |
0.2% |
10.98 |
SipHash_32b |
0diff --git a/src/bench/crypto_hash.cpp b/src/bench/crypto_hash.cpp
1index 2f1ff56438..cab34169a5 100644
2--- a/src/bench/crypto_hash.cpp
3+++ b/src/bench/crypto_hash.cpp
4@@ -195,11 +195,108 @@ static void SipHash_32b(benchmark::Bench& bench)
5 FastRandomContext rng{/*fDeterministic=*/true};
6 auto k0{rng.rand64()}, k1{rng.rand64()};
7 auto val{rng.rand256()};
8+ auto extra{rng.rand32()};
9 auto i{0U};
10- bench.run([&] {
11- ankerl::nanobench::doNotOptimizeAway(SipHashUint256(k0, k1, val));
12+ bench.batch(uint256::size() + sizeof(uint32_t) / 8).unit("byte").run([&] {
13+ ankerl::nanobench::doNotOptimizeAway(SipHashUint256Extra(k0, k1, val, extra));
14 ++k0;
15 ++k1;
16+ ++extra;
17+ ++i;
18+ val.data()[i % uint256::size()] ^= i & 0xFF;
19+ });
20+}
21+
22+static void Sha256Shani_32b(benchmark::Bench& bench)
23+{
24+ FastRandomContext rng{/*fDeterministic=*/true};
25+ auto k0{rng.rand64()}, k1{rng.rand64()};
26+ auto val{rng.rand256()};
27+ auto extra{rng.rand32()};
28+ auto i{0U};
29+ uint8_t hash[CSHA256::OUTPUT_SIZE];
30+
31+ bench.name(strprintf("%s using the '%s' SHA256 implementation", __func__, SHA256AutoDetect(sha256_implementation::USE_SSE4_AND_SHANI)));
32+ uint8_t digest[CSHA256::OUTPUT_SIZE];
33+ bench.batch(uint256::size() + sizeof(uint32_t) / 8).unit("byte").run([&] {
34+ CSHA256()
35+ .Write(reinterpret_cast<const uint8_t*>(&k0), sizeof(k0))
36+ .Write(reinterpret_cast<const uint8_t*>(&k1), sizeof(k1))
37+ .Write(val.data(), val.size())
38+ .Write(reinterpret_cast<const uint8_t*>(&extra), sizeof(extra))
39+ .Finalize(hash);
40+
41+ ankerl::nanobench::doNotOptimizeAway(digest[0]);
42+ ++k0;
43+ ++k1;
44+ ++extra;
45+ ++i;
46+ val.data()[i % uint256::size()] ^= i & 0xFF;
47+ });
48+ SHA256AutoDetect();
49+}
50+
51+uint64_t ModifiedRapidHash(uint64_t k0, uint64_t k1, const uint256& val, uint32_t extra)
52+{
53+ auto const rapid_mum = [](uint64_t* a, uint64_t* b) {
54+#if defined(__SIZEOF_INT128__)
55+ __uint128_t r = *a;
56+ r *= *b;
57+ *a = (uint64_t)r;
58+ *b = (uint64_t)(r >> 64);
59+#elif defined(_MSC_VER) && (defined(_WIN64) || defined(_M_HYBRID_CHPE_ARM64))
60+#if defined(_M_X64)
61+ *a = _umul128(*a, *b, b);
62+#else
63+ uint64_t c = __umulh(*a, *b);
64+ *a = *a * *b;
65+ *b = c;
66+#endif
67+#else
68+ uint64_t ha = *a >> 32, hb = *b >> 32, la = (uint32_t)*a, lb = (uint32_t)*b, hi, lo;
69+ uint64_t rh = ha * hb, rm0 = ha * lb, rm1 = hb * la, rl = la * lb, t = rl + (rm0 << 32), c = t < rl;
70+ lo = t + (rm1 << 32);
71+ c += lo < t;
72+ hi = rh + (rm0 >> 32) + (rm1 >> 32) + c;
73+ *a = lo;
74+ *b = hi;
75+#endif
76+ };
77+
78+ auto const rapid_mix = [&rapid_mum](uint64_t a, uint64_t b) -> uint64_t {
79+ rapid_mum(&a, &b);
80+ return a ^ b;
81+ };
82+
83+ // This effectifely behaves like rapidhash with that input:
84+ // seed: k0, data: [val, k1, extra]
85+ // So it hashes 32+8+4 = 44 bytes, plus uses the 8 byte as seed.
86+
87+ // Default secret parameters.
88+ static constexpr uint64_t secret[3] = {0x2d358dccaa6c78a5ull, 0x8bb84b93962eacc9ull, 0x4b33a62ed433d4a3ull};
89+
90+ // no need to mix the seed itself, as it is purely random.
91+ uint64_t seed = k0;
92+ seed = rapid_mix(val.GetUint64(0) ^ secret[2], val.GetUint64(1) ^ seed ^ secret[1]);
93+ seed = rapid_mix(val.GetUint64(2) ^ secret[2], val.GetUint64(3) ^ seed);
94+ uint64_t a = k1 ^ secret[1];
95+ uint64_t b = extra ^ seed;
96+ rapid_mum(&a, &b);
97+ return rapid_mix(a ^ secret[0], b ^ secret[1]);
98+}
99+
100+static void RapidHash_32b(benchmark::Bench& bench)
101+{
102+ FastRandomContext rng{/*fDeterministic=*/true};
103+ auto k0{rng.rand64()}, k1{rng.rand64()};
104+ auto val{rng.rand256()};
105+ auto extra{rng.rand32()};
106+ auto i{0U};
107+ bench.batch(uint256::size() + sizeof(uint32_t) / 8).unit("byte").run([&] {
108+ ankerl::nanobench::doNotOptimizeAway(ModifiedRapidHash(k0, k1, val, extra));
109+ ++k0;
110+ ++k1;
111+ ++extra;
112 ++i;
113 val.data()[i % uint256::size()] ^= i & 0xFF;
114 });
115@@ -276,6 +373,8 @@ BENCHMARK(SHA256_32b_SSE4, benchmark::PriorityLevel::HIGH);
116 BENCHMARK(SHA256_32b_AVX2, benchmark::PriorityLevel::HIGH);
117 BENCHMARK(SHA256_32b_SHANI, benchmark::PriorityLevel::HIGH);
118 BENCHMARK(SipHash_32b, benchmark::PriorityLevel::HIGH);
119+BENCHMARK(Sha256Shani_32b, benchmark::PriorityLevel::HIGH);
120+BENCHMARK(RapidHash_32b, benchmark::PriorityLevel::HIGH);
121 BENCHMARK(SHA256D64_1024_STANDARD, benchmark::PriorityLevel::HIGH);
122 BENCHMARK(SHA256D64_1024_SSE4, benchmark::PriorityLevel::HIGH);
123 BENCHMARK(SHA256D64_1024_AVX2, benchmark::PriorityLevel::HIGH);
124@@ -286,3 +385,4 @@ BENCHMARK(MuHashMul, benchmark::PriorityLevel::HIGH);
125 BENCHMARK(MuHashDiv, benchmark::PriorityLevel::HIGH);
126 BENCHMARK(MuHashPrecompute, benchmark::PriorityLevel::HIGH);
127 BENCHMARK(MuHashFinalize, benchmark::PriorityLevel::HIGH);
128+
RapidHash v3 was released only two days ago, so numbers may improve further.
If the hash isn’t of cryptographic quality, I don’t know how I’d reason about an attacker’s ability to control the buckets
How do we reason about the same currently, given that some sources refer to it as not cryptographic quality either:
Although the SipHash algorithm is considered to be generally strong, it is not intended for cryptographic purposes. As such, all cryptographic uses of this implementation are strongly discouraged.
The RapidHash
benchmarks and studies also indicate that it has an ideal avalanche score - is there any other test we could do that would increase our confidence? Would a more seasoned hash be more welcome, such as HighwayHash
or other ones with no known attacks and SIMD-enabled design, e.g. the ones from https://www.haikel-fazzani.eu.org/blog/post/siphash-alternatives-hash-algorithms with “Good” security?
I would like to understand the exact threat model here, is my understanding correct that:
- we fear collisions because of linear bucket iteration for identical hashes, in case of a brute-force attack, right? If this is just a worst-case scenario (not a likely outcome), we could probably choose a map which stores colliding entries in a balanced tree instead of a linked list;
- we expect natural collisions to be rare, and deliberate ones to require significant effort;
- as long as the attackers cannot pre-compute exactly which coins will end up in the exact same bucket, we would be fine;
- we don’t store unconfirmed transactions in the coins cache, only outputs that are already mined;
- the inputs are really difficult to tweak, given that one of them is already a dSHA256, not a value the attacker can cheaply set to any custom value;
- colliding entries will likely depend on the coins-cache size - randomizing the default slightly might help;
- most likely (worst-case?) outcome is a local slowdown;
- once an attack is detected, we will likely have time to mitigate it in the next version.
But zooming out, isn’t the main problem here that the dbcache is resized too often, triggering many unnecessary rehashes?
Wouldn’t SipHash suffice here if we just pre-sized the coins cache during IBD more precisely and only recreated and shrank it after we’ve reached the tip (to reclaim memory)?
Also, SaltedOutpointHasher noexcept
seems to have caused many of these rehashes - and I have tried reverting it and all my measurements show that we’d be faster without that change - see my preliminary measurements: https://github.com/bitcoin-dev-tools/benchcoin/pull/163