Previously, only k0
and k1
were stored, causing the constant xor operations to be recomputed in every call to SipHashUint256Extra
.
This commit adds a dedicated Uint256ExtraSipHasher
class that caches the initial state (v0-v3) and to perform the SipHash
computation on a uint256
(with an extra parameter), hiding the constant computation details from higher-level code and improving efficiency.
This basically brings the precalculations in the CSipHasher
constructor to the uint256
specialized SipHash implementation.
cmake -B build -DBUILD_BENCH=ON -DCMAKE_BUILD_TYPE=Release && cmake –build build -j$(nproc) && build/src/bench/bench_bitcoin -filter=‘SaltedOutpointHasherBench’ -min-time=10000
C++ compiler …………………….. AppleClang 16.0.0.16000026
Before:
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
58.60 | 17,065,922.04 | 0.3% | 11.02 | SaltedOutpointHasherBench_create_set |
11.97 | 83,576,684.83 | 0.1% | 11.01 | SaltedOutpointHasherBench_hash |
14.50 | 68,985,850.12 | 0.3% | 10.96 | SaltedOutpointHasherBench_match |
13.90 | 71,942,033.47 | 0.4% | 11.03 | SaltedOutpointHasherBench_mismatch |
After:
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
57.27 | 17,462,299.19 | 0.1% | 11.02 | SaltedOutpointHasherBench_create_set |
11.24 | 88,997,888.48 | 0.3% | 11.04 | SaltedOutpointHasherBench_hash |
13.91 | 71,902,014.20 | 0.2% | 11.01 | SaltedOutpointHasherBench_match |
13.29 | 75,230,390.31 | 0.1% | 11.00 | SaltedOutpointHasherBench_mismatch |
compared to master:
0create_set - 17,462,299.19 / 17,065,922.04 - 2.3% faster
1hash - 88,997,888.48 / 83,576,684.83 - 6.4% faster
2match - 71,902,014.20 / 68,985,850.12 - 4.2% faster
3mismatch - 75,230,390.31 / 71,942,033.47 - 4.5% faster
C++ compiler …………………….. GNU 13.3.0
Before:
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
136.76 | 7,312,133.16 | 0.0% | 1,086.67 | 491.12 | 2.213 | 119.54 | 1.1% | 11.01 | SaltedOutpointHasherBench_create_set |
23.82 | 41,978,882.62 | 0.0% | 252.01 | 85.57 | 2.945 | 4.00 | 0.0% | 11.00 | SaltedOutpointHasherBench_hash |
60.42 | 16,549,695.42 | 0.1% | 460.51 | 217.04 | 2.122 | 21.00 | 1.4% | 10.99 | SaltedOutpointHasherBench_match |
78.66 | 12,713,595.35 | 0.1% | 555.59 | 282.52 | 1.967 | 20.19 | 2.2% | 10.74 | SaltedOutpointHasherBench_mismatch |
After:
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
135.38 | 7,386,349.49 | 0.0% | 1,078.19 | 486.16 | 2.218 | 119.56 | 1.1% | 11.00 | SaltedOutpointHasherBench_create_set |
23.67 | 42,254,558.08 | 0.0% | 247.01 | 85.01 | 2.906 | 4.00 | 0.0% | 11.00 | SaltedOutpointHasherBench_hash |
58.95 | 16,962,220.14 | 0.1% | 446.55 | 211.74 | 2.109 | 20.86 | 1.4% | 11.01 | SaltedOutpointHasherBench_match |
76.98 | 12,991,047.69 | 0.1% | 548.93 | 276.50 | 1.985 | 20.25 | 2.3% | 10.72 | SaltedOutpointHasherBench_mismatch |
0compared to master:
1create_set - 7,386,349.49 / 7,312,133.16 - 1.0% faster
2hash - 42,254,558.08 / 41,978,882.62 - 0.6% faster
3match - 16,962,220.14 / 16,549,695.42 - 2.4% faster
4mismatch - 12,991,047.69 / 12,713,595.35 - 2.1% faster