coins: use jumboblock SipHash-1-3 for hashing CCoinsMap keys #35215

pull l0rinc wants to merge 4 commits into bitcoin:master from l0rinc:l0rinc/siphash-jumbo changing 5 files +120 −15
  1. l0rinc commented at 2:02 PM on May 5, 2026: contributor

    Problem

    Several internal unordered maps/sets use COutPoint keys (most notably CCoinsMap storing the in-memory dbcache), repeatedly hashing a 32-byte txid plus a 32-bit output index using SipHash-2-4. That path is conservative but expensive for this fixed 36-byte shape: it processes the four txid limbs and the final output-index/length word in 14 SipRounds.

    SipHash-1-3 & Jumbo blocks

    This implementation follows Pieter Wuille's jumboblock suggestion based on the SipHash analysis paper. The main input is already a 256-bit hash, so the hasher processes the four txid limbs together as one block instead of feeding them as four independent 64-bit SipHash message blocks.

    <details> <summary>Pieter Wuille's jumboblock and SipRound sketch</summary>

    we can actually process a full 256-bit hash at once, as one block, rather than having a block per 64-bit.

    SH24=SipHash-2-4, SH13=SipHash-1-3, JB=jumboblock(this idea), UP=unpadded(dropping the last block, which IMO doesn't help given that our inputs are constant length).

    In the UTXO set cache setting, we have:

    • SH24: 16 SipRounds
    • SH24+UP: 14 SipRounds
    • SH24+JB: 10 SipRounds
    • SH24+JB+UP: 8 SipRounds
    • SH13: 9 SipRounds
    • SH13+UP: 8 SipRounds
    • SH13+JB: 6 SipRounds
    • SH13+JB+UP: 5 SipRounds

    Specifically, this is pseudocode that I asked him about:

    (m0-m3 are the 64-bit limbs of the hash input, m4-m5 are other inputs including padding)
    
    # initialization
    v0 = c0 ^ k0
    v1 = c1 ^ k1
    v2 = c2 ^ k0
    v3 = c3 ^ k1
    
    # process m0-m3
    v0 ^= m0
    v1 ^= m1
    v2 ^= m2
    v3 ^= m3
    SIPROUND * c
    v0 ^= m3
    v1 ^= m0
    v2 ^= m1
    v3 ^= m2
    
    # process m4
    v3 ^= m4
    SIPROUND * c
    v0 ^= m4
    
    # process m5
    v3 ^= m5
    SIPROUND * c
    v0 ^= m5
    
    # finalize
    v2 ^= 0xff
    SIPROUND * d
    return v0 ^ v1 ^ v2 ^ v3
    

    </details>

    Design

    For the COutPoint shape, this processes the txid limbs as a jumboblock, keeps Bitcoin Core's existing combined index/length word (m4) (omittingm5 from Pieter's generic sketch). For symmetry the length byte is also included (even though this one doesn't need to be compatible with a variable-length SipHash implementation), and the path drops from 14 SipRounds (SH24+UP) to 5 (SH13+JB+UP).

    Pieter also ran the jumboblock idea by Jean-Philippe Aumasson, one of the SipHash authors; based on a preliminary analysis, Aumasson did not think this made collisions easier to construct. Aumasson also said SipHash-1-3 is fine for this hashmap use case and offered to comment on or review the PR.

    <img width="1600" height="1180" alt="hashing_path_comparison_previous_vs_jumbo" src="https://github.com/user-attachments/assets/08bf462c-13e0-45d6-91ae-76ede8ad9508" />

    Old: 4 separate 64-bit txid compressions + 1 index/length compression + 4 finalization rounds = 14 SipRounds. New: 1 combined 256-bit txid compression + 1 index/length compression + 3 finalization rounds = 5 SipRounds.

    Fix

    Add PresaltedSipHasher13Jumbo, a narrow SipHash-1-3 jumboblock specialization for hashing an existing uint256 hash plus a uint32_t index. Then switch the existing SaltedOutpointHasher wrapper to use it, so existing COutPoint unordered maps and sets keep their public hasher type while getting the faster implementation.

    This covers the coins cache and other in-memory outpoint tables through the existing abstraction, without spreading a variant-specific type name through call sites. The regular PresaltedSipHasher path remains for txid/wtxid/uint256 hashers, compact-block short IDs, and persisted/index key derivation.

    The salted hash codes are only local in-memory table indexes for the current process; they already vary across normal restarts and are not serialized, persisted, sent over the network, or used for consensus.

    Reproducer

    A test vector documents the non-standard jumboblock output, and the benchmarks now include both the existing 36-byte SipHash-2-4 path and the new jumboblock path.

    Counting the dbcache buckets indicates the new hasher satisfies the uniformness criteria we relied on before: <img width="1200" height="750" alt="ccoinsmap-collisions" src="https://github.com/user-attachments/assets/eeedec81-acdc-4adf-a9c8-bfce089700da" />

    Fresh isolated aarch64 microbenchmarks on a Raspberry Pi 5 show the new 36-byte path about 2x faster with GCC and Clang.

    <details> <summary>Linux reproducer command</summary>

    The command below rebuilds with GCC and Clang and prints only the benchmark output after the build.

    for COMPILER in gcc clang; do \
      if [ "$COMPILER" = gcc ]; then CC=gcc; CXX=g++; else CC=clang; CXX=clang++; fi; \
      cmake -B "build-bench-$COMPILER" -DCMAKE_BUILD_TYPE=Release -DBUILD_BENCH=ON -DBUILD_TESTS=OFF -DBUILD_GUI=OFF -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER="$CC" -DCMAKE_CXX_COMPILER="$CXX" >/dev/null 2>&1 && \
      cmake --build "build-bench-$COMPILER" --target bench_bitcoin -j"$(nproc)" >/dev/null 2>&1 && \
      echo "" && echo "$(date -I) | SipHash 36-byte microbench | $("$CC" --version | head -1) | $("$CXX" --version | head -1) | $(hostname) | $(uname -m) | $(lscpu | awk -F: '/Model name/{print $2; exit}' | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM" && \
      "build-bench-$COMPILER/bin/bench_bitcoin" -filter='SipHash.*36b' -min-time=10000; \
    done
    

    </details>

    <details> <summary>aarch64 SipHash 36-byte microbenchmarks: ~2x faster with GCC and Clang</summary>

    2026-05-03 | SipHash 36-byte microbench | gcc (Ubuntu 14.2.0-19ubuntu2) 14.2.0 | g++ (Ubuntu 14.2.0-19ubuntu2) 14.2.0 | rpi5-16-2 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM
    
    |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    |               16.61 |       60,192,540.58 |    0.0% |           80.00 |           39.78 |  2.011 |           0.00 |   50.0% |     11.00 | `SipHash13Jumbo_36b`
    |               33.92 |       29,483,465.40 |    0.0% |          171.00 |           81.23 |  2.105 |           0.00 |   50.0% |     11.00 | `SipHash24_36b`
    
    2026-05-03 | SipHash 36-byte microbench | Ubuntu clang version 22.0.0 (++20250923084147+c890a9050e88-1~exp1~20250923084331.324) | Ubuntu clang version 22.0.0 (++20250923084147+c890a9050e88-1~exp1~20250923084331.324) | rpi5-16-2 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM
    
    |               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    |               17.02 |       58,748,229.11 |    0.0% |           81.00 |           40.75 |  1.988 |           0.00 |   50.0% |     11.00 | `SipHash13Jumbo_36b`
    |               33.21 |       30,115,556.16 |    0.0% |          172.00 |           79.51 |  2.163 |           0.00 |   50.0% |     11.00 | `SipHash24_36b`
    

    </details>

    Benchmarks

    The -reindex-chainstate runs below were collected before this final shared-wrapper shape, while the branch still applied the same jumboblock hasher only to CCoinsMap.

    <details><summary>5% faster | reindex-chainstate | 946649 blocks | dbcache 1000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD</summary>

    for DBCACHE in 1000; do \
        COMMITS="976985eccd546a95e38973b854ccc6589e8afc74 b16188a906302b7d9e06adf2bc57e2b4f88b942f"; \
        STOP=946649; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0";
    done
    
    976985eccd Merge bitcoin/bitcoin#34124: validation: make `CCoinsView` a pure virtual interface
    b16188a906 crypto: inline jumboblock SipHash
    
    2026-05-04 | reindex-chainstate | 946649 blocks | dbcache 1000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 976985eccd546a95e38973b854ccc6589e8afc74)
      Time (abs ≡):        19247.091 s               [User: 33228.894 s, System: 1908.564 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
      Time (abs ≡):        18357.707 s               [User: 32399.997 s, System: 1935.467 s]
    
    Relative speed comparison
            1.05          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 976985eccd546a95e38973b854ccc6589e8afc74)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
    

    </details>

    <details><summary>5% faster | reindex-chainstate | 946649 blocks | dbcache 30000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD</summary>

    for DBCACHE in 30000; do \
        COMMITS="976985eccd546a95e38973b854ccc6589e8afc74 b16188a906302b7d9e06adf2bc57e2b4f88b942f"; \
        STOP=946649; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0";
    done
    
    ffa1b71e87 bench: add SipHash-2-4 36-byte benchmark
    42410eda26 coins: use jumboblock SipHash for `CCoinsMap`
    
    2026-05-03 | reindex-chainstate | 946649 blocks | dbcache 30000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ffa1b71e870755d2a355fe3c8ed8c084544fabb2)
      Time (abs ≡):        17664.762 s               [User: 24440.658 s, System: 773.072 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 42410eda260ed
      Time (abs ≡):        16835.349 s               [User: 23591.362 s, System: 759.491 s]
    
    Relative speed comparison
            1.05          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ffa1b71e870755d2a355fe3c8ed8c084544fabb2)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 42410eda260ed60bc2af1e8857e96f348bafdcde)
    

    </details>

    <details><summary>2% faster | reindex-chainstate | 946649 blocks | dbcache 1000 | rpi5-16-3 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | SSD</summary>

    for DBCACHE in 1000; do \                                                                                                                                     
        COMMITS="71728b0c83d6f372406f34549ce8b988fa7e3a1e b16188a906302b7d9e06adf2bc57e2b4f88b942f"; \
        STOP=946649; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0";
    done
    
    71728b0c83 bench: add SipHash-2-4 36-byte benchmark
    b16188a906 crypto: inline jumboblock SipHash
    
    2026-05-03 | reindex-chainstate | 946649 blocks | dbcache 1000 | rpi5-16-3 | aarch64 | Cortex-A76 | 4 cores | 15Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 71728b0c83d6f372406f34549ce8b988fa7e3a1e)
      Time (abs ≡):        37976.474 s               [User: 55537.835 s, System: 4759.529 s]
     
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
      Time (abs ≡):        37076.745 s               [User: 54518.158 s, System: 4766.083 s]
     
    Relative speed comparison
            1.02          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 71728b0c83d6f372406f34549ce8b988fa7e3a1e)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
    

    </details>

    <details><summary>2% faster | reindex-chainstate | 946649 blocks | dbcache 1000 | umbrel | x86_64 | Intel(R) N150 | 4 cores | 15Gi RAM | SSD</summary>

    for DBCACHE in 1000; do \                                                                                                                                    
        COMMITS="71728b0c83d6f372406f34549ce8b988fa7e3a1e b16188a906302b7d9e06adf2bc57e2b4f88b942f"; \
        STOP=946649; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0";
    done
    
    71728b0c83 bench: add SipHash-2-4 36-byte benchmark
    b16188a906 crypto: inline jumboblock SipHash
    
    2026-05-03 | reindex-chainstate | 946649 blocks | dbcache 1000 | umbrel | x86_64 | Intel(R) N150 | 4 cores | 15Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 71728b0c83d6f372406f34549ce8b988fa7e3a1e)
      Time (abs ≡):        27194.936 s               [User: 36353.676 s, System: 3494.054 s]
     
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
      Time (abs ≡):        26559.372 s               [User: 35396.175 s, System: 3521.104 s]
     
    Relative speed comparison
            1.02          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 71728b0c83d6f372406f34549ce8b988fa7e3a1e)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = b16188a906302b7d9e06adf2bc57e2b4f88b942f)
    
    

    </details>

    Future Work

    • Add a general SipHash-1-3 implementation, shorter-input specializations, or other variants from Pieter's sketch if benchmarks justify them.
    • Evaluate related ideas for compact-block short IDs (BIP152); this is a protocol surface and would require separate design, BIP discussion, and negotiation.
  2. bench: add SipHash-2-4 36-byte benchmark
    Rename the existing 32-byte benchmark to `SipHash24_32b`.
    Add a 36-byte variant for `uint256` plus a 32-bit outpoint index.
    This records the current baseline shape before adding outpoint-specific hashers.
    42e088fada
  3. DrahtBot added the label UTXO Db and Indexes on May 5, 2026
  4. DrahtBot commented at 2:02 PM on May 5, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35215.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  5. l0rinc force-pushed on May 5, 2026
  6. DrahtBot added the label CI failed on May 5, 2026
  7. DrahtBot commented at 7:27 PM on May 5, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task 32 bit ARM: https://github.com/bitcoin/bitcoin/actions/runs/25393628377/job/74474475011</sub> <sub>LLM reason (✨ experimental): CI failed due to a C++ build error: hash_tests.cpp couldn’t compile because uint256’s consteval hex parsing wasn’t a constant expression.</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  8. crypto: use jumboblock SipHash for outpoints
    Add `PresaltedSipHasher13Jumbo` for hashing a `uint256` plus a `uint32_t` outpoint index.
    For this fixed 36-byte input, the implementation processes the four hash limbs as one jumboblock, keeps the existing index/length word, omits `m5`, and runs 3 finalization SipRounds.
    This is the `SH13+JB+UP` case from Pieter Wuille's sketch.
    
    Switch the existing `SaltedOutpointHasher` wrapper to the new presalted hasher, so all existing `COutPoint` unordered containers keep their public hasher type while using the faster implementation.
    This is a non-standard table-hashing specialization, meant for internal hash tables whose keys already contain a uniformly distributed hash.
    
    Add a fixed test vector for the non-standard path.
    
    Co-authored-by: Pieter Wuille <pieter@wuille.net>
    Co-authored-by: Jean-Philippe Aumasson <jeanphilippe.aumasson@gmail.com>
    8e2fca3c55
  9. bench: add SipHash13Jumbo_36b benchmark
    Add a focused benchmark for `PresaltedSipHasher13Jumbo` on the same 36-byte shape as `SipHash24_36b`.
    This lets reviewers compare the existing SipHash-2-4 36-byte benchmark with the new SipHash-1-3 jumboblock specialization.
    425df7c7cd
  10. crypto: inline jumboblock SipHash
    Move `siphash_detail::SipRound` and `PresaltedSipHasher13Jumbo::operator()` into `siphash.h`.
    This lets the benchmark and `COutPoint` hash-table call sites inline the short specialized hash body through `SaltedOutpointHasher`.
    Inlining made this benchmark up to about 16% faster locally, while the existing SipHash implementations did not show improvement from the same treatment.
    7ca4e5d694
  11. l0rinc force-pushed on May 5, 2026
  12. DrahtBot removed the label CI failed on May 5, 2026
  13. veorq commented at 5:20 AM on May 6, 2026: none

    FTR I confirm my statements quoted by OP

    Pieter also ran the jumboblock idea by Jean-Philippe Aumasson, one of the SipHash authors; based on a preliminary analysis, Aumasson did not think this made collisions easier to construct. Aumasson also said SipHash-1-3 is fine for this hashmap use case and offered to comment on or review the PR.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-06 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me