coins: cache UTXO outpoint hash codes #35195

pull l0rinc wants to merge 2 commits into bitcoin:master from l0rinc:l0rinc/noexcept-false changing 2 files +21 −7
  1. l0rinc commented at 4:22 PM on May 2, 2026: contributor

    Problem

    CCoinsMap uses SaltedOutpointHasher for UTXO cache lookups. Since #16957 marked the hasher noexcept, libstdc++ no longer stores cached hash codes in unordered-map nodes.

    #16957 optimized for memory: the original benchmark reported about 9.4% lower peak RSS, but also about 1.6% slower runtime from recomputing hashes. This PR takes the opposite side of that same tradeoff for the UTXO cache: allow libstdc++ to store one cached hash code per unordered-map node, avoiding repeated SipHash work in rehashing and some table operations.

    The supporting measurements repeatedly point in the same direction:

    • Full IBD/reindex-chainstate runs showed about 2-3% faster at both low and high dbcache.
    • AssumeUTXO loading showed the strongest signal, up to about 11% faster. This is the cleanest dbcache-exercising benchmark here because it avoids block download and full validation work, so the coins-cache effect is less diluted.
    • Compatibility measurements on top of the input-fetcher parallelization and SipHash optimization work still showed cached hash codes helping: about 3% faster on an Intel N150 after the SipHash change, and still about 1% faster on M4 with -dbcache=10000.
    • Theoretically, memory can increase on libstdc++ by up to one cached size_t per node. In practice, recent Massif runs were roughly neutral at moderate dbcache values, while a very large dbcache run showed the expected higher peak.

    For CCoinsMap, no allocator sizing change is needed since its PoolAllocator block size already reserves implementation-defined node overhead and explicitly accounts for implementations where "the hash value is stored as well". That assumption remained in the code after the #16957 noexcept change disabled cached hash codes for this hasher, so there is no memory calculation to restore in this PR.

    This is also consistent with concerns raised in the original #16957 discussion: reviewers noted the noexcept change was slower, that it imposed a penalty when memory was not limiting, and that "it might be possible to regain the performance loss by caching the hash ourselves." This PR lets libstdc++ do that caching for us.

    Fix

    Revert #16957’s SaltedOutpointHasher noexcept change, then document the now-intentional noexcept(false) contract and add a focused unit test.

    The implementation still does not throw, only the exception specification used by libstdc++’s unordered-map node policy changes: for a fast hash functor that may throw, libstdc++ caches hash codes in nodes.

    The test checks both inputs to that policy:

    • SaltedOutpointHasher is not nothrow-invocable
    • on libstdc++, it remains classified as a fast hash

    This mirrors the type-level contract style used in libstdc++’s own hash tests.

    Reproducer

    <details> <summary>Fresh reindex-chainstate run</summary>

    for DBCACHE in 1000 30000; do \
        COMMITS="ef499680c8d426ac04f3a5d4ad67cbb3426e0e21 1016fba624f840fc893b8de61df4a149ac9551a8"; \
        STOP=946649; CC=gcc; CXX=g++; \
        BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
        (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) && \
        (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") && \
        hyperfine \
        --sort command \
        --runs 1 \
        --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json" \
        --parameter-list COMMIT ${COMMITS// /,} \
        --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;" \
        --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
        "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0";
    done
    
    ef499680c8 Merge bitcoin/bitcoin#34176: wallet: crash fix, handle non-writable db directories
    1016fba624 util: allow caching outpoint hash codes
    
    2026-04-30 | reindex-chainstate | 946649 blocks | dbcache 1000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ef499680c8d426ac04f3a5d4ad67cbb3426e0e21)
      Time (abs ≡):        19288.792 s               [User: 33333.953 s, System: 1949.252 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 1016fba624f840fc893b8de61df4a149ac9551a8)
      Time (abs ≡):        18968.171 s               [User: 33375.739 s, System: 2030.692 s]
    
    Relative speed comparison
            1.02          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ef499680c8d426ac04f3a5d4ad67cbb3426e0e21)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=1000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 1016fba624f840fc893b8de61df4a149ac9551a8)
    
    ef499680c8 Merge bitcoin/bitcoin#34176: wallet: crash fix, handle non-writable db directories
    1016fba624 util: allow caching outpoint hash codes
    
    2026-04-30 | reindex-chainstate | 946649 blocks | dbcache 30000 | i9-ssd | x86_64 | Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz | 16 cores | 62Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ef499680c8d426ac04f3a5d4ad67cbb3426e0e21)
      Time (abs ≡):        17615.001 s               [User: 24395.224 s, System: 771.559 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 1016fba624f840fc893b8de61df4a149ac9551a8)
      Time (abs ≡):        17209.581 s               [User: 24079.479 s, System: 785.327 s]
    
    Relative speed comparison
            1.02          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = ef499680c8d426ac04f3a5d4ad67cbb3426e0e21)
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=946649 -dbcache=30000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 1016fba624f840fc893b8de61df4a149ac9551a8)
    

    </details>

    <details> <summary>Earlier supporting measurements</summary>

    Original #16957 result:

    master                                4:13:59   7696728 KiB
    2019-09-SaltedOutpointHasher-noexcept 4:18:11   6971412 KiB
    change                                +1.65%    -9.42%
    

    #16957 also showed that increasing dbcache on the noexcept branch could recover performance while staying below old memory use:

    master -dbcache=5000     4:13:59   7696728 KiB
    noexcept -dbcache=5471   4:01:16   7282044 KiB
    

    Earlier cached-hash measurements:

    IBD to height 909090, -dbcache=5000
    baseline:     27046.236 s ± 631.042 s
    cached hash:  26226.707 s ± 373.829 s
    change:       ~3.0% faster
    
    reindex-chainstate to height 909090, -dbcache=5000
    baseline:     27728.594 s ± 96.850 s
    cached hash:  26847.390 s ± 105.590 s
    change:       ~3.2% faster
    
    loadtxoutset, -dbcache=500
    baseline:     451.699 s ± 1.126 s
    cached hash:  422.857 s ± 1.488 s
    change:       ~6.4% faster
    
    loadtxoutset, -dbcache=3000
    baseline:     426.088 s ± 1.824 s
    cached hash:  403.923 s ± 1.043 s
    change:       ~5.2% faster
    
    loadtxoutset, -dbcache=4000
    baseline:     440.708 s ± 2.242 s
    cached hash:  423.934 s ± 4.145 s
    change:       ~3.8% faster
    

    Later noexcept removal measurements:

    reindex-chainstate, height 916000, -dbcache=15000
    baseline:     23856.269 s
    no noexcept:  22734.025 s
    change:       ~4.7% faster
    
    reindex-chainstate, height 917000, -dbcache=30000
    baseline:     12703.654 s
    no noexcept:  11957.649 s
    change:       ~5.9% faster
    
    assumeutxo on i9 SSD, -dbcache=450
    baseline:     347.241 s ± 3.282 s
    no noexcept:  309.558 s ± 1.951 s
    change:       ~10.9% faster
    

    Memory observations:

    Original [#16957](/bitcoin-bitcoin/16957/):
    master:      7696728 KiB
    noexcept:    6971412 KiB
    change:      -9.42%
    
    Recent Massif, assumeutxo -dbcache=450:
    baseline:     744.5 MB
    cached hash:  737.7 MB
    
    Recent Massif, assumeutxo -dbcache=4500:
    baseline:     4.640 GB
    cached hash:  4.641 GB
    
    Very large dbcache:
    baseline:     24.11 GB
    cached hash:  25.49 GB
    

    </details>

    <details> <summary>Compatibility with input-fetcher and `SipHash` work</summary>

    This change has been in my queue for more than a year, so some supporting measurements are older and were collected while adjacent input-fetcher and SipHash work was still evolving. The relevant compatibility measurements are documented in this [#31132](/bitcoin-bitcoin/31132/) review comment.

    After input fetching, SipHash inlining, and noexcept removal on an Intel N150:

    reindex-chainstate, height 943349, -dbcache=1000, Intel N150
    
    input fetcher baseline:     20014.228 s
    `SipHash` optimized:          18933.341 s
    no noexcept:                18388.590 s
    
    change from `SipHash` optimized to no noexcept: ~2.9% faster
    change from input fetcher baseline to no noexcept: ~8.1% faster
    

    After input fetching, SipHash inlining, and noexcept removal on an M4 Max:

    reindex-chainstate, height 943349, -dbcache=10000, Apple M4 Max
    
    input fetcher baseline:     4836.390 s
    `SipHash` optimized:          4543.009 s
    no noexcept:                4493.407 s
    
    change from `SipHash` optimized to no noexcept: ~1.1% faster
    change from input fetcher baseline to no noexcept: ~7.1% faster
    

    These measurements suggest cached hash-code nodes remain useful even when the hasher itself gets faster and block input fetching is parallelized.

    </details>

  2. Revert "make SaltedOutpointHasher noexcept"
    This reverts commit 67d99900b0d770038c9c5708553143137b124a6c.
    fb0c207567
  3. util: allow caching outpoint hash codes
    `SaltedOutpointHasher` has been `noexcept` since #16957, which makes libstdc++ omit cached hash codes from `std::unordered_map` nodes.
    
    That saves one `size_t` per node, but it also makes `CCoinsMap` recompute the outpoint hash in table operations that could otherwise reuse the cached code.
    
    Declare the operator `noexcept(false)` to restore libstdc++ hash-code caching for this fast hash functor.
    The implementation still does not throw; only the exception specification used by the container policy changes.
    
    This restores a value that #16957 deliberately removed, but the pool-backed `CCoinsMap` allocation budget still accounts for implementations storing hash values.
    The existing `PoolAllocator` comment says that "in some cases the hash value is stored as well" and that "sizeof(void*)*4" overhead "should thus be sufficient so that all implementations can allocate the nodes from the PoolAllocator."
    
    Add a unit test for the exception-specification contract and the libstdc++ fast-hash classification.
    16e77fdf13
  4. DrahtBot added the label UTXO Db and Indexes on May 2, 2026
  5. DrahtBot commented at 4:22 PM on May 2, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-03 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me