This change is part of [IBD] - Tracking PR for speeding up Initial Block Download
Summary
The final UTXO set is written to disk in batches to avoid a gigantic spike at flush time. There is already a -dbbatchsize config option to change this value, this PR adjusts the default only. By increasing the default batch size, we can reduce overhead from repeated compaction cycles, minimize constant overhead per batch, and achieve more sequential writes.
Context
The UTXO set has grown significantly since 2017, when the original 16 MiB batch size was chosen. Flushing it from memory to LevelDB often takes 20-30 minutes on commodity hardware after a successful IBD with large dbcache values. Bigger batch allows more LevelDB optimizations before flushing while minimally increasing memory footprint (the UTXO set is >1000x times bigger the batch size). This is especially true now that we’ve bumped the LevelDB max file size.
Considerations
As noted by @sipa, this will temporarily increase the required memory exactly when our memory needs are the greatest. This will be less problematic after validation: write chainstate to disk every hour, where we can lose at most the last hour of work. While 128 and 256 MiB batches were often faster, 64 MiB was chosen since it seems to achieve a reasonable speedup at a small memory cost.
Measurements
Experiments with different batch sizes (loaded via AssumeUTXO 840k and the new 880k, then measuring final flush time) on different operating system show that 64 MiB batches significantly reduce flush time without notably increasing memory usage (both smaller and bigger ones are usually slower).
dbbatchsize | flush_sum (ms) |
---|---|
8 « 20 | 236993.73 |
8 « 20 | 239557.79 |
8 « 20 | 244149.25 |
8 « 20 | 246116.93 |
8 « 20 | 243496.98 |
16 « 20 | 209673.01 |
16 « 20 | 225029.97 |
16 « 20 | 230826.61 |
16 « 20 | 230312.84 |
16 « 20 | 235912.83 |
32 « 20 | 201898.77 |
32 « 20 | 196676.18 |
32 « 20 | 198958.81 |
32 « 20 | 196230.08 |
32 « 20 | 199105.84 |
64 « 20 | 150691.51 |
64 « 20 | 151072.18 |
64 « 20 | 151465.16 |
64 « 20 | 150403.59 |
64 « 20 | 150342.34 |
128 « 20 | 155917.81 |
128 « 20 | 156121.83 |
128 « 20 | 156514.6 |
128 « 20 | 155616.36 |
128 « 20 | 156398.24 |
256 « 20 | 166843.39 |
256 « 20 | 166226.37 |
256 « 20 | 166351.75 |
256 « 20 | 166197.15 |
256 « 20 | 166755.22 |
512 « 20 | 186020.24 |
512 « 20 | 186689.18 |
512 « 20 | 186895.21 |
512 « 20 | 185427.1 |
512 « 20 | 186105.48 |
1 « 30 | 185488.98 |
1 « 30 | 185963.51 |
1 « 30 | 185754.25 |
1 « 30 | 186993.17 |
1 « 30 | 186145.73 |
Checking the impact of a -reindex-chainstate
with -stopatheight=878000
and -dbcache=30000
gives:
On SSD:
02025-01-12T07:31:05Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
12025-01-12T07:53:51Z Shutdown: done
Flush time before: 22 minutes and 46 seconds
02025-01-12T18:30:00Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
12025-01-12T18:44:43Z Shutdown: done
Flush time after: 14 minutes and 43 seconds
On HDD:
02025-01-12T04:31:40Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
12025-01-12T05:02:39Z Shutdown: done
Flush time before: 30 minutes and 59 seconds
02025-01-12T20:22:24Z [warning] Flushing large (26 GiB) UTXO set to disk, it may take several minutes
12025-01-12T20:42:57Z Shutdown: done
Flush time after: 20 minutes and 33 seconds
Reproducer:
You can either do a full IBD or a reindex(-chainstate) and check the final logs flush the blocks or load the UTXO set from the AssumeUTXO until 840k or 880k and use them for the measurements.
0# Build Bitcoin Core
1cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake --build build -j$(nproc)
2
3# Set up a clean demo environment
4mkdir -p demo && rm -rfd demo/chainstate demo/chainstate_snapshot demo/debug.log
5
6# Start bitcoind with minimal settings without mempool and internet connection
7build/bin/bitcoind -datadir=demo -stopatheight=1
8build/bin/bitcoind -datadir=demo -daemon -blocksonly=1 -connect=0 -dbcache=30000
9
10# Load the AssumeUTXO snapshot, making sure the path is correct
11# Expected output includes `"coins_loaded": 184821030`
12build/bin/bitcoin-cli -datadir=demo -rpcclienttimeout=0 loadtxoutset ~/utxo-880000.dat
13
14# Stop the daemon and verify snapshot flushes in the logs
15build/bin/bitcoin-cli -datadir=demo stop
16grep "FlushSnapshotToDisk: completed" demo/debug.log