This change is part of [IBD] - Tracking PR for speeding up Initial Block Download
Summary
We can serialize the blocks and undos to any Stream which implements the appropriate read/write methods.
AutoFile is one of these, writing the results “directly” to disk (through the OS file cache). Batching these in memory first and reading/writing these to disk is measurably faster (likely because of fewer native fread calls or less locking, as observed by Martinus in a similar change).
Unlocking new optimization opportunities
Buffered writes will also enable batched obfuscation calculations (implemented in #31144) - especially since currently we need to copy the write input’s std::span to do the obfuscation on it, and batching enables doing the operations on the internal buffer directly.
Measurements (micro benchmarks, full IBDs and reindexes)
Microbenchmarks for [Read|Write]BlockBench show a ~30%/168% speedup with macOS/Clang, and ~19%/24% with Linux/GCC (the follow-up XOR batching improves these further):
Before:
| ns/op | op/s | err% | total | benchmark | 
|---|---|---|---|---|
| 2,271,441.67 | 440.25 | 0.1% | 11.00 | ReadBlockBench | 
| 5,149,564.31 | 194.19 | 0.8% | 10.95 | WriteBlockBench | 
After:
| ns/op | op/s | err% | total | benchmark | 
|---|---|---|---|---|
| 1,738,683.04 | 575.15 | 0.2% | 11.04 | ReadBlockBench | 
| 3,052,658.88 | 327.58 | 1.0% | 10.91 | WriteBlockBench | 
Before:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark | 
|---|---|---|---|---|---|---|---|---|---|
| 6,895,987.11 | 145.01 | 0.0% | 71,055,269.86 | 23,977,374.37 | 2.963 | 5,074,828.78 | 0.4% | 22.00 | ReadBlockBench | 
| 5,152,973.58 | 194.06 | 2.2% | 19,350,886.41 | 8,784,539.75 | 2.203 | 3,079,335.21 | 0.4% | 23.18 | WriteBlockBench | 
After:
| ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark | 
|---|---|---|---|---|---|---|---|---|---|
| 5,771,882.71 | 173.25 | 0.0% | 65,741,889.82 | 20,453,232.33 | 3.214 | 3,971,321.75 | 0.3% | 22.01 | ReadBlockBench | 
| 4,145,681.13 | 241.21 | 4.0% | 15,337,596.85 | 5,732,186.47 | 2.676 | 2,239,662.64 | 0.1% | 23.94 | WriteBlockBench | 
2 full IBD runs against master (compiled with GCC where the gains seem more modest) for 888888 blocks (seeded from real nodes) indicates a ~7% total speedup.
 0COMMITS="d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a 652b4e3de5c5e09fb812abe265f4a8946fa96b54"; \
 1STOP_HEIGHT=888888; DBCACHE=1000; \
 2C_COMPILER=gcc; CXX_COMPILER=g++; \
 3BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs"; \
 4(for c in $COMMITS; do git fetch origin $c -q && git log -1 --pretty=format:'%h %s' $c || exit 1; done) && \
 5hyperfine \
 6  --sort 'command' \
 7  --runs 2 \
 8  --export-json "$BASE_DIR/ibd-${COMMITS// /-}-$STOP_HEIGHT-$DBCACHE-$C_COMPILER.json" \
 9  --parameter-list COMMIT ${COMMITS// /,} \
10  --prepare "killall bitcoind; rm -rf $DATA_DIR/*; git checkout {COMMIT}; git clean -fxd; git reset --hard; \
11    cmake -B build -DCMAKE_BUILD_TYPE=Release -DENABLE_WALLET=OFF -DCMAKE_C_COMPILER=$C_COMPILER -DCMAKE_CXX_COMPILER=$CXX_COMPILER && \
12    cmake --build build -j$(nproc) --target bitcoind && \
13    ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=1 -printtoconsole=0; sleep 100" \
14  --cleanup "cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log" \
15  "COMPILER=$C_COMPILER COMMIT=${COMMIT:0:10} ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP_HEIGHT -dbcache=$DBCACHE -blocksonly -printtoconsole=0"
16d2b72b1369 refactor: rename leftover WriteBlockBench
17652b4e3de5 optimization: Bulk serialization writes in `WriteBlockUndo` and `WriteBlock`
18Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
19  Time (mean ± σ):     41528.104 s ± 354.003 s    [User: 44324.407 s, System: 3074.829 s]
20  Range (min … max):   41277.786 s … 41778.421 s    2 runs
21 
22Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)
23  Time (mean ± σ):     38771.457 s ± 441.941 s    [User: 41930.651 s, System: 3222.664 s]
24  Range (min … max):   38458.957 s … 39083.957 s    2 runs
25 
26Relative speed comparison
27        1.07 ±  0.02  COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = d2b72b13699cf460ffbcb1028bcf5f3b07d3b73a)
28        1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=888888 -dbcache=1000 -blocksonly -printtoconsole=0 (COMMIT = 652b4e3de5c5e09fb812abe265f4a8946fa96b54)