Draft for now - comments are welcome
- Should we repurpose the existing
-blocksxor
arg instead? - Would multithreading help enough to justify the extra complexity, given the current IO-bound profile?
- Should we split
ObfuscateBlocks
out of init? I have split it into many local lambdas, but we may want to find better home for those methods… - Which additional scenarios should we cover in testing (resume mid-run, explicit 16-hex key path, large datadirs, pruned)?
- Can we safely assume the extra memory and disk space is available?
- Release notes, translations?
- Given user concern about what they store on disk, should we consider including this in v30?
Context
Recent discussions highlighted that many nodes which synced before Bitcoin Core v28 have their block and undo files stored effectively in the clear (zero XOR key). This patch adds a simple, resumable maintenance tool to obfuscate previously raw block files, rotate an existing key to a fresh random one, or de-obfuscate (set key to zero) if consciously chosen, all without requiring resync. The operation can be cancelled and restarted safely.
Implementation
The new startup option -reobfuscate-blocks[=VALUE]
accepts either 16 hex characters as an exact 8-byte XOR key (little-endian in-memory layout) or a boolean to generate a random
64-bit key. e.g. -reobfuscate-blocks=0000000000000000
sets the key to zero, effectively removing obfuscation.
If we detect unobfuscated blocks at start time we suggest this new option in a warning.
At startup, we iterate over all (blk|rev)*.dat
files, read them with the old XOR key and write them back with the new key (<name>.reobfuscated
). After successful write, we
immediately delete the old file. Once all files are staged, we rename them back and atomically swap xor.dat.reobfuscated
→ xor.dat
and continue operation.
We log the old and new keys and print progress roughly per-percent as files complete (i.e. max 100 progress logs).
Constraints
- Re-obfuscation resumes automatically (detected via
xor.dat.reobfuscated
) even without the flag. In worst-case a crash should only force us to redo previous work. - We need to load the whole blockfile in memory (~160MB peak memory usage) and write and extra blockfile.
- Single-threaded, processing one file at a time to keep code simple and avoid complexity of interleaving renames and key swaps across threads.
- Fast in practice with sequential read/modify/write per blockfile - after recent obfuscation vectorization, this path is very quick.
Performance
cpu | hdd/ssd | block count | size | files | time (min) | blocks/min |
---|---|---|---|---|---|---|
Apple M4 Max laptop | SSD | ~909k | ~707 GB | 9,982 | 6.2 | 146,612 |
Intel Core i9 | SSD | ~909k | ~725 GB | 10,238 | 23.1 | 39,351 |
Intel Core i7 | HDD | ~909k | ~720 GB | 10,156 | 208.7 | 4,356 |
Raspberry Pi 4B | HDD | ~738k | ~601 GB | 8,511 | 1237 | 596 |
Similar work: