RFC: blocks: add -reobfuscate-blocks arg to xor existing blk/rev on startup #33324

pull l0rinc wants to merge 4 commits into bitcoin:master from l0rinc:l0rinc/reobfuscate-blocks changing 10 files +253 −12
  1. l0rinc commented at 11:51 pm on September 5, 2025: contributor

    Draft for now - comments are welcome

    • Should we repurpose the existing -blocksxor arg instead?
    • Would multithreading help enough to justify the extra complexity, given the current IO-bound profile?
    • Should we split ObfuscateBlocks out of init? I have split it into many local lambdas, but we may want to find better home for those methods…
    • Which additional scenarios should we cover in testing (resume mid-run, explicit 16-hex key path, large datadirs, pruned)?
    • Can we safely assume the extra memory and disk space is available?
    • Release notes, translations?
    • Given user concern about what they store on disk, should we consider including this in v30?

    Context

    Recent discussions highlighted that many nodes which synced before Bitcoin Core v28 have their block and undo files stored effectively in the clear (zero XOR key). This patch adds a simple, resumable maintenance tool to obfuscate previously raw block files, rotate an existing key to a fresh random one, or de-obfuscate (set key to zero) if consciously chosen, all without requiring resync. The operation can be cancelled and restarted safely.

    Implementation

    The new startup option -reobfuscate-blocks[=VALUE] accepts either 16 hex characters as an exact 8-byte XOR key (little-endian in-memory layout) or a boolean to generate a random 64-bit key. e.g. -reobfuscate-blocks=0000000000000000 sets the key to zero, effectively removing obfuscation.

    If we detect unobfuscated blocks at start time we suggest this new option in a warning.

    At startup, we iterate over all (blk|rev)*.dat files, read them with the old XOR key and write them back with the new key (<name>.reobfuscated). After successful write, we immediately delete the old file. Once all files are staged, we rename them back and atomically swap xor.dat.reobfuscatedxor.dat and continue operation.

    We log the old and new keys and print progress roughly per-percent as files complete (i.e. max 100 progress logs).

    Constraints

    • Re-obfuscation resumes automatically (detected via xor.dat.reobfuscated) even without the flag. In worst-case a crash should only force us to redo previous work.
    • We need to load the whole blockfile in memory (~160MB peak memory usage) and write and extra blockfile.
    • Single-threaded, processing one file at a time to keep code simple and avoid complexity of interleaving renames and key swaps across threads.
    • Fast in practice with sequential read/modify/write per blockfile - after recent obfuscation vectorization, this path is very quick.

    Performance

    cpu hdd/ssd block count size files time (min) blocks/min
    Apple M4 Max laptop SSD ~909k ~707 GB 9,982 6.2 146,612
    Intel Core i9 SSD ~909k ~725 GB 10,238 23.1 39,351
    Intel Core i7 HDD ~909k ~720 GB 10,156 208.7 4,356
    Raspberry Pi 4B HDD ~738k ~601 GB 8,511 1237 596

    Similar work:

  2. DrahtBot commented at 11:51 pm on September 5, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33324.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK TheCharlatan, stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33231 (net: Prevent node from binding to the same CService by w0xlt)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • blocksdir -> blocks directory [more natural and clear phrasing; “blocksdir” is nonstandard and may confuse readers]

    drahtbot_id_5_m

  3. refactor: inline constant `f_obfuscate = false` parameter 4e3d61b84c
  4. l0rinc force-pushed on Sep 6, 2025
  5. refactor: add path + string and file removal helpers baad60d064
  6. l0rinc force-pushed on Sep 7, 2025
  7. init: add -`reobfuscate-blocks` argument 09a083ca9e
  8. l0rinc force-pushed on Sep 8, 2025
  9. blocks: add `-reobfuscate-blocks` to xor existing blk/rev on startup
    ### Context
    
    Recent discussions highlighted that many nodes which synced before Bitcoin Core v28 have their block and undo files stored effectively in the clear (zero XOR key). This patch adds a simple, resumable maintenance tool to obfuscate previously raw block files, rotate an existing key to a fresh random one, or deobfuscate (set key to zero) if consciously chosen, all without requiring resync. The operation can be cancelled and restarted safely.
    
    ### Implementation
    
    The new startup option `-reobfuscate-blocks[=VALUE]` accepts either 16 hex characters as an exact 8-byte XOR key (little-endian in-memory layout) or a boolean to generate a random 64-bit key. e.g. `-reobfuscate-blocks=0000000000000000` sets the key to zero, effectively removing obfuscation.
    
    If we detect unobfuscated blocks at start time we suggest this new option in a warning.
    
    At startup, we iterate over all `(blk|rev)*.dat` files, read them with the old XOR key and write them back with the new key (`<name>.reobfuscated`).
    The implementation actually combines the two keys and reads directly into the new obfuscated version to only do a single iteration over the data. This works if the original blocks aren't obfuscated or if the new blocks aren't or if both are.
    After successful write, we immediately delete the old file. Once all files are staged, we rename them back and atomically swap `xor.dat.reobfuscated` → `xor.dat` and continue operation.
    
    We log the old and new keys and print progress roughly per-percent as files complete (i.e. max 100 progress logs).
    
    ### Constraints
    
    * Re-obfuscation resumes automatically (detected via `xor.dat.reobfuscated`) even without the flag. In worst-case a crash should only force us to redo previous work.
    * We need to load the whole blockfile in memory (<130MB) and write and extra blockfile.
    * Single-threaded, processing one file at a time to keep code simple and avoid complexity of interleaving renames and key swaps across threads.
    * Fast in practice with sequential read/modify/write per blockfile - after recent obfuscation vectorization, this path is very quick.
    
    ### Performance
    
    > M4 Max laptop with SSD
    
    * ~900k blocks (~707GB) - `[obfuscate] finished migrating 9982 file(s) in 372s`
    
    > i9 with SSD
    
    * ~900k blocks (~725GB) - `[obfuscate] finished migrating 10238 file(s) in 1386s`
    
    -----
    
    Similar attempts: #32451 and andrewtoth/blocks-xor
    
    Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
    Co-authored-by: Murch <murch@murch.one>
    d3962f65fa
  10. l0rinc force-pushed on Sep 8, 2025
  11. TheCharlatan commented at 8:21 am on September 8, 2025: contributor
    Concept ACK
  12. stickies-v commented at 10:06 pm on September 8, 2025: contributor

    Concept ACK, this seems like useful functionality to expose.

    Should we split ObfuscateBlocks out of init? I have split it into many local lambdas, but we may want to find better home for those methods…

    I don’t like using startup options for one-time operations (I feel the same about e.g. -reindex). Without having thought it through too much yet, maybe we can bundle this e.g. as part of bitcoin-util or a separate bitcoin-xor-blocks utility?

    Should we repurpose the existing -blocksxor arg instead?

    With this PR, IIUC we’d have -blocksxor, reobfuscate-blocks, and the existence of the xor.dat file that all have some redundancy and thus potential for conflict (e,g. blocksxor=0, reobfuscate-blocks=1, and a non-zero xor.dat file). Reducing that complexity seems like it would be useful.

  13. DrahtBot added the label Needs rebase on Sep 9, 2025
  14. DrahtBot commented at 11:39 pm on September 9, 2025: contributor
    🐙 This pull request conflicts with the target branch and needs rebase.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-10 03:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me