Continuing the work of @rdponticelli, this is an implementation of the autoprune functionality (see #4701). Thanks to @rdponticelli for doing a lot of prior work on this topic; we used his work and the discussion of it as our starting point (though we found it easier to base this pull off of master). @mrbandrews, @morcos, @ajweiss, and @sdaftuar collaborated on this pull.
To summarize autoprune: this adds a -prune=N option to bitcoind, which if set to N>0 will enable autoprune. When autopruning is enabled, block and undo files will be deleted to try to keep total space used by those files to below the prune target (N, in MB) specified by the user, subject to some constraints:
- The last 288 blocks on the main chain are always kept (MIN_BLOCKS_TO_KEEP),
- N must be at least 550MB (chosen as a value for the target that could reasonably be met, with some assumptions about block sizes, orphan rates, etc; see comment in main.h),
- No blocks are pruned until chainActive is at least 100,000 blocks long (on mainnet; defined separately for mainnet, testnet, and regtest in chainparams as nAutopruneAfterHeight).
We’ve attempted to address the comments raised in the prior conversations; here are a few things worth noting:
- Handling reorgs greater than the data stored on disk is not yet implemented. Currently, bitcoind will exit with a message indicating that it failed to read the block on disk – suggestions for a better error message are welcome.
- We’ve disabled block relay (with a check to see if we’re NODE_NETWORK or not). In the future when we have worked out the service bit to use for an autopruned node and the behavior for block requests, we can re-enable this appropriately. (This differs from the behavior of #4701 .)
- PruneOneBlockFile currently iterates mapBlockIndex each time to determine which entries to update; we can optimize this by maintaining a reverse lookup table so that the blocks in each blockfile are known.
- One open question is what to do if we’ve pruned before, and then are running a -reindex: what do we do with the leftover files? In this case, the first N files will have been deleted by the pruning code, which won’t be processed in the reindex loop, but later we may try overwriting the remaining files in place. Would it be better to, say, delete the files preemptively to prevent potential file corruption?
- RPC/REST interfaces currently do not distinguish between unknown blocks/tx and pruned blocks/tx. It’s unclear if a specific error should be returned, or perhaps the block/tx query interfaces be disabled for pruned nodes.
- Tests and a test plan will be forthcoming.
(Though this pull still needs more work to address some of the above items, we thought we’d open it now to keep review momentum on this feature moving…)