RFC: Repair block directory

mzumsande commented at 2:39 pm on November 1, 2019: member

I’d like to discuss a raw idea for a new feature to repair the block directory, motivated by my first IBD experience some time ago:

In case of data corruption of the chainstate, we can perform a --reindex-chainstate. In case of data corruption of the block index, we can perfom a --reindex.

However, if one of the block files is corrupted, both reindex and reindex-chainstate will stop at some point, and we are back to IBD, which can be painful in case of a slow connection or limited traffic. For example, in case of a bad sector in blk00002.dat, we would download almost the entire blockchain again, even if we are actually just missing a single block on our disk.

So my idea is to create a repair-blockdir feature that would scan the block directory (similar to reindex) and check all files for integrity and completeness using the existing block index. In case of mismatches, only the faulty blk?????.dat files would be rebuilt, requesting specifically the missing blocks from the p2p network.

Before putting any serious work into that: Would such a feature make sense to anyone? Has something similar been attempted before?

mzumsande added the label Feature on Nov 1, 2019

laanwj commented at 3:26 pm on November 1, 2019: member

Generally: I don’t think it makes sense to put work in accommodating hardware with “bad sectors”. Bitcoin’s requirements on hardware resources are heavy compared to other software, and even small bit errors could be actively dangerous (say, in signing, or wallet keys). All in all, it’s better to fail on such hardware.

On the other hand, the ability to cope with non-complete sets of blocks and selectively download, would also be necessary for ‘smarter’ pruning that selectively (non-contiguous, based on a seed) keeps ranges of blocks.

So maybe the infrastructure for this could be useful. It wouldn’t even need a new flag, it could simply be the reindex that does this when it notices some blocks in the middle are missing.

laanwj added the label Data corruption on Nov 1, 2019

jonasschnelli commented at 7:24 pm on November 4, 2019: contributor

I agree with @laanwj.

For example, in case of a bad sector in blk00002.dat, we would download almost the entire blockchain again, even if we are actually just missing a single block on our disk.

If you run on a system/disk/storage that results in bad sectors you eventually run into the risk of corrupting more important data like your wallets private keys for example.

What eventually makes sense is to backup the UTXO set (#8037). I usually follow the hardlink approach to recover in case of a sudden shutdown (where the cache write-down gets interrupted).

mzumsande commented at 0:50 am on November 5, 2019: member

Thanks for the advice! In my situation, I had split away just the blocks dir to an external disk, which failed, so something like this would have come in handy. But the point not to accomodate faulty hardware makes sense to me, so I’ll close this issue.

mzumsande closed this on Nov 5, 2019

kollokollo commented at 9:04 am on November 23, 2019: none

I would like to have that feature. Havin run into this kind of trouble several times in the last years, everytime on a new hardware, I am giving up running a full note this time. Obviously I am not able to run good and stable hardware. I know that this will make the bitcoin nework a tiny bit smaller and leave the work and control more and more only to professional IT departments. Thus compromizing the decentralization and so forth. You know what that means on a long run. So: Yes I would support any feature been able to cope with normal consumer harware limitations, to enable everyone with less-professional equipment still being able to participate in the the network in future.

DrahtBot locked this on Dec 16, 2021

RFC: Repair block directory #17341