Feature request: Backup of datadir state #31324

issue lcharles123 openend this issue on November 19, 2024
  1. lcharles123 commented at 5:13 pm on November 19, 2024: none

    Please describe the feature you’d like to see added.

    It will be good to have a option included in bitcoind to backup and restore the current state of the datadir periodically.

    Not a software problem, but its the environment. Many events can corrupt data inside datadir, to name some: power outages, out of resources (RAM), network fail (in case of network storage). Once corrupted, it’s costly to re-sync it again from scratch. So instead of syncing, it can just restore the backed datadir to some point in time.

    Describe the solution you’d like

    Add a folder called backup to datadir, it can have 4 files: the last two backups named with timestamp and compressed as tar.gz (or other better format) and one text file for each backup with info to restore them. Add options to bitcoind: -restorebackup=<0|1> -backup=<n> where n can be

    • 0, to disable
    • 1 to backup only once after node are synced, and do not update this backup in case of it already present in backup folder
    • greater than 1 indicating the period in days since last backup to do it again. It will keep always two backups to address the case in which a failure occurs during the backup process.

    Describe any alternatives you’ve considered

    The backed data to reconstruct any state in time are (i think):

    • chainstate folder;
    • Last blocks/*.blk and .rev files that are not completely full;
    • blocks/index folder
    • Any other file that are used with specific configuration, like bloom filters (or in the case of restoring the backup, just reconstruct them to avoid conflicts)

    Actually i do it manually, i keep an HDD with datadir up to May/2024, and when restoring, just copy chainstate, last blk/rev files of backup, excluding the others not present in backup, blocks/index Since my node uses HDD and have 2GB RAM, it takes one month to sync from scratch. So i restored this backup sometimes and saved a lot of time.

    The creation of this backup also can be automated using scripts in bash, python, etc. The script need to stop bitcoind, write chainstate, blocks/index and some last .blk and .rev files. The restoration can be done manually, jsut replace chainstate, blocks/index and exclude .blk/.rev files down to current in the backup.

    Please leave any additional context

    No response

  2. lcharles123 added the label Feature on Nov 19, 2024
  3. pinheadmz commented at 5:33 pm on November 19, 2024: member
    I think backups should be the users job not bitcoind. Especially putting backups in datadir – wouldn’t it make more sense to be on a completely different drive ?
  4. adamandrews1 commented at 9:59 pm on November 19, 2024: none
    If this goes forward, I think the backup folder path should be configurable and should have a default path outside the datadir. Default would be disabled. I don’t see the need to keep two backup files. The backup operation should be atomic and replace the old backup file. If it fails, the old backup file should remain undisturbed. I think API should be in terms of block height. -autobackup=100 would attempt to re-write the backup on new block discovery if the new block has height > 100 from the latest block in the backup folder. 0 would disable it.
  5. maflcko commented at 7:38 am on November 20, 2024: member

    Not a software problem, but its the environment. Many events can corrupt data inside datadir, to name some: power outages, out of resources (RAM), network fail (in case of network storage).

    From a Bitcoin Core perspective, none of these should lead to corruption, as the state is flushed atomically. If a flush were to fail, it will be re-done on the next startup (which could then take a long time).

    If you are observing issues, it is likely hardware issues, or software issues somewhere else. Trying to work around them with backup scripts seems not ideal. It would be better to fix the issues themselves.

    Bitcoin Core makes heavy use of CPU, RAM and storage IO. Hardware defects might only become visible when running Bitcoin Core. You might want to check your hardware for defects.

    • Use software such as memtest86 to check your RAM.
    • Use software such as linpack, or Prime95 to check the CPU behaviour under load.
    • Use software such as smartctl, fsck, badblocks, or CrystalDiskInfo to test your storage device use.

    Source: https://bitcoin.stackexchange.com/a/12206

  6. maflcko commented at 7:45 am on November 20, 2024: member

    The creation of this backup also can be automated using scripts in bash, python, etc. The script need to stop bitcoind, write chainstate, blocks/index and some last .blk and .rev files. The restoration can be done manually, jsut replace chainstate, blocks/index and exclude .blk/.rev files down to current in the backup.

    I am sure most people using Bitcoin Core in production will be doing backups already. However, I am also sure that everyone is using their preferred method of backing up. Attempting to praise one way to backup over the others by adding an opinionated example script in this repo is likely going to be controversial and likely not going to be useful, because people will stick to their normal/preferred backup method anyway. (For example, compression will break the benefits of CoW filesystems, not to mention that the benefits of a backup on the same filesystem are limited either way)

  7. fanquake commented at 9:45 am on November 20, 2024: member
    Going to close this. It’s unlikely we are going to introduce additional complexity to our software, to work around issues in your hardware / environment.
  8. fanquake closed this on Nov 20, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me