Add reindex=auto flag to automatically reindex corrupt data #22072
pull AaronDewes wants to merge 1 commits into bitcoin:master from AaronDewes:auto-recovery changing 1 files +16 −6-
AaronDewes commented at 7:37 am on May 26, 2021: noneThis PR allows the reindex flag to be set to auto, which automatically starts a reindex if the chain state or block index are corrupt. This can be especially useful for Raspberry Pi based full nodes, which often experience power outages or similar issues which can corrupt data. It allows full node operators to make Bitcoin Core reindex automatically, without having to worry about removing the reindex flag again. (Which isn’t much effort, but can be annoying to forget)
-
laanwj added the label UTXO Db and Indexes on May 26, 2021
-
laanwj added the label Data corruption on May 26, 2021
-
laanwj commented at 9:08 am on May 26, 2021: member
Thanks for your first contribution!
I’m a bit divided on whether to add yet another configuration option for this. Either something like this makes sense as default behavior, or you might just as well follow the manual steps.
-
AaronDewes commented at 9:57 am on May 26, 2021: none
Either something like this makes sense as default behavior, or you might just as well follow the manual steps.
In my opinion, it would make sense to have this as the default behavior, because the user can’t do anything except reindexing anyway (Maybe still ask the user on the GUI though).
-
in src/init.cpp:1523 in ed68820fb4 outdated
1520@@ -1520,11 +1521,18 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info) 1521 1522 if (!fLoaded && !ShutdownRequested()) { 1523 // first suggest a reindex
amadeuszpawlik commented at 4:15 pm on May 26, 2021:Couldn’t this comment be (re)moved?
AaronDewes commented at 4:36 pm on May 26, 2021:Yes, I’ll move it.sipa commented at 4:39 pm on May 26, 2021: memberI’m not sure about this. If the chainstate gets corrupted the user should be made aware, so they can fix their hardware or setup.
If you reindex automatically without changing anything in the environment, there is no reason why the situation would be any different, and it’ll just end up reindexing again?
AaronDewes commented at 4:56 pm on May 26, 2021: noneI’m not sure about this. If the chainstate gets corrupted the user should be made aware, so they can fix their hardware or setup.
Yes, so maybe leave it as an option, of course, users should fix their setup in such a case, but a sudden power outage can happen, and this would allow to try to automatically fix this.
If you reindex automatically without changing anything in the environment, there is no reason why the situation would be any different, and it’ll just end up reindexing again?
If I understand the init logic correctly, the code doesn’t restart from scratch if a reindex is triggered at that point, and because Bitcoin Core doesn’t automatically rebuild a corrupt block database, only with this argument, I don’t think such a loop is possible. Also, I’m pretty sure the part of the code suggesting a reindex is only reached when a reindex is not already running, and a reindex would most likely work.
sipa commented at 4:59 pm on May 26, 2021: memberIf a power outage causes corruptions, you probably have bigger problems with your setup.
My point is: if you automatically reindex, the user isn’t made aware of the corruption, can’t do anything to resolve whatever is causing it, and the next time it gets corrupted you’ll just start all over again.
AaronDewes commented at 5:07 pm on May 26, 2021: noneIf a power outage causes corruptions, you probably have bigger problems with your setup.
Solutions like RapiBlitz, Umbrel or MyNode make setting up a node really easy. These are cheap and easy to build nodes, but Raspberry Pis don’t deal with power outages well, and that can often lead to data corruption. A power outage causing problems is in my opinion okay for personal nodes, many users just want to get their node set up without having to worry about (and pay for) an UPS or something similar.
My point is: if you automatically reindex, the user isn’t made aware of the corruption, can’t do anything to resolve whatever is causing it, and the next time it gets corrupted you’ll just start all over again.
Should I add a warning to the logs about it then? A sudden power outage is the most likely cause, and if something can be fixed, why not fix it automatically? A delay will be noticed by the user (and a warning), like when Bitcoin is “Rolling forward”, which is also triggered by an unclean shutdown.
sipa commented at 5:15 pm on May 26, 2021: member@AaronDewes Is it so easy to corrupt such hardware? That sounds terrible…
I’ll let others chime in about this, I’m -0 on it. I think it’s always better to make sure the user is aware that corruptions happens so they can investigate - and just putting it in the log will mostly be overlooked. That said I see the convenience for setups where the user isn’t directly using the node themselves.
jamesob commented at 5:21 pm on May 26, 2021: memberI think provided
(i) this is not the default behavior, (ii) the code change is fairly limited (as it appears to be), (iii) automatic reindexes are clearly logged, and (iv) it actually does resolve the issue of a non-UPS’d setup recovering from a power-loss based corruption,
then this is a pretty compelling feature to add.
I am not sure how frequent corruption that is solely based upon power loss (vs. bad hardware) is, though.
AaronDewes commented at 5:27 pm on May 26, 2021: noneI am not sure how frequent corruption that is solely based upon power loss (vs. bad hardware) is, though.
I’ve been helping a lot of Umbrel users with such issues, and if it was a bad drive, either
(a) The drive wouldn’t mount (b) There would be different errors (that I don’t remember right now)
Every time this message got printed, a reindex actually helped.
and just putting it in the log will mostly be overlooked.
For this, it maybe could help to provide information that the node is reindexing over the RPC api, becauce I can’t find such an endpoint. This way, custom UIs build on top of that API could also prompt the user with a warning, so it is more likely to be seen. I’m not sure if that should be part of this or another PR though.
(iii) automatic reindexes are clearly logged
I’ll do that.
(iv) it actually does resolve the issue of a non-UPS’d setup recovering from a power-loss based corruption,
A manual reindex did it, so it should be the same for this.
ryanofsky commented at 6:25 pm on May 26, 2021: contributorThis seems like a good feature. Just to bikeshed, maybe this could be a new reindex value,-reindex=auto
complementing current-reindex=0
and-reindex=1
so it is less vague than-attemptrecovery
and maybe more discoverable.attemptrecovery
sounds more like a one time emergency flag to me than something I would configure permanently on my raspberry pi.maflcko commented at 4:45 am on May 27, 2021: memberhaving to worry about removing the reindex flag again. (Which isn’t much effort, but can be annoying to forget)
An alternative solution to that would be to only store
fReindex
on-reindex
and then stop immediately with instructions to restart. Also it will refuse to start with-reindex
enabled again.AaronDewes force-pushed on May 27, 2021AaronDewes commented at 6:33 am on May 27, 2021: noneI implemented @ryanofsky’s idea now, with reindex defaulting to 0, but I can change the default to auto if that makes more sense for you.
An alternative solution to that would be to only store fReindex on -reindex and then stop immediately with instructions to restart.
This is true, but the user would still have to manually add this flag in case of a power outage.
This can be especially useful for Raspberry Pi based full nodes, which often experience power outages or similar issues which can corrupt data.
And what do you think about what I wrote above?
and just putting it in the log will mostly be overlooked.
For this, it maybe could help to provide information that the node is reindexing over the RPC api
AaronDewes renamed this:
Add flag to automatically reindex corrupt data
Add reindex=auto flag to automatically reindex corrupt data
on May 27, 2021Add reindex=auto flag to automatically reindex corrupt data
This PR allows the reindex flag to be set to auto, which automatically starts a reindex if the chain state or block index are corrupt. This can be especially useful for Raspberry Pi based full nodes, which often experience power outages or similar issues which can corrupt data. It allows full node operators to make Bitcoin Core reindex automatically, without having to worry about removing the reindex flag again. (Which isn't much effort, but can be annoying to forget)
AaronDewes force-pushed on May 31, 2021luke-jr referenced this in commit 291ac87866 on Jun 27, 2021DrahtBot commented at 8:40 pm on April 7, 2022: contributorThe following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #24789 (init, index: disallow indexes when running reindex-chainstate by mzumsande)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
DrahtBot added the label Needs rebase on Apr 26, 2022DrahtBot commented at 12:14 pm on April 26, 2022: contributor🐙 This pull request conflicts with the target branch and needs rebase.
Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.
achow101 commented at 6:30 pm on October 12, 2022: memberClosing this as it has not had any activity in a while. If you are interested in continuing work on this, please leave a comment so that it can be reopened.achow101 closed this on Oct 12, 2022
luke-jr referenced this in commit 069ccfcbc4 on Jun 27, 2023bitcoin locked this on Oct 12, 2023
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC
More mirrored repositories can be found on mirror.b10c.me