Fix for `Corrupt block found indicating potential hardware failure; shutting down`

helloscoopa commented at 3:43 pm on December 5, 2024: none

Is there an existing issue for this?

I have searched the existing issues

Current behaviour

The bitcoind process exits with the given log message. If I try to run again, the same message appears. I need to run bitcoind with -reindex (-reindex-chainstate doesn’t work as block index is corrupted) to continue which takes alot of time and I had to do reindexing like 8 times before I was able to sync about 445,000 blocks. This only slows down my full sync as I download more and more blocks, my reindex times goes higher every time. It doesn’t make sense for me to sync this way. I did not have any unexpected power failures, unexptect bitcoind exits or anything as well. It just happens randomly out of nowhere.

Expected behaviour

Since this is happening while in the initial sync, this should be some corruption happened in the memory, or the block being received from the peer itself is corrupted.

Wouldn’t it be possible that we just discard that block, or maybe disconnect from that peer we got the corrupted block and continue syncing? Why it has to just exit the process with corrupted index on disk that has to be reindexed from scratch before any use. Is there anything fundamental that I’m missing here?

Steps to reproduce

Clone the repo
Checkout to v28x tag
Run ./autogen.sh && ./configure && make && make install

Relevant log output

ERROR: AcceptBlock: bad-txnmrklroot, hashMerkleRoot mismatch *** Corrupt block found indicating potential hardware failure; shutting down

How did you obtain Bitcoin Core

Compiled from source

What version of Bitcoin Core are you using?

v28.1.0rc1

Operating system and version

macOS Sequoia Version 15.1.1 (24B91)

Machine specifications

Apple M1 Pro 16GB Memory Using an external Transcend 1TB SSD as datadir.

maflcko commented at 3:55 pm on December 5, 2024: member

Using an external Transcend 1TB SSD as datadir.

What filesystem? Did you check the connection and the drive for defects?

maflcko added the label Data corruption on Dec 5, 2024

maflcko added the label Questions and Help on Dec 5, 2024

maflcko added the label macOS on Dec 5, 2024

helloscoopa commented at 4:21 pm on December 5, 2024: none

What filesystem? Did you check the connection and the drive for defects?

FS is exFAT.
I’ve tried with 2 different ISPs also – Can be almost sure that the issue is not with the connection.
The drive is brand new Transcend SSD i bought just for this, also I’ve tried syncing a new node for rune stuff back then, I used a different disk (Sandisk SSD) and still faced this same issue. The only thing remained same is my macbook (I tried with different, but same model macbook also and issue still exists – so maybe it has to do with this macbook model? idk.)

maflcko commented at 4:25 pm on December 5, 2024: member

exFAT

https://github.com/bitcoin/bitcoin/issues/28552

willcl-ark commented at 10:32 pm on December 5, 2024: member

Should we warn (or bail) when exFAT is used for [data|blocks]dir on MacOS?

I wrote a patch to test detection on MacOS and it seems to work in my limited testing (with a single exFAT drive).

It feels a bit pointless to (implicitly) support) it being used, when the reliability seems to be so poor…

helloscoopa commented at 3:20 am on December 6, 2024: none

#28552

Yup, Tried APFS and works great. Thanks!

helloscoopa commented at 3:24 am on December 6, 2024: none

Should we warn (or bail) when exFAT is used for [data|blocks]dir on MacOS?

I wrote a patch to test detection on MacOS and it seems to work in my limited testing (with a single exFAT drive).

It feels a bit pointless to (implicitly) support) it being used, when the reliability seems to be so poor…

I think this would be really nice. Especially as its not easy to figure out the actual issue with existing error messages as they’re more generic. I’m not sure if we should stop if exFAT on MacOS, rather a warning would be enough.

maflcko removed the label Questions and Help on Dec 6, 2024

maflcko added the label Upstream on Dec 6, 2024

RandyMcMillan commented at 7:02 pm on December 6, 2024: contributor

@willcl-ark - I say log it and bail.

an additional alert (link to issue?) in the gui may be useful as well?

mark the issue - no plans to fix?

willcl-ark commented at 10:55 am on December 10, 2024: member

Thanks for reporting this @helloscoopa

I’ve opened a PR #31453 with a suggested change to warn users when we detect exFAT on MacOS. This doesn’t “fix” this issue, but will at least help future-users potentially self-diagnose what’s gone wrong.

I’m going to close this issue and track the general problem in the more generic tracking issue I opened: #31454 which includes a little bit more debugging output for anyone wanting to try and fix this more thoroughly.

Let me know if you want this re-opened though for any reason though, and we can do that.

willcl-ark closed this on Dec 10, 2024

Fix for Corrupt block found indicating potential hardware failure; shutting down #31430