Fix for Corrupt block found indicating potential hardware failure; shutting down #31430

issue helloscoopa openend this issue on December 5, 2024
  1. helloscoopa commented at 3:43 pm on December 5, 2024: none

    Is there an existing issue for this?

    • I have searched the existing issues

    Current behaviour

    The bitcoind process exits with the given log message. If I try to run again, the same message appears. I need to run bitcoind with -reindex (-reindex-chainstate doesn’t work as block index is corrupted) to continue which takes alot of time and I had to do reindexing like 8 times before I was able to sync about 445,000 blocks. This only slows down my full sync as I download more and more blocks, my reindex times goes higher every time. It doesn’t make sense for me to sync this way. I did not have any unexpected power failures, unexptect bitcoind exits or anything as well. It just happens randomly out of nowhere.

    Expected behaviour

    Since this is happening while in the initial sync, this should be some corruption happened in the memory, or the block being received from the peer itself is corrupted.

    Wouldn’t it be possible that we just discard that block, or maybe disconnect from that peer we got the corrupted block and continue syncing? Why it has to just exit the process with corrupted index on disk that has to be reindexed from scratch before any use. Is there anything fundamental that I’m missing here?

    Steps to reproduce

    • Clone the repo
    • Checkout to v28x tag
    • Run ./autogen.sh && ./configure && make && make install

    Relevant log output

    ERROR: AcceptBlock: bad-txnmrklroot, hashMerkleRoot mismatch *** Corrupt block found indicating potential hardware failure; shutting down

    How did you obtain Bitcoin Core

    Compiled from source

    What version of Bitcoin Core are you using?

    v28.1.0rc1

    Operating system and version

    macOS Sequoia Version 15.1.1 (24B91)

    Machine specifications

    Apple M1 Pro 16GB Memory Using an external Transcend 1TB SSD as datadir.

  2. maflcko commented at 3:55 pm on December 5, 2024: member

    Using an external Transcend 1TB SSD as datadir.

    What filesystem? Did you check the connection and the drive for defects?

  3. maflcko added the label Data corruption on Dec 5, 2024
  4. maflcko added the label Questions and Help on Dec 5, 2024
  5. maflcko added the label macOS on Dec 5, 2024
  6. helloscoopa commented at 4:21 pm on December 5, 2024: none

    What filesystem? Did you check the connection and the drive for defects?

    • FS is exFAT.
    • I’ve tried with 2 different ISPs also – Can be almost sure that the issue is not with the connection.
    • The drive is brand new Transcend SSD i bought just for this, also I’ve tried syncing a new node for rune stuff back then, I used a different disk (Sandisk SSD) and still faced this same issue. The only thing remained same is my macbook (I tried with different, but same model macbook also and issue still exists – so maybe it has to do with this macbook model? idk.)
  7. maflcko commented at 4:25 pm on December 5, 2024: member
  8. willcl-ark commented at 10:32 pm on December 5, 2024: member

    Should we warn (or bail) when exFAT is used for [data|blocks]dir on MacOS?

    I wrote a patch to test detection on MacOS and it seems to work in my limited testing (with a single exFAT drive).

    It feels a bit pointless to (implicitly) support) it being used, when the reliability seems to be so poor…

  9. helloscoopa commented at 3:20 am on December 6, 2024: none

    #28552

    Yup, Tried APFS and works great. Thanks!

  10. helloscoopa commented at 3:24 am on December 6, 2024: none

    Should we warn (or bail) when exFAT is used for [data|blocks]dir on MacOS?

    I wrote a patch to test detection on MacOS and it seems to work in my limited testing (with a single exFAT drive).

    It feels a bit pointless to (implicitly) support) it being used, when the reliability seems to be so poor…

    I think this would be really nice. Especially as its not easy to figure out the actual issue with existing error messages as they’re more generic. I’m not sure if we should stop if exFAT on MacOS, rather a warning would be enough.

  11. maflcko removed the label Questions and Help on Dec 6, 2024
  12. maflcko added the label Upstream on Dec 6, 2024
  13. RandyMcMillan commented at 7:02 pm on December 6, 2024: contributor

    @willcl-ark - I say log it and bail.

    an additional alert (link to issue?) in the gui may be useful as well?

    mark the issue - no plans to fix?

  14. willcl-ark commented at 10:55 am on December 10, 2024: member

    Thanks for reporting this @helloscoopa

    I’ve opened a PR #31453 with a suggested change to warn users when we detect exFAT on MacOS. This doesn’t “fix” this issue, but will at least help future-users potentially self-diagnose what’s gone wrong.

    I’m going to close this issue and track the general problem in the more generic tracking issue I opened: #31454 which includes a little bit more debugging output for anyone wanting to try and fix this more thoroughly.

    Let me know if you want this re-opened though for any reason though, and we can do that.

  15. willcl-ark closed this on Dec 10, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me