Data corruption on MacOS when using exFAT datadir or blocksdir #31454

issue willcl-ark openend this issue on December 10, 2024
  1. willcl-ark commented at 10:48 am on December 10, 2024: member

    As reported in #28552, #28613 and various other places online there are intermittent issues when using an exFAT-formatted drive on MacOS.

    I was easily able to reproduce the issue, using an extrenal Samsung T7 SSD formatted to exFAT and connected via USB-C to an M4 macbook. Failures were seen at seemingly random points: blocks ~80_000, 170_000, 270_000.

    The failures report in multiple ways, depending on where things have gone wrong. For example:

    02024-12-09T10:35:17Z [error] ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: unspecified iostream_category error at FlatFilePos(nFile=102, nPos=84929524)
    12024-12-09T10:35:17Z [error] A fatal internal error occurred, see debug.log for details: Failed to read block.
    2Error: A fatal internal error occurred, see debug.log for details: Failed to read block.
    

    or:

    02024-12-09T21:58:44Z [error] A fatal internal error occurred, see debug.log for details: Corrupt block found indicating potential hardware failure.
    12024-12-09T21:58:44Z [error] ConnectTip: ConnectBlock 0000000000000010bca0ce4ce37c165cd4e820356194bf5991a7b9af792a83d5 failed, bad-txnmrklroot, hashMerkleRoot mismatch
    22024-12-09T21:58:44Z tor: Thread interrupt
    32024-12-09T21:58:44Z [error] ProcessNewBlock: ActivateBestChain failed (bad-txnmrklroot, hashMerkleRoot mismatch)
    

    In debugging both, I found that the first was simply garbage at that offset in the block file. No magic bytes were present nearby. The second was able to deserialise as the header and first bytes were OK, but the block later contained corrupt bytes, causing the merkle check to fail.

    I added debugging to ensure that it wasn’t an issue with our read/write code, which showed it was not:

    coloured link: https://paste.256k1.dev/erjellines.log

     0/Volumes/data/bitcoin-macos-debug
     1$ rg 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c debug.log -C5
     21647168-2024-12-09T16:40:38Z ReadBlockFromDisk: Starting read for block at FlatFilePos(nFile=0, nPos=69397708) (header at FlatFilePos(nFile=0, nPos=69397700))
     31647169-2024-12-09T16:40:38Z ReadBlockFromDisk: Read header magic=f9beb4d9 size=2538 at FlatFilePos(nFile=0, nPos=69397700)
     41647170-2024-12-09T16:40:38Z ReadBlockFromDisk: Reading block data at file offset 69397708 size=2538
     51647171-2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=000000000001612689260e1db03677c70bb96e5707d55fe90a1e1f0c5dcac318 size=2538 nTx=9
     61647172-2024-12-09T16:40:38Z UpdateTip: new best=000000000001612689260e1db03677c70bb96e5707d55fe90a1e1f0c5dcac318 height=104762 version=0x00000001 log2_work=59.466998 tx=245649 date='2011-01-26T23:23:07Z' progress=0.000218 cache=22.9MiB(134951txo)
     71647173:2024-12-09T16:40:38Z AcceptBlock: Beginning write for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c height=104770 size=33103
     81647174:2024-12-09T16:40:38Z AcceptBlock: Finding new position for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c
     91647175:2024-12-09T16:40:38Z SaveBlockToDisk: Finding position for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c size=33111 height=104770
    101647176:2024-12-09T16:40:38Z SaveBlockToDisk: Got position file=0 pos=69408910 for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c
    111647177:2024-12-09T16:40:38Z WriteBlockToDisk: Opening file for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c at FlatFilePos(nFile=0, nPos=69408910)
    121647178:2024-12-09T16:40:38Z WriteBlockToDisk: Writing header for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c: magic=f9beb4d9 size=33103
    131647179:2024-12-09T16:40:38Z WriteBlockToDisk: Writing block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c data at file offset 69408918
    141647180:2024-12-09T16:40:38Z WriteBlockToDisk: Successfully wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to disk
    151647181:2024-12-09T16:40:38Z SaveBlockToDisk: Successfully wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to FlatFilePos(nFile=0, nPos=69408918)
    161647182:2024-12-09T16:40:38Z AcceptBlock: Wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to file=0 pos=69408918
    171647183:2024-12-09T16:40:38Z AcceptBlock: Successfully recorded block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c transactions
    181647184-2024-12-09T16:40:38Z AcceptBlock: Beginning write for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 height=104772 size=215
    191647185-2024-12-09T16:40:38Z AcceptBlock: Finding new position for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3
    201647186-2024-12-09T16:40:38Z SaveBlockToDisk: Finding position for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 size=223 height=104772
    211647187-2024-12-09T16:40:38Z SaveBlockToDisk: Got position file=0 pos=69442021 for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3
    221647188-2024-12-09T16:40:38Z WriteBlockToDisk: Opening file for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 at FlatFilePos(nFile=0, nPos=69442021)
    23--
    241647316-2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=00000000000053dd7549b03afd260f695e5bfcecca55055d4f3e980cccdfb389 size=215 nTx=1
    251647317-2024-12-09T16:40:38Z UpdateTip: new best=00000000000053dd7549b03afd260f695e5bfcecca55055d4f3e980cccdfb389 height=104769 version=0x00000001 log2_work=59.468001 tx=245698 date='2011-01-27T00:27:29Z' progress=0.000218 cache=22.9MiB(134960txo)
    261647318-2024-12-09T16:40:38Z ReadBlockFromDisk: Starting read for block at FlatFilePos(nFile=0, nPos=69408918) (header at FlatFilePos(nFile=0, nPos=69408910))
    271647319-2024-12-09T16:40:38Z ReadBlockFromDisk: Read header magic=f9beb4d9 size=33103 at FlatFilePos(nFile=0, nPos=69408910)
    281647320-2024-12-09T16:40:38Z ReadBlockFromDisk: Reading block data at file offset 69408918 size=33103
    291647321:2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c size=33103 nTx=3
    301647322-2024-12-09T16:40:38Z [error] A fatal internal error occurred, see debug.log for details: Corrupt block found indicating potential hardware failure.
    311647323:2024-12-09T16:40:38Z [error] ConnectTip: ConnectBlock 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c failed, bad-txnmrklroot, hashMerkleRoot mismatch
    321647324-2024-12-09T16:40:38Z [error] ProcessNewBlock: ActivateBestChain failed (bad-txnmrklroot, hashMerkleRoot mismatch)
    331647325-2024-12-09T16:40:38Z tor: Thread interrupt
    341647326-2024-12-09T16:40:38Z Shutdown: In progress...
    351647327-2024-12-09T16:40:38Z opencon thread exit
    361647328-2024-12-09T16:40:38Z torcontrol thread exit
    

    We attempt to read the correct section, but find incorrect data.

    I did try adding even more logging than this, but after a point the corruption became less frequent; indicating to me that slowing down IBD perhaps gives these syncs enough time to process properly.

    As mentioned in #31453, I suspect the root cause here is the F_FULLFSYNC fcntl is improperly implemented for exFAT on MacOS.

    I am unable to think of any other ways we could fix this in our codebase, without resorting to extreme measures like opening the block files without caching (and therefore forcing direct I/O).

  2. willcl-ark commented at 10:51 am on December 10, 2024: member
    See also #31430
  3. willcl-ark added the label macOS on Dec 10, 2024
  4. willcl-ark added the label Block storage on Dec 10, 2024
  5. willcl-ark added the label Upstream on Dec 10, 2024
  6. willcl-ark added the label Data corruption on Dec 10, 2024
  7. willcl-ark referenced this in commit 7325cf4e3f on Dec 10, 2024
  8. willcl-ark referenced this in commit ee05e73046 on Dec 10, 2024
  9. davidgumberg commented at 0:25 am on January 9, 2025: contributor

    #31453 (comment) (by me):

    To add a little data pointing in the direction of this bug only existing on macOS, I have synced nodes with datadirs on exFAT drives over both SATA and USB on Fedora 40 and have a long-running bitcoind node that has been running off of an exFAT SATA SSD on Fedora 40 for the past ~6 months and have never experienced any of the symptoms described in #28552 or #31454.

    I think it is suspicious that all of these corruption incidents are related to block files, if F_FULLFSYNC was severely unreliable on exFAT on macOS, I would expect similar corruption to happen with leveldb chainstate with a similar frequency (and rarely sqlite wallet files) since both rely on F_FULLSYNC on macOS.

    I also could not find reports of corruptions happening in other projects that use leveldb or sqlite when paired with exFAT disks. Other projects do fall back to fsync when fcntl(fileno, F_FULLFSYNC) fails loudly and returns -1 (which we should also probably use to avoid e.g. #26455 ) but I couldn’t find any complaints about F_FULLFSYNC failing silently, which isn’t to say that there aren’t such complaints or that it isn’t failing silently, but I think it suggests that if there is a bug in macOS, something in particular about the way we store blocks is triggering it.

    One guess I have is that the bug in macOS’s exFAT driver only appears when you have overlapping F_FULLFSYNC calls on the same file like when saving blocks in rapid succession, but I haven’t investigated this.

    Some links that might be of interest I found while researching F_FULLFSYNC:


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me