As reported in #28552, #28613 and various other places online there are intermittent issues when using an exFAT-formatted drive on MacOS.
I was easily able to reproduce the issue, using an extrenal Samsung T7 SSD formatted to exFAT and connected via USB-C to an M4 macbook. Failures were seen at seemingly random points: blocks ~80_000, 170_000, 270_000.
The failures report in multiple ways, depending on where things have gone wrong. For example:
02024-12-09T10:35:17Z [error] ReadBlockFromDisk: Deserialize or I/O error - ReadCompactSize(): size too large: unspecified iostream_category error at FlatFilePos(nFile=102, nPos=84929524)
12024-12-09T10:35:17Z [error] A fatal internal error occurred, see debug.log for details: Failed to read block.
2Error: A fatal internal error occurred, see debug.log for details: Failed to read block.
or:
02024-12-09T21:58:44Z [error] A fatal internal error occurred, see debug.log for details: Corrupt block found indicating potential hardware failure.
12024-12-09T21:58:44Z [error] ConnectTip: ConnectBlock 0000000000000010bca0ce4ce37c165cd4e820356194bf5991a7b9af792a83d5 failed, bad-txnmrklroot, hashMerkleRoot mismatch
22024-12-09T21:58:44Z tor: Thread interrupt
32024-12-09T21:58:44Z [error] ProcessNewBlock: ActivateBestChain failed (bad-txnmrklroot, hashMerkleRoot mismatch)
In debugging both, I found that the first was simply garbage at that offset in the block file. No magic bytes were present nearby. The second was able to deserialise as the header and first bytes were OK, but the block later contained corrupt bytes, causing the merkle check to fail.
I added debugging to ensure that it wasn’t an issue with our read/write code, which showed it was not:
coloured link: https://paste.256k1.dev/erjellines.log
0/Volumes/data/bitcoin-macos-debug
1$ rg 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c debug.log -C5
21647168-2024-12-09T16:40:38Z ReadBlockFromDisk: Starting read for block at FlatFilePos(nFile=0, nPos=69397708) (header at FlatFilePos(nFile=0, nPos=69397700))
31647169-2024-12-09T16:40:38Z ReadBlockFromDisk: Read header magic=f9beb4d9 size=2538 at FlatFilePos(nFile=0, nPos=69397700)
41647170-2024-12-09T16:40:38Z ReadBlockFromDisk: Reading block data at file offset 69397708 size=2538
51647171-2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=000000000001612689260e1db03677c70bb96e5707d55fe90a1e1f0c5dcac318 size=2538 nTx=9
61647172-2024-12-09T16:40:38Z UpdateTip: new best=000000000001612689260e1db03677c70bb96e5707d55fe90a1e1f0c5dcac318 height=104762 version=0x00000001 log2_work=59.466998 tx=245649 date='2011-01-26T23:23:07Z' progress=0.000218 cache=22.9MiB(134951txo)
71647173:2024-12-09T16:40:38Z AcceptBlock: Beginning write for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c height=104770 size=33103
81647174:2024-12-09T16:40:38Z AcceptBlock: Finding new position for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c
91647175:2024-12-09T16:40:38Z SaveBlockToDisk: Finding position for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c size=33111 height=104770
101647176:2024-12-09T16:40:38Z SaveBlockToDisk: Got position file=0 pos=69408910 for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c
111647177:2024-12-09T16:40:38Z WriteBlockToDisk: Opening file for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c at FlatFilePos(nFile=0, nPos=69408910)
121647178:2024-12-09T16:40:38Z WriteBlockToDisk: Writing header for block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c: magic=f9beb4d9 size=33103
131647179:2024-12-09T16:40:38Z WriteBlockToDisk: Writing block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c data at file offset 69408918
141647180:2024-12-09T16:40:38Z WriteBlockToDisk: Successfully wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to disk
151647181:2024-12-09T16:40:38Z SaveBlockToDisk: Successfully wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to FlatFilePos(nFile=0, nPos=69408918)
161647182:2024-12-09T16:40:38Z AcceptBlock: Wrote block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c to file=0 pos=69408918
171647183:2024-12-09T16:40:38Z AcceptBlock: Successfully recorded block 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c transactions
181647184-2024-12-09T16:40:38Z AcceptBlock: Beginning write for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 height=104772 size=215
191647185-2024-12-09T16:40:38Z AcceptBlock: Finding new position for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3
201647186-2024-12-09T16:40:38Z SaveBlockToDisk: Finding position for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 size=223 height=104772
211647187-2024-12-09T16:40:38Z SaveBlockToDisk: Got position file=0 pos=69442021 for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3
221647188-2024-12-09T16:40:38Z WriteBlockToDisk: Opening file for block 0000000000000fe919539878073d169245f55a86cab68e382b8a33502c656ac3 at FlatFilePos(nFile=0, nPos=69442021)
23--
241647316-2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=00000000000053dd7549b03afd260f695e5bfcecca55055d4f3e980cccdfb389 size=215 nTx=1
251647317-2024-12-09T16:40:38Z UpdateTip: new best=00000000000053dd7549b03afd260f695e5bfcecca55055d4f3e980cccdfb389 height=104769 version=0x00000001 log2_work=59.468001 tx=245698 date='2011-01-27T00:27:29Z' progress=0.000218 cache=22.9MiB(134960txo)
261647318-2024-12-09T16:40:38Z ReadBlockFromDisk: Starting read for block at FlatFilePos(nFile=0, nPos=69408918) (header at FlatFilePos(nFile=0, nPos=69408910))
271647319-2024-12-09T16:40:38Z ReadBlockFromDisk: Read header magic=f9beb4d9 size=33103 at FlatFilePos(nFile=0, nPos=69408910)
281647320-2024-12-09T16:40:38Z ReadBlockFromDisk: Reading block data at file offset 69408918 size=33103
291647321:2024-12-09T16:40:38Z ReadBlockFromDisk: Read complete - block hash=0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c size=33103 nTx=3
301647322-2024-12-09T16:40:38Z [error] A fatal internal error occurred, see debug.log for details: Corrupt block found indicating potential hardware failure.
311647323:2024-12-09T16:40:38Z [error] ConnectTip: ConnectBlock 0000000000022043f9ea2c298f3381c6ddd86b968271117dd55cb48ac297954c failed, bad-txnmrklroot, hashMerkleRoot mismatch
321647324-2024-12-09T16:40:38Z [error] ProcessNewBlock: ActivateBestChain failed (bad-txnmrklroot, hashMerkleRoot mismatch)
331647325-2024-12-09T16:40:38Z tor: Thread interrupt
341647326-2024-12-09T16:40:38Z Shutdown: In progress...
351647327-2024-12-09T16:40:38Z opencon thread exit
361647328-2024-12-09T16:40:38Z torcontrol thread exit
We attempt to read the correct section, but find incorrect data.
I did try adding even more logging than this, but after a point the corruption became less frequent; indicating to me that slowing down IBD perhaps gives these syncs enough time to process properly.
As mentioned in #31453, I suspect the root cause here is the F_FULLFSYNC fcntl
is improperly implemented for exFAT on MacOS.
I am unable to think of any other ways we could fix this in our codebase, without resorting to extreme measures like opening the block files without caching (and therefore forcing direct I/O).