util: detect and warn when using exFAT on MacOS #31453

pull willcl-ark wants to merge 2 commits into bitcoin:master from willcl-ark:macos-exfat changing 4 files +74 −0
  1. willcl-ark commented at 10:29 am on December 10, 2024: member

    exFAT is known to cause intermittent corruption on MacOS.

    Therefore we should warn when using this fs format for either the blocks or data directories.

    See #28552 for more context.

  2. DrahtBot commented at 10:30 am on December 10, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31453.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK marcofleon
    Concept ACK l0rinc, laanwj, sipa, helloscoopa, hebasto, tdb3

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28710 (Remove the legacy wallet and BDB dependency by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Utils/log/libs on Dec 10, 2024
  4. willcl-ark commented at 10:32 am on December 10, 2024: member

    I did toy with having this trigger a full on InitError, but the behaviour is intermittent and seems worst during IBD. Exiting would also likely break things for a fair few users, due to exFAT being used in cross-platform storage devices. Therefore a warning seems more apt.

    The root cause seems to be that some calls to F_FULLFSYNC fcntl silently failing on exFAT. This appears to stem from the fact that, whilst supported, F_FULLFSYNC may not be fully implemented for exFAT:

    F_FULLFSYNC Does the same thing as fsync(2) then asks the drive to flush all buffered data to the permanent storage device (arg is ignored). This is currently imple-mented implemented mented on HFS, MS-DOS (FAT), and Universal Disk Format (UDF) file systems. The operation may take quite a while to complete. Certain FireWire drives have also been known to ignore the request to flush their buffered data.

    link: https://newosxbook.com/src.jl?tree=xnu&file=/bsd/man/man2/fcntl.2 and: https://developer.apple.com/library/archive/documentation/System/Conceptual/ManPages_iPhoneOS/man2/fcntl.2.html#//apple_ref/doc/man/2/fcntl

    From debugging this seems to manifest from fast processing speed; if I add enough LogPrintf’s then it eventually goes away, presumably due to the reduced IBD speed. This may also indicate that users on slower (CPU) machines may suffer from this less frequently.

    We do read and write to the block files from the correct position/offsets at failure but the wrong data is found there. So this appears to be an issue at the kernel level.

    I will open a new issue with some outputs from my debugging on this.

  5. in src/common/init.cpp:81 in a7d52d023e outdated
    76+        }};
    77+        for (const auto& check : paths_to_check) {
    78+            FSType fs_type = GetFilesystemType(check.path);
    79+            switch(fs_type) {
    80+                case FSType::EXFAT:
    81+                    LogWarning("Specified %s \"%s\" is exFAT which is known to have intermittent corruption problems on MacOS.", check.description, fs::PathToString(check.path));
    


    maflcko commented at 10:48 am on December 10, 2024:
    Shouldn’t this be an InitWarning? Otherwise gui users (basically everyone on macos?) will never see this on their screen, unless they dig up the debug log at the right position?

    willcl-ark commented at 11:43 am on December 10, 2024:

    Thanks for catching marco. It was indeed a mistake to change this from an InitError in my first draft to LogWarning.

    Will fixup shortly


    willcl-ark commented at 10:46 am on December 11, 2024:
    Done in 83839a35f07055dd03924b5bdab46bf31df33b35
  6. in src/common/init.cpp:69 in a7d52d023e outdated
    62@@ -62,6 +63,32 @@ std::optional<ConfigError> InitConfig(ArgsManager& args, SettingsAbortFn setting
    63             fs::create_directories(net_path / "wallets");
    64         }
    65 
    66+        // Warn if we are trying to put the datadir on an exFAT fs on MacOS
    67+        // This is an upstream issue known to cause bugs, see #28552
    68+#ifdef __APPLE__
    


    l0rinc commented at 11:15 am on December 10, 2024:
    Q: is this related to a macs or just generally using flash drives as storage?

    willcl-ark commented at 11:19 am on December 10, 2024:

    I haven’t encountered the same error using this drive to sync when formatted with APFS, so I in my opinion, no.

    I have come across comments online purporting that certain brands of nvme drives don’t play well with apple’s (internal) nvme controllers, which can also cause similar issues. But here I feel pretty certain this is an exFAT x MacOS issue, and we may well see the same issues with an internal exFAT drive.

    If anyone has a mac with an internal sata/nvme drive which they could format to exFAT and test IBD with, that would be great! (I haven’t)


    willcl-ark commented at 11:20 am on December 10, 2024:
    Oh I should add here, (and probably in #31454) that I tested this same drive using exFAT on Linux, and did not see any corruption to block 450k. By which point every MacOS test had corrupted some data and failed.
  7. marcofleon approved
  8. marcofleon commented at 11:35 am on December 10, 2024: contributor

    utACK a7d52d023eee6559d04e2e191f17176f2fe92c00. This happened to me earlier this year during IBD using a mac M3 and an external SSD formatted by default as exFAT. Took me a bit to figure it out, so a warning would’ve helped.

    Reformatting to APFS solved the issue for me. Would that be worth adding to the warning as a suggestion for a solution?

  9. willcl-ark commented at 11:38 am on December 10, 2024: member

    Reformatting to APFS solved the issue for me. Would that be worth adding to the warning as a suggestion for a solution?

    Yes that could be nice. Another idea that I had suggested was linking to a help page. Perhaps I could dd a section to https://github.com/bitcoin/bitcoin/blob/master/doc/files.md describing the problematic FS/OS combination and add the url to the warning to make things super-easy for users to fix themselves?

    cc @fanquake

  10. laanwj commented at 12:13 pm on December 10, 2024: member
    Concept ACK, i think this makes sense in general. exFAT is meant for storing photos, not as a mature filesystem for database workloads, so subpar reliability and performance are to be expected.
  11. l0rinc commented at 2:10 pm on December 10, 2024: contributor
    Concept ACK
  12. util: detect and warn when using exFAT on MacOS
    exFAT is known to cause corruption on MacOS. See #28552.
    
    Therefore we should warn when using this fs format for either the blocks
    or data directories.
    83839a35f0
  13. willcl-ark force-pushed on Dec 10, 2024
  14. sipa commented at 4:19 pm on December 10, 2024: member
    Concept ACK
  15. util: document recommended filesystems
    In particular, recommend against using exFAT on MacOS as this is known
    to cause breakages.
    
    See #31454 for more context.
    ee05e73046
  16. willcl-ark force-pushed on Dec 10, 2024
  17. DrahtBot added the label CI failed on Dec 10, 2024
  18. DrahtBot commented at 4:47 pm on December 10, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/34204028171

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  19. DrahtBot removed the label CI failed on Dec 10, 2024
  20. marcofleon commented at 11:39 am on December 11, 2024: contributor

    Nice, the warning and new addition to the docs lgtm.

    ACK ee05e730466f250e131eb92165bcdb8bab68c864

  21. DrahtBot requested review from laanwj on Dec 11, 2024
  22. DrahtBot requested review from l0rinc on Dec 11, 2024
  23. DrahtBot requested review from sipa on Dec 11, 2024
  24. helloscoopa commented at 12:09 pm on December 11, 2024: none
    Concept ACK
  25. hebasto commented at 10:36 am on December 12, 2024: member
    Concept ACK.
  26. in src/common/init.cpp:77 in ee05e73046
    72+            std::string_view description;
    73+        };
    74+        std::array<PathCheck, 2> paths_to_check{{
    75+            {args.GetDataDirNet(), "data directory"},
    76+            {args.GetBlocksDirPath(), "blocks directory"}
    77+        }};
    


    tdb3 commented at 2:27 am on December 15, 2024:
    Makes sense to cover blocksdir and datadir. Since the user could also set -walletdir=<dir> when starting bitcoind, do you think it would be worth warning for it as well? Much lighter disk usage than blocksdir and datadir, but if we’re doubtful of macOS’s exFAT implementation, then maybe we would want to warn the user when storing wallets with it?
  27. tdb3 commented at 2:29 am on December 15, 2024: contributor
    Concept ACK Great find!

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me