util: explicitly close all AutoFiles that have been written #29307

pull vasild wants to merge 1 commits into bitcoin:master from vasild:AutoFile_error_check changing 18 files +145 −46
  1. vasild commented at 2:35 pm on January 24, 2024: contributor

    fclose(3) may fail to flush the previously written data to disk, thus a failing fclose(3) is as serious as a failing fwrite(3).

    Previously the code ignored fclose(3) failures. This PR improves that by changing all users of AutoFile that use it to write data to explicitly close the file and handle a possible error.


    Other alternatives are:

    1. fflush(3) after each write to the file (and throw if it fails from the AutoFile::write() method) and hope that fclose(3) will then always succeed. Assert that it succeeds from the destructor :roll_eyes:. Will hurt performance.
    2. Throw nevertheless from the destructor. Exception within the exception in C++ I think results in terminating the program without a useful message.
    3. (this is implemented in the latest incarnation of this PR) Redesign AutoFile so that its destructor cannot fail. Adjust all its users :sob:. For example, if the file has been written to, then require the callers to explicitly call the AutoFile::fclose() method before the object goes out of scope. In the destructor, as a sanity check, assume/assert that this is indeed the case. Defeats the purpose of a RAII wrapper for FILE* which automatically closes the file when it goes out of scope and there are a lot of users of AutoFile.
  2. DrahtBot commented at 2:35 pm on January 24, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/29307.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK ryanofsky, hodlinator, l0rinc
    Stale ACK maflcko

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32043 ([IBD] Tracking PR for speeding up Initial Block Download by l0rinc)
    • #31144 ([IBD] multi-byte block obfuscation by l0rinc)
    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #28792 (Embed default ASMap as binary dump header file by fjahr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter

    file contents is likely incomplete ⇒ file contents are likely incomplete

  3. DrahtBot added the label Utils/log/libs on Jan 24, 2024
  4. vasild force-pushed on Jan 24, 2024
  5. DrahtBot added the label CI failed on Jan 24, 2024
  6. DrahtBot commented at 2:58 pm on January 24, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/20819752683

  7. in src/streams.cpp:18 in 7c58f6ea4f outdated
    14     if (!m_file) throw std::ios_base::failure("AutoFile::read: file handle is nullptr");
    15     if (m_xor.empty()) {
    16-        return std::fread(dst.data(), 1, dst.size(), m_file);
    17+        const auto n = std::fread(dst.data(), 1, dst.size(), m_file);
    18+        if (ferror(m_file)) {
    19+            throw std::ios_base::failure(strprintf("AutoFile::detail_fread: %s", SysErrorString(errno)));
    


    maflcko commented at 3:59 pm on January 24, 2024:

    not sure. detail_fread is an implementation detail and thin wrapper, with the goal to simply return the return value of std::fread. It should be left for the caller to decide about the error.

    In any case, if you want to do this, you’d have to apply it to both branches of the if. The error message shouldn’t differ just because a file is xor’d or not.

    Finally, it would be good to explain whether this is a bugfix or a refactor/belt-and-suspenders. I don’t think an error/eof could have occurred if the full dst.size() was read. So I don’t think this is needed, but I could be wrong. Ideally, there is a unit test showing why the change is needed.


    vasild commented at 8:40 am on January 25, 2024:

    Alright, moved to the caller.

    IMO this is a bugfix, but I am not 100% sure either. Might as well be viewed as “belt-and-suspenders”. What is certain is that it can’t hurt even if there is never an error on any platform if detail_fread(dst) == dst.size(). A unit test would require simulating an IO error under the fread(3) call and identically behaving fread(3) on all platforms.

  8. in src/test/fuzz/autofile.cpp:50 in 7c58f6ea4f outdated
    46@@ -47,7 +47,7 @@ FUZZ_TARGET(autofile)
    47                 }
    48             },
    49             [&] {
    50-                auto_file.fclose();
    51+                assert(auto_file.fclose() == 0);
    


    vasild commented at 5:18 pm on January 24, 2024:

    This assert actually failed in CI (wow!):

    https://cirrus-ci.com/task/5586893237125120

     0Run autofile with args ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', PosixPath('/ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/autofile')]INFO: Running with entropic power schedule (0xFF, 100).
     1INFO: Seed: 3765617163
     2INFO: Loaded 1 modules   (547059 inline 8-bit counters): 547059 [0x55cd54dd29b8, 0x55cd54e582ab), 
     3INFO: Loaded 1 PC tables (547059 PCs): 547059 [0x55cd54e582b0,0x55cd556b11e0), 
     4INFO:      535 files found in /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/autofile
     5INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1041296 bytes
     6INFO: seed corpus: files: 535 min: 1b max: 1041296b total: 17489928b rss: 106Mb
     7fuzz: test/fuzz/autofile.cpp:50: auto autofile_fuzz_target(FuzzBufferType)::(anonymous class)::operator()() const: Assertion `auto_file.fclose() == 0' failed.
     8==18378== ERROR: libFuzzer: deadly signal
     9    [#0](/bitcoin-bitcoin/0/) 0x55cd5193ab55 in __sanitizer_print_stack_trace (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a5fb55) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    10    [#1](/bitcoin-bitcoin/1/) 0x55cd518922fc in fuzzer::PrintStackTrace() (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19b72fc) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    11    [#2](/bitcoin-bitcoin/2/) 0x55cd51878367 in fuzzer::Fuzzer::CrashCallback() (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199d367) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    12    [#3](/bitcoin-bitcoin/3/) 0x7fb162f1d8ff  (/lib/x86_64-linux-gnu/libc.so.6+0x428ff) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    13    [#4](/bitcoin-bitcoin/4/) 0x7fb162f7498a in pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x9998a) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    14    [#5](/bitcoin-bitcoin/5/) 0x7fb162f1d855 in gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42855) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    15    [#6](/bitcoin-bitcoin/6/) 0x7fb162f018b6 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x268b6) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    16    [#7](/bitcoin-bitcoin/7/) 0x7fb162f017da  (/lib/x86_64-linux-gnu/libc.so.6+0x267da) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    17    [#8](/bitcoin-bitcoin/8/) 0x7fb162f14185 in __assert_fail (/lib/x86_64-linux-gnu/libc.so.6+0x39185) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    18    [#9](/bitcoin-bitcoin/9/) 0x55cd51accacc in autofile_fuzz_target(Span<unsigned char const>)::$_3::operator()() const src/test/fuzz/autofile.cpp:50:17
    19    [#10](/bitcoin-bitcoin/10/) 0x55cd51accacc in unsigned long CallOneOf<autofile_fuzz_target(Span<unsigned char const>)::$_0, autofile_fuzz_target(Span<unsigned char const>)::$_1, autofile_fuzz_target(Span<unsigned char const>)::$_2, autofile_fuzz_target(Span<unsigned char const>)::$_3, autofile_fuzz_target(Span<unsigned char const>)::$_4, autofile_fuzz_target(Span<unsigned char const>)::$_5>(FuzzedDataProvider&, autofile_fuzz_target(Span<unsigned char const>)::$_0, autofile_fuzz_target(Span<unsigned char const>)::$_1, autofile_fuzz_target(Span<unsigned char const>)::$_2, autofile_fuzz_target(Span<unsigned char const>)::$_3, autofile_fuzz_target(Span<unsigned char const>)::$_4, autofile_fuzz_target(Span<unsigned char const>)::$_5) src/./test/fuzz/util.h:42:27
    20    [#11](/bitcoin-bitcoin/11/) 0x55cd51accacc in autofile_fuzz_target(Span<unsigned char const>) src/test/fuzz/autofile.cpp:27:9
    21    [#12](/bitcoin-bitcoin/12/) 0x55cd5204a52c in std::function<void (Span<unsigned char const>)>::operator()(Span<unsigned char const>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    22    [#13](/bitcoin-bitcoin/13/) 0x55cd5204a52c in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:178:5
    23    [#14](/bitcoin-bitcoin/14/) 0x55cd51879814 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199e814) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    24    [#15](/bitcoin-bitcoin/15/) 0x55cd51878f09 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199df09) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    25    [#16](/bitcoin-bitcoin/16/) 0x55cd5187aaf6 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199faf6) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    26    [#17](/bitcoin-bitcoin/17/) 0x55cd5187b017 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19a0017) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    27    [#18](/bitcoin-bitcoin/18/) 0x55cd5186875b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x198d75b) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    28    [#19](/bitcoin-bitcoin/19/) 0x55cd51892ce6 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19b7ce6) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    29    [#20](/bitcoin-bitcoin/20/) 0x7fb162f030cf  (/lib/x86_64-linux-gnu/libc.so.6+0x280cf) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    30    [#21](/bitcoin-bitcoin/21/) 0x7fb162f03188 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x28188) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    31    [#22](/bitcoin-bitcoin/22/) 0x55cd5185d624 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1982624) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    32NOTE: libFuzzer has rudimentary signal handlers.
    33      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
    34SUMMARY: libFuzzer: deadly signal
    35MS: 0 ; base unit: 0000000000000000000000000000000000000000
    36artifact_prefix='./'; Test unit written to ./crash-276f4f9d3e6ba98f897bd1082aec7214ca8e4f5d
    37INFO: Running with entropic power schedule (0xFF, 100).
    38INFO: Seed: 3765617163
    39INFO: Loaded 1 modules   (547059 inline 8-bit counters): 547059 [0x55cd54dd29b8, 0x55cd54e582ab), 
    40INFO: Loaded 1 PC tables (547059 PCs): 547059 [0x55cd54e582b0,0x55cd556b11e0), 
    41INFO:      535 files found in /ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/autofile
    42INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1041296 bytes
    43INFO: seed corpus: files: 535 min: 1b max: 1041296b total: 17489928b rss: 106Mb
    44fuzz: test/fuzz/autofile.cpp:50: auto autofile_fuzz_target(FuzzBufferType)::(anonymous class)::operator()() const: Assertion `auto_file.fclose() == 0' failed.
    45==18378== ERROR: libFuzzer: deadly signal
    46    [#0](/bitcoin-bitcoin/0/) 0x55cd5193ab55 in __sanitizer_print_stack_trace (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1a5fb55) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    47    [#1](/bitcoin-bitcoin/1/) 0x55cd518922fc in fuzzer::PrintStackTrace() (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19b72fc) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    48    [#2](/bitcoin-bitcoin/2/) 0x55cd51878367 in fuzzer::Fuzzer::CrashCallback() (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199d367) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    49    [#3](/bitcoin-bitcoin/3/) 0x7fb162f1d8ff  (/lib/x86_64-linux-gnu/libc.so.6+0x428ff) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    50    [#4](/bitcoin-bitcoin/4/) 0x7fb162f7498a in pthread_kill (/lib/x86_64-linux-gnu/libc.so.6+0x9998a) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    51    [#5](/bitcoin-bitcoin/5/) 0x7fb162f1d855 in gsignal (/lib/x86_64-linux-gnu/libc.so.6+0x42855) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    52    [#6](/bitcoin-bitcoin/6/) 0x7fb162f018b6 in abort (/lib/x86_64-linux-gnu/libc.so.6+0x268b6) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    53    [#7](/bitcoin-bitcoin/7/) 0x7fb162f017da  (/lib/x86_64-linux-gnu/libc.so.6+0x267da) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    54    [#8](/bitcoin-bitcoin/8/) 0x7fb162f14185 in __assert_fail (/lib/x86_64-linux-gnu/libc.so.6+0x39185) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    55    [#9](/bitcoin-bitcoin/9/) 0x55cd51accacc in autofile_fuzz_target(Span<unsigned char const>)::$_3::operator()() const src/test/fuzz/autofile.cpp:50:17
    56    [#10](/bitcoin-bitcoin/10/) 0x55cd51accacc in unsigned long CallOneOf<autofile_fuzz_target(Span<unsigned char const>)::$_0, autofile_fuzz_target(Span<unsigned char const>)::$_1, autofile_fuzz_target(Span<unsigned char const>)::$_2, autofile_fuzz_target(Span<unsigned char const>)::$_3, autofile_fuzz_target(Span<unsigned char const>)::$_4, autofile_fuzz_target(Span<unsigned char const>)::$_5>(FuzzedDataProvider&, autofile_fuzz_target(Span<unsigned char const>)::$_0, autofile_fuzz_target(Span<unsigned char const>)::$_1, autofile_fuzz_target(Span<unsigned char const>)::$_2, autofile_fuzz_target(Span<unsigned char const>)::$_3, autofile_fuzz_target(Span<unsigned char const>)::$_4, autofile_fuzz_target(Span<unsigned char const>)::$_5) src/./test/fuzz/util.h:42:27
    57    [#11](/bitcoin-bitcoin/11/) 0x55cd51accacc in autofile_fuzz_target(Span<unsigned char const>) src/test/fuzz/autofile.cpp:27:9
    58    [#12](/bitcoin-bitcoin/12/) 0x55cd5204a52c in std::function<void (Span<unsigned char const>)>::operator()(Span<unsigned char const>) const /usr/bin/../lib/gcc/x86_64-linux-gnu/13/../../../../include/c++/13/bits/std_function.h:591:9
    59    [#13](/bitcoin-bitcoin/13/) 0x55cd5204a52c in LLVMFuzzerTestOneInput src/test/fuzz/fuzz.cpp:178:5
    60    [#14](/bitcoin-bitcoin/14/) 0x55cd51879814 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199e814) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    61    [#15](/bitcoin-bitcoin/15/) 0x55cd51878f09 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool, bool*) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199df09) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    62    [#16](/bitcoin-bitcoin/16/) 0x55cd5187aaf6 in fuzzer::Fuzzer::ReadAndExecuteSeedCorpora(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x199faf6) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    63    [#17](/bitcoin-bitcoin/17/) 0x55cd5187b017 in fuzzer::Fuzzer::Loop(std::vector<fuzzer::SizedFile, std::allocator<fuzzer::SizedFile>>&) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19a0017) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    64    [#18](/bitcoin-bitcoin/18/) 0x55cd5186875b in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x198d75b) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    65    [#19](/bitcoin-bitcoin/19/) 0x55cd51892ce6 in main (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x19b7ce6) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    66    [#20](/bitcoin-bitcoin/20/) 0x7fb162f030cf  (/lib/x86_64-linux-gnu/libc.so.6+0x280cf) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    67    [#21](/bitcoin-bitcoin/21/) 0x7fb162f03188 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x28188) (BuildId: f0b834daa3d05a80967e9ec2f990a1ea71c958fa)
    68    [#22](/bitcoin-bitcoin/22/) 0x55cd5185d624 in _start (/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz+0x1982624) (BuildId: 34c8e9fbac2706ae466ade7d359adebec5d82bcf)
    69NOTE: libFuzzer has rudimentary signal handlers.
    70      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
    71SUMMARY: libFuzzer: deadly signal
    72MS: 0 ; base unit: 0000000000000000000000000000000000000000
    73artifact_prefix='./'; Test unit written to ./crash-276f4f9d3e6ba98f897bd1082aec7214ca8e4f5d
    74Target ['/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/fuzz', '-runs=1', PosixPath('/ci_container_base/ci/scratch/qa-assets/fuzz_seed_corpus/autofile')] failed with exit code 77
    

    No base64 of the unit in the output, I guess I cannot access ./crash-276f4f9d3e6ba98f897bd1082aec7214ca8e4f5d and I cannot reproduce by running FUZZ=autofile ./src/test/fuzz/fuzz -runs=1000000 .../qa-assets/fuzz_seed_corpus/autofile locally. Maybe the CI machine ran out of disk space?


    vasild commented at 8:22 am on January 25, 2024:

    I realized that auto_file used in this fuzz test is not associated with a regular file on the filesystem, but with a “cookie” returned by fuzzed_file_provider.open() returned by fopencookie(3) which uses a custom close function FuzzedFileProvider::close() which can arbitrarily return -1:

    https://github.com/bitcoin/bitcoin/blob/207220ce8b767d8efdb5abf042ecf23d846ded73/src/test/fuzz/util.cpp#L352

    changed this to ignore the return value of auto_file.fclose().

  9. vasild force-pushed on Jan 25, 2024
  10. DrahtBot removed the label CI failed on Jan 25, 2024
  11. maflcko commented at 11:32 am on January 29, 2024: member

    I don’t think any care close-check needs to be done when reading from a file.

    So what about removing the write method from AutoFile, and introduce a new derived class to add it back. This class could Assume that the file was flushed/closed before the destructor is called?

  12. vasild commented at 10:48 am on February 4, 2024: contributor

    I don’t think any care close-check needs to be done when reading from a file.

    That is my understanding too.

    So what about removing the write method from AutoFile, and introduce a new derived class to add it back. This class could Assume that the file was flushed/closed before the destructor is called?

    Assume is more for code correctness, not for external errors (like IO error). If it does not fail during testing and on CI, that does not mean IO errors are absent and will not occur on the users’ machines. Given that Assume() is removed from production builds, it still means that file corruption could remain unnoticed on live environments, like now on master.

  13. maflcko commented at 9:45 am on February 5, 2024: member

    This class could Assume that the file was flushed/closed before the destructor is called?

    Assume is more for code correctness, not for external errors (like IO error).

    The assumption would be that the close was called before the destructor is called. This is a code correctness question and seems like the perfect fit for Assume. That is, if the Assume fails, the code was missing a close, and the code must be changed to fix the internal bug. The Assume does not in any way check for IO errors.

  14. vasild commented at 3:17 pm on February 5, 2024: contributor

    Then I misunderstood this:

    This class could Assume that the file was flushed/closed before the destructor is called

    How?

  15. maflcko commented at 3:34 pm on February 5, 2024: member
    AutoFile holds a file, which will be nullptr when it is closed. So if the file is nullptr, then it can be assumed to be flushed.
  16. vasild commented at 4:40 pm on February 5, 2024: contributor
    How/when would the file be closed?
  17. maflcko commented at 8:00 am on February 6, 2024: member

    How/when would the file be closed?

    An AutoFile can be closed by calling the fclose() method in the code.

  18. vasild commented at 9:45 am on February 6, 2024: contributor
    Ok, the users of the class then have to explicitly close. That is 1.4. from the OP, I elaborated it with more words.
  19. DrahtBot commented at 3:42 pm on March 12, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/20850328376

  20. DrahtBot added the label CI failed on Mar 12, 2024
  21. Retropex referenced this in commit f54c172ffe on Mar 28, 2024
  22. Retropex referenced this in commit 70403c5bb7 on Mar 28, 2024
  23. achow101 commented at 3:43 pm on April 9, 2024: member
    Seems like there is some discussion in the description that is a better fit for a brainstorming issue.
  24. DrahtBot marked this as a draft on Apr 23, 2024
  25. DrahtBot commented at 6:50 am on April 23, 2024: contributor
    Converted to draft due to failing CI and inactivity
  26. vasild force-pushed on May 10, 2024
  27. vasild commented at 10:28 am on May 10, 2024: contributor

    5543990321...1fa61d9edc: rebase and log a message from ~AutoFile() if closing fails and terminate the program.

    Ready for review. I think this would be a strict improvement over what we have now in master - a silent ignoring of a possible file corruption.

  28. vasild marked this as ready for review on May 10, 2024
  29. vasild force-pushed on May 10, 2024
  30. Sjors commented at 4:59 pm on May 10, 2024: member
    @vasild CI seems unhappy.
  31. vasild force-pushed on May 13, 2024
  32. vasild marked this as a draft on May 13, 2024
  33. vasild force-pushed on May 13, 2024
  34. DrahtBot removed the label CI failed on May 13, 2024
  35. vasild marked this as ready for review on May 14, 2024
  36. vasild commented at 9:51 am on May 14, 2024: contributor
    Fixed the CI!
  37. in src/streams.cpp:28 in de23848eed outdated
    24@@ -23,8 +25,9 @@ std::size_t AutoFile::detail_fread(Span<std::byte> dst)
    25 
    26 void AutoFile::read(Span<std::byte> dst)
    27 {
    28-    if (detail_fread(dst) != dst.size()) {
    29-        throw std::ios_base::failure(feof() ? "AutoFile::read: end of file" : "AutoFile::read: fread failed");
    30+    if (detail_fread(dst) != dst.size() || ferror(m_file)) {
    


    maflcko commented at 10:57 am on May 14, 2024:

    de23848eed5437f5e4dc6ccdc38b40cc58738ead: I still don’t understand this. Why would fread succeed to read the requested number of bytes, but then mark the stream with an error? If the goal is to somehow detect errors after the read succeeded, I am not sure this should be a goal. Otherwise, it would be good to explain what real user-facing bug this fixes or introduces.

    Either this is a refactor, in which case it seems pointless, or it is a behavior change, in which case it needs to explain why it beneficial and correct.


    vasild commented at 12:39 pm on May 14, 2024:

    This was somewhat answered in #29307 (review) above but let me elaborate - the thing that prompted me to do this change is this text from the fread(3) manual page:

    The function fread() does not distinguish between end-of-file and error, and callers must use feof(3) and ferror(3) to determine which occurred.

    I think checking for ferror(3) after fread(3) is a kind of harmless “belt-and-suspenders”. If you are sure that if detail_fread(dst) == dst.size() then ferror(m_file) will always be false or feel strongly against this then I could drop the commit de23848eed util: check for error after reading from a file.


    maflcko commented at 5:32 pm on May 23, 2024:

    If you are sure that if detail_fread(dst) == dst.size() then ferror(m_file) will always be false or feel strongly against this then I could drop the commit https://github.com/bitcoin/bitcoin/commit/de23848eed5437f5e4dc6ccdc38b40cc58738ead util: check for error after reading from a file.

    I don’t feel strongly in one way or another. I am just saying that with the currently available information, it is not possible to confidently review this and know it is a good change. At a minimum, it should be clear whether this is a refactor or a behavior change, and what user-facing bug this could possibly prevent or fix. (I can’t think of one, but maybe I am missing something?)


    vasild commented at 7:00 am on June 20, 2024:

    Well, it is pretty much obvious that checking for an error after read is either:

    • a noop, if the read could have never failed because it managed to read as many bytes as requested or
    • a bug fix where a read error was ignored before the fix

    so this is a safe change (it does not break anything), possibly fixing a bug. Given that you keep pushing against this I will drop this commit in the hopes to move this PR forward.


    maflcko commented at 7:08 am on June 20, 2024:
    No, I don’t understand why a “read error was ignored before the fix” and why “this is a safe change (it does not break anything)”. For example, if this was hit on a filesystem that stored a block file, which was fully and successfully read, but then return an error (for an unknown reason), this would mean that for a user where the program previously worked fine without any issues, is now unusable (for an unknown reason).

    vasild commented at 8:14 am on June 20, 2024:

    I just found a similar code to this proposed change:

    https://github.com/bitcoin/bitcoin/blob/2d21060af831fa8f8f1791b4464961d5f28a88cb/src/node/utxo_snapshot.cpp#L74-L80

    After reading from a file and getting everything that is needed from it, the code checks if there is an extra data and if that read fails it reports an error.


    maflcko commented at 8:30 am on June 20, 2024:
    The code you linked to is primarily checking for unexpected trailing data and only printing a warning. It is not making the program possibly unusable.

    vasild commented at 10:05 am on June 20, 2024:

    I think I see your point, you prefer 1. over 2., assuming we got all the bytes we asked for from a read, then:

    1. Ignore possible IO error by not checking for it
    2. Check for an error and behave as if the read failed

    I dropped this change.

  38. vasild force-pushed on May 14, 2024
  39. in src/kernel/mempool_persist.cpp:221 in f9cf6b516b outdated
    217@@ -213,6 +218,7 @@ bool DumpMempool(const CTxMemPool& pool, const fs::path& dump_path, FopenFn mock
    218                   fs::file_size(dump_path));
    219     } catch (const std::exception& e) {
    220         LogInfo("Failed to dump mempool: %s. Continuing anyway.\n", e.what());
    221+        (void)file.fclose();
    


    maflcko commented at 12:54 pm on May 15, 2024:

    Not sure I understand this change either. This seems to be mixing the third alternative in the pull request description with the approach you wanted to take?

    Defeats the purpose of a RAII wrapper for FILE* which automatically closes the file when it goes out of scope and there are a lot of users of AutoFile.


    vasild commented at 1:11 pm on May 23, 2024:

    It is best if all users of AutoFile explicitly call the fclose() method and check that it succeeds (option 3. from PR description). This is not feasible because there are many users of AutoFile. However, some of those users already do call the fclose() method explicitly. For those users it is easy and I added checks whether fclose() succeeds. DumpMempool() is such a user and the change is:

    0             throw std::runtime_error("FileCommit failed");
    1-        file.fclose();           
    2+        if (file.fclose() != 0) {    
    3+            throw std::runtime_error(
    4+                strprintf("Error closing %s: %s", fs::PathToString(file_fspath), SysErrorString(errno)));
    5+        }
    6         if (!RenameOver(dump_path + ".new", dump_path)) {
    

    For consistency in DumpMempool() I changed all code paths in that function to explicitly call fclose(). There are two codepaths and this added (void)file.fclose(); is the other one. I ignore the return value because this is already failure handling and the function is going to return false. I didn’t want to do something like “while handling error1, error2 occurred”.


    maflcko commented at 5:25 pm on May 23, 2024:

    Oh, now I see this is required, because otherwise the node would crash on this error instead of continuing with the current code, if this was missing?

    Not sure this is desirable, see also https://github.com/bitcoin/bitcoin/pull/29307/files#r1612065514


    vasild commented at 10:09 am on June 20, 2024:
    Resolving this discussion. Changed the semantic of AutoFile to require all write users to call fclose().
  40. DrahtBot added the label Needs rebase on May 23, 2024
  41. in src/streams.h:413 in de23848eed outdated
    410-    ~AutoFile() { fclose(); }
    411+    ~AutoFile()
    412+    {
    413+        if (fclose() != 0 && m_was_written) {
    414+            LogPrintLevel(BCLog::ALL, BCLog::Level::Error, "Error closing file: %s\n", SysErrorString(errno));
    415+            std::abort();
    


    maflcko commented at 5:22 pm on May 23, 2024:

    not sure about unconditionally crashing the node on any failure here. This may or may not be an issue, so the call site will have to decide about this.

    For example, If an RPC user dumps the mempool to an external (corrupt) USB thumb drive, a failure shouldn’t crash the node, but notify the user over the RPC.

    Also, I am not sure about involving logging in this low-level code. Logging may not be set up, so this may never be printed before the crash.

    Also, I am not sure about the log message. It just says that there was an error, but not whose fault it was or what should be done about it.

    A better solution overall may be to just call if (m_was_written) Assume(IsNull());. This should highlight all violating call sites during CI, because the test should be deterministic. (If there is a code path that is not deterministically tested, all hopes are lost anyway, and I don’t think the solution should be to crash the users node completely and without considering the context?)


    vasild commented at 10:07 am on June 20, 2024:

    if (m_was_written) Assume(IsNull());

    Done.

  42. luke-jr referenced this in commit ec8ef5e51c on Jun 13, 2024
  43. luke-jr referenced this in commit 17f1f24993 on Jun 13, 2024
  44. vasild force-pushed on Jun 20, 2024
  45. vasild renamed this:
    util: check for errors after close and read in AutoFile
    util: explicitly close all AutoFiles that have been written
    on Jun 20, 2024
  46. DrahtBot removed the label Needs rebase on Jun 20, 2024
  47. maflcko commented at 12:04 pm on June 20, 2024: member

    lgtm ACK 11be9f4103ea219f801a1f0fe1385f66ca70ad22

    Will do some testing later

  48. luke-jr referenced this in commit d2ea6ab60b on Jun 22, 2024
  49. DrahtBot added the label Needs rebase on Jul 2, 2024
  50. vasild force-pushed on Jul 2, 2024
  51. vasild commented at 12:41 pm on July 2, 2024: contributor
    11be9f4103...ea89a6e368: rebase due to conflicts
  52. DrahtBot removed the label Needs rebase on Jul 2, 2024
  53. DrahtBot added the label Needs rebase on Jul 9, 2024
  54. vasild force-pushed on Jul 22, 2024
  55. vasild commented at 10:33 am on July 22, 2024: contributor
    ea89a6e368...4533445fd7: rebase due to conflicts
  56. DrahtBot added the label CI failed on Jul 22, 2024
  57. DrahtBot commented at 12:13 pm on July 22, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/27742621741

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  58. DrahtBot removed the label Needs rebase on Jul 22, 2024
  59. vasild commented at 12:56 pm on July 22, 2024: contributor
    CI failure seems unrelated.
  60. DrahtBot removed the label CI failed on Jul 22, 2024
  61. ryanofsky commented at 9:26 pm on July 22, 2024: contributor

    Concept ACK, approach ~0

    The PR description is a bit vague about how it is changing the AutoFile API.

    Previously, the autofile API allowed but did not require AutoFile callers to check for failing fclose calls after writing. Now it requires that callers call the fclose method manually and explicitly use or discard its return value.

    Checking for failing fclose calls is important because if a call fails, it probably indicates that there is a serious problem that should trigger a log message or an alert. Specifically if the file was used for writing it could mean that data could not successfully be saved to disk. So it would be bad for fclose errors to be discarded without being handled at all or reported somewhere.

    The PR makes AutoFile require callers to handle failing fclose calls, and it enforces this in two ways:

    1. It adds [[nodiscard]] annotation to the fclose method so its return value cannot be ignored. The return value is 0 if successful or EOF (-1) if unsuccessful.
    2. It tries to enforce that callers actually invoke fclose by adding an Assume statement to the destructor that terminates the program with a fatal error if the file was written to and fclose was not called and the binary was built in a debug mode.

    It seems like this approach works, but it is maybe a little fragile because it doesn’t actually enforce that caller handles the fclose result in every code path. So for example if an early return statement is added after a caller opens an autofile object but before it calls the fclose method, and the early return code path is not normally executed, it can introduce a latent bug that occasionally will crash debug builds, while providing no indication that anything is wrong in release builds.

    I think if the goal is to require AutoFile callers to handle failing fclose calls, a more reliable way to do that would be to change the constructor signature to

    0explicit AutoFile(std::FILE* file, std::function<void()> on_error);
    

    And require the caller to provide an on_error callback that does error handing like

    0[&] {
    1        remove(pathTmp);
    2        LogError("%s: Failed to close file %s: %s\n", __func__, fs::PathToString(pathTmp), SysErrorString(errno));
    3}
    

    I think the current approach is ok too, but it seems more fragile because it doesn’t detect all programming errors, and when it does it detects them with a runtime crash instead of a compile time error.

    I am coming to this late, so if you are really attached to the current approach I can just review and ack it, but I wanted to suggest an alternative.

  62. vasild commented at 1:13 pm on July 23, 2024: contributor

    @ryanofsky, I am not so attached to the current approach :) if it can be improved, then lets improve it.

    For the on_error callback - that would be called on fclose() error from destructor only, right? Not on any error? E.g. the write() method throws an exception on error.

    Not all callers have the file name when creating AutoFile which means we would have to log some generic message without a file name. Or go an extra mile and make sure we have the file name, but some callers don’t use files at all, IIRC the fuzzer uses some cookie “files” that are not on the filesystem.

    Then the callback can be omitted and the message can be logged from the destructor:

     0     ~AutoFile()
     1     {   
     2         if (m_was_written) {
     3             // Callers that wrote to the file must have closed it explicitly
     4             // with the fclose() method and checked that the close succeeded.
     5             // This is because here from the destructor we have no way to signal
     6             // error due to close which, after write, could mean the file is
     7             // corrupted and must be handled properly at the call site.
     8             Assume(IsNull());
     9         }
    10                        
    11-        (void)fclose();
    12+        if (fclose() != 0) {
    13+            LogError("Failed to close file: %s\n", SysErrorString(errno));
    14+        }
    15     }
    
  63. DrahtBot added the label CI failed on Aug 7, 2024
  64. DrahtBot added the label Needs rebase on Aug 21, 2024
  65. vasild force-pushed on Aug 22, 2024
  66. vasild commented at 9:27 am on August 22, 2024: contributor
    4533445fd7...1dcd626fe1: rebase due to conflicts
  67. DrahtBot removed the label Needs rebase on Aug 22, 2024
  68. vasild force-pushed on Aug 23, 2024
  69. vasild commented at 9:25 am on August 23, 2024: contributor
    1dcd626fe1...587d996f07: rebase, get the extra printout from #29307 (comment) and resolve a silent merge conflict with https://github.com/bitcoin/bitcoin/pull/28052
  70. DrahtBot removed the label CI failed on Aug 23, 2024
  71. vasild force-pushed on Sep 2, 2024
  72. vasild commented at 2:54 pm on September 2, 2024: contributor
    587d996f07...489c2477e7: rebase to pick CMake
  73. DrahtBot added the label CI failed on Sep 5, 2024
  74. in src/addrdb.cpp:84 in 489c2477e7 outdated
    83         return false;
    84     }
    85-    fileout.fclose();
    86+    if (fileout.fclose() != 0) {
    87+        remove(pathTmp);
    88+        LogError("%s: Failed to close file %s: %s\n", __func__, fs::PathToString(pathTmp), SysErrorString(errno));
    


    hodlinator commented at 9:07 pm on September 7, 2024:

    Less error-prone to use the error value returned from fclose() rather than relying on global errno which might have been mutated by remove() etc.

    0    if (int err{fileout.fclose()}; err != 0) {
    1        remove(pathTmp);
    2        LogError("Failed to close file %s: %s", fs::PathToString(pathTmp), SysErrorString(err));
    

    (Also dropped pre--logsourcelocations __func__ and removed superfluous newline).


    vasild commented at 12:27 pm on April 14, 2025:
    Done. Note however that fclose() returns either 0 or EOF (-1). It does not return the error code.
  75. in src/index/blockfilterindex.cpp:234 in 489c2477e7 outdated
    228@@ -229,6 +229,12 @@ size_t BlockFilterIndex::WriteFilterToDisk(FlatFilePos& pos, const BlockFilter&
    229     }
    230 
    231     fileout << filter.GetBlockHash() << filter.GetEncodedFilter();
    232+
    233+    if (fileout.fclose() != 0) {
    234+        LogPrintf("%s: Failed to close filter file %d\n", __func__, pos.nFile);
    


    hodlinator commented at 9:31 pm on September 7, 2024:

    Could use SystemErr(err) here?

    0    if (int err{fileout.fclose()}; err != 0) {
    1        LogPrintf("Failed to close filter file %d: %s", pos.nFile, SysErrorString(err));
    

    Could be done in other files too (net.cpp, blockstorage.cpp). (I can understand not wanting to do it further above in this specific file though).


    vasild commented at 12:28 pm on April 14, 2025:
    Done, modulo fclose() does not return the error code which is to be found in errno.
  76. in src/bench/streams_findbyte.cpp:31 in 489c2477e7 outdated
    27@@ -28,7 +28,7 @@ static void FindByte(benchmark::Bench& bench)
    28     });
    29 
    30     // Cleanup
    31-    file.fclose();
    32+    (void)file.fclose();
    


    hodlinator commented at 9:35 pm on September 7, 2024:

    Rather than silence errors, it might be good to at least assert against errors here and elsewhere in benchmarks/tests as we want to ensure we are benchmarking/testing properly? (Not for some fuzz-cases though as you already found out).

    0    assert(file.fclose() != 0);
    

    That’s already how you handled src/wallet/test/fuzz/wallet_bdb_parser.cpp.


    vasild commented at 12:28 pm on April 14, 2025:
    Done, but it should be ==, not !=.
  77. DrahtBot removed the label CI failed on Sep 15, 2024
  78. DrahtBot added the label Needs rebase on Sep 17, 2024
  79. vasild force-pushed on Sep 28, 2024
  80. vasild commented at 5:20 am on September 28, 2024: contributor
    489c2477e7...dba7835386: rebase due to conflicts
  81. DrahtBot removed the label Needs rebase on Sep 28, 2024
  82. in src/policy/fees.cpp:957 in dba7835386 outdated
    949@@ -950,7 +950,7 @@ void CBlockPolicyEstimator::Flush() {
    950 void CBlockPolicyEstimator::FlushFeeEstimates()
    951 {
    952     AutoFile est_file{fsbridge::fopen(m_estimation_filepath, "wb")};
    953-    if (est_file.IsNull() || !Write(est_file)) {
    954+    if (est_file.IsNull() || !Write(est_file) || est_file.fclose() != 0) {
    955         LogPrintf("Failed to write fee estimates to %s. Continue anyway.\n", fs::PathToString(m_estimation_filepath));
    


    hodlinator commented at 7:28 pm on January 21, 2025:

    nit: Might take the opportunity to improve grammar and modernize log:

    0        LogWarning("Failed to write fee estimates to %s. Continuing anyway.", fs::PathToString(m_estimation_filepath));
    

    vasild commented at 12:28 pm on April 14, 2025:
    Done.
  83. in src/rpc/blockchain.cpp:53 in dba7835386 outdated
    46@@ -47,6 +47,7 @@
    47 #include <util/check.h>
    48 #include <util/fs.h>
    49 #include <util/strencodings.h>
    50+#include <util/syserror.h>
    51 #include <util/translation.h>
    52 #include <validation.h>
    53 #include <validationinterface.h>
    


    hodlinator commented at 8:27 pm on January 21, 2025:

    vasild commented at 12:24 pm on April 14, 2025:
    It is being closed inside WriteUTXOSnapshot().
  84. hodlinator commented at 10:56 am on January 22, 2025: contributor

    Concept ACK

    Appreciate the intent of being rigorous in reporting issues with flushing file content to disk.

    Alternative implementation

    The current PR which produces a run-time failure in the AutoFile destructor upon forgetting to explicitly call fclose(). I attempted an alternative implementation (branch) inspired by ryanofsky’s comment that takes a lambda in the constructor which is called upon fclose()-failure in the destructor. This enforces the context-aware handling of the issue at compile-time. The change was somewhat invasive but I’m curious to hear what others make of it. (It also splits AutoFile into FileReader and FileWriter (naming inspired by other types in streams.h)).

  85. DrahtBot added the label Needs rebase on Jan 22, 2025
  86. luke-jr referenced this in commit 06105d8da8 on Feb 26, 2025
  87. vasild force-pushed on Apr 14, 2025
  88. DrahtBot removed the label Needs rebase on Apr 14, 2025
  89. vasild commented at 12:25 pm on April 14, 2025: contributor
    dba7835...9c16900: rebase due to conflicts
  90. vasild force-pushed on Apr 14, 2025
  91. vasild commented at 12:25 pm on April 14, 2025: contributor
    9c16900b9b...8939947449: address suggestions
  92. vasild commented at 12:34 pm on April 14, 2025: contributor

    lambda in the constructor which is called upon fclose()-failure in the destructor … I’m curious to hear what others make of it. (It also splits AutoFile into FileReader and FileWriter (naming inspired by other types in streams.h)).

    Concept ACK, but I am worried that the lack of review interest here would mean that an even more invasive patch would never make it.

  93. l0rinc commented at 1:30 pm on April 14, 2025: contributor
    Concept ACK
  94. DrahtBot added the label Needs rebase on Apr 16, 2025
  95. vasild force-pushed on Apr 17, 2025
  96. vasild commented at 4:40 am on April 17, 2025: contributor
    8939947449...aa88c6dfb6: rebase due to conflicts
  97. DrahtBot added the label CI failed on Apr 17, 2025
  98. DrahtBot commented at 5:45 am on April 17, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/40701339509

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  99. DrahtBot removed the label Needs rebase on Apr 17, 2025
  100. vasild force-pushed on Apr 17, 2025
  101. vasild commented at 7:32 am on April 17, 2025: contributor
    aa88c6dfb6...0e97a68e13: fix CI
  102. in src/node/blockstorage.cpp:1122 in 0e97a68e13 outdated
    1118+        LogPrintLevel(BCLog::BLOCKSTORAGE,
    1119+                      BCLog::Level::Error,
    1120+                      "Failed to close block file %s: %s",
    1121+                      pos.ToString(),
    1122+                      SysErrorString(errno));
    1123+        m_opts.notifications.fatalError(_("Failed close file when writing block."));
    


    DrahtBot commented at 7:46 am on April 17, 2025:

    (✨ LLM generated, experimental)

    “Failed close file when writing block.” → “Failed to close file when writing block.”
    
  103. vasild force-pushed on Apr 17, 2025
  104. vasild commented at 8:17 am on April 17, 2025: contributor
    0e97a68e13...1f71111e21: fix AI and CI (even more)
  105. in src/net.cpp:4018 in 1f71111e21 outdated
    4011@@ -4012,6 +4012,11 @@ static void CaptureMessageToFile(const CAddress& addr,
    4012     uint32_t size = data.size();
    4013     ser_writedata32(f, size);
    4014     f << data;
    4015+
    4016+    if (f.fclose() != 0) {
    4017+        throw std::ios_base::failure(
    4018+            strprintf("Error closing %s after write, file contents is likely incomplete", fs::PathToString(path)));
    


    DrahtBot commented at 8:22 am on April 17, 2025:

    (✨ LLM generated, experimental)

    file contents is likely incomplete ⇒ file contents are likely incomplete

  106. DrahtBot removed the label CI failed on Apr 17, 2025
  107. util: explicitly close all AutoFiles that have been written
    There is no way to report a close error from `AutoFile` destructor.
    Such an error could be serious if the file has been written to because
    it may mean the file is now corrupted (same as if write fails).
    
    So, change all users of `AutoFile` that use it to write data to
    explicitly close the file and handle a possible error.
    c52b673bf8
  108. in src/index/blockfilterindex.cpp:154 in 1f71111e21 outdated
    150@@ -151,7 +151,7 @@ bool BlockFilterIndex::CustomCommit(CDBBatch& batch)
    151         LogError("%s: Failed to open filter file %d\n", __func__, pos.nFile);
    152         return false;
    153     }
    154-    if (!file.Commit()) {
    155+    if (!file.Commit() || file.fclose() != 0) {
    


    l0rinc commented at 11:09 am on April 17, 2025:

    If we can’t fully commit to disk, we won’t call fclose directly - we’ll rely on the destructor’s close. We’d get a commitment failure message (though that could have been successful) - and maybe another message from the destructor.

    Could we separate the two errors?

    0if (!file.Commit()) {
    1    LogError("%s: Failed to commit filter file %d", __func__, pos.nFile);
    2    return false;
    3}
    4if (file.fclose()) {
    5    LogError("%s: Failed to close filter file %d", __func__, pos.nFile);
    6    return false;
    7}
    

    or unify them to a single error like we’re doing in https://github.com/bitcoin/bitcoin/pull/29307/files?w=1#diff-7194b705e3c48e34fbd6e648fbbee39c6e0b49ee9658658b930678a9a2d31dd6R957

    0if (file.IsNull() || !file.Commit() || file.fclose()) {
    1    LogError("%s: Failed to flush filter file %d", __func__, pos.nFile);
    2    return false;
    3}
    
  109. in src/index/blockfilterindex.cpp:236 in 1f71111e21 outdated
    228@@ -229,6 +229,12 @@ size_t BlockFilterIndex::WriteFilterToDisk(FlatFilePos& pos, const BlockFilter&
    229     }
    230 
    231     fileout << filter.GetBlockHash() << filter.GetEncodedFilter();
    232+
    233+    if (fileout.fclose() != 0) {
    234+        LogPrintf("Failed to close filter file %d: %s", pos.nFile, SysErrorString(errno));
    235+        return 0;
    236+    }
    


    l0rinc commented at 11:23 am on April 17, 2025:

    Why is this a LogInfo, shouldn’t it be a LogError?

    0    if (fileout.fclose()) {
    1        LogError("%s: Failed to close filter file %d: %s", __func__, pos.nFile, SysErrorString(errno));
    2        return 0;
    3    }
    
  110. in src/node/blockstorage.cpp:965 in 1f71111e21 outdated
    971+                    fileout << blockundo << hasher.GetHash();
    972+                }
    973             }
    974 
    975-            fileout.flush(); // Make sure `AutoFile`/`BufferedWriter` go out of scope before we call `FlushUndoFile`
    976+            if (file.fclose() != 0) {
    


    l0rinc commented at 11:28 am on April 17, 2025:
    nit: now that we’re closing explicitly we probably don’t need the scope, too (wouldn’t we double-log in that case anyway?) Or if we keep it, could make sense to add the closing inside - unless it’s deliberate, in which case we might want to explain it in the commit message
  111. in src/node/blockstorage.cpp:973 in 1f71111e21 outdated
    979+                              "Failed to close block undo file %s: %s",
    980+                              pos.ToString(),
    981+                              SysErrorString(errno));
    982+                return FatalError(m_opts.notifications, state, _("Failed to close block undo file."));
    983+            }
    984+            // Make sure that the file is closed before we call `FlushUndoFile`.
    


    l0rinc commented at 11:28 am on April 17, 2025:

    I’d either delete this now or put it before the actual closing. And do we really need to make this unlikely scenario occupy so much space? And in the other case we simply used LogError instead:

    0// Make sure that the file is closed before we call `FlushUndoFile`.
    1if (file.fclose()) {
    2    LogError("Failed to close block undo file %s: %s", pos.ToString(), SysErrorString(errno));
    3    return FatalError(m_opts.notifications, state, _("Failed to close block undo file."));
    4}
    
  112. in src/node/blockstorage.cpp:1169 in 1f71111e21 outdated
    1164@@ -1142,6 +1165,11 @@ static auto InitBlocksdirXorKey(const BlockManager::Options& opts)
    1165 #endif
    1166         )};
    1167         xor_key_file << xor_key;
    1168+        if (xor_key_file.fclose() != 0) {
    1169+            throw std::runtime_error{strprintf("Error closing XOR key file %s: %s\n",
    


    l0rinc commented at 11:34 am on April 17, 2025:

    nit:

    0            throw std::runtime_error{strprintf("Error closing XOR key file %s: %s",
    
  113. vasild force-pushed on Apr 17, 2025
  114. vasild commented at 11:36 am on April 17, 2025: contributor
    1f71111e21...c52b673bf8: #29307 (review)
  115. in src/test/fuzz/autofile.cpp:68 in c52b673bf8
    61@@ -62,5 +62,10 @@ FUZZ_TARGET(autofile)
    62         if (f != nullptr) {
    63             fclose(f);
    64         }
    65+    } else {
    66+        // FuzzedFileProvider::close() is expected to fail sometimes. Don't let
    67+        // the destructor of AutoFile be upset by a failing fclose(). Close it
    68+        // explicitly (and ignore any errors) so that the destructor is a noop.
    


    l0rinc commented at 11:40 am on April 17, 2025:
    why do we want the destructor to be a noop?
  116. in src/test/streams_tests.cpp:573 in c52b673bf8
    567@@ -566,7 +568,9 @@ BOOST_AUTO_TEST_CASE(buffered_reader_matches_autofile_random_content)
    568 
    569     // Write out the file with random content
    570     {
    571-        AutoFile{test_file.Open(pos, /*read_only=*/false), obfuscation}.write(m_rng.randbytes<std::byte>(file_size));
    572+        AutoFile f{test_file.Open(pos, /*read_only=*/false), obfuscation};
    573+        f.write(m_rng.randbytes<std::byte>(file_size));
    574+        BOOST_REQUIRE_EQUAL(f.fclose(), 0);
    


    l0rinc commented at 11:41 am on April 17, 2025:
    nit: this likely isn’t strictly necessary, as long as it contains all the data (checked below), it should work correctly - but it’s fine either way
  117. in src/streams.h:406 in c52b673bf8
    402 {
    403 protected:
    404     std::FILE* m_file;
    405     std::vector<std::byte> m_xor;
    406     std::optional<int64_t> m_position;
    407+    bool m_was_written{false};
    


    l0rinc commented at 11:51 am on April 17, 2025:
    Can you separate the AutoFile changes to a new commit, explaining these changes separately from the changes that are calling close explicitly?
  118. in src/streams.h:423 in c52b673bf8
    420+            // corrupted and must be handled properly at the call site.
    421+            Assume(IsNull());
    422+        }
    423+
    424+        if (fclose() != 0) {
    425+            LogPrintLevel(BCLog::ALL, BCLog::Level::Error, "Failed to close file: %s", SysErrorString(errno));
    


    l0rinc commented at 11:53 am on April 17, 2025:
    0            LogError("Failed to close file: %s", SysErrorString(errno));
    
  119. in src/streams.h:433 in c52b673bf8
    431     AutoFile& operator=(const AutoFile&) = delete;
    432 
    433     bool feof() const { return std::feof(m_file); }
    434 
    435-    int fclose()
    436+    [[nodiscard]] int fclose()
    


    l0rinc commented at 11:54 am on April 17, 2025:
    👍 for [[nodiscard]], but this seems like the perfect opportunity to unify it with how e.g. feof works and change fclose to return a boolean instead - since we don’t seem to check other values than 0.
  120. in src/streams.h:417 in c52b673bf8
    414+    {
    415+        if (m_was_written) {
    416+            // Callers that wrote to the file must have closed it explicitly
    417+            // with the fclose() method and checked that the close succeeded.
    418+            // This is because here from the destructor we have no way to signal
    419+            // error due to close which, after write, could mean the file is
    


    l0rinc commented at 11:55 am on April 17, 2025:
    not sure what these lines mean
  121. in src/test/flatfile_tests.cpp:84 in c52b673bf8
    80@@ -79,13 +81,15 @@ BOOST_AUTO_TEST_CASE(flatfile_open)
    81 
    82         file >> LIMITED_STRING(text, 256);
    83         BOOST_CHECK_EQUAL(text, line2);
    84+        BOOST_REQUIRE_EQUAL(file.fclose(), 0);
    


    l0rinc commented at 11:56 am on April 17, 2025:
    ‘require’ seems a bit harsh, but shouldn’t happen often
  122. in src/streams.cpp:98 in c52b673bf8
    94@@ -95,6 +95,7 @@ void AutoFile::write(std::span<const std::byte> src)
    95             src = src.subspan(buf_now.size());
    96         }
    97     }
    98+    m_was_written = true;
    


    l0rinc commented at 11:58 am on April 17, 2025:
    should this be added to AutoFile::write_buffer as well?
  123. l0rinc changes_requested
  124. l0rinc commented at 12:14 pm on April 17, 2025: contributor

    Concept ACK, we definitely want to know if this happens!

    • It seems we’re using the return value as a boolean - can make make it return one?
    • The error messages seem all over the place, can we unify them to LogError?
    • Not sure if scope-ing the AutoFile + closing it explicitly is the best solution - would probably prefer @ryanofsky’s std::function<void()> on_error or @maflcko’s recommendation to let AutoFile guarantee proper closing.
    • Sometimes we’re explicitly showing a closing related failure, other times we’re reusing previous error messages - it’s not a very important or likely scenario, we could likely add it to other erroneous cases.
    • We have discernible changes here, all in a single commit - would prefer separating the concerns a bit more with better commit message explanations

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-04-29 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me