Follow-up after AutoFile position caching: remove unused code #30927

pull sipa wants to merge 2 commits into bitcoin:master from sipa:202409_tellcache_followup changing 4 files +9 −19
  1. sipa commented at 11:30 am on September 19, 2024: member

    This is a follow-up to #30884.

    Remove a number of dead code paths, and improve the code organization and documentation, in AutoFile.

  2. DrahtBot commented at 11:30 am on September 19, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, theStack, l0rinc, tdb3

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. streams: remove unused code 67a3d59076
  4. sipa force-pushed on Sep 19, 2024
  5. in src/streams.h:457 in 67a3d59076 outdated
    454@@ -455,7 +455,6 @@ class AutoFile
    455     }
    456 
    457     bool Commit();
    


    maflcko commented at 11:44 am on September 19, 2024:
    unrelated style-nit (feel free to ignore): I think this “paragraph” isn’t part of the “Stream subset”. It would be good to either remove the “Stream subset” comment above, or move the “paragraph” (Commit and Truncate) a few lines up.

    sipa commented at 11:58 am on September 19, 2024:
    Done, in a separate commit. I have also added comments to all the affected functions.
  6. maflcko approved
  7. maflcko commented at 11:45 am on September 19, 2024: member
    lgtm ACK 67a3d590768301fb46a93fdb0a5c66c0c2de1082
  8. streams: reorder/document functions caac06f784
  9. maflcko commented at 11:58 am on September 19, 2024: member

    re-ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

    Only change is fixup up the doxygen dev docs

  10. in src/node/blockstorage.cpp:687 in caac06f784
    683@@ -684,10 +684,6 @@ bool BlockManager::UndoWriteToDisk(const CBlockUndo& blockundo, FlatFilePos& pos
    684 
    685     // Write undo data
    686     long fileOutPos = fileout.tell();
    687-    if (fileOutPos < 0) {
    688-        LogError("%s: ftell failed\n", __func__);
    689-        return false;
    690-    }
    691     pos.nPos = (unsigned int)fileOutPos;
    


    l0rinc commented at 1:04 pm on September 19, 2024:

    Does it still make sense for AutoFile::tell() to return a int64_t instead of a uint64_t or unsigned int?

    Now that we’ve removed the invalid usages, maybe it’s a good opportunity to transition AutoFile (and maybe FlatFilePos) to std::optional<uint64_t> m_position;


    sipa commented at 1:51 pm on September 19, 2024:

    I think this is a bit of a philosophical question.

    Both int64_t and uint64_t have sufficient range to represent the data here (because ftell returns a long, which matches int64_t on all our supported platforms, plus we know the result cannot be negative), so both are perfectly functional.

    The advantage of uint64_t (and presumably the reason you’re suggesting this) is because it signals that the number cannot be negative.

    On the other hand, signed types have the advantage that they actually model integers (because signed overflow is UB, the compiler - and humans - may treat signed types as being mathematical integers without wraparound behavior), while unsigned types explicitly model modular arithmetic. Bjarne Stroustroup’s reasoning for wanting std::span to have a signed size despite not being allowed to be negative is worth a read, I think.

    I don’t think either of these arguments are very strong, but I think it’s enough to not bother changing the return type.


    l0rinc commented at 2:02 pm on September 19, 2024:

    The advantage of uint64_t (and presumably the reason you’re suggesting this) is because it signals that the number cannot be negative.

    Yes, to make it stricter and closer aligned to the problem domain.

    signed types have the advantage that they actually model integers

    But here we’re modelling a size where potential negative values are confusing (especially since we’re storing it as unsigned int going forward). The negative values were already handled, it seems to me that they lost their usefulness after this change (checking for <0 after that change would warn).

    signed overflow is UB

    We’re already potentially down-casting the size to <64 bits - is overflow something we care about here?


    sipa commented at 2:11 pm on September 19, 2024:

    @l0rinc All the same arguments apply to the size of spans, which cannot be negative, and yet people strongly (and unsuccessfully, but only barely) argued that it should have a signed size type. I don’t believe the concern is about specific worries around overflow and its behavior; it is about the fact that sizes shouldn’t be thought of as modular arithmetic, and that C++ has no type to model natural numbers - the best approximation is signed types which model integers.

    And again, I also see the advantages of the other side. But given that both types suffice, and there are advantages in both ways, I don’t think it’s worth arguing over.


    l0rinc commented at 2:22 pm on September 19, 2024:

    Mixing signed and unsigned numbers is a common source of confusion and bugs. Most coding guidelines recommend against mixing them.

    But we are mixing them, that’s my main concern

    0pos.nPos = (unsigned int)fileOutPos
    

    I’d be fine with using a signed position in both cases, but if one is signed and the other is not, most of us will assume that we have to check the signed value for negatives (as the existence of this PR demonstrates). But changing the usages is outside the scope of this PR, thanks for explaining your reasoning (and for the interesting article), please resolve the comment.


    laanwj commented at 7:37 pm on September 19, 2024:
    Would be nice to add an assertion / check here to make that fileOutPos is within range of unsigned int before casting it, to avoid silent truncation. But yes, out of scope of this PR as the conversion isn’t introduced here.
  11. in src/streams.h:442 in caac06f784
    437     int64_t tell();
    438 
    439+    /** Wrapper around FileCommit(). */
    440+    bool Commit();
    441+
    442+    /** Wrapper around TruncateFile(). */
    


    l0rinc commented at 1:14 pm on September 19, 2024:
    isn’t this obvious from the implementation?

    sipa commented at 1:52 pm on September 19, 2024:
    Someone shouldn’t need to look at the implementation to understand what a function does.

    l0rinc commented at 2:11 pm on September 19, 2024:
    I’m not exactly sure in which cases this comment would help us avoid having to check the sources - but I don’t have strong opinions about it, please resolve this comment.
  12. in src/streams.h:437 in caac06f784
    429@@ -430,9 +430,18 @@ class AutoFile
    430     /** Implementation detail, only used internally. */
    431     std::size_t detail_fread(Span<std::byte> dst);
    432 
    433+    /** Wrapper around fseek(). Will throw if seeking is not possible. */
    434     void seek(int64_t offset, int origin);
    435+
    436+    /** Find position within the file. Will throw if unknown. */
    437     int64_t tell();
    


    l0rinc commented at 1:29 pm on September 19, 2024:

    Slightly unrelated, but since we’re modifying its usage here: streams_tests doesn’t seem to cover this method at all.

    (+ Commit, Truncate and many std::ios_base::failure throws seem also uncovered)

  13. l0rinc changes_requested
  14. l0rinc commented at 1:31 pm on September 19, 2024: contributor

    Concept ACK

    Could be the right time to do some minor cleanup (return unsigned) and maybe to cover missing method and exceptions with tests.

  15. theStack approved
  16. theStack commented at 2:21 pm on September 19, 2024: contributor
    Code-review ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369
  17. DrahtBot requested review from l0rinc on Sep 19, 2024
  18. l0rinc approved
  19. l0rinc commented at 2:23 pm on September 19, 2024: contributor

    ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

    My only remaining concern is lack of related test coverage, but it’s not a blocker

  20. tdb3 approved
  21. tdb3 commented at 3:04 pm on September 19, 2024: contributor
    CR ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369
  22. fanquake merged this on Sep 19, 2024
  23. fanquake closed this on Sep 19, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 04:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me