Follow-up after AutoFile position caching: remove unused code

sipa commented at 11:30 am on September 19, 2024: member

This is a follow-up to #30884.

Remove a number of dead code paths, and improve the code organization and documentation, in AutoFile.

DrahtBot commented at 11:30 am on September 19, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	maflcko, theStack, l0rinc, tdb3

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

streams: remove unused code 67a3d59076

sipa force-pushed on Sep 19, 2024

in src/streams.h:457 in 67a3d59076 outdated

454@@ -455,7 +455,6 @@ class AutoFile
455     }
456 
457     bool Commit();

maflcko commented at 11:44 am on September 19, 2024:

unrelated style-nit (feel free to ignore): I think this “paragraph” isn’t part of the “Stream subset”. It would be good to either remove the “Stream subset” comment above, or move the “paragraph” (Commit and Truncate) a few lines up.

sipa commented at 11:58 am on September 19, 2024:

Done, in a separate commit. I have also added comments to all the affected functions.

maflcko approved

maflcko commented at 11:45 am on September 19, 2024: member

lgtm ACK 67a3d590768301fb46a93fdb0a5c66c0c2de1082

streams: reorder/document functions caac06f784

maflcko commented at 11:58 am on September 19, 2024: member

re-ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

Only change is fixup up the doxygen dev docs

in src/node/blockstorage.cpp:687 in caac06f784

683@@ -684,10 +684,6 @@ bool BlockManager::UndoWriteToDisk(const CBlockUndo& blockundo, FlatFilePos& pos
684 
685     // Write undo data
686     long fileOutPos = fileout.tell();
687-    if (fileOutPos < 0) {
688-        LogError("%s: ftell failed\n", __func__);
689-        return false;
690-    }
691     pos.nPos = (unsigned int)fileOutPos;

l0rinc commented at 1:04 pm on September 19, 2024:

Does it still make sense for AutoFile::tell() to return a int64_t instead of a uint64_t or unsigned int?

Now that we’ve removed the invalid usages, maybe it’s a good opportunity to transition AutoFile (and maybe FlatFilePos) to std::optional<uint64_t> m_position;

sipa commented at 1:51 pm on September 19, 2024:

I think this is a bit of a philosophical question.

Both int64_t and uint64_t have sufficient range to represent the data here (because ftell returns a long, which matches int64_t on all our supported platforms, plus we know the result cannot be negative), so both are perfectly functional.

The advantage of uint64_t (and presumably the reason you’re suggesting this) is because it signals that the number cannot be negative.

On the other hand, signed types have the advantage that they actually model integers (because signed overflow is UB, the compiler - and humans - may treat signed types as being mathematical integers without wraparound behavior), while unsigned types explicitly model modular arithmetic. Bjarne Stroustroup’s reasoning for wanting std::span to have a signed size despite not being allowed to be negative is worth a read, I think.

I don’t think either of these arguments are very strong, but I think it’s enough to not bother changing the return type.

l0rinc commented at 2:02 pm on September 19, 2024:

The advantage of uint64_t (and presumably the reason you’re suggesting this) is because it signals that the number cannot be negative.

Yes, to make it stricter and closer aligned to the problem domain.

signed types have the advantage that they actually model integers

But here we’re modelling a size where potential negative values are confusing (especially since we’re storing it as unsigned int going forward). The negative values were already handled, it seems to me that they lost their usefulness after this change (checking for <0 after that change would warn).

signed overflow is UB

We’re already potentially down-casting the size to <64 bits - is overflow something we care about here?

sipa commented at 2:11 pm on September 19, 2024:

@l0rinc All the same arguments apply to the size of spans, which cannot be negative, and yet people strongly (and unsuccessfully, but only barely) argued that it should have a signed size type. I don’t believe the concern is about specific worries around overflow and its behavior; it is about the fact that sizes shouldn’t be thought of as modular arithmetic, and that C++ has no type to model natural numbers - the best approximation is signed types which model integers.

And again, I also see the advantages of the other side. But given that both types suffice, and there are advantages in both ways, I don’t think it’s worth arguing over.

l0rinc commented at 2:22 pm on September 19, 2024:

Mixing signed and unsigned numbers is a common source of confusion and bugs. Most coding guidelines recommend against mixing them.

But we are mixing them, that’s my main concern

0pos.nPos = (unsigned int)fileOutPos

I’d be fine with using a signed position in both cases, but if one is signed and the other is not, most of us will assume that we have to check the signed value for negatives (as the existence of this PR demonstrates). But changing the usages is outside the scope of this PR, thanks for explaining your reasoning (and for the interesting article), please resolve the comment.

laanwj commented at 7:37 pm on September 19, 2024:

Would be nice to add an assertion / check here to make that fileOutPos is within range of unsigned int before casting it, to avoid silent truncation. But yes, out of scope of this PR as the conversion isn’t introduced here.

in src/streams.h:442 in caac06f784

437     int64_t tell();
438 
439+    /** Wrapper around FileCommit(). */
440+    bool Commit();
441+
442+    /** Wrapper around TruncateFile(). */

l0rinc commented at 1:14 pm on September 19, 2024:

isn’t this obvious from the implementation?

sipa commented at 1:52 pm on September 19, 2024:

Someone shouldn’t need to look at the implementation to understand what a function does.

l0rinc commented at 2:11 pm on September 19, 2024:

I’m not exactly sure in which cases this comment would help us avoid having to check the sources - but I don’t have strong opinions about it, please resolve this comment.

in src/streams.h:437 in caac06f784

429@@ -430,9 +430,18 @@ class AutoFile
430     /** Implementation detail, only used internally. */
431     std::size_t detail_fread(Span<std::byte> dst);
432 
433+    /** Wrapper around fseek(). Will throw if seeking is not possible. */
434     void seek(int64_t offset, int origin);
435+
436+    /** Find position within the file. Will throw if unknown. */
437     int64_t tell();

l0rinc commented at 1:29 pm on September 19, 2024:

Slightly unrelated, but since we’re modifying its usage here: streams_tests doesn’t seem to cover this method at all.

(+ Commit, Truncate and many std::ios_base::failure throws seem also uncovered)

l0rinc changes_requested

l0rinc commented at 1:31 pm on September 19, 2024: contributor

Concept ACK

Could be the right time to do some minor cleanup (return unsigned) and maybe to cover missing method and exceptions with tests.

theStack approved

theStack commented at 2:21 pm on September 19, 2024: contributor

Code-review ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

DrahtBot requested review from l0rinc on Sep 19, 2024

l0rinc approved

l0rinc commented at 2:23 pm on September 19, 2024: contributor

ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

My only remaining concern is lack of related test coverage, but it’s not a blocker

tdb3 approved

tdb3 commented at 3:04 pm on September 19, 2024: contributor

CR ACK caac06f784c5d94c6a5f7d0b586f9ddbbe55c369

fanquake merged this on Sep 19, 2024

fanquake closed this on Sep 19, 2024

Follow-up after AutoFile position caching: remove unused code #30927

Code Coverage

Reviews

Conflicts