refactor: Allow std::byte in (Read/Write)(LE/BE) #31524

pull maflcko wants to merge 1 commits into bitcoin:master from maflcko:2412-more-byte changing 2 files +39 −22
  1. maflcko commented at 10:09 am on December 18, 2024: member

    Starting with C++17, std::byte is often (not always) a better choice over uint8_t for new code.

    However, the existing codebase discourages the use of std::byte, when helpers such as ReadLE32 are used. This is because calling code will be cluttered with byte-casts.

    Fix it by allowing std::byte pointers in ReadLE32 (and friends).

  2. refactor: Allow std::byte in Read(LE/BE) fa83bec78e
  3. DrahtBot commented at 10:09 am on December 18, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31524.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK sipa, theuni, fjahr, l0rinc

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  4. DrahtBot added the label Refactoring on Dec 18, 2024
  5. sipa commented at 3:51 pm on December 18, 2024: member
    utACK fa83bec78ef3e86445e522afa396c97b58eb1902
  6. theuni approved
  7. theuni commented at 4:33 pm on December 18, 2024: member

    Nice use of concepts.

    utACK fa83bec78ef3e86445e522afa396c97b58eb1902

  8. fjahr commented at 4:55 pm on December 31, 2024: contributor
    Code review ACK fa83bec78ef3e86445e522afa396c97b58eb1902
  9. in src/crypto/common.h:22 in fa83bec78e
    18+
    19+template <ByteType B>
    20+inline uint16_t ReadLE16(const B* ptr)
    21 {
    22     uint16_t x;
    23     memcpy(&x, ptr, 2);
    


    l0rinc commented at 6:54 pm on December 31, 2024:

    nit: given the simplicity of this PR, if you edit again, consider if it would be simpler to use sizeof expressions instead of the constants:

    0    memcpy(&x, ptr, sizeof(uint16_t));
    

    maflcko commented at 12:16 pm on January 2, 2025:
    I am not touching this line of code, and don’t want to, so maybe a separate issue or pull request seems more appropriate?

    l0rinc commented at 12:44 pm on January 2, 2025:
    sure, I don’t have a strong preference either way (while this reflects the method-name-to-size connection better, I admit it’s more verbose), please resolve the comment
  10. in src/crypto/common.h:19 in fa83bec78e
    15-uint16_t static inline ReadLE16(const unsigned char* ptr)
    16+template <typename B>
    17+concept ByteType = std::same_as<B, unsigned char> || std::same_as<B, std::byte>;
    18+
    19+template <ByteType B>
    20+inline uint16_t ReadLE16(const B* ptr)
    


    l0rinc commented at 6:55 pm on December 31, 2024:

    as far as I can tell inline templates are redundant here (constexpr also compiles, but not sure it’s valid):

    0template <ByteType B>
    1uint16_t ReadLE16(const B* ptr)
    

    maflcko commented at 12:16 pm on January 2, 2025:
    No, std::memcpy is not constexpr, so adding it serves no purpose.

    l0rinc commented at 12:44 pm on January 2, 2025:
    That was my understanding as well, what I suggested was just to remove the redundant inline

    maflcko commented at 12:55 pm on January 2, 2025:

    I don’t have a strong feeling, see #30933 (review).

    My preference would be to have a clang-tidy (plugin) rule for this, instead of using human review resources.


    ryanofsky commented at 1:20 pm on January 2, 2025:

    In commit “refactor: Allow std::byte in Read(LE/BE)” (fa83bec78ef3e86445e522afa396c97b58eb1902)

    There is a shorter syntax for this that I think would make code easier to read (thoughout this file):

    0-template <ByteType B>
    1-inline uint16_t ReadLE16(const B* ptr)
    2+inline uint16_t ReadLE16(const ByteType auto* ptr)
    

    maflcko commented at 1:31 pm on January 2, 2025:
    No strong opinion from my side. I’ll leave as-is for now, but other reviewers are encouraged to chime in with up/down votes.

    l0rinc commented at 1:47 pm on January 2, 2025:
    Personally, I find the ByteType auto construct slightly more confusing than the more verbose template.

    maflcko commented at 4:03 pm on January 2, 2025:

    Personally, I find the ByteType auto construct slightly more confusing than the more verbose template.

    Ok, resolving for now.

    If there is a preference about the style, my preference would be to have a clang-tidy (plugin) rule for this, instead of using human review resources.


    ryanofsky commented at 11:53 pm on January 2, 2025:

    I don’t think it’s necessary to change this, but the principle here is to avoid having unnecessary template functions and unnecessary template parameters when it is possible to use concepts directly. In this case there’s no point in defining a type B to stand in for ByteType and then using B one time when you can just use ByteType directly.

    I don’t think it would be possible to write a clang-tidy plugin to check for these cases because the check would need to scan the codebase and ensure there were no calls to these functions that were passed explicit template arguments. Clang-tidy can’t know if if a function is intended to be used as template function, or if it is just intended to be a generic function that references a few concepts. The point of writing more direct code is to reflect this intent.


    sipa commented at 0:09 am on January 3, 2025:

    @ryanofsky I don’t think that’s right; these are all just different notations for the same thing:

    • void func(ByteType auto b)
    • template<ByteType T> void func(T b)
    • template<typename T> requires ByteType<T> void func(T b)

    So even the shortest version can be invoked with explicitly template arguments (func<unsigned char>(5) for example).

    No opinion on whether the usage of Concept auto arguments should be encouraged, and/or the long version of it outlawed. But I do believe it’s possible to write a tidy rule for it in theory.


    maflcko commented at 11:14 am on January 3, 2025:

    I don’t think it would be possible to write a clang-tidy plugin to check for these cases

    I haven’t tried it, so I am not sure if it is possible. However, looking at the AST, it should be trivial to detect those functions. Matching any FunctionTemplateDecl whose FunctionDecl does not have any TemplateTypeParm in the AST should be sufficient to find those that can be written without template<...>.

    Example: https://godbolt.org/z/4rre4rPT6


    ryanofsky commented at 2:57 pm on January 3, 2025:

    Thanks sipa and maflcko. I didn’t realize you were allowed to pass explicit template arguments to functions declared without template parameters, but it makes sense given the equivalence shown. It is also interesting to see ASTs of these functions in godbolt, even though I did understand that clang-tidy could see when template parameters were referenced.

    Upshot seems to be that it would be possible to write a clang-tidy linter that would not force you to write broken code (because template arguments can be passed to functions not written with template parameters). But this linter could still force you to write code that does not reflect your intent, if you are are intending for a function to accept template arguments and the linter forces you to drop the parameters.

    I have to say, I don’t actually understand why 3 experienced c++ developers (maflcko, l0rinc, sipa) would be on the fence in this case about whether functions not intended to accept template parameters should be written as function templates, but vive la difference.


    sipa commented at 3:32 pm on January 3, 2025:
    @ryanofsky I was just commenting on (lack of) semantic difference between the different syntaxes. Personally I have a slight preference for the shorter notations, but no opinion on whether the project should adopt/encourage/enforce that.

    l0rinc commented at 5:06 pm on January 3, 2025:
    I just haven’t really seen this being used in the project, but I don’t mind if we do.

    l0rinc commented at 6:25 pm on January 3, 2025:
    The more I looked at it, the more I actually liked it. I’ve pushed a follow-up to #31601 - let’s discuss it there
  11. in src/crypto/common.h:16 in fa83bec78e
    12 #include <cstdint>
    13 #include <cstring>
    14 
    15-uint16_t static inline ReadLE16(const unsigned char* ptr)
    16+template <typename B>
    17+concept ByteType = std::same_as<B, unsigned char> || std::same_as<B, std::byte>;
    


    l0rinc commented at 6:57 pm on December 31, 2024:

    Byte type means that size is 1, right? If you edit again, consider simplifying to something like:

    0template <typename B>
    1concept ByteType = (sizeof(B) == 1);
    

    Or if you don’t like it, to the mentioned uint8_t

    0template <typename B>
    1concept ByteType = std::same_as<B, uint8_t> || std::same_as<B, std::byte>;
    

    maflcko commented at 12:16 pm on January 2, 2025:

    No, signed chars are explicitly disallowed. See also: git grep delete ./src/serialize.h

    0src/serialize.h:template <typename Stream, CharNotInt8 V> void Serialize(Stream&, V) = delete; // char serialization forbidden. Use uint8_t or int8_t
    1src/serialize.h:template <typename Stream, CharNotInt8 V> void Unserialize(Stream&, V) = delete; // char serialization forbidden. Use uint8_t or int8_t
    

    l0rinc commented at 12:46 pm on January 2, 2025:
    Yes, I saw, but is this the level where it should be disallowed (since this would technically work with signed char, disallowing it seems like a different abstraction level)? But if it is, please resolve the comment

    maflcko commented at 12:51 pm on January 2, 2025:
    If there is a use-case for this, it can be added, but I think limiting to “unsigned” bytes seems fine and preferable for now.
  12. in src/crypto/common.h:47 in fa83bec78e
    46+template <ByteType B>
    47+inline void WriteLE16(B* ptr, uint16_t x)
    48 {
    49     uint16_t v = htole16_internal(x);
    50     memcpy(ptr, &v, 2);
    51 }
    


    l0rinc commented at 6:59 pm on December 31, 2024:

    We could const the parameter and locals to obviate which one we’re writing:

    0template <ByteType B>
    1void WriteLE16(B* ptr, const uint16_t x)
    2{
    3    const uint16_t v{htole16_internal(x)};
    4    memcpy(ptr, &v, 2);
    5}
    

    maflcko commented at 12:16 pm on January 2, 2025:
    I am not touching this line of code, and don’t want to, so maybe a separate issue or pull request seems more appropriate?
  13. in src/crypto/chacha20.cpp:35 in fa83bec78e
    38+    input[2] = ReadLE32(key.data() + 8);
    39+    input[3] = ReadLE32(key.data() + 12);
    40+    input[4] = ReadLE32(key.data() + 16);
    41+    input[5] = ReadLE32(key.data() + 20);
    42+    input[6] = ReadLE32(key.data() + 24);
    43+    input[7] = ReadLE32(key.data() + 28);
    


    l0rinc commented at 7:10 pm on December 31, 2024:

    maybe we can roll back the loop now to something like:

    0void ChaCha20Aligned::SetKey(Span<const std::byte> key) noexcept
    1{
    2    assert(key.size() == KEYLEN);
    3    constexpr int words{KEYLEN / sizeof(uint32_t)};
    4    for (int i{0}; i < words; ++i) input[i] = ReadLE32(key.data() + i * sizeof(uint32_t));
    5    std::memset(&input[words], 0, (std::size(input) - words) * sizeof(uint32_t));
    6}
    

    maflcko commented at 12:16 pm on January 2, 2025:
    No, I don’t want to affect performance in one way or another on some compiler (optimization level)s
  14. l0rinc approved
  15. l0rinc commented at 7:14 pm on December 31, 2024: contributor

    ACK fa83bec78ef3e86445e522afa396c97b58eb1902

    Please see if any of the simplification suggestions apply, but I’m fine with it as is

  16. in src/crypto/common.h:49 in fa83bec78e
    49     uint16_t v = htole16_internal(x);
    50     memcpy(ptr, &v, 2);
    51 }
    52 
    53-void static inline WriteLE32(unsigned char* ptr, uint32_t x)
    54+template <ByteType B>
    


    l0rinc commented at 10:19 am on January 2, 2025:

    The title claims that only reads were changed, but since we have also adapted the writes, would it make sense to modify them in the example ChaCha as well to demonstrate their correctness?

     0diff --git a/src/crypto/chacha20.cpp b/src/crypto/chacha20.cpp
     1--- a/src/crypto/chacha20.cpp	(revision fa83bec78ef3e86445e522afa396c97b58eb1902)
     2+++ b/src/crypto/chacha20.cpp	(date 1735812552447)
     3@@ -59,7 +59,7 @@
     4 
     5 inline void ChaCha20Aligned::Keystream(Span<std::byte> output) noexcept
     6 {
     7-    unsigned char* c = UCharCast(output.data());
     8+    std::byte* c = output.data();
     9     size_t blocks = output.size() / BLOCKLEN;
    10     assert(blocks * BLOCKLEN == output.size());
    11 
    12@@ -161,8 +161,8 @@
    13 inline void ChaCha20Aligned::Crypt(Span<const std::byte> in_bytes, Span<std::byte> out_bytes) noexcept
    14 {
    15     assert(in_bytes.size() == out_bytes.size());
    16-    const unsigned char* m = UCharCast(in_bytes.data());
    17-    unsigned char* c = UCharCast(out_bytes.data());
    18+    const std::byte* m = in_bytes.data();
    19+    std::byte* c = out_bytes.data();
    20     size_t blocks = out_bytes.size() / BLOCKLEN;
    21     assert(blocks * BLOCKLEN == out_bytes.size());
    

    maflcko commented at 12:16 pm on January 2, 2025:
    thx, may do if I have to re-touch, or in a follow-up
  17. maflcko renamed this:
    refactor: Allow std::byte in Read(LE/BE)
    refactor: Allow std::byte in (Read/Write)(LE/BE)
    on Jan 3, 2025
  18. DrahtBot renamed this:
    refactor: Allow std::byte in (Read/Write)(LE/BE)
    refactor: Allow std::byte in (Read/Write)(LE/BE)
    on Jan 3, 2025
  19. ryanofsky merged this on Jan 3, 2025
  20. ryanofsky closed this on Jan 3, 2025

  21. maflcko deleted the branch on Jan 3, 2025
  22. laanwj commented at 12:24 pm on January 4, 2025: member
    Post-merge ACK fa83bec78ef3e86445e522afa396c97b58eb1902

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me