refactor: use Span for SipHash::Write #28085

pull theStack wants to merge 1 commits into bitcoin:master from theStack:202307-refactor-use_span_for_siphash changing 12 files +25 −23
  1. theStack commented at 1:44 AM on July 17, 2023: contributor

    This simple refactoring PR changes the interface for the SipHash arbitrary-data Write method to take a Span<unsigned char> instead of having to pass data and length. (Span<std::byte> seems to be more modern, but vectors of unsigned char are still used prety much everywhere where SipHash is called, and I didn't find it very appealing having to clutter the code with Make(Writable)ByteSpan helpers).

  2. DrahtBot commented at 1:44 AM on July 17, 2023: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK sipa, MarcoFalke, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Refactoring on Jul 17, 2023
  4. in src/crypto/siphash.h:30 in f4b0b0bd5b outdated
      26 | @@ -26,7 +27,7 @@ class CSipHasher
      27 |       */
      28 |      CSipHasher& Write(uint64_t data);
      29 |      /** Hash arbitrary bytes. */
      30 | -    CSipHasher& Write(const unsigned char* data, size_t size);
      31 | +    CSipHasher& Write(Span<const unsigned char> data);
    


    maflcko commented at 7:50 AM on July 17, 2023:
        CSipHasher& Write(Span<const B> data);
    

    I wonder if this should be made a template on B and instantiated for std::byte and unsigned char in the cpp file? Otherwise it will need to be touched again in the future if someone wants to hash a std::byte span.


    theStack commented at 2:28 PM on July 17, 2023:

    Sounds like a good idea! Couldn't get that to compile yet unfortunately 🙁 (which is maybe due to my lack of knowledge about templates). I tried with the following patch:

    diff --git a/src/crypto/siphash.cpp b/src/crypto/siphash.cpp
    index 2c328d0840..7cbafd4d98 100644
    --- a/src/crypto/siphash.cpp
    +++ b/src/crypto/siphash.cpp
    @@ -45,7 +45,8 @@ CSipHasher& CSipHasher::Write(uint64_t data)
         return *this;
     }
    
    -CSipHasher& CSipHasher::Write(Span<const unsigned char> data)
    +template <typename B>
    +CSipHasher& CSipHasher::Write(Span<const B> data)
     {
         uint64_t v0 = v[0], v1 = v[1], v2 = v[2], v3 = v[3];
         uint64_t t = tmp;
    @@ -73,6 +74,8 @@ CSipHasher& CSipHasher::Write(Span<const unsigned char> data)
    
         return *this;
     }
    +template CSipHasher& CSipHasher::Write(Span<const unsigned char>);
    +template CSipHasher& CSipHasher::Write(Span<const std::byte>);
    
     uint64_t CSipHasher::Finalize() const
     {
    diff --git a/src/crypto/siphash.h b/src/crypto/siphash.h
    index 4fb3dc2f25..748f6c1a39 100644
    --- a/src/crypto/siphash.h
    +++ b/src/crypto/siphash.h
    @@ -27,7 +27,8 @@ public:
          */
         CSipHasher& Write(uint64_t data);
         /** Hash arbitrary bytes. */
    -    CSipHasher& Write(Span<const unsigned char> data);
    +    template <typename B = unsigned char>
    +    CSipHasher& Write(Span<const B> data);
         /** Compute the 64-bit SipHash-2-4 of the data written so far. The object remains untouched. */
         uint64_t Finalize() const;
     };
    
    

    which led to the complaint (using clang 13.0.0)

    util/bytevectorhash.cpp:19:35: error: no matching member function for call to 'Write'
        return CSipHasher(m_k0, m_k1).Write(input).Finalize();
               ~~~~~~~~~~~~~~~~~~~~~~~^~~~~
    ./crypto/siphash.h:28:17: note: candidate function not viable: no known conversion from 'const std::vector<unsigned char>' to 'uint64_t' (aka 'unsigned long long') for 1st argument
        CSipHasher& Write(uint64_t data);
                    ^
    ./crypto/siphash.h:31:17: note: candidate template ignored: could not match 'Span' against 'vector'
        CSipHasher& Write(Span<const B> data);
                    ^
    1 error generated.
    

    Not sure why the match 'Span' against 'vector' works fine without template, but doesn't with template? 🤔


    maflcko commented at 2:36 PM on July 17, 2023:

    Ah, sorry, the const prevents a match. You can just drop it. Or alternatively put in the header two overloads of the form:

    CSipHasher& Write(Span<const unsigned char> data);
    CSipHasher& Write(Span<const std::byte> data);
    

    And the make one call the other in the cpp file.


    sipa commented at 2:53 PM on July 17, 2023:

    While I don't obect to having this particular function templatized, I don't think it's a scalable approach to introduce dual char/byte functions all over the codebase, just because not everything is using bytes.

    IMO, the goal should be to only have a byte interface here; The options for that are:

    • Just ~byte~ bite the bullet now, and introduce MakeByteSpan in all call sites.
    • Stick with a Span<const unsigned char> interface for now, and revisit at some point in the future when more of the codebase is using bytes.

    I think both of these options are preferable to duplicating/templatizing the function.


    theStack commented at 4:11 PM on July 17, 2023:

    Ah, sorry, the const prevents a match. You can just drop it.

    FWIW, removing the const from the Span template type didn't change anything, the error message "could not match 'Span' against 'vector'" still appears. (Or, did I misinterpret your second sentence and with "drop it" you meant to drop the template idea in general?). @sipa: Fair enough. There are some places already with templates for byte types (e.g. FastRandomContext::randbytes<B>, ParseHex<B> etc. to name two that I found quickly), so I wasn't aware that this is controversial and would have been fine adding a std::byte variant, but you have a valid point. I'll follow the path of least resistance/effort then and keep it as it is, i.e. stay with Span<const unsigned char> for now.


    sipa commented at 4:24 PM on July 17, 2023:

    I wouldn't say it's controversial - I have no problem with adding this.

    I just want to question if duplicating lots of things (if here, I suspect there will be several others too) is the right way to go.


    maflcko commented at 6:53 AM on July 18, 2023:

    the error message "could not match 'Span' against 'vector'" still appears. (Or, did I misinterpret your second sentence and with "drop it" you meant to drop the template idea in general?).

    Nah, clearly I don't understand C++. The overload should work, though, similar to the HexStr overloads:

    /**
     * Convert a span of bytes to a lower-case hexadecimal string.
     */
    std::string HexStr(const Span<const uint8_t> s);
    inline std::string HexStr(const Span<const char> s) { return HexStr(MakeUCharSpan(s)); }
    inline std::string HexStr(const Span<const std::byte> s) { return HexStr(MakeUCharSpan(s)); }
    

    Though, looks like this is no longer relevant as of the current state of the pull.

    There are some places already with templates for byte types (e.g. FastRandomContext::randbytes<B>

    I think that one is different, since it is not trivial to cast to the right type at the call site IIUC. If you only need a (mutable) reference, a reinterpret_cast<std::vector<std::byte>&>(mut) may be sufficient (obviously only call this on a byte-like vector). But to own the memory in a std::vector<std::byte> at least calling the move constructor is required in addition, which would prevent copy elision.

  5. in src/crypto/siphash.cpp:55 in f4b0b0bd5b outdated
      53 |      uint8_t c = count;
      54 |  
      55 | -    while (size--) {
      56 | -        t |= ((uint64_t)(*(data++))) << (8 * (c % 8));
      57 | +    while (data.size() > 0) {
      58 | +        t |= ((uint64_t)data.front()) << (8 * (c % 8));
    


    maflcko commented at 7:51 AM on July 17, 2023:
            t |= uint64_t{data.front()} << (8 * (c % 8));
    

    style-nit, if it compiles?


    theStack commented at 4:11 PM on July 17, 2023:

    Done.

  6. refactor: use Span for SipHash::Write
    Co-authored-by: MarcoFalke <*~=`'#}+{/-|&$^_@721217.xyz>
    7d92b1430a
  7. theStack force-pushed on Jul 17, 2023
  8. sipa commented at 4:56 PM on July 17, 2023: member

    utACK 7d92b1430a6fd42c4438810640576830d0ff8d13

  9. maflcko commented at 8:21 AM on July 18, 2023: member

    lgtm ACK 7d92b1430a6fd42c4438810640576830d0ff8d13

  10. achow101 commented at 8:19 PM on July 19, 2023: member

    ACK 7d92b1430a6fd42c4438810640576830d0ff8d13

  11. achow101 merged this on Jul 19, 2023
  12. achow101 closed this on Jul 19, 2023

  13. sidhujag referenced this in commit 9d63da50a5 on Jul 21, 2023
  14. bitcoin locked this on Jul 18, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me