Implement Keccak and SHA3_256 #19841

pull sipa wants to merge 3 commits into bitcoin:master from sipa:202008_sha3 changing 5 files +322 −0
  1. sipa commented at 8:00 pm on August 30, 2020: member

    Add a simple (and initially unoptimized) Keccak/SHA3 implementation based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as one will be needed for TORv3 support (the conversion from BIP155 encoding to .onion notation uses a SHA3-based checksum). In follow-up commits, a benchmark is added, and the Keccakf function is unrolled for a (for me) 4.9x speedup.

    Test vectors are taken from https://csrc.nist.gov/projects/cryptographic-algorithm-validation-program/secure-hashing#sha3vsha3vss.

  2. sipa commented at 8:02 pm on August 30, 2020: member
    Ping @vasild.
  3. jonatack commented at 8:15 pm on August 30, 2020: member
    Nice!
  4. DrahtBot added the label Build system on Aug 30, 2020
  5. DrahtBot added the label Utils/log/libs on Aug 30, 2020
  6. sipa force-pushed on Aug 30, 2020
  7. DrahtBot commented at 1:17 am on August 31, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19145 (Add hash_type MUHASH for gettxoutsetinfo by fjahr)
    • #19055 (Add MuHash3072 implementation by fjahr)
    • #18014 (lib: Optimizing siphash implementation by elichai)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. fanquake commented at 2:25 am on August 31, 2020: member
    Concept ACK on this approach to adding a Keccak/SHA3 implementation.
  9. MarcoFalke removed the label Build system on Aug 31, 2020
  10. naumenkogs commented at 7:22 am on August 31, 2020: member
    Concept ACK as a requirement for Torv3.
  11. practicalswift commented at 9:46 am on August 31, 2020: contributor

    Concept ACK

    Nice simple implementation: will fuzz it to see if I’m able to reject that it is as robust as it looks :)

  12. laanwj commented at 1:31 pm on August 31, 2020: member
    This is a nice compact implementation of KeccaK/SHA3 (even after the unrolling). Code review ACK 2263ab190e625f81d154ce6581f37adcdb31ef8b
  13. vasild approved
  14. vasild commented at 3:31 pm on August 31, 2020: member

    ACK 2263ab190

    I tested this with #19845, changing the code to use this SHA3_256 implementation and it works (produces the same checksum in the TORv3 address):

    0// CHECKSUM = H(".onion checksum" | PUBKEY | VERSION)[:2]
    1
    2hasher.Write(Span<const uint8_t>{reinterpret_cast<const uint8_t*>(".onion checksum"), 15});
    3hasher.Write(m_addr);
    4hasher.Write(Span<const uint8_t>{reinterpret_cast<const uint8_t*>("\x03"), 1});
    

    I also tested this implementation against the same vectors as TestSHA256, just like in #19845 and the results are the same (as the SHA3_256 from that PR, not as the SHA256!).

    The last two tests:

    0    TestSHA3_256(std::string(1000000, 'a'),
    1                 "5c8875ae474a3634ba4fd55ec85bffd661f32aca75c6d699d0cdcb6c115891c1");
    2    TestSHA3_256(test1, "7d06279100b45d54a479e973aa01ea0cceafb656c45238557813c9c39ceb6403");
    

    take 2.8s VS 0.4s for SHA256 (in non optimized build).

    Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.

    Ideally this would get merged first and then #19845 adapted to use it (and ditch the implementation imported in that PR).

  15. sipa commented at 5:26 pm on August 31, 2020: member

    @vasild FWIW, you should be able to use hasher.Write(MakeUCharSpan(".onion checksum")).

    Regarding speed: the SHA256 implementations we have take advantage of SSE4/AVX2/SHA-NI instructions when available. I didn’t use any of those for SHA3 as it seems unnecessary for its current use case. In an optimized build, but with HW accelerations for SHA256 disabled, the SHA3 code here is very close to SHA256:

    ns/byte byte/s err% total benchmark
    1.69 590,913,173.58 0.1% 0.02 SHA1
    4.12 242,749,791.96 0.0% 0.05 SHA256
    4.11 243,277,749.23 0.9% 0.04 SHA3_256_1M
    2.90 344,486,900.54 0.1% 0.03 SHA512

    On the same system (which has SHA-NI), with hardware accelerations enabled it is of course:

    ns/byte byte/s err% total benchmark
    0.63 1,586,480,646.52 0.0% 0.01 SHA256
  16. vasild commented at 7:37 pm on August 31, 2020: member
    Right, all is good. And SHA3-256 is supposed to be slower than SHA256.
  17. laanwj commented at 9:10 am on September 1, 2020: member

    Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.

    As it is not relevant to our use case, I don’t think this is worth it. I wouldn’t like to see the added complication of assembly implementations and special intrinsics just for computing an address checksum (it would be different if it was for packet checksums). Brevity (and of course correctness) of the implementation is the important thing here.

  18. theStack commented at 9:02 am on September 2, 2020: member
    Concept ACK It seems that currently the header <array> is unnecessarily included both in the implementation and the unit test source files. // EDIT: Same for <stdint.h> and <stdlib.h> in sha3.h.
  19. laanwj commented at 10:14 am on September 3, 2020: member
    @theStack Agree about array, it doesn’t seem to be used at all. Wouldnt’ stdint.h be necessary for sized integer types such as uint32_t? (yes, it’s probably included indirectly somehow, but we like to be explicit about dependencies)
  20. theStack commented at 10:37 am on September 3, 2020: member

    @theStack Agree about array, it doesn’t seem to be used at all. Wouldnt’ stdint.h be necessary for sized integer types such as uint32_t? (yes, it’s probably included indirectly somehow, but we like to be explicit about dependencies) @laanwj: Oh indeed, you are right. For some reason I forgot that sized integer types and size_t are not integral part of the language but defined in headers… 🤦‍♂️ so both <stdint.h> are <stdlib.h> are needed indeed (though one could probably use the C++ counterparts <cstdint> and <cstdlib> instead, but that’d be more of a general discussion not related only to this PR).

  21. sipa commented at 11:10 pm on September 3, 2020: member
    I added <array> to get access to std::begin and std::end.
  22. laanwj commented at 10:09 am on September 4, 2020: member

    I forgot that sized integer types and size_t are not integral part of the language but defined in headers… man_facepalming

    Haah. I try to forget that sometimes …

    I added to get access to std::begin and std::end.

    Oh, of course!

  23. theStack commented at 9:53 pm on September 5, 2020: member

    I added <array> to get access to std::begin and std::end.

    Ah, I wasn’t aware those are declared in <array>. Sorry for the noise.

    For reviewing this PR, I thought it would be nice to validate this implementation against a standard one with random input data (in addition to only the test vectors). I decided for python’s hashlib which has sha3_256 available since Version 3.6. Thanks to pybind11 it’s super-easy to create Python bindings for C++ code. With less than 20 LOC binding code, SHA3_256 can be used in Python in a natural way:

    0import sha3_sipa
    1
    2
    3def sha3_256_sipa(input):
    4    return sha3_sipa.SHA3_256().Write(input).Finalize()
    

    The whole mini-project with Makefile and the comparison test with random input data is on my branch https://github.com/theStack/bitcoin/tree/test-sha3_256-against-python3-implementation If anyone wants to try out, the following instructions should work on any recent Ubuntu with Python3.6+:

    0$ sudo apt-get install python3-dev
    1$ sudo pip3 install pybind11
    2$ cd sha3_test
    3$ make
    4$ ./sha3_test.py
    

    Running this already for a few hours, with millions of random inputs (random length up to 1 MB) and as expected, all of them passed. Will as next step code-review the PR tomorrow.

  24. elichai commented at 7:46 pm on September 6, 2020: contributor
    I like how compact this is :)
  25. Implement keccak-f[1600] and SHA3-256 2ac8bf9583
  26. Add SHA3 benchmark 3f01ddb01b
  27. Unroll Keccak-f implementation ab654c7d58
  28. sipa force-pushed on Sep 7, 2020
  29. sipa commented at 1:42 am on September 7, 2020: member
    Improved the tests and comments a bit. @theStack Nice, that’s pretty neat! @elichai It was much more compact before unrolling, but I couldn’t resist a 4.9x speedup with such a simple change (even though it really doesn’t matter here…). @practicalswift You’re of course welcome to do any testing you like, but I don’t think fuzzing for robustness is all that useful here. The KeccakF function has completely fixed code path and memory accesses, so it either fails for everything or for nothing. The only input-dependent behavior is in SHA3_256::Write, but even there the only thing that matters is the length of the inputs, which is well-covered by the included tests. It’d be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.
  30. vasild approved
  31. vasild commented at 7:49 am on September 7, 2020: member

    ACK ab654c7

    It’d be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.

    I tested that as I ran the new code through the same tests as the existent SHA256 which uses TestVector.

  32. elichai commented at 10:29 am on September 7, 2020: contributor

    @elichai It was much more compact before unrolling, but I couldn’t resist a 4.9x speedup with such a simple change (even though it really doesn’t matter here…).

    haha, you could instead add #pragma unroll 5/24/25 :P

  33. sipa commented at 7:01 pm on September 7, 2020: member

    haha, you could instead add #pragma unroll 5/24/25 :P

    I tried that. Numbers I get:

    ns/byte byte/s err% total benchmark
    28.88 34,630,178.09 1.0% 0.33 3f01ddb01bfffd49dfa131898d1c674ac5d0ac99 (before unrolling)
    5.88 170,120,654.67 0.8% 0.07 ab654c7d587b33d62230394663020439f80cee28 (after unrolling)
    5.60 178,667,398.49 0.4% 0.06 3f01ddb01bfffd49dfa131898d1c674ac5d0ac99, but with pragmas to unroll everything in one round

    So that means it’s (on GCC 9.3.0, on x86_64 Linux) possible to get slightly better performance with forced unrolling, but also a lot more fragile (pragmas may work different across compilers and without the pragmas the code is a lot worse).

    I prefer keeping the code as-is. It’s still very readable (I think), is platform neutral, and has close to ideal performance.

  34. elichai commented at 8:42 am on September 8, 2020: contributor

    I prefer keeping the code as-is. It’s still very readable (I think), is platform neutral, and has close to ideal performance.

    Yeah it probably won’t work on MSVC

  35. practicalswift commented at 5:43 am on September 10, 2020: contributor
    ACK ab654c7d587b33d62230394663020439f80cee28 – patch looks correct and no sanitizer complaints when doing some basic fuzz testing of the added code (remember: don’t trust: fuzz!) :)
  36. laanwj commented at 1:46 pm on September 10, 2020: member

    Thanks @theStack. I have run your cross-verification for a while here as well, no errors came up.

    I would prefer to keep the code as-is, with an explicit unroll. Even though a pragma would make the code shorter, it’s non standard C++ functionality.

    re-ACK ab654c7d587b33d62230394663020439f80cee28

  37. laanwj merged this on Sep 10, 2020
  38. laanwj closed this on Sep 10, 2020

  39. sidhujag referenced this in commit 1c08f95d3f on Sep 11, 2020
  40. deadalnix referenced this in commit fc167a7f4f on Feb 9, 2021
  41. kittywhiskers referenced this in commit b694bb2ce4 on May 18, 2021
  42. kittywhiskers referenced this in commit c440ddc290 on May 19, 2021
  43. kittywhiskers referenced this in commit 196d7645a5 on May 20, 2021
  44. kittywhiskers referenced this in commit 1c727a3159 on May 20, 2021
  45. kittywhiskers referenced this in commit f651595217 on May 20, 2021
  46. kittywhiskers referenced this in commit 41eaee235e on May 20, 2021
  47. PastaPastaPasta referenced this in commit b76e7fec1f on May 21, 2021
  48. random-zebra referenced this in commit b4751e10ce on Aug 11, 2021
  49. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me