sipa
commented at 8:00 pm on August 30, 2020:
member
Add a simple (and initially unoptimized) Keccak/SHA3 implementation based on https://github.com/mjosaarinen/tiny_sha3/blob/master/sha3.c, as one will be needed for TORv3 support (the conversion from BIP155 encoding to .onion notation uses a SHA3-based checksum). In follow-up commits, a benchmark is added, and the Keccakf function is unrolled for a (for me) 4.9x speedup.
#18014 (lib: Optimizing siphash implementation by elichai)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
fanquake
commented at 2:25 am on August 31, 2020:
member
Concept ACK on this approach to adding a Keccak/SHA3 implementation.
MarcoFalke removed the label
Build system
on Aug 31, 2020
naumenkogs
commented at 7:22 am on August 31, 2020:
member
Concept ACK as a requirement for Torv3.
practicalswift
commented at 9:46 am on August 31, 2020:
contributor
Concept ACK
Nice simple implementation: will fuzz it to see if I’m able to reject that it is as robust as it looks :)
laanwj
commented at 1:31 pm on August 31, 2020:
member
This is a nice compact implementation of KeccaK/SHA3 (even after the unrolling).
Code review ACK2263ab190e625f81d154ce6581f37adcdb31ef8b
vasild approved
vasild
commented at 3:31 pm on August 31, 2020:
member
ACK2263ab190
I tested this with #19845, changing the code to use this SHA3_256 implementation and it works (produces the same checksum in the TORv3 address):
I also tested this implementation against the same vectors as TestSHA256, just like in #19845 and the results are the same (as the SHA3_256 from that PR, not as the SHA256!).
take 2.8s VS 0.4s for SHA256 (in non optimized build).
Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.
Ideally this would get merged first and then #19845 adapted to use it (and ditch the implementation imported in that PR).
sipa
commented at 5:26 pm on August 31, 2020:
member
@vasild FWIW, you should be able to use hasher.Write(MakeUCharSpan(".onion checksum")).
Regarding speed: the SHA256 implementations we have take advantage of SSE4/AVX2/SHA-NI instructions when available. I didn’t use any of those for SHA3 as it seems unnecessary for its current use case. In an optimized build, but with HW accelerations for SHA256 disabled, the SHA3 code here is very close to SHA256:
ns/byte
byte/s
err%
total
benchmark
1.69
590,913,173.58
0.1%
0.02
SHA1
4.12
242,749,791.96
0.0%
0.05
SHA256
4.11
243,277,749.23
0.9%
0.04
SHA3_256_1M
2.90
344,486,900.54
0.1%
0.03
SHA512
On the same system (which has SHA-NI), with hardware accelerations enabled it is of course:
ns/byte
byte/s
err%
total
benchmark
0.63
1,586,480,646.52
0.0%
0.01
SHA256
vasild
commented at 7:37 pm on August 31, 2020:
member
laanwj
commented at 9:10 am on September 1, 2020:
member
Verdict: likely it works correctly and there may be room for performance improvement, but this is not relevant for the TORv3 code.
As it is not relevant to our use case, I don’t think this is worth it. I wouldn’t like to see the added complication of assembly implementations and special intrinsics just for computing an address checksum (it would be different if it was for packet checksums). Brevity (and of course correctness) of the implementation is the important thing here.
theStack
commented at 9:02 am on September 2, 2020:
member
Concept ACK
It seems that currently the header <array> is unnecessarily included both in the implementation and the unit test source files.
// EDIT: Same for <stdint.h> and <stdlib.h> in sha3.h.
laanwj
commented at 10:14 am on September 3, 2020:
member
@theStack Agree about array, it doesn’t seem to be used at all. Wouldnt’ stdint.h be necessary for sized integer types such as uint32_t? (yes, it’s probably included indirectly somehow, but we like to be explicit about dependencies)
theStack
commented at 10:37 am on September 3, 2020:
member
@theStack Agree about array, it doesn’t seem to be used at all. Wouldnt’ stdint.h be necessary for sized integer types such as uint32_t? (yes, it’s probably included indirectly somehow, but we like to be explicit about dependencies) @laanwj: Oh indeed, you are right. For some reason I forgot that sized integer types and size_t are not integral part of the language but defined in headers… 🤦♂️ so both <stdint.h> are <stdlib.h> are needed indeed (though one could probably use the C++ counterparts <cstdint> and <cstdlib> instead, but that’d be more of a general discussion not related only to this PR).
sipa
commented at 11:10 pm on September 3, 2020:
member
I added <array> to get access to std::begin and std::end.
laanwj
commented at 10:09 am on September 4, 2020:
member
I forgot that sized integer types and size_t are not integral part of the language but defined in headers… man_facepalming
Haah. I try to forget that sometimes …
I added to get access to std::begin and std::end.
Oh, of course!
theStack
commented at 9:53 pm on September 5, 2020:
member
I added <array> to get access to std::begin and std::end.
Ah, I wasn’t aware those are declared in <array>. Sorry for the noise.
For reviewing this PR, I thought it would be nice to validate this implementation against a standard one with random input data (in addition to only the test vectors). I decided for python’s hashlib which has sha3_256 available since Version 3.6. Thanks to pybind11 it’s super-easy to create Python bindings for C++ code. With less than 20 LOC binding code, SHA3_256 can be used in Python in a natural way:
0$ sudo apt-get install python3-dev
1$ sudo pip3 install pybind11
2$ cd sha3_test
3$ make
4$ ./sha3_test.py
Running this already for a few hours, with millions of random inputs (random length up to 1 MB) and as expected, all of them passed. Will as next step code-review the PR tomorrow.
elichai
commented at 7:46 pm on September 6, 2020:
contributor
I like how compact this is :)
Implement keccak-f[1600] and SHA3-2562ac8bf9583
Add SHA3 benchmark3f01ddb01b
Unroll Keccak-f implementationab654c7d58
sipa force-pushed
on Sep 7, 2020
sipa
commented at 1:42 am on September 7, 2020:
member
Improved the tests and comments a bit.
@theStack Nice, that’s pretty neat!
@elichai It was much more compact before unrolling, but I couldn’t resist a 4.9x speedup with such a simple change (even though it really doesn’t matter here…).
@practicalswift You’re of course welcome to do any testing you like, but I don’t think fuzzing for robustness is all that useful here. The KeccakF function has completely fixed code path and memory accesses, so it either fails for everything or for nothing. The only input-dependent behavior is in SHA3_256::Write, but even there the only thing that matters is the length of the inputs, which is well-covered by the included tests. It’d be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.
vasild approved
vasild
commented at 7:49 am on September 7, 2020:
member
ACKab654c7
It’d be slightly more useful to test that hashing the same data, chopped up in differently sized chunks, gives the same result.
I tested that as I ran the new code through the same tests as the existent SHA256 which uses TestVector.
elichai
commented at 10:29 am on September 7, 2020:
contributor
@elichai It was much more compact before unrolling, but I couldn’t resist a 4.9x speedup with such a simple change (even though it really doesn’t matter here…).
haha, you could instead add #pragma unroll 5/24/25 :P
sipa
commented at 7:01 pm on September 7, 2020:
member
haha, you could instead add #pragma unroll 5/24/25 :P
3f01ddb01bfffd49dfa131898d1c674ac5d0ac99, but with pragmas to unroll everything in one round
So that means it’s (on GCC 9.3.0, on x86_64 Linux) possible to get slightly better performance with forced unrolling, but also a lot more fragile (pragmas may work different across compilers and without the pragmas the code is a lot worse).
I prefer keeping the code as-is. It’s still very readable (I think), is platform neutral, and has close to ideal performance.
elichai
commented at 8:42 am on September 8, 2020:
contributor
I prefer keeping the code as-is. It’s still very readable (I think), is platform neutral, and has close to ideal performance.
Yeah it probably won’t work on MSVC
practicalswift
commented at 5:43 am on September 10, 2020:
contributor
ACKab654c7d587b33d62230394663020439f80cee28 – patch looks correct and no sanitizer complaints when doing some basic fuzz testing of the added code (remember: don’t trust: fuzz!) :)
laanwj
commented at 1:46 pm on September 10, 2020:
member
Thanks @theStack. I have run your cross-verification for a while here as well, no errors came up.
I would prefer to keep the code as-is, with an explicit unroll. Even though a pragma would make the code shorter, it’s non standard C++ functionality.
re-ACKab654c7d587b33d62230394663020439f80cee28
laanwj merged this
on Sep 10, 2020
laanwj closed this
on Sep 10, 2020
sidhujag referenced this in commit
1c08f95d3f
on Sep 11, 2020
deadalnix referenced this in commit
fc167a7f4f
on Feb 9, 2021
kittywhiskers referenced this in commit
b694bb2ce4
on May 18, 2021
kittywhiskers referenced this in commit
c440ddc290
on May 19, 2021
kittywhiskers referenced this in commit
196d7645a5
on May 20, 2021
kittywhiskers referenced this in commit
1c727a3159
on May 20, 2021
kittywhiskers referenced this in commit
f651595217
on May 20, 2021
kittywhiskers referenced this in commit
41eaee235e
on May 20, 2021
PastaPastaPasta referenced this in commit
b76e7fec1f
on May 21, 2021
random-zebra referenced this in commit
b4751e10ce
on Aug 11, 2021
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-12-19 03:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me