lib: Optimizing siphash implementation #18014

pull elichai wants to merge 2 commits into bitcoin:master from elichai:2020-01-siphash changing 3 files +73 −18
  1. elichai commented at 5:39 pm on January 28, 2020: contributor

    Hi, This builds on #18013

    Before anything I want to point out that we have 3 SipHash implementations CSipHasher, SipHashUint256, SipHashUint256Extra. this PR touches only the first one(not used in any hashmap AFAIK).

    I re-implemented the CSipHasher with performance up to 3X times faster for big strings (BUFFER_SIZE = 1000*1000) and 5%-19% faster for small strings (3 bytes, because a minute of syncing showed me that 3 bytes siphash is something that happens quite often)

    Benchmarks against other siphash implementations can be found here: https://gist.github.com/elichai/abdebeeaee7e581bc74c75cb9487b3af (code: https://github.com/elichai/siphash-bench)

    My implementation was inspired by the one in Rust’s stdlib (https://github.com/rust-lang/rust/blob/master/src/libcore/hash/sip.rs) which rust-bitcoin use in https://github.com/rust-bitcoin/bitcoin_hashes.

    Before:

     0$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     1#Benchmark                                      evals       iterations  total       min         max         median      
     2SipHash                                         5           700         4.20809     0.0011912   0.00122256  0.00120163  
     3SipHash_32b                                     5           40000000    4.1793      2.08632e-08 2.0948e-08  2.08949e-08 
     4SipHash_3b                                      5           40000000    3.18892     1.56861e-08 1.64617e-08 1.5749e-08  
     5$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     6#Benchmark                                      evals       iterations  total       min         max         median      
     7SipHash                                         5           700         4.24318     0.00120808  0.00121676  0.00121336  
     8SipHash_32b                                     5           40000000    4.23684     2.06753e-08 2.16015e-08 2.14555e-08 
     9SipHash_3b                                      5           40000000    3.15998     1.54582e-08 1.61558e-08 1.58555e-08 
    10$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    11#Benchmark                                      evals       iterations  total       min         max         median      
    12SipHash                                         5           700         4.2472      0.0012113   0.00121558  0.00121324  
    13SipHash_32b                                     5           40000000    4.20925     2.09789e-08 2.11288e-08 2.10327e-08 
    14SipHash_3b                                      5           40000000    3.10727     1.54352e-08 1.55982e-08 1.55463e-08 
    15$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    16#Benchmark                                      evals       iterations  total       min         max         median      
    17SipHash                                         5           700         4.37224     0.00124528  0.00125769  0.0012473   
    18SipHash_32b                                     5           40000000    4.26011     2.1214e-08  2.134e-08   2.13171e-08 
    19SipHash_3b                                      5           40000000    3.18842     1.59033e-08 1.59832e-08 1.59432e-08 
    

    After:

     0$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     1#Benchmark                                      evals       iterations  total       min         max         median      
     2SipHash                                         5           700         1.36254     0.000386656 0.000392219 0.000388635 
     3SipHash_32b                                     5           40000000    4.31286     2.13773e-08 2.17857e-08 2.16181e-08 
     4SipHash_3b                                      5           40000000    2.91375     1.44794e-08 1.46495e-08 1.45848e-08 
     5$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     6#Benchmark                                      evals       iterations  total       min         max         median      
     7SipHash                                         5           700         1.32683     0.000372232 0.000386258 0.000376842 
     8SipHash_32b                                     5           40000000    4.15533     2.069e-08   2.08661e-08 2.07693e-08 
     9SipHash_3b                                      5           40000000    2.77612     1.38154e-08 1.3988e-08  1.38665e-08 
    10$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    11#Benchmark                                      evals       iterations  total       min         max         median      
    12SipHash                                         5           700         1.36596     0.00038727  0.000392932 0.000391074 
    13SipHash_32b                                     5           40000000    4.27694     2.13219e-08 2.14471e-08 2.13672e-08 
    14SipHash_3b                                      5           40000000    2.75763     1.37529e-08 1.38244e-08 1.37862e-08 
    15$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    16#Benchmark                                      evals       iterations  total       min         max         median      
    17SipHash                                         5           700         1.34316     0.000376846 0.000386059 0.000385079 
    18SipHash_32b                                     5           40000000    4.23368     2.1066e-08  2.14124e-08 2.11283e-08 
    19SipHash_3b                                      5           40000000    2.81931     1.40299e-08 1.42123e-08 1.40787e-08 
    

    ~Also made the benchmarks print a more readable output(https://gist.github.com/elichai/812c8866a69959404b480d968e080475), this is limited by up to 47 chars of benchmark name, so as long as we don’t add more names like CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT and longer then it will be fine. (it can probably be adjustable but that will require iterating over all the tests before running them to determine the longest cell and I thought the 47 limit is more than reasonable)~

  2. elichai renamed this:
    Optimizing siphash implementation
    lib: Optimizing siphash implementation
    on Jan 28, 2020
  3. DrahtBot added the label Utils/log/libs on Jan 28, 2020
  4. elichai commented at 8:52 pm on January 28, 2020: contributor
    (the Travis failure isn’t related, it’s a bug in the s390x machines)
  5. maflcko commented at 0:41 am on January 29, 2020: member
    @elichai The issue is upstream, see #18016
  6. DrahtBot commented at 0:54 am on January 29, 2020: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #26158 (bench: add “priority level” to the benchmark framework by furszy)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. laanwj commented at 5:19 pm on January 29, 2020: member
    That’s a very nice speed improvement!
  8. sipa commented at 6:00 pm on January 29, 2020: member
    Where do we use variable-length SipHash?
  9. emilengler commented at 6:08 pm on January 29, 2020: contributor
    I just tested it and the speed improvement is good. Also the formatting is much prettier now. Will review the code now
  10. in src/bench/bench.cpp:24 in cd5f26c260 outdated
    19@@ -20,7 +20,15 @@ const std::function<void(const std::string&)> G_TEST_LOG_FUN{};
    20 
    21 void benchmark::ConsolePrinter::header()
    22 {
    23-    std::cout << "# Benchmark, evals, iterations, total, min, max, median" << std::endl;
    24+    std::cout << "#"
    25+              << std::left << std::left << std::setw(47) << "Benchmark"
    


    emilengler commented at 6:14 pm on January 29, 2020:
    I played a bit around with it and I find it better to change this to 16. This will be much more friendly on smaller terminals and/or display

    elichai commented at 8:47 pm on January 29, 2020:
    Well that doesn’t work with tests like CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT. see the last paragraph in my post

    emilengler commented at 2:35 pm on January 30, 2020:
    Couldn’t this be dynamically calculated then?

    elichai commented at 2:36 pm on January 30, 2020:

    this is limited by up to 47 chars of benchmark name, so as long as we don’t add more names like CHACHA20_POLY1305_AEAD_256BYTES_ENCRYPT_DECRYPT and longer then it will be fine. (it can probably be adjustable but that will require iterating over all the tests before running them to determine the longest cell and I thought the 47 limit is more than reasonable)

  11. in src/bench/bench.cpp:57 in cd5f26c260 outdated
    52@@ -45,8 +53,14 @@ void benchmark::ConsolePrinter::result(const State& state)
    53         }
    54     }
    55 
    56-    std::cout << std::setprecision(6);
    57-    std::cout << state.m_name << ", " << state.m_num_evals << ", " << state.m_num_iters << ", " << total << ", " << front << ", " << back << ", " << median << std::endl;
    58+    std::cout << std::setprecision(6)
    59+              << std::left << std::setw(48) << state.m_name
    


    emilengler commented at 6:15 pm on January 29, 2020:
    Set this to 17 then. Maybe a const for this would be a good idea
  12. in src/bench/bench.cpp:58 in cd5f26c260 outdated
    52@@ -45,8 +53,14 @@ void benchmark::ConsolePrinter::result(const State& state)
    53         }
    54     }
    55 
    56-    std::cout << std::setprecision(6);
    57-    std::cout << state.m_name << ", " << state.m_num_evals << ", " << state.m_num_iters << ", " << total << ", " << front << ", " << back << ", " << median << std::endl;
    58+    std::cout << std::setprecision(6)
    59+              << std::left << std::setw(48) << state.m_name
    60+              << std::left << std::setw(12) << state.m_num_evals
    


    emilengler commented at 6:15 pm on January 29, 2020:
    The const idea also applies to all occurrences of 12
  13. in src/bench/crypto_hash.cpp:89 in cd5f26c260 outdated
    84+        hash = CSipHasher(hash, ++k2).Write(in.data(), in.size()).Finalize();
    85+    }
    86+}
    87+
    88+
    89+static void SipHash(benchmark::State& state)
    


    emilengler commented at 6:19 pm on January 29, 2020:
    nit: I’m not that much into crypto but SipHash and SipHash_32b look very similar. Maybe you could include them into one function or make one general function which is called by the both. The only difference between them is the vector constructor

    elichai commented at 8:47 pm on January 29, 2020:
    SipHash_32b is calling a different SipHash implementation that is somewhat optimized to 32 bytes (256bit)
  14. in src/bench/bench.cpp:129 in cd5f26c260 outdated
    139@@ -126,6 +140,7 @@ void benchmark::BenchRunner::RunAll(Printer& printer, uint64_t num_evals, double
    140         }
    141 
    142         if (!std::regex_match(p.first, baseMatch, reFilter)) {
    143+             g_testing_setup = nullptr;
    


    emilengler commented at 6:22 pm on January 29, 2020:
    nit: Useless space

    elichai commented at 8:46 pm on January 29, 2020:

    elichai commented at 4:46 pm on January 30, 2020:
    Rebased it out
  15. emilengler commented at 6:22 pm on January 29, 2020: contributor
    This review is more related to the interface rather than the crypto as I don’t have much experience to review that.
  16. elichai commented at 8:44 pm on January 29, 2020: contributor

    Where do we use variable-length SipHash?

    A quick search shows:

    1. GCSFilter::HashToRange BIP158 Compact block filters (blockfilters.h)
    2. RelayAddress I think as a PRNG (net_processing.h).
    3. ByteVectorHash used as the hash for the std::unordered_set<Element, ByteVectorHash> ElementSet; in blockfilter.h.
    4. CConnman::GetDeterministicRandomizer again some randomizer thing (net.cpp).

    EDIT: We could also replace this Write+Finalize with a single invocation and gain a few more percentages (by not storing and checking tail, and improving inlining) but that’s kinda lose future usability.

  17. Empact commented at 3:02 am on January 30, 2020: member

    nit: You have a few whitespace irregularities - running git-clang-format generates this diff

     0diff --git a/src/bench/crypto_hash.cpp b/src/bench/crypto_hash.cpp
     1index 9eeb8da16..037260939 100644
     2--- a/src/bench/crypto_hash.cpp
     3+++ b/src/bench/crypto_hash.cpp
     4@@ -90,7 +90,7 @@ static void SipHash(benchmark::State& state)
     5 {
     6     uint64_t hash = 0;
     7     uint64_t k2 = 0;
     8-    std::vector<uint8_t> in(BUFFER_SIZE,0);
     9+    std::vector<uint8_t> in(BUFFER_SIZE, 0);
    10     while (state.KeepRunning())
    11         hash = CSipHasher(hash, ++k2).Write(in.data(), in.size()).Finalize();
    12 }
    13diff --git a/src/crypto/siphash.cpp b/src/crypto/siphash.cpp
    14index 0b62de998..cfc04c194 100644
    15--- a/src/crypto/siphash.cpp
    16+++ b/src/crypto/siphash.cpp
    17@@ -2,8 +2,8 @@
    18 // Distributed under the MIT software license, see the accompanying
    19 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
    20 
    21-#include <crypto/siphash.h>
    22 #include <crypto/common.h>
    23+#include <crypto/siphash.h>
    24 
    25 #include <algorithm>
    26 
    27@@ -50,7 +50,8 @@ CSipHasher& CSipHasher::Write(uint64_t data)
    28 
    29 
    30 /// Load a uint64_t from 0 to 7 bytes.
    31-inline uint64_t ReadU64ByLenLE(const unsigned char* data, size_t len) {
    32+inline uint64_t ReadU64ByLenLE(const unsigned char* data, size_t len)
    33+{
    34     assert(len < 8);
    35     uint64_t out = 0;
    36     for (size_t i = 0; i < len; ++i) {
    37@@ -85,7 +86,7 @@ CSipHasher& CSipHasher::Write(const unsigned char* data, size_t size)
    38 
    39     auto i = needed;
    40     while (i < len - left) {
    41-        uint64_t mi = ReadLE64(data+i);
    42+        uint64_t mi = ReadLE64(data + i);
    43         v3 ^= mi;
    44         SIPROUND;
    45         SIPROUND;
    46diff --git a/src/crypto/siphash.h b/src/crypto/siphash.h
    47index 698f9c310..ddf192e8a 100644
    48--- a/src/crypto/siphash.h
    49+++ b/src/crypto/siphash.h
    50@@ -14,7 +14,7 @@ class CSipHasher
    51 {
    52 private:
    53     uint64_t v[4];
    54-    size_t count; // total amount of bytes inputted.
    55+    size_t count;  // total amount of bytes inputted.
    56     uint64_t tail; // bytes that weren't processed yet.
    57 
    58 public:
    
  18. in src/bench/crypto_hash.cpp:94 in cd5f26c260 outdated
    89+static void SipHash(benchmark::State& state)
    90+{
    91+    uint64_t hash = 0;
    92+    uint64_t k2 = 0;
    93+    std::vector<uint8_t> in(BUFFER_SIZE,0);
    94+    while (state.KeepRunning())
    


    Empact commented at 3:03 am on January 30, 2020:
    nit: as new code, should have brackets unless one line https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md#coding-style-c

    elichai commented at 4:46 pm on January 30, 2020:
    Done
  19. elichai force-pushed on Jan 30, 2020
  20. elichai requested review from sipa on Jan 30, 2020
  21. m2709 changes_requested
  22. m2709 commented at 2:32 am on January 31, 2020: none
    .
  23. bitcoin deleted a comment on Jan 31, 2020
  24. luke-jr referenced this in commit 5cb3bb68cb on Feb 9, 2020
  25. in src/crypto/siphash.h:17 in 81e0f14421 outdated
    13@@ -14,8 +14,8 @@ class CSipHasher
    14 {
    15 private:
    16     uint64_t v[4];
    17-    uint64_t tmp;
    18-    int count;
    19+    size_t count;  // total amount of bytes inputted.
    20+    uint64_t tail; // bytes that weren't processed yet.
    


    jonatack commented at 4:50 pm on February 21, 2020:
    perhaps sort these entries L::16-18

    PastaPastaPasta commented at 12:48 pm on February 1, 2022:
    Why would you want to sort them? This order seems to make sense to me?

    elichai commented at 4:02 pm on October 12, 2022:
    Yeah, do you remember the reason you wanted a different order here?
  26. elichai commented at 2:21 pm on February 23, 2020: contributor
    I decided to drop commit 2b32471b5b113c0a34e06896842a450dc7168454 because it’s more controversial than I thought (there might be software that relies on the current formatting and there’s #18011 coming up)
  27. elichai force-pushed on Feb 23, 2020
  28. jonatack commented at 2:26 pm on February 23, 2020: contributor

    Concept ACK

    before change of formatting

    0$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    1
    2# Benchmark, evals, iterations, total, min, max, median
    3SipHash_32b, 5, 40000000, 6.28913, 3.10429e-08, 3.27799e-08, 3.11495e-08
    4
    5# Benchmark, evals, iterations, total, min, max, median
    6SipHash_32b, 5, 40000000, 6.33962, 3.10606e-08, 3.30034e-08, 3.12351e-08
    7
    8# Benchmark, evals, iterations, total, min, max, median
    9SipHash_32b, 5, 40000000, 6.55364, 3.12087e-08, 3.75397e-08, 3.16322e-08
    

    after change of formatting and added benchmarks

     0((HEAD detached at 52aa1f380b))$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     1
     2#Benchmark                                      evals       iterations  total       min         max         median      
     3SipHash                                         5           700         7.58726     0.00214442  0.00219727  0.00217387  
     4SipHash_32b                                     5           40000000    6.32856     3.11782e-08 3.25053e-08 3.1465e-08  
     5SipHash_3b                                      5           40000000    4.3802      2.16468e-08 2.21054e-08 2.19273e-08 
     6
     7#Benchmark                                      evals       iterations  total       min         max         median      
     8SipHash                                         5           700         7.82075     0.0021651   0.0024746   0.00218246  
     9SipHash_32b                                     5           40000000    6.39722     3.12934e-08 3.3359e-08  3.16498e-08 
    10SipHash_3b                                      5           40000000    4.6185      2.18849e-08 2.52748e-08 2.30488e-08 
    11
    12#Benchmark                                      evals       iterations  total       min         max         median      
    13SipHash                                         5           700         8.12191     0.00215675  0.00248114  0.00232176  
    14SipHash_32b                                     5           40000000    6.33754     3.12737e-08 3.24442e-08 3.16561e-08 
    15SipHash_3b                                      5           40000000    4.3563      2.13873e-08 2.21485e-08 2.17682e-08
    

    after change of algorithm implementation

     0(pr/18014)$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
     1
     2#Benchmark                                      evals       iterations  total       min         max         median      
     3SipHash                                         5           700         2.0267      0.000571765 0.000590692 0.000579299 
     4SipHash_32b                                     5           40000000    6.27794     3.10675e-08 3.209e-08   3.13121e-08 
     5SipHash_3b                                      5           40000000    4.07212     2.01261e-08 2.06954e-08 2.02564e-08 
     6
     7#Benchmark                                      evals       iterations  total       min         max         median      
     8SipHash                                         5           700         2.13875     0.000582848 0.000627614 0.000614234 
     9SipHash_32b                                     5           40000000    6.27427     3.10577e-08 3.16413e-08 3.13393e-08 
    10SipHash_3b                                      5           40000000    4.02383     1.9983e-08  2.05697e-08 2.00265e-08 
    11
    12#Benchmark                                      evals       iterations  total       min         max         median      
    13SipHash                                         5           700         2.05042     0.000577004 0.00060258  0.000584264 
    14SipHash_32b                                     5           40000000    6.32065     3.10846e-08 3.25604e-08 3.14578e-08 
    15SipHash_3b                                      5           40000000    4.03686     1.96887e-08 2.08056e-08 2.02039e-08 
    

    Will look at the algorithm.

  29. jonatack commented at 2:27 pm on February 23, 2020: contributor

    I decided to drop commit 2b32471 because it’s more controversial than I thought (there might be software that relies on the current formatting and there’s #18011 coming up)

    Right, will be easier to review as more focused; it was 2 PRs in one before.

  30. bitcoin deleted a comment on Feb 24, 2020
  31. jonatack commented at 1:51 pm on July 6, 2020: contributor
    Sorry for the delay @elichai – I still plan to review this.
  32. DrahtBot added the label Needs rebase on Jul 30, 2020
  33. elichai force-pushed on Aug 2, 2020
  34. elichai commented at 12:52 pm on August 2, 2020: contributor

    Updated benchmarks with the new benchmarking library:

    Before:

    0|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|        2,982,785.00 |              335.26 |    0.2% |   23,000,202.00 |    7,146,700.00 |  3.218 |   2,125,014.00 |    0.0% |      0.03 | `SipHash`
    3|               39.53 |       25,296,689.57 |    0.1% |          237.03 |           94.79 |  2.501 |           3.00 |    0.1% |      0.00 | `SipHash_32b`
    4|               27.87 |       35,879,242.25 |    0.1% |          239.02 |           66.89 |  3.573 |          17.00 |    0.0% |      0.00 | `SipHash_3b`
    

    After:

    0$ sudo pyperf system tune
    1$ ./src/bench/bench_bitcoin -filter="SipHash|SipHash_3b|SipHash_32b"
    2|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    3|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    4|          676,413.00 |            1,478.39 |    0.1% |    4,250,201.00 |    1,623,300.00 |  2.618 |     125,015.00 |    0.0% |      0.01 | `SipHash`
    5|               39.49 |       25,324,719.95 |    0.1% |          237.03 |           94.73 |  2.502 |           3.00 |    0.1% |      0.00 | `SipHash_32b`
    6|               23.75 |       42,108,786.61 |    0.2% |          193.02 |           57.00 |  3.387 |          16.00 |    0.0% |      0.00 | `SipHash_3b`
    
  35. in src/crypto/siphash.cpp:57 in 9ed348ddea outdated
    52+/// Load a uint64_t from 0 to 7 bytes.
    53+inline uint64_t ReadU64ByLenLE(const unsigned char* data, size_t len)
    54+{
    55+    assert(len < 8);
    56+    uint64_t out = 0;
    57+    for (size_t i = 0; i < len; ++i) {
    


    sipa commented at 4:22 pm on August 2, 2020:

    On LE systems I can imagine that this is even faster:

    0uint64_t val = 0;
    1memcpy((unsigned char*)&val, data, len);
    2return val;
    

    On BE it needs return (val >> 8 * (8 - len)) instead.

    Feel like benchmarking that?


    elichai commented at 4:40 pm on August 2, 2020:

    Thanks I’ll benchmark this, but I’m not too optimistic about this being faster, because a the length isn’t statically known so it must actually call memcpy and can’t inline and optimize.

    I also want to add fuzz harnesses against reference implementations for the hash functions, which should increase confidence in refactoring the delicate areas(tough not sure how much reviewers will like the extra code)


    elichai commented at 4:49 pm on August 2, 2020:

    Benchmarked now, they are too close to actually know which is faster

    diff:

     0diff --git a/src/crypto/siphash.cpp b/src/crypto/siphash.cpp
     1index cfc04c194e..616bab40f1 100644
     2--- a/src/crypto/siphash.cpp
     3+++ b/src/crypto/siphash.cpp
     4@@ -54,9 +54,7 @@ inline uint64_t ReadU64ByLenLE(const unsigned char* data, size_t len)
     5 {
     6     assert(len < 8);
     7     uint64_t out = 0;
     8-    for (size_t i = 0; i < len; ++i) {
     9-        out |= (uint64_t)data[i] << (i * 8);
    10-    }
    11+    memcpy(&out, data, len);
    12     return out;
    13 }
    

    Before:

     0$ ./src/bench/bench_bitcoin -filter="SipHash"
     1
     2|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
     3|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
     4|          676,729.00 |            1,477.70 |    0.0% |    4,250,201.00 |    1,624,300.00 |  2.617 |     125,015.00 |    0.0% |      0.01 | `SipHash`
     5
     6$ ./src/bench/bench_bitcoin -filter="SipHash"
     7
     8|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
     9|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    10|          676,510.00 |            1,478.17 |    0.1% |    4,250,201.00 |    1,623,700.00 |  2.618 |     125,015.00 |    0.0% |      0.01 | `SipHash`
    11
    12$ ./src/bench/bench_bitcoin -filter="SipHash"
    13
    14|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    15|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    16|          676,427.00 |            1,478.36 |    0.0% |    4,250,201.00 |    1,623,600.00 |  2.618 |     125,015.00 |    0.0% |      0.01 | `SipHash`
    

    After:

     0$ ./src/bench/bench_bitcoin -filter="SipHash"
     1
     2|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
     3|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
     4|          676,646.00 |            1,477.88 |    0.0% |    4,250,232.00 |    1,624,200.00 |  2.617 |     125,023.00 |    0.0% |      0.01 | `SipHash`
     5
     6$ ./src/bench/bench_bitcoin -filter="SipHash"
     7
     8|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
     9|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    10|          676,239.00 |            1,478.77 |    0.0% |    4,250,232.00 |    1,623,300.00 |  2.618 |     125,023.00 |    0.0% |      0.01 | `SipHash`
    11
    12$ ./src/bench/bench_bitcoin -filter="SipHash"
    13
    14|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    15|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    16|          676,756.00 |            1,477.64 |    0.1% |    4,250,232.00 |    1,623,700.00 |  2.618 |     125,023.00 |    0.0% |      0.01 | `SipHash`
    

    sipa commented at 4:53 pm on August 2, 2020:
    Cool, thanks for testing. No reason to prefer my suggestion in that case.
  36. DrahtBot removed the label Needs rebase on Aug 3, 2020
  37. DrahtBot added the label Needs rebase on Sep 10, 2020
  38. luke-jr referenced this in commit 2880f45202 on Nov 25, 2020
  39. jonatack commented at 9:26 am on January 15, 2021: contributor
    Needs rebase.
  40. elichai force-pushed on Mar 19, 2021
  41. elichai force-pushed on Mar 19, 2021
  42. elichai commented at 2:26 pm on March 19, 2021: contributor

    With the benchmarks adjusted to bytes Before:

    0|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|                1.79 |      559,246,583.00 |    0.2% |           17.00 |            4.29 |  3.967 |           2.13 |    0.0% |      0.02 | `SipHash`
    3|                1.24 |      806,638,658.38 |    0.1% |            7.44 |            2.98 |  2.497 |           0.09 |    0.2% |      0.00 | `SipHash_32b`
    4|               12.02 |       83,226,929.77 |    0.3% |           74.01 |           28.84 |  2.566 |           5.67 |    0.0% |      0.00 | `SipHash_3b`
    

    After:

    0|             ns/byte |              byte/s |    err% |        ins/byte |        cyc/byte |    IPC |       bra/byte |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    2|                0.69 |    1,452,369,540.91 |    0.4% |            4.25 |            1.63 |  2.610 |           0.13 |    0.0% |      0.01 | `SipHash`
    3|                1.24 |      804,669,432.11 |    0.1% |            7.44 |            2.98 |  2.493 |           0.09 |    0.1% |      0.00 | `SipHash_32b`
    4|                9.93 |      100,677,637.95 |    0.1% |           65.67 |           23.86 |  2.752 |           5.33 |    0.0% |      0.00 | `SipHash_3b`
    
  43. DrahtBot removed the label Needs rebase on Mar 19, 2021
  44. sipa commented at 2:48 am on March 20, 2021: member
    utACK 19e28a41168d02dc85356714c889b43d96d834d6
  45. in src/crypto/siphash.cpp:58 in 19e28a4116 outdated
    53+inline uint64_t ReadU64ByLenLE(const unsigned char* data, size_t len)
    54+{
    55+    assert(len < 8);
    56+    uint64_t out = 0;
    57+    for (size_t i = 0; i < len; ++i) {
    58+        out |= (uint64_t)data[i] << (i * 8);
    


    PastaPastaPasta commented at 12:36 pm on February 1, 2022:
    please convert this to a c++11 functional cast

    elichai commented at 4:01 pm on October 12, 2022:
    These are integer casts, so I’m not sure if it’s worth doing because it decreases the readability, and not worth asking for re-ACKs
  46. in src/crypto/siphash.cpp:111 in 19e28a4116 outdated
    106@@ -77,12 +107,14 @@ uint64_t CSipHasher::Finalize() const
    107 {
    108     uint64_t v0 = v[0], v1 = v[1], v2 = v[2], v3 = v[3];
    109 
    110-    uint64_t t = tmp | (((uint64_t)count) << 56);
    111+
    112+    uint64_t t = tail | (((uint64_t)count) << 56);
    


    PastaPastaPasta commented at 12:36 pm on February 1, 2022:
    please use a c++11 functional-cast

    elichai commented at 4:02 pm on October 12, 2022:
    same as above
  47. PastaPastaPasta commented at 12:37 pm on February 1, 2022: contributor
    this PR seems quite dead, but if it gets revived, please do the following
  48. maflcko commented at 12:41 pm on February 1, 2022: member

    this PR seems quite dead

    I wouldn’t say it is dead. It compiles and (unit) tests fine on itself and on current master. All it needs is review.

  49. in src/crypto/siphash.cpp:96 in 19e28a4116 outdated
    101+        SIPROUND;
    102+        SIPROUND;
    103+        v0 ^= mi;
    104+        i += 8;
    105+    }
    106+
    


    PastaPastaPasta commented at 12:45 pm on February 1, 2022:
    This should really be a for loop

    elichai commented at 3:59 pm on October 12, 2022:
    i is used later, so this will require for(; i < len - left; i+= 8) which I’m not sure is that better because it can be confusing,but I can change it if you think it improves readability
  50. PastaPastaPasta commented at 12:49 pm on February 1, 2022: contributor

    My comment on aliveness was just that it’s been about 11 months since it got a review, not to say that there is anything problematic with the pr. Maybe since I commented some others will be notified and will do a review :)

    For what it’s worth, I don’t see any major issues with the PR, although I did add a few more comments of things that likely should be changed

  51. in src/crypto/siphash.cpp:28 in 19e28a4116 outdated
    24@@ -22,14 +25,14 @@ CSipHasher::CSipHasher(uint64_t k0, uint64_t k1)
    25     v[2] = 0x6c7967656e657261ULL ^ k0;
    26     v[3] = 0x7465646279746573ULL ^ k1;
    27     count = 0;
    28-    tmp = 0;
    29+    tail = 0;
    


    aureleoules commented at 4:46 pm on October 12, 2022:

    nit: these should be initialized in the class

     0diff --git a/src/crypto/siphash.cpp b/src/crypto/siphash.cpp
     1index f78771868..dcf40d309 100644
     2--- a/src/crypto/siphash.cpp
     3+++ b/src/crypto/siphash.cpp
     4@@ -24,8 +24,6 @@ CSipHasher::CSipHasher(uint64_t k0, uint64_t k1)
     5     v[1] = 0x646f72616e646f6dULL ^ k1;
     6     v[2] = 0x6c7967656e657261ULL ^ k0;
     7     v[3] = 0x7465646279746573ULL ^ k1;
     8-    count = 0;
     9-    tail = 0;
    10 }
    11 
    12 CSipHasher& CSipHasher::Write(uint64_t data)
    13diff --git a/src/crypto/siphash.h b/src/crypto/siphash.h
    14index 4b630f229..1b62e2e6c 100644
    15--- a/src/crypto/siphash.h
    16+++ b/src/crypto/siphash.h
    17@@ -14,8 +14,8 @@ class CSipHasher
    18 {
    19 private:
    20     uint64_t v[4];
    21-    uint64_t tail; // bytes that weren't processed yet.
    22-    uint8_t count;  // total amount of bytes inputted.
    23+    uint64_t tail{0}; // bytes that weren't processed yet.
    24+    uint8_t count{0};  // total amount of bytes inputted.
    25 
    26 public:
    27     /** Construct a SipHash calculator initialized with 128-bit key (k0, k1) */
    
  52. aureleoules approved
  53. aureleoules commented at 5:48 pm on October 12, 2022: member
    ACK 19e28a41168d02dc85356714c889b43d96d834d6 I verified that the implementation of CSipHasher::Write matches the one from rust-bitcoin/bitcoin_hashes (https://github.com/rust-bitcoin/bitcoin_hashes/blob/ec356e4933f6ee972dd0a1f836a3297b2e9f3407/src/siphash24.rs#L171-L208)
  54. aureleoules commented at 5:52 pm on October 12, 2022: member
    Maybe this needs a rebase for the CI to pass on this PR?
  55. Add more benchmarks to siphash 169a439b39
  56. Optimized siphash implementation 409c2e3452
  57. elichai commented at 9:08 pm on October 12, 2022: contributor
    Couldn’t reproduce the linter complaints locally, so rebased, hopefully that’ll fix it
  58. elichai force-pushed on Oct 12, 2022
  59. aureleoules approved
  60. aureleoules commented at 9:22 pm on October 12, 2022: member
    reACK 409c2e345225716a29c856b24e1c232a643a52ef - rebased since last review (no changes)
  61. achow101 commented at 2:38 pm on October 13, 2022: member
    @sipa re-review?
  62. DrahtBot added the label Needs rebase on Oct 20, 2022
  63. DrahtBot commented at 5:11 pm on October 20, 2022: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  64. luke-jr commented at 8:16 pm on November 10, 2022: member
    Semi-ACK: Didn’t fully review the code, but this has been part of Knots for over 2 years (0.19.1) and no issues seem to have surfaced.
  65. DrahtBot commented at 0:18 am on February 8, 2023: contributor

    There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  66. achow101 added the label Up for grabs on Apr 25, 2023
  67. achow101 commented at 3:28 pm on April 25, 2023: member
    Closing as up for grabs due to lack of activity.
  68. achow101 closed this on Apr 25, 2023

  69. luke-jr referenced this in commit 30e00b92cf on Jan 29, 2024
  70. luke-jr referenced this in commit 7145d63cdf on Jan 30, 2024
  71. luke-jr referenced this in commit 3094221bd9 on Jan 30, 2024
  72. bitcoin locked this on Apr 24, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me