Auto-detect SHA256 implementation in benchmarks #19214

pull sipa wants to merge 1 commits into bitcoin:master from sipa:202006_hw_sha_bench changing 1 files +2 −0
  1. sipa commented at 6:25 pm on June 8, 2020: member
    It seems SHA256AutoDetect() was not being called in benchmarks, making the numbers only reflect the naive implementation. Fix this by calling it in bench_bitcoin’s setup.
  2. Call SHA256AutoDetect in benchmark setup addf18da95
  3. DrahtBot added the label Tests on Jun 8, 2020
  4. pstratem commented at 7:36 pm on June 8, 2020: contributor

    comparing this to current master this seems to actually be slower (if only very slightly), running debian 10.4 on an i7-8550U

    bench_master.txt:SHA256, 5, 340, 4.45805, 0.0026092, 0.00263955, 0.00262045 bench_master.txt:SHA256D64_1024, 5, 7400, 4.45677, 0.000120143, 0.000120585, 0.000120541 bench_master.txt:SHA256_32b, 5, 4700000, 4.68355, 1.98788e-07, 1.99912e-07, 1.99391e-07 bench_19214.txt:SHA256, 5, 340, 4.57454, 0.00268855, 0.00269248, 0.00269149 bench_19214.txt:SHA256D64_1024, 5, 7400, 4.55315, 0.000122395, 0.000123309, 0.000123203 bench_19214.txt:SHA256_32b, 5, 4700000, 4.97084, 2.09111e-07, 2.13025e-07, 2.11958e-07

  5. fjahr commented at 7:54 pm on June 8, 2020: member

    tested ACK addf18da951439f696dba163ae1c73458d43ea03

    For me, the SHA256 tests are speeding up significantly after this.

  6. pstratem commented at 8:20 pm on June 8, 2020: contributor

    I must have gotten something wrong, doing the benchmarks again after git clean shows this pr being about 6x faster

    ACK addf18da951439f696dba163ae1c73458d43ea03

  7. MarcoFalke commented at 0:50 am on June 9, 2020: member
    Can hashing be made to fail when SHA256AutoDetect hasn’t been called?
  8. Sjors commented at 9:59 am on June 9, 2020: member

    On a 2019 Macbook Pro:

    0src/bench/bench_bitcoin -filter=SHA256.*
    

    Before:

    0# Benchmark, evals, iterations, total, min, max, median
    1SHA256, 5, 340, 6.08992, 0.00350198, 0.00370616, 0.0035939
    2SHA256D64_1024, 5, 7400, 22.9059, 0.000614125, 0.00062785, 0.000618134
    3SHA256_32b, 5, 4700000, 6.0593, 2.55725e-07, 2.59171e-07, 2.58255e-07
    

    After (addf18da951439f696dba163ae1c73458d43ea03):

    0# Benchmark, evals, iterations, total, min, max, median
    1SHA256, 5, 340, 4.12459, 0.00240666, 0.00244616, 0.00242406
    2SHA256D64_1024, 5, 7400, 3.56757, 9.53814e-05, 9.75219e-05, 9.61168e-05
    3SHA256_32b, 5, 4700000, 4.29699, 1.76951e-07, 1.91434e-07, 1.80197e-07
    
  9. laanwj commented at 12:19 pm on June 9, 2020: member
    0# Benchmark, evals, iterations, total, min, max, median
    1SHA256, …
    2SHA256D64_1024, …
    3SHA256_32b, …
    

    Maybe it would be useful to specify here what SHA256 implementation is benchmarked. This makes comparisons slightly more meaningful.

  10. MarcoFalke commented at 2:41 pm on June 10, 2020: member

    From IRC:

    0[16:51] <phantomcircuit> sipa, oh do any of the other benchmarks maybe end up calling something that would call the auto detect?
    

    If another benchmark spins up a testing setup, that testing setup will call auto detect. So this explains the confusing results where the naive implementation is faster than avx2.

  11. luke-jr commented at 4:25 am on June 11, 2020: member

    There should probably be a way to force a specific implementation?

    (I think always defaulting to the generic implementation makes sense…)

  12. MarcoFalke commented at 3:04 pm on June 11, 2020: member

    In the functional tests we use

    0if self.is_foobar_compiled():
    1  self.test_foobar()
    

    Something along those lines could also be used to bench the different hash impls. here.

  13. laanwj commented at 5:22 pm on June 11, 2020: member

    Yes, benchmarking all the different SHA356 implementations could be useful as well. I think this was the case in one of the initial PRs that introduced more of them.

    That said that still leaves open what to do for other benchmarks that might depend on the SHA256 implementation. We don’t want to re-run all the benchmarks for all the supported implementations ofc.

  14. MarcoFalke commented at 5:50 pm on June 11, 2020: member

    That said that still leaves open what to do for other benchmarks that might depend on the SHA256 implementation. We don’t want to re-run all the benchmarks for all the supported implementations ofc.

    I generally don’t like using globals to magically change control flow, especially in tests. There have been enough cases in the past where global state in tests has lead to confusing results. (Including this very benchmark: #19214 (comment))

    Which is why I suggested to force a decision before hashing is used: #19214 (comment) The silent fallback shouldn’t be needed, or am I missing something obvious?

  15. laanwj commented at 3:13 pm on June 16, 2020: member

    The silent fallback shouldn’t be needed, or am I missing something obvious?

    I think there’s something of an initialization order issue here. Some of the objects initialized before main() might make (light, non-performance-critical) use of SHA256 to do initialization. We don’t want to move the processor detection that soon due to logging / potential failure modes.

  16. laanwj commented at 1:15 pm on July 15, 2020: member
    ACK addf18da951439f696dba163ae1c73458d43ea03 I’m going to merge this, It has enough ACKs and I think this is a clear improvement to before. Additional suggestions can be done in later PRs.
  17. laanwj merged this on Jul 15, 2020
  18. laanwj closed this on Jul 15, 2020

  19. sidhujag referenced this in commit 58b6f0d9c0 on Jul 16, 2020
  20. deadalnix referenced this in commit 4d0d32567a on Sep 1, 2021
  21. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me