Add 1-way SSE4 SHA256 implementation using intrinsics for MSVC builds #28526

pull hebasto wants to merge 7 commits into bitcoin:master from hebasto:230924-sse4 changing 8 files +255 −3
  1. hebasto commented at 2:18 pm on September 24, 2023: member

    This PR reintroduces the 1-way SSE4 SHA256 implementation using intrinsics, as suggested in #13442, specifically for MSVC builds, where a 50% performance gain has been achieved.

    Here are benchmarks on my machine with Intel Core i5-8350U CPU (no sha_ni flag) + Windows 11 Pro 22H2:

    • before this PR (8a9e37fb95cbb0bf7f6e06fa05d8381db04d61e2):
     0>.\src\bench_bitcoin.exe -filter=SHA256_.*
     1
     2|             ns/byte |              byte/s |    err% |     total | benchmark
     3|--------------------:|--------------------:|--------:|----------:|:----------
     4|                9.92 |      100,826,852.23 |    0.1% |      0.01 | SHA256_32b_AVX2 using the 'standard,sse41(4way),avx2(8way)' SHA256 implementation
     5|                9.90 |      101,038,141.67 |    0.3% |      0.01 | SHA256_32b_SHANI using the 'standard,sse41(4way)' SHA256 implementation
     6|               10.02 |       99,788,852.31 |    0.9% |      0.01 | SHA256_32b_SSE4 using the 'standard,sse41(4way)' SHA256 implementation
     7|               10.01 |       99,883,509.98 |    0.8% |      0.01 | SHA256_32b_STANDARD using the 'standard' SHA256 implementation
     8|                4.48 |      223,348,893.31 |    1.1% |      0.05 | SHA256_AVX2 using the 'standard,sse41(4way),avx2(8way)' SHA256 implementation
     9|                4.47 |      223,668,612.58 |    1.2% |      0.05 | SHA256_SHANI using the 'standard,sse41(4way)' SHA256 implementation
    10|                4.45 |      224,638,332.29 |    0.7% |      0.05 | SHA256_SSE4 using the 'standard,sse41(4way)' SHA256 implementation
    11|                4.45 |      224,542,494.67 |    0.6% |      0.05 | SHA256_STANDARD using the 'standard' SHA256 implementation
    
    • with this PR:
     0>.\src\bench_bitcoin.exe -filter=SHA256_.*
     1
     2|             ns/byte |              byte/s |    err% |     total | benchmark
     3|--------------------:|--------------------:|--------:|----------:|:----------
     4|                7.04 |      142,024,691.36 |    0.2% |      0.01 | SHA256_32b_AVX2 using the 'sse41(1way),sse41(4way),avx2(8way)' SHA256 implementation
     5|                7.03 |      142,222,222.22 |    0.2% |      0.01 | SHA256_32b_SHANI using the 'sse41(1way),sse41(4way)' SHA256 implementation
     6|                7.08 |      141,231,323.51 |    0.8% |      0.01 | SHA256_32b_SSE4 using the 'sse41(1way),sse41(4way)' SHA256 implementation
     7|                9.88 |      101,196,866.84 |    0.4% |      0.01 | SHA256_32b_STANDARD using the 'standard' SHA256 implementation
     8|                3.01 |      332,270,069.11 |    1.3% |      0.03 | SHA256_AVX2 using the 'sse41(1way),sse41(4way),avx2(8way)' SHA256 implementation
     9|                3.00 |      332,989,244.45 |    0.3% |      0.03 | SHA256_SHANI using the 'sse41(1way),sse41(4way)' SHA256 implementation
    10|                3.04 |      328,612,270.38 |    2.0% |      0.03 | SHA256_SSE4 using the 'sse41(1way),sse41(4way)' SHA256 implementation
    11|                4.45 |      224,678,709.45 |    0.4% |      0.05 | SHA256_STANDARD using the 'standard' SHA256 implementation
    

    Based on #24773.

  2. DrahtBot commented at 2:18 pm on September 24, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29774 (build: Enable fuzz binary in MSVC by hebasto)
    • #29625 (Several randomness improvements by sipa)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. hebasto marked this as a draft on Sep 24, 2023
  4. hebasto force-pushed on Oct 4, 2023
  5. hebasto commented at 3:19 pm on October 4, 2023: member
    Rebased on top of the merged #27598.
  6. DrahtBot added the label CI failed on Jan 17, 2024
  7. DrahtBot added the label Needs rebase on Jan 27, 2024
  8. Add MSVC implementation of GetCPUID() cf96097de3
  9. Add MSVC implementation of AVXEnabled() 1dc8641dad
  10. msvc: Enable AVX2 implementation of SHA256 2df3619a97
  11. msvc: Enable SSE4.1 implementation of SHA256 422d813081
  12. msvc: Enable x86 SHA-NI implementation of SHA256 6e2513f03e
  13. Add 1-way SSE4 SHA256 implementation using intrinsics 95bafb0007
  14. msvc: Use 1-way SSE4 SHA256 intrinsics-based implementation 0e9c80507b
  15. hebasto force-pushed on Feb 12, 2024
  16. DrahtBot removed the label Needs rebase on Feb 12, 2024
  17. DrahtBot removed the label CI failed on Feb 12, 2024
  18. DrahtBot added the label Needs rebase on Apr 28, 2024
  19. DrahtBot commented at 3:29 am on April 28, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  20. hebasto commented at 5:25 am on April 28, 2024: member

    Based on #24773.

    Deferring to after cmake.

  21. hebasto closed this on Apr 28, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me