This is an optimization described in the intel sha256 whitepaper (page 10). It speeds up the SHA256D64_1024 bench (sse4 path, no avx2) for me by ~6%.
sha256: small speedup for sse4 path. #13400
pull theuni wants to merge 2 commits into bitcoin:master from theuni:sha2-avx1 changing 1 files +5 −4-
theuni commented at 7:06 PM on June 5, 2018: member
-
crypto: split out Rotations a2724af487
-
crypto: sha256 optim: reduce register copies ea3ed0cbd2
-
sipa commented at 7:07 PM on June 5, 2018: member
ACK, benchmarked to be around 5% faster for SSE4 (when disabling the AVX2 code on my i7-7820HQ).
- MarcoFalke added the label Refactoring on Jun 5, 2018
- theuni closed this on Jun 5, 2018
- theuni referenced this in commit 4ed6f4fc90 on Jun 12, 2018
- theuni referenced this in commit 6da1fe9d69 on Jun 12, 2018
- theuni referenced this in commit 4ee6fbb8b7 on Jun 12, 2018
- DrahtBot locked this on Sep 8, 2021
Labels