optimization: precalculate SipHash constant XOR with k0 and k1 in SaltedOutpointHasher #30442

pull l0rinc wants to merge 2 commits into bitcoin:master from l0rinc:paplorinc/siphash changing 7 files +132 −28
  1. l0rinc commented at 3:01 pm on July 12, 2024: contributor

    Continuing the IBD-related micro-optimizations (started in #30326), here I’m precalculating the SipHash constants XOR with k0 and k1 for the map hash calculations .

    before:

    ns/op op/s err% total benchmark
    60.68 16,481,085.68 0.3% 11.05 SaltedOutpointHasherBenchmark_create_set
    20.64 48,451,463.67 0.0% 11.00 SaltedOutpointHasherBenchmark_hash
    26.73 37,408,850.57 0.3% 11.02 SaltedOutpointHasherBenchmark_match
    24.25 41,243,487.98 0.6% 10.93 SaltedOutpointHasherBenchmark_mismatch

    after:

    ns/op op/s err% total benchmark
    58.81 17,004,413.32 0.4% 11.00 SaltedOutpointHasherBenchmark_create_set
    19.25 51,947,532.75 0.0% 11.01 SaltedOutpointHasherBenchmark_hash
    24.58 40,682,615.34 0.2% 11.01 SaltedOutpointHasherBenchmark_match
    23.92 41,812,410.34 0.5% 10.93 SaltedOutpointHasherBenchmark_mismatch

    i.e.:

    0SaltedOutpointHasherBenchmark_create_set - 17,004,413.32/16,481,085.68 - 3% faster
    1SaltedOutpointHasherBenchmark_hash - 51,947,532.75/48,451,463.67 - 7.2% faster
    2SaltedOutpointHasherBenchmark_match - 40,682,615.34/37,408,850.57 - 8.7% faster
    3SaltedOutpointHasherBenchmark_mismatch - 41,812,410.34/41,243,487.98 - 1.3% faster
    

    The tiny speed gains are probably not measurable by an actual IBD.

  2. DrahtBot commented at 3:01 pm on July 12, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    No conflicts as of last run.

  3. l0rinc force-pushed on Jul 12, 2024
  4. l0rinc renamed this:
    Precalculate SipHash constant XOR with k0 and k1 in SaltedOutpointHasher
    optimization: Precalculate SipHash constant XOR with k0 and k1 in SaltedOutpointHasher
    on Jul 12, 2024
  5. l0rinc marked this as ready for review on Jul 13, 2024
  6. l0rinc commented at 12:40 pm on July 14, 2024: contributor
    @andrewtoth, this is another tiny addition to the coincache speedup, your review would be welcome.
  7. andrewtoth commented at 3:36 pm on July 14, 2024: contributor

    I did not see any improvement in the benchmark with this change. Running ./src/bench/bench_bitcoin -filter=.*Out[pP]oint.*

    commit 3b7b82b4b0c39a38538aae2cba10bec3907c5cbf

    0|               ns/op |                op/s |    err% |          ins/op |          cyc/op |         bra/op |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|---------------:|--------:|----------:|:----------
    2|                1.31 |      766,241,272.84 |    0.2% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `COutPoint_equality_match`
    3|                0.64 |    1,561,309,649.35 |    0.0% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `COutPoint_equality_mismatch`
    4|              107.62 |        9,292,298.54 |    1.7% |            0.00 |            0.00 |           0.00 |    0.0% |      0.10 | `SaltedOutpointHasherBenchmark_create_set`
    5|               23.36 |       42,799,791.14 |    0.0% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_hash`
    6|               48.78 |       20,502,210.98 |    0.1% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_match`
    7|               69.31 |       14,426,956.77 |    0.3% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_mismatch`
    

    commit 7ac4e3aaf345a06fdde463100c45c4295d730820

    0|               ns/op |                op/s |    err% |          ins/op |          cyc/op |         bra/op |   miss% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------------:|----------------:|---------------:|--------:|----------:|:----------
    2|                1.30 |      766,440,414.89 |    0.0% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `COutPoint_equality_match`
    3|                1.02 |      982,018,502.07 |    0.2% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `COutPoint_equality_mismatch`
    4|              108.92 |        9,180,628.87 |    0.9% |            0.00 |            0.00 |           0.00 |    0.0% |      0.10 | `SaltedOutpointHasherBenchmark_create_set`
    5|               23.69 |       42,213,365.83 |    0.1% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_hash`
    6|               47.90 |       20,876,369.10 |    0.1% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_match`
    7|               70.10 |       14,265,655.03 |    0.1% |            0.00 |            0.00 |           0.00 |    0.0% |      0.01 | `SaltedOutpointHasherBenchmark_mismatch`
    
  8. l0rinc commented at 5:58 pm on July 14, 2024: contributor
    That’s disappointing, thanks for checking, something’s still off with my compiler, it seems. After I finish benching #28280 (comment), I’ll try this again on the dedicates server instead. I’m also a bit surprised that the ratios are completely different compared to my measurements, e.g. SaltedOutpointHasherBenchmark_match and _mismatch are 40% off, while in my case they’re basically equal. Are compilers this different? Could you please send me the exact command and compiler versions you’ve used? Thanks!
  9. l0rinc marked this as a draft on Jul 15, 2024
  10. andrewtoth commented at 3:55 pm on July 15, 2024: contributor

    Could you please send me the exact command and compiler versions you’ve used?

    gcc version 11.4.0 (Ubuntu 11.4.0-1ubuntu1~22.04)

    I ran ./configure --enable-bench make ./src/bench/bench_bitcoin -filter=.*Out[pP]oint.*

  11. l0rinc commented at 10:15 am on July 16, 2024: contributor

    Now that the other benchmark finished after a few days, I ran this on a gcc (Debian 12.2.0-14) 12.2.0.

    After the first run I got a warning of:

    Warning, results might be unstable:

    • CPU frequency scaling enabled: CPU 0 between 800.0 and 4,200.0 MHz
    • CPU governor is ‘powersave’ but should be ‘performance’
    • Turbo is enabled, CPU frequency will fluctuate

    which I’ve fixed and also ran pyperf system tune. Additionally I set a --min-time=10000, otherwise the results weren’t consistent between runs. I ran each change twice to make sure they’re stable.

    before:

    0./configure --enable-bench
    1git checkout 7ac4e3aaf345a06fdde463100c45c4295d730820 && git reset --hard
    2make -j$(nproc) && ./src/bench/bench_bitcoin -filter='.*Out[pP]oint.*' --min-time=10000
    
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    135.01 7,406,978.51 0.1% 1,060.42 485.49 2.184 120.72 1.1% 11.05 SaltedOutpointHasherBenchmark_create_set
    23.89 41,860,205.94 0.0% 252.01 85.94 2.932 4.00 0.0% 11.00 SaltedOutpointHasherBenchmark_hash
    59.81 16,719,701.99 0.1% 451.14 215.16 2.097 20.83 1.5% 11.00 SaltedOutpointHasherBenchmark_match
    79.21 12,624,726.37 0.0% 566.84 284.97 1.989 20.50 2.3% 10.74 SaltedOutpointHasherBenchmark_mismatch
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    133.71 7,479,023.41 0.0% 1,060.38 480.97 2.205 120.72 1.1% 11.03 SaltedOutpointHasherBenchmark_create_set
    23.87 41,894,947.89 0.0% 252.01 85.88 2.935 4.00 0.0% 11.00 SaltedOutpointHasherBenchmark_hash
    60.05 16,652,312.90 0.1% 452.20 216.05 2.093 20.86 1.3% 11.00 SaltedOutpointHasherBenchmark_match
    79.90 12,515,491.22 0.1% 566.85 287.46 1.972 20.50 2.3% 10.74

    after:

    0git checkout 3b7b82b4b0c39a38538aae2cba10bec3907c5cbf && git reset --hard
    1make -j$(nproc) && ./src/bench/bench_bitcoin -filter='.*Out[pP]oint.*' --min-time=10000
    
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    132.60 7,541,596.87 0.0% 1,045.84 476.97 2.193 120.72 1.1% 11.03 SaltedOutpointHasherBenchmark_create_set
    23.43 42,674,820.33 0.0% 246.01 84.29 2.918 4.00 0.0% 11.00 SaltedOutpointHasherBenchmark_hash
    58.90 16,977,444.58 0.0% 444.45 211.91 2.097 20.92 1.3% 11.00 SaltedOutpointHasherBenchmark_match
    78.29 12,773,764.99 0.0% 544.01 281.65 1.932 20.29 2.6% 10.73 SaltedOutpointHasherBenchmark_mismatch
    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    132.75 7,533,102.65 0.0% 1,045.87 477.53 2.190 120.72 1.1% 11.02 SaltedOutpointHasherBenchmark_create_set
    23.44 42,660,977.02 0.0% 246.01 84.33 2.917 4.00 0.0% 11.00 SaltedOutpointHasherBenchmark_hash
    59.05 16,934,798.45 0.0% 444.96 212.44 2.094 20.93 1.3% 11.01 SaltedOutpointHasherBenchmark_match
    76.32 13,103,542.96 0.1% 531.55 274.56 1.936 19.99 2.4% 10.71 SaltedOutpointHasherBenchmark_mismatch

    Resulting in:

    SaltedOutpointHasherBenchmark_create_set: Before Avg = 7443000.96, After Avg = 7537349.76, Increase = 1.27%

    SaltedOutpointHasherBenchmark_hash: Before Avg = 41877576.91, After Avg = 42667898.67, Increase = 1.89%

    SaltedOutpointHasherBenchmark_match: Before Avg = 16686007.45, After Avg = 16956121.52, Increase = 1.62%

    SaltedOutpointHasherBenchmark_mismatch: Before Avg = 12570108.79, After Avg = 12938653.98, Increase = 2.93% @andrewtoth, what do you think, can you try reproducing these results?

  12. l0rinc marked this as ready for review on Jul 18, 2024
  13. andrewtoth commented at 3:24 am on July 25, 2024: contributor
    Based on your benchmark results, I don’t think this can be called an optimization. It seems to be worse for one benchmark, better in another, and roughly the same for the rest. The description also notes that this will not be noticeable by users. In light of that, does the motivation for this PR still hold?
  14. l0rinc closed this on Jul 25, 2024

  15. l0rinc commented at 6:11 pm on July 26, 2024: contributor
    I’ve removed the COutPoint_equality changes and benchmarks (updated the comments to avoid confusion), the SaltedOutpointHasher speed gains remain. I’ll let you decide if it’s worth doing the change or not.
  16. l0rinc reopened this on Jul 26, 2024

  17. l0rinc force-pushed on Jul 26, 2024
  18. l0rinc force-pushed on Jul 26, 2024
  19. DrahtBot commented at 4:18 pm on August 30, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/27977216903

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  20. DrahtBot added the label CI failed on Aug 30, 2024
  21. l0rinc force-pushed on Aug 31, 2024
  22. l0rinc renamed this:
    optimization: Precalculate SipHash constant XOR with k0 and k1 in SaltedOutpointHasher
    optimization: precalculate SipHash constant XOR with k0 and k1 in SaltedOutpointHasher
    on Aug 31, 2024
  23. DrahtBot removed the label CI failed on Aug 31, 2024
  24. DrahtBot added the label CI failed on Oct 13, 2024
  25. l0rinc force-pushed on Oct 15, 2024
  26. achow101 requested review from josibake on Oct 15, 2024
  27. bench: Add COutPoint and SaltedOutpointHasher benchmarks
    This commit introduces new benchmarks to measure the performance of various operations using
    SaltedOutpointHasher, including hash computation, set operations, and set creation.
    
    These benchmarks are intended to provide insights about coin caching performance (e.g. during IBD).
    
    > make -j10 && ./src/bench/bench_bitcoin -filter='.*Out[pP]oint.*' --min-time=10000
    
    initial measurements:
    
    |               ns/op |                op/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |               60.68 |       16,481,085.68 |    0.3% |     11.05 | `SaltedOutpointHasherBenchmark_create_set`
    |               20.64 |       48,451,463.67 |    0.0% |     11.00 | `SaltedOutpointHasherBenchmark_hash`
    |               26.73 |       37,408,850.57 |    0.3% |     11.02 | `SaltedOutpointHasherBenchmark_match`
    |               24.25 |       41,243,487.98 |    0.6% |     10.93 | `SaltedOutpointHasherBenchmark_mismatch`
    6ce13a33c6
  28. Precalculate SipHash constants XOR with k0 and k1
    Previously, only k0 and k1 were stored, requiring redundant calculations in `SipHashUint256Extra(k0, k1, id.hash, id.n)`.
    Now, the precomputed values (`v0`, `v1`, `v2`, `v3`) are stored and used, improving efficiency.
    
    |               ns/op |                op/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |               58.46 |       17,104,708.69 |    0.4% |     11.05 | `SaltedOutpointHasherBenchmark_create_set`
    |               19.26 |       51,919,263.84 |    0.1% |     11.00 | `SaltedOutpointHasherBenchmark_hash`
    |               25.51 |       39,207,035.36 |    0.2% |     11.00 | `SaltedOutpointHasherBenchmark_match`
    |               24.20 |       41,328,321.90 |    1.5% |     10.97 | `SaltedOutpointHasherBenchmark_mismatch`
    
    compared to master:
    SaltedOutpointHasherBenchmark_create_set - 17,104,708.69/16,481,085.68 - 3.7% faster
    SaltedOutpointHasherBenchmark_hash - 51,919,263.84/48,451,463.67 - 7% faster
    SaltedOutpointHasherBenchmark_match - 39,207,035.36/37,408,850.57 - 4.8% faster
    SaltedOutpointHasherBenchmark_mismatch - 41,328,321.90/41,243,487.98 - same
    bc959f5ba9
  29. l0rinc force-pushed on Oct 16, 2024
  30. l0rinc commented at 1:15 pm on October 16, 2024: contributor
    @laanwj This is the optimization that relies on #30349, would really appreciate you input on it.
  31. DrahtBot removed the label CI failed on Oct 16, 2024
  32. laanwj requested review from laanwj on Oct 17, 2024
  33. laanwj added the label Utils/log/libs on Oct 17, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me