Performance regression in scalar_mul with Clang and x86_64 assembly enabled #1682

issue hebasto openend this issue on June 5, 2025
  1. hebasto commented at 2:18 pm on June 5, 2025: member

    On Ubuntu 25.04:

    0$ ./build_clang17_0_6_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0319    ,     0.0345    ,     0.0569 
    4field_mul                     ,     0.0173    ,     0.0173    ,     0.0175
    5$ ./build_clang17_0_6_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0266    ,     0.0268    ,     0.0275 
    9field_mul                     ,     0.0173    ,     0.0174    ,     0.0177
    
    0$ ./build_clang18_1_8_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0336    ,     0.0345    ,     0.0357 
    4field_mul                     ,     0.0170    ,     0.0171    ,     0.0172
    5$ ./build_clang18_1_8_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0266    ,     0.0267    ,     0.0271 
    9field_mul                     ,     0.0170    ,     0.0171    ,     0.0174
    
    0$ ./build_clang19_1_7_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0329    ,     0.0330    ,     0.0331 
    4field_mul                     ,     0.0166    ,     0.0167    ,     0.0171 
    5$ ./build_clang19_1_7_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0270    ,     0.0270    ,     0.0271 
    9field_mul                     ,     0.0167    ,     0.0167    ,     0.0169 
    
    0$ ./build_clang20_1_2_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0330    ,     0.0343    ,     0.0447 
    4field_mul                     ,     0.0164    ,     0.0170    ,     0.0214 
    5$ ./build_clang20_1_2_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0269    ,     0.0270    ,     0.0271 
    9field_mul                     ,     0.0165    ,     0.0165    ,     0.0166 
    
  2. sipa commented at 2:19 pm on June 5, 2025: contributor
    Can you post benchmarks with GCC on the same machine?
  3. hebasto commented at 2:27 pm on June 5, 2025: member

    Can you post benchmarks with GCC on the same machine?

    Sure!

    0$ ./build_gcc11_5_0_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0289    ,     0.0291    ,     0.0301 
    4field_mul                     ,     0.0151    ,     0.0152    ,     0.0154 
    5$ ./build_gcc11_5_0_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0313    ,     0.0314    ,     0.0316 
    9field_mul                     ,     0.0151    ,     0.0151    ,     0.0153 
    
    0$ ./build_gcc15_0_1_asm/bin/bench_internal mul
    1Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    2
    3scalar_mul                    ,     0.0292    ,     0.0293    ,     0.0295 
    4field_mul                     ,     0.0170    ,     0.0171    ,     0.0172
    5$ ./build_gcc15_0_1_noasm/bin/bench_internal mul
    6Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
    7
    8scalar_mul                    ,     0.0327    ,     0.0328    ,     0.0338 
    9field_mul                     ,     0.0171    ,     0.0171    ,     0.0172
    
  4. real-or-random added the label performance on Jun 5, 2025
  5. real-or-random commented at 8:26 pm on June 5, 2025: contributor
    Fwiw, I’ve hacked together a compiler explorer instance where you can compare clang’s output (inline asm disabled) on different versions. I don’t see a big change, it’s probably just a lot of incremental improvements that add up.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-08 17:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me