build: detect arm32 assembly by default #1752

pull Raimo33 wants to merge 3 commits into bitcoin-core:master from Raimo33:detect-arm32-asm changing 5 files +14 −23
  1. Raimo33 commented at 5:29 pm on September 19, 2025: none

    This PR addresses issue #1751 by adding a call to check_arm32_assembly() by default, matching the current behavior with check_x86_64_assembly().

    This would result in speedup on field_10x26_impl.h on default builds. For example, currently, the Bitcoin Core reference implementation compiles libsecp256k1 with default options, leading to unoptimal builds.

    This change could help address https://github.com/bitcoin/bitcoin/issues/32832 partially, considering the flamegraph shows that ecdsa_verify takes 90% of IBD time.

    Benchmark Avg(us) OFF Avg(us) arm32 Improvement (%)
    ecdsa_verify 379.0 322.0 15.0
    ecdsa_sign 184.0 170.0 7.6
    ec_keygen 160.0 145.0 9.4
    ecdh 382.0 332.0 13.1
    schnorrsig_sign 162.0 148.0 8.6
    schnorrsig_verify 380.0 323.0 15.0
    ellswift_encode 109.0 95.1 12.7
    ellswift_decode 60.2 50.8 15.6
    ellswift_keygen 268.0 240.0 10.4
    ellswift_ecdh 395.0 343.0 13.2
  2. Raimo33 force-pushed on Sep 19, 2025
  3. Raimo33 force-pushed on Sep 19, 2025
  4. hebasto commented at 8:28 pm on September 19, 2025: member

    This would result in speedup on field_10x26_impl.h on default builds.

    Please provide benchmarks to support this statement.

  5. Raimo33 force-pushed on Sep 20, 2025
  6. Raimo33 force-pushed on Sep 20, 2025
  7. Raimo33 commented at 2:01 pm on September 22, 2025: none

    Please provide benchmarks to support this statement.

    will do. I’m buying a raspberry PI right now. I reckon if the benchmarks don’t show improvements we should delete field_10x26_arm.s entirely

  8. hebasto commented at 1:15 pm on September 25, 2025: member
    Perhaps convert this to a draft while the CI is red?
  9. Raimo33 marked this as a draft on Sep 25, 2025
  10. Raimo33 commented at 4:00 pm on October 28, 2025: none

    Please provide benchmarks to support this statement.

    I’ve ran benchmarks on my raspberry pi 4. Here are the results.

    Optional features: assembly ………………………. OFF

     0Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
     1
     2ecdsa_verify                  ,   378.0       ,   379.0       ,   379.0    
     3ecdsa_sign                    ,   184.0       ,   184.0       ,   185.0    
     4ec_keygen                     ,   160.0       ,   160.0       ,   160.0    
     5ecdh                          ,   382.0       ,   382.0       ,   383.0    
     6schnorrsig_sign               ,   162.0       ,   162.0       ,   162.0    
     7schnorrsig_verify             ,   380.0       ,   380.0       ,   381.0    
     8ellswift_encode               ,   109.0       ,   109.0       ,   109.0    
     9ellswift_decode               ,    60.1       ,    60.2       ,    60.3    
    10ellswift_keygen               ,   268.0       ,   268.0       ,   268.0    
    11ellswift_ecdh                 ,   395.0       ,   395.0       ,   395.0
    

    Optional features: assembly ………………………. arm32

     0Benchmark                     ,    Min(us)    ,    Avg(us)    ,    Max(us)    
     1
     2ecdsa_verify                  ,   322.0       ,   322.0       ,   322.0    
     3ecdsa_sign                    ,   170.0       ,   170.0       ,   170.0    
     4ec_keygen                     ,   145.0       ,   145.0       ,   145.0    
     5ecdh                          ,   332.0       ,   332.0       ,   333.0    
     6schnorrsig_sign               ,   148.0       ,   148.0       ,   149.0    
     7schnorrsig_verify             ,   323.0       ,   323.0       ,   324.0    
     8ellswift_encode               ,    94.9       ,    95.1       ,    95.3    
     9ellswift_decode               ,    50.6       ,    50.8       ,    50.9    
    10ellswift_keygen               ,   239.0       ,   240.0       ,   240.0    
    11ellswift_ecdh                 ,   343.0       ,   343.0       ,   343.0
    
  11. hebasto commented at 4:18 pm on October 28, 2025: member

    I’ve ran benchmarks on my raspberry pi 4. Here are the results.

    Which compiler did you use?

  12. Raimo33 commented at 4:25 pm on October 28, 2025: none

    Which compiler did you use?

    I used the option -DCMAKE_TOOLCHAIN_FILE=./cmake/arm-linux-gnueabihf.toolchain.cmake

    0C compiler ............................ GNU 13.3.0, /usr/bin/arm-linux-gnueabihf-gcc
    
  13. hebasto commented at 4:56 pm on October 28, 2025: member

    Which compiler did you use?

    I used the option -DCMAKE_TOOLCHAIN_FILE=./cmake/arm-linux-gnueabihf.toolchain.cmake

    0C compiler ............................ GNU 13.3.0, /usr/bin/arm-linux-gnueabihf-gcc
    

    This workflow is for cross-compiling. Did you try to build natively on your RPi?

  14. build: remove assembly detection when explicitly disabled 18ea93ebdb
  15. build: detect arm32 assembly by default 7d52166fd5
  16. build: remove experimental warning for arm32 assembly 557bad20da
  17. Raimo33 force-pushed on Oct 28, 2025
  18. Raimo33 commented at 5:56 pm on October 28, 2025: none

    Did you try to build natively on your RPi?

    I get:

    0/home/pi/secp256k1/src/asm/field_10x26_arm.s:875: Error: selected processor does not support `ubfx r2,r3,#0,#22' in ARM mode
    1/home/pi/secp256k1/src/asm/field_10x26_arm.s:880: Error: selected processor does not support `movw r14,field_R1<<4' in ARM mode
    

    when building natively on my RPI. the compiler is /usr/libexec/gcc/arm-linux-gnueabihf/14/ Apparently that’s because RPIs build for armv6 architectures, not armv7…

    and our field_10x26_arm.s is only compatible with armv7


Raimo33 hebasto


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-05 17:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me