Use SIMD? #1700

issue Raimo33 openend this issue on July 13, 2025
  1. Raimo33 commented at 11:29 pm on July 13, 2025: none

    Ever thought about using SIMD intrinsics to speed up some functions?

    https://github.com/sipa/secp256k1/blob/master/src%2Ffield_10x26_impl.h

    This code for example is full of cases were SIMD would offer great benefit

  2. Raimo33 commented at 11:29 pm on July 13, 2025: none
    I’m open to implement it myself if it gets decided
  3. real-or-random commented at 6:51 am on July 14, 2025: contributor

    Ever thought about using SIMD intrinsics to speed up some functions?

    Does this issue answer your question? #1110

    sipa/secp256k1@master/src%2Ffield_10x26_impl.h

    By the way, this link points to a 10 year old version of the library code (because it points to the the wrong repo).

    I’m open to implement it myself if it gets decided

    I would be happy to see experimentation with SIMD, and I think we’re in general open to the idea, but be aware that we have very high coding and reviewing standards, and not a lot of bandwidth. Reviewing such code will take a long time, and no one can give you a “decision” right now.

  4. real-or-random added the label performance on Jul 14, 2025
  5. Raimo33 commented at 7:20 am on July 14, 2025: none
    Ok I will experiment then. do you think I should make separate files or put #ifndefs blocks and embed the SSE2, AVX2, AVX512 versions directly along the already existing functions?
  6. real-or-random commented at 7:26 am on July 14, 2025: contributor
    I’d start with #ifdef blocks for experimentation. This gets you started quicker if some functions use intrinsics and some don’t because you won’t need to care about organizing files so that you’ll have all the right functions included.
  7. Raimo33 commented at 10:33 am on July 14, 2025: none

    hey, quick question: are the VERIFY blocks for debugging or not? in other words, should I optimize them? for example:

     0#ifdef VERIFY
     1static void secp256k1_fe_impl_verify(const secp256k1_fe *a) {
     2    const uint64_t *d = a->n;
     3    int m = a->normalized ? 1 : 2 * a->magnitude;
     4   /* secp256k1 'p' value defined in "Standards for Efficient Cryptography" (SEC2) 2.7.1. */
     5    VERIFY_CHECK(d[0] <= 0xFFFFFFFFFFFFFULL * m);
     6    VERIFY_CHECK(d[1] <= 0xFFFFFFFFFFFFFULL * m);
     7    VERIFY_CHECK(d[2] <= 0xFFFFFFFFFFFFFULL * m);
     8    VERIFY_CHECK(d[3] <= 0xFFFFFFFFFFFFFULL * m);
     9    VERIFY_CHECK(d[4] <= 0x0FFFFFFFFFFFFULL * m);
    10    if (a->normalized) {
    11        if ((d[4] == 0x0FFFFFFFFFFFFULL) && ((d[3] & d[2] & d[1]) == 0xFFFFFFFFFFFFFULL)) {
    12            VERIFY_CHECK(d[0] < 0xFFFFEFFFFFC2FULL);
    13        }
    14    }
    15}
    16#endif
    17`
    
  8. real-or-random commented at 1:09 pm on July 14, 2025: contributor
    Yes, essentially. The VERIFY blocks and the VERIFY_CHECK macros are for assertions enabled only in the tests. No need to add SIMD there.
  9. Raimo33 commented at 4:29 pm on July 14, 2025: none

    I’ve added SIMD to field_5x52_impl.h Please share feedback and let me know if I should continue with the other files.

    I ran the benchmarks (both with avx2 enabled, to see difference between auto-generated simd and manual simd) and the most significant improvement seems to be in ecmult related functions. I don’t know what field_5x52_impl.h impacts exactly. Take a look at the benchmarks attached.

    https://github.com/Raimo33/secp256k1/blob/simd/src/field_5x52_impl.h bench.zip


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-07-14 23:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me