ARM assembly implementation of field_10x26 inner #173

pull laanwj wants to merge 1 commits into bitcoin-core:master from laanwj:2014_12_field10x26_arm_asm changing 4 files +962 −8
  1. laanwj commented at 4:15 PM on December 26, 2014: member

    --with-field=32bit --with-scalar=32bit --with-asm=no --enable-benchmark --host=arm-linux-gnueabihf

    bench_verify:
    min 2661.978us / avg 2662.845us / max 2665.311us
    

    --with-field=32bit --with-scalar=32bit --with-asm=arm --enable-benchmark --host=arm-linux-gnueabihf

    bench_verify:
    min 1732.304us / avg 1732.896us / max 1733.773us
    

    For a 35% speed-up in total.

    Measured on HummingBoard-i1 w. i.MX6 Solo (Cortex A9).

  2. gmaxwell commented at 6:09 PM on December 26, 2014: contributor

    On a8 with endomorphism:

    From: min 3744.119us / avg 3744.770us / max 3745.406us to: min 1886.788us / avg 1887.016us / max 1887.225us

    Quite impressive.

  3. peterdettman commented at 5:22 AM on December 27, 2014: contributor

    Excellent! I gather this is without any NEON instructions so far?

    There's an alternative way to write the multiplication that can save ~40 limb muls for 10x26 (not Karatsuba, but using the same underlying principle - see http://eprint.iacr.org/2014/852.pdf to get the general idea, though our field is not as suitable). I have sample C code for 5x52 and could whip up a 10x26 version. It's not faster in C because it adds a lot of additions, but they are well-organised for vectorization, so with SIMD instruction sets it may be viable. @laanwj Do you think you'd be able to give it a shot with NEON (I'll supply the C)?

  4. peterdettman commented at 5:51 AM on December 27, 2014: contributor

    Also, I'm curious how much cache is on these boards you guys are testing with. Have you tried reducing WINDOW_G (ecmult_impl.h)?

  5. laanwj commented at 9:53 AM on December 27, 2014: member

    @peterdettman No NEON, just wanted to see i if I could beat the compiler with plain ARM assembly. as not all SoCs support NEON and this gives a baseline.

    I'm certainly interested in that more efficient C version. I still intend to do that. NEON can do 2x 32×32→64 add or mul or mad at the same time, and has 32 64 bit (2×32) registers, so it would be interesting to see what different method can be used there.

    Re: cache, IMX6 info states 32 K instruction and data L1 caches and 256 KB to 1 MB of L2 cache. This is the bottom-of-the-line model so likely only 256 KB. I have not tried changing WINDOW_G. @gmaxwell That's indeed impressive, thanks for benchmarking. Somehow ASM optimizations work better for that board :) May be because of difference in memory speed, the manual implementation removes a lot of loads from/to the stack compared to what gcc generates.

  6. sipa commented at 4:51 PM on December 29, 2014: contributor

    @gmaxwell You feel you can review this?

  7. gmaxwell commented at 9:47 AM on December 30, 2014: contributor

    Yep. Just backlogged a bit. (also figure it should spend some time cooking on the tests before merging in any case.)

  8. laanwj commented at 9:40 AM on January 21, 2015: member

    As this code has no loops, and only basic arithmetic and bit operations, would it be viable to use symbolic execution to check the computed result is equivalent to e.g. gcc's output? There's some work in s2e to do symbolic execution of ARM binary code. Conceptually it sounds simple but as usual I may be forgetting about a state explosion or two.

  9. laanwj commented at 3:38 PM on January 26, 2015: member

    Using miasm I generated IR expressions from the secp256k1_fe_*_inner assembly

    https://gist.github.com/laanwj/1b7730796aa94f5bfa87

    Next step would be to find out how to symbolically execute it, and whether it is possible to make something useful from the result.

  10. laanwj commented at 8:22 AM on January 30, 2015: member

    Made some progress on this. Using symbolic execution I verified that for both sqr and mul:

    • The code writes only to memory in a sequential area on the stack init_SP-xxx .. init_SP-4, and the output init_R0+0 .. init_R0+36
    • The code only inputs from memory for the input arguments, e.g. init_R1+0 .. init_R1+36 and init_R2+0 .. init_R2+36

    There is no dependence of the result on initial state of registers besides the memory addressed by R0,R1,R2. E.g. no other information leaks into the expression.

    I also generated a few images

    sqr_a sqr_inner from my assembly code

    sqr_b sqr_inner as generated by gcc

    The outputs (all 10 of them) are on the left, inputs (green) on the right. Legend:

    • Red Multiply
    • Yellow Add
    • Cyan Other operations such as bitshifts, and, or
    • Blue Bit slices and composes
    • Green Memory reads

    Yes, the graph layout kind of sucks, too many intersections. I'm thinking of using simulated annealing to clean it up. Although this naive DAG approach still manages to show the structure better than anything I could get Gephi to produce. A bit like the Eiffel tower on its side.

    I don't think comparing them is going to be particularly easy, at the least it's going to take a lot of expression rewriting.

  11. gmaxwell commented at 2:06 AM on January 31, 2015: contributor

    @laanwj so a validation strategy is to first prove via range analysis that the calculation can never overflow. Having done that it should be possible to convert the asm to an algebraic statement (e.g. first convert it to a SSA form, then just substitute in regular operations). Then the algebraic statement could be simplified with a cas and compared to the ideal representation of the function (or a conversion from the asm generated by GCC).

  12. laanwj commented at 2:13 PM on January 31, 2015: member

    For the leaf multiplications and additions it'd be quite easy to prove that no overflow happens. But I expect the least to be learned there, as the assembly code is a straightforward implementation of the C operations with umlal/umull.

    However the most annoying are the carry computations (for ADC+ADDS 64 bit addition) higher up. Miasm's evaluator creates an expression based on cf = (((op1 ^ op2) ^ res) ^ ((op1 ^ res) & (~(op1 ^ op2)))).msb to compute it, which can be simplified somewhat by using a 33-bit addition then taking the upper bit. Maybe it'd be possible to recognize 64-bit additions and substitute them back, getting rid of the carry logic completely.

    (another is 64-bit shift, which y = x>>S gets assembled to y.h = x.h >> S, y.l = (y.l >> S) | (y.h << (32-S)). By rewriting all shifts to bit compose/splicing and recognizing bitwise OR of disjunct composes, this could be reassembled into one operation. Maybe this isn't needed though, I'm not sure how much the expression needs to be simplified at all, just to match it...)

  13. gmaxwell commented at 2:33 PM on January 31, 2015: contributor

    ::nods:: The purpose of proving no overflow is not that its very useful in and of itself so that the rest of the proving can be done by replacing everything with plain integer operations (instead of finite machine words) and using plain algebra since if nothing overflows the operations are the same.

  14. ARM assembly implementation of field_10x26 inner 1a619fefc9
  15. laanwj force-pushed on Mar 28, 2015
  16. gmaxwell commented at 1:26 AM on April 18, 2015: contributor

    FWIW, doesn't autodetect for me on my novena, -- needs a manual flag. Is it supposed to?

  17. laanwj commented at 10:57 AM on April 18, 2015: member

    It's not supposed to autodetect. This is experimental, after all.

  18. sipa commented at 12:16 PM on April 27, 2015: contributor

    I do feel that holding this up is a bit unfair - I'm probably demanding a level of review here that wasn't demanded for the x86_64 assembly, especially as I'd really like to see this in. Still, I'd like to see someone confirm they have reviewed it...

  19. luke-jr commented at 8:36 AM on July 29, 2015: member

    Benchmarking on USB Armory:

    With c33307495b3a6658e602e14067dd594136d4690a
    configure: Using assembly optimizations: no
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: no
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: no
    
    $ bench_internal
    scalar_add: min 0.268us / avg 0.268us / max 0.269us
    scalar_negate: min 0.160us / avg 0.160us / max 0.161us
    scalar_sqr: min 2.04us / avg 2.04us / max 2.04us
    scalar_mul: min 1.93us / avg 1.93us / max 1.93us
    scalar_inverse: min 606us / avg 606us / max 606us
    scalar_inverse_var: min 606us / avg 606us / max 606us
    field_normalize: min 0.0801us / avg 0.0802us / max 0.0807us
    field_normalize_weak: min 0.0488us / avg 0.0489us / max 0.0494us
    field_sqr: min 1.25us / avg 1.26us / max 1.26us
    field_mul: min 1.88us / avg 1.88us / max 1.89us
    field_inverse: min 349us / avg 349us / max 350us
    field_inverse_var: min 349us / avg 349us / max 350us
    field_sqrt_var: min 344us / avg 345us / max 345us
    group_double_var: min 11.0us / avg 11.0us / max 11.0us
    group_add_var: min 28.0us / avg 28.0us / max 28.0us
    group_add_affine: min 20.9us / avg 20.9us / max 20.9us
    group_add_affine_var: min 19.4us / avg 19.4us / max 19.4us
    ecmult_wnaf: min 4.96us / avg 4.97us / max 5.02us
    hash_sha256: min 2.53us / avg 2.54us / max 2.56us
    hash_hmac_sha256: min 10.2us / avg 10.2us / max 10.2us
    hash_rfc6979_hmac_sha256: min 55.9us / avg 56.0us / max 56.0us
    
    $ bench_recover
    ecdsa_recover: min 5522us / avg 5522us / max 5522us
    
    $ bench_sign
    ecdsa_sign: min 2521us / avg 2521us / max 2522us
    
    $ bench_verify
    ecdsa_verify: min 5167us / avg 5167us / max 5167us
    
    With c33307495b3a6658e602e14067dd594136d4690a+1a619fefc90e29d04c9f740af8e86142a40e1d5a:
    configure: Using assembly optimizations: arm
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: no
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: no
    
    $ bench_internal
    scalar_add: min 0.268us / avg 0.268us / max 0.269us
    scalar_negate: min 0.158us / avg 0.158us / max 0.159us
    scalar_sqr: min 2.04us / avg 2.04us / max 2.04us
    scalar_mul: min 1.93us / avg 1.93us / max 1.93us
    scalar_inverse: min 606us / avg 606us / max 606us
    scalar_inverse_var: min 606us / avg 606us / max 607us
    field_normalize: min 0.0801us / avg 0.0802us / max 0.0807us
    field_normalize_weak: min 0.0488us / avg 0.0489us / max 0.0494us
    field_sqr: min 0.597us / avg 0.598us / max 0.603us
    field_mul: min 0.810us / avg 0.811us / max 0.816us
    field_inverse: min 165us / avg 165us / max 165us
    field_inverse_var: min 165us / avg 165us / max 165us
    field_sqrt_var: min 163us / avg 163us / max 163us
    group_double_var: min 5.22us / avg 5.22us / max 5.22us
    group_add_var: min 12.5us / avg 12.5us / max 12.5us
    group_add_affine: min 10.1us / avg 10.1us / max 10.1us
    group_add_affine_var: min 8.80us / avg 8.80us / max 8.80us
    ecmult_wnaf: min 4.95us / avg 4.96us / max 5.01us
    hash_sha256: min 2.54us / avg 2.54us / max 2.55us
    hash_hmac_sha256: min 10.2us / avg 10.2us / max 10.2us
    hash_rfc6979_hmac_sha256: min 55.6us / avg 55.6us / max 55.7us
    
    $ bench_recover
    ecdsa_recover: min 2932us / avg 2932us / max 2932us
    
    $ bench_sign
    ecdsa_sign: min 1650us / avg 1650us / max 1652us
    
    $ bench_verify
    ecdsa_verify: min 2769us / avg 2769us / max 2770us
    
  20. luke-jr commented at 6:49 AM on September 20, 2015: member

    Benchmarking on Nokia N900:

    With 85e3a2cc087993973a2195849c652005b0be7ddd
    CFLAGS='-mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -O2'
    configure: Using assembly optimizations: no
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: gmp
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: yes
    configure: Building ECDH module: yes
    configure: Building Schnorr signatures module: yes
    configure: Building ECDSA pubkey recovery module: yes
    
    $ tests
    test count = 64
    random seed = f3f2447cb0df6420fe4a7ac0af8307ee
    random run = ca0a22a61ef8f80fc89fece8e28c6378
    
    $ bench_internal
    scalar_add: min 0.294us / avg 0.300us / max 0.320us
    scalar_negate: min 0.125us / avg 0.126us / max 0.128us
    scalar_sqr: min 2.52us / avg 2.56us / max 2.70us
    scalar_mul: min 2.31us / avg 2.36us / max 2.66us
    scalar_split: min 10.1us / avg 10.1us / max 10.3us
    scalar_inverse: min 746us / avg 753us / max 786us
    scalar_inverse_var: min 30.1us / avg 30.4us / max 31.8us
    field_normalize: min 0.125us / avg 0.126us / max 0.128us
    field_normalize_weak: min 0.0685us / avg 0.0689us / max 0.0705us
    field_sqr: min 0.792us / avg 0.798us / max 0.814us
    field_mul: min 1.08us / avg 1.09us / max 1.12us
    field_inverse: min 220us / avg 222us / max 227us
    field_inverse_var: min 39.7us / avg 40.2us / max 42.5us
    field_sqrt_var: min 217us / avg 218us / max 224us
    group_double_var: min 6.76us / avg 6.84us / max 7.15us
    group_add_var: min 16.5us / avg 16.6us / max 17.0us
    group_add_affine: min 13.2us / avg 13.3us / max 13.4us
    group_add_affine_var: min 11.6us / avg 11.6us / max 11.9us
    wnaf_const: min 3.15us / avg 3.18us / max 3.33us
    ecmult_wnaf: min 6.58us / avg 6.65us / max 7.15us
    hash_sha256: min 2.55us / avg 2.58us / max 2.70us
    hash_hmac_sha256: min 10.6us / avg 10.8us / max 11.1us
    hash_rfc6979_hmac_sha256: min 58.4us / avg 59.1us / max 61.1us
    context_verify: min 327948us / avg 330529us / max 333565us
    context_sign: min 1160us / avg 1169us / max 1211us
    
    $ bench_recover
    ecdsa_recover: min 2097us / avg 2101us / max 2109us
    
    $ bench_sign
    ecdsa_sign: min 2128us / avg 2131us / max 2137us
    
    $ bench_verify
    ecdsa_verify: min 2052us / avg 2058us / max 2064us
    
    $ bench_ecdh
    ecdh: min 2506us / avg 2519us / max 2579us
    
    $ bench_schnorr_verify
    schnorr_verify: min 2036us / avg 2039us / max 2042us
    
    With 85e3a2cc087993973a2195849c652005b0be7ddd+1a619fefc90e29d04c9f740af8e86142a40e1d5a
    CFLAGS='-mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -O2'
    configure: Using assembly optimizations: arm
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: gmp
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: yes
    configure: Building ECDH module: yes
    configure: Building Schnorr signatures module: yes
    configure: Building ECDSA pubkey recovery module: yes
    
    $ tests
    test count = 64
    random seed = dd63c4e6e5d1b385ba297417ab7db622
    random run = dc334c84b1b820836d957e83d1580f89
    
    $ bench_internal
    scalar_add: min 0.294us / avg 0.298us / max 0.312us
    scalar_negate: min 0.125us / avg 0.126us / max 0.130us
    scalar_sqr: min 2.52us / avg 2.56us / max 2.73us
    scalar_mul: min 2.31us / avg 2.33us / max 2.36us
    scalar_split: min 10.1us / avg 10.1us / max 10.5us
    scalar_inverse: min 746us / avg 752us / max 773us
    scalar_inverse_var: min 30.1us / avg 30.6us / max 33.1us
    field_normalize: min 0.125us / avg 0.126us / max 0.130us
    field_normalize_weak: min 0.0685us / avg 0.0693us / max 0.0725us
    field_sqr: min 0.792us / avg 0.818us / max 1.02us
    field_mul: min 1.08us / avg 1.08us / max 1.13us
    field_inverse: min 220us / avg 221us / max 223us
    field_inverse_var: min 39.7us / avg 40.2us / max 42.0us
    field_sqrt_var: min 217us / avg 218us / max 221us
    group_double_var: min 6.75us / avg 6.82us / max 7.03us
    group_add_var: min 16.5us / avg 16.6us / max 16.8us
    group_add_affine: min 13.2us / avg 13.3us / max 13.5us
    group_add_affine_var: min 11.6us / avg 11.6us / max 11.8us
    wnaf_const: min 3.15us / avg 3.20us / max 3.42us
    ecmult_wnaf: min 6.58us / avg 6.70us / max 7.31us
    hash_sha256: min 2.55us / avg 2.60us / max 2.84us
    hash_hmac_sha256: min 10.7us / avg 10.8us / max 11.4us
    hash_rfc6979_hmac_sha256: min 58.4us / avg 59.2us / max 61.4us
    context_verify: min 327219us / avg 328688us / max 331987us
    context_sign: min 1158us / avg 1168us / max 1208us
    
    $ bench_recover
    ecdsa_recover: min 2097us / avg 2101us / max 2106us
    
    $ bench_sign
    ecdsa_sign: min 2120us / avg 2123us / max 2126us
    
    $ bench_verify
    ecdsa_verify: min 2052us / avg 2055us / max 2059us
    
    $ bench_ecdh
    ecdh: min 2508us / avg 2557us / max 2769us
    
    $ bench_schnorr_verify
    schnorr_verify: min 2035us / avg 2043us / max 2050us
    
    With 85e3a2cc087993973a2195849c652005b0be7ddd
    CFLAGS='-mthumb -mno-thumb-interwork -mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -O2 -Wa,-mthumb'
    configure: Using assembly optimizations: no
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: gmp
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: yes
    configure: Building ECDH module: yes
    configure: Building Schnorr signatures module: yes
    configure: Building ECDSA pubkey recovery module: yes
    
    $ tests
    test count = 64
    random seed = 61edcfb05a0a4846402d3471b26c00c2
    random run = 590a9e113ba71a79db92ca985e503664
    
    $ bench_internal
    scalar_add: min 0.312us / avg 0.316us / max 0.331us
    scalar_negate: min 0.139us / avg 0.139us / max 0.144us
    scalar_sqr: min 2.57us / avg 2.59us / max 2.65us
    scalar_mul: min 2.44us / avg 2.45us / max 2.49us
    scalar_split: min 10.8us / avg 10.9us / max 11.5us
    scalar_inverse: min 764us / avg 770us / max 792us
    scalar_inverse_var: min 29.6us / avg 30.4us / max 33.1us
    field_normalize: min 0.134us / avg 0.134us / max 0.138us
    field_normalize_weak: min 0.0618us / avg 0.0627us / max 0.0659us
    field_sqr: min 1.68us / avg 1.69us / max 1.72us
    field_mul: min 2.06us / avg 2.11us / max 2.43us
    field_inverse: min 421us / avg 423us / max 425us
    field_inverse_var: min 40.7us / avg 41.0us / max 41.6us
    field_sqrt_var: min 415us / avg 417us / max 419us
    group_double_var: min 12.8us / avg 12.9us / max 13.0us
    group_add_var: min 31.3us / avg 31.4us / max 31.6us
    group_add_affine: min 23.8us / avg 24.0us / max 24.2us
    group_add_affine_var: min 21.7us / avg 21.8us / max 22.0us
    wnaf_const: min 3.71us / avg 3.76us / max 4.02us
    ecmult_wnaf: min 6.65us / avg 6.74us / max 7.22us
    hash_sha256: min 2.75us / avg 2.82us / max 3.04us
    hash_hmac_sha256: min 11.4us / avg 11.5us / max 11.8us
    hash_rfc6979_hmac_sha256: min 62.6us / avg 63.3us / max 65.3us
    context_verify: min 573358us / avg 576329us / max 578496us
    context_sign: min 1852us / avg 1866us / max 1906us
    
    $ bench_recover
    ecdsa_recover: min 3816us / avg 3820us / max 3825us
    
    $ bench_sign
    ecdsa_sign: min 3021us / avg 3025us / max 3028us
    
    $ bench_verify
    ecdsa_verify: min 3771us / avg 3775us / max 3778us
    
    $ bench_ecdh
    ecdh: min 4536us / avg 4540us / max 4548us
    
    $ bench_schnorr_verify
    schnorr_verify: min 3735us / avg 3737us / max 3741us
    
    With 85e3a2cc087993973a2195849c652005b0be7ddd+1a619fefc90e29d04c9f740af8e86142a40e1d5a
    CFLAGS='-mthumb -mno-thumb-interwork -mcpu=cortex-a8 -mfpu=vfpv3 -mfloat-abi=hard -O2 -Wa,-mthumb'
    configure: Using assembly optimizations: arm
    configure: Using field implementation: 32bit
    configure: Using bignum implementation: gmp
    configure: Using scalar implementation: 32bit
    configure: Using endomorphism optimizations: yes
    configure: Building ECDH module: yes
    configure: Building Schnorr signatures module: yes
    configure: Building ECDSA pubkey recovery module: yes
    
    $ tests
    test count = 64
    random seed = f4d64b19fa6b8ed3112e13f9917e0bfb
    random run = 0fdb669452614944fa8a5fc42d522ff1
    
    $ bench_internal
    scalar_add: min 0.312us / avg 0.323us / max 0.404us
    scalar_negate: min 0.138us / avg 0.139us / max 0.144us
    scalar_sqr: min 2.57us / avg 2.61us / max 2.80us
    scalar_mul: min 2.44us / avg 2.46us / max 2.49us
    scalar_split: min 10.8us / avg 10.9us / max 11.2us
    scalar_inverse: min 764us / avg 773us / max 795us
    scalar_inverse_var: min 29.8us / avg 30.7us / max 32.6us
    field_normalize: min 0.134us / avg 0.135us / max 0.145us
    field_normalize_weak: min 0.0618us / avg 0.0626us / max 0.0659us
    field_sqr: min 0.793us / avg 0.829us / max 0.924us
    field_mul: min 1.08us / avg 1.08us / max 1.12us
    field_inverse: min 220us / avg 221us / max 226us
    field_inverse_var: min 39.2us / avg 39.5us / max 39.9us
    field_sqrt_var: min 217us / avg 218us / max 219us
    group_double_var: min 6.78us / avg 6.83us / max 7.03us
    group_add_var: min 16.6us / avg 16.7us / max 17.0us
    group_add_affine: min 13.3us / avg 13.4us / max 13.6us
    group_add_affine_var: min 11.6us / avg 11.7us / max 11.9us
    wnaf_const: min 3.71us / avg 3.75us / max 3.97us
    ecmult_wnaf: min 6.71us / avg 6.79us / max 7.13us
    hash_sha256: min 2.76us / avg 2.81us / max 3.07us
    hash_hmac_sha256: min 11.4us / avg 11.5us / max 11.8us
    hash_rfc6979_hmac_sha256: min 62.7us / avg 63.3us / max 64.9us
    context_verify: min 328046us / avg 329295us / max 330910us
    context_sign: min 1164us / avg 1173us / max 1218us
    
    $ bench_recover
    ecdsa_recover: min 2098us / avg 2101us / max 2105us
    
    $ bench_sign
    ecdsa_sign: min 2136us / avg 2140us / max 2156us
    
    $ bench_verify
    ecdsa_verify: min 2054us / avg 2057us / max 2060us
    
    $ bench_ecdh
    ecdh: min 2508us / avg 2515us / max 2528us
    
    $ bench_schnorr_verify
    schnorr_verify: min 2036us / avg 2038us / max 2041us
    
  21. sipa commented at 5:36 PM on September 22, 2015: contributor

    Needs rebase. I still like to see this in, but someone reviewing it would be nice...

  22. sipa cross-referenced this on Dec 12, 2015 from issue ARM assembly implementation of field_10x26 inner (rebase of #173) by sipa
  23. laanwj cross-referenced this on Feb 8, 2016 from issue "Activating best chain" can take very long by laanwj
  24. mruddy cross-referenced this on Feb 28, 2016 from issue More recent checkpoints? by achow101
  25. laanwj cross-referenced this on Mar 25, 2016 from issue Support for armv7,armv7s and arm64 by sandeepmalode
  26. sipa referenced this in commit 7b0fb18b75 on May 25, 2016
  27. laanwj commented at 3:12 PM on October 11, 2016: member

    This should be closed as it was merged.

  28. laanwj closed this on Oct 11, 2016

  29. laanwj cross-referenced this on Dec 2, 2016 from issue Need info for NEON implementation of field multiplication by laanwj

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 14:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me