secp256k1_der_parse_integer is slow #577

issue jb55 opened this issue on December 14, 2018
  1. jb55 commented at 4:58 AM on December 14, 2018: none

    Ran a quick gprof of bitcoin for a short period and I noticed this:

      %   cumulative   self              self     total           
     time   seconds   seconds    calls   s/call   s/call  name    
     37.21    573.47   573.47                             secp256k1_fe_mul_inner
     27.43    996.29   422.82                             secp256k1_fe_sqr_inner
      7.13   1106.24   109.95                             secp256k1_scalar_reduce_512
      4.89   1181.67    75.43                             secp256k1_der_parse_integer
      4.82   1255.90    74.23                             secp256k1_ge_set_xo_var
    

    I figured the low hanging fruit on the top 3 might be exhausted but secp256k1_der_parse_integer looks promising. I'm guessing because it has a bunch of branching? Thought I'd bring this up here if anyone has any ideas on optimizing this.

  2. real-or-random commented at 9:34 AM on December 14, 2018: contributor

    Which implementation of Bitcoin did you test? Bitcoin Core does not call secp256k1_der_parse_integer AFAIK because it used the code for lax DER parsing in contrib.

    Apart from that, if you think it's due to all the branching, then -fprofile-arcs and -fprofile-use are the first things that come to my mind.

  3. jb55 commented at 4:31 PM on December 14, 2018: none

    Are you sure? I'm not running any special build, just Bitcoin core master with gprof enabled. It looks like it's called from secp256k1_ecdsa_sig_parse ?

  4. real-or-random commented at 7:00 PM on December 14, 2018: contributor

    Yeah but disregarding tests and bechnmarks, secp256k1_ecdsa_sig_parse is called only from secp256k1_ecdsa_signature_parse_der, which is not called from Bitcoin Core? If you look at the code you see that that Core uses lax parsing and then its own function IsValidSignatureEncoding to check the encoding.

  5. jb55 commented at 8:15 PM on December 14, 2018: none

    I'll try to get a call graph from gprof to see what's going on

  6. jb55 commented at 3:46 PM on December 15, 2018: none

    Yeah this is weird, all I see is

                                                     <spontaneous>
    [12]     4.9   75.43    0.00                 secp256k1_der_parse_integer [12]
    

    which might make sense if secp256k1 isn't compiled with -pg ?

    Right now I'm going to assume the gprof output is somehow wrong.

    Here are the full gprof logs:

    https://jb55.com/s/bitcoin-weird-gprof.txt https://jb55.com/s/bitcoin-weird-gprof-graph.txt

  7. jb55 closed this on Dec 16, 2018

  8. jonasnick commented at 5:05 PM on December 18, 2018: contributor

    What was actually the issue?

  9. jb55 commented at 6:25 PM on December 20, 2018: none

    @jonasnick I never got around to figuring it out. For some reason gprof seemed to think it was sampling secp256k1_der_parse_integer while running bitcoin even though no code paths reach there. So it's either wrong or there is a memory corruption issue somewhere? The former is perhaps more likely?

  10. mrx23dot commented at 12:20 PM on April 10, 2022: none

    Maybe using the bad practice of putting functions in header files got gprof confused :D Here is my result for generating public keys from private ones. Does anyone have a suggestion on what configuration combination is the fastest?

      %   cumulative   self              self     total           
     time   seconds   seconds    calls  ms/call  ms/call  name    
     46.14      4.79     4.79 235910621     0.00     0.00  secp256k1_fe_mul_inner
     12.01      6.03     1.25 88835821     0.00     0.00  secp256k1_fe_sqr_inner
      9.06      6.97     0.94     1152     0.82     5.45  secp256k1_gej_add_ge_var
    
    
  11. real-or-random commented at 12:39 PM on April 10, 2022: contributor

    Maybe using the bad practice of putting functions in header files got gprof confused :D

    A C compiler doesn't have a notion of a header file, so I doubt that's the issue.

    Does anyone have a suggestion on what configuration combination is the fastest?

    It will depend on the compiler (flags) and your machine. I suggest running ./bench_ecmult. Check the first line ecmult_gen. Maybe we should add a benchmark for key generation.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-22 22:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me