variable sized precomputed table for signing #337

pull douglasbakkum wants to merge 1 commits into bitcoin-core:master from douglasbakkum:master changing 8 files +91 −39
  1. douglasbakkum commented at 10:04 am on October 18, 2015: none

    This pull request gives an option to reduce the precomputed table size for the signing context (ctx) by setting #define ECMULT_GEN_PREC_BITS [N_BITS].

    Motivation: Per #251 and #254, the static table can be reduced to 64kB. However, this is still too big for some of my embedded applications. Setting #define ECMULT_GEN_PREC_BITS 2 produces a 32kB table at a tradeoff of about 75% of the signing speed. Not defining this value will default to the existing implementation of 4 bits. Statistics:

     0ECMULT_GEN_PREC_BITS = 1
     1Precomputed table size: 32kB
     2./bench_sign 
     3ecdsa_sign: min 195us / avg 200us / max 212us
     4
     5ECMULT_GEN_PREC_BITS = 2
     6Precomputed table size: 32kB
     7./bench_sign 
     8ecdsa_sign: min 119us / avg 126us / max 134us
     9
    10ECMULT_GEN_PREC_BITS = 4 (default)
    11Precomputed table size: 64kB
    12./bench_sign
    13ecdsa_sign: min 83.5us / avg 89.6us / max 95.3us
    14
    15ECMULT_GEN_PREC_BITS = 8
    16Precomputed table size: 512kB
    17./bench_sign 
    18ecdsa_sign: min 96.4us / avg 99.4us / max 104us
    

    Only values of 2 and 4 make sense. 8 bits causes a larger table size with no increase in speed. 1 bit runs, actually, but does not reduce table size and is slower than 2 bits.

  2. douglasbakkum cross-referenced this on Oct 18, 2015 from issue Low-footprint mode by gmaxwell
  3. douglasbakkum commented at 10:35 am on October 18, 2015: none
    See also comments about alternatively using wNAF: https://github.com/bitcoin/secp256k1/issues/189#issuecomment-149000292
  4. gmaxwell commented at 4:27 pm on October 18, 2015: contributor

    Reasonable enough.

    Can you help me understand your memory trade-off requirements? There is a way to halve the table size without largely changing the performance though it requires getting rid of the nums blinding. I’d been under the impression that with the table completely static most of the reasonable target embedded devices (e.g. arm m3s with 192k ram) would end up with the table in their relatively copious flash and there was no real need to reduce below 64k on them.

  5. douglasbakkum commented at 5:33 pm on October 18, 2015: none
    Sure. The current device has 256kB of flash and 64kB of RAM on a Cortex M4. The secp256k1 library (including the 64kB table) takes up about 140kB in total.
  6. jonasschnelli commented at 6:03 pm on October 18, 2015: contributor
    Concept ACK. I think hardware wallets and other use-cases on embedded devices don’t really seek for super-performance. But they like to use a reliable and well-testes library that fits on a chip with tiny ram and tiny flash. Removing precomputed tables sounds after a logical tradeoff in that territory.
  7. douglasbakkum cross-referenced this on Mar 22, 2016 from issue use secp ecc lib by default, with smaller precomputed table for signing by douglasbakkum
  8. ddustin commented at 7:44 pm on October 23, 2018: none
    Is there a reason this was never merged in?
  9. gmaxwell commented at 0:31 am on May 23, 2019: contributor
    @real-or-random This complements the WINDOW_G work, would you mind reviewing it?
  10. real-or-random commented at 8:57 am on May 23, 2019: contributor
    I’ll have a look. :)
  11. in src/ecmult_gen_impl.h:60 in abcaa73da5 outdated
    48@@ -49,39 +49,39 @@ static void secp256k1_ecmult_gen_context_build(secp256k1_ecmult_gen_context *ctx
    49 
    50     /* compute prec. */
    51     {
    52-        secp256k1_gej precj[1024]; /* Jacobian versions of prec. */
    53+        secp256k1_gej precj[ECMULT_GEN_PREC_N * ECMULT_GEN_PREC_G]; /* Jacobian versions of prec. */
    54         secp256k1_gej gbase;
    55         secp256k1_gej numsbase;
    56         gbase = gj; /* 16^j * G */
    


    real-or-random commented at 7:57 am on May 24, 2019:
    there are some comments that should be changed here and up to line 67

    douglasbakkum commented at 10:37 am on May 24, 2019:
    I updated the comments, assuming you meant to change, for example, 16 to PREC_G. Let me know if there is something in addition.
  12. real-or-random commented at 8:04 am on May 24, 2019: contributor
    @douglasbakkum I know this is very old. Are you still interested in this PR? If yes, it needs rebase. If not, we could adopt it.
  13. douglasbakkum commented at 9:29 am on May 24, 2019: none

    Awesome. Yes definitely interested. We have been using this in production BitBox’s since the PR was submitted.

    I will rebase.

  14. douglasbakkum closed this on May 24, 2019

  15. douglasbakkum force-pushed on May 24, 2019
  16. douglasbakkum reopened this on May 24, 2019

  17. douglasbakkum commented at 10:22 am on May 24, 2019: none
    @real-or-random The PR is now rebased. Let me know if you would like further changes.
  18. douglasbakkum force-pushed on May 24, 2019
  19. real-or-random commented at 11:52 am on May 24, 2019: contributor

    Thanks, looks good to me!

    What we need to do is make this and #596 consistent (for naming of constants, ./configure, etc.) I think the easiest way forward is to get #596 merged and then adapt this PR here. (I know this is also the easiest way for me as the author of #596 but I’m of course open to other suggestions too. :P) @gmaxwell What do you think? I’m not convinced that “window size” would be an appropriate term here, so maybe PREC_BITS makes more sense in general?

  20. gmaxwell commented at 10:12 am on May 25, 2019: contributor
    Sounds fine to me (both on the sequence and the naming)
  21. gmaxwell commented at 12:08 pm on May 29, 2019: contributor
    #596 is now merged and I don’t believe there is anything at risk of imminent merger which will cause difficult conflicts with this PR.
  22. douglasbakkum force-pushed on May 29, 2019
  23. gmaxwell commented at 8:47 pm on May 29, 2019: contributor
  24. real-or-random commented at 11:40 am on May 30, 2019: contributor
    Great, so if ECMULT_GEN_PREC_BITS is fine, then this constant should be ./configure-able as in #596. See this commit for reference: https://github.com/bitcoin-core/secp256k1/pull/596/commits/2842dc523e88ff8f01c16486a1601ee49f4e60ff @douglasbakkum Do you want to give it a try?
  25. douglasbakkum commented at 2:05 pm on May 31, 2019: none
    @real-or-random Yea, agree. If it is quick for you, would you mind doing so?
  26. vhnatyk referenced this in commit dbc9d46c7d on Jun 18, 2019
  27. vhnatyk cross-referenced this on Jun 18, 2019 from issue Implementing pre allocation context creation by elichai
  28. vhnatyk cross-referenced this on Jun 18, 2019 from issue Dev/vhnat/new master by vhnatyk
  29. vhnatyk commented at 12:19 pm on June 19, 2019: none
    Hi! any progress with this?
  30. douglasbakkum commented at 1:36 pm on June 21, 2019: none
    Sorry for the delays. My colleague @benma made a commit to make the value configurable.
  31. douglasbakkum force-pushed on Jun 21, 2019
  32. benma commented at 7:56 am on June 29, 2019: contributor
    @real-or-random did you have a chance to test the configuration?
  33. real-or-random commented at 9:33 am on June 29, 2019: contributor
    No, sorry, not yet. It’s on my long “GitHub saved notifications” list, I’ll have a look next week. :)
  34. in src/gen_context.c:56 in df3224d89a outdated
    45@@ -45,23 +46,23 @@ int main(int argc, char **argv) {
    46     fprintf(fp, "#define _SECP256K1_ECMULT_STATIC_CONTEXT_\n");
    47     fprintf(fp, "#include \"src/group.h\"\n");
    48     fprintf(fp, "#define SC SECP256K1_GE_STORAGE_CONST\n");
    49-    fprintf(fp, "static const secp256k1_ge_storage secp256k1_ecmult_static_context[64][16] = {\n");
    50+    fprintf(fp, "static const secp256k1_ge_storage secp256k1_ecmult_static_context[ECMULT_GEN_PREC_N][ECMULT_GEN_PREC_G] = {\n");
    


    real-or-random commented at 3:10 pm on July 3, 2019:

    This is somewhat weird if you change the config in between but for some reason gen_context is not run again.

    So I think the generated ecmult_static_context.h should have the gen_context.c compile-time values of ECMULT_GEN_PREC_N and ECMULT_GEN_PREC_G hardcoded and contain a precompiler assertion (#error) that verifies they’re identical to the ecmult_static_context.h-compile-time values. This should make sure that we never compile with a ecmult_static_context.h generated with a different configuration. (Alternatively, we could make the file name of ecmult_static_context.h make dependent on the configured values, e.g., ecmult_static_context_64_16.h or something.)

    But maybe I make a mistake here, it’s easy to get these things wrong. So correct me if you think I’m wrong.


    benma commented at 8:58 am on July 4, 2019:

    That’s a great catch. Normally gen_context is run with make if the configuration changed, but it might not be absolutely failsafe (e.g. by manually overwriting the file with a different version etc.).

    Change incoming.


    real-or-random commented at 8:26 am on July 15, 2019:
    Thanks, looks good! Can you add a sentence like “Try deleting ecmult_static_context.h before the build.” or similar?
  35. vhnatyk commented at 5:42 am on July 12, 2019: none
    Seems we are so close 😄, sorry for spamming - what else needs to be done?
  36. real-or-random commented at 12:45 pm on July 12, 2019: contributor
    Ah I haven’t seen the changes. I’ll have a look next week!
  37. in configure.ac:171 in 7abe39b298 outdated
    164@@ -165,6 +165,14 @@ AC_ARG_WITH([ecmult-window], [AS_HELP_STRING([--with-ecmult-window=SIZE|auto],
    165 )],
    166 [req_ecmult_window=$withval], [req_ecmult_window=auto])
    167 
    168+AC_ARG_WITH([ecmult-gen-precision], [AS_HELP_STRING([--with-ecmult-gen-precision=2|4|8|auto],
    169+[Precision bits to tune the precomputed table size for signing.]
    170+[The size of the table is 32kB for 2 bits, 64kB for 4 bits, 512kB for 8 bits of precision.]
    171+[A smaller table size usually results in slower signing.]
    


    real-or-random commented at 8:23 am on July 15, 2019:

    nit: maybe phrase it the around way around: “A larger table size usually results in possible faster signing.”

    (to be consistent with “Larger values result in possibly better performance” for the other config entry)


    benma commented at 10:14 am on July 15, 2019:

    With ./configure --disable-openssl-tests --with-ecmult-gen-precision=8 (but not 2 and 4) and valgrind ./exhaustive_tests I get millions of warnings like

    There is no issue if you call it like this:

    valgrind --max-stackframe=2097912 ./exhaustive_tests

    Reference:

    -max-stackframe= [default: 2000000] The maximum size of a stack frame. If the stack pointer moves by more than this amount then Valgrind will assume that the program is switching to a different stack. You may need to use this option if your program has large stack-allocated arrays.

    Seems like the big option just needs more stack space. It’s a valgrind heuristic only and the tests run fine otherwise.

    Will add a note to the README.


    real-or-random commented at 10:39 am on July 15, 2019:
    Oh indeed, thanks!
  38. in Makefile.am:175 in 7abe39b298 outdated
    171@@ -172,7 +172,7 @@ src/ecmult_static_context.h: $(gen_context_BIN)
    172 CLEANFILES = $(gen_context_BIN) src/ecmult_static_context.h $(JAVAROOT)/$(JAVAORG)/*.class .stamp-java
    173 endif
    174 
    175-EXTRA_DIST = autogen.sh src/gen_context.c src/basic-config.h $(JAVA_FILES)
    176+EXTRA_DIST = autogen.sh src/gen_context.c src/libsecp256k1-config.h src/basic-config.h $(JAVA_FILES)
    


    real-or-random commented at 8:47 am on July 15, 2019:
    I don’t think we want to distribute src/libsecp256k1-config.h. It will be generated by configure.

    benma commented at 10:23 am on July 15, 2019:

    Can you add a sentence like “Try deleting ecmult_static_context.h before the build.” or similar?

    Done.

    I don’t think we want to distribute src/libsecp256k1-config.h. It will be generated by configure.

    That was a misguided attempt at having gen_context rebuild the static context upon configuration change. It seems to work now anyway, so I removed it again.


    real-or-random commented at 11:49 am on July 15, 2019:

    Oh hm, apparently make distcheck fails here: https://travis-ci.org/bitcoin-core/secp256k1/jobs/558864214#L847

    On a second thought, I’m not sure that including src/libsecp256k1-config.h is a good idea. There are downstream users of this library (e.g., https://github.com/rust-bitcoin/rust-secp256k1) that don’t use the autotools. But I’m not sure what the best thing is. Maybe including it only if ECMULT_GEN_PREC_BITS is not defined. Sounds pragmatic to me.


    benma commented at 3:24 pm on July 15, 2019:
    fixed
  39. real-or-random commented at 9:41 am on July 15, 2019: contributor

    With ./configure --disable-openssl-tests --with-ecmult-gen-precision=8 (but not 2 and 4) and valgrind ./exhaustive_tests I get millions of warnings like

     0==6831== Invalid read of size 16
     1==6831==    at 0x1111AB: secp256k1_fe_add (field_5x52_impl.h:405)
     2==6831==    by 0x1111AB: secp256k1_gej_double_var.part.0.constprop.0 (group_impl.h:358)
     3==6831==    by 0x1154ED: secp256k1_gej_double_var (group_impl.h:326)
     4==6831==    by 0x1154ED: secp256k1_ecmult_gen_context_build (ecmult_gen_impl.h:83)
     5==6831==    by 0x11A55E: secp256k1_context_preallocated_create (secp256k1.c:130)
     6==6831==    by 0x11A55E: secp256k1_context_create (secp256k1.c:142)
     7==6831==    by 0x109130: main (tests_exhaustive.c:460)
     8==6831==  Address 0x1ffedfe740 is on thread 1's stack
     9==6831==  in frame [#1](/bitcoin-core-secp256k1/1/), created by secp256k1_ecmult_gen_context_build (group_impl.h:29)
    10==6831==
    11==6831== Invalid read of size 16
    12==6831==    at 0x1111B0: secp256k1_fe_add (field_5x52_impl.h:405)
    13==6831==    by 0x1111B0: secp256k1_gej_double_var.part.0.constprop.0 (group_impl.h:358)
    14==6831==    by 0x1154ED: secp256k1_gej_double_var (group_impl.h:326)
    15==6831==    by 0x1154ED: secp256k1_ecmult_gen_context_build (ecmult_gen_impl.h:83)
    16==6831==    by 0x11A55E: secp256k1_context_preallocated_create (secp256k1.c:130)
    17==6831==    by 0x11A55E: secp256k1_context_create (secp256k1.c:142)
    18==6831==    by 0x109130: main (tests_exhaustive.c:460)
    19==6831==  Address 0x1ffedfe730 is on thread 1's stack
    20==6831==  in frame [#1](/bitcoin-core-secp256k1/1/), created by secp256k1_ecmult_gen_context_build (group_impl.h:29)
    21==6831==
    22==6831== Invalid write of size 4
    23==6831==    at 0x1111B5: secp256k1_fe_add (field_5x52_impl.h:412)
    24==6831==    by 0x1111B5: secp256k1_gej_double_var.part.0.constprop.0 (group_impl.h:358)
    25==6831==    by 0x1154ED: secp256k1_gej_double_var (group_impl.h:326)
    26==6831==    by 0x1154ED: secp256k1_ecmult_gen_context_build (ecmult_gen_impl.h:83)
    27==6831==    by 0x11A55E: secp256k1_context_preallocated_create (secp256k1.c:130)
    28==6831==    by 0x11A55E: secp256k1_context_create (secp256k1.c:142)
    29==6831==    by 0x109130: main (tests_exhaustive.c:460)
    30==6831==  Address 0x1ffedfe75c is on thread 1's stack
    31==6831==  in frame [#1](/bitcoin-core-secp256k1/1/), created by secp256k1_ecmult_gen_context_build (group_impl.h:29)
    32==6831==
    33==6831== Invalid write of size 8
    34==6831==    at 0x1111C8: secp256k1_fe_add (field_5x52_impl.h:405)
    35==6831==    by 0x1111C8: secp256k1_gej_double_var.part.0.constprop.0 (group_impl.h:358)
    36==6831==    by 0x1154ED: secp256k1_gej_double_var (group_impl.h:326)
    37==6831==    by 0x1154ED: secp256k1_ecmult_gen_context_build (ecmult_gen_impl.h:83)
    38==6831==    by 0x11A55E: secp256k1_context_preallocated_create (secp256k1.c:130)
    39==6831==    by 0x11A55E: secp256k1_context_create (secp256k1.c:142)
    40==6831==    by 0x109130: main (tests_exhaustive.c:460)
    41==6831==  Address 0x1ffedfe740 is on thread 1's stack
    42==6831==  in frame [#1](/bitcoin-core-secp256k1/1/), created by secp256k1_ecmult_gen_context_build (group_impl.h:29)
    

    I haven’t looked at this further.

  40. in README.md:70 in deec07d51e outdated
    65+
    66+    $ ./exhaustive_tests
    67+
    68+With valgrind, you might need to increase the max stack size:
    69+
    70+    $ valgrind --max-stackframe=2097912 ./exhaustive_tests
    


    real-or-random commented at 11:34 am on July 15, 2019:
    I need slightly larger values…. Maybe recommend 2500000?

    benma commented at 2:05 pm on July 15, 2019:
    Done
  41. in src/gen_context.c:7 in deec07d51e outdated
    3@@ -4,9 +4,10 @@
    4  * file COPYING or http://www.opensource.org/licenses/mit-license.php.*
    5  **********************************************************************/
    6 
    7+#include "libsecp256k1-config.h" // for ECMULT_GEN_PREC_BITS
    


    real-or-random commented at 11:36 am on July 15, 2019:

    It’s not crucial but we should make sure that this warning disappears: /configure --disable-openssl-tests --with-ecmult-gen-precision=8 --enable-experimental --enable-module-recovery --enable-module-ecdh --with-ecmult-window=22 CC=clang

    0In file included from src/gen_context.c:9:
    1src/basic-config.h:33: warning: "ECMULT_WINDOW_SIZE" redefined
    2   33 | #define ECMULT_WINDOW_SIZE 15
    3      |
    4In file included from src/gen_context.c:7:
    5src/libsecp256k1-config.h:18: note: this is the location of the previous definition
    6   18 | #define ECMULT_WINDOW_SIZE 22
    

    I guess other compilers warn, too.

    In the end it does not matter what ECMULT_WINDOW_SIZE is for use in gen_context.c. I think sticking to 15 is reasonable (but the fact that we ignore the config setting here should be more prominent in the source).


    real-or-random commented at 12:07 pm on July 15, 2019:

    In the end it does not matter what ECMULT_WINDOW_SIZE is for use in gen_context.c. I think sticking to 15 is reasonable (but the fact that we ignore the config setting here should be more prominent in the source).

    Actually it does not matter too much because gen_context does not use ECMULT_WINDOW_SIZE at all.


    benma commented at 2:05 pm on July 15, 2019:
    Fixed via #undef ECMULT_WINDOW_SIZE in basic-config.h, like with the other definitions.

    vhnatyk commented at 10:48 pm on July 22, 2019:
    @real-or-random sorry to bug, seems ok now? 😊

    benma commented at 10:52 am on September 4, 2019:
    Ping @real-or-random - seems we are really close now :P
  42. real-or-random commented at 11:40 am on July 15, 2019: contributor
    Sorry for bringing up more and more stuff. We should be really close now. :)
  43. real-or-random commented at 5:31 pm on September 4, 2019: contributor

    Yes, we’re close. :)

    ACK 1d26b27ac90092306bfbc9cdd5123e8a5035202a (mod squashing) I read the changes, and tested them both with and without static precomputation, with different precision values and also with valgrind.

    Can you squash some of the commits?

  44. real-or-random requested review from sipa on Sep 4, 2019
  45. real-or-random commented at 5:44 pm on September 4, 2019: contributor
    @sipa Can you have look at this? I think you know the algorithm best.
  46. sipa commented at 6:29 pm on September 4, 2019: contributor

    Code review ACK 1d26b27ac90092306bfbc9cdd5123e8a5035202a. Looks great; all comments I wanted to make were addressed in later commits.

    Please squash the fixup commits.

  47. sipa approved
  48. elichai commented at 6:46 pm on September 4, 2019: contributor
    Maybe better docs to differentiate the different between ecmult_static_precomputation,ecmult-window,ecmult-gen-precision?
  49. benma commented at 9:36 pm on September 4, 2019: contributor
    @elichai okay to do this in a different PR? Not strictly related.
  50. real-or-random commented at 5:34 am on September 5, 2019: contributor
    I think this can happen in a different PR.
  51. elichai commented at 5:37 am on September 5, 2019: contributor
    Sure, just getting to a lot of tables that it’s getting confusing which table will affect what.
  52. variable signing precompute table
    make ECMULT_GEN_PREC_BITS configurable
    
    ecmult_static_context.h: add compile time config assertion (#3) - Prevents accidentally using a file which was generated with a
    different configuration.
    
    README: mention valgrind issue
    
    With --with-ecmult-gen-precision=8, valgrind needs a max stack size
    adjustment to not run into a stack switching heuristic:
    
    http://valgrind.org/docs/manual/manual-core.html
    
    > -max-stackframe= [default: 2000000]
    > The maximum size of a stack frame. If the stack pointer moves by more than this amount then Valgrind will assume that the program is switching to a different stack.
    You may need to use this option if your program has large stack-allocated arrays.
    
    basic-config: undef ECMULT_WINDOW_SIZE before (re-)defining it
    dcb2e3b3ff
  53. douglasbakkum force-pushed on Sep 5, 2019
  54. douglasbakkum commented at 7:24 am on September 5, 2019: none
    Thanks all. The commits are now squashed and rebased.
  55. real-or-random approved
  56. real-or-random commented at 8:45 am on September 5, 2019: contributor
    ACK dcb2e3b3fff0b287d576842aabe5c79f2fe4df30 verified that all changes to the previous ACKed 1d26b27ac90092306bfbc9cdd5123e8a5035202a were due to the rebase
  57. jonasnick approved
  58. jonasnick commented at 12:20 pm on September 5, 2019: contributor
    ACK dcb2e3b3fff0b287d576842aabe5c79f2fe4df30 read the code and tested various configurations with valgrind
  59. real-or-random referenced this in commit 96cd94e385 on Sep 5, 2019
  60. real-or-random merged this on Sep 5, 2019
  61. real-or-random closed this on Sep 5, 2019

  62. vhnatyk commented at 7:30 pm on September 5, 2019: none
    So happy this got merged finally!) Hopefully I can start getting rid of black magik in my code now))
  63. benma cross-referenced this on Oct 8, 2019 from issue rebase on bitcoin-core/secp256k1 by benma
  64. vhnatyk cross-referenced this on Oct 19, 2019 from issue Clarify RAM footprint on MCUs by vhnatyk
  65. gmaxwell cross-referenced this on Nov 9, 2019 from issue Don't put an absurd amount of data onto the stack in some configs by gmaxwell
  66. sipa cross-referenced this on Jun 9, 2020 from issue Update libsecp256k1 subtree by sipa
  67. fanquake referenced this in commit 8c97780db8 on Jun 13, 2020
  68. sidhujag referenced this in commit 8a3a072968 on Jun 13, 2020
  69. ComputerCraftr referenced this in commit b98f1c6e6c on Jun 16, 2020
  70. UdjinM6 referenced this in commit 9d36ba6570 on Aug 10, 2021
  71. 5tefan referenced this in commit 8ded2caa74 on Aug 12, 2021
  72. gades referenced this in commit d855cc511d on May 8, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 22:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me