Use a static constant table for small ecmult WINDOW_G sizes. #614

pull gmaxwell wants to merge 1 commits into bitcoin-core:master from gmaxwell:201905-staticg changing 2 files +98 −0
  1. gmaxwell commented at 1:03 am on May 10, 2019: contributor

    For discussion for now: this need to be reworked some to play nicely with #600.

    When WINDOW_G is small there is no reason to not pre-compute it: on almost all embedded devices flash is much ‘cheaper’ than ram, pre-computation can be burdensomely slow, and the code to do the precomputation might actually be larger than the table (I haven’t checked). We wouldn’t want to put a 1.3 MB table in the source/library but a 1024 byte one hardly seems like a concern.

    This is relevant to #603 @laanwj

  2. gmaxwell commented at 1:03 am on May 10, 2019: contributor
    I only tested this very minimally, so please don’t immediately dump it on an embedded device and waste 4 hours troubleshooting. Test on a desktop first. :P
  3. in src/ecmult_impl.h:389 in 5ef8822012 outdated
    384+    }
    385+    ctx->pre_g = (secp256k1_ge_storage (*)[ECMULT_TABLE_SIZE(WINDOW_G)])secp256k1_pre_g_static_context;
    386+#ifdef USE_ENDOMORPISM
    387+    ctx->pre_g_128 = (secp256k1_ge_storage (*)[ECMULT_TABLE_SIZE(WINDOW_G)])secp256k1_pre_g_128_static_context;
    388+#endif
    389+    return;
    


    laanwj commented at 5:56 am on May 10, 2019:
    This explicit return seems unnecessary as the rest of the #ifdef already covers the entire function.

    gmaxwell commented at 11:42 am on May 25, 2019:
    fixed
  4. in src/ecmult_impl.h:382 in 5ef8822012 outdated
    376@@ -302,6 +377,17 @@ static void secp256k1_ecmult_context_init(secp256k1_ecmult_context *ctx) {
    377 }
    378 
    379 static void secp256k1_ecmult_context_build(secp256k1_ecmult_context *ctx, const secp256k1_callback *cb) {
    380+#if ECMULT_WINDOW_SIZE <= ECMULT_STATIC_TABLE_MAX
    


    laanwj commented at 5:59 am on May 10, 2019:
    It’d be nice to have a convenience #define STATIC_WINDOW_G that can be tested instead of repeating this expression every time.

    gmaxwell commented at 11:42 am on May 25, 2019:
    Will do.
  5. laanwj commented at 6:02 am on May 10, 2019: member

    Concept ACK

    Tested that configure with out of range ecmult window value: 2, 25 fails, and the below values pass.

    Benchmarks (SiFive Unleashed RV64GC, no endomorphism, default 1Ghz clock speed, gcc 9.0.1, --with-bignum=gmp default):

     0--with-ecmult-window=3    ecdsa_verify: min 1649us / avg 1649us / max 1649us
     1--with-ecmult-window=4    ecdsa_verify: min 1570us / avg 1570us / max 1570us
     2--with-ecmult-window=5    ecdsa_verify: min 1517us / avg 1517us / max 1517us
     3--with-ecmult-window=6    ecdsa_verify: min 1479us / avg 1479us / max 1479us
     4--with-ecmult-window=7    ecdsa_verify: min 1449us / avg 1449us / max 1449us
     5--with-ecmult-window=8    ecdsa_verify: min 1427us / avg 1427us / max 1427us
     6--with-ecmult-window=9    ecdsa_verify: min 1410us / avg 1410us / max 1410us
     7--with-ecmult-window=10   ecdsa_verify: min 1395us / avg 1395us / max 1395us
     8--with-ecmult-window=11   ecdsa_verify: min 1382us / avg 1382us / max 1382us
     9--with-ecmult-window=12   ecdsa_verify: min 1373us / avg 1373us / max 1373us
    10--with-ecmult-window=13   ecdsa_verify: min 1364us / avg 1364us / max 1364us
    11--with-ecmult-window=14   ecdsa_verify: min 1357us / avg 1357us / max 1357us
    12--with-ecmult-window=15   ecdsa_verify: min 1351us / avg 1351us / max 1351us
    

    So from 3 to 15 is at most a 18% speed-up, where going to 6 already accounts for about half of that. Even with the lowest setting I’d say it’s fast enough for boot-loader use of verifying one or a few signatures.

    Another run with --with-bignum=no, which is more likely in embedded context to not pull in libgmp as a depenency:

     0--with-ecmult-window=3   ecdsa_verify: min 1780us / avg 1780us / max 1780us
     1--with-ecmult-window=4   ecdsa_verify: min 1699us / avg 1699us / max 1699us
     2--with-ecmult-window=5   ecdsa_verify: min 1645us / avg 1645us / max 1645us
     3--with-ecmult-window=6   ecdsa_verify: min 1609us / avg 1609us / max 1609us
     4--with-ecmult-window=7   ecdsa_verify: min 1585us / avg 1585us / max 1585us
     5--with-ecmult-window=8   ecdsa_verify: min 1564us / avg 1564us / max 1564us
     6--with-ecmult-window=9   ecdsa_verify: min 1545us / avg 1545us / max 1545us
     7--with-ecmult-window=10  ecdsa_verify: min 1531us / avg 1531us / max 1531us
     8--with-ecmult-window=11  ecdsa_verify: min 1519us / avg 1519us / max 1519us
     9--with-ecmult-window=12  ecdsa_verify: min 1510us / avg 1510us / max 1510us
    10--with-ecmult-window=13  ecdsa_verify: min 1500us / avg 1500us / max 1500us
    11--with-ecmult-window=14  ecdsa_verify: min 1494us / avg 1494us / max 1494us
    12--with-ecmult-window=15  ecdsa_verify: min 1487us / avg 1487us / max 1487us
    

    the code to do the precomputation might actually be larger than the table (I haven’t checked)

    This is just about true: the size of secp256k1_context_create (verify-only) goes from 496 to 124 with this patch, so this makes place for the 256 byte table for window size 4 (note: haven’t checked if any other functions go unused). It won’t be true for higher window sizes :smile:

  6. gmaxwell commented at 10:54 pm on May 11, 2019: contributor
    @apoelstra Thoughts on how this should interact with the single allocation change?
  7. real-or-random commented at 8:55 am on May 13, 2019: contributor

    Concept ACK

    @apoelstra Thoughts on how this should interact with the single allocation change?

    How is this related to scratch spaces?

  8. gmaxwell commented at 11:41 pm on May 13, 2019: contributor
    @real-or-random oops should have really directed at #566 just had ‘single allocation’ on the brain. When we do single allocation / no-allocation API’s we’ll need to leave the memory for this table out from the allocation. (which is also why I want to do this PR after merging those, as I don’t want to disrupt them.)
  9. real-or-random commented at 7:32 am on May 24, 2019: contributor
  10. gmaxwell commented at 11:40 am on May 25, 2019: contributor

    Rebased, fixed endomorphism support, ifdefed out secp256k1_ecmult_odd_multiples_table_storage_var.

    TODO: Update configure output, include a comment explaining how the constants are generated, and decide on how big to support. 6 is sounding pretty good based on llanwj’s data. Plus some assorted ifdef cleanup. @laanwj Any interest in getting the actual binary size without this PR, with it and size 2 and size 6? on RV?

  11. real-or-random commented at 12:11 pm on May 25, 2019: contributor

    include a comment explaining how the constants are generated

    It does not hurt but this style of just hardcoding the constants is somewhat inconsistent with the one by gen_context. I like the approach here more because it’s much simpler; gen_context creates a lot of complexity in the build system. Maybe we should also add a pre-generated file with gen_context constants to the repo in the future.

  12. gmaxwell commented at 10:24 pm on May 25, 2019: contributor

    TODO: Update configure output, include a comment explaining how the constants are generated, and decide on how big to support. Better handling for exhaustive tests?

    I’m unsure about gen_context the big reason that we don’t ship a static table is because it’s enormous (as in comparable to the size of the codebase enormous). This issue doesn’t really arise with window_g. We should probably change signing to use #546 and then we could have equivalent performance with a much smaller table, and then it might be less of a consideration.

  13. real-or-random commented at 8:15 pm on May 26, 2019: contributor

    I’m unsure about gen_context the big reason that we don’t ship a static table is because it’s enormous (as in comparable to the size of the codebase enormous).

    Yep okay, I didn’t see that point.

  14. Use a static constant table for small ecmult WINDOW_G sizes. 462a3351a1
  15. laanwj cross-referenced this on Jun 10, 2019 from issue missing symbols for `no_std` target by laanwj
  16. gmaxwell closed this on Jun 11, 2019

  17. real-or-random cross-referenced this on Jul 30, 2019 from issue Suggestions on adding precomputes for verification by xxuejie
  18. laanwj cross-referenced this on Aug 7, 2019 from issue build.rs: Add feature 'lowmemory' to reduce memory usage by laanwj
  19. gmaxwell cross-referenced this on Nov 14, 2019 from issue Signed-digit multi-comb for ecmult_gen (by peterdettman) by sipa
  20. gmaxwell cross-referenced this on Aug 26, 2020 from issue Tests on embedded platforms by real-or-random

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 21:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me