Function to compute optimal ecmult_multi scratch size for a number of points #638

pull jonasnick wants to merge 6 commits into bitcoin-core:master from jonasnick:ecmult-scratch changing 5 files +333 −77
  1. jonasnick commented at 9:34 pm on June 12, 2019: contributor

    @DavidBurkett requested to allow computing the optimal scratch size for Schnorr batch verification (https://github.com/ElementsProject/secp256k1-zkp/issues/69). This PR is a prerequisite but also contains a bunch of other fixups.

    Other than adding the new function this PR refactors scratch space handling in ecmult_impl to improve code quality, tests and documentation.

    The biggest part of this PR is to make computing the scratch size and its inverse more precise by not assuming maximum padding when aligning, but rather using the actual padding. This is not strictly necessary but removes a leaky abstraction and makes testing easier.

  2. jonasnick force-pushed on Jun 13, 2019
  3. in src/tests.c:3156 in dd63ca35cc outdated
    3093@@ -3094,7 +3094,7 @@ void test_ecmult_multi_batching(void) {
    3094     secp256k1_scratch_destroy(&ctx->error_callback, scratch);
    3095 
    3096     for(i = 1; i <= n_points; i++) {
    3097-        if (i > ECMULT_PIPPENGER_THRESHOLD) {
    3098+        if (i >= ECMULT_PIPPENGER_THRESHOLD) {
    


    real-or-random commented at 12:44 pm on June 13, 2019:
    ACK
  4. in src/ecmult_impl.h:1027 in 6596bf5ce9 outdated
    1022@@ -1023,8 +1023,8 @@ static int secp256k1_ecmult_pippenger_batch(const secp256k1_callback* error_call
    1023     }
    1024 
    1025     state_space->ps = (struct secp256k1_pippenger_point_state *) secp256k1_scratch_alloc(error_callback, scratch, entries * sizeof(*state_space->ps));
    1026-    state_space->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries*(WNAF_SIZE(bucket_window+1)) * sizeof(int));
    1027-    buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, (1<<bucket_window) * sizeof(*buckets));
    1028+    state_space->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries * WNAF_SIZE(bucket_window+1) * sizeof(int));
    1029+    buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, sizeof(*buckets) << bucket_window);
    


    real-or-random commented at 12:44 pm on June 13, 2019:
    ACK
  5. in src/util.h:143 in ba5fd7a9db outdated
    92@@ -93,7 +93,7 @@ static SECP256K1_INLINE void *checked_realloc(const secp256k1_callback* cb, void
    93 #define ALIGNMENT 16
    94 #endif
    95 
    96-#define ROUND_TO_ALIGN(size) (((size + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
    97+#define ROUND_TO_ALIGN(size) ((((size) + ALIGNMENT - 1) / ALIGNMENT) * ALIGNMENT)
    


    real-or-random commented at 12:44 pm on June 13, 2019:
    ACK
  6. in src/ecmult_impl.h:1002 in e774a6eea8 outdated
    1278+    if (n_points >= ECMULT_PIPPENGER_THRESHOLD) {
    1279+        int bucket_window = secp256k1_pippenger_bucket_window(n_points);
    1280+        return secp256k1_pippenger_scratch_size(n_points, bucket_window);
    1281+    } else {
    1282+        return secp256k1_strauss_scratch_size(n_points);
    1283+    }
    


    real-or-random commented at 12:49 pm on June 13, 2019:
    Approach ACK

    jonasnick commented at 1:17 pm on June 13, 2019:
    What’s an approach ACK?

    jonasnick commented at 1:24 pm on June 13, 2019:
    Oh, I guess it’s a concept ack. I was confused by other meanings of approach :D

    real-or-random commented at 1:39 pm on June 13, 2019:

    I’m actively testing https://github.com/bitcoin/bitcoin/pull/16149 here

    edit: except that I just write “ACK”. All my “ACK"s in this review mean “ACK thorough code inspection”


    robot-dreams commented at 0:12 am on November 30, 2021:

    (No action needed) Responding to this 2 year old comment 😅

    In Core, it looks like “Approach ACK” means “Concept ACK, and I agree with the approach of this change (but I haven’t reviewed the code in detail)”:

    https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#conceptual-review

  7. in src/ecmult_impl.h:1097 in 83f5f3d15d outdated
    1006+    }
    1007+    (*state_space)->ps = (struct secp256k1_pippenger_point_state *) secp256k1_scratch_alloc(error_callback, scratch, entries * sizeof(struct secp256k1_pippenger_point_state));
    1008+    (*state_space)->wnaf_na = (int *) secp256k1_scratch_alloc(error_callback, scratch, entries * WNAF_SIZE(bucket_window+1) * sizeof(int));
    1009+    *buckets = (secp256k1_gej *) secp256k1_scratch_alloc(error_callback, scratch, sizeof(secp256k1_gej) << bucket_window);
    1010+    if ((*state_space)->ps == NULL || (*state_space)->wnaf_na == NULL || *buckets == NULL) {
    1011+        secp256k1_scratch_apply_checkpoint(error_callback, scratch, scratch_checkpoint);
    


    real-or-random commented at 12:55 pm on June 13, 2019:
    Approach ACK I think it’s cleaner to apply the checkpoint outside this function.

    jonasnick commented at 7:55 pm on June 16, 2019:
    fixed
  8. in src/ecmult_impl.h:654 in f05bce8e6e outdated
    652-    static const size_t point_size = (sizeof(secp256k1_ge) + sizeof(secp256k1_gej) + sizeof(secp256k1_fe)) * ECMULT_TABLE_SIZE(WINDOW_A) + sizeof(struct secp256k1_strauss_point_state) + sizeof(secp256k1_gej) + sizeof(secp256k1_scalar);
    653+    size += ROUND_TO_ALIGN(n_points * ECMULT_TABLE_SIZE(WINDOW_A) * sizeof(secp256k1_ge));
    654 #endif
    655-    return n_points*point_size;
    656+    size += ROUND_TO_ALIGN(n_points * sizeof(struct secp256k1_strauss_point_state));
    657+    return size;
    


    real-or-random commented at 1:03 pm on June 13, 2019:
    Hm, I’m somewhat unsure about this. It seems like a layer violation to care about the alignment here.

    jonasnick commented at 1:41 pm on June 13, 2019:
    If we want to have a function that returns the required scratch space given a number of points (and we should) then we need to add the padding somewhere. While we can assume the worst case padding somewhere else I would prefer to have the *_scratch_space function return exact results. This makes it much easier to think about and also helps testing because now we can just check that what is allocated actually matches what we computed with *_scratch_space (see https://github.com/bitcoin-core/secp256k1/pull/638/commits/24553bfa1c34bcbe4820a783d66649b0a18affb7#diff-4655d106bf03045a3a50beefc800db21R2996). Or do you have an alternative in mind?

    jonasnick commented at 7:50 pm on June 16, 2019:
    Added function alloc_size to scratch space
  9. in src/ecmult_impl.h:707 in f05bce8e6e outdated
    701@@ -696,7 +702,9 @@ static int secp256k1_ecmult_strauss_batch_single(const secp256k1_callback* error
    702 }
    703 
    704 static size_t secp256k1_strauss_max_points(const secp256k1_callback* error_callback, secp256k1_scratch *scratch) {
    705-    return secp256k1_scratch_max_allocation(error_callback, scratch, STRAUSS_SCRATCH_OBJECTS) / secp256k1_strauss_scratch_size(1);
    706+    /* Call max_allocation with 0 objects because we've already accounted for
    707+     * alignment in strauss_scratch_size. */
    708+    return secp256k1_scratch_max_allocation(error_callback, scratch, 0) / secp256k1_strauss_scratch_size(1);
    


    real-or-random commented at 1:09 pm on June 13, 2019:

    … and wasn’t the previous version (without changes in strauss_scratch_size) more precise? We need to round up to the alignment only once per array (e.g., once for the scalars array).

    In the proposed revision, I think we overestimate the required padding a lot because we still call strauss_scratch_size(1) here but this has padding now.


    jonasnick commented at 1:35 pm on June 13, 2019:
    This is correct - sorry I overlooked this. Will add fix and test.

    jonasnick commented at 7:49 pm on June 16, 2019:
    fixed
  10. real-or-random commented at 1:10 pm on June 13, 2019: contributor
    (My review is best viewed commit by commit.)
  11. in src/ecmult_impl.h:876 in e774a6eea8 outdated
    1161+              && space_for_points < secp256k1_pippenger_scratch_size_points(n_points, bucket_window, 1)) {
    1162+            /* If there's not enough space after alignment is taken into
    1163+             * account it suffices to decrease n_points by one. This is because
    1164+             * the maximum padding required is less than an entry. */
    1165+            n_points -= 1;
    1166+            VERIFY_CHECK(space_for_points >= secp256k1_pippenger_scratch_size_points(n_points, bucket_window, 1));
    


    real-or-random commented at 4:25 pm on June 13, 2019:
    After some discussion with @jonasnick, one of the things I’m not sure about is the added complexity in this function. On the one hand, this function is accurate now and users of the function can rely on that. On the other hand, if we just call secp256k1_scratch_max_allocation with PIPPENGER_SCRATCH_OBJECTS instead of 0, we may return a value that is one too small. That’s not terrible for performance but it potentially makes the function a little bit harder to use and test because you may need to remember that it is not accurate.

    jonasnick commented at 9:29 pm on June 15, 2019:

    It seems like both hands are arguments in favor of the change (i.e. calling secp256k1_scratch_max_allocation with 0).

    We need to do that for strauss anyway because otherwise

    0n_points == strauss_max_points(..., scratch_create(strauss_scratch_size(n_points)))
    

    wouldn’t hold.

  12. jonasnick force-pushed on Jun 16, 2019
  13. jonasnick force-pushed on Jun 16, 2019
  14. jonasnick commented at 7:49 pm on June 16, 2019: contributor

    I made a couple of changes and in order to avoid adding code that is deleted in later commits I force pushed, sorry. Summary of the changes:

    • added function alloc_size to scratch space to compute actual size allocated for a given number of objects
    • fixed bug in strauss_max_points (thanks @real-or-random) that vastly underestimated the number of points actually fitting into the scratch space. Also added test which would have caught this issue.
    • added a verify check to ensure that the space required for a single point/entry is smaller than the worst case padding
  15. jonasnick force-pushed on Jun 16, 2019
  16. in src/ecmult_impl.h:857 in c7d8c9ce78 outdated
    1181         }
    1182-        space_for_points = max_alloc - space_overhead;
    1183+        space_for_points = max_alloc - space_constant;
    1184 
    1185-        n_points = space_for_points/entry_size;
    1186+        n_points = (space_for_points - entry_size)/entry_size;
    


    sipa commented at 7:48 pm on July 23, 2019:
    Is this right? It’s equivalent to space_for_points / entry_size - 1.

    jonasnick commented at 9:19 am on July 25, 2019:
    It is right. Simplified the line according to your suggestion.

    robot-dreams commented at 2:39 am on November 29, 2021:
    Is this assignment to n_points (along with the comment) redundant with above?
  17. sipa commented at 7:55 pm on July 23, 2019: contributor
    Concept ACK, I still need to go over the logic changes.
  18. real-or-random commented at 12:18 pm on April 7, 2021: contributor
    needs rebase
  19. jonasnick cross-referenced this on Oct 25, 2021 from issue Replace MuSig(1) module with MuSig2 by jonasnick
  20. jonasnick force-pushed on Nov 6, 2021
  21. jonasnick commented at 8:48 pm on November 6, 2021: contributor

    Rebased and polished quite a bit. Also added fix for bug in master that we noticed before iirc. So to make sure it gets in I opened #1004.

    Still, I didn’t fully try to understand how this PR works. Also, it seems like ecmult_multi_scratch_size doesn’t give the exact optimal result. That’s because a scratch space of size pippenger_scratch_size(n_points, bucket_window) it may happen that strauss_max_points(error_callback, scratch), n) (the actual batch size) is smaller than n_points.

  22. jonasnick force-pushed on Nov 6, 2021
  23. in src/scratch_impl.h:104 in f162912502 outdated
     95@@ -96,4 +96,14 @@ static void *secp256k1_scratch_alloc(const secp256k1_callback* error_callback, s
     96     return ret;
     97 }
     98 
     99+static size_t secp256k1_scratch_alloc_size(size_t *sizes, size_t n_sizes) {
    100+    size_t i;
    101+    size_t sum = 0;
    102+
    103+    for (i = 0; i < n_sizes; i++) {
    104+        sum += ROUND_TO_ALIGN(sizes[i]);
    


    robot-dreams commented at 1:40 am on November 29, 2021:

    Nit: The existing secp256k1_scratch_max_allocation seems very careful about checking for overflow. For consistency, is it necessary to do the same here? For example:

    0// Check for overflow
    1if (sum + ROUND_TO_ALIGN(sizes[i]) < sum) {
    2    return 0;
    3}
    

    jonasnick commented at 5:11 pm on January 30, 2022:
    I think I fixed this. Also added test.
  24. in src/ecmult_impl.h:698 in 63811c4680 outdated
    693+ * the sizes of the individual parts. Otherwise it will return the size that is
    694+ * actually allocated with secp256k1_scratch_alloc which is greater or equal.
    695+ */
    696+static size_t secp256k1_pippenger_scratch_size_points(size_t n_points, int bucket_window, int as_allocated) {
    697+    size_t entries = secp256k1_pippenger_entries(n_points);
    698+    /* 4 objects are accounted for in pippenger_scratch_size_constant */
    


    robot-dreams commented at 2:00 am on November 29, 2021:
    Nit: 2 objects?

    jonasnick commented at 5:11 pm on January 30, 2022:
    fixed
  25. in src/ecmult_impl.h:838 in 1e81895753 outdated
    819@@ -801,13 +820,17 @@ static int secp256k1_ecmult_pippenger_batch_single(const secp256k1_callback* err
    820     return secp256k1_ecmult_pippenger_batch(error_callback, scratch, r, inp_g_sc, cb, cbdata, n, 0);
    821 }
    822 
    823-/**
    824- * Returns the maximum number of points in addition to G that can be used with
    825- * a given scratch space. The function ensures that fewer points may also be
    826- * used.
    827+/* Returns the (near) maximum number of points in addition to G that can be
    


    robot-dreams commented at 2:43 am on November 29, 2021:

    Do you already know how it might fail to be a maximum? (No worries if not, I still want to revisit these details carefully.)

    Edit: Could this fail to be a maximum because the constant space used by the buckets decreases when you jump to the next bucket window size?

  26. robot-dreams commented at 2:53 am on November 29, 2021: contributor
    Concept ACK
  27. in src/ecmult_impl.h:777 in 63811c4680 outdated
    742     const size_t scratch_checkpoint = secp256k1_scratch_checkpoint(error_callback, scratch);
    743     /* Use 2(n+1) with the endomorphism, when calculating batch
    744      * sizes. The reason for +1 is that we add the G scalar to the list of
    745      * other scalars. */
    746-    size_t entries = 2*n_points + 2;
    747+    size_t entries = secp256k1_pippenger_entries(n_points);
    


    robot-dreams commented at 11:12 pm on November 29, 2021:
    Nit: Should the comment go above the definition of the function secp256k1_pippenger_entries instead?
  28. in src/ecmult_impl.h:861 in 63811c4680 outdated
    823 
    824-        entry_size = 2*entry_size;
    825-        space_overhead = (sizeof(secp256k1_gej) << bucket_window) + entry_size + sizeof(struct secp256k1_pippenger_state);
    826-        if (space_overhead > max_alloc) {
    827+        space_constant = secp256k1_pippenger_scratch_size_constant(bucket_window);
    828+        if (space_constant + entry_size > max_alloc) {
    


    robot-dreams commented at 11:20 pm on November 29, 2021:
    Style nit (feel free to ignore): Was this check previously here for avoiding underflow (rather than short-circuiting)? If so would it make sense to keep the check as space_constant > max_alloc to make the intent clear?
  29. in src/tests.c:4345 in 1e81895753 outdated
    4098@@ -4098,6 +4099,20 @@ void test_ecmult_multi_strauss_scratch_size(void) {
    4099     }
    4100 }
    4101 
    4102+/* Spot check that any scratch space is large enough to fit
    4103+ * `strauss_max_points(scratch)` many points. */
    4104+void test_ecmult_multi_strauss_max_points(void) {
    4105+    size_t scratch_size = secp256k1_strauss_scratch_size_raw(1, 0);
    4106+    size_t max_scratch_size = secp256k1_strauss_scratch_size_raw(1, 1) + 1;
    4107+    for (; scratch_size < max_scratch_size; scratch_size++) {
    


    robot-dreams commented at 0:05 am on November 30, 2021:
    Would it make sense to check bigger scratch_size (so that n_points is bigger too), but increase the amount scratch_size is incremented on each iteration?
  30. in src/tests.c:4349 in 1e81895753 outdated
    4106+    size_t max_scratch_size = secp256k1_strauss_scratch_size_raw(1, 1) + 1;
    4107+    for (; scratch_size < max_scratch_size; scratch_size++) {
    4108+        secp256k1_scratch *scratch = secp256k1_scratch_create(&ctx->error_callback, scratch_size);
    4109+        size_t n_points = secp256k1_strauss_max_points(&ctx->error_callback, scratch);
    4110+        CHECK(secp256k1_scratch_max_allocation(&ctx->error_callback, scratch, 0) == scratch_size);
    4111+        CHECK(scratch_size >= secp256k1_strauss_scratch_size(n_points));
    


    robot-dreams commented at 0:06 am on November 30, 2021:

    Would it make sense to also check that the result is exact, e.g. by adding a check like this:

    0CHECK(scratch_size < secp256k1_strauss_scratch_size(n_points + 1));
    
  31. in src/ecmult_impl.h:858 in e54c1aff71 outdated
    845         size_t space_for_points;
    846-        size_t space_overhead;
    847-        size_t entry_size = sizeof(secp256k1_ge) + sizeof(secp256k1_scalar) + sizeof(struct secp256k1_pippenger_point_state) + (WNAF_SIZE(bucket_window+1)+1)*sizeof(int);
    848+        size_t space_constant;
    849+        /* Compute entry size without taking alignment into account */
    850+        size_t entry_size = secp256k1_pippenger_scratch_size_points(0, bucket_window, 0);
    


    robot-dreams commented at 1:18 am on November 30, 2021:
    Style nit (feel free to ignore): Should this be called point_size instead to avoid confusion, since in other places you get 2(n+1) entries from the endomorphism?
  32. robot-dreams commented at 1:24 am on November 30, 2021: contributor

    Looks good overall.

    It’s unfortunate that:

    • Padding / alignment adds so much complexity to these calculations
    • Pippenger sizes have this weird non-monotonic behavior

    But I still think your change makes sense.

    My only general feedback is that updating the scratch space usage would involve keeping a lot of different things in sync, similar to what @real-or-random mentioned at #1004 (comment). Is there a way to refactor or add comments to make the task easier in the future (e.g. by sharing code between scratch_size_raw and batch_allocate)?

  33. ecmult: fix off-by-one in ecmult_multi test 8040c1f033
  34. ecmult: compute allocated size for pippenger buckets consistently 4c5ac49e5e
  35. scratch: add alloc_size function to scratch_space 8a29272a18
  36. ecmult: make strauss_ and pippenger_scratch_size more precise
    Take actual alignment into account instead of assuming worst case.
    This improves test because it can be checked that *_scratch_size matches
    what is actually allocated.
    4d78e9bf7c
  37. ecmult: make strauss_ and pippenger_max_points more precise
    Take actual alignment into account instead of assuming the worst case. This
    allows more precise tests for strauss, because if a scratch space has exactly
    strauss_scratch_size(n_points) left, then secp256k1_strauss_max_points(cb,
    scratch) = n_points.
    2d7a903d79
  38. ecmult: add function to compute optimal scratch space size 0d574b4366
  39. jonasnick force-pushed on Jan 30, 2022
  40. jonasnick commented at 5:11 pm on January 30, 2022: contributor
    I rebased this to see how master affects this PR. Will still need to address review comments and add better explanations to the commits.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-23 19:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me