Add Elligator Square module #982

pull sipa wants to merge 4 commits into bitcoin-core:master from sipa:202109_ellsq changing 21 files +1133 −24
  1. sipa commented at 7:04 pm on September 28, 2021: contributor

    Based on #979.

    This adds a module with an implementation of the Elligator Squared algorithm for encoding/decoding public keys in uniformly random byte arrays.

  2. sipa force-pushed on Sep 28, 2021
  3. sipa force-pushed on Oct 2, 2021
  4. sipa force-pushed on Nov 9, 2021
  5. sipa force-pushed on Nov 9, 2021
  6. sipa force-pushed on Nov 9, 2021
  7. in src/modules/ellsq/tests_impl.h:99 in 7b1e6260e2 outdated
    93+    for (i = 0; i < 1000*count; i++) {
    94+        secp256k1_fe t;
    95+        secp256k1_ge g;
    96+        random_field_element_test(&t);
    97+        secp256k1_ellsq_fe_to_ge_var(&g, &t);
    98+        CHECK(secp256k1_ge_is_valid_var(&g));
    


    robot-dreams commented at 7:00 pm on November 12, 2021:

    Is it worth checking that t appears exactly once among preimages of g?

    0        {
    1            /* t should appear exactly once in preimages */
    2            int j, matches = 0;
    3            secp256k1_fe f;
    4            for (j = 0; j < 4; j++) {
    5                if (secp256k1_ellsq_ge_to_fe_var(&f, &g, j) && check_fe_equal(&f, &t)) matches++;
    6            }
    7            CHECK(matches == 1);
    8        }
    

    sipa commented at 9:06 pm on November 16, 2021:
    Good idea, done.
  8. in src/modules/ellsq/tests_impl.h:107 in 7b1e6260e2 outdated
     96+        random_field_element_test(&t);
     97+        secp256k1_ellsq_fe_to_ge_var(&g, &t);
     98+        CHECK(secp256k1_ge_is_valid_var(&g));
     99+    }
    100+    /* Verify that secp256k1_ellsq_ge_to_fe_var + fe_to_ge_var roundtrips for random inputs. */
    101+    for (i = 0; i < 1000*count; i++) {
    


    robot-dreams commented at 7:02 pm on November 12, 2021:

    Is it worth checking that all preimages are distinct?

     0        int j, valid[4];
     1        secp256k1_fe f[4];
     2        secp256k1_ge g;
     3        random_group_element_test(&g);
     4        for (j = 0; j < 4; j++) {
     5            valid[j] = 0;
     6            if (secp256k1_ellsq_ge_to_fe_var(&f[j], &g, j)) {
     7                secp256k1_ge g2;
     8                secp256k1_ellsq_fe_to_ge_var(&g2, &f[j]);
     9                ge_equals_ge(&g, &g2);
    10                valid[j] = 1;
    11            }
    12        }
    13        /* Preimages should all be distinct */
    14        for (j = 0; j < 4; j++) {
    15            int k;
    16            if (!valid[j]) continue;
    17            for (k = j + 1; k < 4; k++) {
    18                if (!valid[k]) continue;
    19                CHECK(!check_fe_equal(&f[j], &f[k]));
    20            }
    21        }
    

    sipa commented at 9:06 pm on November 16, 2021:
    Good idea, added (slightly differently).
  9. robot-dreams commented at 8:07 pm on November 12, 2021: contributor

    tACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489

    I’ll give a full ack once I carefully review:

    • Optimizations in section 3 of the writeup
    • Low-level implementation
    • Argument that the encoding is statistically indistinguishable from 64 random bytes, even with the minor changes (e.g. point at infinity, omitting some duplicate checks)

    Basic question, what would go wrong if we use naive approaches such as:

    (1) Given a random curve point (x1, y1), increment the x coordinate until you get another curve point (x2, y2). Use a random field element in [x1, x2) as the encoding. Decode by decrementing until you get an x corresponding to a curve point.

    (2) Just generate and use a random field element as the encoding; both parties decrement x until they get a valid curve point.

    I’m guessing the reason we don’t do something like that is (1) will produce field elements that don’t look random, and (2) will generate keys in a biased way?

  10. sipa commented at 8:11 pm on November 12, 2021: contributor

    @robot-dreams:

    (1) Indeed, this will result in biased encodings, because the distance between field elements which are valid X coordinates varies, and attacker-known, so they can detect a bias: encodings for which the previous and next valid X coordinate are close together will occur more frequently.

    (2) Biased private key isn’t really a problem as long as (a) the bias is small and (b) the bias isn’t recognizable from the encodings/public keys, which is the case here. There is another problem however: the encoder wouldn’t know the corresponding private key.

  11. robot-dreams commented at 8:16 pm on November 12, 2021: contributor
    Oops, good point about (2), I am silly. 🤣
  12. sipa commented at 8:20 pm on November 12, 2021: contributor
    But about (1), if there were a known upper bound on the distance between valid X coordinates (AFAIK there isn’t one, but more than ~128 is probably negligibly rare), you could actually use f(x) = (first point with largest X coordinate not larger than x) as mapping function in the scheme. I suspect that’s significantly slower than what we have now though, as you’d need avg 128 iterations, and counting the distance between valid X coordinates around each…
  13. in src/modules/ellsq/main_impl.h:254 in 7b1e6260e2 outdated
    249+            secp256k1_ge_neg(&q, &q);
    250+            secp256k1_gej_set_ge(&qj, &q);
    251+            secp256k1_gej_add_ge_var(&qj, &qj, &p, NULL);
    252+            if (!secp256k1_gej_is_infinity(&qj)) {
    253+                 secp256k1_ge_set_gej_var(&q, &qj);
    254+            } else {
    


    robot-dreams commented at 8:34 pm on November 12, 2021:

    Quick question, is there an easy way to hit this branch in a test, that doesn’t involve something convoluted like mocking out the hash function?

    No worries if not but just thought I’d double check, since I added a VERIFY_CHECK(0) here and it doesn’t get hit by either the test or benchmark.


    sipa commented at 8:35 pm on November 12, 2021:

    See the comment below:

    Note that if u1 is generated using a secure hash function, the probability of reaching this branch is negligible.

    So I think it’s indeed impossible to construct a testcase for it without stubbing out the hash function.


    robot-dreams commented at 3:12 pm on November 15, 2021:

    I ended up just checking this case manually with these steps:

    • Add code to always set u1 directly to a preimage of p (if one exists)
    • Add VERIFY_CHECK(0) in the else case and make sure it gets hit
    • Remove the VERIFY_CHECK(0) and make sure the tests (especially roundtrips) pass

    Looks good to me!

  14. robot-dreams commented at 10:46 pm on November 13, 2021: contributor

    Code Review ACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489, modulo verifying the proof that the encoding is actually indistinguishable from 64 random bytes

    I opened the different implementations of f, r side by side and checked that:

    • Section 2.6 is equivalent to Section 3
    • Section 3 is equivalent to the broken-down Sage implementation
    • The C and Sage implementations are equivalent

    Basic question:

    • The use of SHA256 for generating branch values seems reasonable, but just wondering, is there a justification that this use of SHA256 gives you a secure PRG (taking rnd32 as the key)? (Is this a well-known construction?)
  15. in src/modules/ellsq/main_impl.h:277 in 7b1e6260e2 outdated
    272+        }
    273+        memset(hashdata, 0, sizeof(hashdata));
    274+        return 1;
    275+    }
    276+    /* Only returned in case the provided pubkey is invalid. */
    277+    return 0;
    


    robot-dreams commented at 3:19 pm on November 15, 2021:

    Could this happen? The coverage report shows that this line doesn’t get hit, and secp256k1_pubkey_load never seems to return 0.

    If not, would it make sense to either:

    • Add an explicit check that the pubkey is valid
    • Replace the if (secp256k1_pubkey_load(ctx, &p, pubkey)) { with a VERIFY_CHECK and de-indent the body

    sipa commented at 10:13 pm on November 15, 2021:

    Hmm, good point. I will address this.

    Perhaps we should also make secp256k1_load_pubkey return void (not in this PR, though).


    sipa commented at 9:22 pm on November 16, 2021:
    secp256k1_pubkey_load can actually return 0, through its ARG_CHECK macro.

    robot-dreams commented at 10:25 pm on November 16, 2021:

    Oh nice, I missed that.

    Do we want more strict checks here than ARG_CHECK(!secp256k1_fe_is_zero(&ge->x))? For example the following passes, but I wouldn’t expect 1000 randomly generate 64-byte strings to all produce valid pubkeys:

    0        int N = 1000;
    1        int j = 0, success = 0;
    2        secp256k1_ge g;
    3        secp256k1_pubkey pubkey;
    4        secp256k1_testrand256(pubkey.data);
    5        secp256k1_testrand256(pubkey.data + 32);
    6        for (j = 0; j < N; j++) {
    7            if (secp256k1_pubkey_load(ctx, &g, &pubkey)) success++;
    8        }
    9        CHECK(success == N);
    

    sipa commented at 10:28 pm on November 16, 2021:
    I don’t think that’s needed. That check is a last-resort sanity check (several functions will set pubkeys on output to all-zeros when they return failure). Triggering that situation implies you’re already using the API incorrectly; a secp256k1_pubkey object must contain a valid public key, in normal usage.
  16. in src/modules/ellsq/main_impl.h:273 in 7b1e6260e2 outdated
    268+                secp256k1_fe_normalize_var(&u2);
    269+                secp256k1_fe_get_b32(ell64 + 32, &u2);
    270+                break;
    271+            }
    272+        }
    273+        memset(hashdata, 0, sizeof(hashdata));
    


    robot-dreams commented at 3:47 pm on November 15, 2021:

    Do you know if this memset gets optimized out? e.g. here’s how the disassembly for the tests binary looked on my machine:

     0callq   _secp256k1_ellsq_ge_to_fe_var
     1testl   %eax, %eax
     2movq    -0xd0(%rbp), %r15
     3movq    -0xc0(%rbp), %rbx
     4leaq    -0x2c0(%rbp), %r14
     5leaq    -0x90(%rbp), %r12
     6leaq    -0x320(%rbp), %r13
     7jne     0x1000097f1
     8
     9...
    10
    110x1000097f1:
    12leaq    -0x2f0(%rbp), %r14
    13movq    %r14, %rdi
    14callq   _secp256k1_fe_normalize_var
    15addq    $0x20, %r15
    16movq    %r15, %rdi
    17movq    %r14, %rsi
    18callq   _secp256k1_fe_get_b32
    19movl    $0x1, %eax
    20movq    0x1867e5(%rip), %rcx    ## literal pool symbol address: ___stack_chk_guard
    21movq    (%rcx), %rcx
    22cmpq    -0x30(%rbp), %rcx
    23jne     0x10000988d
    24addq    $0x2f8, %rsp            ## imm = 0x2F8
    25popq    %rbx
    26popq    %r12
    27popq    %r13
    28popq    %r14
    29popq    %r15
    30popq    %rbp
    31retq
    

    I see calls corresponding to the if statement on line 266 above, as well as the return 1, but I don’t see anything corresponding to the memset.


    real-or-random commented at 8:24 am on November 16, 2021:
    Yes, it is optimized out but currently that’s true for all of these “safety” memsets in the codebase. This is something we should really fix. I have an PR here but it’s old and needs to be rebased and finished… #636
  17. robot-dreams commented at 4:54 pm on November 15, 2021: contributor

    ACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489, I don’t think the rest of my comments are blocking.

    • I’m happy with the argument that the sampling algorithm produces a uniformly random preimage of a curve point
    • Although I wasn’t yet able to calculate the exact statistical distance between the distribution implemented here and the distribution described in the paper (handling infinity being the only difference), I suspect that it’s O(1/p), i.e. negligible unless I made a big mistake in my estimation.
    • I’ve convinced myself (in part with the help of the section starting with “Back to Delphia” in this article) that the use of SHA256 to generate random branches / field elements is secure under a random oracle model 😅 sipa’s favorite! Here, an attacker would have no choice but to guess rnd32.

    The only thing I haven’t verified is that the function f is indeed “well-distributed”, but I don’t want to block this PR until I’ve learned the background material needed to check that—I’m happy to accept that without verifying the proof for now.

    Finally, would it make sense to either:

    1. Elaborate on the statement not terribly non-uniform in https://github.com/sipa/writeups/tree/main/elligator-square-for-bn#25-dealing-with-infinity
    2. Make the encoder NOT target the f(u) = -f(v) case, so that it’s really easy to show that the statistical distance between the implemented distribution and the one in the paper is neglible (the decoder can still handle that case, so that every 64-byte string decodes to a valid public key)
  18. sipa commented at 10:11 pm on November 15, 2021: contributor

    Elaborate on the statement not terribly non-uniform in https://github.com/sipa/writeups/tree/main/elligator-square-for-bn#25-dealing-with-infinity

    I should drop that, as it’s confusing. Given the immensely low probability of hitting infinity in the first place, how it is handled is totally irrelevant.

    This comment aims to indicate that it’s better than e.g. mapping infinity to the generator or so (which it is, as it’d mean the generator has ~2n ellsq preimages, while other points only have ~n), but I don’t think that’s a useful justification given it’s only changing an negligibly frequent event anyway. The real justification is: it’s probably the easiest way to implement this edge case.

    Make the encoder NOT target the f(u) = -f(v) case, so that it’s really easy to show that the statistical distance between the implemented distribution and the one in the paper is neglible (the decoder can still handle that case, so that every 64-byte string decodes to a valid public key)

    I don’t think it should be hard to show the difference is negligible, given the fact that even triggering (much less observing) a case in which the outcome is different is negligible (it requires picking a uniformly random u for which f(u)=-P).

  19. robot-dreams commented at 6:09 am on November 16, 2021: contributor
    OK, I’m fully on board with this now! If you’re curious, see https://gist.github.com/robot-dreams/86302bfca28bf9dc83ced365cd428f64 for way too much detail that nobody needed or asked for.
  20. sipa force-pushed on Nov 16, 2021
  21. sipa force-pushed on Nov 16, 2021
  22. sipa force-pushed on Nov 16, 2021
  23. robot-dreams commented at 10:31 pm on November 16, 2021: contributor
    ACK 07a4ef5a08cea9916cf93603670d8ee4d5340540
  24. stratospher referenced this in commit 94767af462 on Jan 7, 2022
  25. stratospher referenced this in commit b3c5aab7eb on Jan 7, 2022
  26. stratospher referenced this in commit 4a00483129 on Jan 7, 2022
  27. stratospher referenced this in commit 0cfafbda3e on Jan 7, 2022
  28. stratospher referenced this in commit f01f335844 on Jan 7, 2022
  29. stratospher referenced this in commit 8eb51d7e80 on Jan 7, 2022
  30. stratospher referenced this in commit 915a1c9abb on Jan 7, 2022
  31. stratospher referenced this in commit 0410b05d16 on Jan 8, 2022
  32. stratospher referenced this in commit a77605a047 on Jan 8, 2022
  33. stratospher referenced this in commit 8cdf579e86 on Jan 8, 2022
  34. stratospher referenced this in commit 2e831ff811 on Jan 8, 2022
  35. stratospher referenced this in commit 1c2ef4c40f on Jan 8, 2022
  36. stratospher referenced this in commit e755867f79 on Jan 8, 2022
  37. stratospher referenced this in commit 33827163a0 on Jan 8, 2022
  38. stratospher referenced this in commit 4a63d306e4 on Jan 8, 2022
  39. stratospher referenced this in commit 90d044f51c on Jan 8, 2022
  40. stratospher referenced this in commit 2b621b5f92 on Jan 8, 2022
  41. stratospher referenced this in commit 692ef047f3 on Jan 8, 2022
  42. stratospher referenced this in commit ff16f37018 on Jan 8, 2022
  43. sipa force-pushed on Jan 25, 2022
  44. sipa commented at 7:59 pm on January 25, 2022: contributor
    Rebased.
  45. robot-dreams commented at 8:12 pm on January 25, 2022: contributor

    ACK f997dadf592131054d1cfca6175730e53dee52a1 (aside from most recent build / CI changes) based on:

    • #982 (comment)
    • git range-diff 0a40a4861ae654b3685d3d3670d45876f7cc9c48 07a4ef5a08cea9916cf93603670d8ee4d5340540 f997dadf592131054d1cfca6175730e53dee52a1
  46. sipa commented at 9:03 pm on January 31, 2022: contributor
    Mental note: simplify this after #1033.
  47. stratospher referenced this in commit 858c4072db on Feb 3, 2022
  48. stratospher referenced this in commit 2b49380816 on Feb 3, 2022
  49. stratospher referenced this in commit f8e8a8d141 on Feb 3, 2022
  50. Native jacobi symbol algorithm
    This introduces variants of the divsteps-based GCD algorithm used for
    modular inverses to compute Jacobi symbols. Changes compared to
    the normal vartime divsteps:
    * Only positive matrices are used, guaranteeing that f and g remain
      positive.
    * An additional jac variable is updated to track sign changes during
      matrix computation.
    * There is (so far) no proof that this algorithm terminates within
      reasonable amount of time for every input, but experimentally it
      appears to almost always need less than 900 iterations. To account
      for that, only a bounded number of iterations is performed (1500),
      after which failure is returned. The field logic then falls back to
      using square roots to determining the result.
    * The algorithm converges to f=g=gcd(f0,g0) rather than g=0. To keep
      this test simple, the end condition is f=1, which won't be reached
      if started with g=0. That case is dealt with specially.
    e5ae1e0c4d
  51. doc: Describe Jacobi calculation in safegcd_implementation.md 7e1bbef364
  52. Elligator Squared module
    This adds a module with an implementation of the Elligator Squared
    algorithm for encoding/decoding public keys in uniformly random
    byte arrays.
    fcd4abfeed
  53. Add ellsq testing to CI d4cbedc023
  54. sipa force-pushed on Jun 28, 2022
  55. sipa commented at 2:01 pm on July 7, 2022: contributor
    Closing, as Elligator Square seems strictly inferior in performance and complexity to ElligatorSwift.
  56. sipa closed this on Jul 7, 2022


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 09:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me