Based on #979.
This adds a module with an implementation of the Elligator Squared algorithm for encoding/decoding public keys in uniformly random byte arrays.
93+ for (i = 0; i < 1000*count; i++) {
94+ secp256k1_fe t;
95+ secp256k1_ge g;
96+ random_field_element_test(&t);
97+ secp256k1_ellsq_fe_to_ge_var(&g, &t);
98+ CHECK(secp256k1_ge_is_valid_var(&g));
Is it worth checking that t
appears exactly once among preimages of g
?
0 {
1 /* t should appear exactly once in preimages */
2 int j, matches = 0;
3 secp256k1_fe f;
4 for (j = 0; j < 4; j++) {
5 if (secp256k1_ellsq_ge_to_fe_var(&f, &g, j) && check_fe_equal(&f, &t)) matches++;
6 }
7 CHECK(matches == 1);
8 }
96+ random_field_element_test(&t);
97+ secp256k1_ellsq_fe_to_ge_var(&g, &t);
98+ CHECK(secp256k1_ge_is_valid_var(&g));
99+ }
100+ /* Verify that secp256k1_ellsq_ge_to_fe_var + fe_to_ge_var roundtrips for random inputs. */
101+ for (i = 0; i < 1000*count; i++) {
Is it worth checking that all preimages are distinct?
0 int j, valid[4];
1 secp256k1_fe f[4];
2 secp256k1_ge g;
3 random_group_element_test(&g);
4 for (j = 0; j < 4; j++) {
5 valid[j] = 0;
6 if (secp256k1_ellsq_ge_to_fe_var(&f[j], &g, j)) {
7 secp256k1_ge g2;
8 secp256k1_ellsq_fe_to_ge_var(&g2, &f[j]);
9 ge_equals_ge(&g, &g2);
10 valid[j] = 1;
11 }
12 }
13 /* Preimages should all be distinct */
14 for (j = 0; j < 4; j++) {
15 int k;
16 if (!valid[j]) continue;
17 for (k = j + 1; k < 4; k++) {
18 if (!valid[k]) continue;
19 CHECK(!check_fe_equal(&f[j], &f[k]));
20 }
21 }
tACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489
I’ll give a full ack once I carefully review:
Basic question, what would go wrong if we use naive approaches such as:
(1) Given a random curve point (x1, y1), increment the x coordinate until you get another curve point (x2, y2). Use a random field element in [x1, x2) as the encoding. Decode by decrementing until you get an x corresponding to a curve point.
(2) Just generate and use a random field element as the encoding; both parties decrement x until they get a valid curve point.
I’m guessing the reason we don’t do something like that is (1) will produce field elements that don’t look random, and (2) will generate keys in a biased way?
(1) Indeed, this will result in biased encodings, because the distance between field elements which are valid X coordinates varies, and attacker-known, so they can detect a bias: encodings for which the previous and next valid X coordinate are close together will occur more frequently.
(2) Biased private key isn’t really a problem as long as (a) the bias is small and (b) the bias isn’t recognizable from the encodings/public keys, which is the case here. There is another problem however: the encoder wouldn’t know the corresponding private key.
249+ secp256k1_ge_neg(&q, &q);
250+ secp256k1_gej_set_ge(&qj, &q);
251+ secp256k1_gej_add_ge_var(&qj, &qj, &p, NULL);
252+ if (!secp256k1_gej_is_infinity(&qj)) {
253+ secp256k1_ge_set_gej_var(&q, &qj);
254+ } else {
Quick question, is there an easy way to hit this branch in a test, that doesn’t involve something convoluted like mocking out the hash function?
No worries if not but just thought I’d double check, since I added a VERIFY_CHECK(0)
here and it doesn’t get hit by either the test or benchmark.
See the comment below:
Note that if u1 is generated using a secure hash function, the probability of reaching this branch is negligible.
So I think it’s indeed impossible to construct a testcase for it without stubbing out the hash function.
I ended up just checking this case manually with these steps:
u1
directly to a preimage of p
(if one exists)VERIFY_CHECK(0)
in the else
case and make sure it gets hitVERIFY_CHECK(0)
and make sure the tests (especially roundtrips) passLooks good to me!
Code Review ACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489, modulo verifying the proof that the encoding is actually indistinguishable from 64 random bytes
I opened the different implementations of f
, r
side by side and checked that:
Basic question:
rnd32
as the key)? (Is this a well-known construction?)272+ }
273+ memset(hashdata, 0, sizeof(hashdata));
274+ return 1;
275+ }
276+ /* Only returned in case the provided pubkey is invalid. */
277+ return 0;
Could this happen? The coverage report shows that this line doesn’t get hit, and secp256k1_pubkey_load
never seems to return 0.
If not, would it make sense to either:
if (secp256k1_pubkey_load(ctx, &p, pubkey)) {
with a VERIFY_CHECK
and de-indent the bodyHmm, good point. I will address this.
Perhaps we should also make secp256k1_load_pubkey
return void
(not in this PR, though).
secp256k1_pubkey_load
can actually return 0, through its ARG_CHECK
macro.
Oh nice, I missed that.
Do we want more strict checks here than ARG_CHECK(!secp256k1_fe_is_zero(&ge->x))
? For example the following passes, but I wouldn’t expect 1000 randomly generate 64-byte strings to all produce valid pubkeys:
0 int N = 1000;
1 int j = 0, success = 0;
2 secp256k1_ge g;
3 secp256k1_pubkey pubkey;
4 secp256k1_testrand256(pubkey.data);
5 secp256k1_testrand256(pubkey.data + 32);
6 for (j = 0; j < N; j++) {
7 if (secp256k1_pubkey_load(ctx, &g, &pubkey)) success++;
8 }
9 CHECK(success == N);
secp256k1_pubkey
object must contain a valid public key, in normal usage.
268+ secp256k1_fe_normalize_var(&u2);
269+ secp256k1_fe_get_b32(ell64 + 32, &u2);
270+ break;
271+ }
272+ }
273+ memset(hashdata, 0, sizeof(hashdata));
Do you know if this memset
gets optimized out? e.g. here’s how the disassembly for the tests
binary looked on my machine:
0callq _secp256k1_ellsq_ge_to_fe_var
1testl %eax, %eax
2movq -0xd0(%rbp), %r15
3movq -0xc0(%rbp), %rbx
4leaq -0x2c0(%rbp), %r14
5leaq -0x90(%rbp), %r12
6leaq -0x320(%rbp), %r13
7jne 0x1000097f1
8
9...
10
110x1000097f1:
12leaq -0x2f0(%rbp), %r14
13movq %r14, %rdi
14callq _secp256k1_fe_normalize_var
15addq $0x20, %r15
16movq %r15, %rdi
17movq %r14, %rsi
18callq _secp256k1_fe_get_b32
19movl $0x1, %eax
20movq 0x1867e5(%rip), %rcx ## literal pool symbol address: ___stack_chk_guard
21movq (%rcx), %rcx
22cmpq -0x30(%rbp), %rcx
23jne 0x10000988d
24addq $0x2f8, %rsp ## imm = 0x2F8
25popq %rbx
26popq %r12
27popq %r13
28popq %r14
29popq %r15
30popq %rbp
31retq
I see calls corresponding to the if
statement on line 266 above, as well as the return 1
, but I don’t see anything corresponding to the memset
.
ACK 7b1e6260e215174561a2d1fdd7f500ba6eaf1489, I don’t think the rest of my comments are blocking.
rnd32
.The only thing I haven’t verified is that the function f
is indeed “well-distributed”, but I don’t want to block this PR until I’ve learned the background material needed to check that—I’m happy to accept that without verifying the proof for now.
Finally, would it make sense to either:
not terribly non-uniform
in https://github.com/sipa/writeups/tree/main/elligator-square-for-bn#25-dealing-with-infinityf(u) = -f(v)
case, so that it’s really easy to show that the statistical distance between the implemented distribution and the one in the paper is neglible (the decoder can still handle that case, so that every 64-byte string decodes to a valid public key)Elaborate on the statement not terribly non-uniform in https://github.com/sipa/writeups/tree/main/elligator-square-for-bn#25-dealing-with-infinity
I should drop that, as it’s confusing. Given the immensely low probability of hitting infinity in the first place, how it is handled is totally irrelevant.
This comment aims to indicate that it’s better than e.g. mapping infinity to the generator or so (which it is, as it’d mean the generator has ~2n ellsq preimages, while other points only have ~n), but I don’t think that’s a useful justification given it’s only changing an negligibly frequent event anyway. The real justification is: it’s probably the easiest way to implement this edge case.
Make the encoder NOT target the f(u) = -f(v) case, so that it’s really easy to show that the statistical distance between the implemented distribution and the one in the paper is neglible (the decoder can still handle that case, so that every 64-byte string decodes to a valid public key)
I don’t think it should be hard to show the difference is negligible, given the fact that even triggering (much less observing) a case in which the outcome is different is negligible (it requires picking a uniformly random u for which f(u)=-P).
ACK f997dadf592131054d1cfca6175730e53dee52a1 (aside from most recent build / CI changes) based on:
This introduces variants of the divsteps-based GCD algorithm used for
modular inverses to compute Jacobi symbols. Changes compared to
the normal vartime divsteps:
* Only positive matrices are used, guaranteeing that f and g remain
positive.
* An additional jac variable is updated to track sign changes during
matrix computation.
* There is (so far) no proof that this algorithm terminates within
reasonable amount of time for every input, but experimentally it
appears to almost always need less than 900 iterations. To account
for that, only a bounded number of iterations is performed (1500),
after which failure is returned. The field logic then falls back to
using square roots to determining the result.
* The algorithm converges to f=g=gcd(f0,g0) rather than g=0. To keep
this test simple, the end condition is f=1, which won't be reached
if started with g=0. That case is dealt with specially.
This adds a module with an implementation of the Elligator Squared
algorithm for encoding/decoding public keys in uniformly random
byte arrays.