As suggested by @gmaxwell. This seems to give a 3% speedup for verification.
Builds on top of #119.
Ah this needs a comment to explain what we're doing, something like:
/_To finish the verification we must check that the r provided matches the X from our multi-exponentation. To do this the obvious way requires converting the point in jacobian coordinate back to affine, or at least it's X value, which requires a slow modular inverse. To avoid the inverse instead of computing r == (Fp(x)/Fp(z)^2) % n, we test (r_z^2 == x) || ((r < p-n) && ((r+n)*z^2 == x)). */
ACK. Tested and reviewed the implementation. Perhaps since this is my hairbrained scheme, someone additional should review the idea too.
Rebased.
Rebased.
@apoelstra Care to review?
Suggested by Greg Maxwell.
133 | + unsigned char c[32]; 134 | + secp256k1_scalar_get_b32(c, &sig->r); 135 | + secp256k1_fe_t xr; 136 | + secp256k1_fe_set_b32(&xr, c); 137 | + 138 | + // We now have the recomputed R point in pr, and its claimed x coordinate (modulo the order)
Can you change order to group order (n)? Intuitively x should be mod the field order, but ofc ECDSA uses it as an exponent, so it's better to be explicit.
137 | + 138 | + // We now have the recomputed R point in pr, and its claimed x coordinate (modulo the order) 139 | + // in xr. Naively, we would extract the x coordinate from pr (requiring a inversion modulo p), 140 | + // compute the remainder modulo the order, and compare it to xr. However: 141 | + // 142 | + // x(R) mod n == xr
Here you say n but every time afterward you say order. IMHO they should all be n.
138 | + // We now have the recomputed R point in pr, and its claimed x coordinate (modulo the order) 139 | + // in xr. Naively, we would extract the x coordinate from pr (requiring a inversion modulo p), 140 | + // compute the remainder modulo the order, and compare it to xr. However: 141 | + // 142 | + // x(R) mod n == xr 143 | + // <=> exists h. (x(R) + h * order < p && x(R) == xr + h * order)
Do you mean && xr == x(R) + h * order) (and then to change the rest of the lines)? Otherwise I don't see this -- it seems like you're checking that x(R) won't overflow but then adding to xr rather than x(R).
Rewrote the comments a bit; indeed, the bounds checks were wrong.
utACK. Thanks for waiting on me.