- Builds on top of https://github.com/bitcoin/secp256k1/pull/41 (Co-Z arithmetic for precomputation).
- Instead of inversion, scales all precomp A points to the same Z (usually != 1).
- Store the precomp A as affine by discarding the z coordinate (whilst noting the single Z value for later), which is equivalent to taking them as affine points on an isogenous curve.
- In secp256k1_ecmult, we treat the accumulator (“r”) as operating on this isogenous curve.
- Doubling formula needs no changes, adding of precomp A points is a simple _gej_add_ge_var.
- G precomp remains the same, but each add of these requires an extra 1M to bring the accumulator temporarily back to the “default isogeny”. For simplicity here I’ve modified _gej_add_ge_var to support this.
- The z value of the final result needs to be scaled (1M) also.
- Optimal at WINDOW_A=5.
Anyway, the upshot of this is an ~5% performance improvement for bench_verify (64bit, endo=yes), over and above the Co-Z precomputation itself, so >8% total vs master. Rough math suggests it’s saving ~116M vs the Co-Z arithmetic PR, which appears to make any inversion approach obsolete.
Questions welcome, as I’m not sure how to explain this in a straight-forward way (as far as I know this is a novel idea). I guess it’s important to understand why the isogeny works out so neatly; it’s a rather nice property of secp256k1 that stems from it having a==0 in the curve equation; otherwise, operating on an isogenous curve would require changes to the doubling formula. I recommend e.g. http://joye.site88.net/papers/TJ10coordblind.pdf, where this is discussed in the context of blinding (which reminds me I was meaning to PR a demo of that), albeit the a==0 case is not explicitly called out. It may be easier just to think of it as playing games with the z values…
EDIT: I’ll leave it as I wrote it, but I think the above should be talking about isomorphisms rather than isogenies.