This introduces variants of the vartime divsteps-based GCD algorithm used for modular inverses to compute Jacobi symbols. Changes compared to the normal vartime divsteps:
- Only positive matrices are used, guaranteeing that f and g remain positive.
- An additional jac variable is updated to track sign changes during matrix computation.
- There is (so far) no proof that this algorithm terminates within reasonable amount of time for every input, but experimentally it appears to almost always need less than 900 iterations. To account for that, only a bounded number of iterations is performed (1500), after which failure is returned. The field logic then falls back to using square roots to determining the result.
- The algorithm converges to f=g=gcd(f0,g0) rather than g=0. To keep this test simple, the end condition is f=1, which won’t be reached if started with g=0. That case is dealt with specially.
This code is currently unused, except for tests. I don’t aim for it to be merged until there is a need for it, but this demonstrates its feasibility.
In terms of performance:
0field_inverse: min 1.76us / avg 1.76us / max 1.78us
1field_inverse_var: min 0.991us / avg 0.993us / max 0.996us
2field_jacobi_var: min 1.31us / avg 1.31us / max 1.31us
3field_sqrt: min 4.36us / avg 4.37us / max 4.40us
while with the older (f24e122d13db7061b1086ddfd21d3a1c5294213b) libgmp based Jacobi code on the same system:
0num_jacobi: min 1.53us / avg 1.54us / max 1.55us