x-only ECDH without sqrt

peterdettman commented at 1:53 am on June 26, 2015: contributor

Builds on #252.

This is a demo of another isomorphism trick that I originally described here: http://www.ietf.org/mail-archive/web/cfrg/current/msg05770.html .

This PR adds secp256k1_xo_multiply(), with test and benchmark (bench_ecdh_xo), plus a new group method secp256k1_ge_set_xo_iso_var() to support the central trick of getting a valid point from an x-coord without sqrt.

There are two main caveats:

A Jacobi symbol test is required to validate the input. GMP has an appropriate method~~, but the call site is just a TODO in this current code. I would appreciate help in wiring that up if anyone wants to volunteer~~.
The output is only an x-coord. The y-coord could be recovered… at the cost of a sqrt! AFAIK there’s no cheap way to get even the sign bit. Still, I think needing only the x-coord is a fairly common use case, though I don’t know if/how bitcoin uses ECDH.

NOTE: The input is currently also just a field element, but it could perhaps be a point, with this optimization only applied if it is in compressed format.

I measure an 8.5% performance improvement for bench_ecdh_xo vs bench_ecdh. The final numbers will depend mostly on how expensive the Jacobi symbol test is, but it should be considerably cheaper than the sqrt.

peterdettman commented at 4:07 am on June 26, 2015: contributor

Just to emphasise: the improvement only applies when the input is compressed, which is the case in bench_ecdh currently.

peterdettman force-pushed on Jun 27, 2015

tests: add a couple tests

  - Add zero/one sanity check tests for ecmult

  - Add unit test for secp256k1_scalar_split_lambda_var

  - Typo fix in `ge_equals_ge`; was comparing b->y to itself, should
    have been comparing a->y to b->y

  - Normalize y-coordinate in `random_group_element_test`; this is
    needed to pass random group elements as the first argument to
    `ge_equals_ge`, which I will do in a future commit.

a828df5c34

Add constant-time `secp256k1_point_multiply` for ECDH

Designed with clear separation of the wNAF conversion, precomputation
and exponentiation (since the precomp at least we will probably want
to separate in the API for users who reuse points a lot.

Future work:
  - actually separate precomp in the API
  - do multiexp rather than single exponentiation

955774cef3

Expose API for constant time point multiplication 1bcca3510a

Add benchmark for ECDH multiplication 6b43041f90

Make `secp256k1_scalar_add_bit` conditional; make `secp256k1_scalar_split_lambda_var` constant time

This has the effect of making `secp256k1_scalar_mul_shift_var` constant
time in both input scalars. Keep the _var name because it is NOT constant
time in the shift amount.

As used in `secp256k1_scalar_split_lambda_var`, the shift is always
the constant 272, so this function becomes constant time, and it
loses the `_var` suffix.

59c57209c2

Implement endomorphism optimization for secp256k1_ecdh_point_multiply 48dc8e3001

peterdettman force-pushed on Jul 1, 2015

peterdettman commented at 1:26 am on July 2, 2015: contributor

Updated and rebased for latest #252 changes, with the hash over just the x-coordinate.

Expose API for constant time point multiplication 171470e840

Add benchmark for ECDH multiplication c641e06098

Make `secp256k1_scalar_add_bit` conditional; make `secp256k1_scalar_split_lambda_var` constant time

This has the effect of making `secp256k1_scalar_mul_shift_var` constant
time in both input scalars. Keep the _var name because it is NOT constant
time in the shift amount.

As used in `secp256k1_scalar_split_lambda_var`, the shift is always
the constant 272, so this function becomes constant time, and it
loses the `_var` suffix.

a35d91b13d

Implement endomorphism optimization for secp256k1_ecdh_point_multiply 0436dfcf62

Merge branch 'ecdh-mult' of https://github.com/apoelstra/secp256k1 into ecdh-mult 67d307c2f2

Demo code for x-only ECDH 2a5914ac1a

peterdettman force-pushed on Jul 3, 2015

Add Jacobi symbol test via GMP d23d8c5592

peterdettman commented at 12:32 pm on July 3, 2015: contributor

Jacobi symbol test is now implemented (using GMP method mpz_jacobi wrapped by the num_ API); the performance improvement is reduced to just over 6%.

EDIT: 6% is for 64bit field with endomorphisms enabled.

Fallback to _set_xo_var when USE_NUM_NONE defined a5eaa21565

sipa commented at 9:25 pm on July 8, 2015: contributor

So, there are a bunch of related questions here:

Do we want a specific ECDH implementation, or do we want EC point-scalar multiplication?
- For ECDH, we could have “hash of full serialized point”, “hash of x coordinate” or “just x coordinate”. I wouldn’t want more than one, though.
- EC point-scalar multiplication is much more general, and I wouldn’t mind having an X-only version as well as an XY version, if there is a significant gain.

I also don’t like relying on GMP for an extra operation. At least for the mpz_mod_inverse we can (and should!) do a verification whether the inverse is correct using our implementation. Is something similar possible with mpz_jacobi?

Alpha is using the hash(full point) mechanism currently, though the X-only approach for ECDH does feel a bit cleaner, so I’m fine with only having that approach in the main code if it’s considered better.

gmaxwell commented at 10:06 pm on July 8, 2015: contributor

apolestra was working on a legendre symbol implementation for the library.

I’d be in favor of adoption a construction as standard that used X only and the signs of the two points in the hash, so that they didn’t have the malleability, but also didn’t require computing the Y.

sipa commented at 10:09 pm on July 8, 2015: contributor

@gmaxwell Well a function that only takes an X coordinate and a scalar as input, and gives an X coordinate as output also satisfies that (doesn’t need Y computing, and is not malleable).

apoelstra commented at 10:39 pm on July 8, 2015: contributor

@sipa I am working on a Jacobi symbol function. It’s slightly nontrivial (in particular I need a div_mod operator first). I lost all day today (and most of yesterday) to various meetings at school but I should have a PR by Friday morning.

While “X only” to “X only” is nonmalleable, “EC point” to “X only” is not … and to compute a shared secret you need to start with an EC point (because you need to know its discrete log) so I don’t think this gives us any safety.

I’ve been thinking a bit about learning the sign or parity of y without computing any square roots, but I haven’t gotten anywhere.

sipa commented at 0:23 am on July 9, 2015: contributor

@apoelstra the ECDH function could accept NULL as point input, in which case it defaults to G, allowing the “pubkey creation” for ECDH with the same API.

peterdettman commented at 1:58 am on July 9, 2015: contributor

Regarding the hashing, I’m still a bit skeptical of building a particular derivation method into the agreement primitive, though I understand the motivation. Are use-cases outside bitcoin off the radar?

peterdettman commented at 3:02 am on July 9, 2015: contributor

Regarding malleability, I think any scheme where sign-flipping the input point is not detected outside of the ECDH primitive is already broken, since one could flip the sign for both sides and have them agree on a different value. @gmaxwell Hashing the input signs creates an ordering dilemma, where the primitive is otherwise symmetric (I guess they could be sorted by x-coordinate). Perhaps a parity bit XOR(s1,s2) instead, but then it’s back to allowing both signs to be flipped together.

gmaxwell commented at 7:54 am on July 9, 2015: contributor

@peterdettman one could sort the pubkeys but alas, that doesn’t always work. I don’t know ultimately but we should try to think it through.

In general we’ve tried to avoid offering just “raw primitives” instead of complete cryptography protocols for a couple reasons–

One is that the API becomes something you’re committed to maintaining, e.g. say we offered a raw ECDH but then found a nice fast x-only, then end up maintaining having to maintain code for both because people were using the old API because they expected the Y. Or we found some performance optimization, e.g. imagine if our ECDH verify returned the recovered r value and expected the caller to to do the comparison.

We’ve seen OpenSSL end up having to carry around a lot of buggy additional code because it didn’t really have an external interface but exposed most of its internals to the outside world then had to maintain them.

Another is that we know from experience that more low level exposure is more opportunity to misuse. Example: our old ECDSA directly took the k value, one user of the library decided to set k=privkey xor message (and then argued that it wasn’t vulnerable; andytoshi had fun coming up with attacks for that construct). Another example for ECDH which is not an issue for secp256k1 but would be for some other curves is that a dumb protocol which spat out the ephemeral point directly could be used to leak data from subgroup confinement. So basically, if there is a possible way to easily misuse an API a naive user to misuse an API we ought to have a good reason why we offered it anyways…. Or at least if there is something simply and safe that most people should want we should offer than in addition to the low level thing, just so they’re not required to play with fire.

Another is performance relayed, if there is a generally interesting cryptosystem someone wants to implement with this curve, it should be done using the high speed internal functions not kludged together using the external API.

So these are just some considerations… perhaps returning the X value is the thing we really should do here– I’m not sure– but I do think that its easier to misuse– as I said, I think I can easily construct contrived protocols where the malleability causes problems, and so I think it’s worth giving some thought, even if our ultimate conclusion is that we can’t improve things without making unfortunate tradeoffs.

sipa commented at 1:29 pm on July 10, 2015: contributor

One thing I like about using x-only is that it means all externally visible data structures are 32 bytes. Simply using (p, Q.x) -> (pQ).x however leads to a malleability, where p can be negated. One solution would be (p, Q) -> H(pG + Q || (pQ).x), but that requires an extra point multiplication, and introduces a hash, and needs a 33-byte input.

sipa commented at 1:35 pm on July 10, 2015: contributor

I really see no way to avoid the malleability with x-only, without undoing the performance benefit. Am I missing something?

sipa commented at 1:42 pm on July 10, 2015: contributor

@peterdettman Use cases outside of Bitcoin are definitely not off the radar, though at some point we may need an extension mechanism, as those who want the library for signing-only in memory-constrained setups may not appreciate code in there to implement Alpha’s zero-knowledge range proofs.

I do think that we want an API that exposes “end-to-end” use cases, which are hard to use in an insecure way, and not just low-level primitives. For example, what if we had an API that implemented simple point addition between two compressed points, but someone has to add 1 million points together. The naive way of doing so would be calling the 2-ary point addition 999999 times, but doing so would involve 999999 conversions to affine coordinates, killing performance. If there is a need for doing an addition of 1000000 points, there should be a high-level API that does so, and implements it in the most safe way possible.

DavidEGrayson commented at 4:58 pm on July 10, 2015: none

In general we’ve tried to avoid offering just “raw primitives” instead of complete cryptography protocols for a couple reasons

This is starting to look like a discussion of the overall aim and spirit of the library. As someone who has written an ECDSA implementation I thought I would chime in and give my two cents.

I mainly think of secp256k1 as a mathematical object that has certain low-level operations: it has a generator point and an infinity point, you can add points, multiply points by numbers, negate points. Each point has two coordinates, except for the infinity point. ECDSA is just a handy thing that was built on top of this interesting elliptic curve.

So when I see a library called “libsecp256k1” with the tag line “Optimized C library for EC operations on curve secp256k1.”, I just expect it to have those primitive operations in there.

My ECDSA implementation offers these primitive operations, and that is what allowed someone to build a ring signature implementation, using my work as a dependency.

If I didn’t offer those operations, they would have to write code that depends on poorly documented internal functions in my library. If I pull their work into my library, then I would have to maintain that code myself as my project progresses, which is more work than just maintaining the API they used. If I don’t pull it, then they would have the burden of maintaining a fork of my library. Either way, one of us would be in a position of maintaining code that they didn’t write and aren’t too interested in.

To address the safety issue, you could just document to the user which operations are safe for novices working on production systems, and which operations should be left to advanced users or users who are just playing around. The advanced ones could even be in a separate header file. Or perhaps you would break the project up into separate libraries that offer different classes of operations.

To address the particular example of adding up 1000000 points, you could offer an API that uses opaque pointers to optimized representations of points, instead of actually passing around binary blobs of data.

sipa commented at 5:07 pm on July 10, 2015: contributor

@DavidEGrayson I see your point, but I don’t believe it is realistic. As optimizations evolve, internal data types will change, and the way they need to interact will change too. Exposing those for the purpose of simplifying work on top would incur a large maintenance cost, especially as those internal types are platform dependent, and properly hiding their implementation adds performance cost.

Furthermore, I think it is ill-advised to encourage people to write constructs using low-level optimized unsafe but exposed functions. This is cryptographic code, and if you need access to low-level operations to implement something more efficiently, I think you should do the effort of understanding the internals anyway. The alternative probably has many risks for edge cases.

ajtowns commented at 9:24 am on October 5, 2015: none

The ECDH implementation in openssl only reports “x” back to the user (and this seems to be what’s recommended in NIST SP800-56A / 5.7.1.2 http://nvlpubs.nist.gov/nistpubs/SpecialPublications/NIST.SP.800-56Ar2.pdf and http://www.secg.org/sec1-v2.pdf ). That makes it hard to generate the same secret libsecp256k1 will use – if you already have some data you can test both odd/even possibilities, but if you’re sending and libsecp256k1 is receiving, you need to know in advance.

So I think an “x”-only implementation would be a win for compatability reasons (and if that could be the only option, that seems like it’d be even better)…

apoelstra commented at 4:08 pm on October 5, 2015: contributor

Very interesting.

The recommendation in NIST SP800-56A / 5.7.1.2 seems dangerous, not only because it will give the same shared secret for ±scalar and ±point, but also because its output has a bunch of algebraic structure that has nothing to do with being a shared secret. Ditto for SECG 3.3.1 which says the same thing.

For what it’s worth, I don’t consider compatibility with OpenSSL to be a goal at all.

gmaxwell commented at 4:35 pm on October 5, 2015: contributor

@ajtowns “x-only” is mostly orthogonal with your ask there– x-only in the sense of this pull is performing the computation from a ‘compressed’ point without the sqrt needed to recover y… normal ECDH could, of course, discard y in the results; as well x-only could recover the y at the end (though losing its performance gains).

We’ve talked about changing the ECDH api to take a hasher callback similar to how we handle the nonce, in order to preserve some flexibility without giving an API that encourages insecure operation.

Likewise with apoelstra, I don’t believe our goal is to replicate other things. In particular, secp256k1 is not a terribly popular curve and will be inherently incompatible with most things on that basis alone. Many defacto (and, in some cases, formalized) standards involve behavior we think may be (and/or have observed to be) gratuitously risky. For example, I’m sure we violate various DSA standards by using derandomized signing, not using SHA1, and providing signatures that pass a strong verifier.

ajtowns commented at 4:25 am on October 7, 2015: none

Is there a reason why discarding y in ECDH is risky? Hashing the ECDH-generated secret makes perfect sense to me (and it’s something the openssl API directly supports, so seems like its widely accepted as a sensible thing to do); but I’m not seeing how being able to generate an alternate public key that results in the same secret is a problem if you have to know the original secret key and original secret anyway?

Ack on compatability not being a goal; it’s just frustrating that this lib’s ECDH seems to be different enough from every other ECDH implementation that you have to go back to multiplying points from scratch to be compatible. :( But yeah, first-world-crypto problems, I guess.

apoelstra commented at 12:57 pm on October 7, 2015: contributor

@ajtowns We don’t have a concrete reason why discarding y is dangerous (and indeed maybe it’s perfectly safe for all real applications). But the consequence “there are two public keys you can give to someone such that they’ll compute the same shared secret” seems like the kind of subtle unexpected thing that people will forget in higher level cryptosystems.

Like @gmaxwell suggested somewhere that maybe somebody would make a system that used an ECDH secret to seed an RNG which checked that it was never seeded with the same key twice (by just blacklisting every public key it had already received). Maybe for like a gambling site users could get predictable outcomes this way.

This is kinda contrived, because like why would the site blacklist the incoming public points instead of the computed secrets, but it illustrates my point that we don’t want surprises. You could say of ECDSA “why does it matter that (r, s) and (r, -s) are both valid signatures when you need to know the actual secret to compute one of them?” and we’ve seen in Bitcoin that this is a malleability vector that prevents chaining together even fairly simple zero-conf transactions.

sipa commented at 1:02 pm on October 7, 2015: contributor

Antony: I think the most important question is: do you have an existing standard to support that uses ECDH over secp256k1, where SHA256(full pubkey) is not sufficient?

ajtowns commented at 2:09 pm on October 7, 2015: none

apoelstra: yeah, that was pretty much what I came up with. which would have convinced me if no one else was implementing ECDH, but since it’s only one bit of additional protection and other implementations and standards just use x alone, compatibility seems better to me.

sipa: sorry, context is that I was reimplementing the onion routing protocol rusty’s proposing for lightning. rusty’s version is in C using this library; I was writing in Python using some OpenSSL-based modules. (Why bother? Multiple independent implementations of new protocols seems to me like a good way to ensure they make sense, YMMV) Getting a compatible result to this library’s ECDH was a pain due to the slightly abnormal inclusion of a bit from y. Changing the protocol to just use SHA256(x) would’ve made things easy in python w/ openssl, but a pain in the C version with this library (unless you guys decide to patch it of course! :). Having a python wrapper for this module would make the code easier, but would mean the implementations weren’t particularly as independent as they could be…

List thread is at http://lists.linuxfoundation.org/pipermail/lightning-dev/2015-October/000248.html fwiw.

apoelstra commented at 3:50 pm on October 7, 2015: contributor

@ajtowns It’s not just “one bit of protection”, which would indeed be irrelevant. It’s the difference between the map point -> shared secret being bijective or not, which is a qualitative change in the nature of the protocol.

Having said that, I think having a custom hasher like with the nonce generation would make everybody happy. I agree that compatibility ought to be possible (even if you have to write your own hash function that throws away y) even if we don’t target that by default.

gmaxwell commented at 7:01 pm on October 7, 2015: contributor

The issue is that it creates a malleable result, where two different points result in the same shared secret. This general shape of issue is known to produce security vulnerabilities in other contexts (e.g. the inherent ecdsa malleability).

While this won’t produce an issue in most contexts that we’re aware of, this is a general primitive and we do not know how it will be used. I can construct a contrived example where this surprising property results in a vulnerability– so I consider it a concern, if not a blocker, and we should be careful and thoughtful.

Contrived example: say you are building a smart contract bonded transaction processing system. Participants are required by the protocol to never replay a specific message type (except if the whole input is identical). The messages are encrypted and look like [ephemeral pubkey][ECIES(message, digital-signature-of-message)]. If a duplicate is created, someone can show the smart contract the two non-identical messages and the destination private key, allowing it to check that they are duplicates and if so releases the bond. Someone sends you a message, you swap the pubkey for the equivalent one, and send two– now technically distinct– messages to the contract and the private key. Of course, this kind of thing can be avoided– e.g. by requiring the keys to have a specified sign; but only by someone who understands and thinks of the issue. So– a risk, if only a small one.

Kagami commented at 12:19 pm on November 5, 2015: none

@sipa

Antony: I think the most important question is: do you have an existing standard to support that uses ECDH over secp256k1, where SHA256(full pubkey) is not sufficient?

See this comment. (Just to notify other participants of this PR.)

sipa commented at 2:57 pm on February 8, 2022: contributor

Vague idea: does using H(pG || qG || (pqG).x) as shared secret not resolve issues around multiple key pubkeys resulting in the same shared secret? That seems like a good idea in any case.

real-or-random commented at 3:13 pm on February 8, 2022: contributor

Oh indeed. Actually it should suffice to hash the “signs” of the y-coordinates of pG and qG (or their XOR even) instead of pG || qG.

sipa commented at 6:14 pm on July 3, 2022: contributor

A new and generalized implementation of this technique: #1118 (internal only, not exposed in API yet).

real-or-random referenced this in commit 145078c418 on Apr 10, 2023

real-or-random added the label performance on May 11, 2023

real-or-random commented at 4:23 pm on May 11, 2023: contributor

Note: The improvement suggested here is compatible with our current ECDH scheme (so it’s strictly a performance improvement), whereas #994 and #1198 add new schemes.

x-only ECDH without sqrt #262