Hi y'all, In case you weren't already tired of all the recent dev list chatter re post quantum cryptography, here's another! When the topic of Bitcoin transitioning to a post quantum world is brought up, the discussion typically focuses on the consensus layer re swapping out vulnerable signature schemes. However, the consensus layer isn't the only area of Bitcoin that relies in cryptography that would be broken in the face of a powerful quantum computer! That's right, I'm talking about BIP 324, the peer to peer encryption BIP for Bitcoin. Like everything else on the Internet today, BIP 324 uses ECDH to allow two connecting peers to derive a shared secret known only to them, which is then used to encrypt all traffic between them. As ECDH relies on Elliptic Curve cryptography, a future quantum computer would be able to eavesdrop on a p2p handshake transcript, then derive the underlying private keys to the ephemeral ECDH public key, permitting it to decrypt all traffic. It's actually worse than that, as today adversaries can collect all encrypted p2p Bitcoin traffic, with the hope of being able to decrypt it all at a future date. This is commonly referred to as the: "harvest, decrypt later" (HNDL) strategy [11]. Compared to a consensus change, which requires widespread market agreement, and coordination to achieve, upgrading BIP 324 to be post quantum resistant is a much lower hanging fruit worthy of pursing immediately. Last week I starting thinking a bit about this topic, brushing up on the latest literature/techniques, and stumbled onto a few key design questions. The goal of this post isn't to propose a new concrete p2p encryption BIP, instead I want to start discussion on the various design tradeoffs that came up as I was researching this p2p encryption transition. ## PQ BIP 324 Design Questions 1. Do we want to pursue a hybrid KEM (key encapsulation mechanism), or go with a pure PQ KEM? 2. Is it still a key requirement that the initial handshake be indistinguishable from a random byte string? 2a. If yes to the above, then should we go with classical-then-pq-upgrade, or a one shot hybrid oblivious KEM. ## A Brief Intro to KEMs + ML-KEM First, let's introduce the new primitive we have to work with: ML-KEM (Module-Lattice-Based Key-Encapsulation Mechanism) [1][2]. As it says on the tin, ML-KEM is a lattice based Key-Encapsulation Mechanism. The phrase KEM might sound unfamiliar with those comfortable with ECDH, but ECDH is actually a KEM itself. A KEM has 3 algorithms: * KeyGen() -> {sk, pk} * Generates a public/private secret key pair * Encaps(pub) -> {secret, capsule} * Generates a new secret value, and a "capsule", which only the holder of pub can use to obtain the secret value. * Decaps(priv, capsule) -> secret * Uses the private key to extract the secret from the capsule If you squint a bit, then you'll see that ECDH is a KEM, and a rather elegant one at that: * KeyGen() -> {k, k*G} * Normal EC key generation. * Encaps(pub) -> {capsule = x*G, secret = pub*x} * The core ECDH routine. The ephemeral public key is actually the "capsule". The resulting secret is the ECDH output with the remote party's KEM public key and the local secret. * Decaps(priv, capsule) -> secret = priv * capsule * The receiver completes the key exchange using the ephemeral public key and their own private key. ECIES is another flavor of EC based KEM. One thing worth noting is that AFAICT, so far in the NIST PQC world [4], there is no known non-interactive key exchange protocol like we enjoy today with ECDH. IIUC, the reason is that lattice based schemes derived from the LWE [3] problem, whose security is predicated on using "noise" to hide a secret value. For these cryptosystems, usually a type of "hint" is sent to make everything work out nicely like in ECDH. However, in the stricter non-interactive setting (no messages sent), this doesn't map cleanly. As a result, ML-KEM looks more like a hybrid encryption protocol (Alice encrypts a shared secret to bob using asymmetric lattice crypto). ## To Hybrid KEM, Or Not to Hybrid KEM This brings us to our first design question.... Should we use a hybrid KEM or a pure post quantum one? A hybrid KEM would keep the existing ECDH, _also_ do ML-KEM, then securely combine (there's some subtlety there, see [6][7]) the resulting in a final secret value for encryption. A hybrid KEM is attractive as an encryption channel derived from such a KEM is secure if _any_ of the combined schemes are secure. This permits schemes to hedge a bit, as hey, maybe the PQ stuff is actually broken in the future but ECDH isn't. If it's the other way around, then your encryption scheme is still secure. ### Pure ML-KEM P2P Encrypted Handshake If we opt to not use a hybrid scheme, then the Elligator layer can be dropped all together. Instead, the 1.1 KB (ML-KEM-768) encapsulation keys are sent, keeping the trailing garbage+terminator in tact. The initial handshake would look something like: * Alice -> Bob: alice_encaps || initiator_garbage * Alice derives an encapsulation key, and sends it to Bob. * Bob -> Alice: ml_kem_capsule || responder_garbage || responder_garbage_terminator || first_encrypted_packet * Bob uses Alice's encapsulation key to encapsulate a random secret, and sends it over to Alice. He can also encrypt the first message at this point. * Alice -> Bob: initiator_garbage_terminator || first_encrypted_packet * Alice de-encapsulates the shared secret, and can now also start to encrypt messages. We'd then replace `v2_ecdh` with something like a `v3_mlkem` that derives the final shared secret based on the sent/received transcript up until that point: * `sha256_tagged("bip324_ml_kem", ml_kem_secret, alice_encaps, ml_kem_capsule)` ### Hybrid ML-KEM P2P Encrypted Handshake If we want to use a hybrid combiner, then along side the normal ellswift keys, the ML-KEM-768 encap key is also sent: * Alice -> Bob: ellswift_alice || alice_encaps || initiator_garbage * Bob -> Alice: ellswift_bob || ml_kem_capsule || responder_garbage || responder_garbage_terminator || first_encrypted_packet * Alice -> Bob: initiator_garbage_terminator || first_encrypted_packet Then following guidelines of [7], we'd then replace `v2_ecdh` with something like `v3_hybrid_shared_secret`: * `sha256_tagged("bip324_ellswift_xonly_ecdh_mlkem_768", ml_kem_ss, ecdh_point_x32, alice_encaps, ml_kem_capsule, ellswift_alice, ellswift_bob)` ## PQ/Hybrid Obfuscated KEMs At this point, those that are familiar with BIP 324 will recognize that both the pure PQ and hybrid versions renders the ElligatorSwift usage pretty much useless. ElligatorSwift encodes a 32-byte public key as a 64-byte value which is indistinguishable from a uniformly distributed bitstream. In a bubble, this means that the initial BIP 324 handshake to a 3rd party observer just looks like random bytes. However, with the introduction of ML-KEM, the ML-KEM encapsulation key is sent in plaintext over the wire. An ML-KEM key has identifiable structure, as it's a giant vector of polynomial coefficients mod 3329, which is easily recognizable over the wire. Luckily, there's an ML-KEM analogue to ElligatorSwift, called Kemeleon [8][9][10]! In a similar fashion to ElligatorSwift, it takes an ML-KEM public key, then encodes it as one giant integer, utilizing rejection sampling. Kemeleon applies this mapping both to the encapsulation keys, and also the capsule ciphertext that encrypts the shared secrets. The ML-KEM keys end up being a bit smaller, while the ciphertexts map to a larger value. Another tradeoff is that the Kemeleon key generation is ~3x slower than normal ML-KEM generation. One thing to note here is that Kemeleon's "looks random" property isn't quite on the same footing as ElligatorSwift's. ElligatorSwift is statistically indistinguishable from random, since every 512-bit string is a valid encoding. Kemeleon's indistinguishability is computational, resting on a Module-LWE style assumption. So if you naively concatenate an ElligatorSwift key and a Kemeleon key, the pair is only as obfuscated as the weakest visible half. This asymmetry is what motivates the OEINC construction discussed below. This brings us to our second design question.... Do we still want to ensure that the BIP-324 handshake looks identical to a pseudorandom bytestream from the very first message? Assuming yes, then AFAICT, we have two classes of options here: 1. Retain the existing BIP-324 outer ElligatorSwift handshake, but use ML-KEM within that initial encrypted transport to upgrade to a PQ shared secret. 2. Use the Outer Encrypts Inner Nested Combiner (OEINC - "OINK") combiner from [8]. 3. Attempt to adapt Drivel from [8] into the Bitcoin p2p setting. ### Classical Encrypted Channel Upgrades to PQ With the first option, we simply use one KEM right after the other. So BIP 324 v2 would be mostly unchanged, then we _upgrade_ to BIP 324 v3 within v2. A sketch of this would be something like: * Phase 0: normal BIP 324 handshake * Phase 1: negotiation of PQ KEM scheme over the encrypted handshake * Can be optional, if we just pick a set PQ KEM scheme. * Before this point, no Bitcoin p2p message should be sent, as the channel isn't PQC protected yet. * Phase 2: do normal ML-KEM within the ElligatorSwift derived encrypted transport 1. Alice sends the encapsulation key 2. Bob derives a secrets, encrypts it using the encapsulation key 3. Both sides then derive a PQ shared secret, ss_PQ * Phase 3: both sides use a hybrid combiner like sketched out above to derive a new set of transport keys * Phase 4: both sides rekey, switching over to a new the transport keys The upside of this option is that the outer part of BIP 324 remains unchanged, then with another round trip, we're able to upgrade the encryption keys to PQ hybrid security. The downside is that the very first messages sent aren't PQ from the start, but a PQ adversary wouldn't be able to decrypt the actual Bitcoin p2p messages (as we wait to send those until the upgrade). The handshake still looks like just random bytes. ### Outer Encrypts Inner Nested Combiner For the second option, [8] (with talk video [9] and slides [10]) describes an OEINC scheme where the outer KEM encrypts the inner KEM, wherein the KEM ciphertext of an inner KEM is encrypted using a shared secret derived from the outer KEM. The two KEM ciphertexts and the two derived keys are then used alongside a hybrid combiner to derive a final shared secret. Unlike the classical-then-pq-upgrade that establishes a classical channel, then uses that to upgrade to pq channel, OEINC is a special hybrid combiner that achieves a similar output but in one swoop. It defines a special KEM, which can then be used as the KEM in the very first handshake I sketched out. A sketch of this KEM looks something like: * Setup: * The outer KEM is BIP 324's ElligatorSwift-encoded secp256k1 DHKEM. * It serves as the outer KEM because its on-wire encoding is statistically indistinguishable from random. * The inner KEM is ML-Kemeleon. * KeyGen(): * (kem_secret_outer, kem_pubkey_outer) = outKEM.Gen() * (kem_secret_inner, kem_pubkey_inner) = inKEM.Gen() * combined_pubkey = (kem_pubkey_outer, kem_pubkey_inner) * combined_secret = (kem_secret_outer, kem_secret_inner) * Encaps(combined_pubkey): * (shared_secret_outer, capsule_outer) = outKEM.Encap(kem_pubkey_outer) * (encrypt_key_1, encrypt_key_2) = KDF(shared_secret_outer) * (shared_secret_inner, capsule_inner) = inKEM.Encap(kem_pubkey_inner) * encrypted_capsule_inner = encrypt(encrypt_key_1, capsule_inner) * combined_capsule = capsule_outer || encrypted_capsule_inner * combined_shared_secret = combine(encrypt_key_2, shared_secret_inner, combined_capsule) * Decaps(combined_secret, combined_capsule): * (capsule_outer, encrypted_capsule_inner) = combined_capsule * shared_secret_outer = outKEM.Decaps(kem_secret_outer, capsule_outer) * (encrypt_key_1, encrypt_key_2) = KDF(shared_secret_outer) * capsule_inner = decrypt(encrypt_key_1, encrypted_capsule_inner) * shared_secret_inner = inKEM.Decaps(kem_secret_inner, capsule_inner) * combined_shared_secret = combine(encrypt_key_2, shared_secret_inner, combined_capsule) This is done over just sending the two encapsulated secrets plainly as I outlined above in order to achieve a stronger security notion. The issue with this though is that though ciphertext uniformity (the encapsulated secrets) is achieved, the two public keys sent are randomly looking, but not in a uniform manner. In practice, this might not really matter much AFAICT (a theoretical adversary would be able to distinguish the Elligator half from the Kemeleon half). ### Drivel: PQ-Obfuscated Authentication The biggest issue with Drivel as a fit for BIP 324 is that it expects the initiator to already know a long term static public key for the responder. In the case of BIP 324, only ephemeral keys are exchanged, so there's no long term public keys known to either side. To get around this, we could extend BIP 155 (or make a new one likely, given size limits) to include a signed OKEM key. However then that would introduce authentication into the combined set, which explicitly wasn't a design goal of BIP 324. With that caveat in mind, here's the construction itself. Drivel [8] combines the OEINC scheme with another layer that out-of-the-box assumes an asymmetric protocol within a set client and server. The client uses an existing OEINC KEM public key published by the server to then encrypt a fresh new ephemeral KEM. ----- So there we have it. Before drafting a concrete v3 transport, we need to decide if we want a hybrid KEM, or are fine with a pure PQ KEM. Then we need to decide if we want to attempt to maintain the current quality where the p2p handshake transcript is indistinguishable from random. If yes, then that forces another series of decisions re how to construct/compose an oblivious KEM from available primitives. At a glance, the route of classical-then-pq-upgrade seems to be the simplest. BIP 324 stays as is, then we run ML-KEM within that. The ML-KEM keys are encrypted, so there's no need to sprinkle in the layer of Kemeleon. If we want a nice combined protocol, then we should investigate the OEINC route. It's more data to send as part of the initial handshake, but we still keep ElligatorSwift and use that as the outer KEM. If for some reason we're concerned with a future adversary gaining a distinguisher for Kemeleon, then maybe we need to bite the bullet and also roll out a full blown PQ authentication protocol along side everything. One thing worth flagging for any of the byte-0 designs (where PQ material is sent in the clear on the very first flight, like the hybrid and OEINC sketches above): ML-KEM-768 makes the responder do real work before it can decide if a connection is even legit. Today, the responder only needs the first 64 bytes of an ElligatorSwift share before it can derive the shared secret. With ML-KEM-768, the responder has to read and validate a 1184 byte encapsulation key before running Encaps, and FIPS 203 mandates input checks on every Encaps and Decaps. In a permissionless P2P network, that's a meaningful change in inbound DoS surface, and probably calls for stricter handshake byte limits, tighter timeouts, and possibly some form of stateless cookie/puzzle if handshake floods become a real problem. The classical-then-pq-upgrade path sidesteps most of this since the PQ material only shows up after the v2 channel is up. With all that said, after the above design decisions are addressed, there aren't too many concrete blockers here w.r.t rolling this out. Of course the development (eg: selecting/creating a library for ML-KEM and maybe ML-Kemeleon), and upgrade will take some time. But unlike the consensus layer, p2p encryption doesn't require the widespread market agreement that an actual soft fork does. BIP 324 is a much shorter walk to PQ than the consensus layer, and serves as a sort of PQ warm up before the bigger soft fork is tackled. -- Laolu [1]: https://en.wikipedia.org/wiki/ML-KEM [2]: https://csrc.nist.gov/pubs/fips/203/final [3]: https://en.wikipedia.org/wiki/Learning_with_errors [4]: This statement ignores Isogeny based crypto, and also SWOOSH [5] as it requires 200 KB pubkeys [5]: https://eprint.iacr.org/2023/271 [6]: https://eprint.iacr.org/2018/024 [7]: https://eprint.iacr.org/2020/1364 [8]: https://eprint.iacr.org/2024/1086 [9]: https://www.youtube.com/watch?v=CvFCYUq5rGg [10]: https://csrc.nist.gov/csrc/media/Presentations/2025/kemeleon/images-media/kemeleon.pdf [11]: https://en.wikipedia.org/wiki/Harvest_now,_decrypt_later -- You received this message because you are subscribed to the Google Groups "Bitcoin Development Mailing List" group. To unsubscribe from this group and stop receiving emails from it, send an email to bitcoindev+unsubscribe@googlegroups.com. To view this discussion visit https://groups.google.com/d/msgid/bitcoindev/CAO3Pvs9U3prZJiDs0Ns7LSA07R8hM-GQou_FcTZZz-JUQpUYHw%40mail.gmail.com.