This issue is for the tracking and brainstorming of the implementation of BIP 77 Async Payjoin into the wallet.
- HPKE #32617
- The rest :)
Async Payjoin utilizes an external Directory server to allow the Sender and Receiver to communicate with each other without needing to operate any infrastructure. The servers are reached using RFC 9458 Oblivious HTTP (OHTTP) relays to avoid revealing the IPs of the Sender and Receiver to the Directory. All communication between the Sender or Recevier to the OHTTP Relay is encrypted, as described in the OHTTP RFC. Additionally, the payload data itself is encrypted with methodology similar to OHTTP to prevent the Directory from being able to read the transaction data.
The cryptography uses things that we already have in the project. Specifically, all keys used are EC keys on secp256k1, keys are exchanged using DH, and encryption is done with ChaCha20Poly1305. Pubkeys included directly in the payload (ephemeral keys used to do additional key exchanges for responses) are encoded with ElligatorSwift. All of these algorithms are already included in the project to support BIP 324, so it should not be difficult to reuse them for Async Payjoin.
OHTTP itself is encrypted using the aforementioned cryptography. It uses plain HTTP, not HTTPS, so there is no need to include anything related to TLS.
The general operation of Async Payjoin is:
- The Receiver produces a Bitcoin URI containing a parameter which includes the URL of a mailbox endpoint on a Directory server (i.e. some URL for the Sender to POST to). The URL additionally contains the public key of the Directory itself, and an ephemeral(ish)public key of the Receiver.
- The Receiver provides the URI to the Sender out of band.
- The Sender generates an ephemeral public key, performs a key exchange with the Receiver’s ephemeral key to compute a shared secret. This shared secret is used with ChaCha20-Poly1305 to encrypt:
- A new ephemeral pubkey for the key exchange for the reply (Reply Key)
- The Original PSBT which pays the receiver
 
- The Sender connects to the specified Directory server via a OHTTP Relay, performing a DH key exchange with the Directory public key, and encrypting the ElligatorSwift encoding of their Reply Key and the encrypted payload produced in the previous step. The payload is POSTed to the specified mailbox endpoint.
- The Receiver polls the Directory server at the mailbox endpoint until it receives the Sender’s payload.
- The Receiver performs DHKE with the Sender’s ephemeral key, then decrypts the encrypted payload.
- The Receiver produces a Proposal PSBT which is the Sender’s Original PSBT modified with the Receiver’s inputs and outputs, as well as any necessary signatures and witnesses.
- The Receiver generates a new ephemeral key to perform DHKE with the Sender’s Reply Key, and uses the shared secret to encrypt their Proposal PSBT
- The Receiver connects to the previously specified Directory via a OHTTP Relay, and provides an encapsulated payload of the ElligatorSwift encoding of their ephemeral key. The payload is POSTed to a new mailbox endpoint derived from the Sender’s Reply Key.
- The Sender polls the Directory server at the mailbox endpoint derived from their Reply Key until it receives the Receiver’s payload.
- The Sender performs DHKE with the Receiver’s new ephemeral key, decrypts the Proposal PSBT, validates and signs it. The Sender broadcasts the final transaction.
Additionally, the original Bitcoin URI contains an expiration time. This is used by both the Sender and Receiver to have a timeout on polling the Directory. The Receiver can also broadcast the Original PSBT and the Sender should stop polling if it sees that be broadcast.
There are a few questions that need to be answered for implementing Async Payjoin
- If we are the Receiver, how do we get the pubkey of the Directory. Some suggestions are:
- The user gets it out of band somehow
- OHTTP specifies a way to do key discovery by querying the server directly without OHTTP. But this both requires TLS, and reveals our IP unless some other IP hiding method is used.
- A set of keys are hard coded into the software. (ew)
 
- How do we choose which OHTTP relay to use?
- What is the polling interval?
- What shows in the GUI when we are waiting for and polling the Directory, for both Receivers and Senders?
- We should not be automatically signing transactions for our users:
- How will Senders using the GUI be notified and prompted when a reply is received so that the PSBT can be signed?
- How will Senders using the CLI know when a reply was received and that a PSBT needs to be signed?