Output Descriptor encoding

Sjors commented at 4:24 pm on January 31, 2020: member

@chris-belcher brought this up on the bitcoin-dev mailinglist recently.

I’ve recently been playing around with descriptors, and they are very nice to work with. They should become the standard for master public keys IMO.

One downside is that users cant easily copypaste them to-and-fro to make watch-only wallet. The descriptors contain parenthesis and commas which stop highlighting by double-clicking. Also the syntax might look scary to newbs. @achow101 wrote in that thread:

The main reasons this was proposed in the first place is because of concerns that users will be unwilling to use or be confused by descriptors. There is a concern that users will not understand the commas, parentheses, brackets, etc. syntax of descriptors and thus only copy part of it. There is also the concern that users will see this code-like syntax and be intimidated by it so they will not want to handle them.

So my (offhanded) suggestion was to encode it in some way to just make it look like some magic string that they need to handle as one unit.

Note, I’m selectively quoting from the thread, so maybe read it yourself :-)

Perhaps we can serialize output descriptors with a key-value map similar to PSBT. For example:

Text: sh(sortedmulti(2,03acd484e2f0c7f65309ad178a9f559abde09796974c57e714c35f110dfc27ccbe,022f01e5e15cca351daff3843fb70f3c2f0a1bdd05e5af888a67784ef3e10a2a01))

JSON:

 0{
 1   "sh": {
 2       "sortedmulti": {
 3            "threshold": 2,
 4            "pubkeys": [
 5"3acd484e2f0c7f65309ad178a9f559abde09796974c57e714c35f110dfc27ccbe",
 6"022f01e5e15cca351daff3843fb70f3c2f0a1bdd05e5af888a67784ef3e10a2a01"
 7]
 8       }
 9    }
10}

Any xpub would be serialized in its binary form, not in base58.

I’m tempted to allow multiple descriptors, and jam slightly more information to this serialized form. In particular wallet age and whether a descriptor is intended as change. That way a wallet backup can consist of a seed phrase plus a single QR code.

Regarding checksums, @achow101 wrote:

Descriptors already have their own BCH code for descriptor checksums optimized for their length and character rset. This can be repurposed to be used with whatever encoding scheme is chosen so long as the encoding’s character set is covered by the descriptor checksum character set. The checksum’s character set is fairly large and covers all(?) characters on a standard keyboard so that descriptors could be expanded with other features in the future. Thus it should cover any encoding scheme that is suggested.

We could re-use the checksum for the text representation, but that only makes sense if we encode the literal descriptor string, rather its “meaning”, as I’m suggesting above.

We can instead use the standard bech32 checksum. BIP173 claims that its error detection “the chosen code performs reasonably well up to 1023 characters”. Although it recommends that “other designs are preferable for lengths above 89 characters”, if copy-paste is the main use case for descriptors, that seems unnecessary.

Sjors added the label Feature on Jan 31, 2020

chris-belcher commented at 6:00 pm on January 31, 2020: contributor

whether a descriptor is intended as change

This field should be an integer rather than a boolean. Because that would allow other general address types beyond just receive and change, for example locktime addresses used for creating fidelity bonds.

Sjors commented at 12:33 pm on January 6, 2021: member

Closing this for now, as everyone seems happy with strings…

Sjors closed this on Jan 6, 2021

DrahtBot locked this on Aug 18, 2022

Output Descriptor encoding #18043