Feature discussion: partial descriptors/miniscript

sipa commented at 4:39 pm on January 20, 2022: member

So far, we effectively have two levels of “solvability” in descriptors:

addr() and raw() which can encode scriptPubKey with effectively no solvability information at all
Everything else, which encodes scriptPubKeys that are fully solvable.

I’ve long held that there shouldn’t be any level in between, in the sense that you shouldn’t be using a descriptor if you don’t know everything about its construction. I’m starting to think that’s not a tenable position:

Already, key origin information - which may be no less critical than other parts of descriptors for certain roles - is optional. There is no way to force that information to be included, because it may not exist (for non-BIP32-derived keys, for example), but it may also simply be omitted.
#23480 proposes the introduction of a descriptor like tr(), except it encodes the post-tweak key rather than the inner key. This too may be because the information doesn’t exist, but it cannot prevent situations where it is simply omitted.
It has been suggested before (e.g. #21365 (comment)) to permit tr() descriptors to contain hashes for omitted script subtrees. That too could represent non-existent information (Merkle path entries that are chosen rather than actual hashes), or omitted information. And it is useful - there are certainly scenarios where one might want to participate in, and sign for, taproot outputs for which certain subtrees are not known.
If we’re going to accept that the previous 3 examples will occur, we might as well bite the bullet and also add a way to encode “pkh with known keyhash, but not known key”.

If all these mechanisms become permitted in descriptors (and miniscript), we get the nice logistical benefit of being able to represent all information extracted from PSBT/SPKM/scripts/… into a descriptor regardless of what is present or missing. I believe this would logistically simplify things later on too: Miniscript would not need separate instantiations for its descriptor-decoding use case and its signing/solving use case. Perhaps even further out, this can mean that many fields in SignatureData are replaced with just a descriptor object (everything except signatures/preimages), and the signing logic becomes a descriptor method.

Straw man proposal:

Do #23480 (adding rawtr(KEY)), for P2TR outputs with specified tweaked key, but no specified internal key or script tree.
Add a rawnode(HEX) fragment, only usable inside P2TR script tree expressions, indicating a tree node with specified hash, but no specified subtree.
Add a rawpkh(HEX) fragment, usable wherever pkh(KEY) is usable, indicating a PKH script (DUP HASH160 <hex> EQUALVERIFY CHECKSIG), without specified public key. Post-miniscript this would also add a rawpk_h(HEX) fragment, corresponding to pk_h(KEY).

sipa added the label Feature on Jan 20, 2022

sipa commented at 4:39 pm on January 20, 2022: member

Ping @sanket1729 @apoelstra @darosior @achow101, others.

darosior commented at 5:09 pm on January 20, 2022: member

I think this makes sense. Already mentioned elsewhere but to be noted here: the Rust implementation of Miniscript/Descriptors allows hashes inside pkh()/pk_h().

sipa commented at 5:13 pm on January 20, 2022: member

@darosior It’s my understanding that this is only the case if you instantiate it with a key type that does so, and not generally true for the descriptor language implementation. This is (not exactly the same, but similarly) true for the C++ code as well - e.g. the signing instantiation uses a key type where all keys are actually key hashes.

So to be clear, this isn’t about the implementation(s), which already have some affordances for permitting things like this internally; only about the descriptor language (and miniscript-as-used-in-descriptors) specification.

darosior commented at 5:25 pm on January 20, 2022: member

It’s my understanding that this is only the case if you instantiate it with a key type that does so, and not generally true for descriptors

It’s more complicated. It is the default to parse pk_h() descriptors for “raw” public keys using their hash. However it’s the default to parse pk_h() descriptors for xpubs using the xpub itself. EDIT: the default in the xpub context is also to parse pkh() containing raw keys by their key, not its hash. I think how convoluted the implicit parsing of keyhash inside pkh() is makes a point for having explicit rawpk() descriptors. ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ Le jeudi 20 janvier 2022 à 6:13 PM, Pieter Wuille @.***> a écrit :

@.***(https://github.com/darosior) It’s my understanding that this is only the case if you instantiate it with a key type that does so, and not generally true for descriptors. This is (not exactly the same, but similarly) true for the C++ code - e.g. the signing/inference instantiation uses a key type where all keys hashes.

So to be clear, this isn’t about the implementation(s), which already have some affordances for permitting things like this internally; only about the descriptor (and miniscript-as-used-in-descriptors) specification.

— Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android. You are receiving this because you were mentioned.Message ID: @.***>

apoelstra commented at 6:03 pm on January 20, 2022: contributor

concept ACK. @darosior after some IRC discussion, I think what this might look like is to replace our current PkH(Pk::Hash) with something like PkH(Pk::Hash, Option<Pk>). That is, make it explicit that pubkeys may be available, or not, and that this may be different for different pk_hs in the same descriptor.

Similarly taptrees would be (TapTreeHash, Option<TapTree>) etc

sanket1729 commented at 1:54 am on January 21, 2022: contributor

we get the nice logistical benefit of being able to represent all information extracted from PSBT/SPKM/scripts/… into a descriptor regardless of what is present or missing. …and the signing logic becomes a descriptor method

+1. While implementing taproot miniscript descriptors, I found this wanting. Inferring descriptor from PSBTs is a very useful property to simplify signing/finalizing implementations.

Add a rawnode(HEX) fragment, only usable inside P2TR script tree expressions, indicating a tree node with specified hash, but no specified subtree.

I believe we discussed hidden(HEX), but I like rawnode is more in the spirit of raw. I like the overall idea that raw* means something other additional special information might be required at signing.

JeremyRubin commented at 10:48 pm on February 2, 2022: contributor

It has been suggested before (though I can’t find a reference now)

c/f N2KB taproot paths #21365 (comment)

sipa commented at 10:50 pm on February 2, 2022: member

@JeremyRubin Thanks, included in description.

sipa commented at 5:46 pm on February 13, 2022: member

One more possibility: a rawleaf(HEXSCRIPT[,LEAFVERSION]) fragment inside tr() to represent leaves of unknown version/script.

sanket1729 commented at 9:28 pm on February 13, 2022: contributor

I don’t see any reason for not having this, but additional information would rawleaf(HEXSCRIPT[,LEAFVERSION]) serve over rawnode(HEX)?

In psbt workflow, parties can provide a mapping control_block -> (leaf_script, leaf_version) to convey the same information.

sipa commented at 9:31 pm on February 13, 2022: member

@sanket1729 Well, exactly! It can be conveyed using PSBT, why shouldn’t it be representable using descriptors? rawleaf contains information that rawnode cannot represent: the preimage of the node hash, in case it is a leaf.

sipa commented at 1:46 pm on February 14, 2022: member

Hmm, taking this reasoning to its logical conclusion, I think we also should have rawsh() (for P2SH with specified scripthash), rawwsh() (for P2WSH with specified scripthash) and rawwpkh() (for P2WPKH with specified pubkey hash).

sipa commented at 2:28 pm on February 14, 2022: member

Actually, rawsh, rawwsh, and rawwpkh don’t add anything over straight up addr when used at the top level. For rawwsh and rawsh inside sh, we could instead just permit raw inside sh and wsh (i think permitting addr inside sh and wsh would be misleading).

ryanofsky commented at 4:40 pm on February 24, 2022: member

I’ve long held that there shouldn’t be any level in between, in the sense that you shouldn’t be using a descriptor if you don’t know everything about its construction. I’m starting to think that’s not a tenable position

Curious if there was a specific rationale or concern behind the previous position, or if it just seemed simpler at the time

sanket1729 commented at 6:16 pm on April 22, 2022: contributor

One more use-case for rawtr descriptors is to allow people test new things with a valid key spend that might not be clearly repesentable in current descriptor spec. Things like #24897 are a good example.

we could instead just permit raw inside sh and wsh

Agreed. This seems to be in spirit of rawleaf() inside taproot nodes. Given all of this, it makes the descriptor language complete. Complete meaning we can represent the explicit spending script regardless it is inside p2sh/p2wsh/or a leaf. It was already possible to do at the top level with raw, but it lacked the power to express the underlying explicit spending script.

sipa commented at 6:07 pm on May 16, 2022: member

@ryanofsky Slow response but:

Curious if there was a specific rationale or concern behind the previous position, or if it just seemed simpler at the time

My thinking used to be that descriptor’s goal was to encapsulate full knowledge of “how to spend an output, excluding private keys”, with addr/raw as exceptions added specifically for the purpose of allowing scantxoutset to work with raw scripts/addresses. But there didn’t seem to be a use case for anything in between “know everything” (normal descriptors) and “know nothing except the scriptPubKey” (raw/addr), because such an in-between thing would be useless for everything except scanning, and thus all the information it would have beyond what raw/addr provide would be redundant.

I’ve come around because I’m convinced now there are use cases for these in-between situations. At least internally, signing is definitely possible, and useful, even with partial knowledge. Imagine a 2-of-2 script that’s (abusing notation here) and(pkh(A),pkh(B)) (so something where the script has pubkey hashes of two keys). If an output to such a script exists, and you have the private key to A, but don’t have even the unhashed public key to B, you do want to participate in signing. With the introduction of taproot, this notion is even extended, where you may be able to individually know enough to spend according to one branch fully, but not know anything about other branches. The current miniscript final PR bypasses descriptors for its signing logic, and effectively uses a different layer (the Key type template in miniscript) to represent this partial knowledge without needed descriptors to represent that. That’s another sign that there may be (even only “internal”) use cases where it would greatly help to permit just everything in descriptors.

And if we’re going to need to add more “exceptions” anyway because there are use cases demanding them, that’s perhaps a sign the philosophy is wrong to begin with, and we should just embrace the fact that there can always be a spectrum of available knowledge, rather than think about every missing piece as an exception.

Feature discussion: partial descriptors/miniscript #24114