descriptors: inference process, do not return unparsable multisig descriptors #31404

pull furszy wants to merge 2 commits into bitcoin:master from furszy:2024_descriptors_infer_multisig changing 9 files +51 −15
  1. furszy commented at 4:04 pm on December 2, 2024: member

    The internal script-to-descriptor inference procedure shouldn’t return invalid descriptors or ones that fail the string-to-descriptor parsing process. These two procedures should always support seamless roundtrips.

    This inlines InferScript/InferDescriptor to the actual descriptors ParseScript logic for the multisig path, ensuring only valid descriptors are produced.

    Examples where this presents an issue:

    1. Because the legacy wallet allowed the import of invalid scripts, the process could end up inferring and storing an unparsable descriptor, which crashes the software during any subsequent wallet load.

      Note: This issue will no longer exist after #31378, because we will discard multisig scripts that are consensus-invalid or descriptors-incompatible prior to calling the descriptors inference procedure but fixing this also at the descriptors level prevents future misuses.

    2. The decodescript() RPC command currently return unusable multisig descriptors. For example, providing a P2SH multisig redeem script with more than 520 bytes results in a descriptor when it shouldn’t due to its large size.

  2. DrahtBot commented at 4:04 pm on December 2, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31404.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31378 (wallet: fix crash during migration due to invalid multisig descriptors by furszy)
    • #28710 (Remove the legacy wallet and BDB dependency by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in test/functional/rpc_decodescript.py:101 in 34f42d34ce outdated
     95@@ -96,6 +96,15 @@ def decodescript_script_pub_key(self):
     96         assert_equal('witness_v0_scripthash', rpc_result['segwit']['type'])
     97         assert_equal('0 ' + multisig_script_hash, rpc_result['segwit']['asm'])
     98 
     99+        # Check decoding a multisig redeem script doesn't return an invalid descriptor
    100+        self.log.info("- multisig invalid")
    101+        # Decode a +512 bytes a consensus-invalid p2sh multisig - provided script translates to multi(20, keys)
    


    sipa commented at 4:07 pm on December 2, 2024:
    The limit is 520, not 512.

    furszy commented at 4:19 pm on December 2, 2024:
    upps, fixed. Thanks
  4. in src/script/descriptor.cpp:2244 in 34f42d34ce outdated
    2238@@ -2239,6 +2239,17 @@ std::unique_ptr<DescriptorImpl> InferScript(const CScript& script, ParseScriptCo
    2239                 break;
    2240             }
    2241         }
    2242+
    2243+        // Bare multisig pubkeys are limited to 3
    2244+        if (ctx == ParseScriptContext::TOP && providers.size() > 3) {
    


    sipa commented at 4:07 pm on December 2, 2024:
    Can you introduce a constant for this number 3 (if it doesn’t exist already)?

    furszy commented at 4:20 pm on December 2, 2024:
    Sure. Introduced it in the first commit.
  5. refactor: replace hardcoded number, introduce 'MAX_BARE_MULTISIG_PUBKEYS_NUM' 088bbb81e3
  6. furszy force-pushed on Dec 2, 2024
  7. furszy force-pushed on Dec 2, 2024
  8. furszy renamed this:
    [RFC] descriptors: inference process, do not return unparsable multisig descriptors
    descriptors: inference process, do not return unparsable multisig descriptors
    on Dec 4, 2024
  9. in test/functional/rpc_decodescript.py:105 in 50ebd73aee outdated
    100+        self.log.info("- multisig invalid")
    101+        # Decode a +520 bytes consensus-invalid p2sh multisig - provided script translates to multi(20, keys)
    102+        large_multisig_script = '0114' + push_public_key * 20 + '0114' + 'ae'
    103+        rpc_result = self.nodes[0].decodescript(large_multisig_script)
    104+        # Verify it is valid only under segwit rules, not for bare multisig nor p2sh.
    105+        assert 'desc' not in rpc_result
    


    rkrux commented at 11:54 am on December 24, 2024:
    0assert_equal('multisig', rpc_result['type'])
    

    Nit: This assertion can also be added. I modified the test case to have 21 keys, and the type changed to nonstandard, which is correct. p2sh and desc values are present in that output.

    Do you think it’s worth adding the test case with 21 pubkeys as well to cover the nonstandard scenario?


    furszy commented at 8:46 pm on December 24, 2024:
    0assert_equal('multisig', rpc_result['type'])
    

    Nit: This assertion can also be added. I modified the test case to have 21 keys, and the type changed to nonstandard, which is correct. p2sh and desc values are present in that output.

    Do you think it’s worth adding the test case with 21 pubkeys as well to cover the nonstandard scenario?

    Hmm, I find the type return field misleading. Based on the code, it relates to the output script type, which in this case can only be represented as p2wsh and not as a bare multisig. This is because (1) bare multisig scripts with +3 pubkeys are non-standard, and (2) bare multisig scripts with +15 pubkeys are consensus-invalid.

    So, I’m not entirely convinced about the type assertion at this point. It seems we should improve the type field RPC description, correct the RPC to actually return “non-standard” for bare multisigs with +3 pubkeys, and perhaps introduce the “consensus-invalid” (or simply “invalid”) type too.

  10. rkrux commented at 11:54 am on December 24, 2024: none
    make, functional tests successful.
  11. in src/rpc/rawtransaction.cpp:496 in 50ebd73aee outdated
    492@@ -493,7 +493,7 @@ static RPCHelpMan decodescript()
    493             RPCResult::Type::OBJ, "", "",
    494             {
    495                 {RPCResult::Type::STR, "asm", "Disassembly of the script"},
    496-                {RPCResult::Type::STR, "desc", "Inferred descriptor for the script"},
    497+                {RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"},
    


    rkrux commented at 11:55 am on December 24, 2024:
    +1 for this doc change.
  12. in src/policy/policy.cpp:89 in 088bbb81e3 outdated
    86@@ -87,7 +87,7 @@ bool IsStandard(const CScript& scriptPubKey, const std::optional<unsigned>& max_
    87         unsigned char m = vSolutions.front()[0];
    88         unsigned char n = vSolutions.back()[0];
    89         // Support up to x-of-3 multisig txns as standard
    


    Eunovo commented at 11:21 am on December 28, 2024:
    https://github.com/bitcoin/bitcoin/pull/31404/commits/088bbb81e34dfbe56fdeafaba2ddf756017e86e1: Should probably replace // Support up to x-of-3 multisig txns as standard with // Support up to x-of-MAX_BARE_MULTISIG_PUBKEYS_NUM multisig txns as standard
  13. in src/script/descriptor.cpp:1856 in 088bbb81e3 outdated
    1851@@ -1852,8 +1852,8 @@ std::vector<std::unique_ptr<DescriptorImpl>> ParseScript(uint32_t& key_exp_index
    1852             return {};
    1853         }
    1854         if (ctx == ParseScriptContext::TOP) {
    1855-            if (providers.size() > 3) {
    1856-                error = strprintf("Cannot have %u pubkeys in bare multisig; only at most 3 pubkeys", providers.size());
    1857+            if (providers.size() > MAX_BARE_MULTISIG_PUBKEYS_NUM) {
    1858+                error = strprintf("Cannot have %u pubkeys in bare multisig; only at most %d pubkeys", providers.size(), MAX_BARE_MULTISIG_PUBKEYS_NUM);
    


    Eunovo commented at 11:30 am on December 28, 2024:
    https://github.com/bitcoin/bitcoin/pull/31404/commits/088bbb81e34dfbe56fdeafaba2ddf756017e86e1: MAX_BARE_MULTISIG_PUBKEYS_NUM is an unsigned int so you can use %u format specifier instead of %d

    furszy commented at 2:47 pm on December 28, 2024:
    sure. Applied.
  14. in src/wallet/rpc/addresses.cpp:318 in 50ebd73aee outdated
    314@@ -315,6 +315,7 @@ RPCHelpMan addmultisigaddress()
    315 
    316     // Make the descriptor
    317     std::unique_ptr<Descriptor> descriptor = InferDescriptor(GetScriptForDestination(dest), spk_man);
    318+    if (!descriptor) throw JSONRPCError(RPC_INTERNAL_ERROR, "Invalid descriptor generated");
    


    Eunovo commented at 12:38 pm on December 28, 2024:

    https://github.com/bitcoin/bitcoin/pull/31404/commits/50ebd73aeeb853328d1f4eadba77f2ca80a27a8c: AFAICT At this point, the addmultisigaddress operation is already complete (the scripts have been imported into the wallet and the destination is in the address book), throwing an error here may mislead the user into thinking the operation failed.

    If you want the operation to fail if an invalid descriptor is inferred then this code needs to be moved up before the scripts are imported into the wallet, OR, you can just make the descriptor field optional


    furszy commented at 2:45 pm on December 28, 2024:
    Sure, good catch. Still, this shouldn’t happen because we are creating the destination internally. It is purely a sanity check.

    Eunovo commented at 8:22 am on December 29, 2024:
    It looks like you have resolved this in https://github.com/bitcoin/bitcoin/pull/31404/commits/e1d6179b61a2d4766c5eb6183a839fcaa6e566fa. Mark this as resolved?
  15. in src/core_write.cpp:158 in 50ebd73aee outdated
    152@@ -153,7 +153,9 @@ void ScriptToUniv(const CScript& script, UniValue& out, bool include_hex, bool i
    153 
    154     out.pushKV("asm", ScriptToAsmStr(script));
    155     if (include_address) {
    156-        out.pushKV("desc", InferDescriptor(script, provider ? *provider : DUMMY_SIGNING_PROVIDER)->ToString());
    157+        auto desc = InferDescriptor(script, provider ? *provider : DUMMY_SIGNING_PROVIDER);
    158+        // future: make 'InferDescriptor' return the error cause
    159+        if (desc) out.pushKV("desc", desc->ToString());
    


    Eunovo commented at 1:29 pm on December 28, 2024:

    https://github.com/bitcoin/bitcoin/pull/31404/commits/50ebd73aeeb853328d1f4eadba77f2ca80a27a8c: The followig callers of ScriptToUniv need to be updated.

     0diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
     1index 4cbe4a6062..5fcb9ff0d2 100644
     2--- a/src/rpc/blockchain.cpp
     3+++ b/src/rpc/blockchain.cpp
     4@@ -1105,7 +1105,7 @@ static RPCHelpMan gettxout()
     5                 {RPCResult::Type::STR_AMOUNT, "value", "The transaction value in " + CURRENCY_UNIT},
     6                 {RPCResult::Type::OBJ, "scriptPubKey", "", {
     7                     {RPCResult::Type::STR, "asm", "Disassembly of the output script"},
     8-                    {RPCResult::Type::STR, "desc", "Inferred descriptor for the output"},
     9+                    {RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"},
    10                     {RPCResult::Type::STR_HEX, "hex", "The raw output script bytes, hex-encoded"},
    11                     {RPCResult::Type::STR, "type", "The type, eg pubkeyhash"},
    12                     {RPCResult::Type::STR, "address", /*optional=*/true, "The Bitcoin address (only if a well-defined address exists)"},
    13diff --git a/src/rpc/rawtransaction.cpp b/src/rpc/rawtransaction.cpp
    14index 14963190f4..df9a8c2f87 100644
    15--- a/src/rpc/rawtransaction.cpp
    16+++ b/src/rpc/rawtransaction.cpp
    17@@ -505,7 +505,7 @@ static RPCHelpMan decodescript()
    18                      {RPCResult::Type::STR_HEX, "hex", "The raw output script bytes, hex-encoded"},
    19                      {RPCResult::Type::STR, "type", "The type of the output script (e.g. witness_v0_keyhash or witness_v0_scripthash)"},
    20                      {RPCResult::Type::STR, "address", /*optional=*/true, "The Bitcoin address (only if a well-defined address exists)"},
    21-                     {RPCResult::Type::STR, "desc", "Inferred descriptor for the script"},
    22+                     {RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"},
    23                      {RPCResult::Type::STR, "p2sh-segwit", "address of the P2SH script wrapping this witness redeem script"},
    24                  }},
    25             },
    26@@ -827,7 +827,7 @@ const RPCResult decodepsbt_inputs{
    27                 {RPCResult::Type::OBJ, "scriptPubKey", "",
    28                 {
    29                     {RPCResult::Type::STR, "asm", "Disassembly of the output script"},
    30-                    {RPCResult::Type::STR, "desc", "Inferred descriptor for the output"},
    31+                    {RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"},
    32                     {RPCResult::Type::STR_HEX, "hex", "The raw output script bytes, hex-encoded"},
    33                     {RPCResult::Type::STR, "type", "The type, eg 'pubkeyhash'"},
    34                     {RPCResult::Type::STR, "address", /*optional=*/true, "The Bitcoin address (only if a well-defined a
    35ddress exists)"},
    36diff --git a/src/rpc/util.cpp b/src/rpc/util.cpp
    37index b1fbc25641..ff23225548 100644
    38--- a/src/rpc/util.cpp
    39+++ b/src/rpc/util.cpp
    40@@ -1412,7 +1412,7 @@ std::vector<RPCResult> ScriptPubKeyDoc() {
    41     return
    42          {
    43              {RPCResult::Type::STR, "asm", "Disassembly of the output script"},
    44-             {RPCResult::Type::STR, "desc", "Inferred descriptor for the output"},
    45+             {RPCResult::Type::STR, "desc", /*optional=*/true, "Inferred top level descriptor for the script (if any)"}
    46,
    47              {RPCResult::Type::STR_HEX, "hex", "The raw output script bytes, hex-encoded"},
    48              {RPCResult::Type::STR, "address", /*optional=*/true, "The Bitcoin address (only if a well-defined address 
    49exists)"},
    50              {RPCResult::Type::STR, "type", "The type (one of: " + GetAllOutputTypes() + ")"},
    

    furszy commented at 2:50 pm on December 28, 2024:
    Sure. Done as suggested. Thanks.
  16. descriptors: inference process, do not return unparsable descriptors
    The internal script-to-descriptor inference procedure shouldn't return invalid
    descriptors or ones that fail the string-to-descriptor parsing process. These
    two procedures should always support seamless roundtrips.
    
    This inlines `InferScript`/`InferDescriptor` to the actual descriptors
    `ParseScript` logic for the multisig path, ensuring only valid descriptors are
    produced.
    
    Examples where this presents an issue:
    1) Because the legacy wallet allowed the import of invalid scripts, the process
       could end up inferring and storing an unparsable descriptor, which crashes
       the software during any subsequent wallet load
    
    2) The `decodescript()` RPC command currently return unusable multisig descriptors.
       For example, providing a P2SH multisig redeem script with more than 520 bytes
       results in a descriptor when it shouldn't due to its large size.
    e1d6179b61
  17. furszy force-pushed on Dec 28, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me