test: verify spend from 999-of-999 taproot multisig wallet #28212

pull eriknylund wants to merge 1 commits into bitcoin:master from eriknylund:musig-test-improvements changing 1 files +28 −3
  1. eriknylund commented at 4:26 pm on August 3, 2023: contributor

    Verify that a 999-of-999 taproot multisig wallet is possible and can spend from it, and that neither a 1-of-0 nor 1-of-1000 is allowed.

    The tests will require some time to run. On my Mac M1 the new tests run in about 40 seconds.

    Had to bump fee rate to resolve assertions in sendtoaddress and psbt methods:

    • code -5: assert rpc_online.gettransaction(txid)[“confirmations”] > 0
    • code -26: min relay fee not met

    Fixes #28179

  2. DrahtBot commented at 4:26 pm on August 3, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label Tests on Aug 3, 2023
  4. eriknylund force-pushed on Aug 3, 2023
  5. DrahtBot added the label CI failed on Aug 3, 2023
  6. in test/functional/wallet_taproot.py:517 in 1bc3a89465 outdated
    512+        # Test 999-of-999
    513+        self.do_test_k_of_n(999, 999)
    514+        # Test 1-of-1000, should not succeed
    515+        try:
    516+            self.do_test_k_of_n(1, 1000)
    517+        except JSONRPCException as e:
    


    paplorinc commented at 4:49 pm on August 3, 2023:

    won’t this pass if no exception is raised? Shouldn’t we add something like either:

    0try:
    1    self.do_test_k_of_n(1, 1000)
    2    assert False, "Expected exception was not raised for 1-of-1000 test"
    3except JSONRPCException as e:
    

    or via unittest:

    0with self.assertRaises(JSONRPCException) as cm:
    1    self.do_test_k_of_n(1, 1000)
    2self.assertEqual(cm.exception.error["code"], -5)
    3self.assertIn("Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", cm.exception.error["message"])
    

    ?


    eriknylund commented at 4:58 pm on August 3, 2023:
    Thanks for pointing this out 🙏 I agree it shouldn’t be possible for this test to succeed so an exception should be raised.

    instagibbs commented at 6:44 pm on August 3, 2023:
    assert_raises_rpc_error is probably what you want

    eriknylund commented at 6:54 pm on August 3, 2023:
    @paplorinc I like the second approach, it seems a bit cleaner and clearer to me. However, it doesn’t seem to be used in other functional tests so maybe the first approach is better to be more consistent?

    eriknylund commented at 6:56 pm on August 3, 2023:
    @instagibbs Thanks! I was looking at that option too, however I couldn’t see a way to use it in this context without having to change a lot of the code, but maybe I’m missing something obvious? 🤔

    instagibbs commented at 8:13 pm on August 3, 2023:
    if something should raise an error and a specific code/message, you use assert_raises_rpc_error. Just take a look on how it’s used elsewhere?

    paplorinc commented at 9:31 pm on August 3, 2023:
    0# Test that 1-of-1000 and 999-of-1000 raise exception
    1for k in [1, 999]:
    2    assert_raises_rpc_error(-5, "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", self.do_test_k_of_n, k, 1000)
    

    or maybe even:

    0for k in [0, 1, 999, 1000]:
    

    if that makes sense.


    eriknylund commented at 5:35 am on August 4, 2023:
    I much appreciate the input @paplorinc, this was the syntax I was looking for 🙏 I saw the first version of your message and immediately thought about doing the other part ([0, 1, 999, 1000]). You read my mind, this makes very much sense! Also, doing [1, 999] would be quite time consuming because each run lasts about a second so in total it would need 1000 seconds to complete. Nvm, I was thinking about testing the whole range when in fact it’s just two tests. I’ll update the PR to support the version with four tests.
  7. fanquake requested review from Sjors on Aug 3, 2023
  8. DrahtBot removed the label CI failed on Aug 3, 2023
  9. in test/functional/wallet_taproot.py:393 in 1bc3a89465 outdated
    387@@ -387,6 +388,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
    388         psbt_online.unloadwallet()
    389         psbt_offline.unloadwallet()
    390 
    391+    def do_test_k_of_n(self, k, n):
    


    eriknylund commented at 5:45 am on August 4, 2023:
    @Sjors Given my basic understanding of output descriptors, I’d much appreciate your thoughts on this block of code, to make sure it meets the issue criterias and if this is a good way to test it.

    Sjors commented at 9:40 am on August 4, 2023:
    I understand descriptors, but Python less well :-) What is this function trying to do?

    eriknylund commented at 10:23 am on August 4, 2023:

    It’s a function that allows setup and testing of any k-of-n wallet. It builds on the idea of the existing test that @darosior pointed to, which starts on L495 https://github.com/bitcoin/bitcoin/blob/64440bb733896a7a2caf902825e0406cb993e666/test/functional/wallet_taproot.py#L495 where a random number of n is used.

    The part that differs is how the [KEY_1,...,KEY_N] array inside the multi_a is built - where the previous function uses ([H_POINT] * rnd_pos) up until k1 and then ([H_POINT] * (MAX_PUBKEYS_PER_MULTI_A - 1 - rnd_pos) after, the new function uses k1 for the whole array of.

    However, because this previous test is not documented, I’m not entirely sure about the H_POINT parts do and if I need them in my test somehow, or if the way I’ve written it is enough to verify the 999-of-999 and 1-of-1000 cases.

  10. in test/functional/wallet_taproot.py:519 in aa2a1aa671 outdated
    524-            self.do_test_k_of_n(999, 1000)
    525-        except JSONRPCException as e:
    526-            assert e.error["code"] == -5 and "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive" in e.error["message"]
    527+        # Test that all of the combinations 0-of-1000, 1-of-1000 and 999-of-1000 and 1000-of-1000 raise exception
    528+        for k in [0, 1, 999, 1000]:
    529+            assert_raises_rpc_error(-5, "Cannot have 1000 keys in multi_a; must have between 1 and 999 keys, inclusive", self.do_test_k_of_n, k, 1000)
    


    paplorinc commented at 7:12 am on August 4, 2023:

    isn’t it weird that the error message states “Cannot have 1000 keys” and k is ignored?

    https://github.com/bitcoin/bitcoin/blob/7edce77ff3725f8650bb1ae6b45b1cee100db280/src/script/descriptor.cpp#L1381

    Based on the error the k doesn’t even matter (it fails for 500 as well), so is it even important to test multiple k values here?


    Sjors commented at 9:31 am on August 4, 2023:

    Maybe keep it simple and just test 1 of 1000.

    In that case I also don’t think you need the k_of_n helper function.


    eriknylund commented at 10:07 am on August 4, 2023:
    I guess that would do the trick, however I’m unsure if by doing so we modify the test to test the assertion, which seems a bit incomplete/unintuitive to me, rather than testing the edge cases in k-of-N wallets? 🤔 If you understand what I mean 😄 If you don’t think it’s an issue, I’m all for keeping it simple and just test 1-of-1000.

    Sjors commented at 10:19 am on August 4, 2023:
    Well the “edge” here is n = 999, but this test is varying k, not n.

    eriknylund commented at 10:24 am on August 4, 2023:
    Ah you’re right. I’ll simplify the test to only do 1-of-1000.

    eriknylund commented at 11:44 am on August 4, 2023:
    Would it make sense to test the other edge, 1-of-0?
  11. maflcko commented at 7:18 am on August 4, 2023: member
    The linter is failing. Also, please squash your commits according to https://github.com/bitcoin/bitcoin/blob/master/CONTRIBUTING.md#squashing-commits
  12. DrahtBot added the label CI failed on Aug 4, 2023
  13. in test/functional/wallet_taproot.py:306 in aa2a1aa671 outdated
    303@@ -300,7 +304,7 @@ def do_test_sendtoaddress(self, comment, pattern, privmap, treefn, keys_pay, key
    304             test_balance = int(rpc_online.getbalance() * 100000000)
    305             ret_amnt = random.randrange(100000, test_balance)
    306             # Increase fee_rate to compensate for the wallet's inability to estimate fees for script path spends.
    307-            res = rpc_online.sendtoaddress(address=self.boring.getnewaddress(), amount=Decimal(ret_amnt) / 100000000, subtractfeefromamount=True, fee_rate=200)
    308+            res = rpc_online.sendtoaddress(address=self.boring.getnewaddress(), amount=Decimal(ret_amnt) / 100000000, subtractfeefromamount=True, fee_rate=500)
    


    Sjors commented at 9:33 am on August 4, 2023:
    Why are you changing the fee rate?

    eriknylund commented at 10:01 am on August 4, 2023:

    If I didn’t bump the fee rate I would run into the two error codes mentioned in the PR message:

    0Had to bump fee rate to resolve assertions in sendtoaddress and psbt methods:
    1
    2    code -5: assert rpc_online.gettransaction(txid)["confirmations"] > 0
    3    code -26: min relay fee not met
    

    The previous fee_rate = 200 limit seems to be sufficient up until n ~ 500. I assume this is because the script size increases with higher numbers of n.


    Sjors commented at 10:16 am on August 4, 2023:
    Fee rate is the fee per vbyte, so the fee goes up as the script size goes up. But this may be related to #26573
  14. eriknylund force-pushed on Aug 4, 2023
  15. in test/functional/wallet_taproot.py:396 in 7b75803e0a outdated
    389@@ -387,6 +390,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
    390         psbt_online.unloadwallet()
    391         psbt_offline.unloadwallet()
    392 
    393+    def do_test_k_of_n(self, k, n):
    394+        self.do_test(
    395+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
    396+            f"tr($2/*,multi_a({k}" + (",$1/*" * n) + "))",
    


    paplorinc commented at 12:42 pm on August 4, 2023:
    Slightly unrelated: are there any tests that check multisig with different keys rather than repeating the same one, to make it more representative of a real-world scenario? I’m just getting familiar with the codebase, I’m not sure whether it’s important or not.

    eriknylund commented at 12:52 pm on August 4, 2023:
    I think it’s very related and also my concern, is this test as good as one with K different keys. I’m hoping @Sjors can provide some insight here. My understanding of the other tests in this file is limited because they are not documented, but they seem to be doing something similar to this. On the other hand I could of course do what the previous test did, to use the “H_POINT” approach and just throw one XPRV in there somewhere. If that makes the test more real-world like or improves the test in some other way I’m all for it.

    Sjors commented at 11:23 am on December 14, 2023:
    Testing different keys makes sense, but let’s keep this PR simple and leave it to a followup.
  16. DrahtBot removed the label CI failed on Aug 4, 2023
  17. eriknylund force-pushed on Aug 7, 2023
  18. SambhavsCreation approved
  19. SambhavsCreation commented at 6:43 pm on August 10, 2023: none
    I think it looks good. I think the concerns about testing with different keys are not of critical importance right now. Something could be done about it in the future.
  20. eriknylund force-pushed on Aug 16, 2023
  21. DrahtBot added the label CI failed on Aug 16, 2023
  22. in test/functional/wallet_taproot.py:16 in 2dcfce16aa outdated
    10@@ -11,7 +11,10 @@
    11 from test_framework.address import output_key_to_p2tr
    12 from test_framework.key import H_POINT
    13 from test_framework.test_framework import BitcoinTestFramework
    14-from test_framework.util import assert_equal
    15+from test_framework.util import (
    16+    assert_equal,
    17+    assert_raises_rpc_error
    


    maflcko commented at 9:23 am on August 17, 2023:
    0    assert_raises_rpc_error,
    

    style-nit: Add the missing comma here? Otherwise someone will have to touch this line again when appending something.

    A (force) push should also make CI less red.


    eriknylund commented at 10:06 am on August 17, 2023:
    Thanks for the comments. 🙏 I added the comma, rebased onto master and force pushed to trigger a new build to get the mac CI through.
  23. eriknylund force-pushed on Aug 17, 2023
  24. DrahtBot added the label Needs rebase on Aug 17, 2023
  25. eriknylund force-pushed on Aug 17, 2023
  26. DrahtBot removed the label Needs rebase on Aug 17, 2023
  27. DrahtBot removed the label CI failed on Aug 18, 2023
  28. maflcko commented at 6:26 am on September 15, 2023: member
    Merge commits are not allowed in pull requests, please reset or rebase your changes.
  29. eriknylund force-pushed on Sep 15, 2023
  30. eriknylund force-pushed on Nov 14, 2023
  31. DrahtBot added the label CI failed on Nov 14, 2023
  32. in test/functional/wallet_taproot.py:397 in 030ee8a449 outdated
    389@@ -387,6 +390,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
    390         psbt_online.unloadwallet()
    391         psbt_offline.unloadwallet()
    392 
    393+    def do_test_k_of_n(self, k, n):
    394+        self.do_test(
    395+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
    396+            f"tr($2/*,multi_a({k}" + (",$1/*" * n) + "))",
    397+            [True, False],
    


    Sjors commented at 11:38 am on December 14, 2023:
    Note to other reviewers: this privmap argument determines for which keys an xpriv is inserted into the descriptor.
  33. in test/functional/wallet_taproot.py:515 in 030ee8a449 outdated
    509@@ -499,6 +510,20 @@ def run_test(self):
    510             [True, False],
    511             lambda k1, k2: (key(k2), [multi_a(1, ([H_POINT] * rnd_pos) + [k1] + ([H_POINT] * (MAX_PUBKEYS_PER_MULTI_A - 1 - rnd_pos)))])
    512         )
    513+
    514+        # Test 999-of-999
    515+        self.do_test_k_of_n(999, 999)
    


    Sjors commented at 11:47 am on December 14, 2023:

    Unfortunately this leads to extreme CPU usage and a sendtoaddress RPC timeout even on my 2019 MacBook Pro.

    I suggest sticking to a lower maximum for now, so we at least get this useful helper function, and leaving a note like:

    0// TODO: fix or work around performance bottleneck in `sendtoaddress`
    1// in order to test the maximum Taproot allowed 999-of-999.
    

    My Macbook can easily handle 1-of-100. The 30 second timeout is hit somewhere between 1-of-750 and 1-of-1000. The CI machines may be slower.

    In term of fixing the bottleneck, you could try the more modern send RPC though I doubt that makes a difference. Maybe @achow101 has an idea?


    eriknylund commented at 11:07 am on December 15, 2023:
    I’ll lower it to 100 unless @achow101 thinks the send RPC approach would work efficiently here.

    eriknylund commented at 11:19 am on December 15, 2023:

    @Sjors just to make sure, did you mean 100-of-100? Or should I leave this out completely and just keep the 1-of-0 and 1-of-1000 tests further down that utilises the same new helper method?

    0        # Test 100-of-100 
    1        self.do_test_k_of_n(100, 100)
    

    eriknylund commented at 12:18 pm on December 15, 2023:

    The CI machines seem to make it through without timeout: https://github.com/bitcoin/bitcoin/actions/runs/7221218374/job/19675746505?pr=28212#step:7:3594 https://github.com/bitcoin/bitcoin/actions/runs/7221218374/job/19675746796?pr=28212#step:27:330

    However, the test runs for 15 minutes on the Win64 job and for 20 minutes on the macOS job, making it the longest running functional test by far.


    Sjors commented at 12:49 pm on December 15, 2023:

    If it still takes that long you probably want to pick a smaller K and/or N.

    did you mean 100-of-100? Or should I leave this out completely and just keep the 1-of-0 and 1-of-1000 tests further down that utilises the same new helper method?

    The other two tests only check the failure case. A 1-of-100 would check the success path. 100-of-100 might be fine too, but maybe slower?


    eriknylund commented at 1:17 pm on December 15, 2023:

    100-of-100 seems to run okay, less than 16 sec to complete the whole test. However, after the rebase I see the same behavior on my Mac M1 as you’re seeing on your 2019 MacBook Pro. Even the 1-of-999 test doesn’t pass without an RPC timeout, which it did before! So perhaps there’s been some change while the PR has been open that now causes the test to perform slower? 🤔

    1-of-999 gives me: 'sendall' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)

    999-of-999 gives me 'sendtoaddress' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)


    Sjors commented at 2:12 pm on December 15, 2023:
    Interesting. Try cleaning your working directory (e.g. git clean -dfx) and building again. To try on an older version of master, you can you check out any old commit and then do a git cherry-pick ....

    eriknylund commented at 9:05 pm on December 15, 2023:

    Interesting. Try cleaning your working directory (e.g. git clean -dfx) and building again. To try on an older version of master, you can you check out any old commit and then do a git cherry-pick ....

    Git bisecting this I found that the tests start timing out with commit 4f473ea515bc77b9138323dab8a741c063d32e8f

    Are you able to verify this on your setup? @darosior would you be able to check as well? 🙏

    This is how I did it:

    0git checkout 4f473ea515bc77b9138323dab8a741c063d32e8f
    1git cherry-pick 387c12e0813a863bc9728777719acbafe7b12a34
    2make
    3python3 test/functional/wallet_taproot.py
    4...
    5test_framework.authproxy.JSONRPCException: 'sendtoaddress' RPC took longer than 30.000000 seconds. Consider using larger timeout for calls that take longer to return. (-344)
    

    Test previous commit:

    0git checkout HEAD~1
    1git cherry-pick 387c12e0813a863bc9728777719acbafe7b12a34
    2make
    3python3 test/functional/wallet_taproot.py
    4...
    52023-12-15T20:46:16.689000Z TestFramework (INFO): Tests successful
    

    Sjors commented at 8:29 am on December 16, 2023:
    Great find! Let’s track this in a separate issue: #29098
  34. in test/functional/wallet_taproot.py:395 in 030ee8a449 outdated
    389@@ -387,6 +390,14 @@ def do_test_psbt(self, comment, pattern, privmap, treefn, keys_pay, keys_change)
    390         psbt_online.unloadwallet()
    391         psbt_offline.unloadwallet()
    392 
    393+    def do_test_k_of_n(self, k, n):
    394+        self.do_test(
    395+            f"tr(XPUB,multi_a({k},XPRV_1,...,XPRV_N)",
    


    Sjors commented at 11:47 am on December 14, 2023:
    ...,XPRIV_{n}
  35. Sjors commented at 11:58 am on December 14, 2023: member
    Thanks for the rebase. At least one of the CI failures is due to a timeout, see my inline comment.
  36. test: verify that a 999-of-999 taproot multisig wallet is possible and can spend from it, and that neither a 1-of-0 nor 1-of-1000 is allowed 387c12e081
  37. eriknylund force-pushed on Dec 15, 2023
  38. darosior commented at 3:04 pm on April 9, 2024: member
    This has not seen any activity for a while. It’s depending on #29098 (comment), which i unfortunately do not plan on implementing right now. If you do i’m happy to review, but in the meantime i think this PR can be closed and potentially re-opened once the performance of the satisfier is reasonable.
  39. achow101 closed this on Apr 9, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-04 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me