wallet: ensure the wallet is unlocked when needed for rescanning #26347

pull ishaanam wants to merge 3 commits into bitcoin:master from ishaanam:rescanblockchain_unlock_wallet changing 7 files +106 −13
  1. ishaanam commented at 1:07 am on October 20, 2022: contributor

    Wallet passphrases are needed to top up the keypool of encrypted wallets during a rescan. The following RPCs need the passphrase when rescanning: - importdescriptors - rescanblockchain

    The following RPCs use the information about whether or not the passphrase is being used to ensure that full rescans are able to take place (meaning the following RPCs should not be able to run if a rescan requiring the wallet to be unlocked is taking place): - walletlock - encryptwallet - walletpassphrasechange

    m_relock_mutex is also introduced so that the passphrase is not deleted from memory when the timeout provided in walletpassphrase is up and the wallet is still rescanning. Fixes #25702, #11249

    Thanks to achow101 for coming up with the idea of using a new mutex to solve this issue and for answering related questions.

  2. fanquake added the label Wallet on Oct 20, 2022
  3. aureleoules commented at 9:07 am on October 21, 2022: member
    Changes look good, is there a reason it’s in draft? Also, is there a way to add a test for this?
  4. ishaanam commented at 1:29 pm on October 21, 2022: contributor

    Changes look good, is there a reason it’s in draft?

    Also, is there a way to add a test for this?

    I made it a draft because I haven’t written a test yet. I’ll mark it as ready for review after I add some tests.

  5. ishaanam force-pushed on Oct 23, 2022
  6. ishaanam force-pushed on Oct 23, 2022
  7. ishaanam marked this as ready for review on Oct 23, 2022
  8. DrahtBot commented at 10:57 pm on October 23, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK achow101, hernanmarino, furszy
    Concept ACK pablomartin4btc, theStack

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27068 (wallet: SecureString to allow null characters by john-moffett)
    • #22838 (descriptors: Be able to specify change and receiving in a single descriptor string by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  9. aureleoules commented at 9:31 am on October 24, 2022: member
    The test seems to create intermittent failures. I also cherry-picked the test on master and the results seem to be the same. seq 100 | xargs -i -P 5 python test/functional/wallet_importdescriptors.py --descriptors will execute the test in parallel batches of 5.
  10. ishaanam force-pushed on Oct 24, 2022
  11. ishaanam commented at 6:34 pm on October 24, 2022: contributor

    The test seems to create intermittent failures. I also cherry-picked the test on master and the results seem to be the same. seq 100 | xargs -i -P 5 python test/functional/wallet_importdescriptors.py --descriptors will execute the test in parallel batches of 5.

    Thanks for taking a look at this, that should be fixed now. There is still a problem of the passphrase timing out on certain systems before importdescriptors is even run, but I’m not sure how to fix that.

  12. in src/wallet/rpc/encrypt.cpp:93 in b8952bd7d3 outdated
    89@@ -90,6 +90,7 @@ RPCHelpMan walletpassphrase()
    90     std::weak_ptr<CWallet> weak_wallet = wallet;
    91     pwallet->chain().rpcRunLater(strprintf("lockwallet(%s)", pwallet->GetName()), [weak_wallet, relock_time] {
    92         if (auto shared_wallet = weak_wallet.lock()) {
    93+            LOCK(shared_wallet->m_relock_mutex);
    


    achow101 commented at 7:12 pm on October 24, 2022:

    In b8952bd7d3b3f745109506223f829c5dd91e638f “wallet: ensure that the passphrase is not deleted from memory when being used to rescan”

    I think this is the wrong place to grab this mutex. I think it would be better to put it CWallet::Lock (in addition to the LOCK(cs_wallet) already there) as that is the underlying locking function that is called in a variety of places and so would offer stronger protection.


    hernanmarino commented at 7:31 pm on October 25, 2022:
    I have some doubts about the proper location for grabbing the mutex, but I am not an expert in wallet code, some take my ideas with care. @achow101 ’s suggestion seems like a nice “natural” way to lock mutexes, and initially I agreed with it. But since this pr creates a new mutex, I see no conflict on grabbing it elsewhere, like here. Also I’m a little bit concerned that other calls to CWallet::Lock might unnecessarily lock this unrelated mutex, blocking execution.

    ishaanam commented at 2:48 am on October 27, 2022:
    The new mutex is now grabbed in CWallet::Lock and also in locations where cs_wallet would have been grabbed first in order to prevent a deadlock.
  13. in src/wallet/rpc/transactions.cpp:883 in b8952bd7d3 outdated
    874@@ -875,6 +875,7 @@ RPCHelpMan rescanblockchain()
    875     uint256 start_block;
    876 
    877     EnsureWalletIsUnlocked(*pwallet);
    878+    LOCK(pwallet->m_relock_mutex);
    


    achow101 commented at 7:14 pm on October 24, 2022:

    In b8952bd7d3b3f745109506223f829c5dd91e638f “wallet: ensure that the passphrase is not deleted from memory when being used to rescan”

    The mutex should be grabbed before ensuring the wallet is unlocked. There could be a race condition where the wallet is locked by the scheduler in between checking that the wallet is unlocked, and grabbing m_relock_mutex. By grabbing the mutex first, we can be sure that the scheduler will not be able to lock the wallet before we check that the wallet is unlocked.

    Also, use LOCK2 to grab 2 locks at the same time.


    ishaanam commented at 2:32 am on October 25, 2022:
    How would using LOCK2 work for grabbing cs_wallet and m_relock_mutex at the same time if the whole point of having m_relock_mutex is that we don’t want to be holding cs_wallet during the time of the rescan since it can be time consuming?

    achow101 commented at 7:21 pm on October 25, 2022:
    LOCK2 only needs to be used in the places when both locks are being acquired, e.g. in CWallet::Lock.

    ishaanam commented at 2:39 am on October 27, 2022:

    I have made it so that m_relock_mutex is grabbed before ensuring the wallet is unlocked.

    LOCK2 only needs to be used in the places when both locks are being acquired, e.g. in CWallet::Lock.

    Right, I thought you meant this location specifically.

  14. in test/functional/wallet_importdescriptors.py:675 in 25b8736431 outdated
    670+
    671+        descriptor = {"desc": descsum_create("pkh(" + xpriv + "/1h/*h)"),
    672+                              "timestamp": "now",
    673+                              "active": True,
    674+                              "range": [0,10000],
    675+                              "next_index": 10000}
    


    achow101 commented at 7:40 pm on October 24, 2022:

    In ad8d139d6237ffa556e90d2d87ec43c209c89dc9 “test: test rescanning encrypted wallets”

    10000 is running into the rpc timeout for me for the importdescriptors calls. I think this could be reduced to 4000. That seems to not hit the rpc timeout, but still consistently fails this test on master.


    hernanmarino commented at 7:19 pm on October 25, 2022:
    10000 runs fine for me, but I agree with lowering it to 4000.

    ishaanam commented at 2:35 am on October 27, 2022:
    Done
  15. achow101 commented at 7:41 pm on October 24, 2022: member
    This could have a similar test for rescanblockchain as well.
  16. pablomartin4btc commented at 7:13 pm on October 25, 2022: member
    Concept ACK, I see the point from @achow101 moving the wallet lock/ mutex to CWallet::Lock(), it looks like a better place but I’d need more knowledge to give a proper advice.
  17. ishaanam force-pushed on Oct 27, 2022
  18. ishaanam force-pushed on Oct 27, 2022
  19. ishaanam commented at 2:50 am on October 27, 2022: contributor

    Thanks for the review @achow101, I have implemented your suggestions. I also made some slight modifications to the error messages in 8c16fdb73dd323245642d2710b5f81776db40396.

    This could have a similar test for rescanblockchain as well.

    Yes, I will add one shortly.

  20. ishaanam force-pushed on Oct 30, 2022
  21. ishaanam force-pushed on Oct 30, 2022
  22. in src/wallet/wallet.h:492 in 4724f0b82c outdated
    486@@ -487,6 +487,9 @@ class CWallet final : public WalletStorage, public interfaces::Chain::Notificati
    487 
    488     // Used to prevent concurrent calls to walletpassphrase RPC.
    489     Mutex m_unlock_mutex;
    490+    // Used to prevent deleting the passphrase from memory when it is still in use.
    491+    RecursiveMutex m_relock_mutex;
    


    w0xlt commented at 7:55 pm on October 30, 2022:
    Using Mutex is preferable to RecursiveMutex.

    ishaanam commented at 9:08 pm on October 30, 2022:

    Mutex isn’t used here because sometimes m_relock_mutex has to be locked multiple time within a scope to prevent deadlocks. See: #26347 (review)

    The new mutex is now grabbed in CWallet::Lock and also in locations where cs_wallet would have been grabbed first in order to prevent a deadlock.

  23. ishaanam force-pushed on Oct 30, 2022
  24. ishaanam force-pushed on Oct 30, 2022
  25. ishaanam commented at 0:09 am on October 31, 2022: contributor
    Added a test for rescanblockchain
  26. theStack commented at 1:11 pm on January 3, 2023: contributor
    Concept ACK
  27. in src/wallet/rpc/transactions.cpp:886 in 8c16fdb73d outdated
    875     std::optional<int> stop_height;
    876     uint256 start_block;
    877+
    878     {
    879         LOCK(pwallet->cs_wallet);
    880+        EnsureWalletIsUnlocked(*pwallet);
    


    furszy commented at 3:38 pm on January 20, 2023:
    if the wallet is a watch-only wallet, do we need to ensure this?

    ishaanam commented at 4:09 pm on January 25, 2023:
    No, I will make it so that here and in the other RPCs EnsureWalletIsUnlocked is only called if the wallet is not watch-only. Additionally, in that case m_relock_mutex doesn’t need to be held, as long as it is not grabbed later in the RPC.

    ishaanam commented at 1:05 am on January 29, 2023:
    Actually, the wallet will always be unlocked if private keys are disabled, because wallet encryption is not allowed in a watch-only wallet. See: https://github.com/bitcoin/bitcoin/blob/master/src/wallet/rpc/encrypt.cpp#L224
  28. pablomartin4btc commented at 4:40 pm on January 25, 2023: member
    re-ack b2e2829014c5c9a3dc02eab7fcd77dd907228c78. I’ll check again once you make the changes you commented to @furszy.
  29. hernanmarino commented at 4:43 pm on January 25, 2023: contributor
    Re ACK b2e2829014c5c9a3dc02eab7fcd77dd907228c78
  30. in src/wallet/wallet.cpp:3410 in 4724f0b82c outdated
    3284@@ -3284,7 +3285,7 @@ bool CWallet::Lock()
    3285         return false;
    3286 
    3287     {
    3288-        LOCK(cs_wallet);
    3289+        LOCK2(m_relock_mutex, cs_wallet);
    


    furszy commented at 2:27 pm on February 4, 2023:

    Was surprised to see that this wasn’t causing any lock ordering deadlock. Because, for example, ChangeWalletPassphrase locks cs_wallet and then call Lock() (which locks m_relock_mutex first, which is the opposite lock ordering direction).

    So, got deeper over it and was able to trigger the deadlock detection on the GUI “change wallet passphrase” workflow. Can be replicated by enabling the debug mode (–enable-debug at configure time) and try to change the wallet passphrase there (soft will abort right away).

    This does not happen on the RPC command because this two recursive locks are acquired at the ChangeWalletPassphrase caller, which locks them in the proper order (unlike the GUI’s changeWalletPassphrase interface method).


    ishaanam commented at 3:08 pm on February 6, 2023:
    Good catch, I was able to replicate this locally. Should be fixed now.
  31. ishaanam force-pushed on Feb 6, 2023
  32. pablomartin4btc commented at 6:17 pm on February 6, 2023: member
    re-ack fcaaa5fd905a21298bd69343c9756e32a70870e6 - fix lock ordering deadlock.
  33. in test/functional/wallet_importdescriptors.py:689 in fcaaa5fd90 outdated
    684+        encrypted_wallet = self.nodes[0].get_wallet_rpc("encrypted_wallet")
    685+        encrypted_wallet.walletpassphrase("passphrase", 1)
    686+
    687+        descriptor["timestamp"] = 0
    688+        descriptor["next_index"] = 0
    689+        encrypted_wallet.importdescriptors([descriptor])
    


    achow101 commented at 10:14 pm on February 6, 2023:

    In fcaaa5fd905a21298bd69343c9756e32a70870e6 “test: test rescanning encrypted wallets”

    I suspect that this test is going to be the source of many intermittent failures in the future. We already observe a failure in CI for this test. Since this is testing a race condition, the speed of the machine executing the test becomes a significant factor, and having sanitizers being run can cause the test to fail.

    I think we might be able to workaround this by using a batched RPC request as IIRC those are run in a single thread sequentially.


    ishaanam commented at 4:55 am on February 7, 2023:
    I agree that this test would have been problematic. I have changed the importdescriptors and rescanblockchain tests added in this PR to use a batched RPC request.
  34. ishaanam force-pushed on Feb 7, 2023
  35. DrahtBot added the label Needs rebase on Feb 13, 2023
  36. wallet: keep track of when the passphrase is needed when rescanning
    Wallet passphrases are needed to top up the keypool during a
    rescan. The following RPCs need the passphrase when rescanning:
        - `importdescriptors`
        - `rescanblockchain`
    
    The following RPCs use the information about whether or not the
    passphrase is being used to ensure that full rescans are able to
    take place:
        - `walletlock`
        - `encryptwallet`
        - `walletpassphrasechange`
    66a86ebabb
  37. wallet: ensure that the passphrase is not deleted from memory when being used to rescan
    `m_relock_mutex` is introduced so that the passphrase is not
    deleted from memory when the timeout provided in
    `walletpassphrase` is up, but the wallet is still rescanning.
    493b813e17
  38. test: test rescanning encrypted wallets 6a5b348f2e
  39. ishaanam force-pushed on Feb 15, 2023
  40. DrahtBot removed the label Needs rebase on Feb 15, 2023
  41. ishaanam commented at 1:55 pm on February 15, 2023: contributor
    Rebased, I also increased the walletpassphrase timeout in both of the tests.
  42. achow101 commented at 6:10 pm on February 15, 2023: member
    ACK 6a5b348f2e526f048d0b448b01f6c4ab608569af
  43. DrahtBot requested review from hernanmarino on Feb 15, 2023
  44. hernanmarino commented at 1:04 pm on February 16, 2023: contributor
    ACK 6a5b348f2e526f048d0b448b01f6c4ab608569af
  45. furszy approved
  46. furszy commented at 3:26 pm on February 21, 2023: member

    Tested ACK 6a5b348f

    No longer crashes due the deadlock.

  47. DrahtBot requested review from furszy on Feb 21, 2023
  48. achow101 merged this on Feb 21, 2023
  49. achow101 closed this on Feb 21, 2023

  50. sidhujag referenced this in commit 052eccd80c on Feb 25, 2023
  51. in test/functional/wallet_transactiontime_rescan.py:199 in 6a5b348f2e
    194+            self.generatetoaddress(usernode, COINBASE_MATURITY + 1, temp_wallet.getnewaddress())
    195+
    196+            minernode.createwallet("encrypted_wallet", blank=True, passphrase="passphrase", descriptors=False)
    197+            encrypted_wallet = minernode.get_wallet_rpc("encrypted_wallet")
    198+
    199+            encrypted_wallet.walletpassphrase("passphrase", 1)
    


    maflcko commented at 4:03 pm on March 2, 2023:

    This fails for me when running in valgrind:

     0 node0 2023-02-28T19:50:00.090808Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.1] [rpc/request.cpp:179] [parse] [rpc] ThreadRPCServer method=walletpassphrase user=__cookie__ 
     1 node0 2023-02-28T19:50:08.966047Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.1] [rpc/server.cpp:560] [RPCRunLater] [rpc] queue run of timer lockwallet(encrypted_wallet) in 1 seconds (using HTTP) 
     2 node0 2023-02-28T19:50:08.995457Z (mocktime: 2023-03-30T19:35:09Z) [http] [httpserver.cpp:239] [http_request_cb] [http] Received a POST request for /wallet/encrypted_wallet from 127.0.0.1:42774 
     3 node0 2023-02-28T19:50:08.998881Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.2] [rpc/request.cpp:179] [parse] [rpc] ThreadRPCServer method=sethdseed user=__cookie__ 
     4 node0 2023-02-28T19:51:12.893784Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.2] [wallet/scriptpubkeyman.h:251] [WalletLogPrintf] [encrypted_wallet] keypool added 800 keys (400 internal), size=800 (400 internal) 
     5 node0 2023-02-28T19:51:12.953236Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.2] [wallet/scriptpubkeyman.h:251] [WalletLogPrintf] [encrypted_wallet] LegacyScriptPubKeyMan::NewKeyPool rewrote keypool 
     6 node0 2023-02-28T19:51:12.964195Z (mocktime: 2023-03-30T19:35:09Z) [http] [httpserver.cpp:239] [http_request_cb] [http] Received a POST request for /wallet/encrypted_wallet from 127.0.0.1:53062 
     7 node0 2023-02-28T19:51:12.966738Z (mocktime: 2023-03-30T19:35:09Z) [httpworker.3] [rpc/request.cpp:179] [parse] [rpc] ThreadRPCServer method=sethdseed user=__cookie__ 
     8 test  2023-02-28T19:51:12.976000Z TestFramework (ERROR): JSONRPC error 
     9                                   Traceback (most recent call last):
    10                                     File "/root/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 134, in main
    11                                       self.run_test()
    12                                     File "/root/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/wallet_transactiontime_rescan.py", line 200, in run_test
    13                                       encrypted_wallet.sethdseed(seed=hd_seed)
    14                                     File "/root/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/coverage.py", line 49, in __call__
    15                                       return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
    16                                     File "/root/bitcoin-core/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/authproxy.py", line 149, in __call__
    17                                       raise JSONRPCException(response['error'], status)
    18                                   test_framework.authproxy.JSONRPCException: Error: Please enter the wallet passphrase with walletpassphrase first. (-13)
    

    Maybe the timeout can be increased to 999999?

    Also I wonder why you picked a timeout of 3 below, or why walletpassphrase needs to be called at all twice in a row?

    If you want to confirm that the wallet can’t be relocked while the rescan is in progress, wouldn’t it be better to call walletlock rpc right after the rescan rpc?


    ishaanam commented at 7:22 pm on March 2, 2023:

    Maybe the timeout can be increased to 999999?

    Yes, the timeout before sethdseed can be increased to 999999, followed by running walletlock. I can open a follow-up PR for this.

    Also I wonder why you picked a timeout of 3 below, or why walletpassphrase needs to be called at all twice in a row?

    3 was used because it needed to be a number that was large enough so that the scan would begin, but small enough so that the timeout would occur while the rescan is in progress. However, this number can be too large or too small depending on the speed of the machine.

    If you want to confirm that the wallet can’t be relocked while the rescan is in progress, wouldn’t it be better to call walletlock rpc right after the rescan rpc?

    I don’t completely understand this suggestion. Do you mean calling walletlock to ensure that the wallet is still unlocked? If so, I think there are two problems with that:

    • walletlock doesn’t throw an error if the wallet is already locked
    • The wallet is supposed to re-lock immediately after the rescan, so the wallet might already be locked by the time walletlock is called, even if it did not lock during the rescan. However, using a batched RPC request might help with this.

    maflcko commented at 9:45 am on March 3, 2023:

    Do you mean calling walletlock to ensure that the wallet is still unlocked?

    No, I mean calling walletlock to lock the wallet. So the execution would look like:

    0-> walletpassphrase 99999 (start)
    1<- walletpassphrase (end)
    2-> rescanblockchain (start)
    3-> walletlock (start)
    4<- rescanblockchain (finish, release mutex)
    5<- walletlock (take mutex, end)
    

    Without the fix in this pull, walletlock might lock immediately and then likely cause rescanblockchain to fail?


    ishaanam commented at 10:33 pm on March 4, 2023:
    Ok, I think I understand what you mean now. I have implemented this using multi-threading in #27199. However, for me this no longer fails without the fix.

    maflcko commented at 12:25 pm on March 13, 2023:

    However, for me this no longer fails without the fix.

    For me, nothing fails. Unless I specifically add a manual sleep on the code before the fix:

     0diff --git a/src/wallet/wallet.cpp b/src/wallet/wallet.cpp
     1index eed40f9462..d5b5af0d4c 100644
     2--- a/src/wallet/wallet.cpp
     3+++ b/src/wallet/wallet.cpp
     4@@ -1851,6 +1851,8 @@ CWallet::ScanResult CWallet::ScanForWalletTransactions(const uint256& start_bloc
     5     WalletLogPrintf("Rescan started from block %s... (%s)\n", start_block.ToString(),
     6                     fast_rescan_filter ? "fast variant using block filters" : "slow variant inspecting all blocks");
     7 
     8+    UninterruptibleSleep(4s);
     9+
    10     fAbortRescan = false;
    11     ShowProgress(strprintf("%s " + _("Rescanning…").translated, GetDisplayName()), 0); // show rescan progress in GUI as dialog or on splashscreen, if rescan required on startup (e.g. due to corruption)
    12     uint256 tip_hash = WITH_LOCK(cs_wallet, return GetLastBlockHash());
    
  52. achow101 referenced this in commit db03248070 on Mar 16, 2023
  53. sidhujag referenced this in commit 5b7255486c on Mar 17, 2023
  54. sidhujag referenced this in commit 4923d40056 on Mar 17, 2023
  55. Ronnie6464 approved
  56. Ronnie6464 changes_requested
  57. Ronnie6464 commented at 1:55 am on May 28, 2023: none
    No
  58. Ronnie6464 approved
  59. Ronnie6464 commented at 2:47 am on May 28, 2023: none
  60. Ronnie6464 approved
  61. Ronnie6464 commented at 2:50 am on May 28, 2023: none
    test/functional/wallet_transactiontime_rescan.py
  62. bitcoin deleted a comment on May 28, 2023
  63. bitcoin deleted a comment on May 28, 2023
  64. bitcoin deleted a comment on May 28, 2023
  65. bitcoin locked this on May 28, 2023
  66. bitcoin unlocked this on May 28, 2023
  67. bitcoin locked this on May 28, 2023
  68. bitcoin deleted a comment on May 28, 2023
  69. bitcoin deleted a comment on May 28, 2023
  70. ishaanam deleted the branch on Nov 30, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me