wallet: Reorder locks in dumpwallet to avoid lock order assertion #22492

pull achow101 wants to merge 2 commits into bitcoin:master from achow101:dumpwallet-lock-order changing 3 files +53 −35
  1. achow101 commented at 3:12 am on July 19, 2021: member

    When a wallet is loaded which has an unconfirmed transaction in the mempool, it will end up establishing the lock order of cs_wallet -> cs_main -> cs_KeyStore. If dumpwallet is used on this wallet, then a lock order of cs_wallet -> cs_KeyStore -> cs_main will be used, which causes a lock order assertion. This PR fixes this by reordering dumpwallet and GetKeyBirthTimes (only used by dumpwallet). Specifically, in both functions, the function calls which lock cs_main are done prior to locking cs_KeyStore. This avoids the lock order issue.

    Additionally, I have added a test case to wallet_dump.py. Of course testing this requires --enable-debug.

    Fixes #22489

  2. DrahtBot added the label RPC/REST/ZMQ on Jul 19, 2021
  3. DrahtBot added the label Wallet on Jul 19, 2021
  4. MarcoFalke added this to the milestone 22.0 on Jul 19, 2021
  5. in test/functional/wallet_dump.py:216 in 9454cbfbd6 outdated
    208@@ -209,6 +209,15 @@ def run_test(self):
    209         with self.nodes[0].assert_debug_log(['Flushing wallet.dat'], timeout=20):
    210             self.nodes[0].getnewaddress()
    211 
    212+        # Make sure that dumpwallet doesn't have a lock order issue when there is an unconfirmed tx and it is reloaded
    213+        # See https://github.com/bitcoin/bitcoin/issues/22489
    214+        self.nodes[0].createwallet("w3")
    215+        w3 = self.nodes[0].get_wallet_rpc("w3")
    216+        w3.generatetoaddress(101, w3.getnewaddress())
    


    MarcoFalke commented at 8:56 am on July 19, 2021:

    nit: Wouldn’t import be faster than generate?

    0w3.importprivkey(privkey=self.nodes[0].get_deterministic_priv_key().key, label='coinbase_import')
    

    achow101 commented at 4:35 pm on July 19, 2021:
    For some reason I thought this was a clean chain. Changed as suggested.
  6. MarcoFalke commented at 9:01 am on July 19, 2021: member
    Potentially introduced in pull #16426 ?
  7. ryanofsky commented at 2:39 pm on July 19, 2021: member

    What code path is waiting to acquire cs_keystore while cs_main is acquired? (EDIT: It is this requestMempoolTransactions call).

    The goal of #16426 was to not acquire wallet locks with cs_main acquired, and in general to avoid doing any slow stuff with cs_main acquired, so a slow wallet would not slow down the node.

    It seems like the fix should be the opposite of this and whichever code path is locking cs_keystore while holding up cs_main should be changed not to do that, instead of changing these code paths.

  8. ryanofsky commented at 3:51 pm on July 19, 2021: member
    #15719 should fix this by no longer trying to acquire wallet locks while cs_main is held. The call order there is CWallet::AttachChain calling handleNotifications calling mempool_fn on the validationinterface queue thread, which calls transactionAddedToMempool without any locks held
  9. MarcoFalke commented at 4:02 pm on July 19, 2021: member
    #15719 looks a bit large to get into 22.0. Though, this doesn’t seem like a regression (it might exist in 21.x), so maybe we just ignore it for 22.0?
  10. ryanofsky commented at 4:20 pm on July 19, 2021: member

    #15719 looks a bit large to get into 22.0. Though, this doesn’t seem like a regression (it might exist in 21.x), so maybe we just ignore it for 22.0?

    Yes would not want to use #15719 to fix this issue if it needs to be fixed this release. I didn’t verify but would guess the issue was probably introduced with #16426 like you suggested and not a recent regression.

    Also, I think probably (but could be wring) there may not even be a deadlock here in release builds and this could be a debug_lockerorder false-positive. It seems like using one lock order before the wallet is loaded and a slightly different lock order after it is loaded wouldn’t be a problem without the lock order debug assertions.

  11. Reorder dumpwallet so that cs_main functions go first
    DEBUG_LOCKORDER expects cs_wallet, cs_main, and cs_KeyStore to be
    acquired in that order. However dumpwallet would take these in the order
    cs_wallet, cs_KeyStore, cs_main. So when configured with
    `--enable-debug`, it is possible to hit the lock order assertion when
    using dumpwallet.
    
    To fix this, cs_wallet and cs_KeyStore are no longer locked at the same
    time. Instead cs_wallet will be locked first. Then the functions which
    lock cs_main will be run. Lastly cs_KeyStore will be locked afterwards.
    This avoids the lock order issue.
    
    Furthermore, since GetKeyBirthTimes (only used by dumpwallet) also uses
    a function that locks cs_main, and itself also locks cs_KeyStore, the
    same reordering is done here.
    25d99e6511
  12. tests: Test for dumpwallet lock order issue
    Adds a test for the condition which can trigger a lock order assertion.
    Specifically, there must be an unconfirmed transaction in the mempool
    which belongs to the wallet being loaded. This will establish the order
    of cs_wallet -> cs_main -> cs_KeyStore. Then dumpwallet is called on
    that wallet. Previously, this would have used a lock order of cs_wallet
    -> cs_KeyStore -> cs_main, but this should be fixed now. The test
    ensures that.
    9b85a5e2f7
  13. MarcoFalke commented at 4:26 pm on July 19, 2021: member

    It seems like using one lock order before the wallet is loaded and a slightly different lock order after it is loaded wouldn’t be a problem without the lock order debug assertions.

    If there are more than one rpc threads loading/using wallets, it could deadlock?

  14. achow101 force-pushed on Jul 19, 2021
  15. achow101 commented at 4:35 pm on July 19, 2021: member

    LegacyScriptPubKeyMan::cs_KeyStore can still be acquired before cs_main:

    Fixed by putting cs_KeyStore within a scope.

    Also, I think probably (but could be wring) there may not even be a deadlock here in release builds and this could be a debug_lockerorder false-positive. It seems like using one lock order before the wallet is loaded and a slightly different lock order after it is loaded wouldn’t be a problem without the lock order debug assertions.

    I also think this is just a false positive. The locks will be released once the wallet is available to be used by the RPC.

    If there are more than one rpc threads loading/using wallets, it could deadlock?

    I don’t think so since those locks are only acquired during wallet loading. While the wallet is still loading, it can’t be accessed by other RPC threads.

  16. ryanofsky commented at 5:03 pm on July 19, 2021: member

    If there are more than one rpc threads loading/using wallets, it could deadlock?

    I don’t think so either for the same reason achow said. Each wallet has it’s own wallet and keystore locks, and only one thread can load a wallet at time. So whatever lock order is used by that thread for those locks during initialization does not seem relevant. I could imagine fixing this issue by adding an IGNORE_INIT_LOCKORDER(mutex) macro and putting something like

    0IGNORE_INIT_LOCKORDER(walletInstance->cs_wallet);
    1IGNORE_INIT_LOCKORDER(walletInstance->cs_keystore);
    

    around the loading code.

    But this PR’s approach of just reducing lock scopes is also pretty nice, so it does seem like a good way to go.

  17. in src/wallet/rpcdump.cpp:770 in 25d99e6511 outdated
    766     wallet.GetKeyBirthTimes(mapKeyBirth);
    767 
    768+    int64_t block_time = 0;
    769+    CHECK_NONFATAL(wallet.chain().findBlock(wallet.GetLastBlockHash(), FoundBlock().time(block_time)));
    770+
    771+    // Note: To avoid a lock order issue, access to cs_main must be locked before cs_KeyStore.
    


    ryanofsky commented at 5:23 pm on July 19, 2021:

    In commit “Reorder dumpwallet so that cs_main functions go first” (25d99e6511d8c43b2025a89bcd8295de755346a7)

    Note: this comment is good to keep here, but I just want to remind myself for later that this part will no longer be true and could be dropped with #15719, which avoids nested locking on the init thread

  18. ryanofsky approved
  19. ryanofsky commented at 5:28 pm on July 19, 2021: member
    Code review ACK 9b85a5e2f7e003ca8621feaac9bdd304d19081b4. Nice to reduce lock scope, and good test!
  20. ghost commented at 6:28 pm on July 19, 2021: none
    1. Without --enable-debug: No issues #22489 (comment)
    2. With --enable-debug: (a) mempool size > 0: Error as mentioned in Issue description #22489#issue-947130115 ❌ (b) mempool size = 0: No issues ✅
    1. Without --enable-debug: No issues ✅
    2. With --enable-debug: No issues ✅

    Steps followed to test:

    1. If compiling with --enable-debug use:
      0./configure BDB_LIBS="-L${BDB_PREFIX}/lib -ldb_cxx-4.8" BDB_CFLAGS="-I${BDB_PREFIX}/include" --enable-debug
      
      Else use:
      0./configure BDB_LIBS="-L${BDB_PREFIX}/lib -ldb_cxx-4.8" BDB_CFLAGS="-I${BDB_PREFIX}/include"
      
    2. Create a wallet: bitcoin-cli createwallet "W1"
    3. Check mempool: bitcoin-cli getmempoolinfo
    4. Dump wallet: bitcoin-cli dumpwallet "W1_DUMP"
  21. unknown approved
  22. DrahtBot commented at 7:11 pm on July 19, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  23. lsilva01 approved
  24. lsilva01 commented at 8:50 pm on July 19, 2021: contributor
    Tested ACK https://github.com/bitcoin/bitcoin/pull/22492/commits/9b85a5e2f7e003ca8621feaac9bdd304d19081b4 under the same conditions reported in issue #22489 and the dumpwallet command completed successfully.
  25. MarcoFalke commented at 1:03 pm on July 20, 2021: member

    review ACK 9b85a5e2f7e003ca8621feaac9bdd304d19081b4 🎰

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3review ACK 9b85a5e2f7e003ca8621feaac9bdd304d19081b4 🎰
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUgBLQv+PmySuwM9ak0Ck/qctmtnTO9CqXBcld9qmnp1zJJ7Wom+7qZ2DXPDAMk0
     8LBpscaaSFGQQtZIo7hzTmRPvbJwHOerAxfK1eH24PmfbVj+ojzhupJLaRGBPHfRm
     9Zm0RtB+7kJybgVqEFa3x4r5EoO6Vr76haQrF3pnZMPVFmlHu9H8EEkHu5BzJn4cM
    10A+weXKOCHFsHQByVpMC5yN3DDFKxyxhrhpS6RrTYz+7UGmj7ccJYT30nqPS+TXIm
    11vtJ5++AP3oAhw11SRJQ3DPgRWkM0dRR25cLiY/CmBgrj1uXCamPqkvJ+ATmWp1z2
    12LcicfIih6+foIBGn3sgmrGPK+xC7e6ecGRbB+1YJ98Vt36Fx5tOOGnT4lI0qsQDZ
    13crhSzdGdDc5/L0R1bzCrT5F1gPJvP52tyTfviLJ4SBZUJFAvos6LQ2sEmJ0piUq8
    14AAnr6jPT2/X1my18j+JHWQ8LWYsKqKRM/q2NCjv1pWsornjWEuQ0zJ/5ZMO9YOPk
    15TuVq9rGv
    16=obrS
    17-----END PGP SIGNATURE-----
    

    Timestamp of file with hash ba92f66df23beaffeb1a2b71aca1f1e08095c35afb64b1352fde0b4ef3af3b6b -

  26. MarcoFalke merged this on Jul 20, 2021
  27. MarcoFalke closed this on Jul 20, 2021

  28. sidhujag referenced this in commit f21adc34d3 on Jul 23, 2021
  29. gwillen referenced this in commit 71045e76ea on Jun 1, 2022
  30. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me