Fix crash when closing wallet #835

pull furszy wants to merge 1 commits into bitcoin-core:master from furszy:2024_gui_fix_wallet_close_crash changing 2 files +13 โˆ’8
  1. furszy commented at 10:01 pm on September 12, 2024: member

    The crash occurs because WalletController::removeAndDeleteWallet is called twice for the same wallet model: first in the GUI’s button connected function WalletController::closeWallet, and then again when the backend emits the WalletModel::unload signal.

    This causes the issue because removeAndDeleteWallet inlines an erase(std::remove()). So, if std::remove returns an iterator to the end (indicating the element wasn’t found because it was already erased), the subsequent call to erase leads to an undefined behavior.

    Test Notes: Try closing any wallet using the toolbar button in the GUI. It will crash in master, but not here.

    Fixes https://github.com/bitcoin/bitcoin/issues/30887.

  2. DrahtBot commented at 10:01 pm on September 12, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK pablomartin4btc, jarolrod, hebasto

    If your review is incorrectly listed, please react with ๐Ÿ‘Ž to this comment and the bot will ignore it on the next update.

  3. maflcko added this to the milestone 28.0 on Sep 12, 2024
  4. jarolrod added the label Bug on Sep 12, 2024
  5. gui: fix crash when closing wallet
    The crash occurs because 'WalletController::removeAndDeleteWallet' is called
    twice for the same wallet model: first in the GUI's button connected function
    'WalletController::closeWallet', and then again when the backend emits the
    'WalletModel::unload' signal.
    
    This causes the issue because 'removeAndDeleteWallet' inlines an
    erase(std::remove()). So, if 'std::remove' returns an iterator to the end
    (indicating the element wasn't found because it was already erased), the
    subsequent call to 'erase' leads to an undefined behavior.
    a965f2bc07
  6. furszy force-pushed on Sep 12, 2024
  7. pablomartin4btc commented at 11:36 pm on September 12, 2024: contributor

    ACK a965f2bc07a3588f8c2b8d6a542961562e3f5d0e

    Went thru the code and as commented in the new WalletController::removeWallet function that’s now been called instead of removeAndDeleteWallet, it will end up calling later removeAndDeleteWallet (within the connection to unload signal in WalletController::getOrCreateWallet - unload triggered by wallet().remove()) which will emmit walletRemoved connected to BitcoinGUI::removeWallet (this for example will remove the wallet from the combo box and so on).

    I’ll test this in a few hours.

  8. jarolrod approved
  9. jarolrod commented at 5:11 am on September 13, 2024: member

    ACK a965f2bc07a3588f8c2b8d6a542961562e3f5d0e

    This accurately diagnoses and fixes the referenced issue. I’ve thoroughly tested this on Linux, macOS, and Windows; and both confirm the existence of the bug and that this fixes it. I’ve also sanity tested other related actions in the GUI in different scenarios, and also sanity tested other non-wallet actions in the GUI for crashes.

    Thanks for the quick fix :)

  10. hebasto added the label Needs backport (28.x) on Sep 13, 2024
  11. hebasto renamed this:
    gui: fix crash when closing wallet
    Fix crash when closing wallet
    on Sep 13, 2024
  12. hebasto added the label Wallet on Sep 13, 2024
  13. pablomartin4btc approved
  14. pablomartin4btc commented at 9:02 am on September 13, 2024: contributor

    tACK a965f2bc07a3588f8c2b8d6a542961562e3f5d0e

    Tested on Linux, I’ve reproduced the segfault on both “Close Walllet” and “Close All Wallets” actions from the File menu items and this PR fixes it and related features work as expected (eg closing a wallet will remove it from the top right combo when multi wallets are loaded).

  15. hebasto commented at 10:02 am on September 13, 2024: member

    Concept ACK.

    git bisect points to 5d15485aafefdc759ba97e039bb1b9ccac267358 from https://github.com/bitcoin/bitcoin/pull/30659 as the root of the issue, which aligns with the other comment.

    It would be reasonable to add a dedicated test to test_bitcoin-qt in a follow-up.

  16. hebasto approved
  17. hebasto commented at 10:20 am on September 13, 2024: member

    ACK a965f2bc07a3588f8c2b8d6a542961562e3f5d0e.

    Now, the BitcoinGUI::removeWallet signal is emitted once per wallet for both the “Closet Wallet…” and “Close All Wallets…” menu actions.

  18. hebasto merged this on Sep 13, 2024
  19. hebasto closed this on Sep 13, 2024

  20. furszy deleted the branch on Sep 13, 2024
  21. achow101 referenced this in commit 674dded875 on Sep 13, 2024
  22. fanquake removed the label Needs backport (28.x) on Sep 13, 2024
  23. ryanofsky commented at 4:49 pm on September 13, 2024: contributor

    Code review ACK a965f2bc07a3588f8c2b8d6a542961562e3f5d0e

    This seems like a workable fix, but I think it it is not ideal. The underlying problem here is that https://github.com/bitcoin/bitcoin/pull/30659 moved NotifyUnload from FlushAndDeleteWallet to RemoveWallet, so now instead of NotifyUnload guaranteed being called once, it can now be called multiple times, but the GUI code assumes it is only called once and double deletes the wallet model and crashes. I think a better for this issue after reverting current fix would simply be:

     0--- a/src/qt/walletcontroller.cpp
     1+++ b/src/qt/walletcontroller.cpp
     2@@ -149,7 +149,7 @@ WalletModel* WalletController::getOrCreateWallet(std::unique_ptr<interfaces::Wal
     3     const bool called = QMetaObject::invokeMethod(wallet_model, "startPollBalance");
     4     assert(called);
     5 
     6-    connect(wallet_model, &WalletModel::unload, this, [this, wallet_model] {
     7+    connect(wallet_model, &WalletModel::unload, wallet_model, [this, wallet_model] {
     8         // Defer removeAndDeleteWallet when no modal widget is actively waiting for an action.
     9         // TODO: remove this workaround by removing usage of QDialog::exec.
    10         QWidget* active_dialog = QApplication::activeModalWidget();
    

    This works by simply attaching the signal that deletes the wallet model to the wallet model object, so it only deleted once.


    I think current fix a965f2bc07a3588f8c2b8d6a542961562e3f5d0e is not ideal and would suggest reverting it for a few reasons:

    • Fix is not complete. After bitcoin/bitcoin#30659 there’s no guarantee that NotifyUnload won’t be sent multiple times because it is called from RemoveWallet, and RemoveWallet is called from many places: from RPCs and the GUI and from shutdown code. The fix only addresses one case where the notification is sent twice.

    • Fix makes GUI wallet close sequence potentially slower and less reliable. Before this change, closing wallet in the GUI would remove the GUI immediately, but now it waits for a signal from the wallet before destroying the GUI. It’s not a big deal, but this approach is more fragile and potentially slower if wallet code is running in a separate process.

    • Fix add a new removeWallet method wrapping wallet().remove() call, which is ok, but I think there’s a benefit to getting rid of of similar sounding methods that are easy to confuse when possible.


    I can open a separate PR if makes sense to follow up on this, also happy to review if someone else wants to

  24. pablomartin4btc commented at 4:54 pm on September 13, 2024: contributor

    now instead of NotifyUnload guaranteed being called once, it can now be called multiple times, but the GUI code assumes it is only called once and double deletes the wallet model and crashes.

    How can I repro the crash?

  25. ryanofsky commented at 4:58 pm on September 13, 2024: contributor

    How can I repro the crash?

    I don’t know of an easy way to reproduce the crash after the current fix, but it is possible to revert the current fix and test the alternate suggested fix by using the same steps in the PR description, choosing close wallet from the GUI menu.

    It might also be possible to make the GUI crash after the current fix by enabling RPC and calling unloadwallet RPC simultaneously from multiple threads, but not sure about that.

  26. furszy commented at 8:32 pm on September 13, 2024: member

    This seems like a workable fix, but I think it it is not ideal. The underlying problem here is that https://github.com/bitcoin/bitcoin/pull/30659 moved NotifyUnload from FlushAndDeleteWallet to RemoveWallet, so now instead of NotifyUnload guaranteed being called once, it can now be called multiple times, but the GUI code assumes it is only called once and double deletes the wallet model and crashes.

    As RemoveWallet erases the wallet from the context’s wallets vector (locking the context’s mutex) prior to calling NotifyUnload, the only way it could be called twice is if we re-add the wallet in-between the two RemoveWallet calls. Which shouldn’t be possible anywhere.

    Fix makes GUI wallet close sequence potentially slower and less reliable. Before this change, closing wallet in the GUI would remove the GUI immediately, but now it waits for a signal from the wallet before destroying the GUI. It’s not a big deal, but this approach is more fragile and potentially slower if wallet code is running in a separate process.

    I’m not so agree here. I think the GUI should call the close method and wait asynchronously until the wallet’s backend emit the closure event prior to disconnecting the views (ideally with a progress dialog saying something like “Closing wallet..” and disallowing the specific wallet usage). This is because the closing procedure could also fail and leave the wallet loaded without GUI access. And for me, this even make more sense in a multiprocess world where every interface call would be better to handle it asynchronously due to the potentially longer wait time (we don’t want to freeze the main GUI thread) – not all with extra backend events as in this case, but waiting async for the response –.

  27. ryanofsky commented at 9:12 pm on September 13, 2024: contributor

    As RemoveWallet erases the wallet from the context’s wallets vector (locking the context’s mutex) prior to calling NotifyUnload, the only way it could be called twice is if we re-add the wallet in-between the two RemoveWallet calls. Which shouldn’t be possible anywhere.

    Oh, I didn’t notice there is a return false in the middle of RemoveWallet so it does look like RemoveWallet code path is safe and won’t (currently) notify more than once.

    Nevertheless it doesn’t make sense for the Qt signal handler that deletes the WalletModel to outlive the wallet_model, so the code I change I suggested is more correct than current code, would have prevented this bug, and could prevent future bugs, because as you say, wallet unloading is asynchronous, and there is a gap between time wallet unload is requested and the time it is actually deleted. So when the GUI wants to close the wallet it should update itself as soon as possible without putting itself in a vulnerable state and being sensitive to implementation details of RemoveWallet,

    I’m not so agree here. I think the GUI should call the close method and wait asynchronously until the wallet’s backend emit the closure event prior to disconnecting the views (ideally with a progress dialog saying something like “Closing wallet..” and disallowing the specific wallet usage). This is because the closing procedure could also fail and leave the wallet loaded without GUI access. And for me, this even make more sense in a multiprocess world where every interface call would be better to handle it asynchronously due to the potentially longer wait time.

    That’s a reasonable approach but the change in a965f2bc07a3588f8c2b8d6a542961562e3f5d0e makes that harder not easier to implement correctly because it keeps wallet model and GUI alive and potentially refreshing itself while unloading is happening. The correct way to implement that change would be to make wallet unload function take a std::function and return a interface::Handler object, and not confuse a signal meant to notifying the GUI when wallet is about to be unloaded with a signal notifying the GUI that a wallet has been unloaded, which are not the same thing.

  28. ryanofsky commented at 11:32 am on September 14, 2024: contributor

    One way to test difference between the current fix and suggested fix is to delay the notification:

     0--- a/src/wallet/wallet.cpp
     1+++ b/src/wallet/wallet.cpp
     2@@ -174,7 +174,10 @@ bool RemoveWallet(WalletContext& context, const std::shared_ptr<CWallet>& wallet
     3         context.wallets.erase(i);
     4     }
     5     // Notify unload so that upper layers release the shared pointer.
     6-    wallet->NotifyUnload();
     7+    std::thread([wallet]() {
     8+        std::this_thread::sleep_for(std::chrono::seconds(5));
     9+        wallet->NotifyUnload();
    10+    }).detach();
    11 
    12     // Write the wallet setting
    13     UpdateWalletSetting(chain, name, load_on_start, warnings);
    

    Testing this with the current fix, the wallet GUI stays visible for 5 seconds after closing the wallet, because it is waiting for the delayed notification. By contrast with original gui code and suggested fix:

     0--- a/src/qt/walletcontroller.cpp
     1+++ b/src/qt/walletcontroller.cpp
     2@@ -149,7 +149,7 @@ WalletModel* WalletController::getOrCreateWallet(std::unique_ptr<interfaces::Wal
     3     const bool called = QMetaObject::invokeMethod(wallet_model, "startPollBalance");
     4     assert(called);
     5 
     6-    connect(wallet_model, &WalletModel::unload, this, [this, wallet_model] {
     7+    connect(wallet_model, &WalletModel::unload, wallet_model, [this, wallet_model] {
     8         // Defer removeAndDeleteWallet when no modal widget is actively waiting for an action.
     9         // TODO: remove this workaround by removing usage of QDialog::exec.
    10         QWidget* active_dialog = QApplication::activeModalWidget();
    

    closing the wallet in the GUI doesn’t cause a 5 second wait, because when the wallet is closed from the GUI it just closes without waiting for a notification. The notification is only needed when wallet is unloaded externally by an RPC, not when the GUI itself unloads the wallet.

    It is true as suggested #835 (comment) that it might be useful for the wallet to send a different notification when wallet unloading completes and wallet is fully closed, but this would actually need to be a different notification. It might be useful to rename the current notification from “unload” to “remove” to clarify that it is called from the RemoveWallet function and sent when wallet is removed from the list of opened wallets, not when it is unloaded, which could be at some later time.

  29. TheCharlatan referenced this in commit 69282950aa on Sep 16, 2024
  30. TheCharlatan referenced this in commit dfe0cd4ec5 on Sep 16, 2024
  31. furszy commented at 9:18 pm on September 16, 2024: member

    A bit delayed but here. Sorry.

    About your fix, I’m not against it at all. Fully agree that is better than mine. Connecting the signal handler lifetime to the object that it deletes it is what we should be doing.

    I think removing the wallet from the GUI early in the process is effective, but it doesn’t handle failures well, as we’d need to reload all wallet views from scratch. That said, disabling all wallet actions in the GUI might be a bit tricky to implement with the current code. Still, the wallet.close() method should at least ensure that all listeners are disconnected, so no ‘refresh status’ signals should be sent to the GUI while the wallet is being unloaded (side note: weโ€™re currently not checking the RemoveWallet return value in the GUI -> this is not an issue now but it could be an issue if we ever add more stuff there).


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/gui. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-22 22:20 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me