rpc: Relock wallet only if most recent callback #18814

pull promag wants to merge 2 commits into bitcoin:master from promag:2020-04-rpc-wallet-relock changing 3 files +14 −4
  1. promag commented at 9:48 am on April 29, 2020: member

    This PR fixes an early relocking race condition from #18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time

    Issue introduced in #18487. Fixes #18811.

  2. fanquake added the label RPC/REST/ZMQ on Apr 29, 2020
  3. rpc: Add mutex to guard deadlineTimers a2e6db5c4f
  4. promag force-pushed on Apr 29, 2020
  5. DrahtBot commented at 10:56 am on April 29, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18452 (qt: Fix shutdown when waitfor* cmds are called from RPC console by hebasto)
    • #18354 (Protect wallet by using shared pointers by bvbfan)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  6. in src/wallet/rpcwallet.cpp:1968 in 5a0486cd15 outdated
    1958@@ -1957,9 +1959,11 @@ static UniValue walletpassphrase(const JSONRPCRequest& request)
    1959     // wallet before the following callback is called. If a valid shared pointer
    1960     // is acquired in the callback then the wallet is still loaded.
    1961     std::weak_ptr<CWallet> weak_wallet = wallet;
    1962-    pwallet->chain().rpcRunLater(strprintf("lockwallet(%s)", pwallet->GetName()), [weak_wallet] {
    1963+    pwallet->chain().rpcRunLater(strprintf("lockwallet(%s)", pwallet->GetName()), [weak_wallet, relock_time] {
    1964         if (auto shared_wallet = weak_wallet.lock()) {
    1965             LOCK(shared_wallet->cs_wallet);
    1966+            // Skip if this is not the most recent rpcRunLater callback.
    1967+            if (shared_wallet->nRelockTime != relock_time) return;
    


    ryanofsky commented at 12:04 pm on April 29, 2020:

    I think this could introduce a bug where wallet never relocks.

    In case where walletpassphrase is called from two threads, first thread sets pwallet->nRelockTime and is interrupted after constructing lambda but before invoking rpcRunLater. Second thread sets pwallet->nRelockTime to a different value and calls rpcRunLater, and then first thread wakes up and calls rpcRunLater.

    Lambda from second thread never runs because lambda from first thread replaced it. But lambda from first thread has the wrong relock_time value so never relocks the wallet.

    Simplest fix might be adding relock time to timer names “lockwallet(%s,%i)” above to make them unique


    promag commented at 12:29 pm on April 29, 2020:
    That would make timers list grow - they are never removed, just replaced.

    promag commented at 9:09 pm on April 29, 2020:
    Added a mutex to prevent concurrent calls with the same wallet. The lock is held during wallet Unlock and RPCRunLater so the case you describe can’t happen.

    promag commented at 10:30 pm on May 3, 2020:
    Nice catch BTW.
  7. MarcoFalke commented at 12:22 pm on April 29, 2020: member
    Would be nice to write a test for this
  8. promag commented at 12:38 pm on April 29, 2020: member
    Thanks for reviewing @ryanofsky. I’ll address that case. @MarcoFalke probably a unit test no?
  9. MarcoFalke commented at 12:56 pm on April 29, 2020: member
    Yes, I haven’t looked at the code changes, but I assume some kind of unit test makes sense.
  10. MarcoFalke commented at 1:17 pm on April 29, 2020: member
    Also, this needs backport, right?
  11. MarcoFalke added this to the milestone 0.20.0 on Apr 29, 2020
  12. MarcoFalke added the label Needs backport (0.20) on Apr 29, 2020
  13. promag force-pushed on Apr 29, 2020
  14. ryanofsky approved
  15. ryanofsky commented at 9:52 pm on April 29, 2020: member

    Code review ACK b0f1e19f82d725dbefb4b0ac69e4d74f2deb1d44. I think all the problems we’ve seen so far are avoided:

    1. Original deadlock from #18487 where rpcRunLater blocks waiting for old relock callback in timer thread which is deadlocked waiting for cs_wallet
    2. Early relocking race condition from #18811 where old relock callback runs after new wallet unlock code and nRelockTime update but before rpcRunLater call, causing early relock and incorrect nRelockTime time
    3. Failure to relock race condition in 5a0486cd15c26e5e587f56f331dec3a7756407b2 where two relock callbacks could each cancel the other out without either of them relocking

    Everything would be simpler if rpcRunLater would just activate the new callback after the old callback if the old callback happened to be running, instead of blocking waiting for the old callback to finish. Then walletpassphrase could go back to using cs_wallet and multiple mutexes wouldn’t be required. But I think the approach in this PR should work too and is probably a smaller change on top of existing code.

    Only additional suggestion would be to write a PR description that summarizes what this is fixing without having to read the other thread. A unit test could be a good idea, but might be difficult to write, so I don’t think it would be crazy to merge this without a new test if it’s difficult to come up with one. Especially since the bug this is fixing was caught by a python test.

  16. promag commented at 9:58 pm on April 29, 2020: member

    A unit test could be a good idea, but might be difficult to write

    FWIW I’m still trying this.

  17. promag commented at 10:11 pm on April 29, 2020: member

    instead of blocking waiting for the old callback to finish.

    BTW If we bump minimum libevent then we could use this https://github.com/libevent/libevent/blob/29cc8386a2f7911eaa9336692a2c5544d8b4734f/whatsnew-2.1.txt#L248-L265

  18. MarcoFalke commented at 10:30 pm on April 29, 2020: member
    Bumping libevent should be discussed separately. See also #18676
  19. ryanofsky commented at 10:37 pm on April 29, 2020: member
    Also, I haven’t looked but I doubt there’s a need for a special libevent feature to improve rpcRunLater. We should just need an extra bit of code at the end of the existing libevent callback to register a new callback if there’s one pending
  20. promag commented at 10:34 pm on May 3, 2020: member

    FWIW I’m still trying this.

    And I quit. Tried with unit test but it required to start HTTP/RPC server because of libevent and timer stuff. Then tried funcional test but can’t figure how to hit the problem. @ryanofsky I’ll borrow your comment above to improve PR description.

  21. ryanofsky commented at 0:38 am on May 7, 2020: member

    This still looks good but it might be simpler to squash second and third commits to avoid having a bug in the second commit which is fixed by the third commit.

    This would simplify the PR description so you could drop the paragraph “Failure to relock race condition in 59a50e7” about the temporary bug.

  22. rpc: Relock wallet only if most recent callback 9f59dde974
  23. promag force-pushed on May 7, 2020
  24. promag commented at 0:45 am on May 7, 2020: member
    Done.
  25. ryanofsky commented at 1:04 am on May 7, 2020: member

    Code review ACK 9f59dde9740d065118bdddde75ef9f4e4603a7b1. No changes since last review except squashing commits.

    Two other thoughts:

    • It might be nice to add a comment for RPCRunLater that mentions the risk of deadlocks:

       0/** 
       1 * Run func nSeconds from now. 
       2 * Overrides previous timer <name> (if any).
       3 *
       4 * If previous timer has started executing, this will block waiting for the call
       5 * to return. Therefore, to avoid deadlocks, RPCRunLater can't be called while
       6 * holding locks that a previous timer might be waiting for.
       7 *
       8 * [@todo](/bitcoin-bitcoin/contributor/todo/) Make implementation nonblocking to avoid risk of deadlocks and simplify
       9 * usage.
      10 */ 
      11void RPCRunLater(const std::string& name, std::function<void()> func, int64_t nSeconds);
      
    • This is an idea for a new PR, but maybe there should be an RPC function to return a timer’s scheduled time so we could drop CWallet::nRelockTime
  26. MarcoFalke commented at 2:45 pm on May 11, 2020: member
    ACK 9f59dde9740d065118bdddde75ef9f4e4603a7b1
  27. jonatack commented at 2:32 pm on May 12, 2020: member

    ACK 9f59dde9740d065118bdddde75ef

    I like @ryanofsky’s suggestions in #18814 (comment).

  28. fanquake merged this on May 13, 2020
  29. fanquake closed this on May 13, 2020

  30. promag deleted the branch on May 13, 2020
  31. promag commented at 10:01 am on May 13, 2020: member
    @ryanofsky honestly I’d suggest to avoid using it. Instead of timeout what I think would be best is to exclusively “bind” the unlock to the current RPC connection. In the GUI we could have a timer. I don’t see other uses of the timer in the server.
  32. sidhujag referenced this in commit c7204ec263 on May 14, 2020
  33. fanquake referenced this in commit 0342f4ad67 on May 14, 2020
  34. fanquake referenced this in commit 2e31113f8c on May 14, 2020
  35. fanquake removed the label Needs backport (0.20) on May 14, 2020
  36. fanquake referenced this in commit ca4dac48c5 on May 15, 2020
  37. fanquake referenced this in commit 251e321ad7 on May 15, 2020
  38. MarcoFalke referenced this in commit 17bdf2afae on May 15, 2020
  39. Fabcien referenced this in commit afc4263288 on Jan 6, 2021
  40. backpacker69 referenced this in commit 6adf27d814 on Mar 28, 2021
  41. backpacker69 referenced this in commit f75b8d27c0 on Mar 28, 2021
  42. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-17 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me