wallet: Handle concurrent wallet loading #19300

pull promag wants to merge 2 commits into bitcoin:master from promag:2020-06-loadwallet changing 2 files +47 −2
  1. promag commented at 10:03 am on June 17, 2020: member

    This PR handles concurrent wallet loading.

    This can be tested by running in parallel the following script a couple of times:

    0for i in {1..10}
    1do
    2  src/bitcoin-cli -regtest loadwallet foo
    3  src/bitcoin-cli -regtest unloadwallet foo
    4done
    

    Eventually the error occurs:

    0error code: -4
    1error message:
    2Wallet already being loading.
    

    For reference, loading and already loaded wallet gives:

    0error code: -4
    1error message:
    2Wallet file verification failed. Error loading wallet w1. Duplicate -wallet filename specified.
    

    Fixes #19232.

  2. MarcoFalke commented at 10:11 am on June 17, 2020: member
    Does the crash also happen in a python functional test? If yes, mind to add one?
  3. promag commented at 10:14 am on June 17, 2020: member
    @MarcoFalke haven’t tried but I’d say it should - but it will be one of those loop-until-it-happens test. I’ll give it a try.
  4. MarcoFalke added this to the milestone 0.20.1 on Jun 17, 2020
  5. MarcoFalke added the label Needs backport (0.19) on Jun 17, 2020
  6. MarcoFalke added the label Needs backport (0.20) on Jun 17, 2020
  7. MarcoFalke added the label Wallet on Jun 17, 2020
  8. MarcoFalke commented at 10:16 am on June 17, 2020: member
    does this need backport to 0.19?
  9. promag commented at 10:18 am on June 17, 2020: member
    @MarcoFalke yes, and probably is a clean backport.
  10. fscemama commented at 11:56 am on June 17, 2020: none

    Hi, just tested the running of several background processes of the following script :

    0# Bitcoin Core RPC client version v0.20.0.0-ga62f0ed64f8bbbdfe6467ac5ce92ef5b5222d1bd
    1wallet=xxxxxx
    2for i in {1..1000}
    3do
    4  echo $i
    5  bitcoin-cli loadwallet $wallet
    6  bitcoin-cli unloadwallet $wallet
    7done
    

    Crash is immediate. Here’s valgrind output:

     0valgrind /usr/local/bin/bitcoind -daemon -walletrbf -conf=/root/.bitcoin/bitcoin.conf -fallbackfee=0.00001 -rpcworkqueue=128
     1==30964== Memcheck, a memory error detector
     2==30964== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
     3==30964== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
     4==30964== Command: /usr/local/bin/bitcoind -daemon -walletrbf -conf=/root/.bitcoin/bitcoin.conf -fallbackfee=0.00001 -rpcworkqueue=128
     5==30964==
     6Bitcoin Core starting
     7==30964==
     8==30964== HEAP SUMMARY:
     9==30964==     in use at exit: 2,232,114 bytes in 2,271 blocks
    10==30964==   total heap usage: 3,234 allocs, 963 frees, 2,462,760 bytes allocated
    11==30964==
    12==30964== LEAK SUMMARY:
    13==30964==    definitely lost: 0 bytes in 0 blocks
    14==30964==    indirectly lost: 0 bytes in 0 blocks
    15==30964==      possibly lost: 0 bytes in 0 blocks
    16==30964==    still reachable: 2,232,114 bytes in 2,271 blocks
    17==30964==         suppressed: 0 bytes in 0 blocks
    18==30964== Reachable blocks (those to which a pointer was found) are not shown.
    19==30964== To see them, rerun with: --leak-check=full --show-leak-kinds=all
    20==30964==
    21==30964== For lists of detected and suppressed errors, rerun with: -s
    22==30964== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
    23root@backdev2:/home/psb/bin# ==30973== Syscall param pwrite64(buf) points to uninitialised byte(s)
    24==30973==    at 0x507B983: ??? (syscall-template.S:84)
    25==30973==    by 0x53CF188: __os_io (in /usr/lib/libdb_cxx-4.8.so)
    26==30973==    by 0x53BD04C: __memp_pgwrite.part.0 (in /usr/lib/libdb_cxx-4.8.so)
    27==30973==    by 0x53BD408: __memp_bhwrite (in /usr/lib/libdb_cxx-4.8.so)
    28==30973==    by 0x53BC31A: __memp_alloc (in /usr/lib/libdb_cxx-4.8.so)
    29==30973==    by 0x53BED27: __memp_fget (in /usr/lib/libdb_cxx-4.8.so)
    30==30973==    by 0x5348408: __db_verify (in /usr/lib/libdb_cxx-4.8.so)
    31==30973==    by 0x534A846: __db_verify_internal (in /usr/lib/libdb_cxx-4.8.so)
    32==30973==    by 0x52B9026: Db::verify(char const*, char const*, std::ostream*, unsigned int) (in /usr/lib/libdb_cxx-4.8.so)
    33==30973==    by 0x3DF7E5: BerkeleyEnvironment::Verify(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool (*)(boost::filesystem::path const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&), std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&) (db.cpp:277)
    34==30973==    by 0x3DFB4A: BerkeleyBatch::VerifyDatabaseFile(boost::filesystem::path const&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, bool (*)(boost::filesystem::path const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&)) (db.cpp:422)
    35==30973==    by 0x472D48: CWallet::Verify(interfaces::Chain&, WalletLocation const&, bool, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) (wallet.cpp:3778)
    36==30973==  Address 0xd63ee08 is 120 bytes inside a block of size 1,112 alloc'd
    37==30973==    at 0x402E7BD: malloc (vg_replace_malloc.c:309)
    38==30973==    by 0x53CC964: __os_malloc (in /usr/lib/libdb_cxx-4.8.so)
    39==30973==    by 0x539B4F3: __env_alloc (in /usr/lib/libdb_cxx-4.8.so)
    40==30973==    by 0x53BBBC0: __memp_alloc (in /usr/lib/libdb_cxx-4.8.so)
    41==30973==    by 0x53BED27: __memp_fget (in /usr/lib/libdb_cxx-4.8.so)
    42==30973==    by 0x52D4063: __bam_new_file (in /usr/lib/libdb_cxx-4.8.so)
    43==30973==    by 0x5386387: __db_new_file (in /usr/lib/libdb_cxx-4.8.so)
    44==30973==    by 0x5386906: __db_open (in /usr/lib/libdb_cxx-4.8.so)
    45==30973==    by 0x534B2B8: __db_vrfy_pgset (in /usr/lib/libdb_cxx-4.8.so)
    46==30973==    by 0x534B4BE: __db_vrfy_dbinfo_create (in /usr/lib/libdb_cxx-4.8.so)
    47==30973==    by 0x5348056: __db_verify (in /usr/lib/libdb_cxx-4.8.so)
    48==30973==    by 0x534A846: __db_verify_internal (in /usr/lib/libdb_cxx-4.8.so)
    49==30973==
    50==30973==
    51==30973== Process terminating with default action of signal 6 (SIGABRT)
    52==30973==    at 0x6350FFF: raise (raise.c:51)
    53==30973==    by 0x6352429: abort (abort.c:89)
    54==30973==    by 0x6349E66: __assert_fail_base (assert.c:92)
    55==30973==    by 0x6349F11: __assert_fail (assert.c:101)
    56==30973==    by 0x49085A: BerkeleyDatabase (db.h:125)
    57==30973==    by 0x49085A: std::unique_ptr<BerkeleyDatabase, std::default_delete<BerkeleyDatabase> > MakeUnique<BerkeleyDatabase, std::shared_ptr<BerkeleyEnvironment>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > >(std::shared_ptr<BerkeleyEnvironment>&&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&&) (memory.h:16)
    58==30973==    by 0x49091F: BerkeleyDatabase::Create(boost::filesystem::path const&) (db.h:139)
    59==30973==    by 0x47E2C1: CWallet::CreateWalletFromFile(interfaces::Chain&, WalletLocation const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&, unsigned long) (wallet.cpp:3805)
    60==30973==    by 0x4823E6: LoadWallet(interfaces::Chain&, WalletLocation const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >&, std::vector<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, std::allocator<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > > >&) (wallet.cpp:157)
    61==30973==    by 0x40A7FD: loadwallet(JSONRPCRequest const&) (rpcwallet.cpp:2626)
    62==30973==    by 0x22BFEB: operator() (server.h:104)
    63==30973==    by 0x22BFEB: std::_Function_handler<bool (JSONRPCRequest const&, UniValue&, bool), CRPCCommand::CRPCCommand(char const*, char const*, UniValue (*)(JSONRPCRequest const&), std::initializer_list<char const*>)::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}>::_M_invoke(std::_Any_data const&, JSONRPCRequest const&, UniValue&, bool&&) (functional:1717)
    64==30973==    by 0x14A95E: operator() (functional:2127)
    65==30973==    by 0x14A95E: operator() (chain.cpp:208)
    66==30973==    by 0x14A95E: std::_Function_handler<bool (JSONRPCRequest const&, UniValue&, bool), interfaces::(anonymous namespace)::RpcHandlerImpl::RpcHandlerImpl(CRPCCommand const&)::{lambda(JSONRPCRequest const&, UniValue&, bool)#1}>::_M_invoke(std::_Any_data const&, JSONRPCRequest const&, UniValue&, bool&&) (functional:1717)
    67==30973==    by 0x2B32DB: operator() (functional:2127)
    68==30973==    by 0x2B32DB: ExecuteCommand (server.cpp:453)
    69==30973==    by 0x2B32DB: CRPCTable::execute(JSONRPCRequest const&) const (server.cpp:436)
    70==30973==
    71==30973== HEAP SUMMARY:
    72==30973==     in use at exit: 475,082,336 bytes in 3,760,553 blocks
    73==30973==   total heap usage: 9,888,764 allocs, 6,128,211 frees, 1,385,505,603 bytes allocated
    74==30973==
    
  11. promag commented at 1:01 pm on June 17, 2020: member
    @MarcoFalke pushed 50b7e8216564d1e897f2f05b6d91abb5afd90f0c, I welcome any python ninja to make it more cool.
  12. in test/functional/wallet_multiwallet.py:242 in 50b7e82165 outdated
    237+            t.start()
    238+            threads.append(t)
    239+        for t in threads:
    240+            t.join()
    241+        global got_loading_error
    242+        assert_equal(got_loading_error, True)
    


    MarcoFalke commented at 1:16 pm on June 17, 2020:
    I don’t think we can assert that a race happened. Races are only intermittent.

    promag commented at 1:50 pm on June 17, 2020:
    I think this one is pretty much guaranteed, loading takes a bit.

    ryanofsky commented at 0:36 am on June 23, 2020:

    In commit “qa: Test concurrent wallet loading” (9b009fae6e2eb0ab2ee7ce7882c3556a9ac363a7)

    I don’t think we can assert that a race happened. Races are only intermittent.

    I would agree with Marco that in general tests like this are fragile and bad and slow and more trouble than they are worth, but also agree with promag that in practice this test is unlikely to fail.

    To avoid potential problems from this, I’d suggest adding a comment here like “got_loading_error condition is not guaranteed to be true but very unlikely to be false based on how long it takes to load a wallet”, so in case this does fail, future developers don’t waste time trying to figure it out and can ignore or remove the test

    In general it’s easier to write tests for race conditions by making them invasive and implementing them in c++


    MarcoFalke commented at 10:16 am on August 22, 2020:

    promag commented at 12:30 pm on August 22, 2020:
    So it was slower to spawn those threads than to load the wallet. Not sure about c++ test because it would need to force block inside loading. Maybe we could improve this first by waiting for all threads to be ready to load?
  13. promag force-pushed on Jun 17, 2020
  14. promag commented at 1:51 pm on June 17, 2020: member
    @fscemama if you can please repeat the test but using this branch.
  15. fscemama commented at 2:00 pm on June 17, 2020: none
    Sure. Can you confirm I must compile this and re-test ?
  16. fscemama commented at 2:32 pm on June 17, 2020: none

    Successfull. Will send these kind of normal messages:

    0error code: -4
    1error message:
    2Wallet already being loading.
    3error code: -18
    4error message:
    5Requested wallet does not exist or is not loaded
    

    No crash. Hourra!

  17. DrahtBot commented at 11:28 pm on June 17, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19101 (refactor: remove ::vpwallets and related global variables by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  18. in src/wallet/wallet.cpp:180 in ca10f4e0ec outdated
    177+        result = g_loading_wallet_set.insert(location.GetName());
    178+        if (!result.second) {
    179+            error = Untranslated("Wallet already being loading.");
    180+            return nullptr;
    181+        }
    182+    }
    


    hebasto commented at 2:11 pm on June 18, 2020:
    0    auto result = WITH_LOCK(g_loading_wallet_mutex, return g_loading_wallet_set.insert(location.GetName()));
    1    if (!result.second) {
    2        error = Untranslated("Wallet already being loading.");
    3        return nullptr;
    4    }
    

    promag commented at 3:15 pm on June 18, 2020:
    Thanks, will apply this.

    promag commented at 0:02 am on June 19, 2020:
    Done.
  19. in src/wallet/wallet.cpp:150 in ca10f4e0ec outdated
    146@@ -145,7 +147,7 @@ void UnloadWallet(std::shared_ptr<CWallet>&& wallet)
    147     }
    148 }
    149 
    150-std::shared_ptr<CWallet> LoadWallet(interfaces::Chain& chain, const WalletLocation& location, bilingual_str& error, std::vector<bilingual_str>& warnings)
    151+static std::shared_ptr<CWallet> LoadWalletInternal(interfaces::Chain& chain, const WalletLocation& location, bilingual_str& error, std::vector<bilingual_str>& warnings)
    



    promag commented at 3:14 pm on June 18, 2020:
    I think this is better for a follow up - being a bugfix we should make it minimal so it’s easily backported. Unless we allow an anonymous namespace between functions?

    hebasto commented at 3:21 pm on June 18, 2020:

    I think this is better for a follow up - being a bugfix we should make it minimal so it’s easily backported.

    It’s up to you :)

    Unless we allow an anonymous namespace between functions?

    Why not?


    promag commented at 0:03 am on June 19, 2020:
    Placed in anonymous namespace.
  20. hebasto commented at 2:32 pm on June 18, 2020: member

    Concept ACK.

    btw, couldn’t get an error with the testing bash script on master (343c0bfbf1e56dd4b2919bc98215367922926f3c) without code patching.

    After #19249 the LoadWallet() function could be annotated with EXCLUSIVE_LOCKS_REQUIRED(!g_loading_wallet_mutex):

     0diff --git a/src/wallet/wallet.cpp b/src/wallet/wallet.cpp
     1index c93c79116..7e73a2bac 100644
     2--- a/src/wallet/wallet.cpp
     3+++ b/src/wallet/wallet.cpp
     4@@ -99,7 +99,7 @@ std::unique_ptr<interfaces::Handler> HandleLoadWallet(LoadWalletFn load_wallet)
     5     return interfaces::MakeHandler([it] { LOCK(cs_wallets); g_load_wallet_fns.erase(it); });
     6 }
     7 
     8-static Mutex g_loading_wallet_mutex;
     9+Mutex g_loading_wallet_mutex;
    10 static Mutex g_wallet_release_mutex;
    11 static std::condition_variable g_wallet_release_cv;
    12 static std::set<std::string> g_loading_wallet_set GUARDED_BY(g_loading_wallet_mutex);
    13diff --git a/src/wallet/wallet.h b/src/wallet/wallet.h
    14index cf000b0b7..3013fb360 100644
    15--- a/src/wallet/wallet.h
    16+++ b/src/wallet/wallet.h
    17@@ -49,11 +49,13 @@ struct bilingual_str;
    18 //! by the shared pointer deleter.
    19 void UnloadWallet(std::shared_ptr<CWallet>&& wallet);
    20 
    21+extern Mutex g_loading_wallet_mutex;
    22+
    23 bool AddWallet(const std::shared_ptr<CWallet>& wallet);
    24 bool RemoveWallet(const std::shared_ptr<CWallet>& wallet);
    25 std::vector<std::shared_ptr<CWallet>> GetWallets();
    26 std::shared_ptr<CWallet> GetWallet(const std::string& name);
    27-std::shared_ptr<CWallet> LoadWallet(interfaces::Chain& chain, const WalletLocation& location, bilingual_str& error, std::vector<bilingual_str>& warnings);
    28+std::shared_ptr<CWallet> LoadWallet(interfaces::Chain& chain, const WalletLocation& location, bilingual_str& error, std::vector<bilingual_str>& warnings) EXCLUSIVE_LOCKS_REQUIRED(!g_loading_wallet_mutex);
    29 std::unique_ptr<interfaces::Handler> HandleLoadWallet(LoadWalletFn load_wallet);
    30 
    31 enum class WalletCreationStatus {
    
  21. promag commented at 2:50 pm on June 18, 2020: member

    btw, couldn’t get an error with the testing bash script on master @hebasto see #19232 (comment)

  22. in test/functional/wallet_multiwallet.py:233 in ca10f4e0ec outdated
    229@@ -212,6 +230,18 @@ def wallet_file(name):
    230         w2 = node.get_wallet_rpc(wallet_names[1])
    231         w2.getwalletinfo()
    232 
    233+        if not self.options.usecli:
    


    MarcoFalke commented at 2:51 pm on June 18, 2020:
    Any reason to not test the cli? IIRC the reported bug was about the cli

    promag commented at 2:53 pm on June 18, 2020:
    I just need to get the cli equivalent to get_rpc_proxy.

    promag commented at 0:02 am on June 19, 2020:
    Fixed.
  23. hebasto commented at 2:59 pm on June 18, 2020: member

    btw, couldn’t get an error with the testing bash script on master

    @hebasto see #19232 (comment)

    Indeed :)

    On master:

    02020-06-18T14:57:15Z [httpworker.0] Using BerkeleyDB version Berkeley DB 5.3.28: (September  9, 2013)
    12020-06-18T14:57:15Z [httpworker.0] Using wallet /home/hebasto/.bitcoin/regtest/wallets/foo
    22020-06-18T14:57:15Z [httpworker.0] BerkeleyEnvironment::Open: LogDir=/home/hebasto/.bitcoin/regtest/wallets/foo/database ErrorFile=/home/hebasto/.bitcoin/regtest/wallets/foo/db.log
    32020-06-18T14:57:15Z [httpworker.0] init message: Loading wallet...
    4bitcoind: ./wallet/bdb.h:113: BerkeleyDatabase::BerkeleyDatabase(std::shared_ptr<BerkeleyEnvironment>, std::string): Assertion `inserted.second' failed.
    52020-06-18T14:57:15Z [httpworker.1] Using BerkeleyDB version Berkeley DB 5.3.28: (September  9, 2013)
    62020-06-18T14:57:15Z [httpworker.1] Using wallet /home/hebasto/.bitcoin/regtest/wallets/foo
    7Aborted (core dumped)
    
  24. hebasto commented at 3:09 pm on June 18, 2020: member

    Testing this PR with two parallel testing scripts and got different error messages:

     0$ ~/test.sh & ~/test.sh 
     1[1] 354
     2error code: -4
     3error message:
     4Wallet already being loading.
     5error code: -18
     6error message:
     7Requested wallet does not exist or is not loaded
     8error code: -4
     9error message:
    10Wallet already being loading.
    11error code: -18
    12error message:
    13Requested wallet does not exist or is not loaded
    14error code: -4
    15error message:
    16Wallet already being loading.
    17error code: -18
    18error message:
    19Requested wallet does not exist or is not loaded
    20error code: -4
    21error message:
    22Wallet already being loading.
    23error code: -18
    24error message:
    25Requested wallet does not exist or is not loaded
    26error code: -4
    27error message:
    28Wallet already being loading.
    29error code: -18
    30error message:
    31Requested wallet does not exist or is not loaded
    32{
    33  "name": "foo",
    34  "warning": ""
    35}
    36{
    37  "name": "foo",
    38  "warning": ""
    39}
    40{
    41  "name": "foo",
    42  "warning": ""
    43}
    44{
    45  "name": "foo",
    46  "warning": ""
    47}
    48{
    49  "name": "foo",
    50  "warning": ""
    51}
    52[1]+  Exit 18                 ~/test.sh
    
  25. promag commented at 3:12 pm on June 18, 2020: member
    @hebasto yes that’s expected - one is from concurrent load and the other from concurrent unload.
  26. wallet: Handle concurrent wallet loading b9971ae585
  27. promag force-pushed on Jun 19, 2020
  28. qa: Test concurrent wallet loading 9b009fae6e
  29. promag force-pushed on Jun 19, 2020
  30. hebasto approved
  31. hebasto commented at 10:08 am on June 22, 2020: member

    ACK 9b009fae6e2eb0ab2ee7ce7882c3556a9ac363a7, tested on Linux Mint 20 (x86_64):

     0$ cat ~/test.sh 
     1for i in {1..2}
     2do
     3  bash -c "echo $1; src/bitcoin-cli -regtest loadwallet foo"
     4  bash -c "echo $1; src/bitcoin-cli -regtest unloadwallet foo"
     5done
     6sleep 1s
     7
     8$ ~/test.sh "A:" & ~/test.sh "B:"
     9[1] 46177
    10A:
    11B:
    12error code: -4
    13error message:
    14Wallet already being loading.
    15B:
    16error code: -18
    17error message:
    18Requested wallet does not exist or is not loaded
    19B:
    20error code: -4
    21error message:
    22Wallet already being loading.
    23B:
    24error code: -18
    25error message:
    26Requested wallet does not exist or is not loaded
    27{
    28  "name": "foo",
    29  "warning": ""
    30}
    31A:
    32A:
    33{
    34  "name": "foo",
    35  "warning": ""
    36}
    37A:
    

    Also verified:

    0$ ./test/functional/wallet_multiwallet.py
    1$ ./test/functional/wallet_multiwallet.py --usecli
    
  32. ryanofsky approved
  33. ryanofsky commented at 1:16 am on June 23, 2020: member

    Code review good-but-not-ideal ACK 9b009fae6e2eb0ab2ee7ce7882c3556a9ac363a7

    This is an improvement but I don’t think it’s the ideal fix. A friendlier API would just require callers to write something like:

    0status = loadwallet("mywallet")
    1if status not in (success, wallet_already_loaded): raise Error
    2listtransactions("mywallet")
    

    instead of:

    0while (status = loadwallet("mywallet") == wallet_already_loading:
    1   sleep()
    2if status not in (success, wallet_already_loaded): raise Error
    3listtransactions("mywallet")
    

    This could be implemented by making the second loadwallet call block and wait for the first loadwallet call to finish, and then return the existing “duplicate wallet” error, instead of introducing a new error string and requiring the caller to poll before using the wallet. This change would also make simultaneous wallet load calls work more similar to the the way simultaneous wallet unload calls already work. I don’t think it would be too much more complex: you’d have to add a condition variable wait and notify but could remove the new error handling.

    But anyway, current fix is reasonable, fine to merge, and good to see implemented

  34. promag commented at 8:10 am on June 23, 2020: member

    This could be implemented by making the second loadwallet call block and wait for the first loadwallet call to finish, and then return the existing “duplicate wallet” error

    If the 2nd waits then why return an error?

    Edit: actually, I wonder if loading an already loaded wallet should be an error in the first place.

    This change would also make simultaneous wallet load calls work more similar to the the way simultaneous wallet unload calls already work.

    The 2nd concurrent unload doesn’t wait.

    I’m happy to make changes but I think current API is not ideal for concurrent clients.

    I had the idea of counting loads (incr load and decr unload) and then effective unload would happen when count is zero. This would simplify concurrent clients using the same wallet - each client must load->use->unload. But this is breaking change and so I think a new pair of RPC methods would be better like acquirewallet/releasewallet. However this would bring same problems as unspent locks..

  35. ryanofsky commented at 11:14 pm on June 23, 2020: member

    This could be implemented by making the second loadwallet call block and wait for the first loadwallet call to finish, and then return the existing “duplicate wallet” error

    If the 2nd waits then why return an error?

    Because I think the point of changing this is to fix a race condition, not add a new one. If you load the same wallet twice one load should succeed, and one load should return error duplicate wallet. There shouldn’t be an unpredictable length of time where both calls mysteriously succeed. Also, I presume code would be simpler not having a special case to suppress the error.

    Edit: actually, I wonder if loading an already loaded wallet should be an error in the first place.

    This would be a reasonable design, but I don’t see a reason not to return the existing error. At this point all it would be doing is hiding information we know and breaking backwards compatibility.

    This change would also make simultaneous wallet load calls work more similar to the the way simultaneous wallet unload calls already work.

    The 2nd concurrent unload doesn’t wait.

    Oh, I figured it did, but didn’t look closely. Again I think it’d be better if it waited, to avoid any need to poll when you need a wallet unloaded. And I think it’d simplify both server & client implementations to drop the extra error condition and just return “succeeded”, “failed”, or “already not loaded”, never “currently unloading”.

    I’m happy to make changes but I think current API is not ideal for concurrent clients.

    I had the idea of counting loads (incr load and decr unload) and then effective unload would happen when count is zero. This would simplify concurrent clients using the same wallet - each client must load->use->unload. But this is breaking change and so I think a new pair of RPC methods would be better like acquirewallet/releasewallet. However this would bring same problems as unspent locks..

    Yes, I wouldn’t suggest this. I think the API I’m suggesting is pretty ideal for concurrent clients because it would let them unload & unload wallets reliably and never poll. The only place where reference counting would be useful would be for concurrent clients that have no way of communicating with each other. But even in this case reference counting would be unreliable and leaky if something ever went wrong like a connection getting dropped or a client process being killed.

    Overall, I think the current fix in 9b009fae6e2eb0ab2ee7ce7882c3556a9ac363a7 is good, and definitely better than having the bug. I think it would be simpler on both server and client sides if load and unload RPCs never returned “currently loading” or “currently unloading” and just waited to return definitive statuses. But this isn’t necessary and I wouldn’t want it to hold things up. I appreciate you fixing this and considering my suggestions!

  36. ryanofsky commented at 2:16 pm on June 29, 2020: member
    This PR might be close to being mergeable. It has acks from me and hebasto and some review comments from Marco. Suggestions I made to improve this aren’t blocking and could be followed up later.
  37. promag commented at 2:24 pm on June 29, 2020: member

    I think the API I’m suggesting is pretty ideal for concurrent clients because it would let them unload & unload wallets reliably and never poll.

    I disagree, current API (with or without your suggestion) doesn’t make it easy to concurrently use a wallet (load+use+unload) - client A unloads and messes client B. And in the case some coordination exists on the client side, those improvements aren’t really necessary.

    reference counting would be unreliable and leaky if something ever went wrong like a connection getting dropped or a client process being killed.

    Right, For this I thought about a new RPC endpoint where the wallet counting would be tied to the connection and load/unload would happen indirectly.

  38. ryanofsky commented at 3:06 pm on June 29, 2020: member

    I disagree, current API (with or without your suggestion) doesn’t make it easy to concurrently use a wallet (load+use+unload) - client A unloads and messes client B.

    I don’t think we disagree about anything here. This is irrelevant. Neither your fix, nor the one I’m suggesting tries to do this, because it would require storing client state in the server, which would be an awkward and unreliable way to design things.

    And in the case some coordination exists on the client side, those improvements aren’t really necessary.

    Here is the use case: I want to write little shell script that unloads a wallet whether or not is it already loaded, maybe so I can back it up. Or I want to write a little shell script that loads a wallet, whether or not it is already loaded, maybe so I can check my balance or perform a transaction.

    With the current API writing either of these scripts requires checking 4 for different error states (success/failure/in-progress/already-loaded-or-unloaded) and writing a loop to poll the server. With the API I am proposing, no polling is required and there are only 3 states to check instead of 4.

    Also, the server-side implementation should be simpler. I am happy to implement it as a followup if the current PR is merged.

  39. MarcoFalke commented at 3:13 pm on June 29, 2020: member
    Concept ACK 9b009fae6e2eb0ab2ee7ce7882c3556a9ac363a7 I have not reviewed the code
  40. promag commented at 3:22 pm on June 29, 2020: member
    @ryanofsky Agree in making those improvements for future releases? IMO this is enough as a fix for latest version and to backport.
  41. ryanofsky commented at 3:23 pm on June 29, 2020: member

    @ryanofsky Agree in making those improvements for future releases? IMO this is enough as a fix for latest version and to backport.

    Yes I acked this PR twice and commented repeatedly it would be fine to improve this later.

  42. MarcoFalke merged this on Jun 29, 2020
  43. MarcoFalke closed this on Jun 29, 2020

  44. promag deleted the branch on Jun 29, 2020
  45. fanquake removed the label Needs backport (0.20) on Jul 3, 2020
  46. fanquake referenced this in commit c9b49d2856 on Jul 3, 2020
  47. fanquake referenced this in commit eb6b82a558 on Jul 3, 2020
  48. fanquake referenced this in commit 01c563708f on Jul 10, 2020
  49. MarcoFalke removed this from the milestone 0.20.1 on Aug 22, 2020
  50. MarcoFalke added this to the milestone 0.19.2 on Aug 22, 2020
  51. backpacker69 referenced this in commit 997d7448ce on Sep 8, 2020
  52. backpacker69 referenced this in commit 5d867e3bf5 on Sep 8, 2020
  53. Bushstar referenced this in commit 372e6f7423 on Oct 21, 2020
  54. Bushstar referenced this in commit 435ee1e5c5 on Oct 21, 2020
  55. Platinumwrist referenced this in commit 05e2740d92 on Oct 25, 2020
  56. MarkLTZ referenced this in commit 08546c9207 on Nov 8, 2020
  57. MarcoFalke referenced this in commit 9dbcd37105 on Dec 16, 2020
  58. MarcoFalke removed the label Needs backport (0.19) on Dec 16, 2020
  59. sidhujag referenced this in commit fa317ee7d0 on Dec 17, 2020
  60. Fabcien referenced this in commit ef5ce4bddb on Aug 28, 2021
  61. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 19:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me