Fixes #11888, needs a 0.16 milestone
Also adds a test that the lock works.
#11687 will probably rework this to a per-wallet lock instead of just the walletdir, but this fixes the current issue
Concept ACK
Suggest moving LockWalletDirectory to wallet/db.cpp and calling it from CDBEnv::Open instead of VerifyWallets. That way this will be compatible with #11687. Also would suggest renaming LockWalletDirectory to LockEnvDirectory to be a little more consistent with terminology used in that file (even though the file isn't totally consistent right now).
@ryanofsky something like this? I had to probe the lock before calling open because of the weird database log backup thing when open fails, perhaps there is a cleaner way to do it e.g. returning different errors from open rather than just true/false. But I think this is ok
67 | + return false; 68 | + } 69 | + if (probe_only) { 70 | + lock.unlock(); 71 | + } 72 | + } catch(const boost::interprocess::interprocess_exception& e) {
Nit, missing space after catch.
57 | @@ -58,7 +58,12 @@ def run_test(self): 58 | w5 = self.nodes[0].get_wallet_rpc("w5") 59 | w5_info = w5.getwalletinfo() 60 | assert_equal(w5_info['immature_balance'], 50) 61 | + self.stop_node(0) 62 | 63 | + competing_wallet_dir = os.path.join(self.options.tmpdir, 'competing_walletdir') 64 | + os.mkdir(competing_wallet_dir) 65 | + self.start_node(0, ['-walletdir='+competing_wallet_dir])
Use restart_node, and remove stop_node above?
Concept ACK.
Is it me or there can be a race between probing and the actual locking?
Removed the probing with a bit of a rework
67 | @@ -68,7 +68,10 @@ class CDBEnv 68 | typedef std::pair<std::vector<unsigned char>, std::vector<unsigned char> > KeyValPair; 69 | bool Salvage(const std::string& strFile, bool fAggressive, std::vector<KeyValPair>& vResult); 70 | 71 | - bool Open(const fs::path& path); 72 | + enum OpenResult { OPEN_OK, 73 | + ENV_LOCKED, 74 | + OPEN_FAIL }; 75 | + OpenResult Open(const fs::path& path);
How about bool Open(const fs::path& path, bool* is_locked = 0)?
Could do, but IMO doesn't simplify the code much, if we're returning something why not use return :)
Because you only care of is_locked when Open fails. For me ENV_LOCKED is the reason Open failed.
Also, you can avoid those != OPEN_OK conditions.
But sure, consider my suggestion a nit.
Ah that's true, ok :+1:
https://github.com/bitcoin/bitcoin/pull/11904/files#r157363498
Seems fine to use either an enum or a bool, whichever you prefer. The real problem is that VerifyEnvironment is the wrong place for the rename/retry code to live. It creates a bizarre API because you can't reliably open an environment without making two different calls and pointlessly closing and reopening. It's also a bad division of code because the verify function has to make assumptions about what the open function is doing internally.
A better way to organize the code would be to have a single bool CDBEnv::Open(fs::path, bool retry=false) function, where setting the retry parameter to true renames the log directory and reopens if the initial open fails. Verifying would then just call Open(path, true) followed by a Close. Anyway this would be nice to clean up, but no need to do it here as part of fixing the lock issue.
utACK 8020abc808cbf8e0b8db24f92eafbf3262ea9163
171 | + // try opening it again one more time 172 | + if (!Open(walletDir, false)) { 173 | + // if it still fails, it probably means we can't even create the database env 174 | + return false; 175 | + } 176 | + return false;
Something wrong here, missing }?
Yep oops, bad code move. Fixed.
67 | @@ -68,7 +68,7 @@ class CDBEnv 68 | typedef std::pair<std::vector<unsigned char>, std::vector<unsigned char> > KeyValPair; 69 | bool Salvage(const std::string& strFile, bool fAggressive, std::vector<KeyValPair>& vResult); 70 | 71 | - bool Open(const fs::path& path); 72 | + bool Open(const fs::path& path, bool retry = 0);
= false?
BTW, how about:
private:
bool Open(const fs::path& path, bool retry);
public:
bool Open(const fs::path& path) { return Open(path, true); }
this way the public API remains, for instance, https://github.com/bitcoin/bitcoin/pull/11904/files#diff-5cd083d2c34da5f0ac380011f7b725d2R312 goes away.
I'd prefer to leave it explicit I think, is there any other benefit to hiding the retry parameter?
It's an internal detail, you should not be able to call Open(path, false), it's only there for the 2nd attempt.
Its also called without retry here: https://github.com/MeshCollider/bitcoin/blob/f70ce89806a7f0ce1755e7d074f6e07753a85694/src/wallet/db.cpp#L436
Ok, missed that.
Tested ACK. Needs squash.
60 | + fs::path lock_file_path = env_path / ".lock"; 61 | + FILE* file = fsbridge::fopen(lock_file_path, "a"); // empty lock file; created if it doesn't exist. 62 | + if (file) fclose(file); 63 | + 64 | + try { 65 | + static boost::interprocess::file_lock lock(lock_file_path.string().c_str());
Didn't notice this before, and no need for a change now, but just in case #11687 is somehow merged before this PR, static variable here should move to CDBEnv instance variable. (A single static variable can't guard different database directories if different database directories may be opened during the lifetime of the process.)
326 | - // if it still fails, it probably means we can't even create the database env 327 | - errorStr = strprintf(_("Error initializing wallet database environment %s!"), walletDir); 328 | - return false; 329 | - } 330 | + if (!bitdb.Open(walletDir, true)) { 331 | + errorStr = strprintf(_("Cannot obtain a lock on wallet directory %s. Another instance of bitcoin may be using it."), walletDir);
Might want to keep the previous generic error here ("Error initializing wallet database environment") since this can happen in other conditions than failing to obtain the lock.
utACK f70ce89806a7f0ce1755e7d074f6e07753a85694. Only change since last review is CDBEnv::Open retry refactor.
Squashed and added a commit to modify the error message as suggested.
Note travis failure on second commit is just because the last three tests were cancelled due to third commit being pushed before they started
52 | @@ -52,6 +53,24 @@ void CheckUniqueFileid(const CDBEnv& env, const std::string& filename, Db& db) 53 | } 54 | } 55 | } 56 | + 57 | +bool LockEnvDirectory(const fs::path& env_path) 58 | +{ 59 | + // Make sure only a single Bitcoin process is using the wallet directory. 60 | + fs::path lock_file_path = env_path / ".lock";
What if the walletdir is the same as the datadir? Won't there be a problem if both use the same lockfile? I would consider renaming this to ".walletlock" perhaps (which is also clearer to people who see the file).
Reading more, I think this is indeed advisable:
- It's unspecified if a file_lock synchronizes two threads from the same process.
- It's unspecified if a process can use two file_lock objects pointing to the same file.
Good point, fixed!
52 | @@ -52,6 +53,24 @@ void CheckUniqueFileid(const CDBEnv& env, const std::string& filename, Db& db) 53 | } 54 | } 55 | } 56 | + 57 | +bool LockEnvDirectory(const fs::path& env_path)
A lot in this function seems to be a duplication from LockDataDirectory in init.cpp. What do you think about abstracting them out to a utility function in util.cpp instead?
utACK 005d4cdef7a7a9f98cf202db7542a7fd01e048b8
Thanks for the feedback @sipa
381 | + fs::path pathLockFile = directory / lockfile_name; 382 | + FILE* file = fsbridge::fopen(pathLockFile, "a"); // empty lock file; created if it doesn't exist. 383 | + if (file) fclose(file); 384 | + 385 | + try { 386 | + static boost::interprocess::file_lock lock(pathLockFile.string().c_str());
From https://botbot.me/freenode/bitcoin-core-dev/msg/95065183/
I've been trying to work out why the last commit in #11904 is breaking the locking function, it works fine before that commit is added, but I can't work out why it doesn't work with that change. with a brand new walletdir, it creates the .walletlock file, but lock.try_lock() fails and the node crashes on startup, does anyone with more boost familiarity than me know why lock.try_lock fails when I move it
I haven't looked closely at changes here yet, but as I was trying to say in #11904 (review), I think the static declaration on this line becomes broken as soon as more than one directory is locked, because this will only create one single file_lock instance no matter how many different directories need to be locked. I think what you really need here is a map or list of locks, not one single lock. Perhaps something along the lines of (untested):
static std::map<std::string, boost::interprocess::file_lock> locks;
boost::interprocess::file_lock& lock = locks.emplace(pathLockFile.string(), pathLockFile.string().c_str()).first->second;
Ah, of course. Fixed. Thank you!
375 | @@ -375,6 +376,27 @@ int LogPrintStr(const std::string &str) 376 | return ret; 377 | } 378 | 379 | +bool LockDirectory(const fs::path& directory, const std::string lockfile_name, bool probe_only)
Maybe combine directory and std::string lockfile_name arguments into a single path argument. Having 2 arguments seems more verbose and less flexible.
Indeed, was just for the purpose of printing out the directory in the log message below rather than the filename too. Will modify if/when there's another change too
I kind of like these separated out - after all, the function is called LockDirectory and not CreateLockFile. If you merge the directory and lockfile name, I'd say you'd also have to rename the function because it's not longer specifically about locking directories.
Nit: const std::string &lockfile_name
390 | + } 391 | + if (probe_only) { 392 | + lock.unlock(); 393 | + } 394 | + } catch (const boost::interprocess::interprocess_exception& e) { 395 | + return error("Error while attempting to lock directory %s: %s", directory.string(), e.what());
Would be good to include filename so you could see whether wallet or data lock failed.
Could do, but in both cases another more specific error message is printed out when LockDirectory returns, so it's always on the next log line anyway
utACK abcfc2713d4e2696b5178c15975291bfd79a68b7. Main change since previous review was adding the more general LockDirectory util function.
Would also recommend just squashing everything down into a single commit to avoid this being a garden-path pr.
utACK, needs rebase
Rebased
Tested ACK 2f3bd47
2018-01-16 10:09:49 Cannot obtain a lock on wallet directory /tmp/wallet1. Another instance of bitcoin may be using it.
Milestone
0.16.0