ListWalletDir returns all available wallets in the current wallet directory.
Based on MeshCollider work in pull #11485.
After #11485 (comment) decided to push this.
Consider the following:
$ find wallets
wallets
wallets/wallet.dat
wallets/db.log
wallets/x1
wallets/x1/wallet.dat
wallets/x1/db.log
wallets/x1/.walletlock
wallets/x2
wallets/x2/wallet.dat
wallets/x2/db.log
wallets/x2/.walletlock
wallets/.walletlock
wallets/w
wallets/w/x3
wallets/w/x3/wallet.dat
wallets/w/x3/db.log
wallets/w/x3/.walletlock
ListWalletDir() gives
wallet.dat
x1
x2
w/x3
<!--e57a25ab6845829454e8d69fc972939a-->Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
228 | + std::vector<std::string> paths; 229 | + for (auto& path : ListWalletDir()) { 230 | + paths.push_back(path.string()); 231 | + } 232 | + return paths; 233 | + }
How about moving these functions to interfaces/wallet.cpp. I think they don't belong here.
How about moving these functions to interfaces/wallet.cpp. I think they don't belong here.
This is a reasonable place. These methods fit in with the existing getWallets and handleLoadWallet methods. The Node interface is somewhat monolithic and we could consider breaking it up in various ways. But the Wallet interface would be the wrong place to add these methods. Wallet methods operate on individual wallets, not collections of wallets. And these methods need to be available before any Wallet object is constructed.
24 | @@ -25,3 +25,32 @@ fs::path GetWalletDir() 25 | 26 | return path; 27 | } 28 | + 29 | +std::vector<fs::path> ListWalletDir() 30 | +{ 31 | + const unsigned char bdb_magic[4] = {0x62, 0x31, 0x05, 0x00}; // Berkeley DB Btree magic bytes
Should cite the source of this information (https://github.com/file/file/blob/5824af38469ec1ca9ac3ffd251e7afe9dc11e227/magic/Magdir/database#L74-L75) to give credit upstream and make this code easier to understand / debug / update in the future.
Since this is declared as bytes, the memcmp below will only look for little-endian btree files, even on big endian platforms. It seems like it would make more sense to look for little endian files on little-endian platforms, and big endian files on big endian platforms. Changing this to uint32_t bdb_magic = 0x00053162 and passing &bdb_magic to memcmp should accomplish this.
47 | + if (memcmp(file_bytes, bdb_magic, sizeof(bdb_magic)) != 0) continue; 48 | + 49 | + // avoid duplicates due to the recursive iteration 50 | + if (!dbs.insert(db).second) continue; 51 | + 52 | + paths.push_back(fs::relative(path, wallet_dir));
There are a number of problems with this implementation:
<wallet>/wallet.dat instead of <wallet> paths, which CWallet::Verify will reject.listwallets and getwalletinfo rpcs in the default bitcoin configuration.wallet.dat in subdirectories, it will read logs and other random files which bitcoin would never open as wallets. It also reads wallet.dat files in subdirectories twice and then uses a std::set to filter out the duplicates after they are read, which is unnecessary.Following alternate implementation is untested, but I think would avoid these problems:
std::vector<fs::path> wallets;
for (auto it = fs::recursive_directory_iterator(wallet_dir); it != end(it); ++it) {
if (it->status().type() == fs::directory_file && IsBerkeleyBtree(it->path() / "wallet.dat")) {
// Found a directory which contains wallet.dat btree file, add it as a wallet.
wallets.emplace_back(it->path());
} else if (it.level() == 0 && it->symlink_status().type() == fs::regular_file && IsBerkeleyBtree(it->path())) {
if (it->path() == "wallet.dat") {
// Found top-level wallet.dat btree file, add top level directory ""
// as a wallet.
wallets.emplace_back();
} else {
// Found top-level btree file not called wallet.dat. Current bitcoin
// software will never create these files but will allow them to be
// opened in a shared database environment for backwards compatibility.
// Add it to the list of available wallets.
wallets.emplace_back(it->path());
}
}
}
return wallets;
Thanks @ryanofsky, will update accordingly.
10 | @@ -11,4 +11,7 @@ 11 | //! Get the path of the wallet directory. 12 | fs::path GetWalletDir(); 13 | 14 | +//! Get wallets in wallet directory. 15 | +std::vector<fs::path> ListWalletDir(); 16 | +
Nit: Add some trivial test for ListWalletDir to make it technically in use.
There is listwaletdir RPC, which has some tests.
@ryanofsky updated to used your suggested code with some minor details fixed. The result is still relative paths to -walletdir.
@practicalswift will add test.
33 | @@ -34,6 +34,16 @@ void DummyWalletInit::AddWalletOptions() const 34 | 35 | const WalletInitInterface& g_wallet_init_interface = DummyWalletInit(); 36 | 37 | +fs::path GetWalletDir() 38 | +{ 39 | + throw std::logic_error("Wallet function called in non-wallet build.");
Nit: throwNotWalletBuild()?
172 | @@ -173,6 +173,12 @@ class Node 173 | //! Get unspent outputs associated with a transaction. 174 | virtual bool getUnspentOutput(const COutPoint& output, Coin& coin) = 0; 175 | 176 | + //! Return default wallet directory. 177 | + virtual std::string getWalletDir() = 0; 178 | + 179 | + //! Return available wallets in wallet directory. 180 | + virtual std::vector<std::string> listWalletDir() = 0;
I find this function name confusing. How about listWallets()? A future version of this function could take a directory argument and the name would still make sense.
I like the pairing between getWalletDir, listWalletDir and -walletdir, but I don't have a strong preference. Let's see what others say.
I think I agree with promag. listWalletDir describes more specifically what the function does than listWallets and it fits in with the other names cited. I don't imagine us adding a gui feature that would call for the gui asking the node to list wallets in a directory other than -walletdir, but even if we did and this method gained an optional argument, listWalletDir would still seem like a sensible name.
2449 | @@ -2450,6 +2450,32 @@ static UniValue getwalletinfo(const JSONRPCRequest& request) 2450 | return obj; 2451 | } 2452 | 2453 | +static UniValue listwalletdir(const JSONRPCRequest& request)
Ditto as above, but unfortunately it's too late to rename listwallets to listloadedwallets. Overloading listwallets doesn't seem like a good idea.
What about lswallets, listwalletfiles or findwallets?
I don't see a problem with listwalletdir. It describes the thing this RPC does, which is to list the contents of the -walletdir directory. (I do think it would have been clearer if that directory would have been called -walletsdir instead of -walletdir, but surprising to me as a native english speaker, other people seemed to think -walletsdir was clumsy and not grammatical in #12221).
228 | + { 229 | + std::vector<std::string> paths; 230 | + for (auto& path : ListWalletDir()) { 231 | + paths.push_back(path.string()); 232 | + } 233 | + return paths;
Sort alphabetically?
I think that's something for the UI to handle. It can sort by other attribute, like last modified, transaction count, balance, etc.
That's fine, as long as the RPC interface does it, that's also a User Interface :-)
Should we sort listwallets result too?
That could be a nice improvement. Not necessarily in this PR, but perhaps there could be a common function to render the list of wallets in RPC, to make sure they're consistent.
Should it ignore case? Should have an option to support numeric/date prefixes?
From the cli point of view, you can do stuff like bitcoin-cli -regtest listwalletdir | jq -r '.wallets | map(.name) | join("\n")' | sort ....
Don't most alphabetical sort functions ignore case (unless it's a tie) and put numbers before letters? I'm a fan of jq, but seems overkill for a simple list. It doesn't seem terribly complicated in C++11 even with UTF-8 (1st answer, 3rd example):
std::wstring a[] = {L"Шла", L"Саша", L"по", L"шоссе", L"и", L"сосала", L"сушку"};
std::sort(std::begin(a), std::end(a), std::locale("en_US.utf8"));
std::wstring_convert<std::codecvt_utf8<wchar_t>> cvt;
for(auto& s: a) std::cout << cvt.to_bytes(s) << ' ';
If you don't mind I prefer to follow up that discussion and possible implementation in both RPC. In this PR I'm just following listwallets ordering and for now having the functionality is enough.
Concept ACK. Tested 20db275 on macOS (needs testing on Windows).
My wallet.dat ended up as "" in the list.
Relying on BDB magic bytes to figure out if something is a wallet seems potentially brittle. Can you add BDB4 and BDB5 wallet payloads to the test suite? In addition the test node could dynamically create a wallet, and then try to find it. That way at least some developer will see that test fail on their machine if they compile with a future BDB version that's somehow different.
46 | +{ 47 | + const fs::path wallet_dir = GetWalletDir(); 48 | + std::vector<fs::path> paths; 49 | + 50 | + for (auto it = fs::recursive_directory_iterator(wallet_dir); it != end(it); ++it) { 51 | + std::cout << it->path().string() << '-' << it.level() << std::endl;
In commit "wallet: Add ListWalletDir utility" (90adb2cd6149af1869e624a2060ec8c4a543cf08)
Probably delete this line, it looks leftover from debugging.
Ops
2456 | + throw std::runtime_error( 2457 | + "listwalletdir\n" 2458 | + "Returns a list of wallets in the wallet directory.\n" 2459 | + "\nResult:\n" 2460 | + "[ (json array of strings)\n" 2461 | + " \"walletname\" (string) the wallet name\n"
In commit "rpc: Add listwalletdir RPC" (963eb27e8598e2746966ee4056f177ed332f0279)
I would consider avoiding a mistake we've made previously with other api's (listwallets, listaccounts) where we limited extensibility by returning lists of strings instead of list of json objects. Specifically, I think it'd be better to return
[ {"wallet": ""}, {"wallet": "foo"}, {"wallet": "bar"} ]
instead of:
[ "", "foo", "bar" ]
Because there's more metadata we could be returning about these wallets even without opening them, such as last modified times from the filesystem, status strings indicating whether wallets are presently loaded and locked, and path types indicating whether wallet paths refer to directories, symlinks, or legacy btree files.
Absolutely, my initial response was like:
{
"wallets": [
{ "name": "" }
]
}
which permits adding, for instance, walletdir, count, offset..
utACK 20db275b250189cfd2200df8e5535aa92afa594a
38 | @@ -39,7 +39,7 @@ static bool IsBerkeleyBtree(const fs::path& path) 39 | uint32_t data = 0; 40 | file.read((char*) &data, sizeof(data)); // Read 4 bytes of file to compare against magic 41 | 42 | - return memcmp(&data, &bdb_magic, sizeof(bdb_magic)) == 0;
@ryanofsky does this looks good to you?
Squashed and added release notes.
the test node could dynamically create a wallet, and then try to find it. That way at least some developer will see that test fail on their machine if they compile with a future BDB version that's somehow different. @Sjors does the test change looks good to you?
<!--32850dd3fdea838b4049e64f46995ea2-->
| Coverage | Change (pull 14291) | Reference (master) |
|---|---|---|
| Lines | -0.0043 % | 87.0361 % |
| Functions | -0.0186 % | 84.1130 % |
| Branches | -0.0052 % | 51.5451 % |
@promag it covers the second part of my suggestion yes: the wallets are generated by the test framework (using whatever version of BDB is on the dev machine), and then listed.
Can you add BDB4 and BDB5 wallet payloads to the test suite?
This isn't explicitly tested, but because some/most Travis machines use bdb4, it's covered regardless. We're not officially supporting BDB>4 wallets, so that's not important.
tACK fc40216e1b0a3e0462b4119f6e4380ac368ed828 for macOS, modulo wallet.dat being shown as "" (though that can be fixed later).
Try adding a second wallet to bitcoin.conf, then it shows up as "wallet.dat". But in that case you can unload with unloadwallet "wallet.dat". In case you don't have wallets in bitcoin.conf, there's never a need to load wallet.dat, so then it doesn't matter that it's called "".
35 | + std::ifstream file(path.string(), std::ios::binary); 36 | + if (!file.is_open()) return false; 37 | + 38 | + file.seekg(12, std::ios::beg); // Magic bytes start at offset 12 39 | + uint32_t data = 0; 40 | + file.read((char*) &data, sizeof(data)); // Read 4 bytes of file to compare against magic
Any concerns about the endianness of data as compared to bdb_magic here?
Any concerns about the endianness of data as compared to bdb_magic here?
Could add a comment here, but this checks for databases in the native format. On big endian systems, the magic bytes are 00 05 31 62, and on little endian systems they are 62 31 05 00.
agree with regard to adding comment; at least I too had no idea that BerkeleyDB files generated on big-endian systems are incompatible!
edit: another theory on IRC is that BerkeleyDB creates the file in native endian but accepts both (does byteswaps if needed)! if this is true, you need to check for both orderings here
here's Oracle's magic file for id'ing BDB files; https://docs.oracle.com/cd/E17275_01/html/programmer_reference/magic.txt tl;dr; check both ways around
ACK fc40216e1b0a3e0462b4119f6e4380ac368ed828
utACK fc40216e1b0a3e0462b4119f6e4380ac368ed828
utACK fc40216e1b0a3e0462b4119f6e4380ac368ed828. Changes since last review are adding release notes, replacing memcmp, and returning objects instead of strings
49 | + 50 | + for (auto it = fs::recursive_directory_iterator(wallet_dir); it != end(it); ++it) { 51 | + if (it->status().type() == fs::directory_file && IsBerkeleyBtree(it->path() / "wallet.dat")) { 52 | + // Found a directory which contains wallet.dat btree file, add it as a wallet. 53 | + paths.emplace_back(fs::relative(it->path(), wallet_dir)); 54 | + } else if (it.level() == 0 && it->symlink_status().type() == fs::regular_file && IsBerkeleyBtree(it->path())) {
so if i get this right: in the legacy case where walletdir == datadir, this will scan every file in the data directory recursively, and check it for BDB version magic? this includes LevelDB database files, block files, debug log files, authcookie files, lock files … can see tons of ways in which this can go wrong or this is somehow avoided?
if not, I'd strongly suggest it should just fail in that case—if the user wants to list non-default wallets, let them create a wallets directory
No, it only recursively scans directories to try to open wallet.dat inside and then check magic bytes.
It checks magic bytes on all files in the walletdir (datadir in the legacy case) — not recursively.
It still might be good to prevent this from opening too many files if there happen to be a lot of stray files in the root directory, or a lot of directories in the tree. Maybe add && (++count) before && IsBerkeleyBtree checks, and if (count > 1000) throw std::runtime_error("Can't list wallets, too many files in walletdir") and the end of the loop as a sanity check.
@ryanofsky an alternative solution is to scan only if walletdir != datadir, WDYT?
@ryanofsky an alternative solution is to scan only if walletdir != datadir, WDYT?
That seems like a worse solution. It doesn't provide protection if there's a looping directory structure or a lot of junk files in the wallet directory. It's less user friendly if users with old datadirs have to manually move database files around in order for this one rpc method to work. It also seems like it would be more complicated to test and explain.
I think my suggestion is better because it would just work in the all the normal cases, and provide a straightforward error message in the unusual case laanwj is concerned about.
By default, recursive_directory_iterator does not follow directory symlinks
I know, I was thinking about more about FUSE mounts and such. Anyway, if you think it complicates code too much to deal with unexpected filesystem stuff and provide a nice error message, I don't personally think you need to add any error handling here. I was just trying to suggest a way to deal with laanwj's concern.
But since you asked WDYT, I think special casing datadir=walletdir is a bad user experience and confusing and ugly and hard to explain.
31 | + // Berkeley DB Btree magic bytes, from: 32 | + // https://github.com/file/file/blob/5824af38469ec1ca9ac3ffd251e7afe9dc11e227/magic/Magdir/database#L74-L75 33 | + const uint32_t bdb_magic = 0x00053162; 34 | + 35 | + if (!fs::exists(path)) return false; 36 | + if (fs::file_size(path) < 1024) return false;
@ryanofsky @laanwj added these 2 checks to avoid opening unnecessary files.
I'm not sure what purpose these checks serve. Are they supposed to be an optimization? Is the second check supposed to guard against opening lock files? I would suggest adding a comment here or dropping these.
The first check actually seems like the opposite of an optimization, since file_size/ifstream calls below inherently have to check that the file exists, so it seems like this just checks the same condition twice.
The first check fs::exists(path) was there to prevent exceptions from the 2nd check fs::file_size(path). Fixed and added comment.
utACK 47a229e920187dd90af7bba52cc0bfe4baab735c. Only change since last review is adding more IsBerkeleyBtree checks.
utACK 743a3a9b20768519a11f54834d27fd7585a0670a. Just some cleanup around the file size check since the last review.
Thanks, rebased to remove fixup.
35 | + // A Berkeley DB Btree file has at least 4K. 36 | + // This check also prevents opening lock files. 37 | + boost::system::error_code ec; 38 | + if (fs::file_size(path, ec) < 4096) return false; 39 | + 40 | + std::ifstream file(path.string(), std::ios::binary);
switch to fs::ifstream post-#13878
utACK after nits
44 | + 45 | + // Berkeley DB Btree magic bytes, from: 46 | + // https://github.com/file/file/blob/5824af38469ec1ca9ac3ffd251e7afe9dc11e227/magic/Magdir/database#L74-L75 47 | + // - big endian systems - 00 05 31 62 48 | + // - little endian systems - 62 31 05 00 49 | + return data == 0x00053162 || data == 0x62310500;
LGTM now
utACK b791ef12c3605c185432e391e1571853b07a7441. Changes since last review were adding minimum file size check, switching to fs::ifstream, and no longer omitting big-endian btree files. It seems that berkeley supports loading databases with any byte order, with native byte order just being slightly more efficient: https://docs.oracle.com/cd/E17276_01/html/programmer_reference/general_am_conf.html#am_conf_byteorder
re-utACK after squash
ListWalletDir returns all available wallets in the current wallet directory.
Based on MeshCollider work in pull #11485.
66 | + } else { 67 | + // Found top-level btree file not called wallet.dat. Current bitcoin 68 | + // software will never create these files but will allow them to be 69 | + // opened in a shared database environment for backwards compatibility. 70 | + // Add it to the list of available wallets. 71 | + paths.emplace_back(fs::relative(it->path(), wallet_dir));
wallet/walletutil.cpp:71:36: error: ‘relative’ is not a member of ‘fs’
paths.emplace_back(fs::relative(it->path(), wallet_dir));
^
See: #14528 (travis: Compile once on xenial)
69 | @@ -68,6 +70,8 @@ def wallet_file(name): 70 | wallet_names = ['w1', 'w2', 'w3', 'w', 'sub/w5', os.path.join(self.options.tmpdir, 'extern/w6'), 'w7_symlink', 'w8', ''] 71 | extra_args = ['-wallet={}'.format(n) for n in wallet_names] 72 | self.start_node(0, extra_args) 73 | + assert_equal(set(map(lambda w: w['name'], self.nodes[0].listwalletdir()['wallets'])), set(['', 'w3', 'w2', 'sub/w5', 'w7', 'w7', 'w1', 'w8', 'w']))
This is wrong, fs::relative resolves symlinks but we allow symlinks to be specified as wallet names. I guess listwalletdir should list both w7 and w7_symlink but in runtime only one could be loaded (actually it's the same wallet but with different names). @ryanofsky @laanwj should I fix this in #14531 or fix in a separate PR?
On Debian Jessie PowerPC (Big endian) system (endianness probably unrelated):
wallet/walletutil.cpp: In function ‘std::vector<boost::filesystem::path> ListWalletDir()’:
wallet/walletutil.cpp:57:78: error: ‘end’ was not declared in this scope
for (auto it = fs::recursive_directory_iterator(wallet_dir); it != end(it); ++it) {
^
wallet/walletutil.cpp:57:78: note: suggested alternative:
In file included from /usr/include/c++/4.9/bits/basic_string.h:42:0,
from /usr/include/c++/4.9/string:52,
from ./fs.h:9,
from ./wallet/walletutil.h:8,
from wallet/walletutil.cpp:5:
/usr/include/c++/4.9/initializer_list:99:5: note: ‘std::end’
end(initializer_list<_Tp> __ils) noexcept
^
wallet/walletutil.cpp:60:32: error: ‘relative’ is not a member of ‘fs’
paths.emplace_back(fs::relative(it->path(), wallet_dir));
^
wallet/walletutil.cpp:71:36: error: ‘relative’ is not a member of ‘fs’
paths.emplace_back(fs::relative(it->path(), wallet_dir));
^
Makefile:6919: recipe for target 'wallet/libbitcoin_wallet_a-walletutil.o' failed
make[2]: *** [wallet/libbitcoin_wallet_a-walletutil.o] Error 1
make[2]: Leaving directory '/root/workspace/bitcoin/src'
Makefile:10219: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/root/workspace/bitcoin/src'
Makefile:697: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1