Backports:
Note that this backport is unclean and several changes have to be made to most commits to accommodate BDB and the differences in migration cleanup behavior.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34222.
See the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.
416+ assert not (self.nodes[0].wallets_path / "wallet.dat").exists()
417+
418+ self.log.info('Checking createfromdump with an unnamed wallet')
419+ self.do_tool_createfromdump("", "wallet.dump")
420+ if not self.options.descriptors:
421+ os.rename(self.nodes[0].wallets_path / "default.wallet.dat", self.nodes[0].wallets_path / "wallet.dat")
os.rename isn’t throwing an exception on Windows? the wallet.dat should exist in the top-level dir after the do_tool_createfromdump
Github-Pull: bitcoin/bitcoin#31423
Rebased-From: d04f6a97ba9a55aa9455e1a805feeed4d630f59a
@davidgumberg discovered that a form of the deletion bug is also reachable if the user migrates a wallet at a relative path, resulting the entire directory at that relative path being deleted on a migration failure. This is problematic when that relative path contains more than just a wallet file. For example, with a wallet.dat in the datadir, migrating the wallet named “../” would result in deleting the entire datadir.
To resolve this, I’ve backported the entirety of #34156 and #34226 which adds tests for this case.
These backports are largely unclean as well.
4630@@ -4570,6 +4631,7 @@ util::Result<MigrationResult> MigrateLegacyToDescriptor(std::shared_ptr<CWallet>
4631 // Before deleting the wallet's directory, copy the backup file to the top-level wallets dir
4632 fs::path temp_backup_location = fsbridge::AbsPathJoin(GetWalletDir(), backup_filename);
4633 fs::copy_file(backup_path, temp_backup_location, fs::copy_options::none);
4634+ wallet_files_to_remove.insert(backup_path);
If the wallet originally was in the wallet dir directly, we’ll copy it over itself and then delete it?
Why not just move the file here (and never delete the backup)?
Actually, it looks like copy_file is throwing an exception: “filesystem error: cannot copy file: File exists”
So it’s not deleting the backup, but neither is it doing the intended cleanup at all. Probably want to extend the test to verify the cleanup happens, and then fix this.
(to be clear, for test “Test failure during migration of wallet named: unnamed (default)”)
The tests already check that this error is being hit and that the cleanup does not occur.
However this was needed for the other tests where cleanup is occurring. It is not universally true that the cleanup does not occur. It should only be for the unnamed wallet and for wallets by absolute path.
627+
628+ # Verify the /wallets/ path exists.
629+ assert self.master_node.wallets_path.exists()
630+
631+ # Verify both wallet paths exist.
632+ assert Path(old_path / "wallet.dat").exists()
137+ std::vector<fs::path> files;
138+ files.emplace_back(env->Directory() / m_filename);
139+ files.emplace_back(env->Directory() / "db.log");
140+ files.emplace_back(env->Directory() / ".walletlock");
141+ files.emplace_back(env->Directory() / "database" / "log.0000000001");
142+ files.emplace_back(env->Directory() / "database");
Good point. Added a check that env->m_databases.size() == 1 to return these.
It’s only the unnamed wallet and direct files in the walletsdir that can possibly run into this issue. For wallets in a subdirectory, it will be one wallet.dat per environment.
m_databases only includes open wallets. Other wallets dependent on these files might not be open.
Wallets are compacted on shutdown so they should not have any of these files.
This also matches the behavior of BerkeleyDatabase::Flush.
Files() implementation for BDB exists just for completeness, it is not being called anywhere during migration right now.
it is not being called anywhere during migration right now.
Oh right, only SQLiteDatabase::Files() is called by migration. BerkeleyDatabase::Files() is only called by createfromdump’s cleanup if the user creates into a BDB file.
Wallets are compacted on shutdown so they should not have any of these files.
Only on clean shutdown. It’s possible the node crashed or power failed, etc.
This also matches the behavior of BerkeleyDatabase::Flush.
Yes, this is an existing bug: https://github.com/bitcoinknots/bitcoin/pull/242 (happy to PR here if there’s interest)
createfromdump cleanup is really minimal in comparison. createfromdump is a niche external wallet utility intended for development mostly.
Yes, this is an existing bug: bitcoinknots#242 (happy to PR here if there’s interest)
Briefly glancing through that pr, I don’t think it solves any of the issues discussed in this thread. According to BDB’s docs, the only error that lsn_reset, txn_checkpoint, and log_archive can return is EINVAL, so I don’t think adding error handling there is meaningfully useful. It looks like that behavior has also existed since the initial commit of this project.
As best as I can tell, there is no way to prevent BDB from possibly becoming corrupted when the log files are removed. Even with log_archive(DB_ARCH_REMOVE), it still keeps the most recent log file, and there does not seem to be any way to determine whether that log file is actually still useful. The only way to avoid corruption there is to never delete the log files. However, the purpose of deleting the log files is to allow people to move between BDB versions (or more specifically, between a self compiled binary against a different BDB version and the release binaries) without the log file incompatibility issue, and that is probably still useful behavior. This behavior has also existed for nearly 13 years now.
I think this corruption is extremely unlikely to occur anyways. I’m finding it hard to even trigger these corruptions even with knowing the conditions they should occur in. Given that BDB was removed from the project several months ago, I doubt there is any appetite to review fixes for these issues, especially as the versions they affect go EOL.
131@@ -132,6 +132,19 @@ class BerkeleyDatabase : public WalletDatabase
132 /** Return path to main database filename */
133 std::string Filename() override { return fs::PathToString(env->Directory() / m_filename); }
134
135+ std::vector<fs::path> Files() override
createfromdump?
It is now used by the changes to migration.
It’s used by RestoreWallet for cleanup as well.
Github-Pull: bitcoin/bitcoin#31423
Rebased-From: 1de423e0a08bbc63eed36c8772e9ef8b48e80fb8
Github-Pull: bitcoin/bitcoin#34215
Rebased-From: f78f6f1dc8e16d5a8a23749e77bc3bf17c91ae42
Track what RestoreWallet creates so only those files and directories
are removed during a failure and nothing else. Preexisting paths
must be left untouched.
Note:
Using fs::remove_all() instead of fs::remove() in RestoreWallet does
not cause any problems currently, but the change is necessary for the
next commit which extends RestoreWallet to work with existing directories,
which may contain files that must not be deleted.
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: 4ed0693a3f2a427ef9e7ad016930ec29fa244995
1461+ migrated = self.master_node.get_wallet_rpc(wallet_name)
1462+ info = migrated.getwalletinfo()
1463+ assert_equal(info["descriptors"], False)
1464+ assert_equal(info["format"], "bdb")
1465+ backup_path = self.master_node.wallets_path / wallet_name / f"{wallet_name}_{mocked_time}.legacy.bak"
1466+ assert backup_path.exists()
How is this test passing?
We first attempt to migrate the wallet and fail, then call loadwallet with the same bdb wallet when v29 has them disabled by default.
The master_node is not restarted with -deprecatedrpc=create_bdb so master_node.loadwallet should be throwing a “Build does not support Berkeley DB database format”.
-deprecatedrpc=create_bdb is only necessary for createwallet, not loadwallet.
631
632 mocked_time = int(time.time())
633 self.master_node.setmocktime(mocked_time)
634- assert_raises_rpc_error(-1, "filesystem error: cannot copy file: File exists", self.migrate_and_get_rpc, "")
635+ if os.path.isabs(wallet_name) or is_default:
636+ assert_raises_rpc_error(-1, "filesystem error: cannot copy file: File exists", self.master_node.migratewallet, wallet_name)
This test should also be failing, the generic exception is
0 assert_raises_rpc_error(-1, "filesystem error: in copy_file: File exists", self.master_node.migratewallet, wallet_name)
1430@@ -1431,6 +1431,41 @@ def test_solvable_no_privs(self):
1431 assert_equal(addr_info["solvable"], True)
1432 assert "hex" in addr_info
1433
1434+ def unsynced_wallet_on_pruned_node_fails(self):
1435+ self.log.info("Test migration of an unsynced wallet on a pruned node fails gracefully")
1436+ wallet_name = "pruned"
In 4831bfa64bef74900808e1915c3b2dab6578bd65: Why not the unnamed wallet? The error was there, not in a named one.
update: please see #34222 (comment)
Created a branch with all fixes in separate commits. https://github.com/furszy/bitcoin-core/tree/pr34222 Brief summary so you can squash them on their respective commits:
https://github.com/furszy/bitcoin-core/commit/4d7c5dbd5e0d7c03ed7dfff9516003d5fec4b721 replaces copy_file for rename. This is not only a code improvement, so we don’t need to keep track of the file, but also fixes an issue during the recovery logic when prune node fail to migrate
https://github.com/furszy/bitcoin-core/commit/0ef0a8a02453b4f02fb430b37e649d3f43244aaf builds on top of the previous one, and is related tor the failure test, squash it on #34222 (review) please.
https://github.com/furszy/bitcoin-core/commit/ec1023bca418ff6005838068261a3b96735b3d93 builds on top of the rest, and is related to the prune node test, squash it on https://github.com/bitcoin/bitcoin/commit/4831bfa64bef74900808e1915c3b2dab6578bd65 please.
When migrating any legacy unnamed wallet, a failed migration would
cause the cleanup logic to remove its parent directory. Since this
type of legacy wallet lives directly in the main '/wallets/' folder,
this resulted in unintentionally erasing all wallets, including the
backup file.
To be fully safe, we will no longer call `fs::remove_all`. Instead,
we only erase the individual db files we have created, leaving
everything else intact. The created wallets parent directories are
erased only if they are empty.
As part of this last change, `RestoreWallet` was modified to allow
an existing directory as the destination, since we no longer remove
the original wallet directory (we only remove the files we created
inside it). This also fixes the restore of top-level default wallets
during failures, which were failing due to the directory existence
check that always returns true for the /wallets/ directory.
This bug started after:
https://github.com/bitcoin/bitcoin/commit/f6ee59b6e2995a3916fb4f0d4cbe15ece2054494
Previously, the `fs::copy_file` call was failing for top-level wallets,
which prevented the `fs::remove_all` call from being reached.
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: f4c7e28e80bf9af50b03a770b641fd309a801589
4622 if (!success) {
4623 // Migration failed, cleanup
4624 // Before deleting the wallet's directory, copy the backup file to the top-level wallets dir
4625 fs::path temp_backup_location = fsbridge::AbsPathJoin(GetWalletDir(), backup_filename);
4626- fs::copy_file(backup_path, temp_backup_location, fs::copy_options::none);
4627+ fs::rename(backup_path, temp_backup_location);
Verifies that a failed migration of the unnamed (default) wallet
does not erase the main /wallets/ directory, and also that the
backup file exists.
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: 36093bde63286e19821a9e62cdff1712b6245dc7
The first test verifies that restoring into an existing empty directory
or a directory with no .dat db files succeeds, while restoring into a
dir with a .dat file fails.
The second test covers restoring into the default unnamed wallet
(wallet.dat), which also implicitly exercises the recovery path used
after a failed migration.
The third test covers failure during restore on a prune node. When
the wallet last sync was beyond the pruning height.
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: f011e0f0680a8c39988ae57dae57eb86e92dd449
Right now, after migration the last message users see is "migration completed",
but the migration isn't actually finished yet. We still need to load the new wallets
to ensure consistency, and if that fails, the migration will be rolled back. This
can be confusing for users.
This change logs the post-migration loading step and if a wallet fails to load and
the migration will be rolled back.
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: d70b159c42008ac3b63d1c43d99d4f1316d2f1ef
Because the default wallet has no name, the watch-only and solvables
wallets created during migration end up having no name either.
This fixes it by applying the same prefix name we use for the backup
file for an unnamed default wallet.
Before: watch-only wallet named "_watchonly"
After: watch-only wallet named "default_wallet_watchonly"
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: 82caa8193a3e36f248dcc949e0cd41def191efac
Github-Pull: bitcoin/bitcoin#34156
Rebased-From: b7c34d08dd9549a95cffc6ec1ffa4bb4f81e35eb
561+ os.mkdir(watch_only_dir)
562+ shutil.copyfile(self.old_node.wallets_path / "wallet.dat", watch_only_dir / "wallet.dat")
563+
564+ mocked_time = int(time.time())
565+ self.master_node.setmocktime(mocked_time)
566+ assert_raises_rpc_error(-1, "filesystem error: cannot copy file: File exists", self.migrate_and_get_rpc, "")
In c95b1121809e50322b4bf7fd7868e8d0f2a96199: this should be:
0assert_raises_rpc_error(-4, "Failed to create database", self.migrate_and_get_rpc, "")
Also, should pull https://github.com/bitcoin/bitcoin/pull/34221/commits/cbf0bd35bbf312f3b13d92d281d7112e4b43b9c3 from #34221.
Remember that this is setting the mock time twice, one outside migrate_and_get_rpc and another time inside it.
In c95b112: this should be:
0assert_raises_rpc_error(-4, "Failed to create database", self.migrate_and_get_rpc, "")
Already fixing that
Also, should pull cbf0bd3 from #34221.
Remember that this is setting the mock time twice, one outside
migrate_and_get_rpcand another time inside it.
migrate_and_get_rpc in 29.x does not do setmocktime.
migrate_and_get_rpc in 29.x does not do setmocktime.
👍🏼 .
Refactor a common way to perform the failed migration test that exists
for default wallets, and add relative-path wallets and absolute-path
wallets.
Github-Pull: 34226
Rebased-From: eeaf28dbe0e09819ab0e95bb7762b29536bdeef6
light code review ACK 76cdeb7b06232050c7d20ffa1395697cc4e53295 + backported the functional tests without the fixes and all of them failed accordingly.
0bruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/tool_wallet.py
1...
22026-01-12T18:20:20.949000Z TestFramework (ERROR): Unexpected exception
3Traceback (most recent call last):
4 File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
5 self.run_test()
6 File "/home/bruno/projects/bitcoin/./build/test/functional/tool_wallet.py", line 853, in run_test
7 self.test_dump_createfromdump()
8 File "/home/bruno/projects/bitcoin/./build/test/functional/tool_wallet.py", line 601, in test_dump_createfromdump
9 assert self.nodes[0].wallets_path.exists()
10 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
11AssertionError
0bruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/wallet_backup.py
1...
22026-01-12T18:21:24.258000Z TestFramework (ERROR): Unexpected exception
3Traceback (most recent call last):
4 File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 160, in try_rpc
5 fun(*args, **kwds)
6 File "/home/bruno/projects/bitcoin/test/functional/test_framework/coverage.py", line 50, in __call__
7 return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
8 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
9 File "/home/bruno/projects/bitcoin/test/functional/test_framework/authproxy.py", line 151, in __call__
10 raise JSONRPCException(response['error'], status)
11test_framework.authproxy.JSONRPCException: Failed to create database path '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0'. Database already exists. (-36)
12
13During handling of the above exception, another exception occurred:
14
15Traceback (most recent call last):
16 File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
17 self.run_test()
18 File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_backup.py", line 359, in run_test
19 self.restore_wallet_existent_name()
20 File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_backup.py", line 160, in restore_wallet_existent_name
21 assert_raises_rpc_error(
22 File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 151, in assert_raises_rpc_error
23 assert try_rpc(code, message, fun, *args, **kwds), "No exception raised"
24 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
25 File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 166, in try_rpc
26 raise AssertionError(
27AssertionError: Expected substring not found in error message:
28substring: 'Failed to restore wallet. Database file exists in '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0/wallet.dat'.'
29error message: 'Failed to create database path '/tmp/bitcoin_func_test_h2_o1g4s/node3/regtest/wallets/res0'. Database already exists.'.
0bruno@bruno-MS-Challenger-B850M-PLUS:~/projects/bitcoin((HEAD detached at v29.2))$ ./build/test/functional/wallet_migration.py
1...
22026-01-12T18:23:08.626000Z TestFramework (ERROR): Unexpected exception
3Traceback (most recent call last):
4 File "/home/bruno/projects/bitcoin/test/functional/test_framework/test_framework.py", line 135, in main
5 self.run_test()
6 File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_migration.py", line 1873, in run_test
7 self.test_migration_failure(wallet_name=wallet_name)
8 File "/home/bruno/projects/bitcoin/./build/test/functional/wallet_migration.py", line 773, in test_migration_failure
9 assert_raises_rpc_error(
10 File "/home/bruno/projects/bitcoin/test/functional/test_framework/util.py", line 151, in assert_raises_rpc_error
11 assert try_rpc(code, message, fun, *args, **kwds), "No exception raised"
12 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
13AssertionError: No exception raised
light review ACK 76cdeb7b06232050c7d20ffa1395697cc4e53295.
Did a manual cherry-pick, diff is substantial but nothing seemed strange.
Update the PR description please?
Done