wallet: optimize migration process, batch db transactions #28574

pull furszy wants to merge 12 commits into bitcoin:master from furszy:2023_wallet_batch_migration changing 14 files +259 −117
  1. furszy commented at 1:54 pm on October 3, 2023: member

    Last step in a chain of PRs (#26836, #28894, #28987, #29403).

    Detailed Description:

    The current wallet migration process performs only individual db writes. Accessing disk to delete all legacy records, clone and clean each address book entry for every created wallet, create each new descriptor (with their corresponding master key, caches and key pool), and also clone and delete each transaction that requires to be transferred to a different wallet.

    This work consolidates all individual disk writes into two batch operations. One for the descriptors creation from the legacy data and a second one for the execution of the migration process itself. Efficiently dumping all the information to disk at once atomically at the end of each process.

    This represent a speed up and also a consistency improvement. During migration, we either want to succeed or fail. No other outcomes should be accepted. We should never leave a partially migrated wallet on disk and request the user to manually restore the previous wallet from a backup (at least not if we can avoid it).

    Since the speedup depends on the storage device, benchmark results can vary significantly. Locally, I have seen a 15% speedup on a USB 3.2 pendrive.

    Note for Testers:

    The first commit introduces a benchmark for the migration process. This one can be cherry-picked on top of master to compare results pre and post changes.

    Please note that the benchmark setup may take some time (~70 seconds here) due to the absence of a batching mechanism for the address generation process (GetNewDestination() calls).

  2. DrahtBot commented at 1:54 pm on October 3, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK achow101, theStack, pablomartin4btc
    Stale ACK josibake

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30221 (wallet: Ensure best block matches wallet scan state by achow101)
    • #29124 (wallet: Automatically repair corrupted metadata with doubled derivation path by achow101)
    • #28710 (Remove the legacy wallet and BDB dependency by achow101)
    • #28333 (wallet: Construct ScriptPubKeyMans with all data rather than loaded progressively by achow101)
    • #27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
    • #27286 (wallet: Keep track of the wallet’s own transaction outputs in memory by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Wallet on Oct 3, 2023
  4. furszy force-pushed on Oct 4, 2023
  5. DrahtBot added the label CI failed on Oct 4, 2023
  6. furszy force-pushed on Oct 5, 2023
  7. furszy force-pushed on Oct 5, 2023
  8. DrahtBot removed the label CI failed on Oct 5, 2023
  9. pablomartin4btc commented at 2:26 am on October 11, 2023: member

    Concept ACK.

    I’ll also perform the benchmark as instructed clearly in the description and will review the code later.

  10. theStack commented at 11:57 am on November 9, 2023: contributor
    Concept ACK
  11. fanquake referenced this in commit a7f4f1a09c on Dec 8, 2023
  12. DrahtBot added the label Needs rebase on Dec 8, 2023
  13. furszy force-pushed on Dec 8, 2023
  14. furszy commented at 8:13 pm on December 8, 2023: member
    Rebased after #28894 merge.
  15. DrahtBot removed the label Needs rebase on Dec 8, 2023
  16. furszy force-pushed on Dec 8, 2023
  17. DrahtBot added the label CI failed on Dec 8, 2023
  18. DrahtBot added the label Needs rebase on Jan 2, 2024
  19. furszy force-pushed on Jan 3, 2024
  20. furszy commented at 2:28 pm on January 3, 2024: member
    Rebased, conflicts solved. Next PR in the line is #28987.
  21. DrahtBot removed the label Needs rebase on Jan 3, 2024
  22. DrahtBot added the label Needs rebase on Jan 4, 2024
  23. furszy force-pushed on Jan 4, 2024
  24. DrahtBot removed the label Needs rebase on Jan 4, 2024
  25. DrahtBot added the label Needs rebase on Jan 8, 2024
  26. furszy commented at 4:14 pm on January 21, 2024: member
    Decoupled the address book cloning related commits to #26836. I am marking this PR as a draft until the preceding PRs are merged.
  27. furszy marked this as a draft on Jan 21, 2024
  28. achow101 referenced this in commit 835948d44b on Feb 8, 2024
  29. achow101 referenced this in commit 6ff0aa089c on Feb 12, 2024
  30. josibake commented at 12:14 pm on February 13, 2024: member

    Concept ACK

    Starting to review, if you get a chance can you rebase this so that it’s only the relevant outstanding commits?

  31. furszy force-pushed on Feb 13, 2024
  32. furszy marked this as ready for review on Feb 13, 2024
  33. furszy commented at 7:49 pm on February 13, 2024: member

    Starting to review, if you get a chance can you rebase this so that it’s only the relevant outstanding commits?

    Done. Rebased and reworked part of it. I still need to accommodate some code and improve the commit descriptions. But, at least the changes are reviewable now.

  34. furszy force-pushed on Feb 13, 2024
  35. DrahtBot removed the label Needs rebase on Feb 13, 2024
  36. in src/bench/wallet_migration.cpp:73 in ac643893c6 outdated
    68+        TestUnloadWallet(std::move(wallet));
    69+    }
    70+
    71+    int wallet_num = 0;
    72+    bench.epochs(NUM_WALLETS).run([&] {
    73+        std::string filename = strprintf("legacy_%d", wallet_num);
    


    maflcko commented at 12:04 pm on February 14, 2024:
    0        auto filename{PathFromString(strprintf("legacy_%d", wallet_num));
    

    I don’t like the c_str below. Also, it doesn’t compile on Windows.

    Using PathFromString should be fine to “convert” ascii to a path segment, IIUC

    Same above and #28894 (review)


    furszy commented at 12:42 pm on February 14, 2024:

    I don’t like the c_str below. Also, it doesn’t compile on Windows.

    yeah, me neither.. I just did this 5 months ago. Fixed, thanks.

  37. furszy force-pushed on Feb 14, 2024
  38. furszy force-pushed on Feb 15, 2024
  39. DrahtBot removed the label CI failed on Feb 15, 2024
  40. furszy force-pushed on Feb 16, 2024
  41. DrahtBot added the label Needs rebase on Feb 20, 2024
  42. furszy force-pushed on Feb 20, 2024
  43. DrahtBot removed the label Needs rebase on Feb 20, 2024
  44. DrahtBot added the label CI failed on Feb 26, 2024
  45. DrahtBot commented at 7:47 pm on February 26, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/21767740453

  46. furszy force-pushed on Feb 26, 2024
  47. DrahtBot removed the label CI failed on Feb 26, 2024
  48. in src/bench/wallet_migration.cpp:40 in a2866d6ce0 outdated
    35+    int NUM_WALLETS = 1;
    36+    // Number of imported watch only addresses
    37+    int NUM_WATCH_ONLY_ADDR = 20;
    38+
    39+    // Setup legacy wallets
    40+    for (int i = 0; i < NUM_WALLETS; ++i) {
    


    josibake commented at 10:44 am on February 27, 2024:

    in “bench: add coverage for wallet migration process” (https://github.com/bitcoin/bitcoin/pull/28574/commits/a2866d6ce0a4bf8b249ca13c19747d6a209dadab):

    Why the loop setup if we are only going to do one wallet? Seems better to either run multiple epochs or simplify the benchmark and take out the loop.


    furszy commented at 1:22 pm on February 27, 2024:

    Why the loop setup if we are only going to do one wallet? Seems better to either run multiple epochs or simplify the benchmark and take out the loop.

    The idea was to run it multiple times to decrease the noise by taking the median. But the time this test was taking originally (10-15 min) was too much. Thus why I left everything in-place for when the migration time decreases.

  49. in src/wallet/scriptpubkeyman.cpp:1787 in 5cc89e5eba outdated
    1781@@ -1782,6 +1782,12 @@ std::optional<MigrationData> LegacyScriptPubKeyMan::MigrateToDescriptor()
    1782         keyid_it++;
    1783     }
    1784 
    1785+    WalletBatch batch(m_storage.GetDatabase());
    1786+    if (!batch.TxnBegin()) {
    1787+        LogPrintf("Error during descriptors migration, cannot initialize db transaction\n");
    


    josibake commented at 10:49 am on February 27, 2024:

    in “wallet: batch MigrateToDescriptor() db transactions” (https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7):

    What about “Error during legacy wallet migration,” instead?


    furszy commented at 1:29 pm on February 27, 2024:

    What about “Error during legacy wallet migration,” instead?

    The idea of the error message is to point out at which part of the process failed. Saying “Error during legacy wallet migration” is too broad (the same message could arise anywhere along migration). This part of the process generates descriptors from the legacy SKPM keys and scripts.


    josibake commented at 4:37 pm on February 27, 2024:
    I see, maybe “Error generating descriptors for migration, cannot initialize db transaction”? Also not a big deal, its just he current message is confusing to me in that we aren’t migrating descriptors, but taking existing keys, creating descriptors and then saving them in a new wallet.

    furszy commented at 4:45 pm on February 27, 2024:
    Sure. Done as suggested.
  50. in src/wallet/scriptpubkeyman.cpp:1846 in 5cc89e5eba outdated
    1816@@ -1811,8 +1817,8 @@ std::optional<MigrationData> LegacyScriptPubKeyMan::MigrateToDescriptor()
    1817 
    1818         // Make the DescriptorScriptPubKeyMan and get the scriptPubKeys
    1819         auto desc_spk_man = std::unique_ptr<DescriptorScriptPubKeyMan>(new DescriptorScriptPubKeyMan(m_storage, w_desc, m_keypool_size));
    1820-        desc_spk_man->AddDescriptorKey(key, key.GetPubKey());
    1821-        desc_spk_man->TopUp();
    1822+        WITH_LOCK(desc_spk_man->cs_desc_man, desc_spk_man->AddDescriptorKeyWithDB(batch, key, key.GetPubKey()));
    


    josibake commented at 10:58 am on February 27, 2024:

    in “wallet: batch MigrateToDescriptor() db transactions” (https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7):

    key.GetPubKey() is a pretty expensive operation and we are calling it above:

    https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7#diff-55ce403d9c3440b4a38a261b7d452f84dd9a1d09b2a5ef1b4d3ac0c3567b9106R1812

    and then again here. Not a big deal compared to some of the other stuff, but I ran the bench locally with and without the suggested change and its ~1.5% speedup (assuming I’m reading the bench results correctly :sweat_smile: ):

    0# before
    1|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    3|    6,936,309,493.00 |                0.14 |    0.0% |32,790,301,157.00 |11,790,224,744.00 |  2.781 |3,044,271,114.00 |    0.2% |      6.94 | `WalletMigration`
    4
    5
    6# after
    7|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    8|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    9|    6,836,475,065.00 |                0.15 |    0.0% |32,795,704,156.00 |11,732,850,832.00 |  2.795 |3,044,413,213.00 |    0.2% |      6.84 | `WalletMigration`
    

    josibake commented at 11:39 am on February 27, 2024:

    Also, in mapCryptedKey, the CPubKey is already there, so you could pull it out and avoid recalculating it: https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7#diff-55ce403d9c3440b4a38a261b7d452f84dd9a1d09b2a5ef1b4d3ac0c3567b9106R1762

    But I’m not sure that’s worth the extra effort considering it would only make a difference if a majority of the keys come from mapCryptedKey and it seems that this isn’t making much of an improvement anyways.


    furszy commented at 1:45 pm on February 27, 2024:
    Sounds good. We could go further and remove the GetKey() from the keyids for-loop too (GetKey() access mapKeys for unencrypted wallets and mapCryptedKeys for encrypted ones –> which we already do at the beginning of the process). Still, maybe better to leave this and your suggestion for a follow-up. Just so this PR can be scoped to the db txn batching changes.

    pablomartin4btc commented at 9:11 pm on October 9, 2024:
    I ran a bench as well calling GetPubKey() only once instead of twice as is currently and didn’t gain much out of it, code-wise it could be update it for clarity and practicality anyways but I understand this is not part of the code change and moreover if this is going to be entirely replaced by a refactoring on a follow-up.
  51. in src/wallet/scriptpubkeyman.cpp:1959 in 5cc89e5eba outdated
    1928@@ -1923,9 +1929,9 @@ std::optional<MigrationData> LegacyScriptPubKeyMan::MigrateToDescriptor()
    1929                 if (!GetKey(keyid, key)) {
    1930                     continue;
    1931                 }
    1932-                desc_spk_man->AddDescriptorKey(key, key.GetPubKey());
    1933+                WITH_LOCK(desc_spk_man->cs_desc_man, desc_spk_man->AddDescriptorKeyWithDB(batch, key, key.GetPubKey()));
    


    josibake commented at 11:51 am on February 27, 2024:

    in “wallet: batch MigrateToDescriptor() db transactions” (https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7):

    Could get rid of the key.GetPubKey() call here and completely get rid of the privkeyid set above:

    https://github.com/bitcoin/bitcoin/pull/28574/commits/5cc89e5eba7330ea3ab64ca66e364170dcd0b4f7#diff-55ce403d9c3440b4a38a261b7d452f84dd9a1d09b2a5ef1b4d3ac0c3567b9106R1907-R1910

    since keys.origins is a pair of the keyid and CPubKey. Running the bench with with this suggestion seemed to help:

    0# after
    1|               ns/op |                op/s |    err% |          ins/op |          cyc/op |    IPC |         bra/op |   miss% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
    3|    6,395,421,573.00 |                0.16 |    0.0% |32,790,285,573.00 |11,895,327,005.00 |  2.757 |3,044,102,935.00 |    0.2% |      6.40 | `WalletMigration`
    

    6.94 -> 6.84 -> 6.4


    furszy commented at 1:57 pm on February 27, 2024:
    Sounds good but I would rather leave the improvement for a follow-up. Mainly to scope the PR changes to the db txn batching changes.

    josibake commented at 4:39 pm on February 27, 2024:
    Makes sense! I misread the title as meaning this PR is both optimizing the process and batching db transactions, but its clear from the PR description that this is focused on just the batching logic.
  52. josibake commented at 12:00 pm on February 27, 2024: member
    Looking good! I noticed a lot of key.GetPubKey() that could be removed, left some suggestions and verified locally that they improved the bench. I guess one conceptual question is if we are doing the GetPubKey() calls to regenerate the pubkeys in case we don’t trust the data in the legacy wallet?
  53. furszy commented at 1:58 pm on February 27, 2024: member
    Thanks for the review josi!
  54. furszy force-pushed on Feb 27, 2024
  55. furszy force-pushed on Feb 27, 2024
  56. josibake commented at 4:51 pm on February 27, 2024: member

    ACK https://github.com/bitcoin/bitcoin/pull/28574/commits/868e2f64a9d7de3a980e64fd7acd7323214d3765

    Code reviewed and verified the bench results. Thanks for taking the suggestions!

  57. DrahtBot requested review from theStack on Feb 27, 2024
  58. DrahtBot requested review from pablomartin4btc on Feb 27, 2024
  59. in src/wallet/scriptpubkeyman.cpp:2043 in cd69403317 outdated
    2008@@ -2009,9 +2009,15 @@ std::optional<MigrationData> LegacyScriptPubKeyMan::MigrateToDescriptor()
    2009 }
    2010 
    2011 bool LegacyScriptPubKeyMan::DeleteRecords()
    2012+{
    2013+    return RunWithinTxn(m_storage.GetDatabase(), /*process_desc=*/"delete legacy records", [&](WalletBatch& batch){
    2014+        return DeleteRecords(batch);
    2015+    });
    2016+}
    


    achow101 commented at 9:03 pm on March 22, 2024:

    In cd694033173ec7de2a16187432a3d7aa33ff62ca “wallet: introduce ‘DeleteRecords(batch)’ to be used within an external db txn”

    After this PR, this DeleteRecords() overload is only used by the tests, so I think it can just be removed (and perhaps never introduced).


    furszy commented at 3:53 am on March 24, 2024:

    After this PR, this DeleteRecords() overload is only used by the tests, so I think it can just be removed (and perhaps never introduced).

    I didn’t do it because, aside from the test changes, it also requires creating a new batch object and beginning + committing the db txn inside ApplyMigrationData on https://github.com/bitcoin/bitcoin/commit/cd694033173ec7de2a16187432a3d7aa33ff62ca. And these changes will need to be removed in the last commit.

  60. in src/wallet/wallet.cpp:4275 in 868e2f64a9 outdated
    4275+        if (auto res_migration = wallet.ApplyMigrationData(batch, *data); !res_migration) {
    4276+            error = util::ErrorString(res_migration);
    4277+            return false;
    4278+        }
    4279+        // Process succeed, connect the SPKM signals
    4280+        wallet.ConnectScriptPubKeyManNotifiers();
    


    achow101 commented at 9:06 pm on March 22, 2024:

    In 868e2f64a9d7de3a980e64fd7acd7323214d3765 “wallet: ApplyMigrationData, group all db txs into a single atomic operation”

    I don’t think it’s necessary to connect the signals anymore since the local wallet exists outside of any contexts. Although removing this should be its own commit.

    If you choose to not remove this, for completeness, this should also do NotifyCanGetAddressesChanged to match the previous behavior.


    furszy commented at 4:13 am on March 24, 2024:
    Sure. The wallet is also unloaded few lines after this signal connection. Will remove the lines.
  61. in src/wallet/wallet.cpp:2349 in 4b3ec698a2 outdated
    2346+    // Now that txs have been deleted from disk, erase them from memory.
    2347+    ClearTxs(*res);
    2348+    return {};
    2349+}
    2350+
    2351+util::Result<std::vector<CWallet::MapWalletIt>> CWallet::RemoveTxs(WalletBatch& batch, std::vector<uint256>& txs_to_remove) {
    


    achow101 commented at 9:15 pm on March 22, 2024:

    In 4b3ec698a228fc531cca946f9a572e1dbdc4b0d5 “wallet: prepare RemoveTxs to be used within a broader db txn procedure”

    I don’t really like this refactor. This means that any callers of this particular overload will have to remember to do ClearTxs, whereas callers of the other one don’t. This seems like something that could result in programming mistakes in the future. Generally our overloads should do basically the same things, just with the option of providing different arguments. Not clearing the txs from mapWallet in one overload is a pretty big difference that may not be easily noticed.

    I think it would actually be a bit cleaner and better to have RemoveTxs detect if a database transaction has been started already and just choose to not begin a transaction if that’s the case.


    furszy commented at 4:16 am on March 24, 2024:

    I think it would actually be a bit cleaner and better to have RemoveTxs detect if a database transaction has been started already and just choose to not begin a transaction if that’s the case.

    The rationale behind the decoupling wasn’t related to the existing (or not) db txn. It was to prevent another programming mistake: The provided db transaction could be aborted by the calling function (deliberately or undeliberately), causing the wallet to enter an inconsistent state where the data in memory differs from the data on disk. This would happen because, in the all-in-one version of this function, the transactions are cleared from the in-memory map.

    I understand that for the migration process, this isn’t really an issue because the wallet will be unloaded anyway. We could go back to the all-in-one version of this function, but for other processes, if we ever decide to abort the db txn, the wallet will need to be reloaded from disk to recover the previous state.

    I’m ok with merging back these functions. We just need to be careful not to abort the db txn and continue as if nothing happened after calling this method.


    furszy commented at 2:13 pm on March 26, 2024:

    I understand that for the migration process, this isn’t really an issue because the wallet will be unloaded anyway. We could go back to the all-in-one version of this function, but for other processes, if we ever decide to abort the db txn, the wallet will need to be reloaded from disk to recover the previous state.

    I’m ok with merging back these functions. We just need to be careful not to abort the db txn and continue as if nothing happened after calling this method.

    Ended up introducing a mechanism to listen for the db txn outcomes. This way, internal procedures can register callbacks to update the in-memory state only when the data is successfully committed to disk.

  62. furszy force-pushed on Mar 26, 2024
  63. furszy commented at 2:15 pm on March 26, 2024: member
    Updated per feedback. Thanks achow.
  64. furszy force-pushed on Mar 26, 2024
  65. DrahtBot added the label Needs rebase on Mar 29, 2024
  66. furszy force-pushed on Mar 29, 2024
  67. furszy commented at 1:59 pm on March 29, 2024: member
    rebased after #29130 merge.
  68. DrahtBot added the label CI failed on Mar 29, 2024
  69. DrahtBot removed the label Needs rebase on Mar 29, 2024
  70. DrahtBot removed the label CI failed on Apr 4, 2024
  71. DrahtBot added the label CI failed on May 1, 2024
  72. DrahtBot commented at 9:59 pm on May 1, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/23239600985

  73. DrahtBot marked this as a draft on May 13, 2024
  74. DrahtBot commented at 8:05 am on May 13, 2024: contributor
    Turned into draft for now, due to failing CI. If this is no longer a WIP, you can move it out of draft.
  75. furszy force-pushed on May 15, 2024
  76. furszy marked this as ready for review on May 15, 2024
  77. DrahtBot removed the label CI failed on May 15, 2024
  78. DrahtBot commented at 2:30 am on June 3, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/25006227127

  79. DrahtBot added the label CI failed on Jun 3, 2024
  80. furszy force-pushed on Jun 3, 2024
  81. furszy commented at 1:37 pm on June 3, 2024: member
    Rebased due a hidden conflict with #26606.
  82. DrahtBot removed the label CI failed on Jun 3, 2024
  83. in src/wallet/wallet.cpp:3750 in 4e72821e19 outdated
    3695@@ -3696,25 +3696,17 @@ DescriptorScriptPubKeyMan& CWallet::SetupDescriptorScriptPubKeyMan(WalletBatch&
    3696     return *out;
    3697 }
    3698 
    3699-void CWallet::SetupDescriptorScriptPubKeyMans(const CExtKey& master_key)
    3700+void CWallet::SetupDescriptorScriptPubKeyMans(WalletBatch& batch, const CExtKey& master_key)
    


    achow101 commented at 7:34 pm on July 1, 2024:

    In 4e72821e1961b58559f49a4746bc8d15d116f6b7 “wallet: ApplyMigrationData, group all db txs into a single atomic operation”

    This could be its own commit.


    furszy commented at 9:55 pm on July 3, 2024:
    Done
  84. achow101 commented at 7:54 pm on July 1, 2024: member

    Concept ACK

    I’m not a huge fan of all the boilerplate and wrappers that this ends up adding, but I also don’t see a better way to do this.

    In ApplyMigrationData, it looks like we’re creating batchs for the watchonly and solvable wallets multiple times. Those could be consolidated in this PR as well.

  85. furszy force-pushed on Jul 3, 2024
  86. furszy commented at 9:55 pm on July 3, 2024: member

    In ApplyMigrationData, it looks like we’re creating batchs for the watchonly and solvable wallets multiple times. Those could be consolidated in this PR as well.

    Done. Thanks.

  87. achow101 commented at 7:48 pm on July 9, 2024: member
    ACK ad1e3620af88f73b869b4a2dbb74e03f5cd93517
  88. DrahtBot requested review from josibake on Jul 9, 2024
  89. DrahtBot added the label Needs rebase on Jul 11, 2024
  90. furszy force-pushed on Jul 12, 2024
  91. DrahtBot removed the label Needs rebase on Jul 12, 2024
  92. furszy force-pushed on Jul 14, 2024
  93. in src/bench/wallet_migration.cpp:10 in 2eb9314186 outdated
     5+#include <config/bitcoin-config.h> // IWYU pragma: keep
     6+
     7+#include <bench/bench.h>
     8+#include <interfaces/chain.h>
     9+#include <node/context.h>
    10+#include <test/util/mining.h>
    


    HerWayBit commented at 6:16 pm on July 14, 2024:
    .
  94. HerWayBit approved
  95. hebasto added the label Needs CMake port on Aug 16, 2024
  96. furszy force-pushed on Aug 29, 2024
  97. in src/Makefile.bench.include:98 in 5e1afd6dbc outdated
    94@@ -95,6 +95,7 @@ bench_bench_bitcoin_SOURCES += bench/wallet_create.cpp
    95 bench_bench_bitcoin_SOURCES += bench/wallet_loading.cpp
    96 bench_bench_bitcoin_SOURCES += bench/wallet_create_tx.cpp
    97 bench_bench_bitcoin_SOURCES += bench/wallet_ismine.cpp
    98+bench_bench_bitcoin_SOURCES += bench/wallet_migration.cpp
    


    hebasto commented at 3:25 pm on August 29, 2024:
    Changes to the Autotools-based build system might be dropped altogether. Otherwise, a conflict is unavoidable once #30664 is merged, which is expected to happen tomorrow.

    furszy commented at 2:17 am on September 5, 2024:
    pushed.
  98. hebasto removed the label Needs CMake port on Aug 29, 2024
  99. DrahtBot added the label Needs rebase on Sep 2, 2024
  100. furszy force-pushed on Sep 5, 2024
  101. DrahtBot removed the label Needs rebase on Sep 5, 2024
  102. DrahtBot added the label CI failed on Sep 12, 2024
  103. DrahtBot removed the label CI failed on Sep 15, 2024
  104. in src/bench/wallet_migration.cpp:20 in 761e7c7719 outdated
    15+#include <wallet/wallet.h>
    16+
    17+#include <optional>
    18+
    19+#ifdef USE_BDB // only enable benchmark when bdb and sqlite are enabled
    20+#ifdef USE_SQLITE
    


    theStack commented at 10:10 pm on September 24, 2024:

    preprocessor nit, as one-liner:

    0#if defined(USE_BDB) && defined(USE_SQLITE)
    

    furszy commented at 10:25 pm on September 26, 2024:
    done
  105. in src/bench/wallet_migration.cpp:49 in 761e7c7719 outdated
    44+
    45+    // Add watch-only addresses
    46+    std::vector<CScript> scripts_watch_only;
    47+    for (int w = 0; w < NUM_WATCH_ONLY_ADDR; ++w) {
    48+        CKey key;
    49+        key.MakeNewKey(true);
    


    theStack commented at 10:11 pm on September 24, 2024:

    nit

    0        CKey key = GenerateRandomKey();
    

    furszy commented at 10:26 pm on September 26, 2024:
    done
  106. in src/wallet/wallet.cpp:1709 in 2779398c05 outdated
    1700@@ -1701,10 +1701,16 @@ bool CWallet::CanGetAddresses(bool internal) const
    1701 }
    1702 
    1703 void CWallet::SetWalletFlag(uint64_t flags)
    1704+{
    1705+    WalletBatch batch(GetDatabase());
    1706+    return SetWalletFlagWithDB(batch, flags);
    1707+}
    1708+
    1709+void CWallet::SetWalletFlagWithDB(WalletBatch& batch, uint64_t flags)
    


    theStack commented at 11:00 pm on September 24, 2024:
    naming nit: any reason why sometimes the WithDB postfix is added for the new methods taking a WalletBatch, while on other cases in later commits (e.g. DeleteRecords) it’s overloaded with the already-existing name? Personally I’m not a huge fan of overloading, but happy to ACK either variant.

    furszy commented at 10:35 pm on September 26, 2024:
    I think this was part of the first changes I did, I do like more the overloaded function approach as is less verbose without loosing readability. But later on I saw that we are using the WithDB suffix and changed the rest of the commits. Will change DeleteRecords to use it. Thanks.
  107. theStack commented at 11:01 pm on September 24, 2024: contributor

    I’m not seeing any performance improvements locally with a non-debug build running on an SSD (benchmark shows ~13s both with and without the batching), but still think this PR is important for consistency reasons, i.e. to avoid the new wallet be left in a “half-migrated” state in case anything goes wrong. Changes look good to me at first glance (left only some nits), will do a deeper review round tomorrow.

    Might be worth to update the PR description with benchmark results from release builds, as only those are relevant for end users and the speed-up could vary significantly?

  108. in src/wallet/wallet.cpp:3763 in 766982846f outdated
    3745@@ -3746,21 +3746,27 @@ void CWallet::SetupDescriptorScriptPubKeyMans(const CExtKey& master_key)
    3746     if (!batch.TxnCommit()) throw std::runtime_error("Error: cannot commit db transaction for descriptors setup");
    3747 }
    3748 
    3749+void CWallet::SetupOwnDescriptorScriptPubKeyMans()
    3750+{
    3751+    assert(!IsWalletFlagSet(WALLET_FLAG_EXTERNAL_SIGNER));
    


    theStack commented at 9:38 pm on September 25, 2024:

    could add a lock assertion here (since the method has the EXCLUSIVE_LOCKS_REQUIRED(cs_wallet); annotation)

    0void CWallet::SetupOwnDescriptorScriptPubKeyMans()
    1{
    2    AssertLockHeld(cs_wallet);
    3
    4    assert(!IsWalletFlagSet(WALLET_FLAG_EXTERNAL_SIGNER));
    

    furszy commented at 10:39 pm on September 26, 2024:
    sure. done
  109. in src/wallet/wallet.cpp:4116 in c436b51645 outdated
    4105-            SetupDescriptorScriptPubKeyMans(data.master_key);
    4106+            SetupDescriptorScriptPubKeyMans(local_wallet_batch, data.master_key);
    4107         } else {
    4108             // Setup with a new seed if we don't.
    4109-            SetupOwnDescriptorScriptPubKeyMans();
    4110+            SetupOwnDescriptorScriptPubKeyMans(local_wallet_batch);
    


    theStack commented at 9:40 pm on September 25, 2024:
    in c436b516: shouldn’t those calls also happen as a batch transaction, i.e. wrapped around a TxnBegin/TxnCommit pair (at least in the context of this commit)?

    furszy commented at 10:42 pm on September 26, 2024:

    in c436b51: shouldn’t those calls also happen as a batch transaction, i.e. wrapped around a TxnBegin/TxnCommit pair (at least in the context of this commit)?

    yeah, they should, but that would make this commit and the last one more verbose without much gain. Maybe it’s better to leave it as is for simplicity?

  110. furszy force-pushed on Sep 26, 2024
  111. furszy commented at 10:55 pm on September 26, 2024: member

    Updated per feedback, thanks @theStack!

    I’m not seeing any performance improvements locally with a non-debug build running on an SSD (benchmark shows ~13s both with and without the batching), but still think this PR is important for consistency reasons, i.e. to avoid the new wallet be left in a “half-migrated” state in case anything goes wrong. Changes look good to me at first glance (left only some nits), will do a deeper review round tomorrow.

    Might be worth to update the PR description with benchmark results from release builds, as only those are relevant for end users and the speed-up could vary significantly?

    Yeah, now that all the other PRs got merged, things changed a bit. Aside from the consistency reasons, this PR is important for people running on HDDs. See #28037 (comment), which shared a pretty nice time improvement. Going from a 30 minutes migration to one that took only 80 seconds. Still, that was a long time ago. Will give it a new run on an HDD and update the PR description.

  112. theStack approved
  113. theStack commented at 2:18 pm on September 27, 2024: contributor

    Code-review ACK a4e1ba722be5dbd71a3d21e5da82aee5409d22bd

    Thanks for following up!

    Yeah, now that all the other PRs got merged, things changed a bit. Aside from the consistency reasons, this PR is important for people running on HDDs. See #28037 (comment), which shared a pretty nice time improvement. Going from a 30 minutes migration to one that took only 80 seconds. Still, that was a long time ago. Will give it a new run on an HDD and update the PR description.

    That sounds pretty impressive (>20x?). Would be very interesting indeed to re-benchmark with this branch again (cc @willcl-ark), though as said above the changes also make sense to me if the performance improvement is not that significant.

  114. DrahtBot requested review from achow101 on Sep 27, 2024
  115. furszy commented at 9:03 pm on September 30, 2024: member
    PR description updated. Created #31000 so we can all run the benchmark on an external HDD in a more friendly manner. Have seen a 15% speedup locally on a USB 3.2 pendrive.
  116. pablomartin4btc commented at 11:06 pm on October 9, 2024: member

    ACK a4e1ba722be5dbd71a3d21e5da82aee5409d22bd

    • This PR:
    0./build/src/bench/bench_bitcoin -filter=WalletMigration
    1
    2|               ns/op |                op/s |    err% |     total | benchmark
    3|--------------------:|--------------------:|--------:|----------:|:----------
    4|    5,322,062,123.00 |                0.19 |    0.0% |      5.32 | `WalletMigration`
    
    • master
    0./build/src/bench/bench_bitcoin -filter=WalletMigration
    1
    2|               ns/op |                op/s |    err% |     total | benchmark
    3|--------------------:|--------------------:|--------:|----------:|:----------
    4|   11,362,482,640.00 |                0.09 |    0.0% |     11.36 | `WalletMigration`
    
  117. furszy commented at 11:13 pm on October 10, 2024: member
    @josibake @achow101 friendly ping. It seems we are quite close here.
  118. DrahtBot added the label CI failed on Oct 20, 2024
  119. DrahtBot commented at 8:46 pm on October 20, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/30733818111

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  120. bench: add coverage for wallet migration process 66c9936455
  121. wallet: batch MigrateToDescriptor() db transactions
    Grouping all db writes into a single atomic write operation.
    Speeding up the flow and preventing inconsistent states.
    f2541d09e1
  122. wallet: decouple default descriptors creation from external signer setup
    This will be useful in the following-up commit to batch the entire
    wallet migration process.
    6052c7891d
  123. wallet: introduce 'SetWalletFlagWithDB' 122d103ca2
  124. wallet: provide WalletBatch to 'DeleteRecords'
    So it can be used within an external db txn context.
    055c0532fc
  125. wallet: remove post-migration signals connection
    The wallet is isolated during migration and reloaded at the end
    of the process. There is no benefit on connecting the signals
    few lines before unloading the wallet.
    91e065ec17
  126. wallet: introduce active db txn listeners
    Useful to ensure that the in-memory state is updated only
    after successfully committing the data to disk.
    57249ff669
  127. wallet: provide WalletBatch to 'RemoveTxs'
    Preparing it to be used within a broader db txn procedure.
    aacaaaa0d3
  128. wallet: refactor ApplyMigrationData to return util::Result<void> 34bf0795fc
  129. wallet: provide WalletBatch to 'SetupDescriptorScriptPubKeyMans'
    So it can be used within an external db txn context.
    9ef20e86d7
  130. wallet: migration, consolidate main wallet db writes
    Perform a single db write operation for the entire
    migration procedure.
    7c9076a2d2
  131. wallet: migration, consolidate external wallets db writes
    Perform a single db write operation for each external wallet
    (watch-only and solvables) for the entire migration procedure.
    c98fc36d09
  132. furszy force-pushed on Oct 21, 2024
  133. furszy commented at 1:26 pm on October 21, 2024: member
    Sad rebase due to tiny conflict with #30937. Ready to go.
  134. DrahtBot removed the label CI failed on Oct 21, 2024
  135. achow101 commented at 7:21 pm on October 23, 2024: member
    ACK c98fc36d094a08d44f3c95431db2c5f34a96cc73
  136. DrahtBot requested review from theStack on Oct 23, 2024
  137. DrahtBot requested review from pablomartin4btc on Oct 23, 2024
  138. theStack approved
  139. theStack commented at 7:49 pm on October 23, 2024: contributor
    re-ACK c98fc36d094a08d44f3c95431db2c5f34a96cc73 (as per $ git range-diff a4e1ba7...c98fc36)
  140. pablomartin4btc approved
  141. pablomartin4btc commented at 3:32 pm on October 24, 2024: member

    re-ACK c98fc36d094a08d44f3c95431db2c5f34a96cc73

    (no diff since my previous review, just rebased)

  142. achow101 merged this on Oct 24, 2024
  143. achow101 closed this on Oct 24, 2024

  144. furszy deleted the branch on Oct 24, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-23 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me