wallet: Ensure best block matches wallet scan state

achow101 commented at 9:58 pm on June 3, 2024: member

Implements the idea discussed in #29652 (comment)

Currently, m_last_block_processed and m_last_block_processed_height are not guaranteed to match the block locator stored in the wallet, nor do either of those fields actually represent the last block that the wallet is synced up to. This is confusing and unintuitive.

This PR changes those last block fields to be updated whenever the wallet makes a change to the db for new transaction state found in new blocks. Whenever a block is received that contains a transaction relevant to the wallet, the last block locator will now be written to disk. Furthermore, every block disconnection will now write an updated locator.

To ensure that the locator is relatively recent and loading rescans are fairly quick in the event of unplanned shutdown, it is also now written every 144 blocks (~1 day). Additionally it is now written when the wallet is unloaded so that it is accurate when the wallet is loaded again.

Lastly, the chainstateFlushed notification in the wallet is changed to be a no-op. The best block locator record is no longer written when chainstateFlushed is received from the node since it should already be mostly up to date.

DrahtBot commented at 9:58 pm on June 3, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30221.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	rkrux, mzumsande, ryanofsky
Concept ACK	fjahr

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#bitcoin-core/gui/872 (gui: Menu action to export a watchonly wallet by achow101)
#32523 (wallet: Remove watchonly behavior and isminetypes by achow101)
#32489 (wallet: Add exportwatchonlywallet RPC to export a watchonly version of a wallet by achow101)
#29680 (wallet: fix unrelated parent conflict doesn’t cause child tx to be marked as conflict by Eunovo)
#28333 (wallet: Construct ScriptPubKeyMans with all data rather than loaded progressively by achow101)
#27865 (wallet: Track no-longer-spendable TXOs separately by achow101)
#27286 (wallet: Keep track of the wallet’s own transaction outputs in memory by achow101)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

LLM Linter (✨ experimental)

Possible typos and grammar issues:

completeley -> completely [typo]
in case the case -> in case of a [redundant words]

DrahtBot added the label Wallet on Jun 3, 2024

Thompson1985 approved

ryanofsky commented at 3:34 pm on June 4, 2024: contributor

I’m not sure, but it seems like a potentially good thing to me that last_block_processed and BESTBLOCK are two distinct concepts.

The last block processed is the block that CWallet in-memory state has been synced to (particularly CWalletTx::m_state which includes mempool / abandoned information not serialized to disk).

The BESTBLOCK record is the last block whose data has been flushed to disk, that the wallet should begin syncing from next time it is reloaded.

It seems like the first commit ae3b8280a7dbb6cc87814ed97dbea5a122cf8215 could lead to a performance regression if it is writing the flushing the BESTBLOCK record on every single blockConnected event from the node. For example during reindexing it could write and sync the database each time a block is connected, even if the block is before the wallets birthday, or doesn’t have any relevant transactions?

I think it probably makes sense to write BESTBLOCK in a smarter way, and probably write it more frequently, but it seems unnecessary to have to write it every block, and it might be bad for performance now or make optimizations harder in the future.

I’m also curious about the idea of saving height with the best block record. I could imagine this being useful, since IIRC parts of wallet loading are complicated by not knowing heights of transactions before the chain is attached, but it doesn’t really seem like the height is used for much yet. Was this intended to be used by something else later?

achow101 commented at 3:48 pm on June 4, 2024: member

The last block processed is the block that CWallet in-memory state has been synced to (particularly CWalletTx::m_state which includes mempool / abandoned information not serialized to disk).

But none of that state is relevant to the last block processed. Any state related to blocks (confirmed or conflicted) is written to disk.

The BESTBLOCK record is the last block whose data has been flushed to disk, that the wallet should begin syncing from next time it is reloaded.

The point is that this discrepancy can result in skipping blocks. It doesn’t make sense to me that we wouldn’t store the block the wallet’s state has been synced to, and it doesn’t make sense that we should rely on a field that means something else to determine what point to start rescanning from on the next load.

If BESTBLOCK doesn’t match the chainstate, that’s fine because it’s a locator and we will just find the fork and sync from there. I don’t think it’s necessary to record which block we think the chainstate is synced to.

ryanofsky commented at 3:56 pm on June 4, 2024: contributor

It doesn’t make sense to me that we wouldn’t store the block the wallet’s state has been synced to

You may be right this is the better approach. But I think the previous approach does make some sense, too. You might not want to write to the wallet database each time a block is connected if the block doesn’t contain any relevant transactions, especially during reindexing.

achow101 commented at 9:03 pm on June 4, 2024: member

You might not want to write to the wallet database each time a block is connected if the block doesn’t contain any relevant transactions

I think you would since not writing it would result in possibly rescanning that block at the next loading which takes a little bit of time, regardless of whether any relevant transactions are in the block.

especially during reindexing.

Perhaps an easy solution to this is to just write the best block every 1000 blocks (or some other interval) when we are in IBD?

I’m also curious about the idea of saving height with the best block record. I could imagine this being useful, since IIRC parts of wallet loading are complicated by not knowing heights of transactions before the chain is attached, but it doesn’t really seem like the height is used for much yet. Was this intended to be used by something else later?

This was mainly done to avoid looking up the height every time we load since that was being problematic and causing a tests to fail. But it definitely could be more useful in the future.

ryanofsky commented at 3:16 pm on June 5, 2024: contributor

The idea of saving heights is interesting to me because wallet code assumes it know block heights many places, but it doesn’t actually store heights anywhere. So for example, if the wallet stored a mapping of block hashes to heights (or other block information) it might be useful for allowing the wallet to work in an offline mode or letting the wallet CLI tool have more capabilities. This is all pretty abstract though, so I’m not suggesting something specific.

Perhaps an easy solution to this is to just write the best block every 1000 blocks (or some other interval) when we are in IBD?

Yes but I think that just adds back the complexity you were trying to get rid of, in a form that seems worse than the status quo. If you implement that, the wallet will be back to tracking the last block processed separately from the best block written to the database. And now, instead of the node determining when data should be flushed to disk and having all wallets flush that data simultaneously, each wallet will have internal logic deciding when to flush. This would be more complicated, and could be worse for power consumption and performance if there are multiple wallets and flushes are happening more frequently at different times.

I think overall this PR is doing some good things, but the goals seem either not clear, or not clearly good, so I wonder if maybe you could take these changes and make more targeted PRs for the things you want to achieve?

Or, if this PR is definitely the direction you want to go, I’m happy to review it. I don’t like some things it is doing, but overall I think it is a reasonable change.

DrahtBot added the label Needs rebase on Jul 22, 2024

achow101 force-pushed on Jul 22, 2024

DrahtBot removed the label Needs rebase on Jul 22, 2024

DrahtBot added the label Needs rebase on Aug 12, 2024

in test/functional/wallet_assumeutxo.py:108 in ebf2e2863c outdated

84@@ -84,8 +85,6 @@ def run_test(self):
85             assert_equal(n.getblockchaininfo()[
86                          "headers"], SNAPSHOT_BASE_HEIGHT)
87 
88-        w.backupwallet("backup_w.dat")

fjahr commented at 11:18 am on August 20, 2024:

I’m thinking that it might make sense to keep this backup where it is and add second backup created at 199. Then this could test that both cases work as expected, i.e. 199 errors below and 299 doesn’t. 299 is an interesting edge case in general (snapshot height == backup height).

fjahr commented at 11:28 am on August 20, 2024: contributor

Concept ACK

I can confirm that best block locator and last_processed_block being different is confusing, see also #30455 (review)

Currently, this needs a rebase and I’m curious if @achow101 plans to make further changes based on @ryanofsky ’s comments.

ryanofsky commented at 2:28 pm on August 20, 2024: contributor

I can confirm that best block locator and last_processed_block being different is confusing

I wonder if something else could be done to resolve this confusion other than writing to every wallet file every time a new block is connected, even if the block doesn’t contain any relevant transactions. To me, “last block processed” and “last block flushed” seem like different concepts, and we could force them to be the same but only if we:

Give up flexibility of not needing to flush each wallet each time a block is connected (which seems inefficient during sync)
Give up ability to do coordinated flushes across the chainstate database, index databases and wallet databases time (which seems like it could waste resources and hurt system performance)

fjahr commented at 2:35 pm on August 20, 2024: contributor

I wonder if something else could be done to resolve this confusion

Ok, I would say it’s a problem when users see behavior that is inconsistent because of this difference. If the users don’t notice it I guess these don’t have to be the same at all times. There could be a more “pull-based” approach where they are aligned when the user interacts with the wallet, similar to what I tried to do for the backup case in #30678. But I am not clear yet on which approach is the better one all factors considered.

ryanofsky commented at 2:48 pm on August 20, 2024: contributor

It would be great if we could open an issue describing the user behavior that’s confusing. I read the comment you linked to but I didn’t really understand the context of what was going on with those backups. If there can be a github issue that simply describes steps to reproduce, expected behavior, and actual behavior, we can probably figure out if expected behavior is reasonable, and implement a fix if so.

fjahr commented at 7:43 pm on August 20, 2024: contributor

It would be great if we could open an issue describing the user behavior that’s confusing.

I have created an issue here: #30686

achow101 force-pushed on Aug 29, 2024

Thompson1985 approved

Thompson1985 commented at 8:08 pm on August 29, 2024: none

Thank you 👍

DrahtBot removed the label Needs rebase on Aug 29, 2024

DrahtBot added the label Needs rebase on Sep 23, 2024

achow101 force-pushed on Oct 4, 2024

DrahtBot removed the label Needs rebase on Oct 4, 2024

DrahtBot commented at 9:52 pm on October 4, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31102970538

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot added the label CI failed on Oct 4, 2024

maflcko commented at 3:35 pm on October 16, 2024: member

Maybe mark as draft while CI is red?

DrahtBot added the label Needs rebase on Oct 24, 2024

achow101 force-pushed on Oct 24, 2024

DrahtBot removed the label Needs rebase on Oct 24, 2024

DrahtBot added the label Needs rebase on Oct 25, 2024

achow101 force-pushed on Oct 25, 2024

DrahtBot removed the label Needs rebase on Oct 25, 2024

DrahtBot removed the label CI failed on Oct 25, 2024

DrahtBot added the label Needs rebase on Nov 15, 2024

achow101 force-pushed on Nov 19, 2024

DrahtBot removed the label Needs rebase on Nov 19, 2024

DrahtBot added the label Needs rebase on Jan 15, 2025

achow101 force-pushed on Jan 20, 2025

DrahtBot removed the label Needs rebase on Jan 20, 2025

DrahtBot added the label Needs rebase on Jan 31, 2025

achow101 force-pushed on Feb 1, 2025

DrahtBot removed the label Needs rebase on Feb 1, 2025

DrahtBot added the label CI failed on Feb 1, 2025

in src/wallet/wallet.cpp:1537 in ddc8095208 outdated

1530@@ -1517,8 +1531,11 @@ void CWallet::blockConnected(ChainstateRole role, const interfaces::BlockInfo& b
1531     assert(block.data);
1532     LOCK(cs_wallet);
1533 
1534-    m_last_block_processed_height = block.height;
1535-    m_last_block_processed = block.hash;
1536+    // Update the best block first. This will set the best block's height, which is
1537+    // needed by MarkConflicted.
1538+    // Although this also writes the best block to disk, this is okay even if there is an unclean
1539+    // shutdown since reloading the wallet will still rescan this block.

mzumsande commented at 10:25 pm on February 6, 2025:

this is okay even if there is an unclean shutdown since reloading the wallet will still rescan this block

I don’t think that this is the case. We will be temporarily in a state in which the wallet thinks it has scanned the block (locator is written to disk) but haven’t actually done the SyncTransaction work yet. This is fragile, because in case of an unclean shutdown a future rescan may not include this block if it thinks, based on the locator, that the block has already been scanned.

Therefore, it seems better to me to first do the SyncTransaction work and only then update the best block locator on disk - if there is an unclean shutdown with this order, we’ll be sure to sync the transactions again.

Although the best solution would be to it all in one batch.

in src/wallet/wallet.cpp:1548 in ddc8095208 outdated

1585@@ -1573,6 +1586,9 @@ void CWallet::blockDisconnected(const interfaces::BlockInfo& block)
1586             }
1587         }
1588     }
1589+
1590+    // Update the best block
1591+    SetLastBlockProcessed(block.height - 1, *Assert(block.prev_hash));

mzumsande commented at 8:58 pm on February 7, 2025:

To deal with unclean shutdowns, during disconnect the reverse order compared to blockConnected would be better: If the locator is updated first, an unclean shutdown will lead to a rescan of the disconnected block (which might do nothing but correct the locator). But if txns are set to Inactive first and have an unclean shutdown after that, the transactions will be missing from the wallet / balance. As mentioned above, doing it all in one batch would resolve this in a more elegant way.

ryanofsky commented at 3:41 pm on May 13, 2025:

In commit “wallet: Update best block record after block dis/connect” (f1f254f6c9d3cead617300367442c1bbb449af7c)

re: #30221 (review)

To deal with unclean shutdowns, during disconnect the reverse order compared to blockConnected would be better: If the locator is updated first, an unclean shutdown will lead to a rescan of the disconnected block (which might do nothing but correct the locator). But if txns are set to Inactive first and have an unclean shutdown after that, the transactions will be missing from the wallet / balance. As mentioned above, doing it all in one batch would resolve this in a more elegant way.

I might not totally understand this comment, but I think it is suggesting calling WriteBestBlock() at the top and SetLastBlockProcessed() at the bottom of blockDisconnected() , which is the opposite of what currently happens in blockConnected() where the last block is updated at the top and best block is updated at the bottom.

This could be reasonable. I also think it could be reasonable to do the opposite and just update the last block and best block at the bottom of both methods.

I think if the concern here is dealing with “unclean shutdowns” there is probably nothing special about the block disconnected method or reason to be particularly concerned about shutdowns here (unless I’m missing something), and it should be if the ok best block is a little out of date in that case.

achow101 commented at 6:40 pm on May 13, 2025:

I think I will leave this as is.

mzumsande commented at 11:29 pm on February 7, 2025: contributor

Concept ACK

I verified that this would solve the issues from #31824, only looked at the first couple commits in detail so far.

With respect to the above discussion, I think that the concept of delayed flushing only makes sense if it applies to all relevant data for a given block connection /disconnection (which is the case for the indexes, for example). But when other changes to wallet txns are synced to disk immediately, it seems wrong to delay flushing the locator - this will lead to a mixed state where some relevant data is flushed and other data isn’t, which will lead to issues such as the one above.

As for the point of IBD performance due to updating the locator at every block: Maybe this could be prevented by keeping track during block (dis)connection, if we did any other changes to the wallet db for this block. If nothing else needed changing ( wallet not affected by the block, only the locator would be updated) it should be safe to delay / skip the update.

DrahtBot added the label Needs rebase on Feb 14, 2025

achow101 force-pushed on Mar 6, 2025

achow101 commented at 1:50 am on March 6, 2025: member

As discussed at CoreDev, I’ve updated this PR to sync the best block periodically or when a connected block contains a relevant transaction.

I’ve also had to make a couple of changes to AttachChain following #31629

achow101 force-pushed on Mar 6, 2025

DrahtBot removed the label Needs rebase on Mar 6, 2025

achow101 force-pushed on Mar 6, 2025

achow101 commented at 7:54 pm on March 6, 2025: member

I added a change to make sure that the most recent best block is written to disk on wallet unload. However, it seems like this uncovered some weirdness in the wallet unloading on shutdown. In particular, StopWallets() was being called before UnloadWallets() which I think is incorrect. I added a commit which flips this order, and it seems to work, although there may be side effects I have not discovered yet. This primarily affects BDB wallets as the shutdown for BDB wallets is much more involved and confusing due to the shared database environment. This is not an issue for SQLite.

achow101 force-pushed on Mar 7, 2025

DrahtBot removed the label CI failed on Mar 7, 2025

DrahtBot added the label Needs rebase on Mar 13, 2025

achow101 force-pushed on Mar 14, 2025

DrahtBot removed the label Needs rebase on Mar 14, 2025

DrahtBot added the label CI failed on Mar 14, 2025

in src/wallet/wallet.cpp:1505 in 8b8cb9f4e9 outdated

1560         transactionRemovedFromMempool(block.data->vtx[index], MemPoolRemovalReason::BLOCK);
1561     }
1562+
1563+    // Update on disk if this block resulted in us updating a tx, or periodically every 144 blocks (~1 day)
1564+    if (wallet_updated || block.height % 144 == 0) {
1565+        SetLastBlockProcessed(block.height, block.hash);

mzumsande commented at 6:39 pm on March 25, 2025:

8b8cb9f4e9055a3283805ed6ff9397236842bb34 : need to update wallet_reorgsrestore.py after #31757 introduced a test for the bug this PR fixes.

achow101 commented at 10:03 pm on April 11, 2025:

Done

in src/wallet/wallet.cpp:655 in eecaca12f6 outdated

648@@ -649,15 +649,6 @@ bool CWallet::ChangeWalletPassphrase(const SecureString& strOldWalletPassphrase,
649     return false;
650 }
651 
652-void CWallet::chainStateFlushed(ChainstateRole role, const CBlockLocator& loc)

mzumsande commented at 9:28 pm on March 25, 2025:

eecaca12f6979aefed51c7b19f886c9d7d4e9be4: Two calls to chainStateFlushed are left, in MigrateLegacyToDescriptor and bench/wallet_migration.cpp

achow101 commented at 10:03 pm on April 11, 2025:

Removed

mzumsande commented at 9:46 pm on March 25, 2025: contributor

I don’t completely understand what the second part of the PR (BESTBLOCK_NOMERKLE serialization) achieves with the current approach. With 748063a9820cb740c9b68f80b9e8c97d2a21e83d, chainStateFlushed has been made a no-op and could be removed. On the other hand, the expressed goal of having m_last_block_processed and m_last_block_processed_height always in sync with the block locator is no longer current, given that we only update every 144 blocks if there is nothing relevant for the wallet in these blocks. In that case, why not remove chainstateFlushed after 748063a9820cb740c9b68f80b9e8c97d2a21e83d (plus update the locator on unload) and call it a day, avoiding any possible compatibility issues and the complications of a walletdb format change?

achow101 force-pushed on Apr 11, 2025

achow101 commented at 10:05 pm on April 11, 2025: member

I’ve removed the BestBlock struct refactor and walletdb changes so that this only changes how m_last_block_processed is updated. Those changes will be put into a separate PR at a later time.

I also noticed that the concurrent writes test in wallet_descriptor.py no longer tests the issue since it relied on chainStateFlushed() writing to the wallet. I’m pretty sure that there are no other cases of concurrent writes, so I think it is fine to remove that test now.

in test/functional/wallet_descriptor.py:1 in 229a159390

maflcko commented at 7:59 am on April 12, 2025:

 0[22:04:03.113] test/functional/wallet_descriptor.py:12:8: F401 [*] `concurrent.futures` imported but unused
 1[22:04:03.113]    |
 2[22:04:03.113] 10 |     pass
 3[22:04:03.113] 11 | 
 4[22:04:03.113] 12 | import concurrent.futures
 5[22:04:03.113]    |        ^^^^^^^^^^^^^^^^^^ F401
 6[22:04:03.113] 13 | 
 7[22:04:03.113] 14 | from test_framework.blocktools import COINBASE_MATURITY
 8[22:04:03.113]    |
 9[22:04:03.113]    = help: Remove unused import: `concurrent.futures`
10[22:04:03.113] 
11[22:04:03.113] test/functional/wallet_descriptor.py:15:40: F401 [*] `test_framework.descriptors.descsum_create` imported but unused
12[22:04:03.113]    |
13[22:04:03.113] 14 | from test_framework.blocktools import COINBASE_MATURITY
14[22:04:03.113] 15 | from test_framework.descriptors import descsum_create
15[22:04:03.113]    |                                        ^^^^^^^^^^^^^^ F401
16[22:04:03.113] 16 | from test_framework.test_framework import BitcoinTestFramework
17[22:04:03.113] 17 | from test_framework.util import (
18[22:04:03.113]    |
19[22:04:03.113]    = help: Remove unused import: `test_framework.descriptors.descsum_create`
20[22:04:03.113] 
21[22:04:03.113] Found 2 errors.
22[22:04:03.113] [*] 2 fixable with the `--fix` option.
23[22:04:03.115] ^^^
24[22:04:03.116] `ruff` found errors!
25[22:04:03.116] ^---- ⚠️ Failure generated from lint check 'py_lint' (Lint Python code)!
26[22:04:03.116] 
27[22:04:03.116]

achow101 force-pushed on Apr 12, 2025

DrahtBot removed the label CI failed on Apr 12, 2025

in src/wallet/load.cpp:178 in fa7d90ff02 outdated

172@@ -173,7 +173,7 @@ void FlushWallets(WalletContext& context)
173     }
174 }
175 
176-void StopWallets(WalletContext& context)
177+void CloseWallets(WalletContext& context)
178 {
179     for (const std::shared_ptr<CWallet>& pwallet : GetWallets(context)) {

ryanofsky commented at 6:23 pm on April 14, 2025:

In commit “wallet: Swap order of Unload and Close on shutdown” (fa7d90ff025c13d33342e2f36be6b2b60ddca558)

I’m confused about this change. If UnloadWallets, which deletes CWallet objects, is being called before CloseWallets (formerly StopWallets) then calling CloseWallets will have no effect because there are no wallets to call CWallet::Close on.

It seems like the thing this commit is really doing is avoiding calling CWallet::Close so that a new best block write added in RemoveWallet in following commit can work (commit “wallet: Write best block record on unload” (b48be4b17929e540b6c093fa868b8aca6e01266e).

If that’s the goal I think I’d suggest just completely dropping the CloseWallets function and the ~WalletLoaderImpl() destructor. If calling CWallet::Close call explicitly is needed it could be done in RemoveWallet, but if this change is working, it seems like that was never actually necessary and could be skipped.

achow101 commented at 8:02 pm on April 14, 2025:

Dropped it as suggested.

I believe CloseWallets did actually have an effect in that it is the step that deletes the /databases directory for legacy wallets. However, this step is skipped by unloadwallet, and afaict, is mostly unnecessary. So I’ve gone ahead and removed it. Since BDB is going to be removed, I see no reason to try to preserve this behavior especially when the two methods of unloading differ.

For sqlite wallets, this makes no difference.

ryanofsky approved

ryanofsky commented at 7:11 pm on April 14, 2025: contributor

Code review ACK 4882e71403111f11651f97a6b1cf7d861e619d8f. This change seems ok now that it is not writing to disk on every block, and there are some nice cleanups here.

I think PR description should be rewritten to say what the benefits of this change are now that statements like “This PR changes the those last block fields to actually be in sync with the record stored on disk.” are no longer true.

I guess my thoughts are:

I like all the small cleanups here making operations more explicit.
I like that blockConnected notification now calls WriteBestBlock to update the wallet locator whenever the block contains any relevant transactions. That seems like a clear win.
I think it’s probably good that blockDisconnected notification calls WriteBestBlock every block. Might be a little cleaner to only do it when the wallet is actually updated like blockConnected, but doubt that matters.
Calling WriteBestBlock every 144 blocks seems like it might write too frequently during initial sync and not write frequently enough after that. Seems like it might be smarter to update the best block based on clock time like the rescan implementation, which seems to write it every 60 seconds.
I don’t like not updating the best block when the chainStateFlushed notification is received. It seems nice to be able to tell a daemon to flush all its state to disk at once, and have some high level control over flushing for efficiency so each component is not just doing its own thing. But this is a theoretical objection because I don’t think chainStateFlushed was a practically useful notification before, even though it could have been.

Overall LGTM.

DrahtBot requested review from mzumsande on Apr 14, 2025

DrahtBot requested review from fjahr on Apr 14, 2025

achow101 force-pushed on Apr 14, 2025

achow101 commented at 8:03 pm on April 14, 2025: member

Updated the PR description.

ryanofsky approved

ryanofsky commented at 6:27 pm on April 16, 2025: contributor

Code review ACK 480186d7093117932bb1f672932813a8936b6c42. Only change since last review is dropping extra wallet shutdown step, which seems like a nice simplification.

PR description has also been updated and looks good. I did find last two paragraphs a little confusing because when it mentions “last block” fields and records it sounds might be referring to m_last_block_processed and m_last_block_processed_height some places where it’s actually referring to the best block locator. Would maybe suggest:

This PR changes those last block fields to be updated whenever the wallet makes a change to the db for new transaction state found in new blocks. Whenever a block is received that contains a transaction relevant to the wallet, the last block locator will now be written to disk. Furthermore, every block disconnection will now write an updated locator.

To ensure that the locator is relatively recent and loading rescans are fairly quick in the event of unplanned shutdown, it is also now written every 144 blocks (~1 day). Additionally it is now written when the wallet is unloaded so that it is accurate when the wallet is loaded again.

Could also add “The wallet locator is no longer written when a chainstateFlushed notification is received from the node, since it should already be mostly up to date”

achow101 commented at 6:54 pm on April 16, 2025: member

Updated description mostly as suggested.

in src/wallet/load.h:35 in 00fe040eca outdated

30@@ -31,10 +31,10 @@ void StartWallets(WalletContext& context);
31 //! Flush all wallets in preparation for shutdown.
32 void FlushWallets(WalletContext& context);
33 
34-//! Stop all wallets. Wallets will be flushed first.
35-void StopWallets(WalletContext& context);
36+//! Close all wallet databases.
37+void CloseWallets(WalletContext& context);

mzumsande commented at 6:30 pm on April 23, 2025:

This function is declared here, but not implemented or used anywhere. Is this meant to be called somewhere in UnloadWallets? Or does UnloadWallets already close the db somewhere (where?) and this can be deleted?

achow101 commented at 8:20 pm on April 23, 2025:

Just forgot to remove it, done now.

The db is closed when a CWallet is destroyed.

mzumsande commented at 8:08 pm on April 23, 2025: contributor

Looks good, except for an issue with commit 00fe040eca4b77d05ec5652b4d494b0ded183653

DrahtBot requested review from mzumsande on Apr 23, 2025

achow101 force-pushed on Apr 23, 2025

mzumsande commented at 9:14 pm on April 23, 2025: contributor

Looks like WalletMigration bench fails on this branch (which is now run in the CI).

DrahtBot added the label CI failed on Apr 23, 2025

DrahtBot commented at 10:06 pm on April 23, 2025: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/41041933804

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

achow101 commented at 10:08 pm on April 23, 2025: member

Fixed CI

achow101 force-pushed on Apr 23, 2025

DrahtBot removed the label CI failed on Apr 23, 2025

achow101 force-pushed on Apr 24, 2025

in src/wallet/wallet.cpp:2936 in 8b180358fb outdated

3119@@ -3120,7 +3120,8 @@ std::shared_ptr<CWallet> CWallet::Create(WalletContext& context, const std::stri
3120         }
3121 
3122         if (chain) {
3123-            walletInstance->chainStateFlushed(ChainstateRole::NORMAL, chain->getTipLocator());
3124+            WalletBatch batch(walletInstance->GetDatabase());
3125+            batch.WriteBestBlock(chain->getTipLocator());

ryanofsky commented at 5:08 pm on May 5, 2025:

In commit “wallet: Replace chainStateFlushed in loading with WriteBestBlock” (8b180358fb4c0585f2d9b8c4239d99a98c1fd663)

Not important, but if a goal of this PR is to try to keep m_last_block_processed and bestblock closer in sync, seems like it would be good to call SetLastBlockProcessed here instead of WriteBestBlock.

Same suggestion also applies to AttachChain below, but that WriteBestBlock to SetLastBlockProcessed replacement is already made in a later commit 08668272030bfcffc78efd77b642d6e50563d5dc. I think I’d also suggest squashing that commit into this one to reduce internal churn in the PR.

achow101 commented at 7:36 pm on May 7, 2025:

Done and squashed the commits.

in src/wallet/wallet.cpp:167 in eceb848494 outdated

163@@ -164,6 +164,7 @@ bool RemoveWallet(WalletContext& context, const std::shared_ptr<CWallet>& wallet
164 
165     interfaces::Chain& chain = wallet->chain();
166     std::string name = wallet->GetName();
167+    wallet->WriteBestBlock();

ryanofsky commented at 5:22 pm on May 5, 2025:

In commit “wallet: Write best block record on unload” (eceb848494ea24d3686a9ff9a2d25b657cce9d6e)

Current change should be ok because it’s ok if locator is a little behind, but this doesn’t seem like most ideal place to write the best block because notifications could still be incoming at the point. A better place would seem like the FlushAndDeleteWallet function before the Flush call. Even better might be to move both the WriteBestBlock call from here and the Flush call from FlushAndDeleteWallet into the CWallet destructor.

achow101 commented at 7:36 pm on May 7, 2025:

Moved to the destructor.

achow101 commented at 0:02 am on May 8, 2025:

https://github.com/bitcoin/bitcoin/actions/runs/14892979963/job/41829335187?pr=30221 seems like the same possible issue that you’ve described where a notification comes in during the unload.

ryanofsky commented at 5:41 pm on May 8, 2025:

re: #30221 (review)

Just to be clear I was suggesting moving both WriteBestBlock call and the Flush call into the destructor, because if you only move the WriteBestBlock call, it will happen after flushing, which seems incorrectly logically, though maybe it is ok if sqlite Flush is a no-op. I didn’t look into the CI failures though, just mentioning this in case it’s related.

achow101 commented at 5:56 pm on May 8, 2025:

Flush() is deleted post #28710 as sqlite’s Flush() is a no-op.

ryanofsky commented at 6:03 pm on May 8, 2025:

Flush() is deleted post #28710 as sqlite’s Flush() is a no-op.

Thanks, makes sense. I had been looking to see if the Flush call was moved in the diff, but of course it’s not because it doesn’t exist anymore.

achow101 commented at 11:00 pm on May 8, 2025:

I could not figure out the CI failure, so I’ve reverted this change.

ryanofsky commented at 4:39 pm on May 13, 2025:

In commit “wallet: Write best block record on unload” (8ecc7653556a1977eae5a902d51a2ececde94913)

Just to be sure, current state is that if WriteBestBlock is called here, rather than being called in the destructor or being called in FlushAndDeleteWallet, the best block locator could be a little out of date and this is maybe not ideal but fine?

re: #30221 (comment)

It can’t be static because the FUZZ_TARGET below needs to access it. However, the wallet could be created in there instead of in the setup, as the other wallet fuzz tests do.

Yes, suggestion would be to follow the fuzz test setup pattern and create a static variable and global pointer. But no need to change this if debugging the CI failure is hard and only consequence is the best block could be a little behind.

achow101 commented at 6:31 pm on May 13, 2025:

the best block locator could be a little out of date and this is maybe not ideal but fine?

I’m not sure that actually could happen, but I think that’s the worse that can happen, and I think that’s fine. It should only be a couple of blocks which would be fast to rescan.

in src/wallet/wallet.h:787 in 2d72a8894c outdated

789@@ -791,7 +790,6 @@ class CWallet final : public WalletStorage, public interfaces::Chain::Notificati
790     /** should probably be renamed to IsRelevantToMe */
791     bool IsFromMe(const CTransaction& tx) const;
792     CAmount GetDebit(const CTransaction& tx, const isminefilter& filter) const;
793-    void chainStateFlushed(ChainstateRole role, const CBlockLocator& loc) override;

ryanofsky commented at 5:28 pm on May 5, 2025:

In commit “wallet: Remove chainStateFlushed” (2d72a8894cfb609ee616b098e165c722183ba325)

Would it be easily possible to squash this commit and the earlier commit “wallet: Remove WriteBestBlock from chainStateFlushed” (35b636469983fdcd5f8e62448f4733effe596e41) together?

I feel like that wouild make the PR easier to understand because the earlier commit doesn’t provide any rationale while the commit message here is a little misleading because the commit at this point is only removing dead code.

Another alternative to squashing the two commits could be move to the rationale in this commit message to the earlier commit to describe reason for the earlier change.

achow101 commented at 7:36 pm on May 7, 2025:

Squashed.

ryanofsky approved

ryanofsky commented at 5:30 pm on May 5, 2025: contributor

Code review ACK de7b5eda7ef04e875850fe594de2a89992b7bd46. No significant changes since the last review other than a rebase and a CI fix.

Left some suggestions that I think could make the PR a little easier to understand and review, but they are not important so feel free to ignore.

DrahtBot added the label Needs rebase on May 7, 2025

in test/functional/wallet_reorgsrestore.py:122 in de7b5eda7e outdated

121@@ -122,11 +122,11 @@ def test_reorg_handling_during_unclean_shutdown(self):
122         self.start_node(0)
123         assert_equal(node.getbestblockhash(), tip)
124 
125-        # Due to an existing bug, the wallet incorrectly keeps the transaction in an abandoned state, even though that's
126-        # no longer the case (after the unclean shutdown, the node's chain returned to the pre-invalidation tip).
127-        # This issue blocks any future spending and results in an incorrect balance display.
128+        # After disconnecting the block, the wallet should recorded the new best block.

maflcko commented at 2:39 pm on May 7, 2025:

nit: “should recorded” looks like a typo

should recorded → should record

achow101 commented at 7:36 pm on May 7, 2025:

Done

achow101 force-pushed on May 7, 2025

DrahtBot removed the label Needs rebase on May 7, 2025

achow101 force-pushed on May 7, 2025

DrahtBot added the label CI failed on May 8, 2025

DrahtBot commented at 1:21 am on May 8, 2025: contributor

🚧 At least one of the CI tasks failed. Task fuzzer,address,undefined,integer, no depends: https://github.com/bitcoin/bitcoin/runs/41829354924 LLM reason (✨ experimental): The CI failure is caused by a dynamic-type-mismatch detected by UndefinedBehaviorSanitizer during fuzz testing.

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

ryanofsky commented at 6:11 pm on May 8, 2025: contributor

re: #30221 (comment)

🚧 At least one of the CI tasks failed. Task fuzzer,address,undefined,integer, no depends: https://github.com/bitcoin/bitcoin/runs/41829354924 LLM reason (✨ experimental): The CI failure is caused by a dynamic-type-mismatch detected by UndefinedBehaviorSanitizer during fuzz testing.

This seems unrelated to the other CI failures, but it looks like this failure is caused by a bug in the fuzz test:

https://github.com/bitcoin/bitcoin/blob/7eed9f17aec3e3ba93ec180a072b1c2f08dcbb39/src/wallet/test/fuzz/fees.cpp#L17-L25

Where the CWallet object is destroyed after the TestingSetup object so the interfaces::Chain object is already destroyed and this error happens.

Simplest fix would probably be to make the CWallet object a static variable in the function like the TestingSetup variable is, instead of a global variable.

achow101 force-pushed on May 8, 2025

achow101 commented at 11:01 pm on May 8, 2025: member

Simplest fix would probably be to make the CWallet object a static variable in the function like the TestingSetup variable is, instead of a global variable.

It can’t be static because the FUZZ_TARGET below needs to access it. However, the wallet could be created in there instead of in the setup, as the other wallet fuzz tests do.

But, with moving WriteBestBlock back to RemoveWallet, this is not necessary either.

DrahtBot removed the label CI failed on May 9, 2025

DrahtBot added the label Needs rebase on May 9, 2025

achow101 force-pushed on May 9, 2025

DrahtBot removed the label Needs rebase on May 9, 2025

in src/wallet/wallet.cpp:4464 in 8ecc765355 outdated

4471@@ -4471,4 +4472,17 @@ std::optional<CKey> CWallet::GetKey(const CKeyID& keyid) const
4472     }
4473     return std::nullopt;
4474 }
4475+
4476+void CWallet::WriteBestBlock() const

mzumsande commented at 4:32 pm on May 12, 2025:

nit: could save a little bit of duplicated code by having SetLastBlockProcessed() call WriteBestBlock() (which means that the locking would have to happen in RemoveWallet().

achow101 commented at 6:44 pm on May 13, 2025:

Done

in src/wallet/wallet.cpp:1515 in f1f254f6c9 outdated

1513         transactionRemovedFromMempool(block.data->vtx[index], MemPoolRemovalReason::BLOCK);
1514     }
1515+
1516+    // Update on disk if this block resulted in us updating a tx, or periodically every 144 blocks (~1 day)
1517+    if (wallet_updated || block.height % 144 == 0) {
1518+        SetLastBlockProcessed(block.height, block.hash);

mzumsande commented at 4:33 pm on May 12, 2025:

nit: could also call just introduce and use WriteBestBlock() here since SetLastBlockProcessedInMem is already called above.

ryanofsky commented at 3:29 pm on May 13, 2025:

re: #30221 (review)

In commit “wallet: Update best block record after block dis/connect” (f1f254f6c9d3cead617300367442c1bbb449af7c)

nit: could also call just introduce and use WriteBestBlock() here since SetLastBlockProcessedInMem is already called above.

This does seem like a nice suggestion. I also wonder if we can consolidate the seperate updates in blockConnected, deleting SetLastBlockProcessedInMem at the top, and moving it to the bottom here like:

0SetLastBlockProcessedInMem(block.height, block.hash);
1if (wallet_updated || block.height % 144 == 0) WriteBestBlock();

This could be easier to understand and it would make blockConnected implementation more similar to blockDisconnected.

achow101 commented at 6:45 pm on May 13, 2025:

Changed to WriteBestBlock().

I also wonder if we can consolidate the seperate updates in blockConnected, deleting SetLastBlockProcessedInMem at the top, and moving it to the bottom here like:

It’s at the top because there’s some stuff in AddToWallet which need the last block processed in memory to be updated to the current block.

I have them split so that if there is an unclean shutdown, the block will be rescanned on next load which is always safe.

mzumsande commented at 5:03 pm on May 12, 2025: contributor

Code Review ACK e2e9c30 (nits are not important)

in src/wallet/wallet.cpp:684 in f1f254f6c9 outdated

679+
680+    CBlockLocator loc;
681+    chain().findBlock(m_last_block_processed, FoundBlock().locator(loc));
682+
683+    WalletBatch batch(GetDatabase());
684+    batch.WriteBestBlock(loc);

ryanofsky commented at 3:26 pm on May 13, 2025:

In commit “wallet: Update best block record after block dis/connect” (f1f254f6c9d3cead617300367442c1bbb449af7c)

EDIT: Martin left a similar suggestion #30221 (review). Would be nice to follow up on any of these ideas.

It seems like this code is redundant and could be simplified:

 0--- a/src/wallet/wallet.cpp
 1+++ b/src/wallet/wallet.cpp
 2@@ -666,12 +666,7 @@ void CWallet::SetLastBlockProcessed(int block_height, uint256 block_hash)
 3     AssertLockHeld(cs_wallet);
 4 
 5     SetLastBlockProcessedInMem(block_height, block_hash);
 6-
 7-    CBlockLocator loc;
 8-    chain().findBlock(m_last_block_processed, FoundBlock().locator(loc));
 9-
10-    WalletBatch batch(GetDatabase());
11-    batch.WriteBestBlock(loc);
12+    WriteBestBlock();
13 }
14 
15 void CWallet::SetMinVersion(enum WalletFeature nVersion, WalletBatch* batch_in)

But this also makes me think it might be better to just delete the SetLastBlockProcessed method, and only keep SetLastBlockProcessedInMem and WriteBestBlock methods to avoid confusing the two concepts. (I do think keeping InMem suffix is nice in either case and clarifies things).

achow101 commented at 6:46 pm on May 13, 2025:

Done as suggested.

in src/wallet/wallet.cpp:3244 in 357a65fd42 outdated

3240                 return false;
3241             }
3242+            walletInstance->m_attaching_chain = false;
3243+            // Set and update the best block record
3244+            // Although ScanForWalletTransactions will have stopped at the best block that was set prior to the rescan,
3245+            // we still need to make sure that the best block on disk is set correctly as rescanning may overwrite it.

ryanofsky commented at 4:17 pm on May 13, 2025:

In commit “wallet: Replace chainStateFlushed in loading with SetLastBlockProcessed” (357a65fd426f2122f390803d572fd89d4b0177c4)

I think everything in the comment is accurate but also vague, and it raises questions about when the last scanned block would not match the last/best blocks. I think it might be better to be more concrete like:

// Set last block scanned as the last block processed, because it may be different in the case of a reorg. Also save the best block locator because rescanning only updates it intermittently.

achow101 commented at 6:47 pm on May 13, 2025:

Done as suggested.

ryanofsky approved

ryanofsky commented at 4:46 pm on May 13, 2025: contributor

Code review ACK e2e9c30a4d04bd8173c3386c5b97dbac227e810e just moving WriteBestBlock call from destructor back to RemoveWallet() since last review.

I left a bunch of suggestions below and repeated some of Martin’s but they are all nitpicking at this point so feel free to ignore

DrahtBot requested review from mzumsande on May 13, 2025

achow101 force-pushed on May 13, 2025

in src/wallet/wallet.cpp:4475 in 74450726b4 outdated

4487+        chain().findBlock(m_last_block_processed, FoundBlock().locator(loc));
4488+
4489+        WalletBatch batch(GetDatabase());
4490+        batch.WriteBestBlock(loc);
4491+    }
4492+}

rkrux commented at 7:25 pm on May 13, 2025:

Now that we have this WriteBestBlock function, something similar happens within BackupWallet as well: https://github.com/bitcoin/bitcoin/blob/8309a9747a8df96517970841b3648937d05939a3/src/wallet/wallet.cpp#L3267-L3274

Do you think we can leverage WriteBestBlock over there? I mention because the pattern is quite similar and I would prefer to have the same function being called.

ryanofsky commented at 4:53 pm on May 14, 2025:

Could also be good to check !loc.IsNull() in this function like that code is doing. (Could assert it or just not write to the database if it is null.)

achow101 commented at 6:05 pm on May 14, 2025:

Indeed, done.

in src/wallet/wallet.cpp:3224 in 9a7792acec outdated

3236             }
3237+            walletInstance->m_attaching_chain = false;
3238+            // Set and update the best block record
3239+            // Set last block scanned as the last block processed as it may be different in case the case of a reorg.
3240+            // Also save the best block locator because rescanning only updates it intermittently.
3241+            walletInstance->SetLastBlockProcessed(*scan_res.last_scanned_height, scan_res.last_scanned_block);

rkrux commented at 7:28 pm on May 13, 2025:

This comment is specific to the concept of scanning for wallet transactions and then updating the last block processed both in memory and disk:

Additionally, after rescanning on wallet loading, instead of writing the locator for the current chain tip, write the locator for the last block that the rescan had scanned.

There are couple other usages of ScanForWalletTransactions in rescanblockchain RPC and conditionally in importdescriptors. I guess we don’t want to update the block locator in disk in these cases because we also sync the block record to disk every 144 blocks now & doing it in these 2 RPCs doesn’t seem necessary?

achow101 commented at 6:05 pm on May 14, 2025:

Those usages of ScanForWalletTransactions specifically set the option to not write the last block scanned to disk. This is because they are expected to be used during normal operation when the wallet will still be handling incoming blocks and transactions, so having that record possibly be out sync is actually quite bad.

mzumsande commented at 6:28 pm on May 14, 2025:

Also just mentioning that rescanblockchain can be used with ranges (start_height / stop_height). If stop_height is set to some past block, setting the best block back to that old block would be incorrect and lead to wrong balances etc.

rkrux commented at 11:18 am on May 15, 2025:

Thanks, this makes sense to me.

Limiting the writing of best block to disk mainly to cases¹ that are actively involved in processing the new blocks seems logical. Since these cases will be updating the best block periodically & conditionally, it’s prudent to not interfere by doing the same via other user driven RPCs that work on different criterion as highlighted in the above comments.

This is because they are expected to be used during normal operation when the wallet will still be handling incoming blocks and transactions

I find this corroborated here in the scan function: https://github.com/bitcoin/bitcoin/blob/c779ee3a4044df3a263394bbb8b104aeeaa7c727/src/wallet/wallet.cpp#L1893-L1896

blockConnected, blockDisconnected notifications ↩︎

rkrux commented at 11:44 am on May 15, 2025:

I verified that ScanForWalletTransactions does update the best block locator on disk intermittently: https://github.com/bitcoin/bitcoin/blob/c779ee3a4044df3a263394bbb8b104aeeaa7c727/src/wallet/wallet.cpp#L1875-L1883

But I don’t see it updating m_last_block_processed, instead it reads it. Even if m_last_block_processed comes out to be same as the best block locator at the end in the AttachChain case, I find updating m_last_block_processed and best block locator here after rescan better both in terms of data consistency and readability - gives more confidence.

rkrux commented at 7:31 pm on May 13, 2025: contributor

This partial review is for the previous diff at e2e9c30a4d04bd8173c3386c5b97dbac227e810e. I started looking into it today and need to spend some more time in internalising the consequences of this diff but at the moment I’m inclining towards a concept ack.

Left few questions below.

fanquake referenced this in commit 8a65f03894 on May 14, 2025

wallet: Update best block record after block dis/connect

When a block is connected, if the new block had anything relevant to the
wallet, update the best block record on disk. If not, also sync the best
block record to disk every 144 blocks.

Also reuse the new WriteBestBlock method in BackupWallet.

7bacabb204

wallet: Replace chainStateFlushed in loading with SetLastBlockProcessed

The only reason to call chainStateFlushed during wallet loading is to
ensure that the best block is written. Do these writes explicitly to
prepare for removing chainStateFlushed, while also ensuring that the
wallet's in memory state tracking is written to disk.

Additionally, after rescanning on wallet loading, instead of writing
the locator for the current chain tip, write the locator for the last
block that the rescan had scanned. This ensures that the stored best
block record matches the wallet's current state.

Any blocks dis/connected during the rescan are processed after the
rescan and the last block processed will be updated accordingly.

6d3a8b195a

wallet, bench: Write a bestblock record in WalletMigration

Migrating a wallet requires having a bestblock record. This is always
the case in normal operation as such a record is always written on
wallet loading if it didn't already exist. However, within the unit
tests and benchmarks, this is not guaranteed. Since migration requires
the record, WalletMigration needs to also add this record before the
benchmark.

7fd3e1cf0c

wallet: Remove chainStateFlushed

chainStateFlushed is no longer needed since the best block is updated
after a block is scanned. Since the chainstate being flushed does not
necessarily coincide with the wallet having processed said block, it
does not entirely make sense for the wallet to be recording that block
as its best block, and this can cause race conditions where some blocks
are not processed. Thus, remove this notification.

98a1a5275c

wallet: Remove unnecessary database Close step on shutdown

StopWallets, which was being called prior to UnloadWallets, performs an
unnecessary database closing step. This causes issues in UnloadWallets
which does additional database cleanups. Since the database closing step
is unnecessary, StopWallets is removed, and UnloadWallets is now called
from WalletLoaderImpl::stop.

876a2585a8

wallet: Write best block record on unload b44b7c03fe

test, wallet: Remove concurrent writes test

Since CWallet::chainStateFlushed is now no-op, this test no longer tests
the concurrent writes scenario. There are no other cases where multiple
DatabaseBatches are open at the same time.

30a94b1ab9

achow101 force-pushed on May 14, 2025

in src/wallet/wallet.cpp:3224 in 6d3a8b195a outdated

3223@@ -3220,13 +3224,21 @@ bool CWallet::AttachChain(const std::shared_ptr<CWallet>& walletInstance, interf
3224

rkrux commented at 11:06 am on May 15, 2025:

Within AttachChain itself, there is another opportunity to use setLastBlockProcessedInMem here: https://github.com/bitcoin/bitcoin/blob/c779ee3a4044df3a263394bbb8b104aeeaa7c727/src/wallet/wallet.cpp#L3137-L3143

The cs_wallet is held at the start of this function.

achow101 commented at 7:57 pm on May 15, 2025:

Will do if I need to retouch.

in src/wallet/wallet.cpp:3243 in 98a1a5275c outdated

3222             // Set last block scanned as the last block processed as it may be different in case the case of a reorg.
3223             // Also save the best block locator because rescanning only updates it intermittently.
3224             walletInstance->SetLastBlockProcessed(*scan_res.last_scanned_height, scan_res.last_scanned_block);
3225         }
3226     }
3227-    walletInstance->m_attaching_chain = false;

rkrux commented at 1:22 pm on May 15, 2025:

Though these m_attaching_chain updates were done in AttachChain function, it looks quite cleaner now to get rid of this low level boolean from wallet instance, makes it easier to wrap head around AttachChain.

in src/wallet/wallet.cpp:3238 in 6d3a8b195a outdated

3234                 error = _("Failed to rescan the wallet during initialization");
3235                 return false;
3236             }
3237+            walletInstance->m_attaching_chain = false;
3238+            // Set and update the best block record
3239+            // Set last block scanned as the last block processed as it may be different in case the case of a reorg.

rkrux commented at 1:50 pm on May 15, 2025:

as it may be different in case the case of a reorg.

Redundancy if retouched.

in test/functional/wallet_reorgsrestore.py:123 in 7bacabb204 outdated

118@@ -119,11 +119,11 @@ def test_reorg_handling_during_unclean_shutdown(self):
119         self.start_node(0)
120         assert_equal(node.getbestblockhash(), tip)
121 
122-        # Due to an existing bug, the wallet incorrectly keeps the transaction in an abandoned state, even though that's
123-        # no longer the case (after the unclean shutdown, the node's chain returned to the pre-invalidation tip).
124-        # This issue blocks any future spending and results in an incorrect balance display.
125+        # After disconnecting the block, the wallet should record the new best block.
126+        # Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned

rkrux commented at 2:20 pm on May 15, 2025:

since the chainstate was not flushed

I now realise that this refers to the node specific chainstate. Having seeing the removal of chain state flushed function in the wallet, I initially thought it referred to that (which doesn’t exist anymore).

in test/functional/wallet_reorgsrestore.py:122 in 7bacabb204 outdated

118@@ -119,11 +119,11 @@ def test_reorg_handling_during_unclean_shutdown(self):
119         self.start_node(0)
120         assert_equal(node.getbestblockhash(), tip)
121 
122-        # Due to an existing bug, the wallet incorrectly keeps the transaction in an abandoned state, even though that's
123-        # no longer the case (after the unclean shutdown, the node's chain returned to the pre-invalidation tip).
124-        # This issue blocks any future spending and results in an incorrect balance display.
125+        # After disconnecting the block, the wallet should record the new best block.

rkrux commented at 3:01 pm on May 15, 2025:

the wallet should record the new best block.

I assume there is no way for us to verify this before the crash and after tip invalidation. The wallet info returns value based on the one that’s in memory: m_last_block_processed

wallet.getwalletinfo()[’lastprocessedblock’][‘hash’]

achow101 commented at 8:01 pm on May 15, 2025:

Not sure what would be verified? After invalidateblock and before the crash, we do verify that the coinbase transaction is marked abandoned. This check is to verify that starting after the crash, the wallet has the correct before-invalidate state.

rkrux commented at 3:15 pm on May 16, 2025:

Not sure what would be verified? After invalidateblock and before the crash, we do verify that the coinbase transaction is marked abandoned.

I wanted to check that the best block locator is indeed persisted to disk after invalidation but the getwalletinfo RPC reads lastprocessedblock from memory, which can’t be used for verifying this PR’s change. The coinbase transaction verification was done before this PR as well, so that also not verifies.

Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned coinbase. This should be rescanned and now un-abandoned.

While this comment is helpful, but I’d prefer to assert the “rescanning” portion in the test. Previously, I had in mind some functionality that reads from disk while returning the response in RPC but that might be going a bit too far just for tests. The other easier way I found out is to verify using the debug log from AttachChain that confirms the rescan indeed does happen - which tells me that the best block locator was persisted to disk after block invalidation and before node crash. https://github.com/bitcoin/bitcoin/blob/3f83c744ac28b700090e15b5dda2260724a56f49/src/wallet/wallet.cpp#L3192

Functional test diff that worked in my system:

 0@@ -104,6 +104,7 @@ class ReorgsRestoreTest(BitcoinTestFramework):
 1 
 2         # Disconnect tip and sync wallet state
 3         tip = wallet.getbestblockhash()
 4+        tip_height = wallet.getblockstats(tip)["height"]
 5         wallet.invalidateblock(tip)
 6         wallet.syncwithvalidationinterfacequeue()
 7 
 8@@ -116,7 +117,8 @@ class ReorgsRestoreTest(BitcoinTestFramework):
 9         node.kill_process()
10 
11         # Restart the node and confirm that it has not persisted the last chain state changes to disk
12-        self.start_node(0)
13+        with self.nodes[0].assert_debug_log(expected_msgs=[f"Rescanning last 1 blocks (from block {tip_height - 1})...\n"]):
14+            self.start_node(0)
15         assert_equal(node.getbestblockhash(), tip)
16 
17         # After disconnecting the block, the wallet should record the new best block.

achow101 commented at 6:22 pm on May 16, 2025:

Ah, I see. I think that’s unnecessary and the current test is sufficient to verify the behavior.

in test/functional/wallet_reorgsrestore.py:126 in 7bacabb204 outdated

124-        # This issue blocks any future spending and results in an incorrect balance display.
125+        # After disconnecting the block, the wallet should record the new best block.
126+        # Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned
127+        # coinbase. This should be rescanned and now un-abandoned.
128         wallet = node.get_wallet_rpc("reorg_crash")
129-        assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: #31824.

rkrux commented at 3:08 pm on May 15, 2025:

0- assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: [#31824](/bitcoin-bitcoin/31824/).
1+ assert_greater_than(wallet.getwalletinfo()['immature_balance'], 0)

Nit: The correct balance is there, can be asserted.

in test/functional/wallet_reorgsrestore.py:126 in 7bacabb204 outdated

125+        # After disconnecting the block, the wallet should record the new best block.
126+        # Upon reload after the crash, since the chainstate was not flushed, the tip contains the previously abandoned
127+        # coinbase. This should be rescanned and now un-abandoned.
128         wallet = node.get_wallet_rpc("reorg_crash")
129-        assert_equal(wallet.getwalletinfo()['immature_balance'], 0) # FIXME: #31824.
130+        assert_equal(wallet.gettransaction(coinbase_tx_id)['details'][0]['abandoned'], False)

rkrux commented at 3:09 pm on May 15, 2025:

self.log.info(“Test that wallet doesn’t crash due to a duplicate block disconnection event after an unclean shutdown”)

IMO the test log should be updated to mention this improvement also. That the wallet transactions are not missed anymore in case of temporarily invalidated blocks with unclean node shutdown.

achow101 commented at 8:02 pm on May 15, 2025:

Will do if I need to retouch.

rkrux commented at 3:10 pm on May 15, 2025: contributor

ACK 30a94b1ab9ae850d55cb9eb606a06890437bc75e

I like how this PR brings the best/last block processed updates both in memory & disk in unison, making it easier to reason about the wallet code ultimately. The detailed commit messages aids review.

I have verified that the location of such changes makes it possible for the following wallet RPCs to correctly handle best block related memory & disk updates:

0createwallet
1loadwallet
2restorewallet
3unloadwallet
4backupwallet

I feel that updating the locator when wallet transactions are found in the latest block & periodically every 144 blocks in blockConnected notifications, and also going back to the previous best block hash in the corresponding blockConnected notification makes it more intuitive & easier to reason about. There’s a possibility that the best block locator will be updated frequently if the wallet has transactions in contiguous blocks but I don’t think that’s a big deal as it’s relatively less likely to occur.

Even though I had not read the chainStateFlushed code prior to this PR, I found consistent usage of that terminology within wallet instance slightly distracting & thus support its removal.

I prefer the SetLastBlockProcessedInMem naming as it clearly mentions the storage that’s being updated. Also in favour of SetLastBlockProcessed function updating in memory & disk in one go.

I realised that there’s an opportunity to put m_last_block_processed & m_last_block_processed_height together in a struct as they are more often than not used together, I see a related refactor has been alluded to in this comment: #30221 (comment)

DrahtBot requested review from ryanofsky on May 15, 2025

mzumsande commented at 5:49 pm on May 15, 2025: contributor

Code Review ACK 30a94b1ab9ae850d55cb9eb606a06890437bc75e

in src/wallet/wallet.cpp:4480 in 7bacabb204 outdated

4475+{
4476+    AssertLockHeld(cs_wallet);
4477+
4478+    if (!m_last_block_processed.IsNull()) {
4479+        CBlockLocator loc;
4480+        chain().findBlock(m_last_block_processed, FoundBlock().locator(loc));

ryanofsky commented at 6:27 pm on May 19, 2025:

In commit “wallet: Update best block record after block dis/connect” (7bacabb204b6c34f9545f0b37e2c66296ad2c0de)

IMO would be good to assert !loc.IsNull() as a sanity check to ensure that doesn’t happen. (It shouldn’t, but node or wallet code could change in the future and invalidate current assumptions.)

ryanofsky approved

ryanofsky commented at 6:36 pm on May 19, 2025: contributor

Code review ACK 30a94b1ab9ae850d55cb9eb606a06890437bc75e. Only changes since last review are using WriteBestBlock method more places and updating comments.

It looks like there are some small review comments above that could be followed up, but with 3 acks I think it would be good to merge this shortly and let smaller improvements can be made later. I plan to merge this soon after testing locally.

ryanofsky assigned ryanofsky on May 19, 2025

ryanofsky merged this on May 19, 2025

ryanofsky closed this on May 19, 2025

rkrux referenced this in commit 3112fb5c5b on May 21, 2025

rkrux commented at 5:41 pm on May 21, 2025: contributor

It looks like there are some small review comments above that could be followed up, but with 3 acks I think it would be good to merge this shortly and let smaller improvements can be made later. I plan to merge this soon after testing locally.

I did the follow-ups in #32580.

TheCharlatan referenced this in commit cb3cd9b8b9 on May 29, 2025

Dorex45 commented at 10:24 pm on June 20, 2025: none

Thanks for working on syncing wallet state! Concept ACK – wrapping block and height in a BestBlock struct makes sense. Just A few thoughts: • Could we add short comments explaining BestBlock members for clarity? • In DeserializeBestBlock, should we warn/log if height is missing for backward compatibility? • What about adding a test for a scenario involving a reorg, to ensure BestBlock updates in both blockConnected and blockDisconnected paths? • Optional: Would a ToString() or debug helper for BestBlock aid future debugging? Overall this looks solid ,thanks for the cleanup!

wallet: Ensure best block matches wallet scan state #30221

Code Coverage & Benchmarks

Reviews

Conflicts

LLM Linter (✨ experimental)