wallet: guard and alert about a wallet invalid state during chain sync

furszy commented at 2:06 pm on June 3, 2022: member

Follow-up work to my comment in #25239.

Guarding and alerting the user about a wallet invalid state during chain synchronization.

Explanation

if the AddToWallet tx write fails, the method returns a wtx nullptr without removing the recently added transaction from the wallet’s map.

Which makes that AddToWalletIfInvolvingMe return false (even when the tx is on the wallet’s map already), –> which makes SyncTransaction skip the MarkInputsDirty call –> which leads to a wallet invalid state where the inputs of this new transaction are not marked dirty, while the transaction that spends them still exist on the in-memory wallet tx map.

Plus, as we only store the arriving transaction inside AddToWalletIfInvolvingMe when we synchronize/scan block/s from the chain and nowhere else, it makes sense to treat the transaction db write error as a runtime error to notify the user about the problem. Otherwise, the user will lose all the not stored transactions after a wallet shutdown (without be able to recover them automatically on the next startup because the chain sync would be above the block where the txs arrived).

Note: On purpose, the first commit adds test coverage for it. Showing how the wallet can end up in an invalid state. The second commit corrects it with the proposed solution.

DrahtBot added the label Wallet on Jun 3, 2022

furszy force-pushed on Jun 3, 2022

furszy commented at 3:53 pm on June 3, 2022: member

Update: Added test coverage for it in the first commit. Showing how the wallet can end up in the invalid state. The second commit corrects it with the proposed solution.

furszy force-pushed on Jun 4, 2022

DrahtBot commented at 3:58 pm on June 4, 2022: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#25297 (wallet: speedup transactions sync, rescan and load by grouping all independent db writes on a single batched db transaction by furszy)
#24897 ([Draft / POC] Silent Payments by w0xlt)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

jonatack commented at 3:11 pm on June 6, 2022: contributor

Concept ACK

in src/wallet/test/wallet_tests.cpp:957 in d15e9f7fc0 outdated

952+
953+    // Now the bad case:
954+    // 1) Make db always fail
955+    // 2) Try to add a transaction that spends the previously created transaction and
956+    //    verify that we are not moving forward if the wallet cannot store it
957+    dynamic_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;

ryanofsky commented at 4:23 pm on June 6, 2022:

In commit “test: add case for wallet invalid state where the inputs (prev-txs) of a new arriving transaction are not marked dirty, while the transaction that spends them exist inside the in-memory wallet tx map (not stored on db due a db write failure).” (742cd0b4898b06798d9cbac0f272d7aed0fb2d3f)

Think you just want a static_cast here if you already know the type. (dynamic_cast is for when you are unsure what the type will be at runtime)

furszy commented at 7:07 pm on June 6, 2022:

yeah 👍🏼 .

in src/wallet/test/wallet_tests.cpp:889 in d15e9f7fc0 outdated

881+
882+/** A dummy WalletDatabase that does nothing, only fails if needed.**/
883+class FailDatabase : public WalletDatabase
884+{
885+public:
886+    bool m_pass{true}; // false when this db should fail

ryanofsky commented at 5:19 pm on June 6, 2022:

In commit “test: add case for wallet invalid state where the inputs (prev-txs) of a new arriving transaction are not marked dirty, while the transaction that spends them exist inside the in-memory wallet tx map (not stored on db due a db write failure).” (742cd0b4898b06798d9cbac0f272d7aed0fb2d3f)

Default is true here, but false in the other class. Would suggest sticking choosing one default just to avoid confusion and having to remember which class has which default.

(Also maybe use a shorter commit subject, https://cbea.ms/git-commit/ has nice guidelines)

furszy commented at 7:11 pm on June 6, 2022:

Default is true here, but false in the other class. Would suggest sticking choosing one default just to avoid confusion and having to remember which class has which default.

yeah, good catch. I wrote the test differently first, this is a remnant of the first approach.

(Also maybe use a shorter commit subject, https://cbea.ms/git-commit/ has nice guidelines)

awesome reading :), thanks!

ryanofsky approved

ryanofsky commented at 5:33 pm on June 6, 2022: contributor

Code review ACK d15e9f7fc0ca3b8ab775ca218e146b3238578484. Nice find and test, and clear description of the problem. This change will raise an explicit error in a failure case that was previously ignored, so in theory it could lead to worse outcomes in some cases. But status quo results in inconsistent in-memory state, and the error seems pretty unlikely to happen to begin with so this is probably the right move.

furszy force-pushed on Jun 6, 2022

furszy commented at 7:31 pm on June 6, 2022: member

Thanks for the feedback!

Applied the following changes:

Moved dynamic_cast to static_cast.
Set same default value for “m_pass” field in both classes FailDatabase and FailBatch.
Shorted the commit subject.

in src/wallet/test/wallet_tests.cpp:960 in 4bd0d0140e outdated

958-    //    output is spent, should at least, be marked dirty)
959-
960-    dynamic_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;
961+    // 2) Try to add a transaction that spends the previously created transaction and
962+    //    verify that we are not moving forward if the wallet cannot store it
963+    static_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;

achow101 commented at 9:01 pm on June 13, 2022:

In 4bd0d0140e2a27e604ca0febcd485f468705c0ed “wallet: guard and alert about a wallet invalid state during chain sync”

static_cast change should be in the previous commit.

achow101 commented at 9:03 pm on June 13, 2022: member

It’s preferable that each commit passes the test suite. Currently the first commit fails as the code currently fails the test that you have added. So it would be preferable if that test were added after the fix is implemented.

furszy commented at 12:36 pm on June 14, 2022: member

It’s preferable that each commit passes the test suite. Currently the first commit fails as the code currently fails the test that you have added. So it would be preferable if that test were added after the fix is implemented.

Ok. I actually did that on purpose so it was easier for reviewers to replicate/understand the problem (I wrote it in the PR description); just compiling the first commit, see/verify the failure, then compile the fix and see it passing (Test Driven Approach).

At least for me, it make more sense (and feels more natural) than have tests always passing when we actually have problems in the sources.

But.. that is probably not a discussion for this PR, will just re-order the commits :).

furszy force-pushed on Jun 14, 2022

furszy commented at 12:56 pm on June 14, 2022: member

Updated per achow101 feedback:

Re-ordered the commits, so test suite always passes. Note about this change: now we only have the final version of the test. The one that fails due a runtime_error and not the intermediary one that was showing the wallet invalid state error (which, for reviewers, can still be found at https://github.com/furszy/bitcoin-core/commit/c9c9466dd0ab77bcce52ebe4b4d5f89787bd6542).

ryanofsky approved

ryanofsky commented at 3:03 pm on June 15, 2022: contributor

Code review ACK 7027e94801686064eef806b10df141ebca3ea2b8. Only changes since last review are internal changes to commits, and making suggested static_cast and m_pass default changes

in src/wallet/wallet.cpp:1137 in 7027e94801 outdated

1138@@ -1139,7 +1139,13 @@ bool CWallet::AddToWalletIfInvolvingMe(const CTransactionRef& ptx, const SyncTxS
1139             // Block disconnection override an abandoned tx as unconfirmed
1140             // which means user may have to call abandontransaction again
1141             TxState tx_state = std::visit([](auto&& s) -> TxState { return s; }, state);
1142-            return AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
1143+            CWalletTx* wtx = AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
1144+            if (!wtx) {
1145+                // Can only be nullptr if there was a db write error (missing db, read-only db or a db engine internal writing error).
1146+                // As we only store arriving transaction in this process, and we don't want an inconsistent state, let's throw an error.
1147+                throw std::runtime_error("DB error adding transaction to wallet, write failed");

luke-jr commented at 7:25 pm on June 23, 2022:

Where is the exception handler?

furszy commented at 4:54 pm on June 24, 2022:

This runs on the scheduler thread, it’s caught by the try/catch inside serviceQueue().

It’s a direct node/wallet abortion cause, because the wallet cannot recover alone from this invalid state. If it continues running is only for worst.

achow101 commented at 7:27 pm on June 27, 2022: member

ACK 7027e94801686064eef806b10df141ebca3ea2b8

wallet: guard and alert about a wallet invalid state during chain sync

-Context:
If `AddToWallet` db write fails, the method returns a wtx nullptr without
removing the recently added transaction from the wallet's map.

-Problem:
When a db write error occurs, `AddToWalletIfInvolvingMe` return false even
when the tx is on the wallet's map already --> which makes `SyncTransaction`
skip the `MarkInputsDirty` call --> which leads to a wallet invalid state
where the inputs of this new transaction are not marked dirty, while the
transaction that spends them still exist on the in-memory wallet tx map.

Plus, as we only store arriving transaction inside `AddToWalletIfInvolvingMe`
when we synchronize/scan blocks from the chain and nowhere else, it makes sense
to treat the tx db write error as a runtime error to notify the user about the
problem. Otherwise, the user will lose all the not stored transactions after a
wallet shutdown (without be able to recover them automatically on the next
startup because the chain sync would be above the block where the txs arrived).

77de5c693f

test: add coverage for wallet inconsistent state during sync

When a transaction arrives, the wallet mark its inputs (prev-txs) as dirty.
Clearing the wallet transaction cache, triggering a balance recalculation.

If this does not happen due a db write error during `AddToWallet`, the wallet
will be in an invalid state: The transaction that spends certain wallet UTXO will
exist inside the in-memory wallet tx map, having the credit/debit calculated,
while its inputs will still have the old cached data (like if them were never
spent).

9e04cfaa76

furszy force-pushed on Jul 18, 2022

furszy commented at 3:10 pm on July 18, 2022: member

Sadly had to rebase it on master, got a silent merge conflict with the GetNewDestination changes.

Ready to go again.

achow101 commented at 8:21 pm on July 18, 2022: member

re-ACK 9e04cfaa76cf9dda27f10359dd43e78dd3268e09

achow101 requested review from ryanofsky on Jul 18, 2022

achow101 requested review from jonatack on Jul 18, 2022

in src/wallet/wallet.cpp:1137 in 77de5c693f outdated

1129@@ -1130,7 +1130,13 @@ bool CWallet::AddToWalletIfInvolvingMe(const CTransactionRef& ptx, const SyncTxS
1130             // Block disconnection override an abandoned tx as unconfirmed
1131             // which means user may have to call abandontransaction again
1132             TxState tx_state = std::visit([](auto&& s) -> TxState { return s; }, state);
1133-            return AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
1134+            CWalletTx* wtx = AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
1135+            if (!wtx) {
1136+                // Can only be nullptr if there was a db write error (missing db, read-only db or a db engine internal writing error).
1137+                // As we only store arriving transaction in this process, and we don't want an inconsistent state, let's throw an error.
1138+                throw std::runtime_error("DB error adding transaction to wallet, write failed");

jonatack commented at 11:31 am on July 19, 2022:

Would it make sense to follow this existing error message format in src/wallet/wallet.cpp:2157, e.g. provide the function name?

0throw std::runtime_error(std::string(__func__) + ": Wallet db error, transaction commit failed");

(modulo s/db/DB/)

jonatack commented at 11:33 am on July 19, 2022: contributor

Quick review of the patch only LGTM with one question. Did not build, nor review/verify the passing/failing test yet, will do.

jonatack commented at 6:30 pm on July 20, 2022: contributor

ACK 9e04cfaa76cf9dda27f10359dd43e78dd3268e09

modulo #25272 (review) regarding the error message

furszy commented at 3:23 pm on July 25, 2022: member

Hey, thanks for the review.

about your point @jonatack:

Would it make sense to follow this existing error message format in src/wallet/wallet.cpp:2157, e.g. provide the function name?

Probably would make more sense to go into the opposite direction and change line 2157 to describe the context where the runtime error occurred properly. Something like "DB error committing transaction to wallet, write failed".

So if this edge case happens, and the runtime error string gets presented to the user before shutdown, the text is more descriptive than having the function name on it (which, looking it at the user’s perspective, provides zero information).

Still, we could probably merge this PR as is now and tackle it in a short follow-up PR.

achow101 merged this on Aug 2, 2022

achow101 closed this on Aug 2, 2022

sidhujag referenced this in commit 4d6eeb0387 on Aug 2, 2022

w0xlt commented at 10:29 pm on August 2, 2022: contributor

Post-merge ACK https://github.com/bitcoin/bitcoin/pull/25272/commits/9e04cfaa76cf9dda27f10359dd43e78dd3268e09

furszy deleted the branch on May 27, 2023

bitcoin locked this on May 26, 2024

wallet: guard and alert about a wallet invalid state during chain sync #25272

Explanation

Conflicts