wallet: guard and alert about a wallet invalid state during chain sync #25272

pull furszy wants to merge 2 commits into bitcoin:master from furszy:2022_wallet_invalid_state changing 3 files +117 −1
  1. furszy commented at 2:06 pm on June 3, 2022: member

    Follow-up work to my comment in #25239.

    Guarding and alerting the user about a wallet invalid state during chain synchronization.

    Explanation

    if the AddToWallet tx write fails, the method returns a wtx nullptr without removing the recently added transaction from the wallet’s map.

    Which makes that AddToWalletIfInvolvingMe return false (even when the tx is on the wallet’s map already), –> which makes SyncTransaction skip the MarkInputsDirty call –> which leads to a wallet invalid state where the inputs of this new transaction are not marked dirty, while the transaction that spends them still exist on the in-memory wallet tx map.

    Plus, as we only store the arriving transaction inside AddToWalletIfInvolvingMe when we synchronize/scan block/s from the chain and nowhere else, it makes sense to treat the transaction db write error as a runtime error to notify the user about the problem. Otherwise, the user will lose all the not stored transactions after a wallet shutdown (without be able to recover them automatically on the next startup because the chain sync would be above the block where the txs arrived).

    Note: On purpose, the first commit adds test coverage for it. Showing how the wallet can end up in an invalid state. The second commit corrects it with the proposed solution.

  2. DrahtBot added the label Wallet on Jun 3, 2022
  3. furszy force-pushed on Jun 3, 2022
  4. furszy commented at 3:53 pm on June 3, 2022: member
    Update: Added test coverage for it in the first commit. Showing how the wallet can end up in the invalid state. The second commit corrects it with the proposed solution.
  5. furszy force-pushed on Jun 4, 2022
  6. DrahtBot commented at 3:58 pm on June 4, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #25297 (wallet: speedup transactions sync, rescan and load by grouping all independent db writes on a single batched db transaction by furszy)
    • #24897 ([Draft / POC] Silent Payments by w0xlt)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. jonatack commented at 3:11 pm on June 6, 2022: contributor
    Concept ACK
  8. in src/wallet/test/wallet_tests.cpp:957 in d15e9f7fc0 outdated
    952+
    953+    // Now the bad case:
    954+    // 1) Make db always fail
    955+    // 2) Try to add a transaction that spends the previously created transaction and
    956+    //    verify that we are not moving forward if the wallet cannot store it
    957+    dynamic_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;
    


    ryanofsky commented at 4:23 pm on June 6, 2022:

    In commit “test: add case for wallet invalid state where the inputs (prev-txs) of a new arriving transaction are not marked dirty, while the transaction that spends them exist inside the in-memory wallet tx map (not stored on db due a db write failure).” (742cd0b4898b06798d9cbac0f272d7aed0fb2d3f)

    Think you just want a static_cast here if you already know the type. (dynamic_cast is for when you are unsure what the type will be at runtime)


    furszy commented at 7:07 pm on June 6, 2022:
    yeah 👍🏼 .
  9. in src/wallet/test/wallet_tests.cpp:889 in d15e9f7fc0 outdated
    881+
    882+/** A dummy WalletDatabase that does nothing, only fails if needed.**/
    883+class FailDatabase : public WalletDatabase
    884+{
    885+public:
    886+    bool m_pass{true}; // false when this db should fail
    


    ryanofsky commented at 5:19 pm on June 6, 2022:

    In commit “test: add case for wallet invalid state where the inputs (prev-txs) of a new arriving transaction are not marked dirty, while the transaction that spends them exist inside the in-memory wallet tx map (not stored on db due a db write failure).” (742cd0b4898b06798d9cbac0f272d7aed0fb2d3f)

    Default is true here, but false in the other class. Would suggest sticking choosing one default just to avoid confusion and having to remember which class has which default.

    (Also maybe use a shorter commit subject, https://cbea.ms/git-commit/ has nice guidelines)


    furszy commented at 7:11 pm on June 6, 2022:

    Default is true here, but false in the other class. Would suggest sticking choosing one default just to avoid confusion and having to remember which class has which default.

    yeah, good catch. I wrote the test differently first, this is a remnant of the first approach.

    (Also maybe use a shorter commit subject, https://cbea.ms/git-commit/ has nice guidelines)

    awesome reading :), thanks!

  10. ryanofsky approved
  11. ryanofsky commented at 5:33 pm on June 6, 2022: contributor
    Code review ACK d15e9f7fc0ca3b8ab775ca218e146b3238578484. Nice find and test, and clear description of the problem. This change will raise an explicit error in a failure case that was previously ignored, so in theory it could lead to worse outcomes in some cases. But status quo results in inconsistent in-memory state, and the error seems pretty unlikely to happen to begin with so this is probably the right move.
  12. furszy force-pushed on Jun 6, 2022
  13. furszy force-pushed on Jun 6, 2022
  14. furszy commented at 7:31 pm on June 6, 2022: member

    Thanks for the feedback!

    Applied the following changes:

    1. Moved dynamic_cast to static_cast.
    2. Set same default value for “m_pass” field in both classes FailDatabase and FailBatch.
    3. Shorted the commit subject.
  15. in src/wallet/test/wallet_tests.cpp:960 in 4bd0d0140e outdated
    958-    //    output is spent, should at least, be marked dirty)
    959-
    960-    dynamic_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;
    961+    // 2) Try to add a transaction that spends the previously created transaction and
    962+    //    verify that we are not moving forward if the wallet cannot store it
    963+    static_cast<FailDatabase&>(wallet.GetDatabase()).m_pass = false;
    


    achow101 commented at 9:01 pm on June 13, 2022:

    In 4bd0d0140e2a27e604ca0febcd485f468705c0ed “wallet: guard and alert about a wallet invalid state during chain sync”

    static_cast change should be in the previous commit.

  16. achow101 commented at 9:03 pm on June 13, 2022: member
    It’s preferable that each commit passes the test suite. Currently the first commit fails as the code currently fails the test that you have added. So it would be preferable if that test were added after the fix is implemented.
  17. furszy commented at 12:36 pm on June 14, 2022: member

    It’s preferable that each commit passes the test suite. Currently the first commit fails as the code currently fails the test that you have added. So it would be preferable if that test were added after the fix is implemented.

    Ok. I actually did that on purpose so it was easier for reviewers to replicate/understand the problem (I wrote it in the PR description); just compiling the first commit, see/verify the failure, then compile the fix and see it passing (Test Driven Approach).

    At least for me, it make more sense (and feels more natural) than have tests always passing when we actually have problems in the sources.

    But.. that is probably not a discussion for this PR, will just re-order the commits :).

  18. furszy force-pushed on Jun 14, 2022
  19. furszy commented at 12:56 pm on June 14, 2022: member

    Updated per achow101 feedback:

  20. ryanofsky approved
  21. ryanofsky commented at 3:03 pm on June 15, 2022: contributor
    Code review ACK 7027e94801686064eef806b10df141ebca3ea2b8. Only changes since last review are internal changes to commits, and making suggested static_cast and m_pass default changes
  22. in src/wallet/wallet.cpp:1137 in 7027e94801 outdated
    1138@@ -1139,7 +1139,13 @@ bool CWallet::AddToWalletIfInvolvingMe(const CTransactionRef& ptx, const SyncTxS
    1139             // Block disconnection override an abandoned tx as unconfirmed
    1140             // which means user may have to call abandontransaction again
    1141             TxState tx_state = std::visit([](auto&& s) -> TxState { return s; }, state);
    1142-            return AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
    1143+            CWalletTx* wtx = AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
    1144+            if (!wtx) {
    1145+                // Can only be nullptr if there was a db write error (missing db, read-only db or a db engine internal writing error).
    1146+                // As we only store arriving transaction in this process, and we don't want an inconsistent state, let's throw an error.
    1147+                throw std::runtime_error("DB error adding transaction to wallet, write failed");
    


    luke-jr commented at 7:25 pm on June 23, 2022:
    Where is the exception handler?

    furszy commented at 4:54 pm on June 24, 2022:

    This runs on the scheduler thread, it’s caught by the try/catch inside serviceQueue().

    It’s a direct node/wallet abortion cause, because the wallet cannot recover alone from this invalid state. If it continues running is only for worst.

  23. achow101 commented at 7:27 pm on June 27, 2022: member
    ACK 7027e94801686064eef806b10df141ebca3ea2b8
  24. wallet: guard and alert about a wallet invalid state during chain sync
    -Context:
    If `AddToWallet` db write fails, the method returns a wtx nullptr without
    removing the recently added transaction from the wallet's map.
    
    -Problem:
    When a db write error occurs, `AddToWalletIfInvolvingMe` return false even
    when the tx is on the wallet's map already --> which makes `SyncTransaction`
    skip the `MarkInputsDirty` call --> which leads to a wallet invalid state
    where the inputs of this new transaction are not marked dirty, while the
    transaction that spends them still exist on the in-memory wallet tx map.
    
    Plus, as we only store arriving transaction inside `AddToWalletIfInvolvingMe`
    when we synchronize/scan blocks from the chain and nowhere else, it makes sense
    to treat the tx db write error as a runtime error to notify the user about the
    problem. Otherwise, the user will lose all the not stored transactions after a
    wallet shutdown (without be able to recover them automatically on the next
    startup because the chain sync would be above the block where the txs arrived).
    77de5c693f
  25. test: add coverage for wallet inconsistent state during sync
    When a transaction arrives, the wallet mark its inputs (prev-txs) as dirty.
    Clearing the wallet transaction cache, triggering a balance recalculation.
    
    If this does not happen due a db write error during `AddToWallet`, the wallet
    will be in an invalid state: The transaction that spends certain wallet UTXO will
    exist inside the in-memory wallet tx map, having the credit/debit calculated,
    while its inputs will still have the old cached data (like if them were never
    spent).
    9e04cfaa76
  26. furszy force-pushed on Jul 18, 2022
  27. furszy commented at 3:10 pm on July 18, 2022: member

    Sadly had to rebase it on master, got a silent merge conflict with the GetNewDestination changes.

    Ready to go again.

  28. achow101 commented at 8:21 pm on July 18, 2022: member
    re-ACK 9e04cfaa76cf9dda27f10359dd43e78dd3268e09
  29. achow101 requested review from ryanofsky on Jul 18, 2022
  30. achow101 requested review from jonatack on Jul 18, 2022
  31. in src/wallet/wallet.cpp:1137 in 77de5c693f outdated
    1129@@ -1130,7 +1130,13 @@ bool CWallet::AddToWalletIfInvolvingMe(const CTransactionRef& ptx, const SyncTxS
    1130             // Block disconnection override an abandoned tx as unconfirmed
    1131             // which means user may have to call abandontransaction again
    1132             TxState tx_state = std::visit([](auto&& s) -> TxState { return s; }, state);
    1133-            return AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
    1134+            CWalletTx* wtx = AddToWallet(MakeTransactionRef(tx), tx_state, /*update_wtx=*/nullptr, /*fFlushOnClose=*/false, rescanning_old_block);
    1135+            if (!wtx) {
    1136+                // Can only be nullptr if there was a db write error (missing db, read-only db or a db engine internal writing error).
    1137+                // As we only store arriving transaction in this process, and we don't want an inconsistent state, let's throw an error.
    1138+                throw std::runtime_error("DB error adding transaction to wallet, write failed");
    


    jonatack commented at 11:31 am on July 19, 2022:

    Would it make sense to follow this existing error message format in src/wallet/wallet.cpp:2157, e.g. provide the function name?

    0throw std::runtime_error(std::string(__func__) + ": Wallet db error, transaction commit failed");
    

    (modulo s/db/DB/)

  32. jonatack commented at 11:33 am on July 19, 2022: contributor
    Quick review of the patch only LGTM with one question. Did not build, nor review/verify the passing/failing test yet, will do.
  33. jonatack commented at 6:30 pm on July 20, 2022: contributor

    ACK 9e04cfaa76cf9dda27f10359dd43e78dd3268e09

    modulo #25272 (review) regarding the error message

  34. furszy commented at 3:23 pm on July 25, 2022: member

    Hey, thanks for the review.

    about your point @jonatack:

    Would it make sense to follow this existing error message format in src/wallet/wallet.cpp:2157, e.g. provide the function name?

    Probably would make more sense to go into the opposite direction and change line 2157 to describe the context where the runtime error occurred properly. Something like "DB error committing transaction to wallet, write failed".

    So if this edge case happens, and the runtime error string gets presented to the user before shutdown, the text is more descriptive than having the function name on it (which, looking it at the user’s perspective, provides zero information).

    Still, we could probably merge this PR as is now and tackle it in a short follow-up PR.

  35. achow101 merged this on Aug 2, 2022
  36. achow101 closed this on Aug 2, 2022

  37. sidhujag referenced this in commit 4d6eeb0387 on Aug 2, 2022
  38. w0xlt commented at 10:29 pm on August 2, 2022: contributor
  39. furszy deleted the branch on May 27, 2023
  40. bitcoin locked this on May 26, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me