wallet: bugfix, always use apostrophe for spkm descriptor ID #27920

pull furszy wants to merge 4 commits into bitcoin:master from furszy:2023_wallet_fix_spkm_id_corruption changing 6 files +136 −73
  1. furszy commented at 3:51 pm on June 20, 2023: member

    Aiming to fix #27915.

    As we re-write the descriptor’s db record every time that the wallet is loaded (at TopUp time), if the spkm ID differs from the one in db, the wallet will enter in an unrecoverable corruption state (due to the storage of a descriptor with an ID that is not linked to any other descriptor record in DB), and no soft version will be able to open it anymore.

    Because we cannot change the past, to stay compatible between releases, we need to always use the apostrophe version for the spkm IDs.

  2. DrahtBot commented at 3:51 pm on June 20, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK achow101, Sjors
    Concept ACK ErikDeSmedt

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27255 (MiniTapscript: port Miniscript to Tapscript by darosior)
    • #26567 (Wallet: estimate the size of signed inputs using descriptors by darosior)
    • #25907 (wallet: rpc to add automatically generated descriptors by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Wallet on Jun 20, 2023
  4. furszy force-pushed on Jun 20, 2023
  5. mzumsande commented at 4:46 pm on June 20, 2023: contributor
    Can confirm that this fixes the problem from #27915, I’m not familiar with the wallet though, so I can’t comment on the code changes.
  6. furszy force-pushed on Jun 20, 2023
  7. achow101 commented at 5:26 pm on June 20, 2023: member

    Having the id be dependent on ToString being a round trip seems kind of bleh. I think it would be better to change the id calculation to just use the string written to/read from the db.

    Edit: Actually, how the id itself is generated shouldn’t matter. It might as well be random. Once we assign the id, we should just store it and remember what it is rather than regenerating it whenever we want it. But that doesn’t really solve this downgrading problem.

  8. in src/wallet/walletdb.cpp:671 in 080ecd855b outdated
    663@@ -664,6 +664,12 @@ ReadKeyValue(CWallet* pwallet, DataStream& ssKey, CDataStream& ssValue,
    664                 wss.m_descriptor_caches[id] = DescriptorCache();
    665             }
    666             pwallet->LoadDescriptorScriptPubKeyMan(id, desc);
    667+
    668+            // Prior to doing anything with this spkm, verify ID calculation is always the same.
    669+            if (id != pwallet->GetDescriptorScriptPubKeyMan(desc)->GetID()) {
    670+                wss.descriptor_unknown = true;
    671+                return error("The descriptor ID calculated by the wallet differs from the one in DB");
    


    achow101 commented at 8:04 pm on June 20, 2023:

    In 080ecd855bc4194396c58e71677a3e50c7b0abff “wallet: do not allow loading descriptor with an invalid ID”

    The pattern we use for errors in ReadKeyValue is to assign the strErr rather than return error("...").

    0                strErr = "The descriptor ID calculated by the wallet differs from the one in DB");
    1                return false
    

    furszy commented at 8:36 pm on June 20, 2023:

    Yeah, I did it on purpose, so the error is actually logged. The descriptor_unknown field overwrites strErr with another message.

    But, what could do is actually change the descriptor_unknown logging so it appends the error at the end (it will also log the currently not logged std::ios_base::failure error that is above).


    achow101 commented at 8:41 pm on June 20, 2023:

    I’m not sure that descriptor_unknown needs to be set here. The descriptor isn’t really unknown, just our computed id doesn’t match the stored one. This is more on the level of a critical corruption error, in the same way that invalid keys are.

    Error handling here should be a lot easier with #24914.


    furszy commented at 8:53 pm on June 20, 2023:

    I’m not sure that descriptor_unknown needs to be set here. The descriptor isn’t really unknown, just our computed id doesn’t match the stored one. This is more on the level of a critical corruption error, in the same way that invalid keys are.

    Yeah, but we don’t have a way to set a corruption error right now. If the function only return false with the strErr, it will be tagged as a non-critical error. Thus why re-used the descriptor_unknown flag.

    Error handling here should be a lot easier with #24914.

    yessss. Thus why I didn’t care much on adding more code to it. While it works, should be fine.


    achow101 commented at 8:59 pm on June 20, 2023:

    Yeah, but we don’t have a way to set a corruption error right now. If the function only return false with the strErr, it will be tagged as a non-critical error. Thus why re-used the descriptor_unknown flag.

    Could add DBKeys::WALLETDESCRIPTOR to https://github.com/bitcoin/bitcoin/blob/d1ae96755a0f9d7e12c3f6741c030d8ea6d0416f/src/wallet/walletdb.cpp#L867

  9. in src/script/descriptor.h:112 in 494fa79a42 outdated
    107@@ -108,6 +108,9 @@ struct Descriptor {
    108     /** Convert the descriptor back to a string, undoing parsing. */
    109     virtual std::string ToString() const = 0;
    110 
    111+    /* Constant descriptor ID. */
    112+    virtual std::string GetID() const = 0;
    


    achow101 commented at 8:07 pm on June 20, 2023:

    In 494fa79a421c7384fdecacc7b42ae0dbe66668f2 “wallet: bugfix, always use apostrophe for spkm descriptor ID”

    This name is misleading as it doesn’t actually return an id, just the string we hash for it. I would actually prefer just adding a parameter to ToString() rather than a whole new function.


    furszy commented at 8:59 pm on June 20, 2023:

    I did that first, but didn’t like to add a force_apostrophe argument to it. The apostrophe existence is internal to the bip32 PubKeyProvider only and not really a concept that the Descriptor class should know about. Even other pub key providers shouldn’t be aware of the apostrophe format neither.

    The idea behind the GeID() method was to have a function that we treat differently and think twice before change anything inside it.

    Said that, I’m not that strong over it neither. Could use the same ID concept for the ToString parameter too.

  10. furszy force-pushed on Jun 20, 2023
  11. in src/wallet/walletdb.cpp:875 in 7455e83601 outdated
    871+                    return DBErrors::UNKNOWN_DESCRIPTOR;
    872                 }
    873                 // losing keys is considered a catastrophic error, anything else
    874                 // we assume the user can live with:
    875-                if (IsKeyType(strType) || strType == DBKeys::DEFAULTKEY) {
    876+                else if (IsKeyType(strType) || strType == DBKeys::DEFAULTKEY) {
    


    Sjors commented at 1:36 pm on June 22, 2023:
    7455e836018d30574f290c92c1df631ffafc8105 nit: move comment above to below } else if {
  12. Sjors commented at 1:46 pm on June 22, 2023: member
    One way to reduce the odds of this happening again with future descriptor changes is to hardcode the COMPAT identifier for all the test vectors in descriptor_tests.cpp.
  13. Sjors commented at 3:05 pm on June 22, 2023: member

    I wrote a commit on top of this branch that moves GetID() into the Descriptor class and then adds test vectors for it. The function documentation clarifies that this ID is not part of BIP 380 and shouldn’t be exposed to the user.

    https://github.com/Sjors/bitcoin/tree/2023/06/desc_id_vectors

  14. furszy commented at 7:47 pm on June 22, 2023: member

    I wrote a commit on top of this branch that moves GetID() into the Descriptor class and then adds test vectors for it. The function documentation clarifies that this ID is not part of BIP 380 and shouldn’t be exposed to the user.

    https://github.com/Sjors/bitcoin/tree/2023/06/desc_id_vectors

    You know, I initially wrote this with the GetID() method as well, check #27920 (review).

    Also, left you a comment on your commit. Not sure why added the GetID() method to the pub key provider, they are always returning an empty uint256 and they don’t seems to be used anywhere (can remove them, compile the sources, run the descriptor_tests and it passes there).

    Will go deeper over your commit. Thanks!

    One way to reduce the odds of this happening again with future descriptor changes is to hardcode the COMPAT identifier for all the test vectors in descriptor_tests.cpp.

    Good call 👌🏼.

  15. furszy force-pushed on Jun 23, 2023
  16. furszy force-pushed on Jun 23, 2023
  17. furszy force-pushed on Jun 23, 2023
  18. furszy force-pushed on Jun 23, 2023
  19. furszy commented at 8:26 pm on June 23, 2023: member

    Updated per feedback, thanks @Sjors.

    Pulled your test commit and made some changes. While reviewing it, discovered that descriptors containing a key with an origin are actually incorrect. As the origin consists of a root xpub fingerprint and the key derivation path, this former needs to be formatted with an apostrophe as well. So, have updated the code and tests to address it. And also expanded the coverage for few other descriptors that were previously not included.

    Then, to facilitate the verification of the test vectors, have created two isolated commits, ab3b700 and 9e1c3b4. These can be applied on top of the v25 branch. The cherry-pick process is not entirely clean, but once it’s done, you can easily verify the correctness of the descriptors’ IDs. The tests must pass in v25 with these commits, just as they pass here with the bug fix.

    To simplify the process of verifying these test vectors, have pushed a v25 branch that includes only those two commits: https://github.com/furszy/bitcoin-core/tree/25.x_verify_spkm_ids.

    Also, have separated the descriptor ID function as a standalone function to avoid mixing it with the descriptor’s interface.

  20. Sjors commented at 2:43 pm on June 26, 2023: member

    While reviewing it, discovered that descriptors containing a key with an origin are actually incorrect.

    Glad the tests were useful :-)

    The tests must pass in v25 with these commits.

    Thanks, I was thinking of testing that as well, but couldn’t figure out how to do so trivially with my own commit.

  21. DrahtBot added the label Needs rebase on Jun 28, 2023
  22. wallet: do not allow loading descriptor with an invalid ID
    If the computed descriptor's ID doesn't match the wallet's
    DB spkm ID, return early from the loading process to prevent
    DB data from being modified in any post-loading procedure
    (e.g 'TopUp' updates the descriptor's data).
    1d207e3931
  23. refactor: extract descriptor ID calculation from spkm GetID()
    This allows us to verify the descriptor ID on the descriptors
    unit tests in different software versions without requiring to
    use the entire DescriptorScriptPubKeyMan machinery.
    
    Note:
    The unit test changes are introduced after the bugfix commit
    but this commit + the unit test commit can be cherry-picked
    on top of the v25 branch to verify IDs correctness. IDs must
    be the same for v25 and after the bugfix commit.
    97a965d98f
  24. wallet: bugfix, always use apostrophe for spkm descriptor ID
    As we update the descriptor's db record every time that
    the wallet is loaded (at `TopUp` time), if the spkm ID differs
    from the one in db, the wallet will enter in an unrecoverable
    corruption state, and no soft version will be able to open
    it anymore.
    
    Because we cannot change the past, to stay compatible between
    releases, we need to always use the apostrophe version for the
    spkm IDs.
    6a9510d2da
  25. test: add coverage for descriptor ID
    Tests vectors were calculated by running the same tests on
    v25. Which was the last release prior to introducing the
    diff in the descriptor's string representation ('h' format).
    
    Co-authored-by: Sjors Provoost <sjors@sprovoost.nl>
    5df988b534
  26. furszy force-pushed on Jun 28, 2023
  27. Sjors commented at 1:34 pm on June 28, 2023: member
    What happened to fa2329285c?
  28. DrahtBot removed the label Needs rebase on Jun 28, 2023
  29. furszy commented at 2:08 pm on June 28, 2023: member

    What happened to fa23292?

    Vanished after #24914 merge :).

    We no longer iterate over all the database records without ordering. Now the wallet loads each specific group of records separately (e.g. legacy records, descriptors records, txs, etc).

    The reason behind fa232928 was that the upper level method LoadWallet was only treating few specific ReadKeyValue errors as corruption ones (we were adding each of them manually). And now, each specific section of the wallet loading process can return a corruption error alone (check 405b4d91).

  30. ErikDeSmedt commented at 5:12 pm on June 28, 2023: none

    ACK: I’ve tested the changes and the bug is resolved.

    I’ve been struggling with back-porting the test-code to v25.0. You might need to change some descriptors to use h instead of ' and I manually removed some checksums. @Sjors : Thanks for helping me out with this

  31. achow101 commented at 6:57 pm on June 28, 2023: member
    ACK 5df988b534df842ddb658ad2a7ff0f12996c8478
  32. furszy requested review from Sjors on Jul 3, 2023
  33. Sjors commented at 7:30 pm on July 3, 2023: member

    tACK 5df988b534df842ddb658ad2a7ff0f12996c8478

    Also tested backporting the tests. Trick is to disable the norm_pub check (or manually replacing ' with h, changing the checksum where needed).

  34. DrahtBot removed review request from Sjors on Jul 3, 2023
  35. achow101 merged this on Jul 4, 2023
  36. achow101 closed this on Jul 4, 2023

  37. furszy deleted the branch on Jul 4, 2023
  38. in src/wallet/walletdb.cpp:809 in 5df988b534
    802@@ -803,6 +803,12 @@ static DBErrors LoadDescriptorWalletRecords(CWallet* pwallet, DatabaseBatch& bat
    803         }
    804         pwallet->LoadDescriptorScriptPubKeyMan(id, desc);
    805 
    806+        // Prior to doing anything with this spkm, verify ID compatibility
    807+        if (id != pwallet->GetDescriptorScriptPubKeyMan(desc)->GetID()) {
    808+            strErr = "The descriptor ID calculated by the wallet differs from the one in DB";
    809+            return DBErrors::CORRUPT;
    


    luke-jr commented at 4:41 pm on July 4, 2023:
    Why not UNKNOWN_DESCRIPTOR instead?

    achow101 commented at 4:52 pm on July 4, 2023:
    The descriptor isn’t unknown. It’s a corruption error as the id written in the database doesn’t match the id that we calculate.
  39. sidhujag referenced this in commit dd59d6980f on Jul 4, 2023
  40. luke-jr commented at 4:45 pm on July 4, 2023: member
    Tag for backport?
  41. fanquake commented at 4:48 pm on July 4, 2023: member
    Not if this has (as I understand it) only been broken on master?
  42. achow101 commented at 4:52 pm on July 4, 2023: member

    Tag for backport?

    No need, only affects master.

  43. Sjors commented at 6:04 pm on July 4, 2023: member
    If anyone on master is affected by this, we could perhaps add a command to wallet-tool that resets the descriptor ID and cache.
  44. denavila commented at 0:57 am on September 3, 2023: none
    Some of my wallets are no longer opening after pulling master, and I bisected it to this commit. For example when starting in -sigtest or -regtest mode I get the error “The wallet is corrupt” and Bitcoin Core exits. This doesn’t happen for -testnet or main net though. If I understand correctly, this PR is fixing a bug for people who got their wallets corrupted, but in my case it’s the opposite - I can still open the wallets with versions before this commit. Is this an expected behavior?
  45. sipa commented at 4:20 am on September 3, 2023: member
    @denavila My understanding is that if you used non-release code from before this PR, then this PR will be expected to break things for you. However if you only ever run released version, this PR prevents incompatibilities.
  46. denavila commented at 4:20 pm on September 3, 2023: none
    Yes, I was using non-release versions for these wallets. I tried to restore the signet wallet from a backup, but that also hits the same “corruption” error. Is there any other way to migrate the signet wallet to the latest code? It’s signet, so it’s not the end of the world if the coins are lost, but it would still be better if I could recover them.
  47. furszy commented at 4:31 pm on September 3, 2023: member

    Yes, I was using non-release versions for these wallets. I tried to restore the signet wallet from a backup, but that also hits the same “corruption” error. Is there any other way to migrate the signet wallet to the latest code? It’s signet, so it’s not the end of the world if the coins are lost, but it would still be better if I could recover them.

    Probably the easiest workaround would be to revert this PR on your local repository. Compile the sources and start the app. Once you have the wallet loaded, export the descriptors by calling listdescriptors true (showing the private keys). Then, create a new wallet on top of latest master and import the exported descriptors there.

  48. denavila commented at 5:17 pm on September 3, 2023: none
    @furszy, thanks for the workaround! I got my signet wallet restored :-)
  49. Sjors commented at 2:38 pm on September 4, 2023: member

    Compile the sources and start the app.

    ~Or even just run an earlier release and export the descriptors.~ Glad that worked!

  50. furszy commented at 2:49 pm on September 4, 2023: member

    Compile the sources and start the app.

    Or even just run an earlier release and export the descriptors. Glad that worked!

    That wouldn’t work. Remember that this fixed an unrecoverable corruption state that was arising when the ‘h’ created wallet was open by an earlier release.

    The previous release would overwrite the spkm ID with the non-‘h’ version of it. Making all the db records tied to it fail to load on the next startup.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 13:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me