[WIP] wallet: standardize change output detection process #25979

pull furszy wants to merge 13 commits into bitcoin:master from furszy:2022_wallet_change_detection changing 25 files +510 −70
  1. furszy commented at 8:40 pm on September 1, 2022: member

    Depends on #27601, please go there first.

    This work aims to define, and implement a base standard mechanism to detect individual change outputs.

    Context

    Currently, the wallet detects whether an output is change or not based on data stored in the address book.

    There is no notion of “change outputs”, the wallet detects change scripts.

    Connoting that any address book record modification has implications on all the historical outputs related to that particular destination. Meaning that all those outputs can either be change or not. There is no middle-ground granular distinction.

    How Change Detection Currently Works?

    The wallet walks-through the transaction outputs, extracts the script destination and verify the following two points:

    1. If the destination doesn’t exist in the address book, then the script is a “change address”.

    2. If the destination exists in the address book, but it doesn’t have a label, then the script is a “change address”.

    Motivation

    There are a good number of problems in the current approach:

    • We make the wallet dependent on an external structure, with separate storage. Which has to be updated and maintained along with the wallet state.

    • It cannot be maintained nor recovered across different wallet instances. Cannot re-create the, possibly custom, address book data only by importing the wallet descriptor string.

    • As the address book is an structure that the user can freely modify, the change detection process might differ through different wallets.

    • The current rudimentary assumptions of “no address book entry” or “no label set for the address book entry” to denote that certain script destination is change or not can easily be broken: E.g. derive an address from one of the wallet’s external paths manually. Then send coins to it. As the receive destination wasn’t created inside the wallet, the wallet has no associated address book entry. So, the reception is invalidly detected as change (added a test case for it).

    • The wallet can’t detect change outputs on more complex scripts such as multi-sig change outputs.

    • The wallet is not able to detect change outputs going to an internal address if the internal address has a label. (E.g. the user can manually set a label for the internal address and, doing that, make that all the change outputs, in the wallet history, that were sent to the destination are no longer detected as change).

    • There isn’t a way to distinguish the external reception of coins into an internal address. Coins reception on any internal address are always detected as change.

    New Change Detection Mechanism Goals

    Aiming to:

    • Define a base mechanism to align different wallet implementations. Preventing each piece of software from diverging on the basic change outputs distinction.

    • Detect change outputs on-demand without requiring to maintain an external data structure synced with the latest wallet state.

    • Independently, and accurately, detect change outputs regardless data stored in structures that the user can freely modify.

    • Granular distinction between change vs non-change outputs that were sent to the same internal address. E.g. the reception of coins, from an external source, on internal addresses will not longer be detected as change anymore.

    • Expand the change detection to more complex scripts such as a multi-sig protected addresses. (While they are added into the wallet on an internal spkm)

    Change Output Detection Rules

    A transaction output is change if it fulfills the following points:

    1. At least one of the parent transaction inputs is from the wallet. (If none of them are, then the wallet is receiving coins on an internal address).

    2. The script extracted destination is from the wallet and is located in one of the internal script pub key manager. (e.g. derived from an internal derivation path)

    What about legacy wallets?

    If the legacy wallet is HD post-split, we have an internal derivation path, so we can follow the same process as descriptors wallet. Unless the destination is on the pre-split key pool, in which case, we fallback to the follow-up case.

    If the legacy wallet is pre-split, we continue using the address book as we either have an HD wallet with keys derived only on the external path, or we are using raw keypool.

    ———————————————————————

    Extra Note

    This PR, in about 85% at least, is about expanding the current test coverage for the change output detection area.

    TO DO (still WIP):

    • Save “internal” flag on non-active descriptors so the wallet can use them on the change detection process. (which will fix the currently failing test cases).
    • Re-organize commits so tests always pass.
    • Verify backwards compatibility.
  2. DrahtBot added the label Wallet on Sep 1, 2022
  3. furszy renamed this:
    wallet: standardize change output detection process
    [WIP] wallet: standardize change output detection process
    on Sep 1, 2022
  4. DrahtBot commented at 9:23 pm on September 1, 2022: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK ghost

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29936 (fuzz: wallet: add target for CreateTransaction by brunoerg)
    • #29136 (wallet: addhdkey RPC to add just keys to wallets via new void(KEY) descriptor by achow101)
    • #28333 (wallet: Construct ScriptPubKeyMans with all data rather than loaded progressively by achow101)
    • #26596 (wallet: Migrate legacy wallets to descriptor wallets without requiring BDB by achow101)
    • #22838 (descriptors: Be able to specify change and receiving in a single descriptor string by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. ghost commented at 2:31 am on September 2, 2022: none

    Interesting pull request although I haven’t tested it yet.

    Does this also fix #20935 and #20795 ?

    How Change Detection Currently Works? The wallet walks-through the transaction outputs, extracts the script destination and verify the following two points:

    If the destination doesn’t exist in the address book, then the script is a “change address”.

    If the destination exists in the address book, but it doesn’t have a label, then the script is a “change address”.

    I was assuming change address had different derivation path. Is that false?

  6. furszy commented at 4:09 pm on September 2, 2022: member

    Does this also fix #20935 and #20795 ?

    Yes for the change outputs detection part. It will allow us to properly detect each individual output as change or not, independently on what is stored in the address book entry (making the process deterministic across instances, and not dependent on a data structure that the user can freely modify).

    As an edge case example; with this, users could even receive coins on an internal address and the new mechanism will properly detect it as an external reception, not a change output (the transaction inputs aren’t from the wallet, so no coins are returning to it). Still, this behavior is obviously not encouraged, nor should be easy to do in the wallet, as it breaks the whole idea of internal/external derivation paths.

    Plus, would recommend you to check #25685 as well. The first commit there fixes a misleading wallet behavior where even if you set add_inputs=false (telling the wallet to disallow automatic coin selection), the wallet will still fetch coins internally and use them in the Coin Selection process (if you don’t pre-set inputs manually).

    How Change Detection Currently Works? The wallet walks-through the transaction outputs, extracts the script destination and verify the following two points:

    If the destination doesn’t exist in the address book, then the script is a “change address”.

    If the destination exists in the address book, but it doesn’t have a label, then the script is a “change address”.

    I was assuming change address had different derivation path. Is that false?

    Not entirely, the assumption is most of the time correct. It breaks on pre-split legacy wallets, where we don’t have an internal derivation path. In those wallet versions we (1) only use an external path to derive public and change addresses, or if we go further back in time, (2) we use a raw pool of keys with no derivation paths at all. (Thus why the initial change detection process was implemented using the address book).

  7. ghost commented at 5:41 am on September 3, 2022: none
    Concept ACK
  8. achow101 commented at 7:53 pm on October 31, 2022: member

    While I like the idea this is going for, I don’t think it is sufficient. It relies on m_internal_spk_managers which only contains the currently active SPKMs. If a user were to remove a SPKM from being active, all of the addresses that were produced by that would no longer be detected as change. This is especially problematic with #25907 which rotates all of the currently active descriptors.

    In general, I’m not sure that we can determine which output is change without storing additional metadata. What has been suggested before is to explicitly store an “IsChange” value in address book entries, with a “smart” fallback like what this PR does.

  9. DrahtBot added the label Needs rebase on Dec 6, 2022
  10. furszy force-pushed on Jan 10, 2023
  11. DrahtBot removed the label Needs rebase on Jan 10, 2023
  12. furszy commented at 7:07 pm on January 11, 2023: member

    finally here. Thanks for the input achow101

    While I like the idea this is going for, I don’t think it is sufficient. It relies on m_internal_spk_managers which only contains the currently active SPKMs. If a user were to remove a SPKM from being active, all of the addresses that were produced by that would no longer be detected as change. This is especially problematic with #25907 which rotates all of the currently active descriptors.

    I’m probably missing some context but.. wouldn’t that be solvable by adding an internal field to the WalletDescriptor class? (32a990a). So we can keep track of non-active internal descriptors too. It should work fine while we don’t have any descriptor removal functionality (which would also mean to remove txs from the wallet etc) and require to keep historical records.

    In general, I’m not sure that we can determine which output is change without storing additional metadata. What has been suggested before is to explicitly store an “IsChange” value in address book entries, with a “smart” fallback like what this PR does.

    I think that we should have a more granular distinction and move from “change addresses” to “change outputs”. Storing any extra metadata, if needed, inside the wallet transaction class.

    A good use case for this is the reception of coins from an external source on internal addresses, which shouldn’t be detected as change as it could be a simple reception or part of a dust attack. if the tx has no inputs belonging to the wallet, then no output should be labeled as change.

  13. DrahtBot added the label Needs rebase on Jan 18, 2023
  14. furszy force-pushed on Jan 19, 2023
  15. DrahtBot removed the label Needs rebase on Jan 19, 2023
  16. DrahtBot added the label Needs rebase on Feb 16, 2023
  17. furszy force-pushed on Feb 20, 2023
  18. DrahtBot removed the label Needs rebase on Feb 20, 2023
  19. DrahtBot added the label Needs rebase on Apr 11, 2023
  20. furszy force-pushed on Apr 12, 2023
  21. DrahtBot removed the label Needs rebase on Apr 12, 2023
  22. DrahtBot added the label Needs rebase on Apr 15, 2023
  23. furszy force-pushed on Apr 20, 2023
  24. DrahtBot removed the label Needs rebase on Apr 21, 2023
  25. DrahtBot added the label CI failed on Apr 21, 2023
  26. DrahtBot removed the label CI failed on Apr 23, 2023
  27. DrahtBot added the label CI failed on May 3, 2023
  28. furszy renamed this:
    [WIP] wallet: standardize change output detection process
    wallet: standardize change output detection process
    on May 19, 2023
  29. furszy force-pushed on May 19, 2023
  30. furszy renamed this:
    wallet: standardize change output detection process
    [WIP] wallet: standardize change output detection process
    on May 19, 2023
  31. furszy commented at 1:42 pm on May 19, 2023: member

    Thanks for the ping marko, rebased. Yeah, this is still relevant. Change detection is still an issue.

    The base work is done, just need some extra love. Will keep it rebased, so anyone can review/test it meanwhile.

  32. furszy marked this as a draft on May 19, 2023
  33. furszy commented at 1:52 pm on May 19, 2023: member
    This now depends on #27601. Moved to draft until that one gets merged.
  34. DrahtBot added the label Needs rebase on Jul 20, 2023
  35. Sadiq272 changes_requested
  36. Sadiq272 commented at 1:39 am on January 16, 2024: none
    Oba
  37. bitcoin deleted a comment on Jan 16, 2024
  38. bitcoin deleted a comment on Jan 16, 2024
  39. furszy commented at 3:00 am on January 16, 2024: member
    This is on a long-working path. Will try to rebase it in the coming days or close it until we proceed with the preceding PRs if it is too behind.
  40. furszy force-pushed on Jan 21, 2024
  41. furszy force-pushed on Jan 21, 2024
  42. DrahtBot removed the label Needs rebase on Jan 21, 2024
  43. DrahtBot removed the label CI failed on Jan 21, 2024
  44. DrahtBot added the label CI failed on Feb 23, 2024
  45. DrahtBot commented at 10:34 am on February 23, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/20703001941

  46. DrahtBot added the label Needs rebase on Mar 29, 2024
  47. wallet: add test coverage for change output detection
    General concept:
     Create transactions using the local wallet to different destinations provided by an external wallet.
     Verifying that the local wallet can detect the change output prior and post the tx is added to the wallet.
    
    The following cases are covered for a descriptor wallet:
     * 1) Create tx that sends to a legacy p2pkh addr and verify change detection.
     * 2) Create tx that sends to a p2wpkh addr and verify change detection.
     * 3) Create tx that sends to a wrapped p2wpkh addr and verify change detection.
     * 4) Create tx that sends to a taproot addr and verify change detection.
    
    And the following ones for a legacy-only wallet:
     * 1) Create tx that sends to a legacy p2pkh addr and verify change detection.
     * 2) Create tx that sends to a p2wpkh addr and verify change detection.
     * 3) Create tx that sends to a wrapped p2wpkh addr and verify change detection.
    b85011a2dc
  48. wallet: change output detection coverage for tx created outside the wallet
    The current change detection process assumption can easily be broken by creating an address manually,
    from an external derivation path, and send coins to it. As the address was not created by the
    wallet, it will not be in the addressbook, there by will be treated as change when it's clearly not.
    
    The wallet will properly detect it once the transaction gets added to the wallet and the address
    (and all the previous unused address) are derived and stored in the address book.
    606a979dd7
  49. wallet: test change detection negative paths
    1) The change output goes to an address in one of the wallet external path.
    2) The change goes back to the source.
       As the source is an external destination, and we are currently detecting
       change through it output script, the change will be marked as external
       (not change).
    3) The user setting an address book label to a destination created from an
       internal key.
    19f617fa26
  50. wallet: standardize change output detection process
    Currently, the wallet detects whether an output is change or not based
    on data stored in the address book.
    
    There is no notion of “change outputs”, the wallet detects change scripts.
    
    Meaning that any address book data modification has implications
    on all the historical outputs related to a particular destination as all of
    them can either be change or not. There is no middle-ground.
    
    The wallet walks-through the transaction outputs, extracts the script
    destination and verify the following two points:
    
    1) If the destination doesn't exist in the address book, then the script
        is a "change address".
    
    2) If the destination exists in the address book, but it doesn't have a
         label, then the script is a "change address".
    
    There are a good number of problems in the current approach:
    
    - We make the wallet dependent on an external structure, with separate storage.
       Which has to be updated and maintained along with the wallet state.
    
    - It cannot be maintained nor recovered across different wallet instances.
       Cannot re-create the address book data only by importing the wallet descriptor string.
    
    - As the address book is an structure that the user can freely modify, the change
       detection process might differ through different wallets.
    
    - The current rudimentary assumptions of "no address book entry" or "no label set for the address book entry"
      to denote that certain script destination is change or not can easily be broken:
      E.g. derive an address from one of the wallet’s external paths manually. Then send coins to it.
      As the receive destination wasn't created inside the wallet, the wallet has no associated address book entry.
      So, the reception is invalidly detected as change (added a test case for it).
    
    - The wallet can't detect change outputs on more complex scripts such as multi-sig change outputs.
    
    - The wallet is not able to detect any change output going to an internal address if the internal address has a label.
      (E.g. the user can manually set a label for the internal address and, doing that, make that all the change outputs,
       in the wallet history, that were sent to the destination are no longer detected as change).
    
    - There isn’t a way to distinguish the external reception of coins into an internal address. Coins reception on any
       internal address are always detected as change.
    
    Aiming to:
    
    * Define a base mechanism to align different wallet implementations. Preventing each piece of software
      from diverging on the basic change outputs distinction.
    
    * Detect change outputs on-demand without requiring to maintain an external data structure synced with the
       latest wallet state.
    
    * Independently, and accurately, detect change outputs regardless data stored in structures that the user
       can freely modify.
    
    * Granular distinction between change vs non-change outputs that were sent to the same internal address.
      E.g. the reception of coins, from an external source, on internal addresses will not longer be detected
      as change anymore.
    
    * Expand the change detection to more complex scripts such as a multi-sig protected addresses. (While they
      are added into the wallet on an internal spkm)
    
    A transaction output is change if it fulfills the following points:
    
    1) At least one of the parent transaction inputs is from the wallet. (If none of them are, then the wallet is receiving coins
        on an internal address).
    
    2) The script extracted destination is from the wallet and is located in one of the internal script pub key manager.
        (e.g. derived from an internal derivation path)
    
    If the legacy wallet is HD post-split, we have an internal derivation path, so we can follow the same process as
    descriptors wallet. Unless the destination is on the pre-split key pool, in which case, we fallback to the follow-up
    case.
    
    If the legacy wallet is pre-split, we continue using the address book as we either have an HD wallet with keys
    derived only on the external path, or we are using raw keypool.
    9d8fee4a49
  51. wallet: remove old ScriptIsChange function 2f2278e720
  52. wallet: coverage for coins reception into an internal address
    As the wallet is receiving those coins from outside, them should
    not be detected as change.
    
    E.g. the user manually obtained and shared one of the internal addresses
    and received coins there.
    455f74d9fc
  53. RPC: add 'include_change' arg to 'listtransactions' and 'gettransaction' 6661e95d6c
  54. wallet: granular distinction between change vs non-change outputs sent to the same internal address
    A transaction output is change if it fulfills the following rules:
    
    1) At least one of the transaction inputs is from the wallet.
       (If none of them are, then the wallet is receiving coins
        on an internal address).
    
    2) The script extracted destination is from the wallet and is
       located in one of the internal script pub key manager.
       (e.g. derived from an internal derivation path)
    2ff9d97da6
  55. [fixup] tests, internal address with addrbook label detected as change 916ffd3d04
  56. wallet: move OutputIsChange(txout) to IsOutputChange(tx, pos)
    Keep behavior consistent
    5ff81d70e0
  57. wallet: add 'internal' field to WalletDescriptor
    To detect whether active and non-active descriptors are internal or not.
    838d0ea534
  58. wallet: GetAllScriptPubKeyMans, return non-active internal descriptors 1eafcd24d2
  59. test: coverage for change detection on inactive internal descriptors 95b2d8d8dc
  60. furszy force-pushed on Mar 29, 2024
  61. DrahtBot removed the label Needs rebase on Mar 29, 2024
  62. DrahtBot added the label Needs rebase on May 23, 2024
  63. DrahtBot commented at 7:04 pm on May 23, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me