wallet: Strictly match tx change type to improve privacy

MarcoFalke commented at 8:16 pm on December 15, 2021: member

Currently the change type will only match a destination by accident, making it easier to determine the change.

Fix that by strictly matching one of the destinations.

MarcoFalke added the label Wallet on Dec 15, 2021

MarcoFalke added the label Privacy on Dec 15, 2021

MarcoFalke commented at 8:17 pm on December 15, 2021: member

Obviously this will fail tests and need docs fixed up. I’ll do that unless there are NACKs.

MarcoFalke renamed this:
~~wallet: Strictly match in TransactionChangeType to improve privacy [skip ci] [ci skip]~~
wallet: Strictly match tx change type to improve privacy [skip ci] [ci skip]
on Dec 15, 2021

MarcoFalke force-pushed on Dec 15, 2021

kristapsk commented at 8:28 pm on December 15, 2021: contributor

Concept ACK

in src/wallet/wallet.cpp:1983 in 26682d5bf4 outdated

1962-                return m_default_address_type;
1963-            }
1964-        }
1965+        std::vector<std::vector<uint8_t>> dummy;
1966+        const TxoutType type{Solver(recipient.scriptPubKey, dummy)};
1967+        if (type == TxoutType::WITNESS_V1_TAPROOT) {

achow101 commented at 8:50 pm on December 15, 2021:

Unknown witness versions should be included here too.

MarcoFalke commented at 8:56 pm on December 15, 2021:

I’d say no. Otherwise, it will be trivial for chain-analysis to determine the change type if the outputs are {v2, v1, v0} and the input is {v1}

achow101 commented at 9:09 pm on December 15, 2021:

Yes, but the intention is to use OutputType::BECH32M for future witness versions too. v2 outputs will have bech32m addresses. If you had a v2 descriptor as active, it would take the slot that tr descriptors would currently occupy so a tx with a taproot output would have a v2 as change under this logic.

MarcoFalke commented at 9:14 pm on December 15, 2021:

This is impossible to do unless the specification of v2 is known.

If I use UNKNOWN here and V2 is specified, then UNKNOWN will not match V2. Before V2 is defined, it is not possible to match it. As soon as V2 is defined, it can be matched. So this is something for a follow-up when the time is right.

achow101 commented at 9:29 pm on December 15, 2021:

Sure, but I think we should handle UNKNOWN here too as one of the explicitly stated goals of bech32(m) is to allow sending to future segwit versions without needing to know about them. When a new version’s TxoutType is added, it needs to be added here as well, but we should always handle UNKNOWN regardless.

MarcoFalke commented at 6:59 am on December 16, 2021:

UNKNOWN is handled. Obviously it is impossible to match, but it is treated like any other type that can’t be matched (wsh, pk, raw multisig, raw data, …).

Happy to add a comment, but the algorithm is:

find any types that can theoretically be matched
try to match them if there is a spkman
fallback to bech32(m) (Note: This will happen if all outputs are unmatchable. For example all are v2)

MarcoFalke commented at 7:02 am on December 16, 2021:

Not sure what to do here, but I think you’ll have to send me a sketch of a diff to clarify what you mean.

achow101 commented at 6:07 pm on December 17, 2021:

I was thinking something like

0        if (type == TxoutType::WITNESS_V1_TAPROOT || type == TxoutType::WITNESS_UNKNOWN) {

But looking at this further, I suppose the current implementation is fine. If the transaction contains any known address types, then we will match to those and privacy is still preserved. If they are all unknown, then we will try to use bech32(m) anyways, unless unavailable, in which case this wouldn’t matter.

MarcoFalke commented at 8:26 am on December 18, 2021:

Right, on mismatch it’d try bech32(m)

ghost commented at 10:33 pm on December 15, 2021: none

Concept ACK

naumenkogs commented at 9:10 am on December 16, 2021: member

Concept ACK

MarcoFalke renamed this:
~~wallet: Strictly match tx change type to improve privacy [skip ci] [ci skip]~~
wallet: Strictly match tx change type to improve privacy
on Dec 16, 2021

MarcoFalke force-pushed on Dec 16, 2021

in test/functional/wallet_address_types.py:351 in faf495815e outdated

347@@ -348,9 +348,9 @@ def run_test(self):
348         if self.options.descriptors:
349             self.log.info("Nodes with addresstype=p2sh-segwit only use a bech32m change output if any destination address is bech32:")
350             self.test_change_output_type(1, [to_address_p2sh], 'p2sh-segwit')
351-            self.test_change_output_type(1, [to_address_bech32_1], 'bech32m')
352-            self.test_change_output_type(1, [to_address_p2sh, to_address_bech32_1], 'bech32m')
353-            self.test_change_output_type(1, [to_address_bech32_1, to_address_bech32_2], 'bech32m')
354+            self.test_change_output_type(1, [to_address_bech32_1], 'bech32')

rajarshimaitra commented at 5:56 am on December 17, 2021:

Should the log line above also change to bech32?

MarcoFalke commented at 7:00 am on December 17, 2021:

thx, done

in test/functional/wallet_address_types.py:372 in faf495815e outdated

370+            self.test_change_output_type(3, [to_address_p2sh], 'p2sh-segwit')
371         else:
372             self.log.info("Nodes with addresstype=bech32 always use a P2WPKH change output (unless changetype is set otherwise):")
373             self.test_change_output_type(3, [to_address_bech32_1], 'bech32')
374-            self.test_change_output_type(3, [to_address_p2sh], 'bech32')
375+            self.test_change_output_type(3, [to_address_p2sh], 'p2sh-segwit')

rajarshimaitra commented at 6:16 am on December 17, 2021:

The log comment above this lines aren’t valid anymore.

MarcoFalke commented at 7:01 am on December 17, 2021:

thx, done

in test/functional/wallet_address_types.py:348 in faf495815e outdated

347@@ -348,9 +348,9 @@ def run_test(self):
348         if self.options.descriptors:

rajarshimaitra commented at 6:25 am on December 17, 2021:

It seems now the change type behavior for both type of wallets is same. So this can be removed?

MarcoFalke commented at 7:01 am on December 17, 2021:

I think the whole test needs to be re-written from scratch, but I want to keep the changes minimal here.

rajarshimaitra commented at 6:30 am on December 17, 2021: contributor

Concept + tACK https://github.com/bitcoin/bitcoin/pull/23789/commits/faf495815e6cfca8969a3c3f0671ce6ec2351472

This mixes change with destination better than previous logic. Below are just some small nits.

wallet: Strictly match tx change type to improve privacy fada6c65d2

MarcoFalke force-pushed on Dec 17, 2021

MarcoFalke commented at 7:02 am on December 17, 2021: member

Force pushed to fix test logs

S3RK commented at 7:23 am on December 17, 2021: member

Concept & Approach ACK fada6c6. Also did light code review .

ghost commented at 10:12 am on December 17, 2021: none

Tested with some transactions on testnet by using different types of address:

P2PKH: dfaa147f5940c27e7a1cc1486bd612e49eb15b42d63b7045e5008bb8c7af2c5c ✅
P2WPKH: df09bb5f59e4a74b5d6627f1ab191aa105e204dc1f37a977c9a603550ab82fdc ✅
P2SH: 68b216c5884e0b0107831768b263afdc947e2bf733bf54dd80ec1725f0d8de34 ✅
P2WSH: 4740d4ee560e42b8537338147a30f9b7492b32e79667ce64a428989f06125488 ❌ (Not Supported)
P2TR: ~~f7fbf80856f3996cd2eba7efa3c2ce266c5f18c3115f2036106223afc7d0404a ❌~~ #23789 (comment)

MarcoFalke commented at 10:24 am on December 17, 2021: member

Excellent. Thank you for testing. If you want to match P2TR, you need a descriptor wallet with a tr descriptor (created by default for new descriptor wallets).

ghost commented at 11:11 am on December 17, 2021: none

Yes that works for P2TR: 64d300442ad4ccb726146b292658876040a1e230af0fd0236e5e83a5aaa33c8d

MarcoFalke commented at 11:44 am on December 17, 2021: member

and for completeness: P2WSH is impossible to match, since the wallet doesn’t support it

unknown approved

unknown commented at 12:12 pm on December 17, 2021: none

tACK https://github.com/bitcoin/bitcoin/pull/23789/commits/fada6c65d23f7bd4682fac281026612b835c4f8b

Thanks for improving privacy. Will have to add this PR and author in my blog.

kristapsk commented at 2:33 pm on December 17, 2021: contributor

and for completeness: P2WSH is impossible to match, since the wallet doesn’t support it

And it’s useless even to try to match as it will likely differ on spend.

MarcoFalke commented at 2:58 pm on December 17, 2021: member

WSH is a corner case anyway (less than 5%) according to https://transactionfee.info/charts/output-type-distribution-count/

ghost commented at 3:49 pm on December 17, 2021: none

I was assuming its used for multisig because of this note mentioned in caravan:

achow101 commented at 6:09 pm on December 17, 2021: member

and for completeness: P2WSH is impossible to match, since the wallet doesn’t support it

Not necessarily. A descriptor wallet can have an active wsh(...) descriptor, and it would be assigned to OutputType::BECH32 (because P2WSH is bech32).

achow101 commented at 6:31 pm on December 17, 2021: member

ACK fada6c65d23f7bd4682fac281026612b835c4f8b

w0xlt approved

w0xlt commented at 7:49 am on December 18, 2021: contributor

tACK fada6c6

Tested on signet with legacy, p2sh-segwit, bech32, and bech32m address types.

MarcoFalke commented at 8:25 am on December 18, 2021: member

Not necessarily. A descriptor wallet can have an active wsh(…) descriptor, and it would be assigned to OutputType::BECH32 (because P2WSH is bech32).

Interesting. Could the wallet also spend the created wsh change output?

MarcoFalke commented at 9:34 am on December 20, 2021: member

@achow101 Ready for merge?

achow101 merged this on Dec 20, 2021

achow101 closed this on Dec 20, 2021

JeremyRubin commented at 8:48 pm on December 20, 2021: contributor

This feels like an incredibly fast merge for a privacy related feature that requires careful consideration.

I need to review the code specifically, but if I understand it correctly this will majorly slow the adoption of Taproot because you’ll have transitive slowness to upgrade (e.g., I made a segwit change to be compatible with a old segwit destination I was paying to, now my change is segwit, now future users will make segwit change to pay me, etc).

Please revert and make a mailing list post.

JeremyRubin commented at 8:53 pm on December 20, 2021: contributor

I gues maybe part of what I’m missing is that only outputs are analyzed, not inputs, so it’s not as bad as I thought it might be in terms of transitivity.

I still think this could slow Tr adoption quite a bit, and I think that’s worth getting a better idea of before deploying this.

achow101 commented at 9:33 pm on December 20, 2021: member

now my change is segwit, now future users will make segwit change to pay me, etc).

No, inputs are not considered.

Please revert and make a mailing list post.

This is a local wallet policy, it does not need to be discussed on the mailing list.

I still think this could slow Tr adoption quite a bit, and I think that’s worth getting a better idea of before deploying this.

I see that it could slow adoption a bit, but it is overall an improvement to how we determine change addresses IMO. We have plenty of time until 23.0 to change it, and changing the order of preferences would not be particularly difficult.

ghost commented at 9:53 pm on December 20, 2021: none

I still think this could slow Tr adoption quite a bit, and I think that’s worth getting a better idea of before deploying this.

Privacy > TR adoption
IIUC change will be P2TR if sending to P2TR: #23789 (comment) so I don’t see issues with adoption if more people want to use P2TR they will eventually because of usecases.

Making it difficult to identify change in a transaction is a great improvement.

JeremyRubin commented at 10:04 pm on December 20, 2021: contributor

Well it still does worsen privacy since then my wallet will be spending a mix out output types rather than just all taproot.

And the observability of this pattern (change matching) is also… pretty observable, especially if bitcoin core (and only upgraded wallets) are the ones doing it.

This becomes worse the slower taproot uptake is.

Further, when you do a subsequent spend, unless our coin selection algorithm is primed to only spend like-kind togehter (not sure it is?) we end up re-correlating the legacy output with the new format, linking the idea that that output was likely one from our wallet as it’s with the upgraded type. In order to safely get taproot outputs you’d have to double-tap your old style outputs into a all taproot self-pay payout txn, or find an opportunity to do so nautrally as a payment. If you were to take your old outputs and pay to a new txn and make a Tr change, you also get screwed because then it’s known that your wallet did really have Tr enabled and was faking. I think it’s really hard to avoid these cases.

The main benefit here is decorrelation of change / payment at time of flight of txn. But privacy is a property that we also have to weigh the costs and benefits for future behaviors as well, and I think it’s worth tightly considering this PR under that lens.

I agree that privacy > all else, but this change should have IMO a pretty negative effect on the herd benefit. And privacy only can be created within the walls of an anonymity set. The only thing worse than no privacy is a false sense of one.

I think the best path to build the anonymity set is to just go all-in taproot change.

sidhujag referenced this in commit d7f67b2b36 on Dec 20, 2021

achow101 commented at 10:17 pm on December 20, 2021: member

we end up re-correlating the legacy output with the new format, linking the idea that that output was likely one from our wallet as it’s with the upgraded type.

I don’t see how this is any worse than a normal spend which is likely to correlate change to your wallet anyways. The purpose is to make it harder to determine which is change while the UTXOs are unspent. Hiding which outputs are change is already tenuous at best.

In order to safely get taproot outputs you’d have to double-tap your old style outputs into a all taproot self-pay payout txn, or find an opportunity to do so nautrally as a payment.

Or just set -addresstype=bech32m and -changetype=bech32m? Those startup options, and any change address option given to transaction creation, will always take precedence over this automatic behavior.

The behavior implemented here also is fairly similar to what we did for segwit. The default address type was p2sh-segwit, and we would only use bech32 if a p2wpkh or p2wsh recipient was observed.

JeremyRubin commented at 10:23 pm on December 20, 2021: contributor

we end up re-correlating the legacy output with the new format, linking the idea that that output was likely one from our wallet as it’s with the upgraded type.

I don’t see how this is any worse than a normal spend which is likely to correlate change to your wallet anyways. The purpose is to make it harder to determine which is change while the UTXOs are unspent. Hiding which outputs are change is already tenuous at best.

It’s worse because the subsequent spend mixing inputs shows that our wallet did in fact support (other type) and is therefore likely to be the change address because it’s unlikely we requested to be paid in (old type).

Or just set -addresstype=bech32m and -changetype=bech32m? Those startup options, and any change address option given to transaction creation will always take precedence over this automatic behavior.

I’m analyzing the impact for a user who uses the default configuration, not what someone can do. If the default configuration isn’t a good idea, it shouldn’t be default.

The behavior implemented here also is fairly similar to what we did for segwit. The default address type was p2sh-segwit, and we would only use bech32 if a p2wpkh or p2wsh recipient was observed.

I don’t think that results from segwit v0 apply here since there was an entirely different mood around adoption which necessitated the segwit have 2 address types for p2sh wrapping and there was real risk of refusing to integrate.

Noting also the negative scaling impact this has on a CISA future, for example, if a lot of the change outputs created are to be long lived.

ghost commented at 10:31 pm on December 20, 2021: none

And the observability of this pattern (change matching) is also… pretty observable, especially if bitcoin core (and only upgraded wallets) are the ones doing it.

There are few other wallets that do the same thing.

Further, when you do a subsequent spend, unless our coin selection algorithm is primed to only spend like-kind togehter (not sure it is?) we end up re-correlating the legacy output with the new format, linking the idea that that output was likely one from our wallet as it’s with the upgraded type.

Good point. Will have to test this. I think this can be improved but it would be out of scope for this PR.

JeremyRubin commented at 10:36 pm on December 20, 2021: contributor

the best case we could do would probably be to draw output types from some distribution matching the chain data over the last 2 weeks and encode a small bias towards the newer type. The herd would move in unison towards new types, and you would actually not be able to learn more information than the bias amount.

note: this would have to apply to address creation too

achow101 commented at 10:54 pm on December 20, 2021: member

It’s worse because the subsequent spend mixing inputs shows that our wallet did in fact support (other type) and is therefore likely to be the change address because it’s unlikely we requested to be paid in (old type).

Not true. The default output type is still bech32 - users using the current defaults will still give out bech32 addresses, and so they will be requesting payments to bech32 addresses. However, their wallet may have tr descriptors, in which case they support Taproot. So they would be giving out p2wpkh addresses, and potentially receiving change as p2tr if the recipient gives them a p2tr address.

Additionally, this is all about address type and not script type. Future witness versions (are expected to) use bech32m as well, so in the future, unless the recipient was asked to be paid in (really old type), the change is likely to be (future witness version) even if the recipient asked for (older witness version).

I’m analyzing the impact for a user who uses the default configuration, not what someone can do. If the default configuration isn’t a good idea, it shouldn’t be default.

You gave a scenario where the user is trying to use all taproot. If a user knows enough about address types and wants to only use a specific type, I think they can figure out how to set the option. We can also add a GUI setting for it too.

MarcoFalke deleted the branch on Dec 21, 2021

MarcoFalke commented at 8:54 am on December 21, 2021: member

@JeremyRubin For reference I implemented what you suggest first in #23731, but then decided to abandon it.

re: process. I think that a pull request is ready to be merged if it was reviewed. Surely, a behaviour changing pull request should be open for a few days to let more eyes take a look at it. I think 11 days is more than enough for that, since it will never be possible to get everyone’s eye on it. And this is “just” the master branch. If for some reason all reviewers messed up and did the wrong thing, there is still plenty of time to revisit before a release.

re: content: Surely, using bech32m unconditionally for change is going to speed up tr adoption on chain. However, it also makes it trivial to spot the change output immediately (at broadcast time of the tx). Especially if the “real” tr usage is low.

With this patch, there should still be a bias toward “promoting” tr, since any tr output will lead to tr change.

Also, I am not sure if it makes sense to consider input-bundling here. If the analysis firm uses input-bundling as a heuristic, then it can determine the wallet inputs regardless of their type already. If they don’t, since it might be a coinjoin, then it shouldn’t matter.

MarcoFalke commented at 10:14 am on December 22, 2021: member

The -changetype doc wasn’t updated. Fixed in https://github.com/bitcoin/bitcoin/pull/23840

achow101 referenced this in commit 1abbae65eb on Jul 28, 2022

DrahtBot locked this on Dec 22, 2022

wallet: Strictly match tx change type to improve privacy #23789