test: coinselection edge cases

S3RK commented at 8:39 am on March 16, 2022: member

This PR adds tests for some edge cases in coin selection.

There are 4 sections by the type of utxo pool:

a few small inputs of equal value
one big input and many small inputs
just one input
one big and one small input

S3RK force-pushed on Mar 16, 2022

fanquake added the label Tests on Mar 16, 2022

fanquake requested review from Xekyo on Mar 16, 2022

in test/functional/wallet_coinselection.py:44 in 1607529ea5 outdated

36+        tx1 = w0.sendtoaddress(w1.getnewaddress("", "bech32"), 1)
37+        tx2 = w0.sendtoaddress(w1.getnewaddress("", "bech32"), 1)
38+        tx3 = w0.sendtoaddress(w1.getnewaddress("", "legacy"), 1)
39+        self.generate(self.nodes[0], 1)
40+
41+        # TODO: set long_term_fee_rate explicitly

Xekyo commented at 9:58 pm on March 16, 2022:

I noticed there is a todo left here. Is that an open question?

S3RK commented at 8:07 am on March 22, 2022:

fixed that

in test/functional/wallet_coinselection.py:146 in 1607529ea5 outdated

141+            w0.sendtoaddress(w1.getnewaddress("", "legacy"), 0.05)
142+
143+        # We will target a solution with 20 coins that will add up exactly to 1btc.
144+        # Add a coin that will cover all the fees and exactly match the target.
145+        # TODO: fix that knapsack targets change even if there is no change needed
146+        # (148*20+42+31)*350+5000000 = 6061550

Xekyo commented at 8:51 pm on March 17, 2022:

What’s the 42 here? Is that the header plus a P2SH output?

S3RK commented at 8:52 am on March 21, 2022:

42 is header (11) + p2pwkh (31). I’ll updated the comment

in test/functional/wallet_coinselection.py:88 in 1607529ea5 outdated

83+        # TODO: implement optimization and uncomment the assertions
84+        # assert_equal(len(tx['vin']), 1)
85+        # assert_equal(len(tx['vout']), 1)
86+        # assert(tx['vin'][0]['txid'] in [tx1, tx2])
87+
88+        # fees for 1 legacy input tx = (148+42)*20 = 3800sat

Xekyo commented at 9:03 pm on March 17, 2022:

I’m wondering what the 42 is here. Is that the output plus header? My first intuition was that it was just the output, but then I realized that it probably is the whole transaction weight that’s being calculated. Does that mean that we’re expecting a wrapped segwit output here? Perhaps you could split up the calculation a bit more for clarity, e.g. (10+148+32), if I interpret that correctly.

S3RK commented at 8:53 am on March 21, 2022:

Yes, 42 is output + header. I’ll update the comment

in test/functional/wallet_coinselection.py:61 in 1607529ea5 outdated

56+
57+        def select_coins_for(target):
58+            result = w1.fundrawtransaction(w1.createrawtransaction([], {dummyaddress: target}), {"fee_rate": fee_rate})
59+            return w1.decoderawtransaction(result['hex'])
60+
61+        # 1-input-1-output tx is 110vbytes which results in 2200sat fee at 20sat/vbyte

Xekyo commented at 9:09 pm on March 17, 2022:

0        # 1-input-1-output tx is 110vbytes (42/4+68+31) which results in 2200sat fee at 20sat/vbyte

S3RK commented at 8:54 am on March 21, 2022:

I’ll update the comment

in test/functional/wallet_coinselection.py:111 in 1607529ea5 outdated

106+        assert_equal(len(tx['vin']), 1)
107+        assert_equal(len(tx['vout']), 1)
108+        assert_equal(tx['vin'][0]['txid'], tx3)
109+
110+        # a solution with two bech32 inputs and a change is the most efficient
111+        # but it could only be found by SRD

Xekyo commented at 9:27 pm on March 17, 2022:

I was wondering for a bit why the solution could not be found by Knapsack, but then I realized that it probably would, but then a solution with one bech32 input and one legacy input would be preferred because it overshoots the target less by spending more in fees. Given the 1000 iterations, Knapsack would definitely find that solution.

However, then if SRD also found the solution with two bech32 inputs, it would get preferred per the waste metric. Was that your reasoning as well?

S3RK commented at 8:59 am on March 21, 2022:

Exactly. Knapsack prefers a solution with legacy+bech32 as it’s closer to the target + MIN_CHANGE due to higher fees. I’ll update the comment

in test/functional/wallet_coinselection.py:127 in 1607529ea5 outdated

118+        # This is more than cost of change and BnB won't find a solution with single input
119+        # We either get
120+        # 1) 2x bech32 solution (SRD only)
121+        # 2) legacy + bech32 solution
122+        tx = select_coins_for(0.99994899)
123+        # legacy + dropping more for fees is still better in terms of waste than legacy + bech32

Xekyo commented at 9:34 pm on March 17, 2022:

What do you think would be the outcome if we changed knapsack to keep the solution with the best waste instead of the solution with the smallest change?

S3RK commented at 7:31 am on March 22, 2022:

To keep the solution with best waste out of a 1000 trials? Hm.. that would fix this test case, but it’d make Knapsack behave exactly like running SRD in a loop. I guess we can then just drop knapsack altogether.

in test/functional/wallet_coinselection.py:144 in 1607529ea5 outdated

135+        w0 = self.nodes[0].get_wallet_rpc(self.default_wallet_name)
136+        self.nodes[0].createwallet(wallet_name="test_one_big_and_many_small_coins")
137+        w1 = self.nodes[0].get_wallet_rpc("test_one_big_and_many_small_coins")
138+
139+        # Create a bunch of small coins
140+        for i in range(100):

Xekyo commented at 9:37 pm on March 17, 2022:

I was just looking into this user asking about a transaction with 856 inputs a few days ago. I’d be curious to see if we could still reproduce behavior like that on a wallet with a UTXO pool of only about 1,000. I think SRD would hopefully make that better, but I would say we could even go higher than 100 here. :)

S3RK commented at 7:46 am on March 22, 2022:

Actually I was trying to keep the number of small coins as a low as possible while still demonstrating some inefficiencies of coin selection.

The more coins we have -> the more time the test takes.
The more coins we have -> the harder it’s to debug and to reason about.

What is the test scenario that would be possible with 1000 coins, but not possible with 100 coins?

P.S. I think we can definitely reproduce this scenario, I see no reasons why not? If utxo pool is equivalent to this test cases then even with SRD it will find better solution in ~856/1000 = 85%, which leaves 15% to a really bad knapsack alternative.

Xekyo commented at 5:08 pm on March 22, 2022:

Yeah, I think it would be easy to produce the scenario:

• 100 small coins, 1 big coin • no BnB solution • a large set of the small coins produces a smaller change output than the 1 big coin, so knapsack prefers the former • SRD only finds the 1 big coin by itself as a solution in ~1/101 cases

⇒ usually SRD finds the best, but it’s a worse waste than best solution, but better than knapsack

One could probably make it even less likely that the best solution is found by SRD by having two big ones that are both necessary

S3RK commented at 8:18 am on March 23, 2022:

Isn’t this exactly test case 3) in test_one_big_and_many_small_coins?

in test/functional/wallet_coinselection.py:149 in 1607529ea5 outdated

140+        for i in range(100):
141+            w0.sendtoaddress(w1.getnewaddress("", "legacy"), 0.05)
142+
143+        # We will target a solution with 20 coins that will add up exactly to 1btc.
144+        # Add a coin that will cover all the fees and exactly match the target.
145+        # TODO: fix that knapsack targets change even if there is no change needed

Xekyo commented at 9:40 pm on March 17, 2022:

I’m not sure we should be improving Knapsack. I’d rather get rid of it, but only after we have better coin selection than Knapsack. @achow101 and I were looking into why Knapsack still produces such a high percentage of the selection results in simulations and realized that after all those 1000 random walks, it’ll still prefer lowest_larger if the latter produces a smaller change (that’s still bigger than min_change).

Xekyo commented at 9:43 pm on March 17, 2022:

Also, even when Knapsack can’t find changeless transactions, wouldn’t the expectation be that BnB finds it?

S3RK commented at 7:28 am on March 22, 2022:

I’m not really arguing for improving it. I guess maybe one way that fix that would be to just remove Knapsack that targets no change and leave only knapsack that targets a MIN_CHANGE.

P.S. BnB won’t produce same solutions even in the situations when a changeless solution is possible, because BnB optimizes for waste. But I think this is okay.

in test/functional/wallet_coinselection.py:169 in 1607529ea5 outdated

164+            # lock big coin
165+            w1.lockunspent(False, [tx for tx in w1.listunspent() if tx['txid'] == txid])
166+            return fee_total, max_fee
167+
168+        # in all the scenarios below the best solution would be to take one big coin and create a change output
169+        best_fee = (42 + 68 + 31) * fee_rate / COIN

Xekyo commented at 9:51 pm on March 17, 2022:

42 kinda throws me off a bit as before, but still assuming it’s header + output

S3RK commented at 9:04 am on March 21, 2022:

yes, it’s. I’ll split off the calculation

in test/functional/wallet_coinselection.py:184 in 1607529ea5 outdated

175+        assert_approx(fee_total, best_fee * 100)
176+
177+        # 1) knapsack latches to the exact match solution and can't find the best solution
178+        # SRD can find a solution better than knapsack if draws the big coin in the first 21 tries
179+        # chances for SRD to find it are roughly 2*(21/102) = ±40%.
180+        # Chances are doubled due to coinselection run twice with and without APS enabled

Xekyo commented at 10:05 pm on March 17, 2022:

What’s “APS”?

S3RK commented at 9:04 am on March 21, 2022:

avoid partial spends. We run transaction building two times (with and without aps)

jonatack commented at 8:55 am on March 22, 2022:

What’s “APS”?

In more detail, see the configuration flag -maxapsfee added by #14582 in 0.21, which sets the max allowed avoid partial spends (APS) fee. It defaults to 0 (fee is the same with and without APS), and setting it to -1 will disable APS, unless -avoidpartialspends is set.

in test/functional/wallet_coinselection.py:183 in 1607529ea5 outdated

174+        assert_approx(max_fee, best_fee)
175+        assert_approx(fee_total, best_fee * 100)
176+
177+        # 1) knapsack latches to the exact match solution and can't find the best solution
178+        # SRD can find a solution better than knapsack if draws the big coin in the first 21 tries
179+        # chances for SRD to find it are roughly 2*(21/102) = ±40%.

Xekyo commented at 10:08 pm on March 17, 2022:

I think the chance is (1-(101/102)×(100/101)×…×(81/82))=(1-81/102)… oh, I see, that’s (21/102). So, to get a better SRD solution that would need to occur at least once, which is 1-(81/102)² = 36.93%… okay, I gotcha.

S3RK commented at 9:07 am on March 21, 2022:

Yes. that is correct. Not sure if I can improve the comment here somehow

in test/functional/wallet_coinselection.py:177 in 1607529ea5 outdated

172+        # knapsack always finds the best solution
173+        fee_total, max_fee = select_coins(big_coin=1.01051, target=1.00000001, iterations=100)
174+        assert_approx(max_fee, best_fee)
175+        assert_approx(fee_total, best_fee * 100)
176+
177+        # 1) knapsack latches to the exact match solution and can't find the best solution

Xekyo commented at 10:12 pm on March 17, 2022:

We’ve previously used “exact match” for any changeless solution, so it wasn’t immediately obvious to me that it was used in the literal sense here. Perhaps you should clarify that you mean that it’s literally the same to the sat.

S3RK commented at 9:07 am on March 21, 2022:

Will update the comment

S3RK commented at 8:35 am on March 22, 2022:

Updated, but I think we should stop using “exact match” in place of changeless solution. Changeless solution is a solution in the range around exact match constrained by the cost of change/discard fee rate.

in test/functional/wallet_coinselection.py:188 in 1607529ea5 outdated

183+        assert_approx(max_fee, knapsack_fee)
184+        # total fee is lower due to lucky SRD solutions
185+        assert_greater_than(knapsack_fee * 100, fee_total)
186+        assert_greater_than(fee_total, best_fee * 100)
187+
188+        # 2) no exact match, effective value of the big coin is smaller than target + MIN_CHANGE

Xekyo commented at 10:12 pm on March 17, 2022:

I realized above use of “exact match” here. :)

S3RK commented at 9:08 am on March 21, 2022:

Thanks. Will retouch the comment

in test/functional/wallet_coinselection.py:189 in 1607529ea5 outdated

184+        # total fee is lower due to lucky SRD solutions
185+        assert_greater_than(knapsack_fee * 100, fee_total)
186+        assert_greater_than(fee_total, best_fee * 100)
187+
188+        # 2) no exact match, effective value of the big coin is smaller than target + MIN_CHANGE
189+        # BnB can find the same solution from previous test but dropping the change output to fees

Xekyo commented at 10:14 pm on March 17, 2022:

0        # BnB will find the same solution from previous test but dropping the change output to fees

BnB is deterministic, so if it “can” I would expect that it does? Does “change output” mean “remainder” here?—Is this the BnB without an upper bound or the variant we currently use?

S3RK commented at 9:08 am on March 21, 2022:

Any BnB, since we’re within the cost of change bound. I’ll retouch the comment

in test/functional/wallet_coinselection.py:218 in 1607529ea5 outdated

208+        assert_greater_than(srd_fee * 100, fee_total)
209+        assert_greater_than(fee_total, best_fee * 100)
210+
211+        # 4) knapsack latches to exact match + MIN_CHANGE
212+        # SRD still occasionally find better solution
213+        fee_total, max_fee = select_coins(big_coin=1.00051, target=0.99, iterations=100)

Xekyo commented at 10:18 pm on March 17, 2022:

It’s a bit hard for me to follow how the exact values were picked, since both the big_coin and target are changing. I haven’t tried to step by step reproduce each calculation in this first review, and I do see that many of your test cases do explain some of the calculation, but I just noticed that this one for example does not.

Since it’s not trivial for me to follow, I kinda expect that other reviewers might also need a bit help. :grin:

S3RK commented at 7:23 am on March 22, 2022:

This is basically the same as test case “1)” in this section, but all the values decreased by MIN_CHANGE. The idea is to demonstrate that Knapsack latches not only to down-to-satoshi matches. I’ll improve the comment

in test/functional/wallet_coinselection.py:261 in 1607529ea5 outdated

249+        # if change is > MIN_CHANGE than we just spend big coin and create a change
250+        tx = select_coins_for(49.9)
251+        assert_equal(len(tx['vin']), 1)
252+        assert_equal(len(tx['vout']), 2)
253+
254+        # if small coins > MIN_CHANGE than we just add one small coin to the big one and create change

Xekyo commented at 10:24 pm on March 17, 2022:

This comment confuses me. After paying the fees, the big coin doesn’t have enough funds to pay fees and create the output here, so we would always need both coins.

S3RK commented at 8:44 am on March 22, 2022:

Oops. The idea was that you have more than 1 small coin. If small coins > MIN_CHANGE that the tests in test_one_big_and_many_small_coins doesn’t work. Any idea how to improve that? For now, added more than one small coin in this test case

Xekyo commented at 10:24 pm on March 17, 2022: member

Cool! Concept ACK. Did a light review, haven’t reproduced all the numbers.

test: coinselection edge cases e378883892

S3RK force-pushed on Mar 22, 2022

S3RK commented at 8:51 am on March 22, 2022: member

Retouched comments to address Xekyo’s feedback.

Xekyo commented at 8:38 pm on March 23, 2022: member

Looks good. What is the next step you plan for this PR? How can one help?

S3RK commented at 8:50 am on March 24, 2022: member

I thought this PR can be merged roughly as it is (addressing all the concerns of course), as I believe the test is valuable in itself.

As a follow up I’d like to fix some of the inefficiencies of coin selection by increasing BnB upper limit. But for that we need to improve how we drop change for fees. I tried one approach in #23367 but it doesn’t work that well. So I’m going splitting #23367 into parts that could be merged independently (the first one -> #24649).

Let’s talk on the next wallet meeting.

S3RK commented at 6:27 am on March 29, 2022: member

There is probably a conflict with #24494.

achow101 commented at 4:38 pm on April 14, 2022: member

While I appreciate more tests, these tests seem to be extremely specific to the way that coin selection currently works and look like they would start failing as soon as we change something, such as removing knapsack. So I’m not sure if we actually want to have these tests if they are going to be removed and/or require a ton of recalculation for future coin selection changes.

If we do want these tests, then I don’t think it should use legacy addresses so much as hopefully we won’t make those by default soon.

S3RK commented at 6:59 am on April 19, 2022: member

While I appreciate more tests, these tests seem to be extremely specific to the way that coin selection currently works and look like they would start failing as soon as we change something, such as removing knapsack. So I’m not sure if we actually want to have these tests if they are going to be removed and/or require a ton of recalculation for future coin selection changes.

Agree, these are not your normal tests, primarily because it’s hard to define the “right” or “expected” outcome. I created these tests to expose some shortcomings of current coin selection and justify increasing BnB limit (#24752). It’d be nice to have them around in some form to avoid regressions, maybe we can incorporate them somehow into simulations. wdyt?

If we do want these tests, then I don’t think it should use legacy addresses so much as hopefully we won’t make those by default soon.

instagibbs commented at 7:27 pm on May 20, 2022: member

To throw my relatively unproductive opinion into the mix, this is a staggeringly large number of magic numbers. Ideally all numbers in tests are “obvious” or programmatic. It’s a nightmare to maintain otherwise.

Didn’t read or think enough about overall thrust of the tests.

S3RK commented at 6:06 am on May 23, 2022: member

Thanks for the feedback. Maybe a better way to verify coin selection would to integrate some simulation/benchmark into this repo.

S3RK closed this on May 23, 2022

DrahtBot locked this on May 23, 2023

test: coinselection edge cases #24580