test: coinselection edge cases #24580

pull S3RK wants to merge 1 commits into bitcoin:master from S3RK:coinselection_tests changing 2 files +269 −0
  1. S3RK commented at 8:39 am on March 16, 2022: member

    This PR adds tests for some edge cases in coin selection.

    There are 4 sections by the type of utxo pool:

    1. a few small inputs of equal value
    2. one big input and many small inputs
    3. just one input
    4. one big and one small input
  2. S3RK force-pushed on Mar 16, 2022
  3. fanquake added the label Tests on Mar 16, 2022
  4. fanquake requested review from Xekyo on Mar 16, 2022
  5. in test/functional/wallet_coinselection.py:44 in 1607529ea5 outdated
    36+        tx1 = w0.sendtoaddress(w1.getnewaddress("", "bech32"), 1)
    37+        tx2 = w0.sendtoaddress(w1.getnewaddress("", "bech32"), 1)
    38+        tx3 = w0.sendtoaddress(w1.getnewaddress("", "legacy"), 1)
    39+        self.generate(self.nodes[0], 1)
    40+
    41+        # TODO: set long_term_fee_rate explicitly
    


    Xekyo commented at 9:58 pm on March 16, 2022:
    I noticed there is a todo left here. Is that an open question?

    S3RK commented at 8:07 am on March 22, 2022:
    fixed that
  6. in test/functional/wallet_coinselection.py:146 in 1607529ea5 outdated
    141+            w0.sendtoaddress(w1.getnewaddress("", "legacy"), 0.05)
    142+
    143+        # We will target a solution with 20 coins that will add up exactly to 1btc.
    144+        # Add a coin that will cover all the fees and exactly match the target.
    145+        # TODO: fix that knapsack targets change even if there is no change needed
    146+        # (148*20+42+31)*350+5000000 = 6061550
    


    Xekyo commented at 8:51 pm on March 17, 2022:
    What’s the 42 here? Is that the header plus a P2SH output?

    S3RK commented at 8:52 am on March 21, 2022:
    42 is header (11) + p2pwkh (31). I’ll updated the comment
  7. in test/functional/wallet_coinselection.py:88 in 1607529ea5 outdated
    83+        # TODO: implement optimization and uncomment the assertions
    84+        # assert_equal(len(tx['vin']), 1)
    85+        # assert_equal(len(tx['vout']), 1)
    86+        # assert(tx['vin'][0]['txid'] in [tx1, tx2])
    87+
    88+        # fees for 1 legacy input tx = (148+42)*20 = 3800sat
    


    Xekyo commented at 9:03 pm on March 17, 2022:
    I’m wondering what the 42 is here. Is that the output plus header? My first intuition was that it was just the output, but then I realized that it probably is the whole transaction weight that’s being calculated. Does that mean that we’re expecting a wrapped segwit output here? Perhaps you could split up the calculation a bit more for clarity, e.g. (10+148+32), if I interpret that correctly.

    S3RK commented at 8:53 am on March 21, 2022:
    Yes, 42 is output + header. I’ll update the comment
  8. in test/functional/wallet_coinselection.py:61 in 1607529ea5 outdated
    56+
    57+        def select_coins_for(target):
    58+            result = w1.fundrawtransaction(w1.createrawtransaction([], {dummyaddress: target}), {"fee_rate": fee_rate})
    59+            return w1.decoderawtransaction(result['hex'])
    60+
    61+        # 1-input-1-output tx is 110vbytes which results in 2200sat fee at 20sat/vbyte
    


    Xekyo commented at 9:09 pm on March 17, 2022:
    0        # 1-input-1-output tx is 110vbytes (42/4+68+31) which results in 2200sat fee at 20sat/vbyte
    

    S3RK commented at 8:54 am on March 21, 2022:
    I’ll update the comment
  9. in test/functional/wallet_coinselection.py:111 in 1607529ea5 outdated
    106+        assert_equal(len(tx['vin']), 1)
    107+        assert_equal(len(tx['vout']), 1)
    108+        assert_equal(tx['vin'][0]['txid'], tx3)
    109+
    110+        # a solution with two bech32 inputs and a change is the most efficient
    111+        # but it could only be found by SRD
    


    Xekyo commented at 9:27 pm on March 17, 2022:

    I was wondering for a bit why the solution could not be found by Knapsack, but then I realized that it probably would, but then a solution with one bech32 input and one legacy input would be preferred because it overshoots the target less by spending more in fees. Given the 1000 iterations, Knapsack would definitely find that solution.

    However, then if SRD also found the solution with two bech32 inputs, it would get preferred per the waste metric. Was that your reasoning as well?


    S3RK commented at 8:59 am on March 21, 2022:
    Exactly. Knapsack prefers a solution with legacy+bech32 as it’s closer to the target + MIN_CHANGE due to higher fees. I’ll update the comment
  10. in test/functional/wallet_coinselection.py:127 in 1607529ea5 outdated
    118+        # This is more than cost of change and BnB won't find a solution with single input
    119+        # We either get
    120+        # 1) 2x bech32 solution (SRD only)
    121+        # 2) legacy + bech32 solution
    122+        tx = select_coins_for(0.99994899)
    123+        # legacy + dropping more for fees is still better in terms of waste than legacy + bech32
    


    Xekyo commented at 9:34 pm on March 17, 2022:
    What do you think would be the outcome if we changed knapsack to keep the solution with the best waste instead of the solution with the smallest change?

    S3RK commented at 7:31 am on March 22, 2022:
    To keep the solution with best waste out of a 1000 trials? Hm.. that would fix this test case, but it’d make Knapsack behave exactly like running SRD in a loop. I guess we can then just drop knapsack altogether.
  11. in test/functional/wallet_coinselection.py:144 in 1607529ea5 outdated
    135+        w0 = self.nodes[0].get_wallet_rpc(self.default_wallet_name)
    136+        self.nodes[0].createwallet(wallet_name="test_one_big_and_many_small_coins")
    137+        w1 = self.nodes[0].get_wallet_rpc("test_one_big_and_many_small_coins")
    138+
    139+        # Create a bunch of small coins
    140+        for i in range(100):
    


    Xekyo commented at 9:37 pm on March 17, 2022:
    I was just looking into this user asking about a transaction with 856 inputs a few days ago. I’d be curious to see if we could still reproduce behavior like that on a wallet with a UTXO pool of only about 1,000. I think SRD would hopefully make that better, but I would say we could even go higher than 100 here. :)

    S3RK commented at 7:46 am on March 22, 2022:

    Actually I was trying to keep the number of small coins as a low as possible while still demonstrating some inefficiencies of coin selection.

    1. The more coins we have -> the more time the test takes.
    2. The more coins we have -> the harder it’s to debug and to reason about.

    What is the test scenario that would be possible with 1000 coins, but not possible with 100 coins?

    P.S. I think we can definitely reproduce this scenario, I see no reasons why not? If utxo pool is equivalent to this test cases then even with SRD it will find better solution in ~856/1000 = 85%, which leaves 15% to a really bad knapsack alternative.


    Xekyo commented at 5:08 pm on March 22, 2022:

    Yeah, I think it would be easy to produce the scenario:

    • 100 small coins, 1 big coin • no BnB solution • a large set of the small coins produces a smaller change output than the 1 big coin, so knapsack prefers the former • SRD only finds the 1 big coin by itself as a solution in ~1/101 cases

    ⇒ usually SRD finds the best, but it’s a worse waste than best solution, but better than knapsack

    One could probably make it even less likely that the best solution is found by SRD by having two big ones that are both necessary


    S3RK commented at 8:18 am on March 23, 2022:
    Isn’t this exactly test case 3) in test_one_big_and_many_small_coins?
  12. in test/functional/wallet_coinselection.py:149 in 1607529ea5 outdated
    140+        for i in range(100):
    141+            w0.sendtoaddress(w1.getnewaddress("", "legacy"), 0.05)
    142+
    143+        # We will target a solution with 20 coins that will add up exactly to 1btc.
    144+        # Add a coin that will cover all the fees and exactly match the target.
    145+        # TODO: fix that knapsack targets change even if there is no change needed
    


    Xekyo commented at 9:40 pm on March 17, 2022:
    I’m not sure we should be improving Knapsack. I’d rather get rid of it, but only after we have better coin selection than Knapsack. @achow101 and I were looking into why Knapsack still produces such a high percentage of the selection results in simulations and realized that after all those 1000 random walks, it’ll still prefer lowest_larger if the latter produces a smaller change (that’s still bigger than min_change).

    Xekyo commented at 9:43 pm on March 17, 2022:
    Also, even when Knapsack can’t find changeless transactions, wouldn’t the expectation be that BnB finds it?

    S3RK commented at 7:28 am on March 22, 2022:

    I’m not really arguing for improving it. I guess maybe one way that fix that would be to just remove Knapsack that targets no change and leave only knapsack that targets a MIN_CHANGE.

    P.S. BnB won’t produce same solutions even in the situations when a changeless solution is possible, because BnB optimizes for waste. But I think this is okay.

  13. in test/functional/wallet_coinselection.py:169 in 1607529ea5 outdated
    164+            # lock big coin
    165+            w1.lockunspent(False, [tx for tx in w1.listunspent() if tx['txid'] == txid])
    166+            return fee_total, max_fee
    167+
    168+        # in all the scenarios below the best solution would be to take one big coin and create a change output
    169+        best_fee = (42 + 68 + 31) * fee_rate / COIN
    


    Xekyo commented at 9:51 pm on March 17, 2022:
    42 kinda throws me off a bit as before, but still assuming it’s header + output

    S3RK commented at 9:04 am on March 21, 2022:
    yes, it’s. I’ll split off the calculation
  14. in test/functional/wallet_coinselection.py:184 in 1607529ea5 outdated
    175+        assert_approx(fee_total, best_fee * 100)
    176+
    177+        # 1) knapsack latches to the exact match solution and can't find the best solution
    178+        # SRD can find a solution better than knapsack if draws the big coin in the first 21 tries
    179+        # chances for SRD to find it are roughly 2*(21/102) = ±40%.
    180+        # Chances are doubled due to coinselection run twice with and without APS enabled
    


    Xekyo commented at 10:05 pm on March 17, 2022:
    What’s “APS”?

    S3RK commented at 9:04 am on March 21, 2022:
    avoid partial spends. We run transaction building two times (with and without aps)

    jonatack commented at 8:55 am on March 22, 2022:

    What’s “APS”?

    In more detail, see the configuration flag -maxapsfee added by #14582 in 0.21, which sets the max allowed avoid partial spends (APS) fee. It defaults to 0 (fee is the same with and without APS), and setting it to -1 will disable APS, unless -avoidpartialspends is set.

  15. in test/functional/wallet_coinselection.py:183 in 1607529ea5 outdated
    174+        assert_approx(max_fee, best_fee)
    175+        assert_approx(fee_total, best_fee * 100)
    176+
    177+        # 1) knapsack latches to the exact match solution and can't find the best solution
    178+        # SRD can find a solution better than knapsack if draws the big coin in the first 21 tries
    179+        # chances for SRD to find it are roughly 2*(21/102) = ±40%.
    


    Xekyo commented at 10:08 pm on March 17, 2022:
    I think the chance is (1-(101/102)×(100/101)×…×(81/82))=(1-81/102)… oh, I see, that’s (21/102). So, to get a better SRD solution that would need to occur at least once, which is 1-(81/102)² = 36.93%… okay, I gotcha.

    S3RK commented at 9:07 am on March 21, 2022:
    Yes. that is correct. Not sure if I can improve the comment here somehow
  16. in test/functional/wallet_coinselection.py:177 in 1607529ea5 outdated
    172+        # knapsack always finds the best solution
    173+        fee_total, max_fee = select_coins(big_coin=1.01051, target=1.00000001, iterations=100)
    174+        assert_approx(max_fee, best_fee)
    175+        assert_approx(fee_total, best_fee * 100)
    176+
    177+        # 1) knapsack latches to the exact match solution and can't find the best solution
    


    Xekyo commented at 10:12 pm on March 17, 2022:
    We’ve previously used “exact match” for any changeless solution, so it wasn’t immediately obvious to me that it was used in the literal sense here. Perhaps you should clarify that you mean that it’s literally the same to the sat.

    S3RK commented at 9:07 am on March 21, 2022:
    Will update the comment

    S3RK commented at 8:35 am on March 22, 2022:
    Updated, but I think we should stop using “exact match” in place of changeless solution. Changeless solution is a solution in the range around exact match constrained by the cost of change/discard fee rate.
  17. in test/functional/wallet_coinselection.py:188 in 1607529ea5 outdated
    183+        assert_approx(max_fee, knapsack_fee)
    184+        # total fee is lower due to lucky SRD solutions
    185+        assert_greater_than(knapsack_fee * 100, fee_total)
    186+        assert_greater_than(fee_total, best_fee * 100)
    187+
    188+        # 2) no exact match, effective value of the big coin is smaller than target + MIN_CHANGE
    


    Xekyo commented at 10:12 pm on March 17, 2022:
    I realized above use of “exact match” here. :)

    S3RK commented at 9:08 am on March 21, 2022:
    Thanks. Will retouch the comment
  18. in test/functional/wallet_coinselection.py:189 in 1607529ea5 outdated
    184+        # total fee is lower due to lucky SRD solutions
    185+        assert_greater_than(knapsack_fee * 100, fee_total)
    186+        assert_greater_than(fee_total, best_fee * 100)
    187+
    188+        # 2) no exact match, effective value of the big coin is smaller than target + MIN_CHANGE
    189+        # BnB can find the same solution from previous test but dropping the change output to fees
    


    Xekyo commented at 10:14 pm on March 17, 2022:
    0        # BnB will find the same solution from previous test but dropping the change output to fees
    

    BnB is deterministic, so if it “can” I would expect that it does? Does “change output” mean “remainder” here?—Is this the BnB without an upper bound or the variant we currently use?


    S3RK commented at 9:08 am on March 21, 2022:
    Any BnB, since we’re within the cost of change bound. I’ll retouch the comment
  19. in test/functional/wallet_coinselection.py:218 in 1607529ea5 outdated
    208+        assert_greater_than(srd_fee * 100, fee_total)
    209+        assert_greater_than(fee_total, best_fee * 100)
    210+
    211+        # 4) knapsack latches to exact match + MIN_CHANGE
    212+        # SRD still occasionally find better solution
    213+        fee_total, max_fee = select_coins(big_coin=1.00051, target=0.99, iterations=100)
    


    Xekyo commented at 10:18 pm on March 17, 2022:

    It’s a bit hard for me to follow how the exact values were picked, since both the big_coin and target are changing. I haven’t tried to step by step reproduce each calculation in this first review, and I do see that many of your test cases do explain some of the calculation, but I just noticed that this one for example does not.

    Since it’s not trivial for me to follow, I kinda expect that other reviewers might also need a bit help. :grin:


    S3RK commented at 7:23 am on March 22, 2022:
    This is basically the same as test case “1)” in this section, but all the values decreased by MIN_CHANGE. The idea is to demonstrate that Knapsack latches not only to down-to-satoshi matches. I’ll improve the comment
  20. in test/functional/wallet_coinselection.py:261 in 1607529ea5 outdated
    249+        # if change is > MIN_CHANGE than we just spend big coin and create a change
    250+        tx = select_coins_for(49.9)
    251+        assert_equal(len(tx['vin']), 1)
    252+        assert_equal(len(tx['vout']), 2)
    253+
    254+        # if small coins > MIN_CHANGE than we just add one small coin to the big one and create change
    


    Xekyo commented at 10:24 pm on March 17, 2022:
    This comment confuses me. After paying the fees, the big coin doesn’t have enough funds to pay fees and create the output here, so we would always need both coins.

    S3RK commented at 8:44 am on March 22, 2022:
    Oops. The idea was that you have more than 1 small coin. If small coins > MIN_CHANGE that the tests in test_one_big_and_many_small_coins doesn’t work. Any idea how to improve that? For now, added more than one small coin in this test case
  21. Xekyo commented at 10:24 pm on March 17, 2022: member
    Cool! Concept ACK. Did a light review, haven’t reproduced all the numbers.
  22. test: coinselection edge cases e378883892
  23. S3RK force-pushed on Mar 22, 2022
  24. S3RK commented at 8:51 am on March 22, 2022: member
    Retouched comments to address Xekyo’s feedback.
  25. Xekyo commented at 8:38 pm on March 23, 2022: member
    Looks good. What is the next step you plan for this PR? How can one help?
  26. S3RK commented at 8:50 am on March 24, 2022: member

    I thought this PR can be merged roughly as it is (addressing all the concerns of course), as I believe the test is valuable in itself.

    As a follow up I’d like to fix some of the inefficiencies of coin selection by increasing BnB upper limit. But for that we need to improve how we drop change for fees. I tried one approach in #23367 but it doesn’t work that well. So I’m going splitting #23367 into parts that could be merged independently (the first one -> #24649).

    Let’s talk on the next wallet meeting.

  27. S3RK commented at 6:27 am on March 29, 2022: member
    There is probably a conflict with #24494.
  28. achow101 commented at 4:38 pm on April 14, 2022: member

    While I appreciate more tests, these tests seem to be extremely specific to the way that coin selection currently works and look like they would start failing as soon as we change something, such as removing knapsack. So I’m not sure if we actually want to have these tests if they are going to be removed and/or require a ton of recalculation for future coin selection changes.

    If we do want these tests, then I don’t think it should use legacy addresses so much as hopefully we won’t make those by default soon.

  29. S3RK commented at 6:59 am on April 19, 2022: member

    While I appreciate more tests, these tests seem to be extremely specific to the way that coin selection currently works and look like they would start failing as soon as we change something, such as removing knapsack. So I’m not sure if we actually want to have these tests if they are going to be removed and/or require a ton of recalculation for future coin selection changes.

    Agree, these are not your normal tests, primarily because it’s hard to define the “right” or “expected” outcome. I created these tests to expose some shortcomings of current coin selection and justify increasing BnB limit (#24752). It’d be nice to have them around in some form to avoid regressions, maybe we can incorporate them somehow into simulations. wdyt?

    If we do want these tests, then I don’t think it should use legacy addresses so much as hopefully we won’t make those by default soon.

  30. instagibbs commented at 7:27 pm on May 20, 2022: member

    To throw my relatively unproductive opinion into the mix, this is a staggeringly large number of magic numbers. Ideally all numbers in tests are “obvious” or programmatic. It’s a nightmare to maintain otherwise.

    Didn’t read or think enough about overall thrust of the tests.

  31. S3RK commented at 6:06 am on May 23, 2022: member
    Thanks for the feedback. Maybe a better way to verify coin selection would to integrate some simulation/benchmark into this repo.
  32. S3RK closed this on May 23, 2022

  33. DrahtBot locked this on May 23, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me