BnB untested/unused condition in UTXO exclusion optimization #32047

issue yancyribbens openend this issue on March 12, 2025
  1. yancyribbens commented at 8:30 pm on March 12, 2025: contributor

    If the following condition is remove, no failure occurs: utxo.fee != utxo_pool.at(utxo_pool_index - 1).fee)

    https://github.com/bitcoin/bitcoin/blob/eb9730ab65887279309af338ad0201924f945ac5/src/wallet/coinselection.cpp#L177

    Furthermore, I believe that this condition isn’t actually needed and can be removed, which would slightly boost performance. The section is checking if a UTXO can be omitted from evaluation if it has the same effective_value as the previous UTXO and the previous was already omitted. The condition in question is comparing if the fee is different between the previous and the current UTXO. The only way for the fees to be different is if the effective_value is different, however the previous condition: utxo.GetSelectionAmount() != utxo_pool.at(utxo_pool_index - 1).GetSelectionAmount() already compared the effective_values.

    Put another way, if one of the conditions is true, the UTXO shortcut can not be applied. If the effective_values where not the same, then don’t do the shortcut except if the fees are the same. But the only way fees can be different is if the effective_values are different which creates a contradiction.

    Lastly, I’m struggling to understand what the motivation is to have this condition in the first place. The condition translates to: If the fee is different between the current UTXO and the previous UTXO, then exclude the current one from the selection shortcut (which as I pointed out would never happen because of the aforementioned contradiction where the effective_value and fee are conditionally dependent).

  2. yancyribbens commented at 8:39 pm on March 12, 2025: contributor
  3. yancyribbens commented at 8:57 pm on March 12, 2025: contributor
    I also just wanted to add that, even if there was the extremely unlikely event where two utxos had the same effective_value and different fee values, the optimization should only compare effective_values as I understand it.
  4. Nouridin commented at 10:32 pm on March 12, 2025: none
    In summary, while the fee check may be redundant given the current relationships between effective value and fee, it was likely originally added to ensure that the shortcut is only applied when both the effective value and the spending fee are identical. Since testing shows that removing it does not lead to failures, it appears safe to remove it, but any change in this well‐tuned coin selection code must be approached with caution given the subtle and sometimes unexpected interactions in coin selection. (at least that what I believe)
  5. fanquake added the label Wallet on Mar 13, 2025
  6. yancyribbens commented at 11:56 am on March 13, 2025: contributor
    I believe it’s possible for effective_value to not be equal between two successive UTXO combinations while two fees are equal. This would have the consequence of not applying the optimization when in principle the optimization should be applied. In fact, I think it’s very likely for two fees to be equal since it’s likely the fee_rates are equal and the weight is equal of two UTXOs. Therefore, leaving this condition in has the disadvantage of nullifying the effect of the UTXO optimization performance increase. Earlier I believe I misspoke when I said it was unlikely for effective_value and fee to contradict each other per UTXO, but I believe adding this parameter is unneeded and is likely to worsen performance.
  7. Nouridin commented at 12:07 pm on March 13, 2025: none

    Your reasoning is sound. If you consider that many UTXOs will have equal fees—because they share the same fee rate and weight—then requiring both the effective value and the fee to match before applying the shortcut can indeed be overly strict. In scenarios where the fees are identical but the effective values differ slightly (perhaps due to minor rounding or variation in the UTXO values), the condition utxo.fee != utxo_pool.at(utxo_pool_index - 1).fee

    will cause the optimization to be bypassed. This means that even if the fee component is effectively the same across UTXOs, any difference in effective value prevents the shortcut from being used, thereby nullifying the potential performance improvement.

    Removing the fee comparison would allow the optimization to trigger based solely on matching effective values. This is particularly attractive if, in practice, the fee rates and weights are uniform enough that fees are almost always equal. Of course, such a change would need thorough testing to ensure that no edge cases are missed where the fee difference might indicate a meaningful divergence between UTXOs.

    In summary, if the vast majority of your UTXOs have equal fees due to identical fee rates and weights, then the fee check can indeed be seen as an unnecessary constraint that prevents the coin selection optimization from being applied, potentially worsening performance.

  8. yancyribbens commented at 3:04 pm on March 17, 2025: contributor

    I did some testing on this issue and I found that if the parameter is removed, one of tests finishes in 3 fewer iterations, from 38 iterations to 35. The reason the test is more efficient without the added parameter is that there are two UTXOs with the same effective value of 5000000 sats but different weights and fees:

    0UTXO A
    1value: 5000000
    2fee: 204
    3weight: 272
    4
    5UTXO B
    6value: 5000000
    7fee: 1200000
    8weight: 1600000
    

    During the selection, if UTXO A is found to not contribute towards a solution, the algorithm should dismiss UTXO B as also not contributing toward the solution and skip without an iteration consumed. However, since the parameter will only skip if the fees are equivalent, the UTXO is instead evaluated and _not_skipped.

    A few other observations about this test is that the fee of UTXO B is not a practical value because the weight is extreme (1600000). However, even if the fee and weight are reasonable and different, the same fault would occur during evaluation of the UTXO skip optimization. Furthermore, this code could also be improved by skipping evaluation of the UTXO if the weight is greater than max_selection_weight which it is by far.

    Lastly, I tried to track down when the parameter was added in hopes of understanding why. Unfortunately the param was present in the origin PR here by @achow101 without an explanation from what I can tell. So I’m at a loss to understand why this parameter exists which appears to degrade performance without any added benefit that I can find.

    The recommendations I have to improve the state of the algorithm:

    • Remove the parameter which is degrading performance in some occasions where two effective values are the same but have different fees
    • Add an iteration count to the API so it can be clearly seen when performance improves or degrades in terms of iteration count
    • Skip evaluating UTXOs that are greater than max_selection_weight
  9. murchandamus commented at 0:09 am on March 19, 2025: contributor

    The only way for the fees to be different is if the effective_value is different, however the previous condition: utxo.GetSelectionAmount() != utxo_pool.at(utxo_pool_index - 1).GetSelectionAmount() already compared the effective_values.

    This is not accurate. You could have two UTXOs of different output types that have the same effective value but differing fees. Counterexample:

    Feerate: 10 sat/vB

    UTXO_1: type: P2WPKH, amount: 10,680 sats input weight: 68 vB fee: 680 sats eff_value: 10,000

    UTXO_2: type: P2TR amount: 10,575 sats input_weight: 57.5 vB fee: 575 sats eff_value: 10,000

    However, since CoinGrinder got merged UTXO sorting tiebreaks by preferring lower waste on equal effective value, so yes, if the first UTXO is skipped combining with a later UTXO of same effective value will always have worse waste, so I think skipping a subsequent UTXO even when the fee differs is now correct.

    I.e. at low feerates, P2WPKH would sort before P2TR when effective value ties, and at high feerates P2TR would sort before P2WPKH when effective value ties.

  10. bitcoin deleted a comment on Mar 19, 2025
  11. yancyribbens commented at 6:33 pm on March 19, 2025: contributor

    This is not accurate. You could have two UTXOs of different output types that have the same effective value but differing fees. Counterexample:

    Yes, I redacted that comment later in the thread as my understand of the problem improved. As I posted last, there’s even a test here that tests two UTXOs with the same effective_value yet different fees.

    However, since CoinGrinder got merged UTXO sorting tiebreaks by preferring lower waste on equal effective value, so yes, if the first UTXO is skipped combining with a later UTXO of same effective value will always have worse waste, so I think skipping a subsequent UTXO even when the fee differs is now correct

    Ah that’s interesting. I see now why the param was originally added. Before coin_grinder was added which imposed a sort order on waste and effective value, it was possible for two UTXOs to have the same effective_value yet different waste scores. That is, two UTXOs can have the same effective_value but a different future effective_value as calculated with long_term_fee_rate. I agree though that now the parameter is superfluous where as before the sort order was improved, it did make a difference.

  12. yancyribbens commented at 6:41 pm on March 19, 2025: contributor
    However, doesn’t the waste score account for long_term_fee_rate as well as fee_rate? In other-words, just comparing fee amounts and not long_term_fee_rate would not be enough to know if comparing two UTXOs have different waste scores..
  13. murchandamus commented at 5:46 pm on March 20, 2025: contributor
    The long_term_fee_rate and fee_rate are the same for all UTXOs in a single coin selection attempt, so if two UTXOs have the same fee, their inputs have the same weight, if they also have the same effective value, they are interchangeable. If they only have the same effective value but different fees, it means that they have differing weights. Similarly, waste simply scales with the weight across UTXOs, as it is weight times long_term_fee_rate - fee_rate. So, if two UTXOs have the same effective value, but different waste scores, it is congruent to them having different fees.
  14. yancyribbens commented at 3:33 pm on March 21, 2025: contributor

    The long_term_fee_rate and fee_rate are the same for all UTXOs in a single coin selection attempt

    Agree. I think that’s a little confusing the way the C++ algorithm is structured and treats fee_rate and long_term_fee_rate per UTXO instead of just passing in the fee_rate params as a per sort argument.

    so if two UTXOs have the same fee, their inputs have the same weight

    Right, since fee is derived from fee_rate and weight, and fee_rate is guaranteed to be the same, the only true variable is weight.

    if they also have the same effective value, they are interchangeable.

    Yes, so they are interchangeable if they have the same weight and the same value since effective_value is derived from fee_rate (not variable between UTXOs), value and weight.

    If they only have the same effective value but different fees, it means that they have differing weights.

    Yes, it follows given that the only variable of consequence have been shown to be value and weight.

    Similarly, waste simply scales with the weight across UTXOs, as it is weight times long_term_fee_rate - fee_rate

    Alright, so that means waste is derivative of Weight since long_term_fee_rate and fee_rate are both non-variable between UTXOs.

    So, if two UTXOs have the same effective value, but different waste scores, it is congruent to them having different fees.

    And different Weights.

    In summery, two UTXOs are equivalent if both value and weight are the same, and if value is the same but weight is different, prefer the UTXO with lower weight since waste score is a derivative of weight. Therefore, if UTXOs are sorted by value with a tie breaker of weight, then if UTXO A was a skip during selection and a successive UTXO B has the same value, it can also be skipped regardless of fee given it’s sort order. A sort by only effective_value would not be enough to make such a guarantee since effective_values between two successive UTXOs could have the same effective_value but different weights and therefore different waste scores.

    Also, sorting UTXOs by the ratio of value/weight would have the same benefit as sorting by value with tie breaker of weight but that’s besides the point.

  15. murchandamus commented at 6:36 pm on March 21, 2025: contributor

    In summery, two UTXOs are equivalent if both value and weight are the same, and if value is the same but weight is different, prefer the UTXO with lower weight since waste score is a derivative of weight.

    Close, but there is a subtlety here. If we were sorting by effective_value and tie-breaking fee or weight, we would always be preferring lower weight, but instead, we tie-break on waste. That means that we will prefer lower weight if fee_rate is higher than long_term_fee_rate, but we will prefer higher weight if fee_rate is lower than long_term_fee_rate. This dynamic ensures that we prefer spending old, less blockspace-efficient UTXOs at low feerates, but prefer more blockspace-efficient UTXOs at high feerates.

    Therefore, if UTXOs are sorted by value with a tie breaker of weight, then if UTXO A was a skip during selection and a successive UTXO B has the same value, it can also be skipped regardless of fee given it’s sort order. A sort by only effective_value would not be enough to make such a guarantee since effective_values between two successive UTXOs could have the same effective_value but different weights and therefore different waste scores.

    Correct, and this still works with our waste-based sort order as intended: if we are in a low-feerate situation, the heaviest UTXOs are sorted first, and we skip same-weight or lighter UTXOs of the same effective value. If we are in a high-feerate situation, we skip same-weight and heavier UTXOs of the same effective value, because in both situation the subsequent UTXOs would lead to a similar selection with a worse waste score.

    Also, sorting UTXOs by the ratio of value/weight would have the same benefit as sorting by value with tie breaker of weight but that’s besides the point.

    Unfortunately not. We explored sorting by descending value density while developing CoinGrinder, and the problem with that approach is that two subsequent UTXO no longer make similar progress toward the selection_target and may not be good replacements for each other (e.g., a P2PKH UTXO with 148,000 sats and a P2TR UTXO with 57,500 sats have the same value density). It lead to CoinGrinder sometimes finding the best solution much faster, but sometimes finding it slower. It just traversed the combination space in a radically different order. I think that it would not be compatible with most of the optimizations we found for CoinGrinder, although there might be other similar ones. We had wondered whether it might generally be stronger for CoinGrinder, but discarded the approach after determining that it was just different—vying instead to keep it similar to BnB and easier to reason about. I think that sorting by value density would not work at all for BnB, as I would expect it to significantly reduce the opportunities to bound the search space if we cannot skip many subtrees in the search graph after overshooting the target.

  16. yancyribbens commented at 1:06 pm on March 22, 2025: contributor

    Close, but there is a subtlety here..

    Thanks, that’s a good point I might have overlooked, that in a high fee environment, the tie break on weight is different than in a low fee environment.

    Unfortunately not. We explored sorting by descending value density while developing CoinGrinder..

    Hmm that make sense now that you say it. sorting by density isn’t strictly equivalent to sorting by value with weight as a tie break and as such would change the behavior of the search.

    So, it sounds like we are in agreement that the fee comparison in BnB can be removed. I’m tempted to explore adding an iteration count to BnB like coin-grinder since that would make the change nicer. IE if we had an iteration count, then removing the parameter wouldn’t require any test changes because it would be apparent that in test the iteration count was reduced with the removal of the condition. Adding an iteration count would also strengthen the tests now and against future changes. However, the easiest change for this would be to add a test where 5k or so inputs have the same effective_value and different fee values and a solution can’t be found because the condition causes the UTXO shortcut to not be evaluated. And maybe a test to show it behaves different in low-fee vs high-fee environment.

  17. murchandamus commented at 6:59 pm on March 22, 2025: contributor

    So, it sounds like we are in agreement that the fee comparison in BnB can be removed.

    Yes, it looks to me like it should actually be an optimization to remove it.

    I’m tempted to explore adding an iteration count to BnB like coin-grinder since that would make the change nicer. IE if we had an iteration count, then removing the parameter wouldn’t require any test changes because it would be apparent that in test the iteration count was reduced with the removal of the condition. Adding an iteration count would also strengthen the tests now and against future changes.

    Yeah, I have had in my backlog that I would like to refactor BnB into the style of CoinGrinder since I wrote CoinGrinder, because I think it would be easier to understand, the iteration count would make more sense (currently it counts and evaluates the backtracking steps as well, but they can never yield a new solution!), and maybe easier to test. Given that my new job is explicitly more focused on working on the wallet, I was hoping to get to that once we are in the new office in April, but it might not be the highest priority.

    However, the easiest change for this would be to add a test where 5k or so inputs have the same effective_value and different fee values and a solution can’t be found because the condition causes the UTXO shortcut to not be evaluated. And maybe a test to show it behaves different in low-fee vs high-fee environment.

    I could have sworn we had such a test already, but if we don’t that would be a great addition to the bnb_feerate_sensitivity_test. You or I could add it there.

  18. yancyribbens commented at 4:01 pm on March 23, 2025: contributor

    Yeah, I have had in my backlog that I would like to refactor BnB into the style of CoinGrinder since I wrote CoinGrinder, because I think it would be easier to understand, the iteration count would make more sense (currently it counts and evaluates the backtracking steps as well, but they can never yield a new solution!), and maybe easier to test. Given that my new job is explicitly more focused on working on the wallet, I was hoping to get to that once we are in the new office in April, but it might not be the highest priority.

    That would be great to update BnB to be more like coin-grinder. Adding an iteration count would be a good first step towards that.

    I could have sworn we had such a test already, but if we don’t that would be a great addition to the bnb_feerate_sensitivity_test. You or I could add it there.

    That sounds good. What is preventing this PR from being merged? Are you waiting on more Acks?

  19. murchandamus commented at 8:38 pm on March 24, 2025: contributor

    That would be great to update BnB to be more like coin-grinder. Adding an iteration count would be a good first step towards that.

    I don’t think it makes sense to add just the iteration count, if it should be refactored completely anyway.—Better to do it all in one.

    I could have sworn we had such a test already, but if we don’t that would be a great addition to the bnb_feerate_sensitivity_test. You or I could add it there.

    That sounds good. What is preventing this PR from being merged? Are you waiting on more Acks?

    I extended the feerate sensitivity test to show that it prefers two heavy inputs at low feerate over two light inputs and vice versa at high feerate: https://github.com/bitcoin/bitcoin/pull/29532/commits/afd4b807ff1300e4f74ceab6a683f3ff1376369d#diff-36088c93368e137d955348aba223985bd4f198f2aaecd626c830f4612ca884c8R139-R146

    But we are getting a bit off-topic here. Do you want to open a PR with the discussed optimization?

  20. yancyribbens commented at 5:52 pm on March 25, 2025: contributor

    I don’t think it makes sense to add just the iteration count, if it should be refactored completely anyway.—Better to do it all in one.

    That would be great if you manage to find the time to do it.

    I extended the feerate sensitivity test to show that it prefers two heavy inputs at low feerate over two light inputs and vice versa at high feerate: https://github.com/bitcoin/bitcoin/commit/afd4b807ff1300e4f74ceab6a683f3ff1376369d#diff-36088c93368e137d955348aba223985bd4f198f2aaecd626c830f4612ca884c8R139-R146

    Thanks, I noticed that. Although I think it’s somewhat orthogonal to the discussed optimization.

    But we are getting a bit off-topic here. Do you want to open a PR with the discussed optimization?

    Sure, I can. I think along with removing the condition, a test should go with the commit which shows that if two UTXOs with the same effective_value and different weights are selection candidates, the heavier more expensive of the two is selected when fees are low, and the cheaper lighter of the two is selected when fees are high. Similar to the test you made above but with equal effective_values which tests the skip optimization performs correctly when the fee comparison is removed .

  21. murchandamus commented at 2:34 am on March 27, 2025: contributor
    It was easier than expected: https://github.com/bitcoin/bitcoin/pull/32150

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-03-28 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me