refactor: Simplify backtrack logic #25932

yancyribbens commented at 7:50 PM on August 25, 2022: contributor

This is a small nit, however I think it's more understandable to write:

utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee

vs

(utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0

glozow requested review from murchandamus on Aug 25, 2022

yancyribbens force-pushed on Aug 25, 2022

DrahtBot added the label Refactoring on Aug 25, 2022

yancyribbens force-pushed on Aug 26, 2022

AryanJ-NYC commented at 1:22 AM on August 27, 2022: none

Concept ACK, Approach ACK

DrahtBot cross-referenced this on Sep 12, 2022 from issue wallet: Remove redundant waste calculation by yancyribbens

DrahtBot commented at 3:43 PM on September 13, 2022: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	Xekyo, aureleoules, achow101
Approach ACK	AryanJ-NYC

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

No conflicts as of last run.

in src/wallet/coinselection.cpp:96 in 1a2d94ac58 outdated

  92 | @@ -93,7 +93,7 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo
  93 |          bool backtrack = false;
  94 |          if (curr_value + curr_available_value < selection_target || // Cannot possibly reach target with the amount remaining in the curr_available_value.
  95 |              curr_value > selection_target + cost_of_change || // Selected value is out of range, go back and try other branch
  96 | -            (curr_waste > best_waste && (utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0)) { // Don't select things which we know will be more wasteful if the waste is increasing
  97 | +            (curr_waste > best_waste && (utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee))) { // Don't select things which we know will be more wasteful if the waste is increasing

murchandamus commented at 8:01 PM on September 15, 2022:

ACK 1a2d94ac586e2ecb55994205fceec14322a949fd

Since the feerate is loop invariant, that boolean will always evaluate the same across one coin selection attempt, and it would be even better to just pull it out of the loop:

    [before loop]
    bool is_high_feerate = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;
    […]
            (is_high_feerate && curr_waste > best_waste)) { // Don't select things which we know will be more wasteful if the waste is increasing

yancyribbens commented at 10:02 AM on September 16, 2022:

I considered this as well. To my surprise moving the comparison outside of the loop didn't make any benchmark difference. My guess is the compiler must be smart enough to do some optimization, or the benchmark tests aren't sensitive enough to detect a change.

That bieng said, I do think it's more readable outside of the for loop (and doesn't need to be compared every iteration). @Xekyo does this look better https://github.com/bitcoin/bitcoin/pull/25932/commits/8ada8bb313eb95d55e9e1ac121bc6143a569e896?

yancyribbens commented at 12:26 PM on September 16, 2022:

Maybe this is better as two separate commits. https://github.com/bitcoin/bitcoin/pull/25932/commits/365aca40453995163bbd17231251512f9f9a103b and https://github.com/bitcoin/bitcoin/pull/25932/commits/953ce061d5e8392d9f4c5bcedd27a2eeb08296db

murchandamus cross-referenced this on Sep 15, 2022 from issue coin selection: Check whether feerate is higher than long_term_feerate only once by murchandamus

yancyribbens force-pushed on Sep 16, 2022

refactor: Simplify feerate comparison statement 365aca4045

in src/wallet/coinselection.cpp:92 in 953ce061d5 outdated

  86 | @@ -87,13 +87,18 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo
  87 |      std::vector<size_t> best_selection;
  88 |      CAmount best_waste = MAX_MONEY;
  89 |  
  90 | +    // Check if the current feerate is expensive.  If it is expensive, we minimize the size of
  91 | +    // the result set and thus minimizing the transaction fee.  If the fees are not expensive,
  92 | +    // we maximize the result set to reduce the amount of UTXOs (fragmentation).

murchandamus commented at 2:40 PM on September 16, 2022:

This is not a good description of what the algorithm does. BnB deterministically searches the combination tree of the UTXO pool and retains the input set with the minimal waste score among the input sets it traverses. It doesn't guarantee to find the minimal waste score, and it doesn't guarantee to minimize the input set at high feerates. The result may end up being the minimal input set at high fees but might not be. E.g. if the feerate were 10.001 sat/vB and the LTFRE 10 sat/vB, the waste score could prefer a signficantly larger input set that results in a smaller excess over a smaller one with large excess, even while this bool classify the coin selection situation as is_feerate_expensive = True.

Meanwhile at low feerates BnB does not guarantee to maximimize the input set—it also keeps the input set with lowest waste score. Since the depth-first search starts with the most valuable UTXOs, input sets with a smaller count are traversed before ones with larger count are visited. Even then, the UTXO types may cause a smaller count to be preferred because the total weight of older output types could be bigger than a smaller count of modern outputs. Depending on the UTXO pool size, it may very well not find the maximal input set within the 100,000 tries: it doesn't take many UTXOs for the 100,000 tries to get exhausted before all relevant combinations are visited. Depending on the variance of the present values some 400+ UTXOs can suffice already, but wallets may have thousands or even tens of thousands of UTXOs.

If you insist on adding a comment perhaps use something along the lines of:

    // Check if the current feerate is higher than the long-term feerate estimate. 
    // Since BnB prefers the input set with the lowest waste score, this results in a preference 
    // for smaller input sets at high feerates and bigger input sets at lower feerates.

You might want to consider adding this to the comment on the function though, as that point is not well-covered there.

yancyribbens commented at 7:48 AM on September 17, 2022:

My comment was misleading it looks like. It's true that a true maximum or true min may not be found before the iteration limit. Also, I appreciate you pointing out that how many or how few UTXO's are targeted can depend on how much the longterm and short term fees differ. Perhaps we can just do without a comment here since the statement is self explanatory. I agree that further comment on how the selection process works is better to place on the function.

in src/wallet/coinselection.cpp:93 in 953ce061d5 outdated

  86 | @@ -87,13 +87,18 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo
  87 |      std::vector<size_t> best_selection;
  88 |      CAmount best_waste = MAX_MONEY;
  89 |  
  90 | +    // Check if the current feerate is expensive.  If it is expensive, we minimize the size of
  91 | +    // the result set and thus minimizing the transaction fee.  If the fees are not expensive,
  92 | +    // we maximize the result set to reduce the amount of UTXOs (fragmentation).
  93 | +    bool is_feerate_expensive = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;

murchandamus commented at 3:03 PM on September 16, 2022:

I don't think "expensive" is the best term to refer to a feerate. Creating a transaction at a high feerate is expensive, but the feerate itself doesn't cost anything. Perhaps:

    bool is_feerate_high = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;

yancyribbens commented at 7:55 AM on September 17, 2022:

My thought process was simply that we are checking if the current fee rate is expensive compared to the longterm fee rate. The reason I like expensive is it's a monetary term and "high" can pertain to many things. Since you feel strongly about using "high" I'll change it to your suggestion.

murchandamus commented at 7:06 PM on September 20, 2022:

A non-representative survey among my colleagues resulted in three out of three (including one native English speaker with a finance background) agreeing with me: a transaction is expensive when its feerate is high.

I therefore maintain that feerates are “high” or “low”, not “expensive” or “cheap”.

yancyribbens commented at 4:59 AM on September 21, 2022:

It occurred to me that in the study of computation, expensive is sometimes used to describe a computation that takes a lot of resource, therefore it's not purely a finical term. Also, since we are terming the transaction as expensive, it does make sense to term the feerate as high.

murchandamus commented at 4:18 PM on September 16, 2022: contributor

Yeah, looks better outside of the loop, but I think the comment would be better as an amendment of the big description in the function, but I also disagree with how the algorithm is characterized.

yancyribbens force-pushed on Sep 17, 2022

yancyribbens commented at 7:59 AM on September 17, 2022: contributor

@Xekyo does this look better? https://github.com/bitcoin/bitcoin/pull/25932/commits/81d4a2b14ff65fe07085ef2a967a466015370ce3

yancyribbens force-pushed on Sep 17, 2022

refactor: Move feerate comparison invariant outside of the loop 81d4a2b14f

murchandamus commented at 7:08 PM on September 20, 2022: contributor

ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3

yancyribbens commented at 6:52 PM on December 23, 2022: contributor

@achow101 friendly request to take a look. It's a small change but I don't see a reason not to include it.

aureleoules approved

aureleoules commented at 10:14 AM on December 29, 2022: member

ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3

fanquake assigned achow101 on Dec 29, 2022

achow101 commented at 4:58 PM on January 3, 2023: member

ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3

achow101 referenced this in commit 7bb07bf8bd on Jan 3, 2023

achow101 commented at 5:47 PM on January 3, 2023: member

This was merged.

achow101 closed this on Jan 3, 2023

sidhujag referenced this in commit 21d2672e73 on Jan 4, 2023

bitcoin locked this on Jan 3, 2024