This is a small nit, however I think it's more understandable to write:
utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee
vs
(utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0
This is a small nit, however I think it's more understandable to write:
utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee
vs
(utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0
Concept ACK, Approach ACK
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--021abf342d371248e50ceaed478a90ca-->
See the guideline for information on the review process.
| Type | Reviewers |
|---|---|
| ACK | Xekyo, aureleoules, achow101 |
| Approach ACK | AryanJ-NYC |
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
<!--174a7506f384e20aa4161008e828411d-->
No conflicts as of last run.
92 | @@ -93,7 +93,7 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo 93 | bool backtrack = false; 94 | if (curr_value + curr_available_value < selection_target || // Cannot possibly reach target with the amount remaining in the curr_available_value. 95 | curr_value > selection_target + cost_of_change || // Selected value is out of range, go back and try other branch 96 | - (curr_waste > best_waste && (utxo_pool.at(0).fee - utxo_pool.at(0).long_term_fee) > 0)) { // Don't select things which we know will be more wasteful if the waste is increasing 97 | + (curr_waste > best_waste && (utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee))) { // Don't select things which we know will be more wasteful if the waste is increasing
ACK 1a2d94ac586e2ecb55994205fceec14322a949fd
Since the feerate is loop invariant, that boolean will always evaluate the same across one coin selection attempt, and it would be even better to just pull it out of the loop:
[before loop]
bool is_high_feerate = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;
[…]
(is_high_feerate && curr_waste > best_waste)) { // Don't select things which we know will be more wasteful if the waste is increasing
I considered this as well. To my surprise moving the comparison outside of the loop didn't make any benchmark difference. My guess is the compiler must be smart enough to do some optimization, or the benchmark tests aren't sensitive enough to detect a change.
That bieng said, I do think it's more readable outside of the for loop (and doesn't need to be compared every iteration). @Xekyo does this look better https://github.com/bitcoin/bitcoin/pull/25932/commits/8ada8bb313eb95d55e9e1ac121bc6143a569e896?
Maybe this is better as two separate commits. https://github.com/bitcoin/bitcoin/pull/25932/commits/365aca40453995163bbd17231251512f9f9a103b and https://github.com/bitcoin/bitcoin/pull/25932/commits/953ce061d5e8392d9f4c5bcedd27a2eeb08296db
86 | @@ -87,13 +87,18 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo 87 | std::vector<size_t> best_selection; 88 | CAmount best_waste = MAX_MONEY; 89 | 90 | + // Check if the current feerate is expensive. If it is expensive, we minimize the size of 91 | + // the result set and thus minimizing the transaction fee. If the fees are not expensive, 92 | + // we maximize the result set to reduce the amount of UTXOs (fragmentation).
This is not a good description of what the algorithm does. BnB deterministically searches the combination tree of the UTXO pool and retains the input set with the minimal waste score among the input sets it traverses. It doesn't guarantee to find the minimal waste score, and it doesn't guarantee to minimize the input set at high feerates. The result may end up being the minimal input set at high fees but might not be. E.g. if the feerate were 10.001 sat/vB and the LTFRE 10 sat/vB, the waste score could prefer a signficantly larger input set that results in a smaller excess over a smaller one with large excess, even while this bool classify the coin selection situation as is_feerate_expensive = True.
Meanwhile at low feerates BnB does not guarantee to maximimize the input set—it also keeps the input set with lowest waste score. Since the depth-first search starts with the most valuable UTXOs, input sets with a smaller count are traversed before ones with larger count are visited. Even then, the UTXO types may cause a smaller count to be preferred because the total weight of older output types could be bigger than a smaller count of modern outputs. Depending on the UTXO pool size, it may very well not find the maximal input set within the 100,000 tries: it doesn't take many UTXOs for the 100,000 tries to get exhausted before all relevant combinations are visited. Depending on the variance of the present values some 400+ UTXOs can suffice already, but wallets may have thousands or even tens of thousands of UTXOs.
If you insist on adding a comment perhaps use something along the lines of:
// Check if the current feerate is higher than the long-term feerate estimate.
// Since BnB prefers the input set with the lowest waste score, this results in a preference
// for smaller input sets at high feerates and bigger input sets at lower feerates.
You might want to consider adding this to the comment on the function though, as that point is not well-covered there.
My comment was misleading it looks like. It's true that a true maximum or true min may not be found before the iteration limit. Also, I appreciate you pointing out that how many or how few UTXO's are targeted can depend on how much the longterm and short term fees differ. Perhaps we can just do without a comment here since the statement is self explanatory. I agree that further comment on how the selection process works is better to place on the function.
86 | @@ -87,13 +87,18 @@ std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_poo 87 | std::vector<size_t> best_selection; 88 | CAmount best_waste = MAX_MONEY; 89 | 90 | + // Check if the current feerate is expensive. If it is expensive, we minimize the size of 91 | + // the result set and thus minimizing the transaction fee. If the fees are not expensive, 92 | + // we maximize the result set to reduce the amount of UTXOs (fragmentation). 93 | + bool is_feerate_expensive = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;
I don't think "expensive" is the best term to refer to a feerate. Creating a transaction at a high feerate is expensive, but the feerate itself doesn't cost anything. Perhaps:
bool is_feerate_high = utxo_pool.at(0).fee > utxo_pool.at(0).long_term_fee;
My thought process was simply that we are checking if the current fee rate is expensive compared to the longterm fee rate. The reason I like expensive is it's a monetary term and "high" can pertain to many things. Since you feel strongly about using "high" I'll change it to your suggestion.
A non-representative survey among my colleagues resulted in three out of three (including one native English speaker with a finance background) agreeing with me: a transaction is expensive when its feerate is high.
I therefore maintain that feerates are “high” or “low”, not “expensive” or “cheap”.
It occurred to me that in the study of computation, expensive is sometimes used to describe a computation that takes a lot of resource, therefore it's not purely a finical term. Also, since we are terming the transaction as expensive, it does make sense to term the feerate as high.
Yeah, looks better outside of the loop, but I think the comment would be better as an amendment of the big description in the function, but I also disagree with how the algorithm is characterized.
ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3
@achow101 friendly request to take a look. It's a small change but I don't see a reason not to include it.
ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3
ACK 81d4a2b14ff65fe07085ef2a967a466015370ce3
This was merged.