Successor of #23367
Increasing upper bound helps to find more changeless solutions. This improves coin selection in the cases introduced in #24580.
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
61 | @@ -62,9 +62,9 @@ struct { 62 | 63 | static const size_t TOTAL_TRIES = 100000; 64 | 65 | -std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_pool, const CAmount& selection_target, const CAmount& cost_of_change) 66 | +std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_pool, const CAmount& selection_target, const CAmount& upper_bound) 67 | { 68 | - SelectionResult result(selection_target); 69 | + SelectionResult result(selection_target, /* force_no_change=*/ true);
SelectionResult result(selection_target, /*force_no_change=*/true);
Anywhere you are adding/touching named args, please use the style from the developer docs.
Sure, will fix that
fixed
759 | @@ -760,7 +760,8 @@ static bool CreateTransactionInternal( 760 | 761 | // vouts to the payees 762 | if (!coin_selection_params.m_subtract_fee_outputs) { 763 | - coin_selection_params.tx_noinputs_size = 11; // Static vsize overhead + outputs vsize. 4 nVersion, 4 nLocktime, 1 input count, 1 output count, 1 witness overhead (dummy, flag, stack size) 764 | + coin_selection_params.tx_noinputs_size = 10; // Static vsize overhead + outputs vsize. 4 nVersion, 4 nLocktime, 1 input count, 1 witness overhead (dummy, flag, stack size)
Note for reviewers/myself: It should be 10 vB for non-segwit transactions, and 42 WU for segwit transactions (as the witness adds the two fields marker and flag that don't scale with the count of inputs. So, this is correct for non-segwit transactions, but would be short for a segwit transaction.
For clarity. The size required to store output counter added below. So in case of <253 outputs tx_noinputs_size will be 11vB for both segwit and non-segwit txs. This overshoots the target a bit for non-segwit, but this is much better than selecting not enough coins.
26 | @@ -27,25 +27,25 @@ using interfaces::FoundBlock; 27 | namespace wallet { 28 | static constexpr size_t OUTPUT_GROUP_MAX_ENTRIES{100}; 29 | 30 | -int GetTxSpendSize(const CWallet& wallet, const CWalletTx& wtx, unsigned int out, bool use_max_sig) 31 | +int GetTxSpendSize(const CWallet& wallet, const CWalletTx& wtx, unsigned int out, const CCoinControl* coin_control)
Note to other reviewers: it wasn't obvious to me at first why this replacement is possible, but I found that use_max_sig is directly calculated from CCoinControl values. Perhaps this could be mentioned in the commit message.
./src/wallet/wallet.cpp: const bool use_max_sig = coin_control && (coin_control->fAllowWatchOnly || coin_control->IsExternalSelected(txin.prevout));
Will do
397 | @@ -398,11 +398,8 @@ std::optional<SelectionResult> AttemptSelection(const CWallet& wallet, const CAm 398 | results.push_back(*knapsack_result); 399 | } 400 | 401 | - // Include change for SRD as we want to avoid making really small change if the selection just 402 | - // barely meets the target. Just use the lower bound change target instead of the randomly 403 | - // generated one, since SRD will result in a random change amount anyway; avoid making the 404 | - // target needlessly large. 405 | - const CAmount srd_target = nTargetValue + coin_selection_params.m_change_fee + CHANGE_LOWER;
I see how this is just a refactor, and doesn't change the result, but it's not clear to me why it's being done. Could you add a sentence of motivation to the commit message?
Will do. The reason is to have the correct value (without CHANGE_LOWER) for SelectionResult.m_target
397 | - coin_selection_params.m_min_change_target, coin_selection_params.rng_fast)}) { 398 | + // So we need to include that for KnapsackSolver and SRD as well, as we are expecting to create a change output. 399 | + if (!coin_selection_params.m_subtract_fee_outputs) { 400 | + target_with_change += coin_selection_params.m_change_fee; 401 | + } 402 | + if (auto knapsack_result{KnapsackSolver(all_groups, target_with_change, coin_selection_params.m_min_change_target, coin_selection_params.rng_fast)}) {
Nice catch!
257 | @@ -258,15 +258,17 @@ struct SelectionResult 258 | bool m_use_effective{false}; 259 | /** The computed waste */ 260 | std::optional<CAmount> m_waste; 261 | + /** Force no change even if more than enough inputs are selected */ 262 | + bool m_force_no_change{false};
force_no_change = false feels like a potential brain twister. Perhaps it would be better to call this allow_change and invert the boolean values.
/** Do not allow change even if more than enough inputs are selected */
bool m_allow_change{true};
meh.. I'm ambivalent about this. What do others think?
63 | @@ -64,7 +64,7 @@ static const size_t TOTAL_TRIES = 100000; 64 | 65 | std::optional<SelectionResult> SelectCoinsBnB(std::vector<OutputGroup>& utxo_pool, const CAmount& selection_target, const CAmount& cost_of_change) 66 | { 67 | - SelectionResult result(selection_target, SelectionAlgorithm::BNB); 68 | + SelectionResult result(selection_target, SelectionAlgorithm::BNB, /*force_no_change=*/true);
See below
SelectionResult result(selection_target, SelectionAlgorithm::BNB, /*allow_change=*/false);
268 | const SelectionAlgorithm m_algo; 269 | 270 | - explicit SelectionResult(const CAmount target, SelectionAlgorithm algo) 271 | - : m_target(target), m_algo(algo) {} 272 | + explicit SelectionResult(const CAmount target, SelectionAlgorithm algo, const bool force_no_change = false) 273 | + : m_target(target), m_algo(algo), m_force_no_change(force_no_change) {}
see above
explicit SelectionResult(const CAmount target, SelectionAlgorithm algo, const bool allow_change = true)
: m_target(target), m_algo(algo), m_allow_change(allow_change) {}
449 | @@ -439,7 +450,18 @@ void SelectionResult::Clear() 450 | void SelectionResult::AddInput(const OutputGroup& group) 451 | { 452 | util::insert(m_selected_inputs, group.m_outputs); 453 | - m_use_effective = !group.m_subtract_fee_outputs; 454 | + m_use_effective = m_use_effective || !group.m_subtract_fee_outputs;
I don't understand what's going on here. I'm actually now wondering why we need m_use_effective in the first place if it's just the negation of m_subtract_fee_outputs on the group.
461 | + m_force_no_change |= other.m_force_no_change; 462 | + if (m_algo == SelectionAlgorithm::MANUAL) { 463 | + m_algo = other.m_algo; 464 | + } 465 | + util::insert(m_selected_inputs, other.m_selected_inputs); 466 | }
This feels like it should be mentioned in the commit message and should perhaps be a separate commit than "Return change from SelectionResult". I see that this might be useful to construct input sets with a specific UTXO included, but I have a hard time following how it fits into the bigger picture.
I'll work on splitting this into a separate commit
515 | + if (change <= min_change) { // TODO: less or less_or_equal 516 | + change = 0; 517 | + } 518 | + 519 | + return change; 520 | +}
Nice! :heart:
I was also thinking about us returning the change value in the SelectionResult in the context of automatic CPFP for unconfirmed inputs when the parent transaction has a lower feerate and we need to add fees for that. This is great!
253 | @@ -254,6 +254,10 @@ struct SelectionResult 254 | private: 255 | /** Set of inputs selected by the algorithm to use in the transaction */ 256 | std::set<COutput> m_selected_inputs; 257 | + /** The target the algorithm selected for. Note that this may not be equal to the recipient amount as it can include non-input fees */ 258 | + CAmount m_target;
I wish we had canonical terms to consistently disambiguate the selection target and the recipient output total, especially because it additionally sometimes includes cost of change and a minimum change.
What do you all think about something along the lines of:
recipients_total: sum of recipient output values (excluding change)selection_target: the value the coin selection needs to raise to pay for the payload including change for algorithms that allow changeDo we need another term for selection_target without change?
+1 to the terms. I don't think we need another one for selection_target without change. Currently we know in most cases whether we need change or not. BnB - always no change, Knapsack and SRD always change, unless sffo.
Good, although Knapsack can actually produce results without change, but it takes care of the change amount internally.
253 | @@ -254,6 +254,10 @@ struct SelectionResult 254 | private: 255 | /** Set of inputs selected by the algorithm to use in the transaction */ 256 | std::set<COutput> m_selected_inputs; 257 | + /** The target the algorithm selected for. Note that this may not be equal to the recipient amount as it can include non-input fees */ 258 | + CAmount m_target; 259 | + /** The algorithm used to produce this result */ 260 | + SelectionAlgorithm m_algo;
Yes, I just made it non const to allow merging two SelectionResults
433 | @@ -434,9 +434,17 @@ std::optional<SelectionResult> SelectCoins(const CWallet& wallet, const std::vec 434 | */ 435 | preset_inputs.Insert(out, /*ancestors=*/ 0, /*descendants=*/ 0, /*positive_only=*/ false); 436 | } 437 | + if (preset_inputs.GetSelectionAmount() < nTargetValue) return std::nullopt;
Wasn't this what we introduced SelectionResult.Merge(…) above for?
Not sure what did you mean by this. This is the case when the selected inputs are not enough and we just return null value. There's no merging going on.
Right, but if the preset inputs are less than the target, wouldn't we be supposed to select additional inputs and build a transaction? I thought that's what you introduced the merge for.
433 | @@ -434,9 +434,17 @@ std::optional<SelectionResult> SelectCoins(const CWallet& wallet, const std::vec 434 | */ 435 | preset_inputs.Insert(out, /*ancestors=*/ 0, /*descendants=*/ 0, /*positive_only=*/ false); 436 | } 437 | + if (preset_inputs.GetSelectionAmount() < nTargetValue) return std::nullopt; 438 | + 439 | + if (preset_inputs.GetSelectionAmount() > nTargetValue + coin_selection_params.m_cost_of_change) {
Would this perhaps still need to check whether the remainder is greater than the dust threshold?
I don't think so, because m_cost_of_change is higher than dust
Ah right, because it includes the cost of future spending.
584 | @@ -572,8 +585,8 @@ std::optional<SelectionResult> SelectCoins(const CWallet& wallet, const std::vec 585 | if (!res) return std::nullopt; 586 | 587 | // Add preset inputs to result 588 | - res->AddInput(preset_inputs); 589 | - if (res->m_algo == SelectionAlgorithm::MANUAL) { 590 | + res->Merge(preselected);
I have the impression that this commit makes changes to how coin selection works with preselected inputs and in addition introduces that SelectionResult returns change. While both changes look worthwhile to me, it would be easier to follow if they were separate commits if not separate PRs. Am I missing some special reason why doing all of this at once is advantageous?
I intended this to not change the behavior. The reason why we need Merge is to correctly track SelectionResult::m_target. Right now in case with preselected inputs m_target will be reduced by the amount of preselected inputs and this leads to incorrect result in SelectionResult::GetChange
I'll split the commits and try to explain better why this is needed
381 | @@ -382,9 +382,10 @@ std::optional<SelectionResult> AttemptSelection(const CWallet& wallet, const CAm 382 | // Vector of results. We will choose the best one based on waste. 383 | std::vector<SelectionResult> results; 384 | 385 | + const auto cost_of_extra_input = coin_selection_params.m_effective_feerate.GetFee(DUMMY_NESTED_P2WPKH_INPUT_SIZE); 386 | // Note that unlike KnapsackSolver, we do not include the fee for creating a change output as BnB will not create a change output. 387 | std::vector<OutputGroup> positive_groups = GroupOutputs(wallet, coins, coin_selection_params, eligibility_filter, true /* positive_only */); 388 | - if (auto bnb_result{SelectCoinsBnB(positive_groups, nTargetValue, coin_selection_params.m_cost_of_change)}) { 389 | + if (auto bnb_result{SelectCoinsBnB(positive_groups, nTargetValue, coin_selection_params.m_cost_of_change + cost_of_extra_input)}) {
That's an interesting upper bound. Much better than just completely unbounded, but definitely gives some room for bigger excess amounts than before. Is this still a WIP-value to give you more wiggle room or is this what you finally identified as the appropriate upper bound? Curious about your thought process here.
To fix inefficiencies identified #24580 we need to increase upper limit by the opportunity cost of spending one extra input. We don't know what type that one extra input would be.
The range of possible values is from the cost of p2tr (57.5vb) to the cost of p2pkh (148vb). The chosen value is in the middle of that range.
An alternative way to pick the value would be to examine available UTXOs and take an average input size or something like this. This is more complicated and I'm not sure if it makes a meaningful difference.
I should probably add a comment to the code once we agree on the value
Perhaps it should just use the default output type, that way it will automatically upgrade as wallets progress to P2TR eventually. The current default seems like a reasonable choice at this time.
:thinking: Having the bound definitely feels a lot better than unbounded. A whole input's fee still feels like a lot of additional fees. It's obviously only the worst case, but I'm curious to see how that fares in simulations. Do you need to make more logic changes or do you think I could take it for a spin on a few different scenarios?
Very exciting changes in this PR. It feels like there are still some moving parts, see my comments on the various sections.
<!--cf906140f33d8803c4a75a2196329ecb-->
🐙 This pull request conflicts with the target branch and needs rebase.
<sub>Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a "draft".</sub>
Update: I did some simulations for this branch. The results were that
FWIU, @S3RK had a similar result with his simulations so far. There was a bug discovered in how the waste is calculated for Knapsack in some instances, but I don't expect this bug to impact the overall trend.
More experiments to follow, and it may be interesting to explore the opposite, i.e. see whether reducing the window size in which BnB produces solutions improves results instead.
The results were that
- the cost increased when the BnB window was relaxed
- the number of changeless solutions was reduced
I don't think this is the correct interpretation of the data. e.g. for bustabit-tiny-hot-wallet_2019-2020-per-day
I see similar picture for bustabit-tiny-hot-wallet_peak-and-tail and for my own simulations.
Overall it's not clear whether increasing the upper limit is beneficial. I'm closing this now and will send refactoring PRs separately.