Make ApproximateBestSubset optimize for amount of inputs #10100

RHavar commented at 9:22 PM on March 27, 2017: contributor

… rather than minimizing change amount.

This is more of a discussion issue, to see if this is appropriate. I haven't tested it (and indeed, it's going to break all coin selection tests).

The current coin selection code is a bit weird and already aggressively spends small inputs, as if you are sending N, it first tries to only use inputs less than N. (Actually, I've commented it all out in the version of code I run, as I've found it to be rather harmful). But anyway ApproximateBestSubset is only used when the it can't source N from inputs <= N -- so I think it makes sense to use a totally different algorithm.

And instead of trying to minimize the change (which is a wet dream for wallet clustering services), I think it makes sense to instead try optimize for the amount of inputs you need to use.

--

And in a more radical note:

I think it's really important to have a bunch of different coin selection algorithms to pick from. One-size fits all is pretty painful. When I'm sending transactions at 200satoshi/byte, I really, really don't want to consolidating all my unspents and don't care if I'm grinding dust outputs. However, when when I'm sending an ultra-low priority transaction (e.g. sending money from my hot wallet to cold storage) and only paying 5 sat/byte, the more utxo consolidation the better!

Make ApproximateBestSubset optimize for amount of inputs, rather than minimizing change 476a1bef8b

RHavar force-pushed on Mar 27, 2017

fanquake added the label Wallet on Mar 27, 2017

instagibbs commented at 2:43 AM on March 30, 2017: member

@Xekyo has done some work on coin selection

jonasschnelli commented at 10:22 AM on April 19, 2017: contributor

Needs rebase. Ping @Xekyo

Xekyo commented at 1:04 PM on April 25, 2017: member

If you want to minimize the number of inputs, you could just use Largest First selection as many have discovered before you. However, you will then create a large number of UTXOs that you'll need to spend in the future (when fees perhaps are even higher). In my simulations FIFO and random selection performed reasonably well among naive approaches.

One of the coin selection strategies that I've evaluated for my Master's thesis was aiming to create a change output of the same size as the amount to send. This did not perform well as a singular strategy in my tests, however, you seem to have different criteria to evaluate fitness.

RHavar commented at 1:19 PM on April 25, 2017: contributor

@Xekyo Keep in mind this isn't purely a "optimize for amount of inputs", as it only changes ApproximateBestSubset which is a fallback after the other stuff fails (which already heavily tries to consolidate utxo's).

Also one assumption you make, which doesn't necessarily hold up for "power users" is that sending transactions is the only way to consolidate the utxo set. This isn't really true for services anymore that are starting to do consolidation with ultra low-fee transactions for off peak times, and something that could even be exposed to the wallet directly.

So for those reasons, I very much like greedy strategies that optimize for minimizing the inputs (and leaving big inputs if possible) -- especially if your simulations show they perform well anyway.

--

Although I kind of don't expect any make any progress to happen on coin selection as it's a lot of bike shedding, and everyone has different use cases that dictate different coin selection strategies. I think the best way would be to just code up a few simple strategies, and let users pick between them.

Or another idea would be use a "score" based system. Where users can configure the score for different properties (e.g. optimizing for size, privacy, utxo consolidation, etc.) and then randomly generates 1000 (?) valid transactions, and picks the one with the best score

Xekyo commented at 1:35 PM on May 2, 2017: member

@RHavar: Sorry for the late reply, I was moving to another country in the past days. At least when I analyzed Bitcoin Core's selection last year, approximate subset was used almost every run: it's only not used if you have a single UTXO that matches the target or when all UTXOs smaller than the target as a sum match the target. These two cases only occurred in less than 1.5% of the transactions in my simulation.

My simulation showed that Largest First selection minimizes short term fees, but maximizes UTXO creation. As I was assuming constant fees, I cannot offer figures for its performance in combination with low-fee consolidation transactions. I agree though, that creating a change output and spending it later for e.g. a fifth of the fee would be cheaper than adding a second input to avoid creating an output.

I'm hoping to be doing some more research on coin selection in the next months with larger real world datasets, I'll be looking into extending my simulation framework to cover varying fee levels over the week, so I hope to be able to give more reliable information.

Meanwhile I think that your approach would serve power users well, but I don't think that we should put Largest First selection as the default strategy for Bitcoin Core.

RHavar commented at 4:05 PM on May 3, 2017: contributor

These two cases only occurred in less than 1.5% of the transactions in my simulation.

I'm not sure this is as meaningful as it first seems. It gets hit very infrequently directly because of the behavior of ApproximateBestSubset. If ApproximateBestSubset was changed to something to minimize (short term) fees, then it'll get hit more often.

I'm hoping to be doing some more research on coin selection in the next months with larger real world datasets, I'll be looking into extending my simulation framework to cover varying fee levels over the week, so I hope to be able to give more reliable information.

I kind of feel like coin-selection is more of an engineering problem than a research one. There are so many different use-cases and trade offs that it'd say it's impossible to have a universally decent coin selection.

Like even ideally you'd take feerate into consideration when doing coin selection. If the feerate is atypically high (e.g. spam attack) you'd want to automatically switch to something that creates as small transactions as possible. If the feerate was atypically low (e.g. weekends) you'd try throw in some extra utxo consolidation.

But not only that, you have a lot of other trade offs that are hard to quantify. Like privacy.

but I don't think that we should put Largest First selection as the default strategy for Bitcoin Core.

Fair enough, I'll close the issue.

RHavar closed this on May 3, 2017

RHavar deleted the branch on Oct 2, 2018

MarcoFalke locked this on Sep 8, 2021