FUZZ: Test that BnB finds best solution

murchandamus commented at 5:34 pm on July 7, 2025: contributor

BnB’s solution is the input set with the lowest waste score, excluding any supersets of other solution candidates. This fuzz test compares a brute force search with the BnB result to ensure that BnB succeeds.

DrahtBot added the label Tests on Jul 7, 2025

DrahtBot commented at 5:34 pm on July 7, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32894.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	yancyribbens

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

in src/wallet/test/fuzz/coinselection.cpp:275 in ce517d32d5 outdated

270+    // Brute force optimal solution (lowest waste, but cannot be superset of another solution)
271+    std::vector<uint32_t> solutions;
272+    int best_waste{std::numeric_limits<int>::max()};
273+    int best_weight{std::numeric_limits<int>::max()};
274+    for (uint32_t pattern = 1; (pattern >> num_groups) == 0; ++pattern) {
275+        // BnB does not add permit adding more inputs to a solution, i.e. a superset of a solution cannot ever be a solution.

maflcko commented at 8:44 am on July 8, 2025:

“BnB does not add permit adding more inputs to a solution…” -> “BnB does not permit adding more inputs to a solution…” [remove the extraneous “add” to correct the verb phrase]

murchandamus force-pushed on Jul 8, 2025

murchandamus force-pushed on Jul 10, 2025

murchandamus commented at 6:58 pm on July 10, 2025: contributor

Thanks @maflcko, I fixed the sentence.

yancyribbens commented at 11:38 pm on July 10, 2025: contributor

I’ll take a closer look at this, although some quick feedback since I’ve implemented something similar.

This fuzz test compares a brute force search with the BnB result to ensure that BnB succeeds.

I recently removed a similar automated test that used brute force. The issue with using a brute solution I had was a) brute force is slow b) only works on a limited pool size of say 20 UTXOs (see a). Instead of using brute force, I adopted a strategy where I generate two random UTXO sets. Then, build a target from one of the two sets. Finally merge the two sets and randomize. Therefore there is for sure a solution (unless iteration limit is hit) and the assertion asserts that BnB will find the solution (or that it doesn’t due to some error condition like iteration limit reached). I’m curious what you think of the pros and cons of your brute force strategy vs seeding a random solution and how it works with larger UTXO sets.

in src/wallet/test/fuzz/coinselection.cpp:229 in a383386b9d outdated

224+    FastRandomContext fast_random_context{ConsumeUInt256(fuzzed_data_provider)};
225+    CoinSelectionParams coin_params{fast_random_context};
226+    coin_params.m_subtract_fee_outputs = false;
227+    // Set effective feerate up to 10'000'000 sats per kvB (10'000 sat/vB).
228+    coin_params.m_effective_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, 10'000'000), 1'000};
229+    coin_params.m_long_term_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, 10'000'000), 1'000};

yancyribbens commented at 7:06 pm on July 11, 2025:

In a similar test, I run SelectCoinsBnB twice and swap fee_rate and long_term_fee_rate, then assert that when long_term_fee_rate is greater less outputs (or equal) are found between the two runs.

murchandamus commented at 7:55 pm on July 11, 2025:

The invariant you describe does not generally hold true. It would be possible that running BnB on the same UTXO set with the same target at two different feerates would result in an input set with fewer inputs at the lower feerate than the result at the higher feerate due the difference in available combinations after factoring in fees (unless we are talking about two different UTXO sets that match in effective values at the different feerates).

We test for switch between consolidatory and thrifty behavior in bnb_feerate_sensitivity_test and it is not a goal of this fuzz harness to test feerate sensitivity, although it indirectly is tested by checking that the waste score matches between the BnB result and the brute force result when BnB finished running and was not interrupted by the iteration limit.

yancyribbens commented at 2:09 am on July 12, 2025:

I did not mean to use two UTXO sets. If you have a single UTXO set and one swaps the long_term_fee_rate and the short term fee_rate, that ought to produce either the same results or different results depending on if the current fees are expensive or if the current fee_rates are cheap. I think the test you linked to shows that since the same set is used {2, 3, 5, 10} which returns different results depending on the difference in fee_rate and long_term_fee_rate. The difference between expensive and cheap fee_rates can be toggled by swapping fee_rate for long_term_fee_rate and running BnB again, unless I’m missing something.

Anyway, I understand not wanting to add that complexity to the test.

murchandamus commented at 7:06 pm on July 21, 2025:

As mentioned above, I think we are crossing wires here. I was talking about running coin selection with the same UTXO set at two different feerates. You appear to be talking about running a test with two different UTXO sets generated from the same set of effective values at different feerates. The latter is done already in the test I linked above. For the former the invariant you describe does not hold.

yancyribbens commented at 12:03 pm on July 22, 2025:

You appear to be talking about running a test with two different UTXO sets generated from the same set of effective values at different feerates. The latter is done already in the test I linked above. For the former the invariant you describe does not hold.

Yes, that is what I’m talking about said a different way. Yes, the test you linked above tests for that but only for the same static values.

murchandamus commented at 2:29 pm on July 22, 2025:

Isn’t that sufficient to demonstrate that it is doing that?

yancyribbens commented at 6:36 pm on July 22, 2025:

Isn’t it a stronger assertion to show the invariant holds for all cases? Otherwise what’s the point of any of the fuzz assertions?

in src/wallet/test/fuzz/coinselection.cpp:236 in a383386b9d outdated

232+
233+    coin_params.change_output_size = fuzzed_data_provider.ConsumeIntegralInRange(1, MAX_SCRIPT_SIZE);
234+    coin_params.m_change_fee = coin_params.m_effective_feerate.GetFee(coin_params.change_output_size);
235+    coin_params.change_spend_size = fuzzed_data_provider.ConsumeIntegralInRange<int>(41, 1000);
236+    const auto change_spend_fee{coin_params.m_discard_feerate.GetFee(coin_params.change_spend_size)};
237+    coin_params.m_cost_of_change = coin_params.m_change_fee + change_spend_fee;

yancyribbens commented at 7:12 pm on July 11, 2025:

could not cost_of_change be randomly selected from a range (0, MAX)?

yancyribbens commented at 7:16 pm on July 11, 2025:

I suppose MAX could be MAX_MONEY here?

murchandamus commented at 8:09 pm on July 11, 2025:

change_fee and cost_of_change are used in different calculations relating to the solution window of BnB and the waste calculation and in reality relate to the current feerate. Therefore it was instead chosen to fuzz the input and output size used to calculate them, but base them on the same feerate used for other calculations in the context.

yancyribbens commented at 3:34 pm on July 12, 2025:

I see. So cost_of_change is variable based on the change output size which is range bound (0, MAX_SCRIPT_SIZE) since the cost of creating the change output depends on the size of the change output created. I didn’t know there was a max script size for an output!?

I think this type of thing might be worth adding in more detail to the commit message to help reviewers.

sipa commented at 8:16 pm on July 12, 2025:

scriptPubKeys of size above 10000 are consensus-unspendable.

in src/wallet/test/fuzz/coinselection.cpp:244 in a383386b9d outdated

240+    coin_params.min_viable_change = std::max(change_spend_fee + 1, dust);
241+
242+    // Create some coins
243+    CAmount max_spendable{0};
244+    int next_locktime{0};
245+    static constexpr unsigned max_output_groups{16};

yancyribbens commented at 7:15 pm on July 11, 2025:

It looks like a downside to this brute force approach is it only allows pool sizes up to 16. Curious if you consider just pre-generating a solution which would allow any theoretical size.

murchandamus commented at 8:48 pm on July 11, 2025:

We already have the coinselection_bnb fuzz target which runs BnB on UTXO pools of up to 10,000 UTXOs and validates that if BnB finds a solution that it falls in the solution window.

In contrast to the existing fuzz test, this fuzz target generates inputs that have solutions and don’t have solution. It then asserts that when the brute force search indicates that a solution exists, BnB also finds a solution, and if BnB finished searching the entire combination space, BnB’s result matches the optimal solution per BnB’s fitness function.

Above 17 UTXOs a fuzz input could be generated whose scenario exhausts BnB’s iteration limit before BnB finds any solution. The invariant that BnB will always find a solution when the brute force algorithm finds a solution no longer holds above 17 UTXOs.

I guess I could increase it to 17 instead of 16, though.

yancyribbens commented at 10:42 pm on July 11, 2025:

We already have the coinselection_bnb fuzz target which runs BnB on UTXO pools of up to 10,000 UTXOs and validates that if BnB finds a solution that it falls in the solution window.

I see. That is also valuable. However, this does not test the case where a solution does exist (say with 20 UTXOS) and BnB does not find the result, only that any found solution matches the parameters. The brute force technique would also not test such a case. It’s possible BnB could find solutions of up to 100,000 UTXOs if every UTXO was in the explore branch I believe.

I admit, it is problematic to try to test every case.

murchandamus commented at 11:14 pm on July 11, 2025:

Let’s say you create a fuzz target that generates two sets of UTXOs and sums up the effective value of one set to set that as its target. For UTXO counts exceeding 17, BnB might return a solution and then we could check exactly the properties that coinselection_bnb checks and additionally, that the solution is at least as good as the input set we originally used to generate the target.

However, while #32150 introduces the use of m_algo_completed and m_selections_evaluated to BnB, they are properties of SelectionResult; but when a coin selection algorithm fails to find a solution, it does not return a SelectionResult object, but just an error: either returns ErrorMaxWeightExceeded or util::Error which do not surface these two properties. We currently wouldn’t even be able to tell whether it failed because it hit the iteration limit, or whether it didn’t find a solution we know to exist, so I don’t think we would learn anything in addition to the two fuzz targets we have from the fuzz target you describe.

murchandamus commented at 11:16 pm on July 11, 2025:

It may however be interesting to introduce another Error class to coin selection which surfaces when the iteration limit was hit before a solution was found. Then we could distinguish between not finding a solution due to exhausting iterations or due to overlooking a solution that we should have found, and fuzzing BnB with bigger UTXO pools that are known to contain a solution would become interesting.

yancyribbens commented at 1:53 am on July 12, 2025:

It may however be interesting to introduce another Error class to coin selection which surfaces when the iteration limit was hit before a solution was found

Yes exactly. That’s how the test works in the rust implementation I wrote. A solution is created from one of two pools and then the pools are combined and shuffled. I have an error that is returned that distinguish iteration limit reached from no solution, so to test BnB is correct, we know we should never see no solution, only iteration limit reached or a solution (or overflow is another possibility).

However, I think the only way to test that BnB finds no solution when no solution exists is to take a target and look through every subset in brute force to validate that no solution exists. So ideally, there would be those two tests. One that tests for a known solution and one that tests for no solution.

yancyribbens commented at 1:25 pm on July 12, 2025:

For reference, this is the rust test I recently wrote: https://github.com/p2pderivatives/rust-bitcoin-coin-selection/blob/3ae36852f4263270e794c444e9fc09172997fc67/src/branch_and_bound.rs#L866

yancyribbens commented at 2:01 pm on July 12, 2025:

I just noticed a subtle bug in the test that I fixed. Let’s see if anyone else notices :nerd_face:

yancyribbens commented at 2:10 am on July 13, 2025:

In contrast to the existing fuzz test, this fuzz target generates inputs that have solutions and don’t have solution. It then asserts that when the brute force search indicates that a solution exists

On second thought, I don’t see the value in a test that tests for results that “don’t have solutions”, The valuable test is testing the quality of any produced solution. Therefore, I’m not sure what the value is to using a brute force method to show that no solution is produced.

murchandamus commented at 7:18 pm on July 21, 2025:

The test you propose is also interesting, and someone might want to take a stab at adding it in another pull request. It is out of scope for this pull request however.

Fuzzing explores pseudorandomly all code paths that can be reached by exercising the fuzz harness. By comparing BnB with a brute force search, we can demonstrate that BnB always finds the best solution that is present, but only when they are present. As only inputs that cause new paths to be explored in the source code are retained, fuzzing space efficiently surfaces edge cases.

yancyribbens commented at 12:08 pm on July 22, 2025:

The test you propose is also interesting, and someone might want to take a stab at adding it in another pull request.

That would be great.

brunoerg commented at 7:05 pm on October 10, 2025:

Above 17 UTXOs a fuzz input could be generated whose scenario exhausts BnB’s iteration limit before BnB finds any solution. The invariant that BnB will always find a solution when the brute force algorithm finds a solution no longer holds above 17 UTXOs.

Worth adding a short comment on max_output_groups about this? It would be great to understand this rationale when reading the harness.

in src/wallet/test/fuzz/coinselection.cpp:252 in a383386b9d outdated

248+    {
249+        // With maximum m_effective_feerate 10'000 s/vB and n_input_bytes = 20'000 B, input_fee <= MAX_MONEY.
250+        const int n_input_bytes{fuzzed_data_provider.ConsumeIntegralInRange<int>(1, 20'000)};
251+        const CAmount input_fee = coin_params.m_effective_feerate.GetFee(n_input_bytes);
252+        // Only make UTXOs with positive effective value
253+        const CAmount eff_value{fuzzed_data_provider.ConsumeIntegralInRange<CAmount>(1, MAX_MONEY + group_pos.size() - max_spendable - max_output_groups)};

yancyribbens commented at 7:22 pm on July 11, 2025:

doesn’t this create pools that could sum to greater than MAX_MONEY?

murchandamus commented at 8:57 pm on July 11, 2025:

I don’t think so.

max_spendable starts at 0 and then sums the effective values of the generated UTXO (see line 258). As max_spendable grows, it limits the amount available to the remaining UTXOs as they are generated. + group_pos.size() - max_output_groups ensures that there is always at least 1 sat left to be assigned to the current UTXO.

yancyribbens commented at 8:01 pm on July 12, 2025:

I see. Thanks. This will be a slightly non-uniform pattern since the window is largest for the first selection (1, MAX_MONEY - 16) and then the next selection will have a smaller range maybe (1, MAX_MONEY / 2). So the pool will tend to have the largest UTXO’s first.

murchandamus commented at 6:50 pm on July 21, 2025:

UTXOs are ordered in descending effective value by BnB, so I don’t think the UTXO with largest value being likely to be generated first is a problem.

yancyribbens commented at 12:10 pm on July 22, 2025:

UTXOs are ordered in descending effective value by BnB, so I don’t think the UTXO with largest value being likely to be generated first is a problem.

I was thinking of testing this more as you would a black box.

in src/wallet/test/fuzz/coinselection.cpp:266 in a383386b9d outdated

262+        group_pos.push_back(output_group);
263+    }
264+    size_t num_groups = group_pos.size();
265+    assert(num_groups <= max_output_groups);
266+
267+    // Only choose targets below max_spendable

yancyribbens commented at 7:27 pm on July 11, 2025:

I think there should be something that prevents max_spendable from being greater than MAX_MONEY?

murchandamus commented at 9:08 pm on July 11, 2025:

max_spendable is the sum of effective values of the generated UTXOs and they are being generated by being picking values from the gap that is left between max_spendable and MAX_MONEY.

Example:

MAX_MONEY is 100.
max_output_groups is 3.

The effective value for the first UTXO A is generated randomly between (1, MAX_MONEY + group_pos.size() - max_spendable - max_output_groups), i.e. (1, 100 + 0 - 0 - 3) = (1, 97). Let’s say it’s 43. max_spendable becomes 43.
The effective value of the second UTXO B is now generated randomly between (1, 100 + 1 - 43 -3) = (1, 55). Let’s say it’s 51. max_spendable is now 84.
The effective value for the third UTXO C is now generated randomly between (1, 100 + 2 - 84 - 3) = (1, 15). Let’s say it’s the maximum of 15. max_spendable is now 99 and less than MAX_MONEY.

yancyribbens commented at 8:03 pm on July 12, 2025:

Yes, that follows from my previous comment. I note that this works however the distribution will not be truly random since the chance of selecting a large UTXO is greatest the first time and so on.

yancyribbens commented at 8:05 pm on July 12, 2025:

Maybe randomize the pool once it’s created so there is equal likely-hood of having a large UTXO at the back of the list.

murchandamus commented at 6:51 pm on July 21, 2025:

BnB traverses the UTXOs after sorting, so I don’t see why this would be of concern.

in src/wallet/test/fuzz/coinselection.cpp:231 in a383386b9d outdated

226+    coin_params.m_subtract_fee_outputs = false;
227+    // Set effective feerate up to 10'000'000 sats per kvB (10'000 sat/vB).
228+    coin_params.m_effective_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, 10'000'000), 1'000};
229+    coin_params.m_long_term_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, 10'000'000), 1'000};
230+    coin_params.m_discard_feerate = CFeeRate{ConsumeMoney(fuzzed_data_provider, 10'000'000), 1'000};
231+    coin_params.m_cost_of_change = ConsumeMoney(fuzzed_data_provider);

murchandamus commented at 8:06 pm on July 11, 2025:

Remove duplicate definition of m_cost_of_change.

yancyribbens commented at 3:35 pm on July 12, 2025:

Good call, I just noticed that as well.

yancyribbens commented at 8:07 pm on July 11, 2025: contributor

Concept ACK adding a test that shows BnB finds the best solution (min waste). The downside to this approach however is it limits solutions to 16 and theoretically BnB could find solutions of larger sets without hitting the maximum iteration limit. Code looks good though otherwise.

I think this should be prioritized to merge first over #32150 so that it can add more test safety to that refactor.

FUZZ: Test that BnB finds best solution

BnB’s solution is the input set with the lowest waste score, excluding
any supersets of other solution candidates.
This fuzz test compares a brute force search with the BnB result to
ensure that BnB succeeds.

96e1c82552

murchandamus force-pushed on Jul 21, 2025

murchandamus commented at 7:21 pm on July 21, 2025: contributor

Removed the duplicate definition of coin_params.m_cost_of_change.

achow101 requested review from achow101 on Oct 22, 2025

achow101 requested review from marcofleon on Oct 22, 2025

fanquake added the label Fuzzing on Oct 30, 2025

FUZZ: Test that BnB finds best solution #32894

Code Coverage & Benchmarks

Reviews