fuzz: provide more realistic values to the base58(check) decoders

l0rinc commented at 4:23 pm on February 20, 2025: contributor

This is a follow-up to #30746, expanding coverage by:

restricting every input for the base58 conversions, capping max sizes to 100 instead of 1000 or all available input (suggested by marcofleon in #30746 (review)) since most actual usage has lengths of e.g. 21, 34, 78.
providing more valid values to the decoder (suggested by maflcko in #30746 (review)) by randomly providing a random input or a valid encoded one; this also enables unifying the roundtrip tests to a single roundtrip per fuzz.

DrahtBot commented at 4:23 pm on February 20, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31917.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	mzumsande, maflcko
Concept ACK	arejula27

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

DrahtBot added the label Tests on Feb 20, 2025

in src/test/fuzz/base_encode_decode.cpp:43 in 21b8b8464b outdated

50 {
51-    FuzzedDataProvider provider(buffer.data(), buffer.size());
52-    const std::string random_string{provider.ConsumeRandomLengthString(1000)};
53-    const int max_ret_len{provider.ConsumeIntegralInRange<int>(-1, 1000)};
54+    // Encode/Decode roundtrip
55+    const auto encoded{EncodeBase58Check(buffer)};

marcofleon commented at 4:30 pm on February 20, 2025:

The problem actually turned out to be this. The input that times out is about 1MB so passing in the whole buffer causes the timeout.

I don’t think the change from 1000 to 100 for both random_string and max_ret_len makes a difference.

brunoerg commented at 5:33 pm on February 20, 2025:

I think we use this function on the RPC target and we limit the length to 64.

l0rinc commented at 5:55 pm on February 20, 2025:

Thanks for the help, I pushed a new version where every input is restricted (used 100 to be slightly different from the RPC one you’ve mentioned) and <1000, even though it seems that wasn’t the actual bottleneck.

l0rinc force-pushed on Feb 20, 2025

l0rinc force-pushed on Feb 21, 2025

arejula27 commented at 1:15 pm on February 23, 2025: none

Concept ACK a63c5c1a8a012b694

I’ve reviewed the changes and have one main observation:

1. Fuzz Test Limitations

Limiting fuzz to the most common sizes makes sense for efficiency. However, if long strings are still expected to be supported, there should be explicit test coverage for them.

I checked the unit tests for Base58 in src/test/base58_tests.cpp, which reads test cases from test/data/base58_encode_decode.json. The longest existing test case appears on line 18, with a hex-encoded string of size 512. Given that this PR removes the long string test from the fuzz tests, it may be beneficial to add a new test case in base58_encode_decode.json to verify behaviour at 1000 characters.

2. Fuzz Test Execution

After static review, I ran the fuzz tests using:

0$ cmake --preset=libfuzzer
1$ cmake --build build_fuzz -j "$(($(nproc)+1))"
2
3$ FUZZ=base58_encode_decode build_fuzz/src/test/fuzz/fuzz -max_total_time=30 -
4$ FUZZ=base58check_encode_decode build_fuzz/src/test/fuzz/fuzz -max_total_time=30
5$ FUZZ=base32_encode_decode build_fuzz/src/test/fuzz/fuzz -max_total_time=30 
6$ FUZZ=base64_encode_decode build_fuzz/src/test/fuzz/fuzz -max_total_time=30 
7$ FUZZ=psbt_base64_decode build_fuzz/src/test/fuzz/fuzz -max_total_time=30

All tests passed successfully within the 30-second run.

maflcko added this to the milestone 29.0 on Feb 26, 2025

in src/test/fuzz/base_encode_decode.cpp:62 in 23fc349959 outdated

76 
77 FUZZ_TARGET(base32_encode_decode)
78 {
79-    const std::string random_string{buffer.begin(), buffer.end()};
80+    FuzzedDataProvider provider{buffer.data(), buffer.size()};
81+    const auto random_string{provider.ConsumeRandomLengthString(100)};

maflcko commented at 8:27 am on February 28, 2025:

Why are you restricting this to 100? It wasn’t restricted before

l0rinc commented at 2:21 pm on February 28, 2025:

to not use all the buffer, where we had problems before. Do you suggest we use the whole buffer here, like before, or to give it a bigger max value?

maflcko commented at 2:36 pm on February 28, 2025:

to not use all the buffer, where we had problems before.

What problems? A problem was introduced in the base58 fuzz target in the previous change, however you are changing the base32/64 here and below.

Generally, if there is no reason to change the code, I think it shouldn’t be changed. Otherwise, it will just be confusing, hard to review, and may even introduce problems.

l0rinc commented at 2:54 pm on February 28, 2025:

What problems? A problem was introduced in the base58 fuzz target in the previous change, however you are changing the base32/64 here and below.

I’ve learned that using the whole buffer can be problematic for long running tasks, and since base conversion is mostly scale dependent anyway, we’d get better coverage of special values (e.g. all zeros, powers of the base, etc), by using a value that isn’t too large.

there is no reason to change the code

This PR is meant to unify the related base fuzz tests. I’d argue it’s easier to review them if they all follow the same steps. This is a follow-up to a previous PR, we’re not modifying legacy code here.

l0rinc commented at 5:26 pm on February 28, 2025:

Kept the base58 fuzz changes only, removed the base[32/64] changes

in src/test/fuzz/base_encode_decode.cpp:78 in 23fc349959 outdated

 95 
 96 FUZZ_TARGET(base64_encode_decode)
 97 {
 98-    const std::string random_string{buffer.begin(), buffer.end()};
 99+    FuzzedDataProvider provider{buffer.data(), buffer.size()};
100+    const auto random_string{provider.ConsumeRandomLengthString(100)};

maflcko commented at 8:27 am on February 28, 2025:

same

maflcko removed this from the milestone 29.0 on Feb 28, 2025

l0rinc force-pushed on Feb 28, 2025

in src/test/fuzz/base_encode_decode.cpp:69 in ac4b10f702 outdated

65@@ -67,9 +66,9 @@ FUZZ_TARGET(base32_encode_decode)
66         assert(encoded_string == ToLower(TrimStringView(random_string)));
67     }
68     // Encode/Decode roundtrip
69-    const auto encoded{EncodeBase32(buffer)};
70+    const auto encoded{EncodeBase32(random_string)};

mzumsande commented at 8:26 pm on March 5, 2025:

this is changed back and fourth between the two commits, probably a rebase error.

l0rinc commented at 9:32 pm on March 5, 2025:

Thanks for the review. It was a rebase error indeed, thanks for catching it.

in src/test/fuzz/base_encode_decode.cpp:31 in ac4b10f702 outdated

32-    if (DecodeBase58(random_string, decoded, max_ret_len)) {
33+    if (std::vector<unsigned char> decoded; DecodeBase58(random_string, decoded, max_ret_len)) {
34         const auto encoded_string{EncodeBase58(decoded)};
35         assert(encoded_string == TrimStringView(random_string));
36-        assert(encoded_string.empty() || !DecodeBase58(encoded_string, decoded, provider.ConsumeIntegralInRange<int>(0, decoded.size() - 1)));
37+        assert(decoded.empty() || !DecodeBase58(encoded_string, decoded, provider.ConsumeIntegralInRange<int>(0, decoded.size() - 1)));

mzumsande commented at 8:39 pm on March 5, 2025:

why is the check changed from encoded_string.empty() to decoded.empty() in the first commit - especially if this function is being rewritten in the next commit anyway?

l0rinc commented at 9:32 pm on March 5, 2025:

it’s the decoded size that determines if the (0, decoded.size() - 1) range makes sense or not - I’m correcting it in the first commit since this was a bug that could cause a fuzz failure (added explanation to commit message). I’m reordering in the second commit so that we have single roundtrip logic instead of 2 separate ones.

mzumsande commented at 6:53 pm on March 13, 2025:

ok, makes sense, but I’d challenge that this is “a bug that could cause a fuzz failure” (otherwise I would’ve expected this failure to be found by now): My understanding is that in the relevant special case, both decoded an encoded_string are empty. However, it should be impossible for encoded_string to be empty and decoded non-empty (and vice versa) because of how base58 works.

fuzz: Always restrict base conversion input lengths

They seem to cause timeouts:
> Issue 397734700: bitcoin-core:base58check_encode_decode: Timeout in base58check_encode_decode

The `encoded_string.empty()` check was corrected here to `decoded.empty()` to make sure the `(0, decoded.size() - 1)` range is always valid.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: marcofleon <marleo23@proton.me>
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>

bad1433ef2

fuzz: make sure DecodeBase58(Check) is called with valid values more often

In Base58 fuzz the two roundtrips are merged now, the new `decode_input` switches between a completely random input and a valid encoded one, to make sure the decoding passes more often.
The `max_ret_len` can also exceed the original length now and is being validated more thoroughly.

Co-authored-by: maflcko <6399679+maflcko@users.noreply.github.com>
Co-authored-by: marcofleon <marleo23@proton.me>

d5537c18a9

l0rinc force-pushed on Mar 5, 2025

mzumsande commented at 6:54 pm on March 13, 2025: contributor

Code Review / lightly tested ACK d5537c18a9034647ba4c9ed4008abd7fee33989e

maflcko commented at 12:47 pm on March 15, 2025: member

review ACK d5537c18a9034647ba4c9ed4008abd7fee33989e 🚛

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: review ACK d5537c18a9034647ba4c9ed4008abd7fee33989e 🚛
3lZRbdgnghqtziDGBOrJTlPhuDIsZdigxD/VS2S8f5+dWrvPvXpUWSzcRe/nWCWjUTD2bz/MArCoV2/DrCVmxDA==

maflcko added this to the milestone 29.0 on Mar 15, 2025

maflcko commented at 12:50 pm on March 15, 2025: member

(In theory this could be backported to 29.0 to avoid a fuzz timeout there, but it isn’t too important, up to the backport maintainer)

fanquake merged this on Mar 16, 2025

fanquake closed this on Mar 16, 2025

fanquake added the label Needs backport (29.x) on Mar 17, 2025

glozow referenced this in commit 15ecae31a8 on Mar 17, 2025

glozow referenced this in commit 458655bca8 on Mar 17, 2025

glozow commented at 2:14 am on March 17, 2025: member

Backported in #32062

glozow removed the label Needs backport (29.x) on Mar 17, 2025

fuzz: provide more realistic values to the base58(check) decoders #31917

Code Coverage & Benchmarks

Reviews

1. Fuzz Test Limitations

2. Fuzz Test Execution