fuzz: add ConstructPubKeyBytes util function

josibake commented at 8:12 am on August 29, 2023: member

In #28246 and #28122 , we add a PubKeyDestination and a V0SilentPaymentsDestination. Both of these PRs update fuzz/util.cpp and need a way to create well-formed pubkeys. Currently in fuzz/util.cpp, we have some logic for creating pubkeys in the multisig data provider. This logic is duplicated in #28246 and duplicated again in #28122. Seems much better to have a ConstructPubKeyBytes function that both PRs (and any future work) can reuse.

This PR introduces a function to do this and has the existing code use it. While the purpose is to introduce a utility function, the previous multisig code used ConsumeIntegralInRange(4, 7) which would have created some uncompressed pubkeys with the prefix 0x05, which is incorrect (see https://bitcoin.stackexchange.com/questions/57855/c-secp256k1-what-do-prefixes-0x06-and-0x07-in-an-uncompressed-public-key-signif)

tldr; using PickValueFromArray is more correct as it limits to the set of defined prefixes for compressed and uncompressed pubkeys.

DrahtBot commented at 8:12 am on August 29, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	Sjors

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#28241 (Silent payment index (for light wallets and consistency check) by Sjors)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Sjors commented at 9:45 am on August 29, 2023: member

utACK bc5b39375a36f5b76ebd1d23faab0eea077f08ef

Maybe add 0x07 in a separate commit.

There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. Previously we would only change the last byte to make them unique.

in src/test/fuzz/util.cpp:27 in bc5b39375a outdated

22+    } else {
23+        pk_type = fuzzed_data_provider.PickValueInArray({0x04, 0x06, 0x07});
24+    }
25+    std::vector<uint8_t> pk_data = ConsumeFixedLengthByteVector(fuzzed_data_provider, pk_type >= 0x04 ? 65 : 33);
26+    pk_data[0] = pk_type;
27+    CPubKey pk{pk_data.begin(), pk_data.end()};

Sjors commented at 9:47 am on August 29, 2023:

Maybe add a comment to this helper function that the resulting key is not necessarily valid. Or add a const bool valid argument which loops until IsValid() passes.

brunoerg commented at 12:19 pm on August 29, 2023:

which loops until IsValid() passes

I don’t think it would be good in fuzzing.

Perhaps something like:

 0diff --git a/src/test/fuzz/util.cpp b/src/test/fuzz/util.cpp
 1index 1eb3583669..454e34f10f 100644
 2--- a/src/test/fuzz/util.cpp
 3+++ b/src/test/fuzz/util.cpp
 4@@ -119,6 +119,7 @@ CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const bool maybe
 5                     int num_data{fuzzed_data_provider.ConsumeIntegralInRange(1, 22)};
 6                     while (num_data--) {
 7                         const auto pubkey = ConsumePubKey(fuzzed_data_provider, fuzzed_data_provider.ConsumeBool());
 8+                        if (!pubkey.IsValid()) return;
 9                         r_script << std::vector<uint8_t>(pubkey.data(), pubkey.data() + pubkey.size());
10                     }
11                     r_script << fuzzed_data_provider.ConsumeIntegralInRange<int64_t>(0, 22);

josibake commented at 6:01 pm on August 29, 2023:

I don’t think we need / want strictly valid pubkeys in a fuzzing context. The point of this function is to return “well-formed” but not necessarily valid pubkeys. By well-formed, I mean they have some sort of parity byte, indicating even or oddness, which also indicates the type of pubkey: compressed, uncompressed, hybrid.

maflcko commented at 9:51 am on August 29, 2023: member

There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. Previously we would only change the last byte to make them unique.

No, they are not guaranteed to. Now they will consume bytes for their whole length, which may exhaust the available bytes faster and thus result in all-zeros and also bloat the fuzz inputs git folder. Previously they consumed no additional bytes.

Not sure why this is called a refactor, when everything is changed.

Sjors commented at 9:52 am on August 29, 2023: member

Now they will consume bytes for their whole length, which may exhaust the available bytes faster and thus result in all-zeros and also bloat the fuzz inputs git folder

Ah, that’s not good.

in src/test/fuzz/util.cpp:121 in bc5b39375a outdated

125-                        auto& pubkey{fuzzed_data_provider.ConsumeBool() ? pubkey_uncomp : pubkey_comp};
126-                        if (fuzzed_data_provider.ConsumeBool()) {
127-                            pubkey.back() = num_data; // Make each pubkey different
128-                        }
129-                        r_script << pubkey;
130+                        const auto pubkey = ConsumePubKey(fuzzed_data_provider, fuzzed_data_provider.ConsumeBool());

Sjors commented at 9:57 am on August 29, 2023:

Given @MarcoFalke’s point about having to be frugal with consuming input bytes, maybe move this out of the loop and then have the loop tweak the public key. Like before it would start with one compressed and one uncompressed key. We’ll still get diversity in {0x04, 0x06, 0x07} and {0x02, 0x03} combinations between multiple runs.

josibake commented at 5:59 pm on August 29, 2023:

I’ve updated it so that the function now returns a byte vector (which can be mutated if need be) and made it so the function can read from the buffer (as in the multisig example), or be passed some random byte data by calling the Consume{Random,Fixed}LengthByteVector functions

josibake force-pushed on Aug 29, 2023

josibake renamed this:
~~fuzz, refactor: add ConsumePubKey util function~~
fuzz, refactor: add ConstructPubKeyBytes util function
on Aug 29, 2023

josibake commented at 5:56 pm on August 29, 2023: member

Not sure why this is called a refactor, when everything is changed.

That was an oversite on my part. I’ve updated the function to use the buffer as before, so the first commit should be just a refactor now. I also split out adding the missing hybrid pubkey byte into it’s own commit.

josibake commented at 6:01 pm on August 29, 2023: member

Maybe add 0x07 in a separate commit.

There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. > Previously we would only change the last byte to make them unique.

Done! The first commit should now be strictly a refactor.

in src/test/fuzz/util.cpp:17 in f4a51ac2c1 outdated

13@@ -14,6 +14,19 @@
14 
15 #include <memory>
16 
17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, const std::vector<uint8_t> byte_data, const bool compressed) noexcept

maflcko commented at 6:21 pm on August 29, 2023:

May be better to use a const reference, or a span<const> (read-only view), instead of creating a full copy and marking it const

josibake force-pushed on Aug 30, 2023

josibake commented at 8:02 am on August 30, 2023: member

CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?

Sjors commented at 9:03 am on August 30, 2023: member

ACK ab94167acd19ff2198de662a8c4ab5d73292bf1c

From libsecp256k1:

0/** Prefix byte used to tag various encoded curvepoints for specific purposes */
1#define SECP256K1_TAG_PUBKEY_EVEN 0x02
2#define SECP256K1_TAG_PUBKEY_ODD 0x03
3#define SECP256K1_TAG_PUBKEY_UNCOMPRESSED 0x04
4#define SECP256K1_TAG_PUBKEY_HYBRID_EVEN 0x06
5#define SECP256K1_TAG_PUBKEY_HYBRID_ODD 0x07

We now cover all of these.

feature_fee_estimation.py failed in the Win64 CI, which should be unrelated to the fuzzer change.

hebasto commented at 9:05 am on August 30, 2023: member

CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?

Restarted.

josibake commented at 11:15 am on August 30, 2023: member

CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?

Restarted.

Still failing but with a different functional test failure now, which seems to be related to timeouts.

Sjors commented at 2:16 pm on August 30, 2023: member

It’s green now.

in src/test/fuzz/util.cpp:17 in ab94167acd outdated

13@@ -14,6 +14,19 @@
14 
15 #include <memory>
16 
17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, const std::vector<uint8_t>& byte_data, const bool compressed) noexcept

maflcko commented at 2:29 pm on August 30, 2023:

style-nit: Would still prefer Span<const uint8_t> :sweat_smile:

josibake commented at 3:20 pm on August 30, 2023:

Updated! Sorry, misread your first comment as having a preference for const bla& over Span<const bla>. I agree span makes more sense here

maflcko approved

maflcko commented at 2:30 pm on August 30, 2023: member

lgtm, all good.

Though, refactor is still wrong, because the fuzz input format is changed.

Generally I don’t think the refactor label makes any sense for (unit) tests, because end users in production are never affected by test-only changes.

josibake force-pushed on Aug 30, 2023

josibake renamed this:
~~fuzz, refactor: add ConstructPubKeyBytes util function~~
fuzz: add ConstructPubKeyBytes util function
on Aug 30, 2023

DrahtBot added the label Tests on Aug 30, 2023

josibake commented at 3:23 pm on August 30, 2023: member

Though, refactor is still wrong, because the fuzz input format is changed.

I removed it from the title, but feel that it’s still useful on the commit message. Using the refactor label, to me, is about communicating intent to the reviewer. The first commit in this PR was not meant to change behavior, and calling it a refactor made it more obvious to reviewers that some behavior had in fact changed.

Adding 0x6 is definitely not a refactor (tho still a useful change, imo), which is why I’ve removed refactor from the title.

in src/test/fuzz/util.cpp:23 in c5865603b9 outdated

18+{
19+    uint8_t pk_type;
20+    if (compressed) {
21+        pk_type = fuzzed_data_provider.PickValueInArray({0x02, 0x03});
22+    } else {
23+        pk_type = fuzzed_data_provider.PickValueInArray({0x04, 0x07});

maflcko commented at 3:25 pm on August 30, 2023:

refactor is still wrong in this commit, because you are removing support for 6 and then adding it back in the next commit, why? Would be easier to just have one commit and not change it back and forth.

josibake commented at 3:38 pm on August 30, 2023:

yeah, reading again more carefully, the code before was doing ConsumeIntegralInRange(4, 7), which would have included 5 and 6, where 5 is nonsense in this context.

So probably fine to just squash the two commits and mention that it’s not strictly a refactor because we are removing a nonsense value from the input.

Sjors commented at 3:42 pm on August 30, 2023:

Removing nonsense in the first commit is fine. But I still prefer the keep the second commit seperate.

maflcko commented at 3:52 pm on August 30, 2023:

In any case, #28361#issue-1871131955 would also need to be adjusted

josibake commented at 4:01 pm on August 30, 2023:

Updated the comment!

in src/test/fuzz/util.cpp:17 in c5865603b9 outdated

13@@ -14,6 +14,19 @@
14 
15 #include <memory>
16 
17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, Span<const uint8_t> byte_data, const bool compressed) noexcept

maflcko commented at 3:25 pm on August 30, 2023:

nit: Could name it pk_data over byte_data?

josibake commented at 3:42 pm on August 30, 2023:

I prefer byte_data because we are reading in an array of bytes and turning it into pk_data by adding a prefix byte to indicate compressed or uncompressed. At the moment we read it in, it is not pk_data. We turn it into pk_data (i.e. a byte array with the correct prefix), and then that is returned and can be serialized or turned into a CPubKey object.

maflcko approved

maflcko commented at 3:26 pm on August 30, 2023: member

Adding 0x6 is definitely not a refactor (tho still a useful change, imo), which is why I’ve removed refactor from the title.

6 was already handled before, no?

fuzz: add ConstructPubKeyBytes function

Today, this code only has one spot where it needs well-formed pubkeys,
but future PRs will want to reuse this code.

Add a function which creates a well-formed byte array that can be turned
into a pubkey. It is not required that the pubkey is valid, just that it
can be recognized as a compressed or uncompressed pubkey.

Note: while the main intent of this commit is to wrap the existing
logic into a function, it also switches to `PickValueFromArray` so that
we are only choosing one of 0x04, 0x06, or 0x07. The previous code,
`ConsumeIntegralInRange` would have also picked 0x05, which is not
definied in the context of compressed vs uncompressed keys.

See https://bitcoin.stackexchange.com/questions/57855/c-secp256k1-what-do-prefixes-0x06-and-0x07-in-an-uncompressed-public-key-signif
for more details.

1580e3be83

josibake force-pushed on Aug 30, 2023

josibake commented at 3:57 pm on August 30, 2023: member

@MarcoFalke squashed and removed the refactor label since we will no longer be creating pubkeys with an 0x05 prefix after this change.

maflcko approved

maflcko commented at 4:07 pm on August 30, 2023: member

Seems fine to remove 5. The reason to have it was to also create “invalid” data and see if it causes a crash. But no strong opinion, as it may be covered by other fuzz tests already.

josibake commented at 4:23 pm on August 30, 2023: member

The reason to have it was to also create “invalid” data and see if it causes a crash

That doesn’t match the comment, // Set first byte for GetLen() to pass, so if that was the intention for this specific test case, we should probably add a step that randomly mutates the bytes by adding an invalid prefix or just reads from the byte buffer without using the function.

Going forward, I think it’s better to have a function just for creating well-formed pubkey data (i.e. correct prefixes) and then use random or invalid data for cases where we actually want to test for that.

Sjors commented at 8:11 am on August 31, 2023: member

Going forward, I think it’s better to have a function just for creating well-formed pubkey data (i.e. correct prefixes) and then use random or invalid data for cases where we actually want to test for that.

That makes sense, but since we previously covered 0x05 (if only by accident), the ConsumeScript function should occasionally set 0x05 (or some other invalid value) as the prefix.

maflcko commented at 10:42 am on August 31, 2023: member

In theory malformed data can also be covered by the outer wrapper, see

0                [&] {
1                    // Insert byte vector directly to allow malformed or unparsable scripts
2                    r_script.insert(r_script.end(), buffer.begin(), buffer.begin() + fuzzed_data_provider.ConsumeIntegralInRange(0U, MAX_BUFFER_SZ));
3                },

Absent data, I think anything is fine here. I just wanted to clarify that what you are changing is intentional.

josibake commented at 11:07 am on August 31, 2023: member

That makes sense, but since we previously covered 0x05 (if only by accident), the ConsumeScript function should occasionally set 0x05 (or some other invalid value) as the prefix.

I’m inclined to leave this pull as is, and if we find there is a real need for mutating with an invalid prefix inside the “push multisig” case, we address it in a follow-up. The way things are written now seems more clear to me: we have a case for sending random bytes to r_script and we have a case for creating a well-formed but not necessarily valid multisig script. This seems to better match the intent of the ConsumeScript function, based on the comments and surrounding code.

josibake commented at 12:50 pm on August 31, 2023: member

@MarcoFalke @Sjors can you let me know if this looks good as-is (where some things can be addressed in follow-ups), or if you’d like to see more changes in this PR before merging?

Sjors commented at 1:07 pm on August 31, 2023: member

Nothing blocking, but haven’t re-reviewed the code yet.

DrahtBot added the label CI failed on Sep 3, 2023

DrahtBot removed the label CI failed on Sep 5, 2023

josibake commented at 7:39 am on September 7, 2023: member

ping @Sjors @MarcoFalke

Sjors commented at 2:33 pm on September 7, 2023: member

ACK 1580e3be83bd03985b2f288ed70de510903068d9

fanquake merged this on Sep 7, 2023

fanquake closed this on Sep 7, 2023

Frank-GER referenced this in commit 086142fb64 on Sep 8, 2023

josibake deleted the branch on Jan 26, 2024

bitcoin locked this on Jan 25, 2025

fuzz: add ConstructPubKeyBytes util function #28361

Reviews

Conflicts