fuzz: add ConstructPubKeyBytes util function #28361

pull josibake wants to merge 1 commits into bitcoin:master from josibake:refactor-fuzzer-util changing 1 files +16 −7
  1. josibake commented at 8:12 am on August 29, 2023: member

    In #28246 and #28122 , we add a PubKeyDestination and a V0SilentPaymentsDestination. Both of these PRs update fuzz/util.cpp and need a way to create well-formed pubkeys. Currently in fuzz/util.cpp, we have some logic for creating pubkeys in the multisig data provider. This logic is duplicated in #28246 and duplicated again in #28122. Seems much better to have a ConstructPubKeyBytes function that both PRs (and any future work) can reuse.

    This PR introduces a function to do this and has the existing code use it. While the purpose is to introduce a utility function, the previous multisig code used ConsumeIntegralInRange(4, 7) which would have created some uncompressed pubkeys with the prefix 0x05, which is incorrect (see https://bitcoin.stackexchange.com/questions/57855/c-secp256k1-what-do-prefixes-0x06-and-0x07-in-an-uncompressed-public-key-signif)

    tldr; using PickValueFromArray is more correct as it limits to the set of defined prefixes for compressed and uncompressed pubkeys.

  2. DrahtBot commented at 8:12 am on August 29, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK Sjors

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28241 (Silent payment index (for light wallets and consistency check) by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. Sjors commented at 9:45 am on August 29, 2023: member

    ~utACK bc5b39375a36f5b76ebd1d23faab0eea077f08ef~

    Maybe add 0x07 in a separate commit.

    There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. Previously we would only change the last byte to make them unique.

  4. in src/test/fuzz/util.cpp:27 in bc5b39375a outdated
    22+    } else {
    23+        pk_type = fuzzed_data_provider.PickValueInArray({0x04, 0x06, 0x07});
    24+    }
    25+    std::vector<uint8_t> pk_data = ConsumeFixedLengthByteVector(fuzzed_data_provider, pk_type >= 0x04 ? 65 : 33);
    26+    pk_data[0] = pk_type;
    27+    CPubKey pk{pk_data.begin(), pk_data.end()};
    


    Sjors commented at 9:47 am on August 29, 2023:
    Maybe add a comment to this helper function that the resulting key is not necessarily valid. Or add a const bool valid argument which loops until IsValid() passes.

    brunoerg commented at 12:19 pm on August 29, 2023:

    which loops until IsValid() passes

    I don’t think it would be good in fuzzing.

    Perhaps something like:

     0diff --git a/src/test/fuzz/util.cpp b/src/test/fuzz/util.cpp
     1index 1eb3583669..454e34f10f 100644
     2--- a/src/test/fuzz/util.cpp
     3+++ b/src/test/fuzz/util.cpp
     4@@ -119,6 +119,7 @@ CScript ConsumeScript(FuzzedDataProvider& fuzzed_data_provider, const bool maybe
     5                     int num_data{fuzzed_data_provider.ConsumeIntegralInRange(1, 22)};
     6                     while (num_data--) {
     7                         const auto pubkey = ConsumePubKey(fuzzed_data_provider, fuzzed_data_provider.ConsumeBool());
     8+                        if (!pubkey.IsValid()) return;
     9                         r_script << std::vector<uint8_t>(pubkey.data(), pubkey.data() + pubkey.size());
    10                     }
    11                     r_script << fuzzed_data_provider.ConsumeIntegralInRange<int64_t>(0, 22);
    

    josibake commented at 6:01 pm on August 29, 2023:
    I don’t think we need / want strictly valid pubkeys in a fuzzing context. The point of this function is to return “well-formed” but not necessarily valid pubkeys. By well-formed, I mean they have some sort of parity byte, indicating even or oddness, which also indicates the type of pubkey: compressed, uncompressed, hybrid.
  5. maflcko commented at 9:51 am on August 29, 2023: member

    There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. Previously we would only change the last byte to make them unique.

    No, they are not guaranteed to. Now they will consume bytes for their whole length, which may exhaust the available bytes faster and thus result in all-zeros and also bloat the fuzz inputs git folder. Previously they consumed no additional bytes.

    Not sure why this is called a refactor, when everything is changed.

  6. Sjors commented at 9:52 am on August 29, 2023: member

    Now they will consume bytes for their whole length, which may exhaust the available bytes faster and thus result in all-zeros and also bloat the fuzz inputs git folder

    Ah, that’s not good.

  7. in src/test/fuzz/util.cpp:121 in bc5b39375a outdated
    125-                        auto& pubkey{fuzzed_data_provider.ConsumeBool() ? pubkey_uncomp : pubkey_comp};
    126-                        if (fuzzed_data_provider.ConsumeBool()) {
    127-                            pubkey.back() = num_data; // Make each pubkey different
    128-                        }
    129-                        r_script << pubkey;
    130+                        const auto pubkey = ConsumePubKey(fuzzed_data_provider, fuzzed_data_provider.ConsumeBool());
    


    Sjors commented at 9:57 am on August 29, 2023:
    Given @MarcoFalke’s point about having to be frugal with consuming input bytes, maybe move this out of the loop and then have the loop tweak the public key. Like before it would start with one compressed and one uncompressed key. We’ll still get diversity in {0x04, 0x06, 0x07} and {0x02, 0x03} combinations between multiple runs.

    josibake commented at 5:59 pm on August 29, 2023:
    I’ve updated it so that the function now returns a byte vector (which can be mutated if need be) and made it so the function can read from the buffer (as in the multisig example), or be passed some random byte data by calling the Consume{Random,Fixed}LengthByteVector functions
  8. josibake force-pushed on Aug 29, 2023
  9. josibake renamed this:
    fuzz, refactor: add ConsumePubKey util function
    fuzz, refactor: add ConstructPubKeyBytes util function
    on Aug 29, 2023
  10. josibake commented at 5:56 pm on August 29, 2023: member

    Not sure why this is called a refactor, when everything is changed.

    That was an oversite on my part. I’ve updated the function to use the buffer as before, so the first commit should be just a refactor now. I also split out adding the missing hybrid pubkey byte into it’s own commit.

  11. josibake commented at 6:01 pm on August 29, 2023: member

    Maybe add 0x07 in a separate commit.

    There’s another behavior change here, probably a good one: while generating N keys, they’re now completely different. > Previously we would only change the last byte to make them unique.

    Done! The first commit should now be strictly a refactor.

  12. in src/test/fuzz/util.cpp:17 in f4a51ac2c1 outdated
    13@@ -14,6 +14,19 @@
    14 
    15 #include <memory>
    16 
    17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, const std::vector<uint8_t> byte_data, const bool compressed) noexcept
    


    maflcko commented at 6:21 pm on August 29, 2023:
    May be better to use a const reference, or a span<const> (read-only view), instead of creating a full copy and marking it const
  13. josibake force-pushed on Aug 30, 2023
  14. josibake commented at 8:02 am on August 30, 2023: member
    CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?
  15. Sjors commented at 9:03 am on August 30, 2023: member

    ACK ab94167acd19ff2198de662a8c4ab5d73292bf1c

    From libsecp256k1:

    0/** Prefix byte used to tag various encoded curvepoints for specific purposes */
    1#define SECP256K1_TAG_PUBKEY_EVEN 0x02
    2#define SECP256K1_TAG_PUBKEY_ODD 0x03
    3#define SECP256K1_TAG_PUBKEY_UNCOMPRESSED 0x04
    4#define SECP256K1_TAG_PUBKEY_HYBRID_EVEN 0x06
    5#define SECP256K1_TAG_PUBKEY_HYBRID_ODD 0x07
    

    We now cover all of these.

    feature_fee_estimation.py failed in the Win64 CI, which should be unrelated to the fuzzer change.

  16. hebasto commented at 9:05 am on August 30, 2023: member

    CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?

    Restarted.

  17. josibake commented at 11:15 am on August 30, 2023: member

    CI failure seems unrelated to the changes here unless I’m missing something. @hebasto ?

    Restarted.

    Still failing but with a different functional test failure now, which seems to be related to timeouts.

  18. Sjors commented at 2:16 pm on August 30, 2023: member
    It’s green now.
  19. in src/test/fuzz/util.cpp:17 in ab94167acd outdated
    13@@ -14,6 +14,19 @@
    14 
    15 #include <memory>
    16 
    17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, const std::vector<uint8_t>& byte_data, const bool compressed) noexcept
    


    maflcko commented at 2:29 pm on August 30, 2023:
    style-nit: Would still prefer Span<const uint8_t> :sweat_smile:

    josibake commented at 3:20 pm on August 30, 2023:
    Updated! Sorry, misread your first comment as having a preference for const bla& over Span<const bla>. I agree span makes more sense here
  20. maflcko approved
  21. maflcko commented at 2:30 pm on August 30, 2023: member

    lgtm, all good.

    Though, refactor is still wrong, because the fuzz input format is changed.

    Generally I don’t think the refactor label makes any sense for (unit) tests, because end users in production are never affected by test-only changes.

  22. josibake force-pushed on Aug 30, 2023
  23. josibake renamed this:
    fuzz, refactor: add ConstructPubKeyBytes util function
    fuzz: add ConstructPubKeyBytes util function
    on Aug 30, 2023
  24. DrahtBot added the label Tests on Aug 30, 2023
  25. josibake commented at 3:23 pm on August 30, 2023: member

    Though, refactor is still wrong, because the fuzz input format is changed.

    I removed it from the title, but feel that it’s still useful on the commit message. Using the refactor label, to me, is about communicating intent to the reviewer. The first commit in this PR was not meant to change behavior, and calling it a refactor made it more obvious to reviewers that some behavior had in fact changed.

    Adding 0x6 is definitely not a refactor (tho still a useful change, imo), which is why I’ve removed refactor from the title.

  26. in src/test/fuzz/util.cpp:23 in c5865603b9 outdated
    18+{
    19+    uint8_t pk_type;
    20+    if (compressed) {
    21+        pk_type = fuzzed_data_provider.PickValueInArray({0x02, 0x03});
    22+    } else {
    23+        pk_type = fuzzed_data_provider.PickValueInArray({0x04, 0x07});
    


    maflcko commented at 3:25 pm on August 30, 2023:
    refactor is still wrong in this commit, because you are removing support for 6 and then adding it back in the next commit, why? Would be easier to just have one commit and not change it back and forth.

    josibake commented at 3:38 pm on August 30, 2023:

    yeah, reading again more carefully, the code before was doing ConsumeIntegralInRange(4, 7), which would have included 5 and 6, where 5 is nonsense in this context.

    So probably fine to just squash the two commits and mention that it’s not strictly a refactor because we are removing a nonsense value from the input.


    Sjors commented at 3:42 pm on August 30, 2023:
    Removing nonsense in the first commit is fine. But I still prefer the keep the second commit seperate.

    maflcko commented at 3:52 pm on August 30, 2023:
    In any case, #28361#issue-1871131955 would also need to be adjusted

    josibake commented at 4:01 pm on August 30, 2023:
    Updated the comment!
  27. in src/test/fuzz/util.cpp:17 in c5865603b9 outdated
    13@@ -14,6 +14,19 @@
    14 
    15 #include <memory>
    16 
    17+std::vector<uint8_t> ConstructPubKeyBytes(FuzzedDataProvider& fuzzed_data_provider, Span<const uint8_t> byte_data, const bool compressed) noexcept
    


    maflcko commented at 3:25 pm on August 30, 2023:
    nit: Could name it pk_data over byte_data?

    josibake commented at 3:42 pm on August 30, 2023:
    I prefer byte_data because we are reading in an array of bytes and turning it into pk_data by adding a prefix byte to indicate compressed or uncompressed. At the moment we read it in, it is not pk_data. We turn it into pk_data (i.e. a byte array with the correct prefix), and then that is returned and can be serialized or turned into a CPubKey object.
  28. maflcko approved
  29. maflcko commented at 3:26 pm on August 30, 2023: member

    Adding 0x6 is definitely not a refactor (tho still a useful change, imo), which is why I’ve removed refactor from the title.

    6 was already handled before, no?

  30. fuzz: add ConstructPubKeyBytes function
    Today, this code only has one spot where it needs well-formed pubkeys,
    but future PRs will want to reuse this code.
    
    Add a function which creates a well-formed byte array that can be turned
    into a pubkey. It is not required that the pubkey is valid, just that it
    can be recognized as a compressed or uncompressed pubkey.
    
    Note: while the main intent of this commit is to wrap the existing
    logic into a function, it also switches to `PickValueFromArray` so that
    we are only choosing one of 0x04, 0x06, or 0x07. The previous code,
    `ConsumeIntegralInRange` would have also picked 0x05, which is not
    definied in the context of compressed vs uncompressed keys.
    
    See https://bitcoin.stackexchange.com/questions/57855/c-secp256k1-what-do-prefixes-0x06-and-0x07-in-an-uncompressed-public-key-signif
    for more details.
    1580e3be83
  31. josibake force-pushed on Aug 30, 2023
  32. josibake commented at 3:57 pm on August 30, 2023: member
    @MarcoFalke squashed and removed the refactor label since we will no longer be creating pubkeys with an 0x05 prefix after this change.
  33. maflcko approved
  34. maflcko commented at 4:07 pm on August 30, 2023: member
    Seems fine to remove 5. The reason to have it was to also create “invalid” data and see if it causes a crash. But no strong opinion, as it may be covered by other fuzz tests already.
  35. josibake commented at 4:23 pm on August 30, 2023: member

    The reason to have it was to also create “invalid” data and see if it causes a crash

    That doesn’t match the comment, // Set first byte for GetLen() to pass, so if that was the intention for this specific test case, we should probably add a step that randomly mutates the bytes by adding an invalid prefix or just reads from the byte buffer without using the function.

    Going forward, I think it’s better to have a function just for creating well-formed pubkey data (i.e. correct prefixes) and then use random or invalid data for cases where we actually want to test for that.

  36. Sjors commented at 8:11 am on August 31, 2023: member

    Going forward, I think it’s better to have a function just for creating well-formed pubkey data (i.e. correct prefixes) and then use random or invalid data for cases where we actually want to test for that.

    That makes sense, but since we previously covered 0x05 (if only by accident), the ConsumeScript function should occasionally set 0x05 (or some other invalid value) as the prefix.

  37. maflcko commented at 10:42 am on August 31, 2023: member

    In theory malformed data can also be covered by the outer wrapper, see

    0                [&] {
    1                    // Insert byte vector directly to allow malformed or unparsable scripts
    2                    r_script.insert(r_script.end(), buffer.begin(), buffer.begin() + fuzzed_data_provider.ConsumeIntegralInRange(0U, MAX_BUFFER_SZ));
    3                },
    

    Absent data, I think anything is fine here. I just wanted to clarify that what you are changing is intentional.

  38. josibake commented at 11:07 am on August 31, 2023: member

    That makes sense, but since we previously covered 0x05 (if only by accident), the ConsumeScript function should occasionally set 0x05 (or some other invalid value) as the prefix.

    I’m inclined to leave this pull as is, and if we find there is a real need for mutating with an invalid prefix inside the “push multisig” case, we address it in a follow-up. The way things are written now seems more clear to me: we have a case for sending random bytes to r_script and we have a case for creating a well-formed but not necessarily valid multisig script. This seems to better match the intent of the ConsumeScript function, based on the comments and surrounding code.

  39. josibake commented at 12:50 pm on August 31, 2023: member
    @MarcoFalke @Sjors can you let me know if this looks good as-is (where some things can be addressed in follow-ups), or if you’d like to see more changes in this PR before merging?
  40. Sjors commented at 1:07 pm on August 31, 2023: member
    Nothing blocking, but haven’t re-reviewed the code yet.
  41. DrahtBot added the label CI failed on Sep 3, 2023
  42. DrahtBot removed the label CI failed on Sep 5, 2023
  43. josibake commented at 7:39 am on September 7, 2023: member
  44. Sjors commented at 2:33 pm on September 7, 2023: member
    ACK 1580e3be83bd03985b2f288ed70de510903068d9
  45. fanquake merged this on Sep 7, 2023
  46. fanquake closed this on Sep 7, 2023

  47. Frank-GER referenced this in commit 086142fb64 on Sep 8, 2023
  48. josibake deleted the branch on Jan 26, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-01 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me