fuzz: Add minisketch fuzz test #23496

pull MarcoFalke wants to merge 1 commits into bitcoin:master from MarcoFalke:2111-fuzzMiniSketch changing 2 files +66 −0
  1. MarcoFalke commented at 3:01 pm on November 12, 2021: member
  2. MarcoFalke force-pushed on Nov 12, 2021
  3. DrahtBot added the label Build system on Nov 12, 2021
  4. MarcoFalke removed the label Build system on Nov 12, 2021
  5. MarcoFalke added the label Tests on Nov 12, 2021
  6. fanquake requested review from naumenkogs on Nov 16, 2021
  7. fanquake requested review from sipa on Nov 16, 2021
  8. in src/test/fuzz/minisketch.cpp:27 in fa18631484 outdated
    22+    // Fill two sets and keep the difference in a map
    23+    std::map<uint32_t, bool> diff;
    24+    LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000)
    25+    {
    26+        const auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)};
    27+        const auto KeepDiff{[&] {
    


    mjdietzx commented at 3:53 pm on November 16, 2021:

    I’m a little confused about this. At first I was wondering why KeepDiff couldn’t just simply do:

    0diff[entry] = true;
    

    But it seems that you’re actually trying to account for the case where we encounter the same entry value at different iterations. ie sketch_a.Add(7); and then later on sketch_b.Add(7);, and you want to track that this is not a difference b/w the sets. Am I correct?

    If I’m understanding this correctly I have some other questions/concerns


    MarcoFalke commented at 4:24 pm on November 16, 2021:

    Am I correct?

    Yes


    mjdietzx commented at 4:31 pm on November 16, 2021:

    Ok, then my concern is: collisions like this should be so rare that I’m not even sure we should have the logic to catch these cases (maybe just a comment would be good enough)?

    But, if we do want to account for these collisions correctly, then I’m not sure this does the job. Bc/ in this case if you add the same element to sketch_a twice, it looks like the diff gets flipped. etc…

    All things considered, I’m wondering if it will be most robust to have a set_a and set_b alongside the sketches, num_differences is then just the size of that set after the LIMITED_WHILE


    MarcoFalke commented at 4:33 pm on November 16, 2021:
    Adding the same element twice to a minisketch set will erase it. This is a feature, not a rare edge case.

    mjdietzx commented at 4:36 pm on November 16, 2021:
    Ah wow, interesting. I kinda wondered this while reading the README but figured it wasn’t the case. Anyways, would it make sense to reduce the max possible value of const auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)}; so that we see these “collisions” normally and get more test coverage?

    MarcoFalke commented at 4:37 pm on November 16, 2021:
    I expect the fuzz engine to copy the bytes in the input, so a collision might even be the “normal” case.

    sipa commented at 4:39 pm on November 16, 2021:

    What @MarcoFalke said, but also: “rare” isn’t relevant for fuzz tests. The inputs are not uniformly distributed; the fuzzer is actively trying to bias the inputs to trigger “rare” edge cases.

    EDIT: wrote this seeing marco’s latest comment above.


    mjdietzx commented at 4:48 pm on November 16, 2021:
    Interesting, thanks
  9. in src/test/fuzz/minisketch.cpp:56 in fa18631484 outdated
    52+    sketch_br.Deserialize(sketch_b.Serialize());
    53+
    54+    Minisketch sketch_diff{std::move(fuzzed_data_provider.ConsumeBool() ? sketch_a : sketch_ar)};
    55+    sketch_diff.Merge(fuzzed_data_provider.ConsumeBool() ? sketch_b : sketch_br);
    56+
    57+    if (capacity >= num_diff) {
    


    mjdietzx commented at 4:00 pm on November 16, 2021:
    We’ll probably only hit this condition ~3% of the time, right (very roughly)?

    mjdietzx commented at 4:15 pm on November 16, 2021:
    Am I correct in my understanding of Minisketch, that if if (capacity < num_diff) then ssize_t num_differences = minisketch_decode(sketch_diff, capacity, differences); will return <0 indicating that there are more than capacity differences in the set? Can we add an else clause and assert this?

    MarcoFalke commented at 4:26 pm on November 16, 2021:

    We’ll probably only hit this condition ~3% of the time, right (very roughly)?

    That depends on the fuzz engine’s randomness and exploration algorithm.


    MarcoFalke commented at 4:27 pm on November 16, 2021:

    will return <0

    I had the same thinking, but it may still succeed “accidentally” on some inputs.


    mjdietzx commented at 4:33 pm on November 16, 2021:
    Yep that makes sense. That’s where I arrived after reading the Minisketch README more, but wanted to double check anyways
  10. mjdietzx commented at 4:23 pm on November 16, 2021: contributor
    Concept and Approach ACK
  11. mjdietzx commented at 4:38 pm on November 16, 2021: contributor
    Code Review ACK fa186314840a7da165adbfa86a316d8a4c34b706
  12. MarcoFalke force-pushed on Nov 16, 2021
  13. MarcoFalke commented at 4:43 pm on November 16, 2021: member
    Rebased to fix silent merge conflict. Should be trivial to re-ACK
  14. in src/test/fuzz/minisketch.cpp:60 in faece2197c outdated
    55+    sketch_diff.Merge(fuzzed_data_provider.ConsumeBool() ? sketch_b : sketch_br);
    56+
    57+    if (capacity >= num_diff) {
    58+        const auto max_elements{fuzzed_data_provider.ConsumeIntegralInRange<size_t>(num_diff, capacity)};
    59+        const auto dec{*Assert(sketch_diff.Decode(max_elements))};
    60+        std::set<uint32_t> decoded_diff;
    


    sipa commented at 4:52 pm on November 16, 2021:

    I find this piece of code to compare dec with the expected diff a bit convoluted. What about:

    0assert(dec.size() == num_diff));
    1for (auto d : dec) assert(diff[d]);
    

    ?


    MarcoFalke commented at 6:19 pm on November 16, 2021:
    Good idea, done.
  15. fuzz: Add minisketch fuzz test fa74d45306
  16. in src/test/fuzz/minisketch.cpp:47 in faece2197c outdated
    42+                sketch_a.Add(entry);
    43+                sketch_b.Add(entry);
    44+            });
    45+    }
    46+    const auto num_diff{
    47+        std::accumulate(diff.begin(), diff.end(), size_t{0}, [](size_t n, const std::pair<const uint32_t, bool>& e) { return n + e.second; })};
    


    sipa commented at 5:04 pm on November 16, 2021:
    FWIW, you can use [](auto n, const auto e&) { return n + e.second; } as lambda too.

    MarcoFalke commented at 6:19 pm on November 16, 2021:

    Nice. I didn’t know this was added in C++14.

    Done.

  17. MarcoFalke force-pushed on Nov 16, 2021
  18. mjdietzx commented at 8:16 pm on November 16, 2021: contributor
    re-ACK fa74d45
  19. in src/test/fuzz/minisketch.cpp:28 in fa74d45306
    23+    std::map<uint32_t, bool> diff;
    24+    LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000)
    25+    {
    26+        const auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)};
    27+        const auto KeepDiff{[&] {
    28+            bool& mut{diff[entry]};
    


    sipa commented at 8:33 pm on November 16, 2021:
    Micronit: diff[entry] ^= true; or diff[entry] ^= 1; also works (feel free to ignore).
  20. sipa commented at 8:33 pm on November 16, 2021: member
    utACK fa74d4530615cfa02cf32a16fab6b13908266e6f
  21. MarcoFalke merged this on Nov 17, 2021
  22. MarcoFalke closed this on Nov 17, 2021

  23. MarcoFalke deleted the branch on Nov 17, 2021
  24. sidhujag referenced this in commit 244233d62a on Nov 17, 2021
  25. DrahtBot locked this on Nov 17, 2022


MarcoFalke mjdietzx sipa


naumenkogs

Labels
Tests


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me