fuzz: Add minisketch fuzz test #23496
pull MarcoFalke wants to merge 1 commits into bitcoin:master from MarcoFalke:2111-fuzzMiniSketch changing 2 files +66 −0-
MarcoFalke commented at 3:01 pm on November 12, 2021: member
-
MarcoFalke force-pushed on Nov 12, 2021
-
DrahtBot added the label Build system on Nov 12, 2021
-
MarcoFalke removed the label Build system on Nov 12, 2021
-
MarcoFalke added the label Tests on Nov 12, 2021
-
fanquake requested review from naumenkogs on Nov 16, 2021
-
fanquake requested review from sipa on Nov 16, 2021
-
in src/test/fuzz/minisketch.cpp:27 in fa18631484 outdated
22+ // Fill two sets and keep the difference in a map 23+ std::map<uint32_t, bool> diff; 24+ LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000) 25+ { 26+ const auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)}; 27+ const auto KeepDiff{[&] {
mjdietzx commented at 3:53 pm on November 16, 2021:I’m a little confused about this. At first I was wondering why
KeepDiff
couldn’t just simply do:0diff[entry] = true;
But it seems that you’re actually trying to account for the case where we encounter the same
entry
value at different iterations. iesketch_a.Add(7);
and then later onsketch_b.Add(7);
, and you want to track that this is not a difference b/w the sets. Am I correct?If I’m understanding this correctly I have some other questions/concerns
MarcoFalke commented at 4:24 pm on November 16, 2021:Am I correct?
Yes
mjdietzx commented at 4:31 pm on November 16, 2021:Ok, then my concern is: collisions like this should be so rare that I’m not even sure we should have the logic to catch these cases (maybe just a comment would be good enough)?
But, if we do want to account for these collisions correctly, then I’m not sure this does the job. Bc/ in this case if you add the same element to
sketch_a
twice, it looks like the diff gets flipped. etc…All things considered, I’m wondering if it will be most robust to have a
set_a
andset_b
alongside the sketches,num_differences
is then just the size of that set after theLIMITED_WHILE
MarcoFalke commented at 4:33 pm on November 16, 2021:Adding the same element twice to a minisketch set will erase it. This is a feature, not a rare edge case.
mjdietzx commented at 4:36 pm on November 16, 2021:Ah wow, interesting. I kinda wondered this while reading the README but figured it wasn’t the case. Anyways, would it make sense to reduce the max possible value ofconst auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)};
so that we see these “collisions” normally and get more test coverage?
MarcoFalke commented at 4:37 pm on November 16, 2021:I expect the fuzz engine to copy the bytes in the input, so a collision might even be the “normal” case.
sipa commented at 4:39 pm on November 16, 2021:What @MarcoFalke said, but also: “rare” isn’t relevant for fuzz tests. The inputs are not uniformly distributed; the fuzzer is actively trying to bias the inputs to trigger “rare” edge cases.
EDIT: wrote this seeing marco’s latest comment above.
mjdietzx commented at 4:48 pm on November 16, 2021:Interesting, thanksin src/test/fuzz/minisketch.cpp:56 in fa18631484 outdated
52+ sketch_br.Deserialize(sketch_b.Serialize()); 53+ 54+ Minisketch sketch_diff{std::move(fuzzed_data_provider.ConsumeBool() ? sketch_a : sketch_ar)}; 55+ sketch_diff.Merge(fuzzed_data_provider.ConsumeBool() ? sketch_b : sketch_br); 56+ 57+ if (capacity >= num_diff) {
mjdietzx commented at 4:00 pm on November 16, 2021:We’ll probably only hit this condition ~3% of the time, right (very roughly)?
mjdietzx commented at 4:15 pm on November 16, 2021:Am I correct in my understanding of Minisketch, that ifif (capacity < num_diff)
thenssize_t num_differences = minisketch_decode(sketch_diff, capacity, differences);
will return<0
indicating that there are more thancapacity
differences in the set? Can we add anelse
clause and assert this?
MarcoFalke commented at 4:26 pm on November 16, 2021:We’ll probably only hit this condition ~3% of the time, right (very roughly)?
That depends on the fuzz engine’s randomness and exploration algorithm.
MarcoFalke commented at 4:27 pm on November 16, 2021:will return <0
I had the same thinking, but it may still succeed “accidentally” on some inputs.
mjdietzx commented at 4:33 pm on November 16, 2021:Yep that makes sense. That’s where I arrived after reading the Minisketch README more, but wanted to double check anywaysmjdietzx commented at 4:23 pm on November 16, 2021: contributorConcept and Approach ACKmjdietzx commented at 4:38 pm on November 16, 2021: contributorCode Review ACK fa186314840a7da165adbfa86a316d8a4c34b706MarcoFalke force-pushed on Nov 16, 2021MarcoFalke commented at 4:43 pm on November 16, 2021: memberRebased to fix silent merge conflict. Should be trivial to re-ACKin src/test/fuzz/minisketch.cpp:60 in faece2197c outdated
55+ sketch_diff.Merge(fuzzed_data_provider.ConsumeBool() ? sketch_b : sketch_br); 56+ 57+ if (capacity >= num_diff) { 58+ const auto max_elements{fuzzed_data_provider.ConsumeIntegralInRange<size_t>(num_diff, capacity)}; 59+ const auto dec{*Assert(sketch_diff.Decode(max_elements))}; 60+ std::set<uint32_t> decoded_diff;
sipa commented at 4:52 pm on November 16, 2021:I find this piece of code to compare
dec
with the expected diff a bit convoluted. What about:0assert(dec.size() == num_diff)); 1for (auto d : dec) assert(diff[d]);
?
MarcoFalke commented at 6:19 pm on November 16, 2021:Good idea, done.fuzz: Add minisketch fuzz test fa74d45306in src/test/fuzz/minisketch.cpp:47 in faece2197c outdated
42+ sketch_a.Add(entry); 43+ sketch_b.Add(entry); 44+ }); 45+ } 46+ const auto num_diff{ 47+ std::accumulate(diff.begin(), diff.end(), size_t{0}, [](size_t n, const std::pair<const uint32_t, bool>& e) { return n + e.second; })};
sipa commented at 5:04 pm on November 16, 2021:FWIW, you can use[](auto n, const auto e&) { return n + e.second; }
as lambda too.
MarcoFalke commented at 6:19 pm on November 16, 2021:Nice. I didn’t know this was added in C++14.
Done.
MarcoFalke force-pushed on Nov 16, 2021mjdietzx commented at 8:16 pm on November 16, 2021: contributorre-ACK fa74d45in src/test/fuzz/minisketch.cpp:28 in fa74d45306
23+ std::map<uint32_t, bool> diff; 24+ LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000) 25+ { 26+ const auto entry{fuzzed_data_provider.ConsumeIntegralInRange<uint32_t>(1, std::numeric_limits<uint32_t>::max() - 1)}; 27+ const auto KeepDiff{[&] { 28+ bool& mut{diff[entry]};
sipa commented at 8:33 pm on November 16, 2021:Micronit:diff[entry] ^= true;
ordiff[entry] ^= 1;
also works (feel free to ignore).sipa commented at 8:33 pm on November 16, 2021: memberutACK fa74d4530615cfa02cf32a16fab6b13908266e6fMarcoFalke merged this on Nov 17, 2021MarcoFalke closed this on Nov 17, 2021
MarcoFalke deleted the branch on Nov 17, 2021sidhujag referenced this in commit 244233d62a on Nov 17, 2021DrahtBot locked this on Nov 17, 2022
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 18:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me