Consider removing unnecessarily large inputs which are causing excessive corpus processing runtime #56

practicalswift commented at 10:38 AM on March 25, 2021: contributor

The following corpus directories contain some very large input files which unnecessarily cause the fuzzing runtime to exceed what feels reasonable.

"Unnecessarily large" in this context means that the presence of these very large input files do not add any coverage beyond what is already achieved by processing only significantly smaller input files already in the corpus.

Corpus directory	Largest coverage increasing file in directory	Largest input file in directory
`addrman`	143 118 bytes	1 048 576 bytes
`banman`	49 814 bytes	50 125 bytes
`block`	1 000 431 bytes	1 048 576 bytes
`prevector`	709 301 bytes	709 301 bytes
`process_messages`	984 807 bytes	3 984 182 bytes
`script_flags`	961 741 bytes	1 855 780 bytes
`transaction`	111 109 bytes	111 872 bytes

Perhaps we should consider removing these excessively large inputs that do not add any coverage at the moment and are unlikely to do so in the future?

practicalswift commented at 10:47 AM on March 25, 2021: contributor

On input size from the OSS-Fuzz documentation: "[…] if large inputs are not necessary to increase the coverage of your target API, it is important to add a limit here to significantly improve performance. […]".

maflcko commented at 10:59 AM on March 25, 2021: contributor

Obviously the size will have a negative effect on the fuzzing efficiency (just like the total number of files). However, when it comes to speed (iterating over the inputs), size isn't the only metric to look at. It also depends on the fuzz target. For example the muhash target used up more than 1 hour of CPU time (this is more than 50% of the total time needed for all targets), which is why I removed most of the muhash inputs in fd77d346a1c09e9dc711e6a41e0802b8b6da700b.

When it comes to the other targets, I don't have any objection to run delete_nonreduced_fuzz_inputs occasionally on them (https://github.com/bitcoin-core/qa-assets/pull/44#issuecomment-768442216). I guess the question here is on what schedule it should be run.

Alfonso-ops commented at 6:43 PM on April 9, 2021: none

We're can I run this

practicalswift commented at 8:42 PM on May 25, 2021: contributor

I think this issue has largely been solved with one exception. The banman corpus still contains some problematic seeds which makes current master trigger slow unit warnings (> 10 seconds) when processed with libFuzzer:

INFO: Running with entropic power schedule (0xFF, 100).
INFO: Seed: 3341068132
INFO: Loaded 1 modules   (224846 inline 8-bit counters): 224846 [0x55b51863e0e8, 0x55b518674f36),
INFO: Loaded 1 PC tables (224846 PCs): 224846 [0x55b518674f38,0x55b5189e3418),
INFO:     1358 files found in qa-assets/fuzz_seed_corpus/banman/
INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 50125 bytes
INFO: seed corpus: files: 1358 min: 1b max: 50125b total: 5733831b rss: 78Mb
[#64](/bitcoin-core-qa-assets/64/)»    pulse  cov: 352 ft: 524 corp: 15/119b exec/s: 32 rss: 79Mb
[#128](/bitcoin-core-qa-assets/128/)»   pulse  cov: 504 ft: 896 corp: 58/938b exec/s: 18 rss: 79Mb
[#256](/bitcoin-core-qa-assets/256/)»   pulse  cov: 652 ft: 1707 corp: 136/3458b exec/s: 18 rss: 79Mb
[#512](/bitcoin-core-qa-assets/512/)»   pulse  cov: 685 ft: 2576 corp: 236/10551b exec/s: 16 rss: 79Mb
[#1024](/bitcoin-core-qa-assets/1024/)»  pulse  cov: 692 ft: 3814 corp: 482/157Kb exec/s: 7 rss: 79Mb
Slowest unit: 13 s:
artifact_prefix='./'; Test unit written to ./slow-unit-b54a4c6d71f852bc398296177de0c25394d98e26
Slowest unit: 15 s:
artifact_prefix='./'; Test unit written to ./slow-unit-043da9e72b316029cfb34a6365dd7b54cbae9f17
Slowest unit: 17 s:
artifact_prefix='./'; Test unit written to ./slow-unit-18a6305384eb34dc9c39f6329234b050d0773a53
Slowest unit: 21 s:
artifact_prefix='./'; Test unit written to ./slow-unit-eb18fab921f21c1ce5812be536c48b95ceb38ef5
Slowest unit: 28 s:
artifact_prefix='./'; Test unit written to ./slow-unit-9d4dacb52359b48832a1fba65ff3240f8f49d836
Slowest unit: 31 s:
artifact_prefix='./'; Test unit written to ./slow-unit-26952e97cc7186d5ef326669edef165c1d283668
Slowest unit: 45 s:
artifact_prefix='./'; Test unit written to ./slow-unit-4870a417f322de523ac63f42c0756ee32bf4f45f
[#1359](/bitcoin-core-qa-assets/1359/)»  INITED cov: 694 ft: 4265 corp: 602/1760Kb exec/s: 0 rss: 82Mb
[#1359](/bitcoin-core-qa-assets/1359/)»  DONE   cov: 694 ft: 4265 corp: 602/1760Kb lim: 49814 exec/s: 0 rss: 82Mb
Done 1359 runs in 1661 second(s)

practicalswift commented at 9:01 PM on May 25, 2021: contributor

The banman issue might be solved by https://github.com/bitcoin/bitcoin/pull/22005. Let's see! :)

maflcko closed this on Jun 16, 2021