(attn @MarcoFalke and @practicalswift)
Looking at process_messages, it seems that msg_type is an arbitrary string (up to max length), not fixed to the NetMsgTypes in protocol.cpp. The fuzzer or a dictionary has to produce valid strings; given the coverage, this obviously often succeeds. But the code looks to drop an invalid type message pretty definitively, so would it be sensible to build a more efficient fuzzer that does a nondeterministic choice of a valid NetMsgType?
If so, it also opens up a version of the process_messages harness that does swarm testing, repeatedly hitting certain message types and omitting others, which can be very helpful sometimes (see https://www.cs.utah.edu/~regehr/papers/swarm12.pdf -- this is used a lot now in compiler testing, and in Apple's FoundationDB testing approach).
I'd like to tackle this, but want to make sure the concept make sense, of
- focusing on valid types only
- applying swarm
I'll also see about a translator to take the valid-messages-only inputs from the process_messages corpus and turn them into this kind of thing. Initially they won't add coverage, but (some) fuzzers will have an easier time exploring this version, and I'm hoping swarm will be useful.
As a side issue, the harness now uses a bool to choose arbitrary # of messages to try. Apparently repeating deep IS useful for state/coverage, since I checked and a large number of the corpus inputs get to depth 10 or deeper.