You mean something along the lines of:
0template <typename T>
1NODISCARD inline T ConsumeDeserializable(FuzzedDataProvider& fuzzed_data_provider, const size_t max_length = 4096) noexcept
2{
3 const std::vector<uint8_t> buffer = ConsumeRandomLengthByteVector(fuzzed_data_provider, max_length);
4 CDataStream ds{buffer, SER_NETWORK, INIT_PROTO_VERSION};
5 T obj;
6 try {
7 ds >> obj;
8 } catch (const std::ios_base::failure&) {
9 }
10 return obj;
11}
I’ve thought about that and while it would be better from a developer ergonomics perspective I think it might come with some negative impact on overall fuzzing. One risk I see is that the fuzzer might try to work also on “meaningless” inputs (inputs which cause std::ios_base::failure
) which would now become “somewhat meaningful” due to default construction.
Take the extreme example of a huge input containing a few million bytes worth of 0x41 scream (AAAAAAAAAAA…
) being deserialized to an object of type CFoo
. The current fuzzer would quickly reach std::ios_base::failure
, return a nullopt
and then give up without further processing (fail early thanks to if (!opt_foo) { return; }
or similar). If we were to return a CFoo{}
instead then the fuzzer would proceed with processing (fail late) which may slow down the fuzzing.
Does it make sense? :)
tl;dr – would be worth doing if no negative fuzzing speed impact can be measured :)
Edit: Proof of concept here if someone wants to experiment :)