I would like to propose splitting up our fuzz binary into one binary per fuzz harness (or at least have an option to build separate binaries). This would primarily enable properly compiling with LTO, which would have several benefits:
- In environments where individual binaries are required/desired (e.g. oss-fuzz), this has the benefit that each binary is as small as possible, resulting in less disk usage (nice for container images).
- Enable afl++ to create a token dictionary at compile time, which only contains tokens that are found in code paths reachable by each individual harness.
- Efficient and collision edge coverage tracking and more for afl++ see: https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.lto.md.
- People designing and developing fuzzing engines recommend not linking unreachable code into fuzz binaries: https://github.com/google/fuzzing/blob/master/docs/good-fuzz-target.md#unreachable-code.
The only downside would be that linking multiple binaries is slower (this was the only reason for switching to compiling only one binary), but I think we can work around this by simply making this optional.
To achieve this we would need to:
- Change the build system to have an option to compile individual binaries
- Change the fuzzing framework to (optionally) have
FUZZ_TARGET
include the actual fuzz entry point directly (e.g.LLVMFuzzerTestOneInput
) instead of accumulating all harness functions into a global map- This probably requires splitting each harness into its own file