RFC: Add multiprocess fuzz target

ryanofsky commented at 12:00 pm on September 17, 2021: member

Originally posted by @MarcoFalke in #22962 (comment)

It would be better to write a proper fuzz target for multiprocess (that ideally also covers serialization).

ryanofsky added the label Brainstorming on Sep 17, 2021

MarcoFalke added the label Tests on Sep 17, 2021

MarcoFalke added the label interfaces on Sep 17, 2021

ryanofsky commented at 12:14 pm on September 17, 2021: member

Adding multiprocess fuzz coverage would be similar to adding fuzz coverage for the RPC server, which I don’t think we have, so I don’t know what would be ideal here.

You could approach it by opening IPC connection, writing arbitrary bytes to the socket, and make sure IPC implementation on the other end is memory safe (doesn’t segfault or cause ASAN/MSAN) errors.

You could also call specific IPC methods and check for postconditions.

~A complication here is that IPC interface similar to the RPC interface, and pretty privileged, so it might be able to do things like read/write files on disk depending on how you call it.~ (EDIT: nvm)

ryanofsky commented at 12:16 pm on September 17, 2021: member

A complication here is that IPC interface similar to the RPC interface, and pretty privileged, so it might be able to do things like read/write files on disk depending on how you call it.

Actually, this is not really a concern, because you could stub out all the server methods and just make sure they are being invoked correctly by the IPC and IPC serialization code.

MarcoFalke commented at 12:24 pm on September 17, 2021: member

While the server part of the RPC server isn’t fuzzed, the RPC methods are fuzzed in the rpc target.

I guess fuzzing multiprocess is hard to fuzz because there is no logic to fuzz, it is just an interface for other logic.

ryanofsky commented at 1:33 pm on September 17, 2021: member

While the server part of the RPC server isn’t fuzzed, the RPC methods are fuzzed in the rpc target.

Wow, the RPC fuzz target is very interesting! It is just calling whitelisted RPC methods with random arguments. It wouldn’t be hard to write the same test for a list of whitelisted IPC methods, since capnp does provide enough introspection to query method arguments and fill them with random values. So given a list of whitelisted methods,, the IPC fuzz test could randomly call whitelisted methods with random argument values, and even randomly fill recursive data structures.

Since a test like this would be calling capnp methods directly, it would not provide fuzz coverage for the C++ wrapper methods which call the corresponding capnp methods. These wrapper methods don’t do very much, but are responsible for things like converting CFeeRate objects to and from streams of bytes, converting C++ vectors and maps to and from capnp lists. Separate fuzz tests could be written for this type conversion code, although from discussion #22962 (comment) it sounds like neither of us thinks this coverage would be very useful? There is no equivalent of this fuzzing for our RPC code just because we only provide an RPC server, not an RPC client (we don’t have C++ methods that help call corresponding RPC methods remotely).

I guess fuzzing multiprocess is hard to fuzz because there is no logic to fuzz, it is just an interface for other logic.

The same thing is true for the RPC server. There should be little difference between IPC and RPC from a fuzzer’s point of view. If you wanted to tweak the RPC fuzz test to only fuzz test the RPC server and skip actual execution of the RPC methods, you could do that. Equivalently, you can test the IPC server with IPC method execution, or the IPC server without IPC method execution.

I will say overall I don’t have good intuition for what type of fuzzing is useful, and what type of fuzzing is not useful, and what the costs and benefits are for different kinds of fuzzing. I would love to read a simple guide that told me what fuzzing best practices are (and maybe told me how fuzzing works, because that is also largely a black box to me). But before that happens, I am happy to write whatever fuzzing test someone tells me to write.

ryanofsky commented at 2:49 pm on September 17, 2021: member

One idea to to provide fuzz coverage of the IPC code could be to write an interface specifically meant for fuzz testing.

 0namespace interfaces {
 1class FuzzTest
 2{
 3  virtual ~FuzzTest() = default;
 4  virtual void fuzzInt32(int32_t&) = 0;
 5  virtual void fuzzString(std::string&) = 0;
 6  virtual void fuzzFeeRate(CFeeRate&) = 0;
 7  virtual void fuzzArgsManager(ArgsManager&) = 0;
 8  virtual void fuzzMapVectorFeeRates(std::map<std::string, std::vector<CFeeRate>>&) = 0;
 9  virtual std::vector<int> fuzzInputArgumentsAndReturnValue(int a, int b, int& c, int& d) = 0;
10  virtual void fuzzCallbackFunction(std::function<int(int)> callback) = 0;
11  virtual void fuzzCallbackInterface(std::unique_ptr<FuzzTest> callback) = 0;
12  virtual std::unique_ptr<FuzzTest> fuzzReturnInterface() = 0;
13};
14} // namespace interfaces

Each method could transform arguments and return values (do modulo arithmetic on integer values, transform, reverse, or shift string values) in some way that the fuzz test code could check. The fuzz test code could call the methods randomly across a socket process boundary.

This type of test could probably provide more meaningful coverage of IPC code than the IPC executing test described previously above that actually executes real whitelisted IPC methods with random arguments. Both types of test could complement each other though.

RFC: Add multiprocess fuzz target #23015