net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket.

practicalswift commented at 8:23 pm on June 7, 2020: contributor

Add regression fuzz harness for CVE-2017-18350. This fuzzing harness would have found CVE-2017-18350 within a minute of fuzzing :)

See doc/fuzzing.md for information on how to fuzz Bitcoin Core. Don’t forget to contribute any coverage increasing inputs you find to the Bitcoin Core fuzzing corpus repo.

Happy fuzzing :)

DrahtBot added the label Build system on Jun 7, 2020

DrahtBot added the label P2P on Jun 7, 2020

DrahtBot added the label Tests on Jun 7, 2020

MarcoFalke removed the label Build system on Jun 7, 2020

MarcoFalke removed the label Tests on Jun 7, 2020

MarcoFalke commented at 11:30 pm on June 7, 2020: member

Concept ACK

MarcoFalke renamed this:
~~tests/net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.~~
net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.
on Jun 7, 2020

DrahtBot commented at 11:34 pm on June 7, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#21328 (net, refactor: pass uint16 CService::port as uint16 by jonatack)
#19415 (net: Make DNS lookup mockable, add fuzzing harness by practicalswift)
#19259 (fuzz: Add fuzzing harness for LoadMempool(…) and DumpMempool(…) by practicalswift)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

practicalswift force-pushed on Jun 8, 2020

DrahtBot added the label Needs rebase on Jun 11, 2020

practicalswift force-pushed on Jun 11, 2020

DrahtBot removed the label Needs rebase on Jun 11, 2020

DrahtBot added the label Needs rebase on Jul 9, 2020

practicalswift force-pushed on Jul 9, 2020

DrahtBot removed the label Needs rebase on Jul 9, 2020

practicalswift force-pushed on Jul 10, 2020

DrahtBot added the label Needs rebase on Jul 11, 2020

practicalswift force-pushed on Jul 11, 2020

DrahtBot removed the label Needs rebase on Jul 11, 2020

DrahtBot added the label Needs rebase on Jul 18, 2020

practicalswift force-pushed on Jul 18, 2020

DrahtBot removed the label Needs rebase on Jul 18, 2020

in src/netbase.h:44 in 5da912d7cb outdated

40@@ -37,6 +41,98 @@ class proxyType
41     bool randomize_credentials;
42 };
43 
44+struct timeval MillisToTimeval(int64_t nTimeout);

Crypt-iQ commented at 2:09 am on July 21, 2020:

This is a redundant declaration, it’s defined lower on line 164.

practicalswift commented at 5:14 pm on August 14, 2020:

It is now used prior to L164, so it must be on L44 – but removing L164. Thanks!

in src/netbase.h:69 in 544a870557 outdated

61+};
62+
63+// Represents a non-owned SOCKET.
64+//
65+// The underlying SOCKET will NOT be closed automatically on object destruction.
66+class SocketImpl : public Socket

Crypt-iQ commented at 11:09 am on July 21, 2020:

Just would like to point out that this is currently unused. Is this intentional?

practicalswift commented at 5:08 pm on August 14, 2020:

It is used in netbase.cpp? :)

0$ git grep SocketImpl
1src/netbase.cpp:        SocketImpl socket{hSocket};
2src/netbase.cpp:        SocketImpl socket{hSocket};
3src/netbase.h:class SocketImpl : public Socket
4src/netbase.h:    explicit SocketImpl(SOCKET socket) : m_socket{socket}

Crypt-iQ commented at 5:11 pm on August 14, 2020:

I meant that even though it’s used, the coverage isn’t updated for SocketImpl functions. Might be a bug with lcov perhaps.

practicalswift commented at 5:17 pm on August 14, 2020:

I think that is expected: it is currently only from ConnectThroughProxy which is not fuzzed :)

Crypt-iQ commented at 5:22 pm on August 14, 2020:

Ah derp! I mistakenly thought it was fuzzed for some reason, but FuzzedSocket is the one that’s used. Is the idea that they be used together in some manner eventually?

practicalswift commented at 5:29 pm on August 14, 2020:

@Crypt-iQ It is only used as a way to wrap an existing SOCKET as a Socket instance: no further plans beyond that :)

in src/test/fuzz/util.h:475 in 4f18edbe55 outdated

469@@ -470,4 +470,63 @@ void ReadFromStream(FuzzedDataProvider& fuzzed_data_provider, Stream& stream) no
470     }
471 }
472 
473+class FuzzedSocket : public Socket
474+{
475+    FuzzedDataProvider& fuzzed_data_provider;

Crypt-iQ commented at 11:16 am on July 21, 2020:

nit: m_fuzzed_data_provider so the reference in the constructor can just be fuzzed_data_provider?

practicalswift commented at 5:17 pm on August 14, 2020:

Good nit! Adressed!

in src/test/fuzz/util.h:633 in 4f18edbe55 outdated

501+            if (len > random_bytes.size()) {
502+                std::memset(buf + random_bytes.size(), 0, len - random_bytes.size());
503+            }
504+            return len;
505+        }
506+        return random_bytes.size();

Crypt-iQ commented at 11:20 am on July 21, 2020:

Here it is still possible for random_bytes.size() < len since that condition is only checked if ConsumeBool returns true. Shouldn’t the correct number of bytes be returned if possible?

practicalswift commented at 5:23 pm on August 14, 2020:

We want to make it so that the fuzzer driven recv can return up to len bytes like the real recv, right? :)

Crypt-iQ commented at 7:15 pm on August 14, 2020:

Ah, yes that makes sense, thanks for clarifying.

Crypt-iQ commented at 11:30 am on July 21, 2020: contributor

Coverage from ~30 hours of libfuzzer fuzzing: https://crypt-iq.github.io/socks5_cov_outs/src/ Missing about 4 lines of coverage, will try to hit those again on a different run.

practicalswift force-pushed on Aug 14, 2020

Crypt-iQ commented at 5:04 am on September 1, 2020: contributor

Builds on Ubuntu. Fails to build on macOS with brew clang.

 0Making all in src
 1  CXX      test/fuzz/addition_overflow-addition_overflow.o
 2In file included from test/fuzz/addition_overflow.cpp:7:
 3./test/fuzz/util.h:335:13: error: no matching function for call to 'AdditionOverflow'
 4        if (AdditionOverflow((uint64_t)fuzzed_file->m_offset, random_bytes.size())) {
 5            ^~~~~~~~~~~~~~~~
 6./test/fuzz/util.h:201:16: note: candidate template ignored: deduced conflicting types for parameter 'T' ('unsigned long long' vs. 'unsigned long')
 7NODISCARD bool AdditionOverflow(const T i, const T j) noexcept
 8               ^
 9./test/fuzz/util.h:346:13: error: no matching function for call to 'AdditionOverflow'
10        if (AdditionOverflow(fuzzed_file->m_offset, n)) {
11            ^~~~~~~~~~~~~~~~
12./test/fuzz/util.h:201:16: note: candidate template ignored: deduced conflicting types for parameter 'T' ('long long' vs. 'long')
13NODISCARD bool AdditionOverflow(const T i, const T j) noexcept
14               ^
152 errors generated.
16make[2]: *** [test/fuzz/addition_overflow-addition_overflow.o] Error 1
17make[1]: *** [all-recursive] Error 1
18make: *** [all-recursive] Error 1

practicalswift force-pushed on Sep 1, 2020

practicalswift commented at 2:15 pm on September 1, 2020: contributor

@Crypt-iQ Thanks for letting me know. Now rebased. Should compile on macOS now? :)

Crypt-iQ commented at 7:08 pm on September 1, 2020: contributor

@practicalswift Now compiles on macOS :). Will run the builds again and re-review.

in src/test/fuzz/util.h:554 in 2d02f775fa outdated

507+    }
508+
509+#ifdef USE_POLL
510+    int poll(short events, int timeout_ms) override
511+    {
512+        return m_fuzzed_data_provider.ConsumeIntegralInRange<int>(-1, 1);

Crypt-iQ commented at 5:41 am on September 2, 2020:

Calling this with -1 as the min parameter to ConsumeIntegralInRange will result in a very large range since it will be cast to uint64_t: https://github.com/bitcoin/bitcoin/blob/48c1083632687a42ac603d4f241e70616a1d3815/src/test/fuzz/FuzzedDataProvider.h#L87

Maybe an alternative version of ConsumeIntegralInRange for just -1,0,1 can be used? Not your fault, it’s the header file’s fault.

Crypt-iQ commented at 5:51 am on September 2, 2020:

Nevermind, godbolt tells me I am wrong.

in src/test/fuzz/util.h:560 in 2d02f775fa outdated

513+    }
514+#endif
515+
516+    int select(bool watch_read, bool watch_write, int timeout_ms) override
517+    {
518+        return m_fuzzed_data_provider.ConsumeIntegralInRange<int>(-1, 1);

Crypt-iQ commented at 5:41 am on September 2, 2020:

Same here.

Crypt-iQ commented at 5:34 pm on September 2, 2020: contributor

ACK 2d02f775fa36c59bca5d7ef1a0e702c1bc2de117

Code change looks good to me. I did not try to reintroduce the CVE, but I did read up about it. Ubuntu 18:

./configure --enable-fuzz --with-sanitizers=address,undefined,integer,fuzzer - no errors
valgrind - no errors

macOS 10.15.4:

./configure --enable-fuzz --with-sanitizers=address,fuzzer --disable-asm - no errors
./configure --enable-fuzz --with-sanitizers=undefined,integer,fuzzer - one complaint (suppressed by #19713)

 0/usr/local/opt/llvm/bin/../include/c++/v1/memory:1876:35: runtime error: implicit conversion from type 'char' of value -125 (8-bit, signed) to type 'unsigned char' changed the value to 131 (8-bit, unsigned)
 1    [#0](/bitcoin-bitcoin/0/) 0x1072b4bd4 in void std::__1::allocator_traits<std::__1::allocator<unsigned char> >::construct<unsigned char, char const&>(std::__1::allocator<unsigned char>&, unsigned char*, char const&) memory:1595
 2    [#1](/bitcoin-bitcoin/1/) 0x1072b49f9 in void std::__1::allocator_traits<std::__1::allocator<unsigned char> >::__construct_range_forward<std::__1::__wrap_iter<char const*>, unsigned char*>(std::__1::allocator<unsigned char>&, std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>, unsigned char*&) memory:1688
 3    [#2](/bitcoin-bitcoin/2/) 0x1072b3598 in std::__1::enable_if<__is_cpp17_forward_iterator<std::__1::__wrap_iter<char const*> >::value, void>::type std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__construct_at_end<std::__1::__wrap_iter<char const*> >(std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>, unsigned long) vector:1076
 4    [#3](/bitcoin-bitcoin/3/) 0x10728e2a2 in std::__1::enable_if<(__is_cpp17_forward_iterator<std::__1::__wrap_iter<char const*> >::value) && (is_constructible<unsigned char, std::__1::iterator_traits<std::__1::__wrap_iter<char const*> >::reference>::value), std::__1::__wrap_iter<unsigned char*> >::type std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::insert<std::__1::__wrap_iter<char const*> >(std::__1::__wrap_iter<unsigned char const*>, std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>) vector:1996
 5    [#4](/bitcoin-bitcoin/4/) 0x10728b18c in Socks5(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, ProxyCredentials const*, Socket&) netbase.cpp:492
 6    [#5](/bitcoin-bitcoin/5/) 0x10663a420 in test_one_input(std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > const&) socks5.cpp:30
 7    [#6](/bitcoin-bitcoin/6/) 0x107663b28 in LLVMFuzzerTestOneInput fuzz.cpp:45
 8    [#7](/bitcoin-bitcoin/7/) 0x107862780 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) FuzzerLoop.cpp:556
 9    [#8](/bitcoin-bitcoin/8/) 0x107861ec5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) FuzzerLoop.cpp:470
10    [#9](/bitcoin-bitcoin/9/) 0x107863f76 in fuzzer::Fuzzer::MutateAndTestOne() FuzzerLoop.cpp:698
11    [#10](/bitcoin-bitcoin/10/) 0x107864be5 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:830
12    [#11](/bitcoin-bitcoin/11/) 0x107851d8d in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) FuzzerDriver.cpp:829
13    [#12](/bitcoin-bitcoin/12/) 0x10787da62 in main FuzzerMain.cpp:19
14    [#13](/bitcoin-bitcoin/13/) 0x7fff73778cc8 in start+0x0 (libdyld.dylib:x86_64+0x1acc8)

practicalswift force-pushed on Sep 17, 2020

practicalswift commented at 1:38 pm on September 17, 2020: contributor

Had to rebase also this one to make use of the new $(FUZZ_SUITE_LDFLAGS_COMMON). @Crypt-iQ May I ask you to review and hopefully give your second ACK on this PR? :)

Crypt-iQ commented at 10:54 am on September 20, 2020: contributor

ACK ea6cd7b2cc6b070b2630457d75b396de555b31c7

practicalswift commented at 8:36 am on October 15, 2020: contributor

Friendly review beg :)

Concept NACK:s are totally fine too: that would allow me to move forward with other fuzzing stuff without having to keep this PR updated :)

practicalswift commented at 8:38 am on October 15, 2020: contributor

@MarcoFalke Thanks for your Concept ACK. Would you mind reviewing the code too? :)

MarcoFalke commented at 10:21 am on November 10, 2020: member

Sorry, I don’t know enough about sockets to be able to review this.

practicalswift commented at 10:27 am on November 10, 2020: contributor

@MarcoFalke Ah, got it! Do you know who would be an appropriate reviewer? :)

FWIW I think sipa wrote the Socks5(...) code originally in #1141.

Any review help welcome! Concept NACK is obviously fine too: I just want to move forward towards close (due to NACK or close due to merge) :)

practicalswift force-pushed on Nov 13, 2020

practicalswift commented at 10:07 am on November 13, 2020: contributor

Rebased. Now using the names MockableSocket, Socket and FuzzedSocket.

Socket and FuzzedSocket both implement MockableSocket.

The introduced src/test/fuzz/socks5 would have caught the CVE-2017-18350 buffer overflow (which I caught via code review rather than fuzzing) within literally seconds of fuzzing :)

fanquake added this to the "Blockers" column in a project

DrahtBot added the label Needs rebase on Dec 15, 2020

practicalswift force-pushed on Dec 16, 2020

DrahtBot removed the label Needs rebase on Dec 16, 2020

in src/test/fuzz/util.h:529 in b7a45b3555 outdated

524+    ssize_t send(const char* buf, size_t len, int flags) override
525+    {
526+        if (m_fuzzed_data_provider.ConsumeBool()) {
527+            return len;
528+        }
529+        return m_fuzzed_data_provider.ConsumeIntegralInRange<ssize_t>(-1, len);

vasild commented at 10:24 am on December 21, 2020:

Just a note: this could return 0 from send(). I am not sure if send(2) could ever return 0:

On success, these calls return the number of characters sent. On error, -1 is returned

anyway, I think it is ok to return 0 - to check that our code does not get bricked by that (in case send(2) actually returns 0).

vasild commented at 10:30 am on December 21, 2020:

I think it would be good to set errno if we are going to return an error (-1). Application reaction on error should differ depending on the value of errno (should retry on “temporary” errors).

practicalswift commented at 6:09 pm on December 21, 2020:

Good point! Pushed an updated version addressing this.

in src/test/fuzz/util.h:534 in b7a45b3555 outdated

529+        return m_fuzzed_data_provider.ConsumeIntegralInRange<ssize_t>(-1, len);
530+    }
531+
532+    ssize_t recv(char* buf, size_t len, int flags) override
533+    {
534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {

vasild commented at 10:34 am on December 21, 2020:

Maybe assert that if buf is nullptr then len is 0 before this if and then remove buf == nullptr from here.

practicalswift commented at 6:09 pm on December 21, 2020:

Good point! Pushed an updated version addressing this.

in src/test/fuzz/util.h:539 in b7a45b3555 outdated

534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {
535+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
536+        }
537+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(len);
538+        if (random_bytes.empty()) {
539+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;

vasild commented at 10:37 am on December 21, 2020:

Here and above, set errno to some “random” value if we are going to return -1. Maybe the “random” value should have higher probablitity of being one of EAGAIN, EWOULDBLOCK, EINTR or EINPROGRESS.

practicalswift commented at 6:10 pm on December 21, 2020:

Good point! Pushed an updated version addressing this.

in src/test/fuzz/util.h:629 in b7a45b3555 outdated

539+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
540+        }
541+        std::memcpy(buf, random_bytes.data(), random_bytes.size());
542+        if (m_fuzzed_data_provider.ConsumeBool()) {
543+            if (len > random_bytes.size()) {
544+                std::memset(buf + random_bytes.size(), 0, len - random_bytes.size());

vasild commented at 10:40 am on December 21, 2020:

Why this? recv(2) would never set the remaining buffer to 0x0 bytes.

practicalswift commented at 10:32 pm on December 21, 2020:

This is intentional:

If m_fuzzed_data_provider.ConsumeBool() we pretend to read len bytes but we only consume random_bytes.size() from the fuzzing input and let the remaining bytes be 0x0.

That way we can satisfy a read of say 4096 bytes even if we only have say 20 bytes of input left: 20 bytes will be real data from the input and 4076 bytes will be 0x0 bytes.

Makes sense? :)

vasild commented at 10:48 am on December 22, 2020:

Oh yes, we return len. Sorry for the noise.

in src/test/fuzz/util.h:596 in b7a45b3555 outdated

532+    ssize_t recv(char* buf, size_t len, int flags) override
533+    {
534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {
535+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
536+        }
537+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(len);

vasild commented at 10:44 am on December 21, 2020:

What is the likelihood of this returning less than len? The ConsumeBytes() comment says:

… If fewer than // |num_bytes| of data remain, returns a shorter std::vector containing all // of the data that’s left.

I think we should make it more likely that this fuzzed recv() returns a partial read because this is when the “interesting” things happen in the app code as it should retry.

practicalswift commented at 10:38 pm on December 21, 2020:

Good point! Pushed an updated version addressing this.

vasild commented at 10:51 am on December 21, 2020: member

Concept ACK

I am considering how to make a RAII socket class that would best fit in this PR, the rest of the code and in #20685, as you suggested.

Some minor comments below in the meantime.

in src/netbase.h:76 in b7a45b3555 outdated

85+    ssize_t recv(char* buf, size_t len, int flags) override
86+    {
87+        return ::recv(m_socket, buf, len, flags);
88+    }
89+
90+#ifdef USE_POLL

laanwj commented at 5:08 pm on December 21, 2020:

I’d prefer to have this implementation in an implementation unit instead of the header. This is not a templated class and implementing virtual methods can be done everywhere. This would also avoid adding the optional include poll.h here, I think.

practicalswift commented at 6:32 pm on December 21, 2020:

Good point! Pushed an updated version addressing this.

laanwj commented at 5:08 pm on December 21, 2020: member

Concept ACK

practicalswift force-pushed on Dec 21, 2020

practicalswift commented at 10:39 pm on December 21, 2020: contributor

@laanwj @vasild

Thanks for very good review feedback!

I believe I’ve addressed all the issues raised. Please re-review :)

vasild commented at 3:49 pm on December 23, 2020: member

I am considering how to make a RAII socket class…

What about something like this: https://github.com/vasild/bitcoin/commit/bf3fa6a10a69f3d0dbd84188e3ba089a19248b20?

practicalswift commented at 5:24 pm on December 27, 2020: contributor

@vasild Great! Would you mind submitting that as a separate PR? :)

vasild commented at 4:19 pm on December 28, 2020: member

@vasild Great! Would you mind submitting that as a separate PR? :)

Here it is: #20788. If it gets merged, then the first commit in this PR can be dropped.

in src/netbase.h:95 in 0370d7a22a outdated

 96+#endif
 97+
 98+    // Simplified select(2) interface: Wait until the socket becomes ready for
 99+    // reading (if watch_read) and/or writing (if watch_write). Timeout after
100+    // timeout_ms milliseconds. See select(2) for return values.
101+    int select(bool watch_read, bool watch_write, int timeout_ms) override;

laanwj commented at 5:09 pm on January 7, 2021:

This could be in the #else of USE_POLL?

practicalswift commented at 5:01 pm on February 8, 2021:

Good idea! Now addressed. Thanks!

laanwj commented at 5:16 pm on January 7, 2021: member

I still like the concept of this very much, but I just wondered if there isn’t a general way to fuzz network applications that doesn’t require wrapping every BSD socketsI call. I mean it seem like something people would want to commonly do.

vasild commented at 11:00 am on January 14, 2021: member

@laanwj It is possible to override functions (e.g. send(2)) without modifying the source code with LD_PRELOAD. I think that is not supported by the current testing framework, out of the box?

practicalswift commented at 11:35 am on January 14, 2021: contributor

I still like the concept of this very much, but I just wondered if there isn’t a general way to fuzz network applications that doesn’t require wrapping every BSD socketsI call. I mean it seem like something people would want to commonly do.

Easy fuzzing of networking code is unfortunately an unsolved problem generally AFAIK.

There have been many attempts to solve this by using LD_PRELOAD hacks as @vasild mentions. One notable example is preeny’s desock. My experience is that these LD_PRELOAD hacks are pretty fragile and while they might be good enough for short-term one-off fuzzing jobs (think targeted vulnerability research) they are not suitable for more long-term robustness testing needs like in our case.

The best “off the shelf” solution to this problem I’ve seen is what Honggfuzz is doing with its networking mode libhfnetdriver (see http://blog.swiecki.net/2018/01/fuzzing-tcp-servers.html). Unfortunately that solutions is limited to Honggfuzz and it is more suited for “macro level” networking code fuzzing than the more “micro level” networking code fuzzing this PR targets.

That said: Honggfuzz NetDriver is great too have in the toolbox too: I’ve tried documenting how to fuzz Bitcoin Core using Honggfuzz’s networking mode. See the unmerged PR #20380 (“doc: Add instructions on how to fuzz the P2P layer using Honggfuzz NetDriver”). Review welcome! :)

tl;dr – Mocked networking functions driven by FuzzedDataProvider as suggested in this PR is AFAICT the most robust and appropriate way to solve our low-level network fuzzing needs. That said it would be interesting to know if anyone has any alternative approach which is both robust and fuzzer engine agnostic: automated robustness testing of networking code is an active area of research :)

laanwj commented at 9:59 pm on January 24, 2021: member

tl;dr – Mocked networking functions driven by FuzzedDataProvider as suggested in this PR is AFAICT the most robust and appropriate way to solve our low-level network fuzzing needs.

Sure. Fair enough. It’s just that I’m in principle not a fan of wrapping and abstracting basic operating system APIs such as BSD Sockets because anyone contributing to the code has to learn a special interface instead of being up to speed quickly with what they already know. It’s also another moving part that could have bugs itself.

That said, it seems this is useful for other things such as I2P support so that seems enough of a motivation to do it anyway. If we do it, it would be good to do so at a global level as #20788 does.

vasild commented at 10:03 am on January 25, 2021: member

anyone contributing to the code has to learn a special interface instead of being up to speed quickly with what they already know

Indeed! This is why #20788 mimics the system interface:

0ssize_t ret = recv(socket, data, len, 0);
1// vs (same arguments, return type and semantic)
2ssize_t ret = sock.Recv(data, len, 0);

practicalswift commented at 9:34 am on January 26, 2021: contributor

@laanwj Good points. When/if @vasild’s #20788 is merged I’ll use that facility instead :)

Crypt-iQ commented at 1:31 pm on January 26, 2021: contributor

@laanwj Another PR mocks calls to getaddrinfo (#19415). I think introducing LD_PRELOAD into fuzz builds quickly becomes complicated with more fuzz tests - where do these shared objects go and who maintains them? The tradeoff is more mocking code, so I favor the approach in this PR.

in src/netbase.h:14 in 0370d7a22a outdated

10@@ -11,6 +11,7 @@
11 
12 #include <compat.h>
13 #include <netaddress.h>
14+#include <protocol.h>

Crypt-iQ commented at 9:51 pm on February 7, 2021:

I can’t see where this is used

practicalswift commented at 5:00 pm on February 8, 2021:

Now removed. Thanks!

in src/netbase.h:44 in 0370d7a22a outdated

37@@ -37,6 +38,79 @@ class proxyType
38     bool randomize_credentials;
39 };
40 
41+/**
42+ * Convert milliseconds to a struct timeval for e.g. select.
43+ */
44+struct timeval MillisToTimeval(int64_t nTimeout);

Crypt-iQ commented at 9:51 pm on February 7, 2021:

This move is an artifact of an earlier commit

practicalswift commented at 5:01 pm on February 8, 2021:

Now addressed. Thanks!

in src/test/fuzz/util.h:652 in 0370d7a22a outdated

647+    {
648+        return m_fuzzed_data_provider.ConsumeBool();
649+    }
650+};
651+
652+FuzzedSocket ConsumeSocket(FuzzedDataProvider& fuzzed_data_provider)

Crypt-iQ commented at 9:54 pm on February 7, 2021:

Should use inline. See #20733

macOS build will warn with

0ld: warning: duplicate symbol 'ConsumeSocket(FuzzedDataProvider&)' in:
1    test/fuzz/fuzz-addition_overflow.o
2    libtest_fuzz.a(libtest_fuzz_a-util.o)

practicalswift commented at 5:01 pm on February 8, 2021:

Now addressed. Thanks!

Crypt-iQ commented at 9:56 pm on February 7, 2021: contributor

Needs rebase because of https://github.com/bitcoin/bitcoin/pull/20946

practicalswift force-pushed on Feb 8, 2021

practicalswift commented at 5:02 pm on February 8, 2021: contributor

@laanwj @Crypt-iQ Feedback addressed. Please re-review :)

practicalswift force-pushed on Feb 11, 2021

laanwj commented at 1:10 pm on February 11, 2021: member

Needs rebase after #20788 (should be a much smaller change now).

DrahtBot added the label Needs rebase on Feb 11, 2021

practicalswift force-pushed on Feb 16, 2021

DrahtBot removed the label Needs rebase on Feb 16, 2021

laanwj commented at 3:37 pm on February 16, 2021: member

Code review ACK ff13f099c922b90b6952d896d948d713ef6e9f7d

laanwj renamed this:
~~net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.~~
net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket.
on Feb 16, 2021

in src/test/fuzz/util.h:257 in ff13f099c9 outdated

249@@ -250,6 +250,24 @@ template <class T>
250     return false;
251 }
252 
253+/**
254+ * Returns a copy of the value selected from the given vector `v`.
255+ */
256+template <typename T>
257+[[nodiscard]] T PickValue(FuzzedDataProvider& fuzzed_data_provider, const std::vector<T> v) noexcept

MarcoFalke commented at 3:47 pm on February 16, 2021:

why is the full vector copied?

practicalswift commented at 6:27 pm on February 16, 2021:

Oh, that was not intentional. Thanks!

in src/test/fuzz/util.h:266 in ff13f099c9 outdated

261+}
262+
263+/**
264+ * Sets errno to a value selected from the given vector `errnos`.
265+ */
266+inline void SetFuzzedErrNo(FuzzedDataProvider& fuzzed_data_provider, const std::vector<int> errnos) noexcept

MarcoFalke commented at 3:50 pm on February 16, 2021:

same

MarcoFalke commented at 3:51 pm on February 16, 2021:

Also I am wondering why this can’t use PickValueInArray

practicalswift commented at 7:15 pm on February 16, 2021:

Thanks! Fixed.
Now using std::array and PickValueInArray.

in src/test/fuzz/socks5.cpp:40 in ff13f099c9 outdated

35+    // will slow down fuzzing.
36+    SOCKS5_RECV_TIMEOUT = (fuzzed_data_provider.ConsumeBool() && std::getenv("FUZZED_SOCKET_FAKE_LATENCY") != nullptr) ? 1 : DEFAULT_SOCKS5_RECV_TIMEOUT;
37+    FuzzedSock fuzzed_sock = ConsumeSock(fuzzed_data_provider);
38+    // This Socks5(...) fuzzing harness would have caught CVE-2017-18350 within
39+    // a few seconds of fuzzing.
40+    (void)Socks5(fuzzed_data_provider.ConsumeRandomLengthString(512), fuzzed_data_provider.ConsumeIntegral<int>(), fuzzed_data_provider.ConsumeBool() ? &proxy_credentials : nullptr, fuzzed_sock);

MarcoFalke commented at 3:55 pm on February 16, 2021:

nit: Would be nice to break long lines after each , or ?, or :

practicalswift commented at 7:15 pm on February 16, 2021:

Fixed!

vasild commented at 1:37 pm on February 18, 2021:

Would be nice to break long lines …

Opened #21223 style: make clang-format break long lines

in src/test/fuzz/util.h:629 in ff13f099c9 outdated

632+            if (r == -1) {
633+                SetFuzzedErrNo(m_fuzzed_data_provider, recv_errnos);
634+            }
635+            return r;
636+        }
637+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(m_fuzzed_data_provider.ConsumeBool() ? len : m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, len));

MarcoFalke commented at 3:55 pm on February 16, 2021:

nit: Would be nice to break long lines after each ?, or :

practicalswift commented at 7:15 pm on February 16, 2021:

Fixed!

MarcoFalke commented at 3:56 pm on February 16, 2021: member

left some style feedback

in src/test/fuzz/socks5.cpp:14 in ff13f099c9 outdated

 9+
10+#include <cstdint>
11+#include <string>
12+#include <vector>
13+
14+bool Socks5(const std::string& strDest, int port, const ProxyCredentials* auth, const Sock& socket);

vasild commented at 5:24 pm on February 16, 2021:

nit: since Socks5() is not static now in netbase.cpp, I think it is more natural to put its declaration (prototype) in netbase.h and remove it from here.

Same for SOCKS5_RECV_TIMEOUT.

practicalswift commented at 1:23 pm on March 2, 2021:

Addressed!

practicalswift force-pushed on Feb 16, 2021

practicalswift commented at 7:16 pm on February 16, 2021: contributor

@laanwj Thanks for the quick review! ❤️ Pushed an updated version addressing @MarcoFalke’s feedback. Should hopefully be ready for final review.

in src/test/fuzz/util.h:641 in 88af03002c outdated

638+            return r;
639+        }
640+        std::memcpy(buf, random_bytes.data(), random_bytes.size());
641+        if (m_fuzzed_data_provider.ConsumeBool()) {
642+            if (len > random_bytes.size()) {
643+                std::memset((char*)buf + random_bytes.size(), 0, len - random_bytes.size());

vasild commented at 2:23 pm on February 17, 2021:

nit: the typecast (char*) in memset() should not be needed. Or if it is needed (why?), then it should also be needed for the memcpy() call a few lines earlier since both take void* argument.

ryanofsky commented at 2:32 pm on February 19, 2021:

re: #19203 (review)

nit: the typecast (char*) in memset() should not be needed. Or if it is needed (why?), then it should also be needed for the memcpy() call a few lines earlier since both take void* argument.

I think think the compiler won’t allow arithmetic + random_bytes.size() on a void pointer.

in src/test/fuzz/util.h:646 in 88af03002c outdated

643+                std::memset((char*)buf + random_bytes.size(), 0, len - random_bytes.size());
644+            }
645+            return len;
646+        }
647+        if (m_fuzzed_data_provider.ConsumeBool() && std::getenv("FUZZED_SOCKET_FAKE_LATENCY") != nullptr) {
648+            std::this_thread::sleep_for(std::chrono::milliseconds{2});

vasild commented at 2:35 pm on February 17, 2021:

SOCKS5_RECV_TIMEOUT can be lowered to 1 second for the tests or be the default 20 seconds. In half of the cases this will sleep for 2 ms. So we need 500 Recv() calls where this sleep is executed in order to trigger a higher level timeout of 1 second.

I think this timeout will practically never be reached. Maybe change this to a few 100s of ms to make a timeout more likely?

ryanofsky commented at 2:56 pm on February 19, 2021:

re: #19203 (review)

SOCKS5_RECV_TIMEOUT can be lowered to 1 second for the tests or be the default 20 seconds. In half of the cases this will sleep for 2 ms. So we need 500 Recv() calls where this sleep is executed in order to trigger a higher level timeout of 1 second.

I think this timeout will practically never be reached. Maybe change this to a few 100s of ms to make a timeout more likely?

SOCKS5_RECV_TIMEOUT looks like milliseconds not seconds so std::chrono::milliseconds{2} should always time out.

Also, I’m assuming this is fake-sleeping not really sleeping in the middle of a fuzz test?

vasild commented at 7:08 am on March 1, 2021:

You are right - it can be lowered to 1ms or be the default 20sec. If we want to always produce a timeout, maybe better to sleep for SOCKS5_RECV_TIMEOUT instead of std::chrono::milliseconds{2}?

Why would it be fake sleeping? My understanding is that it will actually sleep in the middle of the fuzz test.

practicalswift commented at 1:31 pm on March 2, 2021:

Unfortunately the sleeping required to make the timeout trigger is not mockable: that’s the reason for making the ugly 2 ms real sleep opt-in via the environment variable FUZZED_SOCKET_FAKE_LATENCY. That environment is not assumed to be set except in special circumstances such as when targeted fuzzing is performed by security researchers and others who are willing to pay the 2 ms cost in order to get full coverage including the timeout logic.

Making the sleeping required to trigger the timeout mockable is likely out of scope for this PR (but would be nice to have!), so if the opt-in-to-timeout-via-FUZZED_SOCKET_FAKE_LATENCY workaround is deemed too ugly I could simply drop that commit from this PR, and we’ll get partial instead of full coverage :)

Let me know what you think and I’ll adjust accordingly :)

vasild approved

vasild commented at 2:36 pm on February 17, 2021: member

ACK 88af03002c170d95dc04077f3bbcd739e2a9e5b7

vasild commented at 3:11 pm on February 17, 2021: member

Running the new fuzz test for a few tens of minutes produces no errors.

Reverting the fix for CVE-2017–18350 like this:

 0--- i/src/netbase.cpp
 1+++ w/src/netbase.cpp
 2@@ -533,13 +533,13 @@ bool Socks5(const std::string& strDest, int port, const ProxyCredentials* auth,
 3         case SOCKS5Atyp::DOMAINNAME:
 4         {
 5             recvr = InterruptibleRecv(pchRet3, 1, SOCKS5_RECV_TIMEOUT, sock);
 6             if (recvr != IntrRecvError::OK) {
 7                 return error("Error reading from proxy");
 8             }
 9-            int nRecv = pchRet3[0];
10+            int nRecv = (char)pchRet3[0];
11             recvr = InterruptibleRecv(pchRet3, nRecv, SOCKS5_RECV_TIMEOUT, sock);
12             break;
13         }
14         default: return error("Error: malformed proxy response");
15     }
16     if (recvr != IntrRecvError::OK) {

crashes the test in less than a minute.

vasild commented at 3:52 pm on February 17, 2021: member

The PR description needs an update since it mentions 4 commits, but there are just 2 now.

laanwj commented at 5:00 pm on February 17, 2021: member

The PR description needs an update since it mentions 4 commits, but there are just 2 now.

Ah yes I had already updated the PR title for it, but you’re right that the description is out of date too.

in src/test/fuzz/util.h:577 in 1d95ff9f73 outdated

572+        assert(false && "Not implemented yet.");
573+    }
574+
575+    ssize_t Send(const void* data, size_t len, int flags) const override
576+    {
577+        constexpr std::array send_errnos{

ryanofsky commented at 2:30 pm on February 19, 2021:

In commit “fuzz: Add fuzzing harness for Socks5(…)” (1d95ff9f731b0c5a317a9c3ce52e37b4a57e99ff)

It it better to make assumptions about return value and errno value with ConsumeIntegralInRange and PickValueInArray instead of just using ConsumeIntegral for these? ConsumeIntegral would definitely make the setup shorter and simpler and seem to make fuzzing more robust. Would it cause fuzzing to be too slow?

vasild commented at 7:12 am on March 1, 2021:

I think we want to mimic a conforming OS send() because our code is not prepared to handle a misbehaving send(), e.g.

one that returns -1 and does not set errno or
returns -1 and sets errno to 2379392 or
returns -9287 or
returns a number bigger than len.

in src/test/fuzz/util.h:632 in 1d95ff9f73 outdated

627+            return r;
628+        }
629+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(
630+            m_fuzzed_data_provider.ConsumeBool() ?
631+                len :
632+                m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, len));

ryanofsky commented at 2:37 pm on February 19, 2021:

In commit “fuzz: Add fuzzing harness for Socks5(…)” (1d95ff9f731b0c5a317a9c3ce52e37b4a57e99ff)

Again I don’t understand why need to bias the choice here instead of just choosing uniformly in the range. I guess I don’t understand enough about fuzzing but it seems annoying if you have to hold the fuzzer’s hand this way.

practicalswift commented at 3:34 pm on March 2, 2021:

Good question!

My intuition was that the fuzzer would be helped by being able to distinguish between the expected “success case” (enough to read) and the unexpected “fail” case ((likely) not enough to read) on a single byte input basis. In other words: a single byte input chooses between the “fail path” and the “success path” via ConsumeBool.

After some tinkering/measuring it seems like that effort to help the fuzzer didn’t make any significant difference.

Now skipping the ConsumeBool(…) ? len : ConsumeIntegralInRange(…) in favor of a straight ConsumeIntegralInRange(…) as suggested. Please re-review :)

in src/netbase.cpp:44 in 88af03002c outdated

40@@ -41,7 +41,7 @@ int nConnectTimeout = DEFAULT_CONNECT_TIMEOUT;
41 bool fNameLookup = DEFAULT_NAME_LOOKUP;
42 
43 // Need ample time for negotiation for very slow proxies such as Tor (milliseconds)
44-static const int SOCKS5_RECV_TIMEOUT = 20 * 1000;
45+int SOCKS5_RECV_TIMEOUT = 20 * 1000;

ryanofsky commented at 2:47 pm on February 19, 2021:

In commit “fuzz: Add FUZZED_SOCKET_FAKE_LATENCY mode to FuzzedSock to allow for fuzzing timeout logic” (88af03002c170d95dc04077f3bbcd739e2a9e5b7)

It seems dangerous to disguise a variable as an all caps constant. Best thing would be change this to a SOCKS5_RECV_TIMEOUT_DEFAULT default timeout, and add a runtime parameter to override the default. If that’s too complicated, next best thing would be to s/SOCKS5_RECV_TIMEOUT/g_socks5_recv_timeout/

practicalswift commented at 1:22 pm on March 2, 2021:

Addressed!

ryanofsky approved

ryanofsky commented at 3:07 pm on February 19, 2021: member

Code review ACK 88af03002c170d95dc04077f3bbcd739e2a9e5b7 with caveat that I don’t know very much about fuzzing so ignore my suggestions.

practicalswift force-pushed on Mar 2, 2021

vasild commented at 5:06 pm on March 2, 2021: member

b62b90fa0e642c89a75afcfeb7f0491fc8e93f00 looks good except that the first commit does not compile:

0test/fuzz/socks5.cpp:23:5: error: use of undeclared identifier 'default_socks5_recv_timeout'
1    default_socks5_recv_timeout = g_socks5_recv_timeout;
2    ^
3test/fuzz/socks5.cpp:23:35: error: use of undeclared identifier 'g_socks5_recv_timeout'
4    default_socks5_recv_timeout = g_socks5_recv_timeout;
5                                  ^

(the second commit fixes this)

fuzz: Add fuzzing harness for Socks5(...) b22d4c1607

fuzz: Add FUZZED_SOCKET_FAKE_LATENCY mode to FuzzedSock to allow for fuzzing timeout logic 366e3e1f89

practicalswift force-pushed on Mar 2, 2021

practicalswift commented at 9:59 pm on March 2, 2021: contributor

@vasild

b62b90f looks good except that the first commit does not compile: … (the second commit fixes this)

Oh, good catch!

Now addressed :)

Please re-review :)

vasild approved

vasild commented at 1:25 pm on March 3, 2021: member

ACK 366e3e1f89d99c62b548087384487b62fd602e17

MarcoFalke merged this on Mar 3, 2021

MarcoFalke closed this on Mar 3, 2021

sidhujag referenced this in commit 217b5ba671 on Mar 3, 2021

fanquake removed this from the "Blockers" column in a project

practicalswift deleted the branch on Apr 10, 2021

DrahtBot locked this on Aug 16, 2022

net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. #19203

Conflicts