net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. #19203

pull practicalswift wants to merge 2 commits into bitcoin:master from practicalswift:fuzzers-Socks5 changing 5 files +188 −17
  1. practicalswift commented at 8:23 pm on June 7, 2020: contributor

    Add regression fuzz harness for CVE-2017-18350. This fuzzing harness would have found CVE-2017-18350 within a minute of fuzzing :)

    See doc/fuzzing.md for information on how to fuzz Bitcoin Core. Don’t forget to contribute any coverage increasing inputs you find to the Bitcoin Core fuzzing corpus repo.

    Happy fuzzing :)

  2. DrahtBot added the label Build system on Jun 7, 2020
  3. DrahtBot added the label P2P on Jun 7, 2020
  4. DrahtBot added the label Tests on Jun 7, 2020
  5. MarcoFalke removed the label Build system on Jun 7, 2020
  6. MarcoFalke removed the label Tests on Jun 7, 2020
  7. MarcoFalke commented at 11:30 pm on June 7, 2020: member
    Concept ACK
  8. MarcoFalke renamed this:
    tests/net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.
    net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.
    on Jun 7, 2020
  9. DrahtBot commented at 11:34 pm on June 7, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21328 (net, refactor: pass uint16 CService::port as uint16 by jonatack)
    • #19415 (net: Make DNS lookup mockable, add fuzzing harness by practicalswift)
    • #19259 (fuzz: Add fuzzing harness for LoadMempool(…) and DumpMempool(…) by practicalswift)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  10. practicalswift force-pushed on Jun 8, 2020
  11. DrahtBot added the label Needs rebase on Jun 11, 2020
  12. practicalswift force-pushed on Jun 11, 2020
  13. DrahtBot removed the label Needs rebase on Jun 11, 2020
  14. DrahtBot added the label Needs rebase on Jul 9, 2020
  15. practicalswift force-pushed on Jul 9, 2020
  16. DrahtBot removed the label Needs rebase on Jul 9, 2020
  17. practicalswift force-pushed on Jul 10, 2020
  18. DrahtBot added the label Needs rebase on Jul 11, 2020
  19. practicalswift force-pushed on Jul 11, 2020
  20. DrahtBot removed the label Needs rebase on Jul 11, 2020
  21. DrahtBot added the label Needs rebase on Jul 18, 2020
  22. practicalswift force-pushed on Jul 18, 2020
  23. practicalswift force-pushed on Jul 18, 2020
  24. DrahtBot removed the label Needs rebase on Jul 18, 2020
  25. in src/netbase.h:44 in 5da912d7cb outdated
    40@@ -37,6 +41,98 @@ class proxyType
    41     bool randomize_credentials;
    42 };
    43 
    44+struct timeval MillisToTimeval(int64_t nTimeout);
    


    Crypt-iQ commented at 2:09 am on July 21, 2020:
    This is a redundant declaration, it’s defined lower on line 164.

    practicalswift commented at 5:14 pm on August 14, 2020:
    It is now used prior to L164, so it must be on L44 – but removing L164. Thanks!
  26. in src/netbase.h:69 in 544a870557 outdated
    61+};
    62+
    63+// Represents a non-owned SOCKET.
    64+//
    65+// The underlying SOCKET will NOT be closed automatically on object destruction.
    66+class SocketImpl : public Socket
    


    Crypt-iQ commented at 11:09 am on July 21, 2020:
    Just would like to point out that this is currently unused. Is this intentional?

    practicalswift commented at 5:08 pm on August 14, 2020:

    It is used in netbase.cpp? :)

    0$ git grep SocketImpl
    1src/netbase.cpp:        SocketImpl socket{hSocket};
    2src/netbase.cpp:        SocketImpl socket{hSocket};
    3src/netbase.h:class SocketImpl : public Socket
    4src/netbase.h:    explicit SocketImpl(SOCKET socket) : m_socket{socket}
    

    Crypt-iQ commented at 5:11 pm on August 14, 2020:
    I meant that even though it’s used, the coverage isn’t updated for SocketImpl functions. Might be a bug with lcov perhaps.

    practicalswift commented at 5:17 pm on August 14, 2020:
    I think that is expected: it is currently only from ConnectThroughProxy which is not fuzzed :)

    Crypt-iQ commented at 5:22 pm on August 14, 2020:
    Ah derp! I mistakenly thought it was fuzzed for some reason, but FuzzedSocket is the one that’s used. Is the idea that they be used together in some manner eventually?

    practicalswift commented at 5:29 pm on August 14, 2020:
    @Crypt-iQ It is only used as a way to wrap an existing SOCKET as a Socket instance: no further plans beyond that :)
  27. in src/test/fuzz/util.h:475 in 4f18edbe55 outdated
    469@@ -470,4 +470,63 @@ void ReadFromStream(FuzzedDataProvider& fuzzed_data_provider, Stream& stream) no
    470     }
    471 }
    472 
    473+class FuzzedSocket : public Socket
    474+{
    475+    FuzzedDataProvider& fuzzed_data_provider;
    


    Crypt-iQ commented at 11:16 am on July 21, 2020:
    nit: m_fuzzed_data_provider so the reference in the constructor can just be fuzzed_data_provider?

    practicalswift commented at 5:17 pm on August 14, 2020:
    Good nit! Adressed!
  28. in src/test/fuzz/util.h:633 in 4f18edbe55 outdated
    501+            if (len > random_bytes.size()) {
    502+                std::memset(buf + random_bytes.size(), 0, len - random_bytes.size());
    503+            }
    504+            return len;
    505+        }
    506+        return random_bytes.size();
    


    Crypt-iQ commented at 11:20 am on July 21, 2020:
    Here it is still possible for random_bytes.size() < len since that condition is only checked if ConsumeBool returns true. Shouldn’t the correct number of bytes be returned if possible?

    practicalswift commented at 5:23 pm on August 14, 2020:
    We want to make it so that the fuzzer driven recv can return up to len bytes like the real recv, right? :)

    Crypt-iQ commented at 7:15 pm on August 14, 2020:
    Ah, yes that makes sense, thanks for clarifying.
  29. Crypt-iQ commented at 11:30 am on July 21, 2020: contributor
    Coverage from ~30 hours of libfuzzer fuzzing: https://crypt-iq.github.io/socks5_cov_outs/src/ Missing about 4 lines of coverage, will try to hit those again on a different run.
  30. practicalswift force-pushed on Aug 14, 2020
  31. Crypt-iQ commented at 5:04 am on September 1, 2020: contributor

    Builds on Ubuntu. Fails to build on macOS with brew clang.

     0Making all in src
     1  CXX      test/fuzz/addition_overflow-addition_overflow.o
     2In file included from test/fuzz/addition_overflow.cpp:7:
     3./test/fuzz/util.h:335:13: error: no matching function for call to 'AdditionOverflow'
     4        if (AdditionOverflow((uint64_t)fuzzed_file->m_offset, random_bytes.size())) {
     5            ^~~~~~~~~~~~~~~~
     6./test/fuzz/util.h:201:16: note: candidate template ignored: deduced conflicting types for parameter 'T' ('unsigned long long' vs. 'unsigned long')
     7NODISCARD bool AdditionOverflow(const T i, const T j) noexcept
     8               ^
     9./test/fuzz/util.h:346:13: error: no matching function for call to 'AdditionOverflow'
    10        if (AdditionOverflow(fuzzed_file->m_offset, n)) {
    11            ^~~~~~~~~~~~~~~~
    12./test/fuzz/util.h:201:16: note: candidate template ignored: deduced conflicting types for parameter 'T' ('long long' vs. 'long')
    13NODISCARD bool AdditionOverflow(const T i, const T j) noexcept
    14               ^
    152 errors generated.
    16make[2]: *** [test/fuzz/addition_overflow-addition_overflow.o] Error 1
    17make[1]: *** [all-recursive] Error 1
    18make: *** [all-recursive] Error 1
    
  32. practicalswift force-pushed on Sep 1, 2020
  33. practicalswift commented at 2:15 pm on September 1, 2020: contributor
    @Crypt-iQ Thanks for letting me know. Now rebased. Should compile on macOS now? :)
  34. Crypt-iQ commented at 7:08 pm on September 1, 2020: contributor
    @practicalswift Now compiles on macOS :). Will run the builds again and re-review.
  35. in src/test/fuzz/util.h:554 in 2d02f775fa outdated
    507+    }
    508+
    509+#ifdef USE_POLL
    510+    int poll(short events, int timeout_ms) override
    511+    {
    512+        return m_fuzzed_data_provider.ConsumeIntegralInRange<int>(-1, 1);
    


    Crypt-iQ commented at 5:41 am on September 2, 2020:

    Calling this with -1 as the min parameter to ConsumeIntegralInRange will result in a very large range since it will be cast to uint64_t: https://github.com/bitcoin/bitcoin/blob/48c1083632687a42ac603d4f241e70616a1d3815/src/test/fuzz/FuzzedDataProvider.h#L87

    Maybe an alternative version of ConsumeIntegralInRange for just -1,0,1 can be used? Not your fault, it’s the header file’s fault.


    Crypt-iQ commented at 5:51 am on September 2, 2020:
    Nevermind, godbolt tells me I am wrong.
  36. in src/test/fuzz/util.h:560 in 2d02f775fa outdated
    513+    }
    514+#endif
    515+
    516+    int select(bool watch_read, bool watch_write, int timeout_ms) override
    517+    {
    518+        return m_fuzzed_data_provider.ConsumeIntegralInRange<int>(-1, 1);
    


    Crypt-iQ commented at 5:41 am on September 2, 2020:
    Same here.
  37. Crypt-iQ commented at 5:34 pm on September 2, 2020: contributor

    ACK 2d02f775fa36c59bca5d7ef1a0e702c1bc2de117

    Code change looks good to me. I did not try to reintroduce the CVE, but I did read up about it. Ubuntu 18:

    • ./configure --enable-fuzz --with-sanitizers=address,undefined,integer,fuzzer - no errors
    • valgrind - no errors

    macOS 10.15.4:

    • ./configure --enable-fuzz --with-sanitizers=address,fuzzer --disable-asm - no errors
    • ./configure --enable-fuzz --with-sanitizers=undefined,integer,fuzzer - one complaint (suppressed by #19713)
     0/usr/local/opt/llvm/bin/../include/c++/v1/memory:1876:35: runtime error: implicit conversion from type 'char' of value -125 (8-bit, signed) to type 'unsigned char' changed the value to 131 (8-bit, unsigned)
     1    [#0](/bitcoin-bitcoin/0/) 0x1072b4bd4 in void std::__1::allocator_traits<std::__1::allocator<unsigned char> >::construct<unsigned char, char const&>(std::__1::allocator<unsigned char>&, unsigned char*, char const&) memory:1595
     2    [#1](/bitcoin-bitcoin/1/) 0x1072b49f9 in void std::__1::allocator_traits<std::__1::allocator<unsigned char> >::__construct_range_forward<std::__1::__wrap_iter<char const*>, unsigned char*>(std::__1::allocator<unsigned char>&, std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>, unsigned char*&) memory:1688
     3    [#2](/bitcoin-bitcoin/2/) 0x1072b3598 in std::__1::enable_if<__is_cpp17_forward_iterator<std::__1::__wrap_iter<char const*> >::value, void>::type std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::__construct_at_end<std::__1::__wrap_iter<char const*> >(std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>, unsigned long) vector:1076
     4    [#3](/bitcoin-bitcoin/3/) 0x10728e2a2 in std::__1::enable_if<(__is_cpp17_forward_iterator<std::__1::__wrap_iter<char const*> >::value) && (is_constructible<unsigned char, std::__1::iterator_traits<std::__1::__wrap_iter<char const*> >::reference>::value), std::__1::__wrap_iter<unsigned char*> >::type std::__1::vector<unsigned char, std::__1::allocator<unsigned char> >::insert<std::__1::__wrap_iter<char const*> >(std::__1::__wrap_iter<unsigned char const*>, std::__1::__wrap_iter<char const*>, std::__1::__wrap_iter<char const*>) vector:1996
     5    [#4](/bitcoin-bitcoin/4/) 0x10728b18c in Socks5(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, ProxyCredentials const*, Socket&) netbase.cpp:492
     6    [#5](/bitcoin-bitcoin/5/) 0x10663a420 in test_one_input(std::__1::vector<unsigned char, std::__1::allocator<unsigned char> > const&) socks5.cpp:30
     7    [#6](/bitcoin-bitcoin/6/) 0x107663b28 in LLVMFuzzerTestOneInput fuzz.cpp:45
     8    [#7](/bitcoin-bitcoin/7/) 0x107862780 in fuzzer::Fuzzer::ExecuteCallback(unsigned char const*, unsigned long) FuzzerLoop.cpp:556
     9    [#8](/bitcoin-bitcoin/8/) 0x107861ec5 in fuzzer::Fuzzer::RunOne(unsigned char const*, unsigned long, bool, fuzzer::InputInfo*, bool*) FuzzerLoop.cpp:470
    10    [#9](/bitcoin-bitcoin/9/) 0x107863f76 in fuzzer::Fuzzer::MutateAndTestOne() FuzzerLoop.cpp:698
    11    [#10](/bitcoin-bitcoin/10/) 0x107864be5 in fuzzer::Fuzzer::Loop(std::__1::vector<fuzzer::SizedFile, fuzzer::fuzzer_allocator<fuzzer::SizedFile> >&) FuzzerLoop.cpp:830
    12    [#11](/bitcoin-bitcoin/11/) 0x107851d8d in fuzzer::FuzzerDriver(int*, char***, int (*)(unsigned char const*, unsigned long)) FuzzerDriver.cpp:829
    13    [#12](/bitcoin-bitcoin/12/) 0x10787da62 in main FuzzerMain.cpp:19
    14    [#13](/bitcoin-bitcoin/13/) 0x7fff73778cc8 in start+0x0 (libdyld.dylib:x86_64+0x1acc8)
    
  38. practicalswift force-pushed on Sep 17, 2020
  39. practicalswift commented at 1:38 pm on September 17, 2020: contributor
    Had to rebase also this one to make use of the new $(FUZZ_SUITE_LDFLAGS_COMMON). @Crypt-iQ May I ask you to review and hopefully give your second ACK on this PR? :)
  40. Crypt-iQ commented at 10:54 am on September 20, 2020: contributor
    ACK ea6cd7b2cc6b070b2630457d75b396de555b31c7
  41. practicalswift commented at 8:36 am on October 15, 2020: contributor

    Friendly review beg :)

    Concept NACK:s are totally fine too: that would allow me to move forward with other fuzzing stuff without having to keep this PR updated :)

  42. practicalswift commented at 8:38 am on October 15, 2020: contributor
    @MarcoFalke Thanks for your Concept ACK. Would you mind reviewing the code too? :)
  43. MarcoFalke commented at 10:21 am on November 10, 2020: member
    Sorry, I don’t know enough about sockets to be able to review this.
  44. practicalswift commented at 10:27 am on November 10, 2020: contributor

    @MarcoFalke Ah, got it! Do you know who would be an appropriate reviewer? :)

    FWIW I think sipa wrote the Socks5(...) code originally in #1141.

    Any review help welcome! Concept NACK is obviously fine too: I just want to move forward towards close (due to NACK or close due to merge) :)

  45. practicalswift force-pushed on Nov 13, 2020
  46. practicalswift commented at 10:07 am on November 13, 2020: contributor

    Rebased. Now using the names MockableSocket, Socket and FuzzedSocket.

    Socket and FuzzedSocket both implement MockableSocket.

    The introduced src/test/fuzz/socks5 would have caught the CVE-2017-18350 buffer overflow (which I caught via code review rather than fuzzing) within literally seconds of fuzzing :)

  47. fanquake added this to the "Blockers" column in a project

  48. DrahtBot added the label Needs rebase on Dec 15, 2020
  49. practicalswift force-pushed on Dec 16, 2020
  50. DrahtBot removed the label Needs rebase on Dec 16, 2020
  51. in src/test/fuzz/util.h:529 in b7a45b3555 outdated
    524+    ssize_t send(const char* buf, size_t len, int flags) override
    525+    {
    526+        if (m_fuzzed_data_provider.ConsumeBool()) {
    527+            return len;
    528+        }
    529+        return m_fuzzed_data_provider.ConsumeIntegralInRange<ssize_t>(-1, len);
    


    vasild commented at 10:24 am on December 21, 2020:

    Just a note: this could return 0 from send(). I am not sure if send(2) could ever return 0:

    On success, these calls return the number of characters sent. On error, -1 is returned

    anyway, I think it is ok to return 0 - to check that our code does not get bricked by that (in case send(2) actually returns 0).


    vasild commented at 10:30 am on December 21, 2020:
    I think it would be good to set errno if we are going to return an error (-1). Application reaction on error should differ depending on the value of errno (should retry on “temporary” errors).

    practicalswift commented at 6:09 pm on December 21, 2020:
    Good point! Pushed an updated version addressing this.
  52. in src/test/fuzz/util.h:534 in b7a45b3555 outdated
    529+        return m_fuzzed_data_provider.ConsumeIntegralInRange<ssize_t>(-1, len);
    530+    }
    531+
    532+    ssize_t recv(char* buf, size_t len, int flags) override
    533+    {
    534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {
    


    vasild commented at 10:34 am on December 21, 2020:
    Maybe assert that if buf is nullptr then len is 0 before this if and then remove buf == nullptr from here.

    practicalswift commented at 6:09 pm on December 21, 2020:
    Good point! Pushed an updated version addressing this.
  53. in src/test/fuzz/util.h:539 in b7a45b3555 outdated
    534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {
    535+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
    536+        }
    537+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(len);
    538+        if (random_bytes.empty()) {
    539+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
    


    vasild commented at 10:37 am on December 21, 2020:
    Here and above, set errno to some “random” value if we are going to return -1. Maybe the “random” value should have higher probablitity of being one of EAGAIN, EWOULDBLOCK, EINTR or EINPROGRESS.

    practicalswift commented at 6:10 pm on December 21, 2020:
    Good point! Pushed an updated version addressing this.
  54. in src/test/fuzz/util.h:629 in b7a45b3555 outdated
    539+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
    540+        }
    541+        std::memcpy(buf, random_bytes.data(), random_bytes.size());
    542+        if (m_fuzzed_data_provider.ConsumeBool()) {
    543+            if (len > random_bytes.size()) {
    544+                std::memset(buf + random_bytes.size(), 0, len - random_bytes.size());
    


    vasild commented at 10:40 am on December 21, 2020:
    Why this? recv(2) would never set the remaining buffer to 0x0 bytes.

    practicalswift commented at 10:32 pm on December 21, 2020:

    This is intentional:

    If m_fuzzed_data_provider.ConsumeBool() we pretend to read len bytes but we only consume random_bytes.size() from the fuzzing input and let the remaining bytes be 0x0.

    That way we can satisfy a read of say 4096 bytes even if we only have say 20 bytes of input left: 20 bytes will be real data from the input and 4076 bytes will be 0x0 bytes.

    Makes sense? :)


    vasild commented at 10:48 am on December 22, 2020:
    Oh yes, we return len. Sorry for the noise.
  55. in src/test/fuzz/util.h:596 in b7a45b3555 outdated
    532+    ssize_t recv(char* buf, size_t len, int flags) override
    533+    {
    534+        if (buf == nullptr || len == 0 || m_fuzzed_data_provider.ConsumeBool()) {
    535+            return m_fuzzed_data_provider.ConsumeBool() ? 0 : -1;
    536+        }
    537+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(len);
    


    vasild commented at 10:44 am on December 21, 2020:

    What is the likelihood of this returning less than len? The ConsumeBytes() comment says:

    … If fewer than // |num_bytes| of data remain, returns a shorter std::vector containing all // of the data that’s left.

    I think we should make it more likely that this fuzzed recv() returns a partial read because this is when the “interesting” things happen in the app code as it should retry.


    practicalswift commented at 10:38 pm on December 21, 2020:
    Good point! Pushed an updated version addressing this.
  56. vasild commented at 10:51 am on December 21, 2020: member

    Concept ACK

    I am considering how to make a RAII socket class that would best fit in this PR, the rest of the code and in #20685, as you suggested.

    Some minor comments below in the meantime.

  57. in src/netbase.h:76 in b7a45b3555 outdated
    85+    ssize_t recv(char* buf, size_t len, int flags) override
    86+    {
    87+        return ::recv(m_socket, buf, len, flags);
    88+    }
    89+
    90+#ifdef USE_POLL
    


    laanwj commented at 5:08 pm on December 21, 2020:
    I’d prefer to have this implementation in an implementation unit instead of the header. This is not a templated class and implementing virtual methods can be done everywhere. This would also avoid adding the optional include poll.h here, I think.

    practicalswift commented at 6:32 pm on December 21, 2020:
    Good point! Pushed an updated version addressing this.
  58. laanwj commented at 5:08 pm on December 21, 2020: member
    Concept ACK
  59. practicalswift force-pushed on Dec 21, 2020
  60. practicalswift force-pushed on Dec 21, 2020
  61. practicalswift force-pushed on Dec 21, 2020
  62. practicalswift commented at 10:39 pm on December 21, 2020: contributor

    @laanwj @vasild

    Thanks for very good review feedback!

    I believe I’ve addressed all the issues raised. Please re-review :)

  63. vasild commented at 3:49 pm on December 23, 2020: member

    I am considering how to make a RAII socket class…

    What about something like this: https://github.com/vasild/bitcoin/commit/bf3fa6a10a69f3d0dbd84188e3ba089a19248b20?

  64. practicalswift commented at 5:24 pm on December 27, 2020: contributor
    @vasild Great! Would you mind submitting that as a separate PR? :)
  65. vasild commented at 4:19 pm on December 28, 2020: member

    @vasild Great! Would you mind submitting that as a separate PR? :)

    Here it is: #20788. If it gets merged, then the first commit in this PR can be dropped.

  66. in src/netbase.h:95 in 0370d7a22a outdated
     96+#endif
     97+
     98+    // Simplified select(2) interface: Wait until the socket becomes ready for
     99+    // reading (if watch_read) and/or writing (if watch_write). Timeout after
    100+    // timeout_ms milliseconds. See select(2) for return values.
    101+    int select(bool watch_read, bool watch_write, int timeout_ms) override;
    


    laanwj commented at 5:09 pm on January 7, 2021:
    This could be in the #else of USE_POLL?

    practicalswift commented at 5:01 pm on February 8, 2021:
    Good idea! Now addressed. Thanks!
  67. laanwj commented at 5:16 pm on January 7, 2021: member
    I still like the concept of this very much, but I just wondered if there isn’t a general way to fuzz network applications that doesn’t require wrapping every BSD socketsI call. I mean it seem like something people would want to commonly do.
  68. vasild commented at 11:00 am on January 14, 2021: member
    @laanwj It is possible to override functions (e.g. send(2)) without modifying the source code with LD_PRELOAD. I think that is not supported by the current testing framework, out of the box?
  69. practicalswift commented at 11:35 am on January 14, 2021: contributor

    I still like the concept of this very much, but I just wondered if there isn’t a general way to fuzz network applications that doesn’t require wrapping every BSD socketsI call. I mean it seem like something people would want to commonly do.

    Easy fuzzing of networking code is unfortunately an unsolved problem generally AFAIK.

    There have been many attempts to solve this by using LD_PRELOAD hacks as @vasild mentions. One notable example is preeny’s desock. My experience is that these LD_PRELOAD hacks are pretty fragile and while they might be good enough for short-term one-off fuzzing jobs (think targeted vulnerability research) they are not suitable for more long-term robustness testing needs like in our case.

    The best “off the shelf” solution to this problem I’ve seen is what Honggfuzz is doing with its networking mode libhfnetdriver (see http://blog.swiecki.net/2018/01/fuzzing-tcp-servers.html). Unfortunately that solutions is limited to Honggfuzz and it is more suited for “macro level” networking code fuzzing than the more “micro level” networking code fuzzing this PR targets.

    That said: Honggfuzz NetDriver is great too have in the toolbox too: I’ve tried documenting how to fuzz Bitcoin Core using Honggfuzz’s networking mode. See the unmerged PR #20380 (“doc: Add instructions on how to fuzz the P2P layer using Honggfuzz NetDriver”). Review welcome! :)

    tl;dr – Mocked networking functions driven by FuzzedDataProvider as suggested in this PR is AFAICT the most robust and appropriate way to solve our low-level network fuzzing needs. That said it would be interesting to know if anyone has any alternative approach which is both robust and fuzzer engine agnostic: automated robustness testing of networking code is an active area of research :)

  70. laanwj commented at 9:59 pm on January 24, 2021: member

    tl;dr – Mocked networking functions driven by FuzzedDataProvider as suggested in this PR is AFAICT the most robust and appropriate way to solve our low-level network fuzzing needs.

    Sure. Fair enough. It’s just that I’m in principle not a fan of wrapping and abstracting basic operating system APIs such as BSD Sockets because anyone contributing to the code has to learn a special interface instead of being up to speed quickly with what they already know. It’s also another moving part that could have bugs itself.

    That said, it seems this is useful for other things such as I2P support so that seems enough of a motivation to do it anyway. If we do it, it would be good to do so at a global level as #20788 does.

  71. vasild commented at 10:03 am on January 25, 2021: member

    anyone contributing to the code has to learn a special interface instead of being up to speed quickly with what they already know

    Indeed! This is why #20788 mimics the system interface:

    0ssize_t ret = recv(socket, data, len, 0);
    1// vs (same arguments, return type and semantic)
    2ssize_t ret = sock.Recv(data, len, 0);
    
  72. practicalswift commented at 9:34 am on January 26, 2021: contributor
    @laanwj Good points. When/if @vasild’s #20788 is merged I’ll use that facility instead :)
  73. Crypt-iQ commented at 1:31 pm on January 26, 2021: contributor
    @laanwj Another PR mocks calls to getaddrinfo (#19415). I think introducing LD_PRELOAD into fuzz builds quickly becomes complicated with more fuzz tests - where do these shared objects go and who maintains them? The tradeoff is more mocking code, so I favor the approach in this PR.
  74. in src/netbase.h:14 in 0370d7a22a outdated
    10@@ -11,6 +11,7 @@
    11 
    12 #include <compat.h>
    13 #include <netaddress.h>
    14+#include <protocol.h>
    


    Crypt-iQ commented at 9:51 pm on February 7, 2021:
    I can’t see where this is used

    practicalswift commented at 5:00 pm on February 8, 2021:
    Now removed. Thanks!
  75. in src/netbase.h:44 in 0370d7a22a outdated
    37@@ -37,6 +38,79 @@ class proxyType
    38     bool randomize_credentials;
    39 };
    40 
    41+/**
    42+ * Convert milliseconds to a struct timeval for e.g. select.
    43+ */
    44+struct timeval MillisToTimeval(int64_t nTimeout);
    


    Crypt-iQ commented at 9:51 pm on February 7, 2021:
    This move is an artifact of an earlier commit

    practicalswift commented at 5:01 pm on February 8, 2021:
    Now addressed. Thanks!
  76. in src/test/fuzz/util.h:652 in 0370d7a22a outdated
    647+    {
    648+        return m_fuzzed_data_provider.ConsumeBool();
    649+    }
    650+};
    651+
    652+FuzzedSocket ConsumeSocket(FuzzedDataProvider& fuzzed_data_provider)
    


    Crypt-iQ commented at 9:54 pm on February 7, 2021:

    Should use inline. See #20733

    macOS build will warn with

    0ld: warning: duplicate symbol 'ConsumeSocket(FuzzedDataProvider&)' in:
    1    test/fuzz/fuzz-addition_overflow.o
    2    libtest_fuzz.a(libtest_fuzz_a-util.o)
    

    practicalswift commented at 5:01 pm on February 8, 2021:
    Now addressed. Thanks!
  77. Crypt-iQ commented at 9:56 pm on February 7, 2021: contributor
  78. practicalswift force-pushed on Feb 8, 2021
  79. practicalswift commented at 5:02 pm on February 8, 2021: contributor
    @laanwj @Crypt-iQ Feedback addressed. Please re-review :)
  80. practicalswift force-pushed on Feb 11, 2021
  81. laanwj commented at 1:10 pm on February 11, 2021: member
    Needs rebase after #20788 (should be a much smaller change now).
  82. DrahtBot added the label Needs rebase on Feb 11, 2021
  83. practicalswift force-pushed on Feb 16, 2021
  84. DrahtBot removed the label Needs rebase on Feb 16, 2021
  85. laanwj commented at 3:37 pm on February 16, 2021: member
    Code review ACK ff13f099c922b90b6952d896d948d713ef6e9f7d
  86. laanwj renamed this:
    net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper.
    net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket.
    on Feb 16, 2021
  87. in src/test/fuzz/util.h:257 in ff13f099c9 outdated
    249@@ -250,6 +250,24 @@ template <class T>
    250     return false;
    251 }
    252 
    253+/**
    254+ * Returns a copy of the value selected from the given vector `v`.
    255+ */
    256+template <typename T>
    257+[[nodiscard]] T PickValue(FuzzedDataProvider& fuzzed_data_provider, const std::vector<T> v) noexcept
    


    MarcoFalke commented at 3:47 pm on February 16, 2021:
    why is the full vector copied?

    practicalswift commented at 6:27 pm on February 16, 2021:
    Oh, that was not intentional. Thanks!
  88. in src/test/fuzz/util.h:266 in ff13f099c9 outdated
    261+}
    262+
    263+/**
    264+ * Sets errno to a value selected from the given vector `errnos`.
    265+ */
    266+inline void SetFuzzedErrNo(FuzzedDataProvider& fuzzed_data_provider, const std::vector<int> errnos) noexcept
    


    MarcoFalke commented at 3:50 pm on February 16, 2021:
    same

    MarcoFalke commented at 3:51 pm on February 16, 2021:
    Also I am wondering why this can’t use PickValueInArray

    practicalswift commented at 7:15 pm on February 16, 2021:
    1. Thanks! Fixed.
    2. Now using std::array and PickValueInArray.
  89. in src/test/fuzz/socks5.cpp:40 in ff13f099c9 outdated
    35+    // will slow down fuzzing.
    36+    SOCKS5_RECV_TIMEOUT = (fuzzed_data_provider.ConsumeBool() && std::getenv("FUZZED_SOCKET_FAKE_LATENCY") != nullptr) ? 1 : DEFAULT_SOCKS5_RECV_TIMEOUT;
    37+    FuzzedSock fuzzed_sock = ConsumeSock(fuzzed_data_provider);
    38+    // This Socks5(...) fuzzing harness would have caught CVE-2017-18350 within
    39+    // a few seconds of fuzzing.
    40+    (void)Socks5(fuzzed_data_provider.ConsumeRandomLengthString(512), fuzzed_data_provider.ConsumeIntegral<int>(), fuzzed_data_provider.ConsumeBool() ? &proxy_credentials : nullptr, fuzzed_sock);
    


    MarcoFalke commented at 3:55 pm on February 16, 2021:
    nit: Would be nice to break long lines after each , or ?, or :

    practicalswift commented at 7:15 pm on February 16, 2021:
    Fixed!

    vasild commented at 1:37 pm on February 18, 2021:

    Would be nice to break long lines …

    Opened #21223 style: make clang-format break long lines

  90. in src/test/fuzz/util.h:629 in ff13f099c9 outdated
    632+            if (r == -1) {
    633+                SetFuzzedErrNo(m_fuzzed_data_provider, recv_errnos);
    634+            }
    635+            return r;
    636+        }
    637+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(m_fuzzed_data_provider.ConsumeBool() ? len : m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, len));
    


    MarcoFalke commented at 3:55 pm on February 16, 2021:
    nit: Would be nice to break long lines after each ?, or :

    practicalswift commented at 7:15 pm on February 16, 2021:
    Fixed!
  91. MarcoFalke commented at 3:56 pm on February 16, 2021: member
    left some style feedback
  92. in src/test/fuzz/socks5.cpp:14 in ff13f099c9 outdated
     9+
    10+#include <cstdint>
    11+#include <string>
    12+#include <vector>
    13+
    14+bool Socks5(const std::string& strDest, int port, const ProxyCredentials* auth, const Sock& socket);
    


    vasild commented at 5:24 pm on February 16, 2021:

    nit: since Socks5() is not static now in netbase.cpp, I think it is more natural to put its declaration (prototype) in netbase.h and remove it from here.

    Same for SOCKS5_RECV_TIMEOUT.


    practicalswift commented at 1:23 pm on March 2, 2021:
    Addressed!
  93. practicalswift force-pushed on Feb 16, 2021
  94. practicalswift force-pushed on Feb 16, 2021
  95. practicalswift commented at 7:16 pm on February 16, 2021: contributor
    @laanwj Thanks for the quick review! ❤️ Pushed an updated version addressing @MarcoFalke’s feedback. Should hopefully be ready for final review.
  96. in src/test/fuzz/util.h:641 in 88af03002c outdated
    638+            return r;
    639+        }
    640+        std::memcpy(buf, random_bytes.data(), random_bytes.size());
    641+        if (m_fuzzed_data_provider.ConsumeBool()) {
    642+            if (len > random_bytes.size()) {
    643+                std::memset((char*)buf + random_bytes.size(), 0, len - random_bytes.size());
    


    vasild commented at 2:23 pm on February 17, 2021:
    nit: the typecast (char*) in memset() should not be needed. Or if it is needed (why?), then it should also be needed for the memcpy() call a few lines earlier since both take void* argument.

    ryanofsky commented at 2:32 pm on February 19, 2021:

    re: #19203 (review)

    nit: the typecast (char*) in memset() should not be needed. Or if it is needed (why?), then it should also be needed for the memcpy() call a few lines earlier since both take void* argument.

    I think think the compiler won’t allow arithmetic + random_bytes.size() on a void pointer.

  97. in src/test/fuzz/util.h:646 in 88af03002c outdated
    643+                std::memset((char*)buf + random_bytes.size(), 0, len - random_bytes.size());
    644+            }
    645+            return len;
    646+        }
    647+        if (m_fuzzed_data_provider.ConsumeBool() && std::getenv("FUZZED_SOCKET_FAKE_LATENCY") != nullptr) {
    648+            std::this_thread::sleep_for(std::chrono::milliseconds{2});
    


    vasild commented at 2:35 pm on February 17, 2021:

    SOCKS5_RECV_TIMEOUT can be lowered to 1 second for the tests or be the default 20 seconds. In half of the cases this will sleep for 2 ms. So we need 500 Recv() calls where this sleep is executed in order to trigger a higher level timeout of 1 second.

    I think this timeout will practically never be reached. Maybe change this to a few 100s of ms to make a timeout more likely?


    ryanofsky commented at 2:56 pm on February 19, 2021:

    re: #19203 (review)

    SOCKS5_RECV_TIMEOUT can be lowered to 1 second for the tests or be the default 20 seconds. In half of the cases this will sleep for 2 ms. So we need 500 Recv() calls where this sleep is executed in order to trigger a higher level timeout of 1 second.

    I think this timeout will practically never be reached. Maybe change this to a few 100s of ms to make a timeout more likely?

    SOCKS5_RECV_TIMEOUT looks like milliseconds not seconds so std::chrono::milliseconds{2} should always time out.

    Also, I’m assuming this is fake-sleeping not really sleeping in the middle of a fuzz test?


    vasild commented at 7:08 am on March 1, 2021:

    You are right - it can be lowered to 1ms or be the default 20sec. If we want to always produce a timeout, maybe better to sleep for SOCKS5_RECV_TIMEOUT instead of std::chrono::milliseconds{2}?

    Why would it be fake sleeping? My understanding is that it will actually sleep in the middle of the fuzz test.


    practicalswift commented at 1:31 pm on March 2, 2021:

    Unfortunately the sleeping required to make the timeout trigger is not mockable: that’s the reason for making the ugly 2 ms real sleep opt-in via the environment variable FUZZED_SOCKET_FAKE_LATENCY. That environment is not assumed to be set except in special circumstances such as when targeted fuzzing is performed by security researchers and others who are willing to pay the 2 ms cost in order to get full coverage including the timeout logic.

    Making the sleeping required to trigger the timeout mockable is likely out of scope for this PR (but would be nice to have!), so if the opt-in-to-timeout-via-FUZZED_SOCKET_FAKE_LATENCY workaround is deemed too ugly I could simply drop that commit from this PR, and we’ll get partial instead of full coverage :)

    Let me know what you think and I’ll adjust accordingly :)

  98. vasild approved
  99. vasild commented at 2:36 pm on February 17, 2021: member
    ACK 88af03002c170d95dc04077f3bbcd739e2a9e5b7
  100. vasild commented at 3:11 pm on February 17, 2021: member

    Running the new fuzz test for a few tens of minutes produces no errors.

    Reverting the fix for CVE-2017–18350 like this:

     0--- i/src/netbase.cpp
     1+++ w/src/netbase.cpp
     2@@ -533,13 +533,13 @@ bool Socks5(const std::string& strDest, int port, const ProxyCredentials* auth,
     3         case SOCKS5Atyp::DOMAINNAME:
     4         {
     5             recvr = InterruptibleRecv(pchRet3, 1, SOCKS5_RECV_TIMEOUT, sock);
     6             if (recvr != IntrRecvError::OK) {
     7                 return error("Error reading from proxy");
     8             }
     9-            int nRecv = pchRet3[0];
    10+            int nRecv = (char)pchRet3[0];
    11             recvr = InterruptibleRecv(pchRet3, nRecv, SOCKS5_RECV_TIMEOUT, sock);
    12             break;
    13         }
    14         default: return error("Error: malformed proxy response");
    15     }
    16     if (recvr != IntrRecvError::OK) {
    

    crashes the test in less than a minute.

  101. vasild commented at 3:52 pm on February 17, 2021: member
    The PR description needs an update since it mentions 4 commits, but there are just 2 now.
  102. laanwj commented at 5:00 pm on February 17, 2021: member

    The PR description needs an update since it mentions 4 commits, but there are just 2 now.

    Ah yes I had already updated the PR title for it, but you’re right that the description is out of date too.

  103. in src/test/fuzz/util.h:577 in 1d95ff9f73 outdated
    572+        assert(false && "Not implemented yet.");
    573+    }
    574+
    575+    ssize_t Send(const void* data, size_t len, int flags) const override
    576+    {
    577+        constexpr std::array send_errnos{
    


    ryanofsky commented at 2:30 pm on February 19, 2021:

    In commit “fuzz: Add fuzzing harness for Socks5(…)” (1d95ff9f731b0c5a317a9c3ce52e37b4a57e99ff)

    It it better to make assumptions about return value and errno value with ConsumeIntegralInRange and PickValueInArray instead of just using ConsumeIntegral for these? ConsumeIntegral would definitely make the setup shorter and simpler and seem to make fuzzing more robust. Would it cause fuzzing to be too slow?


    vasild commented at 7:12 am on March 1, 2021:

    I think we want to mimic a conforming OS send() because our code is not prepared to handle a misbehaving send(), e.g.

    • one that returns -1 and does not set errno or
    • returns -1 and sets errno to 2379392 or
    • returns -9287 or
    • returns a number bigger than len.
  104. in src/test/fuzz/util.h:632 in 1d95ff9f73 outdated
    627+            return r;
    628+        }
    629+        const std::vector<uint8_t> random_bytes = m_fuzzed_data_provider.ConsumeBytes<uint8_t>(
    630+            m_fuzzed_data_provider.ConsumeBool() ?
    631+                len :
    632+                m_fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, len));
    


    ryanofsky commented at 2:37 pm on February 19, 2021:

    In commit “fuzz: Add fuzzing harness for Socks5(…)” (1d95ff9f731b0c5a317a9c3ce52e37b4a57e99ff)

    Again I don’t understand why need to bias the choice here instead of just choosing uniformly in the range. I guess I don’t understand enough about fuzzing but it seems annoying if you have to hold the fuzzer’s hand this way.


    practicalswift commented at 3:34 pm on March 2, 2021:

    Good question!

    My intuition was that the fuzzer would be helped by being able to distinguish between the expected “success case” (enough to read) and the unexpected “fail” case ((likely) not enough to read) on a single byte input basis. In other words: a single byte input chooses between the “fail path” and the “success path” via ConsumeBool.

    After some tinkering/measuring it seems like that effort to help the fuzzer didn’t make any significant difference.

    Now skipping the ConsumeBool(…) ? len : ConsumeIntegralInRange(…) in favor of a straight ConsumeIntegralInRange(…) as suggested. Please re-review :)

  105. in src/netbase.cpp:44 in 88af03002c outdated
    40@@ -41,7 +41,7 @@ int nConnectTimeout = DEFAULT_CONNECT_TIMEOUT;
    41 bool fNameLookup = DEFAULT_NAME_LOOKUP;
    42 
    43 // Need ample time for negotiation for very slow proxies such as Tor (milliseconds)
    44-static const int SOCKS5_RECV_TIMEOUT = 20 * 1000;
    45+int SOCKS5_RECV_TIMEOUT = 20 * 1000;
    


    ryanofsky commented at 2:47 pm on February 19, 2021:

    In commit “fuzz: Add FUZZED_SOCKET_FAKE_LATENCY mode to FuzzedSock to allow for fuzzing timeout logic” (88af03002c170d95dc04077f3bbcd739e2a9e5b7)

    It seems dangerous to disguise a variable as an all caps constant. Best thing would be change this to a SOCKS5_RECV_TIMEOUT_DEFAULT default timeout, and add a runtime parameter to override the default. If that’s too complicated, next best thing would be to s/SOCKS5_RECV_TIMEOUT/g_socks5_recv_timeout/


    practicalswift commented at 1:22 pm on March 2, 2021:
    Addressed!
  106. ryanofsky approved
  107. ryanofsky commented at 3:07 pm on February 19, 2021: member
    Code review ACK 88af03002c170d95dc04077f3bbcd739e2a9e5b7 with caveat that I don’t know very much about fuzzing so ignore my suggestions.
  108. practicalswift force-pushed on Mar 2, 2021
  109. practicalswift force-pushed on Mar 2, 2021
  110. practicalswift force-pushed on Mar 2, 2021
  111. practicalswift force-pushed on Mar 2, 2021
  112. vasild commented at 5:06 pm on March 2, 2021: member

    b62b90fa0e642c89a75afcfeb7f0491fc8e93f00 looks good except that the first commit does not compile:

    0test/fuzz/socks5.cpp:23:5: error: use of undeclared identifier 'default_socks5_recv_timeout'
    1    default_socks5_recv_timeout = g_socks5_recv_timeout;
    2    ^
    3test/fuzz/socks5.cpp:23:35: error: use of undeclared identifier 'g_socks5_recv_timeout'
    4    default_socks5_recv_timeout = g_socks5_recv_timeout;
    5                                  ^
    

    (the second commit fixes this)

  113. fuzz: Add fuzzing harness for Socks5(...) b22d4c1607
  114. fuzz: Add FUZZED_SOCKET_FAKE_LATENCY mode to FuzzedSock to allow for fuzzing timeout logic 366e3e1f89
  115. practicalswift force-pushed on Mar 2, 2021
  116. practicalswift commented at 9:59 pm on March 2, 2021: contributor

    @vasild

    b62b90f looks good except that the first commit does not compile: … (the second commit fixes this)

    Oh, good catch!

    Now addressed :)

    Please re-review :)

  117. vasild approved
  118. vasild commented at 1:25 pm on March 3, 2021: member
    ACK 366e3e1f89d99c62b548087384487b62fd602e17
  119. MarcoFalke merged this on Mar 3, 2021
  120. MarcoFalke closed this on Mar 3, 2021

  121. sidhujag referenced this in commit 217b5ba671 on Mar 3, 2021
  122. fanquake removed this from the "Blockers" column in a project

  123. practicalswift deleted the branch on Apr 10, 2021
  124. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me