net: fuzz: bypass network magic and checksum validation #31012

pull brunoerg wants to merge 3 commits into bitcoin:master from brunoerg:2024-09-fuzz-netdriver changing 3 files +15 −53
  1. brunoerg commented at 4:46 pm on October 1, 2024: contributor

    Fixes #30957

    We currently have some instructions on the documentation about how to fuzz the Bitcoin Core P2P layer using Honggfuzz NetDriver. It includes a patch to be applied to bypass network magic and checksum validation, that is outdated. However, we can change the code to bypass them using FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION. It also makes the p2p_transport_serialization fuzz target simpler since we do not need to assist the network magic and checksum creation anymore.

  2. DrahtBot commented at 4:46 pm on October 1, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK dergoegge, marcofleon

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label P2P on Oct 1, 2024
  4. brunoerg commented at 4:47 pm on October 1, 2024: contributor
  5. dergoegge commented at 4:59 pm on October 1, 2024: member
    Concept ACK
  6. in doc/fuzzing.md:248 in 6d995d738f outdated
    265-+    if (false && memcmp(hash.begin(), hdr.pchChecksum, CMessageHeader::CHECKSUM_SIZE) != 0) { // skip checksum checking
    266-         LogDebug(BCLog::NET, "Header error: Wrong checksum (%s, %u bytes), expected %s was %s, peer=%d\n",
    267-                  SanitizeString(msg.m_type), msg.m_message_size,
    268-                  HexStr(Span{hash}.first(CMessageHeader::CHECKSUM_SIZE)),
    269 EOF
    270 $ cmake -B build_fuzz \
    


    dergoegge commented at 5:21 pm on October 1, 2024:
    CPPFLAGS will have to include -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION when building bitcoind for fuzzing with the honggfuzz netdriver.

    brunoerg commented at 7:40 pm on October 1, 2024:

    Isn’t FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION already set when using -DBUILD_FOR_FUZZING=ON?

     0if(BUILD_FOR_FUZZING)
     1  message(WARNING "BUILD_FOR_FUZZING=ON will disable all other targets and force BUILD_FUZZ_BINARY=ON.")
     2  set(BUILD_DAEMON OFF)
     3  set(BUILD_CLI OFF)
     4  set(BUILD_TX OFF)
     5  set(BUILD_UTIL OFF)
     6  set(BUILD_UTIL_CHAINSTATE OFF)
     7  set(BUILD_KERNEL_LIB OFF)
     8  set(BUILD_WALLET_TOOL OFF)
     9  set(BUILD_GUI OFF)
    10  set(ENABLE_EXTERNAL_SIGNER OFF)
    11  set(WITH_MINIUPNPC OFF)
    12  set(WITH_ZMQ OFF)
    13  set(BUILD_TESTS OFF)
    14  set(BUILD_GUI_TESTS OFF)
    15  set(BUILD_BENCH OFF)
    16  set(BUILD_FUZZ_BINARY ON)
    17
    18  target_compile_definitions(core_interface INTERFACE
    19    ABORT_ON_FAILED_ASSUME
    20    FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    21  )
    22endif()
    

    maflcko commented at 7:58 pm on October 1, 2024:

    DBUILD_FOR_FUZZING

    This isn’t set for the hfuzz net driver. It uses the HFND_FUZZING_ENTRY_FUNCTION_CXX


    brunoerg commented at 8:12 pm on October 1, 2024:
    Yes, missed that. Just added it.
  7. DrahtBot added the label CI failed on Oct 1, 2024
  8. brunoerg force-pushed on Oct 1, 2024
  9. brunoerg force-pushed on Oct 1, 2024
  10. brunoerg marked this as a draft on Oct 1, 2024
  11. brunoerg commented at 8:38 pm on October 1, 2024: contributor
    Just moved it to draft to get some conceptual feedbacks first. I added a commit to bypass some asserts in ConnmanTestMsg that would fail since we’re bypassing network magic and checksum validation. Also, I think we will have to change p2p_transport_bidirectional target since there are some sanity checks that would fail with the bypasses, especially the assertions about ReceivedBytes/GetReceivedMessage. Any thoughts?
  12. maflcko commented at 6:10 am on October 2, 2024: member

    I added a commit to bypass some asserts in ConnmanTestMsg that would fail since we’re bypassing network magic and checksum validation. Also, I think we will have to change p2p_transport_bidirectional target since there are some sanity checks that would fail with the bypasses, especially the assertions about ReceivedBytes/GetReceivedMessage. Any thoughts?

    Can you explain why the asserts and sanity checks would be failing? They are unrelated to the checksum, so if they are failing, it seems like this patch is incomplete and some part is still checking the magic or checksum, no?

  13. dergoegge commented at 9:54 am on October 2, 2024: member

    They are unrelated to the checksum, so if they are failing, it seems like this patch is incomplete and some part is still checking the magic or checksum, no?

    I’m guessing that we need to make sure that (if we are fuzzing) V1Transport::SetMessageToSend creates valid checksums/magic-bytes according to the new fuzzing mode check.

  14. brunoerg force-pushed on Oct 2, 2024
  15. brunoerg commented at 10:06 am on October 2, 2024: contributor

    Can you explain why the asserts and sanity checks would be failing? They are unrelated to the checksum, so if they are failing, it seems like this patch is incomplete and some part is still checking the magic or checksum, no?

    Some previously valid magic or checksum could be now be considered invalid. But I removed the commit that bypasses the asserts and changed the verification to:

    0#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    1    if ((hdr.pchMessageStart[0] & 1) != 0 && hdr.pchMessageStart != m_magic_bytes) {
    2#else
    

    instead of just:

    0#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    1    if ((hdr.pchMessageStart[0] & 1) != 0) {
    2#else
    

    I think this way we still facilitate it without invalidating valid magic/checksum.

  16. in src/net.cpp:734 in ffdbf6c45b outdated
    727@@ -728,7 +728,11 @@ int V1Transport::readHeader(Span<const uint8_t> msg_bytes)
    728     }
    729 
    730     // Check start string, network magic
    731+#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    732+    if ((hdr.pchMessageStart[0] & 1) != 0 && hdr.pchMessageStart != m_magic_bytes) {
    733+#else
    734     if (hdr.pchMessageStart != m_magic_bytes) {
    735+#endif
    


    maflcko commented at 10:50 am on October 2, 2024:

    maybe I am missing something, but this looks still wrong. I am reading this as fuzzing adding additional constraints, which seems the inverse of what we want to achieve.

    Also, this could be impossible to satisfy, if !(m_magic_bytes[0]&1) holds. What am I missing?

    Edit: Nvm, this is about avoiding the error, so additional constraints are useful to avoid hitting it.


    brunoerg commented at 11:07 am on October 2, 2024:
    Before, if hdr.pchMessageStart != m_magic_bytes we would return an error. That is, fuzzing would have to spend time matching it. Now, we will only invalidate it if hdr.pchMessageStart != m_magic_bytes and (hdr.pchMessageStart[0] & 1) != 0. So, it will make fuzzing easier by just needing to match (hdr.pchMessageStart[0] & 1) == 0.
  17. brunoerg marked this as ready for review on Oct 2, 2024
  18. maflcko removed the label CI failed on Oct 2, 2024
  19. in src/net.cpp:803 in ffdbf6c45b outdated
    796@@ -793,7 +797,11 @@ CNetMessage V1Transport::GetReceivedMessage(const std::chrono::microseconds time
    797     RandAddEvent(ReadLE32(hash.begin()));
    798 
    799     // Check checksum and header message type string
    800+#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    801+    if ((hdr.pchChecksum[0] & 1) != 0 && memcmp(hash.begin(), hdr.pchChecksum, CMessageHeader::CHECKSUM_SIZE) != 0) {
    802+#else
    803     if (memcmp(hash.begin(), hdr.pchChecksum, CMessageHeader::CHECKSUM_SIZE) != 0) {
    804+#endif
    


    dergoegge commented at 8:58 am on October 3, 2024:

    I think this could be de-duplicated and made a little cleaner (same for the magic byte check):

    0    bool checksum_ok{memcmp(hash.begin(), hdr.pchChecksum, CMessageHeader::CHECKSUM_SIZE) == 0};
    1#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    2    checksum_ok = checksum_ok || (hdr.pchChecksum[0] & 1) != 0;
    3#endif
    4
    5    if (!checksum_ok) {
    6        ...
    

    brunoerg commented at 10:39 am on October 3, 2024:
    Nice, I will address it.
  20. in doc/fuzzing.md:201 in ffdbf6c45b outdated
    197@@ -198,6 +198,7 @@ $ cmake -B build_fuzz \
    198    -DCMAKE_C_COMPILER="$(pwd)/honggfuzz/hfuzz_cc/hfuzz-clang" \
    199    -DCMAKE_CXX_COMPILER="$(pwd)/honggfuzz/hfuzz_cc/hfuzz-clang++" \
    200    -DBUILD_FOR_FUZZING=ON \
    201+   -DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION=ON \
    


    dergoegge commented at 9:06 am on October 3, 2024:

    This won’t define the c++ macro FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION, as this only sets a cmake build flag (cmake probably complains that this flag doesn’t exist).

    You should be able to use -DAPPEND_CPPFLAGS="-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION" (I see where the confusion comes from, as -D is accepted by compilers to define macros and by cmake to set build flags but they are different things).

    You will also need to add this to the correct section in the docs (i.e. the netdriver docs):

    https://github.com/bitcoin/bitcoin/blob/ffdbf6c45be2a69050401cd5dad785758cee21d4/doc/fuzzing.md?plain=1#L246-L258


    brunoerg commented at 10:50 am on October 3, 2024:

    You should be able to use -DAPPEND_CPPFLAGS="-DFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION" (I see where the confusion comes from, as -D is accepted by compilers to define macros and by cmake to set build flags but they are different things).

    Ah yes, thank you. Just saw some examples from the CI scripts.

  21. dergoegge changes_requested
  22. brunoerg force-pushed on Oct 3, 2024
  23. brunoerg force-pushed on Oct 3, 2024
  24. brunoerg commented at 10:51 am on October 3, 2024: contributor
    Force-pushed addressing #31012 (review) and #31012 (review). Thanks, @dergoegge.
  25. marcofleon commented at 4:01 pm on October 4, 2024: contributor

    Approach ACK. The use of FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION in net.cpp looks good to me. Generated a coverage report to verify that both branches of both checks are hit.

    I haven’t yet worked through the honggfuzz instructions to confirm that they work. I’ll circle back to this to give it a try.

  26. dergoegge commented at 10:53 am on October 7, 2024: member

    Copy and pasting the command from the docs doesn’t work:

    0$ git apply << "EOF" ...
    1error: corrupt patch at line 15
    
  27. brunoerg force-pushed on Oct 7, 2024
  28. brunoerg commented at 12:51 pm on October 7, 2024: contributor

    Copy and pasting the command from the docs doesn’t work:

    Just fixed it. Can you check it?

  29. dergoegge commented at 2:37 pm on October 7, 2024: member

    -v2transport=0 should be added to the bitcoind args used for netdriver fuzzing. Otherwise, it doesn’t seem like honggfuzz ever fuzzes anything besides the v2 handshake.

    Also playing around with this and reading the netdriver post (i couldn’t find any other docs on it) I’m not sure if this will ever even get past the version handshake? So I’m wondering how useful this tool really is for us.

  30. brunoerg force-pushed on Oct 7, 2024
  31. brunoerg commented at 5:35 pm on October 7, 2024: contributor

    -v2transport=0 should be added to the bitcoind args used for netdriver fuzzing. Otherwise, it doesn’t seem like honggfuzz ever fuzzes anything besides the v2 handshake.

    Well noticed, just added it.

  32. DrahtBot added the label CI failed on Oct 7, 2024
  33. DrahtBot commented at 6:44 pm on October 7, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31190077287

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  34. net: bypass network magic and checksum validation for fuzzing 25890351ef
  35. doc: fuzz: remove instructions to bypass network magic and checksum validation for Honggfuzz NetDriver 4ae2c950bf
  36. fuzz: remove checksum and magic bytes assist in `p2p_transport_serialization` d43966603b
  37. brunoerg force-pushed on Oct 7, 2024
  38. DrahtBot removed the label CI failed on Oct 7, 2024
  39. brunoerg commented at 9:20 pm on October 7, 2024: contributor

    Also playing around with this and reading the netdriver post (i couldn’t find any other docs on it) I’m not sure if this will ever even get past the version handshake? So I’m wondering how useful this tool really is for us.

    Yes, I don’t think it will pass the version handshake. I can’t see what kind of bugs it would find that other fuzzers would not get.

  40. dergoegge approved
  41. dergoegge commented at 1:05 pm on October 8, 2024: member

    ACK d43966603bdefb0402a2dc93ff990cb254cdf2df

    I think we could consider removing the honggfuzz netdriver instructions in a follow up but the changes look good to me anyway. I tested this by running honggfuzz in netdriver mode.

  42. DrahtBot requested review from marcofleon on Oct 8, 2024
  43. marcofleon commented at 1:45 pm on October 8, 2024: contributor
    Tested ACK d43966603bdefb0402a2dc93ff990cb254cdf2df
  44. theuni commented at 8:14 pm on October 9, 2024: member

    If we’re going to be peppering the code with these, could we add a safety harness to make sure it’s never being used accidentally for a live node?

    Maybe somewhere in init.cpp or bitcoind.cpp:

    0#ifdef FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION
    1exit(1);
    2#endif
    

    Edit: I realize that bitcoind should be disabled in that case. This would be belt-and-suspenders to make sure that always holds.

  45. brunoerg commented at 9:06 pm on October 9, 2024: contributor

    If we’re going to be peppering the code with these, could we add a safety harness to make sure it’s never being used accidentally for a live node?

    Seems good. Note that we’re already use FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION to bypass CheckProofOfWork.

  46. fanquake commented at 9:28 am on October 10, 2024: member

    could we add a safety harness to make sure it’s never being used accidentally for a live node? Maybe somewhere in init.cpp or bitcoind.cpp: Edit: I realize that bitcoind should be disabled in that case. This would be belt-and-suspenders to make sure that always holds.

    Just want to note that putting something in either of these files wouldn’t be enough to prevent someone producing a kernel lib with the fuzzing code.

  47. maflcko commented at 11:02 am on October 10, 2024: member

    I think we could consider removing the honggfuzz netdriver instructions in a follow

    Not sure about touching real code with the motivation to make instructions happy that may be removed anyway.

    If the changes are meaningful for the fuzz target, then it may be fine.

    It also makes the p2p_transport_serialization fuzz target simpler since we do not need to assist the network magic and checksum creation anymore.

    Can you provide more details here? IIUC a hash will still be calculated, just not compared, so this may only drop one hash call, with a possible maximum speed-up of 2x. However, it would be good to actually measure the difference. Otherwise, I am not sure if this is worth it.

  48. brunoerg commented at 1:17 pm on October 10, 2024: contributor

    Can you provide more details here? IIUC a hash will still be calculated, just not compared, so this may only drop one hash call, with a possible maximum speed-up of 2x. However, it would be good to actually measure the difference. Otherwise, I am not sure if this is worth it.

    This is not about performance. It makes the p2p_transport_serialization fuzz target simpler because we can remove the code that assists the checksum and magic bytes creation. So, we can achieve the same with less LoC.

  49. maflcko commented at 1:22 pm on October 10, 2024: member
    Ok, then I am nack-ish on this. Generally, test code should follow real code, not the other way round, unless there is a reason. Saving 15 lines of test-only code seems a weak reason to me, but I don’t mind if others like this.
  50. brunoerg commented at 1:24 pm on October 10, 2024: contributor
    Since everyone agrees on removing Honggfuzz netdriver, I think we can change the approach to simply just remove it from the documentation.
  51. brunoerg commented at 10:57 am on October 15, 2024: contributor
    Closing this in favor of #31092
  52. brunoerg closed this on Oct 15, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me