tests: Avoid accumulating allocated memory in a global state if LogPrintf is called when fuzzing #17894

pull practicalswift wants to merge 1 commits into bitcoin:master from practicalswift:fuzzers-deserialize-enable-logging changing 1 files +15 −0
  1. practicalswift commented at 11:38 PM on January 7, 2020: contributor

    Avoid accumulating allocated memory in a global state if LogPrintf is called when fuzzing.

    The accumulation takes place via the m_msgs_before_open buffering.

    Resolved by enabling logging and writing log messages to standard output.

    This has the added benefit of making the fuzzing operator aware of any log printing caused by fuzzing which is likely to be an anomaly in itself (in the general case).

    The only fuzzing harness in master that I've seen calling LogPrintf is messageheader_deserialize via the call to CMessageHeader::IsValid() which somewhat surprisingly does a LogPrintf in the case of nMessageSize > MAX_SIZE :)

    Before:

    $ src/test/fuzz/messageheader_deserialize corpus/
    …
    INFO: libFuzzer disabled leak detection after every mutation.
          Most likely the target function accumulates allocated
          memory in a global state w/o actually leaking it.
          You may try running this binary with -trace_malloc=[12]
          to get a trace of mallocs and frees.
          If LeakSanitizer is enabled in this process it will still
          run on the process shutdown.
    …
    

    After:

    $ src/test/fuzz/messageheader_deserialize corpus/
    …
    2020-01-07T23:20:34Z CMessageHeader::IsValid(): (@, 4278190080 bytes) nMessageSize > MAX_SIZE
    …
    

    How to test this PR

    $ make distclean
    $ ./autogen.sh
    $ CC=clang CXX=clang++ ./configure --enable-fuzz \
          --with-sanitizers=address,fuzzer,undefined
    $ make
    $ src/test/fuzz/messageheader_deserialize
    …
    
  2. fanquake added the label Tests on Jan 7, 2020
  3. Crypt-iQ commented at 1:27 AM on January 8, 2020: contributor

    Concept ACK.

    I'm not too familiar with libFuzzer so I can't speak to that, but won't this buildup only occur if AFL is in persistent mode since then there's a loop rather than a new binary being executed?

  4. fanquake commented at 2:44 AM on January 8, 2020: member

    Travis failure:

    Traceback (most recent call last):
      File "test/fuzz/test_runner.py", line 175, in <module>
        main()
      File "test/fuzz/test_runner.py", line 115, in main
        universal_newlines=True,
      File "/usr/lib/python3.6/subprocess.py", line 438, in run
        output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command '['/home/travis/build/bitcoin/bitcoin/build/bitcoin-x86_64-pc-linux-gnu/src/test/fuzz/addr_info_deserialize', '-help=1']' 
    died with <Signals.SIGABRT: 6>.
    
  5. DrahtBot commented at 6:19 PM on January 10, 2020: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    No conflicts as of last run.

  6. practicalswift commented at 8:29 PM on January 12, 2020: contributor

    @Crypt-iQ Yes, the build-up will only occur if in-process fuzzing is used: libFuzzer is always in-process and AFL only when in persistent mode.

  7. in src/test/fuzz/deserialize.cpp:45 in 125de8a980 outdated
      38 | @@ -35,6 +39,15 @@ void initialize()
      39 |  {
      40 |      // Fuzzers using pubkey must hold an ECCVerifyHandle.
      41 |      static const auto verify_handle = MakeUnique<ECCVerifyHandle>();
      42 | +
      43 | +    // Enable logging. This avoids accumulating allocated memory in a global
      44 | +    // state (via m_msgs_before_open buffering) if LogPrintf is called when
      45 | +    // fuzzing. Log messages are written to standard output.
    


    MarcoFalke commented at 6:51 PM on January 13, 2020:

    Log messages are written to standard output

    Why?


    practicalswift commented at 9:16 PM on January 13, 2020:

    My reasoning is that any LogPrintf in the deserialization code is likely a bug or a mistake, and thus something the fuzzing operator would like to be aware of. I'm scratching my own itch here: as a fuzzing operator myself I sure would like to know at least! :)

    From the OP:

    This has the added benefit of making the fuzzing operator aware of any log printing caused by fuzzing which is likely to be an anomaly in itself (in the general case).

    Makes sense? :)


    MarcoFalke commented at 9:19 PM on January 13, 2020:

    Logging to stdout in the fuzzer is spammy, I think. It distracts from the normal fuzzer output.


    MarcoFalke commented at 9:33 PM on January 13, 2020:

    Ah, that is true for deserialization fuzzers. However, for validation fuzzers or net processing fuzzers, or any other high level fuzzer, log prints are no longer an anomaly.


    practicalswift commented at 10:28 PM on January 13, 2020:

    This PR is only making changes to the deserialization fuzzers.


    MarcoFalke commented at 7:32 PM on January 14, 2020:

    Sorry, I missed that.

  8. in src/test/fuzz/deserialize.cpp:49 in 125de8a980 outdated
      38 | @@ -35,6 +39,15 @@ void initialize()
      39 |  {
      40 |      // Fuzzers using pubkey must hold an ECCVerifyHandle.
      41 |      static const auto verify_handle = MakeUnique<ECCVerifyHandle>();
      42 | +
      43 | +    // Enable logging. This avoids accumulating allocated memory in a global
      44 | +    // state (via m_msgs_before_open buffering) if LogPrintf is called when
      45 | +    // fuzzing. Log messages are written to standard output.
      46 | +    SelectParams(CBaseChainParams::REGTEST);
      47 | +    InitLogging();
    


    MarcoFalke commented at 6:55 PM on January 13, 2020:

    Pretty sure this will log to (and maybe corrupt) the users main data directory


    practicalswift commented at 9:12 PM on January 13, 2020:

    The regtest debug.log file you mean?

    $ strace -f src/test/fuzz/messageheader_deserialize 2>&1 | grep -E '^open.*regtest'
    openat(AT_FDCWD, "/…/.bitcoin/regtest/debug.log", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
    

    That's not very nice and we shouldn't do that. The reason was that the m_print_to_file is effectively defaulting to true due to the initialization logic. It is easy to miss that since it is initialised to false first:

    $ git grep 'bool m_print_to_file'
    src/logging.h:        bool m_print_to_file = false;
    

    Good catch! Thanks!

    Now logging to /dev/null instead via ::gArgs.ForceSetArg("-debuglogfile", "/dev/null");:

    $ strace -f src/test/fuzz/messageheader_deserialize 2>&1
    openat(AT_FDCWD, "/dev/null", O_WRONLY|O_CREAT|O_APPEND, 0666) = 3
    

    Let me know if you think it should be solved in some other way. I'm open to suggestions :)



    MarcoFalke commented at 9:24 PM on January 13, 2020:

    If you decide to use this hack, it might make sense to move it to src/test/util/setup_common.h so that it can be more easily re-used.


    practicalswift commented at 12:16 PM on January 14, 2020:

    I think I prefer the more explicit setup method: that way all necessary setup preconditions for the fuzzer are clearly stated in the fuzzing harness.

    Now setting both -datadir (dummy) and -debuglogfile (/dev/null) to make sure there is no possibility of interfering with the main regtest location.

    Please re-review :)

  9. MarcoFalke changes_requested
  10. MarcoFalke commented at 6:55 PM on January 13, 2020: member

    Concept ACK. Approach NACK.

  11. practicalswift force-pushed on Jan 13, 2020
  12. tests: Avoid accumulating allocated memory in a global state if LogPrintf is called when fuzzing d7f458338a
  13. practicalswift force-pushed on Jan 14, 2020
  14. practicalswift commented at 12:17 PM on January 14, 2020: contributor

    @MarcoFalke Approach ACK now with the updated version? :)

  15. practicalswift commented at 1:08 PM on January 22, 2020: contributor

    @MarcoFalke How can we proceed with this one? Would be really nice to not have deal with this memory leak when fuzzing the deserialisation code :)

  16. practicalswift commented at 8:12 AM on March 10, 2020: contributor

    Closing due to lack of interest :)

  17. practicalswift closed this on Mar 10, 2020

  18. practicalswift deleted the branch on Apr 10, 2021
  19. DrahtBot locked this on Aug 16, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 15:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me