tests: Add fuzzing harnesses for CAutoFile, CBufferedFile, LoadExternalBlockFile and other FILE* consumers

practicalswift commented at 7:14 pm on June 2, 2020: contributor

Add fuzzing harnesses for CAutoFile, CBufferedFile, LoadExternalBlockFile and other FILE* consumers:

Add FuzzedFileProvider which provides a FILE* interface to FuzzedDataProvider using fopencookie
Add FuzzedAutoFileProvider which provides a CAutoFile interface to FuzzedDataProvider
Add serialization/deserialization fuzzing helpers WriteToStream(…)/ReadFromStream(…)
Add fuzzing harness for CAutoFile (streams.h)
Add fuzzing harness for CBufferedFile (streams.h)
Add fuzzing harness for LoadExternalBlockFile(...) (validation.h)
Add fuzzing harness for CBlockPolicyEstimator::Read and CBlockPolicyEstimator::Write (policy/fees.h)

See doc/fuzzing.md for information on how to fuzz Bitcoin Core. Don’t forget to contribute any coverage increasing inputs you find to the Bitcoin Core fuzzing corpus repo.

Happy fuzzing :)

DrahtBot added the label Build system on Jun 2, 2020

DrahtBot added the label Tests on Jun 2, 2020

DrahtBot commented at 0:08 am on June 8, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#19259 (tests: Add fuzzing harness for LoadMempool(…) and DumpMempool(…) by practicalswift)
#19203 (net: Add regression fuzz harness for CVE-2017-18350. Add FuzzedSocket. Add thin SOCKET wrapper. by practicalswift)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label Needs rebase on Jun 11, 2020

practicalswift force-pushed on Jun 11, 2020

DrahtBot removed the label Needs rebase on Jun 11, 2020

practicalswift force-pushed on Jun 12, 2020

practicalswift force-pushed on Jun 15, 2020

in src/test/fuzz/util.h:336 in 97ee502b66 outdated

295+        const std::vector<uint8_t> random_bytes = fuzzed_file->m_fuzzed_data_provider.ConsumeBytes<uint8_t>(size);
296+        if (random_bytes.empty()) {
297+            return 0;
298+        }
299+        std::memcpy(buf, random_bytes.data(), random_bytes.size());
300+        const ssize_t n = fuzzed_file->m_fuzzed_data_provider.ConsumeBool() ? size : random_bytes.size();

Crypt-iQ commented at 10:58 pm on July 10, 2020:

I don’t understand the need for the call to ConsumeBool(). Isn’t size necessarily equal to random_bytes.size()?

MarcoFalke commented at 10:18 am on July 11, 2020:

If the m_fuzzed_data_provider is empty it should give out less bytes, so size would be larger. ConsumeBool would return false in that case and this condition would evaluate to size.

So it seems n can simply be replaced by size

Crypt-iQ commented at 10:50 am on July 11, 2020:

Oh I didn’t think about that edge case. Don’t you mean the condition should evaluate to random_bytes.size() if ConsumeBool() is false. So n replaced by random_bytes.size() instead?

MarcoFalke commented at 10:51 am on July 11, 2020:

Ah right off-by-one :man_facepalming:

practicalswift commented at 1:25 am on July 15, 2020:

Now simply using random_bytes.size(). I think I was trying to be too clever :) Please re-review :)

Crypt-iQ commented at 11:30 pm on July 10, 2020: contributor

utACK 27bef8c

DrahtBot added the label Needs rebase on Jul 11, 2020

practicalswift force-pushed on Jul 11, 2020

DrahtBot removed the label Needs rebase on Jul 11, 2020

in src/test/fuzz/buffered_file.cpp:43 in 30d545c15a outdated

37+                    opt_buffered_file->read((char*)arr.data(), fuzzed_data_provider.ConsumeIntegralInRange<size_t>(0, 4096));
38+                } catch (const std::ios_base::failure&) {
39+                }
40+                break;
41+            }
42+            case 1: {

Crypt-iQ commented at 8:13 pm on July 14, 2020:

Why did you leave out SetPos when calling CBufferedFile functions?

practicalswift commented at 0:11 am on July 15, 2020:

Good catch! Now fuzzing also SetPos :)

Crypt-iQ commented at 9:24 pm on July 14, 2020: contributor

Combined fuzzing coverage for only the fuzzers in this branch: https://crypt-iq.github.io/pr19143_cov/src/

Each fuzzer was run for about 24 hours. I used a variation of libFuzzer and AFL. There seems to be some missing coverage in CBlockPolicyEstimator::Write & CBlockPolicyEstimator::Read. That could probably be fixed by better seeds. There’s also missing coverage in LoadExternalBlockFile, and this can also be fixed by better seeds. I don’t know the format of the external block files (I copy pasted the magic bytes from MessageStart() as seeds but no luck). This was the only slow fuzzing harness with ~5 execs/s.

practicalswift force-pushed on Jul 15, 2020

practicalswift commented at 0:58 am on July 15, 2020: contributor

Combined fuzzing coverage for only the fuzzers in this branch: https://crypt-iq.github.io/pr19143_cov/src/

Each fuzzer was run for about 24 hours. I used a variation of libFuzzer and AFL.

Wow, that’s some thorough testing! Thanks a lot!

There seems to be some missing coverage in CBlockPolicyEstimator::Write & CBlockPolicyEstimator::Read. That could probably be fixed by better seeds.

I think the perceived lack of coverage is simply due to src/test/fuzz/policy_estimator being very slow (CBlockPolicyEstimator creation is costly): I believe that given enough time the fuzzer will span the full CFG :)

To speed things up I’ve now added also a dedicated fuzzer for CBlockPolicyEstimator::{Write,Read} which re-uses the CBlockPolicyEstimator object across runs (without sacrificing substantial coverage stability AFAICT): see src/test/fuzz/policy_estimator_io.cpp.

This dedicated fuzzer is running at >1000 exec/s compared to <50 exec/s for the non-dedicated one.

There’s also missing coverage in LoadExternalBlockFile, and this can also be fixed by better seeds. I don’t know the format of the external block files (I copy pasted the magic bytes from MessageStart() as seeds but no luck). This was the only slow fuzzing harness with ~5 execs/s.

libFuzzer from LLVM 10 is able to reach past the magic bytes check within a few seconds. Perhaps you were using an old version of libFuzzer when running the load_external_block_file fuzzer?

practicalswift force-pushed on Jul 15, 2020

in src/test/fuzz/util.h:361 in 46ce3bf4c6 outdated

356+        FuzzedFileProvider* fuzzed_file = (FuzzedFileProvider*)cookie;
357+        int64_t new_offset = 0;
358+        if (whence == SEEK_SET) {
359+            new_offset = *offset;
360+        } else if (whence == SEEK_CUR) {
361+            if (AdditionOverflow(new_offset, *offset)) {

Crypt-iQ commented at 2:23 am on July 15, 2020:

new_offset should be fuzzed_file->m_offset

practicalswift commented at 11:42 am on July 15, 2020:

Oh, nice catch! Thanks for noticing: now fixed!

in src/test/fuzz/util.h:318 in 3f12c9334f outdated

313+            FuzzedFileProvider::write,
314+            FuzzedFileProvider::seek,
315+            FuzzedFileProvider::close,
316+        };
317+        return fopencookie(this, mode.c_str(), io_hooks);
318+#else

Crypt-iQ commented at 8:17 am on July 15, 2020:

Could return early at the top of the function #ifndef _GNU_SOURCE, then you wouldn’t need these macros here?

practicalswift commented at 11:25 am on July 15, 2020:

This is intentional actually: I want exactly the same input bytes to be consumed by an open() call regardless of fopencookie being available or not.

in src/test/fuzz/autofile.cpp:66 in d03f2588a4 outdated

61+    }
62+    (void)auto_file.Get();
63+    (void)auto_file.GetType();
64+    (void)auto_file.GetVersion();
65+    (void)auto_file.IsNull();
66+    if (fuzzed_data_provider.ConsumeBool()) {

Crypt-iQ commented at 8:33 am on July 15, 2020:

Is this gated by ConsumeBool() to test the behavior that, if this isn’t hit, fclose() is called automatically in the destructor?

practicalswift commented at 11:27 am on July 15, 2020:

Yes it is, but I want the fuzzer to also exercise the release() code path here, and it isn’t called in the destructor :)

in src/test/fuzz/buffered_file.cpp:53 in 8cf3f3492d outdated

48@@ -48,17 +49,30 @@ void test_one_input(const std::vector<uint8_t>& buffer)
49                 break;
50             }
51             case 3: {
52+                if (!opt_buffered_file->SetPos(fuzzed_data_provider.ConsumeIntegralInRange<uint64_t>(0, 4096))) {
53+                    setpos_fail = true;

Crypt-iQ commented at 8:40 am on July 15, 2020:

Should this be set back to false upon successful call to SetPos?

practicalswift commented at 11:29 am on July 15, 2020:

Unfortunately I think a single failed SetPos call is sufficient to set a subsequent FindByte call into an infinite loop. A successful SetPos won’t necessarily undo the harm of a prior failed SetPos call :)

practicalswift force-pushed on Jul 15, 2020

in src/test/fuzz/policy_estimator_io.cpp:25 in bb8711426e outdated

20+    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
21+    FuzzedAutoFileProvider fuzzed_auto_file_provider = ConsumeAutoFile(fuzzed_data_provider);
22+    CAutoFile fuzzed_auto_file = fuzzed_auto_file_provider.open();
23+    // Re-using block_policy_estimator across runs to avoid costly creation of CBlockPolicyEstimator object.
24+    static CBlockPolicyEstimator block_policy_estimator;
25+    if (block_policy_estimator.Read(fuzzed_auto_file)) {

Crypt-iQ commented at 8:48 am on July 15, 2020:

Since block_policy_estimator is declared static, it shouldn’t matter if the Read call fails because the Write call could still succeed so I don’t think it needs to be gated.

practicalswift commented at 11:33 am on July 15, 2020:

The thinking here is that a successful Read will “reset” (to some extent) the state of block_policy_estimator. Without the check we might be calling Write(…) on block_policy_estimator that is populated from a successful Read in a previous run: that would introduce unnecessary coverage instability.

More specifically this is the case I’m guarding against:

0run n:
1  block_policy_estimator.Read(…) returns true
2  block_policy_estimator.Write(…)
3run n+1:
4  block_policy_estimator.Read(…) returns false
5  block_policy_estimator.Write(…) # now with state from run n (instead of n+1)

I want to minimize the unnecessary coverage instability that unavoidably may arise from re-using block_policy_estimator between runs (which in turn is done to get decent performance for this otherwise super slow fuzzing harness).

Crypt-iQ commented at 11:42 am on July 15, 2020:

Ok this makes much more sense. Otherwise it’s not always possible to get the same coverage I think? Different machines on different runs may not hit the same Read->Write->Read(false) calls.

practicalswift commented at 11:50 am on July 15, 2020:

@Crypt-iQ Exactly! We want to have “coverage stability”: a given input should result in the same coverage for a single run regardless of what has happened in previous runs. Achieving 100% coverage stability is not always possible. It might be due to architectural reasons (globals is a typical example), or it might be due to performance reasons (re-using an object might be required to get decent performance in terms of execs/s). We want to do what we can to at least minimise unnecessary coverage instability.

Crypt-iQ commented at 10:16 am on July 15, 2020: contributor

With clang-10 I can pass the magic bytes check (I guess it uses coverage in memcmp to solve for it?). Currently running the new policy_estimator_io fuzzer as well.

tests: Add FuzzedFileProvider which provides a FILE* interface to FuzzedDataProvider using fopencookie 9dbcd6854c

tests: Add FuzzedAutoFileProvider which provides a CAutoFile interface to FuzzedDataProvider e48094a506

tests: Add serialization/deserialization fuzzing helpers WriteToStream(…)/ReadFromStream(…) e507c0799d

tests: Add fuzzing harness for CAutoFile (streams.h) f3aa659be6

tests: Add fuzzing harness for CBufferedFile (streams.h) 9823376030

tests: Add fuzzing harness for LoadExternalBlockFile(...) (validation.h) 7bcc71e5f8

tests: Add fuzzing harness for CBufferedFile::{SetPos,GetPos,GetType,GetVersion} (stream.h) 614e0807a8

tests: Add fuzzing harness for CBlockPolicyEstimator::{Read,Write} (policy/fees.h) ad6c34881d

practicalswift force-pushed on Jul 15, 2020

Crypt-iQ commented at 2:45 pm on July 16, 2020: contributor

Can confirm that the new policy_estimator_io fuzzer is significantly faster with libfuzzer (600-700 exec/s on one core) and hits checks in Read & Write that it otherwise doesn’t hit in the policy_estimator fuzzer due to being slower. Also tried with AFL but was very slow since I forgot to compile with persistent mode :/. Will review the latest set of changes now. Coverage for just the policy_estimator_io fuzzer is here https://crypt-iq.github.io/policy_cov_llvm_outs/src/policy/fees.cpp.gcov.html

Crypt-iQ commented at 2:33 am on July 18, 2020: contributor

Tested ACK ad6c348

MarcoFalke merged this on Jul 18, 2020

MarcoFalke closed this on Jul 18, 2020

sidhujag referenced this in commit 60a95acd67 on Jul 18, 2020

Fabcien referenced this in commit b52804961d on Feb 3, 2021

practicalswift deleted the branch on Apr 10, 2021

kittywhiskers referenced this in commit a6bb4410ab on May 7, 2022

kittywhiskers referenced this in commit 4a5287d999 on May 7, 2022

kittywhiskers referenced this in commit 48937a60eb on Jun 14, 2022

kittywhiskers referenced this in commit 8950416efc on Jun 14, 2022

kittywhiskers referenced this in commit c18816d0bd on Jun 14, 2022

kittywhiskers referenced this in commit 8049c7962c on Jun 18, 2022

kittywhiskers referenced this in commit 67997d3e44 on Jun 18, 2022

kittywhiskers referenced this in commit b12b2b7a36 on Jul 4, 2022

kittywhiskers referenced this in commit f83469f3fd on Jul 4, 2022

kittywhiskers referenced this in commit a211038703 on Jul 6, 2022

kittywhiskers referenced this in commit 8bde6cf816 on Jul 6, 2022

kittywhiskers referenced this in commit 915dcab3dc on Jul 6, 2022

kittywhiskers referenced this in commit c92c2cc09b on Jul 13, 2022

kittywhiskers referenced this in commit 60aa00a590 on Jul 13, 2022

kittywhiskers referenced this in commit eb2fa81288 on Jul 15, 2022

PastaPastaPasta referenced this in commit 30d6584cb6 on Jul 17, 2022

DrahtBot locked this on Aug 18, 2022

tests: Add fuzzing harnesses for CAutoFile, CBufferedFile, LoadExternalBlockFile and other FILE* consumers #19143

Conflicts