fuzz: Fix timeouts #28812

issue maflcko openend this issue on November 7, 2023
  1. maflcko commented at 1:38 pm on November 7, 2023: member

    Fuzz timeouts may hint at bugs or performance issues. Also, they may block the fuzz CI and fuzz farms from making progress in a reasonable time.

    Thus, they should be fixed.

    One timeout was fixed today in commit 2b3f43b96ef0b674bf350f50f317477b8d3e1e56.

    Currently OSS-Fuzz reports other timeouts (as non-deterministic):

    Screenshot 2023-11-07 at 14-32-56 Testcases

    Someone should investigate them.

  2. maflcko commented at 4:36 pm on November 7, 2023: member
  3. maflcko commented at 5:39 pm on November 7, 2023: member

    Flame graph for mocked_descriptor_parse:

    (Most of the time is spent in pubkey derive)

    Screenshot from 2023-11-07 18-37-42

  4. dergoegge commented at 10:50 am on November 8, 2023: member
    Are the graphs for the entire corpus or just the slow inputs?
  5. maflcko commented at 10:51 am on November 8, 2023: member
    Just the slow inputs. I can add steps to reproduce, if you want.
  6. dergoegge commented at 10:10 am on November 9, 2023: member

    I can add steps to reproduce, if you want.

    That would be great.

  7. maflcko commented at 11:01 am on November 9, 2023: member

    Steps to reproduce flamegraph:

    • Compile fuzz tests normally (libFuzzer recommended)
    • perf record, for example via FUZZ=mocked_descriptor_parse perf record -g --call-graph dwarf -F 1001 ./src/test/fuzz/fuzz -runs=2 /tmp/fuzz_input.dat (-runs can be increased to discount an expensive global setup initialization, --call-graph and frequency can be modified depending on your system)
    • Load the data as a flamegraph: hotspot ./perf.data
  8. maflcko commented at 11:02 am on November 9, 2023: member
    cc @darosior about mocked_descriptor_parse
  9. darosior commented at 11:03 am on November 9, 2023: member
    Yeah i was looking into it. It’s surprising we are spending so long inside the dup key check. Maybe we re-introduced the issue fixed by #25540.
  10. maflcko commented at 9:48 am on November 13, 2023: member
  11. maflcko referenced this in commit fa0693867a on Nov 14, 2023
  12. maflcko referenced this in commit fab5cb9066 on Nov 14, 2023
  13. maflcko commented at 1:06 pm on November 15, 2023: member
  14. maflcko commented at 4:29 pm on November 23, 2023: member
  15. fanquake referenced this in commit 5f9fd11680 on Nov 26, 2023
  16. fanquake referenced this in commit 31ce305d46 on Nov 28, 2023
  17. maflcko commented at 10:38 am on November 30, 2023: member
  18. brunoerg commented at 1:03 pm on November 30, 2023: contributor

    Another one: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64593 cc @brunoerg

    I don’t have access to see it in depth 🙁.

    I suppose it might be related to the descriptor parse. I think we can use ImportDescriptors from wallet_notifications in scriptpubkeyman.

  19. maflcko commented at 1:06 pm on November 30, 2023: member

    I don’t have access to see it in depth 🙁.

    The “Reproducer Testcase” should be public.

    I suppose it might be related to the descriptor parse.

    Ah, good point. This was done intentionally. :sweat_smile:

  20. brunoerg commented at 1:18 pm on November 30, 2023: contributor

    The “Reproducer Testcase” should be public.

    “Access denied” when I try to access it.

  21. maflcko commented at 2:05 pm on November 30, 2023: member
    Ok, I see. I guess google requires login now, starting at some point in the past.
  22. maflcko added the label Tests on Nov 30, 2023
  23. maflcko commented at 1:42 pm on December 6, 2023: member
  24. fanquake referenced this in commit 2e8ec6b338 on Dec 6, 2023
  25. maflcko commented at 1:04 pm on December 14, 2023: member
  26. darosior commented at 3:18 pm on December 31, 2023: member

    Another one: https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=64593 cc @brunoerg

    I think i’ve fixed it in #28832 (comment). @maflcko can you share the repro from oss-fuzz here so i can confirm it?

  27. maflcko commented at 11:34 am on January 2, 2024: member

    I guess it was:

    0$ base64 ~/Downloads/clusterfuzz-testcase-scriptpubkeyman-5467976815738880 
    1dHIoJWQxLzEvMC8wMC8wMC8wLzAwLzAvMC8wLzAwLzE5LzEwOTAvMC8wLzAvMC8wLzAwLzE5LzEw
    2OTAvMC8wLzAwLzAvMC8wLzAwLzEvMC8wMC8wLzAvMDAvMDkvMTkvMC8wLzAvMC8wMC8wLzAvMC8w
    3MC8xOS8wLzAwMC8wMC8wLzAvMC8wMC8xLzAvMDAvMC8wLzAwLzA5LzE5LzAvMC8wLzAvMDAvMC8w
    4LzAvMDAvMTkvMC8wMC8wLzAvMC8wMC8xOS8xLzAwLyonKQ==
    
  28. achow101 referenced this in commit d44554567f on Jan 4, 2024
  29. maflcko commented at 11:03 am on January 5, 2024: member
  30. maflcko commented at 12:20 pm on January 5, 2024: member
  31. maflcko commented at 6:57 pm on January 9, 2024: member
    #28812 (comment) exists, still, as well.
  32. darosior commented at 9:49 am on January 10, 2024: member

    I think this is due to a missing std::move for j: which makes BuildScript copy subs[0]: https://github.com/bitcoin/bitcoin/blob/063a8b83875997068b3eb506b5f30f2691d18052/src/script/miniscript.h#L762-L763.

    EDIT: no, my bad, only the first argument to BuildScript can be moved.

    Reproducing locally now.

  33. janus referenced this in commit 48c72a0004 on Apr 1, 2024
  34. maflcko commented at 7:12 am on May 24, 2024: member

    Looks like there are different flames in descriptor_parse. Another one that took two minutes:

    Screenshot from 2024-05-24 09-08-02 1eab7f1648dd012d8efee262a9898d2a9b044fd2.bin.txt

    Looks like it is just a long sh(thresh(1,1,1,1,1,0,1,1,1,1,1,0,1,1,1,1....

  35. maflcko referenced this in commit cc5858cb70 on May 24, 2024
  36. darosior commented at 8:19 am on May 30, 2024: member

    1eab7f1648dd012d8efee262a9898d2a9b044fd2.bin.txt

    This one is just hitting the performance of our code. Some logic in the constructor of a thresh fragment is quadratic in the number of sub-fragments. This input is a thresh with more than 130k sub-fragments.

    Technically it would be possible to not have this logic in the constructor and cache it instead. But 1) that would only punt the issue 2) i don’t think it’s worth modifying the logic “just” to avoid the fuzz timeouts.

    I think we’ll have to limit the size of the input, unfortunately.

  37. darosior commented at 9:05 am on May 30, 2024: member

    Alright so i think we can still pinpoint this issue to avoid having to limit the size of the input. In the same way we limit the derivation paths depth, we could limit the number of sub-fragments per fragment in the input:

     0bool HasLargeFrag(const FuzzBufferType& buff, const int max_subs)
     1{
     2    std::stack<int> counts;
     3    for (const auto& ch: buff) {
     4        if (ch == '(') {
     5            counts.push(1);
     6        } else if (ch == ',' && !counts.empty()) {
     7            if (++counts.top() > max_subs) return true;
     8        } else if (ch == ')' && !counts.empty()) {
     9            counts.pop();
    10        }
    11    }
    12    return false;
    13}
    

    I’ll PR something to this effect.

  38. darosior commented at 10:38 am on May 30, 2024: member

    clusterfuzz-testcase-mocked_descriptor_parse-5224675118481408.bin.not.txt clusterfuzz-testcase-minimized-descriptor_parse-5888623437217792.bin.txt

    For those there is probably something to be done by someone more experienced at C++ than i am. I’ve tried fiddling around with trying to move the content of other arguments than the first one in BuildScript, which led me into unsuccessfully looking into prevector.

    Instead of having to dive in this rabbit hole (which could also potentially lead to touch consensus code), i’m going to limit the number of wrappers a fragment can have in the fuzz target in the same fashion as for ruling out the fragments with too many subs.

  39. maflcko commented at 10:49 am on May 30, 2024: member

    I think we’ll have to limit the size of the input, unfortunately.

    Yes, as long as all possible real-world scenarios are covered, this is fine as well.

  40. murchandamus commented at 7:38 pm on June 6, 2024: contributor

    I stumbled upon a fuzz seed that oom’ed in CI for package_rbf. Since the seed appears to be too big to paste, I provide the output of $ base64 95a3b1b42d43ac7190f24ff116b2cd6515e7a566 > /tmp/base64-95a3b1b4-oom.txt: base64-95a3b1b4-oom.txt

    If the prior format is unsatisfactory, the problematic seed 95a3b1b42d43ac7190f24ff116b2cd6515e7a566 is also in this commit to qa-assets: https://github.com/bitcoin-core/qa-assets/pull/186/commits/8d624b44419f46bcb5117baa16047dbf4bdb339f

  41. fanquake referenced this in commit 262260ce1e on Jul 15, 2024
  42. maflcko commented at 8:03 am on July 18, 2024: member
    Closing for now. It is probably better to create separate issues for each fuzz target, going forward.
  43. maflcko closed this on Jul 18, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-24 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me