fuzz: replace every fuzzer-controlled while loop with a macro #22508

pull apoelstra wants to merge 1 commits into bitcoin:master from apoelstra:2021-07--fuzzer-loops changing 29 files +34 −34
  1. apoelstra commented at 3:09 pm on July 20, 2021: contributor
    Limits the number of iterations to 1000 rather than letting the fuzzer do millions or billions of iterations on a single core.
  2. apoelstra commented at 3:11 pm on July 20, 2021: contributor

    There are several loops in the fuzz test where the number of iterations is directly controlled by the fuzzer, which oftentimes will go for millions of iterations, leading to certain fuzz tests taking forever on a single core. This PR limits these fuzzer-controlled loops to 1000 iterations, an arbitrary high number.

    IMHO letting the fuzzer control the number of iterations directly like this, with no restrictions, is a bad idea because the fuzzer will naturally try to hit super high numbers.

    I’m looking for a concept ACK (should we limit these) and an approach ACK (is doing this with a macro a reasonable way to go?). Actual ACKs/merges would also be nice :).

  3. apoelstra commented at 3:12 pm on July 20, 2021: contributor

    On an EPYC 7502 with 256Gb of RAM, just running time ./test/fuzz/test_runner.py ../../qa-assets/fuzz_seed_corpus/:

    before, -j1

    0real    67m58.883s
    1user    152m7.845s
    2sys     4m36.080s
    

    after, -j1

    0real    35m57.697s
    1user    103m10.803s
    2sys     3m52.910s
    

    before, -j32

    0real    60m6.374s
    1user    153m29.631s
    2sys     5m14.991s
    

    after, -j32

    0real    28m54.078s
    1user    102m52.337s
    2sys     4m16.605s
    
  4. MarcoFalke commented at 3:21 pm on July 20, 2021: member
    Concept ACK. @DrahtBot please list the conflicts for the alternative solution that has been proposed.
  5. DrahtBot commented at 3:22 pm on July 20, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #22919 (fees: skip pointless fee parameter calculation during IBD by RohitRanjangit)
    • #22702 (Add allocator for node based containers by martinus)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  6. MarcoFalke commented at 3:34 pm on July 20, 2021: member

    Ok, that didn’t work. Basically, it would be nice to be able to still overwrite the limit per fuzz target:

  7. apoelstra commented at 4:54 pm on July 20, 2021: contributor
    @MarcoFalke I added a #define variable and capped the tx_pool test to 300 iterations to demonstrate how to use it.
  8. DrahtBot added the label Tests on Jul 20, 2021
  9. in src/test/fuzz/tx_pool.cpp:147 in 5b266249bc outdated
    143@@ -142,7 +144,7 @@ FUZZ_TARGET_INIT(tx_pool_standard, initialize_tx_pool)
    144         return c.out.nValue;
    145     };
    146 
    147-    while (fuzzed_data_provider.ConsumeBool()) {
    148+    FUZZER_CONTROLLED_LOOP(fuzzed_data_provider) {
    


    MarcoFalke commented at 6:31 am on July 22, 2021:
    it would be nice to be able to pass the limit to the loop (in the same line). Otherwise it will get messy if there are nested loops and one forgets to adjust the limit or the loops are moved without moving the limit, which is in a different line.

    apoelstra commented at 2:56 pm on July 22, 2021:
    Sure – replaced the extra #define variable with a new macro argument

    apoelstra commented at 2:57 pm on July 22, 2021:
    Maybe we should just drop the “default 1000” one and always explicitly set the maximum number of iterations?

    MarcoFalke commented at 3:02 pm on July 22, 2021:
    Yeah, I wouldn’t mind that. An additional , 1000 doesn’t seem too verbose.
  10. apoelstra force-pushed on Jul 22, 2021
  11. in src/test/fuzz/addrman.cpp:47 in 9b7a105fb8 outdated
    43@@ -44,7 +44,7 @@ FUZZ_TARGET_INIT(addrman, initialize_addrman)
    44             addr_man.m_asmap.clear();
    45         }
    46     }
    47-    while (fuzzed_data_provider.ConsumeBool()) {
    48+    FUZZER_CONTROLLED_LOOP(fuzzed_data_provider) {
    


    MarcoFalke commented at 3:01 pm on July 22, 2021:
    0    LIMITED_WHILE(fuzzed_data_provider.ConsumeBool) {
    

    Since some loops (prefer to?) use fuzzed_data_provider.remaining_bytes(), maybe make this configurable as well? Feel free to ignore the naming suggestion.


    MarcoFalke commented at 9:33 am on August 2, 2021:
    @apoelstra Are you still working on this? If not, I am happy to pick up

    apoelstra commented at 5:33 pm on September 8, 2021:
    Hi, sorry, I can pick this up now.
  12. DrahtBot added the label Needs rebase on Jul 22, 2021
  13. practicalswift commented at 7:43 pm on July 24, 2021: contributor

    Concept ACK

    Nice idea!

  14. apoelstra force-pushed on Sep 8, 2021
  15. apoelstra commented at 6:00 pm on September 8, 2021: contributor
    Redid the PR. Now it takes the macro from #22649 and just applies it to every single remaining fuzzer loop.
  16. DrahtBot removed the label Needs rebase on Sep 8, 2021
  17. practicalswift commented at 8:13 am on September 9, 2021: contributor

    Thanks for fixing this!

    Is the mixing of LIMITED_WHILE ( and LIMITED_WHILE( intentional? (Personally I prefer the latter.)

    ACK modulo spacing consistency :)

  18. laanwj commented at 11:31 am on September 9, 2021: member

    There are several loops in the fuzz test where the number of iterations is directly controlled by the fuzzer, which oftentimes will go for millions of iterations, leading to certain fuzz tests taking forever on a single core.

    Concept and code review ACK be655e8b8111f3a7cd7909203c2bac3d16751406

    I would slightly prefer this “arbitrary high number” to be a constant as it’s repeated everywhere.

  19. MarcoFalke commented at 11:54 am on September 9, 2021: member

    I would slightly prefer this “arbitrary high number” to be a constant as it’s repeated everywhere.

    I’d say the number is dependent on the test context. The current ones in master are 30, 300, 3000, and 30000.

    I think each number needs to be picked carefully to be in the right 10^x.

    Is the mixing of LIMITED_WHILE ( and LIMITED_WHILE( intentional? (Personally I prefer the latter.)

    clang-format(-diff) should take care of those in any case.

  20. apoelstra commented at 2:22 pm on September 9, 2021: contributor

    Oops, I will fix the spacing inconsistency. It probably happened when I switched from manual edits to using sed. @MarcoFalke I think we could probably start with an “anbitrary high number”, say 30000 (or 10000), and then when when tests reveal that certain tests are taking too long we can reduce them in individual cases. The current situation effectively has a limit of infinity and things are mostly OK, so I don’ t think this’ll be a big deal.

    Once we’re at 5-digit numbers I don’t think there are any situations where we’d want to increase them.

  21. DrahtBot added the label Needs rebase on Oct 5, 2021
  22. MarcoFalke commented at 3:11 pm on October 5, 2021: member
    Sorry, will review soon
  23. apoelstra force-pushed on Oct 25, 2021
  24. apoelstra commented at 8:03 pm on October 25, 2021: contributor
    Rebased and changed formatting to be consistent.
  25. DrahtBot removed the label Needs rebase on Oct 25, 2021
  26. in src/test/fuzz/versionbits.cpp:202 in 9116bc17e1 outdated
    198@@ -199,7 +199,7 @@ FUZZ_TARGET_INIT(versionbits, initialize)
    199     const uint32_t signalling_mask = fuzzed_data_provider.ConsumeIntegral<uint32_t>();
    200 
    201     // mine prior periods
    202-    while (fuzzed_data_provider.remaining_bytes() > 0) {
    203+    LIMITED_WHILE(fuzzed_data_provider.remaining_bytes() > 0, 10000) {
    


    MarcoFalke commented at 9:37 am on November 3, 2021:
    Why is this needed? There is already a break below, no?

    apoelstra commented at 12:21 pm on November 12, 2021:
    Yep, you’re right, I misread the loop’s exit condition.
  27. in src/test/fuzz/rpc.cpp:351 in 9116bc17e1 outdated
    347@@ -348,7 +348,7 @@ FUZZ_TARGET_INIT(rpc, initialize_rpc)
    348         return;
    349     }
    350     std::vector<std::string> arguments;
    351-    while (fuzzed_data_provider.ConsumeBool()) {
    352+    LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000) {
    


    MarcoFalke commented at 9:46 am on November 3, 2021:

    This seems a bit high? ConsumeRPCArgument might call ConsumeArrayRPCArgument, so the limit is 100'000'000. A user shouldn’t be passing in more than 100 items. So any value between 100-1000 seems more appropriate.

    See also https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40648

    Which is running base58 encoding (expensive) of one rpc arg 6000 times, thus times out. Your limit of 10000 wouldn’t affect that, I think.


    apoelstra commented at 12:23 pm on November 12, 2021:

    ACK. I reduced it to 100.

    I worry that reducing this will meaningfully affect test coverage since we presumably want to test what happens when you give gazillions of RPC arguments … but probably a general-purpose fuzz test is the wrong place to try to hit that.


    MarcoFalke commented at 3:52 pm on November 12, 2021:
    Sorry, I also meant in the line above in this file. 100 * 100 is still 10k.
  28. MarcoFalke approved
  29. MarcoFalke commented at 9:50 am on November 3, 2021: member
    ACK, beside comments
  30. apoelstra force-pushed on Nov 12, 2021
  31. in src/test/fuzz/versionbits.cpp:202 in 6cfcbdae4e outdated
    198@@ -199,7 +199,7 @@ FUZZ_TARGET_INIT(versionbits, initialize)
    199     const uint32_t signalling_mask = fuzzed_data_provider.ConsumeIntegral<uint32_t>();
    200 
    201     // mine prior periods
    202-    while (fuzzed_data_provider.remaining_bytes() > 0) {
    203+    while(fuzzed_data_provider.remaining_bytes() > 0) { // early exit; no need for LIMITED_WHILE
    


    MarcoFalke commented at 3:54 pm on November 12, 2021:
    0    while (fuzzed_data_provider.remaining_bytes() > 0) { // early exit; no need for LIMITED_WHILE
    

    nit


    apoelstra commented at 3:57 pm on November 12, 2021:
    fixed
  32. apoelstra force-pushed on Nov 12, 2021
  33. fuzz: replace every fuzzer-controlled loop with a LIMITED_WHILE loop
    Blindly chose a cap of 10000 iterations for every loop, except for
    the two in script_ops.cpp and scriptnum_ops.cpp which appeared to
    (sometimes) be deserializing individual bytes; capped those to one
    million to ensure that sometimes we try working with massive scripts.
    
    There was also one fuzzer-controlled loop in timedata.cpp which was
    already capped, so I left that alone.
    
    git grep 'while (fuzz' should now run clean except for timedata.cpp
    214d9055ac
  34. in src/test/fuzz/rpc.cpp:297 in 244eb3c09b outdated
    293@@ -294,7 +294,7 @@ std::string ConsumeScalarRPCArgument(FuzzedDataProvider& fuzzed_data_provider)
    294 std::string ConsumeArrayRPCArgument(FuzzedDataProvider& fuzzed_data_provider)
    295 {
    296     std::vector<std::string> scalar_arguments;
    297-    while (fuzzed_data_provider.ConsumeBool()) {
    298+    LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000) {
    


    MarcoFalke commented at 4:10 pm on November 12, 2021:
    This one I meant, too ;)

    MarcoFalke commented at 4:10 pm on November 12, 2021:
  35. apoelstra force-pushed on Nov 12, 2021
  36. MarcoFalke commented at 3:50 pm on November 15, 2021: member

    Some times:

    master | this | relative | name

     0	206.2	76.67	0.3718234724	rpc			
     1	149.06	157.45	1.056286059	process_messages																				
     2	27.21	27.19	0.9992649761	addrman																				
     3	24.19	24.89	1.028937578	script_sign																				
     4	23.93	26.08	1.089845382	process_message																				
     5	23.32	20.44	0.8765008576	coins_view																				
     6	15.22	15.75	1.034822602	policy_estimator																				
     7	14.46	15.32	1.059474412	script_ops																				
     8	13.06	12.85	0.9839203675	bloom_filter																				
     9	12.35	12.41	1.0048583	pow																				
    10	11.93	8.79	0.7367979883	connman																				
    11	10.71	9.38	0.8758169935	net																				
    12	9.22	7.21	0.7819956616	node_eviction																				
    13	7.7	8.23	1.068831169	rbf																				
    14																	
    
  37. MarcoFalke commented at 3:51 pm on November 15, 2021: member
  38. MarcoFalke commented at 3:51 pm on November 15, 2021: member
    cr ACK 214d9055acdd72189a2f415477ce472ca8db4191
  39. MarcoFalke merged this on Nov 15, 2021
  40. MarcoFalke closed this on Nov 15, 2021

  41. sidhujag referenced this in commit 921b690a0d on Nov 16, 2021
  42. apoelstra deleted the branch on Nov 16, 2021
  43. DrahtBot locked this on Nov 17, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me