fuzz: improve utxo_snapshot target #30329

pull mzumsande wants to merge 1 commits into bitcoin:master from mzumsande:202406_fuzz_assumeutxo changing 3 files +47 −10
  1. mzumsande commented at 7:41 pm on June 24, 2024: contributor

    Add the possibility of giving more guidance to the creation of the metadata and/or coins, so that the fuzzer gets the chance to reach more error conditions in ActivateSnapshot and sometimes successfully creates a valid snapshot.

    This also changes the asserts for the success case that were outdated (after #29370) and only didn’t result in a crash because the fuzzer wasn’t able to reach this code before.

  2. DrahtBot commented at 7:41 pm on June 24, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, TheCharlatan, fjahr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29656 (chainparams: Change nChainTx type to uint64_t by fjahr)
    • #29307 (util: explicitly close all AutoFiles that have been written by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on Jun 24, 2024
  4. mzumsande force-pushed on Jun 24, 2024
  5. DrahtBot added the label CI failed on Jun 24, 2024
  6. DrahtBot commented at 8:51 pm on June 24, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26619261177

  7. mzumsande force-pushed on Jun 24, 2024
  8. fjahr commented at 9:47 pm on June 24, 2024: contributor
    Concept ACK
  9. mzumsande force-pushed on Jun 24, 2024
  10. DrahtBot removed the label CI failed on Jun 24, 2024
  11. DrahtBot added the label Needs rebase on Jul 2, 2024
  12. mzumsande force-pushed on Jul 2, 2024
  13. DrahtBot removed the label Needs rebase on Jul 3, 2024
  14. in src/kernel/chainparams.cpp:498 in b9ba1a7309 outdated
    494@@ -495,12 +495,19 @@ class CRegTestParams : public CChainParams
    495         };
    496 
    497         m_assumeutxo_data = {
    498-            {
    499+            {   // For use by unit tests
    


    maflcko commented at 2:45 pm on July 3, 2024:
    nit: maybe also list the filename here, like in the other cases.

    mzumsande commented at 0:16 am on July 5, 2024:
    I didn’t do that because there are already multiple files that use it (validation_chainstatemanager_tests, validation_chainstate_tests). Seems less likely that the same would happen for fuzz tests.
  15. in src/kernel/chainparams.cpp:509 in b9ba1a7309 outdated
    505+            {
    506+                // For use by fuzz target src/test/fuzz/utxo_snapshot.cpp
    507+                .height = 200,
    508+                .hash_serialized = AssumeutxoHash{uint256S("0x998b6a337899f04a1cd298ecf0fdc6ab66bfb428f00e5a48998505e91b3b55e0")},
    509+                .nChainTx = 201,
    510+                .blockhash = uint256S("0x5e93653318f294fb5aa339d00bbf8cf1c3515488ad99412c37608b139ea63b27,")
    


    maflcko commented at 2:46 pm on July 3, 2024:
    style nit: Missing trailing ,?

    mzumsande commented at 0:17 am on July 5, 2024:
    fixed, the “,” went into the hash by mistake (and was ignored). I also fixed the hash_serialized, which was wrong somehow.

    maflcko commented at 6:04 am on July 5, 2024:

    the “,” went into the hash by mistake (and was ignored)

    ugh. I didn’t even notice this

  16. in src/test/fuzz/utxo_snapshot.cpp:50 in b9ba1a7309 outdated
    47+        if (fuzzed_data_provider.ConsumeBool()) {
    48+            std::vector<uint8_t> metadata{ConsumeRandomLengthByteVector(fuzzed_data_provider)};
    49+            outfile << Span{metadata};
    50+        } else {
    51+            DataStream data_stream{};
    52+            auto msg_start = Params().MessageStart();
    


    maflcko commented at 2:48 pm on July 3, 2024:

    style-nit:

    0            auto msg_start = chainman.Params().MessageStart();
    

    mzumsande commented at 0:18 am on July 5, 2024:
    fixed, here and at one more spot.
  17. in src/test/fuzz/utxo_snapshot.cpp:108 in b9ba1a7309 outdated
    107-            Assert(num_tx == 1);
    108-            chain_tx += num_tx;
    109+            Assert(index);
    110+            Assert(index->nTx == 0);
    111+            if (index->nHeight == chainman.GetSnapshotBaseHeight()) {
    112+                Assert(index->nChainTx != 0);
    


    maflcko commented at 2:52 pm on July 3, 2024:
    nit: Could assert the chain tx value from the chain params constant?

    mzumsande commented at 0:18 am on July 5, 2024:
    Done!
  18. maflcko approved
  19. maflcko commented at 2:53 pm on July 3, 2024: member
    lgtm. Thanks!
  20. in src/test/fuzz/utxo_snapshot.cpp:53 in b9ba1a7309 outdated
    50+        } else {
    51+            DataStream data_stream{};
    52+            auto msg_start = Params().MessageStart();
    53+            int base_blockheight{fuzzed_data_provider.ConsumeIntegralInRange<int>(1, 2 * COINBASE_MATURITY)};
    54+            uint256 base_blockhash{g_chain->at(base_blockheight - 1)->GetHash()};
    55+            uint64_t m_coins_count{fuzzed_data_provider.ConsumeIntegralInRange<uint64_t>(1, 2 * COINBASE_MATURITY)};
    


    maflcko commented at 2:59 pm on July 3, 2024:
    Should be fine to exceed the range here for fuzzing? Maybe +1, or *3 instead of *2?

    mzumsande commented at 0:18 am on July 5, 2024:
    Done, used *3
  21. maflcko commented at 3:04 pm on July 3, 2024: member

    ACK b9ba1a73094f4ad593b527e23d2681f41c22539 🎯

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: ACK b9ba1a73094f4ad593b527e23d2681f41c22539 🎯
    3iTX6VgeBMzgsveHFGIl5Qnsj/cWt0KYcv2B3dvhxmLm6ljOkWyT0bGPRv0zKVRDXMpcDBJcwelxlW8o6WHPEDw==
    
  22. DrahtBot requested review from fjahr on Jul 3, 2024
  23. TheCharlatan commented at 8:27 pm on July 3, 2024: contributor

    I checked if there is anything we could do in the testing setup to make the test faster. With this patchset, the test takes around 1:15 min for 10k runs on my machine. Removing the validation interface from the test setup cuts it down to 1:05 min , if I remove the networking setup, it takes 35 sec to complete.

    I also removed the mempool and it crashed when reaching validation.cpp:5684. Is it a hard requirement when using assumeutxo that a mempool exists, or is this a bug?

  24. fjahr commented at 10:24 pm on July 3, 2024: contributor

    I also removed the mempool and it crashed when reaching validation.cpp:5684. Is it a hard requirement when using assumeutxo that a mempool exists, or is this a bug?

    That appears to be a bug. We actually require the mempool to be empty so it not being there should not be a problem. I will open a PR.

  25. fjahr commented at 11:34 pm on July 3, 2024: contributor

    I also removed the mempool and it crashed when reaching validation.cpp:5684. Is it a hard requirement when using assumeutxo that a mempool exists, or is this a bug?

    That appears to be a bug. We actually require the mempool to be empty so it not being there should not be a problem. I will open a PR.

    This should work https://github.com/fjahr/bitcoin/commit/21530b489eee6b8db4b9af45430d9d3cb43a459c, but on second thought I am not sure if we need this (yet). Conceptually it makes sense that we change this because, as I said, we don’t need the mempool. However, since we don’t use the mempool I think this should also not make an impact on the testing speed and even in blocksonly mode there still is a mempool, so this won’t be an issue.

    This might be getting off-topic so I can open a separate issue to move the discussion.

  26. maflcko commented at 6:16 am on July 4, 2024: member

    I checked if there is anything we could do in the testing setup to make the test faster. With this patchset, the test takes around 1:15 min for 10k runs on my machine. Removing the validation interface from the test setup cuts it down to 1:05 min , if I remove the networking setup, it takes 35 sec to complete.

    Nice. Would be good to have a pull request with this, so that it can be reviewed and merged.

  27. fjahr commented at 12:23 pm on July 4, 2024: contributor
    utACK b9ba1a73094f4ad593b527e23d2681f41c225397
  28. TheCharlatan approved
  29. TheCharlatan commented at 1:44 pm on July 4, 2024: contributor

    ACK b9ba1a73094f4ad593b527e23d2681f41c225397

    … but it would be nice to address the nits, will re-ACK quickly :)

  30. fuzz: improve utxo_snapshot target
    Add the possibility of giving more guidance to the creation of the
    metadata and/or coins, so that the fuzzer gets the chance
    to reach more error conditions in ActivateSnapshot and sometimes
    successfully creates a valid snapshot.
    
    This also changes the asserts for the success case that were outdated,
    and only didn't result in a crash because the fuzzer wasn't able
    to reach this code before.
    de71d4dece
  31. mzumsande force-pushed on Jul 5, 2024
  32. mzumsande commented at 0:36 am on July 5, 2024: contributor

    Thanks for the reviews! I addressed the feedback by @maflcko and also corrected hash_serialized in the chainparams. Not sure why it was wrong before, I’m pretty sure I was able to activate the snapshot before opening the PR on another computer. Probably just an oversight, but maybe someone can confirm that everything is deterministic and the snapshot can be loaded now (e.g. by adding an assert and setting base_blockheight and m_coins_count to 200 it should crash almost immediately).

    I checked if there is anything we could do in the testing setup to make the test faster. (…)

    Sound like good ideas for a follow-up!

  33. maflcko commented at 6:09 am on July 5, 2024: member

    re-ACK de71d4dece604907afc4fc26b7788e9c1a4cbecb 🎆

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: re-ACK de71d4dece604907afc4fc26b7788e9c1a4cbecb 🎆
    3jzFq8Tx4Dv2D6Jmy+qeBCaFo50xs+4wd4o3IJT7NvMAv64PiF75BRT4tYBxjuU2NEb6pH6x4uKTzg4Rptjc3Bw==
    
  34. DrahtBot requested review from TheCharlatan on Jul 5, 2024
  35. TheCharlatan approved
  36. TheCharlatan commented at 7:11 am on July 5, 2024: contributor
    ACK de71d4dece604907afc4fc26b7788e9c1a4cbecb
  37. fjahr commented at 9:37 am on July 5, 2024: contributor
    utACK de71d4dece604907afc4fc26b7788e9c1a4cbecb
  38. mzumsande commented at 5:49 pm on July 8, 2024: contributor

    maybe someone can confirm that everything is deterministic and the snapshot can be loaded now (e.g. by adding an assert and setting base_blockheight and m_coins_count to 200 it should crash almost immediately).

    FYI, confirmed for myself on another computer that this was just an error, and the snapshot can be loaded by the fuzzer.

  39. in src/kernel/chainparams.cpp:509 in de71d4dece
    505+            {
    506+                // For use by fuzz target src/test/fuzz/utxo_snapshot.cpp
    507+                .height = 200,
    508+                .hash_serialized = AssumeutxoHash{uint256S("0x4f34d431c3e482f6b0d67b64609ece3964dc8d7976d02ac68dd7c9c1421738f2")},
    509+                .nChainTx = 201,
    510+                .blockhash = uint256S("0x5e93653318f294fb5aa339d00bbf8cf1c3515488ad99412c37608b139ea63b27"),
    


    ryanofsky commented at 8:01 pm on July 9, 2024:

    In commit “fuzz: improve utxo_snapshot target” (de71d4dece604907afc4fc26b7788e9c1a4cbecb)

    Since fuzz test is not deterministic, and doesn’t give us an easy way to know hash values here actually allow snapshot to be loaded, it might be good to write a separate regtest generating a an empty chain with height 200 and verifying these hashes.

  40. ryanofsky merged this on Jul 9, 2024
  41. ryanofsky closed this on Jul 9, 2024

  42. mzumsande deleted the branch on Jul 9, 2024
  43. ryanofsky commented at 8:20 pm on July 9, 2024: contributor

    Post-merge code review ACK de71d4dece604907afc4fc26b7788e9c1a4cbecb

    This seems like a good change in principle, but I’m not actually sure what additional test coverage it provides. The only valid snapshot that can be loaded seems to be one that is hardcoded for a chain with height 200 and no transactions except for coinbases. After the snapshot is generated there is only a single ConsumeBool call determining whether or not headers for the blocks are added before loading the snapshot.

    So basically it creates a hardcoded snapshot, and then tests loading it with and without headers. I could be missing something, but it seems like that is not really taking advantage of the fuzzer or testing something that would be not more convenient to run as a unit test.

    It would probably open up a lot more flexibility for the fuzz test if it were able to modify the m_assumeutxo_data vector instead of relying on hardcoded values.

  44. mzumsande commented at 9:12 pm on July 9, 2024: contributor

    This seems like a good change in principle, but I’m not actually sure what additional test coverage it provides. The only valid snapshot that can be loaded seems to be one that is hardcoded for a chain with height 200 and no transactions except for coinbases. After the snapshot is generated there is only a single ConsumeBool call determining whether or not headers for the blocks are added before loading the snapshot.

    Note that loading a valid snapshot is not the only goal of this PR. It also allows for a better coverage of ActivateSnapshot() which was previously hard to reach because the fuzzer already had a hard time fuzzing valid metadata (after #29612) and would abort without it. E.g. it has zero coverage in https://maflcko.github.io/b-c-cov/fuzz.coverage/src/validation.cpp.gcov.html though I’m not sure how up-do-date the corpus is, and fuzzing locally I also didn’t reach ActivateSnapshot() (maybe I wasn’t patient enough).

    I think for more interesting fuzzing we’d have to think about allowing dynamic registration of snapshots in tests (or enabling an option to just skip the check for the hash in chainparams).

    Some time ago, I thought about this in the context of functional tests (to avoid having to edit chainparams whenever we change something in a functional test), but after realizing that the registered snapshot data from chainparams is not only necessary during registration but also in a restart (nChainTx) it seemed too crude to me and I didn’t open a PR. Might make more sense for fuzzing though.

  45. maflcko commented at 7:39 pm on July 23, 2024: member
  46. Theghost256 approved

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me