Keep fuzz inputs for older branches #274

pull ekzyis wants to merge 1 commits into bitcoin-core:main from ekzyis:keep-fuzz-inputs-for-older-branches changing 1 files +139 −38
  1. ekzyis commented at 1:51 AM on April 12, 2026: contributor

    close #265

    see comment below for more information

  2. in delete-nonreduced-fuzz-inputs/src/main.rs:89 in ca43f569ba outdated
      87 |      }
      88 | +    Ok(git_refs)
      89 | +}
      90 | +
      91 | +fn app() -> AppResult {
      92 | +    let mut git_refs = vec!["master".to_string()];
    


    ekzyis commented at 1:53 AM on April 12, 2026:

    ~we might want to use the output of git rev-parse --short master instead of master in the commit messages~

    afaik, this wasn't a problem before, so I don't think it's a problem now

  3. ekzyis marked this as a draft on Apr 12, 2026
  4. ekzyis force-pushed on Apr 12, 2026
  5. ekzyis force-pushed on Apr 12, 2026
  6. ekzyis force-pushed on Apr 12, 2026
  7. ekzyis force-pushed on Apr 12, 2026
  8. ekzyis commented at 11:10 PM on April 12, 2026: contributor

    https://github.com/bitcoin-core/qa-assets/commit/4babc489a28d335b2e7e1a4c723a8e77efa61c9e looks good to me now. Updated PR description.

    I ran this against a test fuzz corpora in https://github.com/ekzyis/qa-assets/commit/a4ef010134b76b0a8a481069ca2255aef6a52410 (a different branch) to speed things up. Took 30 minutes with --extra-ref v30.2 --extra-ref v29.3. To do the same, run this in the VM:

    $ git clone --depth=1 --branch=test-keep-fuzz-inputs-for-older-branches https://github.com/ekzyis/qa-assets
    $ cp -r qa-assets/delete-nonreduced-fuzz-inputs .
    $ cd delete-nonreduced-fuzz-inputs
    $ sudo cargo run -- --extra-ref v30.2 --extra-ref v29.3
    

    I had to allow empty commits because fuzzing another ref might not lead to any new fuzz inputs.

  9. ekzyis marked this as ready for review on Apr 12, 2026
  10. in delete-nonreduced-fuzz-inputs/src/main.rs:74 in 4babc489a2
      72 | +    let mut git_refs = Vec::new();
      73 | +    let mut args = env::args().skip(1);
      74 | +    while let Some(arg) = args.next() {
      75 |          match arg.as_str() {
      76 |              "--help" | "-h" => return Err(help("help requested")),
      77 | +            "--extra-ref" => {
    


    maflcko commented at 8:05 AM on April 13, 2026:

    I don't think this matches the docs above, which say:

    Usage: delete-nonreduced-fuzz-inputs [--extra-ref=<ref>]...
    

    My preference would be to keep the = and do the parsing, as suggested earlier in #269 (review):

        for arg in args {
            if let Some(s) = arg.strip_prefix("-h") {
                Err(help("Help requested"))?;
            } else if let Some(e) = arg.strip_prefix("--extra-ref=) {
    //
    } else {
                Err(help(&format!(
                    "Too many args, or unknown named arg: {arg}"
                )))?;
            }
         }
    

    But just a nit. An alternative could be to remove the = in the docs. Up to you.


    ekzyis commented at 12:46 PM on April 13, 2026:

    Oh, I tried using strip_prefix before, but with the existing match, and the combination was an experimental feature not available on the cargo version in the ubuntu VM. Seems like I didn't consider to not use match.

    Using your pattern in e298604b38 now. I only changed this to recognize -hfoo as an unknown argument and support --help:

    -        if let Some(_) = arg.strip_prefix("-h") {
    +        if arg == "-h" || arg == "--help" {
    
  11. in delete-nonreduced-fuzz-inputs/src/main.rs:492 in 4babc489a2
     485 | @@ -407,6 +486,23 @@ fn run_afl_cmin<P: AsRef<Path>, Q: AsRef<Path>>(
     486 |      Ok(())
     487 |  }
     488 |  
     489 | +/// Move every file in `dir` to its parent directory.
     490 | +fn move_files_to_parent_dir(dir: &PathBuf) -> AppResult {
     491 | +    for entry in fs::read_dir(dir).map_err(|e| format!("fs::read_dir failed: {}", e.to_string()))? {
     492 | +        let entry = entry.map_err(|e| format!("failed to read file: {}", e.to_string()))?;
    


    maflcko commented at 8:15 AM on April 13, 2026:
            let entry = entry.map_err(|e| format!("failed to read entry: {}", e.to_string()))?;
    

    nit


    ekzyis commented at 12:47 PM on April 13, 2026:

    fixed in 4b7804521f

  12. in delete-nonreduced-fuzz-inputs/src/main.rs:501 in 4babc489a2
     496 | +            .ok_or_else(|| format!("expected {} to have parent", from.display()))?
     497 | +            .join(entry.file_name());
     498 | +        fs::rename(&from, &to).map_err(|e| format!("fs::rename failed: {}", e.to_string()))?;
     499 | +    }
     500 | +
     501 | +    fs::remove_dir_all(dir).map_err(|e| format!("fs::remove_dir_all failed: {}", e.to_string()))?;
    


    maflcko commented at 8:16 AM on April 13, 2026:

    ekzyis commented at 1:05 PM on April 13, 2026:

    Oh, good question! There was no reason. I tried using fs::remove_dir now, but I realized the directory isn't actually empty after the loop for v30.2, so fs::remove_dir fails.

    First, I thought it might be because one shouldn't read a dir and move files in the same loop, so I collected all file names before renaming in 76292bffb9. This didn't fix the issue (probably still a good thing to do).

    I then printed out all rename operations in 11d2d33b2d and can see that they all succeed, but the directory is indeed not empty when fs::remove_dir runs, see full log.

    I think I'm running into the same problem I had when implementing this part in #266 with mv in bash:

    $ mv v30.2/007c7bebe2400a5f05d47382971a33e2244b4fc2 007c7bebe2400a5f05d47382971a33e2244b4fc2
    mv: 'v30.2/007c7bebe2400a5f05d47382971a33e2244b4fc2' and '007c7bebe2400a5f05d47382971a33e2244b4fc2' are the same file
    

    except fs::rename doesn't even throw :thinking: I also didn't (and still don't) understand how they can be the same file. Sure, they have the same content, and same name, but wasn't one created when fuzzing with master, and the other one with v30.2? But when I run ls -i, sure enough, they point to the same inode:

    $ ls -i v30.2/007c7bebe2400a5f05d47382971a33e2244b4fc2
    4497220 v30.2/007c7bebe2400a5f05d47382971a33e2244b4fc2
    $ ls -i 007c7bebe2400a5f05d47382971a33e2244b4fc2
    4497220 007c7bebe2400a5f05d47382971a33e2244b4fc2
    

    Not sure what's going on here. It's why I started to use cp instead of mv, as mentioned in the comment.

    I'll figure this out to make sure this isn't hiding a bug instead of just using cp.


    update:

    Sure, they have the same content, and same name, but wasn't one created when fuzzing with master, and the other one with v30.2?

    Oh, one explanation could be that afl-cmin isn't actually creating new inodes. The input file in all_inputs is the same inode:

    $ ls -i all_inputs/addition_overflow/007c7bebe2400a5f05d47382971a33e2244b4fc2
    4497220 all_inputs/addition_overflow/007c7bebe2400a5f05d47382971a33e2244b4fc2
    

    maflcko commented at 1:36 PM on April 13, 2026:

    Maybe afl-cmin creates a hard link? According to my LLM:

    • afl-cmin primarily creates hard links to the minimized inputs (it uses link() in src/afl-cmin.c, os.link() in afl-cmin.py, and ln in afl-cmin.bash).
    

    maflcko commented at 1:45 PM on April 13, 2026:

    If those are really hard-links, I presume remove_dir_all is actually the correct function to call here


    ekzyis commented at 5:33 PM on April 13, 2026:

    Yes, afl-cmin in C++, bash and python is creating hard links. The script is installing the python version:

    $ file $(which afl-cmin)
    /usr/local/bin/afl-cmin: Python script, ASCII text executable
    

    Ok, this explains it! Using remove_dir_all in 4373c7715e.

  13. maflcko approved
  14. maflcko commented at 8:17 AM on April 13, 2026: contributor

    lgtm, just some nits

  15. ekzyis force-pushed on Apr 13, 2026
  16. ekzyis commented at 1:24 PM on April 13, 2026: contributor

    Thanks for the review! I didn't squash the commits yet. I thought it would make the next review easier.

    I'm currently debugging why I can't use fs::remove_dir, see #274 (review).

  17. Keep fuzz inputs for older branches 4373c7715e
  18. ekzyis force-pushed on Apr 13, 2026
  19. ekzyis commented at 5:34 PM on April 13, 2026: contributor

    Squashed all commits into 4373c7715e.

    Btw, I didn't make sure that libFuzzer will only add fuzz targets to the output directory, and not delete the ones in the output directory that don't increase coverage for the fuzzed binary. I only looked at the documentation for -merge, and it doesn't mention it would:

    You may use the same flag to add more interesting items to an existing corpus. Only the inputs that trigger new coverage will be added to the first corpus.

    For -set_cover_merge, which is what we're using, I only found this comment in the source code:

    // Merges all corpora into the first corpus. A file is added into
    // the first corpus only if it adds new features. Unlike `Merger::Merge`,
    // this implementation calculates an approximation of the minimum set
    // of corpora files, that cover all known features (set cover problem).
    // Generally, this means that files with more features are preferred for
    // merge into the first corpus. When two files have the same number of
    // features, the smaller one is preferred.
    size_t Merger::SetCoverMerge(const std::set<uint32_t> &InitialFeatures,
    
  20. in delete-nonreduced-fuzz-inputs/src/main.rs:500 in 4373c7715e
     495 | +        .ok_or_else(|| format!("expected {} to have parent", dir.display()))?;
     496 | +
     497 | +    for name in names {
     498 | +        let from = dir.join(&name);
     499 | +        let to = parent.join(&name);
     500 | +        fs::rename(&from, &to).map_err(|e| format!("fs::rename failed: {}", e.to_string()))?;
    


    maflcko commented at 5:45 PM on April 14, 2026:

    heh, I guess there could be a check to see if the file to exists, and then delete from. Otherwise, keep the move. This would allow to use remove_dir, but not important. Either way is fine.

  21. maflcko approved
  22. maflcko merged this on Apr 14, 2026
  23. maflcko closed this on Apr 14, 2026

  24. ekzyis deleted the branch on Apr 14, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/qa-assets. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-17 08:25 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me