bench: run FindByte across block-sized buffer #34046

pull l0rinc wants to merge 1 commits into bitcoin:master from l0rinc:l0rinc/findbyte-bench changing 1 files +14 −7
  1. l0rinc commented at 9:12 pm on December 10, 2025: contributor

    The benchmark was updated to start from different file positions and iterate with a stride of ~32kb to exercise a few “random” places.

    Found while reviewing #34044

    Note: this PR started off as making FindByte more realistic, but based on #34044 (comment) I decided to remove it, and after #34046 (comment) it was reverted to its original state

  2. DrahtBot added the label Tests on Dec 10, 2025
  3. DrahtBot commented at 9:12 pm on December 10, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34046.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, Raimo33
    Concept NACK sipa

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

  4. l0rinc force-pushed on Dec 10, 2025
  5. DrahtBot added the label CI failed on Dec 10, 2025
  6. DrahtBot commented at 9:58 pm on December 10, 2025: contributor

    🚧 At least one of the CI tasks failed. Task macOS native: https://github.com/bitcoin/bitcoin/actions/runs/20113425463/job/57716719394 LLM reason (✨ experimental): bench_sanity_check test segfault caused CI failure.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. DrahtBot removed the label CI failed on Dec 10, 2025
  8. in src/bench/streams_findbyte.cpp:39 in 031ee0b22c
    42+            assert(bf.GetPos() == found_index);
    43+        }
    44     });
    45-
    46     assert(file.fclose() == 0);
    47+    fs::remove(path);
    


    maflcko commented at 8:03 am on December 11, 2025:
    the file isn’t left around. This was fixed already in fa2fbaa4a29f80d3c7d5f0ad6b64035c3156dd12

    l0rinc commented at 12:15 pm on December 11, 2025:
    Deleted the manual removal, thanks!
  9. Raimo33 commented at 11:52 am on December 11, 2025: contributor

    Code Review ACK 031ee0b22c4ee4f04f66cb8f0be534b404dbe73c

    The new benchmark is more realistic

  10. l0rinc force-pushed on Dec 11, 2025
  11. in src/bench/streams_findbyte.cpp:26 in 464ee9e26f
    25-    uint8_t data[file_size] = {0};
    26-    data[file_size - 1] = 1;
    27-    file << data;
    28+    std::vector data(file_size, std::byte{0});
    29+    data[found_index] = target;
    30+    file.write_buffer(data);
    


    maflcko commented at 12:35 pm on December 11, 2025:

    I don’t understand this either. There is no obfuscation, so the two are the same either way?

    I’d prefer to just keep the << in high level code, unless there is a reason not to.


    l0rinc commented at 12:43 pm on December 11, 2025:
    the vector serializes differently, we don’t need the size prefix, it would make the assertion at the end a bit less intuitive

    maflcko commented at 12:50 pm on December 11, 2025:

    To use span-serialization, you can just type std::span{bla}, see also git grep '<< std::span{'.

    I don’t think it makes sense to use write_buffer for places where span-serialization is wanted. Otherwise, this leads to inconsistent code, such as:

    0file << int_val;
    1file.write_buffer(vec_val);
    2file << int_other_val;
    

    Also, this mutating the buffer (possibly) makes it doubly confusing.


    l0rinc commented at 1:01 pm on December 11, 2025:

    this mutating the buffer

    There was no obfuscation enabled, so no mutation was happening.

    don’t think it makes sense to use write_buffer for places where span-serialization is wanted

    I dislike magical operators and needless serialization when we want to directly write content, but I’m also fine with file << std::span{data} - pushed.

  12. maflcko commented at 12:41 pm on December 11, 2025: member
    lgtm 464ee9e26f7851ff27c2d18f807ae64c8dd0fc62, but I’d prefer to not use the low-level buffer writing questionable “optimization”
  13. bench: run `FindByte` across block-sized buffer
    The current benchmark doesn't represent a realistic scenario:
    * it only checks 200 bytes, while in reality we have 8 MB: https://github.com/bitcoin/bitcoin/blob/2c44c41984e0a170cf38d595635a9e861854573c/src/validation.cpp#L5009
    * it only checks the very last position (which is basically the missing case), while in reality it's a circular buffer, so the magic bytes can be anywhere
    * plain array was needlessly serialized instead of written directly
    
    The benchmark was updated to start from different file positions and iterate with a stride of ~32kb to exercise a few "random" places.
    93941dc74b
  14. l0rinc force-pushed on Dec 11, 2025
  15. DrahtBot added the label CI failed on Dec 11, 2025
  16. maflcko commented at 1:05 pm on December 11, 2025: member

    review ACK 93941dc74be8817463e024e5aa11ed0638fdc9f3 🐪

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK 93941dc74be8817463e024e5aa11ed0638fdc9f3 🐪
    3Yt2ihRdgRtgrO4buTQEVh9tmEonYfAwnXmc8GMkuzgWbdXqsilW8jJP/6QMesry75/BUwnPrr7ognDzINKfsAw==
    
  17. Raimo33 commented at 1:23 pm on December 11, 2025: contributor
    re ACK 93941dc74be8817463e024e5aa11ed0638fdc9f3
  18. l0rinc closed this on Dec 11, 2025

  19. l0rinc reopened this on Dec 11, 2025

  20. l0rinc commented at 6:31 pm on December 11, 2025: contributor
    (close/open dance to fix boost download issue)
  21. DrahtBot removed the label CI failed on Dec 11, 2025
  22. l0rinc force-pushed on Dec 14, 2025
  23. l0rinc renamed this:
    bench: run `FindByte` across block-sized buffer
    bench: remove `FindByte` benchmark
    on Dec 14, 2025
  24. l0rinc commented at 5:05 pm on December 14, 2025: contributor
    Decided to remove this benchmark instead of updating it - seeing that actual reindex got slower, while even the updated benchmark showed a speedup, since current blocks always contain the header in the first position.
  25. sipa commented at 6:11 pm on December 14, 2025: member

    Concept NACK.

    The existence of a benchmark does not imply it’s worth efforts for improving the code further. It’s still useful to see it doesn’t regress.

    It’s not that we don’t care about FindByte’s performance. It would probably be unacceptable to make it 100x slower. But as-is, it’s good enough.

  26. l0rinc force-pushed on Dec 14, 2025
  27. l0rinc commented at 6:17 pm on December 14, 2025: contributor
    Thanks for the feedback @sipa, I have reverted the previous version which checks multiple positions.
  28. DrahtBot added the label CI failed on Dec 14, 2025
  29. l0rinc renamed this:
    bench: remove `FindByte` benchmark
    bench: run `FindByte` across block-sized buffer
    on Dec 14, 2025
  30. DrahtBot removed the label CI failed on Dec 14, 2025
  31. l0rinc commented at 6:06 pm on December 15, 2025: contributor
    I’ll try something else, closing for now
  32. l0rinc closed this on Dec 15, 2025


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-02-02 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me