bench: Match ConnectBlock tx output counts #32216

pull monlovesmango wants to merge 1 commits into bitcoin:master from monlovesmango:connectblock-bench-fix-output-counts changing 1 files +5 −11
  1. monlovesmango commented at 5:37 pm on April 3, 2025: contributor

    There turned out to be a mismatch in the tx output counts which caused ‘ConnectBlockMixedEcdsaSchnorr’ benchmark to run slower than ‘ConnectBlockAllEcdsa’ and ‘ConnectBlockAllSchnorr’. This commit makes the tx output counts uniform across all benchmarks.

    This commit also renames the ’taproot_tx’ variable to ’tx’ to reflect that this variable represents a general tx and not just a taproot tx.

  2. DrahtBot commented at 5:37 pm on April 3, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32216.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK davidgumberg, janb84, josibake, Prabhat1308

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31668 (Added rescan option for import descriptors by saikiran57)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on Apr 3, 2025
  4. monlovesmango commented at 5:40 pm on April 3, 2025: contributor

    Background for this PR can be found in #31689.

    Happy to remove the renaming of taproot_tx if reviewers feel that it is inappropriate.

  5. janb84 commented at 6:31 pm on April 3, 2025: contributor
    see review below.
  6. Prabhat1308 commented at 6:57 pm on April 3, 2025: contributor

    tACK 1f0aef6

    My output is less verbose compared to the outputs by other reviewers for some reason.

    0
    1|            ns/block |             block/s |    err% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------:|:----------
    3|       30,309,900.52 |               32.99 |    0.2% |     65.84 | `ConnectBlockAllEcdsa`
    4|       29,992,461.34 |               33.34 |    0.4% |     65.71 | `ConnectBlockAllSchnorr`
    5|       30,330,863.62 |               32.97 |    0.2% |     66.21 | `ConnectBlockMixedEcdsaSchnorr`
    

    Also agree with the naming change since not all transactions are taproot and better to keep it general.

    • This benchmark is expected to be slower than the AllSchnorr or Ecdsa benchmark
    • because it uses transactions with both Schnorr and Ecdsa signatures
    • which requires the transaction to be hashed multiple times for
    • the different signature algorithms

    I see this mentioned but since for small numbers there isn’t much difference. The difference will be seen when number is increased to actual numbers ? I am of the view that benchmarks be close to the actual numbers and would probably like some actual input numbers to be put rather than the small dummy numbers since these aren’t the markers for this benchmark.

    EDIT: After examining the code again , I see that 1000 transactions are taken . Although still far from real world scenario (with 5 inputs/output) . Should be okay for a benchmark .

  7. DrahtBot requested review from janb84 on Apr 3, 2025
  8. in src/bench/connectblock.cpp:1 in 1f0aef6d70


    janb84 commented at 7:06 pm on April 3, 2025:

    NIT: Please change the comment on line 114-119, your change removes the need for the comment

    0/**
    1 * This benchmark is expected to be slower than the AllSchnorr or Ecdsa benchmark
    2 * because it uses transactions with both Schnorr and Ecdsa signatures
    3 * which requires the transaction to be hashed multiple times for
    4 * the different signature algorithms
    5 */
    

    monlovesmango commented at 9:35 pm on April 3, 2025:
    done! good catch

    janb84 commented at 6:54 am on April 4, 2025:
    Thanks for changing this, it was after the comment of @Prabhat1308 that I noticed it was still there. So props to @Prabhat1308 👏
  9. janb84 commented at 7:07 pm on April 3, 2025: contributor

    Approach ACK 1f0aef6

    The benchmark now runs with equal amount of outputs, this was also discussed during the PR review club. Seems logical to have an equal amount of outputs between the different benchmarks and the change also removes the outlier. Would like to know if the unequal number of outputs were there with a reason though.

  10. monlovesmango force-pushed on Apr 3, 2025
  11. bench: Match ConnectBlock tx output counts
    There turned out to be a mismatch in the tx output counts which caused
    'ConnectBlockMixedEcdsaSchnorr' benchmark to run slower than
    'ConnectBlockAllEcdsa' and 'ConnectBlockAllSchnorr'. This commit makes
    the tx output counts uniform across all benchmarks.
    
    This commit also renames the 'taproot_tx' variable to 'tx' to reflect
    that this variable represents a general tx and not just a taproot tx.
    924f25f6fc
  12. monlovesmango force-pushed on Apr 3, 2025
  13. monlovesmango commented at 9:47 pm on April 3, 2025: contributor

    My output is less verbose compared to the outputs by other reviewers for some reason.

    This is how it looks for me too when I run it on my mac, not sure why.

  14. davidgumberg commented at 1:19 am on April 4, 2025: contributor

    Tested ACK 924f25f6fc766cb89a23fb

    Nice find! Given that, as I understand, part of the motivation for this benchmark’s introduction (#31689) is to be able to compare ConnectBlock() performance with and without schnorr batch validation (#29491), and part of the motivation for using Schnorr signatures in the first place is that they’re susceptible to batch validation, it seems to me that it would be useful for these three benchmarks to be as like as possible, so that they can be compared, and this PR makes that the case in my testing.

    Recent PR review club where this was discussed: (https://bitcoincore.reviews/31689#l-60).

    Benchmark Results

    master (c66f7dab33fb08dc323)

    0$ git checkout --detach 924f25f~1
    1$ cmake -B build -DBUILD_BENCH=ON && cmake --build build -j $(nproc)
    2$ ./build/bin/bench_bitcoin --filter=ConnectBlock.* --min-time=5000
    
    ns/block block/s err% ins/block cyc/block IPC bra/block miss% total benchmark
    46,549,571.00 21.48 1.2% 672,393,930.64 161,287,432.73 4.169 15,122,140.18 1.7% 5.46 ConnectBlockAllEcdsa
    47,199,469.60 21.19 0.2% 680,249,499.27 164,132,414.09 4.145 15,120,173.64 1.7% 5.47 ConnectBlockAllSchnorr
    59,705,754.22 16.75 0.7% 839,091,363.50 207,657,368.33 4.041 18,945,901.12 1.8% 5.55 ConnectBlockMixedEcdsaSchnorr

    pr branch (924f25f6fc766cb)

    0$ git checkout --detach 924f25f
    1$ cmake -B build -DBUILD_BENCH=ON && cmake --build build -j $(nproc)
    2$ ./build/bin/bench_bitcoin --filter=ConnectBlock.* --min-time=5000
    
    ns/block block/s err% ins/block cyc/block IPC bra/block miss% total benchmark
    58,425,229.87 17.12 0.6% 840,316,339.12 203,096,705.00 4.138 18,694,845.12 1.7% 5.45 ConnectBlockAllEcdsa
    59,756,200.89 16.73 0.6% 846,439,909.12 207,679,122.78 4.076 18,670,295.75 1.8% 5.52 ConnectBlockAllSchnorr
    59,767,476.56 16.73 0.6% 840,513,726.44 207,604,950.00 4.049 18,968,510.88 1.8% 5.58 ConnectBlockMixedEcdsaSchnorr

    My output is less verbose compared to the outputs by other reviewers for some reason.

    This is because nanobench provides some extra stats via linux’s perf_events subsystem and omits them when they’re not available:

    CPU statistics like instructions, cycles, branches, branch misses are only available on Linux, through perf events. On some systems you might need to change permissions through perf_event_paranoid or use ACL.

    https://github.com/martinus/nanobench/blob/e4327893194f06928012eb81cabc606c4e4791ac/src/docs/tutorial.rst?plain=1#L79-L84

  15. DrahtBot requested review from janb84 on Apr 4, 2025
  16. DrahtBot requested review from Prabhat1308 on Apr 4, 2025
  17. janb84 commented at 6:52 am on April 4, 2025: contributor

    re ACK 924f25f

    Changes sinds last review:

    • Removed comment block which was not relevant anymore.
  18. josibake approved
  19. josibake commented at 7:56 am on April 4, 2025: member

    ACK https://github.com/bitcoin/bitcoin/pull/32216/commits/924f25f6fc766cb89a23fb1a6ad632947742da9f

    Thanks for picking this up! Very cool to see this coming from the PR review club. While not strictly necessary for all of these benchmarks to have the same number of transactions, I think this change better aligns with the motivation of #31689 and will help avoid confusion in the future. Also agree that the variable rename improves the readability.

  20. l0rinc changes_requested
  21. l0rinc commented at 8:27 am on April 4, 2025: contributor

    As mentioned in the original PR, I personally would prefer adjusting the bench.unit("block") instead (or besides), since a “block” doesn’t actually define what it contains, but number of transactions tells more about what we’re measuring. Changing it to

     0diff --git a/src/bench/connectblock.cpp b/src/bench/connectblock.cpp
     1index efe05dc2af..2317538f93 100644
     2--- a/src/bench/connectblock.cpp
     3+++ b/src/bench/connectblock.cpp
     4@@ -92,7 +92,7 @@ std::pair<std::vector<CKey>, std::vector<CTxOut>> CreateKeysAndOutputs(const CKe
     5 void BenchmarkConnectBlock(benchmark::Bench& bench, std::vector<CKey>& keys, std::vector<CTxOut>& outputs, TestChain100Setup& test_setup)
     6 {
     7     const auto& test_block{CreateTestBlock(test_setup, keys, outputs)};
     8-    bench.unit("block").run([&] {
     9+    bench.batch(outputs.size()).unit("tx").run([&] {
    10         LOCK(cs_main);
    11         auto& chainman{test_setup.m_node.chainman};
    12         auto& chainstate{chainman->ActiveChainstate()};
    

    for the original change results in:

    ns/tx tx/s err% total benchmark
    8,712,574.78 114.78 0.1% 11.00 ConnectBlockAllEcdsa
    8,630,085.42 115.87 0.0% 10.98 ConnectBlockAllSchnorr
    8,718,840.97 114.69 0.0% 10.95 ConnectBlockMixedEcdsaSchnorr

    and after this PR:

    ns/tx tx/s err% total benchmark
    8,702,884.12 114.90 0.0% 10.93 ConnectBlockAllEcdsa
    8,650,517.76 115.60 0.1% 11.04 ConnectBlockAllSchnorr
    8,722,221.53 114.65 0.0% 10.95 ConnectBlockMixedEcdsaSchnorr

    We can keep the current changes as well, but I think it would clarify things if we defined more precisely what we’re measuring.

  22. Prabhat1308 commented at 1:21 pm on April 4, 2025: contributor
    reACK 924f25f
  23. monlovesmango commented at 4:46 pm on April 4, 2025: contributor

    Hey @l0rinc ,

    I appreciate you reiterating this feedback! I actually tend to agree with you, however do not quite agree with the strategy laid out in your suggested changes, and my lack of experience makes me hesitant to make any sweeping changes.

    For your suggested changes, in order for the metric to be tx/s I believe the batch size would actually need to be the number of txs and not the outputs.size(). Additionally, I actually think the metric we would really want is tx outputs/s, as switching to tx/s wouldn’t actually resolve the issue this pr focused on. From a technical perspective it would be inaccurate to say that this benchmark is actually measuring tx outputs/s just by scaling the batch because what it actually would bench is just connectblock divided by the number of tx outputs which doesn’t change the metric in a meaningful way.

    More broadly, for a “ConnectBlock” benchmark my assumption is that this could potentially be used to benchmark other aspects of connectblock functionality, not just signature validation, in which case tx outputs/s would not really be an appropriate metric. I do think it might be valuable to have a separate benchmark that is more focused at the tx output granularity level.

    I think it would clarify things if we defined more precisely what we’re measuring.

    In my view, block/s is the most precise definition of what we’re measuring and tx/s (or tx output/x) would be an obfuscation.

  24. l0rinc commented at 8:34 pm on April 4, 2025: contributor

    In my view, block/s is the most precise definition of what we’re measuring and tx/s (or tx output/x) would be an obfuscation.

    Isn’t the point of this PR to change what blocks/s means? That’s why this PR exists, since a block doesn’t really narrow down what we’re measuring. We might as well call it thing/s.

    the metric we would really want is tx outputs/s

    I think that’s basically what I wrote (you can scale it to actual transaction outputs, but since the transaction count is the same for all, it’s just a multiplier, won’t change the ratios)

    We can also write:

    0bench.batch(test_block.vtx.size() * outputs.size()).unit("txo").run([&] {
    

    or even

    0bench.batch(std::accumulate(test_block.vtx.begin(), test_block.vtx.end(), size_t{0}, [](size_t sum, const auto& tx) { return sum + tx->vout.size(); })).unit("txo").run([&] {
    

    if you think that’s more accurate, but calling it “block/s” and than changing the value and still calling it “block/s” is not the sign of a reliable benchmark.

    Will let you decide, I basically already said these in the other PR as well.

  25. monlovesmango commented at 11:35 pm on April 4, 2025: contributor

    Isn’t the point of this PR to change what blocks/s means?

    The point of this PR is simply to make the existing benchmark code more consistent across benchmarks, not to change anything in the design of the benchmark that has already been merged. As I said before, I do not have the experience nor expertise to tackle something like that and am not convinced that just changing the unit of the benchmark is meaningful.

  26. ryanofsky assigned ryanofsky on Apr 8, 2025
  27. ryanofsky merged this on Apr 8, 2025
  28. ryanofsky closed this on Apr 8, 2025


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-04-16 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me