node: optimize CBlockIndexWorkComparator #33334

pull Raimo33 wants to merge 1 commits into bitcoin:master from Raimo33:index-work-comparator-branchless changing 1 files +7 −9
  1. Raimo33 commented at 10:00 pm on September 7, 2025: contributor

    Summary

    This PR optimizes CBlockIndexWorkComparator::operator() by removing 4 redundant branches.

    The previous implementation used multiple separate comparisons with explicit branches for greater-than and less-than cases, resulting in unnecessary code paths.

    The new implementation consolidates comparisons into single inequality checks and reduces complexity while preserving its original behavior. This change is particularly beneficial for loading blocks from files and reindexing.

    Benchmarks

    0taskset -c 1 ./bin/bench_bitcoin --filter="(CheckBlockIndex|LoadExternalBlockFile)" -output-csv=bench_old.csv --min-time=30000
    

    Before:

    ns/op op/s err% total benchmark
    129,988.82 7,692.97 0.0% 33.02 CheckBlockIndex
    21,661,396.96 46.17 0.5% 32.37 LoadExternalBlockFile

    After:

    ns/op op/s err% total benchmark
    115,346.65 8,669.52 0.0% 33.00 CheckBlockIndex
    20,389,679.85 49.04 0.4% 31.76 LoadExternalBlockFile

    Compared to master:

    • CheckBlockIndex +12.7% faster
    • LoadExternalBlockFile +6.2% faster
  2. DrahtBot commented at 10:00 pm on September 7, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33334.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK l0rinc

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33637 (refactor: optimize block index comparisons (1.4-7.7x faster) by l0rinc)
    • #29640 (Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) by sr-gi)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. node: optimize CBlockIndexWorkComparator
    Refactor the comparator logic in CBlockIndexWorkComparator::operator() to reduce the amounts of branches and improve readability without changing semantics.
    
    The previous implementation used multiple separate comparisons with explicit branches for greater-than and less-than cases, resulting in
    unnecessary code paths.
    
    The new implementation consolidates comparisons into single inequality checks and reduces complexity while preserving its original behavior.
    This change is particularly beneficial for loading blocks from files and reindexing.
    
    taskset -c 1 ./bin/bench_bitcoin --filter="(CheckBlockIndex|LoadExternalBlockFile|BlockToJsonVerboseWrite)" -output-csv=bench_old.csv --min-time=30000
    
    |               ns/op |                op/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |       26,557,419.20 |               37.65 |    0.1% |     32.85 | `BlockToJsonVerboseWrite`
    |          129,988.82 |            7,692.97 |    0.0% |     33.02 | `CheckBlockIndex`
    |       21,661,396.96 |               46.17 |    0.5% |     32.37 | `LoadExternalBlockFile`
    
    taskset -c 1 ./bin/bench_bitcoin --filter="(CheckBlockIndex|LoadExternalBlockFile|BlockToJsonVerboseWrite|WalletIsMineDescriptors)" -output-csv=bench_new.csv --min-time=30000
    
    |               ns/op |                op/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |       27,930,130.95 |               35.80 |    0.1% |     32.96 | `BlockToJsonVerboseWrite`
    |          115,346.65 |            8,669.52 |    0.0% |     33.00 | `CheckBlockIndex`
    |       20,389,679.85 |               49.04 |    0.4% |     31.76 | `LoadExternalBlockFile`
    80ac0467ef
  4. Raimo33 force-pushed on Sep 9, 2025
  5. in src/node/blockstorage.cpp:173 in 80ac0467ef
    177-    if (pa > pb) return true;
    178-
    179-    // Identical blocks.
    180-    return false;
    181+    return pa > pb;
    182 }
    


    l0rinc commented at 9:07 pm on September 9, 2025:

    All this does is compare by 3 parameters - wouldn’t it be simpler to use std::tie for lexicographical comparison instead?

    0bool CBlockIndexWorkComparator::operator()(const CBlockIndex* pa, const CBlockIndex* pb) const
    1{
    2    // First sort by most total work (descending), then by earliest activatable time (ascending), then by pointer value (ascending)
    3    return std::tie(pa->nChainWork, pb->nSequenceId, pb) < std::tie(pb->nChainWork, pa->nSequenceId, pa);
    4}
    

    Note that pa & pb usages are mixed to follow the original algorithm.

    Since https://corecheck.dev/bitcoin/bitcoin/pulls/33334 hasn’t finished yet, I ran the benchmarks on my Mac with min-time of 100 seconds for stability several times, and took the fastest of each execution (since noise can only slow down execution), which reveals that:

    • BlockToJsonVerboseWrite speed is completely unaffected in all cases;
    • LoadExternalBlockFile is the same in the current implementation but would be ~7% faster with the suggested solution;
    • CheckBlockIndex is indeed 17% faster for the current solution, but would be 24% faster with the proposed one - all while making the code a lot simpler;

    Before:

    ns/op op/s err% total benchmark
    25,322,734.53 39.49 0.2% 109.93 BlockToJsonVerboseWrite
    39,049.20 25,608.72 0.2% 110.15 CheckBlockIndex
    5,955,124.54 167.92 0.1% 110.26 LoadExternalBlockFile

    After:

    ns/op op/s err% total benchmark
    25,333,028.32 39.47 0.2% 109.61 BlockToJsonVerboseWrite
    33,173.57 30,144.48 0.3% 109.82 CheckBlockIndex
    6,169,392.66 162.09 0.4% 110.79 LoadExternalBlockFile

    Suggested std::tie:

    ns/op op/s err% total benchmark
    25,360,146.33 39.43 0.3% 109.90 BlockToJsonVerboseWrite
    31,343.25 31,904.80 0.7% 104.88 CheckBlockIndex
    5,535,101.51 180.67 0.7% 967.85 LoadExternalBlockFile

    If you decide to take the suggestion, please add me as a coauthor.


    Raimo33 commented at 10:39 pm on September 9, 2025:

    I experimented with std::tie before. whether it is more readable is debatable. I’d say they’re the same in terms of readability (one might not be familiar with std::tie). I ran the benchmarks on my x86 with gcc laptop (not the same machine as my original benchmarks) and got different results from yours.

    0taskset -c 0 ./bin/bench_bitcoin --filter="(CheckBlockIndex|LoadExternalBlockFile)" --min-time=10000
    

    Master:

    ns/op op/s err% total benchmark
    200,279.21 4,993.03 0.1% 10.95 CheckBlockIndex
    29,798,546.03 33.56 0.3% 10.82 LoadExternalBlockFile

    This PR:

    ns/op op/s err% total benchmark
    166,134.37 6,019.22 0.1% 10.91 CheckBlockIndex
    29,547,138.28 33.84 0.1% 10.59 LoadExternalBlockFile

    CheckBlockIndex: +17% faster than master LoadExternalBlockFile: same as master

    @l0rinc suggestion:

    ns/op op/s err% total benchmark
    193,085.31 5,179.06 0.0% 11.00 CheckBlockIndex
    30,456,267.03 32.83 0.3% 10.96 LoadExternalBlockFile

    CheckBlockIndex: +3.6% faster than master LoadExternalBlockFile: -2.2% slower than master

    Overall very different results on different machines/compilers. probably std::tie would be the most standard implementation, giving room for smarter future compilers to optimize. but I genuinely don’t know how this could be less than 2 branches.


    l0rinc commented at 11:34 pm on September 9, 2025:

    but I genuinely don’t know how this could be less than 2 branches

    let’s find out - can you create a godbold reproducer with latest compilers?


    Raimo33 commented at 10:59 am on September 10, 2025:

    https://godbolt.org/z/6TjP137z1

    Looks like the current version wins both on x86 and arm, but I’m no assembly expert.


    l0rinc commented at 1:55 am on October 16, 2025:
    I still think the std::tie version is better, pushed it as an alternative as part of a bigger related optimization I had. Added you as coauthor there: https://github.com/bitcoin/bitcoin/pull/33637
  6. l0rinc changes_requested
  7. l0rinc commented at 9:28 pm on September 9, 2025: contributor
    Concept ACK - please consider my suggestion
  8. luke-jr referenced this in commit 39631839b7 on Sep 19, 2025
  9. Raimo33 commented at 9:17 am on October 17, 2025: contributor
    Follow up in #33637
  10. Raimo33 closed this on Oct 17, 2025

  11. Raimo33 deleted the branch on Oct 24, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-03 00:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me