bench: unrealistic ConnectBlock benchmarks

Raimo33 commented at 12:53 pm on September 12, 2025: none

The current ConnectBlock benchmarks in bench/connectblock.cpp do not reflect realistic mainnet workloads due to three key issues:

1. Unrealistic block composition

Every benchmarked block is constructed with a highly artificial transaction pattern:

0/*
1 * - Each transaction has the same number of inputs and outputs
2 * - All Taproot inputs use simple key path spends (no script path spends)
3 * - All signatures use SIGHASH_ALL (default sighash)
4 * - Each transaction spends all outputs from the previous transaction
5 */

This setup avoids realistic UTXO set fragmentation and script diversity. The benchmark effectively measures validation of a synthetic “ladder” of transactions rather than a block resembling mainnet traffic.

2. Unrealistic UTXO cache state

Before benchmarking, the code creates a block that produces the outputs, then immediately spends them all in the benchmark block. This keeps the entire UTXO set hot in memory (CoinsTip()).

In reality:

Many UTXO lookups hit LevelDB and require disk access.
Cache misses and eviction policies significantly impact block validation cost.

3. Unrealistic repetition

Each benchmark repeatedly validates the same synthetic block:

0    const auto& test_block{CreateTestBlock(test_setup, keys, outputs)};
1    bench.unit("block").run([&] {
2        /* ... */
3    });

There is no variability in transaction graph, script mix, or UTXO evolution across iterations. As a result, the benchmark never exercises cache churn, block-to-block dependency patterns, or realistic workload diversity.

Why this matters

These issues mean the benchmark results do not reflect real-world ConnectBlock performance. Instead, they measure a best-case, memory-only workload on a synthetic block structure.

davidgumberg commented at 0:06 am on September 13, 2025: contributor

I am skeptical of trying to measure end-to-end performance of Bitcoin Core systems like block connection including LevelDB performance using a microbenchmark suite. In my opinion, these benchmarks work best at measuring the performance of specific components and functions a-la unit tests, and I would not object to adding more scenarios to ConnectBlock() that are meant to exercise the performance of certain segments of block connection, but I think to get measurements of realistic scenarios, there is no substitute for just measuring Bitcoin Core nodes doing IBD, of interest might be the benchkit project for measuring IBD’s: https://github.com/bitcoin-dev-tools/benchkit.

Raimo33 commented at 2:44 pm on September 13, 2025: none

it’s not “a little bit of faithfulness” that you sacrifice. you’re ignoring the cache completely.

l0rinc commented at 2:11 am on September 15, 2025: contributor

@Raimo33 if you can provide an alternative that’s reasonably simple for a micro-benchmark, I will happily review it.

But note that I already have a few micro-benchmark improvements in that area where review is needed:

#32554 - creates a configurable block so that we can measure the composition of the block
https://github.com/bitcoin/bitcoin/pull/32729/files#diff-547fa26a77a99ccd77aca7ce1c69c0544666f788d463dc7bff664001f9ff1c88R40 - splits checkblock measurements from serialization
https://github.com/bitcoin/bitcoin/pull/31868/files#diff-547fa26a77a99ccd77aca7ce1c69c0544666f788d463dc7bff664001f9ff1c88R24 - adding separate benchmark for serialization size counting
https://github.com/bitcoin/bitcoin/pull/31682/files#diff-99f9f77ae4c2d8d3d3b611b8a38ee08b48239dcc30417179b07b1b092b1d9dd6R138 - dedicated transaction processing benchmark

HowHsu commented at 1:34 pm on September 15, 2025: none

I can confirm that this issue is true, during my testing for my PR： https://github.com/bitcoin/bitcoin/pull/32791 . To test ConnectBlock() realistically, I think we can leverage CVerifyDB::VerifyDB(), it is called on node init. It mainly replays the latest -check_blocks blocks in the best chain when you set -check_level to 4. Considering cache state, you have to manually set it at the begining (flush it for no-cache case, pre-load entries for cache-hit case, .etc).Surely you have to find a good height range of blocks first.

bench: unrealistic ConnectBlock benchmarks #33375

1. Unrealistic block composition

2. Unrealistic UTXO cache state

3. Unrealistic repetition

Why this matters