Bench 31132 base #35188

andrewtoth commented at 9:30 PM on April 30, 2026: contributor

Let's get a baseline!

benchcoin: add tooling

Adds build configuration, benchmarking CI workflows, Python
dependencies, plotting tools, and documentation for benchcoin.

Co-authored-by: David Gumberg <davidzgumberg@gmail.com>
Co-authored-by: Lőrinc <pap.lorinc@gmail.com>

50a568a764

don't compare to master in prs e02107e8c0

only run single bins in prs 15d5723948

rebase at 0100 GMT 45eec4c48c

make charts taller 8721de5069

update machine configs and charts fece8dad3a

fix nightly chart display and machine spec detection

- Fix empty chart: use get_chart_data() instead of to_dict() so JS
  filters can match config strings ("450", "32000") instead of objects
- Capture machine specs on self-hosted runner during build job and pass
  via --machine-specs flag to nightly append, instead of detecting on
  the ubuntu-latest publish runner

02f3ffbc19

chart: make chart series dynamic and unique c576ee42c5

rename history file e8edc3886b

use better colours in charts 9404f57b28

don't use inline html 8c63bee5cd

use commit date in chart data points ee95e61c1d

use nix flake in both publish workflow steps 4a61ed024e

fix nightly-history mismatch 22c3cfdd81

fix instrumented suffixes in reports e0663833a5

add clickable plotly links 3e6bec31d5

use corect path in index 321bb08f26

use scatter plot for leveldb compaction 65ac9b061c

add debug logs to artifacts b59ea9a0f5

dynamic charts test 097ddab8c4

fix theme render order cc9095faa0

add ruff and ty to flake 833fbf857b

add ty.toml fa00d49e38

add ruff.toml 2d958f7837

support a full IBD PR run ff14496c6f

Generate static debug.log plots during report generation

Run LogParser + PlotGenerator from bench/analyze.py during artifact
copying to produce static PNG charts from debug.log files. This
pre-generates the same 11 chart types that were previously rendered
client-side via JavaScript.

Changes to report.py:
- Import HAS_MATPLOTLIB, LogParser, PlotGenerator from bench.analyze
- _copy_network_artifacts: generate plots after each debug.log with
  "{network}-{name}" prefix (e.g. "450-uninstrumented-pr")
- _copy_artifacts: generate plots for single-directory mode, including
  when input_dir == output_dir
- _prepare_graphs_data: add "plots" key with relative paths to PNGs
- generate(): reorder to copy artifacts before HTML rendering so
  _prepare_graphs_data can find the generated plot files

Plot generation is guarded by HAS_MATPLOTLIB for graceful fallback
when matplotlib is unavailable.

a7157eae99

Replace client-side debug.log charts with static images

The pr-report.html template previously included debug-log-charts.html
which fetched multi-hundred-MB debug.log.gz files in the browser,
decompressed them with pako.js, parsed every line, and rendered 11
Plotly charts client-side. This made report pages unresponsive.

Now that report.py pre-generates the charts as static PNGs:
- pr-report.html: replace the debug-log-charts.html include with an
  img loop over graph.plots, using loading="lazy"
- debug-log-charts.html: delete (344 lines of client-side JS)
- base.html: remove pako.js and Plotly CDN scripts (both are
  independently included by pr-chart.html and nightly-chart.html
  via their own script tags)

The debug.log download link is preserved.

7da70c7856

Update bench/README.md to reflect current CLI interface

Rewrite to document the TOML config + matrix entry workflow,
removing stale references to the old two-commit comparison CLI,
--datadir requirement, profiles, and BENCH_DATADIR env var.

08ef37ebdb

Stop publishing debug.log.gz to gh-pages, link to CI artifacts instead

Debug logs were consuming 388MB on gh-pages. They are already uploaded
as CI artifacts with 90-day retention during benchmark runs.

- Remove gzip compression and copying of debug logs in report generation
- Remove debug log extraction in publish-results workflow
- Replace per-graph "Download debug.log" links with a single link to
  the CI run page where artifacts can be downloaded
- Keep matplotlib plot generation from debug logs (plots are still
  generated during report phase, just the raw logs aren't published)

5ea33b9476

Wait for Pages deployment before commenting on PR

The PR comment with result links was posted before GitHub Pages
finished deploying, leading to broken links. Add a wait-for-pages
job that polls for the pages-build-deployment run matching our
exact gh-pages commit, then blocks until it completes.

be1b2d4e97

Sort PR results index numerically instead of lexicographically 81b957d755

set prune height to 1_000_000MB

Previously, prune=10000 was causing flushes of the UTXO set when block
pruning was taking please, resulting in logs like:

❯ zcat 32000-instrumented-pr-debug.log.gz | rg UTXO
2026-02-12T07:22:57Z * Using 31990.0 MiB for in-memory UTXO set (plus up to 286.1 MiB of unused mempool space)
2026-02-12T07:28:51Z [warning] Flushing large (2 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:33:10Z [warning] Flushing large (3 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:37:23Z [warning] Flushing large (4 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:42:03Z [warning] Flushing large (4 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:46:34Z [warning] Flushing large (5 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:51:10Z [warning] Flushing large (6 GiB) UTXO set to disk, it may take several minutes
2026-02-12T07:55:57Z [warning] Flushing large (7 GiB) UTXO set to disk, it may take several minutes
2026-02-12T08:00:35Z [warning] Flushing large (8 GiB) UTXO set to disk, it may take several minutes
2026-02-12T08:05:16Z [warning] Flushing large (8 GiB) UTXO set to disk, it may take several minutes
2026-02-12T08:10:00Z [warning] Flushing large (8 GiB) UTXO set to disk, it may take several minutes
2026-02-12T08:14:36Z [warning] Flushing large (8 GiB) UTXO set to disk, it may take several minutes
2026-02-12T08:16:47Z [warning] Flushing large (8 GiB) UTXO set to disk, it may take several minutes

and generally interrupting benchmarking. Remove this effect by setting
prune to such a high value it will never trigger.

Prune is **required** to permit us to continue syncing from a pruned
datadir.

fa5cdfcdf7

Include prune in nightly chart series key f1dccd1b66

Fix numeric sort crash on pr-main directory 9cb908e6d7

Restore debug log extraction for PNG plot generation

ec9395419a removed debug log copying to stop publishing raw logs to
gh-pages (388MB). But it also removed the extraction step that makes
debug logs available during report generation, so matplotlib had no
input files and PNG charts silently stopped appearing in PR reports.

Restore the copy of debug-logs-${network} artifacts into the results
directory before report generation. The raw logs are still not committed
to gh-pages — only the small pre-rendered PNGs in plots/ are.

acfa586b46

use local bitcoin node to test region network issues 896efb3406

pin nightly benchmark job order: 450 before 32000

Replace the matrix strategy with explicit sequential jobs so the
450 benchmark always runs first in a consistent cache state, and
the 32000 benchmark always runs second.

e402e2830c

run fstrim before each benchmark for consistent SSD performance

Weekly fstrim.timer on the runner caused a ~25% speedup every Monday
(Sunday night run). Running fstrim in the prepare script before each
benchmark ensures consistent write performance regardless of when
the system timer last ran.

Follows the same suid wrapper pattern as drop-caches.

a9d2d618b1

Revert "use local bitcoin node to test region network issues"

This reverts commit fd8bdf64ec654c3dbc8677a2bfffa005cbac5ed0.

a67b1867b0

fstrim the mount point, not a subdirectory

FITRIM ioctl requires the filesystem mount point. Resolve it from
the tmp_datadir path by walking up to the mount boundary.

e6def8866b

show manual nightly re-runs as scatter points on chart

Manual (workflow_dispatch) runs are now stored separately from scheduled
nightly runs. Scheduled runs still dedup by (date, commit, dbcache) to
handle retries. Manual runs always append, appearing as diamond markers
on the chart alongside the nightly trend line.

Also ruff format.

65edcc845a

Compare PR benchmarks against median of last 7 nightly runs b318990cb7

merge manual nightly runs into their series on the chart

Manual (workflow_dispatch) runs no longer get a separate "(manual)"
legend entry with diamond markers. They appear as regular points in
the same series trace as scheduled runs.

51fd6b8e6c

Add assumevalid=0 benchmark runs to PR workflow

Adds a separate benchmark job (benchmark-noav) that runs IBD with
-assumevalid=0 to measure full script verification performance.
Uses a dedicated TOML config with uninstrumented-only matrix, and
prefixes artifacts with noav- so the publish workflow can handle
them alongside existing runs.

8956c2a50d

validation: collect block inputs in CoinsViewOverlay before ConnectBlock

Introduce CoinsViewOverlay::StartFetching, which maps all input prevouts of a
block to a new m_inputs vector of InputToFetch elements. Returns a ResetGuard
which is lifetime bound to the block, while the InputToFetch elements are
lifetime bound to the block as well.

Introduce StopFetching to clear the m_inputs vector.
CCoinsViewCache::Reset is made virtual and is overridden in CoinsViewOverlay.
StopFetching is called on Reset, so the InputToFetch objects will not
exceed the lifetime of the block.

Introduce ProcessInput to fetch the utxo of an individual input in m_inputs.
Each caller fetches the input at m_input_head and increments it, so each call
will fetch the next input in the queue.

Fetch coins from the m_inputs vector in FetchCoinFromBase by scanning all inputs
until we discover the input with the correct outpoint.

This is designed deliberately so multiple threads can call ProcessInput independently.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: Hodlinator <172445034+hodlinator@users.noreply.github.com>

43ba5645e8

coins: track last accessed input using m_input_tail

Inputs are accessed by ConnectBlock in the same order as they
are created in StartFetching (excepting BIP30 checks).
We can use this information, as well as the fact that CoinsViewOverlay
caches coins accessed via FetchCoinFromBase, to skip scanning
over previously accessed coins.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

619dd706f1

coins: introduce QuickHashHasher

Collapses a 32-byte Txid into a uint64_t, using 4 random uint64_ts.
Used in place of a hash function as a performance improvement.

Co-authored-by: Pieter Wuille <pieter@wuille.net>

fb34b4be36

coins: filter inputs spending outputs of same block in ProcessInput

This is a performance improvement, because we can skip checking on disk
that the input does not exist.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

1a284c6a50

coins: add ready flag to InputToFetch

Prepares for ProcessInput to be called from multiple threads.

This flag acts as a memory fence around InputToFetch::coin. There is no lock
guarding reads and writes of the coin field.
Instead we use the flag's release/acquire semantics to ensure that when the
main thread reads the coin it will have happened after a worker thread has
finished writing it.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

0fb5a8c001

coins: stop fetching before mutating base

Prepares for ProcessInput to be called from multiple threads.

ProcessInput reads from base. For ProcessInput to be safe to call in parallel
on separate threads, it must not be mutated.
Flush, Sync, and SetBackend can modify base, so we override these and
StopFetching before calling the base class.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

404cbd395b

validation: add -inputfetchthreads configuration option

Add a configuration option for the number of worker threads used for
parallel UTXO input fetching during block connection.

Default is 4 threads, max is 15, 0 disables parallel fetching.

67fa640c74

coins: introduce thread pool in CoinsViewOverlay

Prepares for ProcessInput to be called from multiple threads.

Introduce a ThreadPool shared pointer to CoinsViewOverlay. A pool managed
externally can be passed in the constructor.

A global thread pool is used in fuzz harnesses since iterations can happen
faster than the OS can create and tear down thread pools.
This can cause a memory leak when fuzzing.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

d520e55b90

coins: fetch inputs in parallel

Leverages the thread pool to fetch inputs on multiple threads, while the overlay
serves inputs on the main thread.

This is a performance improvement over blocking the main thread to fetch inputs.

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

9ba8f3a492

doc: update CoinsViewOverlay docstring to describe parallel fetching

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

34766acfa2

test: add unit tests for CoinsViewOverlay::StartFetching

Co-authored-by: l0rinc <pap.lorinc@gmail.com>

444f7750b1

fuzz: update harnesses to cover CoinsViewOverlay::StartFetching

Co-authored-by: l0rinc <pap.lorinc@gmail.com>
Co-authored-by: sedited <seb.kung@gmail.com>

77bdb8c3e7

fuzz: add coins_view_stacked fuzz harness to test concurrent leveldb reads b633d6e62c

andrewtoth closed this on Apr 30, 2026

andrewtoth commented at 9:31 PM on April 30, 2026: contributor

Oops, was supposed to go to benchcoin :grimacing:

DrahtBot commented at 9:31 PM on April 30, 2026: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process. A summary of reviews will appear here.