contrib: USDT-based TxGraph tracing pipeline
Summary
Add a complete tracing pipeline for TxGraph based on USDT tracepoints:
- 27 USDT tracepoints covering every public TxGraph API (mutations, staging, queries, maintenance)
- BCC recording script that attaches to all tracepoints via eBPF and writes a binary TXGTRACE trace file
- Analysis script that parses trace files and reports cluster size distribution, chain-shaped topology classification, and dependency statistics
- Replay tool that reads trace files, reconstructs TxGraph API calls, and reports per-entry-point timing — used to compare TxGraph performance across different implementations
Motivation
When optimizing TxGraph internals, we need tools to:
- Capture real workloads: record the exact sequence of TxGraph operations from a live mainnet node
- Understand mempool structure: analyze cluster topology (size distribution, chain vs non-chain, dependency density)
- Compare implementations: replay the same trace on different branches for fair, reproducible performance comparison
The USDT-based approach requires no wrapper class and has negligible overhead when no tracer is attached (one branch check per tracepoint, predicted not-taken).
Commits
1. txgraph: add USDT tracepoints for all TxGraph API operations
Adds 27 USDT tracepoints to src/txgraph.cpp and one txgraph:init tracepoint in src/txmempool.cpp (at the MakeTxGraph call site).
Variable-length operations (get_ancestors_union, get_descendants_union, count_distinct_clusters) use TRACEPOINT_ACTIVE to conditionally build a fixed-size stack buffer of indices, passed as a pointer for eBPF to read via bpf_probe_read_user.
All arguments are explicitly cast to 64-bit types to work around a BCC 0.31.0 bug where bpf_usdt_readarg fails to read sub-8-byte stack arguments (4@offset(%rsp) descriptors always return zero; only 8@ descriptors work correctly).
2. contrib: add BCC script to record TxGraph traces via USDT
contrib/tracing/txgraph/txgraph_trace_recorder.py — a BCC Python script that attaches to all 27 tracepoints and writes events to a TXGTRACE binary file.
Each eBPF handler is generated as a standalone C function from Python data definitions (BCC does not allow its builtins inside C macro expansions).
For complete traces, bitcoind should be started with TXGRAPH_WAIT_FOR_TRACER=1 environment variable.
3. contrib: add trace analysis script for cluster topology
contrib/tracing/txgraph/analyze_trace.py — parses TXGTRACE files and reports peak/final mempool state, cluster size distribution, chain-shaped cluster classification, and edge sanity checks. Correctly handles staged mutations.
4. contrib: add txgraph-replay trace replay tool
contrib/tracing/txgraph/txgraph_replay.cpp — standalone C++ tool. Replays all operations from a trace file, timing query/trigger operations while leaving mutations untimed. Reports per-entry-point statistics.
Built separately from Bitcoin Core via contrib/tracing/txgraph/build_replay.sh, which links against pre-built Bitcoin Core libraries. No changes to Bitcoin Core's CMake build system are required.
When started with TXGRAPH_WAIT_FOR_TRACER=1, bitcoind waits for a tracer to attach before mempool initialization (with a 2-second grace period for the BCC perf buffer to become ready), ensuring the trace captures every event from the start.
Usage
# Build Bitcoin Core (standard build, no special flags needed)
cmake -B build
cmake --build build -j$(nproc)
# Build txgraph-replay tool separately
contrib/tracing/txgraph/build_replay.sh
# Start bitcoind (with env var to wait for tracer)
TXGRAPH_WAIT_FOR_TRACER=1 bitcoind -datadir=...
# Record trace (in another terminal)
sudo python3 contrib/tracing/txgraph/txgraph_trace_recorder.py \
-p $(pidof bitcoind) -o trace.bin
# Analyze cluster topology
python3 contrib/tracing/txgraph/analyze_trace.py trace.bin
# Replay for performance comparison
./build/bin/txgraph-replay trace.bin
Sample output
analyze_trace.py
Parameters: max_cluster_count=64 max_cluster_size=404000 acceptable_cost=75000
Processed 78223 operations, 9736 CommitStagings
Peak mempool size: 9688 transactions (at op [#74231](/bitcoin-bitcoin/74231/))
Peak state (9688 transactions)
9688 transactions, 3306 clusters
Size Clusters Chains Non-chain Chain%
1 2723 2723 0 100.0%
2 212 212 0 100.0%
25 207 207 0 100.0%
TOTAL 3306 3266 40 98.8%
txgraph-replay
=== TxGraph Replay Summary ===
Total ops replayed: 78222
Timed entry points:
Operation Calls Total (us) Avg (us)
DoWork 9737 115647 11.88
GetMainMemoryUsage 19482 9 0.00
CommitStaging 9736 14 0.00
TOTAL 58441 115798
~
~