This change is part of [IBD] - Tracking PR for speeding up Initial Block Download
Summary
CheckBlock
’s latency is critical for efficiently validating correct inputs during transaction validation, including mempool acceptance and new block creation.
This PR improves performance and maintainability by introducing the following changes:
- Simplified checks for the most common cases (1 or 2 inputs - 70-90% of transactions have a single input).
- Optimized the general case by replacing
std::set
with sortedstd::vector
for improved locality. - Simplified Null prevout checks from linear to constant time.
Context, concerns
As noted by @sipa in #31682#pullrequestreview-2570273162, this is consensus-critical code (especially since we’ve had an actual inflation bug related to duplicate checking).
However, the changes here are very confined to a single function, easily reviewed, and I have a general preference for avoiding std::set
when sorted std::vector
works too as well.
The goal of this change is to make block validation faster via a low-risk alternative.
Measurements
Before:
ns/block | block/s | err% | total | benchmark |
---|---|---|---|---|
372,743.63 | 2,682.81 | 1.1% | 10.99 | CheckBlockBench |
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
3,304,694.54 | 302.60 | 0.5% | 11.05 | DuplicateInputs |
9,585.10 | 104,328.55 | 0.1% | 11.03 | ProcessTransactionBench |
After:
ns/block | block/s | err% | total | benchmark |
---|---|---|---|---|
179,971.00 | 5,556.45 | 0.3% | 11.02 | CheckBlockBench |
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
963,177.98 | 1,038.23 | 1.7% | 10.92 | DuplicateInputs |
9,410.90 | 106,259.75 | 0.3% | 11.01 | ProcessTransactionBench |
- 2.07x faster
CheckBlockBench
- 3.4x faster
DuplicateInputs
- 1.8% faster
ProcessTransactionBench
Before:
ns/block | block/s | err% | ins/block | cyc/block | IPC | bra/block | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
1,096,261.84 | 912.19 | 0.1% | 7,963,390.88 | 3,487,375.26 | 2.283 | 1,266,941.00 | 1.8% | 11.03 | CheckBlockBench |
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
8,366,309.48 | 119.53 | 0.0% | 23,865,177.67 | 26,620,160.23 | 0.897 | 5,972,887.41 | 4.0% | 10.78 | DuplicateInputs |
56,199.57 | 17,793.73 | 0.1% | 229,263.01 | 178,766.31 | 1.282 | 15,509.97 | 0.5% | 10.91 | ProcessTransactionBench |
After:
ns/block | block/s | err% | ins/block | cyc/block | IPC | bra/block | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
834,855.94 | 1,197.81 | 0.0% | 6,518,548.86 | 2,656,039.78 | 2.454 | 919,160.84 | 1.5% | 10.78 | CheckBlockBench |
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
4,261,492.75 | 234.66 | 0.0% | 17,379,823.40 | 13,559,793.33 | 1.282 | 4,265,714.28 | 3.4% | 11.00 | DuplicateInputs |
55,819.53 | 17,914.88 | 0.1% | 227,828.15 | 177,520.09 | 1.283 | 15,184.36 | 0.4% | 10.91 | ProcessTransactionBench |
- 31% faster
CheckBlockBench
- 2x faster
DuplicateInputs
- 0.5% faster
ProcessTransactionBench
While the point of the change isn’t necessarily to speed up IBD, but you can see the measurements at #31682 (comment)
Related PRs: