CheckBlock
’s latency is critical for efficiently validating correct inputs during transaction validation, including mempool acceptance and new block creation.
This PR improves performance and maintainability by introducing the following changes:
- Simplified checks for the most common cases (1 or 2 inputs - 70-90% of transactions have a single input).
- Optimized the general case by replacing
std::set
with sortedstd::vector
for improved locality. - Simplified Null prevout checks from linear to constant time.
The goal of this change is to make block validation faster via a low-risk alternative.
Before:
ns/block | block/s | err% | total | benchmark |
---|---|---|---|---|
372,743.63 | 2,682.81 | 1.1% | 10.99 | CheckBlockBench |
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
3,304,694.54 | 302.60 | 0.5% | 11.05 | DuplicateInputs |
9,585.10 | 104,328.55 | 0.1% | 11.03 | ProcessTransactionBench |
After:
ns/block | block/s | err% | total | benchmark |
---|---|---|---|---|
179,971.00 | 5,556.45 | 0.3% | 11.02 | CheckBlockBench |
ns/op | op/s | err% | total | benchmark |
---|---|---|---|---|
963,177.98 | 1,038.23 | 1.7% | 10.92 | DuplicateInputs |
9,410.90 | 106,259.75 | 0.3% | 11.01 | ProcessTransactionBench |
- 2.07x faster
CheckBlockBench
- 3.4x faster
DuplicateInputs
- 1.8% faster
ProcessTransactionBench
Before:
ns/block | block/s | err% | ins/block | cyc/block | IPC | bra/block | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
1,096,261.84 | 912.19 | 0.1% | 7,963,390.88 | 3,487,375.26 | 2.283 | 1,266,941.00 | 1.8% | 11.03 | CheckBlockBench |
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
8,366,309.48 | 119.53 | 0.0% | 23,865,177.67 | 26,620,160.23 | 0.897 | 5,972,887.41 | 4.0% | 10.78 | DuplicateInputs |
56,199.57 | 17,793.73 | 0.1% | 229,263.01 | 178,766.31 | 1.282 | 15,509.97 | 0.5% | 10.91 | ProcessTransactionBench |
After:
ns/block | block/s | err% | ins/block | cyc/block | IPC | bra/block | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
834,855.94 | 1,197.81 | 0.0% | 6,518,548.86 | 2,656,039.78 | 2.454 | 919,160.84 | 1.5% | 10.78 | CheckBlockBench |
ns/op | op/s | err% | ins/op | cyc/op | IPC | bra/op | miss% | total | benchmark |
---|---|---|---|---|---|---|---|---|---|
4,261,492.75 | 234.66 | 0.0% | 17,379,823.40 | 13,559,793.33 | 1.282 | 4,265,714.28 | 3.4% | 11.00 | DuplicateInputs |
55,819.53 | 17,914.88 | 0.1% | 227,828.15 | 177,520.09 | 1.283 | 15,184.36 | 0.4% | 10.91 | ProcessTransactionBench |
- 31% faster
CheckBlockBench
- 2x faster
DuplicateInputs
- 0.5% faster
ProcessTransactionBench
While the point of the change isn’t necessarily to speed up IBD, but you can see the measurements at #31682 (comment)
Related PRs: