Profiling BlockAssemblerAddPackageTxns indicated excessive hash recalculations:
<img src="https://github.com/bitcoin/bitcoin/assets/1841944/e1687162-30b5-47b6-83b3-d234cd156395">
The original implementation used a set to manage the stack of transactions for a depth-first traversal of its child transactions. However, since we're already filtering via setDescendants before adding them to stage, duplicates cannot exist anyway. Therefore, we can change it to a simple vector, providing efficient push and pop operations, eliminating the need for redundant duplicate checks and hash calculations.
Benchmarking BlockAssemblerAddPackageTxns locally an on CI reveals a significant speed increase - please validate it on your machine!
./src/bench/bench_bitcoin --filter='BlockAssemblerAddPackageTxns' --min-time=1000
before:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 377,256,875.00 | 2.65 | 0.2% | 4.15 | BlockAssemblerAddPackageTxns
after:
| ns/op | op/s | err% | total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
| 275,087,833.00 | 3.64 | 0.1% | 3.02 | BlockAssemblerAddPackageTxns