crypto: optimize SipHash `Write()` method with chunked processing #33696

pull Raimo33 wants to merge 1 commits into bitcoin:master from Raimo33:optimize-siphash-chunked changing 1 files +45 −9

Raimo33 commented at 9:06 am on October 24, 2025: contributor

reopening #33325 as draft

Summary

The current default Write() implementation of Siphash uses a byte-by-byte approach to iterate the span. This results in significant overhead for large inputs due to repeated bounds checking and span manipulations, without any help from the compiler.

This PR aims at optimizing Siphash by replacing byte-by-byte processing in CSipHasher::Write() with an optimized chunked approach that processes data in 8-byte aligned blocks when possible.

Details

The new implementation is divided in 3 stages that process:

initial unaligned bytes to reach an 8-byte boundary
aligned 8-byte chunks directly using memcpy for efficiency
remaining bytes at the end

every change was thoroughly tested and benchmarked to avoid overfitting, but replicating is welcomed and encouraged.

Benchmarks

0taskset -c 1 ./bin/bench_bitcoin -filter="(GCSFilterConstruct)" --min-time=60000

Before:

ns/op	op/s	err%	total	benchmark
12,983,090.72	77.02	0.1%	66.00	`GCSFilterConstruct`

After:

ns/op	op/s	err%	total	benchmark
11,155,751.42	89.64	0.1%	65.99	`GCSFilterConstruct`

compared to master:

GCSFilterConstruct +16% faster

DrahtBot added the label Utils/log/libs on Oct 24, 2025
DrahtBot commented at 9:06 am on October 24, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33696.

Reviews

See the guideline for information on the review process.

Type Reviewers

Concept NACK dergoegge, l0rinc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Type	Reviewers
Concept NACK	dergoegge, l0rinc

crypto: optimize SipHash Write() method with chunked processing

Replace byte-by-byte processing in CSipHasher::Write() with an optimized
chunked approach that processes data in 8-byte aligned blocks when possible.

./bin/bench_bitcoin -filter="(GCSFilterConstruct)" --min-time=60000

before:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|       12,983,090.72 |               77.02 |    0.1% |     66.00 | `GCSFilterConstruct`

after:

|               ns/op |                op/s |    err% |     total | benchmark
|--------------------:|--------------------:|--------:|----------:|:----------
|       11,155,751.42 |               89.64 |    0.1% |     65.99 | `GCSFilterConstruct`

ca57c201f2

Raimo33 force-pushed on Oct 24, 2025
dergoegge commented at 9:13 am on October 24, 2025: member

Concept NACK

For a small gain in the GCSFilterConstruct benchmark this is not worth the extra complexity and review overhead.
l0rinc commented at 9:18 am on October 24, 2025: contributor

Why are you reopening a nacked PR?
Raimo33 commented at 9:35 am on October 24, 2025: contributor

Why are you reopening a nacked PR?

reopened as WIP. Just need to make the diff simpler. you were the only NACK. It still conceptually valid imo to process blocks in 8 byte chunks.
l0rinc commented at 9:55 am on October 24, 2025: contributor

Why are you doing that? Why not fix it in the original PR to retain context? NACK, please use the original PR for this change, seems very weird to open a new PR when you don’t like the feedback…
achow101 closed this on Oct 24, 2025

Contributors
Raimo33 DrahtBot dergoegge l0rinc

Labels
Utils/log/libs

crypto: optimize SipHash Write() method with chunked processing #33696

Summary

Details

Benchmarks

Code Coverage & Benchmarks

Reviews

crypto: optimize SipHash `Write()` method with chunked processing #33696