Replace current benchmarking framework with nanobench
This replaces the current benchmarking framework with nanobench [1], an MIT licensed single-header benchmarking library, of which I am the autor. This has in my opinion several advantages, especially on Linux:
fast: Running all benchmarks takes ~6 seconds instead of 4m13s on an Intel i7-8700 CPU @ 3.20GHz.
accurate: I ran e.g. the benchmark for SipHash_32b 10 times and calculate standard deviation / mean = coefficient of variation:
- 0.57% CV for old benchmarking framework
- 0.20% CV for nanobench
So the benchmark results with nanobench seem to vary less than with the old framework.
It automatically determines runtime based on clock precision, no need to specify number of evaluations.
measure instructions, cycles, branches, instructions per cycle, branch misses (only Linux, when performance counters are available)
output in markdown table format.
Warn about unstable environment (frequency scaling, turbo, ...)
For better profiling, it is possible to set the environment variable NANOBENCH_ENDLESS to force endless running of a particular benchmark without the need to recompile. This makes it to e.g. run "perf top" and look at hotspots.
Here is an example copy & pasted from the terminal output:
| ns/byte | byte/s | err% | ins/byte | cyc/byte | IPC | bra/byte | miss% | total | benchmark
|--------------------:|--------------------:|--------:|----------------:|----------------:|-------:|---------------:|--------:|----------:|:----------
| 2.52 | 396,529,415.94 | 0.6% | 25.42 | 8.02 | 3.169 | 0.06 | 0.0% | 0.03 | bench/crypto_hash.cpp RIPEMD160
| 1.87 | 535,161,444.83 | 0.3% | 21.36 | 5.95 | 3.589 | 0.06 | 0.0% | 0.02 | bench/crypto_hash.cpp SHA1
| 3.22 | 310,344,174.79 | 1.1% | 36.80 | 10.22 | 3.601 | 0.09 | 0.0% | 0.04 | bench/crypto_hash.cpp SHA256
| 2.01 | 496,375,796.23 | 0.0% | 18.72 | 6.43 | 2.911 | 0.01 | 1.0% | 0.00 | bench/crypto_hash.cpp SHA256D64_1024
| 7.23 | 138,263,519.35 | 0.1% | 82.66 | 23.11 | 3.577 | 1.63 | 0.1% | 0.00 | bench/crypto_hash.cpp SHA256_32b
| 3.04 | 328,780,166.40 | 0.3% | 35.82 | 9.69 | 3.696 | 0.03 | 0.0% | 0.03 | bench/crypto_hash.cpp SHA512