This adds cycle min/max/avg to the statistics.
Supported on x86 and x86_64 (natively through rdtsc), as well as for some other architectures on Linux (perf syscall). Will just show 0 on unsupported platforms.
Was tested on x86_64 and AARCH64.
This adds cycle min/max/avg to the statistics.
Supported on x86 and x86_64 (natively through rdtsc), as well as for some other architectures on Linux (perf syscall). Will just show 0 on unsupported platforms.
Was tested on x86_64 and AARCH64.
This adds cycle min/max/avg to the statistics.
Supported on x86 and x86_64 (natively through rdtsc), as well as Linux
(perf syscall).
Tested ACK (OSX) 3532818
Result with -02 on OSX (2.6 GHz Intel Core i7)
#Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
Base58CheckEncode,229376,0.000003975452273,0.000005225097993,0.000004511387877,10312,13553,11703
Base58Decode,851968,0.000001059088390,0.000001410629920,0.000001215919978,2747,3659,3154
Base58Encode,327680,0.000002935426892,0.000003485380148,0.000003217919584,7617,9042,8347
CCoinsCaching,90112,0.000009148381650,0.000012961449102,0.000011695591225,23730,33622,30338
CoinSelection,416,0.002168059349060,0.002760812640190,0.002422756873644,5623936,7161456,6284967
DeserializeAndCheckBlockTest,72,0.013411879539490,0.015962481498718,0.014648040135701,34790547,41406927,37996977
DeserializeBlockTest,88,0.010604500770569,0.012940049171448,0.011376557025042,27508165,33566117,29512372
LockedPool,512,0.001808419823647,0.003033317625523,0.002045841421932,4691069,7868411,5307190
MempoolEviction,15360,0.000059797894210,0.000087032094598,0.000065140891820,155116,225747,168975
RIPEMD160,384,0.002612933516502,0.002894565463066,0.002725711092353,6777939,7508452,7070852
RollingBloom-refresh,1,0.000611000000000,0.000611000000000,0.000611000000000
RollingBloom-refresh,1,0.000105000000000,0.000105000000000,0.000105000000000
RollingBloom-refresh,1,0.000101000000000,0.000101000000000,0.000101000000000
RollingBloom-refresh,1,0.000097000000000,0.000097000000000,0.000097000000000
RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
RollingBloom-refresh,1,0.000099000000000,0.000099000000000,0.000099000000000
RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
RollingBloom-refresh,1,0.000108000000000,0.000108000000000,0.000108000000000
RollingBloom-refresh,1,0.000128000000000,0.000128000000000,0.000128000000000
RollingBloom-refresh,1,0.000094000000000,0.000094000000000,0.000094000000000
RollingBloom-refresh,1,0.000151000000000,0.000151000000000,0.000151000000000
RollingBloom-refresh,1,0.000095000000000,0.000095000000000,0.000095000000000
RollingBloom-refresh,1,0.000106000000000,0.000106000000000,0.000106000000000
RollingBloom-refresh,1,0.000124000000000,0.000124000000000,0.000124000000000
RollingBloom-refresh,1,0.000115000000000,0.000115000000000,0.000115000000000
RollingBloom-refresh,1,0.000100000000000,0.000100000000000,0.000100000000000
RollingBloom-refresh,1,0.000100000000000,0.000100000000000,0.000100000000000
RollingBloom-refresh,1,0.000117000000000,0.000117000000000,0.000117000000000
RollingBloom-refresh,1,0.000101000000000,0.000101000000000,0.000101000000000
RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
RollingBloom,1310720,0.000000795478627,0.000000927659130,0.000000840967550,2063,2406,2181
SHA1,512,0.001935496926308,0.002218931913376,0.002032823860645,5020685,5755989,5273419
SHA256,208,0.004498481750488,0.005540251731873,0.004966990305827,11667956,14371842,12885041
SHA256_32b,4,0.345051527023315,0.346106529235840,0.345579028129578,895062320,897798922,896430621
SHA512,352,0.002845406532288,0.003299534320831,0.003069994124499,7380971,8558912,7963958
SipHash_32b,30,0.033124923706055,0.037207484245300,0.035290129979451,85925534,96516794,91547269
Sleep100ms,10,0.100992441177368,0.104498505592346,0.102697491645813,261974287,271068521,266396862
Trig,67108864,0.000000014460568,0.000000015428895,0.000000014972940,37,40,38
VerifyScriptBench,5632,0.000182222574949,0.000207984820008,0.000195238654586,472678,539492,506447
@laanwj I was playing around with this type of timing earlier and read that I should be wary of rdtsc getting reordered with respect to other instructions and that if you can't use rdtscp instead, then you should add a serializing instruction first like cpuid. Also do you not have any issues with the thread migrating to another core? I had to set cpu affinity.
I couldn't find where I was reading all that, but here is one link: http://blog.regehr.org/archives/330
I was playing around with this type of timing earlier and read that I should be wary of rdtsc getting reordered with respect to other instructions
Yes, both the compiler and the CPU pipeline may reorder it. In this specific case it's not too bad, though, because the call is already from a function (State::KeepRunning) called inside the benchmark. So there is quite some overhead already, making reordering by a few instructions probably unnoticeable in the noise.
and that if you can't use rdtscp instead, then you should add a serializing instruction first like cpuid.
I didn't know that. Although rdtscp seems not to be available on all x86 processors. I'll leave that as a future improvement.
x86 is already precise and low-overhead compared to the ARM path which has to do a syscall (the instructions aren't available to user-space).
Also do you not have any issues with the thread migrating to another core? I had to set cpu affinity.
Indeed, calling bench with e.g. taskset -c 0 bench_bitcoin will likely get more precise cycle measurements.
Running on OSX (3.4GHz i7)
#Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
Base58CheckEncode,262144,0.000003828128683,0.000004008295946,0.000003908591680,12985,13597,13260
Base58Decode,983040,0.000000999269105,0.000001138963853,0.000001040822341,3389,3863,3530
Base58Encode,425984,0.000002385859261,0.000002805812983,0.000002487964454,8093,9517,8440
CCoinsCaching,106496,0.000009394483641,0.000010205199942,0.000010016403394,31871,34618,33978
CoinSelection,480,0.002068780362606,0.002542287111282,0.002170727153619,7017890,8624174,7364277
DeserializeAndCheckBlockTest,96,0.010975986719131,0.011605978012085,0.011256289978822,37233600,39370829,38184392
DeserializeBlockTest,112,0.009186625480652,0.010600864887238,0.009532500590597,31163698,35960873,32339308
LockedPool,640,0.001598000526428,0.001764506101608,0.001673145219684,5420783,5985758,5675764
MempoolEviction,14336,0.000070976559073,0.000092454254627,0.000074292616253,240772,313630,252040
RIPEMD160,416,0.002447441220284,0.002628095448017,0.002492115474664,8302338,8915072,8453936
RollingBloom-refresh,1,0.000569000000000,0.000569000000000,0.000569000000000
RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
RollingBloom-refresh,1,0.000121000000000,0.000121000000000,0.000121000000000
RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
RollingBloom-refresh,1,0.000115000000000,0.000115000000000,0.000115000000000
RollingBloom-refresh,1,0.000125000000000,0.000125000000000,0.000125000000000
RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
RollingBloom-refresh,1,0.000108000000000,0.000108000000000,0.000108000000000
RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
RollingBloom-refresh,1,0.000116000000000,0.000116000000000,0.000116000000000
RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
RollingBloom-refresh,1,0.000116000000000,0.000116000000000,0.000116000000000
RollingBloom-refresh,1,0.000149000000000,0.000149000000000,0.000149000000000
RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
RollingBloom,1441792,0.000000725554855,0.000000796730092,0.000000755561731,2465,2702,2563
SHA1,576,0.001806784421206,0.001858018338680,0.001830946240160,6129109,6302896,6211558
SHA256,240,0.004244878888130,0.004624485969543,0.004340062538783,14399626,15688224,14722663
SHA256_32b,4,0.300903081893921,0.302513957023621,0.301708519458771,1020885060,1026209988,1023547524
SHA512,384,0.002580255270004,0.002831816673279,0.002633192886909,8752964,9606202,8932496
SipHash_32b,28,0.036777973175049,0.037643551826477,0.037094610077994,124759771,127697052,125844822
Sleep100ms,10,0.102699518203735,0.104491472244263,0.103782296180725,348388267,354464246,352058326
Trig,67108864,0.000000015213971,0.000000016026718,0.000000015468853,51,54,52
VerifyScriptBench,6144,0.000170339830220,0.000214513391256,0.000176954781637,577836,727691,600278
@fanquake: did you compile with -O2 or -O0 (--enable-debug)?
It looks like RollingBloom-refresh bench is not changed to the new output format.
Yes those lines are "faked" they don't go through the framework. so cycles is missing there, it will only be shown for the total. not a big deal, though there will need to be a proper solution for nested benchmarks at some point that doesn't involve printing from inside the benchmarked code. Not in this pull though.
Options used to compile and link:
debug enabled = yes
target os = darwin
build os = darwin
CC = /usr/local/bin/ccache gcc
CFLAGS = -g -O2 -g3 -O0
CPPFLAGS = -Qunused-arguments -DDEBUG -DDEBUG_LOCKORDER -DHAVE_BUILD_INFO -D__STDC_FORMAT_MACROS -I/usr/local/opt/berkeley-db4/include -DMAC_OSX
CXX = /usr/local/bin/ccache g++ -std=c++11
CXXFLAGS = -g -O2 -g3 -O0 -Wall -Wextra -Wformat -Wformat-security -Wno-unused-parameter -Wno-self-assign -Wno-unused-local-typedef -Wno-deprecated-register
LDFLAGS = -Wl,-headerpad_max_install_names -Wl,-dead_strip
#Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
Base58CheckEncode,30720,0.000034503871575,0.000035417033359,0.000034728871348,117174,120144,117818
Base58Decode,73728,0.000013721641153,0.000014259770978,0.000013993813708,46547,48372,47470
Base58Encode,40960,0.000023265369236,0.000028600101359,0.000024570309324,78931,97019,83356
CCoinsCaching,13312,0.000073989387602,0.000080669764429,0.000076363722865,250992,273651,259056
CoinSelection,104,0.009642988443375,0.011777520179749,0.009857095204867,32711658,40027011,33439357
DeserializeAndCheckBlockTest,12,0.087463498115540,0.089210510253906,0.088053584098816,296699541,302626002,298725543
DeserializeBlockTest,16,0.068791985511780,0.069242000579834,0.068972617387772,233362289,234886757,233973693
LockedPool,160,0.004223585128784,0.006983995437622,0.006671081483364,14328786,23691500,22631932
MempoolEviction,2560,0.000405104830861,0.000430928543210,0.000418359413743,1374232,1461820,1419187
RIPEMD160,20,0.053014993667603,0.053706049919128,0.053378355503082,179840289,182186794,181088217
RollingBloom,229376,0.000004121757229,0.000005482856068,0.000004432338756,13982,18599,15035
SHA1,56,0.018210709095001,0.019991517066956,0.018770660672869,61774970,67816841,63680293
SHA256,32,0.032343983650208,0.033765912055969,0.033258154988289,109720940,114543363,112829591
SHA256_32b,2,2.213187932968140,2.213187932968140,2.213187932968140,7508019029,7508019029,7508019029
SHA512,52,0.020092487335205,0.020573496818542,0.020253401536208,68158867,69791050,68705021
SipHash_32b,8,0.155891418457031,0.157407522201538,0.156427383422852,528826462,534114465,530680206
Sleep100ms,10,0.100543022155762,0.104717016220093,0.102999615669250,341066705,355370400,349431503
Trig,62914560,0.000000015808098,0.000000016690024,0.000000016066789,53,56,54
VerifyScriptBench,3584,0.000285433605313,0.000293343327940,0.000287477991411,968261,996237,975283