bench: Add support for measuring CPU cycles #9202

pull laanwj wants to merge 1 commits into bitcoin:master from laanwj:2016_11_bench_cpu_cycles changing 5 files +121 −5
  1. laanwj commented at 11:23 AM on November 22, 2016: member

    This adds cycle min/max/avg to the statistics.

    Supported on x86 and x86_64 (natively through rdtsc), as well as for some other architectures on Linux (perf syscall). Will just show 0 on unsupported platforms.

    Was tested on x86_64 and AARCH64.

  2. bench: Add support for measuring CPU cycles
    This adds cycle min/max/avg to the statistics.
    
    Supported on x86 and x86_64 (natively through rdtsc), as well as Linux
    (perf syscall).
    3532818746
  3. laanwj added the label Tests on Nov 22, 2016
  4. jonasschnelli approved
  5. jonasschnelli commented at 12:56 PM on November 22, 2016: contributor

    Tested ACK (OSX) 3532818

    Result with -02 on OSX (2.6 GHz Intel Core i7)

    #Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
    Base58CheckEncode,229376,0.000003975452273,0.000005225097993,0.000004511387877,10312,13553,11703
    Base58Decode,851968,0.000001059088390,0.000001410629920,0.000001215919978,2747,3659,3154
    Base58Encode,327680,0.000002935426892,0.000003485380148,0.000003217919584,7617,9042,8347
    CCoinsCaching,90112,0.000009148381650,0.000012961449102,0.000011695591225,23730,33622,30338
    CoinSelection,416,0.002168059349060,0.002760812640190,0.002422756873644,5623936,7161456,6284967
    DeserializeAndCheckBlockTest,72,0.013411879539490,0.015962481498718,0.014648040135701,34790547,41406927,37996977
    DeserializeBlockTest,88,0.010604500770569,0.012940049171448,0.011376557025042,27508165,33566117,29512372
    LockedPool,512,0.001808419823647,0.003033317625523,0.002045841421932,4691069,7868411,5307190
    MempoolEviction,15360,0.000059797894210,0.000087032094598,0.000065140891820,155116,225747,168975
    RIPEMD160,384,0.002612933516502,0.002894565463066,0.002725711092353,6777939,7508452,7070852
    RollingBloom-refresh,1,0.000611000000000,0.000611000000000,0.000611000000000
    RollingBloom-refresh,1,0.000105000000000,0.000105000000000,0.000105000000000
    RollingBloom-refresh,1,0.000101000000000,0.000101000000000,0.000101000000000
    RollingBloom-refresh,1,0.000097000000000,0.000097000000000,0.000097000000000
    RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
    RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
    RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
    RollingBloom-refresh,1,0.000099000000000,0.000099000000000,0.000099000000000
    RollingBloom-refresh,1,0.000096000000000,0.000096000000000,0.000096000000000
    RollingBloom-refresh,1,0.000108000000000,0.000108000000000,0.000108000000000
    RollingBloom-refresh,1,0.000128000000000,0.000128000000000,0.000128000000000
    RollingBloom-refresh,1,0.000094000000000,0.000094000000000,0.000094000000000
    RollingBloom-refresh,1,0.000151000000000,0.000151000000000,0.000151000000000
    RollingBloom-refresh,1,0.000095000000000,0.000095000000000,0.000095000000000
    RollingBloom-refresh,1,0.000106000000000,0.000106000000000,0.000106000000000
    RollingBloom-refresh,1,0.000124000000000,0.000124000000000,0.000124000000000
    RollingBloom-refresh,1,0.000115000000000,0.000115000000000,0.000115000000000
    RollingBloom-refresh,1,0.000100000000000,0.000100000000000,0.000100000000000
    RollingBloom-refresh,1,0.000100000000000,0.000100000000000,0.000100000000000
    RollingBloom-refresh,1,0.000117000000000,0.000117000000000,0.000117000000000
    RollingBloom-refresh,1,0.000101000000000,0.000101000000000,0.000101000000000
    RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
    RollingBloom,1310720,0.000000795478627,0.000000927659130,0.000000840967550,2063,2406,2181
    SHA1,512,0.001935496926308,0.002218931913376,0.002032823860645,5020685,5755989,5273419
    SHA256,208,0.004498481750488,0.005540251731873,0.004966990305827,11667956,14371842,12885041
    SHA256_32b,4,0.345051527023315,0.346106529235840,0.345579028129578,895062320,897798922,896430621
    SHA512,352,0.002845406532288,0.003299534320831,0.003069994124499,7380971,8558912,7963958
    SipHash_32b,30,0.033124923706055,0.037207484245300,0.035290129979451,85925534,96516794,91547269
    Sleep100ms,10,0.100992441177368,0.104498505592346,0.102697491645813,261974287,271068521,266396862
    Trig,67108864,0.000000014460568,0.000000015428895,0.000000014972940,37,40,38
    VerifyScriptBench,5632,0.000182222574949,0.000207984820008,0.000195238654586,472678,539492,506447
    
  6. morcos commented at 3:38 PM on November 22, 2016: member

    @laanwj I was playing around with this type of timing earlier and read that I should be wary of rdtsc getting reordered with respect to other instructions and that if you can't use rdtscp instead, then you should add a serializing instruction first like cpuid. Also do you not have any issues with the thread migrating to another core? I had to set cpu affinity.

    I couldn't find where I was reading all that, but here is one link: http://blog.regehr.org/archives/330

  7. laanwj commented at 5:54 AM on November 23, 2016: member

    I was playing around with this type of timing earlier and read that I should be wary of rdtsc getting reordered with respect to other instructions

    Yes, both the compiler and the CPU pipeline may reorder it. In this specific case it's not too bad, though, because the call is already from a function (State::KeepRunning) called inside the benchmark. So there is quite some overhead already, making reordering by a few instructions probably unnoticeable in the noise.

    and that if you can't use rdtscp instead, then you should add a serializing instruction first like cpuid.

    I didn't know that. Although rdtscp seems not to be available on all x86 processors. I'll leave that as a future improvement.

    x86 is already precise and low-overhead compared to the ARM path which has to do a syscall (the instructions aren't available to user-space).

    Also do you not have any issues with the thread migrating to another core? I had to set cpu affinity.

    Indeed, calling bench with e.g. taskset -c 0 bench_bitcoin will likely get more precise cycle measurements.

  8. fanquake commented at 6:40 AM on November 23, 2016: member

    Running on OSX (3.4GHz i7)

    #Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
    Base58CheckEncode,262144,0.000003828128683,0.000004008295946,0.000003908591680,12985,13597,13260
    Base58Decode,983040,0.000000999269105,0.000001138963853,0.000001040822341,3389,3863,3530
    Base58Encode,425984,0.000002385859261,0.000002805812983,0.000002487964454,8093,9517,8440
    CCoinsCaching,106496,0.000009394483641,0.000010205199942,0.000010016403394,31871,34618,33978
    CoinSelection,480,0.002068780362606,0.002542287111282,0.002170727153619,7017890,8624174,7364277
    DeserializeAndCheckBlockTest,96,0.010975986719131,0.011605978012085,0.011256289978822,37233600,39370829,38184392
    DeserializeBlockTest,112,0.009186625480652,0.010600864887238,0.009532500590597,31163698,35960873,32339308
    LockedPool,640,0.001598000526428,0.001764506101608,0.001673145219684,5420783,5985758,5675764
    MempoolEviction,14336,0.000070976559073,0.000092454254627,0.000074292616253,240772,313630,252040
    RIPEMD160,416,0.002447441220284,0.002628095448017,0.002492115474664,8302338,8915072,8453936
    RollingBloom-refresh,1,0.000569000000000,0.000569000000000,0.000569000000000
    RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
    RollingBloom-refresh,1,0.000121000000000,0.000121000000000,0.000121000000000
    RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
    RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
    RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
    RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
    RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
    RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
    RollingBloom-refresh,1,0.000115000000000,0.000115000000000,0.000115000000000
    RollingBloom-refresh,1,0.000125000000000,0.000125000000000,0.000125000000000
    RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
    RollingBloom-refresh,1,0.000108000000000,0.000108000000000,0.000108000000000
    RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
    RollingBloom-refresh,1,0.000116000000000,0.000116000000000,0.000116000000000
    RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
    RollingBloom-refresh,1,0.000109000000000,0.000109000000000,0.000109000000000
    RollingBloom-refresh,1,0.000116000000000,0.000116000000000,0.000116000000000
    RollingBloom-refresh,1,0.000149000000000,0.000149000000000,0.000149000000000
    RollingBloom-refresh,1,0.000113000000000,0.000113000000000,0.000113000000000
    RollingBloom-refresh,1,0.000114000000000,0.000114000000000,0.000114000000000
    RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
    RollingBloom-refresh,1,0.000111000000000,0.000111000000000,0.000111000000000
    RollingBloom-refresh,1,0.000112000000000,0.000112000000000,0.000112000000000
    RollingBloom,1441792,0.000000725554855,0.000000796730092,0.000000755561731,2465,2702,2563
    SHA1,576,0.001806784421206,0.001858018338680,0.001830946240160,6129109,6302896,6211558
    SHA256,240,0.004244878888130,0.004624485969543,0.004340062538783,14399626,15688224,14722663
    SHA256_32b,4,0.300903081893921,0.302513957023621,0.301708519458771,1020885060,1026209988,1023547524
    SHA512,384,0.002580255270004,0.002831816673279,0.002633192886909,8752964,9606202,8932496
    SipHash_32b,28,0.036777973175049,0.037643551826477,0.037094610077994,124759771,127697052,125844822
    Sleep100ms,10,0.102699518203735,0.104491472244263,0.103782296180725,348388267,354464246,352058326
    Trig,67108864,0.000000015213971,0.000000016026718,0.000000015468853,51,54,52
    VerifyScriptBench,6144,0.000170339830220,0.000214513391256,0.000176954781637,577836,727691,600278
    
  9. jonasschnelli commented at 7:18 AM on November 23, 2016: contributor

    @fanquake: did you compile with -O2 or -O0 (--enable-debug)?

  10. paveljanik commented at 10:20 AM on November 23, 2016: contributor

    It looks like RollingBloom-refresh bench is not changed to the new output format.

  11. laanwj commented at 11:06 AM on November 23, 2016: member

    Yes those lines are "faked" they don't go through the framework. so cycles is missing there, it will only be shown for the total. not a big deal, though there will need to be a proper solution for nested benchmarks at some point that doesn't involve printing from inside the benchmarked code. Not in this pull though.

  12. fanquake commented at 11:18 AM on November 23, 2016: member

    @jonasschnelli

    --enable-debug

    Options used to compile and link:
      debug enabled = yes
      target os     = darwin
      build os      = darwin
    
      CC            = /usr/local/bin/ccache gcc
      CFLAGS        = -g -O2 -g3 -O0
      CPPFLAGS      = -Qunused-arguments  -DDEBUG -DDEBUG_LOCKORDER -DHAVE_BUILD_INFO -D__STDC_FORMAT_MACROS -I/usr/local/opt/berkeley-db4/include -DMAC_OSX
      CXX           = /usr/local/bin/ccache g++ -std=c++11
      CXXFLAGS      = -g -O2 -g3 -O0 -Wall -Wextra -Wformat -Wformat-security -Wno-unused-parameter -Wno-self-assign -Wno-unused-local-typedef -Wno-deprecated-register
      LDFLAGS       =  -Wl,-headerpad_max_install_names -Wl,-dead_strip
    
    #Benchmark,count,min,max,average,min_cycles,max_cycles,average_cycles
    Base58CheckEncode,30720,0.000034503871575,0.000035417033359,0.000034728871348,117174,120144,117818
    Base58Decode,73728,0.000013721641153,0.000014259770978,0.000013993813708,46547,48372,47470
    Base58Encode,40960,0.000023265369236,0.000028600101359,0.000024570309324,78931,97019,83356
    CCoinsCaching,13312,0.000073989387602,0.000080669764429,0.000076363722865,250992,273651,259056
    CoinSelection,104,0.009642988443375,0.011777520179749,0.009857095204867,32711658,40027011,33439357
    DeserializeAndCheckBlockTest,12,0.087463498115540,0.089210510253906,0.088053584098816,296699541,302626002,298725543
    DeserializeBlockTest,16,0.068791985511780,0.069242000579834,0.068972617387772,233362289,234886757,233973693
    LockedPool,160,0.004223585128784,0.006983995437622,0.006671081483364,14328786,23691500,22631932
    MempoolEviction,2560,0.000405104830861,0.000430928543210,0.000418359413743,1374232,1461820,1419187
    RIPEMD160,20,0.053014993667603,0.053706049919128,0.053378355503082,179840289,182186794,181088217
    RollingBloom,229376,0.000004121757229,0.000005482856068,0.000004432338756,13982,18599,15035
    SHA1,56,0.018210709095001,0.019991517066956,0.018770660672869,61774970,67816841,63680293
    SHA256,32,0.032343983650208,0.033765912055969,0.033258154988289,109720940,114543363,112829591
    SHA256_32b,2,2.213187932968140,2.213187932968140,2.213187932968140,7508019029,7508019029,7508019029
    SHA512,52,0.020092487335205,0.020573496818542,0.020253401536208,68158867,69791050,68705021
    SipHash_32b,8,0.155891418457031,0.157407522201538,0.156427383422852,528826462,534114465,530680206
    Sleep100ms,10,0.100543022155762,0.104717016220093,0.102999615669250,341066705,355370400,349431503
    Trig,62914560,0.000000015808098,0.000000016690024,0.000000016066789,53,56,54
    VerifyScriptBench,3584,0.000285433605313,0.000293343327940,0.000287477991411,968261,996237,975283
    
  13. laanwj merged this on Nov 29, 2016
  14. laanwj closed this on Nov 29, 2016

  15. laanwj referenced this in commit e56cf67e6b on Nov 29, 2016
  16. codablock referenced this in commit 8f47523822 on Jan 16, 2018
  17. codablock referenced this in commit a1d4a55600 on Jan 16, 2018
  18. codablock referenced this in commit 27fcec08f8 on Jan 17, 2018
  19. andvgal referenced this in commit b8eb5f453d on Jan 6, 2019
  20. CryptoCentric referenced this in commit 5de4e1c1b5 on Feb 25, 2019
  21. zkbot referenced this in commit aa225ebb0b on Jan 24, 2020
  22. zkbot referenced this in commit 74ff73abab on Jan 24, 2020
  23. furszy referenced this in commit 4ed15cc69d on Jun 8, 2020
  24. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me