Tests: Improve benchmark precision #11517

pull martinus wants to merge 2 commits into bitcoin:master from martinus:improved-benchmarking changing 15 files +285 −211
  1. martinus commented at 4:28 PM on October 17, 2017: contributor

    The benchmark's KeepRunning() used to make a function call for each call, inflating measurement times for short running code. This change inlines the critical code that is executed each run and moves the slow timer updates into a new function.

    This change increases the average runtime for Trig from 0.000000082339208 sec to 0.000000080948591.

  2. TheBlueMatt commented at 5:20 PM on October 17, 2017: member

    Cool! The benchmark suite is really not at all very good, sadly. If you feel up to it, I think we'd do better by just hardcoding the number of runs in each benchmark - it is often very easy to see very significant differences in the average runtime based on number of runs (even for things which take 1+ms per run!). Unless you have some better intuition than I about why this is the case, that is.

  3. martinus commented at 7:10 PM on October 17, 2017: contributor

    A while ago I wrote a microbenchmarking library similar to google benchmark. There I've used the geometric mean I think is a much better statistic than average here, and its better when you have outliers. I also think having the mean and confidence interval is much more reasonable than just min, max,avg values. Would you be interested in updates like this?

  4. TheBlueMatt commented at 10:25 PM on October 17, 2017: member

    I'd be happy with anything that gives more reliable values, really. Right now, to compre benchmark runs, I often have to pin to a CPU, then run multiple times, and only consider differences if both runs had the same iteration count, which is a huge pain.

  5. fanquake added the label Tests on Oct 17, 2017
  6. laanwj commented at 1:59 PM on October 18, 2017: member

    Cool! The benchmark suite is really not at all very good, sadly. If you feel up to it, I think we'd do better by just hardcoding the number of runs in each benchmark

    The idea of doing it this way is that the benchmarks can be run in roughly the same time on slower hardware, whereas on faster hardware they are more precise. The secp256k1 benchmarks have the problem that they take basically forever on lower-end CPUs. I agree it's probably broken though, and the range of CPUs that bitcoind can realistically run on is much smaller than secp256k1's. (if this is an issue, maybe a fixed scaling factor could be passed on the command line for this purpose, that scales all the benchmark iterations equally and is at least deterministic?)

  7. TheBlueMatt commented at 3:28 PM on October 18, 2017: member

    Yea, a scaling factor seems fine, even an auto-calculated one would probably also be fine, as long as it is very stable/deterministic on the same hardware between runs, which the current version is not.

  8. martinus commented at 6:43 AM on October 19, 2017: contributor

    I don't think using the exact same number of iterations is a huge benefit for stability. The problem is there will always be some random events in the system that cause some delays (unwanted fluctuations). The other problem are fluctuations that are inherent to what's benchmarked: e.g. if the code does any allocations, the runtime will always fluctuate. This fluctuation won't go away with fixed number of runs and I think the best way to deal with it is with better statistics. In my old library I am doing this:

    1. Perform 5 runs, each approximately 0.5 seconds.
    2. Calculate geometric mean of the runs.
    3. calculate the standard deviation of the runtime's logarithm, and print the 95% confidence interval of the runtime.

    Using the logarithm is much more appropriate for benchmark results, because runtime simply can't be negative. With that data we can use a student's t test to compare the mean of two different algorithms to show if the difference is actually statistically significant or might be just random fluctuation.

    That's a bit more complicated, but not by much; boost has all the features for calculating these statistics in a few lines of code. I can work on that but unfortunately I'm quite busy so that will take some time

  9. TheBlueMatt commented at 3:04 PM on October 19, 2017: member

    For my largest use of bench (FIBRE stuff, which has pretty wide-ranging bench's which run through FEC encode/decode + mempool scan + block decode) number of runs makes a huge difference. I'd be happy to be wrong, but my guess has always been cache-related/branch-prediction effects (the issues still appear when you give bench_bitcoin a dedicated core, minimizing any system disruption). Fluctuation is fine, but I often see fluctuations which are 2-4% when iteration count changes (on roughly-1ms-runtime!) when trying to tease out 5% changes. For something running that long, it seems strange to me that you'd see such huge fluctuations.

  10. martinus commented at 7:11 AM on October 20, 2017: contributor

    That's strange. Using a fixed number of iterations should be a simple change. How about adding a command line option to specify a fixed number of iterations? Also a filter would be useful so only a specified test runs.

  11. thebostik commented at 10:14 PM on October 21, 2017: none

    The premise is a bit flawed, (though this version might still be better, I like the idea of visualizing the data),

    used to make a function call for each call

    isn't true because the value of countMask changes in two different ways inside of this logic, increasing to prevent time and cycles from being collected on every other iteration (how it starts at first). That countMask jumps up pretty quickly to powers of 2 - 1 (so the loop condition exits unless the count is a pretty high power of 2 - 1).

    There's a comment that says as much, but it's non-obvious since it's tucked pretty far inside that logic. // If the execution was much too fast (1/128th of maxElapsed), increase the count mask by 8x and restart timing. // The restart avoids including the overhead of this code in the measurement.

    I added another counter inside of KeepRunning to track those function invocations. It looks like when the benchmark has scaled up for the final timing loop for each, there are no more than 20 of these timing calls in the final output of the benchmark for the fastest benches (example of this, new column overheadCount, and new row TimeLookup). The timing calls themselves take 25ns avg each according to the same benchmark code (heh).

    So I guess the complexity / readability to achieve resource-independent 1s bench runs is the main thing that could be improve with this current version, I thought the same thing too until really digging into it and playing with countMask some.

  12. martinus commented at 7:13 AM on October 22, 2017: contributor

    isn't true because the value of countMask changes in two different ways inside of this logic, increasing to prevent time and cycles from being collected on every other iteration (how it starts at first).

    That's right, but KeepRunning() itself used to introduce a function call overhead. With my change the fast path is inlined, and the slow measuring path is in the non-inlined method.

    I've benchmarked an extreme example (with cout so compiler doesn't optimize it away):

    double d = 0.01;
    while (state.KeepRunning()) {
        d += 0.000001;
    }
    std::cout << d << std::endl;

    Before my change that the measurement is 3.19 ns, with the inlined change I measure 0.83 ns.

    I also find the plot extremely useful, and the command line options. I can now do this:

    ./bench/bench_bitcoin -filter=Deser.* -evals=20 -printer=plot |tee out.html

    The output is a nice plot, I've uploaded it here: http://lawyer-maggie-58725.bitballoon.com/ There you can clearly see that there is an outlier for DeserializeAndCheckBlockTest. Using the median of multiple runs it is nicely filtered out.

  13. martinus renamed this:
    Tests: Improve benchmark precision by inlining performance critical code
    Tests: Improve benchmark precision
    on Oct 27, 2017
  14. in src/bench/bench.h:119 in f4d9ade8f6 outdated
     144 | -        static void RunAll(double elapsedTimeForOne=1.0);
     145 | +        static void RunAll(Printer& printer, uint64_t numEvals = 1, double elapsedTimeForOneEval = 0.5, const std::string& filter = ".*", bool isListOnly = false);
     146 |      };
     147 |  }
     148 |  
     149 |  // BENCHMARK(foo) expands to:  benchmark::BenchRunner bench_11foo("foo", foo);
    


    ryanofsky commented at 6:14 PM on October 30, 2017:

    This comment is now out of date.

  15. in src/bench/bench.h:120 in f4d9ade8f6 outdated
     147 |  }
     148 |  
     149 |  // BENCHMARK(foo) expands to:  benchmark::BenchRunner bench_11foo("foo", foo);
     150 | -#define BENCHMARK(n) \
     151 | -    benchmark::BenchRunner BOOST_PP_CAT(bench_, BOOST_PP_CAT(__LINE__, n))(BOOST_PP_STRINGIZE(n), n);
     152 | +#define BENCHMARK(n, avgTime) \
    


    ryanofsky commented at 6:15 PM on October 30, 2017:

    Usage example at the top of bench.h is now out of date. Should also add comment there explaining the new macro argument.

  16. in src/bench/bench.h:121 in f4d9ade8f6 outdated
     148 |  
     149 |  // BENCHMARK(foo) expands to:  benchmark::BenchRunner bench_11foo("foo", foo);
     150 | -#define BENCHMARK(n) \
     151 | -    benchmark::BenchRunner BOOST_PP_CAT(bench_, BOOST_PP_CAT(__LINE__, n))(BOOST_PP_STRINGIZE(n), n);
     152 | +#define BENCHMARK(n, avgTime) \
     153 | +    benchmark::BenchRunner BOOST_PP_CAT(bench_, BOOST_PP_CAT(__LINE__, n))(BOOST_PP_STRINGIZE(n), n, avgTime);
    


    ryanofsky commented at 6:22 PM on October 30, 2017:

    Suggest expected_time or estimated_duration instead of avgTime. Developer guide recommends using snake case for variables, and avg doesn't tell you what it's supposed to be the average of.

  17. in src/bench/bench_bitcoin.cpp:27 in f4d9ade8f6 outdated
      23 |      SetupEnvironment();
      24 |      fPrintToDebugLog = false; // don't want to write to debug.log file
      25 |  
      26 | -    benchmark::BenchRunner::RunAll();
      27 | +    int64_t evaluations = gArgs.GetArg("-evals", 5);
      28 | +    std::string regexFilter = gArgs.GetArg("-filter", ".*");
    


    ryanofsky commented at 6:25 PM on October 30, 2017:

    Should add -help output describing what the options do.

    Should use snake_case for variable names (and ideally match them with option names).


    martinus commented at 5:49 AM on October 31, 2017:

    Thanks a lot for the review! I'll see if I can fix all that over the weekend.

  18. in src/bench/bench.h:50 in f4d9ade8f6 outdated
      37 | @@ -36,53 +38,86 @@ BENCHMARK(CODE_TO_TIME);
      38 |   */
      39 |   
      40 |  namespace benchmark {
      41 | +class State;
      42 |  
      43 | -    class State {
    


    ryanofsky commented at 6:49 PM on October 30, 2017:

    New code is missing indentation, so now it is inconsistent, and BenchRunner looks like it is nested inside State. Also, the State class is key for someone trying to understand the benchmark framework, so would recommend leaving on top of the printer classes (you can forward declare Printer).

  19. in src/bench/bench.h:115 in f4d9ade8f6 outdated
     140 |      public:
     141 | -        BenchRunner(std::string name, BenchFunction func);
     142 | +        BenchRunner(std::string name, BenchFunction func, double avgTime);
     143 |  
     144 | -        static void RunAll(double elapsedTimeForOne=1.0);
     145 | +        static void RunAll(Printer& printer, uint64_t numEvals = 1, double elapsedTimeForOneEval = 0.5, const std::string& filter = ".*", bool isListOnly = false);
    


    ryanofsky commented at 6:54 PM on October 30, 2017:

    Default argument values never actually seem to be used. Would drop so they aren't repeated and don't become inconsistent in the future.

  20. ryanofsky commented at 7:29 PM on October 30, 2017: member

    utACK f028ada09ee92a1950ccbc60415301b146eb349c if code comments/whitespace/variable names are cleaned up. This change adds some features and makes the code more portable and effectively makes benchmarks run with a fixed number of iterations like Matt wanted, though I think it would be a little more straightforward if the BENCHMARK macro just took the number of KeepRunning iterations literally instead of taking an average time estimate, and using 0.5/average_time as the number of iterations.

  21. in src/bench/bench_bitcoin.cpp:32 in 08e77aa51a outdated
      27 | +        std::cout << HelpMessageGroup(_("Options:"))
      28 | +                  << HelpMessageOpt("-?", _("Print this help message and exit"))
      29 | +                  << HelpMessageOpt("-list", _("List benchmarks without executing them. Can be combined with -scaling and -filter"))
      30 | +                  << HelpMessageOpt("-evals=<n>", strprintf(_("Number of measurement evaluations to perform. (default: %u)"), DEFAULT_BENCH_EVALUATIONS))
      31 | +                  << HelpMessageOpt("-filter=<regex>", strprintf(_("Regular expression filter to select benchmark by name (default: %s)"), DEFAULT_BENCH_FILTER))
      32 | +                  << HelpMessageOpt("-scaling=<n>", strprintf(_("Scaling factor for iterations. The specified factor is then divided by 1000, so e.g. 500 results in a factor 0.5 (default: %u)"), DEFAULT_BENCH_SCALING))
    


    TheBlueMatt commented at 8:33 PM on November 2, 2017:

    This is somewhat confusing. Maybe call it the scaling factor for runtime to make it clear what it actually means?


    martinus commented at 6:55 PM on November 7, 2017:

    I've updated the description now too

  22. in src/bench/bench.cpp:49 in 08e77aa51a outdated
      45 | +void benchmark::ConsolePrinter::footer() {}
      46 | +void benchmark::PlotlyPrinter::header()
      47 |  {
      48 | -    benchmarks().insert(std::make_pair(name, func));
      49 | +    std::cout << "<html><head>"
      50 | +              << "<script src=\"https://cdn.plot.ly/plotly-latest.min.js\"></script>"
    


    TheBlueMatt commented at 8:37 PM on November 2, 2017:

    Grr, I kinda feel yucky having code that generates a tracking request for people who want to see the output. Is it possible to check plotly in (does it have a compatible license?) or can we just put the graphs in simple cli form and skip the html output?


    martinus commented at 6:50 PM on November 7, 2017:

    Plotly.js is released under the MIT license: https://github.com/plotly/plotly.js/blob/master/LICENSE, so that should be pretty compatible :) The HTML output is optional, and default output is commandline, so I did not see a problem with that. I've chosen it because the box plots are really nice, especially when using more evaluations this the output make it very visible how the runtime actually fluctuates. That makes comparisons much easier. It would be very hard to reproduce that in the command line.

  23. in src/bench/bench.cpp:17 in d61637bf94 outdated
      16 | -benchmark::BenchRunner::BenchmarkMap &benchmark::BenchRunner::benchmarks() {
      17 | -    static std::map<std::string, benchmark::BenchFunction> benchmarks_map;
      18 | -    return benchmarks_map;
      19 | +void benchmark::ConsolePrinter::header()
      20 | +{
      21 | +    std::cout << "# Benchmark, evals, iterations, total, min, max, median" << std::endl;
    


    TheBlueMatt commented at 8:55 PM on November 6, 2017:

    This is wrong for is_list_only - maybe is_list_only mode should also call result() with some other parameter for the mode?


    martinus commented at 6:51 PM on November 7, 2017:

    fixed in the following commit

  24. in src/bench/bench.h:57 in d61637bf94 outdated
      77 | -    typedef std::function<void(State&)> BenchFunction;
      78 | +    std::chrono::time_point<std::chrono::high_resolution_clock> m_start_time;
      79 |  
      80 | -    class BenchRunner
      81 | +    bool UpdateTimer(std::chrono::time_point<std::chrono::high_resolution_clock> finish_time);
      82 | +    void PrintResults() const;
    


    TheBlueMatt commented at 8:57 PM on November 6, 2017:

    This function is neither defined, nor used.


    martinus commented at 6:51 PM on November 7, 2017:

    thanks, I've removed it

  25. TheBlueMatt commented at 8:58 PM on November 6, 2017: member

    Looks awesome, thanks for tackling this!

  26. theuni commented at 10:26 PM on November 7, 2017: member

    Concept ACK. The plotly.js issue could be worked around by requiring that a url be supplied when plotting.

    It'd be nice if this could be rebased on #11562, as I think that's a simple logical subset of the changes here.

  27. laanwj commented at 3:50 AM on November 8, 2017: member

    Concept ACK. Adding HTML as an optional format is neat, please keep CSV support though I need it (but you do so that's great).

    NACK on checking plotly (or any JS code) into the repository.

    I think it's ok to use this URI by default for the HTML output, but agree that it should be possible to provide your own (e.g. if you've installed it somewhere on your local network, and want to avoid the internet access).

    Also agree on rebasing on top of #11562, that one is ready for merge and will go in today.

  28. martinus force-pushed on Nov 8, 2017
  29. martinus commented at 5:48 AM on November 9, 2017: contributor

    I've rebased and squashed my changes on top of the current master, and added configurable plotly.js URL. That was the first time I've done a rebase, I hope I've done it correctly! The travis build is green but for some reason it shows as "build is in progress"?

  30. MarcoFalke commented at 3:55 PM on November 9, 2017: member

    Don't worry about the travis-yellow. This happens from time to time.

  31. MarcoFalke commented at 3:34 PM on November 10, 2017: member

    Needs rebase

  32. martinus force-pushed on Nov 10, 2017
  33. in src/bench/bench.cpp:12 in 6c31a3599a outdated
       7 | @@ -8,98 +8,135 @@
       8 |  #include <assert.h>
       9 |  #include <iostream>
      10 |  #include <iomanip>
      11 | +#include <algorithm>
      12 | +#include <regex>
    


    MarcoFalke commented at 9:09 PM on November 13, 2017:

    Don't forget to #include <numeric> for std::accumulate.


    martinus commented at 9:24 AM on November 18, 2017:

    thanks, updated!

  34. MarcoFalke commented at 9:47 PM on November 13, 2017: member

    Looks good 6c31a3599a66a6a59b23919f6e3724b80e936485. Would you mind updating "RollingBloom-refresh" to the same Interface/output as well?

    Also, you removed the cycles statistics from the output. Would be nice to shortly mention the reason for that.

  35. martinus force-pushed on Nov 18, 2017
  36. martinus commented at 9:42 AM on November 18, 2017: contributor

    I have removed the cycles statistics because I personally don't think it is necessary, and it simplifies the code. I could add it back though if others think its a helpful statistic

  37. in src/bench/bench.cpp:100 in 9b4567c7b8 outdated
     101 | -void
     102 | -benchmark::BenchRunner::RunAll(benchmark::duration elapsedTimeForOne)
     103 | +void benchmark::BenchRunner::RunAll(Printer& printer, uint64_t num_evals, double scaling, const std::string& filter, bool is_list_only)
     104 |  {
     105 |      perf_init();
     106 |      if (std::ratio_less_equal<benchmark::clock::period, std::micro>::value) {
    


    MarcoFalke commented at 3:31 PM on November 27, 2017:

    nit (unrelated to your changes): Condition should be negated.

    Mind to add a commit for this?


    martinus commented at 10:02 AM on December 2, 2017:

    Ah, thanks for noting. I have fixed that message and rebased & squashed everything

  38. MarcoFalke commented at 3:32 PM on November 27, 2017: member
  39. martinus force-pushed on Dec 2, 2017
  40. MarcoFalke commented at 10:47 PM on December 13, 2017: member

    utACK 846ae3e5143a3959a03fe9bdff78e0253ef38a37

  41. TheBlueMatt commented at 9:44 PM on December 14, 2017: member

    Code-level utACK 846ae3e5143a3959a03fe9bdff78e0253ef38a37, but there are a few iteration counts that seem much too high (or low). My output is below (take note of CCheckQueueSpeed which took much, much too long, as well as CCheckQueueSpeedPrevectorJob, as well as a few which were much too short.

    Benchmark, evals, iterations, total, min, max, median

    Base58CheckEncode, 5, 320000, 4.95443, 3.03136e-06, 3.15128e-06, 3.10137e-06 Base58Decode, 5, 800000, 4.70946, 1.15549e-06, 1.19147e-06, 1.17997e-06 Base58Encode, 5, 470000, 4.69162, 1.9516e-06, 2.03356e-06, 2.00767e-06 BenchLockedPool, 5, 530, 4.77861, 0.00174317, 0.00186048, 0.00180833 CCheckQueueSpeed, 5, 29000, 169.862, 0.00116512, 0.00117693, 0.00117196 CCheckQueueSpeedPrevectorJob, 5, 1400, 24.7343, 0.00344175, 0.00358993, 0.00354375 CCoinsCaching, 5, 170000, 5.63809, 6.55766e-06, 6.74362e-06, 6.64618e-06 CoinSelection, 5, 650, 7.56354, 0.00229932, 0.00235094, 0.00234165 DeserializeAndCheckBlockTest, 5, 160, 6.90174, 0.00853605, 0.00879497, 0.00859933 DeserializeBlockTest, 5, 130, 4.66008, 0.00716357, 0.00717861, 0.00716775 FastRandom_1bit, 5, 44000000, 0.552348, 2.49383e-09, 2.5533e-09, 2.50127e-09 FastRandom_32bit, 5, 110000000, 5.47546, 9.9329e-09, 1.00259e-08, 9.93521e-09 MempoolEviction, 5, 41000, 4.77384, 2.32089e-05, 2.34167e-05, 2.3244e-05 PrevectorClear, 5, 5600, 5.79097, 0.000206768, 0.000206858, 0.000206819 PrevectorDestructor, 5, 5700, 5.12332, 0.00017952, 0.00018031, 0.000179633 RIPEMD160, 5, 440, 5.56629, 0.00252712, 0.00254049, 0.00252763 RollingBloom, 5, 1500000, 6.21419, 8.1511e-07, 8.77143e-07, 8.16809e-07 SHA1, 5, 570, 5.67104, 0.00198533, 0.00199761, 0.00198967 SHA256, 5, 340, 5.67169, 0.00322694, 0.00361889, 0.00325424 SHA256_32b, 5, 4700000, 5.55119, 2.33977e-07, 2.40388e-07, 2.35284e-07 SHA512, 5, 330, 5.45279, 0.00328487, 0.00334672, 0.00329141 SipHash_32b, 5, 40000000, 5.66765, 2.82588e-08, 2.84166e-08, 2.83323e-08 Sleep100ms, 5, 10, 5.0036, 0.10007, 0.100075, 0.100071 Trig, 5, 12000000, 1.45462, 2.28395e-08, 2.80425e-08, 2.35008e-08 VerifyScriptBench, 5, 6300, 5.76116, 0.000181074, 0.000186656, 0.000182651

  42. jtimon commented at 11:38 PM on December 14, 2017: contributor

    Concept ACK

  43. MarcoFalke commented at 11:51 PM on December 15, 2017: member

    @martinus Can you share your results? @TheBlueMatt That is interesting. Running on my laptop gives similar results to yours with the exception of CCheckQueueSpeed and CCheckQueueSpeedPrevectorJob

    differ: ccheckqueuespeed

    similar: coinselection

    MarcoFalke.txt:

    # Benchmark, evals, iterations, total, min, max, median
    Base58CheckEncode, 5, 320000, 4.5474, 2.817e-06, 2.8636e-06, 2.84568e-06
    Base58Decode, 5, 800000, 3.82362, 9.30481e-07, 1.00586e-06, 9.49333e-07
    Base58Encode, 5, 470000, 4.7699, 1.86009e-06, 2.11548e-06, 2.05627e-06
    BenchLockedPool, 5, 530, 5.17102, 0.00175682, 0.00213842, 0.00190727
    CCheckQueueSpeed, 5, 29000, 23.8431, 0.000159138, 0.000169364, 0.000164902
    CCheckQueueSpeedPrevectorJob, 5, 1400, 6.47687, 0.000917759, 0.000934733, 0.000922342
    CCoinsCaching, 5, 170000, 4.86768, 5.59884e-06, 5.89609e-06, 5.68206e-06
    CoinSelection, 5, 650, 7.6276, 0.00218004, 0.00252972, 0.00234302
    DeserializeAndCheckBlockTest, 5, 160, 6.98462, 0.00763635, 0.0101189, 0.00814532
    DeserializeBlockTest, 5, 130, 4.60714, 0.00664361, 0.00766447, 0.00692787
    FastRandom_1bit, 5, 44000000, 0.428713, 1.93345e-09, 1.97208e-09, 1.94739e-09
    FastRandom_32bit, 5, 110000000, 4.87081, 8.54002e-09, 9.68865e-09, 8.64202e-09
    MempoolEviction, 5, 41000, 4.25868, 2.01973e-05, 2.16433e-05, 2.0531e-05
    PrevectorClear, 5, 5600, 4.10787, 0.000139865, 0.000152129, 0.000148594
    PrevectorDestructor, 5, 5700, 3.80067, 0.000130306, 0.000137978, 0.000133524
    RIPEMD160, 5, 440, 5.45262, 0.00240383, 0.00258753, 0.00243835
    RollingBloom, 5, 1500000, 6.32018, 7.26596e-07, 9.93319e-07, 7.94326e-07
    SHA1, 5, 570, 5.27207, 0.00177787, 0.00190894, 0.00186143
    SHA256, 5, 340, 5.01164, 0.00285627, 0.00317001, 0.00289666
    SHA256_32b, 5, 4700000, 4.91959, 2.04949e-07, 2.16811e-07, 2.09041e-07
    SHA512, 5, 330, 4.41158, 0.00265473, 0.00268366, 0.00267743
    SipHash_32b, 5, 40000000, 4.93749, 2.42605e-08, 2.48857e-08, 2.47638e-08
    Sleep100ms, 5, 10, 5.0058, 0.100097, 0.100133, 0.100118
    Trig, 5, 12000000, 1.16958, 1.89401e-08, 2.04877e-08, 1.90709e-08
    VerifyScriptBench, 5, 6300, 4.50527, 0.000139754, 0.000145372, 0.000143697
    
  44. martinus commented at 7:18 AM on December 16, 2017: contributor

    That's interesting, here are my results:

    # Benchmark, evals, iterations, total, min, max, median
    Base58CheckEncode, 5, 320000, 5.79601, 3.47514e-06, 3.90062e-06, 3.56941e-06
    Base58Decode, 5, 800000, 5.68374, 1.35576e-06, 1.47075e-06, 1.43981e-06
    Base58Encode, 5, 470000, 5.97008, 2.29896e-06, 2.8671e-06, 2.47352e-06
    BenchLockedPool, 5, 530, 5.74901, 0.00202647, 0.00244975, 0.00208988
    CCheckQueueSpeed, 5, 29000, 4.64417, 3.06441e-05, 3.39189e-05, 3.14987e-05
    CCheckQueueSpeedPrevectorJob, 5, 1400, 5.73779, 0.000769057, 0.000853848, 0.000828896
    CCoinsCaching, 5, 170000, 5.82152, 6.65853e-06, 7.02337e-06, 6.89329e-06
    CoinSelection, 5, 650, 5.41396, 0.00160549, 0.00172544, 0.00165988
    DeserializeAndCheckBlockTest, 5, 160, 6.57906, 0.00803611, 0.00852187, 0.00820457
    DeserializeBlockTest, 5, 130, 4.45676, 0.00673637, 0.00707143, 0.00679454
    FastRandom_1bit, 5, 44000000, 0.545512, 2.36596e-09, 2.6505e-09, 2.48319e-09
    FastRandom_32bit, 5, 110000000, 5.45518, 9.6993e-09, 1.00983e-08, 9.95917e-09
    MempoolEviction, 5, 41000, 5.60891, 2.63059e-05, 3.01792e-05, 2.64325e-05
    PrevectorClear, 5, 5600, 5.59075, 0.000189645, 0.000213659, 0.000199472
    PrevectorDestructor, 5, 5700, 5.83946, 0.000179477, 0.000234454, 0.000197302
    RIPEMD160, 5, 440, 5.58772, 0.00239881, 0.00266644, 0.00254531
    RollingBloom, 5, 1500000, 6.05577, 6.86181e-07, 9.65984e-07, 7.73467e-07
    SHA1, 5, 570, 5.15691, 0.00179377, 0.00183837, 0.00179627
    SHA256, 5, 340, 5.21523, 0.00300338, 0.0030959, 0.00308264
    SHA256_32b, 5, 4700000, 5.28716, 2.20308e-07, 2.30364e-07, 2.25683e-07
    SHA512, 5, 330, 5.22515, 0.00310101, 0.00323767, 0.00316392
    SipHash_32b, 5, 40000000, 5.31217, 2.60762e-08, 2.69529e-08, 2.68093e-08
    Sleep100ms, 5, 10, 5.03343, 0.10037, 0.10093, 0.100714
    Trig, 5, 12000000, 5.09343, 8.40781e-08, 8.57396e-08, 8.51671e-08
    VerifyScriptBench, 5, 6300, 5.47213, 0.000165754, 0.00018609, 0.000169795
    

    I have configured the runtimes so that total times is somewhat close to 5 seconds on my PC, the only mistake I see is that I have set FastRandom_1bit to 10 times too little iterations. Trig is a pretty useless benchmark just as demonstrations, so the runtimes there do not really matter IMHO. All other benchmarks take roughly a bit over 5 seconds for me. I'm using g++ 5.4.0 on Linux

    It's really interesting that the runtimes of CCheckQueueSpeed are so different. What compiler are you using?

  45. TheBlueMatt commented at 1:06 AM on December 18, 2017: member

    Heh, I went and checked and the CCheckQueue bench tests are just (rather uselessly) testing inter-core latency...when I turn the core count way down using taskset things get much faster (2 non-HT cores seems to be the fastest, didnt test across physical CPUs, but assume it would be very slow). I'd suggest we either turn the bench off wholesale or we put some semi-realistic work in the check there.

  46. TheBlueMatt commented at 1:08 AM on December 18, 2017: member

    Oh, also this is on a many-core system, so the checkqueue bench spawns a ton of threads (it uses a dynamic count based on hardware), making the bench even more inconsistent between systems.

  47. MarcoFalke commented at 10:34 PM on December 19, 2017: member

    @martinus I don't want to hold up your pull request any longer. I really like how flexible the new design is to jump in and hack own stuff into it. However, since there seems to be issues arising from hardcoding the number of iterations, I'd propose the following solution:

    <strike>Add a commit that also hardcodes the number of cores in checkqueue.cpp to 2. E.g.

    diff --git a/src/bench/checkqueue.cpp b/src/bench/checkqueue.cpp
    index e79c986792..55590c4a72 100644
    --- a/src/bench/checkqueue.cpp
    +++ b/src/bench/checkqueue.cpp
    @@ -5,3 +5,2 @@
     #include <bench/bench.h>
    -#include <util.h>
     #include <validation.h>
    @@ -17,3 +16,3 @@
     // particularly visible
    -static const int MIN_CORES = 2;
    +static constexpr int NUM_CORES = 2;
     static const size_t BATCHES = 101;
    @@ -33,4 +32,4 @@ static void CCheckQueueSpeed(benchmark::State& state)
         boost::thread_group tg;
    -    for (auto x = 0; x < std::max(MIN_CORES, GetNumCores()); ++x) {
    -       tg.create_thread([&]{queue.Thread();});
    +    for (auto x = 0; x < NUM_CORES; ++x) {
    +        tg.create_thread([&] { queue.Thread(); });
         }
    @@ -80,4 +79,4 @@ static void CCheckQueueSpeedPrevectorJob(benchmark::State& state)
         boost::thread_group tg;
    -    for (auto x = 0; x < std::max(MIN_CORES, GetNumCores()); ++x) {
    -       tg.create_thread([&]{queue.Thread();});
    +    for (auto x = 0; x < NUM_CORES; ++x) {
    +        tg.create_thread([&] { queue.Thread(); });
         }
    
  48. MarcoFalke commented at 10:36 PM on December 19, 2017: member

    @TheBlueMatt suggested to get rid of the test, since it is not really a useful benchmark. I am fine with that, too.

    (Make sure to add it as separate commit, so it is easier to review)

  49. in src/bench/bench_bitcoin.cpp:12 in 33dbba421d outdated
       8 | @@ -9,17 +9,64 @@
       9 |  #include <validation.h>
      10 |  #include <util.h>
      11 |  #include <random.h>
      12 | +#include <init.h>
    


    laanwj commented at 8:59 PM on December 20, 2017:

    Why does this add a dependency on init.h?

  50. in contrib/devtools/check-doc.py:18 in 33dbba421d outdated
      14 | @@ -15,9 +15,9 @@
      15 |  import sys
      16 |  
      17 |  FOLDER_GREP = 'src'
      18 | -FOLDER_TEST = 'src/test/'
      19 | +FOLDER_TEST = 'src/(test|bench)/'
    


    MarcoFalke commented at 9:01 PM on December 20, 2017:

    Why the reason for this change. It hides a valid warning for me at least:

    Args undocumented: 1
    set(['-plot-height'])
    
  51. in src/bench/bench_bitcoin.cpp:41 in 33dbba421d outdated
      36 | +                  << HelpMessageOpt("-filter=<regex>", strprintf(_("Regular expression filter to select benchmark by name (default: %s)"), DEFAULT_BENCH_FILTER))
      37 | +                  << HelpMessageOpt("-scaling=<n>", strprintf(_("Scaling factor for benchmark's runtime (default: %u)"), DEFAULT_BENCH_SCALING))
      38 | +                  << HelpMessageOpt("-printer=(console|plot)", strprintf(_("Choose printer format. console: print data to console. plot: Print results as HTML graph (default: %s)"), DEFAULT_BENCH_PRINTER))
      39 | +                  << HelpMessageOpt("-plot-plotlyurl=<uri>", strprintf(_("URL to use for plotly.js (default: %s)"), DEFAULT_PLOT_PLOTLYURL))
      40 | +                  << HelpMessageOpt("-plot-width=<x>", strprintf(_("Plot width in pixel (default: %u)"), DEFAULT_PLOT_WIDTH))
      41 | +                  << HelpMessageOpt("-plot-width=<x>", strprintf(_("Plot height in pixel (default: %u)"), DEFAULT_PLOT_HEIGHT));
    


    MarcoFalke commented at 9:02 PM on December 20, 2017:

    Arg should say "-plot-height=<x>" to make the python check happy.

  52. MarcoFalke commented at 9:02 PM on December 20, 2017: member

    re-ACK. Needs rebase and two nits fixed.

  53. martinus commented at 8:42 PM on December 21, 2017: contributor

    I think I made my branch kaputt while trying to rebase :-(

  54. MarcoFalke commented at 10:10 PM on December 21, 2017: member

    Git commits are unkaputtbar. Should be trivial to recover. Give me a sec...

    On Dec 21, 2017 15:43, "Martin Ankerl" notifications@github.com wrote:

    I think I made my branch kaputt while trying to rebase :-(

    — You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/pull/11517#issuecomment-353453059, or mute the thread https://github.com/notifications/unsubscribe-auth/AGGmv1sgm_8LoAiTDtAlTAw0E1S0EiiKks5tCsLogaJpZM4P8dRz .

  55. MarcoFalke commented at 10:27 PM on December 21, 2017: member

    git checkout improved-benchmarking git reset --hard 48ea572e79e1f63a5de4025d02ba72a666bcdc95 git rebase 604e08c83cf58ca7e7cda2ab284c1ace7bb12977

    Solve the merge conflict (Should be straightforward): vim src/bench/checkqueue.cpp

    git add src/bench/checkqueue.cpp && git rebase --continue git rebase --interactive HEAD~4

    Adjust the prefixes, and the order of the commits, such that it says (approx):

    pick foobaa0 Improved microbenchmarking with multiple features.

    f foobaa1 increased number of iterations so it takes about 5 seconds

    f foobaa2 removed unused init.h include fixed argument help check for bench

    pick foobaa3 Removed CCheckQueueSpeed benchmark

    git push origin improved-benchmarking --force

  56. Improved microbenchmarking with multiple features.
    * inline performance critical code
    * Average runtime is specified and used to calculate iterations.
    * Console: show median of multiple runs
    * plot: show box plot
    * filter benchmarks
    * specify scaling factor
    * ignore src/test and src/bench in command line check script
    * number of iterations instead of time
    * Replaced runtime in BENCHMARK makro number of iterations.
    * Added -? to bench_bitcoin
    * Benchmark plotly.js URL, width, height can be customized
    * Fixed incorrect precision warning
    00721e69f8
  57. Removed CCheckQueueSpeed benchmark
    This benchmark's runtime was rather unpredictive on different machines, not really a useful benchmark.
    760af84072
  58. in contrib/devtools/check-doc.py:20 in e63f31138c outdated
      16 | @@ -17,7 +17,7 @@
      17 |  FOLDER_GREP = 'src'
      18 |  FOLDER_TEST = 'src/test/'
      19 |  CMD_ROOT_DIR = '`git rev-parse --show-toplevel`/%s' % FOLDER_GREP
      20 | -CMD_GREP_ARGS = r"egrep -r -I '(map(Multi)?Args(\.count\(|\[)|Get(Bool)?Arg\()\"\-[^\"]+?\"' %s | grep -v '%s'" % (CMD_ROOT_DIR, FOLDER_TEST)
      21 | +CMD_GREP_ARGS = r"egrep -r -I '(map(Multi)?Args(\.count\(|\[)|Get(Bool)?Arg\()\"\-[^\"]+?\"' %s | grep -Ev '%s'" % (CMD_ROOT_DIR, FOLDER_TEST)
    


    MarcoFalke commented at 10:35 PM on December 21, 2017:

    Don't forget to remove this as well

  59. martinus force-pushed on Dec 23, 2017
  60. martinus commented at 10:22 AM on December 23, 2017: contributor

    Thanks a lot @MarcoFalke!

  61. laanwj commented at 1:52 PM on December 23, 2017: member

    utACK 760af84

  62. laanwj merged this on Dec 23, 2017
  63. laanwj closed this on Dec 23, 2017

  64. laanwj referenced this in commit 5180a86c96 on Dec 23, 2017
  65. MarcoFalke commented at 5:25 PM on December 23, 2017: member

    Post-merge checked that the rebase was done correct:

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA512
    
    ACK 760af84072408ba53d009e868fccc25fb186d40c
    -----BEGIN PGP SIGNATURE-----
    
    iQIcBAEBCgAGBQJaPpFpAAoJENLqSFDnUosl6n4QAKHR/UGqeuxq0CJHWsaP2gX/
    JIG2y99LkAdepKBIUAaHxwSiLjUVPFX+8m+v5sdgcUuTMzhnF/oto59WDG7UJ5RN
    UTw+rGcZuFN8aBBcgtrLCKASjLVsAtImIESnx8uOgT+NQ2ShNtmwlbazNBmupx6e
    yChi665E30Olw5w7CCDHPKttzZozp5ZoCCTjMlYChJC9F7fr9+jmgr4cLhHrRCIt
    WtO874M6lzi45LJEPL6waLpV+DakCmss733iVwf9eb6aEvrJ6Ix2TSGiIRyLlarZ
    MSCEBEHJbEmYOxQmZtGAuTd+9dZHR1vuEQiWB9wbJaggb9K+/7aBTsHd7/h/XEvr
    DXb1wFiFmNjNytpqsIgYeOz1T3SK7vuLod9uFXSXx9TEc4/i91RuHZgQ9kBr1T8w
    BrYBzSlSAziJ5fqKthfShj1C3X3Wpj9axcVAxFyPo1h+JbamVUDAl8B5afSPtkY3
    h62TWy4n4KhYEqqBQfMi2UUNhyEvwR46MOVj9i3UrEemRnL/OtxnkitFa5mhqBi5
    eye5MeRXgYvLSRRHUoErtx3W5wcpC+8DCF/CnHLKUiyvCWD74XDhhss7TRdePcjq
    o103bXGqBvR8IjQT6WYqVZbkDkHBBIpqPrkEW/icp8MJLHsiqvI1ZR7eqXQxlHEC
    nfaCXEa0qfx+Qvd5+Ozp
    =siBw
    -----END PGP SIGNATURE-----
    
  66. martinus deleted the branch on May 22, 2018
  67. laanwj referenced this in commit 7c7508c268 on Jun 5, 2018
  68. PastaPastaPasta referenced this in commit 3301b73c37 on Jun 17, 2020
  69. PastaPastaPasta referenced this in commit 9a9aa3f467 on Jul 2, 2020
  70. PastaPastaPasta referenced this in commit 516aa1f4d5 on Jul 18, 2020
  71. PastaPastaPasta referenced this in commit f08b2f82f8 on Jul 19, 2020
  72. PastaPastaPasta referenced this in commit 7ccd49aede on Jul 19, 2020
  73. UdjinM6 referenced this in commit 83726d286e on Jul 20, 2020
  74. gades referenced this in commit cf5d8c4d57 on Jun 30, 2021
  75. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-16 18:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me