bench: Prevent thread oversubscription and decreases the variance of result values #19710

pull hebasto wants to merge 2 commits into bitcoin:master from hebasto:200813-var changing 2 files +10 −4
  1. hebasto commented at 2:26 pm on August 13, 2020: member

    Split out from #18710.

    Some results (borrowed from #18710): 89121718-a3329800-d4c1-11ea-8bd1-66da20619696

  2. hebasto commented at 2:26 pm on August 13, 2020: member
  3. JeremyRubin commented at 3:31 pm on August 13, 2020: contributor

    Hmmmm…

    I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

  4. DrahtBot added the label Tests on Aug 13, 2020
  5. hebasto commented at 4:43 pm on August 13, 2020: member

    @JeremyRubin

    Hmmmm…

    I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

    We do not know for sure who many CPU cores/threads users dedicate to Bitcoin Core :)

    With this PR benchmark results are quite consistent:

    0$ taskset --cpu-list 0-7 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
    1|              ns/job |               job/s |    err% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------:|:----------
    3|              177.62 |        5,629,902.65 |    1.9% |      0.01 | `CCheckQueueSpeedPrevectorJob`
    
    0$ taskset --cpu-list 0-3 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
    1|              ns/job |               job/s |    err% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------:|:----------
    3|              176.05 |        5,680,327.83 |    3.3% |      0.01 | `CCheckQueueSpeedPrevectorJob`
    
    0$ taskset --cpu-list 0-1 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
    1|              ns/job |               job/s |    err% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------:|:----------
    3|               55.45 |       18,034,367.56 |    2.6% |      0.00 | `CCheckQueueSpeedPrevectorJob`
    
    0$ taskset --cpu-list 0 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
    1|              ns/job |               job/s |    err% |     total | benchmark
    2|--------------------:|--------------------:|--------:|----------:|:----------
    3|               54.73 |       18,271,118.07 |    4.0% |      0.00 | `CCheckQueueSpeedPrevectorJob`
    

    And early return do not disable the test, right?

  6. laanwj commented at 10:50 am on August 14, 2020: member

    If it’s a benchmark that benchmarks thread synchronization/interaction (which is, I think, the only valid reason to run a benchmark multi-threaded) then I agree running it without threads makes little sense.

    (And if so you could even see the “two threads on a single-core machine” case as a realistic case, because bitcoind will create threads on such a machine)

  7. bench: Allow skip benchmark
    Co-authored-by: Martin Ankerl <Martin.Ankerl@gmail.com>
    ce3e6a7cb2
  8. hebasto force-pushed on Aug 14, 2020
  9. hebasto commented at 12:17 pm on August 14, 2020: member

    Updated d786765721e02e7d2c888c8276c02064e5056593 -> a65c1adc7638c2b6f2038cbe11a637b285dd8dd2 (pr19710.01 -> pr19710.02, diff).

    Addressed:

    Hmmmm…

    I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

    If it’s a benchmark that benchmarks thread synchronization/interaction (which is, I think, the only valid reason to run a benchmark multi-threaded) then I agree running it without threads makes little sense.

  10. laanwj added this to the "Blockers" column in a project

  11. DrahtBot commented at 8:36 pm on August 20, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #18731 (refactor: Make CCheckQueue RAII-styled by hebasto)
    • #18710 (Add local thread pool to CCheckQueue by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  12. MarcoFalke commented at 6:56 am on August 21, 2020: member
    cr ACK a65c1adc7638c2b6f2038cbe11a637b285dd8dd2
  13. in src/bench/checkqueue.cpp:31 in a65c1adc76 outdated
    23@@ -26,6 +24,10 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
    24 // and there is a little bit of work done between calls to Add.
    25 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
    26 {
    27+    if (GetNumCores() < 2) {
    28+        return;
    29+    }
    


    vasild commented at 4:26 pm on August 25, 2020:

    I think this warrants a comment explaining “why?”.

    Also, maybe don’t drop the MIN_CORES constant and use it here instead of the magic number “2”?

  14. in src/bench/checkqueue.cpp:50 in a65c1adc76 outdated
    45@@ -44,7 +46,7 @@ static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
    46     };
    47     CCheckQueue<PrevectorJob> queue {QUEUE_BATCH_SIZE};
    48     boost::thread_group tg;
    49-    for (auto x = 0; x < std::max(MIN_CORES, GetNumCores()); ++x) {
    50+    for (auto x = 0; x < GetNumCores() - 1; ++x) {
    


    vasild commented at 4:27 pm on August 25, 2020:
    Maybe also comment here “why we do - 1”.
  15. vasild approved
  16. vasild commented at 4:32 pm on August 25, 2020: member
    ACK a65c1adc
  17. hebasto force-pushed on Aug 25, 2020
  18. hebasto commented at 7:45 pm on August 25, 2020: member

    Updated a65c1adc7638c2b6f2038cbe11a637b285dd8dd2 -> 4f9164389a078f64258f3d7f2ecda5ab68968e8c (pr19710.02 -> pr19710.03, diff).

    Addressed @vasild’s comments:

    Also, maybe don’t drop the MIN_CORES constant and use it here instead of the magic number “2”?

    Maybe also comment here “why we do - 1”.

  19. in src/bench/checkqueue.cpp:29 in 4f9164389a outdated
    25@@ -26,6 +26,10 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
    26 // and there is a little bit of work done between calls to Add.
    27 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
    28 {
    29+    if (GetNumCores() < MIN_CORES) {
    


    promag commented at 8:38 pm on August 25, 2020:

    4f9164389a078f64258f3d7f2ecda5ab68968e8c

    nit, single line and also add comment.


    hebasto commented at 8:47 pm on August 25, 2020:
    Thanks! Updated.
  20. promag commented at 8:38 pm on August 25, 2020: member
    Concept ACK.
  21. hebasto force-pushed on Aug 25, 2020
  22. hebasto commented at 8:46 pm on August 25, 2020: member

    Updated 4f9164389a078f64258f3d7f2ecda5ab68968e8c -> 5e7acd65443d3fc7be084ea81be3089e4b28bc9a (pr19710.03 -> pr19710.04, diff).

    Addressed @promag’s comment:

    nit, single line and also add comment.

  23. promag commented at 9:48 pm on August 25, 2020: member
    Code review ACK 5e7acd65443d3fc7be084ea81be3089e4b28bc9a.
  24. vasild approved
  25. vasild commented at 7:14 am on August 26, 2020: member

    ACK 5e7acd654

    Thanks for the comments!

  26. hebasto commented at 9:31 am on August 26, 2020: member
    @MarcoFalke Mind having another look at this PR?
  27. bench: Prevent thread oversubscription
    This change decreases the variance of benchmark results.
    3edc4e34fe
  28. in src/bench/checkqueue.cpp:30 in 5e7acd6544 outdated
    25@@ -26,6 +26,9 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
    26 // and there is a little bit of work done between calls to Add.
    27 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
    28 {
    29+    // We shouldn't ever be running with the checkqueue on a single core machine.
    30+    if (GetNumCores() < MIN_CORES) return;
    


    promag commented at 9:40 am on August 26, 2020:
    Maybe ditch MIN_CORES and change this to GetNumCores() == 1?

    hebasto commented at 9:48 am on August 26, 2020:

    https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency:

    If the value is not well defined or not computable, returns ​0​.


    hebasto commented at 9:48 am on August 26, 2020:
    Maybe <=1 ?

    hebasto commented at 10:02 am on August 26, 2020:
    Thanks! Updated.
  29. hebasto force-pushed on Aug 26, 2020
  30. hebasto commented at 10:02 am on August 26, 2020: member

    Updated 5e7acd65443d3fc7be084ea81be3089e4b28bc9a -> 3edc4e34fe2f92e7066c1455f5e42af2fdb43b99 (pr19710.04 -> pr19710.05, diff).

    Addressed @promag’s comment:

    Maybe ditch MIN_CORES and change this to GetNumCores() == 1?

  31. fjahr commented at 3:25 pm on August 30, 2020: member
    Code review ACK 3edc4e34fe2f92e7066c1455f5e42af2fdb43b99
  32. MarcoFalke merged this on Aug 31, 2020
  33. MarcoFalke closed this on Aug 31, 2020

  34. hebasto deleted the branch on Aug 31, 2020
  35. promag commented at 10:10 am on August 31, 2020: member
    ACK
  36. laanwj removed this from the "Blockers" column in a project

  37. sidhujag referenced this in commit bfafd05ac3 on Aug 31, 2020
  38. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me