bench: Prevent thread oversubscription and decreases the variance of result values

hebasto commented at 2:26 pm on August 13, 2020: member

Split out from #18710.

Some results (borrowed from #18710): 89121718-a3329800-d4c1-11ea-8bd1-66da20619696

hebasto commented at 2:26 pm on August 13, 2020: member

cc @martinus @JeremyRubin

JeremyRubin commented at 3:31 pm on August 13, 2020: contributor

Hmmmm…

I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

DrahtBot added the label Tests on Aug 13, 2020

hebasto commented at 4:43 pm on August 13, 2020: member

@JeremyRubin

Hmmmm…

I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

We do not know for sure who many CPU cores/threads users dedicate to Bitcoin Core :)

With this PR benchmark results are quite consistent:

0$ taskset --cpu-list 0-7 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
1|              ns/job |               job/s |    err% |     total | benchmark
2|--------------------:|--------------------:|--------:|----------:|:----------
3|              177.62 |        5,629,902.65 |    1.9% |      0.01 | `CCheckQueueSpeedPrevectorJob`

0$ taskset --cpu-list 0-3 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
1|              ns/job |               job/s |    err% |     total | benchmark
2|--------------------:|--------------------:|--------:|----------:|:----------
3|              176.05 |        5,680,327.83 |    3.3% |      0.01 | `CCheckQueueSpeedPrevectorJob`

0$ taskset --cpu-list 0-1 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
1|              ns/job |               job/s |    err% |     total | benchmark
2|--------------------:|--------------------:|--------:|----------:|:----------
3|               55.45 |       18,034,367.56 |    2.6% |      0.00 | `CCheckQueueSpeedPrevectorJob`

0$ taskset --cpu-list 0 src/bench/bench_bitcoin -filter=CCheckQueueSpeedPrevectorJob
1|              ns/job |               job/s |    err% |     total | benchmark
2|--------------------:|--------------------:|--------:|----------:|:----------
3|               54.73 |       18,271,118.07 |    4.0% |      0.00 | `CCheckQueueSpeedPrevectorJob`

And early return do not disable the test, right?

laanwj commented at 10:50 am on August 14, 2020: member

If it’s a benchmark that benchmarks thread synchronization/interaction (which is, I think, the only valid reason to run a benchmark multi-threaded) then I agree running it without threads makes little sense.

(And if so you could even see the “two threads on a single-core machine” case as a realistic case, because bitcoind will create threads on such a machine)

bench: Allow skip benchmark

Co-authored-by: Martin Ankerl <Martin.Ankerl@gmail.com>

ce3e6a7cb2

hebasto force-pushed on Aug 14, 2020

hebasto commented at 12:17 pm on August 14, 2020: member

Updated d786765721e02e7d2c888c8276c02064e5056593 -> a65c1adc7638c2b6f2038cbe11a637b285dd8dd2 (pr19710.01 -> pr19710.02, diff).

Addressed:

@JeremyRubin’s comment:

Hmmmm…

I’d maybe rather just abort the test and return early if only 1 thread gets made because we shouldn’t ever be running with the checkqueue on a single core machine…

@laanwj’s comment

If it’s a benchmark that benchmarks thread synchronization/interaction (which is, I think, the only valid reason to run a benchmark multi-threaded) then I agree running it without threads makes little sense.

laanwj added this to the "Blockers" column in a project

DrahtBot commented at 8:36 pm on August 20, 2020: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#18731 (refactor: Make CCheckQueue RAII-styled by hebasto)
#18710 (Add local thread pool to CCheckQueue by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

MarcoFalke commented at 6:56 am on August 21, 2020: member

cr ACK a65c1adc7638c2b6f2038cbe11a637b285dd8dd2

in src/bench/checkqueue.cpp:31 in a65c1adc76 outdated

23@@ -26,6 +24,10 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
24 // and there is a little bit of work done between calls to Add.
25 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
26 {
27+    if (GetNumCores() < 2) {
28+        return;
29+    }

vasild commented at 4:26 pm on August 25, 2020:

I think this warrants a comment explaining “why?”.

Also, maybe don’t drop the MIN_CORES constant and use it here instead of the magic number “2”?

in src/bench/checkqueue.cpp:50 in a65c1adc76 outdated

45@@ -44,7 +46,7 @@ static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
46     };
47     CCheckQueue<PrevectorJob> queue {QUEUE_BATCH_SIZE};
48     boost::thread_group tg;
49-    for (auto x = 0; x < std::max(MIN_CORES, GetNumCores()); ++x) {
50+    for (auto x = 0; x < GetNumCores() - 1; ++x) {

vasild commented at 4:27 pm on August 25, 2020:

Maybe also comment here “why we do - 1”.

vasild approved

vasild commented at 4:32 pm on August 25, 2020: member

ACK a65c1adc

hebasto force-pushed on Aug 25, 2020

hebasto commented at 7:45 pm on August 25, 2020: member

Updated a65c1adc7638c2b6f2038cbe11a637b285dd8dd2 -> 4f9164389a078f64258f3d7f2ecda5ab68968e8c (pr19710.02 -> pr19710.03, diff).

Addressed @vasild’s comments:

#19710 (review):

Also, maybe don’t drop the MIN_CORES constant and use it here instead of the magic number “2”?

#19710 (review)

Maybe also comment here “why we do - 1”.

in src/bench/checkqueue.cpp:29 in 4f9164389a outdated

25@@ -26,6 +26,10 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
26 // and there is a little bit of work done between calls to Add.
27 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
28 {
29+    if (GetNumCores() < MIN_CORES) {

promag commented at 8:38 pm on August 25, 2020:

4f9164389a078f64258f3d7f2ecda5ab68968e8c

nit, single line and also add comment.

hebasto commented at 8:47 pm on August 25, 2020:

Thanks! Updated.

promag commented at 8:38 pm on August 25, 2020: member

Concept ACK.

hebasto force-pushed on Aug 25, 2020

hebasto commented at 8:46 pm on August 25, 2020: member

Updated 4f9164389a078f64258f3d7f2ecda5ab68968e8c -> 5e7acd65443d3fc7be084ea81be3089e4b28bc9a (pr19710.03 -> pr19710.04, diff).

Addressed @promag’s comment:

nit, single line and also add comment.

promag commented at 9:48 pm on August 25, 2020: member

Code review ACK 5e7acd65443d3fc7be084ea81be3089e4b28bc9a.

vasild approved

vasild commented at 7:14 am on August 26, 2020: member

ACK 5e7acd654

Thanks for the comments!

hebasto commented at 9:31 am on August 26, 2020: member

@MarcoFalke Mind having another look at this PR?

bench: Prevent thread oversubscription

This change decreases the variance of benchmark results.

3edc4e34fe

in src/bench/checkqueue.cpp:30 in 5e7acd6544 outdated

25@@ -26,6 +26,9 @@ static const unsigned int QUEUE_BATCH_SIZE = 128;
26 // and there is a little bit of work done between calls to Add.
27 static void CCheckQueueSpeedPrevectorJob(benchmark::Bench& bench)
28 {
29+    // We shouldn't ever be running with the checkqueue on a single core machine.
30+    if (GetNumCores() < MIN_CORES) return;

promag commented at 9:40 am on August 26, 2020:

Maybe ditch MIN_CORES and change this to GetNumCores() == 1?

hebasto commented at 9:48 am on August 26, 2020:

https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency:

If the value is not well defined or not computable, returns 0.

hebasto commented at 9:48 am on August 26, 2020:

Maybe <=1 ?

hebasto commented at 10:02 am on August 26, 2020:

Thanks! Updated.

hebasto force-pushed on Aug 26, 2020

hebasto commented at 10:02 am on August 26, 2020: member

Updated 5e7acd65443d3fc7be084ea81be3089e4b28bc9a -> 3edc4e34fe2f92e7066c1455f5e42af2fdb43b99 (pr19710.04 -> pr19710.05, diff).

Addressed @promag’s comment:

Maybe ditch MIN_CORES and change this to GetNumCores() == 1?

fjahr commented at 3:25 pm on August 30, 2020: member

Code review ACK 3edc4e34fe2f92e7066c1455f5e42af2fdb43b99

MarcoFalke merged this on Aug 31, 2020

MarcoFalke closed this on Aug 31, 2020

hebasto deleted the branch on Aug 31, 2020

promag commented at 10:10 am on August 31, 2020: member

ACK

laanwj removed this from the "Blockers" column in a project

sidhujag referenced this in commit bfafd05ac3 on Aug 31, 2020

DrahtBot locked this on Feb 15, 2022

bench: Prevent thread oversubscription and decreases the variance of result values #19710

Conflicts