KVM enabled containers are executed in separate VMs which means the startup time is much slower. Kubernetes cluster which runs regular containers was recently upgraded so hopefully #20093 won't happen again.
[ci] Use regular containers #21473
pull fkorotkov wants to merge 4 commits into bitcoin:master from fkorotkov:patch-3 changing 2 files +2 −135-
fkorotkov commented at 1:16 AM on March 19, 2021: none
-
fanquake commented at 1:18 AM on March 19, 2021: member
We tested this in #21445, but didn't seem to see any advantage, potentially the builds were even being under resourced:
It seems this will result in a slow-down. E.g. https://cirrus-ci.com/task/6631575736549376?command=ci#L5821 vs https://cirrus-ci.com/task/5477946514210816?command=ci#L5513
Looks like it is only using 0.3 CPU even though 2 have been requested?
Is this no longer the case?
- fanquake added the label Tests on Mar 19, 2021
-
fkorotkov commented at 1:30 AM on March 19, 2021: none
Didn't know you already tried it. Seem weird since the new cluster is using much faster CPUs (they have better single core performance 3.1 GHz vs 2.6 GHz). Let's see how this run will perform.
BTW how does your build system detect how many cores available? Could it be that it detect wrongly and tries to use 32 cores of the host VM which causes the actual 2 cores being overwhelmed. 🤔
-
MarcoFalke commented at 6:37 AM on March 19, 2021: member
Our ci scripts use
MAKEJOBSto pass down tomakeand the python tests. It is set to"-j4", so that the 2 CPU we requested are filled most of the time.I believe the difference between the regular containers and the kvm cluster is that the kvm will spin up a "private" vm with the requested CPUs, whereas the regular cluster will use a large shared vm. So if 20 Cirrus users compete for the SSD on the shared vm, it might be faster to use a HDD on the private vm. The CPU will wait idle while the SSD is working.
-
hebasto commented at 10:46 AM on March 19, 2021: member
Still takes more time than the master branch.
-
6e4206038e
[ci] Use regular containers
KVM enabled containers are executed in separate VMs which means the startup time is much slower. Kubernetes cluster which runs these containers was recently upgraded so hopefully #20093 won't happen again.
-
fkorotkov commented at 11:18 AM on March 19, 2021: none
Still takes more time than the master branch.
Indeed. This is very surprising for me since the local SSD on the VMs are 24 times faster than an HHD and the CPUs are faster. On paper everything is faster and indeed a few tasks were faster:
<img width="1555" alt="Screen Shot 2021-03-19 at 7 06 44 AM" src="https://user-images.githubusercontent.com/989066/111772318-1a674180-8883-11eb-8f91-78212acae87c.png">
Let me try to rebase and see how a new build performance. I'm very curious in situation and I'd like to investigate a bit more. Maybe there is some room for improvement for Cirrus. 🤔
- fkorotkov force-pushed on Mar 19, 2021
-
fkorotkov commented at 12:45 PM on March 19, 2021: none
I think my initial feeling was right. If I understood the code of tests correctly, they are like integration tests where you have bunch of the nodes running and you verify the results in the end. I think the C++ code here gets the CPU count wrong in case of running in a container. In case of the KVM containers the VM running it is the same size so it works. I think in this case, each node in the integration tests is trying to use 32 cores even though it's throttled by the cgroups. Do you know if there is an easy way to verify my theory and see how many CPUs TestNodes are trying to use during tests?
-
Explicitly set parallelism in tests in case of running inside of a Docker container b231be2051
-
Hardcode two threads in tests 0587188971
-
Hardcode CPUs in C++ for testing 77a7d97efa
-
fkorotkov commented at 6:07 PM on March 19, 2021: none
I've tried to hardcode the CPU amount but still got slower performance for
multiprocess. I'm a bit confused how it can be slower given that on paper everything should be faster. 🤷♂️I don't have any other ideas to try so will close the PR. Sorry for the noise.
- fkorotkov closed this on Mar 19, 2021
- fkorotkov deleted the branch on Mar 19, 2021
-
MarcoFalke commented at 11:25 AM on April 9, 2021: member
(This has been merged in #21445 (comment))
- DrahtBot locked this on Aug 16, 2022