ci: Switch remaining Linux tasks to self-hosted #21652

maflcko commented at 11:16 AM on April 11, 2021: member

Cirrus CI will be capping the free compute soon. For now, switch more tasks to persistent worker, as recommended by Cirrus CI.

(See slightly related discussion in #28098)

fanquake added the label Tests on Apr 11, 2021

maflcko force-pushed on Apr 11, 2021

DrahtBot commented at 3:24 PM on April 11, 2021: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	dergoegge, pinheadmz, hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#28210 (build: Bump minimum supported Clang to clang-13 by MarcoFalke)
#28173 (ci: Run Windows native task on GitHub Actions by hebasto)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

maflcko force-pushed on Apr 14, 2021

maflcko force-pushed on Apr 17, 2021

maflcko force-pushed on Jun 4, 2021

DrahtBot added the label Needs rebase on Dec 13, 2021

maflcko closed this on Dec 27, 2021

maflcko deleted the branch on Dec 27, 2021

bitcoin locked this on Dec 27, 2022

bitcoin unlocked this on Jul 24, 2023

maflcko restored the branch on Jul 24, 2023

maflcko renamed this:
~~[WIP NOMERGE DRAFT] ci: Switch more tasks to self-hosted~~
ci: Switch more tasks to self-hosted
on Jul 24, 2023

maflcko reopened this on Jul 24, 2023

maflcko force-pushed on Jul 24, 2023

DrahtBot added the label CI failed on Jul 24, 2023

DrahtBot removed the label Needs rebase on Jul 24, 2023

maflcko force-pushed on Jul 28, 2023

maflcko force-pushed on Jul 29, 2023

maflcko force-pushed on Jul 30, 2023

maflcko commented at 10:07 AM on July 30, 2023: member

Looks like there is an intermittent issue, which is fixed in podman 4.1:

[#135258](/bitcoin-bitcoin/135258/)	REDUCE cov: 2818 ft: 7924 corp: 483/10261b lim: 254 exec/s: 3468 rss: 251Mb L: 15/177 MS: 1 EraseBytes-
[#135494](/bitcoin-bitcoin/135494/)	NEWError: timed out waiting for file /var/lib/containers/storage/overlay-containers/2b5173104c7716f28471c2aed46932cd57b0904326ef3b5e3c9c0462dad553a6/userdata/75976ef6638693018268a2b4187292027a833405bd50e9ab4a3ddccae789be0b/exit/2b5173104c7716f28471c2aed46932cd57b0904326ef3b5e3c9c0462dad553a6: internal libpod error

Exit status: 255

maflcko force-pushed on Jul 31, 2023

maflcko force-pushed on Aug 4, 2023

maflcko force-pushed on Aug 9, 2023

maflcko force-pushed on Aug 15, 2023

DrahtBot added the label Needs rebase on Aug 15, 2023

maflcko force-pushed on Aug 16, 2023

DrahtBot removed the label Needs rebase on Aug 16, 2023

DrahtBot removed the label CI failed on Aug 16, 2023

DrahtBot added the label Needs rebase on Aug 16, 2023

maflcko force-pushed on Aug 16, 2023

DrahtBot added the label CI failed on Aug 16, 2023

DrahtBot removed the label Needs rebase on Aug 16, 2023

maflcko marked this as ready for review on Aug 17, 2023

maflcko force-pushed on Aug 17, 2023

maflcko force-pushed on Aug 18, 2023

ci: Switch remaining tasks to self-hosted

This allows to drop unused templates, such as
cirrus_ephemeral_worker_template_env, or container_depends_template.

Also, ccache_cache, previous_releases_cache, and
base_depends_built_cache can be dropped, because the caching is done in
container volumes on the self-hosted runners.

fad006fa0a

ci: Remove distro-name from task name

The exact distro name should not be important. Also, it is easy to find
out, if needed. Thus, remove it to avoid bloat and maintenance overhead
having to keep it in sync.

fa8e89d5e4

maflcko force-pushed on Aug 18, 2023

maflcko commented at 4:18 PM on August 18, 2023: member

Added a description to the commits, to make it easier to review.

maflcko added this to the milestone 26.0 on Aug 18, 2023

DrahtBot removed the label CI failed on Aug 18, 2023

maflcko requested review from hebasto on Aug 21, 2023

maflcko requested review from pinheadmz on Aug 21, 2023

maflcko renamed this:
~~ci: Switch more tasks to self-hosted~~
ci: Switch remaining tasks to self-hosted
on Aug 22, 2023

dergoegge approved

dergoegge commented at 11:55 AM on August 22, 2023: member

ACK fa8e89d5e48c4554eddef611eb002b61f3305272

pinheadmz commented at 5:09 PM on August 22, 2023: member

concept ACK fa8e89d5e48c4554eddef611eb002b61f3305272

Questions:

Is Win64 native still running on cirrus? The code changes look like it shouldnt be but it has a warning:

Monthly free compute limit exceeded and will be limited next month!

The MSan task took 2.5 hours! yow. Is the bottleneck there just hardware?

DrahtBot removed review request from pinheadmz on Aug 22, 2023

maflcko renamed this:
~~ci: Switch remaining tasks to self-hosted~~
ci: Switch remaining Linux tasks to self-hosted
on Aug 23, 2023

maflcko commented at 7:44 AM on August 23, 2023: member

Is Win64 native still running on cirrus? The code changes look like it shouldnt be but it has a warning:

Yes, sorry for the confusion. This is only about Linux. I've adjusted the title. There is another pull about the msvc on Windows.

maflcko commented at 7:45 AM on August 23, 2023: member

The MSan task took 2.5 hours! yow. Is the bottleneck there just hardware?

On the first run it will build llvm/clang/msan + depends + Bitcoin Core on a fresh cache. See also the comment in the cirrus yaml.

maflcko requested review from fanquake on Aug 23, 2023

hebasto approved

hebasto commented at 10:30 AM on August 23, 2023: member

ACK fa8e89d5e48c4554eddef611eb002b61f3305272.

Observing very low Ccache hit rate. For example, in https://cirrus-ci.com/task/6728143201894400:

ccache version 3.7.7
cache directory                     /tmp/ccache_dir
primary config                      /tmp/ccache_dir/ccache.conf
secondary config      (readonly)    /etc/ccache.conf
stats updated                       Fri Aug 18 17:06:37 2023
cache hit (direct)                   110
cache hit (preprocessed)              15
cache miss                           737
cache hit rate                     14.50 %
called for link                       12
cleanups performed                     0
files in cache                      1489
cache size                          33.6 MB
max cache size                     200.0 MB

maflcko commented at 10:35 AM on August 23, 2023: member

I wonder why the hit rate is non-zero. On the first run, the cache is empty, obviously. Maybe a leftover ccache from a previous push to this pull request?

hebasto commented at 10:38 AM on August 23, 2023: member

@MarcoFalke

I wonder why the hit rate is non-zero. On the first run, the cache is empty, obviously. Maybe a leftover ccache from a previous push to this pull request?

Can you please re-run all Cirrus jobs to test caching facilities?

maflcko commented at 10:51 AM on August 23, 2023: member

Can you please re-run all Cirrus jobs to test caching facilities?

There are multiple workers per label type and multiple tasks using the same label type. Thus, it will take many re-runs to populate the initial cache on all workers.

maflcko commented at 10:52 AM on August 23, 2023: member

I could limit the label type to each task, but that would be micro-managment. And free resources on one label type can not be used on another label type.

hebasto commented at 10:53 AM on August 23, 2023: member

Can you please re-run all Cirrus jobs to test caching facilities?

There are multiple workers per label type and multiple tasks using the same label type. Thus, it will take many re-runs to populate the initial cache on all workers.

Fair enough. We can observe caching quality in-progress later.

hebasto commented at 10:56 AM on August 23, 2023: member

There are multiple workers per label type and multiple tasks using the same label type.

Sounds like we need to increase CCACHE_MAXSIZE to cope such a usage, no?

maflcko commented at 11:02 AM on August 23, 2023: member

Sounds like we need to increase CCACHE_MAXSIZE to cope such a usage, no?

Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.

hebasto commented at 11:12 AM on August 23, 2023: member

Sounds like we need to increase CCACHE_MAXSIZE to cope such a usage, no?

Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.

IIUC, multiple tasks share the same CCACHE_DIR. If they use different compilers or different compiler flags, they need more space to do not purge cached items from another task, no?

maflcko commented at 11:31 AM on August 23, 2023: member

Sounds like we need to increase CCACHE_MAXSIZE to cope such a usage, no?

Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.

IIUC, multiple tasks share the same CCACHE_DIR. If they use different compilers or different compiler flags, they need more space to do not purge cached items from another task, no?

No, and as I said this is unrelated to the changes here. If there is a bug in the CI system when running locally, it should be fixed separately, not as part of this pull.