Cirrus CI will be capping the free compute soon. For now, switch more tasks to persistent worker, as recommended by Cirrus CI.
(See slightly related discussion in #28098)
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--021abf342d371248e50ceaed478a90ca-->
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
Looks like there is an intermittent issue, which is fixed in podman 4.1:
[#135258](/bitcoin-bitcoin/135258/) REDUCE cov: 2818 ft: 7924 corp: 483/10261b lim: 254 exec/s: 3468 rss: 251Mb L: 15/177 MS: 1 EraseBytes-
[#135494](/bitcoin-bitcoin/135494/) NEWError: timed out waiting for file /var/lib/containers/storage/overlay-containers/2b5173104c7716f28471c2aed46932cd57b0904326ef3b5e3c9c0462dad553a6/userdata/75976ef6638693018268a2b4187292027a833405bd50e9ab4a3ddccae789be0b/exit/2b5173104c7716f28471c2aed46932cd57b0904326ef3b5e3c9c0462dad553a6: internal libpod error
Exit status: 255
This allows to drop unused templates, such as
cirrus_ephemeral_worker_template_env, or container_depends_template.
Also, ccache_cache, previous_releases_cache, and
base_depends_built_cache can be dropped, because the caching is done in
container volumes on the self-hosted runners.
The exact distro name should not be important. Also, it is easy to find
out, if needed. Thus, remove it to avoid bloat and maintenance overhead
having to keep it in sync.
Added a description to the commits, to make it easier to review.
ACK fa8e89d5e48c4554eddef611eb002b61f3305272
concept ACK fa8e89d5e48c4554eddef611eb002b61f3305272
Questions:
Is Win64 native still running on cirrus? The code changes look like it shouldnt be but it has a warning:
Monthly free compute limit exceeded and will be limited next month!
The MSan task took 2.5 hours! yow. Is the bottleneck there just hardware?
Is Win64 native still running on cirrus? The code changes look like it shouldnt be but it has a warning:
Yes, sorry for the confusion. This is only about Linux. I've adjusted the title. There is another pull about the msvc on Windows.
ACK fa8e89d5e48c4554eddef611eb002b61f3305272.
Observing very low Ccache hit rate. For example, in https://cirrus-ci.com/task/6728143201894400:
ccache version 3.7.7
cache directory /tmp/ccache_dir
primary config /tmp/ccache_dir/ccache.conf
secondary config (readonly) /etc/ccache.conf
stats updated Fri Aug 18 17:06:37 2023
cache hit (direct) 110
cache hit (preprocessed) 15
cache miss 737
cache hit rate 14.50 %
called for link 12
cleanups performed 0
files in cache 1489
cache size 33.6 MB
max cache size 200.0 MB
I wonder why the hit rate is non-zero. On the first run, the cache is empty, obviously. Maybe a leftover ccache from a previous push to this pull request?
I wonder why the hit rate is non-zero. On the first run, the cache is empty, obviously. Maybe a leftover ccache from a previous push to this pull request?
Can you please re-run all Cirrus jobs to test caching facilities?
Can you please re-run all Cirrus jobs to test caching facilities?
There are multiple workers per label type and multiple tasks using the same label type. Thus, it will take many re-runs to populate the initial cache on all workers.
I could limit the label type to each task, but that would be micro-managment. And free resources on one label type can not be used on another label type.
Can you please re-run all Cirrus jobs to test caching facilities?
There are multiple workers per label type and multiple tasks using the same label type. Thus, it will take many re-runs to populate the initial cache on all workers.
Fair enough. We can observe caching quality in-progress later.
There are multiple workers per label type and multiple tasks using the same label type.
Sounds like we need to increase CCACHE_MAXSIZE to cope such a usage, no?
Sounds like we need to increase
CCACHE_MAXSIZEto cope such a usage, no?
Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.
Sounds like we need to increase
CCACHE_MAXSIZEto cope such a usage, no?Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.
IIUC, multiple tasks share the same CCACHE_DIR. If they use different compilers or different compiler flags, they need more space to do not purge cached items from another task, no?
Sounds like we need to increase
CCACHE_MAXSIZEto cope such a usage, no?Why? Each task has its own name (space). The CI is now doing exactly what happens when you run the CI locally.
IIUC, multiple tasks share the same
CCACHE_DIR. If they use different compilers or different compiler flags, they need more space to do not purge cached items from another task, no?
No, and as I said this is unrelated to the changes here. If there is a bug in the CI system when running locally, it should be fixed separately, not as part of this pull.
See also:
$ git grep _ccache ci
ci/test/04_install.sh: docker volume create "${CONTAINER_NAME}_ccache" || true
ci/test/04_install.sh: --mount "type=volume,src=${CONTAINER_NAME}_ccache,dst=$CCACHE_DIR" \
https://github.com/bitcoin/bitcoin/runs/16142913468 -- timeouts. Probably unrelated...
yeah, the blob is missing.
Fetching xcb-proto-1.15.2.tar.xz from https://bitcoincore.org/depends-sources
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
0 222k 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0
curl: (22) The requested URL returned error: 404
It would be good to switch the mirror URL to something that supports ipv6 and is maintained. Hint: https://github.com/bitcoin/bitcoin/pull/17704
Apart from the blob missing, this seems to be an odd issue, because it works outside of podman and inside podman, any other site works as well:
# podman run --rm -it ubuntu:mantic bash -c 'apt update && apt install curl -y && curl --location --fail --connect-timeout 3 --retry 0 -o /tmp/ab http://xorg.freedesktop.org/archive/individual/proto/xcb-proto-1.15.2.tar.xz '
curl: (28) Failed to connect to xorg.freedesktop.org port 80 after 3001 ms: Timeout was reached
# podman run --rm -it ubuntu:mantic bash -c 'apt update && apt install curl -y && curl --location --fail --connect-timeout 3 --retry 0 -o /tmp/ab https://drahtbot.space/depends_download_fallback/xcb-proto-1.15.2.tar.xz '
100 144k 100 144k 0 0 614k 0 --:--:-- --:--:-- --:--:-- 612k
# curl --location --fail --connect-timeout 3 --retry 0 -o /tmp/ab http://xorg.freedesktop.org/archive/individual/proto/xcb-proto-1.15.2.tar.xz
100 144k 100 144k 0 0 118k 0 0:00:01 0:00:01 --:--:-- 208k
Milestone
26.0