ci: Move more tasks to GHA? #30304

issue maflcko openend this issue on June 19, 2024
  1. maflcko commented at 8:01 am on June 19, 2024: member

    Motivated by #29274 to make it easier to run the CI on forks, more tasks could be moved to GHA, similar to d97ddbe797f5b8b3bca0ee71b692e542b8990195?

    The downside would be that it is harder to re-run a task (only maintainers can do it, not the pull request author).

    Another downside would be that caching depends artefacts and docker images is hard on GHA. So ideally only tasks with NO_DEPENDS=1 are moved for now. It would be:

    • ci/test/00_setup_env_native_fuzz.sh:export NO_DEPENDS=1
    • ci/test/00_setup_env_native_tidy.sh:export NO_DEPENDS=1

    Any other thoughts, or volunteers to move the tasks?

  2. maflcko added the label Brainstorming on Jun 19, 2024
  3. maflcko added the label Tests on Jun 19, 2024
  4. maflcko commented at 8:01 am on June 19, 2024: member
  5. bitcoin deleted a comment on Jun 19, 2024
  6. m3dwards commented at 10:34 am on June 19, 2024: contributor

    I didn’t know PR authors could re-run tasks on Cirrus.

    It is nice that you can run the jobs on your own fork, I quite often now just push a random commit to my fork to trigger the CI jobs as an experiment.

    Conceivably the jobs could be on both Cirrus and GHA and only run on GHA for forks. Extra maintenance burden probably not worth it though.

    How are the depends artefacts cached on Cirrus? And which docker images are you referring to? The CI build one?

    Happy to volunteer to move more tasks.

  7. maflcko commented at 10:47 am on June 19, 2024: member

    How are the depends artefacts cached on Cirrus? And which docker images are you referring to? The CI build one?

    Cirrus itself has a simple and easy to use cache instruction. However, currently, the cache is implicit, because persistent workers are used.

    With images I mean the ones listed by podman image ls, that is:

     0REPOSITORY                                     TAG         IMAGE ID      CREATED        SIZE
     1localhost/ci_native_asan                       latest      582be28ff8c1  15 hours ago   1.81 GB
     2localhost/ci_native_valgrind                   latest      fa2461c0e0d5  3 days ago     1.36 GB
     3localhost/ci_native_fuzz_msan                  latest      682198747e18  5 days ago     6.02 GB
     4localhost/ci_native_fuzz_valgrind              latest      84fec02871f5  5 days ago     1.28 GB
     5localhost/ci_macos_cross                       latest      49b2d3ad6d04  6 days ago     1.62 GB
     6localhost/ci_s390x                             latest      d5fe9fb0978a  8 days ago     539 MB
     7localhost/ci_native_msan                       latest      901b867ade25  8 days ago     6.02 GB
     8localhost/ci_native_tidy                       latest      15fed375c141  8 days ago     2.64 GB
     9localhost/ci_win64                             latest      5e2364ec8c8c  8 days ago     2.61 GB
    10localhost/ci_native_previous_releases          latest      c8feaac3f9ea  8 days ago     537 MB
    11localhost/ci_native_nowallet_libbitcoinkernel  latest      951fd4615e36  8 days ago     909 MB
    12localhost/ci_native_fuzz                       latest      4875a0dcc4c7  8 days ago     1.21 GB
    13localhost/ci_i686_centos                       latest      a331b16f0046  11 days ago    704 MB
    14localhost/ci_arm_linux                         latest      34961f67c7ab  11 days ago    892 MB
    15localhost/ci_native_tsan                       latest      9d7b28339df2  11 days ago    1.05 GB
    16localhost/ci_i686_multiprocess                 latest      7e3205a702fd  12 days ago    1.08 GB
    

    When they only cache the result of apt install ..., they mostly serve to avoid outages of the Ubuntu mirror, as well as a small speed-up. However, for heavy images like the msan one, they cache the llvm compilation, which is quite CPU heavy.

  8. m3dwards commented at 11:01 am on June 19, 2024: contributor

    Could we use this to cache the images? https://docs.docker.com/build/cache/backends/gha/

    We are using the GHA cache at the moment, is there a reason why this woudln’t work for depends? Or is it just the effort required to split up the current CI script into different steps to take advantage of GHA cache?

  9. maflcko commented at 11:10 am on June 19, 2024: member

    We are using the GHA cache at the moment, is there a reason why this woudln’t work for depends?

    It has a limit of 10 GB, so I am not sure if it can fit everything. https://github.com/bitcoin/bitcoin/actions/caches

  10. willcl-ark commented at 11:18 am on June 19, 2024: member

    I actually meant to ask this in #30193, but why do we cache using run-id in the key ${{ github.job }}-ccache-${{ github.run_id }} ? As we only cache on master, using only ${{ github.job }}-ccache would make more sense to me; a single rolling cache per job.

    When we search for the cache to load we use a “wildcard” restore restore-keys: ${{ github.job }}-ccache- (with no run_id).

    This would remove some “duplicates”, e.g “macos-native-x86_64-ccache-” has 3 cache entries, when it only needs 1?

    Am I missing some reason for doing things this way?

  11. maflcko commented at 11:22 am on June 19, 2024: member

    Am I missing some reason for doing things this way?

    See #28292 (review) . This is one of the reasons why I personally don’t like GHA: It is a closed, confusing, and brittle ecosystem. The only benefit is that it is free (for now).

  12. m3dwards commented at 11:43 am on June 19, 2024: contributor

    It has a limit of 10 GB, so I am not sure if it can fit everything. https://github.com/bitcoin/bitcoin/actions/caches @fanquake might be able to get us more?

  13. maflcko commented at 11:56 am on June 19, 2024: member
    I am not sure how increasing the cache size limit would be possible.
  14. m3dwards commented at 2:32 pm on June 19, 2024: contributor
    Is the plan to eventually move everything from Cirrus to GHA?
  15. maflcko commented at 3:06 pm on June 19, 2024: member

    If someone finds a solution to all cache issues, then it can be done. (Moving back should be easy in any case)

    For now, see the issue description:

    ideally only tasks with NO_DEPENDS=1 are moved for now. It would be:

    * `ci/test/00_setup_env_native_fuzz.sh:export NO_DEPENDS=1`
    
    * `ci/test/00_setup_env_native_tidy.sh:export NO_DEPENDS=1`
    
  16. Sjors commented at 8:06 am on June 20, 2024: member

    This is one of the reasons why I personally don’t like GHA: It is a closed, confusing, and brittle ecosystem. The only benefit is that it is free (for now).

    I’m a bit hesitant as well. As I mentioned in #29274 (review) the only practical need I have currently is to run the native ARM job on Github CI.

    However, I’m fine with either skipping it, or following some (clear) instructions to run it on my AMD desktop with some virtualisation (if the performance is acceptable).

    The other jobs run fine on my Ubuntu machine(s) with not too much configuration.

    I also found, while working on that PR, that Cirrus has better configuration options. E.g. Github CI doesn’t even support custom env variables.

  17. maflcko commented at 8:22 am on June 20, 2024: member

    The other jobs run fine on my Ubuntu machine(s) with not too much configuration.

    Sure, but for others it may be too much hassle? See https://github.com/bitcoin-inquisition/bitcoin/pull/32#issue-1874824335

  18. Sjors commented at 8:29 am on June 20, 2024: member

    They didn’t try self-hosting or paying. It doesn’t seem like a good strategy to constantly flock to whichever company offers free resources. It would be nice if we can make it more flexible in an easy way.

    E.g. someone maintaining a fork could upload a yaml file somewhere that specifies which jobs should be run by which cloud provider / self host, and which should be skipped.

  19. maflcko commented at 8:48 am on June 20, 2024: member

    Personally I think

    • it is cleaner to not offer (and maintain) a bunch of config options for the CI services
    • self-hosting is too much overhead for forks (especially for Windows/macOS builds)
    • GHA is already required, and if they charged a price, someone would likely pay for it (if it is reasonably priced)
    • Anyone really wanting to self-host can already do it today by writing their own CI provider and CI integration (the CI system itself only requires docker/podman)

    Happy to close this issue, if there is no need or interest.

  20. m3dwards commented at 12:00 pm on June 20, 2024: contributor

    One nice aspect of GHA is the dev experience for contributors means they can easily have CI run on their personal forks before submitting a PR upstream. Yes it should be possible for someone to self host a runner but realistically how many would?

    Flocking to a free provider could leave the project vulnerable to a bait and switch and perhaps a bit of vendor lock in but it also would have a democratising effect; putting CI in the hands of any fork by default.

  21. Sjors commented at 12:20 pm on June 20, 2024: member

    the only practical need I have currently is to run the native ARM job on Github CI

    Which I’ve now solved with the magic power of qemu-user-static.

  22. hebasto commented at 8:45 pm on June 20, 2024: member

    I actually meant to ask this in #30193, but why do we cache using run-id in the key ${{ github.job }}-ccache-${{ github.run_id }} ? As we only cache on master, using only ${{ github.job }}-ccache would make more sense to me; a single rolling cache per job.

    When we search for the cache to load we use a “wildcard” restore restore-keys: ${{ github.job }}-ccache- (with no run_id).

    This would remove some “duplicates”, e.g “macos-native-x86_64-ccache-” has 3 cache entries, when it only needs 1?

    Am I missing some reason for doing things this way?

    In addition to what @maflcko pointed, this approach is documented here:

    A cache today is immutable and cannot be updated. But some use cases require the cache to be saved even though there was a “hit” during restore. To do so, use a key which is unique for every run and use restore-keys to restore the nearest cache.

  23. hebasto commented at 9:14 pm on June 20, 2024: member

    Here is a summary of the current GHA cache storage usage:

    prefix size, MB
    macos-native-x86_64-ccache 380
    win64-native-static-qt 61
    win64-native-ccache-installation 3
    win64-native-ccache 160
    win64-native-vcpkg-tools 5
    win64-native-vcpkg-binary 51

    Total: 660 MB.

    And 10 GB are available.

    Another downside would be that caching depends artefacts and docker images is hard on GHA. So ideally only tasks with NO_DEPENDS=1 are moved for now.

    I agree.

  24. vasild commented at 3:52 am on June 21, 2024: contributor

    GitHub is convenient and free. IMO it is a vendor lock-in trap from which we will eventually want to get out. Adding more dependencies to it would make that harder.

    “GitHub KYC is here boys!” https://x.com/nitesh_btc/status/1802735626032210330

  25. maflcko commented at 6:29 am on June 21, 2024: member

    There is no vendor lock-in, because switching back can be done trivially by calling git revert commit_id by providing the commit_id that switched the task. E.g. d97ddbe797f5b8b3bca0ee71b692e542b8990195.

    The CI system is written exactly with that in mind: Not care about the outer host and be possible to run anywhere. No one is holding anyone back to spin up the CI anywhere they want. All they need to get is the source code at the commit id they want to test and a way to report back the CI logs and results.

    If you want to discuss moving away from GitHub completely, I’d suggest to start a separate discussion thread. This one is about the UX of the CI (on forks).


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 07:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me