ci: Future of CI after Cirrus pricing change #1392

issue real-or-random openend this issue on August 4, 2023
  1. real-or-random commented at 2:03 pm on August 4, 2023: contributor

    Roadmap (keeping this up to date):

    I think the natural way forward for us is:

    • Move native Windows tasks to GitHub Actions (#1389 and #1397)
    • Move SageMath task to GitHub Actions (#1399)
    • Move native macOS tasks to GitHub Actions, this will convert them to x86_64 unfortunately (#1394, #1404)
    • Move Linux tasks to the Bitcoin Core persistent workers or alternatively to GitHub Actions
      • wine/msvc tasks (converting them to native windows in #1401)
      • wine/mingw tasks (#1398)
      • actual normal Linux tasks (#1396)
      • special tasks like qemu/sanitizers (#1406, #1409)

    Possible follow-ups:

    • ~Consider using artifacts to move the Docker image from the Docker build job to the actual CI job (for Linux tasks).~ (Not worth the hassle, the current approach seems to work well.)Details This should be a cleaner solution, but it adds some complexity. It’s also worth checking if this avoids network issues. In terms of delay, this adds about 12 min uploading time to the Docker build job, but avoids about 1 min delay in the actual CI jobs as compared to the current solution that relies purely on the GHA cache (#1398). So this will speed up CI if we could avoid re-uploading existing artifacts, e.g., have another digest file that just stores the SHA256, and re-upload only if the SHA does not match. But all of this is probably not worth the complexity if the current approach with the cache turns out to be good enough.
    • Enable Valgrind on macOS again now that the macOS tasks run on x86_64 (#1151, #1412)
    • Bring back some ARM testing (see for details: #1394 (comment))
    • Consider moving the git safe directory stuff to run-in-docker-action (https://github.com/bitcoin-core/secp256k1/pull/1411)
    • After the migration, check if the build matrix still makes sense
    • cosmetics: Job names could need a rework
    • cosmetics: Printing of log files could be improved

    Other related PRs:


    Corresponding Bitcoin Core issue: https://github.com/bitcoin/bitcoin/issues/28098

    Cirrus CI will cap the community cluster, see cirrus-ci.org/blog/2023/07/17/limiting-free-usage-of-cirrus-ci. As with Core, the pricing model makes it totally unreasonable to pay for compute credits (multiple thousand USD / month).

    The plan in Bitcoin Core is to move native Windows+macOS tasks to GitHub Actions, and move Linux tasks to persistent workers (=self-hosted). If I read the Bitcoin Core IRC meeting notes correctly, @MarcoFalke said these workers will also be available for libsecp256k1.

    But the devil is in the details:

    For macOS, we need to take also #1153 into account. It seems that GitHub-hosted macOS runners are on x86_64. The good news is that Valgrind should work again then, but the (very) bad is that this will reduce our number of native ARM tasks to zero. We still have some QEMU tasks, but we can’t even the run the Valgrind cttimetests on them (maybe this would now work with MSan?!) @MarcoFalke Are the self-hosted runners only x86_64?

    For Linux tasks, the meeting notes say that the main reason for using persistent workers is that some tasks require a very specific environment (e.g., the USDT ASan job). I don’t think we have such requirements, so I tend to think that moving everything to GitHub Actions is a bit cleaner for us. With a persistent worker, Cirrus CI anyway acts only as a “coordination layer” between the worker and GitHub. Yet another way is to the self-hosted runners with GitHub Actions, see my comment https://github.com/bitcoin/bitcoin/issues/28098#issuecomment-1665661274).

  2. real-or-random added the label ci on Aug 4, 2023
  3. maflcko commented at 2:11 pm on August 4, 2023: none

    Are the self-hosted runners only x86_64?

    There is one aarch64 one. (It is required because GitHub doesn’t offer aarch64 Linux boxes, and Google Cloud doesn’t offer an aarch64 CPU that can run armhf 32-bit binaries)

  4. real-or-random commented at 2:13 pm on August 4, 2023: contributor
    Ok, then it probably makes sense to do what I suggested in #1153, namely move ARM tasks to Linux, and reduce the number of our macOS tasks.
  5. maflcko commented at 2:15 pm on August 4, 2023: none

    moving everything to GitHub Actions is a bit cleaner for us

    Sounds interesting. I wonder how (and if) docker images can be cached, along with ccache, etc…

  6. real-or-random commented at 2:36 pm on August 4, 2023: contributor

    moving everything to GitHub Actions is a bit cleaner for us

    Sounds interesting. I wonder how (and if) docker images can be cached, along with ccache, etc…

    Yeah, we’ll need to see.

    And I agree that “in the short run it seems easier to stick to Cirrus for now, because the diff is a lot smaller (just replace container: in the yml with persistent_worker:, etc)” (https://github.com/bitcoin/bitcoin/issues/28098#issuecomment-1665708491). We should probably do this first, and then see if we’re interested in moving to GitHub Actions fully.

    edit: I updated the roadmap above.

  7. hebasto cross-referenced this on Aug 5, 2023 from issue ci: Run "Windows (VS 2022)" job on GitHub Actions by hebasto
  8. hebasto commented at 12:29 pm on August 5, 2023: member

    For macOS, we need to take also #1153 into account. It seems that GitHub-hosted macOS runners are on x86_64. The good news is that Valgrind should work again then…

    For such a case, it is good to see some progress in #1274 :)

  9. hebasto cross-referenced this on Aug 5, 2023 from issue ci, gha: Run "x86_64: macOS Ventura" job on GitHub Actions by hebasto
  10. hebasto cross-referenced this on Aug 7, 2023 from issue ci, gha: Add "x86_64: Linux (Debian stable)" GitHub Actions job by hebasto
  11. hebasto commented at 6:59 am on August 7, 2023: member

    moving everything to GitHub Actions is a bit cleaner for us

    Sounds interesting. I wonder how (and if) docker images can be cached, along with ccache, etc…

    See #1396.

  12. hebasto commented at 8:28 pm on August 7, 2023: member

    There are open PRs for all of the mentioned items. It would be more productive, if we somehow prioritise them to spend our time until Sept. 1st more effectively.

  13. maflcko commented at 8:39 am on August 8, 2023: none

    It would be more productive, if we somehow prioritise them to spend our time until Sept. 1st more effectively.

    I’d say the Windows/macOS ones are probably easier, since they don’t require write permission and don’t have to deal with docker image caching.

  14. real-or-random commented at 5:02 pm on August 8, 2023: contributor
    Yes, we should in principle proceed in the order of the list above. But it doesn’t need to be very strict. For example, if it turns out that #1396 is ready by Sep 1st, we can skip “Move Linux tasks to the Bitcoin Core persistent workers”.
  15. real-or-random referenced this in commit 96294c00fb on Aug 9, 2023
  16. hebasto cross-referenced this on Aug 10, 2023 from issue ci, gha: Add Windows jobs based on Linux image by hebasto
  17. hebasto cross-referenced this on Aug 14, 2023 from issue ci, gha: Run "SageMath prover" job on GitHub Actions by hebasto
  18. hebasto commented at 2:15 pm on August 14, 2023: member
    • Move Linux tasks to the Bitcoin Core persistent workers

    It seems reasonable to split this task in two ones, depending on the underlying architecture: x86_64 and arm64, because the GitHub hosted runners lack support for arm64.

  19. real-or-random commented at 5:08 pm on August 15, 2023: contributor

    @hebasto Hm, we currently don’t have native Linux arm64 jobs, so we can’t “move” them over. We could add some (see #1163 and #1394 (comment)).

    I tend to think that is also acceptable to wait for https://github.com/github/roadmap/issues/528, it’s currently planned for the end of the year. Then we could move macOS back to ARM. Until that happens, perhaps we can add a QEMU jobs that run the ctimetests on MSan (clang-only) at least. Note to self: We need apt-get install libclang-rt-dev:arm64 and this works with

    0HOST="aarch64-linux-gnu" CC="clang --target=aarch64-linux-gnu" WRAPPER_CMD="qemu-aarch64"
    

    (The real tests fail with msan enabled on qemu. I think this is because the stack will explode.)

    I updated the list above with optional items.

  20. maflcko commented at 5:14 pm on August 15, 2023: none

    qemu-arm is a bit slower than native aarch64. You can use the already existing persistent worker, if you want:

    https://github.com/bitcoin/bitcoin/blob/cd43a8444ba44f86ddbb313a03a2782482beda89/.cirrus.yml#L210-L212

    (Currently not set up for this repo, but should be some time this week)

  21. real-or-random commented at 5:28 pm on August 15, 2023: contributor
    Sure, that’s an easy option. I just think we’re currently playing around with the idea to move everything to GHA, if it’s feasible for this repo.
  22. hebasto cross-referenced this on Aug 20, 2023 from issue ci, gha: Move more non-x86_64 tasks from Cirrus CI to GitHub Actions by hebasto
  23. real-or-random referenced this in commit 2e6cf9bae5 on Aug 21, 2023
  24. real-or-random referenced this in commit 6ee14550c8 on Aug 22, 2023
  25. hebasto commented at 5:31 pm on August 24, 2023: member

    While it worked on macOS Catalina back in time, it seems a couple of suppression for /usr/lib/libSystem.B.dylib and /usr/lib/dyld are needed.

    Branch (POC) – https://github.com/hebasto/secp256k1/tree/230824-valgrind CI – https://github.com/hebasto/secp256k1/actions/runs/5967987235

  26. real-or-random commented at 8:29 am on August 25, 2023: contributor
    Oh thanks for checking. Have you tried the supplied suppression file (https://github.com/LouisBrunner/valgrind-macos/blob/main/darwin19.supp)? If it doesn’t solve the problem, we could try to upstream the additional suppressions, see also https://github.com/LouisBrunner/valgrind-macos/issues/15.
  27. hebasto commented at 8:46 am on August 25, 2023: member

    Have you tried the supplied suppression file (LouisBrunner/valgrind-macos@main/darwin19.supp)?

    Yes, I have. It does not change the outcome.

    UPD. I used https://github.com/LouisBrunner/valgrind-macos/blob/main/darwin22.supp as we run Ventura.

  28. real-or-random commented at 9:23 am on August 25, 2023: contributor

    Do you think maintaining the suppressions is a problem? I don’t think it’s a big deal.

    UPD. I used LouisBrunner/valgrind-macos@main/darwin22.supp as we run Ventura.

    Okay, sure, I got confused and looked at the wrong file.

  29. hebasto commented at 9:26 am on August 25, 2023: member

    Do you think maintaining the suppressions is a problem? I don’t think it’s a big deal.

    You mean, in this repository?

  30. real-or-random commented at 11:37 am on August 25, 2023: contributor

    Do you think maintaining the suppressions is a problem? I don’t think it’s a big deal.

    You mean, in this repository?

    Yes… I don’t think it will be a lot of work, but I guess we should still submit it upstream first. If they merge it quickly, then it’s easiest for us. I can take care if you don’t have the bandwidth.

  31. hebasto commented at 2:43 pm on August 25, 2023: member

    While it worked on macOS Catalina back in time, it seems a couple of suppression for /usr/lib/libSystem.B.dylib and /usr/lib/dyld are needed.

    FWIW, it works with no additional suppressions on macos-12.

  32. hebasto commented at 2:46 pm on August 25, 2023: member

    I can take care if you don’t have the bandwidth.

    It would be nice because I have no x86_64 macOS Ventura available.

  33. real-or-random commented at 5:11 pm on August 25, 2023: contributor

    FWIW, it works with no additional suppressions on macos-12.

    Oh ok, should we then just use this for now?

    I can take care if you don’t have the bandwidth.

    It would be nice because I have no x86_64 macOS Ventura available.

    I don’t have any macOS available. ;)

  34. hebasto cross-referenced this on Aug 26, 2023 from issue ci: Switch macOS from Ventura to Monterey and add Valgrind by hebasto
  35. hebasto commented at 10:12 am on August 26, 2023: member

    FWIW, it works with no additional suppressions on macos-12.

    Oh ok, should we then just use this for now?

    Done in #1412.

  36. hebasto commented at 12:28 pm on August 26, 2023: member

    Do you think maintaining the suppressions is a problem? I don’t think it’s a big deal.

    You mean, in this repository?

    Yes… I don’t think it will be a lot of work, but I guess we should still submit it upstream first.

    See https://github.com/LouisBrunner/valgrind-macos/pull/96 as a first step.

  37. real-or-random referenced this in commit 65c79fe2d0 on Aug 29, 2023
  38. real-or-random cross-referenced this on Aug 30, 2023 from issue ci/gha: Add ARM64 QEMU jobs for clang and clang-snapshot by real-or-random
  39. real-or-random referenced this in commit 727bec5bc2 on Sep 4, 2023
  40. real-or-random commented at 1:16 pm on September 5, 2023: contributor
    • Add a task for ctimetest on ARM64/Linux/Valgrind on Cirrus CI using free minutes or the self-hosted runner

    Hm, it appears that Cirrus’ “Dockerfile as a CI environment” feature won’t work with persistent workers (see #1418). Now that I think about it, that’s somewhat expected (e.g., where should the built images be pushed?).

    Alternatives:

    • Set up our own Docker pipeline on Cirrus (but that seems overkill)
    • Use the free minutes (but they’re shared among all repos of bitcoin-core)
    • Go back to the QEMU approach and try to compile compiler-rt on our own (but that’s a rabbit hole and doesn’t cover gcc)
    • Don’t use a Dockerfile (but that means we won’t get gcc-snapshot, unless we build it every time or cache it somehow)
    • Do nothing (and wait for native ARM64 on GHA)

    I think we should do one of the last two?

  41. maflcko commented at 1:26 pm on September 5, 2023: none

    A persistent worker will persist the docker image itself, after the first run on the hardware. I think all you need to do is call

    podman image --file $docker_file --name --env $bla --name $bla_image_name && podman container kill $ci_bla_name && podman run -it --rm --name $ci_bla_name $bla_image_name ./ci.sh

    Alternatively it may be possible to find a sponsor to cover the cost (if it is not too high) on cirrus directly, while native arm64 isn’t on GHA.

    I can look at the llvm issue next week, if time permits.

  42. real-or-random commented at 3:05 pm on September 5, 2023: contributor

    A persistent worker will persist the docker image itself, after the first run on the hardware.

    Thanks for chiming in. Wouldn’t we also need to make sure that images get pruned from time to time? Or does podman handle this automatically?

    podman image --file $docker_file --name --env $bla --name $bla_image_name && podman container kill $ci_bla_name && podman run -it --rm --name $ci_bla_name $bla_image_name ./ci.sh

    I assume the first step performs the caching automatically, rebuildung layers only as necessary? Sorry, I’m not familiar with podman, I have only used Docker so far.

    Alternatively it may be possible to find a sponsor to cover the cost (if it is not too high) on cirrus directly, while native arm64 isn’t on GHA.

    Right, yeah, I’m just not sure if I want to spend time on this.

    I can look at the llvm issue next week, if time permits.

    Ok sure, but I recommend not spending too much time on it. It also won’t help with GCC (I added a note above).

  43. maflcko commented at 3:13 pm on September 5, 2023: none

    Thanks for chiming in. Wouldn’t we also need to make sure that images get pruned from time to time? Or does podman handle this automatically?

    Yeah, you can also run podman image prune, if you want. Pull requests to bitcoin-core/gui should already run it on the same machines, but that seems fragile to rely on.

    See:

    https://github.com/bitcoin-core/gui/blob/9d3b216e009a53ffcecd57e7f10df15cccd5fd6d/ci/test/04_install.sh#L30

    I assume the first step performs the caching automatically, rebuildung layers only as necessary? Sorry, I’m not familiar with podman, I have only used Docker so far.

    Yes, it is the same. You should be able to use docker as well, if you want, which is podman-docker.

    Right, yeah, I’m just not sure if I want to spend time on this.

    If you mean reaching out to a sponsor, I am happy to reach out, if there is a cost estimate.

  44. real-or-random commented at 3:24 pm on September 5, 2023: contributor

    Okay, then I think this approach is probably simpler than I expected. I’m not sure if I have the time this week, but I’ll look into that soon. (Or @hebasto, if you want to give it a try, feel free to go ahead, of course. My plan was to simply “abuse” the existing Dockerfile to avoid maintaining a second one, at the cost of a somewhat larger image. The existing file should build fine except that debian won’t let you install an arm64 cross-compiler on arm64. So we’d need to add some check to skip these packages when we’re on arm64, see https://github.com/bitcoin-core/secp256k1/pull/1163/files#diff-751ef1d9fd31c5787e12221f590262dcf7d96cfb166d456e06bd0ccab115b60d .)

    If you mean reaching out to a sponsor, I am happy to reach out, if there is a cost estimate.

    Okay, thanks, but let’s first try docker/podman then.

  45. real-or-random cross-referenced this on Sep 20, 2023 from issue ci/cirrus: Add native ARM64 jobs by maflcko
  46. maflcko commented at 9:34 am on September 21, 2023: none
    Anything left to be done here?
  47. real-or-random cross-referenced this on Oct 2, 2023 from issue ci: Optimize build matrix by real-or-random
  48. real-or-random commented at 9:34 am on October 2, 2023: contributor
    The migration is done, but there are still a few unticked checkboxes. (And I’ve just added two.) None of them are crucial, but I plan to work on them soon, so I’d like to keep this open for now. We could also close this issue here and add a new tracking issue, or open separate issues for the remaining items, if people think that makes tracking easier.
  49. maflcko commented at 8:18 am on October 3, 2023: none

    https://github.blog/2023-10-02-introducing-the-new-apple-silicon-powered-m1-macos-larger-runner-for-github-actions/

    “With today’s launch, our macOS larger runners will be priced at $0.16/minute for XL and $0.12/minute for large.”

  50. real-or-random commented at 7:38 pm on October 3, 2023: contributor

    github.blog/2023-10-02-introducing-the-new-apple-silicon-powered-m1-macos-larger-runner-for-github-actions

    “With today’s launch, our macOS larger runners will be priced at $0.16/minute for XL and $0.12/minute for large.”

    This is a price decrease for private repos, and GHA remains free for public repos.

  51. maflcko commented at 7:39 pm on October 3, 2023: none
    Are large runners available for public repos?
  52. real-or-random commented at 8:01 pm on October 3, 2023: contributor

    Are large runners available for public repos?

    Ha, okay, you’re right. No, “larger runners” are always billed per minute, i.e., they’re not free for public repos. And it seems that they’re not planning to provide M1 “standard runners”. At least https://github.com/github/roadmap/issues/528#issuecomment-1743546984 has been closed now. That means we should stick to the Cirrus runners for ARM.

  53. real-or-random closed this on Oct 3, 2023

  54. real-or-random reopened this on Oct 3, 2023

  55. real-or-random cross-referenced this on Oct 26, 2023 from issue ci/cirrus: Add ARM32 valgrind tasks by real-or-random
  56. real-or-random commented at 2:34 pm on January 31, 2024: contributor

    And it seems that they’re not planning to provide M1 “standard runners”

    That has changed now: https://github.blog/changelog/2024-01-30-github-actions-introducing-the-new-m1-macos-runner-available-to-open-source/


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin-core/secp256k1. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 01:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me