ci: allow passing extra args to docker build with CI_IMAGE_BUILD_EXTRA_ARGS #31377

pull 0xB10C wants to merge 1 commits into bitcoin:master from 0xB10C:2024-11-CI_IMAGE_BUILD_EXTRA_ARGS changing 1 files +1 −0
  1. 0xB10C commented at 7:21 pm on November 26, 2024: contributor
    By setting CI_IMAGE_BUILD_EXTRA_ARGS, a CI runner can influence the docker build of the CI container image. This allows to, for example, pass --cache-to and --cache-from to the build, which is useful for ephemeral, self-hosted CI runners.
  2. ci: allow passing extra args to docker build
    By setting `CI_IMAGE_BUILD_EXTRA_ARGS`, a CI runner can influence the
    docker build of the CI container image. This allows to, for example,
    pass `--cache-to` and `--cache-from` to the build, which is useful
    for ephemeral CI runners.
    f8c8e23c99
  3. DrahtBot commented at 7:21 pm on November 26, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31377.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

  4. 0xB10C commented at 7:21 pm on November 26, 2024: contributor

    I’ve been looking into setting up my own ephemeral CI runners. Ideally, these would cache the CI container images that we build. By setting CI_IMAGE_BUILD_EXTRA_ARGS to --cache-to type=local,dest=/cache/docker,mode=max --cache-from type=local,src=/cache/docker (see https://docs.docker.com/build/cache/backends/local/) I can cache the images (and reusable layers of the images) for docker build. This is both nice for a faster build (the docker build in the MSan task takes > 1h), but also avoids failures when e.g. the Ubuntu apt mirrors aren’t reachable.

    The current CI runners don’t need this, as they aren’t ephemeral. They cache the images in their internal docker image store which is reused in the next CI task. A general alternative would be to regularly build CI docker images, push them to a registry, and (cache-from)/pull them during the build. However, this seems to be a lot more effort.

  5. DrahtBot added the label Tests on Nov 26, 2024
  6. maflcko commented at 7:46 am on November 27, 2024: member

    The current CI runners don’t need this, as they aren’t ephemeral.

    Currently, each runner is their own user account, with their own podman image layer cache, so in theory, they’d benefit from this as well.

    However, the images are quite large (especially msan), so my worry would be that after a sufficient number of builds, the workers will run out of storage, as the cache isn’t cleared like the local images are.

  7. 0xB10C commented at 11:11 am on November 27, 2024: contributor

    It occurred to me that the local cache needs to be keyed by e.g. ${CONTAINER_NAME} since different tasks build different images. This wasn’t clear to me from the build cache documentation at first. Setting CI_IMAGE_BUILD_EXTRA_ARGS="--cache-to type=local,dest=/cache/docker/${CONTAINER_NAME}" doesn’t work, as bash does not expand variables in variables. I’ll mark this as draft for now until I find a solution.

    However, the images are quite large (especially msan), so my worry would be that after a sufficient number of builds, the workers will run out of storage, as the cache isn’t cleared like the local images are.

    Yes, the local build cache dir seems to grow indefinitely as old layers aren’t cleaned up. Based on the “cache versioning” documentation, I think we wouldn’t need to keep old layers around as we wouldn’t use the digest to reference them. A naive approach would be to use --cache-from type=local,src=/path/to/docker/build/cache/${CONTAINER_NAME} with --cache-to type=local,src=$tmpdir. After the build, overwrite /path/to/docker/build/cache/${CONTAINER_NAME} with the contents of $tempdir. Only the most recent build cache would be kept.

  8. 0xB10C marked this as a draft on Nov 27, 2024
  9. maflcko commented at 11:21 am on November 27, 2024: member

    It occurred to me that the local cache needs to be keyed by e.g. ${CONTAINER_NAME} since different tasks build different images.

    Are you sure? The final layers should be different for each task, so should never collide. (Possibly early layers may be shared, if they happen to be the same). If this was an issue, shouldn’t the CI normally fail when running several tasks on the same machine in the same user account?

    After the build, overwrite /path/to/docker/build/cache/${CONTAINER_NAME} with the contents of $tempdir. Only the most recent build cache would be kept.

    That seems racy, as one worker may still be using the cache when it is (re)moved, which could lead to errors?

  10. 0xB10C commented at 12:40 pm on November 27, 2024: contributor

    It occurred to me that the local cache needs to be keyed by e.g. ${CONTAINER_NAME} since different tasks build different images.

    Are you sure? The final layers should be different for each task, so should never collide. (Possibly early layers may be shared, if they happen to be the same). If this was an issue, shouldn’t the CI normally fail when running several tasks on the same machine in the same user account?

    I’m not sure if we are talking about the same thing. I mean the docker build cache of type=local (https://docs.docker.com/build/cache/backends/local/) and not the final images the local docker knows about and lists when calling docker images.

    My understanding is that --cache-from type=local,dest=/abc/ will look up /abc/index.json and read the digest. E.g.

     0{
     1  "schemaVersion": 2,
     2  "mediaType": "application/vnd.oci.image.index.v1+json",
     3  "manifests": [
     4    {
     5      "mediaType": "application/vnd.oci.image.index.v1+json",
     6      "digest": "sha256:3a897fa438c7965a7a4919f2d55e5ea578221b4686ebc38f4fb843fac19fc3ec",
     7      "size": 1559,
     8      "annotations": {
     9        "org.opencontainers.image.ref.name": "latest"
    10      }
    11    }
    12  ]
    13}
    

    With that, it will start importing the cache from /abc/blobs/sha256/3a897fa438c7965a7a4919f2d55e5ea578221b4686ebc38f4fb843fac19fc3ec. This however doesn’t include all old, now unreferenced blobs/layers.

    That seems racy, as one worker may still be using the cache when it is (re)moved, which could lead to errors?

    At least on Linux/Unix, when you open a file you get a file descriptor and can read from that descriptor even if the file is removed or overwritten with e.g. mv. This might only fail if you got a file descriptor on an old index.json and try to look up the referenced file in blobs/sha256/3a897f.. which has just been removed. However, --cache-from only warns if it can’t read the cache and then continues treating it as a cache-miss. I changed the digest to start with b10c.. for https://cirrus-ci.com/task/4945786574733312?logs=ci#L176:

    0[#5](/bitcoin-bitcoin/5/) importing cache manifest from local:16451918948965588263
    1[#5](/bitcoin-bitcoin/5/) ERROR: failed to configure local cache importer: NotFound: rpc error: code = NotFound desc = content sha256:b10c7fa438c7965a7a4919f2d55e5ea578221b4686ebc38f4fb843fac19fc3ec: not found
    2[#6](/bitcoin-bitcoin/6/) [1/4] FROM docker.io/library/ubuntu:24.04@sha256:278628f08d4979fb9af9ead44277dbc9c92c2465922310916ad0c46ec9999295
    3[#6](/bitcoin-bitcoin/6/) resolve docker.io/library/ubuntu:24.04@sha256:278628f08d4979fb9af9ead44277dbc9c92c2465922310916ad0c46ec9999295
    4[#6](/bitcoin-bitcoin/6/) resolve docker.io/library/ubuntu:24.04@sha256:278628f08d4979fb9af9ead44277dbc9c92c2465922310916ad0c46ec9999295 0.9s done
    5[#6](/bitcoin-bitcoin/6/) DONE 0.9s
    

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me