ci: move ASan job to GitHub Actions from Cirrus CI #30193

pull m3dwards wants to merge 2 commits into bitcoin:master from m3dwards:move-asan-to-gha changing 4 files +50 −17
  1. m3dwards commented at 2:53 pm on May 29, 2024: contributor

    PR for moving the ASAN + LSAN + USDT + friends job to github actions from Cirrus.

    The motivation for this PR is that this task needs a full VM (or bare metal) to function, because of the tracepoints. It can not run in a container on an arbitrary Linux, because the outside machine must exactly match the specification of the distro used in the CI task config. This requires more maintenance for the persistent worker, and I think moving to GHA will reduce the maintenance burden, or at least make it possible for anyone to work on.

    Also, it makes it easier to run the task on forks (bitcoin-inquisition, bitcoin-knots, devel forks, …) without having to set-up a real machine.

  2. DrahtBot commented at 2:53 pm on May 29, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, hebasto, achow101
    Concept ACK 0xB10C, kristapsk

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30164 ([PoC] ci: Add FreeBSD GitHub Actions job by hebasto)
    • #29274 (Support self-hosted Cirrus workers on forks by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on May 29, 2024
  4. maflcko commented at 3:03 pm on May 29, 2024: member
    Concept ACK, may review tomorrow if GHA is back online by then.
  5. hebasto commented at 3:05 pm on May 29, 2024: member
    Concept ACK.
  6. m3dwards force-pushed on May 29, 2024
  7. m3dwards commented at 3:21 pm on May 29, 2024: contributor

    Question: Do we want to keep this job whole or split parts out (such as just running the USDT tests)?

    If we are keeping it whole as it is with ASAN and LSAN etc, I will investigate if we can get the IPV6 tests working. If I can’t get them working, I would like to update the python IPV6 check as it’s currently passing even though IPV6 isn’t working on the runner.

  8. in .github/workflows/ci.yml:322 in 9d26b3bb86 outdated
    317+    timeout-minutes: 120
    318+    env:
    319+      FILE_ENV: "./ci/test/00_setup_env_native_asan.sh"
    320+      CIRRUS_CI: true
    321+      TEST_RUNNER_EXTRA: --exclude "rpc_bind.py --ipv6,feature_proxy.py"
    322+      MAKEJOBS: '-j1'
    


    willcl-ark commented at 8:37 am on May 30, 2024:
    Is a single job chosen for a reson here? The free hosted GHA runners have 4 (v)CPUs available AFAIK…

    m3dwards commented at 9:33 am on May 30, 2024:
    I have been hitting issues on github actions runners cancelling the job after 4 minutes ,which is during compile, so this was an attempt to reduce the CPU load. It hasn’t helped. See https://github.com/actions/runner-images/issues/9848#issuecomment-2137525256
  9. 0xB10C commented at 10:01 am on May 30, 2024: contributor
    Concept ACK! Thanks for working on this!
  10. maflcko commented at 7:21 pm on May 31, 2024: member
    Please remove the Cirrus task in the same commit, to make review easier.
  11. m3dwards commented at 12:03 pm on June 3, 2024: contributor
    @maflcko @hebasto @fanquake do we want to just extract the USDT part of the job into Github Actions?
  12. maflcko commented at 12:13 pm on June 3, 2024: member
    Not sure. It seems more work and overhead to maintain two tasks, instead of one, no?
  13. maflcko commented at 12:21 pm on June 3, 2024: member
    Well, if you really wanted to do it, you could move it into the “test each commit” task, and make that one run every time, to keep the number of tasks the same.
  14. m3dwards commented at 5:10 pm on June 3, 2024: contributor
    @maflcko happy to keep the job together as it is. Nothing prevents a split later if it’s desired. I will look into the functional tests that require IPV6.
  15. hebasto added the label Needs CMake port on Jun 5, 2024
  16. m3dwards force-pushed on Jun 7, 2024
  17. m3dwards force-pushed on Jun 7, 2024
  18. in ci/test/02_run_container.sh:32 in 3fd7bf0105 outdated
    28@@ -29,6 +29,8 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then
    29   docker volume create "${CONTAINER_NAME}_depends_SDKs_android" || true
    30   docker volume create "${CONTAINER_NAME}_previous_releases" || true
    31 
    32+  docker network create --ipv6 --subnet 2001:0DB8::/112 ip6net || true
    


    maflcko commented at 12:24 pm on June 7, 2024:
    Please clarify in the name that this was set up for CI

    maflcko commented at 12:34 pm on June 7, 2024:
    Also, it could split this up into a separate commit/pull, as this affects more than just this one task?
  19. in ci/test/02_run_container.sh:59 in 3fd7bf0105 outdated
    55@@ -54,8 +56,10 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then
    56                   --mount "type=volume,src=${CONTAINER_NAME}_depends_sources,dst=$DEPENDS_DIR/sources" \
    57                   --mount "type=volume,src=${CONTAINER_NAME}_depends_SDKs_android,dst=$DEPENDS_DIR/SDKs/android" \
    58                   --mount "type=volume,src=${CONTAINER_NAME}_previous_releases,dst=$PREVIOUS_RELEASES_DIR" \
    59+                  --env INSTALL_BCC_TRACING_TOOLS="${INSTALL_BCC_TRACING_TOOLS}" \
    


    maflcko commented at 12:26 pm on June 7, 2024:

    I don’t think this works, does it? installation happens when the image is built (docker build), not when the tests are compiled and run.

    See the Cirrus CI comment:

    0   # In the image build step, no external environment variables are available,
    1    # so any settings will need to be written to the settings env file:
    

    m3dwards commented at 1:04 pm on June 7, 2024:
    Oh you are right. This was a late change after I had already had the USDT tests running and didn’t check that they were still running 🤦
  20. in ci/test/00_setup_env_native_asan.sh:25 in 3fd7bf0105 outdated
    21@@ -20,7 +22,7 @@ export CONTAINER_NAME=ci_native_asan
    22 export PACKAGES="systemtap-sdt-dev clang-18 llvm-18 libclang-rt-18-dev python3-zmq qtbase5-dev qttools5-dev-tools libevent-dev libboost-dev libdb5.3++-dev libminiupnpc-dev libnatpmp-dev libzmq3-dev libqrencode-dev libsqlite3-dev ${BPFCC_PACKAGE}"
    23 export NO_DEPENDS=1
    24 export GOAL="install"
    25-export BITCOIN_CONFIG="--enable-usdt --enable-zmq --with-incompatible-bdb --with-gui=qt5 \
    26+export BITCOIN_CONFIG="--enable-usdt --enable-zmq --with-incompatible-bdb --with-gui=no --enable-fuzz-binary=no --disable-bench \
    


    maflcko commented at 12:26 pm on June 7, 2024:
    ?

    m3dwards commented at 12:58 pm on June 7, 2024:
    I was under the impression that these things were not relevant for these sanitisation tests but if they are I will add them back in.

    maflcko commented at 1:02 pm on June 7, 2024:
    The gui tests and benchmarks should also be running under the sanitizer, otherwise bugs may be missed in them.

    maflcko commented at 8:09 am on June 9, 2024:
    In theory, only the fuzz binary compilation could be excluded, because there is a separate task for the fuzz tests under asan. But either seems fine.
  21. maflcko changes_requested
  22. maflcko commented at 12:27 pm on June 7, 2024: member
    lgtm, except for the feedback
  23. m3dwards renamed this:
    ci: move ASAN job to GitHub Actions from Cirrus CI
    ci: move ASan job to GitHub Actions from Cirrus CI
    on Jun 7, 2024
  24. m3dwards force-pushed on Jun 8, 2024
  25. ci: add IPV6 network to ci container
    Allows IPV6 functional tests to run inside the container
    4b527fa93b
  26. m3dwards force-pushed on Jun 8, 2024
  27. DrahtBot added the label CI failed on Jun 8, 2024
  28. in ci/test/02_run_container.sh:23 in b4b4f27fdc outdated
    15@@ -16,19 +16,24 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then
    16   # System-dependent env vars must be kept as is. So read them from the container.
    17   docker run --rm "${CI_IMAGE_NAME_TAG}" bash -c "env | grep --extended-regexp '^(HOME|PATH|USER)='" | tee --append "/tmp/env-$USER-$CONTAINER_NAME"
    18   echo "Creating $CI_IMAGE_NAME_TAG container to run in"
    19+
    20   DOCKER_BUILDKIT=1 docker build \
    21       --file "${BASE_READ_ONLY_DIR}/ci/test_imagefile" \
    22       --build-arg "CI_IMAGE_NAME_TAG=${CI_IMAGE_NAME_TAG}" \
    23+      --build-arg "INSTALL_BCC_TRACING_TOOLS=${INSTALL_BCC_TRACING_TOOLS}" \
    


    maflcko commented at 8:06 am on June 9, 2024:

    Not sure about exposing this. This is an expert-only setting. Anyone changing it needs to understand exactly what they are doing, so they might as well modify the ci/test/00_setup_env_native_asan.sh config file manually.

    I’d prefer to keep the sed approach as-is. If you want to change it, it may be better to do in a separate pull request. This way, the focus here is kept on only moving the task from one infra to the other.


    m3dwards commented at 9:59 am on June 10, 2024:

    Initially I changed it as I didn’t understand why it needed the sed, but after I did understand, I thought it was nicer for someone to be able to run the USDT tests outside of CI by changing an environment variable rather than changing the contents of a file.

    Happy to change it back to the sed approach.


    maflcko commented at 10:33 am on June 10, 2024:

    I thought it was nicer for someone to be able to run the USDT tests outside of CI by changing an environment variable

    This is not enough. They will have to set up the exact system (kernel) outside the CI, as is used inside the CI. Either in a VM or on metal.

    Also the new build arg may invalidate the layer cache of unrelated images.

    Seems fine, if you feel strongly. Happy to ACK either approach.


    m3dwards commented at 1:22 pm on June 12, 2024:
    No strong feelings, switched it back to sed approach.
  29. in ci/test/02_run_container.sh:34 in b4b4f27fdc outdated
    30   docker volume create "${CONTAINER_NAME}_depends" || true
    31   docker volume create "${CONTAINER_NAME}_depends_sources" || true
    32   docker volume create "${CONTAINER_NAME}_depends_SDKs_android" || true
    33   docker volume create "${CONTAINER_NAME}_previous_releases" || true
    34 
    35+  docker network create --ipv6 --subnet 1111:1111::/112 ci-ip6net || true
    


    maflcko commented at 2:25 pm on June 9, 2024:

    Is this a fix for a bug outside of GitHub Actions? If yes, it would be good to have exact steps to test locally. Maybe even in a separate pull request.


    m3dwards commented at 10:06 am on June 10, 2024:

    Yes, there are two issues that this fixes. One is with the RPC server and another is with -proxy. Both issues stem from the fact that the containers have IPV6 but only on the loopback interface (::1).

    I have made a separate PR for the -proxy one which is #30245 (comment)

    There is an existing (closed) issue for the bind problem which is #13155 but this is harder to fix as it’s in libevent. I was planning on adding my comments to that closed issue and perhaps reopening it.

    If both these issues were fixed then we could remove this IPV6 network and all the tests should continue passing, including the IPV6 ones.

  30. ci: move Asan / LSan / USDT job to Github Actions
    Moving it from Cirrus CI so it can be easier to maintain and used by forks
    9eea51d905
  31. in .cirrus.yml:171 in b4b4f27fdc outdated
    166-    # so any settings will need to be written to the settings env file:
    167-    - sed -i "s|\${CIRRUS_CI}|true|g" ./ci/test/00_setup_env_native_asan.sh
    168-  << : *GLOBAL_TASK_TEMPLATE
    169-  persistent_worker:
    170-    labels:
    171-      type: noble  # Must use this specific worker (needed for USDT functional tests)
    


    maflcko commented at 7:49 pm on June 10, 2024:
    Should remove the noble type completely in this file

    m3dwards commented at 1:21 pm on June 12, 2024:
    Done
  32. m3dwards force-pushed on Jun 12, 2024
  33. maflcko commented at 3:30 pm on June 12, 2024: member

    review ACK 9eea51d9058ad638861aa4b94c1c6e71caeb8765

    lgtm

  34. DrahtBot requested review from hebasto on Jun 12, 2024
  35. DrahtBot requested review from 0xB10C on Jun 12, 2024
  36. m3dwards marked this as ready for review on Jun 12, 2024
  37. kristapsk commented at 4:31 pm on June 12, 2024: contributor
    Concept ACK
  38. in .github/workflows/ci.yml:320 in 9eea51d905
    315+    # No need to run on the read-only mirror, unless it is a PR.
    316+    if: github.repository != 'bitcoin-core/gui' || github.event_name == 'pull_request'
    317+    timeout-minutes: 120
    318+    env:
    319+      FILE_ENV: "./ci/test/00_setup_env_native_asan.sh"
    320+      INSTALL_BCC_TRACING_TOOLS: true
    


    maflcko commented at 4:52 pm on June 12, 2024:
    nit: Can remove this.

    m3dwards commented at 9:19 am on June 18, 2024:
  39. in .github/workflows/ci.yml:342 in 9eea51d905
    337+        # In the image build step, no external environment variables are available,
    338+        # so any settings will need to be written to the settings env file:
    339+        run: sed -i "s|\${INSTALL_BCC_TRACING_TOOLS}|true|g" ./ci/test/00_setup_env_native_asan.sh
    340+
    341+      - name: CI script
    342+        run: ./ci/test_run_all.sh
    


    maflcko commented at 4:58 pm on June 12, 2024:
    nit: could you link to a failing run? Just to double check that CI_FAILFAST_TEST_LEAVE_DANGLING works?

    m3dwards commented at 10:35 am on June 18, 2024:

    Failing run: https://github.com/m3dwards/bitcoin/actions/runs/9354822480/job/25748533616

    Looks ok to me, shows exit code 1.

    For completeness, here is a run with a failing test with CI_FAILFAST_TEST_LEAVE_DANGLING removed: https://github.com/m3dwards/bitcoin/actions/runs/9562790625/job/26359909811. The job fails with exit code 137. The windows job behaves differently but I don’t think that uses --failfast

  40. DrahtBot removed the label CI failed on Jun 12, 2024
  41. hebasto approved
  42. hebasto commented at 11:16 am on June 17, 2024: member
    ACK 9eea51d9058ad638861aa4b94c1c6e71caeb8765.
  43. achow101 commented at 7:35 pm on June 17, 2024: member
    ACK 9eea51d9058ad638861aa4b94c1c6e71caeb8765
  44. achow101 merged this on Jun 17, 2024
  45. achow101 closed this on Jun 17, 2024

  46. m3dwards deleted the branch on Jun 17, 2024
  47. maflcko commented at 6:40 am on June 18, 2024: member
    found a nit
  48. fanquake referenced this in commit 5fbdfe7104 on Jun 18, 2024
  49. in .github/workflows/ci.yml:332 in 9eea51d905
    327+
    328+      - name: Restore Ccache cache
    329+        id: ccache-cache
    330+        uses: actions/cache/restore@v4
    331+        with:
    332+          path: ${{ env.CCACHE_DIR }}
    


    maflcko commented at 11:07 am on June 19, 2024:

    https://github.com/bitcoin/bitcoin/actions/runs/9579271766/job/26411361263#step:7:12

    0Warning: Path Validation Error: Path(s) specified in the action for caching do(es) not exist, hence no cache is being saved.
    

    m3dwards commented at 1:41 pm on June 19, 2024:

    CCache is currently written to cache docker volume which I assume on Cirrus CI gets shared between multiple jobs? As a github actions is more idempotent / ephemeral this isn’t going to work as the ccache dir on the host is left empty at the end of the run.

    I can see that this is the same strategy used for depends output and sources and previous_releases.

    The solution could be to modify 02_run_container.sh script to have “GHA mode” where instead of using docker volumes we just bind mount to directories on the host that the github cache action can copy from / to.


    maflcko commented at 1:52 pm on June 19, 2024:
    Ah right. I forgot about this. Your suggestion sounds good.

    m3dwards commented at 6:48 pm on June 19, 2024:
  50. fanquake referenced this in commit 43c40dd808 on Jun 19, 2024
  51. fanquake referenced this in commit 0d524b1484 on Jun 19, 2024
  52. fanquake commented at 11:54 am on June 19, 2024: member
    Backported to 27.x in #30305.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 04:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me