ci: add option for running tests without volume #30310

pull m3dwards wants to merge 2 commits into bitcoin:master from m3dwards:240519-ci-gha-no-volume changing 3 files +27 −4
  1. m3dwards commented at 6:48 pm on June 19, 2024: contributor

    Fixes: #30193 (review)

    Cache wasn’t being saved when run on GHA because the default behaviour of the CI script was to store cache items in a docker volume. This works on Cirrus CI as the volumes are shared but it does not work on Github Actions in which each run is ephemeral.

    Kept the default behaviour the same so hopefully this continues to work for the Cirrus CI jobs.

  2. DrahtBot commented at 6:48 pm on June 19, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, hebasto

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Jun 19, 2024
  4. in .github/workflows/ci.yml:322 in ec8d84b87c outdated
    319         uses: actions/checkout@v4
    320 
    321-      - name: Set Ccache directory
    322-        run: echo "CCACHE_DIR=${RUNNER_TEMP}/ccache_dir" >> "$GITHUB_ENV"
    323+      # DEPENDS_DIR and PREVIOUS_RELEASES_DIR are not currently cached on GHA
    324+      # Directories provided to keep CI script happy
    


    maflcko commented at 6:03 am on June 20, 2024:

    Can you explain this sentence and the meaning of “happy”? (Are you trying to avoid a printed warning, or is this an error?)

    The asan task doesn’t use depends, nor previous releases. This is defined by the CI task config, not GHA.


    m3dwards commented at 10:33 am on June 20, 2024:

    The script makes sure the directories exist to mount into the container:

    mkdir -p "${DEPENDS_DIR}/built"

    This directory has to be in a location that GHA allows you to create a directory such as ${RUNNER_TEMP}. The default specified in 00_setup_env.sh would be /ci_container_base/depends.

    So even though this specific job doesn’t use depends this was done to make sure the container could mount all of the directories.

    I will look at solving this in a less confusing way, perhaps just setting BASE_ROOT_DIR like the MacOS job does.

  5. in ci/test/02_run_container.sh:76 in ec8d84b87c outdated
    75-                  --mount "type=volume,src=${CONTAINER_NAME}_depends_sources,dst=$DEPENDS_DIR/sources" \
    76-                  --mount "type=volume,src=${CONTAINER_NAME}_previous_releases,dst=$PREVIOUS_RELEASES_DIR" \
    77+                  --mount ${CI_DOCKER_CCACHE_MOUNT} \
    78+                  --mount ${CI_DOCKER_DEPENDS_MOUNT} \
    79+                  --mount ${CI_DOCKER_DEPENDS_SOURCES_MOUNT} \
    80+                  --mount ${CI_DOCKER_PREVIOUS_RELEASES_MOUNT} \
    


    maflcko commented at 6:05 am on June 20, 2024:
    missing quotes?
  6. in ci/test/00_setup_env.sh:73 in ec8d84b87c outdated
    69@@ -70,3 +70,4 @@ export CI_BASE_PACKAGES=${CI_BASE_PACKAGES:-build-essential libtool autotools-de
    70 export GOAL=${GOAL:-install}
    71 export DIR_QA_ASSETS=${DIR_QA_ASSETS:-${BASE_SCRATCH_DIR}/qa-assets}
    72 export CI_RETRY_EXE=${CI_RETRY_EXE:-"retry --"}
    73+export CI_DOCKER_USE_VOLUMES_FOR_CACHE=${CI_DOCKER_USE_VOLUMES_FOR_CACHE:-true}
    


    maflcko commented at 6:07 am on June 20, 2024:
    I’d say this should not be exposed. This is a dangerous operation, that may corrupt the host. It should be hidden similar to DANGER_RUN_CI_ON_HOST.
  7. in ci/test/02_run_container.sh:49 in ec8d84b87c outdated
    41@@ -47,15 +42,38 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then
    42   # When detecting podman-docker, `--external` should be added.
    43   docker image prune --force --filter "label=$CI_IMAGE_LABEL"
    44 
    45+  if [ "$CI_DOCKER_USE_VOLUMES_FOR_CACHE" = true ]; then
    46+    docker volume create "${CONTAINER_NAME}_ccache" || true
    47+    docker volume create "${CONTAINER_NAME}_depends" || true
    48+    docker volume create "${CONTAINER_NAME}_depends_sources" || true
    49+    docker volume create "${CONTAINER_NAME}_previous_releases" || true
    


    maflcko commented at 6:10 am on June 20, 2024:

    nit: Could leave this on by default? Having empty volumes created in a CI env that will be discarded should be risk-free.

    I’d say the code should be:

    0docker volume create ...
    1CCACHE_MOUNT=...volume...
    2if DANGER_CI_ON_HOST_CACHE_FOLDERS:
    3  mkdir -p ...
    4  CCACHE_MOUNT=...bind...
    5fi
    
  8. maflcko approved
  9. m3dwards force-pushed on Jun 20, 2024
  10. m3dwards force-pushed on Jun 20, 2024
  11. m3dwards force-pushed on Jun 20, 2024
  12. in ci/test/00_setup_env.sh:21 in 5bf7a08787 outdated
    16@@ -17,7 +17,8 @@ export BASE_READ_ONLY_DIR
    17 # This folder only exists on the ci guest and will be a copy of BASE_READ_ONLY_DIR
    18 export BASE_ROOT_DIR="${BASE_ROOT_DIR:-/ci_container_base}"
    19 # The depends dir.
    20-# This folder exists only on the ci guest, and on the ci host as a volume.
    21+# This folder exists only on the ci guest, and on the ci host as a volume
    22+# unless DANGER_CI_ON_HOST_CACHE_FOLDERS is set, in which case it is created on the host also.
    


    maflcko commented at 12:49 pm on June 20, 2024:
    nit: no need to document this. For DANGER_ options, the user already fully understand what they are doing (brick their system), so no need to hold their hands or otherwise encourage it.
  13. m3dwards force-pushed on Jun 20, 2024
  14. m3dwards commented at 12:57 pm on June 20, 2024: contributor
    Setting to draft while I fix the job, will test on my own fork and re-open when ready for review.
  15. m3dwards marked this as a draft on Jun 20, 2024
  16. m3dwards force-pushed on Jun 20, 2024
  17. in .github/workflows/ci.yml:325 in 5611197fce outdated
    320 
    321       - name: Set Ccache directory
    322         run: echo "CCACHE_DIR=${RUNNER_TEMP}/ccache_dir" >> "$GITHUB_ENV"
    323 
    324+      - name: Set base root directory
    325+        run: echo "BASE_ROOT_DIR=${RUNNER_TEMP}" >> "$GITHUB_ENV"
    


    m3dwards commented at 1:28 pm on June 20, 2024:
    This was made as a step and not in the env: section of the job as ${{ runner.temp }} is only available in the run steps.

    m3dwards commented at 1:34 pm on June 20, 2024:
    RUNNER_TEMP was used over ${{ github.workspace }} (as is used in the MacOS job) because the CI container mounts that as readonly which means it can’t be used to write cache items to.
  18. in ci/test/02_run_container.sh:76 in 5611197fce outdated
    75-                  --mount "type=volume,src=${CONTAINER_NAME}_depends_sources,dst=$DEPENDS_DIR/sources" \
    76-                  --mount "type=volume,src=${CONTAINER_NAME}_previous_releases,dst=$PREVIOUS_RELEASES_DIR" \
    77+                  --mount "${CI_DOCKER_CCACHE_MOUNT}" \
    78+                  --mount "${CI_DOCKER_DEPENDS_MOUNT}" \
    79+                  --mount "${CI_DOCKER_DEPENDS_SOURCES_MOUNT}" \
    80+                  --mount "${CI_DOCKER_PREVIOUS_RELEASES_MOUNT}" \
    


    maflcko commented at 1:30 pm on June 20, 2024:
    0                  --mount "${CI_PREVIOUS_RELEASES_MOUNT}" \
    

    nit: Could remove the docker mention?

  19. in ci/test/02_run_container.sh:45 in 5611197fce outdated
    41@@ -47,15 +42,38 @@ if [ -z "$DANGER_RUN_CI_ON_HOST" ]; then
    42   # When detecting podman-docker, `--external` should be added.
    43   docker image prune --force --filter "label=$CI_IMAGE_LABEL"
    44 
    45+  docker volume create "${CONTAINER_NAME}_ccache" || true
    


    maflcko commented at 1:31 pm on June 20, 2024:
    nit: Could keep the order as-is and not move this?
  20. maflcko approved
  21. maflcko commented at 1:33 pm on June 20, 2024: member

    lgtm.

    Maybe wait for the CI to pass, then address the nits to test the created cache?

  22. kevkevinpal commented at 1:59 pm on June 20, 2024: contributor

    Is there a reason why we don’t want to just have cirrus behave in the same way we’d do it on GHA, seems like GHA cannot store to shared cache but cirrus can act the same was as GHA. Just thinking maybe theres a way to reduce complexity in ci/test/02_run_container.sh and have more shared logic between the two ci’s

    if I were to set DANGER_CI_ON_HOST_CACHE_FOLDERS = 1 on the cirrus pipeline would the ci fail?

  23. maflcko commented at 3:14 pm on June 20, 2024: member

    Is there a reason why we don’t want to just have cirrus behave in the same way we’d do it on GHA, seems like GHA cannot store to shared cache but cirrus can act the same was as GHA. Just thinking maybe theres a way to reduce complexity in ci/test/02_run_container.sh and have more shared logic between the two ci’s

    Yes, it is possible to do the same on cirrus. However, for local runs it can not be done, because DANGER_CI_ON_HOST_CACHE_FOLDERS is dangerous and bricks your system. So I’d say to leave it as-is.

  24. maflcko commented at 3:17 pm on June 20, 2024: member

    Maybe wait for the CI to pass, then address the nits to test the created cache?

    Oh, I guess only master creates the cache?

    0        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'
    

    So I guess this can be merged after addressing the nits.

  25. ci: add option for running tests without volume
    DANGER_CI_ON_HOST_CACHE_FOLDERS if set will mount caches in directories on the host rather than in docker volumes. Supports saving and restoring caches on Github Actions.
    4ecbbd9b7f
  26. m3dwards force-pushed on Jun 20, 2024
  27. m3dwards commented at 4:44 pm on June 20, 2024: contributor

    Oh, I guess only master creates the cache?

    Branches on forks work, see cache restore here: https://github.com/m3dwards/bitcoin/actions/runs/9601151359/job/26479048839

  28. m3dwards marked this as ready for review on Jun 20, 2024
  29. m3dwards commented at 4:46 pm on June 20, 2024: contributor
    Nits should be addressed @maflcko
  30. m3dwards commented at 4:59 pm on June 20, 2024: contributor
    This job has a ccache size limit of 100mb, the MacOS job has a ccache limit of 400m. @hebasto is there something specific to MacOS for why the ccache limit is higher there?
  31. hebasto commented at 9:21 pm on June 20, 2024: member

    This job has a ccache size limit of 100mb, the MacOS job has a ccache limit of 400m. @hebasto is there something specific to MacOS for why the ccache limit is higher there?

    When I adjusted ccache size limits, I picked a size that allowed to populate the cleared cache with no cleanups in the ccache --show-stats report. I’m not aware of any peculiarities that make macOS caches bigger than others.

  32. hebasto commented at 9:26 pm on June 20, 2024: member

    Maybe wait for the CI to pass, then address the nits to test the created cache?

    Oh, I guess only master creates the cache?

    0        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'
    

    So I guess this can be merged after addressing the nits.

    Maybe comment that line out in a temporary “NO MERGE” commit?

  33. maflcko commented at 7:10 am on June 21, 2024: member
    utACK 4ecbbd9b7fa6f30e1d297cd26b112d3148744f58
  34. hebasto commented at 9:35 am on June 21, 2024: member

    Maybe wait for the CI to pass, then address the nits to test the created cache?

    Oh, I guess only master creates the cache?

    0        if: github.event_name != 'pull_request' && steps.ccache-cache.outputs.cache-hit != 'true'
    

    So I guess this can be merged after addressing the nits.

    Maybe comment that line out in a temporary “NO MERGE” commit?

    I’ve pushed a modified branch to my personal account. Here is a CI job: https://github.com/hebasto/bitcoin/actions/runs/9611511964/job/26510228294.

  35. m3dwards commented at 9:41 am on June 21, 2024: contributor

    I’ve pushed a modified branch to my personal account.

    Me also a few comments up :)

  36. hebasto commented at 9:42 am on June 21, 2024: member

    I’ve pushed a modified branch to my personal account.

    Me also a few comments up :)

    Oh sorry, I missed it.

  37. hebasto commented at 9:45 am on June 21, 2024: member

    I’ve pushed a modified branch to my personal account.

    Me also a few comments up :)

    Could you please rerun that job one more time to ensure that ccache --show-stats will print nearly 100% hit rate?

  38. m3dwards commented at 9:47 am on June 21, 2024: contributor

    Although I can see the cache get restored the job is no faster so I’m not sure it’s working.

    Could you please rerun that job one more time to ensure that ccache –show-stats will print nearly 100% hit rate?

    Will do this

  39. m3dwards commented at 10:00 am on June 21, 2024: contributor

    Re-running but the cache hit rate at the moment is low:

     0ccache version 4.9.1
     1Cacheable calls:   770 / 782 (98.47%)
     2  Hits:             59 / 770 ( 7.66%)
     3    Direct:          7 /  59 (11.86%)
     4    Preprocessed:   52 /  59 (88.14%)
     5  Misses:          711 / 770 (92.34%)
     6Uncacheable calls:  12 / 782 ( 1.53%)
     7Local storage:
     8  Cache size (GB): 0.1 / 0.1 (102.4%)
     9  Cleanups:        604
    10  Hits:             59 / 770 ( 7.66%)
    11  Misses:          711 / 770 (92.34%)
    
  40. hebasto commented at 10:11 am on June 21, 2024: member

    Re-running but the cache hit rate at the moment is low:

    The recent successive “ASan + LSan + UBSan + integer, no depends, USDT” job for the master branch on the Cirrus CI was https://cirrus-ci.com/task/6204664989876224. The merged PR did not touch C++ code at all. However, the hit rate is far from 100%:

     0+ bash -c 'ccache --version | head -n 1 && ccache --show-stats'
     1ccache version 4.9.1
     2Cacheable calls:   769 / 781 (98.46%)
     3  Hits:            260 / 769 (33.81%)
     4    Direct:         75 / 260 (28.85%)
     5    Preprocessed:  185 / 260 (71.15%)
     6  Misses:          509 / 769 (66.19%)
     7Uncacheable calls:  12 / 781 ( 1.54%)
     8Local storage:
     9  Cache size (GB): 0.2 / 0.2 (101.7%)
    10  Cleanups:        454
    11  Hits:            260 / 769 (33.81%)
    12  Misses:          509 / 769 (66.19%)
    

    I wonder, is it ccache compatibility issue with some sanitizer instrumentations?

  41. maflcko commented at 10:16 am on June 21, 2024: member
    The Cirrus cache is shared with pull requests, so it can not be used to compare it here. You’d have to run it locally on a fresh VM, twice.
  42. m3dwards commented at 12:33 pm on June 21, 2024: contributor

    Re-run results on GHA:

     0ccache version 4.9.1
     1Cacheable calls:   770 / 782 (98.47%)
     2  Hits:             62 / 770 ( 8.05%)
     3    Direct:          9 /  62 (14.52%)
     4    Preprocessed:   53 /  62 (85.48%)
     5  Misses:          708 / 770 (91.95%)
     6Uncacheable calls:  12 / 782 ( 1.53%)
     7Local storage:
     8  Cache size (GB): 0.1 / 0.1 (102.4%)
     9  Cleanups:        616
    10  Hits:             62 / 770 ( 8.05%)
    11  Misses:          708 / 770 (91.95%)
    

    I will increase the size and see if it helps

  43. hebasto commented at 12:36 pm on June 21, 2024: member

    I will increase the size and see if it helps

    Yes. Especially, considering too many cleanups.

  44. hebasto commented at 12:53 pm on June 21, 2024: member

    I will increase the size and see if it helps

    From my tests it follows that the ccache cache size is about 227 MB. So setting CCACHE_MAXSIZE=250MB or CCACHE_MAXSIZE=300MB should work.

  45. ci: increase available ccache size to 300MB da205dda14
  46. maflcko commented at 3:46 pm on June 21, 2024: member
    utACK da205dda14d7e71fe0567944117362c1b820ba8f
  47. hebasto approved
  48. hebasto commented at 3:48 pm on June 21, 2024: member
    ACK da205dda14d7e71fe0567944117362c1b820ba8f.
  49. m3dwards commented at 3:50 pm on June 21, 2024: contributor

    Looking much better with bigger cache size:

     0ccache version 4.9.1
     1Cacheable calls:   770 / 782 (98.47%)
     2  Hits:            769 / 770 (99.87%)
     3    Direct:        769 / 769 (100.0%)
     4    Preprocessed:    0 / 769 ( 0.00%)
     5  Misses:            1 / 770 ( 0.13%)
     6Uncacheable calls:  12 / 782 ( 1.53%)
     7Local storage:
     8  Cache size (GB): 0.2 / 0.2 (96.47%)
     9  Hits:            769 / 770 (99.87%)
    10  Misses:            1 / 770 ( 0.13%)
    

    ASan job has gone from 79 minutes down to 46 minutes with cache.

  50. fanquake merged this on Jun 24, 2024
  51. fanquake closed this on Jun 24, 2024

  52. m3dwards deleted the branch on Jun 24, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-06-29 07:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me