guix: Use pigz as a faster gzip replacement #21478

pull dongcarl wants to merge 15 commits into bitcoin:master from dongcarl:2021-03-guix-pigz-can-fly changing 12 files +403 −81
  1. dongcarl commented at 3:05 pm on March 19, 2021: member

    Based on: #21375

    When running guix builds, I’ve found that a large portion of the time was being allocated to gzip. Therefore I experimented with replacing gzip with pigz, which makes use of multiple cores. Here are the wall-clock durations for building all architectures on my workstation (AMD Ryzen Threadripper 2970WX 24-Core Processor):

    no depends cache depends cache
    gzip 93.19 mins 44.75 mins
    pigz 75.76 mins 27.37 mins
    change -17.43 mins (-19%) -17.38 mins (-39%)

    However, I do understand if people don’t think the gains are worth increasing our manifest size. Please let me know what you think!

     0$ find guix-build-$(git rev-parse --short=12 HEAD)/output/ -type f -print0 | env LC_ALL=C sort -z | xargs -r0 sha256sum
     1c5c26a1f9048ffe786d956cc4fc7ec3347a6084d0f91cc83bb40a775c57ad0dc  guix-build-ed8cfdd531de/output/aarch64-linux-gnu/bitcoin-ed8cfdd531de-aarch64-linux-gnu-debug.tar.gz
     28cbc109723f8877e98b7d453ffc3b41d7a443105a1e3325c59cb3d3ef24ae504  guix-build-ed8cfdd531de/output/aarch64-linux-gnu/bitcoin-ed8cfdd531de-aarch64-linux-gnu.tar.gz
     354d49f2aabb365aade70f7097d903fc3879405170d729c26de385ec1efde3d26  guix-build-ed8cfdd531de/output/arm-linux-gnueabihf/bitcoin-ed8cfdd531de-arm-linux-gnueabihf-debug.tar.gz
     42613ecc288989f2abe0037f2158b321b9991689d4cce0576aebc2eb4ad2eb0aa  guix-build-ed8cfdd531de/output/arm-linux-gnueabihf/bitcoin-ed8cfdd531de-arm-linux-gnueabihf.tar.gz
     55711f8bfbcc8c60b956bd9a74891a84972ef9568467fb04021c8315d5cb7372a  guix-build-ed8cfdd531de/output/dist-archive/bitcoin-ed8cfdd531de.tar.gz
     6ac4c96ec8b17e08e021628d7311966be4df198360e6564a3441fb22d1590b0fb  guix-build-ed8cfdd531de/output/powerpc64-linux-gnu/bitcoin-ed8cfdd531de-powerpc64-linux-gnu-debug.tar.gz
     7a84a9fe20e3f2c1ddff94794db2f5981965f48c37e5d0288ef5a6c904b4ef991  guix-build-ed8cfdd531de/output/powerpc64-linux-gnu/bitcoin-ed8cfdd531de-powerpc64-linux-gnu.tar.gz
     812af621c5c081ffbe0722f24a4616e22cca0623632990231f0d4016e5821a365  guix-build-ed8cfdd531de/output/powerpc64le-linux-gnu/bitcoin-ed8cfdd531de-powerpc64le-linux-gnu-debug.tar.gz
     9623f9a6055cffae19b1b352e967fb94b6fd0d9d1e4bb3cac555997308256cdf1  guix-build-ed8cfdd531de/output/powerpc64le-linux-gnu/bitcoin-ed8cfdd531de-powerpc64le-linux-gnu.tar.gz
    10c47ebaa781fad44ff249b61a4d7b05214b838580546b3c11de971c639835114b  guix-build-ed8cfdd531de/output/riscv64-linux-gnu/bitcoin-ed8cfdd531de-riscv64-linux-gnu-debug.tar.gz
    1179717355815f39e291e5fe3ca5ba1390240344a4b9b2d12325fba9cdab81a735  guix-build-ed8cfdd531de/output/riscv64-linux-gnu/bitcoin-ed8cfdd531de-riscv64-linux-gnu.tar.gz
    126db76e37fcab591c13ebd7fe2527c62415f5ca16c91a8d395c49d36d84f22211  guix-build-ed8cfdd531de/output/x86_64-apple-darwin18/bitcoin-ed8cfdd531de-osx-unsigned.dmg
    13fd79d768e2478e5ab6a0534fad3c8ff033efe34dd26d98040fa8ace7c642073c  guix-build-ed8cfdd531de/output/x86_64-apple-darwin18/bitcoin-ed8cfdd531de-osx-unsigned.tar.gz
    1496239efb584677390a9d95c08b1ec5b29d01c40619f252eb7d2fb7ca3f6e2571  guix-build-ed8cfdd531de/output/x86_64-apple-darwin18/bitcoin-ed8cfdd531de-osx64.tar.gz
    15d5c5017202f69a74e0c172295de2e08beb007c32acd7f0e6f6213a6876bf5960  guix-build-ed8cfdd531de/output/x86_64-linux-gnu/bitcoin-ed8cfdd531de-x86_64-linux-gnu-debug.tar.gz
    16ff5de879eec3f3a6cdb25e50f77a105b82224bded2212411f204b414d87571f5  guix-build-ed8cfdd531de/output/x86_64-linux-gnu/bitcoin-ed8cfdd531de-x86_64-linux-gnu.tar.gz
    177ab80db27c3234c09804c39c72476b6a3ea92973d783b42031ac28911c914ac2  guix-build-ed8cfdd531de/output/x86_64-w64-mingw32/bitcoin-ed8cfdd531de-win-unsigned.tar.gz
    18d351fc91e7b30ab73a6debf3f6fb46a1087e4e18c48b504ff56a01f37e74151f  guix-build-ed8cfdd531de/output/x86_64-w64-mingw32/bitcoin-ed8cfdd531de-win64-debug.zip
    19b5bdd32eb5391e6cbb25b9ef485f386263a41651bbe7b3d48910d586e8078910  guix-build-ed8cfdd531de/output/x86_64-w64-mingw32/bitcoin-ed8cfdd531de-win64-setup-unsigned.exe
    2023ea865a9916795bb3925bb9f29848e6d3b3e769d42acb99bd33880c8a0e93dc  guix-build-ed8cfdd531de/output/x86_64-w64-mingw32/bitcoin-ed8cfdd531de-win64.zip
    
  2. guix: Use --cores instead of --max-jobs
    In Guix, there are two flags for controlling parallelism:
    
    Note: When I say "derivation," think "package"
    
    --cores=n
      - controls the number of CPU cores to build each derivation. This is
        the value passed to `make`'s `--jobs=` flag.
      - defaults to 0: as many cores as is available
    
    --max-jobs=n
      - controls how many derivations can be built in parallel
      - defaults to 1
    
    Therefore, if set --max-jobs=$MAX_JOBS and don't set --cores, Guix could
    theoretically spin up $MAX_JOBS * $(nproc) number of threads, and that's
    no good.
    
    So we could either default to --cores=1, --max-jobs=$MAX_JOBS
    
      - Pro: --cores=1 means that `make` will be invoked with `-j1`,
             avoiding problems with package whose build systems and test
             suites break when running multi-threaded.
    
      - Con: There will be times when only 1 or 2 derivations can be built
             at a time, because the rest of the dependency graph all depend
             on those 1 or 2 derivations. During these times, the machine
             will be severely under-utilized.
    
    or --cores=$MAX_JOBS, --max-jobs=1
    
      - Pro: We don't encounter prolonged periods of
             severe under-utilization mentioned above.
    
      - Con: Many packages' build systems and test suites break when running
             multi-threaded.
    
    or --cores=1, --max-jobs=1 and let the user override with
    $ADDITIONAL_GUIX_COMMON_FLAGS
    70d4887dde
  3. contrib: Silence git-describe when looking for tag
    Otherwise, it prints a rather disturbing message to stderr:
    
        fatal: no tag exactly matches '<hash>'
    f194de43d2
  4. guix: Create windeploy inside distsrc-*
    ./windeploy is a "working directory", and therefore belongs inside
    distsrc-*. Many people have noticed their Guix builds failing after
    hours simply because they did not remove windeploy (but did remove the
    distsrc-* directories).
    48837c2598
  5. guix: Add souce-able bash prelude and utils 73b2dbe4db
  6. guix: Remove guix-build.sh filename extension e463f98fbd
  7. guix: Adapt guix-build to prelude, restructure hier 0272ae6f36
  8. guix: Add troubleshooting documentation entries 6328800180
  9. guix: Supply --link-profile b4e6b4e815
  10. guix: More thoroughly control native toolchain 9f036f93a3
  11. guix: Add early health check for guix-daemon 23e43c9cca
  12. guix: Fallback to local build for substitute-enabled Guix users 7bccf7cfa8
  13. guix: Use clang-toolchain instead of clang 73b254c184
  14. guix: Remove codesign_allocate+pagestuff from unsigned tarball 61194db48c
  15. depends: libdmg-hfsplus: Skip CMake RPATH patching 8170456b5b
  16. guix: Use pigz for compression ed8cfdd531
  17. dongcarl added the label Needs Conceptual Review on Mar 19, 2021
  18. luke-jr commented at 3:14 pm on March 19, 2021: member
    Considering the huge performance impact of using Guix at all, this doesn’t make sense to me.
  19. dongcarl commented at 6:37 pm on March 19, 2021: member

    Considering the huge performance impact of using Guix at all, this doesn’t make sense to me.

    Sorry, I’m not fully understanding, are you saying: “Running Guix builds already strains my machine quite a bit, it wouldn’t make sense to strain it more by using pigz”?

  20. luke-jr commented at 6:41 pm on March 19, 2021: member
    No, I’m just saying gzip’s performance hit is trivial compared to migrating from gitian to guix.
  21. DrahtBot commented at 6:53 pm on March 19, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #21495 (build, qt: Fix static builds on macOS Big Sur by hebasto)
    • #21420 (build, qt: No longer need to patch translation.pro by hebasto)
    • #21375 (guix: Misc feedback-based fixes + hier restructuring by dongcarl)
    • #19817 (build: macOS toolchain bump by fanquake)
    • #17227 (Qt: Add Android packaging support by icota)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  22. sipa commented at 7:24 am on March 20, 2021: member

    Are there any concerns about pigz possible introducing non-determinism, if it compresses in parallel? (looking at the documentation, my guess is no, but it’s worth testing).

    Otherwise, that’s a pretty nice and simple speedup. In fact, I’m kind of surprised we spend that much time on gzipping stuff that this even matters…

  23. laanwj commented at 9:20 am on March 20, 2021: member

    Otherwise, that’s a pretty nice and simple speedup. In fact, I’m kind of surprised we spend that much time on gzipping stuff that this even matters…

    I came here to say this, basically. That, seeing how much time is spent on compilation, zlib packing takes that much time. It’s not something like xz that is comparatively a lot CPU heavier for compression.

    I’m also a little bit concerned about adding a non-standard dependency. Not enough to NACK this though.

    What about lowering the compression level for intermediate gz’s like caches. Depending on I/O speed versus CPU speed this can help, at the expense of somewhat more disk usage. [So to be clear I’m not suggesting less compression for the distribution.]

    A more high-level question would be: Are we at the point that we need to micro-optimize GUIX build speed, or is this something we can consider later? :slightly_smiling_face: (thanks for looking into it though! it’s very useful to know where time is spent)

  24. dongcarl commented at 5:52 pm on March 20, 2021: member

    Are there any concerns about pigz possible introducing non-determinism, if it compresses in parallel? (looking at the documentation, my guess is no, but it’s worth testing).

    I’ve asked this question here: https://github.com/madler/pigz/issues/90, and tested locally. It seems to be deterministic!

    What about lowering the compression level for intermediate gz’s like caches. Depending on I/O speed versus CPU speed this can help, at the expense of somewhat more disk usage. [So to be clear I’m not suggesting less compression for the distribution.]

    Probably worthwhile to experiment with that too!

    A more high-level question would be: Are we at the point that we need to micro-optimize GUIX build speed, or is this something we can consider later? 🙂 (thanks for looking into it though! it’s very useful to know where time is spent)

    Oh I think we can 100% consider this later, I’ve just done so many builds where I thought something was stuck for 3+ mins only to find out it’s gzip being slow :-) There’s no rush, that’s why it’s just a Draft PR. I just wanted to write down my crude findings somewhere public!

  25. sipa commented at 6:48 pm on March 20, 2021: member
    More fine-tuning / microoptimization: for purposes of just internally compressing things, zstd is a lot faster than gzip (at the same compression level).
  26. MarcoFalke commented at 7:07 am on March 21, 2021: member
    Is the release binary bit-identical with gzip vs pigz?
  27. sipa commented at 7:10 am on March 21, 2021: member

    Is the release binary bit-identical with gzip vs pigz

    No, only between same-version same-settings pigz runs (with possibly different number of threads).

  28. luke-jr commented at 4:24 pm on March 21, 2021: member
    What about the same verson/settings on different platforms? (Right now, we only support amd64, but maybe in the future…)
  29. fanquake commented at 2:53 am on September 2, 2021: member
    Lets reconsider this in the future. Going to close for now.
  30. fanquake closed this on Sep 2, 2021

  31. DrahtBot locked this on Sep 2, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me