contrib: verify-binaries accept full arch-platform specifier #28418

pull willcl-ark wants to merge 1 commits into bitcoin:master from willcl-ark:verify-arch-os changing 2 files +50 −29
  1. willcl-ark commented at 9:18 pm on September 5, 2023: member

    Currently the verify-binaries script will accept “wildcards” such as “linux” or “darwin” as binary specifiers.

    Builders however usually only want to verify for their specific architecture-platform that they want to use, or verify everything. Update the script to accept single full architecture-platform specifiers as used by the binary repositories.

    Previously if you tried to use a complete specifier, e.g. 25.0-x86_64-linux-gnu it would download all linux-gnu platform binaries as this is parsed into 4 parts, and so version_rc and version_os are unspecified.

    With the new parser it’s still possible to specify “25.0-linux” or “25.0-x86_64” to verify multiple builds if so desired.

  2. DrahtBot commented at 9:18 pm on September 5, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Approach ACK TheCharlatan
    Approach NACK BenWestgate

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30147 (contrib: Fixup verify-binaries OS platform parsing by BenWestgate)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Scripts and tools on Sep 5, 2023
  4. TheCharlatan commented at 1:51 pm on September 8, 2023: contributor
    Approach ACK
  5. DrahtBot added the label CI failed on Jan 12, 2024
  6. contrib: verify-binaries accept full arch-platform specifier
    Currently the verify-binaries script will accept "wildcards" such as
    "linux" or "darwin" as binary specifiers.
    
    Builders however usually only want to verify for their specific
    architecture-platform that they want to use, or verify everything. Update the
    script to accept single full architecture-platform specifiers as used by the
    binary repositories.
    
    With the new parser it's also possible to specify "25.0-linux-gnu" or
    "25.0-x86_64" to verify multiple builds if so desired.
    1d16fee21d
  7. willcl-ark force-pushed on Jan 23, 2024
  8. DrahtBot removed the label CI failed on Jan 23, 2024
  9. achow101 requested review from laanwj on Apr 9, 2024
  10. DrahtBot added the label CI failed on Apr 26, 2024
  11. DrahtBot removed the label CI failed on May 1, 2024
  12. BenWestgate commented at 6:02 pm on May 22, 2024: contributor

    Concept ACK

    Have not tested this branch but based on reading the diff it Closes #30145.

    Approach: The script doesn’t need to know what part of the string is arch, os or platform to match the file. Considering this, a much simpler parser produces the same functionality: https://github.com/bitcoin/bitcoin/blob/9f6ccfced950cc4e334163af911ffc620bd5e216/contrib/verify-binaries/verify.py#L102-L111 Is there a reason to capture these in separate variables that I am missing?

    Comparing to VERSION_ARCHES and VERSION_PLATFORMS lists makes this file more work to maintain as it needs updates each time Bitcoin supports a new architecture or OS.

    The user interface strings of this branch are better than master. I have a test written for the “download only one file” scenario in the linked similar PR.

  13. in contrib/verify-binaries/README.md:63 in 1d16fee21d
    63-    --min-good-sigs 10 pub 22.0-x86
    64+    --min-good-sigs 10 pub 22.1-x86_64
    65 ```
    66 
    67-If you only want to download the binaries for a certain platform, add the corresponding suffix, e.g.:
    68+If you only want to download the binaries for a certain architecture and/or platform, add the corresponding suffix, e.g.:
    


    BenWestgate commented at 7:55 pm on May 22, 2024:
    I like this, it is better than master. I borrowed it.
  14. in contrib/verify-binaries/README.md:68 in 1d16fee21d
    70 ```sh
    71-./contrib/verify-binaries/verify.py pub 24.0.1-darwin
    72-./contrib/verify-binaries/verify.py pub 23.1-rc1-win64
    73+./contrib/verify-binaries/verify.py pub 24.2-rc1-win64
    74+./contrib/verify-binaries/verify.py pub 25.1-linux-gnu
    75+./contrib/verify-binaries/verify.py pub 26.0-x86_64-linux-gnu
    


    BenWestgate commented at 7:57 pm on May 22, 2024:
    It is good you give the “-x86_64-linux-gnu” example here as it’s a top choice a new user may want to specifically download & verify.
  15. in contrib/verify-binaries/verify.py:102 in 1d16fee21d
     95@@ -96,24 +96,35 @@ def bool_from_env(key, default=False) -> bool:
     96     raise ValueError(f"Unrecognized environment value {key}={raw!r}")
     97 
     98 
     99-VERSION_FORMAT = "<major>.<minor>[.<patch>][-rc[0-9]][-platform]"
    100-VERSION_EXAMPLE = "22.0-x86_64 or 23.1-rc1-darwin"
    101+VERSION_FORMAT = "<major>.<minor>[.<patch>][-rc[0-9]][-arch][-operating_system]"
    102+VERSION_EXAMPLE = "22.0-x86_64 or 23.1-rc1-apple-darwin or 25.0-x86_64-linux-gnu"
    103+VERSION_ARCHES = ["aarch64", "arm", "arm64", "osx", "osx64", "powerpc64", "powerpc64le", "riscv64", "x86_64"]
    104+VERSION_PLATFORMS = ["apple", "apple-darwin", "linux-gnu", "linux-gnueabihf", "win64"]
    


    BenWestgate commented at 8:03 pm on May 22, 2024:
    This will be more work to maintain than not worrying about what each part of the platform string is and only try to find matches in SHA256SUMS.
  16. in contrib/verify-binaries/verify.py:122 in 1d16fee21d
    131 
    132-    return version_base, version_rc, version_os
    133+    version_str_nonlocal = [version_str]  # wrapped in list to make it non-local for inner functions
    134+    version_rc = extract_segment([f"rc{i}" for i in range(20)])  # assume rc doesn't increment above 20
    135+    version_arch = extract_segment(VERSION_ARCHES)
    136+    version_platform = extract_segment(VERSION_PLATFORMS)
    


    BenWestgate commented at 8:32 pm on May 22, 2024:

    The VERSION_PLATFORM element "apple" will never match a fully specified release version string, it will always match "apple-darwin" first:

    0ben@ben:~/guix.sigs$ for file in $(grep all.SHA256SUMS <<< $(find .) | sed '/.asc/d' | uniq); do cat $file; done | grep -c "apple"
    14551
    2ben@ben:~/guix.sigs$ for file in $(grep all.SHA256SUMS <<< $(find .) | sed '/.asc/d' | uniq); do cat $file; done | grep -c "apple-darwin"
    34551
    

    Including both "apple" and "apple-darwin" in VERSION_PLATFORMS seems redundant unless this is intended as a shorthand.

  17. in contrib/verify-binaries/verify.py:125 in 1d16fee21d
    134+    version_rc = extract_segment([f"rc{i}" for i in range(20)])  # assume rc doesn't increment above 20
    135+    version_arch = extract_segment(VERSION_ARCHES)
    136+    version_platform = extract_segment(VERSION_PLATFORMS)
    137+
    138+    if version_str_nonlocal[0] != version_base:
    139+        raise ValueError(f"Unmatched segments found in specifier string '{version_str}'.")
    


    BenWestgate commented at 8:44 pm on May 22, 2024:

    This prevents people from specifying a version string like:

    0./verify.py pub 27.0-win64-setup.exe
    1./verify.py pub 27.0-win64.zip
    

    To fetch one of the specific files to download as we also have diversity in file extensions among -arch-platforms.

  18. in contrib/verify-binaries/verify.py:305 in 1d16fee21d
    300@@ -290,6 +301,8 @@ def join_url(host: str) -> str:
    301             "Have you specified the version number in the following format?\n"
    302             f"{VERSION_FORMAT} "
    303             f"(example: {VERSION_EXAMPLE})\n"
    304+            f"Supported architectures: {VERSION_ARCHES}\n"
    305+            f"Supported platforms: {VERSION_PLATFORMS}\n"
    


    BenWestgate commented at 8:53 pm on May 22, 2024:

    Suggest changing to:

    0f"Example architectures: {VERSION_ARCHES}\n"
    1f"Example platforms: {VERSION_PLATFORMS}\n"
    

    or

    0f"Suggested architectures: {VERSION_ARCHES}\n"
    1f"Suggested platforms: {VERSION_PLATFORMS}\n"
    

    so it can become out of date but not become wrong.

  19. in contrib/verify-binaries/verify.py:496 in 1d16fee21d
    492     except Exception as e:
    493         log.debug(e)
    494         log.error(f"unable to parse version; expected format is {VERSION_FORMAT}")
    495         log.error(f"  e.g. {VERSION_EXAMPLE}")
    496+        log.error(f"Supported architectures: {VERSION_ARCHES}")
    497+        log.error(f"Supported platforms: {VERSION_PLATFORMS}")
    


  20. in contrib/verify-binaries/verify.py:535 in 1d16fee21d
    531@@ -512,9 +532,9 @@ def cleanup():
    532         return sigs_status
    533 
    534     # Extract hashes and filenames
    535-    hashes_to_verify = parse_sums_file(SUMS_FILENAME, [os_filter])
    536+    hashes_to_verify = parse_sums_file(SUMS_FILENAME, [version_arch, os_filter])
    


    BenWestgate commented at 9:15 pm on May 22, 2024:

    As mentioned earlier, some values of version_arch and os_filter are still going to download multiple files but should not. Such as: “27.0-arm” fetches arm64 as well “27.0-arm64” fetches both .zip and .tar.gz “27.0-x86_64-apple-darwin” fetches both .zip and .tar.gz “27.0-win64” fetches both setup.exe and .zip etc.

    So we need to be able to specify more than this.

    That’s why I suggested a simpler approach that doesn’t try to understand what it’s matching and just gets the matching or partially matching file name(s) in SHA256SUMS I feel that gives users more flexibility and is less code to maintain.

  21. in contrib/verify-binaries/verify.py:537 in 1d16fee21d
    531@@ -512,9 +532,9 @@ def cleanup():
    532         return sigs_status
    533 
    534     # Extract hashes and filenames
    535-    hashes_to_verify = parse_sums_file(SUMS_FILENAME, [os_filter])
    536+    hashes_to_verify = parse_sums_file(SUMS_FILENAME, [version_arch, os_filter])
    537     if not hashes_to_verify:
    538-        log.error("no files matched the platform specified")
    539+        log.error("no files matched the platform specified. check available versions on bitcoincore.org/bin")
    


    BenWestgate commented at 9:16 pm on May 22, 2024:
    Another good error message.
  22. BenWestgate changes_requested
  23. BenWestgate commented at 9:45 pm on May 22, 2024: contributor

    Approach NACK

    Does not seem to completely solve the “Download just one file” use case for certain inputs. Will fail with error if the user attempts to specify further the file they want downloaded. More complex code than necessary may require updates if new platforms or architectures are supported by Bitcoin. Can’t specify certain architectures and platforms without also downloading other architectures and platforms.

    This is too much shoe horning and a pattern match on the file name in SHA256SUM, after special handling <version>[-rcN] since these have special rules, seems sufficient.

    I have tested this PR and found the following issues:

    This is the result whether 27.0-arm or 27.0-arm- are given to the command. It downloads two arches. In #30147, 27.0-arm- would just download only arm and not also arm64.

    0[INFO] downloading bitcoin-27.0-arm-linux-gnueabihf.tar.gz to /tmp/bitcoin_verify_binaries.27.0-arm
    1[INFO] downloading bitcoin-27.0-arm64-apple-darwin.zip to /tmp/bitcoin_verify_binaries.27.0-arm
    2[INFO] downloading bitcoin-27.0-arm64-apple-darwin.tar.gz to /tmp/bitcoin_verify_binaries.27.0-arm
    

    Here’s two files being downloaded despite full specification:

    0python3 verify.py -q pub 27.0-x86_64-apple-darwin
    1VERIFIED: bitcoin-27.0-x86_64-apple-darwin.zip
    2VERIFIED: bitcoin-27.0-x86_64-apple-darwin.tar.gz
    

    Trying to dial it down further makes it fail:

    0ben@ben$ python3 verify.py -q pub 27.0-x86_64-apple-darwin.zip
    1[ERROR] unable to parse version; expected format is <major>.<minor>[.<patch>][-rc[0-9]][-arch][-operating_system]
    2[ERROR]   e.g. 22.0-x86_64 or 23.1-rc1-apple-darwin or 25.0-x86_64-linux-gnu
    3[ERROR] Supported architectures: ['aarch64', 'arm', 'arm64', 'osx', 'osx64', 'powerpc64', 'powerpc64le', 'riscv64', 'x86_64']
    4[ERROR] Supported platforms: ['apple', 'apple-darwin', 'linux-gnu', 'linux-gnueabihf', 'win64']
    

    I can’t search for all “linux” (yes, I know I ignored the rules, but why should this be invalid?)

    0ben@ben$ python3 verify.py -q pub 27.0-linux
    1[ERROR] unable to parse version; expected format is <major>.<minor>[.<patch>][-rc[0-9]][-arch][-operating_system]
    2[ERROR]   e.g. 22.0-x86_64 or 23.1-rc1-apple-darwin or 25.0-x86_64-linux-gnu
    3[ERROR] Supported architectures: ['aarch64', 'arm', 'arm64', 'osx', 'osx64', 'powerpc64', 'powerpc64le', 'riscv64', 'x86_64']
    4[ERROR] Supported platforms: ['apple', 'apple-darwin', 'linux-gnu', 'linux-gnueabihf', 'win64']
    

    To conclude however, I want this feature so badly that I do not care about these issues since I’m only downloading x86_64-linux-gnu, this PR works for me. It is good enough as is, I just (biased ofc) liked my approach in #30147 better since it didn’t have these issues.

  24. willcl-ark closed this on May 24, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me