This PR attempts to make contrib/macdeploy/gen-sdk deterministic
Can anyone with the Xcode_12.2.xip confirm that gen-sdk produces the same hash? => e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae
This PR attempts to make contrib/macdeploy/gen-sdk deterministic
Can anyone with the Xcode_12.2.xip confirm that gen-sdk produces the same hash? => e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae
Testing your branch @ 13a4a092dbeb3ec9a99398450941f4066dc92fcf:
shasum -a 256 Xcode_12.2.xip
28d352f8c14a43d9b8a082ac6338dc173cb153f964c6e8fb6ba389e5be528bd0 Xcode_12.2.xip
....
contrib/macdeploy/gen-sdk Xcode.app
Found Xcode (version: 12.2, build id: 12B45b)
Found MacOSX SDK (version: 11.0, build id: 20A2408)
Creating output .tar.gz file...
Adding MacOSX SDK 11.0 files...
Adding libc++ headers...
Done! Find the resulting gzipped tarball at:
Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
...
shasum -a 256 Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
a396dd24f61fb55a6d3ec98b8b58fc0b04cdb6b2695039869d04105f885e0867 Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
Testing your branch @ 13a4a09:
How did you extract the xip file? Via xip -x or apple-sdk-tools/extract_xcode.py ?
Edit: I tried to unpack the xip file using both methods and despite the extracted folders have different timestamps (xip -x keeps the timestamps from the archive, the cpio method ruins them), the updated gen-sdk script produced the same hash for me.
Will investigate further ...
It seems the default TAR format has changed from GNU to PAX in Python 3.8.
This is addressed by change in 140f8e7831fc191f4981c4e681f501ef744a6129 which sets the GNU format explicitly.
Argh, found another culprit - https://bugs.python.org/issue18819 (fixed via https://github.com/python/cpython/pull/18080)
Python 3.8 and older use ASCII string 0000000\x00 for devmajor/devminor to indicate the entry is not a device. Python 3.9 and newer use the binary string \x00\x00\x00\x00\x00\x00\x00\x00 (to match the original TAR behaviour) ðŸ˜
This might be fixed by monkey-patching the tarfile.TarInfo._create_header function.
on different Python versions (there was a change in TAR handling
between Python 3.8 and Python 3.9)
This might be fixed by monkey-patching the
tarfile.TarInfo._create_headerfunction.
I monkey-patched the Python bug in ba30a5407e065e9d6dd037351e83f56a43f38f19.
The new deterministic hash should be e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae,
I was able to reproduce this hash on:
apple-sdk-tools/extract_xcode.pyxip -xConcept ACK.
The new deterministic hash should be e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae,
Nice. I'm now getting a matching hash.
➜ SDK ./contrib/macdeploy/gen-sdk Xcode.app
Found Xcode (version: 12.2, build id: 12B45b)
Found MacOSX SDK (version: 11.0, build id: 20A2408)
Creating output .tar.gz file...
Adding MacOSX SDK 11.0 files...
Adding libc++ headers...
Done! Find the resulting gzipped tarball at:
Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
➜ SDK shasum -a 256 Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
Nice. I'm now getting a matching hash.
Thanks. Switching from Draft to Ready for review.
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
Concept ACK. I think having a deterministic MacOS SDK input is very useful, so that we can be sure to start from the same point based on hashes. I think we should at least get this in before the next SDK bump.
Concept ACK, would love to see this be deterministic but I have a hash mismatch with the resulting archive file.
I get the same hash for the downloaded xip file:
$ sha256sum ./Xcode_12.2.xip
28d352f8c14a43d9b8a082ac6338dc173cb153f964c6e8fb6ba389e5be528bd0
But, I get a different hash for the resulting archive file:
$ ./contrib/macdeploy/gen-sdk ./Xcode.app
Found Xcode (version: 12.2, build id: 12B45b)
Found MacOSX SDK (version: 11.0, build id: 20A2408)
Creating output .tar.gz file...
Adding MacOSX SDK 11.0 files...
Adding libc++ headers...
Done! Find the resulting gzipped tarball at:
/home/xyz/Code/Bitcoin/review/bitcoin-24534/bitcoin/Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
$ sha256sum ./Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
501625bd401d7f228b3bae18f264fd3739da5788464f0897e5c455451d2128c2 ./Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
Concept ACK, would love to see this be deterministic but I have a hash mismatch with the resulting archive file.
OS version and Python version? Also what filesystem are you using?
Tested ACK ba30a5407e065e9d6dd037351e83f56a43f38f19
Output matches the hash in the OP (tried on Ubuntu 22.04, x86_64, Python 3.10.4):
$ sha256sum ~/Downloads/Xcode_12.2.xip
28d352f8c14a43d9b8a082ac6338dc173cb153f964c6e8fb6ba389e5be528bd0 /home/orion/Downloads/Xcode_12.2.xip
$ …/apple-sdk-tools/extract_xcode.py -f ~/Downloads/Xcode_12.2.xip | cpio -d -i
$ …/bitcoin/contrib/macdeploy/gen-sdk $PWD/Xcode.app/
Found Xcode (version: 12.2, build id: 12B45b)
Found MacOSX SDK (version: 11.0, build id: 20A2408)
Creating output .tar.gz file...
Adding MacOSX SDK 11.0 files...
Adding libc++ headers...
Done! Find the resulting gzipped tarball at:
…/apple-sdk-tools/Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
$ sha256sum Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
Tested ACK https://github.com/bitcoin/bitcoin/commit/ba30a5407e065e9d6dd037351e83f56a43f38f19
I don't know what had happened in my original run, its possible that on accident I hadn't successfully changed to the PR branch. I re-ran everything on my setup and now I get a matching hash :)
sha256sum ./Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz
e7ca56bc8804d16624fad68be2e71647747d6629cacaaa3de5fbfa7f444e9eae ./Xcode-12.2-12B45b-extracted-SDK-with-libcxx-headers.tar.gz