Solves one item in #1392.
This PR also has a few tweaks to the Dockerfile, see individual commits.
I’ll follow up soon with a PR for ARM64/gcc. This will rely on Cirrus CI.
Solves one item in #1392.
This PR also has a few tweaks to the Dockerfile, see individual commits.
I’ll follow up soon with a PR for ARM64/gcc. This will rely on Cirrus CI.
https://github.com/bitcoin-core/secp256k1/actions/runs/6027879043/job/16355370189:
0Run cat tests.log || true
1MemorySanitizer: CHECK failed: sanitizer_allocator_primary64.h:133 "((kSpaceBeg)) == ((address_range.Init(TotalSpaceSize, PrimaryAllocatorName, kSpaceBeg)))" (0xe00000000000, 0xfffffffffffffff4) (tid=3244)
2 <empty stack>
3
4FAIL tests (exit status: 1)
82+ # Determine the version number of the LLVM development branch
83+ LLVM_VERSION=$(apt-cache search --names-only '^clang-[0-9]+$' | sort | tail -1 | cut -f1 -d" " | cut -f2 -d"-" ) && \
84+ # Install packages
85+ apt-get install --no-install-recommends -y "clang-${LLVM_VERSION}" "libclang-rt-${LLVM_VERSION}-dev:arm64" && \
86+ # Assert that we have exactly two clang versions now
87+ ls /usr/bin/clang* && [[ $(ls /usr/bin/clang-?? | sort | wc -l) -eq "2" ]] && \
tail -1
and head -1
. That is not the case anymore.
https://github.com/bitcoin-core/secp256k1/actions/runs/6027879043/job/16355369997:
/usr/bin/aarch64-linux-gnu-ld: cannot find /usr/lib/llvm-14/lib/clang/14.0.6/lib/linux/libclang_rt.msan-aarch64.a: No such file or directory
Should be provided by the libclang-rt-14-dev:arm64
package.
bitcoin-core/secp256k1/actions/runs/6027879043/job/16355370189:
0Run cat tests.log || true 1MemorySanitizer: CHECK failed: sanitizer_allocator_primary64.h:133 "((kSpaceBeg)) == ((address_range.Init(TotalSpaceSize, PrimaryAllocatorName, kSpaceBeg)))" (0xe00000000000, 0xfffffffffffffff4) (tid=3244) 2 <empty stack> 3 4FAIL tests (exit status: 1)
That was quite a rabbit hole. I reported this at: https://github.com/llvm/llvm-project/issues/65144 . There’s not much we could try except compiling compiler-rt on our own, but this smells a bit like overkill. Let’s see if we get a response.
Edit, some more notes on this in case this helpful in the future: It seems that running MSan (and perhaps other sanitizers) on qemu-user is probably a bit fragile in general. MSan makes specific assumptions about the virtual memory layout and qemu-user’s Linux emulation may not be perfect here, in particular on AArch64, where many different virtual address space sizes are possible..
On my local machine, I get even more errors because the qemu-user places the code at a location that doesn’t fit MSan’s assumption. (Perhaps this is because I have qemu 8.1.0 and in Docker we have 7.2.4.) This can be mitigated by specifying a base address offset (-B 0x700000000000
), but this just gives me the original assertion back.
0__msan_init 0x7ffff55e59e4
1app-10-13: 0 - fffffffffff
2shadow-14: 100000000000 - 1fffffffffff
3invalid: 200000000000 - 2fffffffffff
4origin-14: 300000000000 - 3fffffffffff
5shadow-15: 400000000000 - 5fffffffffff
6origin-15: 600000000000 - 7fffffffffff
7invalid: 800000000000 - 9fffffffffff
8app-14: a00000000000 - afffffffffff
9shadow-10-13: b00000000000 - bfffffffffff
10invalid: c00000000000 - cfffffffffff
11origin-10-13: d00000000000 - dfffffffffff
12app-15: e00000000000 - ffffffffffff
13FATAL: Code 0x7ffff55e59e4 is out of application range. Non-PIE build?
14FATAL: MemorySanitizer can not mmap the shadow memory.
15FATAL: Make sure to compile with -fPIE and to link with -pie.
16FATAL: Disabling ASLR is known to cause this error.
17FATAL: If running under GDB, try 'set disable-randomization off'.
18==675507==Process memory map follows:
19 0x400000000000-0x400000001000
20 0x400000001000-0x400000801000 [stack]
21 0x400000801000-0x400000827000 /usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1
22 0x400000827000-0x40000083f000
23 0x40000083f000-0x400000841000 /usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1
24 0x400000841000-0x400000843000 /usr/aarch64-linux-gnu/lib/ld-linux-aarch64.so.1
25 0x400000843000-0x400000844000
26 0x400000844000-0x400000846000
27 0x400000850000-0x400000965000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
28 0x400000965000-0x400000974000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
29 0x400000974000-0x4000009c0000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
30 0x4000009c0000-0x4000009cf000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
31 0x4000009cf000-0x4000009d0000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
32 0x4000009d0000-0x4000009df000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
33 0x4000009df000-0x4000009e0000 /home/tim/bs/dev/secp256k1/.libs/libsecp256k1.so.2.0.3
34 0x4000009f0000-0x400000a70000 /usr/aarch64-linux-gnu/lib/libm.so.6
35 0x400000a70000-0x400000a7f000 /usr/aarch64-linux-gnu/lib/libm.so.6
36 0x400000a7f000-0x400000a80000 /usr/aarch64-linux-gnu/lib/libm.so.6
37 0x400000a80000-0x400000a81000 /usr/aarch64-linux-gnu/lib/libm.so.6
38 0x400000a90000-0x400000a9d000 /usr/aarch64-linux-gnu/lib/libresolv.so.2
39 0x400000a9d000-0x400000aaf000 /usr/aarch64-linux-gnu/lib/libresolv.so.2
40 0x400000aaf000-0x400000ab0000 /usr/aarch64-linux-gnu/lib/libresolv.so.2
41 0x400000ab0000-0x400000ab1000 /usr/aarch64-linux-gnu/lib/libresolv.so.2
42 0x400000ab1000-0x400000ab3000
43 0x400000ac0000-0x400000ad8000 /usr/aarch64-linux-gnu/lib64/libgcc_s.so.1
44 0x400000ad8000-0x400000aef000 /usr/aarch64-linux-gnu/lib64/libgcc_s.so.1
45 0x400000aef000-0x400000af0000 /usr/aarch64-linux-gnu/lib64/libgcc_s.so.1
46 0x400000af0000-0x400000af1000 /usr/aarch64-linux-gnu/lib64/libgcc_s.so.1
47 0x400000b00000-0x400000c7e000 /usr/aarch64-linux-gnu/lib/libc.so.6
48 0x400000c7e000-0x400000c8d000 /usr/aarch64-linux-gnu/lib/libc.so.6
49 0x400000c8d000-0x400000cf0000 /usr/aarch64-linux-gnu/lib/libc.so.6
50 0x400000cf0000-0x400000cf2000 /usr/aarch64-linux-gnu/lib/libc.so.6
51 0x400000cf2000-0x400000cfe000
52 0x400000d09000-0x400000d0e000
53 0x400000d12000-0x4000010a8000
54 0x400001100000-0x400001200000
55 0x4000012a8000-0x4000012a9000
56 0x400001300000-0x400001400000
57 0x400001500000-0x400001600000
58 0x400001700000-0x400001800000
59 0x4000018a9000-0x4000018be000
60 0x4000018bf000-0x4000018c1000
61 0x4000018c2000-0x4000018c4000
62 0x7ffff5567000-0x7ffff55b0000 /[REDACTED]/secp256k1/.libs/ctime_tests
63 0x7ffff55b0000-0x7ffff55bf000
64 0x7ffff55bf000-0x7ffff5647000 /[REDACTED]/secp256k1/.libs/ctime_tests
65 0x7ffff5647000-0x7ffff5656000
66 0x7ffff5656000-0x7ffff5658000 /[REDACTED]/secp256k1/.libs/ctime_tests
67 0x7ffff5658000-0x7ffff5667000
68 0x7ffff5667000-0x7ffff566b000 /[REDACTED]/secp256k1/.libs/ctime_tests
69 0x7ffff566b000-0x7ffff6fff000
70==675507==End of process memory map.
Judging from the tables in the MSan source, it looks more like an x86_64 layout… :shrug:
bitcoin-core/secp256k1/actions/runs/6027879043/job/16355369997:
0/usr/bin/aarch64-linux-gnu-ld: cannot find /usr/lib/llvm-14/lib/clang/14.0.6/lib/linux/libclang_rt.msan-aarch64.a: No such file or directory
Should be provided by the
libclang-rt-14-dev:arm64
package.
Yes, but it turns out that this package conflicts with libclang-rt-14-dev
(the amd64 version). So we can install at most one at the same time. I guess the simplest workaround is to install this one package only in the CI run, and not already in the Docker image.
The idea of this PR was to run at least the clang ARM jobs on GHA, and then only the gcc ARM jobs on Cirrus. This would save some minutes on Cirrus.
But I’m not sure. Perhaps a better approach is a better split is to run more on Cirrus CI
I think the second option doesn’t sound too bad. I could rework this PR to drop MSan. This would resolve all the “ugly” issues mentioned above. And if it turns out that free minutes on Cirrus are not enough, we can still easily move to the Bitcoin Core runner. What do you think?
I think the second option doesn’t sound too bad. I could rework this PR to drop MSan. This would resolve all the “ugly” issues mentioned above. And if it turns out that free minutes on Cirrus are not enough, we can still easily move to the Bitcoin Core runner. What do you think?
Looks good to me.
I think the second option doesn’t sound too bad. I could rework this PR to drop MSan. This would resolve all the “ugly” issues mentioned above. And if it turns out that free minutes on Cirrus are not enough, we can still easily move to the Bitcoin Core runner. What do you think?
Looks good to me.
Ok, did this. Ready for review.
For the follow-up PR that uses Cirrus: I forgot that the free budged is shared between all repos in the bitcoin-core org. I guess it will be less hassle to use the persistent runner. (For example, the HWI repo hasn’t converted yet, so we’d compete with them for the free runners etc.) So I’ll probably do this.
78- # Create symlinks for them
79- ln -s $(ls /usr/bin/clang-?? | sort | tail -1) /usr/bin/clang-snapshot && \
80- ln -s $(ls /usr/bin/clang-?? | sort | head -1) /usr/bin/clang
81+ apt-get update && \
82+ # Determine the version number of the LLVM development branch
83+ LLVM_VERSION=$(apt-cache search --names-only '^clang-[0-9]+$' | sort | tail -1 | cut -f1 -d" " | cut -f2 -d"-" ) && \
e1ce5667749445c025a3225ff487b31e38195653
This line works only accidentally because the current stable Debian has no one-digit versions of clang.
Perhaps,
0 LLVM_VERSION=$(apt-cache search --names-only '^clang-[0-9]+$' | sort -V | tail -1 | cut -f1 -d" " | cut -f2 -d"-" ) && \
?
9+# - sanitizers hanging: https://github.com/google/sanitizers/issues/1662
10+# - valgrind crashing: https://stackoverflow.com/a/75293014
11+# This is not be a problem on our CI hosts, but developers who run the image
12+# on their machines may run into this (e.g., on Arch Linux), so warn them.
13+# (Note that .bashrc is only executed in interactive bash shells.)
14+RUN echo 'if [[ $(ulimit -n) -gt 200000 ]]; then echo "WARNING: Very high value reported by \"ulimit -n\". Consider passing \"--ulimit nofile=32768\" to \"docker run\"."; fi' >> /root/.bashrc
eff42a02c22dcce1a097ab6f122f0fbe5189538a
The downloadable size of the added Docker layer is 595 bytes, which is OK.
I’ve compare downloadable Docker layers sizes:
0sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 32B
1sha256:dca4ea09548975c9d3eb83c5d6d12b942b499a4d2bd7421366172fe45c8d8788 201B
2sha256:3a95c245563807a61f1929525812d0cd7a97e1af9a29bd1e340f782d0ac030d5 29.12MB
3sha256:8c5c787d0228d1c83d0dd6bb53ae4dbd78fca356d6ded7231feb2ec5e5874f8d 106.06MB
4sha256:1ecdca4ed1cd6f5998d1ecc18886973818b254fce924bb48abbf1478a65c7fc7 491.72MB
5sha256:0ff9fabc2d015c9b9ed4a6693d02d19d1b11ce9edf9cf86e85af639eeefc09b2 1.43GB
0sha256:4f4fb700ef54461cfa02571ae0db9a0dc1e0cdb5577484a6d75e68dc38e8acc1 32B
1sha256:e14a3809ad13fbf2d8a6baa3412d21b97038009f65e7fcc6f7a43e6c0a2ed629 201B
2sha256:dd221a06e6ff8bfca340a07201c9c1591bfbd04f878c5d87de77d5fd65563442 595B
3sha256:3a95c245563807a61f1929525812d0cd7a97e1af9a29bd1e340f782d0ac030d5 29.12MB
4sha256:b92eb44f78d023d7c830186e49b783965f4c2c8048c4b9007cfb5e0610f41eb4 70.68MB
5sha256:146e8ec787b6e508eb96dbf8e9659c9ed52ccfcf1763a0f6af3684ba9656f125 492.50MB
6sha256:bec59d901529dae853a627ccff960bbe3a2cf9e6a35c1776af625f9d98c3374a 1.43GB
It looks like there’s a decrease of approximately 35MB.
This commit switches to a new strategy to make sure we're installing the
most recent LLVM packages. Before this commit, we used the unversioned
LLVM packages (e.g., `clang` instead of `clang-18`), which are supposed
to provide the latest snapshot, but this is broken for arm64 [1],
which we want to add in a later PR.
Anyway, the new approach is cleaner because it does not require us to
fiddle with the installed `clang` package by removing a symlink.
[1] https://github.com/llvm/llvm-project/issues/64790
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
The underlying issue does not affect our CI hosts, but is an issue on my
development machine (Arch Linux). In particular, this affects the vanilla
configuration of Docker on systemd, which has effectively no limit:
https://github.com/docker/packaging/blob/11400a3f5a20f2e3eecc3e6347a2ad9ce41278c7/pkg/docker-engine/common/systemd/docker.service#L31
I hope this saves future generations some precious hours of their life.
- No need to have wget installed
- Clean up rm -rf /var/lib/apt/lists/, see
https://docs.docker.com/develop/develop-images/dockerfile_best-practices/#apt-get
It looks like there’s a decrease of approximately 35MB.
Ok, thanks for checking. Not sure if 35 MB was worth it, but I think removing the build deps after each stage is just cleaner.