valgrind currently does not work on GCC compiled executables, due to an upstream bug. https://bugs.kde.org/show_bug.cgi?id=472329
So temporarily switch to clang, so that a long term solution can be figured out in the meantime.
valgrind currently does not work on GCC compiled executables, due to an upstream bug. https://bugs.kde.org/show_bug.cgi?id=472329
So temporarily switch to clang, so that a long term solution can be figured out in the meantime.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
See the guideline for information on the review process.
| Type | Reviewers |
|---|---|
| ACK | l0rinc |
If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
16@@ -17,5 +17,6 @@ export FUZZ_TESTS_CONFIG="--valgrind"
17 export GOAL="all"
18 export BITCOIN_CONFIG="\
19 -DBUILD_FOR_FUZZING=ON \
20- -DCMAKE_CXX_FLAGS='-Wno-error=array-bounds' \
21+ -DCMAKE_C_COMPILER=clang \
22+ -DCMAKE_CXX_COMPILER=clang++ \
no-error=array-bounds related to the mentioned bug?
5@@ -6,9 +6,9 @@
6
7 export LC_ALL=C.UTF-8
8
9-export CI_IMAGE_NAME_TAG="mirror.gcr.io/debian:trixie"
10+export CI_IMAGE_NAME_TAG="mirror.gcr.io/ubuntu:26.04"
container image version
it’s not strictly just the version, is this part of a unification attempts?
26.04 version? In other cases we’ve used the24.04 LTS.
https://releases.ubuntu.com/26.04-snapshot1 seems empty.
The build is passing (though I don’t see the valgrind tasks referenced by the current GH Actions matrix anymore), not sure what I’m missing :/
Added to the commit msg:
26.04 tag - is that a custom snapshot that we want to use temporarily? We don’t usually depend on unreleased OS, I’m confused why it’s necessary here.
11@@ -12,13 +12,28 @@
12 # --error-limit=no build/bin/test_bitcoin
13 #
14 # Note that suppressions may depend on OS and/or library versions.
15-# Tested on aarch64 and x86_64 with Ubuntu Noble system libs, using clang-16
16-# and GCC, without gui.
17+# Tested on x86_64 with Ubuntu 26.04 system libs, using clang, without gui.
18+{
Reverting them will make the CI fail.
I think you can also test locally by compiling with clang-21 and then running:
./bld-cmake/test/functional/interface_bitcoin_cli.py --valgrind
I tried this on an RPI5 with Clang version 22.0.0 (aarch64) for fa173ec59a9dcc2e7956366344d68999d1d1249e.
It seems to trigger similar valgrind false positives on aarch64 - the updated suppression doesn’t seem to cover it, see:
02026-02-14T19:42:42.017831Z TestFramework (ERROR): Unexpected exception
1Traceback (most recent call last):
2 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_framework.py", line 138, in main
3 self.setup()
4 ~~~~~~~~~~^^
5 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_framework.py", line 272, in setup
6 self.setup_network()
7 ~~~~~~~~~~~~~~~~~~^^
8 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_framework.py", line 363, in setup_network
9 self.setup_nodes()
10 ~~~~~~~~~~~~~~~~^^
11 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_framework.py", line 385, in setup_nodes
12 self.start_nodes()
13 ~~~~~~~~~~~~~~~~^^
14 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_framework.py", line 531, in start_nodes
15 node.wait_for_rpc_connection()
16 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^
17 File "/mnt/my_storage/bitcoin/test/functional/test_framework/test_node.py", line 316, in wait_for_rpc_connection
18 raise FailedToStartError(self._node_msg(
19 f'bitcoind exited with status {self.process.returncode} during initialization. {str_error}'))
20test_framework.test_node.FailedToStartError: [node 0] bitcoind exited with status 1 during initialization. ==68399== Use of uninitialised value of size 8
21==68399== at 0x22AD68: _M_is_engaged (optional:563)
22==68399== by 0x22AD68: operator=<node::BlockfileCursor> (optional:1036)
23==68399== by 0x22AD68: node::BlockManager::LoadBlockIndexDB(std::optional<uint256> const&) (???:565)
24==68399== by 0x46C717: ChainstateManager::LoadBlockIndex() (../src/validation.cpp:4939)
25==68399== by 0x241A8F: node::CompleteChainstateInitialization(ChainstateManager&, node::ChainstateLoadOptions const&) (../src/node/chainstate.cpp:43)
26==68399== by 0x24142B: node::LoadChainstate(ChainstateManager&, kernel::CacheSizes const&, node::ChainstateLoadOptions const&) (../src/node/chainstate.cpp:184)
27==68399== by 0x1950C7: operator() (../src/init.cpp:1392)
28==68399== by 0x1950C7: operator()<(lambda at ../src/init.cpp:1392:45)> (???:1386)
29==68399== by 0x1950C7: InitAndLoadChainstate(node::NodeContext&, bool, bool, kernel::CacheSizes const&, ArgsManager const&) (???:1392)
30==68399== by 0x18FE6B: AppInitMain(node::NodeContext&, interfaces::BlockAndHeaderTipInfo*) (../src/init.cpp:1809)
31==68399== by 0x17BA97: AppInit (../src/bitcoind.cpp:242)
32==68399== by 0x17BA97: main (???:283)
33==68399==
34{
35 <insert_a_suppression_name_here>
36 Memcheck:Value8
37 fun:_M_is_engaged
38 fun:operator=<node::BlockfileCursor>
39 fun:_ZN4node12BlockManager16LoadBlockIndexDBERKSt8optionalI7uint256E
40 fun:_ZN17ChainstateManager14LoadBlockIndexEv
41 fun:_ZN4nodeL32CompleteChainstateInitializationER17ChainstateManagerRKNS_21ChainstateLoadOptionsE
42 fun:_ZN4node14LoadChainstateER17ChainstateManagerRKN6kernel10CacheSizesERKNS_21ChainstateLoadOptionsE
43 fun:operator()
44 fun:operator()<(lambda at ../src/init.cpp:1392:45)>
45 fun:_ZL21InitAndLoadChainstateRN4node11NodeContextEbbRKN6kernel10CacheSizesERK11ArgsManager
46 fun:_Z11AppInitMainRN4node11NodeContextEPN10interfaces21BlockAndHeaderTipInfoE
47 fun:AppInit
48 fun:main
49}
50==68399==
51==68399== Exit program on first error (--exit-on-first-error=yes)
52************************
concept ACK
Could you please add a link to the bug for more context and split the change to commits that explain all the different changes. And it’s “temporary”, can we add a todo to the code to make it official :)?
And it’s “temporary”, can we add a todo to the code to make it official :)?
It is not clear what to do, so a tracking issue seems more appropriate than a vague todo.
Could you please add a link to the bug for more context and split the change to commits that explain all the different changes.
The upstream bugs are https://bugs.kde.org/show_bug.cgi?id=485276 and https://bugs.kde.org/show_bug.cgi?id=472329
This allows to run the test under valgrind:
./bld-cmake/test/functional/feature_dbcrash.py --timeout-factor=10 --valgrind
For testing, the same test can be run multiple times in parallel:
./bld-cmake/test/functional/test_runner.py -j 10 $( printf 'feature_dbcrash.py %.0s' {1..10} ) --timeout-factor=10 --valgrind
(Running the test under valgrind may take several hours!)
I found that before this commit, 9 out of the 10 runs failed via:
```
...
TestFramework (INFO): Iteration 36, generating 2500 transactions [11, 5, 6]
TestFramework (ERROR): Unexpected exception
Traceback (most recent call last):
File "/b-c/test/functional/test_framework/test_framework.py", line 142, in main
self.run_test()
~~~~~~~~~~~~~^^
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 262, in run_test
self.sync_node3blocks(block_hashes)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 151, in sync_node3blocks
nodei_utxo_hash = self.restart_node(i, block_hash)
File "/b-c/bld-cmake/test/functional/feature_dbcrash.py", line 102, in restart_node
raise AssertionError(f"Unable to successfully restart node {node_index} in allotted time")
AssertionError: Unable to successfully restart node 0 in allotted time
```
With this commit, all 10 runs passed.
valgrind currently does not work on GCC -O2 compiled executables, which
contain std::optional use, due to an upstream bug. See
https://bugs.kde.org/show_bug.cgi?id=472329
One workaround could be to use -O1. However, that seems brittle, as
variantions of the bug were seen with -O1 as well.
So temporarily use clang in the valgrind CI tasks, because this also
allows to drop a false-positive suppression for:
-DCMAKE_CXX_FLAGS='-Wno-error=array-bounds'
Also, bump the container CI_IMAGE_NAME_TAG, while touching the config.
This bumps the valgrind version to be more recent:
https://packages.ubuntu.com/resolute/valgrind vs
https://packages.debian.org/trixie/valgrind
Also, update the comment in contrib/valgrind.supp to remove GCC and
aarch64:
* GCC wasn't tested with the suppressions file, due to the mentioned bug
* Clang-17 (or later) on aarch64 wasn't tested due to bug
https://github.com/bitcoin/bitcoin/issues/29635 and the minimum
supported clang version is clang-17 right now.
This means the only tested config right now is the one mentioned in the
suppression file:
"Tested on x86_64 with Debian Trixie system libs, using clang, without gui."
11@@ -12,8 +12,7 @@
12 # --error-limit=no build/bin/test_bitcoin
13 #
14 # Note that suppressions may depend on OS and/or library versions.
15-# Tested on aarch64 and x86_64 with Ubuntu Noble system libs, using clang-16
16-# and GCC, without gui.
17+# Tested on x86_64 with Debian Trixie system libs, using clang, without gui.
A build with system libs (or with a normal depends build) will fail
with:
```sh
$ valgrind --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt
Detected locale "C" with character encoding "ANSI_X3.4-1968", which is not UTF-8.
Qt depends on a UTF-8 locale, and has switched to "C.UTF-8" instead.
If this causes problems, reconfigure your locale. See the locale(1) manual
for more information.
********* Start testing of AppTests *********
Config: Using QtTest library 6.10.2, Qt 6.10.2 (x86_64-little_endian-lp64 shared (dynamic) release build; by GCC 15.2.0), ubuntu 26.04
PASS : AppTests::initTestCase()
QINFO : AppTests::appTests() Backing up GUI settings to "/tmp/test_common bitcoin/60d474ffae390f81657d/regtest/guisettings.ini.bak"
==18007== Conditional jump or move depends on uninitialised value(s)
==18007== at 0x12655E26: ???
==18007== by 0xCB28E7F: ???
==18007==
==18007==
==18007== Exit program on first error (--exit-on-first-error=yes)
```
A DEBUG=1 depends build would work, but that seems tedious for
questionable benefit.
16@@ -17,7 +17,9 @@ export TEST_RUNNER_EXTRA="--exclude rpc_bind --exclude feature_bind_extra"
17 export GOAL="install"
18 # TODO enable GUI
I wonder why this isn’t enabled. Let me try to enable it …
Edit: Running with system libs gives:
0$ valgrind --gen-suppressions=all --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt
1
2
3...
4==906896== Conditional jump or move depends on uninitialised value(s)
5==906896== at 0x243F0E26: ???
6==906896== by 0x1F08261F: ???
7==906896==
8{
9 <insert_a_suppression_name_here>
10 Memcheck:Cond
11 obj:*
12 obj:*
13}
14==906896==
15==906896== Exit program on first error (--exit-on-first-error=yes)
Only seems to pass with DEBUG=1 depends.
For reference, the following command did pass for me:
0( cd depends && make DEBUG=1 -j $(nproc) ) && cmake --fresh -B ./bld-cmake --toolchain depends/*/toolchain.cmake -DAPPEND_CXXFLAGS='-O3 -g2' -DAPPEND_CFLAGS='-O3 -g2' -DCMAKE_BUILD_TYPE=Debug -DCMAKE_EXE_LINKER_FLAGS=-fuse-ld=mold -DCMAKE_C_COMPILER='clang' -DCMAKE_CXX_COMPILER='clang++' --preset=dev-mode -DCMAKE_EXPORT_COMPILE_COMMANDS=ON && cmake --build ./bld-cmake --parallel $(nproc) && valgrind --exit-on-first-error=yes --error-exitcode=1 --quiet ./bld-cmake/bin/test_bitcoin-qt
Pushed a new commit to remove the TODO for now.
I’ve also tried:
0diff --git a/ci/test/00_setup_env_native_msan.sh b/ci/test/00_setup_env_native_msan.sh
1index 4a17fb4..0e1697b 100755
2--- a/ci/test/00_setup_env_native_msan.sh
3+++ b/ci/test/00_setup_env_native_msan.sh
4@@ -16,14 +16,13 @@ export MSAN_AND_LIBCXX_FLAGS="${MSAN_FLAGS} ${LIBCXX_FLAGS}"
5 export CONTAINER_NAME="ci_native_msan"
6 export PACKAGES="clang-${APT_LLVM_V} llvm-${APT_LLVM_V} llvm-${APT_LLVM_V}-dev libclang-${APT_LLVM_V}-dev libclang-rt-${APT_LLVM_V}-dev python3-pip"
7 export PIP_PACKAGES="--break-system-packages pycapnp"
8-export DEP_OPTS="DEBUG=1 NO_QT=1 CC=clang CXX=clang++ CFLAGS='${MSAN_FLAGS}' CXXFLAGS='${MSAN_AND_LIBCXX_FLAGS}'"
9+export DEP_OPTS="DEBUG=1 CC=clang CXX=clang++ CFLAGS='${MSAN_FLAGS}' CXXFLAGS='${MSAN_AND_LIBCXX_FLAGS}'"
10 export GOAL="install"
11 export CI_LIMIT_STACK_SIZE=1
12 # Setting CMAKE_{C,CXX}_FLAGS_DEBUG flags to an empty string ensures that the flags set in MSAN_FLAGS remain unaltered.
13 # _FORTIFY_SOURCE is not compatible with MSAN.
14 export BITCOIN_CONFIG="\
15 --preset=dev-mode \
16- -DBUILD_GUI=OFF \
17 -DCMAKE_BUILD_TYPE=Debug \
18 -DCMAKE_C_FLAGS_DEBUG='' \
19 -DCMAKE_CXX_FLAGS_DEBUG='' \
Which prints:
0
1All tests passed.
2
3~InitExecutor : Stopping thread
4~InitExecutor : Stopped thread
5==87986==WARNING: MemorySanitizer: use-of-uninitialized-value
6 [#0](/bitcoin-bitcoin/0/) 0x56145624af4f in QLibraryStore::cleanup() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:364:26
7 [#1](/bitcoin-bitcoin/1/) 0x56145623d05c in qlibraryCleanup() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:378:5
8 [#2](/bitcoin-bitcoin/2/) 0x56145623d05c in (anonymous namespace)::qlibraryCleanup_dtor_class_::~qlibraryCleanup_dtor_class_() /usr/qtbase/src/corelib/plugin/qlibrary.cpp:380:23
9 [#3](/bitcoin-bitcoin/3/) 0x561450f4884f in MSanCxaAtExitWrapper(void*) (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2df84f) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
10 [#4](/bitcoin-bitcoin/4/) 0x7fb04940da75 (/lib/x86_64-linux-gnu/libc.so.6+0x47a75) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
11 [#5](/bitcoin-bitcoin/5/) 0x7fb04940dbbd in exit (/lib/x86_64-linux-gnu/libc.so.6+0x47bbd) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
12 [#6](/bitcoin-bitcoin/6/) 0x7fb0493f01d0 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1d0) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
13 [#7](/bitcoin-bitcoin/7/) 0x7fb0493f028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
14 [#8](/bitcoin-bitcoin/8/) 0x561450f16fb4 in _start (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2adfb4) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
15
16 Member fields were destroyed
17 [#0](/bitcoin-bitcoin/0/) 0x561450f47de1 in __sanitizer_dtor_callback_fields (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2dede1) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
18 [#1](/bitcoin-bitcoin/1/) 0x561455b213ae in QLoggingCategory::~QLoggingCategory() /usr/qtbase/src/corelib/io/qloggingcategory.h:41:32
19 [#2](/bitcoin-bitcoin/2/) 0x561455b213ae in QLoggingCategory::~QLoggingCategory() /usr/qtbase/src/corelib/io/qloggingcategory.cpp:194:1
20 [#3](/bitcoin-bitcoin/3/) 0x561450f4884f in MSanCxaAtExitWrapper(void*) (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2df84f) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
21 [#4](/bitcoin-bitcoin/4/) 0x7fb04940dbbd in exit (/lib/x86_64-linux-gnu/libc.so.6+0x47bbd) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
22 [#5](/bitcoin-bitcoin/5/) 0x7fb0493f01d0 (/lib/x86_64-linux-gnu/libc.so.6+0x2a1d0) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
23 [#6](/bitcoin-bitcoin/6/) 0x7fb0493f028a in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2a28a) (BuildId: 8e9fd827446c24067541ac5390e6f527fb5947bb)
24 [#7](/bitcoin-bitcoin/7/) 0x561450f16fb4 in _start (/ci_container_base/ci/scratch/build-x86_64-pc-linux-gnu/bin/test_bitcoin-qt+0x2adfb4) (BuildId: 0a3f6add7bb2dd2a7bfc097a6edf88a12aebb62f)
25
26SUMMARY: MemorySanitizer: use-of-uninitialized-value /usr/qtbase/src/corelib/plugin/qlibrary.cpp:364:26 in QLibraryStore::cleanup()
27Exiting
28
29
3099% tests passed, 1 tests failed out of 157
31
32Total Test time (real) = 608.27 sec
33
34The following tests FAILED:
35 7 - test_bitcoin-qt (Failed)
36Errors while running CTest
37Command '['/ci_container_base/ci/test/03_test_script.sh']' returned non-zero exit status 8.