According to the ThreadSanitizer docs:
C++11 threading is supported with llvm libc++.
For example, the thread sanitizer build is currently not checking for double lock of mutexes.
Fixes (partially) #19038 (comment)
According to the ThreadSanitizer docs:
C++11 threading is supported with llvm libc++.
For example, the thread sanitizer build is currently not checking for double lock of mutexes.
Fixes (partially) #19038 (comment)
What does this change? Why?
@sipa Looks like it extends tsan checks to include checks on std::thread handling etc.
Adding libc++-dev seems to be a way to fix the build: https://packages.ubuntu.com/xenial/amd64/libc++-dev/filelist
How about running both the clang-9 wallet+qt version and the clang-10 libc++? Or is there not meaningful threading to justify it?
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
This fails because the suppressions can not be applied:
==8563==WARNING: invalid path to external symbolizer!
==8563==WARNING: Failed to use and restart external symbolizer!
==================
WARNING: ThreadSanitizer: double lock of a mutex (pid=8563)
[#0](/bitcoin-bitcoin/0/) <null> <null> (bitcoind+0x7e8d6)
[#1](/bitcoin-bitcoin/1/) <null> <null> (libc++.so.1+0x83505)
[#2](/bitcoin-bitcoin/2/) <null> <null> (bitcoind+0x13458a)
[#3](/bitcoin-bitcoin/3/) <null> <null> (bitcoind+0x360bdc)
[#4](/bitcoin-bitcoin/4/) <null> <null> (bitcoind+0x36b491)
[#5](/bitcoin-bitcoin/5/) <null> <null> (bitcoind+0x36af3e)
[#6](/bitcoin-bitcoin/6/) <null> <null> (bitcoind+0x3334e0)
[#7](/bitcoin-bitcoin/7/) <null> <null> (bitcoind+0x38568a)
[#8](/bitcoin-bitcoin/8/) <null> <null> (bitcoind+0x11a65a)
[#9](/bitcoin-bitcoin/9/) <null> <null> (bitcoind+0x600b6e)
Location is global '<null>' at 0x000000000000 (bitcoind+0x000001347f30)
Mutex M131381 (0x55c3485def30) created at:
[#0](/bitcoin-bitcoin/0/) <null> <null> (bitcoind+0x7e8d6)
[#1](/bitcoin-bitcoin/1/) <null> <null> (libc++.so.1+0x83505)
[#2](/bitcoin-bitcoin/2/) <null> <null> (bitcoind+0xf136c)
[#3](/bitcoin-bitcoin/3/) <null> <null> (libc.so.6+0x270b2)
SUMMARY: ThreadSanitizer: double lock of a mutex (/home/travis/build/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/bitcoind+0x7e8d6)
==================
Anyone knows how to fix this? @practicalswift maybe?
I already tried setting PATH to $PATH:/usr/lib/llvm-10/bin/ to add llvm-symbolizer directory to PATH, but that didn't help
Fixed all issues. Travis green on the tsan run. This is ready for review. The changes are split out into separate micro-commits, so should be easier to review.
ACK faf62e6ed0ca45db44c370844c3515eb5a8cda12
Given the number of new suppressions I suggest also opening an issue for tackling them :)
Concept ACK.
Could removing llvm path in faf62e6ed0ca45db44c370844c3515eb5a8cda12 affect other builds with sanitizers?
I still not fully understand what is (potentially) wrong with code which emits "double locks" warnings about recursive mutexes. I can easy reproduce such warning but it does not give answers though.
Could someone point me to resources to get knowledge?
@hebasto Recursive mutexes are generally considered to be a bad idea. They work, and do what they're designed for - reentering a mutex you already hold is perfectly legal with them. However, they're generally a sign of badly structured code. In well-designed systems, you have code running inside the mutex, and code outside, and there is a clean barrier between the two. The fact that you need a recursive mutex means you have code that's trying to be useful in both, only making the boundary unclear.
Could removing llvm path in faf62e6 affect other builds with sanitizers?
No, because the PATH is never passed into the ci container (docker) anyway and is therefore unset either way.
I still not fully understand what is (potentially) wrong with code which emits "double locks" warnings about recursive mutexes. I can easy reproduce such warning but it does not give answers though.
I checked the first warning and it is about a plain mutex, not recursive one:
src/init.cpp:static Mutex g_genesis_wait_mutex;
I suggest we simply accept the suppressions file for now and then work our way out of this by
ACK faf62e6ed0ca45db44c370844c3515eb5a8cda12, maybe re-organize commits to modify suppressions in a single one?
maybe re-organize commits to modify suppressions in a single one?
I have a slight preference to keep it as is:
32 | @@ -33,7 +33,9 @@ if [ -z "$NO_DEPENDS" ]; then 33 | else 34 | SHELL_OPTS="CONFIG_SHELL=" 35 | fi 36 | - DOCKER_EXEC $SHELL_OPTS make $MAKEJOBS -C depends HOST=$HOST $DEP_OPTS 37 | + # Temporary workaround for https://github.com/bitcoin/bitcoin/issues/16368
Is this being muted because the depends build in the tsan job is now noisier when building with Clang and libstdc++? Or is this an unrelated commit similar to fac2eeeb9d718bdb892eef9adf333ea61ba8f3d0?
It is a combination of all of the qt build issues I've been running into lately.
I am just hoping that one of you build people will solve this one day :pray: kthxbye
To limit the changes here to purely the ci folder, I will leave any build changes for the future.
To clarify, yes this commit is required. Otherwise the build log would exceed 40000 lines and be killed.
to the qt build from outside to disable the wdeprecated warnings in depends.
Can you add NO_QT=1 NO_WALLET=1 to DEP_OPTS? Given qt is no-longer used here, and bdb wasn't used beforehand. If anything that should reduce the build output.
Not sure if we can bump qt to a more recent version
That's blocked until c++17, unless someone wants to fix c++11 support for a newer version (5.12) of Qt for macOS.
Regardless of above, I'll try follow up here and improve this for you.
Can you add NO_QT=1 NO_WALLET=1 to DEP_OPTS? Given qt is no-longer used here, and bdb wasn't used beforehand. If anything that should reduce the build output.
Well the goal is to eventually enable tsan for the wallet and gui. I might wait for sqlite to avoid the bdb horror, but at least qt will be tested against hopefully soon.
Also, it is nice to have a test for the "can I compile qt depends with clang", even though the build result is thrown away.
That's blocked until c++17, unless someone wants to fix c++11 support for a newer version (5.12) of Qt for macOS.
Thanks, didn't know that they switched away from cxx11
Haven't tested anything locally but commits all look ok. Can ACK once my comment has been addressed.
ACK faf62e6ed0ca45db44c370844c3515eb5a8cda12
The commits don't mention that, but the first one is a result of running tsan in Fedora, the second commit is the result of running tsan in Focal
Does this mean we should run CI test on Fedora as well?