Enable functional tests in the ThreadSanitizer (TSan) build job.
This is a follow-up to @MarcoFalke's #14764 which added TSan but for unit tests only.
Enable functional tests in the ThreadSanitizer (TSan) build job.
This is a follow-up to @MarcoFalke's #14764 which added TSan but for unit tests only.
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
98 | @@ -99,9 +99,9 @@ jobs: 99 | DOCKER_NAME_TAG=ubuntu:16.04 100 | PACKAGES="clang llvm python3-zmq qtbase5-dev qttools5-dev-tools libssl-dev libevent-dev bsdmainutils libboost-system-dev libboost-filesystem-dev libboost-chrono-dev libboost-test-dev libboost-thread-dev libdb5.3++-dev libminiupnpc-dev libzmq3-dev libprotobuf-dev protobuf-compiler libqrencode-dev" 101 | NO_DEPENDS=1 102 | - RUN_FUNCTIONAL_TESTS=false # Disabled for now. TODO identify suppressions or exclude specific tests 103 | + FUNCTIONAL_TESTS_CONFIG="--exclude feature_block.py,p2p_invalid_messages.py"
Why are they excluded?
feature_block.py and p2p_invalid_message.py fail for some unknown reason. Unfortunately the error log is not very informative so I haven't figured out why or what suppression (if any) would solve it.
They are timing out because the tsan slows down bitcoind so much. You'd probably have to up all the various timeouts (rpcwait and the timeout for polling loops)
At least that is what I assume based on my tsan runs a few days ago
@MarcoFalke Ah, I see. What do you think about doing that in a follow-up PR once this is merged? It would be nice to keep this initial PR minimal to get basic functional testing under Travis. Debugging Travis is quite time consuming so I'd rather split the task in two if possible.
98 | @@ -99,9 +99,9 @@ jobs: 99 | DOCKER_NAME_TAG=ubuntu:16.04 100 | PACKAGES="clang llvm python3-zmq qtbase5-dev qttools5-dev-tools libssl-dev libevent-dev bsdmainutils libboost-system-dev libboost-filesystem-dev libboost-chrono-dev libboost-test-dev libboost-thread-dev libdb5.3++-dev libminiupnpc-dev libzmq3-dev libprotobuf-dev protobuf-compiler libqrencode-dev" 101 | NO_DEPENDS=1 102 | - RUN_FUNCTIONAL_TESTS=false # Disabled for now. TODO identify suppressions or exclude specific tests 103 | + FUNCTIONAL_TESTS_CONFIG="--exclude feature_block.py,p2p_invalid_messages.py" 104 | GOAL="install" 105 | - BITCOIN_CONFIG="--enable-zmq --with-incompatible-bdb --with-gui=qt5 CPPFLAGS=-DDEBUG_LOCKORDER --with-sanitizers=thread --disable-hardening --disable-asm CC=clang CXX=clang++" 106 | + BITCOIN_CONFIG="--enable-zmq --disable-wallet --with-gui=qt5 --with-sanitizers=thread --disable-hardening --disable-asm CC=clang CXX=clang++"
Why is the wallet and DEBUG_LOCKORDER disabled?
For some strange reasons all functional wallets tests fail under Travis (wallet_basic, wallet_dump, etc.). The failure reason is not clear from the logging so I haven't been able to solve this yet.
I've now re-added CPPFLAGS=-DDEBUG_LOCKORDER since it doesn't seem to cause any problems.
Looks like they time out with
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
https://travis-ci.org/MarcoFalke/bitcoin/jobs/462970867
which shouldn't happen, since we should be printing dots all the time. Not sure what is going on :(
29 | +deadlock:WalletBatch::WritePool 30 | +deadlock:WalletBatch::WritePurpose 31 | +deadlock:WalletBatch::WriteTx 32 | +deadlock:WalletBatch::WriteWatchOnly 33 | +race:BerkeleyDatabase::Flush 34 | +race:BerkeleyEnvironment::Flush
This seems like a lot, what about something like this instead:
diff --git a/test/sanitizer_suppressions/tsan b/test/sanitizer_suppressions/tsan
index 209c46f..4edf68c 100644
--- a/test/sanitizer_suppressions/tsan
+++ b/test/sanitizer_suppressions/tsan
@@ -14,9 +14,10 @@ deadlock:TestPotentialDeadLockDetected
race:src/qt/test/*
deadlock:src/qt/test/*
-# WIP: Unidentified suppressions to run the functional tests
-#race:zmqpublishnotifier.cpp
-#
+# External libraries
+deadlock:libdb
+race:libzmq
+
Fixed!
@MarcoFalke Please review the latest version.
I've now re-introduced CPPFLAGS=-DDEBUG_LOCKORDER.
However, I've been unable to make the Travis build pass without FUNCTIONAL_TESTS_CONFIG="--exclude feature_block.py,p2p_invalid_messages.py" and --disable-wallet.
40 | @@ -41,12 +41,19 @@ DOCKER_EXEC ./configure --cache-file=../config.cache $BITCOIN_CONFIG_ALL $BITCOI 41 | END_FOLD 42 | 43 | BEGIN_FOLD build 44 | -DOCKER_EXEC make $MAKEJOBS $GOAL || ( echo "Build failure. Verbose build follows." && DOCKER_EXEC make $GOAL V=1 ; false ) 45 | +DOCKER_EXEC make $MAKEJOBS $GOAL || ( 46 | + echo "Build failure. Verbose build follows." && DOCKER_EXEC make $GOAL V=1 47 | + DOCKER_EXEC "cat ${TRAVIS_BUILD_DIR}/sanitizer-output/* 2> /dev/null" 48 | + false 49 | +)
Could these commands be run in after_failure step? Then you don't have to copy-paste these code in three different places.
@ken2812221 I haven't tried after_failure before, but let me try to summarise: the suggestion is to add script say .travis/test_XX_after_failure.sh containing ...
#!/bin/bash
DOCKER_EXEC "cat ${TRAVIS_BUILD_DIR}/sanitizer-output/* 2> /dev/null"
.. that would be referenced using ...
after_failure:
- set -o errexit; source .travis/test_XX_after_failure.sh
… in .travis.yml?
Is that a correct summary of the suggestion? :-)
@ken2812221 Friendly ping :-)
Could just test it?
Ah, I guess it wouldn't work because we exit "hard" as opposed to just return a non-zero exit code.
The trap way seems to work. 8d78bb9536e44e6b79532f539b1dd0066ce7ee6c
@MarcoFalke @ken2812221 Updated. Please re-review :-)
utACK 5e5138a721738f47053d915e4c65f925838ad5b4
@MarcoFalke Added suppression (race:InterruptRPC, fix in #14993). Please re-review :-)