Add Travis check to make sure unit test coverage reports stay deterministic.
Rationale:
A necessary condition for meaningful line coverage measuring is that the test suite is deterministic in the sense that the set of lines executed at least once is identical between test suite runs.
This PR addresses issue #14343 (MarcoFalke): “coverage reports non-deterministic”:
Our unit tests and functional tests are non-deterministic in the overall execution, but the coverage should not be affected by that. I.e. some functions might be executed in a different order or sometimes skipped, but every line, function and branch should be executed at least once.
This is currently not true, even for serialization errors that should be hit exactly once.
Beside the obvious issue of missing test coverage on some runs, this also makes it impossible to see how test coverage changes between two commits.
Example output in case of line coverage deterministic unit tests:
0[2019-06-30 08:32:59] Measuring coverage, run [#1](/bitcoin-bitcoin/1/) of 3
1[2019-06-30 08:36:38] Measuring coverage, run [#2](/bitcoin-bitcoin/2/) of 3
2[2019-06-30 08:40:15] Measuring coverage, run [#3](/bitcoin-bitcoin/3/) of 3
3
4Coverage test passed: Deterministic coverage across 3 runs.
Example output in case of line coverage non-deterministic unit tests:
0[2019-06-30 08:32:59] Measuring coverage, run [#1](/bitcoin-bitcoin/1/) of 3
1[2019-06-30 08:36:38] Measuring coverage, run [#2](/bitcoin-bitcoin/2/) of 3
2
3The line coverage is non-deterministic between runs.
4
5The test suite must be deterministic in the sense that the set of lines executed at least
6once must be identical between runs. This is a necessary condition for meaningful coverage
7measuring.
8
9--- gcovr.run-1.txt 2019-01-30 23:14:07.419418694 +0100
10+++ gcovr.run-2.txt 2019-01-30 23:15:57.998811282 +0100
11@@ -471,7 +471,7 @@
12 test/crypto_tests.cpp 270 270 100%
13 test/cuckoocache_tests.cpp 142 142 100%
14 test/dbwrapper_tests.cpp 148 148 100%
15-test/denialofservice_tests.cpp 225 225 100%
16+test/denialofservice_tests.cpp 225 224 99% 363
17 test/descriptor_tests.cpp 116 116 100%
18 test/fs_tests.cpp 24 3 12% 14,16-17,19-20,23,25-26,29,31-32,35-36,39,41-42,45-46,49,51-52
19 test/getarg_tests.cpp 111 111 100%
20@@ -585,5 +585,5 @@
21 zmq/zmqpublishnotifier.h 5 0 0% 12,31,37,43,49
22 zmq/zmqrpc.cpp 21 0 0% 16,18,20,22,33-35,38-45,49,52,56,60,62-63
23 ------------------------------------------------------------------------------
24-TOTAL 61561 27606 44%
25+TOTAL 61561 27605 44%
26 ------------------------------------------------------------------------------
This Travis check uses test_deterministic_coverage.sh
which was introduced in #15296.
To make sure test_deterministic_coverage.sh
won’t trigger any line coverage non-determinism alarms with the current test suite I’ve performed 8 000 test runs (against 98958c81f5065a5de13699d46995d278ecb6709e) which all gave identical line coverage results.
Note to reviewers: Which would be the most appropriate Travis job to put this on? I’m currently using x86_64 Linux [GOAL: install] [bionic] [uses qt5 dev package instead of depends Qt to speed up build and avoid timeout]
, but I’m afraid the total run-time of that job is a bit on the high end with this check included. Would it be preferable to add a new job instead of adding to an existing job? Please advice :-)