valgrind will incur a slowdown of at least 2, so increase the default timeout factor.
This should reduce the number of reported issues. See also #27112 (comment)
valgrind will incur a slowdown of at least 2, so increase the default timeout factor.
This should reduce the number of reported issues. See also #27112 (comment)
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
I’m still seeing #27208 with this change (fa27cf4cc7c24aa00a66dffabab849d4b3cb1c41). See here for the full combined log:
0...............................................................................................
1162/256 - p2p_ibd_stalling.py failed, Duration: 126 s
2
3stdout:
42023-03-08T04:33:24.745000Z TestFramework (INFO): PRNG seed is: 3703385557670057473
52023-03-08T04:33:24.745000Z TestFramework (INFO): Initializing test directory /home/ubuntu/ci_scratch/ci/scratch/test_runner/test_runner_₿_🏃_20230308_012459/p2p_ibd_stalling_97
62023-03-08T04:33:38.260000Z TestFramework (INFO): Prepare blocks without sending them to the node
72023-03-08T04:33:38.356000Z TestFramework (INFO): Check that a staller does not get disconnected if the 1024 block lookahead buffer is filled
82023-03-08T04:35:17.908000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: ''''
9 test_function = lambda: self.is_connected
10'''
112023-03-08T04:35:17.908000Z TestFramework (ERROR): Assertion failed
12Traceback (most recent call last):
13 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
14 self.run_test()
15 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/p2p_ibd_stalling.py", line 77, in run_test
16 peers.append(node.add_outbound_p2p_connection(P2PStaller(stall_block), p2p_idx=id, connection_type="outbound-full-relay"))
17 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 663, in add_outbound_p2p_connection
18 p2p_conn.wait_for_connect()
19 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 467, in wait_for_connect
20 wait_until_helper(test_function, timeout=timeout, lock=p2p_lock)
21 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 281, in wait_until_helper
22 raise AssertionError("Predicate {} not true after {} seconds".format(predicate_source, timeout))
23AssertionError: Predicate ''''
24 test_function = lambda: self.is_connected
25''' not true after 60.0 seconds
262023-03-08T04:35:27.925000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: ''''
27 wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
28'''
29[node 0] Cleaning up leftover process
30
31stderr:
32Traceback (most recent call last):
33 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/p2p_ibd_stalling.py", line 164, in <module>
34 P2PIBDStallingTest().main()
35 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 155, in main
36 exit_code = self.shutdown()
37 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 310, in shutdown
38 self.network_thread.close()
39 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 603, in close
40 wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
41 File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 281, in wait_until_helper
42 raise AssertionError("Predicate {} not true after {} seconds".format(predicate_source, timeout))
43AssertionError: Predicate ''''
44 wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
45''' not true after 10.0 seconds
46Task was destroyed but it is pending!
47task: <Task pending name='Task-9' coro=<BaseSelectorEventLoop._accept_connection2() running at /usr/lib/python3.10/asyncio/selector_events.py:193>>
48/usr/lib/python3.10/asyncio/base_events.py:671: RuntimeWarning: coroutine 'BaseSelectorEventLoop._accept_connection2' was never awaited
49......
50p2p_ibd_stalling.py | ✖ Failed | 126 s
51
52ALL | ✖ Failed | 42880 s (accumulated)
53Runtime: 11402 s
--valgrind
passed (and no timeout factor).
Yeah, this doesn’t affect the valgrind CI system.
Ok. Can we consolidate this at all? The disparity is somewhat confusing because you would assume that the Valgrind CI job would use the dedicated --valgrind
option provided by the test framework.
bitcoind/-cli/-tx/-util
etc in the valgrind wrapper. Also, if someone wants to interactively re-run a single test inside the CI pod, they’d have to pass --valgrind
. Maybe open a new discussion thread or pull request?
Concept ACK. On my test machine --valgrind --timeout-factor=4
works, whereas without --timeout-factor
it doesn’t. Haven’t tried a lower value.
I also have to set --jobs
very low (e.g. 5 on a 32 thread CPU, well below the point where it uses ~100% CPU), or I still get timeouts and other errors (e.g. no RPC connection
in p2p_tx_download.py
). So perhaps 4 is not enough if you want to max-out a machine, but fine for CI. (Also, I compiled without BDB)
Even with that some tests fail, but they do so very quickly so that seems unrelated (see #27228).
--valgrind
(outside the CI system), but will follow up separately with that. Increasing the timeout here seems fine.