test: Default timeout factor to 4 under –valgrind #27221

pull maflcko wants to merge 1 commits into bitcoin:master from maflcko:2303-test-valgrind-avoid-timeout-🎛 changing 1 files +4 −3
  1. maflcko commented at 4:08 pm on March 7, 2023: member

    valgrind will incur a slowdown of at least 2, so increase the default timeout factor.

    This should reduce the number of reported issues. See also #27112 (comment)

  2. test: Default timeout factor to 4 under --valgrind fa27cf4cc7
  3. DrahtBot commented at 4:08 pm on March 7, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK fanquake
    Concept ACK Sjors

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  4. DrahtBot added the label Tests on Mar 7, 2023
  5. fanquake commented at 7:33 pm on March 7, 2023: member
    Concept ACK - will test
  6. fanquake commented at 7:54 am on March 8, 2023: member

    I’m still seeing #27208 with this change (fa27cf4cc7c24aa00a66dffabab849d4b3cb1c41). See here for the full combined log:

     0...............................................................................................
     1162/256 - p2p_ibd_stalling.py failed, Duration: 126 s
     2
     3stdout:
     42023-03-08T04:33:24.745000Z TestFramework (INFO): PRNG seed is: 3703385557670057473
     52023-03-08T04:33:24.745000Z TestFramework (INFO): Initializing test directory /home/ubuntu/ci_scratch/ci/scratch/test_runner/test_runner_₿_🏃_20230308_012459/p2p_ibd_stalling_97
     62023-03-08T04:33:38.260000Z TestFramework (INFO): Prepare blocks without sending them to the node
     72023-03-08T04:33:38.356000Z TestFramework (INFO): Check that a staller does not get disconnected if the 1024 block lookahead buffer is filled
     82023-03-08T04:35:17.908000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: ''''
     9        test_function = lambda: self.is_connected
    10'''
    112023-03-08T04:35:17.908000Z TestFramework (ERROR): Assertion failed
    12Traceback (most recent call last):
    13  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 132, in main
    14    self.run_test()
    15  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/p2p_ibd_stalling.py", line 77, in run_test
    16    peers.append(node.add_outbound_p2p_connection(P2PStaller(stall_block), p2p_idx=id, connection_type="outbound-full-relay"))
    17  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_node.py", line 663, in add_outbound_p2p_connection
    18    p2p_conn.wait_for_connect()
    19  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 467, in wait_for_connect
    20    wait_until_helper(test_function, timeout=timeout, lock=p2p_lock)
    21  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 281, in wait_until_helper
    22    raise AssertionError("Predicate {} not true after {} seconds".format(predicate_source, timeout))
    23AssertionError: Predicate ''''
    24        test_function = lambda: self.is_connected
    25''' not true after 60.0 seconds
    262023-03-08T04:35:27.925000Z TestFramework.utils (ERROR): wait_until() failed. Predicate: ''''
    27        wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
    28'''
    29[node 0] Cleaning up leftover process
    30
    31stderr:
    32Traceback (most recent call last):
    33  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/p2p_ibd_stalling.py", line 164, in <module>
    34    P2PIBDStallingTest().main()
    35  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 155, in main
    36    exit_code = self.shutdown()
    37  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 310, in shutdown
    38    self.network_thread.close()
    39  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/p2p.py", line 603, in close
    40    wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
    41  File "/home/ubuntu/ci_scratch/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/util.py", line 281, in wait_until_helper
    42    raise AssertionError("Predicate {} not true after {} seconds".format(predicate_source, timeout))
    43AssertionError: Predicate ''''
    44        wait_until_helper(lambda: not self.network_event_loop.is_running(), timeout=timeout)
    45''' not true after 10.0 seconds
    46Task was destroyed but it is pending!
    47task: <Task pending name='Task-9' coro=<BaseSelectorEventLoop._accept_connection2() running at /usr/lib/python3.10/asyncio/selector_events.py:193>>
    48/usr/lib/python3.10/asyncio/base_events.py:671: RuntimeWarning: coroutine 'BaseSelectorEventLoop._accept_connection2' was never awaited
    49......
    50p2p_ibd_stalling.py                                    | ✖ Failed  | 126 s
    51
    52ALL                                                    | ✖ Failed  | 42880 s (accumulated) 
    53Runtime: 11402 s
    
  7. maflcko commented at 10:37 am on March 8, 2023: member
    Yeah, this doesn’t affect the valgrind CI system. Only test runs with --valgrind passed (and no timeout factor).
  8. fanquake commented at 10:52 am on March 8, 2023: member

    Yeah, this doesn’t affect the valgrind CI system.

    Ok. Can we consolidate this at all? The disparity is somewhat confusing because you would assume that the Valgrind CI job would use the dedicated --valgrind option provided by the test framework.

  9. maflcko commented at 10:59 am on March 8, 2023: member
    Not sure if that makes sense. This would require special casing the non-test binaries bitcoind/-cli/-tx/-util etc in the valgrind wrapper. Also, if someone wants to interactively re-run a single test inside the CI pod, they’d have to pass --valgrind. Maybe open a new discussion thread or pull request?
  10. Sjors commented at 2:43 pm on March 8, 2023: member

    Concept ACK. On my test machine --valgrind --timeout-factor=4 works, whereas without --timeout-factor it doesn’t. Haven’t tried a lower value.

    I also have to set --jobs very low (e.g. 5 on a 32 thread CPU, well below the point where it uses ~100% CPU), or I still get timeouts and other errors (e.g. no RPC connection in p2p_tx_download.py). So perhaps 4 is not enough if you want to max-out a machine, but fine for CI. (Also, I compiled without BDB)

    Even with that some tests fail, but they do so very quickly so that seems unrelated (see #27228).

  11. fanquake approved
  12. fanquake commented at 3:10 pm on March 13, 2023: member
    ACK fa27cf4cc7c24aa00a66dffabab849d4b3cb1c41 - I still see at least one actual issue when running the functional tests under --valgrind (outside the CI system), but will follow up separately with that. Increasing the timeout here seems fine.
  13. fanquake merged this on Mar 13, 2023
  14. fanquake closed this on Mar 13, 2023

  15. maflcko deleted the branch on Mar 13, 2023
  16. sidhujag referenced this in commit 58324bbc35 on Mar 13, 2023
  17. sidhujag referenced this in commit 85f1a41dbd on Mar 15, 2023
  18. bitcoin locked this on Mar 12, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 01:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me