test: Remove --noshutdown flag, Tidy startup failures

maflcko commented at 10:21 am on January 8, 2025: member

The --noshutdown flag is brittle, confusing, and redundant:

Someone wanting to inspect the state after a test failure will likely also want to debug the state on the python side, so the option is redundant with --pdbonfailure. If there was a use case to replicate --pdbonfailure without starting pdb, a dedicated flag could be added for that use case.
It is brittle to use the flag for a passing test, because it will disable checks in the test. For example, on shutdown LSan will perform a leak check, and the test framework will check that the node did not crash, and it will check that the node did not print errors to stderr.

Fix all issues by removing it.

Also, tidy up startup error messages to be less confusing as a result.

test: Treat leftover process as error

Printing to stderr instead of stdout makes the test_runner.py fail on
leftover processes. This is desired and fine, because a leftover process
should only happen on a test failure anyway.

fad441fba0

test: Remove --noshutdown flag fa0dc09b90

test: Avoid redundant stop and error spam on startup failure

Trying to immediately shut down a node after a startup failure without
waiting for the RPC to be fully up will in most cases just fail and lead
to an RPC error.

Also, it is confusing to sidestep the existing fallback to kill any
leftover nodes on a test failure.

So just rely on the fallback.

fae3bf6b87

DrahtBot commented at 10:21 am on January 8, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31620.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	hodlinator, pablomartin4btc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30660 (qa: Shut down framework cleanly on RPC connection failure by hodlinator)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot renamed this:
~~test: Remove --noshutdown flag, Tidy startup failures~~
test: Remove --noshutdown flag, Tidy startup failures
on Jan 8, 2025

DrahtBot added the label Tests on Jan 8, 2025

maflcko commented at 10:40 am on January 8, 2025: member

For testing the third commit different ways are possible. For example:

0$ rm -rf ./releases
1$ ./test/get_previous_releases.py -b
2$ rm -rf ./releases/v28.0/
3$ ./build/test/functional/wallet_migration.py

Then, observing a shorter error message.

pablomartin4btc commented at 8:22 pm on January 8, 2025: member

cr ACK and tACK fae3bf6b870eb0f9cddd1adac82ba72890806ae3 (coming from #31462).

Adding sys.stderr was necessary for the CI checks (1st commit - fad441fba07877ea78ed6020fde11828307273a6). Redundant error messages were eliminated (e.g. in the proposed PR’s test: “Error: no RPC connection” is shown only once instead of twice as in master).

Just for reference, --noshutdown was added in #6032 and --pdbonfailure in #11023.

in test/functional/test_framework/test_framework.py:328 in fae3bf6b87 outdated

331-            for node in self.nodes:
332-                node.cleanup_on_exit = False
333-            self.log.info("Note: bitcoinds were not stopped and may still be running")
334+        self.log.info("Stopping nodes")
335+        if self.nodes:
336+            self.stop_nodes()

hodlinator commented at 10:09 am on January 23, 2025:

Used repurposed test script (556b4646985d4313925bf060692e7a690d396629) from #30660 to see how things behave.

Cleaner shutdowns when failing can be achieved by also avoiding stop-attempts here:

0        if self.success == TestStatus.FAILED:
1            self.log.info("Not stopping nodes as test failed")
2        else:
3            self.log.info("Stopping nodes")
4            if self.nodes:
5                self.stop_nodes()

Output without suggestion

 0₿ build/test/functional/feature_framework_rpc_failure.py 
 12025-01-23T10:07:35.733000Z TestFramework (INFO): PRNG seed is: 1128325736874135373
 22025-01-23T10:07:35.733000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_qgky8ew8
 32025-01-23T10:07:35.744000Z TestFramework (ERROR): Assertion failed
 4Traceback (most recent call last):
 5  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 131, in main
 6    self.setup()
 7  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 313, in setup
 8    self.setup_network()
 9  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 401, in setup_network
10    self.setup_nodes()
11  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 423, in setup_nodes
12    self.start_nodes()
13  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 573, in start_nodes
14    node.wait_for_rpc_connection()
15  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 334, in wait_for_rpc_connection
16    self._raise_assertion_error("Unable to connect to bitcoind after {}s".format(self.rpc_timeout))
17  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 198, in _raise_assertion_error
18    raise AssertionError(self._node_msg(msg))
19AssertionError: [node 0] Unable to connect to bitcoind after 0s
202025-01-23T10:07:35.795000Z TestFramework (INFO): Stopping nodes
21Traceback (most recent call last):
22  File "/home/hodlinator/b2/build/test/functional/feature_framework_rpc_failure.py", line 13, in <module>
23    FeatureFrameworkRPCFailure(__file__).main()
24  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 159, in main
25    exit_code = self.shutdown()
26                ^^^^^^^^^^^^^^^
27  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 328, in shutdown
28    self.stop_nodes()
29  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 587, in stop_nodes
30    node.stop_node(wait=wait, wait_until_stopped=False)
31  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 395, in stop_node
32    self.stop(wait=wait)
33    ^^^^^^^^^
34  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 215, in __getattr__
35    assert self.rpc_connected and self.rpc is not None, self._node_msg("Error: no RPC connection")
36           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
37AssertionError: [node 0] Error: no RPC connection
38[node 0] Cleaning up leftover process

Output with suggestion

 0₿ build/test/functional/feature_framework_rpc_failure.py 
 12025-01-23T10:07:43.326000Z TestFramework (INFO): PRNG seed is: 2921237778351099427
 22025-01-23T10:07:43.326000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_7ywaw2wa
 32025-01-23T10:07:43.336000Z TestFramework (ERROR): Assertion failed
 4Traceback (most recent call last):
 5  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 131, in main
 6    self.setup()
 7  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 313, in setup
 8    self.setup_network()
 9  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 404, in setup_network
10    self.setup_nodes()
11  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 426, in setup_nodes
12    self.start_nodes()
13  File "/home/hodlinator/b2/test/functional/test_framework/test_framework.py", line 576, in start_nodes
14    node.wait_for_rpc_connection()
15  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 334, in wait_for_rpc_connection
16    self._raise_assertion_error("Unable to connect to bitcoind after {}s".format(self.rpc_timeout))
17  File "/home/hodlinator/b2/test/functional/test_framework/test_node.py", line 198, in _raise_assertion_error
18    raise AssertionError(self._node_msg(msg))
19AssertionError: [node 0] Unable to connect to bitcoind after 0s
202025-01-23T10:07:43.387000Z TestFramework (INFO): Not stopping nodes as test failed
212025-01-23T10:07:43.387000Z TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test_7ywaw2wa
222025-01-23T10:07:43.387000Z TestFramework (ERROR): Test failed. Test logging available at /tmp/bitcoin_func_test_7ywaw2wa/test_framework.log
232025-01-23T10:07:43.388000Z TestFramework (ERROR): 
242025-01-23T10:07:43.388000Z TestFramework (ERROR): Hint: Call /home/hodlinator/b2/test/functional/combine_logs.py '/tmp/bitcoin_func_test_7ywaw2wa' to consolidate all logs
252025-01-23T10:07:43.388000Z TestFramework (ERROR): 
262025-01-23T10:07:43.388000Z TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
272025-01-23T10:07:43.388000Z TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
282025-01-23T10:07:43.388000Z TestFramework (ERROR): 
29[node 0] Cleaning up leftover process

Similar results can be observed by using wallet_migration.py as described in #31620 (comment).

maflcko commented at 1:40 pm on January 23, 2025:

Excellent idea to fully rely on the fallback to reduce even more error spam!

Thanks, pushed as a commit!

Similar results can be observed by using wallet_migration.py as described in #31620 (comment).

Just as a test-note: The steps may not work when this pull is rebased on master, so reverting #31462 may be helpful if someone wishes to. It should be possible to come up with other steps to test, but I think having one way to test is enough for now.

hodlinator commented at 1:23 pm on January 23, 2025: contributor

Concept ACK fae3bf6b870eb0f9cddd1adac82ba72890806ae3

Thanks for digging into the history @pablomartin4btc. #6032 from year 2015 added --noshutdown in order to change the default behavior of --nocleanup from leaving lingering bitcoind-processes into shutting them down by default. I suspect no one relies on lingering processes so it makes sense to remove it.

On a related note I just opened draft PR #31723 which makes the test framework tell bitcoind to pause during startup until one attaches a C++ debugger. That way one can have a debugging session running from the beginning of the test instead of attaching to a lingering process after the test already failed like with --noshutdown.

Good to avoid knock-on errors, similarly to #30660 but more minimal. Contrary to that one, this PR doesn’t add any tests, but I repurposed one from the other PR and found a way to further tidy up the output (see inline comment).

hodlinator approved

hodlinator commented at 2:11 pm on January 23, 2025: contributor

ACK fa0369cccf7c2cde244ec6d8669605a5814e4a0a

Thanks for taking my suggestion! (Wouldn’t mind being official co-author of that last commit, but it’s basically just an extra condition ;) ).

Didn’t fully realize it before but without that last commit one didn’t get this output block because of the exception inside shutdown():

0TestFramework (WARNING): Not cleaning up dir /tmp/bitcoin_func_test_bkz3vya4
1TestFramework (ERROR): Test failed. Test logging available at /tmp/bitcoin_func_test_bkz3vya4/test_framework.log
2TestFramework (ERROR): 
3TestFramework (ERROR): Hint: Call /home/hodlinator/b2/test/functional/combine_logs.py '/tmp/bitcoin_func_test_bkz3vya4' to consolidate all logs
4TestFramework (ERROR): 
5TestFramework (ERROR): If this failure happened unexpectedly or intermittently, please file a bug and provide a link or upload of the combined log.
6TestFramework (ERROR): https://github.com/bitcoin/bitcoin/issues
7TestFramework (ERROR):

So it doesn’t just prevent spam, but actually fixes this block not being printed.

Tested through recommended wallet_migration.py-method.

DrahtBot requested review from pablomartin4btc on Jan 23, 2025

test: Avoid redundant stop and error spam on shutdown

Trying to shut down a node after a test failure may fail and lead to an
RPC error.

Also, it is confusing to sidestep the existing fallback to kill any
leftover nodes on a test failure.

So just rely on the fallback.

Idea by Hodlinator.

Co-Authored-By: Hodlinator <172445034+hodlinator@users.noreply.github.com>

faf2f2c654

maflcko force-pushed on Jan 23, 2025

maflcko commented at 2:17 pm on January 23, 2025: member

Thanks for taking my suggestion! (Wouldn’t mind being official co-author of that last commit, but it’s basically just an extra condition ;) ).

Sure, fixed. I guess I could have just used the GitHub interface to create the commit and automatically get your email inserted. But I went ahead and did it manually.

hodlinator approved

hodlinator commented at 2:23 pm on January 23, 2025: contributor

re-ACK faf2f2c654d9aa18b2f49a157956f9ab0fce302a

Thanks for the attribution! Confirmed via range-diff to only be a commit message change.

pablomartin4btc commented at 9:37 pm on January 25, 2025: member

re tACK faf2f2c654d9aa18b2f49a157956f9ab0fce302a

Didn’t fully realize it before but without that last commit one didn’t get this output block because of the exception inside shutdown():

So it doesn’t just prevent spam, but actually fixes this block not being printed.

Since the PR hasn’t been rebased I tested that running (edited: with the v28.0 folder and its binaries in releases/) wallet_migration.py (fixed in #31451).

If the PR gets rebased I’ll consider this:

Just as a test-note: The steps may not work when this pull is rebased on master, so reverting #31462 may be helpful if someone wishes to. It should be possible to come up with other steps to test, but I think having one way to test is enough for now.

fanquake merged this on Jan 28, 2025

fanquake closed this on Jan 28, 2025

TheCharlatan referenced this in commit e1405325c9 on Feb 3, 2025

maflcko deleted the branch on Feb 5, 2025

test: Remove –noshutdown flag, Tidy startup failures #31620

Code Coverage & Benchmarks

Reviews

Conflicts

Output without suggestion

Output with suggestion