test: Do not pass tests on unhandled exceptions

maflcko commented at 10:43 am on July 17, 2025: member

Currently the functional tests are problematic, because they pass, even if they encounter an unhanded exception.

Fix this by handling all exceptions: Catch BaseException as fallback and mark it as failure.

Can be tested via:

 0diff --git a/test/functional/wallet_disable.py b/test/functional/wallet_disable.py
 1index da6e5d408f..ecc41fb041 100755
 2--- a/test/functional/wallet_disable.py
 3+++ b/test/functional/wallet_disable.py
 4@@ -19,6 +19,7 @@ class DisableWalletTest (BitcoinTestFramework):
 5         self.wallet_names = []
 6 
 7     def run_test (self):
 8+        import sys;sys.exit("fatal error")
 9         # Make sure wallet is really disabled
10         assert_raises_rpc_error(-32601, 'Method not found', self.nodes[0].getwalletinfo)
11         x = self.nodes[0].validateaddress('3J98t1WpEZ73CNmQviecrnyiWrnqRhWNLy')

Previously, the test would pass. With this patch, it would fail.

in test/functional/test_framework/test_framework.py:208 in fad0cd510c outdated

214@@ -215,6 +215,9 @@ def main(self):
215         except KeyboardInterrupt:
216             self.log.warning("Exiting after keyboard interrupt")
217             self.success = TestStatus.FAILED
218+        except BaseException as e:
219+            self.log.warning(f"Unhandled base exception {repr(e)}")
220+            self.success = TestStatus.FAILED

stickies-v commented at 1:36 pm on July 17, 2025:

All these exception handlers (except SkipTest) have the same behaviour, except for making the logs less useful by hiding the actual error. Can’t we just remove all (except SkipTest and BaseException) and let the exception speak for itself? + updates log to exception instead of warning.

 0diff --git a/test/functional/test_framework/test_framework.py b/test/functional/test_framework/test_framework.py
 1index 21007d9156..17e5a6b397 100755
 2--- a/test/functional/test_framework/test_framework.py
 3+++ b/test/functional/test_framework/test_framework.py
 4@@ -194,29 +194,11 @@ class BitcoinTestFramework(metaclass=BitcoinTestMetaClass):
 5             else:
 6                 self.run_test()
 7 
 8-        except JSONRPCException:
 9-            self.log.exception("JSONRPC error")
10-            self.success = TestStatus.FAILED
11         except SkipTest as e:
12             self.log.warning("Test Skipped: %s" % e.message)
13             self.success = TestStatus.SKIPPED
14-        except AssertionError:
15-            self.log.exception("Assertion failed")
16-            self.success = TestStatus.FAILED
17-        except KeyError:
18-            self.log.exception("Key error")
19-            self.success = TestStatus.FAILED
20-        except subprocess.CalledProcessError as e:
21-            self.log.exception("Called Process failed with '{}'".format(e.output))
22-            self.success = TestStatus.FAILED
23-        except Exception:
24-            self.log.exception("Unexpected exception caught during testing")
25-            self.success = TestStatus.FAILED
26-        except KeyboardInterrupt:
27-            self.log.warning("Exiting after keyboard interrupt")
28-            self.success = TestStatus.FAILED
29         except BaseException as e:
30-            self.log.warning(f"Unhandled base exception {repr(e)}")
31+            self.log.exception(repr(e))
32             self.success = TestStatus.FAILED
33         finally:
34             exit_code = self.shutdown()

maflcko commented at 2:57 pm on July 17, 2025:

Agree; thx, done

stickies-v commented at 3:15 pm on July 17, 2025:

Logging the exception e seems more useful than “Unexpected exception”, and imo completely obviates the need for having the CalledProcessError and KeyboardInterrupt special case handling?

stickies-v commented at 3:21 pm on July 17, 2025:

Logs with special case handling:

02025-07-17T15:07:39.774663Z TestFramework (ERROR): Called Process failed with 'None'
12025-07-17T15:18:22.039274Z TestFramework (WARNING): Exiting after keyboard interrupt

Logs without special case handling:

02025-07-17T15:19:09.169925Z TestFramework (ERROR): CalledProcessError(1, ['ls', '--invalid-flag'])
12025-07-17T15:19:26.639162Z TestFramework (ERROR): KeyboardInterrupt()

Imo, they’re equally informative for KeyboardInterrupt, and more informative without special case handling for CalledProcessError, with less code.

 0diff --git a/test/functional/feature_abortnode.py b/test/functional/feature_abortnode.py
 1index a5c8aa163a..748875b25d 100755
 2--- a/test/functional/feature_abortnode.py
 3+++ b/test/functional/feature_abortnode.py
 4@@ -22,6 +22,8 @@ class AbortNodeTest(BitcoinTestFramework):
 5         # We'll connect the nodes later
 6 
 7     def run_test(self):
 8+        import subprocess
 9+        subprocess.run(["ls", "--invalid-flag"], check=True)
10         self.generate(self.nodes[0], 3, sync_fun=self.no_op)
11 
12         # Deleting the undo file will result in reorg failure
13diff --git a/test/functional/test_framework/test_framework.py b/test/functional/test_framework/test_framework.py
14index 823d946cfe..17e5a6b397 100755
15--- a/test/functional/test_framework/test_framework.py
16+++ b/test/functional/test_framework/test_framework.py
17@@ -197,14 +197,8 @@ class BitcoinTestFramework(metaclass=BitcoinTestMetaClass):
18         except SkipTest as e:
19             self.log.warning("Test Skipped: %s" % e.message)
20             self.success = TestStatus.SKIPPED
21-        except subprocess.CalledProcessError as e:
22-            self.log.exception("Called Process failed with '{}'".format(e.output))
23-            self.success = TestStatus.FAILED
24-        except KeyboardInterrupt:
25-            self.log.warning("Exiting after keyboard interrupt")
26-            self.success = TestStatus.FAILED
27         except BaseException as e:
28-            self.log.exception(f"Unexpected exception")
29+            self.log.exception(repr(e))
30             self.success = TestStatus.FAILED
31         finally:
32             exit_code = self.shutdown()

maflcko commented at 3:59 pm on July 17, 2025:

more informative without special case handling for CalledProcessError, with less code.

I don’t think this is true. It is missing the output. You can try this for any command that captures the output:

subprocess.run(['bash', '-c', 'echo aaaaa && false'], check=True, capture_output=True)

There could be an argument about stderr, but this seems better as a separate commit or pull.

KeyboardInterrupt

It currently uses self.log.warning, which is different in verbosity. I don’t think it makes sense to show the traceback when the user pressed CTRL+C?

stickies-v commented at 5:47 pm on July 17, 2025:

You can try this for any command that captures the output:

Ah cool, thanks for the example. Agreed that improving is out of scope for this PR then.

I don’t think it makes sense to show the traceback when the user pressed CTRL+C?

No strong preference either way, I kinda like seeing where it got interrupted (and I’m already used to it because the KeyboardInterrupt exception handling doesn’t work for test_runner.py which is my usual interface), but makes sense to not change this behaviour here and leave as-is.

Can be marked as resolved.

maflcko commented at 5:35 am on July 18, 2025:

I kinda like seeing where it got interrupted

Yeah, I realize it is useful to debug timeouts. Pushed a fresh commit for KeyboardInterrupt.

maflcko added the label Tests on Jul 17, 2025

maflcko force-pushed on Jul 17, 2025

DrahtBot commented at 2:56 pm on July 17, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33001.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	stickies-v, pablomartin4btc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

test: Do not pass tests on unhandled exceptions

This adds a missing catch for BaseException (e.g. SystemExit), which
would otherwise be silently ignored.

Also, remove the redundant other catches, which are just calling
log.exception with a redundant log message.

fa30b34026

maflcko force-pushed on Jul 17, 2025

DrahtBot added the label CI failed on Jul 17, 2025

DrahtBot commented at 3:05 pm on July 17, 2025: contributor

🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/runs/46188973278 LLM reason (✨ experimental): The test_framework code contains two lint errors related to unused variables and f-string formatting, which are detected by the linter and cause the CI failure.

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot removed the label CI failed on Jul 17, 2025

stickies-v approved

stickies-v commented at 5:48 pm on July 17, 2025: contributor

ACK fa30b34026f76a5b8af997152fced2d281782e0d

test: Log KeyboardInterrupt as exception

log.exception is more verbose and useful to debug timeouts.

Also, log stderr for CalledProcessError to make debugging easier.

faa3e68411

stickies-v approved

stickies-v commented at 10:10 am on July 18, 2025: contributor

re-ACK faa3e684118bffa7a98cf76eeeb59243219df900

OrangeDoro commented at 10:38 am on July 20, 2025: none

Hi! I’m a grad student working on a research project about using large language models to automate code review. Based on your commit faa3e684118bffa7a98cf76eeeb59243219df900 and the changes in test/functional/test_framework/test_framework.py, my tool generated this comment:

Ensure that e.stdout and e.stderr are checked for existence before logging to prevent potential AttributeError. Consider using getattr(e, 'stdout', 'N/A') and getattr(e, 'stderr', 'N/A').
Change except BaseException to except Exception to avoid catching system-exiting exceptions unintentionally.
Ensure self.success is set to TestStatus.FAILED in the BaseException block to accurately reflect the test’s outcome.
Ensure that the shutdown method in the finally block is well-defined and handles potential exceptions to enhance robustness.
Reintroduce the KeyboardInterrupt exception handling to ensure user interruptions are logged and handled properly.
Consider sanitizing the output before logging or avoid logging the entire output to prevent sensitive information exposure.
Ensure that the exit code handling is appropriate for the application context. Document the intended behavior if sys.exit() is called.
The assertion assert hasattr(self, "num_nodes") could be replaced with an explicit check and raise a ValueError if self.num_nodes is not set.
Consider logging the command that was executed in the subprocess.CalledProcessError block for better debugging context.
Add a test case that simulates a subprocess.CalledProcessError and verifies that the log message contains both stdout and stderr.
Capture the log output during tests and assert that it matches the expected format, including the presence of stdout and stderr.
The self.success assignments in the exception handling blocks could use comments to clarify what each status means in the context of the test execution.

As part of my research, I’m trying to understand how useful these comments are in real-world development. If you have a moment, I’d be super grateful if you could quickly reply to these two yes/no questions:

Does this comment provide suggestions from a dimension you hadn’t considered?
Do you find this comment helpful?

Thanks a lot for your time and feedback! And sorry again if this message is a bother.

maflcko commented at 6:56 am on July 21, 2025: member

@OrangeDoro Every LLM generated point in your comment is wrong and completely misses the point. This is trivial to see, if you took a look at the previous comments and commit messages. (Please stop sharing LLM generated content that you didn’t review or even understand yourself)

pablomartin4btc approved

pablomartin4btc commented at 9:24 pm on July 22, 2025: member

tACK faa3e684118bffa7a98cf76eeeb59243219df900

Managed to reproduce the issue with the patch provided in the PR description, this branch fixes it. Nice detail handling KeyboardInterrupt for debugging purpose. Code-wise refactoring: better reading removing redundant catches.

test: Do not pass tests on unhandled exceptions #33001

Code Coverage & Benchmarks

Reviews