qa: Improvements to debug_assert_log + busy_wait_for_debug

hodlinator commented at 8:39 am on September 18, 2025: contributor

Remove incorrect docstring in busy_wait_for_debug_log().
Replace nerfed regex searches with X in Y expressions.
Only compute the log string to be printed on failure when we actually fail instead of every 0.05s.
As we find each needle (expected message) in the haystack (log output), stop searching for it. If we fail and time out, we will only complain about the needles (expected messages) we didn’t find. On master we also include found needles, which is less helpful.

Found while developing a new test case in #33336 (review)

DrahtBot added the label Tests on Sep 18, 2025

DrahtBot commented at 8:39 am on September 18, 2025: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33423.

Reviews

See the guideline for information on the review process. A summary of reviews will appear here.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#19461 (multiprocess: Add bitcoin-gui -ipcconnect option by ryanofsky)
#19460 (multiprocess: Add bitcoin-wallet -ipcconnect option by ryanofsky)
#10102 (Multiprocess bitcoin by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in test/functional/test_framework/test_node.py:574 in acb3e9a660 outdated

570@@ -569,7 +571,7 @@ def assert_debug_log(self, expected_msgs, unexpected_msgs=None, timeout=2):
571             if time.time() >= time_end:
572                 break
573             time.sleep(0.05)
574-        self._raise_assertion_error('Expected messages "{}" does not partially match log:\n\n{}\n\n'.format(str(expected_msgs), print_log))
575+        self._raise_assertion_error(f'Expected message(s) {expected_msgs!s} not found in log:\n\n{print_log(log)}\n\n')

l0rinc commented at 5:59 pm on September 18, 2025:

this was weird before, it logged print_log which was created in different scope. Having log here is still weird, but I guess Python is weird.

nit: based on the docs, !r might be more appropriate here:

0        self._raise_assertion_error(f'Expected message(s) {expected_msgs!r} not found in log:\n\n{print_log(log)}\n\n')

hodlinator commented at 1:01 pm on September 19, 2025:

I prefer !s since it is closer to the original str(). !r seems more appropriate when one wants to copy the result into code, mirroring repr(), feverishly escaping backslashes etc:

0>>> repr(r'\w')
1"'\\\\w'"
2>>> str(r'\w')
3'\\w'

in test/functional/test_framework/test_node.py:565 in acb3e9a660 outdated

562                 log = dl.read()
563-            print_log = " - " + "\n - ".join(log.splitlines())
564             for unexpected_msg in unexpected_msgs:
565                 if re.search(re.escape(unexpected_msg), log, flags=re.MULTILINE):
566-                    self._raise_assertion_error('Unexpected message "{}" partially matches log:\n\n{}\n\n'.format(unexpected_msg, print_log))
567+                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{print_log(log)}\n\n')

l0rinc commented at 6:00 pm on September 18, 2025:

nit: you could mention the reason for changing the texts as well in the commit message

hodlinator commented at 9:44 pm on September 19, 2025:

Done.

in test/functional/test_framework/test_node.py:553 in acb3e9a660 outdated

549@@ -550,17 +550,19 @@ def assert_debug_log(self, expected_msgs, unexpected_msgs=None, timeout=2):
550         time_end = time.time() + timeout * self.timeout_factor
551         prev_size = self.debug_log_size(encoding="utf-8")  # Must use same encoding that is used to read() below
552 
553+        def print_log(log):

l0rinc commented at 11:26 pm on September 18, 2025:

nit: print_log and log are both used in the vicinity, if you retouch again, consider different names nit2: any reason for putting this before the yield?

hodlinator commented at 1:57 pm on September 19, 2025:

Agree join_log is more accurate, changed.
It felt like the setup code was all before the yield.

in test/functional/test_framework/test_node.py:574 in 93511bdca5 outdated

574+            # We found remaining expected messages before encountering unexpected messages.
575+            if len(found) == len(expected_msgs):
576                 return
577+            # Remove the ones we found so far, so we only complain about remaining ones when timing out.
578+            for f in reversed(found):
579+                expected_msgs.pop(f)

l0rinc commented at 11:28 pm on September 18, 2025:

can we do this without mutating the input arg?

0remaining = list(expected_msgs)
1...
2remaining = [m for m in remaining if m not in found]

hodlinator commented at 9:13 pm on September 19, 2025:

Agreed that it’s nicer not modify inputs, done!

I had the list comprehension of

0remaining = [e for e in remaining if e not in log]

Then I realized it was wasteful to keep searching if we already failed to find one of the expected messages, so I reverted the list comprehension and added a break.

Found another way to probably get rid of found (not well tested). But it comes at the expense of copying the remaining enumerated list for each outer loop iteration, so holding off:

0for i, expected_msg in reversed(list(enumerate(remaining_expected))):
1    if expected_msg in log:
2        remaining_expected.pop(i)
3    else:
4        break

in test/functional/test_framework/test_node.py:567 in 93511bdca5 outdated

567-            for expected_msg in expected_msgs:
568-                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE) is None:
569-                    found = False
570-            if found:
571+            for i, expected_msg in enumerate(expected_msgs):
572+                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):

l0rinc commented at 11:34 pm on September 18, 2025:

what’s the reason for regex iteration above and here if we’re just escaping it anyway? We’re not doing it in busy_wait_for_debug_log I have added

0            found = []
1            for i, expected_msg in enumerate(expected_msgs):
2                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
3                    found.append(i)
4            found2 = [i for i, m in enumerate(expected_msgs) if m in log]
5            assert found == found2, f"Mismatch: {found} != {found2}"

and ran the whole suite and it seems to be the same, maybe we can simplify it. Note that found2 should likely be a set instead of a list, otherwise comparing the sizes may not make sense. It also helps with the in condition for the remaining values.

hodlinator commented at 9:29 pm on September 19, 2025:

At first I thought there was some special property with re.search(..., re.MULTILINE), but digging deeper I think not. Looking at the PR which added it, I found #14024 (review) which indicates that the original user of the function was passing in a search string with symbols that could be interpreted as a regex but shouldn’t be. #13006 which spawned the issue mentioned regexps, so that’s probably why it’s in this unfortunate nerfed state.

I agree on switch to needle in haystack form instead. Confirmed it to work by running --extended locally.

The inefficiency of print_log was also pointed out but dismissed: #14024 (review) - hopefully this PR is still acceptable.

in test/functional/test_framework/test_node.py:559 in 93511bdca5 outdated

555@@ -556,18 +556,22 @@ def print_log(log):
556         yield
557 
558         while True:
559-            found = True
560+            found = []

l0rinc commented at 11:34 pm on September 18, 2025:

we don’t actually need this here, can we move it closer to it actual usage? Or if we use a list/set comprehension we don’t even need this line

hodlinator commented at 9:45 pm on September 19, 2025:

Moved.

in test/functional/test_framework/test_node.py:602 in 93511bdca5 outdated

602+            for i, expected_msg in enumerate(expected_msgs):
603+                if expected_msg in log:
604+                    found.append(i)
605 
606-            if found:
607+            if len(found) == len(expected_msgs):

l0rinc commented at 0:31 am on September 19, 2025:

Wouldn’t this return if the same expected message appears multiple times in the log? I think we should collect the indexes into a set instead. Not sure that will keep the expected order (which we likely should), I haven’t checked or thought about extensively.

hodlinator commented at 9:21 pm on September 19, 2025:

We only report one found-entry per found expected message if it exists in the log, doesn’t matter if it exists twice in the log (the indexes are not into the log), so I think it should be alright.

l0rinc changes_requested

l0rinc commented at 0:39 am on September 19, 2025: contributor

Thanks for reducing useless work - did you measure any speedup? I think we could make the result simpler and more pythonic - it’s a test, I find visibly simpler code better here than theoretically faster one that we can’t even measure, so I suggested a few changes. Here’s the end-result containing most of my suggestions:

 0diff --git a/test/functional/test_framework/test_node.py b/test/functional/test_framework/test_node.py
 1index e28140c17b..ccbde4702c 100755
 2--- a/test/functional/test_framework/test_node.py
 3+++ b/test/functional/test_framework/test_node.py
 4@@ -550,32 +550,31 @@ class TestNode():
 5         time_end = time.time() + timeout * self.timeout_factor
 6         prev_size = self.debug_log_size(encoding="utf-8")  # Must use same encoding that is used to read() below
 7 
 8-        def print_log(log):
 9-            return " - " + "\n - ".join(log.splitlines())
10-
11         yield
12 
13+        def join_lines(logs):
14+            return " - " + "\n - ".join(logs.splitlines())
15+
16+        remaining = list(expected_msgs)
17         while True:
18-            found = []
19             with open(self.debug_log_path, encoding="utf-8", errors="replace") as dl:
20                 dl.seek(prev_size)
21                 log = dl.read()
22+
23             for unexpected_msg in unexpected_msgs:
24-                if re.search(re.escape(unexpected_msg), log, flags=re.MULTILINE):
25-                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{print_log(log)}\n\n')
26-            for i, expected_msg in enumerate(expected_msgs):
27-                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE):
28-                    found.append(i)
29+                if unexpected_msg in log:
30+                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{join_lines(log)}\n\n')
31+
32+            found = {i for i, m in enumerate(remaining) if m in log}
33             # We found remaining expected messages before encountering unexpected messages.
34-            if len(found) == len(expected_msgs):
35+            if len(found) == len(remaining):
36                 return
37             # Remove the ones we found so far, so we only complain about remaining ones when timing out.
38-            for f in reversed(found):
39-                expected_msgs.pop(f)
40+            remaining = [m for m in remaining if m not in found]
41             if time.time() >= time_end:
42                 break
43             time.sleep(0.05)
44-        self._raise_assertion_error(f'Expected message(s) {expected_msgs!s} not found in log:\n\n{print_log(log)}\n\n')
45+        self._raise_assertion_error(f'Expected message(s) {remaining!r} not found in log:\n\n{join_lines(log)}\n\n') [@contextlib](/bitcoin-bitcoin/contributor/contextlib/).contextmanager
46     def busy_wait_for_debug_log(self, expected_msgs, timeout=60):
47@@ -589,21 +588,16 @@ class TestNode():
48 
49         yield
50 
51+        remaining = list(expected_msgs)
52         while True:
53-            found = []
54             with open(self.debug_log_path, "rb") as dl:
55                 dl.seek(prev_size)
56                 log = dl.read()
57 
58-            for i, expected_msg in enumerate(expected_msgs):
59-                if expected_msg in log:
60-                    found.append(i)
61-
62-            if len(found) == len(expected_msgs):
63+            found = {i for i, m in enumerate(remaining) if m in log}
64+            if len(found) == len(remaining):
65                 return
66-
67-            for f in reversed(found):
68-                expected_msgs.pop(f)
69+            remaining = [m for m in remaining if m not in found]
70 
71             if time.time() >= time_end:
72                 print_log = " - " + "\n - ".join(log.decode("utf8", errors="replace").splitlines())
73@@ -612,8 +606,7 @@ class TestNode():
74             # No sleep here because we want to detect the message fragment as fast as
75             # possible.
76 
77-        self._raise_assertion_error(
78-            f'Expected message(s) {expected_msgs!s} not found in log:\n\n{print_log}\n\n')
79+        self._raise_assertion_error(f'Expected message(s) {remaining!r} not found in log:\n\n{print_log}\n\n') [@contextlib](/bitcoin-bitcoin/contributor/contextlib/).contextmanager
80     def wait_for_new_peer(self, timeout=5):

hodlinator force-pushed on Sep 20, 2025

hodlinator commented at 9:33 pm on September 20, 2025: contributor

Thanks for the feedback!

Thanks for reducing useless work - did you measure any speedup?

Rebased new push on top of #33336 which has a lot of assert_debug_log() (010cf81407c0df8de766fd2a116463d180f25f33), the below represents somewhat average runs:

 0₿ git co 2025/09/assert_debug_log_rebased~3
 1₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
 2Benchmark 1: ./build/test/functional/feature_assumevalid.py
 3  Time (mean ± σ):      6.058 s ±  0.049 s    [User: 4.263 s, System: 0.520 s]
 4  Range (min … max):    5.985 s …  6.157 s    30 runs
 5
 6₿ git co 2025/09/assert_debug_log_rebased
 7₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
 8Benchmark 1: ./build/test/functional/feature_assumevalid.py
 9  Time (mean ± σ):      5.998 s ±  0.045 s    [User: 4.237 s, System: 0.496 s]
10  Range (min … max):    5.929 s …  6.109 s    30 runs

So the tendency is a difference of 1% for this test case - a bit underwhelming. Would expect to see more of a difference for longer log files and more expected_msgs per invocation.

hodlinator commented at 7:05 am on September 22, 2025: contributor

CI failure (https://github.com/bitcoin/bitcoin/actions/runs/17885097855/job/50857723211?pr=33423) seems unrelated:

ERROR: failed to build: failed to solve: short read: expected 659078263 bytes but got 90259897: unexpected EOF Command '['./ci/test/02_run_container.sh']' returned non-zero exit status 1.

hodlinator force-pushed on Sep 22, 2025

hodlinator commented at 7:01 pm on September 22, 2025: contributor

Latest push has somewhat simpler code (no found-list). Tested performance again, similarly to #33423#pullrequestreview-3244741286:

 0₿ git co 2025/09/assert_debug_log_rebased~3
 1₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
 2Benchmark 1: ./build/test/functional/feature_assumevalid.py
 3  Time (mean ± σ):      6.034 s ±  0.047 s    [User: 4.260 s, System: 0.488 s]
 4  Range (min … max):    5.964 s …  6.146 s    30 runs
 5 
 6₿ git co 2025/09/assert_debug_log_rebased
 7₿ hyperfine -r 30 -N ./build/test/functional/feature_assumevalid.py
 8Benchmark 1: ./build/test/functional/feature_assumevalid.py
 9  Time (mean ± σ):      5.990 s ±  0.046 s    [User: 4.233 s, System: 0.482 s]
10  Range (min … max):    5.881 s …  6.126 s    30 runs

So less than 1% performance improvement in this case. Again, should perform better for other cases.

hodlinator force-pushed on Sep 25, 2025

in test/functional/test_framework/test_node.py:566 in 6561ac312e outdated

569-                if re.search(re.escape(expected_msg), log, flags=re.MULTILINE) is None:
570-                    found = False
571-            if found:
572+                if unexpected_msg in log:
573+                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" '
574+                                                f'found in log:\n\n{join_log(log)}\n\n')

l0rinc commented at 5:57 pm on September 25, 2025:

Seems this was always eagerly evaluated since https://github.com/bitcoin/bitcoin/commit/fa3e9f7627784ee00980590e5bf044a0e1249999#diff-7f22a082e3a80ca2f212e36193f87dd80237035afedb7f15b116ef7fa265f3eeR245-R248 - the current approach of only calculating it once when the loop also terminates makes sense

nit: I know it’s a long line but I would still prefer this merged to reduce the noise. I’m fine either way:

0                    self._raise_assertion_error(f'Unexpected message "{unexpected_msg}" found in log:\n\n{join_log(log)}\n\n')

hodlinator commented at 9:02 pm on September 26, 2025:

Regarding nit: I sympathize with the way busy_wait_for_debug_log() on master puts the parameter to _raise_assertion_error() on it’s own line, so I would be tempted to do that for all of them - which makes me realize: if only I broke the message in two lines, I could raise up the first half and still keep close to 80 char width - and we’re back with the current approach. Left unchanged in latest push.

in test/functional/test_framework/test_node.py:571 in 6561ac312e

574+                                                f'found in log:\n\n{join_log(log)}\n\n')
575+            while len(remaining_expected) > 0:
576+                if remaining_expected[-1] in log:
577+                    remaining_expected.pop()
578+                else:
579+                    break

l0rinc commented at 6:00 pm on September 25, 2025:

this basically pops until the last element is still in logs, stops otherwise, right?

0        while remaining_expected and remaining_expected[-1] in log:
1            remaining_expected.pop()

in test/functional/test_framework/test_node.py:572 in 6561ac312e

575+            while len(remaining_expected) > 0:
576+                if remaining_expected[-1] in log:
577+                    remaining_expected.pop()
578+                else:
579+                    break
580+            if len(remaining_expected) == 0:

l0rinc commented at 6:00 pm on September 25, 2025:

it would be slightly more pythonic to do

0            if not remaining_expected:

in test/functional/test_framework/test_node.py:576 in 6561ac312e outdated

582             if time.time() >= time_end:
583                 break
584             time.sleep(0.05)
585-        self._raise_assertion_error('Expected messages "{}" does not partially match log:\n\n{}\n\n'.format(str(expected_msgs), print_log))
586+        self._raise_assertion_error(f'Expected message(s) {remaining_expected!s} '
587+                                    f'not found in log:\n\n{join_log(log)}\n\n')

l0rinc commented at 6:02 pm on September 25, 2025:

that’s not necessarily true, we have just stopped searching after the first failure, right?

hodlinator commented at 9:02 pm on September 26, 2025:

Thanks for noticing this issue!

Latest push inserts

0remaining_expected = [e for e in remaining_expected if e not in log]

before the error reporting for it to be correct, meaning we only need to do the full search on failure.

l0rinc commented at 6:04 pm on September 25, 2025: contributor

The error message seems off to me now, we’re potentially logging lines as missing that we haven’t checked.

doc: Remove no longer correct comment

Left over since fa7b07571f24b6def6effdd4cc1b96c7507bf959.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>

5c16e4631c

qa: Replace always-escaped regexps with "X in Y"

Always escaping the search string makes the use of regular expressions unnecessary.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>

a9021101dc

refactor(qa): Avoid unnecessary string operations

print_log was recalculated every 0.05s in assert_debug_log(), even during successful circumstances - changed to only be computed upon failure.

Simplified terminology from "(does not) partially match(es)" to "(not) found in" since it seems to reference the first function having used regular expression matching, while it always escaped the search strings (see parent commit). (Simplified grammar also avoids issues with singular/plural "was/were not found").

1e54125e2e

qa: Only complain about expected messages that were not found

Only leaves messages we never found in the final assert message of the functions, which is more helpful (remaining_expected).

Avoids repeatedly searching for messages we have already found (pop()).

Stops searching for other expected messages if we already failed finding one. Still need to clean remaining_expected at the end, but *only if we fail*.

Co-authored-by: Lőrinc <pap.lorinc@gmail.com>

a1f7623020

in test/functional/test_framework/test_node.py:579 in 6561ac312e outdated

585-        self._raise_assertion_error('Expected messages "{}" does not partially match log:\n\n{}\n\n'.format(str(expected_msgs), print_log))
586+        self._raise_assertion_error(f'Expected message(s) {remaining_expected!s} '
587+                                    f'not found in log:\n\n{join_log(log)}\n\n')
588 
589     @contextlib.contextmanager
590     def busy_wait_for_debug_log(self, expected_msgs, timeout=60):

l0rinc commented at 6:17 pm on September 25, 2025:

    Return:
       the number of log lines we encountered when matching

This doesn’t seem to be the case anymore

hodlinator commented at 9:06 pm on September 26, 2025:

Thanks! Added commit removing this from docstring with reference in commit message to ref commit where it was made incorrect.

hodlinator force-pushed on Sep 26, 2025

hodlinator commented at 9:21 pm on September 26, 2025: contributor

Latest push fixes bug from previous push (https://github.com/bitcoin/bitcoin/pull/33423#discussion_r2379926157), simplifies code (https://github.com/bitcoin/bitcoin/pull/33423#discussion_r2379920746) and adds commit to fix docstring (https://github.com/bitcoin/bitcoin/pull/33423#discussion_r2379958513).

Compared run-times for all functional tests and wasn’t able to find one which clearly benefited from this optimization. So I’m open to just replacing while remaining_expected and remaining_expected[-1] in log: remaining_expected.pop() with remaining_expected = [e for e in remaining_expected if e not in log] if reviewers prefer.

qa: Improvements to debug_assert_log + busy_wait_for_debug_log #33423

Code Coverage & Benchmarks

Reviews

Conflicts