RFC: assert_debug

ryanofsky commented at 11:32 am on April 17, 2018: member

This idea came up in conversation last night and two people told me it was terrible, so I was curious to try a wider audience. Idea would be to have an assert_debug_log(node, regex) function in python.

Example use:

0self.nodes[0].p2p.send_txs([tx_orphan_invalid])
1with assert_debug_log(self.nodes[0], "invalid orphan tx %s" % txid(tx_orphan_invalid)):
2   self.nodes[0].p2p.send_txs([tx_orphan_parent])

Possible implementation:

 0[@contextmanager](/bitcoin-bitcoin/contributor/contextmanager/)
 1def assert_debug_log(node, pattern, **other_options):
 2    prev_size = node.rpc.flush_debug_log()
 3    try:
 4       yield
 5    finally:
 6       new_size = node.rpc.flush_debug_log()
 7       with open(node.datadir_path("debug.log")) as log:
 8           log.seek(prev_size)
 9           if not re.match(pattern, log.read(new_size - prev_size)):
10               raise AssertionError

Advantages:

Log asserts could make existing tests simpler, more robust, and more readable because they would allow tests to check conditions they care about directly instead of reverse engineering node behaviour with counters and flags and inferences from p2p messages and disconnects.
Log asserts could improve test coverage. Lots of PRs are currently created that fix bugs and change behavior and have 0 accompanying tests because they would be difficult to write in the c++ test framework, and unreliable in the python framework.

Disadvantages:

Log asserts could be overused to check conditions that would be better to interrogate more directly.
Log asserts could be overused to check behaviours that shouldn’t be checked at all because they are implementation specific.
Log asserts could be overused to check trivial things not worth checking given the relative expense (performance, code verbosity, cost of maintaining fragile string comparisons).
Log asserts could encourage writing integration tests when unit tests would be more appropriate.
Log asserts could encourage writing white box tests when black box tests would be more appropriate.
Log asserts could encourage using the python test framework instead of the c++ framework.

Conclusion:

I think assert_debug_log could simplify existing tests, fill a gap in the testing framework, and help move closer to a world where PRs that change internal bitcoin behaviour more typically come with tests. I think assert_debug_log could be overused, but in practice this could be avoided with code review and documentation. More abstract concerns like white box vs black box, unit vs functional, and python vs c++ testing tend to come off a bit superstitious and cargo-culty to me so I’m not the best person to respond to them, but I wouldn’t discourage other discussion in such terms.

MarcoFalke commented at 12:56 pm on April 17, 2018: member

This might be useful if used appropriately. What we mostly care in functional tests is the behaviour (on the rpc and p2p interface) not so much the by-products that end up in the debug.log. One exception is where the rpc or p2p interface don’t provide sufficient information to be sure the behaviour happened for the right reason.

It seems asserting regex on the debug log can be done, but in almost all cases it would be orthogonal or rather in addition to existing test patterns. That is because a string logged to a text file does not imply the thing actually happened in the test. So if you expected, let’s say a disconnect, you could wait until you observe that disconnect in the debug log and then (in addition) wait until you observe the disconnect in the network. Rather, it could make more sense to only wait for the disconnect happening in the network and then go read the debug log of the node that initiated the disconnect and assert on the reason for the disconnect, since those are usually not communicated via existing interfaces.

It seems that is what your example use case is doing, so Concept ACK on that, but Concept NACK on other use cases.

sdaftuar commented at 1:17 pm on April 17, 2018: member

Concept ACK! I wish I had more examples at the top of my head but I’m pretty sure I’ve run into plenty of situations where an assert_debug_log() would have made a test more robust, by ensuring that some observed behavior happened for the reason we expected and not multiple other changes that broke the testing environment’s assumptions.

One example along these lines that comes to mind which is analogous to your suggestion is something like: assert_not_in_debug_log(node, "Unable to bind RPC port..."). This type of error comes up all the time for me in my CI runs of test_runner, resulting in cascading failures whenever something goes wrong in a way that prevents bitcoind from shutting down. This would be much easier to spot if we had scripted tools for looking at the debug log to eg ensure that a node starts up correctly when that’s what we expect.

Another thing that comes to mind is when you’re manually debugging something, being able to inspect the debug log from the test framework could be a powerful tool.

So while this could become a crutch for poorly written tests, I would appreciate the addition to the toolkit of something that could also help improve test robustness and error detection.

promag commented at 1:40 pm on April 17, 2018: member

Concept ACK.

Do you think it would have a big impact in the functional test performance, even an optimised implementation?

MarcoFalke commented at 1:46 pm on April 17, 2018: member

@promag I don’t think it will have any meaningful impact on test performance, given that we currently use polling loops all over the place. And it seems the example code above would only read the lines that were written to since the last read.

promag commented at 2:18 pm on April 17, 2018: member

@MarcoFalke I’m just asking because of the “double flush” (or whatever is done there) and the file open.

BTW, for me this would be the first time to see log expectations. Anyone knows of other test suite doing the same?

fanquake added the label Brainstorming on Apr 18, 2018

jnewbery commented at 6:14 pm on April 18, 2018: member

I don’t think I told you that this was a terrible idea, just that we’d want to avoid proliferation of this pattern all over the functional tests, since that would make them less readable and more brittle.

For certain scenarios, this would be very useful. For example, feature_block.py asserts on the contents of p2p REJECT messages. I think it’d be nicer if we were asserting on the debug log, since we may want to remove or re-engineer REJECT at some point.

promag commented at 6:18 pm on April 18, 2018: member

we’d want to avoid proliferation of this pattern all over the functional tests

Agree, but that’s easily moderated like @ryanofsky says.

skeees commented at 3:15 pm on April 23, 2018: contributor

concept ACK - its definitely useful - and allows you to test stuff that’s hard to interrogate over rpc

But - I worry that this will increase test fragility - i.e. simple changes to logging now might make tests break.

And - the more you tie tests to a particular implementation the more it potentially increases the work involved in (and thereby discourages) refactoring.

Also - i worry that this will increase the tendency for functional tests to cover things that should really be unit tests.

MarcoFalke added the label Tests on May 30, 2018

MarcoFalke added the label good first issue on May 30, 2018

MarcoFalke removed the label Brainstorming on May 30, 2018

laanwj referenced this in commit e24bf1ce18 on Jun 1, 2018

MarcoFalke referenced this in commit 55c18a4530 on Aug 24, 2018

MarcoFalke closed this on Aug 24, 2018

PastaPastaPasta referenced this in commit 5ebd537d54 on Apr 16, 2020

PastaPastaPasta referenced this in commit ff990120e9 on Apr 19, 2020

PastaPastaPasta referenced this in commit 334f68c483 on Apr 20, 2020

PastaPastaPasta referenced this in commit c75e7e2bad on Jun 12, 2020

PastaPastaPasta referenced this in commit 6ab97cbfca on Jun 13, 2020

PastaPastaPasta referenced this in commit b7d117c0ab on Jun 14, 2020

PastaPastaPasta referenced this in commit 329c56721b on Jun 14, 2020

DrahtBot locked this on Sep 8, 2021

gades referenced this in commit 93b7aff30b on Mar 11, 2022

RFC: assert_debug_log #13006