test: check for specific block reject reasons in p2p

theStack commented at 9:04 PM on August 15, 2021: contributor

In the test p2p_segwit.py, there are many instances where we send a segwit block to a node with the expectation that it is rejected. For this purpose, the helper function test_witness_block exists which allows also to check for a specific reject reason that is asserted in the debug log: https://github.com/bitcoin/bitcoin/blob/502d22ceed1f90ed41336260f8eb428d3acaf514/test/functional/p2p_segwit.py#L119-L120

This PR aims to increase the precision of the tests by adding the expected reject reasons to all test_witness_block call instances (found via greping after test_witness_block(.*accepted=False). For some blocks that are rejected due to failed script verification, the exact reason is only shown in the debug log if parallel script verification is disabled. For this reason, the addition of the reasons is split up in two commits:

first, all instances are tackled that are not subject to script verification, i.e. it doesn't matter whether parallel script verification is enabled or not (e.g. bad-blk-weight, bad-witness-merkle-match...)
then, we explicitely set -par=1 to only use one script thread, and add the remaining reasons (non-mandatory-script-verify-flag with the more specific reason in parantheses)

theStack force-pushed on Aug 15, 2021

DrahtBot added the label Tests on Aug 15, 2021

josibake approved

josibake commented at 11:43 AM on August 17, 2021: contributor

ACK https://github.com/bitcoin/bitcoin/pull/22711/commits/d742b1c7d05cfc56994833fd90aab0d8852ffe4a

this is a really important change: without it, the test could be failing for an entirely different reason and you would assume it was failing for the reason you expected. i tested this by adding the following lines to test_v0_outputs_arent_spendable:

from copy import deepcopy
fake_block = deepcopy(block)
# use a hardcoded hash from a previous test run
fake_block.hashPrevBlock = 45514493863593995281992143828725175249324913083213145988561565159796949646003
self.update_witness_block_with_transactions(block, [tx])
# Verify that segwit isn't activated. A block serialized with witness
# should be rejected prior to activation.
test_witness_block(self.nodes[0], self.test_node, fake_block, accepted=False, with_witness=True)

without a reason specified, this test passed! :scream:

after adding reason='unexpected-witness', the test failed with the following error:

AssertionError: [node 0] Expected messages "['unexpected-witness']" does not partially match log:

 - 2021-08-17T11:29:45.259050Z [msghand] [net_processing.cpp:2478] [ProcessMessage] received: block (144 bytes) peer=1
 - 2021-08-17T11:29:45.259090Z [msghand] [net_processing.cpp:3719] [ProcessMessage] received block 5be173022096f0ddfa40c27d8b1c6b8da20d2c2ede788c69f49c1c174559db08 peer=1
 - 2021-08-17T11:29:45.259114Z [msghand] [validation.cpp:3298] [AcceptBlockHeader] ERROR: AcceptBlockHeader: prev block not found

based on this, we should have assert_debug_log fail if both expected_msgs and unexpected_msgs are empty arrays. ill open a follow up PR for this.

in test/functional/p2p_segwit.py:868 in d742b1c7d0 outdated

 874 | @@ -869,7 +875,7 @@ def test_block_malleability(self):
 875 |      @subtest  # type: ignore
 876 |      def test_witness_block_size(self):
 877 |          # TODO: Test that non-witness carrying blocks can't exceed 1MB
 878 | -        # Skipping this test for now; this is covered in p2p-fullblocktest.py
 879 | +        # Skipping this test for now; this is covered in feature_block.py

jonatack commented at 4:24 PM on August 17, 2021:

(Per git show ca6523d0c8 (thanks for the commit reference!) this was changed in January 2018, is it still accurate, covered in feature_block.py?)

theStack commented at 5:28 PM on August 17, 2021:

Yes, I think the relevant part is here:

https://github.com/bitcoin/bitcoin/blob/fdd80b0a53b4af0b29cb6e03118e2456d053a757/test/functional/feature_block.py#L315-L337

These blocks (as all others in feature_block.py, IIRC) don't have a witness commitment, so they are "non-witness carrying blocks". For those the serialized size equals to the block weight divided by WITNESS_SCALE_FACTOR (4), i.e. with MAX_BLOCK_WEIGHT (4000000) this results in the 1MB size limit being checked. Note that these test parts were changed recently in #22378 to switch to weight-based accounting, the constant used was formerly named MAX_BLOCK_BASE_SIZE.

jonatack commented at 4:51 PM on August 17, 2021: contributor

Interesting. Are these error cases already covered by ./src/test/test_bitcoin -t script_tests?

theStack commented at 5:29 PM on August 17, 2021: contributor

Interesting. Are these error cases already covered by ./src/test/test_bitcoin -t script_tests?

It seems like each one of the following script interpreter errors (see src/script/script_error.cpp):

ERR_WITNESS_PROGRAM_WITNESS_EMPTY ("Witness program was passed an empty witness")
ERR_WITNESS_UNEXPECTED ("Witness provided for non-witness script")
ERR_CLEANSTACK ("Stack size must be exactly one after execution")
ERR_WITNESS_MALLEATED ("Witness requires empty scriptSig")
ERR_PUSH_SIZE ("Push value size limit exceeded")
ERR_SCRIPT_SIZE ("Script is too big")
ERR_INVALID_STACK_OPERATION ("Operation not valid with the current stack size")
ERR_EVAL_FALSE ("Script evaluated without error but finished with a false/empty top stack element")

can at least be found once in src/test/data/script_tests.json (without the ERR_ prefix), which is in turn used as input for the script tests in src/test/script_tests.cpp. I'm thinking that maybe a good follow-up PR idea would be to als introduce these script error constants in the functional test framework too, to not have those strings duplicated in the tests; it would be also easier to find which script errors are covered in unit and functional tests with a single grep. The test feature_taproot.py already does this locally:

https://github.com/bitcoin/bitcoin/blob/fdd80b0a53b4af0b29cb6e03118e2456d053a757/test/functional/feature_taproot.py#L554-L575

in test/functional/p2p_segwit.py:521 in d742b1c7d0 outdated

 518 | @@ -514,7 +519,8 @@ def test_v0_outputs_arent_spendable(self):
 519 |              # 'non-mandatory-script-verify-flag (Witness program was passed an
 520 |              # empty witness)' (otherwise).
 521 |              # TODO: support multiple acceptable reject reasons.

laanwj commented at 12:15 PM on August 18, 2021:

Is this TODO still relevant now that it explicitly sets single-threaded validation?

theStack commented at 12:55 PM on August 18, 2021:

Good question. One could argue that by implementing this TODO it'd be possible to further increase test coverage by asserting for the two different reject reasons dynamically, depending on whether g_parallel_script_checks is on; so on single-core systems, the specific message would be checked (like in this PR), and on multi-core systems, the more generic error message block-validation-failed would be checked. At least that's my interpreration of the TODO's original intention.

(Not sure though if it's worth it to do that. Currently, as far as I know we don't even have an easy way of determining of whether bitcoind uses parallel script verification if we don't pass -par explicitely. Also it makes the test less deterministic, and since most of the systems nowadays are multi-core, the case with the specific reject reasons would rarely be tested.)

MarcoFalke commented at 4:34 PM on August 26, 2021:

Maybe remove the TODO then?

theStack commented at 6:09 PM on August 26, 2021:

@MarcoFalke: Yes, seems to be better to remove it (forgot to do that after my comment). Done.

theStack cross-referenced this on Aug 18, 2021 from issue test: fix failure in feature_nulldummy.py on single-core machines by theStack

MarcoFalke referenced this in commit 0492b56e38 on Aug 26, 2021

theStack force-pushed on Aug 26, 2021

DrahtBot cross-referenced this on Aug 30, 2021 from issue test: Activate all regtest softforks at height 1, unless overridden by MarcoFalke

DrahtBot commented at 11:56 PM on August 30, 2021: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

josibake commented at 11:03 AM on August 31, 2021: contributor

reACK: https://github.com/bitcoin/bitcoin/pull/22711/commits/00b93e5e990b04e8f8b3e1d6d0a274de89e83f8b

DrahtBot added the label Needs rebase on Sep 24, 2021

test: fix reference to block processing test in p2p_segwit.py

The block test was renamed from `p2p-fullblocks.py` to
`feature_block.py` in commit ca6523d0c8 (PR #11774).

b1488c4dce

test: check for block reject reasons in p2p_segwit.py [1/2]

This commit adds specific expected reject reasons for segwit blocks
sent to the node, that are independent of whether multiple script threads
are activated or not.

4eb532ff8b

test: check for block reject reasons in p2p_segwit.py [2/2]

This commit adds specific expected reject reasons for segwit blocks
sent to the node, that are only showing up if one script threads
is used. For this reason, the node is started with the parameter
`-par=1`.

45827fd718

theStack force-pushed on Sep 24, 2021

theStack commented at 4:15 PM on September 24, 2021: contributor

Rebased on master.

DrahtBot removed the label Needs rebase on Sep 24, 2021

pg156 commented at 2:51 PM on September 29, 2021: none

Concept and approach ACK

stratospher commented at 7:51 AM on October 24, 2021: contributor

tested ACK 45827fd.

Checking the reject reasons in test_witness_block is a great addition to the test.

MarcoFalke merged this on Oct 25, 2021

MarcoFalke closed this on Oct 25, 2021

sidhujag referenced this in commit eaa91408c5 on Oct 25, 2021

bitcoin locked this on Oct 30, 2022

test: check for specific block reject reasons in p2p_segwit.py #22711

Conflicts