test: fix intermittent failure in p2p_v2_earlykeyresponse #29352

pull mzumsande wants to merge 1 commits into bitcoin:master from mzumsande:202401_fix_flaky_earlykey changing 1 files +4 −0
  1. mzumsande commented at 11:06 pm on January 30, 2024: contributor

    The test fails intermittently, see https://cirrus-ci.com/task/6403578080788480?logs=ci#L3521 and #24748 (comment). I think it’s because of a race between the python NetworkThread and the actual test, which will both call initiate_v2_handshake. I could reproduce it by adding a sleep into initiate_v2_handshake after the line self.sent_garbage = random.randbytes(garbage_len).

    Fix this by waiting for the first initiate_v2_handshake to have finished before calling it a second time.

  2. DrahtBot commented at 11:06 pm on January 30, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK theStack, stratospher, achow101

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29358 (test: use v2 everywhere for P2PConnection if –v2transport is enabled by mzumsande)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Tests on Jan 30, 2024
  4. in test/functional/p2p_v2_earlykeyresponse.py:63 in 1d04d58ecc outdated
    58@@ -59,6 +59,8 @@ def data_received(self, t):
    59         # check that data can be received on recvbuf only when mismatch from V1_PREFIX happens (send_net_magic = False)
    60         assert self.v2_state.can_data_be_received and not self.v2_state.send_net_magic
    61 
    62+    def on_open(self):
    63+        self.connection_made = True
    


    stratospher commented at 5:57 am on January 31, 2024:
    1d04d58: i think we should call it a different name - self.connected or something like that? i’m still getting a crash with the sleep statement because python is confusing connection_made variable and callback function.

    mzumsande commented at 3:27 pm on January 31, 2024:
    thanks, renamed to connection_opened
  5. stratospher commented at 5:58 am on January 31, 2024: contributor
    clever find! will ACK after this is fixed.
  6. fanquake added this to the milestone 27.0 on Jan 31, 2024
  7. theStack commented at 12:06 pm on January 31, 2024: contributor

    Concept ACK

    I think you need to also initialize the newly introduced variable in the PeerEarlyKey constructor, otherwise the following can happen in the wait_until loop:

    0AttributeError: 'PeerEarlyKey' object has no attribute 'connection_made2'
    
  8. test: fix intermittent failure in p2p_v2_earlykeyresponse
    This fixes a possible race between the python NetworkThread and the actual
    test, which will both call initiate_v2_handshake.
    9642aefb81
  9. mzumsande force-pushed on Jan 31, 2024
  10. mzumsande commented at 3:28 pm on January 31, 2024: contributor
    Fixed both issues, thanks!
  11. sr-gi commented at 3:48 pm on January 31, 2024: member

    I wonder if we will not better off having a simpler initiate_v2_handshake in this case and avoiding sync + async logic. Given we end up having to manually send raw messages and accessing peer1 internal state anyway, I think it would be simpler to send everything manually:

     0class TestEncryptedP2PState(EncryptedP2PState):
     1    def __init__(self):
     2        super().__init__(initiating=True, net='regtest')
     3        self.magic_sent = False
     4
     5    def initiate_v2_handshake(self):
     6        self.privkey_ours, self.ellswift_ours = ellswift_create()
     7        self.sent_garbage = random.randbytes(random.randrange(4096))
     8        # We'll take care to do the handshake manually
     9        return b""
    10
    11class PeerEarlyKey(P2PInterface):
    12    """Custom implementation of P2PInterface which uses modified v2 P2P protocol functions for testing purposes."""
    13    def __init__(self):
    14        super().__init__()
    15        self.v2_state = None
    16
    17    def connection_made(self, transport):
    18        """64 bytes ellswift is sent in 2 parts during `initial_v2_handshake()`"""
    19        self.v2_state = TestEncryptedP2PState()
    20        super().connection_made(transport)
    21
    22    def data_received(self, t):
    23        # check that data can be received on recvbuf only when mismatch from V1_PREFIX happens
    24        assert self.v2_state.magic_sent
    25
    26
    27class P2PEarlyKey(BitcoinTestFramework):
    28    def set_test_params(self):
    29        self.num_nodes = 1
    30        self.extra_args = [["-v2transport=1", "-peertimeout=3"]]
    31
    32    def run_test(self):
    33        self.log.info('Sending ellswift bytes in parts to ensure that response from responder is received only when')
    34        self.log.info('ellswift bytes have a mismatch from the 16 bytes(network magic followed by "version\\x00\\x00\\x00\\x00\\x00")')
    35        node0 = self.nodes[0]
    36        self.log.info('Sending first 4 bytes of ellswift which match network magic')
    37        self.log.info('If a response is received, assertion failure would happen in our custom data_received() function')
    38        peer1 = node0.add_p2p_connection(PeerEarlyKey(), wait_for_verack=False, send_version=False, supports_v2_p2p=True)
    39        # Send only the network magic first
    40        peer1.send_raw_message(b"\xfa\xbf\xb5\xda")
    41        peer1.v2_state.magic_sent = True
    42        self.log.info('Sending remaining ellswift and garbage which are different from V1_PREFIX. Since a response is')
    43        self.log.info('expected now, our custom data_received() function wouldn\'t result in assertion failure')
    44        # Send the remaining bytes, if any bytes were received before this point `data_received` would have resulted in an assertion fail
    45        peer1.send_raw_message(peer1.v2_state.ellswift_ours[4:] + peer1.v2_state.sent_garbage)
    46        peer1.wait_for_disconnect(timeout=5)
    47        self.log.info('successful disconnection when MITM happens in the key exchange phase')
    48
    49if __name__ == '__main__':
    50    P2PEarlyKey().main()
    
  12. theStack approved
  13. theStack commented at 3:56 pm on January 31, 2024: contributor

    Tested ACK 9642aefb81a9c87227eccf9997380024247ed1fc

    Reproduced the issue and verified its fix by adding an artificial delay in initiate_v2_handshake, as suggested in the PR description. (With artificial delays of 3 seconds or more, the test fails again with an IOError('Not connected'), but this is expected as the TestNode closes the connection if there is no activity: ... [InactivityCheck] [net] socket no message in first 3 seconds ...)

  14. DrahtBot requested review from stratospher on Jan 31, 2024
  15. mzumsande commented at 4:07 pm on January 31, 2024: contributor

    I wonder if we will not better off having a simpler initiate_v2_handshake in this case and avoiding sync + async logic. Given we end up having to manually send raw messages and accessing peer1 internal state anyway, I think it would be simpler to send everything manually:

    Sounds reasonable to me, but I’d like to refer that to @stratospher, who I believe has plans to extend / rewrite the test anyway (see the WIP branch linked here), and just fix the CI here straightforwardly.

  16. stratospher commented at 5:25 pm on January 31, 2024: contributor

    tested ACK 9642aef.

    Sounds reasonable to me, but I’d like to refer that to @stratospher, who I believe has plans to extend / rewrite the test anyway (see the WIP branch linked #24748 (comment)), and just fix the CI here straightforwardly.

    yes, will look into it! extending this test for more v2 disconnection scenarios in this WIP branch

  17. DrahtBot removed review request from stratospher on Jan 31, 2024
  18. achow101 commented at 9:30 pm on January 31, 2024: member
    ACK 9642aefb81a9c87227eccf9997380024247ed1fc
  19. achow101 merged this on Jan 31, 2024
  20. achow101 closed this on Jan 31, 2024

  21. mzumsande deleted the branch on Feb 1, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me