test: Make the unlikely race in p2p_invalid_messages impossible #27212

pull maflcko wants to merge 1 commits into bitcoin:master from maflcko:2303-test-race-less-👾 changing 1 files +5 −2
  1. maflcko commented at 3:58 PM on March 6, 2023: member

    After add_p2p_connection both sides have the verack processed. However the pong from conn in reply to the ping from the node has not been processed and recorded in totalbytesrecv. Flush the pong from conn by sending a ping from conn.

    This should make the unlikely race impossible.

  2. DrahtBot commented at 3:58 PM on March 6, 2023: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK mzumsande, pinheadmz

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label Tests on Mar 6, 2023
  4. maflcko commented at 4:00 PM on March 6, 2023: member

    Fixes: https://cirrus-ci.com/task/5392601050775552?logs=functional_tests#L275

     node0 2023-03-02T14:54:19.701023Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3169] [ProcessMessage] [net] received: wtxidrelay (0 bytes) peer=0 
     node0 2023-03-02T14:54:19.701159Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3169] [ProcessMessage] [net] received: verack (0 bytes) peer=0 
     node0 2023-03-02T14:54:19.701238Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net.cpp:2816] [PushMessage] [net] sending sendcmpct (9 bytes) peer=0 
     node0 2023-03-02T14:54:19.701450Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net.cpp:2816] [PushMessage] [net] sending ping (8 bytes) peer=0 
     node0 2023-03-02T14:54:19.701694Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net.cpp:2816] [PushMessage] [net] sending getheaders (69 bytes) peer=0 
     node0 2023-03-02T14:54:19.701827Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:5428] [SendMessages] [net] initial getheaders (0) to peer=0 (startheight:-1) 
     node0 2023-03-02T14:54:19.702044Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net.cpp:2816] [PushMessage] [net] sending feefilter (8 bytes) peer=0 
     node0 2023-03-02T14:54:19.702233Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3169] [ProcessMessage] [net] received: getaddr (0 bytes) peer=0 
     node0 2023-03-02T14:54:19.702382Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\addrman.cpp:820] [GetAddr_] [addrman] GetAddr returned 0 random addresses 
     node0 2023-03-02T14:54:19.702518Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3169] [ProcessMessage] [net] received: ping (8 bytes) peer=0 
     node0 2023-03-02T14:54:19.702599Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net.cpp:2816] [PushMessage] [net] sending pong (8 bytes) peer=0 
     node0 2023-03-02T14:54:19.706681Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\httpserver.cpp:239] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:61710 
     node0 2023-03-02T14:54:19.706872Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\rpc\request.cpp:179] [parse] [rpc] ThreadRPCServer method=getpeerinfo user=__cookie__ 
     node0 2023-03-02T14:54:19.709880Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\httpserver.cpp:239] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:61711 
     node0 2023-03-02T14:54:19.710091Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\rpc\request.cpp:179] [parse] [rpc] ThreadRPCServer method=getnettotals user=__cookie__ 
     node0 2023-03-02T14:54:19.712147Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\httpserver.cpp:239] [http_request_cb] [http] Received a POST request for / from 127.0.0.1:61712 
     node0 2023-03-02T14:54:19.712372Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\rpc\request.cpp:179] [parse] [rpc] ThreadRPCServer method=getnettotals user=__cookie__ 
     node0 2023-03-02T14:54:19.722676Z [C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\src\net_processing.cpp:3169] [ProcessMessage] [net] received: pong (8 bytes) peer=0 
     test  2023-03-02T14:54:19.773000Z TestFramework (ERROR): Assertion failed 
                                       Traceback (most recent call last):
                                         File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\test_framework.py", line 134, in main
                                           self.run_test()
                                         File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\p2p_invalid_messages.py", line 64, in run_test
                                           self.test_buffer()
                                         File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\p2p_invalid_messages.py", line 94, in test_buffer
                                           assert_equal(middle, before + cut_pos)
                                         File "C:\Users\ContainerAdministrator\AppData\Local\Temp\cirrus-ci-build\test\functional\test_framework\util.py", line 56, in assert_equal
                                           raise AssertionError("not(%s)" % " == ".join(str(arg) for arg in (thing1, thing2) + args))
                                       AssertionError: not(283 == 251)
     test  2023-03-02T14:54:19.773000Z TestFramework (DEBUG): Closing down network thread 
    
  5. fanquake requested review from pinheadmz on Mar 8, 2023
  6. in test/functional/p2p_invalid_messages.py:86 in fa4732afd2 outdated
      79 | @@ -80,6 +80,11 @@ def run_test(self):
      80 |      def test_buffer(self):
      81 |          self.log.info("Test message with header split across two buffers is received")
      82 |          conn = self.nodes[0].add_p2p_connection(P2PDataStore())
      83 | +        # After add_p2p_connection both sides have the verack processed.
      84 | +        # However the pong from conn in reply to the ping from the node has not
      85 | +        # been processed and recorded in totalbytesrecv.
      86 | +        # Flush the pong from conn by sending a ping from conn.
    


    pinheadmz commented at 5:03 PM on March 8, 2023:

    Is the flush on the node side? Like the pong has been sent but is still sitting in nodes buffer? I'm just trying to reliably break the test on master to examine the patch


    maflcko commented at 5:11 PM on March 8, 2023:

    Is the flush on the node side? Like the pong has been sent but is still sitting in nodes buffer?

    Correct, it is in the node's receive buffer (from conn).

  7. pinheadmz commented at 5:03 PM on March 8, 2023: member

    concept ACK

    Would it make any more sense to instead add wait_for_verack=True to the initial add_p2p_connection()? Looks like that would also callsync_with_ping() albeit with a longer timeout (default 60s).

  8. maflcko commented at 5:10 PM on March 8, 2023: member

    Would it make any more sense to instead add wait_for_verack=True to the initial add_p2p_connection()?

    Yes, this is the default value already.

  9. fanquake requested review from pinheadmz on Mar 12, 2023
  10. fanquake requested review from mzumsande on Mar 12, 2023
  11. pinheadmz commented at 1:45 PM on March 14, 2023: member

    Yes, this is the default value already.

    Oh right -- why then does a second sync_with_ping() fix the race? Should if wait_for_verack wait for the node's ping before sending its own back?

  12. maflcko commented at 1:57 PM on March 14, 2023: member

    Should if wait_for_verack wait for the node's ping before sending its own back?

    Why? ping is sent after the verack.

  13. in test/functional/p2p_invalid_messages.py:97 in fa4732afd2 outdated
      93 | @@ -89,7 +94,7 @@ def test_buffer(self):
      94 |          # Wait until node has processed the first half of the message
      95 |          self.wait_until(lambda: self.nodes[0].getnettotals()['totalbytesrecv'] != before)
      96 |          middle = self.nodes[0].getnettotals()['totalbytesrecv']
      97 | -        # If this assert fails, we've hit an unlikely race
      98 | +        # If this assert fails, we've hit an unlikely (impossible?) race
    


    mzumsande commented at 9:59 PM on March 16, 2023:

    I'd just remove this line, otherwise it will stay there for a long time. After all we have an explanation for why this has failed that makes sense with the log of the failed run, plus a fix that avoid this.

  14. mzumsande commented at 10:17 PM on March 16, 2023: contributor

    ACK fa4732afd23f67cd1bb45a60a66a7c628b64d560

    Is there a reason for the question marks? To me, this looks like a straightforward fix: the explanation makes total sense (the "received: pong (...)" message occurs late in the log of the failed run), and so does the fix - the python conn sends another ping and waits for an answer, so the node must have also processed the earlier pong from its recv buffer for that happen.

  15. maflcko renamed this:
    test: Make the unlikely race in p2p_invalid_messages impossible?
    test: Make the unlikely race in p2p_invalid_messages impossible
    on Mar 17, 2023
  16. maflcko force-pushed on Mar 17, 2023
  17. test: Make the unlikely race in p2p_invalid_messages impossible fa1eb0ecae
  18. maflcko force-pushed on Mar 17, 2023
  19. maflcko commented at 8:43 AM on March 17, 2023: member

    Thanks! Removed ? and comment completely.

  20. mzumsande commented at 2:35 PM on March 17, 2023: contributor

    ACK fa1eb0ecaef14d428812f956082d29ab134fc728

  21. pinheadmz approved
  22. pinheadmz commented at 2:53 PM on March 17, 2023: member

    ACK fa1eb0ecaef14d428812f956082d29ab134fc728

    Thanks @mzumsande for the explanation, I compared the logs of failed and successful tests and I can see the message order is fixed

    <details><summary>Show Signature</summary>

    -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA256
    
    ACK fa1eb0ecaef14d428812f956082d29ab134fc728
    -----BEGIN PGP SIGNATURE-----
    
    iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmQUfpsACgkQ5+KYS2KJ
    yTrJUg//TlRg8JalWfEd+Rb2hklRFFJpE7PcaEajr5in71WlXaCQ9mCRaXHNtkI5
    XepC5Ho7CCNt84FwRPZSr3giAh1W2S/ooJzKVJN2SOea2NdA1YzXJ1lFE1R84ylX
    oADbsSHOVj49BNRORY6PwxTH5WgE2xCCe63h/I9cbazhBMPT7fKlJRYhP6jyEd7c
    Z8dfhB68KB/XM8BEAqOlhsUedS4J41j/Xdaa3f9yYDNPuoHl4zjtPzjO+MMC1Zn8
    /nhWQxVPe5B9QJEWOQLV7RXuE7StLbFm+2geJdZJv/1NYncx8Zl/VeiO1laKM2J3
    nDBPA5pxhRMEi+u906fjU4ideGDC49SoR/k6SfPnbi9ZBx+cOvaBIZTDLQGXyCyU
    UnlARR4XbogJ8Lmn5CAxAi6Csv1zpChYtnSjMhSqe5AEbmbHBH1/NJY/rkweOhbJ
    Gf6+oo0oZme1yhrdZ8STa+kGNZaYVS239fmgOAoOdDWjnDZ/YGX1ms5EzzgqSW1F
    PfNtaIDBWiXrJqHhNBf8NefG/GK4tkU5HlLstkE3RpvWlKBvHs6v/hheQnL0oKgo
    wwsNU9Va9mZ3oxjIvJ/e8VST0a1WW3x5QZAnrNQYNrccySrZGfA35G80rYspaDuY
    OWPFNSM9YtyhGlZii2MlPUG7DUVnh3nQ4bb868N9YQniPajLTA0=
    =WS49
    -----END PGP SIGNATURE-----
    

    pinheadmz's public key is on keybase

    </details>

  23. fanquake merged this on Mar 17, 2023
  24. fanquake closed this on Mar 17, 2023

  25. sidhujag referenced this in commit 1851da8478 on Mar 17, 2023
  26. maflcko deleted the branch on Mar 17, 2023
  27. bitcoin locked this on Mar 16, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-17 03:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me