Tests: p2p_invalid_messages test can fail in runner due to stderr text when socket write buffer flushed #15400

issue JustinTArthur opened this issue on February 13, 2019
  1. JustinTArthur commented at 9:29 PM on February 13, 2019: contributor

    Running p2p_invalid_messages.py on its own shows a passing test, but it can fail when its run by test_runner.py. This is because test_runner checks for an empty stderr and we have some exceptions from the asyncio event loop that aren't caught in our code.

    The test does a send after we know the socket would have been closed so that any kind of RST will be realized and the connection is closed officially on our end; however, if the the extra send we do is done after the kernel has already ~acknowledged~ started the close, the send error is only caught during asyncio's asynchronous write buffer flushing task and the exception is printed to stderr.

    1/1 - p2p_invalid_messages.py failed, Duration: 6 s
    
    stdout:
    2019-02-13T20:21:33.270000Z TestFramework (INFO): Initializing test directory /var/folders/y0/l8_5z4393lg2qlwdjc11n9380000gn/T/test_runner_₿_🏃_20190213_142133/p2p_invalid_messages_0
    2019-02-13T20:21:34.068000Z TestFramework (INFO): Sending a bunch of large, junk messages to test memory exhaustion. May take a bit...
    2019-02-13T20:21:37.903000Z TestFramework (INFO): Waiting for node to drop junk messages.
    2019-02-13T20:21:38.011000Z TestFramework.mininode (WARNING): Connection lost to 127.0.0.1:11000 due to [Errno 41] Protocol wrong type for socket
    2019-02-13T20:21:38.116000Z TestFramework (INFO): Sending a message with incorrect size of 2
    2019-02-13T20:21:38.277000Z TestFramework (INFO): Sending a message with incorrect size of 77
    2019-02-13T20:21:38.433000Z TestFramework (INFO): Sending a message with incorrect size of 78
    2019-02-13T20:21:38.596000Z TestFramework (INFO): Sending a message with incorrect size of 79
    2019-02-13T20:21:38.972000Z TestFramework (INFO): Stopping nodes
    2019-02-13T20:21:39.232000Z TestFramework (INFO): Cleaning up /var/folders/y0/l8_5z4393lg2qlwdjc11n9380000gn/T/test_runner_₿_🏃_20190213_142133/p2p_invalid_messages_0 on exit
    2019-02-13T20:21:39.232000Z TestFramework (INFO): Tests successful
    
    
    stderr:
    Fatal write error on socket transport
    protocol: <test_framework.mininode.P2PDataStore object at 0x102fc8940>
    transport: <_SelectorSocketTransport fd=10 read=polling write=<idle, bufsize=0>>
    Traceback (most recent call last):
      File "/usr/local/Cellar/python/3.7.2_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/asyncio/selector_events.py", line 857, in write
        n = self._sock.send(data)
    OSError: [Errno 41] Protocol wrong type for socket
    

    We can probably check for close prior to the extra send and I think there's a way to register an exception handler to the event loop for errors that happen outside of what your code is awaiting.

  2. laanwj added the label Tests on Feb 13, 2019
  3. laanwj commented at 9:37 PM on February 13, 2019: member

    We can probably check for close prior to the extra send

    Definitely — [Errno 41] Protocol wrong type for socket sounds like a race, not only has the file descriptor been closed but it has been re-used for a different protocol since?

  4. JustinTArthur commented at 9:40 PM on February 13, 2019: contributor

    It could be that or Protocol wrong type for socket could just be darwin kernel-specific behavior when writing to the reset socket; I'm not sure off-hand.

  5. JustinTArthur renamed this:
    p2p_invalid_messages test can fail in runner due to stderr text when socket write buffer flushed
    Tests: p2p_invalid_messages test can fail in runner due to stderr text when socket write buffer flushed
    on Feb 13, 2019
  6. JustinTArthur commented at 10:02 PM on February 13, 2019: contributor

    According to a blog post by [@erickt](/bitcoin-bitcoin/contributor/erickt/), we might see the protocol error in macOS/XNU if we try to send while the kernel is tearing down the socket. So yea, a race of sorts.

  7. MarcoFalke added the label macOS on Feb 13, 2019
  8. IlyasRidhuan commented at 9:24 AM on February 15, 2019: none

    Running the test directly gives the same error for me as running with test_runner.py.

    If it's helpful at all, it seems that if you comment out either test 0 or test 1 (the msg_at_size and msg_over_size tests) the tests consistently pass even using test_runner.py. But when run together, they error out.

    Edit: It looks like if you swap the order of tests 0 & 1 (and add some lines to rebuild the connections after the induced disconnect) the test now run without error. There seems to be something about sending the oversized message after the at-sized one.

  9. MarcoFalke referenced this in commit 0626b8cbdf on Jul 24, 2019
  10. MarcoFalke commented at 11:22 AM on July 24, 2019: member

    Let me know if this is still an issue with current master

  11. MarcoFalke closed this on Jul 24, 2019

  12. MarcoFalke reopened this on Oct 18, 2019

  13. MarcoFalke commented at 12:56 PM on October 18, 2019: member

    I could only see this on travis. With the gitian binaries, or with our new native build: https://travis-ci.org/bitcoin/bitcoin/jobs/599413856 or https://travis-ci.org/bitcoin/bitcoin/jobs/599229377#L6967

    Anyone also seeing this locally?

  14. MarcoFalke commented at 1:26 PM on October 18, 2019: member

    If this is an issue on travis only, we should disable the functional tests there. If this can be reproduced locally, it should be fixed.

  15. laanwj referenced this in commit 4af0447169 on Oct 24, 2019
  16. MarcoFalke closed this on Apr 27, 2020

  17. monstrobishi referenced this in commit 3f61c97d44 on Sep 6, 2020
  18. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 21:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me