tests: race condition in stop_nodes #10237

issue laanwj opened this issue on April 20, 2017
  1. laanwj commented at 8:57 AM on April 20, 2017: member

    This seems to frequently happen while running the tests on OpenBSD 6.1:

    stdout:
    2017-04-20 07:43:54.548000 TestFramework (INFO): Initializing test directory /tmp/test9_kc23yk/582
    2017-04-20 07:44:01.068000 TestFramework (INFO): Mining blocks...
    2017-04-20 07:44:08.886000 TestFramework (INFO): Running tests
    2017-04-20 07:44:14.205000 TestFramework (INFO): Success
    2017-04-20 07:44:14.205000 TestFramework (INFO): Stopping nodes
    
    stderr:
    Traceback (most recent call last):
      File "/home/user/bitcoin/test/functional/test_framework/authproxy.py", line 125, in _request
        return self._get_response()
      File "/home/user/bitcoin/test/functional/test_framework/authproxy.py", line 167, in _get_response
        http_response = self.__conn.getresponse()
      File "/usr/local/lib/python3.5/http/client.py", line 1197, in getresponse
        response.begin()
      File "/usr/local/lib/python3.5/http/client.py", line 297, in begin
        version, status, reason = self._read_status()
      File "/usr/local/lib/python3.5/http/client.py", line 266, in _read_status
        raise RemoteDisconnected("Remote end closed connection without"
    http.client.RemoteDisconnected: Remote end closed connection without response
    
    During handling of the above exception, another exception occurred:
    
    Traceback (most recent call last):
      File "/home/user/bitcoin/test/functional/bumpfee.py", line 304, in <module>
        BumpFeeTest().main()
      File "/home/user/bitcoin/test/functional/test_framework/test_framework.py", line 166, in main
        stop_nodes(self.nodes)
      File "/home/user/bitcoin/test/functional/test_framework/util.py", line 381, in stop_nodes
        stop_node(node, i)
      File "/home/user/bitcoin/test/functional/test_framework/util.py", line 372, in stop_node
        node.stop()
      File "/home/user/bitcoin/test/functional/test_framework/coverage.py", line 46, in __call__
        return_val = self.auth_service_proxy_instance.__call__(*args, **kwargs)
      File "/home/user/bitcoin/test/functional/test_framework/authproxy.py", line 151, in __call__
        response = self._request('POST', self.__url.path, postdata.encode('utf-8'))
      File "/home/user/bitcoin/test/functional/test_framework/authproxy.py", line 129, in _request
        self.__conn.request(method, path, postdata, headers)
      File "/usr/local/lib/python3.5/http/client.py", line 1106, in request
        self._send_request(method, url, body, headers)
      File "/usr/local/lib/python3.5/http/client.py", line 1151, in _send_request
        self.endheaders(body)
      File "/usr/local/lib/python3.5/http/client.py", line 1102, in endheaders
        self._send_output(message_body)
      File "/usr/local/lib/python3.5/http/client.py", line 934, in _send_output
        self.send(msg)
      File "/usr/local/lib/python3.5/http/client.py", line 877, in send
        self.connect()
      File "/usr/local/lib/python3.5/http/client.py", line 849, in connect
        (self.host,self.port), self.timeout, self.source_address)
      File "/usr/local/lib/python3.5/socket.py", line 711, in create_connection
        raise err
      File "/usr/local/lib/python3.5/socket.py", line 702, in create_connection
        sock.connect(sa)
    ConnectionRefusedError: [Errno 61] Connection refused
    

    My guess of what happens is:

    • .stop() is called
    • The node exits before being able to send back the response to stop(). This is possible, on purpose, to prevent the http server from delaying shutdown indefinitely, see this comment: https://github.com/bitcoin/bitcoin/blob/master/src/httpserver.cpp#L495
    • The python code interprets this as a fatal error, not even cleaning up properly, doesn't shutdown the other bitcoinds still running.

    A possible way to resolve this would be to catch errors while calling stop(), and continue stopping the other bitcoinds. Afterwards check whether any child processes are still alive, and only if so throw an error ("stuck in shutdown").

  2. laanwj added the label Tests on Apr 20, 2017
  3. laanwj added the label Linux/Unix on Apr 20, 2017
  4. laanwj commented at 3:01 AM on November 8, 2017: member

    Haven't seeen this for a long tme, closing.

  5. laanwj closed this on Nov 8, 2017

  6. MarcoFalke locked this on Sep 8, 2021
Contributors

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me