Currently, functional tests may intermittently fail, due to a lack of synchronization after a node connection socket was closed gracefully: If node A is connected to node B, and node B closes the connection, node A must wait for the connection to be closed before continuing the test. Otherwise, subsequent re-connections may not work while a stale connection is still alive.
This can be reproduced locally via something like:
0diff --git a/src/net.cpp b/src/net.cpp
1index 6b79e913e8..32bd061500 100644
2--- a/src/net.cpp
3+++ b/src/net.cpp
4@@ -2200,2 +2200,3 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
5 }
6+ UninterruptibleSleep(599ms);
7 pnode->CloseSocketDisconnect();
With this diff, the tests should fail on master, and pass after this fix.
The fix involves simply calling self.wait_until(lambda: len(node_a.getpeerinfo()) == 0) after each disconnect, and before each re-connection.