Several tests call disconnect_nodes() on each node-pair in rapid succession, resulting in a race condition if a node disconnects a peer in-between the calculation of the nodeid's to disconnect and the invocation of the disconnectnode rpc call. Handle this.
[qa] Handle disconnect_node race #13201
pull sdaftuar wants to merge 1 commits into bitcoin:master from sdaftuar:2018-05-rpc-disconnect-race changing 1 files +8 −1-
sdaftuar commented at 5:23 PM on May 9, 2018: member
-
09c6699900
[qa] Handle disconnect_node race
Several tests call disconnect_nodes() on each node-pair in rapid succession, resulting in a race condition if a node disconnects a peer in-between the calculation of the nodeid's to disconnect and the invocation of the disconnectnode rpc call. Handle this.
-
sdaftuar commented at 5:28 PM on May 9, 2018: member
I believe I encountered the race condition I describe in the OP during a run of the
p2p_node_network_limited.pytest. An exception was raised indisconnect_all(), where the test disconnects each pair of peers. Here's the relevant test log snippet:node0 2018-05-09T16:27:51.966761Z Received a POST request for / from 127.0.0.1:57133 node0 2018-05-09T16:27:51.966844Z ThreadRPCServer method=getpeerinfo user=__cookie__ node0 2018-05-09T16:27:51.968238Z Received a POST request for / from 127.0.0.1:57133 node0 2018-05-09T16:27:51.973861Z ThreadRPCServer method=disconnectnode user=__cookie__ node0 2018-05-09T16:27:51.980630Z Received a POST request for / from 127.0.0.1:57133 node0 2018-05-09T16:27:51.986890Z ThreadRPCServer method=disconnectnode user=__cookie__ node0 2018-05-09T16:27:51.993866Z Received a POST request for / from 127.0.0.1:57133 node0 2018-05-09T16:27:51.998753Z ThreadRPCServer method=getpeerinfo user=__cookie__ node0 2018-05-09T16:27:51.999412Z disconnecting peer=0 node0 2018-05-09T16:27:52.007234Z disconnecting peer=1 node1 2018-05-09T16:27:52.007299Z socket closed node0 2018-05-09T16:27:52.007387Z Cleared nodestate for peer=0 node1 2018-05-09T16:27:52.008380Z Received a POST request for / from 127.0.0.1:57134 node0 2018-05-09T16:27:52.012901Z Cleared nodestate for peer=1 node1 2018-05-09T16:27:52.015917Z disconnecting peer=0 node1 2018-05-09T16:27:52.022234Z ThreadRPCServer method=getpeerinfo user=__cookie__ node1 2018-05-09T16:27:52.022266Z Cleared nodestate for peer=0 node1 2018-05-09T16:27:52.022394Z socket closed node1 2018-05-09T16:27:52.022435Z disconnecting peer=1 node1 2018-05-09T16:27:52.022482Z Cleared nodestate for peer=1 node1 2018-05-09T16:27:52.023900Z Received a POST request for / from 127.0.0.1:57134 node1 2018-05-09T16:27:52.028597Z ThreadRPCServer method=disconnectnode user=__cookie__ test 2018-05-09T16:27:52.034000Z TestFramework (ERROR): JSONRPC errorSo we're telling node0 to disconnect from node1 first, and then we're telling node1 to disconnect from node0. Note that the calculation of which nodes to disconnect node1 from (which happens in the node1.getpeerinfo() call) is before node1 disconnects node0, but we actual call node1.disconnectnode() after that. @jnewbery Does this look right to you?
- laanwj added the label Tests on May 9, 2018
-
jnewbery commented at 8:10 PM on May 9, 2018: member
utACK 09c6699900da962c5e35eb82e358240a4099838f.
Seems reasonable - the only way we can receive a
RPC_CLIENT_NODE_NOT_CONNECTEDis if the requested node to disconnect is not found in conman'svNodes. In that case it's fine to just continue, since we're already disconnected by definition. -
laanwj commented at 7:17 AM on May 10, 2018: member
Making disconnection idempotent seems a good idea. utACK 09c6699900da962c5e35eb82e358240a4099838f
- MarcoFalke added this to the milestone 0.16.1 on May 10, 2018
- MarcoFalke added the label Needs backport on May 10, 2018
-
MarcoFalke commented at 2:30 PM on May 10, 2018: member
utACK 09c6699900da962c5e35eb82e358240a4099838f. Nice fixup, that should be backported
- MarcoFalke merged this on May 10, 2018
- MarcoFalke closed this on May 10, 2018
- MarcoFalke referenced this in commit f3e747ee77 on May 10, 2018
- fanquake referenced this in commit 9425d8d5d4 on May 17, 2018
- fanquake removed the label Needs backport on May 17, 2018
- fanquake referenced this in commit b8aacd660e on May 17, 2018
- laanwj referenced this in commit 50b2c9e0df on May 24, 2018
- HashUnlimited referenced this in commit 3b79d2dd0c on Jun 29, 2018
- ccebrecos referenced this in commit 827f819bdc on Sep 14, 2018
- jasonbcox referenced this in commit 15971bd1f2 on Sep 27, 2019
- PastaPastaPasta referenced this in commit 9d269073de on Apr 3, 2020
- ckti referenced this in commit 792776fc78 on Mar 28, 2021
- DrahtBot locked this on Sep 8, 2021
Milestone
0.16.1