net: Refactor ThreadSocketHandler #14147

pstratem commented at 3:13 PM on September 4, 2018: contributor

Working towards using poll() on unix like systems.

A number of small changes designed to separate the actual socket handling from the rest of the logic in ThreadSocketHandler.

gmaxwell commented at 5:01 PM on September 4, 2018: contributor

utACK. Aside, this was much more easily reviewed with ?w=1

MarcoFalke added the label P2P on Sep 4, 2018

pstratem force-pushed on Sep 4, 2018

MarcoFalke added the label Refactoring on Sep 4, 2018

MarcoFalke renamed this:
~~[net] Cleanup ThreadSocketHandler~~
[net] Refactor ThreadSocketHandler
on Sep 4, 2018

MarcoFalke commented at 6:24 PM on September 4, 2018: member

Tagged with "Refactor", as this is supposed to not change behavior?

pstratem force-pushed on Sep 4, 2018

pstratem renamed this:
~~[net] Refactor ThreadSocketHandler~~
[Net] Refactor ThreadSocketHandler
on Sep 4, 2018

pstratem commented at 7:31 PM on September 4, 2018: contributor

@MarcoFalke The only change to the logic is grouping the InactivityChecks in a single function outside of the reference counting guards, that logic sets fDisconnect flags, which wont do anything until the next loop in the thread anyways.

fanquake renamed this:
~~[Net] Refactor ThreadSocketHandler~~
net: Refactor ThreadSocketHandler
on Sep 6, 2018

in src/net.cpp:1221 in 431cbca5ed outdated

1274 | -            nPrevNodeCount = vNodesSize;
1275 | -            if(clientInterface)
1276 | -                clientInterface->NotifyNumConnectionsChanged(vNodesSize);
1277 | +    }
1278 | +    if(vNodesSize != nPrevNodeCount) {
1279 | +        nPrevNodeCount = vNodesSize;

practicalswift commented at 3:00 PM on September 6, 2018:

This assignment is redundant, right? :-)

pstratem commented at 4:28 PM on September 6, 2018:

It is indeed.

pstratem force-pushed on Sep 6, 2018

gmaxwell commented at 7:08 PM on September 6, 2018: contributor

re-utACK

in src/net.cpp:1222 in d95e66e939 outdated

1224 | -    if(vNodesSize != nPrevNodeCount) {
1225 | -        nPrevNodeCount = vNodesSize;
1226 | -        if(clientInterface)
1227 | -            clientInterface->NotifyNumConnectionsChanged(vNodesSize);
1228 | +    if(clientInterface && vNodesSize != nPrevNodeCount) {
1229 | +        clientInterface->NotifyNumConnectionsChanged(vNodesSize);

promag commented at 8:12 PM on September 6, 2018:

Commit "Improve lock handling in DisconnectNodes."

Is it worth preventing notifying the same connection count?

pstratem commented at 11:36 PM on September 6, 2018:

Yes, it triggers a notification that updates the GUI and would happen every 50ms.

in src/net.cpp:1220 in d95e66e939 outdated

1223 | -    }
1224 | -    if(vNodesSize != nPrevNodeCount) {
1225 | -        nPrevNodeCount = vNodesSize;
1226 | -        if(clientInterface)
1227 | -            clientInterface->NotifyNumConnectionsChanged(vNodesSize);
1228 | +    if(clientInterface && vNodesSize != nPrevNodeCount) {

promag commented at 8:12 PM on September 6, 2018:

Commit "Improve lock handling in DisconnectNodes."

nit, space after if.

pstratem commented at 11:43 PM on September 6, 2018:

fixed

in src/net.cpp:1228 in 1518622660 outdated

1221 | @@ -1222,6 +1222,13 @@ void CConnman::DisconnectNodes() {
1222 |      }
1223 |  }
1224 |  
1225 | +void CConnman::InactivityChecks() {
1226 | +    LOCK(cs_vNodes);
1227 | +    for (CNode* pnode : vNodes) {
1228 | +        InactivityCheck(pnode);

promag commented at 8:19 PM on September 6, 2018:

Commit "Move InactivityChecks logic to private method."

Just noting that with this change InactivityCheck is now called with cs_vNodes locked.

pstratem commented at 11:44 PM on September 6, 2018:

The reasoning here is that the contents of InactivityCheck are virtually instant.

If this causes any issues in the future it can be trivially switched back to using reference counting.

in src/net.cpp:1221 in 2c325bd625 outdated

1232 | -void CConnman::InactivityCheck(CNode *pnode) {
1233 |      int64_t nMicroTime = GetTimeMicros();
1234 |      int64_t nTime = nMicroTime / 1000000;
1235 | -    if (nTime - pnode->nTimeConnected > 60)
1236 | +
1237 |      {

promag commented at 8:22 PM on September 6, 2018:

Commit "Merge InactivityCheck into InactivityChecks."

Could remove this block?

pstratem commented at 11:55 PM on September 6, 2018:

Not sure what you mean here

promag commented at 12:10 AM on September 7, 2018:

I mean like this:

void CConnman::InactivityChecks() {
    LOCK(cs_vNodes);
    int64_t nMicroTime = GetTimeMicros();
    int64_t nTime = nMicroTime / 1000000;
    for (CNode* pnode : vNodes) {
        if (nTime - pnode->nTimeConnected > 60)
        ...

pstratem commented at 12:57 AM on September 7, 2018:

yeah good point, the lock could take a while

in src/net.cpp:1226 in 2c325bd625 outdated

1228 | -        InactivityCheck(pnode);
1229 | -    }
1230 | -}
1231 | -
1232 | -void CConnman::InactivityCheck(CNode *pnode) {
1233 |      int64_t nMicroTime = GetTimeMicros();

promag commented at 8:23 PM on September 6, 2018:

Commit "Merge InactivityCheck into InactivityChecks."

Should be after LOCK(cs_vNodes)?

pstratem commented at 11:55 PM on September 6, 2018:

Not sure what you mean here

promag commented at 12:05 AM on September 7, 2018:

Move the call GetTimeMicros after the lock?

pstratem commented at 12:30 AM on September 7, 2018:

Ah I see

promag commented at 8:26 PM on September 6, 2018: member

Concept ACK.

pstratem force-pushed on Sep 7, 2018

MarcoFalke commented at 5:40 PM on September 7, 2018: member

I don't believe this is correct. NotifyNumConnectionsChanged will now only be called when nodes are disconnected, and no longer when nodes are connected. Also moving the notify into the DisconnectNodes member function is confusing. I'd prefer to move it into a static void NotifyNodeCount(CConman* cm, unsigned& prev_node_count) {...}. Note the static, to prevent having it to put into the header file and bloat it unnecessarily. (Imo you could do this as well for the other functions)

Let me know if I am mistaken in some way.

Also, it would help to see the code that builds on top of this refactor. This way it is easier to see why the refactor is required and why it was done in exactly this way.

in src/net.cpp:1232 in 718843aed5 outdated

1283 | +void CConnman::InactivityChecks() {
1284 | +    int64_t nMicroTime, nTime;
1285 | +
1286 | +    {
1287 | +        LOCK(cs_vNodes);
1288 | +        nMicroTime = GetTimeMicros();

MarcoFalke commented at 5:46 PM on September 7, 2018:

in commit e92a917117 Reduce calls to GetMicroTime in InactivityCheck.

newly introduced variables should always be snake_case, e.g. time_micro = ... .

pstratem force-pushed on Sep 9, 2018

in src/net.cpp:1272 in 7c203fb5ab outdated

1287 | +{
1288 | +    unsigned int nPrevNodeCount = 0;
1289 | +    while (!interruptNet)
1290 | +    {
1291 | +        //
1292 | +        // Disconnect nodes

ryanofsky commented at 6:21 PM on September 10, 2018:

In commit "Move DisconnectNodes logic to private method." (7c203fb5ab2558cd6e306d85a62c8bbe159a7c02)

Should delete this comment I think. It just takes up space and conveys no information.

in src/net.cpp:1156 in 7c203fb5ab outdated

1152 | @@ -1153,82 +1153,86 @@ void CConnman::AcceptConnection(const ListenSocket& hListenSocket) {
1153 |      }
1154 |  }
1155 |  
1156 | -void CConnman::ThreadSocketHandler()
1157 | -{
1158 | -    unsigned int nPrevNodeCount = 0;
1159 | -    while (!interruptNet)
1160 | +void CConnman::DisconnectNodes(unsigned int &nPrevNodeCount) {

ryanofsky commented at 6:23 PM on September 10, 2018:

In commit "Move DisconnectNodes logic to private method." (7c203fb5ab2558cd6e306d85a62c8bbe159a7c02)

Opening brace should go on next line.

in src/net.cpp:1195 in 57bd308a09 outdated

1189 | @@ -1190,6 +1190,8 @@ void CConnman::DisconnectNodes() {
1190 |                  vNodesDisconnected.push_back(pnode);
1191 |              }
1192 |          }
1193 | +
1194 | +        vNodesSize = vNodes.size();

ryanofsky commented at 6:34 PM on September 10, 2018:

In commit "Improve lock handling in DisconnectNodes." (57bd308a09556960e8fab1cd44f61ef519058a59)

I would combine this commit with the previous commit "Remove nPrevNodeCount parameter" because changes in that commit like moving the vNodesSize declaration further away from its assignment, and updating the nPrevNodeCount value before throwing it away really don't make sense standalone.

ryanofsky commented at 7:11 PM on September 10, 2018: contributor

Reviewed this, but the bug in commit "Remove nPrevNodeCount parameter from DisconnectNodes." (5878abfdd3fb2ff7bb6d20f16ec16b70e27ead92) which Marco pointed out needs to be fixed, and will change this PR quite a bit. Otherwise, though, the refactorings are nice.

Also, it would help to see the code that builds on top of this refactor. This way it is easier to see why the refactor is required and why it was done in exactly this way.

Doesn't use this refactor, but https://github.com/TheBlueMatt/bitcoin/commit/10b8cafcf500459c37cd228fd82220623ca17f81 from Matt shows generally what is involved in switching from select() to poll(). The changes in this PR probably just help by breaking up the large ThreadSocketHandler function more.

pstratem force-pushed on Sep 10, 2018

pstratem commented at 8:17 PM on September 10, 2018: contributor

@MarcoFalke you're definitely right, that is a bug

i'm not so sure about making these static functions though, they extensively use CConnman variables and are only called by CConnman methods

MarcoFalke commented at 7:10 PM on September 11, 2018: member

Yeah, feel free to ignore my comment about the static vs private

pstratem force-pushed on Sep 11, 2018

pstratem commented at 7:15 PM on September 11, 2018: contributor

@MarcoFalke ok fixed that issue and the variable naming

MarcoFalke commented at 7:36 PM on September 11, 2018: member

utACK 8491dfbadceb8052f852ebf0020a622194db525f

DrahtBot commented at 8:52 PM on September 12, 2018: contributor

Note to reviewers: This pull request conflicts with the following ones:

#14335 (net: refactor: cleanup ThreadSocketHandler by pstratem)
#14221 (wip: net: Implement poll by pstratem)
#14046 (net: Refactor message parsing (CNetMessage), adds flexibility by jonasschnelli)
#14032 (Add p2p layer encryption with ECDH/ChaCha20Poly1305 by jonasschnelli)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot cross-referenced this on Sep 12, 2018 from issue net: Refactor message parsing (CNetMessage), adds flexibility by jonasschnelli

pstratem cross-referenced this on Sep 14, 2018 from issue wip: net: Implement poll by pstratem

in src/net.cpp:1230 in 2d5c345507 outdated

1266 | -            vNodesSize = vNodes.size();
1267 | -        }
1268 | -        if(vNodesSize != nPrevNodeCount) {
1269 | -            nPrevNodeCount = vNodesSize;
1270 | -            if(clientInterface)
1271 | -                clientInterface->NotifyNumConnectionsChanged(vNodesSize);

ryanofsky commented at 7:24 PM on September 14, 2018:

In commit "Move DisconnectNodes logic to private method." (2d5c345507208d215c0621180305a9f1d363012a)

It seems like moving NotifyNumConnectionsChanged call from right after the disconnect to later in the loop below after the select call changes behavior. So instead of ui being updated immediately after a disconnection, it will only be updated after a disconnect followed by a receive on another socket? Is this correct / intentional / good?

pstratem commented at 9:13 PM on September 14, 2018:

Select timeout is 50ms, doesn't seem to particularly matter.

pstratem commented at 9:14 PM on September 14, 2018:

Can certainly change it back though, it really shouldn't matter.

in src/net.cpp:1258 in e817ce3090 outdated

1254 | @@ -1255,9 +1255,22 @@ void CConnman::InactivityChecks()
1255 |      }
1256 |  }
1257 |  
1258 | +void CConnman::NotifyNumConnectionsChanged()

ryanofsky commented at 7:50 PM on September 14, 2018:

In commit "Move NotifyNumConnectionsChanged logic to private method." (e817ce3090fa71c61170216090685ee142d0f1d5)

It would be nice if you combined this commit with commit "Move DisconnectNodes logic to private method" (2d5c345507208d215c0621180305a9f1d363012a), or moved this commit before it, so this block of code only moves once instead of twice inside the same PR.

pstratem commented at 2:13 AM on September 16, 2018:

I've restructured the first commit so that the logic isn't moved the first time.

in src/net.cpp:1263 in 8491dfbadc outdated

1259 | @@ -1260,6 +1260,7 @@ void CConnman::ThreadSocketHandler()
1260 |      unsigned int nPrevNodeCount = 0;
1261 |      while (!interruptNet)
1262 |      {
1263 | +        InactivityChecks();

ryanofsky commented at 7:55 PM on September 14, 2018:

In commit "Run InactivityChecks before other socket handling logic." (8491dfbadceb8052f852ebf0020a622194db525f)

It seems like this commit should have no effect, but if it has no effect, then it's not clear why it would be desirable? I think it would be helpful if the commit messages in this PR said explicitly whether the commit is or isn't intending to change behavior.

pstratem commented at 2:14 AM on September 16, 2018:

It doesn't really matter, but running it earlier might allow for marking nodes disconnected before the socket handling logic runs.

Move DisconnectNodes logic to private method. f3a4c7073c

Move InactivityCheck to private method. fcc8dcb63c

Reduce calls to GetMicroTime in InactivityCheck. 18fb57ef28

Move InactivityChecks logic to private method. d43ea87de6

Merge InactivityCheck into InactivityChecks. f4c0a4b12e

Move NotifyNumConnectionsChanged logic to private method. 9579a87d6c

Move SocketHandler logic to private method. 672543085a

pstratem force-pushed on Sep 16, 2018

DrahtBot cross-referenced this on Sep 18, 2018 from issue Add p2p layer encryption with ECDH/ChaCha20Poly1305 by jonasschnelli

ryanofsky commented at 6:00 PM on September 21, 2018: contributor

utACK 672543085ac4973f034ad30f897230581a2af8d2. Only changes since last review are dropping the "Run InactivityChecks before other socket handling logic" commit and tweaking order of changes to avoid moving the same code twice.

pstratem cross-referenced this on Sep 26, 2018 from issue net: refactor: cleanup ThreadSocketHandler by pstratem

MarcoFalke closed this on Sep 26, 2018

laanwj referenced this in commit 23419e4c49 on Oct 16, 2018

bitcoin locked this on Sep 8, 2021