Split CNode::cs_vSend: message processing and message sending #9535

pull TheBlueMatt wants to merge 2 commits into bitcoin:master from TheBlueMatt:2017-01-cs-vsend-split changing 2 files +12 −15
  1. TheBlueMatt commented at 7:51 PM on January 12, 2017: member

    cs_vSend is used for two purposes - to lock the datastructures used to queue messages to place on the wire and to only call SendMessages once at a time per-node. I believe SendMessages used to access some of the vSendMsg stuff, but it doesn't anymore, so these locks do not need to be on the same mutex, and also make deadlocking much more likely.

  2. TheBlueMatt commented at 7:54 PM on January 12, 2017: member

    Note that this commit was pulled out of #9419 as it stands alone and is by itself a big win.

  3. fanquake added the label P2P on Jan 12, 2017
  4. in src/net.cpp:None in 53474fd5a3 outdated
    1882 | @@ -1887,7 +1883,7 @@ void CConnman::ThreadMessageHandler()
    1883 |  
    1884 |              // Send messages
    1885 |              {
    1886 | -                TRY_LOCK(pnode->cs_vSend, lockSend);
    1887 | +                TRY_LOCK(pnode->cs_sendProcessing, lockSend);
    


    theuni commented at 3:59 AM on January 13, 2017:

    No need for the TRY_LOCK here. Death to 'em!


    TheBlueMatt commented at 4:09 AM on January 13, 2017:

    Ok

  5. theuni commented at 4:38 AM on January 13, 2017: member

    utACK 11290734ca7261f732c9f43f152c69de3a42546c, I ran a nearly identical version for over a week with no issues.

    Shameless beg/plea for 0.14. This was originally intended for #9441, but left out because #9419 (which has since been dropped for 0.14) already included it. I understand that we all have deep review queues atm though, not everything can make it.

  6. Split CNode::cs_vSend: message processing and message sending
    cs_vSend is used for two purposes - to lock the datastructures used
    to queue messages to place on the wire and to only call
    SendMessages once at a time per-node. I believe SendMessages used
    to access some of the vSendMsg stuff, but it doesn't anymore, so
    these locks do not need to be on the same mutex, and also make
    deadlocking much more likely.
    d7c58ad514
  7. Make the cs_sendProcessing a LOCK instead of a TRY_LOCK
    Technically cs_sendProcessing is entirely useless now because it
    is only ever taken on the one MessageHandler thread, but because
    there may be multiple of those in the future, it is left in place
    376b3c2c6e
  8. TheBlueMatt force-pushed on Jan 13, 2017
  9. TheBlueMatt commented at 6:35 PM on January 13, 2017: member

    Rebased after #9441 merge.

  10. sipa commented at 3:41 AM on January 14, 2017: member

    utACK 376b3c2c6e329357e4793c1d1b90d1dc0f30fed0. Happy to see TRY_LOCKs go.

  11. instagibbs commented at 9:48 PM on January 16, 2017: member

    lock pleb utACK 376b3c2c6e329357e4793c1d1b90d1dc0f30fed0

  12. morcos commented at 3:55 PM on January 17, 2017: member

    utACK 376b3c2

    similar to @instagibbs, this is relatively new code to me, but it seems like it makes sense

  13. in src/net.cpp:None in 376b3c2c6e
    1277 | -                        RecordBytesSent(nBytes);
    1278 | -                    }
    1279 | +                LOCK(pnode->cs_vSend);
    1280 | +                size_t nBytes = SocketSendData(pnode);
    1281 | +                if (nBytes) {
    1282 | +                    RecordBytesSent(nBytes);
    


    jtimon commented at 6:15 PM on January 17, 2017:

    RecordBytesSent() needs to be called from inside the lock? If so, I assume the same would be needed in CConnman::PushMessage


    theuni commented at 6:25 PM on January 17, 2017:

    @jtimon No, RecordBytesSent does not require the lock. It has its own.


    TheBlueMatt commented at 6:31 PM on January 17, 2017:

    Just double checked, it looks like every variable accessed in RecordBytesSent is only done with cs_totalBytesSent (and even if this weren't true, this change wouldnt be a regression, we still call PushMessage from other functions without cs_vSend).

  14. jtimon commented at 12:59 AM on January 18, 2017: contributor

    I was told by @TheBlueMatt and @theuni:

    <BlueMatt> well [#9535](/bitcoin-bitcoin/9535/) is just a lock split, so just going through each variable that is accessed in one side and making sure its not accessed on the other is a pretty good (though admittedly not 100% sufficient) review
    <BlueMatt> and one one of the sides in this case there are relatively few variables accessed, so its not so hard
    <cfields> BlueMatt: yep, will do after dinner
    <cfields> jtimon: yea, it's sane, just needs manual review of what's accessed under the locks before/after
    <cfields> (note that you can treat it as non-recursive, so very simple)
    

    This is what my "manual review what's accessed under the locks before/after" contains at this point:

    ****** Variables accessed while LOCK(pnode->cs_vSend) or TRY_LOCK(pnode->cs_vSend, lockSend);
    pnode->vSendMsg
    pnode->nSendOffset
    pnode->hSocket
    pnode->nLastSend
    pnode->nSendBytes
    pnode->nSendSize
    pnode->fPauseSend
    CConnman::nSendBufferMaxSize
    pnode->mapSendBytesPerMsgCmd
    pnode->cs_inventory
    ****** Variables accessed while LOCK(pnode->cs_sendProcessing)
    GetNodeSignals().SendMessages(pnode, *this, flagInterruptMsgProc);
    pto->nVersion
    pto->fDisconnect
    pto->GetSendVersion()
    pto->fPingQueued
    pto->nPingNonceSent
    pto->nPingUsecStart
    connman.PushMessage(pto, msgMaker.Make(NetMsgType::PING, nonce));
    pto->nPingNonceSent
    pto->GetId()
    pto->fWhitelisted
    pto->fAddnode
    pto->addr
    AdvertiseLocal(pto
    pto->nNextLocalAddrSend
    pto->nNextAddrSend
    pto->vAddrToSend
    pto->fClient
    pto->fOneShot
    pto->id
    pto->nStartingHeight
    Reindex && !fImporting && !IsInitialBlockDownload()
    GetMainSignals().Broadcast(nTimeBestReceived, &connman)
    pto->cs_inventory
    pto->vBlockHashesToAnnounce
    pto->PushInventory(CInv(MSG_BLOCK, hashToAnnounce));
    pto->vInventoryBlockToSend
    pto->nNextInvSend
    pto->fInbound
    pto->cs_filter
    pto->fRelayTxes
    pto->fSendMempool
    pto->cs_feeFilter
    pto->minFeeFilter
    pto->pfilter
    pto->pfilter->IsRelevantAndUpdate(*txinfo.tx)
    pto->filterInventoryKnown
    pto->timeLastMempoolReq
    GetFetchFlags(pto, pindex->pprev, consensusParams)
    pto->mapAskFor
    pto->nextSendTimeFeeFilter
    

    ...so far the method seems to be working, but would be nice to hear a confirmation that that's what I was told, to confirm that I'm not making any wrong assumption. Only SendMessages() left and already listed functions/methods to review, but it's long, the tempation of doing a reduced version of #4646 always come back whenever asked to review SendMessages()...sorry.

    At this point I doubt I will make my review before 0.14 is forked, but I'm fully motivated to finish it before or after it is merged. @theuni can you detail the note that you can treat it as non-recursive part?

  15. theuni commented at 2:05 AM on January 18, 2017: member

    @jtimon looks good.

    Re #4646, something similar is definitely the plan. See https://github.com/theuni/bitcoin/commit/1a6b10aea129ac363727c2d68fae809c2861e4da. That's definitely the next step, it was just too much of a refactor for 0.14.

    As for the non-recursive part, I simply meant that it should now be quite easy to spot potential deadlocks since cs_vSend is now limited enough in scope that it can't recurse. In other words, the lock never leaves PushMessage/SocketSendData/RecordBytesSent (where RecordBytesSent isn't necessary and could be avoided).

  16. jtimon commented at 1:27 AM on January 19, 2017: contributor

    Alright, going through all pnode->cs_vSend was easy, but after going though all SendMessages (I updated the list in my comment above), looking only at the pto parameter for the first pass (which is the most relevant for this pr) and not being exhaustive about all potential globals and not going recursively though connman.PushMessage() IsInitialBlockDownload(), Broadcast(nTimeBestReceived, &connman), pto->PushInventory(), pto->pfilter->IsRelevantAndUpdate()...the only "conflict" so far is cs_inventory, but it's just a finer smaller lock, so seems fine as well. I'm very sorry about not being able to give a full utACK for this one, I feel my ACK is worth way more than the simple concept ACK I was planning and I thank @TheBlueMatt and @theuni for pushing me to review further when I warned I'm not so familiar with the networking code and its concurrency. Slow review ACK.

  17. laanwj added this to the milestone 0.14.0 on Jan 19, 2017
  18. laanwj merged this on Jan 19, 2017
  19. laanwj closed this on Jan 19, 2017

  20. laanwj referenced this in commit 82274c02ed on Jan 19, 2017
  21. codablock referenced this in commit dd9f6afea1 on Jan 19, 2018
  22. codablock referenced this in commit 3dbe6a2b02 on Jan 20, 2018
  23. codablock referenced this in commit f8b880f4d8 on Jan 21, 2018
  24. andvgal referenced this in commit fabfc487c9 on Jan 6, 2019
  25. CryptoCentric referenced this in commit 0653e88778 on Feb 27, 2019
  26. jnewbery referenced this in commit 1996424bd9 on Jun 15, 2020
  27. jnewbery referenced this in commit 06ebb5deaf on Jun 18, 2020
  28. jnewbery referenced this in commit 8fc03f04a2 on Jun 18, 2020
  29. jnewbery referenced this in commit e602e5518c on Jun 19, 2020
  30. jnewbery referenced this in commit ee1a65a686 on Jun 19, 2020
  31. jnewbery referenced this in commit d1bb7c7c8d on Jul 7, 2020
  32. jnewbery referenced this in commit eed431388a on Jul 8, 2020
  33. jnewbery referenced this in commit 9e384f8cdf on Jul 8, 2020
  34. jnewbery referenced this in commit e49a95e3fa on Jul 8, 2020
  35. jnewbery referenced this in commit 0ab9aeced7 on Jul 10, 2020
  36. jnewbery referenced this in commit 1a1c23f8d4 on Jul 11, 2020
  37. jnewbery referenced this in commit 73fb51a5e5 on Jul 24, 2020
  38. random-zebra referenced this in commit 5fcad0c139 on Aug 2, 2020
  39. Warchant referenced this in commit 3e79e89fb7 on Aug 6, 2020
  40. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-24 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me