P2P: parse network datastream into header/data components in socket thread #2016

pull jgarzik wants to merge 1 commits into bitcoin:master from jgarzik:rx-no-buffer changing 3 files +176 −55
  1. jgarzik commented at 10:01 PM on November 15, 2012: contributor

    Replaces CNode::vRecv buffer with a vector of CNetMessage's. This simplifies ProcessMessages() and eliminates several redundant data copies.

    Overview:

    • socket thread now parses incoming message datastream into header/data components, as encapsulated by CNetMessage
    • socket thread adds each CNetMessage to a vector inside CNode
    • message thread (ProcessMessages) iterates through CNode's CNetMessage vector

    Message parsing is made more strict:

    • Socket is disconnected, if message larger than MAX_SIZE or if CMessageHeader deserialization fails (latter is impossible?). Previously, code would simply eat garbage data all day long.
    • Socket is disconnected, if we fail to find pchMessageStart. We do not search through garbage, to find pchMessageStart. Each message must begin precisely after the last message ends.

    ProcessMessages() always processes a complete message, and is more efficient:

    • buffer is always precisely sized, using CDataStream::resize(), rather than progressively sized in 64k chunks. More efficient for large messages like "block".
    • whole-buffer memory copy eliminated (vRecv -> vMsg)
    • other buffer-shifting memory copies eliminated (vRecv.insert, vRecv.erase)
  2. P2P: parse network datastream into header/data components in socket thread
    Replaces CNode::vRecv buffer with a vector of CNetMessage's.  This simplifies
    ProcessMessages() and eliminates several redundant data copies.
    
    Overview:
    
    * socket thread now parses incoming message datastream into
      header/data components, as encapsulated by CNetMessage
    * socket thread adds each CNetMessage to a vector inside CNode
    * message thread (ProcessMessages) iterates through CNode's CNetMessage vector
    
    Message parsing is made more strict:
    
    * Socket is disconnected, if message larger than MAX_SIZE
      or if CMessageHeader deserialization fails (latter is impossible?).
      Previously, code would simply eat garbage data all day long.
    * Socket is disconnected, if we fail to find pchMessageStart.
      We do not search through garbage, to find pchMessageStart.  Each
      message must begin precisely after the last message ends.
    
    ProcessMessages() always processes a complete message, and is more efficient:
    
    * buffer is always precisely sized, using CDataStream::resize(),
      rather than progressively sized in 64k chunks.  More efficient
      for large messages like "block".
    * whole-buffer memory copy eliminated (vRecv -> vMsg)
    * other buffer-shifting memory copies eliminated (vRecv.insert, vRecv.erase)
    8af12a9f5a
  3. in src/net.cpp:None in 67c2e23aba outdated
     635 | +        // get current incomplete message, or create a new one
     636 | +        if (vRecvMsg.size() == 0 ||
     637 | +            vRecvMsg[vRecvMsg.size() - 1].complete())
     638 | +            vRecvMsg.push_back(CNetMessage(SER_NETWORK, nRecvVersion));
     639 | +
     640 | +        CNetMessage& msg = vRecvMsg[vRecvMsg.size() - 1];
    


    sipa commented at 10:41 PM on November 15, 2012:

    vRecvMsg.back();

  4. jgarzik commented at 12:42 AM on November 16, 2012: contributor

    @sipa: ITYM back(). Updated.

  5. in src/net.h:None in 8af12a9f5a
     298 | +    // requires LOCK(cs_vRecvMsg)
     299 | +    void SetRecvVersion(int nVersionIn)
     300 | +    {
     301 | +        nRecvVersion = nVersionIn;
     302 | +        for (unsigned int i = 0; i < vRecvMsg.size(); i++)
     303 | +            vRecvMsg[i].SetVersion(nVersionIn);
    


    sipa commented at 6:56 PM on November 18, 2012:

    I'm not sure this is right. When the receive version is set, it is only applied to new messages. In practice that probably doesn't mean anything, as the version/verack message order is quite strict.

  6. sipa commented at 8:01 PM on November 18, 2012: member

    I like the general design, but this pull mixes an client-side optimization with changing the network protocol policy.

    I'm not against making it more strict and not trying to resync after a partial message or garbage data, but maybe that needs some discussion at least.

  7. mikehearn commented at 12:12 PM on November 21, 2012: contributor

    When I didn't have resync after garbage data in bitcoinj I did see failures due to it, though it was long ago.

    BTW, is it possible now to send huge numbers of messages and cause OOM conditions? Previously if you did that the unread data would stick around in the kernels socket buffers and be discarded automatically. Now I guess the socket thread can read faster than the main thread can process.

  8. sipa commented at 2:21 PM on November 21, 2012: member

    The last time I saw garbage occurring frequently was after the feb20 protocol upgrade. I suppose we can start requiring no garbage now...

    The is still a receive buffer flooding check, by the way.

  9. BitcoinPullTester commented at 5:40 PM on November 22, 2012: none

    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8af12a9f5a56de594700abb5b0d9a05adf82d64b for binaries and test log.

  10. sipa commented at 8:36 PM on November 23, 2012: member

    I wonder why we even need that flood protection. You could just as well stop polling sockets for read events if their receive buffer is above some threshold.

  11. sipa commented at 8:48 PM on November 27, 2012: member

    For the record: I saw a node segfault with (among others) this patch, in the ~CNetMessage destructor (the CDataStream in it was not allocated, I assume uninitialized memory used as a CNetMessage).

  12. in src/main.cpp:None in 8af12a9f5a
    3365 | @@ -3380,7 +3366,10 @@ bool ProcessMessages(CNode* pfrom)
    3366 |              printf("ProcessMessage(%s, %u bytes) FAILED\n", strCommand.c_str(), nMessageSize);
    3367 |      }
    3368 |  
    3369 | -    vRecv.Compact();
    3370 | +    // remove processed messages; one incomplete message may remain
    


    sipa commented at 11:11 PM on December 13, 2012:

    Perhaps it's better to use an std::deque here instead of a std::vector?

  13. BitcoinPullTester commented at 1:21 PM on January 26, 2013: none

    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/8af12a9f5a56de594700abb5b0d9a05adf82d64b for binaries and test log.

  14. jgarzik commented at 4:29 PM on March 24, 2013: contributor

    Superceded by #2409

  15. jgarzik closed this on Mar 24, 2013

  16. jgarzik deleted the branch on Aug 24, 2014
  17. HashUnlimited referenced this in commit 12597d5e14 on Apr 6, 2018
  18. KolbyML referenced this in commit 564dce8f82 on Dec 5, 2020
  19. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-19 12:16 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me