Reduce latency in network processing #3180

pull pstratem wants to merge 1 commits into bitcoin:master from pstratem:processgetdata changing 2 files +25 −3
  1. pstratem commented at 11:40 pm on October 28, 2013: contributor

    Currently there is a 100ms sleep in the message processing thread that adds delay to processing messages.

    This patch skips the sleep when there is additional work to do.

    Additionally ProcessMessages & ProcessGetData are modified such that their maximum run time is reduced; this improves the average latency peers see when there is high load. (This has a significant impact on high latency networks during initial block synchronization).

  2. gmaxwell commented at 11:44 pm on October 28, 2013: contributor
    I’d really rather be never sleeping (except when waiting for IO, then event driven). Otherwise the delay will still delay block propagation.
  3. pstratem commented at 11:52 pm on October 28, 2013: contributor

    The problem is this isn’t entirely about IO.

    SendMessages is actually largely a polling function with timers and other logic not dependent on IO.

  4. cjdelisle commented at 7:05 am on October 29, 2013: none

    I understand this is important and it should be validated and pulled. I took a look at the code and as I understand it, it does the following:

    1. If a single getdata contains multiple requests, only the first those up to and including the first request for a block will be serviced.
    2. ProcessMessages will only process one message at a time so one message will be parsed from each node with outstanding messages to process rather than processing all outstanding messages from each node before moving on the the next.
    3. Don’t sleep if there are still outstanding messages to be processed, this alone should improve performance.

    Although I’m not particularly well versed in the bitcoin codebase this commit has my ACK on a code basis.

    I have built and tested it on my laptop and a vm which I -connect= and it synced the chain just fine. My laptop is showing surprisingly high processor load but it’s also serving a lot of block downloads on the order of 100kB-1500kB/s upload speed. Does anyone know if 100% processor load on one core is normal for bitcoin under these circumstances? I want to be sure there is no perf regression in the ProcessMessages codepath since it drops out of the function for each message when there are multiple messages stacked.

  5. in src/net.cpp: in 3c08b7a23f outdated
    1525@@ -1523,8 +1526,13 @@ void ThreadMessageHandler()
    1526             {
    1527                 TRY_LOCK(pnode->cs_vRecvMsg, lockRecv);
    1528                 if (lockRecv)
    1529+                {
    1530                     if (!g_signals.ProcessMessages(pnode))
    1531                         pnode->CloseSocketDisconnect();
    1532+                    
    1533+                    if (!pnode->vRecvGetData.empty() || !pnode->vRecvMsg.empty())
    1534+                        fSleep = false;
    


    cjdelisle commented at 8:32 am on October 29, 2013:
    I suspect that this is being set erroneously given I’m getting 100% CPU pretty reliably and not when the patch is removed. https://github.com/pstratem/bitcoin/blob/3c08b7a23fe3afc8025658316b99f750c9a90fbf/src/main.cpp#L3980 leaves vRecvMsg non-empty if the send buffer is full which will cause an infiniloop condition when a buffer fills up.
  6. cjdelisle commented at 9:05 am on October 29, 2013: none

    My assumption 1. turns out to have been wrong. This does not actually change the behavior of the node WRT how it handles requests with multiple getdata(block) in them. It simply drops out and leaves the balance of the requests till later which will be at most 100ms later when ProcessMessages() is next called.

    In https://github.com/pstratem/bitcoin/blob/3c08b7a23fe3afc8025658316b99f750c9a90fbf/src/main.cpp#L3974

    0    if (!pfrom->vRecvGetData.empty())
    1        ProcessGetData(pfrom);
    

    You’ll want to add another if (!pfrom->vRecvGetData.empty()) return fOk; so that it doesn’t try to continue parsing until the previous getdata is completely parsed.

  7. pstratem commented at 9:43 pm on October 29, 2013: contributor

    The goal here is for the loop to process 1 item from vRecvGetData and/or 1 item from vRecvMsg per peer.

    Probably ProcessGetData should be moved outside of ProcessMessages entirely, but that is outside the scope of this patch (I think).

  8. pstratem commented at 11:24 pm on October 29, 2013: contributor
    I see what you’re getting at, I have made the suggested change.
  9. gavinandresen commented at 11:32 pm on October 29, 2013: contributor
    ACK on the general idea of round-robin-processing requests. I haven’t had time to code review or test.
  10. sipa commented at 0:36 am on October 30, 2013: member

    ACK on approach and implementation. Haven’t tested.

    Some commits can be squashed together. In particular 3rd, which is a bugfix for the second.

  11. cjdelisle commented at 10:23 am on October 30, 2013: none
    Tested the patched version, still getting occasional 100% CPU. I think perhaps we should return a different value from ProcessMessages() if we do not want the calling routine to sleep, that allows us to be more explicit than checking that the queues are not empty. It’s just difficult to reliably prove that there is no other way out of ProcessMessages() which leaves non-empty queues.
  12. pstratem commented at 11:05 pm on October 30, 2013: contributor

    There are only 2 paths where ProcessMessages can run without consuming a message in vRecvMsg.

    Either the send buffer is full or the next message in the queue is not complete.

    I agree that a more general solution should be applied, but probably later as part of a larger reorganization of the networking code.

  13. sipa commented at 2:02 pm on November 2, 2013: member
    Rebase please?
  14. process received messages one at a time without sleeping between messages 75ef87dd93
  15. BitcoinPullTester commented at 5:11 am on November 4, 2013: none
    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/75ef87dd936cd9c84d9a9fd3afce6198409636c4 for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.
  16. gavinandresen commented at 5:52 am on November 4, 2013: contributor

    Tested on OSX, behaves properly.

    Message-handling should be eventually be reworked to use boost:asio (never sleep, be I/O event driven as @gmaxwell suggests) and to use some sort of active queue management to mitigate denial-of-service attacks.

    But this code is better than what we’ve got, so I am going to merge.

  17. gavinandresen referenced this in commit 97f844dd95 on Nov 4, 2013
  18. gavinandresen merged this on Nov 4, 2013
  19. gavinandresen closed this on Nov 4, 2013

  20. darkhosis commented at 12:34 pm on November 4, 2013: none

    Ah, I already had a similar change in my bitcoind for a year or so (though in my case, I simply reduced this 100ms to 10ms). I guess that’s why blockchain.info always picked up so much stuff. I thought it may be because of the 20ms latency..

    It would explain the four blocks picked up by blockchain.info from 198.12.127.2, which I’m running w/ similar settings (just sending out far fewer transactions).. didn’t think it’d have such an impact

  21. rebroad commented at 0:29 am on February 17, 2014: contributor
    #2125 was raised ages ago to address this. Glad to see it finally implemented!
  22. in src/main.cpp: in 75ef87dd93
    3934@@ -3929,6 +3935,8 @@ bool ProcessMessages(CNode* pfrom)
    3935 
    3936         if (!fRet)
    3937             LogPrintf("ProcessMessage(%s, %u bytes) FAILED\n", strCommand.c_str(), nMessageSize);
    3938+        
    3939+        break;
    3940     }
    


    rebroad commented at 4:35 pm on January 3, 2017:
    We don’t really need this while loop any longer since we break out after the first iteration of it….
  23. theuni referenced this in commit ba4cae284f on Jan 6, 2017
  24. theuni referenced this in commit c5a8b1b946 on Jan 13, 2017
  25. OlegGirko referenced this in commit 7533257571 on Jun 26, 2017
  26. OlegGirko referenced this in commit d306e5a867 on Jun 29, 2017
  27. OlegGirko referenced this in commit c1f9f01326 on Jun 29, 2017
  28. OlegGirko referenced this in commit b7857c4b51 on Jun 29, 2017
  29. OlegGirko referenced this in commit a0c1dfb6fb on Jun 30, 2017
  30. OlegGirko referenced this in commit 79e9abca56 on Jul 2, 2017
  31. OlegGirko referenced this in commit f1d32a0711 on Jul 3, 2017
  32. OlegGirko referenced this in commit 701f91d2da on Jul 4, 2017
  33. OlegGirko referenced this in commit 79b0ae3e8a on Jul 4, 2017
  34. OlegGirko referenced this in commit db9ce1e07a on Jul 5, 2017
  35. OlegGirko referenced this in commit 98d7ccdced on Jul 5, 2017
  36. OlegGirko referenced this in commit 105be60a9b on Jul 5, 2017
  37. OlegGirko referenced this in commit 8c689c10f5 on Jul 9, 2017
  38. OlegGirko referenced this in commit 5b6bbe8ca8 on Jul 10, 2017
  39. OlegGirko referenced this in commit e68f62ca26 on Jul 11, 2017
  40. OlegGirko referenced this in commit 92acec82dc on Jul 11, 2017
  41. OlegGirko referenced this in commit bb88ff2146 on Jul 12, 2017
  42. OlegGirko referenced this in commit 2158ce490a on Jul 13, 2017
  43. OlegGirko referenced this in commit 934105f7e6 on Jul 13, 2017
  44. OlegGirko referenced this in commit dc6e8eb409 on Jul 14, 2017
  45. OlegGirko referenced this in commit 90d0e4dc7c on Jul 14, 2017
  46. OlegGirko referenced this in commit a6812d555c on Jul 16, 2017
  47. OlegGirko referenced this in commit 4143b1a538 on Jul 16, 2017
  48. OlegGirko referenced this in commit 89f7467bf2 on Jul 17, 2017
  49. OlegGirko referenced this in commit bc64c1a0a9 on Jul 17, 2017
  50. OlegGirko referenced this in commit 4430f3213f on Jul 18, 2017
  51. OlegGirko referenced this in commit e7cd7c094c on Jul 21, 2017
  52. OlegGirko referenced this in commit d61e3ab22e on Jul 26, 2017
  53. OlegGirko referenced this in commit 4e1f02d514 on Jul 27, 2017
  54. OlegGirko referenced this in commit 5b0d9c861b on Jul 28, 2017
  55. OlegGirko referenced this in commit a7f5fc5b2d on Jul 30, 2017
  56. OlegGirko referenced this in commit 629f1324b0 on Aug 1, 2017
  57. OlegGirko referenced this in commit 1edfc3ec3a on Aug 2, 2017
  58. OlegGirko referenced this in commit 9476dc0043 on Aug 8, 2017
  59. OlegGirko referenced this in commit 6f5cb83330 on Aug 8, 2017
  60. OlegGirko referenced this in commit febf7a839b on Aug 8, 2017
  61. OlegGirko referenced this in commit 464eee6828 on Aug 9, 2017
  62. OlegGirko referenced this in commit 9c59227987 on Aug 15, 2017
  63. OlegGirko referenced this in commit b27679478e on Aug 16, 2017
  64. UdjinM6 referenced this in commit 2472999da0 on Aug 23, 2017
  65. OlegGirko referenced this in commit aae971d078 on Apr 17, 2018
  66. lateminer referenced this in commit 6b93a33514 on Jan 4, 2019
  67. KolbyML referenced this in commit 701b57888d on Dec 5, 2020
  68. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me