P2P: Make peer timeout configurable, speed up very slow test and ensure correct code path tested. #14733

pull zallarak wants to merge 2 commits into bitcoin:master from zallarak:p2ptimeout changing 4 files +36 −12
  1. zallarak commented at 11:42 PM on November 15, 2018: contributor

    Summary:

    1. Primary: Adds a debug_only=true flag for peertimeout, defaults to 60 sec., the current hard-coded setting.
    2. Secondary: Drastically speeds up p2p_timeout.py test.
    3. Secondary: Tests that the correct code path is being tested by adding log assertions to the test.

    Rationale:

    • P2P timeout was hard-coded: make it explicitly specified and configurable, instead of a magic number.
    • Addresses #13518; p2p_timeout.py takes 4 sec. to run instead of 61 sec.
    • Makes p2p_timeout.py more explicit. Previously, we relied on a comment to inform us of the timeout amount being tested. Now it is specified directly in the test via passing in the new arg; -peertimeout=3.
    • Opens us up to testing more P2P scenarios; oftentimes slow tests are the reason we don't test.

    Locally verified changes:

    With Proposed Change (4.7 sec.):

    $ time ./test/functional/p2p_timeouts.py
    2018-11-19T00:04:19.077000Z TestFramework (INFO): Initializing test directory /tmp/testhja7g2n7
    2018-11-19T00:04:23.479000Z TestFramework (INFO): Stopping nodes
    2018-11-19T00:04:23.683000Z TestFramework (INFO): Cleaning up /tmp/testhja7g2n7 on exit
    2018-11-19T00:04:23.683000Z TestFramework (INFO): Tests successful
    
    real    0m4.743s
    

    Currently on master (62.8 sec.):

    $ time ./test/functional/p2p_timeouts.py
    2018-11-19T00:06:10.948000Z TestFramework (INFO): Initializing test directory /tmp/test6mo6k21h
    2018-11-19T00:07:13.376000Z TestFramework (INFO): Stopping nodes
    2018-11-19T00:07:13.631000Z TestFramework (INFO): Cleaning up /tmp/test6mo6k21h on exit
    2018-11-19T00:07:13.631000Z TestFramework (INFO): Tests successful
    
    real    1m2.836s
    

    Error message demonstrated for new argument -peertimeout:

    $ ./bitcoind -peertimeout=-5
    ...
    Error: peertimeout cannot be configured with a negative value.
    
  2. fanquake added the label P2P on Nov 15, 2018
  3. in doc/man/bitcoin-qt.1:269 in f229697a39 outdated
     262 | @@ -263,6 +263,10 @@ nodes.
     263 |  .IP
     264 |  Specify connection timeout in milliseconds (minimum: 1, default: 5000)
     265 |  .HP
     266 | +\fB\-peertimeout=\fR<n>
     267 | +.IP
     268 | +Specify p2p connection timeout in seconds (minimum: 1, default: 60)
     269 | +.HP
    


    MarcoFalke commented at 12:09 AM on November 16, 2018:

    These files are autogenerated and shouldn't be modified.

  4. in src/netbase.cpp:33 in f229697a39 outdated
      29 | @@ -30,6 +30,7 @@ static proxyType proxyInfo[NET_MAX];
      30 |  static proxyType nameProxy;
      31 |  static CCriticalSection cs_proxyInfos;
      32 |  int nConnectTimeout = DEFAULT_CONNECT_TIMEOUT;
      33 | +int nPeerConnectTimeout = DEFAULT_PEER_CONNECT_TIMEOUT;
    


    MarcoFalke commented at 12:10 AM on November 16, 2018:

    Why here? Shouldn't this be in net (e.g. CNode or CConnman)?

  5. in src/init.cpp:1054 in f229697a39 outdated
    1049 | @@ -1049,6 +1050,10 @@ bool AppInitParameterInteraction()
    1050 |      if (nConnectTimeout <= 0)
    1051 |          nConnectTimeout = DEFAULT_CONNECT_TIMEOUT;
    1052 |  
    1053 | +    nPeerConnectTimeout = gArgs.GetArg("-peertimeout", DEFAULT_PEER_CONNECT_TIMEOUT);
    1054 | +    if (nPeerConnectTimeout <= 0)
    


    MarcoFalke commented at 12:10 AM on November 16, 2018:

    Missing {} after if

  6. in src/netbase.h:27 in f229697a39 outdated
      22 |  extern bool fNameLookup;
      23 |  
      24 |  //! -timeout default
      25 |  static const int DEFAULT_CONNECT_TIMEOUT = 5000;
      26 | +//! -peertimeout default
      27 | +static const int DEFAULT_PEER_CONNECT_TIMEOUT = 60;
    


    MarcoFalke commented at 12:12 AM on November 16, 2018:

    Time deltas are measured in type int64_t, not int

  7. zallarak force-pushed on Nov 16, 2018
  8. zallarak commented at 1:48 AM on November 16, 2018: contributor

    net.cpp|h was much more relevant than netbase.cpp|h - moved.

    I didn't move it into CNode because it didn't seem right to duplicate the timeout information into every peer. CConnman isn't a parameter to InactivityCheck, so having it as a top-level setting seemed to make most sense for now.

    Thanks for your prompt comments @MarcoFalke - looking forward to any further thoughts!

  9. MarcoFalke commented at 2:12 AM on November 16, 2018: member

    CConnman isn't a parameter to InactivityCheck

    conman is passed as this or self if you prefer python.

  10. zallarak force-pushed on Nov 16, 2018
  11. zallarak commented at 3:36 AM on November 16, 2018: contributor

    @MarcoFalke great call. I have updated with this per your feedback. Please let me know what you think of the change.

    • Pass in nPeerConnectTimeout through connOptions and add it to CConnman
  12. DrahtBot commented at 4:25 AM on November 16, 2018: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #14856 ([WIP] net: remove more CConnman globals (theuni) by dongcarl)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  13. meshcollider commented at 6:48 PM on November 16, 2018: contributor

    This looks good, although please use snake case even though the surrounding code doesn't :) Also please squash the nit commit

  14. zallarak force-pushed on Nov 16, 2018
  15. zallarak commented at 7:16 PM on November 16, 2018: contributor
  16. zallarak force-pushed on Nov 16, 2018
  17. in test/functional/p2p_timeouts.py:39 in 9801dcdd7b outdated
      35 | @@ -36,6 +36,7 @@ class TimeoutsTest(BitcoinTestFramework):
      36 |      def set_test_params(self):
      37 |          self.setup_clean_chain = True
      38 |          self.num_nodes = 1
      39 | +        self.extra_args=[["-peertimeout=3"]]
    


    practicalswift commented at 11:07 AM on November 17, 2018:

    Nit: Missing whitespace around = :-)


    zallarak commented at 6:11 PM on November 17, 2018:

    Great catch :-), thanks @practicalswift - updated.

  18. zallarak force-pushed on Nov 17, 2018
  19. zallarak force-pushed on Nov 17, 2018
  20. zallarak force-pushed on Nov 18, 2018
  21. zallarak renamed this:
    p2p: allow p2ptimeout to be configurable, speed up slow test
    P2P: allow peer timeout to be configurable
    on Nov 19, 2018
  22. zallarak renamed this:
    P2P: allow peer timeout to be configurable
    P2P: Allow peer timeout to be configurable and speed up very slow test.
    on Nov 19, 2018
  23. in src/init.cpp:406 in ee28e44c72 outdated
     402 | @@ -403,6 +403,7 @@ void SetupServerArgs()
     403 |      gArgs.AddArg("-proxyrandomize", strprintf("Randomize credentials for every proxy connection. This enables Tor stream isolation (default: %u)", DEFAULT_PROXYRANDOMIZE), false, OptionsCategory::CONNECTION);
     404 |      gArgs.AddArg("-seednode=<ip>", "Connect to a node to retrieve peer addresses, and disconnect. This option can be specified multiple times to connect to multiple nodes.", false, OptionsCategory::CONNECTION);
     405 |      gArgs.AddArg("-timeout=<n>", strprintf("Specify connection timeout in milliseconds (minimum: 1, default: %d)", DEFAULT_CONNECT_TIMEOUT), false, OptionsCategory::CONNECTION);
     406 | +    gArgs.AddArg("-peertimeout=<n>", strprintf("Specify p2p connection timeout in seconds (minimum: 1, default: %d)", DEFAULT_PEER_CONNECT_TIMEOUT), false, OptionsCategory::CONNECTION);
    


    MarcoFalke commented at 8:40 PM on November 27, 2018:

    It should be clear to the user that this is different (and how it is different) from the -timeout above without having to read the source code. Otherwise it should be made a hidden debug option.


    zallarak commented at 1:20 AM on November 29, 2018:

    I added a sentence that clarifies that this timeout refers to the period of inactivity we tolerate from peers before they are dropped. Great point.

  24. in src/net.h:561 in ee28e44c72 outdated
     557 | @@ -551,7 +558,7 @@ class CNodeStats
     558 |      bool fRelayTxes;
     559 |      int64_t nLastSend;
     560 |      int64_t nLastRecv;
     561 | -    int64_t nTimeConnected;
     562 | +    uint64_t nTimeConnected;
    


    MarcoFalke commented at 8:46 PM on November 27, 2018:

    Any reason to change this? Unless there is a strong reason, I'd say keep it the way it was. Also int64_t seems preferable to avoid (future) unsigned integer overflow issues.


    MarcoFalke commented at 8:48 PM on November 27, 2018:

    Same review applies for the newly introduced variables.


    dongcarl commented at 3:45 AM on November 28, 2018:

    (3/3) You've changed the type for CNodeStats's nTimeConnected, but not CNode's, which means that when CNode::copyStats is invoked, the X macro would be expanded and the int64_t from CNode would be implicitly converted to a uint64_t.

  25. in src/net.h:399 in ee28e44c72 outdated
     394 | @@ -391,6 +395,9 @@ class CConnman
     395 |      uint64_t nMaxOutboundLimit GUARDED_BY(cs_totalBytesSent);
     396 |      uint64_t nMaxOutboundTimeframe GUARDED_BY(cs_totalBytesSent);
     397 |  
     398 | +    // P2P timeout in seconds
     399 | +    uint64_t peer_connect_timeout;
    


    MarcoFalke commented at 8:46 PM on November 27, 2018:

    This is a member? Should start with m_...


    dongcarl commented at 3:03 AM on November 28, 2018:

    Nit: Class member variables should have a m_ prefix.

  26. in src/net.h:143 in ee28e44c72 outdated
     139 | @@ -138,6 +140,7 @@ class CConnman
     140 |          unsigned int nReceiveFloodSize = 0;
     141 |          uint64_t nMaxOutboundTimeframe = 0;
     142 |          uint64_t nMaxOutboundLimit = 0;
     143 | +        uint64_t peer_connect_timeout = DEFAULT_PEER_CONNECT_TIMEOUT;
    


    MarcoFalke commented at 8:47 PM on November 27, 2018:

    This is a member? Should start with m_...

  27. in src/net.cpp:1238 in ee28e44c72 outdated
    1235 | -    if (nTime - pnode->nTimeConnected > 60)
    1236 | +    uint64_t nTime = GetSystemTimeInSeconds();
    1237 | +    if (nTime - pnode->nTimeConnected > peer_connect_timeout)
    1238 |      {
    1239 |          if (pnode->nLastRecv == 0 || pnode->nLastSend == 0)
    1240 |          {
    


    MarcoFalke commented at 8:49 PM on November 27, 2018:

    Needs the debug log message adjusted

  28. in test/functional/p2p_timeouts.py:68 in ee28e44c72 outdated
      64 | @@ -63,7 +65,7 @@ def run_test(self):
      65 |          no_verack_node.send_message(msg_ping())
      66 |          no_version_node.send_message(msg_ping())
      67 |  
      68 | -        sleep(31)
      69 | +        sleep(2)
    


    MarcoFalke commented at 8:55 PM on November 27, 2018:

    Should sleep with assert_debug_log(... "timeout ..."):

    to assert we hit the right code path.


    zallarak commented at 1:21 AM on November 29, 2018:

    Excellent point! I added the correct log assertions. This makes this test even better.

  29. in src/net.cpp:1235 in ee28e44c72 outdated
    1230 | @@ -1231,8 +1231,8 @@ void CConnman::NotifyNumConnectionsChanged()
    1231 |  
    1232 |  void CConnman::InactivityCheck(CNode *pnode)
    1233 |  {
    1234 | -    int64_t nTime = GetSystemTimeInSeconds();
    1235 | -    if (nTime - pnode->nTimeConnected > 60)
    1236 | +    uint64_t nTime = GetSystemTimeInSeconds();
    1237 | +    if (nTime - pnode->nTimeConnected > peer_connect_timeout)
    


    dongcarl commented at 3:07 AM on November 28, 2018:

    The LogPrint on line 1239 should be changed to not be hardcoded to "60 seconds" anymore.

  30. dongcarl changes_requested
  31. in src/init.cpp:1059 in ee28e44c72 outdated
    1055 | +
    1056 | +    int64_t peer_connect_timeout_arg = gArgs.GetArg("-peertimeout", DEFAULT_PEER_CONNECT_TIMEOUT);
    1057 | +    if (peer_connect_timeout_arg <= 0) {
    1058 | +        return InitError(_("peertimeout cannot be configured with a negative value."));
    1059 | +    }
    1060 | +    peer_connect_timeout = (uint64_t) peer_connect_timeout_arg;
    


    dongcarl commented at 3:30 AM on November 28, 2018:

    (1/3) It would seem the insistence on making peer_connect_timeout a uint64_t is making the code a tiny more verbose (compared the block here with the one above for -timeout).

  32. in src/net.cpp:1234 in ee28e44c72 outdated
    1230 | @@ -1231,8 +1231,8 @@ void CConnman::NotifyNumConnectionsChanged()
    1231 |  
    1232 |  void CConnman::InactivityCheck(CNode *pnode)
    1233 |  {
    1234 | -    int64_t nTime = GetSystemTimeInSeconds();
    1235 | -    if (nTime - pnode->nTimeConnected > 60)
    1236 | +    uint64_t nTime = GetSystemTimeInSeconds();
    


    dongcarl commented at 3:45 AM on November 28, 2018:

    (2/3) The choice of using uint64_t here means that there's an implicit conversion.

  33. dongcarl changes_requested
  34. dongcarl commented at 4:02 AM on November 28, 2018: member

    Question for other reviewers: I'm wondering what the best types are for peer_connect_timeout, nTime, and nTimeConnected.

    I think that while it is logical for these things to be unsigned, and the conversion are largely safe given the current codebase, there are a few things to consider:

    1. Other timeout options (such as -timeout) are signed.
    2. There isn't an unsigned variant of GetArg() or GetSystemTimeInSeconds(), meaning that we have to cast/implicitly convert.
    3. If we're trying to make sure that peer_connect_timeout can't have invalid values by making it unsigned, it can still be 0, which is not valid.

    Perhaps what should be done is to leave the types alone until we use something like std::chrono.

  35. MarcoFalke commented at 3:54 PM on November 28, 2018: member

    I reviewed this yesterday, but didn't submit. Turns out most of my comments were already raised, but I am dropping this nonetheless.

  36. dongcarl commented at 4:20 PM on November 28, 2018: member

    Overall, I'm very excited that the test that's changed here will be sped up quite a bit! :smile:

  37. p2p: allow p2ptimeout to be configurable, speed up slow test 8042bbfbf0
  38. zallarak force-pushed on Nov 29, 2018
  39. zallarak commented at 1:24 AM on November 29, 2018: contributor

    @dongcarl @MarcoFalke - I have resolved every comment you left:

    • Add log assertions to the test to ensure the right code path is being hit -- this greatly improves the tests!
    • Make time values consistent with the rest of the codebase (signed)
    • Made cli helptext more clear
    • Fix log messages that had 60 seconds hard-coded
    • prefix class members with m_

    Really appreciate the great comments.

    Let me know how things look from here!

  40. dongcarl commented at 2:07 AM on November 29, 2018: member

    tACK 8042bbfbf001d60feead6f04c99e0fca93dbba3c

  41. zallarak renamed this:
    P2P: Allow peer timeout to be configurable and speed up very slow test.
    P2P: Make peer timeout configurable. Speeds up very slow test and ensures correct code path is tested.
    on Nov 29, 2018
  42. zallarak renamed this:
    P2P: Make peer timeout configurable. Speeds up very slow test and ensures correct code path is tested.
    P2P: Make peer timeout configurable, speed up very slow test and ensure correct code path tested.
    on Nov 29, 2018
  43. MarcoFalke commented at 8:42 PM on November 29, 2018: member

    utACK 8042bbfbf0

  44. in src/init.cpp:406 in 8042bbfbf0 outdated
     402 | @@ -403,6 +403,7 @@ void SetupServerArgs()
     403 |      gArgs.AddArg("-proxyrandomize", strprintf("Randomize credentials for every proxy connection. This enables Tor stream isolation (default: %u)", DEFAULT_PROXYRANDOMIZE), false, OptionsCategory::CONNECTION);
     404 |      gArgs.AddArg("-seednode=<ip>", "Connect to a node to retrieve peer addresses, and disconnect. This option can be specified multiple times to connect to multiple nodes.", false, OptionsCategory::CONNECTION);
     405 |      gArgs.AddArg("-timeout=<n>", strprintf("Specify connection timeout in milliseconds (minimum: 1, default: %d)", DEFAULT_CONNECT_TIMEOUT), false, OptionsCategory::CONNECTION);
     406 | +    gArgs.AddArg("-peertimeout=<n>", strprintf("Specify p2p connection timeout in seconds. This option determines the amount of time a peer may be inactive before the connection to it is dropped. (minimum: 1, default: %d)", DEFAULT_PEER_CONNECT_TIMEOUT), false, OptionsCategory::CONNECTION);
    


    MarcoFalke commented at 8:50 PM on November 29, 2018:

    Should probably set debug_only=true to hide the option from users, unless there is a reason to expose it.

  45. in src/init.cpp:1057 in 8042bbfbf0 outdated
    1053 |          nConnectTimeout = DEFAULT_CONNECT_TIMEOUT;
    1054 | +    }
    1055 | +
    1056 | +    peer_connect_timeout = gArgs.GetArg("-peertimeout", DEFAULT_PEER_CONNECT_TIMEOUT);
    1057 | +    if (peer_connect_timeout <= 0) {
    1058 | +        return InitError(_("peertimeout cannot be configured with a negative value."));
    


    MarcoFalke commented at 8:50 PM on November 29, 2018:

    No need to translate this error, imo.

  46. MarcoFalke approved
  47. MarcoFalke commented at 8:51 PM on November 29, 2018: member

    Just two minor nits I noticed, feel free to fix them in a separate commit.

  48. zallarak commented at 8:58 PM on November 29, 2018: contributor

    @MarcoFalke do you mean make a new pull request, or should I add the nit commits to this one? Thank you!

    EDIT: I will add the commit here, but not squash it so that the review is preserved. Let me know if I should do otherwise.

  49. make peertimeout a debug argument, remove error message translation 48b37db50f
  50. MarcoFalke commented at 9:13 PM on November 29, 2018: member

    utACK 48b37db50f4a5dc6d2dd1a01183bdb993ca97f8b

  51. dongcarl approved
  52. dongcarl commented at 11:53 PM on November 30, 2018: member

    utACK 48b37db50f4a5dc6d2dd1a01183bdb993ca97f8b

  53. MarcoFalke added this to the milestone 0.18.0 on Dec 3, 2018
  54. laanwj merged this on Dec 4, 2018
  55. laanwj closed this on Dec 4, 2018

  56. laanwj referenced this in commit 88445889f1 on Dec 4, 2018
  57. PastaPastaPasta referenced this in commit 987aa99056 on Apr 16, 2020
  58. PastaPastaPasta referenced this in commit d5156982ef on Apr 16, 2020
  59. PastaPastaPasta referenced this in commit 3f36c51de2 on Apr 19, 2020
  60. PastaPastaPasta referenced this in commit 29eada80a7 on Apr 20, 2020
  61. deadalnix referenced this in commit 94cd1f0f2b on Apr 24, 2020
  62. PastaPastaPasta referenced this in commit feb0c4949b on Jun 12, 2020
  63. PastaPastaPasta referenced this in commit f40d8f006b on Jun 14, 2020
  64. ftrader referenced this in commit 7185853c3d on Aug 17, 2020
  65. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-17 09:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me