BIP324 integration #28331

pull sipa wants to merge 10 commits into bitcoin:master from sipa:202308_bip324_integration changing 22 files +540 −83
  1. sipa commented at 0:25 am on August 24, 2023: member

    Part of #27634.

    This makes BIP324 support feature complete, through a (default off) -v2transport option for enabling V2 connections. If it is enabled:

    • The NODE_P2P_V2 service flag (1 « 11) is advertized.
    • Inbound connections can use V1 or V2 (automatically detected based on the protocol used by the peer)
    • V2 connections are used on outbound when the NODE_P2P_V2 service is available (or the new use_v2 parameter is set on the addnode RPC).
    • V2 outbound connections that instantly fail get retried as V1.

    There are two new RPC fields, "transport_protocol_type" and "session_id", in getpeerinfo.

  2. DrahtBot commented at 0:25 am on August 24, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK theStack, mzumsande, Sjors
    Approach ACK jonatack
    Stale ACK ajtowns, vasild

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #bitcoin-core/gui/754 (Add BIP324-specific labels to peer details by hebasto)
    • #28464 (net: improve max-connection limits code by mzumsande)
    • #28463 (p2p: Increase inbound capacity for block-relay only connections by mzumsande)
    • #28155 (net: improves addnode / m_added_nodes logic by sr-gi)
    • #27600 (net: Add new permission forceinbound to evict a random unprotected connection if all slots are otherwise full by pinheadmz)
    • #27581 (net: Continuous ASMap health check by fjahr)
    • #27114 (p2p: Allow whitelisting outgoing connections by brunoerg)
    • #27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)
    • #26863 (test: merge banning test from p2p_disconnect_ban to rpc_setban by brunoerg)
    • #26728 (wallet: Have the wallet store the key for automatically generated descriptors by achow101)
    • #26114 (net: Make AddrFetch connections to fixed seeds by mzumsande)
    • #25907 (wallet: rpc to add automatically generated descriptors by achow101)
    • #22341 (rpc: add getxpub by Sjors)
    • #20892 (tests: Run both descriptor and legacy tests within a single test invocation by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. sipa force-pushed on Aug 24, 2023
  4. DrahtBot added the label CI failed on Aug 24, 2023
  5. DrahtBot removed the label CI failed on Aug 24, 2023
  6. sipa force-pushed on Aug 24, 2023
  7. glozow added the label P2P on Aug 24, 2023
  8. Sjors commented at 11:53 am on August 25, 2023: member

    Not needed for testing here, but it would still be cool to have a lock icon in the peers tab in the GUI, and reveal the session id if you click on it. cc @hebasto

    Tested a manual connection between macOS and Ubuntu, found matching session ids, pfew - no man in the middle attack on my local network :-)

    My DNS seed hasn’t discovered any peers yet announcing the service flag, that could take a while. https://github.com/sipa/bitcoin-seeder/pull/102

  9. sipa force-pushed on Aug 28, 2023
  10. sipa force-pushed on Aug 28, 2023
  11. DrahtBot added the label CI failed on Aug 28, 2023
  12. sipa force-pushed on Aug 29, 2023
  13. DrahtBot removed the label CI failed on Aug 29, 2023
  14. sipa commented at 2:07 pm on August 29, 2023: member

    @Sjors Re #28196 (comment):

    I’ve changed a net debug-level log message to say “trying v2 connection … lastseen=…hrs\n”

  15. sipa force-pushed on Aug 29, 2023
  16. DrahtBot added the label CI failed on Aug 29, 2023
  17. sipa force-pushed on Aug 29, 2023
  18. DrahtBot removed the label CI failed on Aug 29, 2023
  19. sipa force-pushed on Aug 30, 2023
  20. Sjors commented at 2:09 pm on August 30, 2023: member

    I’ve changed a net debug-level log message to say “trying v2 connection … lastseen=…hrs\n”

    Nice! Though I’m not seeing that in your latest push f770273b3842e45a612cb8bb81d46e3d0883230a.

  21. sipa force-pushed on Aug 30, 2023
  22. sipa force-pushed on Aug 30, 2023
  23. Sjors commented at 8:01 am on August 31, 2023: member
    Oh sorry, I lazy grepped for “v2 connection”.
  24. sipa force-pushed on Aug 31, 2023
  25. sipa force-pushed on Aug 31, 2023
  26. sipa force-pushed on Sep 1, 2023
  27. sipa force-pushed on Sep 2, 2023
  28. sipa force-pushed on Sep 3, 2023
  29. DrahtBot added the label CI failed on Sep 3, 2023
  30. sipa force-pushed on Sep 4, 2023
  31. sipa force-pushed on Sep 4, 2023
  32. sipa force-pushed on Sep 4, 2023
  33. sipa force-pushed on Sep 4, 2023
  34. DrahtBot removed the label CI failed on Sep 5, 2023
  35. sipa force-pushed on Sep 5, 2023
  36. sipa force-pushed on Sep 5, 2023
  37. sipa force-pushed on Sep 5, 2023
  38. sipa force-pushed on Sep 5, 2023
  39. DrahtBot added the label CI failed on Sep 5, 2023
  40. sipa force-pushed on Sep 5, 2023
  41. DrahtBot removed the label CI failed on Sep 5, 2023
  42. sipa force-pushed on Sep 6, 2023
  43. DrahtBot added the label CI failed on Sep 6, 2023
  44. sipa force-pushed on Sep 6, 2023
  45. sipa force-pushed on Sep 6, 2023
  46. DrahtBot removed the label CI failed on Sep 6, 2023
  47. sipa force-pushed on Sep 6, 2023
  48. sipa force-pushed on Sep 6, 2023
  49. sipa force-pushed on Sep 6, 2023
  50. DrahtBot added the label CI failed on Sep 6, 2023
  51. sipa force-pushed on Sep 6, 2023
  52. DrahtBot removed the label CI failed on Sep 7, 2023
  53. sipa force-pushed on Sep 7, 2023
  54. sipa force-pushed on Sep 8, 2023
  55. sipa force-pushed on Sep 9, 2023
  56. sipa force-pushed on Sep 10, 2023
  57. DrahtBot added the label CI failed on Sep 10, 2023
  58. sipa force-pushed on Sep 10, 2023
  59. sipa marked this as ready for review on Sep 10, 2023
  60. sipa commented at 3:20 pm on September 10, 2023: member
    Marked as ready for review.
  61. DrahtBot removed the label CI failed on Sep 10, 2023
  62. sipa force-pushed on Sep 10, 2023
  63. DrahtBot added the label Needs rebase on Sep 11, 2023
  64. hebasto commented at 11:19 am on September 11, 2023: member

    … it would still be cool to have a lock icon in the peers tab in the GUI…

    See https://github.com/bitcoin-core/gui/pull/754 (however, not a lock icon).

  65. sipa force-pushed on Sep 11, 2023
  66. DrahtBot removed the label Needs rebase on Sep 11, 2023
  67. in src/protocol.h:295 in 1768aff56d outdated
    290@@ -291,6 +291,9 @@ enum ServiceFlags : uint64_t {
    291     // See BIP159 for details on how this is implemented.
    292     NODE_NETWORK_LIMITED = (1 << 10),
    293 
    294+    // NODE_P2P_V2 means the node supports BIP324 transport
    295+    NODE_P2P_V2 = (1 << 11),
    


    theStack commented at 5:12 pm on September 11, 2023:
    in commit 1768aff56d6feedcb22b21df12e8926924fdac6a: this flag should also be added to the ALL_SERVICE_FLAGS array in test/util/net.h, I guess (used by fuzz tests): https://github.com/bitcoin/bitcoin/blob/8f7b9eb8711fdb32e8bdb59d2a7462a46c7a8086/src/test/util/net.h#L61-L68

    sipa commented at 10:37 pm on September 13, 2023:
    Done.
  68. in test/functional/p2p_v2.py:36 in be8a5dcddd outdated
    31+        assert_equal(self.nodes[1].getblockcount(), 0)
    32+        with self.nodes[0].assert_debug_log(expected_msgs=[sending_handshake],
    33+                                            unexpected_msgs=[downgrading_to_v1]):
    34+            self.connect_nodes(0, 1, True)
    35+        # sync_all() verifies that the block tips match
    36+        self.sync_all(self.nodes[0:2])
    


    theStack commented at 5:19 pm on September 11, 2023:
    in commit be8a5dcddd36be8a5f137baf3ebd32d590a0747d: Somewhere around here it might make sense to verify that v2 is indeed used by checking the new getpeerinfo fields (“transport_protocol_type” and “session_id”) for nodes 0 and 1.

    sipa commented at 10:37 pm on September 13, 2023:
    Done.
  69. theStack commented at 5:21 pm on September 11, 2023: contributor

    Concept ACK

    Did a quick first review round, a more detailled one is about to follow.

  70. in src/rpc/net.cpp:323 in 2aabbdf4de outdated
    319@@ -309,17 +320,21 @@ static RPCHelpMan addnode()
    320     CConnman& connman = EnsureConnman(node);
    321 
    322     std::string strNode = request.params[0].get_str();
    323+    bool use_p2p_v2 = !request.params[2].isNull() && request.params[2].get_bool();
    


    maflcko commented at 7:39 am on September 12, 2023:
    nit: Not really a fan of hard-coding default value in the code syntax. If the default was changed, this would require changing the code syntax. It would be better to hard-code the default value itself. Even best would be to not hard-code anything anywhere, which should be possible with self.Arg<bool>(2), or similar.

    sipa commented at 10:37 pm on September 13, 2023:
    Done, used self.Arg<bool>(2) (which needed adding an instantiation for).
  71. in src/net.cpp:469 in 2aabbdf4de outdated
    464@@ -465,7 +465,11 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
    465         }
    466     }
    467 
    468-    LogPrintLevel(BCLog::NET, BCLog::Level::Debug, "trying connection %s lastseen=%.1fhrs\n",
    469+    // Use v2 transport when both us and them signal NODE_P2P_V2.
    470+    const bool use_p2p_v2 = !!(addrConnect.nServices & GetLocalServices() & NODE_P2P_V2);
    


    maflcko commented at 7:46 am on September 12, 2023:
    0    const bool use_p2p_v2(addrConnect.nServices & GetLocalServices() & NODE_P2P_V2);
    

    nit: Can be written shorter


    sipa commented at 10:38 pm on September 13, 2023:
    Done.
  72. in src/net.cpp:1899 in 2aabbdf4de outdated
    1893@@ -1822,6 +1894,10 @@ void CConnman::CreateNodeFromAcceptedSocket(std::unique_ptr<Sock>&& sock,
    1894     }
    1895 
    1896     const bool inbound_onion = std::find(m_onion_binds.begin(), m_onion_binds.end(), addr_bind) != m_onion_binds.end();
    1897+    // The V2Transport transparently falls back to V1 behavior when an incoming V1 connection is
    1898+    // detected, so use it whenever we signal NODE_P2P_V2.
    1899+    const bool use_p2p_v2 = !!(nodeServices & NODE_P2P_V2);
    


    maflcko commented at 7:47 am on September 12, 2023:
    Same

    sipa commented at 10:38 pm on September 13, 2023:
    Done.
  73. in test/functional/test_framework/test_framework.py:632 in 2aabbdf4de outdated
    637+
    638+            initiator = list(filter(lambda x: x["addrbind"] == ip_port, peerinfo))[0]
    639+            return initiator['version'] != 0 and initiator['bytesrecv_per_msg'].pop('verack', 0) >= min_verack_msg_bytes and initiator['bytesrecv_per_msg'].pop("pong", 0) >= 29
    640+
    641+        self.wait_until(check_initiator_handshake)
    642+        self.wait_until(check_responder_handshake)
    


    maflcko commented at 7:55 am on September 12, 2023:
    q in be8a5dcddd36be8a5f137baf3ebd32d590a0747d: Is it required to re-write the logic here? Seems a smaller diff would be to replace 24 and 32?

    sipa commented at 10:38 pm on September 13, 2023:
    Apparently not needed, done.
  74. in test/functional/p2p_v2_transport.py:23 in 2aabbdf4de outdated
    18+
    19+    def run_test(self):
    20+        network_info = self.nodes[0].getnetworkinfo()
    21+        assert_equal(int(network_info["localservices"], 16) & NODE_P2P_V2, 0)
    22+        if "P2P_V2" in network_info["localservicesnames"]:
    23+            raise AssertionError("Did not expect P2P_V2 to be signaled for a v1 node")
    


    maflcko commented at 7:58 am on September 12, 2023:
    0        assert "P2P_V2" in network_info["localservicesnames"]
    1            
    

    nit: Can be written shorter? Also, the full test can be removed and the two checks can be added to the p2p_v2.py test? This avoids having to spin up a node twice to call a single RPC each time.


    sipa commented at 10:38 pm on September 13, 2023:
    Done, and done.
  75. in test/functional/p2p_v2.py:78 in 2aabbdf4de outdated
    73+        self.nodes[1].generatetoaddress(1, "bcrt1q3zsxn3qx0cqyyxgv90k7j6786mpe543wc4vy2v", invalid_call=False)
    74+        self.sync_all(self.nodes[0:4])
    75+        assert_equal(self.nodes[0].getblockcount(), 9) # sync_all() verifies tip hashes match
    76+
    77+        # V1 node mines another block and everyone gets it
    78+        self.nodes[3].generatetoaddress(2, "bcrt1q3zsxn3qx0cqyyxgv90k7j6786mpe543wc4vy2v", invalid_call=False)
    


    maflcko commented at 7:59 am on September 12, 2023:
    nit: The call is “invalid”. self.generate... would be more valid.

    sipa commented at 10:39 pm on September 13, 2023:
    I’ve stuck to self.nodes[3].generate(2) (and similarly elsewhere) because the synchronization is explicit and separate in the lines that follow.
  76. in test/functional/test_framework/test_framework.py:589 in 2aabbdf4de outdated
    585@@ -581,26 +586,50 @@ def restart_node(self, i, extra_args=None):
    586     def wait_for_node_exit(self, i, timeout):
    587         self.nodes[i].process.wait(timeout)
    588 
    589-    def connect_nodes(self, a, b):
    590+    def connect_nodes(self, a, b, peer_advertises_v2=None):
    


    maflcko commented at 8:01 am on September 12, 2023:
    0    def connect_nodes(self, a, b, *, peer_advertises_v2=None):
    

    optional integral types should use named arguments


    sipa commented at 10:39 pm on September 13, 2023:
    Fixed.
  77. maflcko commented at 8:02 am on September 12, 2023: member
    left some nits on the tests, feel free to ignore.
  78. achow101 commented at 6:41 pm on September 13, 2023: member

    Couple things while testing this out:

    • Would be nice to have the transport type and session id also show in the GUI’s peer info page
    • Could we log the connection version once the connection is established? Maybe once we receive their version message? The transport version can be figured out from the logs for outbound connections, but not for inbound.
  79. Sjors commented at 7:06 pm on September 13, 2023: member
    (deleted my last comment, forgot to add true to the RPC call)
  80. in src/net.h:1113 in 1fa8efaff7 outdated
    1080@@ -1077,7 +1081,11 @@ class CConnman
    1081         vWhitelistedRange = connOptions.vWhitelistedRange;
    1082         {
    1083             LOCK(m_added_nodes_mutex);
    1084-            m_added_nodes = connOptions.m_added_nodes;
    1085+
    1086+            for (const std::string& strAddedNode : connOptions.m_added_nodes) {
    1087+                // -addnode cli arg does not currently have a way to signal BIP324 support
    


    Sjors commented at 7:44 pm on September 13, 2023:

    1fa8efaff73bda19b7e8f2e27d24b08ccf0de764: we could add something like a -addnodev2 cli arg and then use .m_added_v2_nodes?

    Alternatively can we look the node up in our known addresses and use whatever the service flag says?


    sipa commented at 10:40 pm on September 13, 2023:
    Possibly, but at this stage my goal is to get enough code in for basic tests and experimentation, and for that, an RPC suffices. Further on, we’ll certainly want more convenient ways of interacting with this (perhaps we can treat it as a whitelist permission or so, too, but I don’t want to bikeshed about that here).

    Sjors commented at 11:15 pm on September 13, 2023:
    It can wait indeed. It’s just that I have some nodes that I’d like to connect by default.

    mzumsande commented at 7:58 pm on September 14, 2023:
    There are also other means of making outbound connections based on user input, like -connect or -seednode for which at some later point a decision must be made (make configurable / keep default at v1 / switch to v2 ).

    sipa commented at 8:10 pm on September 14, 2023:
    There is also the possibility of extending the reconnection mechanism to switch to V2 when you connect using V1 to a node and they tell you they support V2.
  81. in src/sync.h:316 in 8672b854fb outdated
    308@@ -309,25 +309,33 @@ class CSemaphore
    309     int value;
    310 
    311 public:
    312-    explicit CSemaphore(int init) : value(init) {}
    313+    explicit CSemaphore(int init) noexcept : value(init) {}
    


    Sjors commented at 7:49 pm on September 13, 2023:
    8672b854fb7a81a400150ef09e99a06c829b68f8: can you add some documentation for what this class is?

    sipa commented at 10:41 pm on September 13, 2023:
    I’ve linked to the Wikipedia article about it.
  82. in test/functional/test_framework/test_framework.py:192 in 2aabbdf4de outdated
    188@@ -189,6 +189,8 @@ def parse_args(self):
    189         parser.add_argument("--randomseed", type=int,
    190                             help="set a random seed for deterministically reproducing a previous test run")
    191         parser.add_argument("--timeout-factor", dest="timeout_factor", type=float, help="adjust test timeouts by a factor. Setting it to 0 disables all timeouts")
    192+        parser.add_argument("--v2transport", dest="v2transport", default=False, action="store_true",
    


    Sjors commented at 8:12 pm on September 13, 2023:

    2aabbdf4de01ef1a3fb0c1bb86ef163c9fba587b Nice! Only three tests fail when I enable this for all tests:

    • p2p_node_network_limited.py
    • p2p_timeouts.py
    • rpc_net.py

    theStack commented at 2:32 am on September 15, 2023:
    FWIW, I’ve started to adapt those tests for --v2transport support, and at least for p2p_node_network_limited.py and rpc_net.py there weren’t too many changes needed (see https://github.com/theStack/bitcoin/commits/enable_v2_transport_for_more_tests).
  83. Sjors commented at 8:12 pm on September 13, 2023: member

    Lightly reviewed 2aabbdf4de01ef1a3fb0c1bb86ef163c9fba587b. I’ve also been running this on mainnet for a while.

    I assume 55079fedeeca488f360d200b1d2d24c2f614bc73 will be replaced with whatever ends up merged in #28452?

    1768aff56d6feedcb22b21df12e8926924fdac6a look fine, modulo https://github.com/bitcoin/bitcoin/pull/28331/commits/1768aff56d6feedcb22b21df12e8926924fdac6a#r1321845616

    I didn’t study 8672b854fb7a81a400150ef09e99a06c829b68f8 much.

    Can you expand the description of 4cfbbceff35fcf22a232c66278c4ababd750d9bb a bit to explain under what circumstances PerformReconnections() is called and elaborate a bit on the various delays? Particuarly, if we tried a v2 connection and failed, how long do we wait before retrying with v1?

  84. in src/net.cpp:3509 in 2aabbdf4de outdated
    3513+    for (const auto& it : m_added_nodes) {
    3514+        if (added_node_params.m_added_node == it.m_added_node) return false;
    3515     }
    3516 
    3517-    m_added_nodes.push_back(strNode);
    3518+    m_added_nodes.push_back(added_node_params);
    


    jonatack commented at 8:25 pm on September 13, 2023:
    Note that both the AddNode() and RemoveAddedNode() methods become one-liners with the change in 7caaa9e (#28248), if you’d like to pull it in before this change.
  85. jonatack commented at 8:30 pm on September 13, 2023: contributor
    Approach ACK. Suggest reviewers consider testing this using the logging in #28248 that uncovered peer connection issues in the interactions between automatic and manual, outbound and inbound, and some that are network-specific, given that this pull adds another dimension.
  86. hebasto commented at 10:22 pm on September 13, 2023: member
    • Would be nice to have the transport type and session id also show in the GUI’s peer info page

    https://github.com/bitcoin-core/gui/pull/754

  87. jonatack commented at 10:26 pm on September 13, 2023: contributor

    Would be nice to have the transport type and session id also show in the GUI’s peer info page

    Am updating -netinfo for bip324 / v2 p2p as well.

  88. sipa force-pushed on Sep 13, 2023
  89. sipa commented at 10:43 pm on September 13, 2023: member

    @achow101

    • Would be nice to have the transport type and session id also show in the GUI’s peer info page

    https://github.com/bitcoin-core/gui/pull/754

    • Could we log the connection version once the connection is established? Maybe once we receive their version message? The transport version can be figured out from the logs for outbound connections, but not for inbound.

    I’ve added logging on VERACK for both inbound and outbound (but only with -debug=net for inbound, because possible spam). @Sjors

    I assume https://github.com/bitcoin/bitcoin/commit/55079fedeeca488f360d200b1d2d24c2f614bc73 will be replaced with whatever ends up merged in #28452?

    Rebased on top of it.

    Can you expand the description of https://github.com/bitcoin/bitcoin/commit/4cfbbceff35fcf22a232c66278c4ababd750d9bb a bit to explain under what circumstances PerformReconnections() is called and elaborate a bit on the various delays? Particuarly, if we tried a v2 connection and failed, how long do we wait before retrying with v1?

    It should be pretty much immediately - the very next time a connection is attempted to be created.

  90. in src/net.cpp:2886 in f204845319 outdated
    2882@@ -2834,6 +2883,8 @@ void CConnman::ThreadOpenAddedConnections()
    2883         // Retry every 60 seconds if a connection was attempted, otherwise two seconds
    2884         if (!interruptNet.sleep_for(std::chrono::seconds(tried ? 60 : 2)))
    2885             return;
    2886+        // See if any reconnections are desired.
    2887+        PerformReconnections();
    


    Sjors commented at 11:18 pm on September 13, 2023:
    Wouldn’t this wait for 60 seconds? (context: #28331 (comment))

    sipa commented at 12:08 pm on September 14, 2023:
    If you only have a ThreadOpenAddedConnections and no ThreadOpenConnections, indeed - though I think that only happens if you have a -connect= option.
  91. in test/functional/p2p_v2_transport.py:124 in 018594ef44 outdated
    119+        self.nodes[1].generate(1, invalid_call=False)
    120+        self.sync_all(self.nodes[0:4])
    121+        assert_equal(self.nodes[0].getblockcount(), 9) # sync_all() verifies tip hashes match
    122+
    123+        # V1 node mines another block and everyone gets it
    124+        self.nodes[3].generate(2, invalid_call=False)
    


    maflcko commented at 6:10 am on September 14, 2023:
    0        self.generate(self.nodes[3], 2, sync_fun=lambda: self.sync_all(self.nodes[0:4]))
    

    nit: Can be written shorter by combining the call with the line below.

    The general idea is that for each call to generate the dev is forced to specify which nodes are expected to be synced after the call, to make intermittent races/errors less common and assumptions easier to follow.

    (Same above)

    (My idea was to force devs to type invalid_call, which would make them realize that the call is considered “invalid”, but maybe that doesn’t work :sweat_smile:)


    sipa commented at 12:20 pm on September 14, 2023:

    Ooh! I interpreted it as “don’t expect an exception to be thrown” or so. Done.

    Perhaps it should be called i_know_test_framework_generate_exists_but_dont_want_to_use_it=True instead of invalid_call=False.

  92. in test/functional/p2p_v2_transport.py:2 in 018594ef44 outdated
    0@@ -0,0 +1,136 @@
    1+#!/usr/bin/env python3
    2+# Copyright (c) 2021 The Bitcoin Core developers
    


    maflcko commented at 6:14 am on September 14, 2023:
    0# Copyright (c) 2021-present The Bitcoin Core developers
    

    nit: To avoid having to ever touch this file again, for new code, could use -present, or drop the years?


    sipa commented at 12:56 pm on September 14, 2023:
    Done.
  93. maflcko approved
  94. maflcko commented at 6:15 am on September 14, 2023: member
    Clarified my test nit about invalid_call=False. Feel free to ignore
  95. DrahtBot added the label Needs rebase on Sep 14, 2023
  96. sipa force-pushed on Sep 14, 2023
  97. DrahtBot removed the label Needs rebase on Sep 14, 2023
  98. sipa force-pushed on Sep 14, 2023
  99. sipa force-pushed on Sep 14, 2023
  100. jsarenik commented at 2:05 pm on September 14, 2023: none
    One running on ln.uk.ms, see http://anyone.eu.org/bitcoin.txt or https://be.anyone.eu.org/node-details for addresses.
  101. in src/rpc/net.cpp:330 in eb05c92119 outdated
    309@@ -309,17 +310,21 @@ static RPCHelpMan addnode()
    310     CConnman& connman = EnsureConnman(node);
    311 
    312     std::string strNode = request.params[0].get_str();
    313+    bool use_v2transport = self.Arg<bool>(2);
    314 
    315     if (strCommand == "onetry")
    316     {
    317         CAddress addr;
    318+        if (use_v2transport) {
    


    mzumsande commented at 8:34 pm on September 14, 2023:
    could return an error if we explicitely set the bool with the RPC command but don’t have activated -v2transport ourselves (instead of silently downgrading to v1).

    sipa commented at 8:51 pm on September 15, 2023:
    Added.
  102. theStack commented at 11:31 am on September 15, 2023: contributor

    After startup I frequently see getpeerinfo outbound peer entries like the following:

     0$ ./src/bitcoin-cli getpeerinfo
     1{
     2  ...,
     3  "services": "0000000000000000",
     4  "servicesnames": [
     5  ],
     6  "bytessent_per_msg": {
     7    "": 2614
     8  },
     9  "bytesrecv_per_msg": {
    10  },
    11  "connection_type": "outbound-full-relay",
    12  "transport_protocol_type": "v2",
    13  "session_id": ""
    14}
    

    So it seems that even though the p2p v2 handshake isn’t completed yet, the “transport_protocol_type” is already reported as “v2”. Should we also use “detecting” (or something else) for the sender side in this case?

    Haven’t checked in detail, but I suspect that this happens if a peer according to addrman signals NODE_P2P_V2 and we hence try to initiate a v2 connection, but the other side only supports p2p v1? (or, for some other reason the handshake doesn’t complete).

  103. sipa force-pushed on Sep 15, 2023
  104. sipa commented at 12:06 pm on September 15, 2023: member

    @theStack Good call, that’s confusing.

    What’s going on is that the code only reported “detecting” for inbound possibly-v2 connections where the 12-byte prefix hasn’t been received yet. In inbound that’s more or less correct, except that peer that send non-bitcoin traffic would also get incorrectly classified as v2.

    I’ve changed it to align the transport version and the session id: only say v2 when the garbage auth packet has been completely received; before that point “detecting” will be reported.

    Diff:

     0--- a/src/net.cpp
     1+++ b/src/net.cpp
     2@@ -1637,15 +1637,14 @@ Transport::Info V2Transport::GetInfo() const noexcept
     3 
     4     Transport::Info info;
     5 
     6-    // Report DETECTING as transport type if we don't know whether the peer is V1 or V2.
     7-    info.transport_type = m_recv_state ==
     8-        RecvState::KEY_MAYBE_V1 ? TransportProtocolType::DETECTING : TransportProtocolType::V2;
     9-
    10-    // Do not report the session ID until the garbage authentication packet has been received and
    11-    // verified (confirming that the other side very likely has the same keys as us).
    12+    // Do not report v2 and session ID until the garbage authentication packet has been received
    13+    // and verified (confirming that the other side very likely has the same keys as us).
    14     if (m_recv_state != RecvState::KEY_MAYBE_V1 && m_recv_state != RecvState::KEY &&
    15         m_recv_state != RecvState::GARB_GARBTERM && m_recv_state != RecvState::GARBAUTH) {
    16+        info.transport_type = TransportProtocolType::V2;
    17         info.session_id = uint256(MakeUCharSpan(m_cipher.GetSessionID()));
    18+    } else {
    19+        info.transport_type = TransportProtocolType::DETECTING;
    20     }
    21 
    22     return info;
    
  105. vasild commented at 3:14 pm on September 15, 2023: contributor

    Concept ACK

    -v2transport is maybe a bit too obscure from the users’ point of view? Something like -p2pencryption may be clearer.

  106. sipa commented at 3:18 pm on September 15, 2023: member

    @vasild

    • I consider the -v2transport an option just for people who want to experiment with it, before it gets enabled by default.
    • I worry that advertising the feature as “encryption” would result in incorrect assumptions about what it achieves (which is raising costs for large-scale observers, if deployed at scale; it doesn’t really protect privacy for your own actions against a deliberate attacker).
  107. in src/net.cpp:3878 in 7d2bf51b9e outdated
    3844+            if (m_reconnections.empty()) break;
    3845+            todo.splice(todo.end(), m_reconnections, m_reconnections.begin());
    3846+        }
    3847+
    3848+        auto& [addr_connect, count_failure, grant, dest, conn_type] = *todo.begin();
    3849+        OpenNetworkConnection(addr_connect, count_failure, &grant, dest.empty() ? nullptr : dest.c_str(), conn_type);
    


    mzumsande commented at 3:25 pm on September 15, 2023:
    I think this means that reconnections for added nodes can be made from ThreadOpenConnections, and reconnections for automatic connections can be performed by ThreadOpenAddedConnections - which seems really weird, although I don’t think it breaks anything. I wonder if it makes sense to perform the respective reconnection attempts only by their designated threads.

    sipa commented at 9:08 pm on September 15, 2023:

    I agree it’s somewhat strange, but I’m not convinced that splitting them up is better.

    Pro current approach:

    • Slightly lower latency for reconnections with 2 threads considering them.
    • Simpler implementation.
    • Not entirely clear what to do with connections not created by either thread (like addnode oneshot, unsure if there are others). I guess they can go in the addnode thread too.

    Pro split:

    • No interference between the two types. For example, if you’d configure lots of addnodes errorneously as V2, with the currently approach they’ll end up slowing down normal (random) peer connection establishment.

    mzumsande commented at 7:28 pm on September 25, 2023:
    Ok - as pointed out below it could also help resolve problems with a blocking semaphore to have another thread trying as well.
  108. mzumsande commented at 3:34 pm on September 15, 2023: contributor
    Concept ACK
  109. sipa force-pushed on Sep 15, 2023
  110. sipa force-pushed on Sep 15, 2023
  111. DrahtBot added the label CI failed on Sep 15, 2023
  112. sipa force-pushed on Sep 19, 2023
  113. sipa commented at 9:21 am on September 19, 2023: member
    Rebased to retrigger CI.
  114. in src/net.cpp:3623 in 54544ace5e outdated
    3616@@ -3606,6 +3617,15 @@ ServiceFlags CConnman::GetLocalServices() const
    3617     return nLocalServices;
    3618 }
    3619 
    3620+static std::unique_ptr<Transport> MakeTransport(NodeId id, bool use_v2transport, bool inbound) noexcept
    3621+{
    3622+    if (use_v2transport) {
    3623+        return std::make_unique<V2Transport>(id, !inbound, SER_NETWORK, INIT_PROTO_VERSION);
    


    theStack commented at 9:26 am on September 19, 2023:

    named argument nit:

    0        return std::make_unique<V2Transport>(id, /*initiating=*/!inbound, SER_NETWORK, INIT_PROTO_VERSION);
    

    sipa commented at 4:50 pm on September 20, 2023:
    Done.
  115. in src/node/connection_types.h:84 in 0c345514b8 outdated
    78@@ -79,4 +79,14 @@ enum class ConnectionType {
    79 /** Convert ConnectionType enum to a string value */
    80 std::string ConnectionTypeAsString(ConnectionType conn_type);
    81 
    82+/** Transport layer version */
    83+enum class TransportProtocolType {
    84+    DETECTING, //!< Incoming connection; peer could be v1 or v2
    


    theStack commented at 9:32 am on September 19, 2023:
    0    DETECTING, //!< peer could be v1 or v2
    

    since the latest change in V2Transport::GetInfo (introduced in #28331 (comment)) the DETECTING state is now used both for in- and outbound connections


    sipa commented at 4:50 pm on September 20, 2023:
    Done.
  116. in src/net.h:285 in 0c345514b8 outdated
    280+        std::optional<uint256> session_id;
    281+
    282+        friend bool operator==(const Info& a, const Info& b) noexcept
    283+        {
    284+            return std::tie(a.transport_type, a.session_id) == std::tie(b.transport_type, b.session_id);
    285+        }
    


    theStack commented at 9:34 am on September 19, 2023:
    nit: this operator seems to be not used anywhere

    sipa commented at 4:50 pm on September 20, 2023:
    Gone.
  117. in test/functional/test_framework/test_framework.py:592 in 4a363e9754 outdated
    589         to_num_peers = 1 + len(to_connection.getpeerinfo())
    590         ip_port = "127.0.0.1:" + str(p2p_port(b))
    591-        from_connection.addnode(ip_port, "onetry")
    592+
    593+        if peer_advertises_v2:
    594+            from_connection.addnode(ip_port, "onetry", True)
    


    theStack commented at 9:36 am on September 19, 2023:

    nit: could use named arguments here and below

    0            from_connection.addnode(node=ip_port, command="onetry", v2transport=True)
    

    sipa commented at 4:49 pm on September 20, 2023:
    Done.
  118. theStack commented at 10:19 am on September 19, 2023: contributor

    Statistical data reporting nit: apparently the garbage data is accounted for the sender side as empty type ("") in "bytessent_per_msg", while for the receiver side ("bytesrecv_per_msg") it’s not, e.g.:

     0./src/bitcoin-cli getpeerinfo
     1[
     2  {
     3    "bytessent_per_msg": {                                          
     4      "": 3623,       
     5      "feefilter": 29,
     6      "getaddr": 33,
     7      ...
     8    },
     9    "bytesrecv_per_msg": {
    10      "addrv2": 16621,
    11      "block": 60014267,
    12      ...
    13    },
    14  }
    15]
    

    (seeing this for both in- and outbound connections)

    Left a few other nits below. I’ve been running this PR for several days now with several in- and outbound peers and everything seems to work just fine AFAICT. Currently also running an IBD with a single BIP324 peer (the only one that @Sjors’ DNS seed x809.seed.bitcoin.sprovoost.nl returned, which was recently added to bitcoin-seeder, see https://github.com/sipa/bitcoin-seeder/pull/102). @jonatack:

    Would be nice to have the transport type and session id also show in the GUI’s peer info page

    Am updating -netinfo for bip324 / v2 p2p as well.

    Looking very much forward to that :rocket:

  119. DrahtBot removed the label CI failed on Sep 19, 2023
  120. sipa force-pushed on Sep 20, 2023
  121. DrahtBot added the label Needs rebase on Sep 21, 2023
  122. sipa force-pushed on Sep 22, 2023
  123. sipa commented at 10:41 am on September 22, 2023: member
    Rebased. @theStack I guess the fact that the handshake bytes are reported in the send statistics but not in the receive statistics is inconsistent. There is a simple solution, I think, namely adding a “message” in the ReceivedMessageComplete sense to be retrieved to transition from VERSION to APP state (but one with no payload, and reject = true), to communicate the handshake bytes to the net layer. I prefer doing this in a follow-up, though, as there may be a wider question about how to report the various types of bytes (maybe we want to classify garbage/key/version separately, and maybe we want to support reporting decoy data too).
  124. DrahtBot removed the label Needs rebase on Sep 22, 2023
  125. DrahtBot added the label CI failed on Sep 22, 2023
  126. sipa force-pushed on Sep 22, 2023
  127. DrahtBot removed the label CI failed on Sep 22, 2023
  128. DrahtBot added the label CI failed on Sep 23, 2023
  129. theStack commented at 11:39 pm on September 23, 2023: contributor

    Light ACK 964a8b88ef69643181e11475f03f94cc2780772e (still planning to look deeper at 58ba59f39c217f6835e7839c391dd680ee9d62f2 and its implications w.r.t. the discussion #28331 (review))

    Tested more intensly over for the last few days with some more in- and outbound connections and everything was working fine. Unexpected disconnects and other seemingly weird behaviour patterns turned out to be wrong alarms not related to any of the new p2p v2 code, presumably the connection was just bad for outbound connection tests on my laptop (unstable/slow wifi at conferences…). Will keep testing over the following days.

    @theStack I guess the fact that the handshake bytes are reported in the send statistics but not in the receive statistics is inconsistent. There is a simple solution, I think, namely adding a “message” in the ReceivedMessageComplete sense to be retrieved to transition from VERSION to APP state (but one with no payload, and reject = true), to communicate the handshake bytes to the net layer. I prefer doing this in a follow-up, though, as there may be a wider question about how to report the various types of bytes (maybe we want to classify garbage/key/version separately, and maybe we want to support reporting decoy data too).

    Sounds good to me. Another easy alternative to get rid of the inconsistency would be to simply report the handshake bytes in neither side for now until we figure out how we want to report the different types of bytes. E.g.:

     0index 73dbb090bc..0bd7d68e96 100644
     1--- a/src/net.cpp
     2+++ b/src/net.cpp
     3@@ -1699,7 +1699,9 @@ std::pair<size_t, bool> CConnman::SocketSendData(CNode& node) const
     4             // Notify transport that bytes have been processed.
     5             node.m_transport->MarkBytesSent(nBytes);
     6             // Update statistics per message type.
     7-            node.AccountForSentBytes(msg_type, nBytes);
     8+            if (!msg_type.empty()) { // don't report v2 handshake bytes for now
     9+                node.AccountForSentBytes(msg_type, nBytes);
    10+            }
    11             nSentSize += nBytes;
    12             if ((size_t)nBytes != data.size()) {
    13                 // could not send full message; stop sending more
    

    (as far as I’m aware, this wouldn’t change the statistics for v1 peers as we haven’t used the empty string as reporting type anywhere before)

  130. DrahtBot removed the label CI failed on Sep 25, 2023
  131. in src/net.h:1575 in 5917b6b5d7 outdated
    1570+        CAddress /*addr_connect*/,
    1571+        bool /*count_failure*/,
    1572+        CSemaphoreGrant /*grant*/,
    1573+        std::string /*dest*/,
    1574+        ConnectionType /*conn_type*/
    1575+    >> m_reconnections GUARDED_BY(m_reconnections_mutex);
    


    ajtowns commented at 12:01 pm on September 25, 2023:
    struct ReconnectionInfo { ... }; std::list<ReconnectionInfo> m_reconnections; ? If you’re going to comment names for the entries, might as well actually give them names.

    sipa commented at 12:38 pm on September 27, 2023:
    Done.
  132. in src/net.cpp:2514 in 5917b6b5d7 outdated
    2504@@ -2459,6 +2505,8 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
    2505         if (!interruptNet.sleep_for(std::chrono::milliseconds(500)))
    2506             return;
    2507 
    2508+        PerformReconnections();
    


    ajtowns commented at 12:29 pm on September 25, 2023:

    This seems a little bit complex/brittle doesn’t it?

    The issue I’m thinking of is that ThreadOpenConnections will block on CSemaphoreGrant grant(*semOutbound) waiting for one to disconnect and release its grant, but the peer wanting a reconnection attempt doesn’t release its grant, so the thread here won’t be woken up. That’s okay normally though: there’s slots for 8 outbounds, 2 block relay connections and a feeler/extra block relay; so usually there will be a spare slot, so the code will continue through and arrive back to the 500ms wait, then handle reconnections again. And even if they are all used up, every minute, ThreadOpenAddedConnections will help out, unless all the slots there are also used up.

    I don’t think that’s super problematic – at worst I think it just performs a bit poorly, by delaying a reconnect for a minute or until your feeler/extra block relay peer gets disconnected. But mostly it feels like gating this through semaphores is just a bit too simple for what we’re trying to achieve.

    I guess I think it would be better to drop the semaphore approach and just have a queue of “open connection” work, one or two (or n) threads that attempt to open connections, and then they just wake up via a condition variable to see what work there is to do (outbound, block relay, addnode, or reconnection) and do it.


    sipa commented at 10:36 pm on September 26, 2023:
    @ajtowns I agree it would be a better to have a more integrated design for managing the creation of open connections. Do you think the code changes for that are simple enough to do something like that in this PR, or should we keep it for a follow-up improvement?

    ajtowns commented at 4:07 am on September 27, 2023:
    Seems better as a follow-up to me; even if the code changes ended up being simple, thinking through all the possible edge cases seems complex enough to justify its own PR.

    sipa commented at 12:41 pm on September 27, 2023:
    Listed as a potential follow-up in #27634
  133. in src/net.cpp:1956 in 5917b6b5d7 outdated
    1951+                    reconnections_to_add.emplace_back(
    1952+                        /*addr_connect=*/CAddress{pnode->addr, ServiceFlags{pnode->addr.nServices & ~NODE_P2P_V2}},
    1953+                        /*count_failure=*/pnode->m_count_failure_after_reconnect,
    1954+                        /*grant=*/std::move(pnode->grantOutbound),
    1955+                        /*dest=*/pnode->m_dest,
    1956+                        /*conn_type=*/pnode->m_conn_type);
    


    ajtowns commented at 1:16 pm on September 25, 2023:
    Did you consider an approach where the CNode object is preserved across the reconnection attempt? I guess that would involve an annoying amount of messing with ConnectNode. It would be nice if the peer id didn’t change between the v2 and v1 connection attempts though - it’s a bit awkward to correlate the two attempts when you can’t just grep for peer=nnn

    mzumsande commented at 6:44 pm on September 25, 2023:

    also, I think that this might lead to (temporary) overshoots in the number of connection subtypes that share semOutbound, for example:

    1. Make 8th full-relay connection using v2, get disconnected and add to m_reconnections.
    2. Make another outbound from ThreadOpenConnections()
    3. call PerformReconnections() and make a 9th full outbound

    Although this shouldn’t be too bad since EvictExtraOutboundPeers should deal with this and disconnect one of the extra connections both for block-relay and full outbound case.


    sipa commented at 11:28 pm on September 26, 2023:

    I did consider reusing the CNode object, but it seems hard to do, because in general we only construct a CNode object after the connection is established. I guess it would be possible to instead leave to-be-reconnected nodes in the nodes array, but it would need more substantial changes (factoring out initialization logic from the CNode constructor, figuring out the interaction with various special states, dealing with any code that assumes any node in the array is active, …). This would however solve both of the mentioned issues (node id numbers changing on reconnect, and accounting of connection subtypes).

    Just making the NodeId be reused would be a much smaller change, though I’m not convinced there is no code somewhere that assumes that distinct objects use distinct ids (and it seems possible in theory that the old object survives in m_nodes_disconnected actually while the new one is already created?).

    If possible, I’d prefer to keep this for a follow-up.


    ajtowns commented at 6:04 am on September 27, 2023:

    Certainly fine as a follow-up, if it’s changed at all. I was thinking something along the lines of https://github.com/ajtowns/bitcoin/commits/202309-reconnect-same-cnode might be workable:

     02023-09-27T06:01:39Z [net:debug] trying v2 connection XXX lastseen=0.0hrs
     12023-09-27T06:01:39Z [net] Added connection peer=1
     22023-09-27T06:01:39Z [net] sending version (103 bytes) peer=1
     32023-09-27T06:01:39Z [net] send version message: version 70016, blocks=162396, txrelay=1, peer=1
     42023-09-27T06:01:39Z [net] start sending v2 handshake to peer=1
     52023-09-27T06:01:39Z [net] socket closed for peer=1
     62023-09-27T06:01:39Z [net] disconnecting peer=1
     72023-09-27T06:01:39Z [net] will retry with v1 transport protocol for peer=1
     82023-09-27T06:01:41Z [net] reconnected with v1 transport protocol for peer=1
     92023-09-27T06:01:41Z [net] received: version (103 bytes) peer=1
    102023-09-27T06:01:41Z [net] sending wtxidrelay (0 bytes) peer=1
    

    sipa commented at 12:42 pm on September 27, 2023:
    Listed as a potential follow-up in #27634
  134. in test/functional/test_framework/messages.py:55 in 4988d8ef0c outdated
    51@@ -52,6 +52,7 @@
    52 NODE_WITNESS = (1 << 3)
    53 NODE_COMPACT_FILTERS = (1 << 6)
    54 NODE_NETWORK_LIMITED = (1 << 10)
    55+NODE_P2P_V2 = (1 << 11)
    


    ajtowns commented at 3:04 pm on September 25, 2023:

    This seems to be an okay way of getting addresses to try:

    bitcoin-cli getnodeaddresses 10000 | jq -r '.[] | select(.services >= 2047 and .services < 4096 and .network == "ipv4" and .port == 8333) | .address'

    Gives lots of false positives though (ie, nodes that don’t actually support v2, or addresses that aren’t actually accepting connections).

  135. in src/net.cpp:3679 in bcde50035e outdated
    3612@@ -3602,6 +3613,15 @@ ServiceFlags CConnman::GetLocalServices() const
    3613     return nLocalServices;
    3614 }
    3615 
    3616+static std::unique_ptr<Transport> MakeTransport(NodeId id, bool use_v2transport, bool inbound) noexcept
    3617+{
    3618+    if (use_v2transport) {
    3619+        return std::make_unique<V2Transport>(id, /*initiating=*/!inbound, SER_NETWORK, INIT_PROTO_VERSION);
    3620+    } else {
    3621+        return std::make_unique<V1Transport>(id, SER_NETWORK, INIT_PROTO_VERSION);
    


    ajtowns commented at 3:08 pm on September 25, 2023:
    I wonder how much sense it really makes to have the Transport virtual class – every V2Transport has a V1Transport m_v1_fallback member, so could just use V2Transport everywhere, and set m_recv_state and m_send_state to V1 when we want v1, instead of allocating a different structure.

    sipa commented at 12:19 pm on September 27, 2023:

    Fair point, when starting off with this approach I think we were assuming that downgrade from V2 to V1 would happen at the CNode level rather than inside the transports.

    It’s pretty easy to address this: add a start_v1 parameter to the V2Transport constructors, and change the std::unique_ptr<Transport> to V2Transport. Even if they keep deriving from Transport, the fact that they’re final classes means no virtual function call overhead.

    Still, as long as we don’t enable -v2transport by default though, the virtual class approach makes sure that no V2Transport code is active unless enabled, though at the cost of virtual function calls. I think that’s fine, but it’s also easy to change.


    sipa commented at 12:42 pm on September 27, 2023:
    Listed as a potential follow-up in #27634
  136. in src/net.cpp:1584 in 5917b6b5d7 outdated
    1598+    // we reconnect.
    1599+    if (m_recv_state != RecvState::KEY) return false;
    1600+    // Only when nothing has been received do we reconnect.
    1601+    if (!m_recv_buffer.empty()) return false;
    1602+    // Check if we've sent enough for the other side to disconnect us (if it was V1).
    1603+    LOCK(m_send_mutex);
    


    ajtowns commented at 3:14 pm on September 25, 2023:
    Should you release the m_recv_mutex lock before taking the m_send_mutex lock? Presumably it shouldn’t possibly matter by the time you’re worrying about reconnecting, though.

    sipa commented at 12:24 pm on September 27, 2023:
    I don’t think this matters; the connection is already closed here, so nothing except V2Transport::ShouldReconnectV1 should be accessing these locks anymore anyway. And if m_recv_mutex is released before, an overly keen code reviewer may wonder if the receive state couldn’t change in between the release and the grab of m_send_mutex.
  137. in src/net.cpp:1952 in 5917b6b5d7 outdated
    1943@@ -1917,6 +1944,19 @@ void CConnman::DisconnectNodes()
    1944                 // remove from m_nodes
    1945                 m_nodes.erase(remove(m_nodes.begin(), m_nodes.end(), pnode), m_nodes.end());
    1946 
    1947+                // Add to reconnection list if appropriate. We don't reconnect right here, because
    1948+                // the creation of a connection is a blocking operation (up to several seconds),
    1949+                // and we don't want to hold up the socket handler thread for that long.
    1950+                if (pnode->m_transport->ShouldReconnectV1()) {
    1951+                    reconnections_to_add.emplace_back(
    1952+                        /*addr_connect=*/CAddress{pnode->addr, ServiceFlags{pnode->addr.nServices & ~NODE_P2P_V2}},
    


    ajtowns commented at 3:19 pm on September 25, 2023:
    Is it worth adding a comment that removing NODE_P2P_V2 here is what ensures it’s retried as a v1 connection in ConnectNode ?

    sipa commented at 12:42 pm on September 27, 2023:
    Done.
  138. in src/rpc/net.cpp:292 in 1ca1108308 outdated
    288@@ -289,11 +289,12 @@ static RPCHelpMan addnode()
    289                 {
    290                     {"node", RPCArg::Type::STR, RPCArg::Optional::NO, "The address of the peer to connect to"},
    291                     {"command", RPCArg::Type::STR, RPCArg::Optional::NO, "'add' to add a node to the list, 'remove' to remove a node from the list, 'onetry' to try a connection to the node once"},
    292+                    {"v2transport", RPCArg::Type::BOOL, RPCArg::Default{false}, "Peer supports BIP324 v2 transport protocol"},
    


    mzumsande commented at 3:43 pm on September 25, 2023:
    nit: would prefer “Attempt to connect with BIP324 v2 transport protocol” or something similar, we usually don’t know for sure if the peer actually supports it (and will downgrade if they don’t).

    sipa commented at 12:42 pm on September 27, 2023:
    Done.
  139. ajtowns commented at 4:22 pm on September 25, 2023: contributor

    Code review ACK 964a8b88ef69643181e11475f03f94cc2780772e

    LGTM on the assumption all the prior implementation is also good

  140. in src/net.h:1405 in 1ca1108308 outdated
    1399@@ -1392,7 +1400,10 @@ class CConnman
    1400     const NetGroupManager& m_netgroupman;
    1401     std::deque<std::string> m_addr_fetches GUARDED_BY(m_addr_fetches_mutex);
    1402     Mutex m_addr_fetches_mutex;
    1403-    std::vector<std::string> m_added_nodes GUARDED_BY(m_added_nodes_mutex);
    1404+
    1405+    // connection string and whether to use v2 p2p
    1406+    std::vector<AddedNodeParams> m_added_nodes GUARDED_BY(m_added_nodes_mutex);
    


    mzumsande commented at 4:46 pm on September 25, 2023:
    nit: maybe call it m_added_node_params? The naming becomes a bit confusing, e.g. when we loop over m_added_nodes to access the m_added_node member.

    sipa commented at 12:43 pm on September 27, 2023:
    Renamed, and also renamed some related variables in other places to avoid name collisions.
  141. mzumsande commented at 7:50 pm on September 25, 2023: contributor

    ACK 964a8b88ef69643181e11475f03f94cc2780772e (comments are mostly nits)

    Reviewed the code, and also did some manual testing for the downgrade mechanism. I have tested this on mainnet for more than a week now and not encountered any problems (one v2 peer has been connected for 11 days).

  142. sipa force-pushed on Sep 27, 2023
  143. sipa commented at 12:48 pm on September 27, 2023: member
    @theStack I’ve included your patch to not include v2 handshake bytes in the per-message-type send byte statistics, as a separate commit. I’ve also mentioned the idea as a potential follow-up in #27634. @real-or-random I’ve added a commit to mention BIP324 in doc/bips.md.
  144. DrahtBot added the label CI failed on Sep 27, 2023
  145. sipa force-pushed on Sep 27, 2023
  146. theStack approved
  147. theStack commented at 9:25 pm on September 27, 2023: contributor

    ACK 830b23ab80ac089c15e232d1b1e6bbd781ddc749 :three::two::four:

    I think we also want to mention BIP324 in the release notes, even though it’s default-off? (doesn’t necessarily have to be in this PR though).

  148. DrahtBot requested review from ajtowns on Sep 27, 2023
  149. DrahtBot requested review from mzumsande on Sep 27, 2023
  150. DrahtBot removed the label CI failed on Sep 27, 2023
  151. in src/net.cpp:1582 in 830b23ab80 outdated
    1608+    LOCK(m_recv_mutex);
    1609+    // Only in the very first state (where m_recv_buffer.empty() means nothing received) do
    1610+    // we reconnect.
    1611+    if (m_recv_state != RecvState::KEY) return false;
    1612+    // Only when nothing has been received do we reconnect.
    1613+    if (!m_recv_buffer.empty()) return false;
    


    vasild commented at 12:09 pm on September 28, 2023:
    The two comments don’t quite match the two conditions - the first comment mentions m_recv_buffer.empty() which is checked in the second condition. Maybe move (where m_recv_buffer.empty() means nothing received) from the first comment to the second.

    sipa commented at 5:10 pm on September 28, 2023:
    I’ve rewritten the comments (and merged them into one).
  152. in src/net.cpp:1571 in 830b23ab80 outdated
    1596         m_send_pos = 0;
    1597         ClearShrink(m_send_buffer);
    1598     }
    1599 }
    1600 
    1601+bool V2Transport::ShouldReconnectV1() const noexcept
    


    vasild commented at 12:14 pm on September 28, 2023:
    Commit 75c02891b1 net: use V2Transport when NODE_P2P_V2 service flag is present is somewhat broken in isolation because it would try to connect using V2 without a fallback to V1. This is later amended in commit d22f715331 net: reconnect with V1Transport under certain conditions. Maybe squash both into one commit?

    sipa commented at 4:29 pm on September 28, 2023:
    I feel they’re sufficiently distinct aspects that having them in separate commits is helpful for review. Sure, the first commit on its own may not result in exactly ideal behavior, but I also don’t think it’s broken.
  153. in src/sync.h:397 in 830b23ab80 outdated
    398+    CSemaphoreGrant(CSemaphoreGrant&& other) noexcept
    399+    {
    400+        sem = other.sem;
    401+        fHaveGrant = other.fHaveGrant;
    402+        other.fHaveGrant = false;
    403+    }
    


    vasild commented at 12:29 pm on September 28, 2023:
    Might as well set other.sem = nullptr;.

    sipa commented at 5:10 pm on September 28, 2023:

    Done.

    EDIT: reverted. Apparently the (existing) semantics related moving of semaphore grants is that the grant moves, but the associated semaphore remains.


    vasild commented at 9:52 am on September 29, 2023:

    Is that necessary? I find it unexpected to leave the semaphore. Usually moved-from objects remain in some empty but valid state, e.g. as if default constructed. A semaphore grant that has a semaphore in it is not that. Did something break when you set other.sem = nullptr? I tried this patch:

     0@@ -390,20 +390,22 @@ public:
     1     // Allow move.
     2     CSemaphoreGrant(CSemaphoreGrant&& other) noexcept
     3     {   
     4         sem = other.sem;
     5         fHaveGrant = other.fHaveGrant;
     6         other.fHaveGrant = false;
     7+        other.sem = nullptr;
     8     }
     9     
    10     CSemaphoreGrant& operator=(CSemaphoreGrant&& other) noexcept
    11     {   
    12         Release();
    13         sem = other.sem;
    14         fHaveGrant = other.fHaveGrant;
    15         other.fHaveGrant = false;
    16+        other.sem = nullptr;
    17         return *this;
    18     }
    

    and both unit tests and functional tests passed. I think the moved-from semaphore grant is not used after the move and is destroyed shortly afterwards.


    sipa commented at 11:39 am on September 29, 2023:

    It very quickly crashes on normal mainnet for me with this patch, possibly because it’s related to automatic connections which aren’t used in the functional test framework.

    I do want to investigate more what’s actually necessary here, because I agree that relying on the state of a moved-from object is strange.


    Sjors commented at 4:29 pm on September 29, 2023:

    ThreadOpenAddedConnections creates a CSemaphoreGrant before it loops over vInfo. Maybe that causes problems for a subsequent item? This function seems to relate to manually added connections.

    Most functional tests use the onetry command when they call addnode, which calls OpenNetworkConnection directly. Except for rpc_net.py which uses add, but it’s never trying more than 1 node at a time.

    I’m able to illustrate the crash by modifying the above test to add a second peer connection: https://github.com/Sjors/bitcoin/commit/0079c5143a4efc50a016e970bcedbbf6eca79007 (and lowering the wait in ThreadOpenAddedConnections and having the test wait a bit)


    vasild commented at 5:10 pm on September 29, 2023:
    Hmm, so the code in CConnman::ThreadOpenAddedConnections() acquires the grant once and then opens connections to all entries in vInfo. Is this intended? Looks like a bug to me. It might as well exceed MAX_ADDNODE_CONNECTIONS. Unrelated to this PR, but should it acquire the grant inside the for (const AddedNodeInfo& info : vInfo) { loop?

    mzumsande commented at 5:19 pm on September 29, 2023:
    Wouldn’t grant.TryAcquire() inside the for loop fail and break once we reach MAX_ADDNODE_CONNECTIONS?

    ajtowns commented at 5:26 pm on September 29, 2023:

    @vasild Agree. The code is fine at the moment, because passing &grant into OpenNetworkConnections at most only releases our grant, without invalidating it. Changing it to invalidate the grant makes our next addnode attempt fail due to reusing an invalid object.

    (There’s no bug here I think – if we hid max addnode connections, the grant.TryAcquire() will fail and we’ll break out of the loop. Moving it into the loop as CSemaphoreGrant grant(*semAddNode, /*fTry=*/tried); would work, I think)


    vasild commented at 5:48 pm on September 29, 2023:

    Right, I did not see the TryAcquire() call :facepalm:, no bug in master.

    Now the question that is relevant to this PR: what about moving the grant creation inside the for loop:

     0--- i/src/net.cpp
     1+++ w/src/net.cpp
     2@@ -2861,18 +2861,18 @@ std::vector<AddedNodeInfo> CConnman::GetAddedNodeInfo() const
     3 void CConnman::ThreadOpenAddedConnections()
     4 {   
     5     AssertLockNotHeld(m_unused_i2p_sessions_mutex);
     6     AssertLockNotHeld(m_reconnections_mutex);
     7     while (true)
     8     {
     9-        CSemaphoreGrant grant(*semAddnode);
    10         std::vector<AddedNodeInfo> vInfo = GetAddedNodeInfo();
    11         bool tried = false;
    12         for (const AddedNodeInfo& info : vInfo) {
    13             if (!info.fConnected) {
    14-                if (!grant.TryAcquire()) {
    15+                CSemaphoreGrant grant(*semAddnode, /*fTry=*/true);
    16+                if (!grant) {
    17                     // If we've used up our semaphore and need a new one, let's not wait here since while we are waiti
    18                     // the addednodeinfo state might change.
    19                     break;
    20                 }
    

    and having the move semantic (more naturally) leave the moved-from object in a default-constructed state (have grant=false, sem=null)?


    sipa commented at 6:21 pm on September 29, 2023:
    I’ve re-introduced the other.sem = nullptr; statements in the move constructor/assignment, and made the corresponding fix in ThreadOpenAddedConnections too. I’ve also refactored the semaphore modernization commit a bit more to get rid of the pointers and use rvalue reference instead when moving is intended (to make it clearer when a variable shouldn’t be used anymore).

    sipa commented at 6:22 pm on September 29, 2023:
    @Sjors Thanks a lot for reproducing the error. It helped locate the issue, and I used it to confirm the code I just pushed fixes the problem. Would you mind adding the test as a separate PR, if possible?

    Sjors commented at 6:53 am on September 30, 2023:

    Nice! I don’t think my test is usable other than for manual testing. We currently can’t mock the 60 second delay involved. We also can’t configure the maximum connection count so it would need 10 test nodes, though that’s doable. I’ll open an issue.

    See also #28464 for some related changes.


    Sjors commented at 9:10 am on October 2, 2023:
    Updated test that demonstrates the crash issue has been fixed: https://github.com/Sjors/bitcoin/commit/bc3995c66300fb3139c5360e8f6ee807b02ea86e

    Sjors commented at 8:57 am on October 11, 2023:
    Issue to add a test: #28635
  154. in src/net.cpp:1961 in 830b23ab80 outdated
    1987+                        // Removing NODE_P2P_V2 here ensures we reconnect using V1.
    1988+                        .addr_connect = CAddress{pnode->addr, ServiceFlags{pnode->addr.nServices & ~NODE_P2P_V2}},
    1989+                        .count_failure = pnode->m_count_failure_after_reconnect,
    1990+                        .grant = std::move(pnode->grantOutbound),
    1991+                        .destination = pnode->m_dest,
    1992+                        .conn_type = pnode->m_conn_type});
    


    vasild commented at 12:45 pm on September 28, 2023:
    This will create a temporary object of type ReconnectionInfo, and will move-construct a new element in reconnections_to_add from the temporary. This defeats the purpose of emplace_back(), might as well use push_back(), or drop ReconnectionInfo{.

    sipa commented at 5:10 pm on September 28, 2023:
    Switched to push_back and dropped ReconnectionInfo{.
  155. in src/net.h:742 in 830b23ab80 outdated
    738@@ -704,8 +739,12 @@ class CNode
    739     // Bind address of our side of the connection
    740     const CAddress addrBind;
    741     const std::string m_addr_name;
    742+    /** The pszDest argument provided ConnectNode(). Only used for reconnections. */
    


    vasild commented at 1:06 pm on September 28, 2023:
    0    /** The pszDest argument provided to ConnectNode(). Only used for reconnections. */
    

    sipa commented at 5:10 pm on September 28, 2023:
    Done.
  156. in src/net.cpp:1956 in 830b23ab80 outdated
    1978@@ -1906,6 +1979,20 @@ void CConnman::DisconnectNodes()
    1979                 // remove from m_nodes
    1980                 m_nodes.erase(remove(m_nodes.begin(), m_nodes.end(), pnode), m_nodes.end());
    1981 
    1982+                // Add to reconnection list if appropriate. We don't reconnect right here, because
    1983+                // the creation of a connection is a blocking operation (up to several seconds),
    1984+                // and we don't want to hold up the socket handler thread for that long.
    1985+                if (pnode->m_transport->ShouldReconnectV1()) {
    


    vasild commented at 1:10 pm on September 28, 2023:
    When we fail to connect using V2 and succeed to connect using V1, should we bother removing NODE_P2P_V2 for that node from addrman?

    sipa commented at 4:32 pm on September 28, 2023:
    That already happens; whenever an outbound peer’s services are received from them, we overwrite their entry in addrman (see m_addrman.SetServices(pfrom.addr, nServices); line in net_processing). It’s not dependent on the actual transport used for the current connection, but I don’t think that incorporating that is helpful. If a peer lies about their supported services, they only hurt their own ability of being connected to.

    Sjors commented at 11:17 am on October 2, 2023:

    If a peer lies about their supported services, they only hurt their own ability of being connected to

    Peers could lie about each-other in gossip. But it doesn’t matter here, as you say.


    vasild commented at 12:21 pm on October 2, 2023:

    Worth noting here is that if an address is already in addrman and we receive it via gossip we update its services, but in a cumulative manner - only add new services, without removing existent services from addrman:

    https://github.com/bitcoin/bitcoin/blob/dcf6230f92d491f46d2bf6cfc096ab5874e385c9/src/addrman.cpp#L572


    sipa commented at 1:32 pm on October 2, 2023:
    @vasild FWIW, that’s the primary reason why we really need a v1-reconnect concept when v2 connections fail; otherwise it’d be trivial for attackers to inject the NODE_P2P_V2 flag in all addresses they relay and cause widespread inability to connect.
  157. in src/node/connection_types.h:88 in 830b23ab80 outdated
    78@@ -79,4 +79,14 @@ enum class ConnectionType {
    79 /** Convert ConnectionType enum to a string value */
    80 std::string ConnectionTypeAsString(ConnectionType conn_type);
    81 
    82+/** Transport layer version */
    83+enum class TransportProtocolType {
    84+    DETECTING, //!< Peer could be v1 or v2
    85+    V1, //!< Unencrypted, plaintext protocol
    86+    V2, //!< BIP324 protocol
    87+};
    


    vasild commented at 1:22 pm on September 28, 2023:

    I don’t know why I don’t see this warning when compiling, but only inside youcompleteme. Both should be using the same flags. Anyway, mentioning it here:

    src/node/connection_types.h|17 col 12 error| Enum ‘ConnectionType’ uses a larger base type (‘int’, size: 4 bytes) than necessary for its value set, consider using ‘std::uint8_t’ (1 byte) as the base type to reduce its size [performance- enum-size]


    sipa commented at 5:11 pm on September 28, 2023:
    Done.
  158. vasild approved
  159. vasild commented at 1:27 pm on September 28, 2023: contributor
    ACK 830b23ab80ac089c15e232d1b1e6bbd781ddc749
  160. sipa force-pushed on Sep 28, 2023
  161. DrahtBot added the label CI failed on Sep 28, 2023
  162. sipa force-pushed on Sep 28, 2023
  163. sipa force-pushed on Sep 28, 2023
  164. sipa commented at 8:38 pm on September 28, 2023: member

    Rebased on top of #28525 as I don’t think we should merge this PR without the BIP change, and this makes testing of the combined result easier.

    Note that this is a breaking change: earlier versions of this PR won’t be able to establish v2 connections with the new version.

  165. DrahtBot removed the label CI failed on Sep 28, 2023
  166. sipa force-pushed on Sep 28, 2023
  167. Sjors commented at 7:41 am on September 29, 2023: member
    Running e69f990
  168. vasild approved
  169. vasild commented at 11:25 am on September 29, 2023: contributor
    ACK e69f990618e489b9576cd50c312afa0f99db2c10
  170. DrahtBot requested review from theStack on Sep 29, 2023
  171. in test/functional/p2p_v2_transport.py:13 in e07a6bff37 outdated
     8+
     9+from test_framework.messages import NODE_P2P_V2
    10+from test_framework.test_framework import BitcoinTestFramework
    11+from test_framework.util import assert_equal
    12+
    13+class V2TransportTest(BitcoinTestFramework):
    


    Sjors commented at 12:52 pm on September 29, 2023:
    af8a90d0c27c3b0787b66094e7db6662596bed71 test seems unrelated to this commit and probably could have been introduced earlier in 9b378d3093a3c7ae3c944e469faf5bea67459b07.

    sipa commented at 6:22 pm on September 29, 2023:
    Good point, done.
  172. in src/net.cpp:472 in e07a6bff37 outdated
    471@@ -472,7 +472,7 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
    472         const std::vector<CService> resolved{Lookup(pszDest, default_port, fNameLookup && !HaveNameProxy(), 256)};
    473         if (!resolved.empty()) {
    474             const CService& rnd{resolved[GetRand(resolved.size())]};
    475-            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), NODE_NONE};
    476+            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), addrConnect.nServices};
    


    Sjors commented at 1:00 pm on September 29, 2023:

    af8a90d0c27c3b0787b66094e7db6662596bed71: isn’t it usually (always?) the case that either addrConnect or pszDest is provided by the caller? In which case this wouldn’t do anything, but it’s not harmful.

    cc @vasild who added this in #23077.


    sipa commented at 4:42 pm on September 29, 2023:
    “in which case this wouldn’t do anything” -> what is the “this” referring to?

    vasild commented at 4:58 pm on September 29, 2023:

    either addrConnect or pszDest is provided by the caller

    That used to be the case before this PR. This PR changes that to pass information via both params. I do not like this as I find it confusing but I did not bother mentioning it as too minor and subjective and I did not see a clear better way. Now that this has been raised already:

    This PR uses addrConnect to pass the service flags even if the address to connect to is in pszDest. Can this be improved? Maybe instead of pzsDest pass struct X { const char* dest, ServiceFlags services}? And then pass std::variant of either CAddress or X (one argument instead of two).


    sipa commented at 6:27 pm on September 29, 2023:

    I agree with making a change like this, but I fear the scope of that could easily grow too large for this PR. There are several other places in the network code where this “string name or CAddress” variant concept exist, and if we’re going to introduce a variant type or something similar for that, I’d prefer for it to be done everywhere. It is also relevant for #28155 (@sr-gi) I think.

    If you feel the mixing of the two argument is a problem in this PR, what about the more narrow change of adding a use_v2transport boolean argument to ConnectNode and OpenNetworkConnection instead, and ignoring the addrConnect.nServices field?


    Sjors commented at 6:49 am on September 30, 2023:

    what is the “this” referring to?

    Setting it to addrConnect.nServices instead of leaving the default initialized NODE_NONE.

    This PR changes that to pass information via both params. I do not like this

    Ah, I didn’t notice that. No strong opinion but in that case I think we should document the pszDest and addrConnect arguments of ConnectNode and OpenNetworkConnection.

    adding a use_v2transport boolean argument to ConnectNode and OpenNetworkConnection instead

    That sounds good.


    sipa commented at 1:12 pm on September 30, 2023:
    I’ve pushed again with this approach.
  173. sipa force-pushed on Sep 29, 2023
  174. Sjors commented at 4:31 pm on September 29, 2023: member
    Reviewed down to 7ad356c3e1d7319af7f9c96a1bfa30b997c6adde.
  175. sipa force-pushed on Sep 29, 2023
  176. sipa force-pushed on Sep 29, 2023
  177. DrahtBot added the label CI failed on Sep 29, 2023
  178. DrahtBot removed the label CI failed on Sep 29, 2023
  179. sipa force-pushed on Sep 30, 2023
  180. sipa commented at 1:14 pm on September 30, 2023: member

    I’ve added a use_v2transport argument to OpenNetworkConnection and ConnectNode, rather than looking at the addrConnect.nServices (as not all call sites use addrConnect):

      0diff --git a/src/net.cpp b/src/net.cpp
      1index 8bd4aebbae..7e70a0586a 100644
      2--- a/src/net.cpp
      3+++ b/src/net.cpp
      4@@ -439,7 +439,7 @@ static CAddress GetBindAddress(const Sock& sock)
      5     return addr_bind;
      6 }
      7 
      8-CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type)
      9+CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type, bool use_v2transport)
     10 {
     11     AssertLockNotHeld(m_unused_i2p_sessions_mutex);
     12     assert(conn_type != ConnectionType::INBOUND);
     13@@ -457,9 +457,6 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
     14         }
     15     }
     16 
     17-    // Use v2 transport when both us and them signal NODE_P2P_V2.
     18-    const bool use_v2transport(addrConnect.nServices & GetLocalServices() & NODE_P2P_V2);
     19-
     20     LogPrintLevel(BCLog::NET, BCLog::Level::Debug, "trying %s connection %s lastseen=%.1fhrs\n",
     21         use_v2transport ? "v2" : "v1",
     22         pszDest ? pszDest : addrConnect.ToStringAddrPort(),
     23@@ -572,7 +569,6 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
     24     if (!addr_bind.IsValid()) {
     25         addr_bind = GetBindAddress(*sock);
     26     }
     27-
     28     CNode* pnode = new CNode(id,
     29                              std::move(sock),
     30                              addrConnect,
     31@@ -1920,7 +1916,7 @@ bool CConnman::AddConnection(const std::string& address, ConnectionType conn_typ
     32     CSemaphoreGrant grant(*semOutbound, true);
     33     if (!grant) return false;
     34 
     35-    OpenNetworkConnection(CAddress(), false, std::move(grant), address.c_str(), conn_type);
     36+    OpenNetworkConnection(CAddress(), false, std::move(grant), address.c_str(), conn_type, /*use_v2transport=*/false);
     37     return true;
     38 }
     39 
     40@@ -1958,8 +1954,7 @@ void CConnman::DisconnectNodes()
     41                 // and we don't want to hold up the socket handler thread for that long.
     42                 if (pnode->m_transport->ShouldReconnectV1()) {
     43                     reconnections_to_add.push_back({
     44-                        // Removing NODE_P2P_V2 here ensures we reconnect using V1.
     45-                        .addr_connect = CAddress{pnode->addr, ServiceFlags{pnode->addr.nServices & ~NODE_P2P_V2}},
     46+                        .addr_connect = pnode->addr,
     47                         .count_failure = pnode->m_count_failure_after_reconnect,
     48                         .grant = std::move(pnode->grantOutbound),
     49                         .destination = pnode->m_dest,
     50@@ -2378,7 +2373,7 @@ void CConnman::ProcessAddrFetch()
     51     CAddress addr;
     52     CSemaphoreGrant grant(*semOutbound, /*fTry=*/true);
     53     if (grant) {
     54-        OpenNetworkConnection(addr, false, std::move(grant), strDest.c_str(), ConnectionType::ADDR_FETCH);
     55+        OpenNetworkConnection(addr, false, std::move(grant), strDest.c_str(), ConnectionType::ADDR_FETCH, /*use_v2transport=*/false);
     56     }
     57 }
     58 
     59@@ -2481,7 +2476,7 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
     60             for (const std::string& strAddr : connect)
     61             {
     62                 CAddress addr(CService(), NODE_NONE);
     63-                OpenNetworkConnection(addr, false, {}, strAddr.c_str(), ConnectionType::MANUAL);
     64+                OpenNetworkConnection(addr, false, {}, strAddr.c_str(), ConnectionType::MANUAL, /*use_v2transport=*/false);
     65                 for (int i = 0; i < 10 && i < nLoop; i++)
     66                 {
     67                     if (!interruptNet.sleep_for(std::chrono::milliseconds(500)))
     68@@ -2786,7 +2781,9 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
     69             // Don't record addrman failure attempts when node is offline. This can be identified since all local
     70             // network connections (if any) belong in the same netgroup, and the size of `outbound_ipv46_peer_netgroups` would only be 1.
     71             const bool count_failures{((int)outbound_ipv46_peer_netgroups.size() + outbound_privacy_network_peers) >= std::min(nMaxConnections - 1, 2)};
     72-            OpenNetworkConnection(addrConnect, count_failures, std::move(grant), /*strDest=*/nullptr, conn_type);
     73+            // Use BIP324 transport when both us and them have NODE_V2_P2P set.
     74+            const bool use_v2transport(addrConnect.nServices & GetLocalServices() & NODE_P2P_V2);
     75+            OpenNetworkConnection(addrConnect, count_failures, std::move(grant), /*strDest=*/nullptr, conn_type, use_v2transport);
     76         }
     77     }
     78 }
     79@@ -2876,12 +2873,7 @@ void CConnman::ThreadOpenAddedConnections()
     80                 }
     81                 tried = true;
     82                 CAddress addr(CService(), NODE_NONE);
     83-
     84-                if (info.m_params.m_use_v2transport) {
     85-                    addr.nServices = ServiceFlags(addr.nServices | NODE_P2P_V2);
     86-                }
     87-
     88-                OpenNetworkConnection(addr, false, std::move(grant), info.m_params.m_added_node.c_str(), ConnectionType::MANUAL);
     89+                OpenNetworkConnection(addr, false, std::move(grant), info.m_params.m_added_node.c_str(), ConnectionType::MANUAL, info.m_params.m_use_v2transport);
     90                 if (!interruptNet.sleep_for(std::chrono::milliseconds(500)))
     91                     return;
     92             }
     93@@ -2895,7 +2887,7 @@ void CConnman::ThreadOpenAddedConnections()
     94 }
     95 
     96 // if successful, this moves the passed grant to the constructed node
     97-void CConnman::OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant&& grant_outbound, const char *pszDest, ConnectionType conn_type)
     98+void CConnman::OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant&& grant_outbound, const char *pszDest, ConnectionType conn_type, bool use_v2transport)
     99 {
    100     AssertLockNotHeld(m_unused_i2p_sessions_mutex);
    101     assert(conn_type != ConnectionType::INBOUND);
    102@@ -2917,7 +2909,7 @@ void CConnman::OpenNetworkConnection(const CAddress& addrConnect, bool fCountFai
    103     } else if (FindNode(std::string(pszDest)))
    104         return;
    105 
    106-    CNode* pnode = ConnectNode(addrConnect, pszDest, fCountFailure, conn_type);
    107+    CNode* pnode = ConnectNode(addrConnect, pszDest, fCountFailure, conn_type, use_v2transport);
    108 
    109     if (!pnode)
    110         return;
    111@@ -3854,7 +3846,8 @@ void CConnman::PerformReconnections()
    112                               item.count_failure,
    113                               std::move(item.grant),
    114                               item.destination.empty() ? nullptr : item.destination.c_str(),
    115-                              item.conn_type);
    116+                              item.conn_type,
    117+                              /*use_v2transport=*/false);
    118     }
    119 }
    120 
    121diff --git a/src/net.h b/src/net.h
    122index c9a0afdd97..25a9814f63 100644
    123--- a/src/net.h
    124+++ b/src/net.h
    125@@ -1139,7 +1139,7 @@ public:
    126     bool GetNetworkActive() const { return fNetworkActive; };
    127     bool GetUseAddrmanOutgoing() const { return m_use_addrman_outgoing; };
    128     void SetNetworkActive(bool active);
    129-    void OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant&& grant_outbound, const char* strDest, ConnectionType conn_type) EXCLUSIVE_LOCKS_REQUIRED(!m_unused_i2p_sessions_mutex);
    130+    void OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant&& grant_outbound, const char* strDest, ConnectionType conn_type, bool use_v2transport) EXCLUSIVE_LOCKS_REQUIRED(!m_unused_i2p_sessions_mutex);
    131     bool CheckIncomingNonce(uint64_t nonce);
    132 
    133     // alias for thread safety annotations only, not defined
    134@@ -1355,7 +1355,7 @@ private:
    135     bool AlreadyConnectedToAddress(const CAddress& addr);
    136 
    137     bool AttemptToEvictConnection();
    138-    CNode* ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type) EXCLUSIVE_LOCKS_REQUIRED(!m_unused_i2p_sessions_mutex);
    139+    CNode* ConnectNode(CAddress addrConnect, const char *pszDest, bool fCountFailure, ConnectionType conn_type, bool use_v2transport) EXCLUSIVE_LOCKS_REQUIRED(!m_unused_i2p_sessions_mutex);
    140     void AddWhitelistPermissionFlags(NetPermissionFlags& flags, const CNetAddr &addr) const;
    141 
    142     void DeleteNode(CNode* pnode);
    143diff --git a/src/rpc/net.cpp b/src/rpc/net.cpp
    144index 8f6ec29813..8d796b8e9b 100644
    145--- a/src/rpc/net.cpp
    146+++ b/src/rpc/net.cpp
    147@@ -299,7 +299,7 @@ static RPCHelpMan addnode()
    148                 {
    149                     {"node", RPCArg::Type::STR, RPCArg::Optional::NO, "The address of the peer to connect to"},
    150                     {"command", RPCArg::Type::STR, RPCArg::Optional::NO, "'add' to add a node to the list, 'remove' to remove a node from the list, 'onetry' to try a connection to the node once"},
    151-                    {"v2transport", RPCArg::Type::BOOL, RPCArg::Default{false}, "Attempt to connect using BIP324 v2 transport protocol"},
    152+                    {"v2transport", RPCArg::Type::BOOL, RPCArg::Default{false}, "Attempt to connect using BIP324 v2 transport protocol (ignored for 'remove' command)"},
    153                 },
    154                 RPCResult{RPCResult::Type::NONE, "", ""},
    155                 RPCExamples{
    156@@ -327,10 +327,7 @@ static RPCHelpMan addnode()
    157     if (command == "onetry")
    158     {
    159         CAddress addr;
    160-        if (use_v2transport) {
    161-            addr.nServices = ServiceFlags(addr.nServices | NODE_P2P_V2);
    162-        }
    163-        connman.OpenNetworkConnection(addr, /*fCountFailure=*/false, /*grant_outbound=*/{}, node_arg.c_str(), ConnectionType::MANUAL);
    164+        connman.OpenNetworkConnection(addr, /*fCountFailure=*/false, /*grant_outbound=*/{}, node_arg.c_str(), ConnectionType::MANUAL, use_v2transport);
    165         return UniValue::VNULL;
    166     }
    
  181. in src/net.cpp:1605 in d5c8b42a16 outdated
    1600+    Transport::Info info;
    1601+
    1602+    // Do not report v2 and session ID until the garbage authentication packet has been received
    1603+    // and verified (confirming that the other side very likely has the same keys as us).
    1604+    if (m_recv_state != RecvState::KEY_MAYBE_V1 && m_recv_state != RecvState::KEY &&
    1605+        m_recv_state != RecvState::GARB_GARBTERM) {
    


    theStack commented at 7:38 pm on October 1, 2023:

    I’ve intentionally tested connecting to a node running a version that still uses the garbage authentication packet (i.e. it doesn’t have the recent BIP324 change included yet: #28525) and noticed that we report v2 and the session id too early, before the garbage authentication (now included in VERSION) is received. I think that’s fixed by the following patch:

    0        m_recv_state != RecvState::GARB_GARBTERM && m_recv_state != RecvState::VERSION) {
    

    sipa commented at 3:09 am on October 2, 2023:
    Nice catch. Since the version packet now takes on the role of garbage authentication, we need to wait for the version packet before deciding it’s confirmed to be a v2 peer.
  182. in src/net.cpp:2772 in 5784e825b7 outdated
    2779@@ -2780,19 +2780,19 @@ void CConnman::ThreadOpenAddedConnections()
    2780     AssertLockNotHeld(m_unused_i2p_sessions_mutex);
    2781     while (true)
    2782     {
    2783-        CSemaphoreGrant grant(*semAddnode);
    


    mzumsande commented at 2:00 am on October 2, 2023:
    I think that this is a small change in behavior (fyi @vasild because of #28331 (review)): On master, the thread will block here if all addnode slots are taken because we call Acquire() (and not TryAcquire()) so there won’t be any sleep_for and no continuous calls to GetAddedNodeInfo() in this situation (until the grant is released because one of the existing added nodes has disconnected). With this PR removing non-fTry acquiry of the grant, it means that we will now continuously cycle through the while loop, calling GetAddedNodeInfo() and sleeping for 2 seconds in the same situation where all addnode slots are taken. Is this change intended / better than the behavior on master? On first glance, repeated calls to GetAddedNodeInfo() seem unnecessary when we don’t have a grant anyway, so the behavior on master seems more natural.

    sipa commented at 3:10 am on October 2, 2023:
    Nice find. I’ve adapted the code to match the old behavior (I hope; please have a look); even if we decide different behavior is desirable, this isn’t the PR to introduce it.

    mzumsande commented at 9:40 pm on October 2, 2023:
    Looks good, I tested that the behavior doesn’t change anymore.
  183. sipa force-pushed on Oct 2, 2023
  184. sipa commented at 3:10 am on October 2, 2023: member

    Changes to address the comments by @mzumsande and @theStack above:

     0diff --git a/src/net.cpp b/src/net.cpp
     1index 7e70a0586a..a17c215a2c 100644
     2--- a/src/net.cpp
     3+++ b/src/net.cpp
     4@@ -1603,10 +1603,10 @@ Transport::Info V2Transport::GetInfo() const noexcept
     5 
     6     Transport::Info info;
     7 
     8-    // Do not report v2 and session ID until the garbage authentication packet has been received
     9+    // Do not report v2 and session ID until the version packet has been received
    10     // and verified (confirming that the other side very likely has the same keys as us).
    11     if (m_recv_state != RecvState::KEY_MAYBE_V1 && m_recv_state != RecvState::KEY &&
    12-        m_recv_state != RecvState::GARB_GARBTERM) {
    13+        m_recv_state != RecvState::GARB_GARBTERM && m_recv_state != RecvState::VERSION) {
    14         info.transport_type = TransportProtocolType::V2;
    15         info.session_id = uint256(MakeUCharSpan(m_cipher.GetSessionID()));
    16     } else {
    17@@ -2861,11 +2861,12 @@ void CConnman::ThreadOpenAddedConnections()
    18     AssertLockNotHeld(m_reconnections_mutex);
    19     while (true)
    20     {
    21+        CSemaphoreGrant grant(*semAddnode);
    22         std::vector<AddedNodeInfo> vInfo = GetAddedNodeInfo();
    23         bool tried = false;
    24         for (const AddedNodeInfo& info : vInfo) {
    25             if (!info.fConnected) {
    26-                CSemaphoreGrant grant(*semAddnode, /*fTry=*/true);
    27+                if (!grant) grant = CSemaphoreGrant(*semAddnode, /*fTry=*/true);
    28                 if (!grant) {
    29                     // If we've used up our semaphore and need a new one, let's not wait here since while we are waiting
    30                     // the addednodeinfo state might change.
    
  185. DrahtBot added the label CI failed on Oct 2, 2023
  186. fanquake commented at 9:15 am on October 2, 2023: member

    From clang-tidy (https://github.com/bitcoin/bitcoin/pull/28331/checks?check_run_id=17301328580):

    0clang-tidy-16 -p=/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu -quiet -load=/tidy-build/libbitcoin-tidy.so /ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/net.cpp
    1net.cpp:2869:22: error: 'grant' used after it was moved [bugprone-use-after-move,-warnings-as-errors]
    2                if (!grant) grant = CSemaphoreGrant(*semAddnode, /*fTry=*/true);
    3                     ^
    4net.cpp:2877:17: note: move occurred here
    5                OpenNetworkConnection(addr, false, std::move(grant), info.m_params.m_added_node.c_str(), ConnectionType::MANUAL, info.m_params.m_use_v2transport);
    6                ^
    7--
    
  187. in src/rpc/net.cpp:306 in cd5e72d6fb outdated
    303                 },
    304                 RPCResult{RPCResult::Type::NONE, "", ""},
    305                 RPCExamples{
    306-                    HelpExampleCli("addnode", "\"192.168.0.6:8333\" \"onetry\"")
    307-            + HelpExampleRpc("addnode", "\"192.168.0.6:8333\", \"onetry\"")
    308+                    HelpExampleCli("addnode", "\"192.168.0.6:8333\" \"onetry\" true")
    


    vasild commented at 10:09 am on October 2, 2023:
    This adds a way to specify a v2 peer to addnode. What about something similar for -connect= peers? Right now there is no way to specify “use v2” for addresses provided to -connect=. What about if -v2transport=1 then always try v2 first (and on failure v1) for all addnode and connect peers? Then this extra argument to addnode is not needed.

    sipa commented at 1:24 pm on October 2, 2023:

    There have been several comments related to this before. The situation is certainly incomplete w.r.t. the ability to specify v1/v2 for manual connections (e.g. also the -addnode / -seednode options lacks that), but my goal with this PR is getting the minimal in to support testing. Making things convenient for wider adoption I think needs some more UX discussion (e.g. do we permit specifying IP address/ranges to be V2, or do we use a network permission, or some special suffix -addnode=IP@v2; also how does this interact with a potential future where peer pubkeys can be specified, …), which I think is better done separately.

    Regarding automatically assuming v2 for all addnode… that would (for now) mean 2 connection attempts for every manual peer; I’m not sure that’s an acceptable tradeoff right now.

  188. vasild approved
  189. vasild commented at 10:12 am on October 2, 2023: contributor

    ACK cd5e72d6fb82c580dbf693b14e842d6204327311 modulo the use-after-move which I think is benign but should be silenced regardless. This should resolve it (untested):

    0                 OpenNetworkConnection(addr, false, std::move(grant), info.m_params.m_added_node.c_str(), ConnectionTyp
    1+                grant = CSemaphoreGrant{};
    2                 if (!interruptNet.sleep_for(std::chrono::milliseconds(500)))
    
  190. DrahtBot requested review from theStack on Oct 2, 2023
  191. DrahtBot requested review from mzumsande on Oct 2, 2023
  192. in src/net.cpp:1929 in 6a2dbed120 outdated
    1888@@ -1868,6 +1889,11 @@ bool CConnman::AddConnection(const std::string& address, ConnectionType conn_typ
    1889 
    1890 void CConnman::DisconnectNodes()
    1891 {
    1892+    AssertLockNotHeld(m_nodes_mutex);
    1893+    AssertLockNotHeld(m_reconnections_mutex);
    1894+
    1895+    decltype(m_reconnections) reconnections_to_add;
    


    Sjors commented at 11:23 am on October 2, 2023:
    6a2dbed1209b18d923da64b019f6f378d3c30695: could a comment here that we // Use temporary variable to track new reconnection candidates, because m_reconnections_mutex may be blocked [also for several seconds??? or much shorter but still don't want that] (supplementing the comment below why we don’t do the reconnect itself here either).

    sipa commented at 1:24 pm on October 2, 2023:
    It’s to avoid a lock dependency between m_nodes_mutex and m_reconnections_mutex. I’ve added a comment.
  193. in src/net.cpp:3818 in 6a2dbed120 outdated
    3813+        OpenNetworkConnection(item.addr_connect,
    3814+                              item.count_failure,
    3815+                              std::move(item.grant),
    3816+                              item.destination.empty() ? nullptr : item.destination.c_str(),
    3817+                              item.conn_type,
    3818+                              /*use_v2transport=*/false);
    


    Sjors commented at 11:30 am on October 2, 2023:
    6a2dbed1209b18d923da64b019f6f378d3c30695 ReconnectionInfo could specify use_v2transport so that PerformReconnections can be used more broadly than just v2 -> v1 downgrades. Can always be done later of course.

    sipa commented at 1:24 pm on October 2, 2023:
    Good idea, done.
  194. in src/net.h:381 in 6a2dbed120 outdated
    360@@ -361,6 +361,11 @@ class Transport {
    361 
    362     /** Return the memory usage of this transport attributable to buffered data to send. */
    363     virtual size_t GetSendMemoryUsage() const noexcept = 0;
    364+
    365+    // 3. Miscellaneous functions.
    366+
    367+    /** Whether upon disconnections, a reconnect with V1 is warranted. */
    368+    virtual bool ShouldReconnectV1() const noexcept = 0;
    


    Sjors commented at 11:31 am on October 2, 2023:
    6a2dbed1209b18d923da64b019f6f378d3c30695 could make this more generic ShouldReconnect() (see comment above).

    sipa commented at 1:25 pm on October 2, 2023:
    I don’t think that works, because even while ReconnectionInfo can be generic for any kind of reconnection attempt, this function and its call site are very much specific to a V2 connection downgrading to a V1.
  195. in src/net.cpp:1925 in 6a2dbed120 outdated
    1920+                // the creation of a connection is a blocking operation (up to several seconds),
    1921+                // and we don't want to hold up the socket handler thread for that long.
    1922+                if (pnode->m_transport->ShouldReconnectV1()) {
    1923+                    reconnections_to_add.push_back({
    1924+                        .addr_connect = pnode->addr,
    1925+                        .count_failure = pnode->m_count_failure_after_reconnect,
    


    Sjors commented at 11:45 am on October 2, 2023:

    6a2dbed1209b18d923da64b019f6f378d3c30695 I’m confused about the intention of count_failure / m_count_failure_after_reconnect. If this is set to true then AddrManImpl marks the most recent failed attempt and bumps info.nAttempts.

    The latter suggests that we should set this to false if we’r trying a v2 -> v1 downgrade, otherwise we’d be counting double. Even better: we should not count the failed v2 attempt if we intend to try again with v1.


    sipa commented at 1:28 pm on October 2, 2023:

    Nice catch. I was clearly confused when writing this.

    The situation is even simpler, because the fCountFailure argument to OpenNetworkConnection is only related to connection failure (i.e., TCP/IP failing to establish a connection) and not related to failures that happen after that point (perhaps it should, but that seems out of scope for this PR). We only ever attempt a V1 reconnection if the establishment of the V2 connection succeeded before that point, so by the time we reconnect, we already know the peer can be reached (in a TCP/IP sense), and there is no need to keep track of failure counting after that.

    I’ve changed it to set fCountFailure always to false on reconnect.

  196. in src/net.h:676 in 6a2dbed120 outdated
    672@@ -662,6 +673,7 @@ struct CNodeOptions
    673     bool prefer_evict = false;
    674     size_t recv_flood_size{DEFAULT_MAXRECEIVEBUFFER * 1000};
    675     bool use_v2transport = false;
    676+    bool count_failure_after_reconnect = false;
    


    Sjors commented at 11:47 am on October 2, 2023:

    6a2dbed1209b18d923da64b019f6f378d3c30695: The word count is ambigous here, because AddrManImpl literally counts failures, but I think the meaning here is “it counts / we care about it”.

    See also my above comment. Depending how this is intended to work, maybe I can think of a better name.


    sipa commented at 1:28 pm on October 2, 2023:
    This code is gone.
  197. Sjors approved
  198. Sjors commented at 11:50 am on October 2, 2023: member
    Tested and mostly happy with cd5e72d6fb82c580dbf693b14e842d6204327311.
  199. sipa force-pushed on Oct 2, 2023
  200. sipa commented at 1:19 pm on October 2, 2023: member

    Addressed comments by @Sjors, and (hopefully for real) fixed the semaphore move issue.

      0diff --git a/src/net.cpp b/src/net.cpp
      1index a17c215a2c..c4c2d55c77 100644
      2--- a/src/net.cpp
      3+++ b/src/net.cpp
      4@@ -582,7 +582,6 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
      5                                  .i2p_sam_session = std::move(i2p_transient_session),
      6                                  .recv_flood_size = nReceiveFloodSize,
      7                                  .use_v2transport = use_v2transport,
      8-                                 .count_failure_after_reconnect = fCountFailure,
      9                              });
     10     pnode->AddRef();
     11 
     12@@ -1925,6 +1924,8 @@ void CConnman::DisconnectNodes()
     13     AssertLockNotHeld(m_nodes_mutex);
     14     AssertLockNotHeld(m_reconnections_mutex);
     15 
     16+    // Use a temporary variable to accumulate desired reconnections, so we don't need
     17+    // m_reconnections_mutex while holding m_nodes_mutex.
     18     decltype(m_reconnections) reconnections_to_add;
     19 
     20     {
     21@@ -1955,10 +1956,10 @@ void CConnman::DisconnectNodes()
     22                 if (pnode->m_transport->ShouldReconnectV1()) {
     23                     reconnections_to_add.push_back({
     24                         .addr_connect = pnode->addr,
     25-                        .count_failure = pnode->m_count_failure_after_reconnect,
     26                         .grant = std::move(pnode->grantOutbound),
     27                         .destination = pnode->m_dest,
     28-                        .conn_type = pnode->m_conn_type});
     29+                        .conn_type = pnode->m_conn_type,
     30+                        .use_v2transport = false});
     31                     LogPrint(BCLog::NET, "retrying with v1 transport protocol for peer=%d\n", pnode->GetId());
     32                 }
     33 
     34@@ -2866,7 +2867,6 @@ void CConnman::ThreadOpenAddedConnections()
     35         bool tried = false;
     36         for (const AddedNodeInfo& info : vInfo) {
     37             if (!info.fConnected) {
     38-                if (!grant) grant = CSemaphoreGrant(*semAddnode, /*fTry=*/true);
     39                 if (!grant) {
     40                     // If we've used up our semaphore and need a new one, let's not wait here since while we are waiting
     41                     // the addednodeinfo state might change.
     42@@ -2875,8 +2875,8 @@ void CConnman::ThreadOpenAddedConnections()
     43                 tried = true;
     44                 CAddress addr(CService(), NODE_NONE);
     45                 OpenNetworkConnection(addr, false, std::move(grant), info.m_params.m_added_node.c_str(), ConnectionType::MANUAL, info.m_params.m_use_v2transport);
     46-                if (!interruptNet.sleep_for(std::chrono::milliseconds(500)))
     47-                    return;
     48+                if (!interruptNet.sleep_for(std::chrono::milliseconds(500))) return;
     49+                grant = CSemaphoreGrant(*semAddnode, /*fTry=*/true);
     50             }
     51         }
     52         // Retry every 60 seconds if a connection was attempted, otherwise two seconds
     53@@ -3699,7 +3699,6 @@ CNode::CNode(NodeId idIn,
     54       m_addr_name{addrNameIn.empty() ? addr.ToStringAddrPort() : addrNameIn},
     55       m_dest(addrNameIn),
     56       m_inbound_onion{inbound_onion},
     57-      m_count_failure_after_reconnect{node_opts.count_failure_after_reconnect},
     58       m_prefer_evict{node_opts.prefer_evict},
     59       nKeyedNetGroup{nKeyedNetGroupIn},
     60       m_conn_type{conn_type_in},
     61@@ -3844,11 +3843,15 @@ void CConnman::PerformReconnections()
     62 
     63         auto& item = *todo.begin();
     64         OpenNetworkConnection(item.addr_connect,
     65-                              item.count_failure,
     66+                              // We only reconnect if the first attempt to connect succeeded at
     67+                              // connection time, but then failed after the CNode object was
     68+                              // created. Since we already know connecting is possible, do not
     69+                              // count failure to reconnect.
     70+                              /*fCountFailure=*/false,
     71                               std::move(item.grant),
     72                               item.destination.empty() ? nullptr : item.destination.c_str(),
     73                               item.conn_type,
     74-                              /*use_v2transport=*/false);
     75+                              item.use_v2transport);
     76     }
     77 }
     78 
     79diff --git a/src/net.h b/src/net.h
     80index 25a9814f63..2f7b832fba 100644
     81--- a/src/net.h
     82+++ b/src/net.h
     83@@ -689,7 +689,6 @@ struct CNodeOptions
     84     bool prefer_evict = false;
     85     size_t recv_flood_size{DEFAULT_MAXRECEIVEBUFFER * 1000};
     86     bool use_v2transport = false;
     87-    bool count_failure_after_reconnect = false;
     88 };
     89 
     90 /** Information about a peer */
     91@@ -738,8 +737,6 @@ public:
     92     const std::string m_dest;
     93     //! Whether this peer is an inbound onion, i.e. connected via our Tor onion service.
     94     const bool m_inbound_onion;
     95-    //! Whether (reconnections of) connections to this peer should count as failure.
     96-    const bool m_count_failure_after_reconnect;
     97     std::atomic<int> nVersion{0};
     98     Mutex m_subver_mutex;
     99     /**
    100@@ -1578,10 +1575,10 @@ private:
    101     struct ReconnectionInfo
    102     {
    103         CAddress addr_connect;
    104-        bool count_failure;
    105         CSemaphoreGrant grant;
    106         std::string destination;
    107         ConnectionType conn_type;
    108+        bool use_v2transport;
    109     };
    110 
    111     /**
    
  201. sipa commented at 1:34 pm on October 2, 2023: member
    @vasild I’ve switched to a slightly different approach to avoid the use-after-move (see diff above).
  202. in src/net.cpp:2870 in 2e0f8a6d39 outdated
    2784@@ -2785,16 +2785,16 @@ void CConnman::ThreadOpenAddedConnections()
    2785         bool tried = false;
    2786         for (const AddedNodeInfo& info : vInfo) {
    2787             if (!info.fConnected) {
    2788-                if (!grant.TryAcquire()) {
    2789+                if (!grant) {
    


    Sjors commented at 3:56 pm on October 2, 2023:
    2e0f8a6d39a15ecac60b9d463606e16ba5707bb5: I’m confused by this change. I see !fgrant calls the bool() operator which checks fHaveGrant. But if that returns false we now break out of the for loop. Wheveras we used to call TryAcquire() and continue if that worked.

    sipa commented at 3:58 pm on October 2, 2023:
    We try acquire at the end of the previous loop iteration, or before entering the loop.

    Sjors commented at 4:09 pm on October 2, 2023:

    before entering the loop

    Oh I see. The CSemaphoreGrant grant(*semAddnode) constructor before the loop doesn’t fTry, but uses Acquire(), which waits (unchanged behavior) and always sets grant to true.

    At the end of the first iteration we set fTry, so we either get a grant for the next iteration or we don’t.

    The first time we don’t get a grant we bail out of the loop. This is effectively the same as before, it’s just that we did TryAquire() at the start of each iteration (which also always would have succeed in the first iteration).

  203. Sjors commented at 4:46 pm on October 2, 2023: member
    tACK 7b255b56698392019dd575c3bb73f6d8b1334f4b
  204. DrahtBot requested review from vasild on Oct 2, 2023
  205. DrahtBot removed the label CI failed on Oct 2, 2023
  206. in doc/bips.md:52 in 7b255b5669 outdated
    48@@ -49,6 +49,7 @@ BIPs that are implemented by Bitcoin Core:
    49 * [`BIP 173`](https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki): Bech32 addresses for native Segregated Witness outputs are supported as of **v0.16.0** ([PR 11167](https://github.com/bitcoin/bitcoin/pull/11167)). Bech32 addresses are generated by default as of **v0.20.0** ([PR 16884](https://github.com/bitcoin/bitcoin/pull/16884)).
    50 * [`BIP 174`](https://github.com/bitcoin/bips/blob/master/bip-0174.mediawiki): RPCs to operate on Partially Signed Bitcoin Transactions (PSBT) are present as of **v0.17.0** ([PR 13557](https://github.com/bitcoin/bitcoin/pull/13557)).
    51 * [`BIP 176`](https://github.com/bitcoin/bips/blob/master/bip-0176.mediawiki): Bits Denomination [QT only] is supported as of **v0.16.0** ([PR 12035](https://github.com/bitcoin/bitcoin/pull/12035)).
    52+* [`BIP 324`](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki): Support for the BIP324 v2 transport protocol and the associated `NODE_P2P_V2` service bit is supported as of **v26.0**, but off by default ([PR 28331](https://github.com/bitcoin/bitcoin/pull/28331)).
    


    theStack commented at 8:57 pm on October 2, 2023:

    pedantic nit to avoid the “support for … is supported” repetition:

    0* [`BIP 324`](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki): The BIP324 v2 transport protocol and the associated `NODE_P2P_V2` service bit is supported as of **v26.0**, but off by default ([PR 28331](https://github.com/bitcoin/bitcoin/pull/28331)).
    

    sipa commented at 10:29 pm on October 2, 2023:
    Done (made some more editorial changes to this sentence).
  207. theStack approved
  208. theStack commented at 8:58 pm on October 2, 2023: contributor
    ACK 7b255b56698392019dd575c3bb73f6d8b1334f4b
  209. in src/net.cpp:472 in 4aaf815330 outdated
    468@@ -469,7 +469,7 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
    469         const std::vector<CService> resolved{Lookup(pszDest, default_port, fNameLookup && !HaveNameProxy(), 256)};
    470         if (!resolved.empty()) {
    471             const CService& rnd{resolved[GetRand(resolved.size())]};
    472-            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), NODE_NONE};
    473+            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), addrConnect.nServices};
    


    mzumsande commented at 9:15 pm on October 2, 2023:
    I think that changing this line is no longer necessary after adding the use_v2transport bool.

    sipa commented at 10:29 pm on October 2, 2023:
    Done.
  210. mzumsande commented at 9:41 pm on October 2, 2023: contributor
    ACK 7b255b56698392019dd575c3bb73f6d8b1334f4b
  211. net: advertise NODE_P2P_V2 if CLI arg -v2transport is on
    Co-authored-by: Dhruv Mehta <856960+dhruv@users.noreply.github.com>
    abf343b320
  212. rpc: don't report v2 handshake bytes in the per-type sent byte statistics
    This matches the behavior for per-type received bytes.
    a4706bc877
  213. net: use V2Transport when NODE_P2P_V2 service flag is present
    Co-authored-by: Dhruv Mehta <856960+dhruv@users.noreply.github.com>
    62d21ee097
  214. rpc: addnode arg to use BIP324 v2 p2p
    Co-authored-by: Pieter Wuille <bitcoin-dev@wuille.net>
    c73cd42363
  215. sync: modernize CSemaphore / CSemaphoreGrant 4d265d0342
  216. net: reconnect with V1Transport under certain conditions
    When an outbound v2 connection is disconnected without receiving anything, but at
    least 24 bytes of our pubkey were sent out (enough to constitute an invalid v1
    header), add them to a queue of reconnections to be tried.
    
    The reconnections are in a queue rather than performed immediately, because we should
    not block the socket handler thread with connection creation (a blocking operation
    that can take multiple seconds).
    432a62c4dc
  217. net: expose transport types/session IDs of connections in RPC and logs
    Co-authored-by: Dhruv Mehta <856960+dhruv@users.noreply.github.com>
    b815cce50e
  218. test: Functional test for opportunistic encryption
    Co-authored-by: Pieter Wuille <bitcoin-dev@wuille.net>
    05d19fbcc1
  219. test: enable v2 transport between nodes in some functional tests 64ca7210f0
  220. doc: mention BIP324 support in bips.md 75a3291035
  221. sipa force-pushed on Oct 2, 2023
  222. sipa commented at 10:29 pm on October 2, 2023: member

    Addressed nits by @mzumsande and @theStack:

     0diff --git a/doc/bips.md b/doc/bips.md
     1index 87b5918c72..952d289daa 100644
     2--- a/doc/bips.md
     3+++ b/doc/bips.md
     4@@ -49,7 +49,7 @@ BIPs that are implemented by Bitcoin Core:
     5 * [`BIP 173`](https://github.com/bitcoin/bips/blob/master/bip-0173.mediawiki): Bech32 addresses for native Segregated Witness outputs are supported as of **v0.16.0** ([PR 11167](https://github.com/bitcoin/bitcoin/pull/11167)). Bech32 addresses are generated by default as of **v0.20.0** ([PR 16884](https://github.com/bitcoin/bitcoin/pull/16884)).
     6 * [`BIP 174`](https://github.com/bitcoin/bips/blob/master/bip-0174.mediawiki): RPCs to operate on Partially Signed Bitcoin Transactions (PSBT) are present as of **v0.17.0** ([PR 13557](https://github.com/bitcoin/bitcoin/pull/13557)).
     7 * [`BIP 176`](https://github.com/bitcoin/bips/blob/master/bip-0176.mediawiki): Bits Denomination [QT only] is supported as of **v0.16.0** ([PR 12035](https://github.com/bitcoin/bitcoin/pull/12035)).
     8-* [`BIP 324`](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki): Support for the BIP324 v2 transport protocol and the associated `NODE_P2P_V2` service bit is supported as of **v26.0**, but off by default ([PR 28331](https://github.com/bitcoin/bitcoin/pull/28331)).
     9+* [`BIP 324`](https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki): The v2 transport protocol specified by BIP324 and the associated `NODE_P2P_V2` service bit are supported as of **v26.0**, but off by default ([PR 28331](https://github.com/bitcoin/bitcoin/pull/28331)).
    10 * [`BIP 325`](https://github.com/bitcoin/bips/blob/master/bip-0325.mediawiki): Signet test network is supported as of **v0.21.0** ([PR 18267](https://github.com/bitcoin/bitcoin/pull/18267)).
    11 * [`BIP 339`](https://github.com/bitcoin/bips/blob/master/bip-0339.mediawiki): Relay of transactions by wtxid is supported as of **v0.21.0** ([PR 18044](https://github.com/bitcoin/bitcoin/pull/18044)).
    12 * [`BIP 340`](https://github.com/bitcoin/bips/blob/master/bip-0340.mediawiki)
    13diff --git a/src/net.cpp b/src/net.cpp
    14index c4c2d55c77..6b2ef5f43d 100644
    15--- a/src/net.cpp
    16+++ b/src/net.cpp
    17@@ -469,7 +469,7 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
    18         const std::vector<CService> resolved{Lookup(pszDest, default_port, fNameLookup && !HaveNameProxy(), 256)};
    19         if (!resolved.empty()) {
    20             const CService& rnd{resolved[GetRand(resolved.size())]};
    21-            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), addrConnect.nServices};
    22+            addrConnect = CAddress{MaybeFlipIPv6toCJDNS(rnd), NODE_NONE};
    23             if (!addrConnect.IsValid()) {
    24                 LogPrint(BCLog::NET, "Resolver returned invalid address %s for %s\n", addrConnect.ToStringAddrPort(), pszDest);
    25                 return nullptr;
    
  223. theStack approved
  224. theStack commented at 11:28 pm on October 2, 2023: contributor
    re-ACK 75a329103505736acb9036224da2dfa8ab038c43
  225. DrahtBot requested review from Sjors on Oct 2, 2023
  226. DrahtBot requested review from mzumsande on Oct 2, 2023
  227. mzumsande commented at 11:59 pm on October 2, 2023: contributor
    re-ACK 75a329103505736acb9036224da2dfa8ab038c43
  228. DrahtBot removed review request from mzumsande on Oct 2, 2023
  229. Sjors commented at 9:12 am on October 3, 2023: member
    re-ACK 75a329103505736acb9036224da2dfa8ab038c43
  230. DrahtBot removed review request from Sjors on Oct 3, 2023
  231. fanquake merged this on Oct 3, 2023
  232. fanquake closed this on Oct 3, 2023

  233. vasild commented at 9:36 am on October 3, 2023: contributor
    ACK 75a329103505736acb9036224da2dfa8ab038c43
  234. rsantacroce commented at 8:13 am on October 4, 2023: none
    any clear reason why noise protocol hasn’t been considered here ?
  235. Sjors commented at 8:22 am on October 4, 2023: member
    @rsantacroce the BIP explains this under “Why not use a general-purpose transport encryption protocol?”: https://github.com/bitcoin/bips/blob/master/bip-0324.mediawiki
  236. rsantacroce commented at 8:35 am on October 4, 2023: none
    Thank you, i missed that one! very good explanation now it’s the other way around why we don’t use it on stratumv2 …. this is for the other repo. thank you and thank you for the great work.
  237. sipa added the label Needs release note on Oct 4, 2023
  238. hebasto referenced this in commit 0e3de3b83e on Oct 5, 2023
  239. Frank-GER referenced this in commit 95e4527ea0 on Oct 13, 2023
  240. in test/functional/p2p_v2_transport.py:45 in 05d19fbcc1 outdated
    47+            self.connect_nodes(0, 1, peer_advertises_v2=True)
    48+        self.generate(self.nodes[0], 5, sync_fun=lambda: self.sync_all(self.nodes[0:2]))
    49+        assert_equal(self.nodes[1].getblockcount(), 5)
    50+        # verify there is a v2 connection between node 0 and 1
    51+        node_0_info = self.nodes[0].getpeerinfo()
    52+        node_1_info = self.nodes[0].getpeerinfo()
    


    kashifs commented at 4:04 pm on November 10, 2023:

    @sipa, am I missing something or should this line read:

    node_1_info = self.nodes[1].getpeerinfo()


    sipa commented at 4:13 pm on November 10, 2023:
    This indeed looks like a bug.

    fanquake commented at 5:35 pm on November 10, 2023:

    theStack commented at 5:54 pm on November 10, 2023:
    Good catch @kashifs. Do you want to open a PR for the fix? If not (for whatever reason), I’m happy to do it and add the commit with your authorship.

    kashifs commented at 9:28 pm on November 10, 2023:
    I’ve opened a PR here. Please let me know if it’s done correctly and if there is more that I should do.
  241. achow101 referenced this in commit 30a0557829 on Nov 28, 2023
  242. fanquake removed the label Needs release note on Dec 7, 2023
  243. fanquake commented at 1:22 pm on December 7, 2023: member
    Removing label, as this got a release note.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me