[net] Cleanup logic around connection types #19316

pull amitiuttarwar wants to merge 19 commits into bitcoin:master from amitiuttarwar:2020-06-conn-refactor changing 9 files +186 −140
  1. amitiuttarwar commented at 9:40 pm on June 17, 2020: contributor

    This is part 1 of #19315, which enables the ability to test outbound and block-relay-only connections from the functional tests. Please see that PR for more information of overall functionality.

    This PR simplifies how we manage different connection types. It introduces an enum with the various types of connections so we can explicitly define the connection type. The existing system relies on a series of independent flags, then has asserts scattered around to ensure that conflicting flags are not enabled at the same time. I find this approach to be both brittle and confusing. While making these changes, I found a small bug due to the silent assumptions.

    This PR also proposes a rename from OneShot to AddrFetch. I find the name OneShot to be very confusing, especially when we also have onetry manual connections. Everyone I’ve talked to offline has agreed that the name is confusing, so I propose a potential alternative. I think this is a good opportunity for a rename since I’m creating an enum to explicitly define the connection types. (some context for the unfamiliar: oneshot or addrfetch connections are short-lived connections created on startup. They connect to the seed peers, send a getaddr to solicit addresses, then close the connection.)

    Overview of this PR:

    • rename oneshot to addrfetch
    • introduce ConnectionType enum
    • one by one, add different connection types to the enum
    • expose the conn_type on CNode, and use this to reduce reliance on flags (& asserts)
    • fix the bug in counting different type of connections
    • some additional cleanup to simplify logic and make expectations explicit/inclusive rather than implicit/exclusive.
  2. DrahtBot added the label P2P on Jun 17, 2020
  3. DrahtBot added the label RPC/REST/ZMQ on Jun 17, 2020
  4. DrahtBot added the label Tests on Jun 17, 2020
  5. DrahtBot added the label Validation on Jun 17, 2020
  6. amitiuttarwar force-pushed on Jun 17, 2020
  7. DrahtBot commented at 11:07 pm on June 17, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19607 ([p2p] Add Peer struct for per-peer data in net processing by jnewbery)
    • #19064 (refactor: Cleanup thread ctor calls by hebasto)
    • #18421 (Periodically update DNS caches for better privacy of non-reachable nodes by naumenkogs)
    • #17428 (p2p: Try to preserve outbound block-relay-only connections during restart by hebasto)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. amitiuttarwar force-pushed on Jun 17, 2020
  9. naumenkogs commented at 7:18 am on June 18, 2020: member

    Concept ACK.

    Is enum ConnectionType the right approach? It worries me that these things are not mutually-exclusive. Like, ADDRFETCH is also OUTBOUND. It’s true that it’s non-ambiguous right now, but maybe we want to make them a bitmask in advance for the sake of explicitness, since we are already refactoring it?

  10. jnewbery commented at 7:56 pm on June 18, 2020: member

    Concept ACK.

    Is enum ConnectionType the right approach? It worries me that these things are not mutually-exclusive. Like, ADDRFETCH is also OUTBOUND.

    I think enum is the right approach. OUTBOUND here means a connection that is created automatically by the OpenConnections thread. Yes, there are other kinds of connection that the node originates, but for clarity, I don’t think those should be referred to as ‘OUTBOUND’.

    I think we definitely don’t want a bitmask of different connection capabilities. That leads to an combinatorial explosion of connection types that you need to either test or explicitly disallow.

  11. amitiuttarwar commented at 9:03 pm on June 18, 2020: contributor

    @naumenkogs its a good question to ask, but I think enforcing mutual exclusivity makes more sense and what allows for a lot of the simplicity. @jnewbery explained the main reasons, I want to add some additional points

    • everything but INBOUND originates from the node, but OUTBOUND refers to a specific type, which is the most common, full-relay, outbound connection. so I do think its exclusive to the other connection types. on the enum I added OUTBOUND, // full relay connections (blocks, addrs, txns), but lmk if you have suggestions on how it could be clarified better.
    • if we support combinations, we have to add many behavioral checks to ensure logical adherence. lots of the cleanup in this PR shed light on how brittle/annoying it is to have those checks scattered everywhere. so, I think if we are adding connection types in the future, designing the conn_type to fit into the flat structure better supports code-simplicity-over-time as well.

    On a different note, I have a question with the fuzz tests & would appreciate any guidance.

    Since I change the signature of the CNode constructor, I’m causing a test failure. I found this function to support fuzz testing with enums, but am unclear on how to implement. It seems I need to add kMaxValue to the enum.

    I’m wondering if the pattern is: a) mimic the enum in the tests & add kMaxValue there, or b) import something into net.cpp and define directly on the code implementation of the enum. Option B seems strange to add that dependency to net, but option A seems like it would be easy to forget to add an enum type to the fuzz test if we add a type to the cpp code in the future. @practicalswift, @MarcoFalke, any chance either of you are able to advise? 🙏🏽

  12. practicalswift commented at 9:25 am on June 22, 2020: contributor

    @amitiuttarwar Thanks for the ping! I’d suggest using PickValueInArray like in these examples:

     0$ git grep -E 'PickValueInArray.*\(\{.+' "src/test/fuzz/**.cpp"
     1src/test/fuzz/bloom_filter.cpp:        static_cast<unsigned char>(fuzzed_data_provider.PickValueInArray({BLOOM_UPDATE_NONE, BLOOM_UPDATE_ALL, BLOOM_UPDATE_P2PUBKEY_ONLY, BLOOM_UPDATE_MASK}))};
     2src/test/fuzz/fees.cpp:    const FeeReason fee_reason = fuzzed_data_provider.PickValueInArray({FeeReason::NONE, FeeReason::HALF_ESTIMATE, FeeReason::FULL_ESTIMATE, FeeReason::DOUBLE_ESTIMATE, FeeReason::CONSERVATIVE, FeeReason::MEMPOOL_MIN, FeeReason::PAYTXFEE, FeeReason::FALLBACK, FeeReason::REQUIRED});
     3src/test/fuzz/kitchen_sink.cpp:    const TransactionError transaction_error = fuzzed_data_provider.PickValueInArray<TransactionError>({TransactionError::OK, TransactionError::MISSING_INPUTS, TransactionError::ALREADY_IN_CHAIN, TransactionError::P2P_DISABLED, TransactionError::MEMPOOL_REJECTED, TransactionError::MEMPOOL_ERROR, TransactionError::INVALID_PSBT, TransactionError::PSBT_MISMATCH, TransactionError::SIGHASH_MISMATCH, TransactionError::MAX_FEE_EXCEEDED});
     4src/test/fuzz/message.cpp:        (void)SigningResultString(fuzzed_data_provider.PickValueInArray({SigningResult::OK, SigningResult::PRIVATE_KEY_NOT_AVAILABLE, SigningResult::SIGNING_FAILED}));
     5src/test/fuzz/netaddress.cpp:    const Network network = fuzzed_data_provider.PickValueInArray({Network::NET_IPV4, Network::NET_IPV6, Network::NET_INTERNAL, Network::NET_ONION});
     6src/test/fuzz/policy_estimator.cpp:        (void)block_policy_estimator.estimateRawFee(fuzzed_data_provider.ConsumeIntegral<int>(), fuzzed_data_provider.ConsumeFloatingPoint<double>(), fuzzed_data_provider.PickValueInArray({FeeEstimateHorizon::SHORT_HALFLIFE, FeeEstimateHorizon::MED_HALFLIFE, FeeEstimateHorizon::LONG_HALFLIFE}), fuzzed_data_provider.ConsumeBool() ? &result : nullptr);
     7src/test/fuzz/policy_estimator.cpp:        (void)block_policy_estimator.HighestTargetTracked(fuzzed_data_provider.PickValueInArray({FeeEstimateHorizon::SHORT_HALFLIFE, FeeEstimateHorizon::MED_HALFLIFE, FeeEstimateHorizon::LONG_HALFLIFE}));
     8src/test/fuzz/script_interpreter.cpp:                (void)SignatureHash(script_code, tx_to, in, fuzzed_data_provider.ConsumeIntegral<int>(), ConsumeMoney(fuzzed_data_provider), fuzzed_data_provider.PickValueInArray({SigVersion::BASE, SigVersion::WITNESS_V0}), nullptr);
     9src/test/fuzz/script_interpreter.cpp:                    (void)SignatureHash(script_code, tx_to, in, fuzzed_data_provider.ConsumeIntegral<int>(), ConsumeMoney(fuzzed_data_provider), fuzzed_data_provider.PickValueInArray({SigVersion::BASE, SigVersion::WITNESS_V0}), &precomputed_transaction_data);
    10src/test/fuzz/script_sign.cpp:                (void)signature_creator.CreateSig(provider, vch_sig, address, ConsumeScript(fuzzed_data_provider), fuzzed_data_provider.PickValueInArray({SigVersion::BASE, SigVersion::WITNESS_V0}));
    11src/test/fuzz/signature_checker.cpp:    const SigVersion sig_version = fuzzed_data_provider.PickValueInArray({SigVersion::BASE, SigVersion::WITNESS_V0});
    12src/test/fuzz/system.cpp:            const OptionsCategory options_category = fuzzed_data_provider.PickValueInArray<OptionsCategory>({OptionsCategory::OPTIONS, OptionsCategory::CONNECTION, OptionsCategory::WALLET, OptionsCategory::WALLET_DEBUG_TEST, OptionsCategory::ZMQ, OptionsCategory::DEBUG_TEST, OptionsCategory::CHAINPARAMS, OptionsCategory::NODE_RELAY, OptionsCategory::BLOCK_CREATION, OptionsCategory::RPC, OptionsCategory::GUI, OptionsCategory::COMMANDS, OptionsCategory::REGISTER_COMMANDS, OptionsCategory::HIDDEN});
    
  13. in src/net.cpp:1834 in f5f0e07e0a outdated
    1837-                    setConnected.insert(pnode->addr.GetGroup(addrman.m_asmap));
    1838-                    if (pnode->m_tx_relay == nullptr) {
    1839+                switch (pnode->m_conn_type){
    1840+                    case ConnectionType::INBOUND:
    1841+                    case ConnectionType::MANUAL:
    1842+                        // Netgroups for inbound and addnode peers are not excluded because our goal here
    


    jnewbery commented at 2:13 pm on June 22, 2020:
    s/addnode/manual

    amitiuttarwar commented at 8:02 pm on June 24, 2020:
    fixed
  14. in src/net.cpp:1836 in f5f0e07e0a outdated
    1839+                switch (pnode->m_conn_type){
    1840+                    case ConnectionType::INBOUND:
    1841+                    case ConnectionType::MANUAL:
    1842+                        // Netgroups for inbound and addnode peers are not excluded because our goal here
    1843+                        // is to not use multiple of our limited outbound slots on a single netgroup
    1844+                        // but inbound and addnode peers do not use our outbound slots.  Inbound peers
    


    jnewbery commented at 2:18 pm on June 22, 2020:
    s/addnode/manual

    amitiuttarwar commented at 8:02 pm on June 24, 2020:
    fixed
  15. in src/net.h:119 in f5f0e07e0a outdated
    112@@ -113,6 +113,14 @@ struct CSerializedNetMsg
    113     std::string command;
    114 };
    115 
    116+enum class ConnectionType {
    


    jnewbery commented at 2:23 pm on June 22, 2020:
    Consider using doxygen-style comments for these enum entries

    amitiuttarwar commented at 8:02 pm on June 24, 2020:
    done
  16. in src/net.h:118 in f5f0e07e0a outdated
    112@@ -113,6 +113,14 @@ struct CSerializedNetMsg
    113     std::string command;
    114 };
    115 
    116+enum class ConnectionType {
    117+    INBOUND, // peer initiated connections
    118+    OUTBOUND, // full relay connections (blocks, addrs, txns)
    


    jnewbery commented at 2:26 pm on June 22, 2020:
    Maybe add that these connections are made automatically to addresses selected from AddrMan.

    amitiuttarwar commented at 8:03 pm on June 24, 2020:
    added
  17. in src/net.h:121 in f5f0e07e0a outdated
    112@@ -113,6 +113,14 @@ struct CSerializedNetMsg
    113     std::string command;
    114 };
    115 
    116+enum class ConnectionType {
    117+    INBOUND, // peer initiated connections
    118+    OUTBOUND, // full relay connections (blocks, addrs, txns)
    119+    MANUAL, // connections to addresses added via addnode or the connect command line argument
    120+    FEELER, // short lived connections used to test address validity
    121+    BLOCK_RELAY, // only relay blocks to these connections
    


    jnewbery commented at 2:27 pm on June 22, 2020:
    Add that these are automatic outbound peers made to addresses selected from AddrMan.

    amitiuttarwar commented at 8:03 pm on June 24, 2020:
    added
  18. jnewbery commented at 2:29 pm on June 22, 2020: member

    utACK f5f0e07e0a43b6decd147671f82c06b6abc64689 (once the fuzzer is fixed)

    I’ve left some optional docs suggestions inline and one more here:

    In net.cpp L2387: s/Initiate outbound connections from -addnode/Initiated manual connections

    Functionally this looks good to me. Very nice cleanup!

  19. amitiuttarwar force-pushed on Jun 24, 2020
  20. amitiuttarwar commented at 8:30 pm on June 24, 2020: contributor

    fixed fuzz tests, rebased & incorporated review comments @jnewbery thank you for review 🙌🏽 I took all your suggestions. @practicalswift thanks for the tip. Tests are now passing. Is there a time the ConsumeEnum() template would make more sense? @ariard continuing convo from your comment here.

    IMO I would adopt a more fine-grained typology. Right now it sounds to confuse type of traffic (block, addr, tx) and type of connection selection (outbound/inbound, normal/anchor, …)

    Can you explain what you are suggesting? I agree that this proposal flattens different aspects of connection behaviors, but I believe this is advantageous for simpler code, as explained above [1] [2].

    As a follow-up, I think we may scope [wtxid relay] under a connection type

    Considering this possibility. I haven’t reviewed #18044 in depth, but I’m not sure it conceptually makes sense. wtxid relay seems more in line with content sent over the connection (like preferences expressed in the version message) than type of connection. I think you could have wtxid relay for connections that are inbound, outbound, and manual connections?

  21. jnewbery commented at 3:07 pm on June 25, 2020: member
    ACK the final state of this, but the intermediate commits break the fuzz build because the fuzz changes only appear in the final commit. Ideally those changes would be in the same commits as the changes to the CNode ctors so all intermediate commits build.
  22. vasild commented at 7:17 pm on June 25, 2020: member
    0enum class ConnectionType {
    1    INBOUND, /**< peer initiated connections */
    2    OUTBOUND, /**< full relay connections (blocks, addrs, txns) made automatically. Addresses selected from AddrMan. */
    3    MANUAL, /**< connections to addresses added via addnode or the connect command line argument */
    4    FEELER, /**< short lived connections used to test address validity */
    5    BLOCK_RELAY, /**< only relay blocks to these automatic outbound connections. Addresses selected from AddrMan. */
    6    ADDR_FETCH, /**< short lived connections used to solicit addrs */
    7};
    

    This is somewhat confusing because it mixes unrelated things in one enum - connection initiator (us/outbound, them/inbound) with connection capabilities. Something like having enum Status { valid, invalid, blue }.

    I think the following questions have unclear answer, looking at the above enum:

    • Are INBOUND connections full relay connections (blocks, addrs, txns)?
    • Are MANUAL connections full relay connections (blocks, addrs, txns)?
    • Are FEELER connections inbound or outbound, that is - who initiated the connection?
    • Are ADDR_FETCH connections inbound or outbound, that is - who initiated the connection?
  23. jnewbery commented at 7:33 pm on June 25, 2020: member

    This is somewhat confusing because it mixes unrelated things in one enum - connection initiator (us/outbound, them/inbound) with connection capabilities. Something like having enum Status { valid, invalid, blue }.

    Please see the above discussion about this, particularly #19316 (comment). Another way to explain this is that these different connection types are already logically separate in the code, but that separation is done using multiple bools. However, combining those different bools is invalid - there’s no such thing as an inbound feeler connection, or a manual block_relay connection for example. That leads to logic bugs like the one fixed in this PR here: https://github.com/bitcoin/bitcoin/pull/19316/commits/f2e183e5791be0322137cb9d90fb410ebcacbfd4. Flattening the internal model of these different connection types should make those bugs less likely.

    I think the following questions have unclear answer, looking at the above enum…

    That’s fair. Perhaps these comments could be expanded to explicitly say who initiates the connection for each type (hint: it’s our side for all connection types except inbound).

  24. in src/net.h:418 in 43cec7b2a1 outdated
    413@@ -406,8 +414,8 @@ class CConnman
    414     std::atomic<bool> fNetworkActive{true};
    415     bool fAddressesInitialized{false};
    416     CAddrMan addrman;
    417-    std::deque<std::string> vOneShots GUARDED_BY(cs_vOneShots);
    418-    RecursiveMutex cs_vOneShots;
    419+    std::deque<std::string> vAddrFetchs GUARDED_BY(cs_vAddrFetchs);
    420+    RecursiveMutex cs_vAddrFetchs;
    


    vasild commented at 12:59 pm on June 26, 2020:
    nit: if these member variables are going to be renamed, why not rename them in a way that abides the naming convention?

    amitiuttarwar commented at 9:43 pm on July 16, 2020:

    good point. I was focused on writing the scripted diff & forgot about new conventions.

    do these renames seem correct? CNode: fAddrFetch -> m_addr_fetch CConnman: vAddrFetchs -> m_addr_fetchs CConnman: cs_vAddrFetchs -> cs_addr_fetch

    according to the guidelines, cs_vAddrFetchs should be renamed to m_cs_addr_fetch, but cs_addr_fetch seems more consistent with our code, so I’m currently opting for that. if agreed, I can PR a small update to the guidelines.


    jnewbery commented at 12:43 pm on July 17, 2020:
    m_addr_fetch and m_addr_fetches seem fine. For the mutex, I’d suggest m_addr_fetches_mutex. The only reason we have the cs name for mutexes is that in v0.1, Satoshi used Windows CRITICAL_SECTIONs for locking.

    amitiuttarwar commented at 0:08 am on July 23, 2020:
    thanks! updated
  25. in src/net.h:776 in 43cec7b2a1 outdated
    771@@ -764,9 +772,8 @@ class CNode
    772     }
    773     // This boolean is unusued in actual processing, only present for backward compatibility at RPC/QT level
    774     bool m_legacyWhitelisted{false};
    775-    bool fFeeler{false}; // If true this node is being used as a short lived feeler.
    776-    bool fOneShot{false};
    777-    bool m_manual_connection{false};
    778+    ConnectionType m_conn_type{ConnectionType::OUTBOUND};
    779+    bool fAddrFetch{false};
    


    vasild commented at 1:01 pm on June 26, 2020:

    Can fAddrFetch be removed and m_conn_type == ConnectionType::ADDR_FETCH used instead?

    nit: naming convention on fAddrFetch


    amitiuttarwar commented at 9:49 pm on July 16, 2020:
    I explored the option but there were enough call sites that I thought it would be unnecessarily verbose for no additional value. Since fAddrFetch is set via this comparison here, its mostly a simpler way to reference & doesn’t introduce room for divergence I believe. Although one thing I’ll do to strengthen this guarantee is remove the initialization to false here & make it a const.

    vasild commented at 1:10 pm on July 17, 2020:

    I see, maybe also consider a method like

    0bool IsAddrFetch() { return m_conn_type == ConnectionType::ADDR_FETCH; }
    

    amitiuttarwar commented at 0:12 am on July 23, 2020:
    is there a reason you prefer the function over member var? I haven’t incorporated yet but am open to it. just trying to understand the reasoning better

    vasild commented at 11:38 am on July 23, 2020:
    • A bunch of methods would define the interface better. Otherwise the interface is “I expose m_conn_type to you and you do whatever you want with it (including changing it). OTOH if methods are used m_conn_type can and should be made private.
    • It is shorter at the caller site (and I believe more readable).
    • Can have methods like IsOutbound() which do not translate to a single enum value and would need more complex expressions at the caller site.
    • Then it would be possible to change the enum m_conn_type to a set of boolean flags, bitmask or whatever other ideas there might be without changing the external interface.

    amitiuttarwar commented at 8:14 pm on July 23, 2020:

    ah, I think my question was unclear. let me try again-

    in the latest push, m_conn_type is indeed a private member var & I’ve introduced several accessor functions for callers.

    My question here is asking about leaving m_addr_fetch as a member var on CNode VS another accessor function IsAddrFetchConn(). I’m wondering what the advantages are of adding the function vs leaving it in the current / past state of being accessed through this member var, m_addr_fetch.

    (same for inbound connection type)


    vasild commented at 8:55 am on July 24, 2020:

    I suggested to use a method IsAddrFetch() because you said m_conn_type == ConnectionType::ADDR_FETCH would be unnecessarily verbose at the callers’ sites (it is, I agree).

    Now that accessor methods have been added I think there is no need to keep fAddrFetch and it can be replaced by IsAddrFetch()?


    amitiuttarwar commented at 3:39 am on July 30, 2020:
    removed fAddrFetch and introduced IsAddrFetch() in latest push 🎈
  26. in src/net.h:781 in 43cec7b2a1 outdated
    777-    bool m_manual_connection{false};
    778+    ConnectionType m_conn_type{ConnectionType::OUTBOUND};
    779+    bool fAddrFetch{false};
    780     bool fClient{false}; // set by version message
    781     bool m_limited_node{false}; //after BIP159, set by version message
    782     const bool fInbound;
    


    vasild commented at 2:27 pm on June 26, 2020:
    Can fInbound be removed and m_conn_type == ConnectionType::INBOUND used instead?

    amitiuttarwar commented at 9:49 pm on July 16, 2020:
    same answer as #19316 (review), but fInbound is already a const.

    amitiuttarwar commented at 3:40 am on July 30, 2020:
    removed fInbound & exposed via IsInboundConn()
  27. in src/net_processing.cpp:802 in 43cec7b2a1 outdated
    796@@ -797,11 +797,9 @@ void UpdateLastBlockAnnounceTime(NodeId node, int64_t time_in_seconds)
    797     if (state) state->m_last_block_announcement = time_in_seconds;
    798 }
    799 
    800-// Returns true for outbound peers, excluding manual connections, feelers, and
    801-// one-shots.
    802 static bool IsOutboundDisconnectionCandidate(const CNode& node)
    803 {
    804-    return !(node.fInbound || node.m_manual_connection || node.fFeeler || node.fOneShot);
    805+    return (node.m_conn_type == ConnectionType::OUTBOUND || node.m_conn_type == ConnectionType::BLOCK_RELAY);
    


    vasild commented at 3:13 pm on June 26, 2020:

    If done with a switch:

    0    switch (node.m_conn_type) {
    1    case ConnectionType::INBOUND: 
    2    case ConnectionType::MANUAL:
    3    case ConnectionType::FEELER:
    4    case ConnectionType::ADDR_FETCH:
    5        return false;
    6    case ConnectionType::OUTBOUND:
    7    case ConnectionType::BLOCK_RELAY:
    8        return true;
    9    }   
    

    and a new entry is added to the enum, it will enforce the developer to assess what should be returned for the new entry instead of silently returning false, like the current code.


    amitiuttarwar commented at 0:52 am on July 23, 2020:
    resolving this comment in favor of #19316 (comment) which has also been extracted to IsOutboundOrBlockRelayConn
  28. in src/net.cpp:1943 in 43cec7b2a1 outdated
    1958+                conn_type = ConnectionType::OUTBOUND;
    1959+            } else if (nOutboundBlockRelay < m_max_outbound_block_relay) {
    1960+                conn_type = ConnectionType::BLOCK_RELAY;
    1961+            } else {
    1962+                //this case should be unreachable
    1963+                conn_type = ConnectionType::OUTBOUND;
    


    vasild commented at 7:12 pm on June 26, 2020:
    If this is really unreachable then, would it be more suitable to put an assert here? Or, if in doubt - print something to the log and don’t open a connection. It looks strange to silently open a connection when we are here because nOutboundFullRelay >= m_max_outbound_full_relay.

    amitiuttarwar commented at 10:25 pm on June 30, 2020:

    totally agree the current state is weird. I dug more into this case & turns out.. I think the comment is wrong & the code is right. So I will update the comment.

    here’s my understanding-

    this previous conditional earlier in the function means if full-relay & block-relay are both at capacity, one of the following happens: (A)fFeeler will be set to true (B) we continue to next iteration of the while loop or (C)GetTryNewOutboundPeer is true, fFeeler stays false, and we continue to the rest of the function.

    GetTryNewOutboundPeer is simply a wrapper for m_try_another_outbound_peer, which is set to true when a stale tip is detected (link).

    so, when we get to this current conditional, if fFeeler if false & our full-relay and block-relay connection counts are at the allocated max -> GetTryNewOutboundPeer must be true. I believe the intent in this scenario is to open a new outbound connection.

    This is a tangent but I’m leaving here incase its interesting for reviewers / to share learnings:

    This got me taking a closer look at GetTryNewOutboundPeer

    It doesn’t look like eviction / outbound counting logic checks for this bool. So if m_try_another_outbound_peer is true, the node would add an outbound connection, then GetExtraOutboundCount would count it as extra & EvictExtraOutboundPeers would evict a connection.

    Maybe this is a mechanism for peer rotation? Not sure how that would work though because evicting extra outbounds is triggered every 45 seconds, and decides what connection to evict based on the oldest recent-block-announcement. A newly added connection defaults to a value of 0 so would probably be selected for eviction?

    update: the part I was missing is that we would proactively query the new connection for blocks, so if they don’t return any, it makes sense to not keep them.


    vasild commented at 3:30 pm on July 2, 2020:

    Ok, I confirm what you write above and I verified that the new code behaves in an identical way with the old one, so this PR is good (except the //this case should be unreachable comment).

    But the code (old and new) is confusing - I imagine somebody may attempt to “fix” this code which opens a connection even when both “full” and “block relay” capacities are exhausted.

    What about checking here that GetTryNewOutboundPeer() is true instead of relying that if fFeeler is false then GetTryNewOutboundPeer() must have been true 85 lines above in the code?

    It is kind of easier to understand that GetTryNewOutboundPeer() being true acts as an “override the limits and open a connection anyway”.


    dongcarl commented at 5:41 pm on July 16, 2020:
    Without restructuring ThreadOpenConnections completely, perhaps we can just do a simple assert(false) here just so that if the previous conditional changes, we’ll notice in test failures?

    amitiuttarwar commented at 0:30 am on July 23, 2020:
    I looked into starting with the conditional of GetTryNewOutboundPeer(), but I think that subtly changes the logic of the function. on current master, block-relay-only takes precedence over GetTryNewOutboundPeer, so starting with that check would change the behavior slightly. I don’t feel comfortable enough with all scenarios to evaluate if that change is fine, so I made it explicit and added an assert(false) as a tiny improvement: link. does this seem any better?

    vasild commented at 11:43 am on July 23, 2020:
    Yes, it looks better to me (until the day when this assertion kicks in :-D)
  29. in src/net.cpp:1645 in 43cec7b2a1 outdated
    1641@@ -1640,7 +1642,7 @@ void CConnman::ThreadDNSAddressSeed()
    1642                     {
    1643                         LOCK(cs_vNodes);
    1644                         for (const CNode* pnode : vNodes) {
    1645-                            nRelevant += pnode->fSuccessfullyConnected && !pnode->fFeeler && !pnode->fOneShot && !pnode->m_manual_connection && !pnode->fInbound;
    1646+                            nRelevant += pnode->fSuccessfullyConnected && (pnode->m_conn_type == ConnectionType::OUTBOUND || pnode->m_conn_type == ConnectionType::BLOCK_RELAY);
    


    vasild commented at 7:34 pm on June 26, 2020:

    A switch without default: would produce a compiler warning when new enum element is added and this code is forgotten:

     0                            switch (pnode->m_conn_type) {
     1                            case ConnectionType::INBOUND:
     2                            case ConnectionType::MANUAL:
     3                            case ConnectionType::FEELER:
     4                            case ConnectionType::ADDR_FETCH:
     5                                break;
     6                            case ConnectionType::OUTBOUND:
     7                            case ConnectionType::BLOCK_RELAY:
     8                                if (pnode->fSuccessfullyConnected) {
     9                                    ++nRelevant;
    10                                }   
    11                                break;
    12                            }
    

    amitiuttarwar commented at 0:50 am on July 23, 2020:

    ah yeah. totally agree with the nice compiler guarantee of switch statements. I’ve extracted IsOutboundOrBlockRelayConn into its own function, but looks like I missed incorporating the switch statement. I’ve updated locally and will push with next round of updates.

    one question- in the other property-test-with-switch-statement functions example, I’ve had to add a default case because otherwise the build with -Werror enabled fails. is this a good way to handle? is there a cleaner way?


    amitiuttarwar commented at 1:23 am on July 23, 2020:
    updated

    vasild commented at 12:02 pm on July 23, 2020:

    I was going to suggest that this can be replaced with something like if (pnode->IsOutboundOrBlockRelay()) - in which case the caller looks short and clean and the switch or chain of ifs and the enum are hidden into the body of that method.

    See https://github.com/bitcoin/bitcoin/pull/19316/files#r459397380


    amitiuttarwar commented at 8:26 pm on July 23, 2020:

    this can be replaced with something like if (pnode->IsOutboundOrBlockRelay())

    yup, already done link :)

  30. in src/net.cpp:1755 in 43cec7b2a1 outdated
    1751@@ -1761,7 +1752,7 @@ int CConnman::GetExtraOutboundCount()
    1752     {
    1753         LOCK(cs_vNodes);
    1754         for (const CNode* pnode : vNodes) {
    1755-            if (!pnode->fInbound && !pnode->m_manual_connection && !pnode->fFeeler && !pnode->fDisconnect && !pnode->fOneShot && pnode->fSuccessfullyConnected) {
    1756+            if (pnode->fSuccessfullyConnected && !pnode->fDisconnect && (pnode->m_conn_type == ConnectionType::OUTBOUND || pnode->m_conn_type == ConnectionType::BLOCK_RELAY)) {
    


    vasild commented at 7:38 pm on June 26, 2020:
    As above, a switch without default: will enforce this to be reviewed when new enum element is added. If the pattern type == OUTBOUND || type == BLOCK_RELAY repeats too often then the switch be put in a new method bool CNode::IsFooBar() and that method used here.

    amitiuttarwar commented at 0:52 am on July 23, 2020:
    resolving this comment in favor of #19316 (review) which has also been extracted to IsOutboundOrBlockRelayConn
  31. in src/net.h:775 in 43cec7b2a1 outdated
    771@@ -764,9 +772,8 @@ class CNode
    772     }
    773     // This boolean is unusued in actual processing, only present for backward compatibility at RPC/QT level
    774     bool m_legacyWhitelisted{false};
    775-    bool fFeeler{false}; // If true this node is being used as a short lived feeler.
    776-    bool fOneShot{false};
    777-    bool m_manual_connection{false};
    778+    ConnectionType m_conn_type{ConnectionType::OUTBOUND};
    


    vasild commented at 7:43 pm on June 26, 2020:

    CNode::m_conn_type is initialized here in its declaration and also in the only constructor of CNode from an argument:

    https://github.com/bitcoin/bitcoin/blob/43cec7b2a1a932f7784b037601a4fa3895e082b7/src/net.h#L866

    https://github.com/bitcoin/bitcoin/blob/43cec7b2a1a932f7784b037601a4fa3895e082b7/src/net.cpp#L2754

    So, maybe remove the initialization from here?

    0    ConnectionType m_conn_type;
    

    amitiuttarwar commented at 0:54 am on July 23, 2020:
    👍🏽 initialization removed (and some other stuff like making it private, const)
  32. in src/net.h:866 in 43cec7b2a1 outdated
    862@@ -856,7 +863,7 @@ class CNode
    863 
    864     std::set<uint256> orphan_work_set;
    865 
    866-    CNode(NodeId id, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn, SOCKET hSocketIn, const CAddress &addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress &addrBindIn, const std::string &addrNameIn = "", bool fInboundIn = false, bool block_relay_only = false);
    867+    CNode(NodeId id, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn, SOCKET hSocketIn, const CAddress &addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress &addrBindIn, const std::string &addrNameIn = "", ConnectionType conn_type_in = ConnectionType::OUTBOUND);
    


    vasild commented at 7:56 pm on June 26, 2020:

    All the callers of this constructor specify all arguments, so the default values of the last 2 arguments can be removed:

    0    CNode(NodeId id, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn, SOCKET hSocketIn, const CAddress &addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress &addrBindIn, const std::string &addrNameIn, ConnectionType conn_type_in);
    

    amitiuttarwar commented at 0:55 am on July 23, 2020:
    done
  33. in src/net_processing.cpp:2253 in 43cec7b2a1 outdated
    2249@@ -2252,7 +2250,7 @@ void ProcessMessage(
    2250         {
    2251             connman->SetServices(pfrom.addr, nServices);
    2252         }
    2253-        if (!pfrom.fInbound && !pfrom.fFeeler && !pfrom.m_manual_connection && !HasAllDesirableServiceFlags(nServices))
    2254+        if ((pfrom.m_conn_type == ConnectionType::OUTBOUND || pfrom.m_conn_type == ConnectionType::BLOCK_RELAY || pfrom.m_conn_type == ConnectionType::ADDR_FETCH) && !HasAllDesirableServiceFlags(nServices))
    


    vasild commented at 8:01 pm on June 26, 2020:
    Consider using a switch without default: to assign to a boolean variable the expression type == FOO || type == BAR || type == BAZ to enforce the code to be reassessed when new enum element is added.

    amitiuttarwar commented at 0:56 am on July 23, 2020:
    extracted into ExpectServicesFromConn & handled as switch statement
  34. vasild commented at 8:14 pm on June 26, 2020: member

    Approach ACK

    Just some minor suggestions.

    Was the bug fixed by commit [net] Fix bug where AddrFetch connections would be counted as outbound full relay present in master before this PR or it was introduced by some of the preceding commits in this PR? If the latter then maybe squash it into the commit that introduced the bug.

    Maybe also squash the fixup from [test] Update fuzz test with new CNode function signature into the commit which broke the tests.

  35. in src/net.h:125 in 43cec7b2a1 outdated
    117+    INBOUND, /**< peer initiated connections */
    118+    OUTBOUND, /**< full relay connections (blocks, addrs, txns) made automatically. Addresses selected from AddrMan. */
    119+    MANUAL, /**< connections to addresses added via addnode or the connect command line argument */
    120+    FEELER, /**< short lived connections used to test address validity */
    121+    BLOCK_RELAY, /**< only relay blocks to these automatic outbound connections. Addresses selected from AddrMan. */
    122+    ADDR_FETCH, /**< short lived connections used to solicit addrs */
    


    jonatack commented at 1:14 am on July 1, 2020:
    I wonder if all of these except INBOUND should be prefixed with OUTBOUND_ to indicate that they are different types of outbound connections, and OUTBOUND become OUTBOUND_FULL_RELAY.

    jonatack commented at 1:39 am on July 1, 2020:

    I wonder if all of these except INBOUND should be prefixed with OUTBOUND_ to indicate that they are different types of outbound connections, and OUTBOUND become OUTBOUND_FULL_RELAY.

    See also #19405 (comment)


    vasild commented at 12:37 pm on July 2, 2020:

    Indeed! And I would take this one step further: to make the naming consistent and obvious we can distinguish 3 properties of a connection:

    1. inbound or outbound
    2. automatic or manual
    3. by capabilities: full relay, feeler, block relay, solicit addresses

    and then name the constants like prop1_prop2_prop3. Something like:

    0INBOUND_AUTO_FULL
    1OUTBOUND_AUTO_FULL
    2OUTBOUND_MANUAL_FULL
    3OUTBOUND_AUTO_FEELER
    4OUTBOUND_AUTO_BLOCK
    5OUTBOUND_AUTO_ADDR
    

    sipa commented at 7:02 pm on July 7, 2020:

    I see how that would add some clarity, but it’s also a bigger pain to maintain (if any other dimension of classification is added, all existing constants may need to be renamed).

    Perhaps having property test functions (IsFull(), IsOutbound(), IsFeeler(), …) on ConnectionType gets you some of the advantages of more elaborate type names, without the maintenance burden?


    amitiuttarwar commented at 1:07 am on July 23, 2020:
    introduced property test functions & these behaviors should be captured in the code comments of each type
  36. in src/net.cpp:108 in c1aed85090 outdated
    103@@ -104,10 +104,10 @@ std::map<CNetAddr, LocalServiceInfo> mapLocalHost GUARDED_BY(cs_mapLocalHost);
    104 static bool vfLimited[NET_MAX] GUARDED_BY(cs_mapLocalHost) = {};
    105 std::string strSubVersion;
    106 
    107-void CConnman::AddOneShot(const std::string& strDest)
    108+void CConnman::AddAddrFetch(const std::string& strDest)
    


    ariard commented at 2:41 am on July 3, 2020:
    OneShot was a terrible name for sure but at least the temporary or boostrap-side of this type of connection was implied by old name. How about AddrSeeding to underscore this connection is only made at the starting phase ?

    amitiuttarwar commented at 1:13 am on July 23, 2020:
    hmm, I feel like this aspect could change over time- eg. we decide to use these conns to seed more addresses at other times. although I will improve the comment on the enum to express the current usage.

    amitiuttarwar commented at 9:14 pm on July 23, 2020:
    updated the comment to short lived connections used to solicit addrs when starting the node without a populated AddrMan, let me know if this feels clear
  37. in src/net.cpp:2743 in c035da9ba9 outdated
    2744@@ -2745,7 +2745,7 @@ CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn
    2745     // Don't relay addr messages to peers that we connect to as block-relay-only
    2746     // peers (to prevent adversaries from inferring these links from addr
    2747     // traffic).
    2748-    m_addr_known{conn_type_in == ConnectionType::BLOCK_RELAY ? nullptr : MakeUnique<CRollingBloomFilter>(5000, 0.001)},
    2749+    m_addr_known{nullptr},
    


    ariard commented at 2:47 am on July 3, 2020:
    Have you tried to introduce a ADDR_RELAY_ONLY or change the equality of IsAddrRelayPeer() to conn_type_in != ConnectionType::BLOCK_RELAY ?

    amitiuttarwar commented at 1:22 am on July 23, 2020:

    what’s the advantage?

    its true that currently m_addr_known is set for everything block-relay-only connections, but if we hard code IsAddrRelayPeer() to be a block-relay-only connection, we’d have two entry points that could potentially get out of sync unless we remember to check & update, right?


    ariard commented at 2:41 pm on July 26, 2020:

    I think it simplifies code understanding. To know how IsAddrRelayPeer is going to behave, I have first to read function declaration and then CNode constructor on how we initialize m_addr_known in function of connection type. Note, it’s already how the current code behaves, so it was just a suggestion of improvement.

    I guess this discussion belong to the more general one of a connection-attribute model where we annotate them on a node at connection initialization.


    amitiuttarwar commented at 10:34 pm on August 14, 2020:
    I tried it out here: 9345f9c9d47a88987606d1421823914a4c48dbf9, but still have the same concern, which I noted here: #19724 (comment).
  38. in src/net.cpp:1838 in f2e183e579 outdated
    1834@@ -1835,7 +1835,7 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
    1835                     setConnected.insert(pnode->addr.GetGroup(addrman.m_asmap));
    1836                     if (pnode->m_tx_relay == nullptr) {
    1837                         nOutboundBlockRelay++;
    1838-                    } else if (pnode->m_conn_type != ConnectionType::FEELER) {
    1839+                    } else if (pnode->m_conn_type == ConnectionType::OUTBOUND) {
    


    ariard commented at 2:57 am on July 3, 2020:
    Side-note, I think it would have be better to fix bug before introducing connection type for review clarity. Reviewers now have to review a fix relying on new code, less understood IMHO

    amitiuttarwar commented at 1:16 am on July 23, 2020:
    👍 noted
  39. in src/net.h:120 in 4f0449ae4a outdated
    118-    OUTBOUND,
    119-    MANUAL,
    120-    FEELER,
    121-    BLOCK_RELAY,
    122-    ADDR_FETCH,
    123+    INBOUND, /**< peer initiated connections */
    


    ariard commented at 3:02 am on July 3, 2020:

    It’s great to see documentation around types. I think you should go further and describe exactly the set of assumptions we are doing for each type (like outbound more trusted than inbound, outbound favored for relay, block_relay_only to avoid leaking the full of our block-topology …) and design goals of more obscure ones (feeler to sanitize addr db, addr fetch to boostrap our view of the network). With ConnectionType you can even grep code locations and points directly to where we are taking decisions based on type. That would be awesome for newcomers :)

    As a follow-up ofc.

  40. ariard commented at 3:43 am on July 3, 2020: member

    Concept ACK, I think this simplify understanding of our zoo of peers and open the way a) to better coverage of p2p b) introduction of more fine-grained connection policy c) introduction of new kind of connection (e.g anchors, …)

    But I disagree on the “flat types” approach as it has been underscored by some other reviewers.

    I think types can be analyzed as a set of attributes, each of them representing some property of the connection, like initiation (inbound/outbound), mode of establishment (manual/auto), classes of traffic (blocks, addrs, transactions, …), special behaviors to sanitize or boostrap internal states (feeler, addr_fetch). Encoding connections at the attributes-level is a potential alternative to the type one currently proposed here.

    To give a better idea of the attributes approach, I did a PoC implem here : https://github.com/bitcoin/bitcoin/compare/master...ariard:2020-07-review-conn-types

    How to judge between them ?

    It has been argued that the flat approach is better for clarity. I suppose that the “clarity” referred here is reviewers understanding but a) for newcomers it’s counter-intuitive to think that distinc types overlap in their behavior, like both MANUAL and OUTBOUND are currently full-relay, b) at review, to understand any change of policy connection based on type you would have to refer to type documentation instead of directly observing the attribute involved in control flow.

    Another argument is combinatorial complexity for testing with a bitmask. I agree there, see the attributes-approach implemation proposed, I do think it’s super easy to test behavior adherence assuming a default layout of full-relay, outbound, auto (and I guess the type-approach implicitely do the same). The ConnectionAttributes can leak as parameters to the new addconnection call, this not having to test all cases. If you have a precise test example where the attributes-approach fails, can you point it to me ?

    With regards to code-simplicity, we make connection policy decisions based on attributes assertion (“Is this node an outbound one ?”, “Is this node a feeler ?”), a mutual exclusivity approach to answer to the first example must test for OUTBOUND, BLOCK_RELAY, MANUAL, FEELER, ADDR_FETCH, etco i.e O(n) check for any type supporting the attribute. An API-based approach (IsOutbound) is just O(1). This issue is going to become worst if we introduce anchor peers or tx-relay-only ones, both of them likely to be outbound.

    A more pragmatic example where the type approach fails for code simplicity is here : https://github.com/bitcoin/bitcoin/blob/44912e821aad58b176e79598aa7f1845b1345aae/src/net_processing.cpp#L2253. How are going to code the check if we add manual block_relay_only ?

    Overall, I think that avoiding the combinatorial complexity is the right move but we can still have a more fine-grained approach that the type one assuming we analyze and decouple attributes rightly. It might be more time-review consuming but likely being beneficial as we introduce more connection policies and new connections behaviors in the future.

  41. DrahtBot added the label Needs rebase on Jul 7, 2020
  42. sipa commented at 6:56 pm on July 7, 2020: member

    Big concept ACK.

    I’m a bit uncomfortable with the == ConnectionType::X tests, and to a lesser extent with switch/case statements spread out over the code. Things are relatively simple because all “special” types are outbound, but it’s not too hard to imagine that another class of inbound connections is added at some point, which would need reconsideration of all == ConnectionType::INBOUND tests.

    What do you think about having a few simple functions IsOutbound(), IsManual(), IsBlocksOnly(), … that take a ConnectionType, and which are the only places where any comparisons or switch/case statements on ConnectionTypes are peformed?

  43. vasild commented at 2:07 pm on July 9, 2020: member

    What do you think about having a few simple functions IsOutbound(), IsManual(), IsBlocksOnly(), … that take a ConnectionType

    +1 :bulb:

    Or even have it all together in a class like:

     0class ConnectionType
     1{
     2public:
     3    ConnectionType(bool flag1, bool flag2, ...); // assert or throw if the flags contradict
     4
     5    bool IsInbound();
     6    bool IsManual();
     7    bool IsGreen();
     8    ...
     9 
    10private:
    11    bool flag1;
    12    bool flag2;
    13    ...
    14};
    
  44. dongcarl commented at 5:49 pm on July 16, 2020: member

    utACK 43cec7b2a1a932f7784b037601a4fa3895e082b7

    Very nice simplification, reasonably easy to see its correctness too!


    From sipa:

    What do you think about having a few simple functions IsOutbound(), IsManual(), IsBlocksOnly(), … that take a ConnectionType, and which are the only places where any comparisons or switch/case statements on ConnectionTypes are peformed?

    I would very much like to see something like this implemented with simple functions, some “property tests” may be layer specific, so the logic would fit better in a the right file rather than in the ConnectionType class.

    I was worried about the case where new cases are added to the enum in the future, but realized that since we’ll still have the switch statements in the functions the compiler will still force us to handle these new cases.

  45. amitiuttarwar force-pushed on Jul 22, 2020
  46. amitiuttarwar force-pushed on Jul 23, 2020
  47. amitiuttarwar commented at 0:07 am on July 23, 2020: contributor

    hi everyone, thanks so much for these thoughtful reviews! 🙌🏽

    Current state of PR:

    I’ve been considering the different options & tinkering. here’s what I’ve got: I’ve incorporated miscellaneous review comments and updated the design to make m_conn_type private & expose an interface of accessor functions. I’m currently proposing the final state of this PR. Once the changes look good, I’ll need to fix up the commit breakdown a bit. That said, I hope the current breakdown is conceptually helpful for review. I’d just need to incorporate the m_conn_type changes to be earlier before it makes sense for master.

    What do you think about having a few simple functions IsOutbound(), IsManual(), IsBlocksOnly(), … that take a ConnectionType, and which are the only places where any comparisons or switch/case statements on ConnectionTypes are performed? @sipa I’ve taken a stab at it. could you review the final state of this branch? does this match your expectations? (changes are essentially in 05e85161c390ba5aa6149bcdf4a4c560a45032f3) @dongcarl I didn’t fully understand what you meant here: I would very much like to see something like this implemented with simple functions, some “property tests” may be layer specific, so the logic would fit better in a the right file rather than in the ConnectionType class.

    do my changes fit into what you’re looking for? otherwise can you help me understand what you mean? @ariard, @vasild you two expressed interest in something very similar. what do you think of the current changeset?

    Conceptual design conversation:

    a few different ideas have been proposed through the review on this PR so far. at a high level- I stand by the core concept of this PR as extracting the connection types into an enum. I think we could potentially incorporate a class level abstraction in the future. but that the enum suffices for our needs right now, simplifies the current state of the code, and supports increased test coverage of connections as proposed in #19315.

    I’d like to emphasize the purpose of the enum: it represents the information we have available at the time of opening or accepting the connection. which is why, for example, we only have 1 inbound type. thus, the information the enum is communicating is only a subset of what is relevant to a connection.

    it doesn’t have information like noban logic, tx relay info negotiated through the p2p, etc. I don’t think its of interest to try to capture all possible permutations of that information here. For example, we support blocks-relay-only connections, but also have blocks-only mode. So full-relay connections for a node started in blocks-only mode are pretty much blocks-relay-only connections, but still have some small differences in practice. and then layer on nuances of noban permissions and omg its wild.

    basically, my point is, there is a ton of nuance to the interactions of different types of behaviors of connections. by capturing one concrete piece of information (what is our intent when we open the connection?) and flattening it into an XOR via using an enum, I believe this greatly simplifies the complexity of considering all possibilities. here, the compiler takes care of enforcing valid states rather than our minds. @ariard do these ideas make sense to you? Let’s get into some specifics.

    I reviewed your proposed changes. the main thing thats missing from 0b7c86158e9a344e0f379d3cef98cab1b96ecc29 is the lack of behavioral enforcement. Eg, according to this structure, it would be valid to have an inbound, manual, feeler, block_relay_only connection. The 4 different attributes give us 36 possible types of connections. This makes it very difficult for me to reason about c8ec734715a5d5efdf96736495c0e3c06b67d061. To strengthen your proposal, I would add a series of assertions on the class constructor that enforce things like “if its inbound, it must be auto, full_relay, and regular” But then we would be left with just 6 valid combinations from the 36 possibilities- the ones that are currently captured in the ConnectionTypes enum.

    RE adding manual block-relay-only connections: if the enum approach is accepted and I wanted to implement this as a new connection type, I would simply s/BLOCK_RELAY/BLOCK_RELAY_AUTO and introduce a new BLOCK_RELAY_MANUAL, then visit the call-sites to decide the desired behavior in various circumstances.

    as I mentioned above, I think this extraction into a separate class might make sense as a direction to be moving towards in the long run, but I think it’s premature right now. what do you think about the latest proposal of having the interface to m_conn_type as a private member? to me, it seems like a step in that direction.

    Other misc review comments:

    • I believe I’ve fixed up the fuzz builds on all the intermediary commits, but could use a double check @vasild - think you might have answers to these from the review process, but just incase
    • I updated the enum to have more descriptive explanations. hows it look?
    • the bug fix is for a bug currently on master

    … and if you’ve made this far, THANKS FOR READING 🙃

  48. DrahtBot removed the label Needs rebase on Jul 23, 2020
  49. amitiuttarwar force-pushed on Jul 23, 2020
  50. in src/net.h:849 in 2e3514a35d outdated
    838+            case ConnectionType::FEELER:
    839+                return true;
    840+            default:
    841+                assert(false);
    842+        }
    843+    }
    


    vasild commented at 12:01 pm on July 23, 2020:

    I think remove the default: (so that we get a warning when a new enum is added and this code not updated to handle it) and put the assert(false) after the switch (so that the compiler stops complaining about reaching end of non-void function).

    If some compiler is not smart enough to see that assert(false) is not going to return, then use abort() which is marked as [[noreturn]].


    amitiuttarwar commented at 9:13 pm on July 23, 2020:
    updated. I think this will also appease the compiler, but will wait for travis build to confirm..
  51. amitiuttarwar force-pushed on Jul 23, 2020
  52. amitiuttarwar force-pushed on Jul 23, 2020
  53. amitiuttarwar commented at 9:17 pm on July 23, 2020: contributor

    updated comment & switch statement handling for non-enum values

    all review comments should be addressed

  54. ariard commented at 3:27 pm on July 26, 2020: member

    Why we do we have different types of connections ? AFAICT we enforce different p2p policies in function of them, i.e don’t evict outbound ones (AttemptToEvictConnection) when slots are full or enforce automatic, outbound connections offer desirable services (at version reception). These types are either automatically selected at connection opening, negotiated through p2p messages (i.e blocks-only), set by node operators, … Mapping attributes (or property, same thing) directly to a p2p policy make it far easier to analyze. You have a one-to-one relation, attribute -> policy, rather with your model you have to go through the process: type (set of implicit attributes) -> attribute -> policy.

    To strengthen your proposal, I would add a series of assertions on the class constructor that enforce things like “if its inbound, it must be auto, full_relay, and regular” But then we would be left with just 6 valid combinations from the 36 possibilities- the ones that are currently captured in the ConnectionTypes enum.

    That was a PoC :p Of course you could add assertions. But again if you capture the same combination in current ConnectionTypes enum, for each property/attribute test you have to list each of them which support it. A class level abstraction asks would encapsulate complexity all for once.

    RE adding manual block-relay-only connections: if the enum approach is accepted and I wanted to implement this as a new connection type, I would simply s/BLOCK_RELAY/BLOCK_RELAY_AUTO and introduce a new BLOCK_RELAY_MANUAL, then visit the call-sites to decide the desired behavior in various circumstances.

    If you just call SetBlockRelayOnly, SetManual at CNode creation there is no more need to sweep through every callsites to be sure every p2p policy is consistent with each new combination of properties/attributes.

    That said, I think we agree mostly on the direction, it’s more about the steps. 6a1259c is moving in the right discussion, and gathering all “known-at-connection-initiation” types in same place is required before to do something more structured like a class.

  55. in src/net.h:819 in 6a1259c574 outdated
    816+
    817+    bool IsBlockOnlyConn() const {
    818+        return m_conn_type == ConnectionType::BLOCK_RELAY;
    819+    }
    820+
    821+    bool IsFeelerConn() const {
    


    ariard commented at 3:34 pm on July 26, 2020:

    I agree this is clearly an improvement. When we fine-grain them more in the future, for each of them I think we could have comments like

    0This connection implements the feeler-behavior, aiming to periodically sanitize our addr db.
    1It's automatically-initiated every FEELER_INTERVAL. It can't be user-triggered.
    2Once we verify an online peer is associated to the addr by receiving a VERSION msg, disconnect
    

    amitiuttarwar commented at 3:47 am on July 30, 2020:
    ya, agree this would be a good place for any additional docs- could add doxygen annotations to these functions. but going to leave for a follow-up (feel free to propose if you’re interested :))

    amitiuttarwar commented at 10:37 pm on August 14, 2020:
    added more docs in follow up PR #19724 here: 899f3b7416317f8d9aacf915dfdc7fb0beb9f28b
  56. DrahtBot added the label Needs rebase on Jul 26, 2020
  57. dongcarl commented at 5:54 pm on July 27, 2020: member
    @amitiuttarwar The changes in 6a1259c574c96c7059e3b6a61b79d56d59c48508 are exactly what I meant :-) Looks like perhaps we can remove m_addr_fetch as a followup.
  58. amitiuttarwar force-pushed on Jul 30, 2020
  59. amitiuttarwar commented at 3:58 am on July 30, 2020: contributor

    thanks for the feedback @vasild, @ariard, @dongcarl !

    I’ve restructured the commits to incorporate having m_conn_type as a private member, and removed all the CNode flags (including fInbound and m_addr_fetch). There are a lot of commits, but I tried hard to make it coherent & each commit super simple. Hopefully it should be fairly straightforward to review in its current state.

    all feedback is addressed, commits are coherently structured. please evaluate for master

  60. DrahtBot removed the label Needs rebase on Jul 30, 2020
  61. in src/chainparams.cpp:113 in c9bce99b25 outdated
    109@@ -110,7 +110,7 @@ class CMainParams : public CChainParams {
    110 
    111         // Note that of those which support the service bits prefix, most only support a subset of
    112         // possible options.
    113-        // This is fine at runtime as we'll fall back to using them as a oneshot if they don't support the
    114+        // This is fine at runtime as we'll fall back to using them as a addrfetch if they don't support the
    


    jnewbery commented at 3:28 pm on August 3, 2020:
    nit: should be “an addrfetch”.

    amitiuttarwar commented at 0:33 am on August 8, 2020:
    fixed
  62. in src/net.h:876 in c9bce99b25 outdated
    872@@ -792,8 +873,8 @@ class CNode
    873 
    874     // flood relay
    875     std::vector<CAddress> vAddrToSend;
    876-    const std::unique_ptr<CRollingBloomFilter> m_addr_known;
    877-    bool fGetAddr{false};
    878+    std::unique_ptr<CRollingBloomFilter> m_addr_known;
    


    jnewbery commented at 3:30 pm on August 3, 2020:
    nit: default initialize this to nullptr here and remove it from the initializer list.

    amitiuttarwar commented at 0:35 am on August 8, 2020:
    done
  63. in src/net.h:877 in c9bce99b25 outdated
    872@@ -792,8 +873,8 @@ class CNode
    873 
    874     // flood relay
    875     std::vector<CAddress> vAddrToSend;
    876-    const std::unique_ptr<CRollingBloomFilter> m_addr_known;
    877-    bool fGetAddr{false};
    878+    std::unique_ptr<CRollingBloomFilter> m_addr_known;
    879+    bool fGetAddr{false}; //!< used to prevent relaying addr messages sent in response to a GETADDR request
    


    jnewbery commented at 3:36 pm on August 3, 2020:
    nit: This comment doesn’t seem to be part of the rest of the changes in this PR. (I also think fGetAddr should probably be removed since the way it currently gets set to false doesn’t make much sense).

    amitiuttarwar commented at 5:25 pm on August 4, 2020:

    heh, yeah, I was trying to sneak it in. but you’re right, it doesn’t fit in. I’ll remove from this PR

    agree that fGetAddr is weird. +1 to a rethink.


    amitiuttarwar commented at 0:35 am on August 8, 2020:
    removed
  64. in src/net.cpp:1945 in c9bce99b25 outdated
    1946+            } else if (nOutboundBlockRelay < m_max_outbound_block_relay) {
    1947+                conn_type = ConnectionType::BLOCK_RELAY;
    1948+            } else if (GetTryNewOutboundPeer()) {
    1949+                conn_type = ConnectionType::OUTBOUND;
    1950+            } else {
    1951+                assert(false);
    


    jnewbery commented at 4:19 pm on August 3, 2020:
    I don’t like this new assert. It seems safe for now because of the logic earlier in this function, but I don’t think we should rely on that (or something else) not changing. Why not just maintain the old behaviour of this always creating OUTBOUND connections when they’re not FEELER or BLOCK_RELAY

    amitiuttarwar commented at 5:23 pm on August 4, 2020:

    hm, I don’t quite follow your logic. If a developer were to change the function’s logic flow without properly attending to this switch statement, in one case we’d silently open (short lived) OUTBOUND connections & in the other case the developer would hit a compiler error. is this different than how you’re thinking about the use cases?

    also, see #19316 (review) for more convo on this topic


    jnewbery commented at 12:24 pm on August 6, 2020:

    This isn’t a switch statement, and there aren’t any compile time checks. assert(false) is a runtime error.

    Here’s what I’m worried about:

    • in line 1856 (https://github.com/bitcoin/bitcoin/pull/19316/files#diff-9a82240fe7dfe86564178691cc57f2f1R1856), both outbound and block-relay-only are full, and GetTryNewOutboundPeer() is true, so we don’t set fFeeler to true.
    • meanwhile, on the ~message handler~ scheduler thread, SetTryNewOutboundPeer(false) is called.
    • we then reach this line where fFeeler is false, both outbound and block-relay-only are full, and GetTryNewOutboundPeer() is false. We assert and terminate.

    You’ve justified this change above:

    GetTryNewOutboundPeer is simply a wrapper for m_try_another_outbound_peer, which is set to true when a stale tip is detected (link).

    so, when we get to this current conditional, if fFeeler if false & our full-relay and block-relay connection counts are at the allocated max -> GetTryNewOutboundPeer must be true.

    m_try_another_outbound_peer is updated by another thread, and there’s no lock on that data between the conditional above and this line, so there’s no guarantee that it won’t change. We may be lucky and some combination of other conditions in the eviction/stale tip logic in net_processing means this won’t actually happen, but it seems possible that there’s a window condition where it could, which would result in users’ nodes very intermittently terminating.

    I think this conn_type logic can probably be moved up and merged with the fFeeler setting logic above (fFeeler, nOutboundFullRelay and nOutboundBlockRelay don’t get updated in the intervening code, so why not just set conn_type above?), but I don’t think we should complicate this PR by doing that here. For this PR, I think we should just not add this assert.

    EDIT: SetTryNewOutboundPeer() is called on the scheduler thread, not message handler thread. Everything else still stands.


    amitiuttarwar commented at 9:39 pm on August 7, 2020:
    ah, I missed that its updated by a separate thread. thanks for this explanation!

    amitiuttarwar commented at 0:36 am on August 8, 2020:
    updated. nice catch!

    amitiuttarwar commented at 10:43 pm on August 14, 2020:
    I’ve taken a stab at a further simplification on the follow-up PR #19724 here: 6bdc1e3a693081e1f12a1a0b2411ac129d2b2923
  65. in src/net.h:831 in c9bce99b25 outdated
    826+
    827+    bool IsInboundConn() const {
    828+        return m_conn_type == ConnectionType::INBOUND;
    829+    }
    830+
    831+    bool IsUniqueNetBlockConn() const {
    


    jnewbery commented at 4:33 pm on August 3, 2020:

    I don’t love this (or ExpectServicesFromConn() below) where logic that is only used in one calling function is placed inside the class. IsUniqueNetBlockConn() is very specific to ThreadOpenConnections() so it makes sense to me that the logic and comment should live in that function.

    I realise this sounds like a contradiction of #19316 (comment), but to me these very specific cases don’t belong inside the object.


    amitiuttarwar commented at 5:33 pm on August 4, 2020:

    I don’t have a strong preference either way, but agree with you around IsUniqueNetBlockConn and ExpectServicesFromConn not being particularly helpful. Since both are implemented as switch statements, I think I’ll revert to having them integrated into their calling functions.

    some of the other m_conn_type wrappers are also only used in one location. eg. IsFeelerConn & IsBlockOnlyConn. But I’m going to leave them as is.

    reviewers- if you disagree, please help me understand why!


    amitiuttarwar commented at 0:38 am on August 8, 2020:
    I ended up inlining IsUniqueNetBlockConn, but left ExpectServicesFromConn because I couldn’t figure out a clear way to inline while maintaining a switch statement.
  66. jnewbery commented at 4:37 pm on August 3, 2020: member

    Looks good. A few small nits inline. There’s one change that I don’t like (adding the assert in ThreadOpenConnections), and one style change which I don’t love but could live with (IsUniqueNetBlockConn). Otherwise this looks great.

    I’d really like to rebase my net/net_processing split branch on this because I expect it’ll simplify some of the changes I’m making, so it’d be good to get this merged soon.

  67. scripted-diff: Rename OneShot to AddrFetch
    -BEGIN VERIFY SCRIPT-
    sed -i 's/a oneshot/an addrfetch/g' src/chainparams.cpp #comment
    sed -i 's/oneshot/addrfetch/g' src/net.cpp #comment
    sed -i 's/AddOneShot/AddAddrFetch/g' src/net.h src/net.cpp
    sed -i 's/cs_vOneShots/m_addr_fetches_mutex/g' src/net.h src/net.cpp
    sed -i 's/vOneShots/m_addr_fetches/g' src/net.h src/net.cpp
    sed -i 's/fOneShot/m_addr_fetch/g' src/net.h src/net.cpp src/net_processing.cpp
    sed -i 's/ProcessOneShot/ProcessAddrFetch/g' src/net.h src/net.cpp
    -END VERIFY SCRIPT-
    3f1b7140e9
  68. [net/refactor] Introduce an enum to distinguish type of connection
    - extract inbound & outbound types
    26304b4100
  69. [net/refactor] Add manual connections to ConnectionType enum 1521c47438
  70. [net/refactor] Add feeler connections to ConnectionType enum 0e52a659a2
  71. [net/refactor] Add block relay only connections to ConnectionType enum e1bc29812d
  72. [net/refactor] Extract m_addr_known logic from initializer list af59feb052
  73. [net/refactor] Add AddrFetch connections to ConnectionType enum
    - AddrFetch connections are short lived connections used to getaddr from a peer
    - previously called "one shot" connections
    442abae2ba
  74. [doc] Describe different connection types 46578c03e9
  75. [net/refactor] Add connection type as a member var to CNode
    - Directly maintaining the connection type prevents having to deduce it from
      several flags.
    d3698b5ee3
  76. [net/refactor] Remove m_manual_connection flag from CNode 49efac5cae
  77. [net/refactor] Remove fFeeler flag from CNode 14923422b0
  78. [net/refactor] Remove m_addr_fetch member var from CNode 7b322df629
  79. [net/refactor] Remove fInbound flag from CNode 60156f5fc4
  80. [net/refactor] Clarify logic for selecting connections in ThreadOpenConnections 4972c21b67
  81. [net] Fix bug where AddrFetch connections would be counted as outbound full relay
    The desired logic is for us to only open feeler connections after we have hit
    the max count for outbound full relay connections.  A short lived AddrFetch
    connection (previously called oneshot) could cause ThreadOpenConnections to
    miscount and mistakenly open a feeler instead of full relay.
    35839e963b
  82. [net/refactor] Rework ThreadOpenConnections logic
    Make the connection counts explicit and extract into interface functions around
    m_conn_type. Using explicit counting and switch statements where possible
    should help prevent counting bugs in the future.
    7f7b83deb2
  83. [net/refactor] Simplify multiple-connection checks
    Extract logic that check multiple connection types into interface functions &
    structure as switch statements. This makes it very clear what touch points are
    for accessing `m_conn_type` & using the switch statements enables the compiler
    to warn if a new connection type is introduced but not handled for these cases.
    2f2e13b6c2
  84. [refactor] Remove IsOutboundDisconnectionCandidate bc5d65b3ca
  85. [net] Remove unnecessary default args on CNode constructor 01e283068b
  86. amitiuttarwar force-pushed on Aug 8, 2020
  87. in src/net.h:125 in 46578c03e9 outdated
    126+    INBOUND, /**< peer initiated connections */
    127+    OUTBOUND, /**< full relay connections (blocks, addrs, txns) made automatically. Addresses selected from AddrMan. */
    128+    MANUAL, /**< connections to addresses added via addnode or the connect command line argument */
    129+    FEELER, /**< short lived connections used to test address validity */
    130+    BLOCK_RELAY, /**< only relay blocks to these automatic outbound connections. Addresses selected from AddrMan. */
    131+    ADDR_FETCH, /**< short lived connections used to solicit addrs when starting the node without a populated AddrMan */
    


    sdaftuar commented at 10:15 am on August 8, 2020:

    I wonder if it might make sense to prefix the different outbound types with OUTBOUND_, so that it’s clear to code readers in the future that these are all different types of outbound peers? In the future, for instance, I’m hoping to propose a way to negotiate block-relay-only connections at connection startup time, and then we’d be able to have both OUTBOUND_BLOCK_RELAY and INBOUND_BLOCK_RELAY peers.

    If you were to go this route, then I’d suggest OUTBOUND could be renamed to OUTBOUND_FULL_RELAY or something like that.


    jonatack commented at 10:52 am on August 8, 2020:
    This was my feedback as well: #19316 (review)

    sdaftuar commented at 11:57 am on August 8, 2020:
    Oops, sorry for the noise and missing the earlier discussion. This would be fine to do later as well once we get around to adding more inbound types.

    amitiuttarwar commented at 5:30 pm on August 8, 2020:

    Oops, sorry for the noise and missing the earlier discussion.

    no worries. would rather you read the code :)

    I’m hoping to propose a way to negotiate block-relay-only connections at connection startup time, and then we’d be able to have both OUTBOUND_BLOCK_RELAY and INBOUND_BLOCK_RELAY peers.

    oh interesting! curious to see how it develops.

    I’ve tended towards sticking with the simpler names because (1) its hard to discern exactly what properties will be relevant for the future eg. ariard mentioned manual block relay connections / could lead to maintenance burden as sipa mentioned. (2) it should be easy to rename in the future with a scripted diff (3) just as a personal preference, eg. OUTBOUND_AUTO_FEELER feels unnecessarily verbose when we don’t have an idea of any other types of feeler connections. or INBOUND_AUTO_FULL could be slightly misleading because maybe its actually only blocks being relayed on this connection.


    sdaftuar commented at 3:30 pm on August 10, 2020:
    Perhaps it would be helpful to mention that -seednode peers specified on the command line get converted to ADDR_FETCH peers?

  88. sdaftuar commented at 10:32 am on August 8, 2020: member

    Concept ACK, thanks for taking this on, this looks like a great cleanup of some pretty messy code.

    I would need to do some more thorough code review than the one-pass I just did in order to ACK the change myself; in particular I am not sure we have comprehensive testing in place for all this code, so the next step for me is to see what kinds of errors I can include that would not cause any tests to fail (as a way to focus my review).

    EDIT: I noticed there was some discussion before of having flat types versus a set of attributes. I agree with the approach of this PR, which is to not have a set of attributes, many combinations of which don’t make sense – the flat-types here match our design understanding of how we reason about different behaviors, and the specific details around exactly what happens for each is not something we should want to expose in too many places.

  89. in src/net.h:121 in 01e283068b
    112@@ -113,6 +113,17 @@ struct CSerializedNetMsg
    113     std::string m_type;
    114 };
    115 
    116+/** Different types of connections to a peer. This enum encapsulates the
    117+ * information we have available at the time of opening or accepting the
    118+ * connection. Aside from INBOUND, all types are initiated by us. */
    119+enum class ConnectionType {
    120+    INBOUND, /**< peer initiated connections */
    121+    OUTBOUND, /**< full relay connections (blocks, addrs, txns) made automatically. Addresses selected from AddrMan. */
    


    laanwj commented at 3:03 pm on August 10, 2020:

    I like the idea of a connection type enum, but I’d like to rename INBOUND and OUTBOUND here as it’s a bit of a odd man out (as commented earlier, it’s not the exclusive thing identifying this type of connection). Maybe FULL_RELAY_INBOUND and FULL_RELAY_OUTBOUND?

    Edit: Oh, I see this has already been discussed a lot, sorry!


    amitiuttarwar commented at 4:44 pm on August 10, 2020:
    with INBOUND I do think this is the only known characteristic of the connection, but I’m happy to rename OUTBOUND to FULL_RELAY_OUTBOUND.

    laanwj commented at 5:15 pm on August 10, 2020:
    Sounds good to me!

    amitiuttarwar commented at 10:54 pm on August 14, 2020:
    renamed OUTBOUND to OUTBOUND_FULL_RELAY in PR #19724 here: 34135a51775dc72809097cfb90b8611eee00106e
  90. in src/net.h:878 in d3698b5ee3 outdated
    874@@ -875,6 +875,7 @@ class CNode
    875 private:
    876     const NodeId id;
    877     const uint64_t nLocalHostNonce;
    878+    const ConnectionType m_conn_type;
    


    sdaftuar commented at 3:43 pm on August 10, 2020:
    Would be great to expose this in the getpeerinfo RPC (not necessarily in this PR).

    amitiuttarwar commented at 2:28 am on August 15, 2020:
    done in #19725
  91. in src/net_processing.cpp:3451 in 01e283068b
    3446@@ -3455,7 +3447,7 @@ void ProcessMessage(
    3447         // to users' AddrMan and later request them by sending getaddr messages.
    3448         // Making nodes which are behind NAT and can only make outgoing connections ignore
    3449         // the getaddr message mitigates the attack.
    3450-        if (!pfrom.fInbound) {
    3451+        if (!pfrom.IsInboundConn()) {
    3452             LogPrint(BCLog::NET, "Ignoring \"getaddr\" from outbound connection. peer=%d\n", pfrom.GetId());
    


    sdaftuar commented at 4:09 pm on August 10, 2020:

    Would it make sense to update the log message here to output the actual ConnectionType of the peer? The use of “outbound” in the log message could be confusing otherwise, given that our enum is OUTBOUND but this could be any of the outbound connection types.

    It might similarly be good to update the LogPrintf a few lines down (where we print “Ignoring getaddr from block-relay-only peer”).


    amitiuttarwar commented at 6:23 pm on August 11, 2020:

    I still have to test to make sure this works properly, but first pass at an implementation here: https://github.com/amitiuttarwar/bitcoin/commit/9162d44d2509e38e2e83e1f4315232d34d1fd777.

    • since I’m touching it anyway, I removed the second LogPrintf because it looks like dead code and I think concerns should be sufficiently addressed with this PR. convo on original for reference: https://github.com/bitcoin/bitcoin/pull/15759/files#r314860750
    • the ConnectionTypeToString function can be used for other log statements, and I’m hoping to use the same to expose in getpeerinfo (but haven’t looked at implementing that yet).

    amitiuttarwar commented at 2:28 am on August 15, 2020:
    done in #19725
  92. in src/net.h:211 in 01e283068b
    207@@ -197,7 +208,7 @@ class CConnman
    208     bool GetNetworkActive() const { return fNetworkActive; };
    209     bool GetUseAddrmanOutgoing() const { return m_use_addrman_outgoing; };
    210     void SetNetworkActive(bool active);
    211-    void OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant *grantOutbound = nullptr, const char *strDest = nullptr, bool fOneShot = false, bool fFeeler = false, bool manual_connection = false, bool block_relay_only = false);
    212+    void OpenNetworkConnection(const CAddress& addrConnect, bool fCountFailure, CSemaphoreGrant *grantOutbound = nullptr, const char *strDest = nullptr, ConnectionType conn_type = ConnectionType::OUTBOUND);
    


    sdaftuar commented at 4:21 pm on August 10, 2020:
    I think we can get rid of all default arguments here now?


    amitiuttarwar commented at 10:57 pm on August 14, 2020:
    incorporated into #19724 (eda5b1582dc8ae808dc957f69f5fa0ec8eb7ef0b)
  93. in src/net_processing.cpp:1968 in 01e283068b
    1965                 }
    1966             }
    1967         }
    1968 
    1969-        if (!pfrom.fDisconnect && IsOutboundDisconnectionCandidate(pfrom) && nodestate->pindexBestKnownBlock != nullptr && pfrom.m_tx_relay != nullptr) {
    1970+        if (!pfrom.fDisconnect && pfrom.IsOutboundOrBlockRelayConn() && nodestate->pindexBestKnownBlock != nullptr && pfrom.m_tx_relay != nullptr) {
    


    sdaftuar commented at 4:26 pm on August 10, 2020:

    I think we can clean this up now to look like:

    0if (!pfrom.fDisconnect && pfrom.IsFullOutboundConn() && nodestate->pindexBestKnownBlock != nullptr) {
    

    amitiuttarwar commented at 10:57 pm on August 14, 2020:
    done in PR #19724 (68f9f9bddd1e5cb481934c1d725d7e7891b8f4fa)
  94. in src/net_processing.cpp:2406 in 01e283068b
    2402@@ -2410,7 +2403,7 @@ void ProcessMessage(
    2403         UpdatePreferredDownload(pfrom, State(pfrom.GetId()));
    2404         }
    2405 
    2406-        if (!pfrom.fInbound && pfrom.IsAddrRelayPeer())
    2407+        if (!pfrom.IsInboundConn() && pfrom.IsAddrRelayPeer())
    


    sdaftuar commented at 4:29 pm on August 10, 2020:

    I wonder if we should rewrite this to be more explicit about which connection types we’re interested in? Here, we’re mixing a new flat-type check with a bucket-of-bools-style check.

    Perhaps we should try to nuke IsAddrRelayPeer() (a function which checks for whether a data structure exists), or change its definition to only reference connection types?


    amitiuttarwar commented at 6:51 pm on August 11, 2020:

    agreed this can be made more clear. I took a stab at a restructure that essentially replaces IsAddrRelayPeer with RelayAddrsWithConn that checks the connection type instead of presence of m_addr_known. https://github.com/amitiuttarwar/bitcoin/commit/6662f5dbe1e513ea2cec9adbcc730555832f85d0. I’m still deciding how I feel about it and am interested in any feedback.

    pros:

    • checking for addr relay via connection type feels more conceptually straightforward based on current conditions
    • easy to update the logic in the future
    • consistent interface being adopted with this PR

    cons:

    • the main downside is that if the initialization conditions for m_addr_known are ever updated, RelayAddrWithConn must also be updated. and would be problematic if missed by the proposer / reviewers.

    amitiuttarwar commented at 11:02 pm on August 14, 2020:
    incorporated in #19724 (https://github.com/bitcoin/bitcoin/pull/19724/commits/9345f9c9d47a88987606d1421823914a4c48dbf9). I’ve noted this outstanding concern and am seeking opinions: #19724 (comment).
  95. laanwj commented at 4:38 pm on August 10, 2020: member
    Code review ACK 01e283068b9e6214f2d77a2f772a4244ebfe2274, the commits are pretty straightforward to follow, and I think this is a move in the right direction overall Let’s wait for @sdaftuar to do an additional pass at looking at the testing coverage before merge.
  96. sdaftuar approved
  97. sdaftuar commented at 5:24 pm on August 10, 2020: member

    ACK 01e283068b9e6214f2d77a2f772a4244ebfe2274.

    As I surmised, our testing of this code is pretty inadequate – I was able to break many things without seeing any test failures.

    So I did manually test each connection type, to make sure it appeared to work as expected still (and in the process learned something interesting about -seednode’s).

    I also did a more thorough reading of the code than I did in my first pass, and I agree with the other reviewers these changes look straightforward and correct.

    I left a few comments/suggestions, none of which should be blockers, as this code is a big improvement and I don’t think we need to hold this up while we nitpick minor improvements or aesthetics.

  98. in src/net.cpp:1847 in 01e283068b
    1854+                // also have the added issue that they could be attacker controlled and used
    1855+                // to prevent us from connecting to particular hosts if we used them here.
    1856+                switch(pnode->m_conn_type){
    1857+                    case ConnectionType::INBOUND:
    1858+                    case ConnectionType::MANUAL:
    1859+                        break;
    


    jnewbery commented at 12:42 pm on August 11, 2020:

    ~This should be a continue~

    Oops, no this is fine.

  99. in src/net.cpp:1958 in 01e283068b
    1959+            } else if (nOutboundFullRelay < m_max_outbound_full_relay) {
    1960+                conn_type = ConnectionType::OUTBOUND;
    1961+            } else if (nOutboundBlockRelay < m_max_outbound_block_relay) {
    1962+                conn_type = ConnectionType::BLOCK_RELAY;
    1963+            } else {
    1964+                // GetTryNewOutboundPeer() is true
    


    jnewbery commented at 12:43 pm on August 11, 2020:
    I’d prefer this comment if it said GetTryNewOutboundPeer() was true above, to avoid future confusion about thread race conditions.

    amitiuttarwar commented at 11:02 pm on August 14, 2020:
    fixed in #19724 by simplifying the function
  100. in src/net.h:856 in 01e283068b
    852@@ -792,7 +853,7 @@ class CNode
    853 
    854     // flood relay
    855     std::vector<CAddress> vAddrToSend;
    856-    const std::unique_ptr<CRollingBloomFilter> m_addr_known;
    857+    std::unique_ptr<CRollingBloomFilter> m_addr_known = nullptr;
    


    jnewbery commented at 12:45 pm on August 11, 2020:
    nit: more common style these days is to use braced initialization (although it’s not in the style guide).
  101. jnewbery commented at 12:48 pm on August 11, 2020: member

    ~There’s one bug that should be fixed before merge (break -> continue).~

    I’ve also made a couple of nits inline which you should ignore or deal with in a follow-up PR.

  102. jnewbery commented at 1:16 pm on August 11, 2020: member
    utACK 01e283068b9e6214f2d77a2f772a4244ebfe2274
  103. vasild commented at 1:33 pm on August 11, 2020: member

    Filtered code coverage report (files not modified by this PR are omitted and not modified lines in files that are otherwise modified are dimmed).

    List of modified and not covered lines.

  104. in src/test/net_tests.cpp:185 in 01e283068b
    185-    // Test that fFeeler is false by default.
    186-    std::unique_ptr<CNode> pnode1 = MakeUnique<CNode>(id++, NODE_NETWORK, height, hSocket, addr, 0, 0, CAddress(), pszDest, fInboundIn);
    187-    BOOST_CHECK(pnode1->fInbound == false);
    188-    BOOST_CHECK(pnode1->fFeeler == false);
    189+    std::unique_ptr<CNode> pnode1 = MakeUnique<CNode>(id++, NODE_NETWORK, height, hSocket, addr, 0, 0, CAddress(), pszDest, ConnectionType::OUTBOUND);
    190+    BOOST_CHECK(pnode1->IsInboundConn() == false);
    


    vasild commented at 4:43 pm on August 11, 2020:
    What about adding BOOST_CHECK(!pnode->IsFeelerConn());?

    amitiuttarwar commented at 11:04 pm on August 14, 2020:
    added a few more tests for explicit checking on #19724 here: 2df9ad4edd8263227e481bb732e1552286bfe960
  105. in src/net_processing.cpp:838 in 01e283068b
    842         LOCK(cs_main);
    843-        mapNodeState.emplace_hint(mapNodeState.end(), std::piecewise_construct, std::forward_as_tuple(nodeid), std::forward_as_tuple(addr, std::move(addrName), pnode->fInbound, pnode->m_manual_connection));
    844+        mapNodeState.emplace_hint(mapNodeState.end(), std::piecewise_construct, std::forward_as_tuple(nodeid), std::forward_as_tuple(addr, std::move(addrName), pnode->IsInboundConn(), pnode->IsManualConn()));
    845     }
    846-    if(!pnode->fInbound)
    847+    if(!pnode->IsInboundConn())
    


    vasild commented at 4:58 pm on August 11, 2020:
    nit: space after if
  106. in src/net.cpp:1844 in 01e283068b
    1851+                // Netgroups for inbound and manual peers are not excluded because our goal here
    1852+                // is to not use multiple of our limited outbound slots on a single netgroup
    1853+                // but inbound and manual peers do not use our outbound slots. Inbound peers
    1854+                // also have the added issue that they could be attacker controlled and used
    1855+                // to prevent us from connecting to particular hosts if we used them here.
    1856+                switch(pnode->m_conn_type){
    


    vasild commented at 4:59 pm on August 11, 2020:
    nit: space after switch
  107. in src/net.h:793 in 01e283068b
    788@@ -782,6 +789,60 @@ class CNode
    789     std::atomic_bool fPauseRecv{false};
    790     std::atomic_bool fPauseSend{false};
    791 
    792+    bool IsOutboundOrBlockRelayConn() const {
    793+        switch(m_conn_type) {
    


    vasild commented at 4:59 pm on August 11, 2020:
    nit: space after switch
  108. in src/net.h:832 in 01e283068b
    827+    bool IsInboundConn() const {
    828+        return m_conn_type == ConnectionType::INBOUND;
    829+    }
    830+
    831+    bool ExpectServicesFromConn() const {
    832+        switch(m_conn_type) {
    


    vasild commented at 4:59 pm on August 11, 2020:
    nit: space after switch
  109. vasild approved
  110. vasild commented at 5:00 pm on August 11, 2020: member

    ACK 01e283068

    Feel free to ignore some minor comments below.

  111. amitiuttarwar commented at 6:12 pm on August 11, 2020: contributor

    thanks for all the reviews! since tip currently has 4 ACKs, I’m hoping these changes are RFM.

    I’m currently incorporating review suggestions on this branch. I’m planning to open a follow-up PR addressing some of these further improvements.

    other possible follow-ups from review comments:

  112. jb55 commented at 6:45 pm on August 11, 2020: member
    wow this code was messy before… ACK 01e283068b9e6214f2d77a2f772a4244ebfe2274
  113. fanquake approved
  114. fanquake commented at 1:40 am on August 12, 2020: member
    ACK 01e283068b9e6214f2d77a2f772a4244ebfe2274 - I don’t have as much experience with the networking code but these changes look fairly straight forward, the new code seems more robust/understandable and the additional documentation is great. I’m glad that a followup branch is already underway. There might be some more review comments here later today, so keep an eye on the discussion, however I’m going to merge this now.
  115. fanquake merged this on Aug 12, 2020
  116. fanquake closed this on Aug 12, 2020

  117. sidhujag referenced this in commit a338d21a64 on Aug 14, 2020
  118. sipa commented at 11:50 pm on August 14, 2020: member
    Posthumous ACK 01e283068b9e6214f2d77a2f772a4244ebfe2274
  119. laanwj referenced this in commit 620ac8c475 on Sep 3, 2020
  120. MarcoFalke referenced this in commit 4f45ea1f73 on Sep 26, 2020
  121. sidhujag referenced this in commit e1572349ed on Sep 26, 2020
  122. Fabcien referenced this in commit 595b16f901 on Dec 21, 2020
  123. Fabcien referenced this in commit 63c2381bcc on Dec 21, 2020
  124. Fabcien referenced this in commit 46c6dfa251 on Dec 21, 2020
  125. Fabcien referenced this in commit 8657a7180b on Dec 21, 2020
  126. Fabcien referenced this in commit 503fd17ab9 on Dec 21, 2020
  127. Fabcien referenced this in commit e1e5aca390 on Dec 21, 2020
  128. Fabcien referenced this in commit e96417b802 on Dec 21, 2020
  129. Fabcien referenced this in commit 96337b2b77 on Dec 21, 2020
  130. deadalnix referenced this in commit 45f9e2cab4 on Dec 21, 2020
  131. deadalnix referenced this in commit 6a9f46ff27 on Dec 21, 2020
  132. Fabcien referenced this in commit f7d2a6ea27 on Dec 21, 2020
  133. Fabcien referenced this in commit 5b34890d57 on Dec 21, 2020
  134. Fabcien referenced this in commit ae844b952f on Dec 21, 2020
  135. Fabcien referenced this in commit a2e0908910 on Dec 21, 2020
  136. Fabcien referenced this in commit a0d96686f5 on Dec 21, 2020
  137. Fabcien referenced this in commit 5c39c9d8f0 on Dec 21, 2020
  138. Fabcien referenced this in commit e7ca7c2357 on Dec 21, 2020
  139. Fabcien referenced this in commit 91061791c0 on Dec 21, 2020
  140. Fabcien referenced this in commit 7584740e3f on Dec 21, 2020
  141. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me