[net] Cleanup connection types- followups #19724

pull amitiuttarwar wants to merge 10 commits into bitcoin:master from amitiuttarwar:2020-08-conn-refactor-followups changing 8 files +139 −75
  1. amitiuttarwar commented at 10:23 pm on August 14, 2020: contributor
    This PR addresses outstanding review comments from #19316. It further simplifies net.cpp complexity and adds documentation about design goals about different connection types.
  2. amitiuttarwar commented at 10:32 pm on August 14, 2020: contributor

    I’m not 100% convinced about replacing IsAddrRelayPeer() with RelayAddrsWithConn() 9345f9c9d47a88987606d1421823914a4c48dbf9.

    Pros:

    • It is more direct to check the connection type instead of a (missing) data structure based on the connection type.
    • Easy to update logic in the future
    • Less mental overhead, less code touch points to glean intention.

    Cons:

    • We are now checking the same conditional in two places- [1] & [2], and need to manually ensure they do not fall out of sync.

    Curious to hear what other reviewers think is preferable.

  3. amitiuttarwar renamed this:
    [net] Cleanup connection type- followups
    [net] Cleanup connection types- followups
    on Aug 14, 2020
  4. DrahtBot added the label P2P on Aug 14, 2020
  5. DrahtBot added the label Tests on Aug 14, 2020
  6. amitiuttarwar force-pushed on Aug 14, 2020
  7. sipa commented at 11:53 pm on August 14, 2020: member
    @amitiuttarwar I think it makes sense to make the addr-relay decision depend on the connection type. That should be the most authoritative information about the connection. Sure, it’s duplication, but if we accidentally forget to create the necessary data structures, test will fail; if we accidentally create them if they’re not needed, at worst we’ve wasted a bit of memory we’re already ok with for other nodes.
  8. MarcoFalke removed the label Tests on Aug 15, 2020
  9. in src/net.cpp:1844 in b672f8d84f outdated
    1840@@ -1841,11 +1841,11 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
    1841                 // but inbound and manual peers do not use our outbound slots. Inbound peers
    1842                 // also have the added issue that they could be attacker controlled and used
    1843                 // to prevent us from connecting to particular hosts if we used them here.
    1844-                switch(pnode->m_conn_type){
    1845+                switch (pnode->m_conn_type){
    


    MarcoFalke commented at 7:38 am on August 15, 2020:

    style-nit:

    0                switch (pnode->m_conn_type) {
    

    if you decide to do style-fixups, it would probably be good to do all of them in one go or not at all. Maybe with clang-format-diff? https://github.com/bitcoin/bitcoin/tree/master/contrib/devtools#clang-format-diffpy


    amitiuttarwar commented at 3:30 am on August 21, 2020:

    heh. thanks for the nudge. I’ve updated my tooling so I’ll pay better attention to the clang-format-diff moving forward. I think I’ve now cleaned everything up properly in this PR? I’m not sure what exactly you mean by “all in one go or not at all”…

    (disclaimer: I didn’t take the suggestions to not IndentCaseBlocks. I would much prefer if we turned that format option to true.)

  10. in src/net_processing.cpp:2436 in b672f8d84f outdated
    2432@@ -2433,9 +2433,8 @@ void ProcessMessage(
    2433         UpdatePreferredDownload(pfrom, State(pfrom.GetId()));
    2434         }
    2435 
    2436-        if (!pfrom.IsInboundConn() && pfrom.IsAddrRelayPeer())
    2437+        if (pfrom.AdvertiseAddressToConn())
    


    jnewbery commented at 10:16 am on August 15, 2020:

    If we’re cleaning this up, we should really clean it up and move the connman.MarkAddressGood() out of this if statement, since it’s not related to address advertising at all. Having it in the same conditional is a trap for developers to fall into.

    (specifically, a developer might think “we don’t need to advertise our address to FEELER connections, since we disconnect before we actually send the ADDR or GETADDR messages, so we can remove FEELER from AdvertiseAddressToConn”, but doing that breaks our FEELER connection logic, because MarkAddressGood() needs to be called to mark the address as good)

    cc @sdaftuar


    sdaftuar commented at 6:08 pm on August 15, 2020:

    I was looking at this more and I believe we should leave the MarkAddressGood call in the if-block, though it needs an explanation— the reason (I think) is that we don’t want block-relay connections to leave a trace in the addrman that could be leaked by an attacker sniffing our addr responses.

    So this just needs a comment explaining why it’s in here.


    jnewbery commented at 9:38 am on August 16, 2020:

    Can you explain this a bit? I think you’re saying that if we connect out to a peer that we use for block-relay-only we don’t want to mark it as good (essentially move it from New to Tried and update the last successful connect time) because then another peer might be able to tell that we’ve connected to them?

    My understanding of how we respond to getaddr messages is that we’ll respond with (up to) 1000 records taken at random from all records, regardless of whether they’re in New or Tried, so marking a peer as good won’t change whether we include it in addr responses.


    sdaftuar commented at 4:29 pm on August 16, 2020:

    I don’t have a great understanding of the details of how addrman works or how we interact with it – so possibly the fear was unfounded. It used to be the case that you could trivially download a listening node’s addrman though, and it also used to be the case (perhaps only long ago?) that we’d leak information about current connections based on time stamps stored in our addrman, so when I worked on #15759 I tried to leave the addrman state unchanged when making block-relay-only connections.

    I’m not sure that I fully succeeded in not leaking information, or that the concern is correct to begin with given how addrman already works. So it’s possible that with some analysis we could make improvements here. Maybe not appropriate for this pr though…


    jnewbery commented at 10:42 am on August 18, 2020:

    I believe there are three ways that a peer’s record is updated in addrman when we’re connected to them:

    1. when we receive a version message from them, we’ll mark their address good. That moves the address from the new table to the tried table, and updates the nLastSuccess/nLastTry/nAttempts fields. We only do this if the peer isn’t inbound and isn’t block-relay-only. I don’t believe this change can easily be detected by a peer sending us getaddr messages.
    2. if a peer self-advertises to us, we’ll call AddAddressKnown(), which could update the nTime, which is included in the address record that we send to a peer in response to a getaddr request.
    3. when we disconnect from the peer, we’ll update the nTime, as long as the peer was marked as fCurrentlyConnected. That happens when we receive a verack from a peer which is not inbound. This doesn’t include feeler peers, since we disconnect from them as soon as we receive a version message from them. The nTime is only updated during disconnection so spies can’t find out who we’re currently connected to (https://github.com/bitcoin/bitcoin/pull/5860#issuecomment-78397140).

    In summary, I think block-relay-only peers should have their address marked good when we connect to them (1). If we don’t, then we’ll update the nLastTry and nAttempts fields when we attempt to connect to them, but not update the nLastSuccess.

    The reason it’s important in this PR is that placing MarkAddressGood() behind a conditional if (AdvertiseAddressToConn()) is very confusing for anyone reading the code, and makes it easy to introduce subtle bugs.


    amitiuttarwar commented at 3:24 am on August 21, 2020:

    I agree that having MarkAddressGood in this conditional is surprising & that changing to AdvertiseAddressToConn augments the confusion.

    For this PR- I’ve updated the conditional to be !pfrom.IsInboundConn() && !pfrom.IsBlockOnlyConn() (aka removed the misleading function), and added a comment.

    For future work- I’ve noted this down. I’ve taken an initial look and @jnewbery I believe what you’re saying makes sense, but I’ll have to spend more time with it to fully convince myself. The fact that I missed this behavior in my initial pass affirms that its easy to miss or confuse. I’d like to either (1) understand why it should be in this conditional & comment or (2) understand why it shouldn’t be & change it. But I’m leaving this for another PR :)


    ariard commented at 11:32 pm on August 22, 2020:

    I agree with John that block-relay-only peers should have their address marked as good when we receive their version. To answer getaddr we fetch randomly a list of address from both vvNew/vvTried without record bias as far as a quick read of CAddrMan::GetAddr_ let it think. So moving address from a table to another doesn’t change the ability of an attacker to learn about a block-relay-only address.

    Maybe we could improve block-relay-only peers unobservability in future works around AddrMan by filtering out block-relay-only address from getaddr replies, without introducing an oracle for the original sender of the used address ? @naumenkogs


    sdaftuar commented at 5:56 pm on August 25, 2020:

    My instinct is that it’d be best if the addrman were completely unchanged as a result of block-relay connections, at least while we are connected to the peer. So if there are places where block-relay connections cause some kind of update in addrman, then those should be bugs that we fix.

    Here’s my thinking: (a) it’s not clear to me that there’s not some way to sniff out things in the tried table from things in the new table. For instance, I think the new table tends to be much bigger than the tried table, yet I think when we respond to getaddr requests, we sample 50% from the tried table and 50% from the new table. This creates a bias that, if you had the ability to read someone’s addrman, might leak information about who someone might be connected to.

    (b) Even if we have limited the ability to read someone’s addrman today (eg with fixes like #18991), I think it’s helpful to separate the policies around addrman from being too tightly interwoven into other features like block-relay peers. The easiest way to reason about this stuff is to say that block-relay peers have no effect on addrman, and then we know that any changes in addrman / addr relay can’t possibly leak information about our block-relay peers.

    The only potential hesitation I have advancing this point of view is if there are side effects from not updating addrman that might be detrimental. For instance, it does seem like it would be terrible if an existing block-relay peer that we are successfully peered with for some time could be evicted from the addrman altogether, and then we forget about it. But maybe it’s enough to either update addrman at the time of disconnection, or save the peer as is suggested in #17428?

    At any rate, this discussion over the best behavior for the interaction between block-relay peers and addrman deserves its own issue. I do agree with the current version of the proposed code which checks for the peer being a block-relay peer.


    naumenkogs commented at 12:20 pm on September 1, 2020:

    I think we’re taking the wrong approach in the PR as it is right now. At least that condition should be split in 2 blocks:

    I think the current code change here is not useful, but only distracts: I) The comment Advertise our address is removed for no reason II) The comment (unrelated to advertising our address) is very confusing. I managed to understand it only after reading this discussion III) The comment Moves address from New to Tried table in Addrman is not sufficient IV) The change of IsAddrRelayPeer to IsBlockOnlyConn here is very confusing, the two blocks should be split into 2 very separate blocks according to the following reasoning:

    1. We self-announce to every peer that supports self-announcement AND it’s a good idea: 1a) not useful to self-announce to feelers as we won’t actually send ADDR to them (i think?) 1b) bad to self-announce to block-relay-only (maybe? wrt privacy)
    2. We MarkAddressGood according to the logic we have

    naumenkogs commented at 12:23 pm on September 1, 2020:

    w.r.t block-relay-only peers and AddrMan, I agree it’s a separate issue, so i’d prefer to preserve the approach status quo here.

    The current PR currently satisfies it, but I’m still unhappy with the code changes in this particular area (see the previous comment).


    sdaftuar commented at 5:39 pm on September 1, 2020:

    @naumenkogs Would it be enough to just add some more comments to this block? I think it’s fine to leave as one block until we actually need to split it, so I’d propose something like this:

     0diff --git a/src/net_processing.cpp b/src/net_processing.cpp
     1index 7cd4f7c4cb5..4285e8ba898 100644
     2--- a/src/net_processing.cpp
     3+++ b/src/net_processing.cpp
     4@@ -2427,6 +2427,21 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
     5         }
     6 
     7         if (!pfrom.IsInboundConn() && !pfrom.IsBlockOnlyConn()) {
     8+            // For outbound peers, we try to relay our address (so that other
     9+            // nodes can try to find us more quickly, as we have no guarantee
    10+            // that an outbound peer is even aware of how to reach us) and do a
    11+            // one-time address fetch (to help populate/update our addrman). If
    12+            // we're starting up for the first time, our addrman may be pretty
    13+            // empty and no one will know who we are.
    14+            //
    15+            // We also will update the addrman to record connection success for
    16+            // these peers (which include OUTBOUND_FULL_RELAY peers and FEELER
    17+            // connections) so that addrman will have an up-to-date notion of
    18+            // which peers are online and available.
    19+            //
    20+            // We skip these operations for BLOCK_RELAY peers to avoid
    21+            // potentially leaking information about our BLOCK_RELAY
    22+            // connections via the addrman or address relay.
    23             if (fListen && !::ChainstateActive().IsInitialBlockDownload())
    24             {
    25                 CAddress addr = GetLocalAddress(&pfrom.addr, pfrom.GetLocalServices());
    26@@ -2446,8 +2461,8 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
    27             m_connman.PushMessage(&pfrom, CNetMsgMaker(nSendVersion).Make(NetMsgType::GETADDR));
    28             pfrom.fGetAddr = true;
    29 
    30-            // Moves address from New to Tried table in Addrman
    31-            // (unrelated to advertising our address)
    32+            // Moves address from New to Tried table in Addrman, resolves
    33+            // tried-table collisions, etc.
    34             m_connman.MarkAddressGood(pfrom.addr);
    35         }
    

    naumenkogs commented at 7:46 am on September 2, 2020:

    I’d suggest making If we're starting up for the first time, our addrman may be pretty empty and no one will know who we are. less ambiguous: If we're starting up for the first time, it would improve our peering chances. Otherwise, our addrman may be pretty empty, and also no one will know who we are.

    But generally, I think this comment:

    • makes it much more clear
    • actually convinces me that it’s an improvement over the existent code.

    amitiuttarwar commented at 0:08 am on September 3, 2020:

    incorporated these comments into the changes,

    I agree this should be further improved, but I’m hoping this is sufficient for this PR. @naumenkogs are you comfortable with the current state?

  11. jnewbery commented at 10:18 am on August 15, 2020: member

    Strong strong concept ACK.

    I think it’d be good to extract everything in the while (!interruptNet) in ThreadOpenConnections() into its own function since deeply nested while/for/if blocks obscure control flow and are very often the sources of bugs. Doing that might be scope creep for this PR though.

  12. in src/net.h:902 in b672f8d84f outdated
    898+        return m_conn_type != ConnectionType::BLOCK_RELAY;
    899+    }
    900+
    901+    /* Whether we attempt to advertise our own address to the peer when
    902+     * processing their VERSION message */
    903+    bool AdvertiseAddressToConn() const {
    


    jnewbery commented at 9:06 am on August 20, 2020:
    I really dislike these functions being inside CNode (same as #19316 (review)). This is very specific net_processing logic, rather than a general property of the CNode. Having the logic inside net.h rather than at the point it’s needed makes it less readable.

    amitiuttarwar commented at 3:16 am on August 21, 2020:
    yeah, I agree. this is why I initially opted for having m_conn_type as a public member var, but feedback on #19316 led me to switching to private. anyway, given the current state, its a bit unfortunate we don’t get the additional guarantee of a switch, but I agree with you & incorporated this feedback so it can be more understandable from the call site.
  13. in src/net.h:841 in b672f8d84f outdated
    837@@ -838,37 +838,89 @@ class CNode
    838         assert(false);
    839     }
    840 
    841+    /* These are the default connections that we use to connect with the
    


    jnewbery commented at 9:18 am on August 20, 2020:
    Format these as doxygen comments (start with /**) https://www.doxygen.nl/manual/docblocks.html

    jnewbery commented at 9:21 am on August 20, 2020:
    These are great comments. Thank you! Should they be merged with the comments above on the enum members?

    amitiuttarwar commented at 5:40 pm on August 20, 2020:

    yeah I felt very indecisive when trying to figure out the best spots for the comments. here what I’ve gone for is having the enum comments be a more concise description of the nature of the connection, and the function provide more context around design goals, capacity, etc.

    I don’t think the split is super clear though, so yeah, maybe it would be more helpful to consolidate it all into the enum member comments. is that what you’re advocating for? or just trying to weigh like me? ⚖️ 😛


    jnewbery commented at 6:22 pm on August 20, 2020:
    yes, consolidate them to one place. I think the enum is the right place for them.

    amitiuttarwar commented at 3:09 am on August 21, 2020:
    done!

    amitiuttarwar commented at 3:11 am on August 21, 2020:
    oof. so many options and yet I still missed the mark 😅 . moved the comments to the enum, but fixed this problem. thanks!
  14. jnewbery commented at 9:26 am on August 20, 2020: member

    ACK everything except the AdvertiseAddressToConn() change. I’d much rather leave the conditional as

    if (!pfrom.IsInboundConn() && !pfrom.IsBlockOnlyConn())

    and add a large comment to above the connman.MarkAddressGood(pfrom.addr); to say that this isn’t anything to do with sending an addr or getaddr message.

  15. DrahtBot commented at 8:24 pm on August 20, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #19860 (Improve diversification of new connections: privacy and stability by naumenkogs)
    • #19858 (Periodically make block-relay connections and sync headers by sdaftuar)
    • #19843 (Refactoring and minor improvement for self-advertisements by naumenkogs)
    • #19829 ([net processing] Move block inventory state to net_processing by jnewbery)
    • #19794 (p2p: Remove fGetAddr flag by mzumsande)
    • #19763 (net: don’t try to relay to the address’ originator by vasild)
    • #19725 ([RPC] Add connection type to getpeerinfo, improve logs by amitiuttarwar)
    • #17785 (p2p: Unify Send and Receive protocol versions by hebasto)
    • #17428 (p2p: Try to preserve outbound block-relay-only connections during restart by hebasto)
    • #17194 (p2p: Avoid forwarding ADDR messages to SPV nodes by naumenkogs)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  16. amitiuttarwar force-pushed on Aug 21, 2020
  17. amitiuttarwar commented at 3:40 am on August 21, 2020: contributor

    thanks for the feedback! all comments should be addressed. @sipa I agree with this:

    make the addr-relay decision depend on the connection type. That should be the most authoritative information about the connection.

    But unfortunately I don’t think this is true:

    but if we accidentally forget to create the necessary data structures, test will fail;

    I updated RelayAddrsWithConn to always return true, recompiled & ran test_runner.py & unit tests. Nothing failed. My concern would be addressed if it did :) I think with #19315, we should be able to add functional tests that would fail if a discrepancy occurs?

    Currently I think the downside is probably acceptable, but want to highlight since I don’t think we have any automatic protection in place. (unless I’m missing something?)


    @jnewbery, RE:

    I think it’d be good to extract everything in the while (!interruptNet) in ThreadOpenConnections() into its own function since deeply nested while/for/if blocks obscure control flow and are very often the sources of bugs. Doing that might be scope creep for this PR though.

    I 100% agree. I spent so many hours trying to interpret the exact logic around what types of connections are open, and have gotten very confused by the deeply nested logic flows & nuances of things like where a break / continue that’s 4 levels deep will pop you back out to. I’ve taken a first pass at extracting the outer while (!interruptNet) loop here (yes, there are two levels nested): https://github.com/amitiuttarwar/bitcoin/commit/eecc5877c08744c55200846bb7b3b70c922b75ec. However, it needs to be done with great care and I would need to spend a lot more time / build a lot more confidence with these changes before I’d feel comfortable proposing them. Feel free to take a look or further the branch if you’re interested :)

  18. jnewbery commented at 10:43 am on August 21, 2020: member

    utACK c99b26010eaf4d446eb5118e38dbcc03fabba11c

    have gotten very confused by the deeply nested logic flows & nuances of things like where a break / continue that’s 4 levels deep will pop you back out to.

    Yes, deeply nested if/while/switch statements hide bugs. If it’s possible to factor them out to clarify control flow, we should aim to do that.

    I’ve taken a first pass at extracting the outer while (!interruptNet) loop here

    I’ve had a very quick skim of that branch and it looks like the right direction to me.

    I would need to spend a lot more time / build a lot more confidence with these changes before I’d feel comfortable proposing them. Feel free to take a look or further the branch if you’re interested :)

    I think you’re very well positioned to open this PR if you want to. You’ve already spent hours getting to grips with the logic here :slightly_smiling_face: . I’ve already got my quota of refactor PRs open, but I can definitely commit to reviewing a PR if you open it. (Equally, we don’t need to fix this all at once. We can always leave this for another time).

  19. in src/net_processing.cpp:2021 in 5dba7ab241 outdated
    1981@@ -1982,7 +1982,7 @@ static void ProcessHeadersMessage(CNode& pfrom, CConnman& connman, ChainstateMan
    1982             }
    1983         }
    1984 
    1985-        if (!pfrom.fDisconnect && pfrom.IsOutboundOrBlockRelayConn() && nodestate->pindexBestKnownBlock != nullptr && pfrom.m_tx_relay != nullptr) {
    1986+        if (!pfrom.fDisconnect && pfrom.IsFullOutboundConn() && nodestate->pindexBestKnownBlock != nullptr) {
    


    ariard commented at 10:26 pm on August 22, 2020:

    EDIT: We may have already a bug here. I think comment “Note that block-relay-only peers are already implicitly protected so we only consider setting m_protect for the full-relay peers” is wrong as block-relay-only peers seems to not be protected from eviction by lagging chain logic and as such we should modify this check to scope them in setting m_protect==true (or either modify ConsiderEviction’s first check)

    AFAICT, we have 2 different logics to evict outbound peers : lagging-chain (ConsiderEviction) and (CheckForStaleTipAndEvictPeers). The former to sanitize out lazy/buggy peers who have never sent us a valid header and are always staying behind our tip, the latter triggered in case of stale tip due to block variance to seek a better chain by rotating our outbound peers if we already reach an outbound limit (EvictExtraOutboundPeers).

    With regards to stale-tip, outbound block-relay-only are protected based on m_tx_relay==nullptr : https://github.com/bitcoin/bitcoin/blob/197450f80868fe752c6107955e5da80704212b34/src/net_processing.cpp#L3977

    With regards to lagging-chain, outbound block-relay-only are protected based on m_protect==true : https://github.com/bitcoin/bitcoin/blob/197450f80868fe752c6107955e5da80704212b34/src/net_processing.cpp#L3909

    So we may evict them if due to block variance they fulfilled the lagging chain conditions, contrary to the intent of unconditionally conserve them once they announced to us a header.


    amitiuttarwar commented at 10:39 pm on August 26, 2020:

    ok, I’ve looked into this and am happy to discuss with you, but want to clarify we are on the same page that this PR doesn’t change the behavior.

    I agree that the comment initially seems inconsistent with the logic in ConsiderEviction, but I’m still trying to wrap my head around all the mechanisms (since the chainwork check is repeated as the first conditional in ConsiderEviction, is the protection here essential?). I have many observations & questions, but I don’t think this PR is the right place to have this conversation. are you willing to open an issue?


    ariard commented at 0:19 am on August 27, 2020:

    Yes as far as I understand this code it is neither introduced by this PR or #19316.

    (since the chainwork check is repeated as the first conditional in ConsiderEviction, is the protection here essential?).

    You mean the check pindexBestKnownBlock != nullptr ? But you have a conditional above and it won’t exclude block-relay-only with m_protect=false. I’m not sure about the exact semantic of m_protect but yes I should be able to write a test for this with #19315 to assert bug :)


    sdaftuar commented at 7:00 pm on August 31, 2020:
    Looks like a documentation oversight when I changed around how this code worked while working on #15759. See #15759#pullrequestreview-281092623. I think originally I was going to make block-relay-only peers immune from all eviction, and then changed it to only be immune to outbound peer rotation-based eviction.

    ariard commented at 9:48 pm on September 3, 2020:
    Tracked in #19863#issue-692406060, either to document block-relay-only eviction expectations or align protection scope among all our outbound peers.
  20. in src/net.cpp:1864 in 51f5181365 outdated
    1879-                continue;
    1880-            }
    1881+        // Determine what type of connection to open. Opening
    1882+        // OUTBOUND_FULL_RELAY connections gets the highest priority until we
    1883+        // meet our full-relay capacity. Then we open BLOCK_RELAY connection
    1884+        // until we hit our block-relay peer limit. GetTryNewOutboundPeer()
    


    ariard commented at 10:30 pm on August 22, 2020:
    block-relay-only to be consistent with mention of this connection type spread around the codebase.

    amitiuttarwar commented at 7:49 pm on August 27, 2020:
    done
  21. in src/net.cpp:1868 in 51f5181365 outdated
    1883+        // meet our full-relay capacity. Then we open BLOCK_RELAY connection
    1884+        // until we hit our block-relay peer limit. GetTryNewOutboundPeer()
    1885+        // gets set when a stale tip is detected, so we try opening an
    1886+        // additional OUTBOUND_FULL_RELAY connection. If none of these
    1887+        // conditions are met, check the nNextFeeler timer to decide if we
    1888+        // should open a feeler.
    


    ariard commented at 10:34 pm on August 22, 2020:
    s/feeler/FEELER/g

    amitiuttarwar commented at 7:50 pm on August 27, 2020:
    done
  22. in src/net.cpp:1881 in 51f5181365 outdated
    1894+        } else if (GetTryNewOutboundPeer()) {
    1895+            // OUTBOUND_FULL_RELAY
    1896+        } else if (nTime > nNextFeeler) {
    1897+            nNextFeeler = PoissonNextSend(nTime, FEELER_INTERVAL);
    1898+            conn_type = ConnectionType::FEELER;
    1899+            fFeeler = true;
    


    ariard commented at 10:43 pm on August 22, 2020:
    I think you can use now conn_type.IsFeelerConn() instead of fFeeler=true in the feeler-related following checks, semantically they are the same ?

    amitiuttarwar commented at 7:52 pm on August 27, 2020:
    IsFeelerConn() is a function on CNode, not ConnectionType, so I don’t think this is possible

    sdaftuar commented at 7:10 pm on August 31, 2020:
    It does seem a little weird to still have fFeeler as a variable after this change, but I don’t feel strongly.

    naumenkogs commented at 10:14 am on September 1, 2020:
    Instead of if (fFeeler), we can now further do if (conn_type == ConnectionType::FEELER), and drop the fFeeler variable. I would prefer this.

    naumenkogs commented at 10:19 am on September 1, 2020:
    Or, now that I see how many if (fFeeler) we have (a lot), I’m not so sure…

    amitiuttarwar commented at 8:52 pm on September 2, 2020:
    yup, I left it in essentially as a local alias. we use it 6 times in the function and it didn’t seem helpful to switch those all over to the more verbose conn_type == ConnectionType::FEELER
  23. in src/net.h:149 in a2d4fe3d2e outdated
    153+
    154+    /**
    155+     * Feeler connections are short lived connections used to increase the
    156+     * number of connectable addresses in our AddrMan. Approximately every
    157+     * FEELER_INTERVAL, we attempt to connect to a random address from the new
    158+     * table. If successful, we add it to the tried table.
    


    ariard commented at 10:47 pm on August 22, 2020:
    Only attempted after finishing to open other types of outbound connections, comment on order of connections was dropped from original location in ThreadOpenConnections

    ariard commented at 0:25 am on August 27, 2020:
    Actually it does also solve bucket conflict between 2 tried tables entries, see : https://github.com/bitcoin/bitcoin/pull/9037

    amitiuttarwar commented at 7:58 pm on August 27, 2020:

    I already added a comment in ThreadOpenConnections with the specifics of the ordering:

    0        // Determine what type of connection to open. Opening
    1        // OUTBOUND_FULL_RELAY connections gets the highest priority until we
    2        // meet our full-relay capacity. Then we open BLOCK_RELAY connection
    3        // until we hit our block-relay-only peer limit.
    4        // GetTryNewOutboundPeer() gets set when a stale tip is detected, so we
    5        // try opening an additional OUTBOUND_FULL_RELAY connection. If none of
    6        // these conditions are met, check the nNextFeeler timer to decide if
    7        // we should open a FEELER.
    

    and I believe that is the right place for documenting the order. I realized that the block-relay-only enum comment mentioned ordering so I’ve gone ahead and removed that. I think these comments are most valuable if they focus on the design goals and the high level description of the connection type & we look at the relevant code for more specifics. otherwise we also risk the comments falling out of date.


    Actually it does also solve bucket conflict between 2 tried tables entries, see : #9037

    what is this in response to? is there a suggested change?


    naumenkogs commented at 11:41 am on September 1, 2020:

    I think these comments are most valuable if they focus on the design goals and the high level description of the connection type & we look at the relevant code for more specifics.

    I agree

    what is this in response to? is there a suggested change?

    Antoine just mentions that there is another thing done when considering a feeler. See here. I think this part, if explained, also belongs there and not here.

  24. in src/net.h:161 in a2d4fe3d2e outdated
    162+    /**
    163+     * After meeting our quota for full outbound connections, we attempt to
    164+     * open MAX_BLOCK_RELAY_ONLY_CONNECTIONS to help prevent against partition
    165+     * attacks. By not relaying transactions or addresses, these connections
    166+     * are harder to detect by a third party, thus helping obfuscate the
    167+     * network topology. Addresses are selected from AddrMan.
    


    ariard commented at 10:57 pm on August 22, 2020:

    You can mark that addresses for {BLOCK_RELAY,OUTBOUND_FULL_RELAY} are drawn from tried table.

    Also block-relay network topology.


    amitiuttarwar commented at 7:59 pm on August 27, 2020:
    are they always? what happens if your tried table isn’t populated yet ?

    ariard commented at 9:51 pm on September 3, 2020:
    Ah sorry, in fact they are drawn from both new/tried tables at anytime. Surely being confused by SelectTriedCollision in net.cpp path!
  25. in src/net.h:163 in a2d4fe3d2e outdated
    167+     * network topology. Addresses are selected from AddrMan.
    168+     */
    169+    BLOCK_RELAY,
    170+
    171+    /**
    172+     * AddrFetch connections are short lived connections used to solicit
    


    ariard commented at 11:00 pm on August 22, 2020:
    For both {ADDR_FETCH, FEELER} you can specify the condition suspending the connection (either VERSION or ADDR reception).

    amitiuttarwar commented at 8:02 pm on August 27, 2020:
    feels a bit deep in the weeds for this patch, both have nuances of how disconnection is executed
  26. in src/net.h:155 in a2d4fe3d2e outdated
    160+    FEELER,
    161+
    162+    /**
    163+     * After meeting our quota for full outbound connections, we attempt to
    164+     * open MAX_BLOCK_RELAY_ONLY_CONNECTIONS to help prevent against partition
    165+     * attacks. By not relaying transactions or addresses, these connections
    


    ariard commented at 11:02 pm on August 22, 2020:
    transactions and addresses to be strict

    amitiuttarwar commented at 8:05 pm on August 27, 2020:
    hmm, seems like our minds are putting parenthesis in different places for this patch. I find “or” clearer. If you feel strongly I can try to rework the sentence. but otherwise, seems fairly trivial to me
  27. in src/net.h:156 in a2d4fe3d2e outdated
    161+
    162+    /**
    163+     * After meeting our quota for full outbound connections, we attempt to
    164+     * open MAX_BLOCK_RELAY_ONLY_CONNECTIONS to help prevent against partition
    165+     * attacks. By not relaying transactions or addresses, these connections
    166+     * are harder to detect by a third party, thus helping obfuscate the
    


    ariard commented at 11:08 pm on August 22, 2020:
    It’s unclear to me how do you qualify what is a “third party” in a P2P network, either another connected peer, a not-directly connected peer to local node or an infrastructure observer ? For the last one it doesn’t help at all. So maybe replace by any other peer, either directly paired with local node or not ?

    amitiuttarwar commented at 8:06 pm on August 27, 2020:
    if two nodes are connected, a third party is any other entity. I don’t really understand your suggestion- if a node is “directly paired” with another peer, how would it be possible to obfuscate that connection to the peer? of course the peer can know what connections they have open?

    ariard commented at 9:53 pm on September 3, 2020:

    I think the detection hardness assumption is function on how you define the third-party.

    I meaned “directed paired” as Alice being connected with Bob and Caroll. Caroll guessing the connection Alice-Bob thanks to some relay leaks (like txn or addr).

  28. in src/net_processing.cpp:2450 in 652e06117d outdated
    2451@@ -2454,6 +2452,9 @@ void ProcessMessage(
    2452             // Get recent addresses
    2453             connman.PushMessage(&pfrom, CNetMsgMaker(nSendVersion).Make(NetMsgType::GETADDR));
    2454             pfrom.fGetAddr = true;
    2455+
    2456+            // Moves address from New to Tried table in Addrman
    2457+            // (unrelated to advertising our address)
    


    ariard commented at 11:24 pm on August 22, 2020:
    I know this is likely temporary but comment may be made better like “Our address sanitization logic (feeler) is less prone to manipulation by outbound connections rather than potentially attacker controlled inbound. We don’t sanitize block-relay-only as it might be a leak of their presence”.

    amitiuttarwar commented at 8:09 pm on August 27, 2020:
    I feel strongly that we should not cement opinions into docs in the codebase. since there are divergent opinions on whether or not we should have this logic here, I’d like to avoid documentation that indicates it must be here.

    sdaftuar commented at 6:47 pm on August 31, 2020:
    nit: MarkAddressGood() has other side effects besides moving addresses from new to tried (such as resolving collisions in the tried table?) so maybe it’d be better to update the comment to indicate that’s not all this does, eg // Moves address from New to Tried table in Addrman, among other things)

    naumenkogs commented at 8:38 am on September 1, 2020:
    I don’t find this comment particularly helpful… Why MarkAddressGood is called here, why for outbounds and non-blocks-only?

    amitiuttarwar commented at 0:10 am on September 3, 2020:
    resolving this conversation just to close off threads, feel free to continue over here
  29. DrahtBot added the label Needs rebase on Aug 24, 2020
  30. jonatack commented at 1:03 pm on August 25, 2020: member
    Concept ACK on these changes. Late to the party but will try to review soon-ish.
  31. amitiuttarwar force-pushed on Aug 27, 2020
  32. amitiuttarwar commented at 8:10 pm on August 27, 2020: contributor

    thanks for the review @jnewbery & @ariard ! all review comments are addressed

    rebased & made some small changes to docs

  33. DrahtBot removed the label Needs rebase on Aug 27, 2020
  34. in src/net_processing.cpp:846 in 8efe70ca9c outdated
    842@@ -843,7 +843,7 @@ void PeerLogicValidation::InitializeNode(CNode *pnode) {
    843         LOCK(cs_main);
    844         mapNodeState.emplace_hint(mapNodeState.end(), std::piecewise_construct, std::forward_as_tuple(nodeid), std::forward_as_tuple(addr, std::move(addrName), pnode->IsInboundConn(), pnode->IsManualConn()));
    845     }
    846-    if(!pnode->IsInboundConn())
    847+    if (!pnode->IsInboundConn())
    


    sdaftuar commented at 6:48 pm on August 31, 2020:
    nit: If you’re including small style changes anyway, this could use curly braces.

    amitiuttarwar commented at 11:51 pm on September 2, 2020:
    done
  35. sdaftuar approved
  36. sdaftuar commented at 7:13 pm on August 31, 2020: member
    utACK, just a couple nits but nothing blocking.
  37. in src/net.h:907 in f26b0ddac5 outdated
    860@@ -861,6 +861,12 @@ class CNode
    861         return m_conn_type == ConnectionType::INBOUND;
    862     }
    863 
    864+    /* Whether we send addr messages over this connection */
    865+    bool RelayAddrsWithConn() const
    


    naumenkogs commented at 8:13 am on September 1, 2020:
    nit: not a big fan of this name… I actually think IsAddrRelayPeer is much better :) I already saw some discussion around the distinction between “node” and “peer”, and now you start referring to it as a “connection”.

    naumenkogs commented at 8:22 am on September 1, 2020:

    Also, this method should be now used here: https://github.com/bitcoin/bitcoin/blob/4c66cc0a5a99cbad51448e51d6b28e11e5232e33/src/net.cpp#L2772

    The motivation is that we don’t want changing the addr condition in one place, and forgetting to change in the other place.


    naumenkogs commented at 8:29 am on September 1, 2020:

    I think the new code should be:

    0    if (conn_type_in != ConnectionType::BLOCK_RELAY) {
    1        m_tx_relay = MakeUnique<TxRelay>();
    2    }
    3    if (IsAddrRelayPeer()) {
    4        m_addr_known = MakeUnique<CRollingBloomFilter>(5000, 0.001);
    5    }
    

    amitiuttarwar commented at 11:55 pm on September 2, 2020:

    ah nice. I like it! I was also concerned about the duplicated logic making room for future discrepancy. I’ve updated to include this, thanks!

    RE name: seems like an issue with the whole convention around connection types? The name I’ve chosen is consistent with all the m_conn_type functions. Its a little late to change all that so I’m going to resolve this comment.

  38. in src/net_processing.cpp:2463 in f26b0ddac5 outdated
    2425@@ -2426,9 +2426,7 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
    2426         UpdatePreferredDownload(pfrom, State(pfrom.GetId()));
    2427         }
    2428 
    2429-        if (!pfrom.IsInboundConn() && pfrom.IsAddrRelayPeer())
    2430-        {
    2431-            // Advertise our address
    


    naumenkogs commented at 8:31 am on September 1, 2020:
    nit: Why removing the comment? :)

    naumenkogs commented at 12:11 pm on September 1, 2020:
    Sort of justified here, but i’m not convinced it’s the right approach. See my comment there.

    amitiuttarwar commented at 11:59 pm on September 2, 2020:

    the comment removal was leftover from a previous iteration of this code where I updated the conditional to call a function AdvertiseAddressToConn or something that made the comment feel redundant.

    latest push includes the more in depth commenting proposed in that thread, will continue convo there

  39. in src/net_processing.cpp:2462 in f26b0ddac5 outdated
    2425@@ -2426,9 +2426,7 @@ void PeerLogicValidation::ProcessMessage(CNode& pfrom, const std::string& msg_ty
    2426         UpdatePreferredDownload(pfrom, State(pfrom.GetId()));
    2427         }
    2428 
    2429-        if (!pfrom.IsInboundConn() && pfrom.IsAddrRelayPeer())
    2430-        {
    2431-            // Advertise our address
    2432+        if (!pfrom.IsInboundConn() && !pfrom.IsBlockOnlyConn()) {
    


    naumenkogs commented at 8:32 am on September 1, 2020:
    This is technically not a behavior change, but I don’t understand the logic behind replacing IsAddrRelayPeer with !IsBlockOnlyConn(). Why not replacing it with RelayAddrsWithConn?

    naumenkogs commented at 12:10 pm on September 1, 2020:
    I see it is justified (sort of?) here.

    sdaftuar commented at 5:09 pm on September 1, 2020:

    The skipping of this logic for block-relay-only outbound peers was introduced in #15759 – gating it on whether the address-relay data structures are instantiated is confusing because the very last thing we do is unrelated to whether those data structures exist.

    Instead, what this block has in common is that we only want to skip it for inbound or block-relay only peers, which is what this patch does. So I think this is right, and while we could split this block into two parts, one gated on whether the peer has the addr-relay data structures, and another gated on whether the peers is an outbound block-relay only connection, that seems pretty unnecessary to me. If someone adds another connection type that is outbound and non-block-relay only and only needs part of this, it can be refactored then.

  40. jnewbery commented at 8:44 am on September 1, 2020: member

    reACK 4c66cc0a5a

    It looks like all of the remaining review comments are on style/documentation. Perhaps we should aim to merge this and address those in a follow-up?

  41. in src/net.cpp:1859 in 1d89fe4ad4 outdated
    1855@@ -1856,28 +1856,32 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
    1856             }
    1857         }
    1858 
    1859-        // Feeler Connections
    


    naumenkogs commented at 8:56 am on September 1, 2020:
    Can this probably migrate to the enum?

    naumenkogs commented at 11:55 am on September 1, 2020:
    Partially done in 630b9a5e1872ac91949b1e30f8cd7f28d1c946ab, but some of the details are lost.

    amitiuttarwar commented at 0:02 am on September 3, 2020:
    what details are lost? the only thing not included is “Start attempting feeler connections only after node finishes making outbound connections,” but the ordering of connections is captured in ThreadOpenConnections. this seems like the same point as addressed over in this thread where you’ve agreed with me??
  42. in src/net.h:130 in 630b9a5e18 outdated
    131+     */
    132+    INBOUND,
    133+
    134+    /**
    135+     * These are the default connections that we use to connect with the
    136+     * network. There is no restriction on the inventory relayed- by default we
    


    naumenkogs commented at 11:34 am on September 1, 2020:
    nit: mix of terminology, we normally don’t refer to addr as “inventory”. Let’s keep it that way :) Probably use something “types of stuff” or whatever else

    amitiuttarwar commented at 0:03 am on September 3, 2020:
    oh, didn’t realize. fixed
  43. naumenkogs commented at 12:34 pm on September 1, 2020: member

    It looks like all of the remaining review comments are on style/documentation. Perhaps we should aim to merge this and address those in a follow-up?

    Most of the changes in this commit are style/documentation. It’s not like it’s a big logic-changing PR with a lot of ACKs will have to invalidate :)

    But actually, I’d be fine with ACKing this one as long as I get some feedback (code change, or at least a comment) on these two comments of mine I consider substantial: first, second.

  44. sdaftuar commented at 1:13 pm on September 2, 2020: member

    I think there may be a silent merge conflict here, as a result of merging #19067 a few days ago (after the CI tests had run on this PR). That PR introduces one new place where the OUTBOUND enum is used.

    (Noticed this because of the failing CI run in #19858, built on top of this one.)

  45. scripted-diff: Rename `OUTBOUND` ConnectionType to `OUTBOUND_FULL_RELAY`
    -BEGIN VERIFY SCRIPT-
    sed -i 's/OUTBOUND, /OUTBOUND_FULL_RELAY, /g' src/net.h
    sed -i 's/ConnectionType::OUTBOUND/ConnectionType::OUTBOUND_FULL_RELAY/g' src/test/net_tests.cpp src/test/fuzz/process_message.cpp src/test/fuzz/process_messages.cpp src/net.cpp src/test/denialofservice_tests.cpp src/net.h src/test/fuzz/net.cpp
    -END VERIFY SCRIPT-
    8d6ff46f55
  46. [net] Remove unnecessary default args on OpenNetworkConnection a6ab1e81f9
  47. amitiuttarwar force-pushed on Sep 2, 2020
  48. [refactor] Restructure logic to check for addr relay.
    We previously identified if we relay addresses to the connection by checking
    for the existence of the m_addr_known data structure. With this commit, we
    answer this question based on the connection type.
    
    IsAddrRelayPeer() checked for the existence of the m_addr_known
    dff16b184b
  49. [doc] Explain address handling logic in process messages
    Co-authored-by: Suhas Daftuar <sdaftuar@gmail.com>
    ff6b9081ad
  50. [trivial] Small style updates 1d74fc7df6
  51. [test] Add explicit tests that connection types get set correctly da3a0be61b
  52. [refactor] Simplify check for block-relay-only connection.
    Previously we deduced it was a block-relay-only based on presence of the
    m_tx_relay structure. Now we have the ability to identify it directly via a
    connection type accessor function.
    1e563aed78
  53. [refactor] Simplify connection type logic in ThreadOpenConnections
    Consolidate the logic to determine connection type into one conditional to
    clarify how they are chosen.
    4829b6fcc6
  54. [doc] Describe connection types in more depth. d5a57cef62
  55. [doc] Follow developer notes, add comment about missing default. eb1c5d090f
  56. amitiuttarwar force-pushed on Sep 3, 2020
  57. amitiuttarwar commented at 0:43 am on September 3, 2020: contributor

    thank you for the reviews!

    good catch on the silent merge conflict @sdaftuar, I’ve rebased to address. (turns out, there was more than one)

    I’ve addressed all review comments. They were getting pretty web-like, so I’ve tried my best to resolve and focus on any outstanding conversations.

    It looks like all of the remaining review comments are on style/documentation. Perhaps we should aim to merge this and address those in a follow-up?

    Most of the changes in this commit are style/documentation. It’s not like it’s a big logic-changing PR with a lot of ACKs will have to invalidate :)

    I’d like to get this PR merged sooner rather than later for a couple reasons:

    • I’d like to get the renames on master to avoid possibility of more silent merge conflicts
    • there are now two PRs that build on this (#19858 & #19860). further improvements can be addressed in parallel rather than holding up the whole sequence.
    • I think its important not to discourage review / lose momentum. @jnewbery has ACKed the tip twice already & I’ve had to invalidate because of needing rebase. with 10 commits deep, it doesn’t seem trivial to reACK, so I’d rather use that review energy on smaller isolated changes that can be handled in another PR.

    just trying to express that if I do get ACKs, I’m going to try to be cautious about invalidating, so might opt for starting another branch/PR if more non-essential changes are requested.

  58. amitiuttarwar commented at 2:24 am on September 3, 2020: contributor
    there’s one failing build, but I’m confused by it. nothing in the logs appears to have failed?? https://travis-ci.org/github/bitcoin/bitcoin/jobs/723629947
  59. RandyMcMillan commented at 2:36 am on September 3, 2020: contributor
    @amitiuttarwar - This PR is based on parent commits that aren’t passing - the fail may have nothing to do with your change. I recommend rebasing your commit on a recent passing commit.
  60. naumenkogs commented at 7:34 am on September 3, 2020: member
    ACK eb1c5d090f4ef844dabc1e8bf3864c1afe1c791c
  61. laanwj commented at 11:27 am on September 3, 2020: member
    Code review ACK eb1c5d090f4ef844dabc1e8bf3864c1afe1c791c
  62. laanwj merged this on Sep 3, 2020
  63. laanwj closed this on Sep 3, 2020

  64. ariard commented at 10:04 pm on September 3, 2020: member
    post-merge Code Review ACK eb1c5d0
  65. Fabcien referenced this in commit 7ef27e505c on Jul 13, 2021
  66. Fabcien referenced this in commit 86e28fbbbf on Jul 13, 2021
  67. Fabcien referenced this in commit 6ad16ebd6d on Jul 13, 2021
  68. Fabcien referenced this in commit 6166450c9c on Jul 13, 2021
  69. Fabcien referenced this in commit 54e18211fb on Jul 13, 2021
  70. Fabcien referenced this in commit 74e38fe222 on Jul 13, 2021
  71. Fabcien referenced this in commit 45b19beb72 on Jul 13, 2021
  72. Fabcien referenced this in commit a508b5b670 on Jul 13, 2021
  73. deadalnix referenced this in commit f4f912e5ca on Jul 13, 2021
  74. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me