Connection slot exhaustion DoS mitigation #6374

pull pstratem wants to merge 15 commits into bitcoin:master from pstratem:connlimit-fix changing 4 files +220 −99
  1. pstratem commented at 2:13 am on July 4, 2015: contributor

    Connection slots are a limited resource which can be the target of DoS attacks.

    This issue was introduced in 2011 by 5a3e82f.

    In mitigating this issue it is important to take steps to avoid network partitioning.

    I have taken the approach of protecting connections with various properties from eviction.

    Of the nodes still available for eviction the most recently connected node from the CNetAddr with the most connections is selected and evicted.

    The largest class of protected connections is those which have been connected for the longest time. The intent is to maintain the strong bias towards these connections which exists today and thus avoid any risk of network partition.

  2. laanwj added the label P2P on Jul 5, 2015
  3. pstratem commented at 7:46 am on July 5, 2015: contributor

    This has been tested by setting up a node and then connected to it from the same source ip in a loop.

    Additional testing is needed around multiple source ips.

  4. laanwj commented at 7:55 am on July 5, 2015: member

    Concept ACK

    This does need extensive testing in various scenarios - e.g. what happens with Tor hidden service connections, which all appear to come from one IP (localhost) address.

  5. pstratem commented at 8:47 am on July 5, 2015: contributor

    Tested with multiple inbound connections from 128+ source ips.

    Long lived connections were stable and the newer connections dropped.

  6. petertodd commented at 10:39 pm on July 6, 2015: contributor

    @pstratem How did you actually test that?

    We could make good use of automated scripts to make such testing relocatable.

  7. pstratem commented at 11:28 pm on July 6, 2015: contributor

    @petertodd python script that connects to the node in a loop running on top of torify and me sitting there hitting “new identity” in vidalia a bunch until i had unique ips connecting…

    not exactly an automated process

  8. pstratem commented at 9:55 pm on July 12, 2015: contributor

    @laanwj Missed the second part of your comment.

    This wont ever evict localhost connections, so inbound connections to a hidden service wont ever be disconnected by this.

    That’s actually not optimal, but unfortunately getting info on inbound hidden service connections requires interfacing with tors control port.

    That’s definitely out of scope for this patch set.

  9. laanwj commented at 9:16 am on July 28, 2015: member

    @pstratem Absolutely - I wasn’t implying that you’d have to interact with Tor’s control port in this pull, just that it’s a requirement that it didn’t make the current situation worse.

    Needs a trivial rebase in net.cpp due to #5288.

  10. in src/net.cpp: in 5f33e1ec5d outdated
    1073-                {
    1074-                    LogPrintf("connection from %s dropped (banned)\n", addr.ToString());
    1075+
    1076+                if (pnode->fDisconnect) {
    1077                     CloseSocket(hSocket);
    1078+                    delete pnode;
    


    laanwj commented at 4:36 pm on August 3, 2015:
    Deleting CNodes happens in the “Delete disconnected nodes” loop, which first makes sure that no one is using the node anymore. Adding a delete pnode here seems like a danger for race conditions?

    pstratem commented at 9:52 pm on August 3, 2015:

    This is deleting the CNode which was created at 984 and which was not added to vNodes.

    Not deleting it here would be a memory leak.


    laanwj commented at 7:09 am on August 7, 2015:
    Bah. I wonder if we could use e.g. boost::scoped_ptr or auto_ptr to avoid this. It’s too easy to get memory leaks with manual deallocation along a subset of code paths, especially when factoring in exceptions.

    pstratem commented at 9:39 pm on August 7, 2015:
    Alternatively I could just add it to vNodes and rely on the normal cleanup logic.
  11. TheBlueMatt commented at 11:08 am on August 13, 2015: member
    Spoke with @pstratem about this for a bit and we agreed this should be refactored so that the connection acceptance stuff is in a separate function. Also after this change the nWhiteConnections variable is set in init but goes unused.
  12. pstratem commented at 0:35 am on August 14, 2015: contributor
    @TheBlueMatt I’ve simply removed the whiteconnections option in favor of protecting any inbound whitelisted connection. This is safe as outbound connections are never disconnected.
  13. in src/net.cpp: in 1f1c87438d outdated
    854+    unsigned int nMostConnections = 0;
    855+    std::map<CNetAddr, std::vector<CNode*> > mapAddrCounts;
    856+    BOOST_FOREACH(CNode *node, vEvictionCandidates) {
    857+        mapAddrCounts[node->addr].push_back(node);
    858+
    859+        if (mapAddrCounts[node->addr].size() > nMostConnections) {
    


    Diapolo commented at 5:53 am on August 14, 2015:
    size_t to unsigned int is safe? You could use size_t for nMostConnections also.

    pstratem commented at 4:01 pm on August 14, 2015:
    This is almost certainly safe because the values are heavily constrained, however it’s not an issue to switch to size_t so I will

    dcousens commented at 11:28 am on August 15, 2015:
    @Diapolo aren’t there warnings for this?

    pstratem commented at 11:35 pm on August 15, 2015:
    @dcousens they’re both unsigned, so no
  14. in src/net.cpp: in 1f1c87438d outdated
    833+                continue;
    834+            vEvictionCandidates.push_back(node);
    835+        }
    836+    }
    837+
    838+    // Protect connections with certain characteristics
    


    Diapolo commented at 5:54 am on August 14, 2015:
    This should be a more detailed comment IMHO.

    TheBlueMatt commented at 11:35 pm on August 20, 2015:
    Yes, please add a comment above each sort/erase block that explains what/why you are protecting that group.

    pstratem commented at 11:50 pm on August 20, 2015:
    Comments added
  15. in src/net.cpp: in 1f1c87438d outdated
    888+
    889+    if (hSocket != INVALID_SOCKET)
    890+        if (!addr.SetSockAddr((const struct sockaddr*)&sockaddr))
    891+            LogPrintf("Warning: Unknown socket family\n");
    892+
    893+    bool whitelisted = hListenSocket.whitelisted || CNode::IsWhitelistedRange(addr);
    


    Diapolo commented at 5:56 am on August 14, 2015:
    You seem to use this bool first at line 916, why not move the check there? At first look I had the impression it belongs to the LOCK block.

    dcousens commented at 9:13 am on August 14, 2015:

    I also had this impression, though that is a common theme with several of the variables (nInbound, nMaxInbound). It’d be nice to see them declared where they are used, rather than C99 style at the top of the function.

    edit: Though to be fair, this code is simply copied, so this is not @pstratem’s fault.


    pstratem commented at 4:01 pm on August 14, 2015:

    This is all copied directly from what was there before.

    I can clean up the logic a bit more later, but for now it’s nice that the diff is minimal.


    dcousens commented at 11:24 am on August 15, 2015:
    Agreed on the clean diff, @pstratem could you put a TODO in there (or make an issue?) though such that this isn’t just forgotten after this though?
  16. in src/net.cpp: in 1f1c87438d outdated
    896+        BOOST_FOREACH(CNode* pnode, vNodes)
    897+            if (pnode->fInbound)
    898+                nInbound++;
    899+    }
    900+
    901+    if (hSocket == INVALID_SOCKET)
    


    Diapolo commented at 5:58 am on August 14, 2015:
    Is this the check for the accept call? Would be cleaner if it was below the call. You also wouldn’t need the hSocket != INVALID_SOCKET check at the top.

    pstratem commented at 4:02 pm on August 14, 2015:
    I agree, but I think that’s for a separate pull request.

    dcousens commented at 11:28 am on August 15, 2015:
    Do we file these as issues or TODOs?

    pstratem commented at 9:13 pm on August 15, 2015:
    I would say file them as a single issue. something like “clean up logic/code in AcceptConnection”
  17. in src/main.cpp: in 1f1c87438d outdated
    4521@@ -4522,6 +4522,7 @@ bool static ProcessMessage(CNode* pfrom, string strCommand, CDataStream& vRecv,
    4522                     if (pingUsecTime > 0) {
    4523                         // Successful ping time measurement, replace previous
    4524                         pfrom->nPingUsecTime = pingUsecTime;
    4525+                        pfrom->nMinPingUsecTime = std::min(pfrom->nMinPingUsecTime, pingUsecTime);
    


    Diapolo commented at 6:00 am on August 14, 2015:
    Is this interresting for getpeerinfo?

    pstratem commented at 4:02 pm on August 14, 2015:
    Probably, but again separate pr
  18. dcousens commented at 9:15 am on August 14, 2015: contributor
    Concept ACK
  19. laanwj commented at 2:36 pm on August 17, 2015: member
    I think i’m missing a thing: how does this change make whiteconnections redundant? There is no information in the commit message.
  20. pstratem commented at 10:56 pm on August 19, 2015: contributor
    @laanwj All whitelisted connections are protected. Which makes a parameter for protecting a specific number of whitelisted connections redundant.
  21. Refactor: AcceptConnection 541a1dd9e6
  22. Refactor: Bail early in AcceptConnection 1ef4817614
  23. Refactor: Move failure conditions to the top of AcceptConnection ae037b707c
  24. Record nMinPingUsecTime 4bac601610
  25. AttemptToEvictConnection 2c701537c8
  26. Prefer to disconnect peers in favor of whitelisted peers b105ba398b
  27. Remove redundant whiteconnections option a8f6e45249
  28. Add comments to AttemptToEvictConnection df23937422
  29. RAII wrapper for CNode* 1317cd1928
  30. Better support for nodes with non-standard nMaxConnections 17f3533c84
  31. Return false early if vEvictionCandidates is empty dc81dd02a1
  32. in src/net.cpp: in 1f1c87438d outdated
    829+                continue;
    830+            if (node->fDisconnect)
    831+                continue;
    832+            if (node->addr.IsLocal())
    833+                continue;
    834+            vEvictionCandidates.push_back(node);
    


    TheBlueMatt commented at 11:13 pm on August 20, 2015:
    Dont you need to increment the use count on the node here?
  33. in src/net.cpp: in 1f1c87438d outdated
    836+    }
    837+
    838+    // Protect connections with certain characteristics
    839+    static CompareNetGroupKeyed comparerNetGroupKeyed;
    840+    std::sort(vEvictionCandidates.begin(), vEvictionCandidates.end(), comparerNetGroupKeyed);
    841+    vEvictionCandidates.erase(vEvictionCandidates.end() - std::min(4, static_cast<int>(vEvictionCandidates.size())), vEvictionCandidates.end());
    


    TheBlueMatt commented at 11:14 pm on August 20, 2015:
    These constants (4, 8, 64) really need to be a function of your configured maximum connections.
  34. in src/net.cpp: in d6b18c5bab outdated
    860+    vEvictionCandidates.erase(vEvictionCandidates.end() - std::min(8, static_cast<int>(vEvictionCandidates.size())), vEvictionCandidates.end());
    861+
    862+    // Protect the 64 nodes which have been connected the longest.
    863+    // This replicates the existing implicit behavior.
    864+    std::sort(vEvictionCandidates.begin(), vEvictionCandidates.end(), ReverseCompareNodeTimeConnected);
    865+    vEvictionCandidates.erase(vEvictionCandidates.end() - std::min(64, static_cast<int>(vEvictionCandidates.size())), vEvictionCandidates.end());
    


    TheBlueMatt commented at 6:30 pm on August 21, 2015:
    Seems my comment was lost in a rebase, can you change the constants (4, 8, 64) to be some multiple of max connections?

    pstratem commented at 10:47 pm on August 21, 2015:

    Hmm not sure that makes sense actually.

    The goal is to prevent a sybil attack, constants are probably best for that.

  35. TheBlueMatt commented at 0:56 am on August 24, 2015: member
    utACK (may want to squash a few of the last commits, but doesnt matter).
  36. laanwj commented at 0:41 am on August 25, 2015: member
    @pstratem Fair enough.
  37. CNodeRef copy constructor and assignment operator 69ee1aab00
  38. in src/net.cpp: in dc81dd02a1 outdated
    779+public:
    780+    CNodeRef(CNode *pnode) : _pnode(pnode)  {_pnode->AddRef();}
    781+    ~CNodeRef() {_pnode->Release();}
    782+
    783+    CNode& operator *() const {return *_pnode;};
    784+    CNode* operator ->() const {return _pnode;};
    


    laanwj commented at 2:08 am on August 25, 2015:

    This class needs an explicit copy constructor and assignment operator to be safe to use inside STL containers:

     0    CNodeRef& operator =(const CNodeRef& other)
     1    {
     2        if (this != &other) {
     3            _pnode->Release();
     4            _pnode = other._pnode;
     5            _pnode->AddRef();
     6        }
     7        return *this;
     8    }
     9    CNodeRef(const CNodeRef& other):
    10        _pnode(other._pnode)
    11    {
    12        _pnode->AddRef();
    13    }
    

    Otherwise, copying it won’t properly update reference counts.

  39. pstratem commented at 10:39 pm on August 25, 2015: contributor
    @sipa you’re commenting on a commit from ~13 days ago, most of those issues have been fixed since
  40. sipa commented at 11:13 pm on August 25, 2015: member
    @pstratem I’m going through the commits one by one.
  41. pstratem commented at 11:15 pm on August 25, 2015: contributor
    @sipa ah ok then
  42. sipa commented at 11:24 pm on August 25, 2015: member

    Concept ACK. I think that calling AddRef/Release without holding cs_vNodes should not be done.

    I think the biases can be improved still - for example by computing a score per node based on ping time, and then penalizing the scores of nodes from the same netgroup if there are multiple. But that can be done later.

  43. Acquire cs_vNodes before changing refrence counts fed30940ef
  44. Fix comment 000c18aace
  45. pstratem commented at 0:08 am on August 26, 2015: contributor
    @sipa this is definitely intended mostly as a framework, the initial rules are an improvement but certainly not intended to be the final set of rules.
  46. Use network group instead of CNetAddr in final pass to select node to disconnect 027de94e1f
  47. laanwj commented at 4:23 pm on September 3, 2015: member
    utACK
  48. laanwj merged this on Sep 3, 2015
  49. laanwj closed this on Sep 3, 2015

  50. laanwj referenced this in commit 69dc5b51a0 on Sep 3, 2015
  51. str4d referenced this in commit 0a1ac9e3ce on Jul 13, 2017
  52. str4d referenced this in commit e3564bae93 on Nov 9, 2017
  53. str4d referenced this in commit d7f224b2be on Apr 5, 2018
  54. random-zebra referenced this in commit 8bbc0650e6 on Jul 1, 2020
  55. zkbot referenced this in commit 89d9e557e1 on Feb 17, 2021
  56. str4d referenced this in commit 39dddced8f on Feb 18, 2021
  57. str4d referenced this in commit d3a2f120f5 on Feb 18, 2021
  58. zkbot referenced this in commit e10008da66 on Feb 18, 2021
  59. zkbot referenced this in commit 777deea264 on Feb 19, 2021
  60. zkbot referenced this in commit b62e35dee8 on Feb 19, 2021
  61. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me