addrman: treat Tor/I2P/CJDNS as a single group #22563

pull vasild wants to merge 1 commits into bitcoin:master from vasild:addrman_per_group_bucketing changing 1 files +5 −0
  1. vasild commented at 2:43 pm on July 27, 2021: contributor
    Sub-netting does not exist in Tor/I2P/CJDNS. Obtaining such addresses is easier compared to obtaining IP addresses. Also, obtaining Tor/I2P/CJDNS addresses that do not have a common prefix is not harder than obtaining ones with common prefix.
  2. vasild commented at 2:51 pm on July 27, 2021: contributor

    This is somewhat hackish, but a very small change, in case it is desired for 22.0.

    The underlying problem is that CNetAddr::GetGroup() (edit: this is now NetGroupManager::GetGroup()) and the places where it is called are build around the assumption that it is easier to obtain addresses with common prefix (e.g. belonging to one IP subnet). However this assumption is not true for Tor, I2P and CJDNS addresses.

    For example, we would avoid connecting to aabcd.onion if we already have connection to aaxyz.onion, assuming that the common prefix makes it more likely they belong to the same actor.

    ~CNetAddr::GetGroup()~ NetGroupManager::GetGroup only makes sense for the IPv4 and IPv6 networks. To resolve this properly a bigger refactor would be required - where we call GetGroup() only for IPv4 and IPv6 addresses and treat Tor/I2P/CJDNS separately.

  3. mzumsande commented at 3:31 pm on July 27, 2021: contributor

    The underlying problem is that CNetAddr::GetGroup() and the places where it is called are build around the assumption that it is easier to obtain addresses with common prefix (e.g. belonging to one IP subnet).

    Doesn’t CNetAddr::GetGroup() in netaddress.cpp already have special casing for Tor/I2P/CJDNS (nBits = 4 , resulting in num_bytes == 0 ) that would make this unnecessary (because it already treats aabcd.onion the same as aaxyz.onion and every other Tor address)?

  4. vasild commented at 3:41 pm on July 27, 2021: contributor

    I don’t think so - currently CNetAddr::GetGroup() treats aabcd.onion and aaxyz.onion the same, but not the same as every other Tor address. The way I read it, CNetAddr::GetGroup() will return a vector of 2 bytes for a Tor address: the first byte will be 0x03 (NET_ONION), set here:

    https://github.com/bitcoin/bitcoin/blob/979f410e69a0350da8bf67f329f760d4dd3a4f44/src/netaddress.cpp#L781

    the second byte is set here:

    https://github.com/bitcoin/bitcoin/blob/979f410e69a0350da8bf67f329f760d4dd3a4f44/src/netaddress.cpp#L814

    and will consist of the first 4 bits of the Tor address, followed by 4 1-bits.

  5. mzumsande commented at 4:02 pm on July 27, 2021: contributor
    You are right, I missed the spot where the second byte is set.
  6. DrahtBot added the label P2P on Jul 27, 2021
  7. vasild commented at 4:15 pm on July 27, 2021: contributor
    Your eyes, on their own, decided to skip the line that contains ] | ((1 << (8 - nBits)) - 1)); (mine did) :exploding_head: :rofl:
  8. mzumsande commented at 8:23 pm on July 27, 2021: contributor

    Yes, I wonder why :grinning:

    Another question (still not sure I have completely understood the bucketing…): Wouldn’t this mean that addresses advertised to us by peers with a Tor address would always map to the same bucket/position, regardless of the source? And, as a result:

    1. if we only have Tor peers, we would only ever reach a multiplicity of one in the New tables, so that the nRefCount mechanism to give addrs advertised to us by many diverse peers more weight in New would no longer work in this situation?
    2. We’d only ever use 64 of the 1024 buckets for addrs received from Tor addresses
  9. vasild commented at 9:10 am on July 28, 2021: contributor

    The relevant code is here.

    (talking about the “new” buckets)

    Correct. This PR makes all Tor sources be treated as a single source, so the same address received from different Tor sources will end up in the same bucket/position. Overall, all addresses received by Tor sources will be dispersed amongst 64 ADDRMAN_NEW_BUCKETS_PER_SOURCE_GROUP buckets.

    Notice that this is already the case for Tor peers that have connected to us (incoming from our point of view) because all of them look to us as coming from 127.0.0.1, so all of them belong to a single group. This plays nicely because an attacker looking to fill a victim’s addrman new buckets can generate many different Tor addresses and connect from all of them to the victim (mimicking different sources), but all will be treated as one source (127.0.0.1). This is not the case for I2P and CJDNS.

    Maybe this patch is too aggressive and should treat as one group sources that are incoming Tor/I2P/CJDNS and leave the 16 groups (4 bits) for sources that we have connected to (outgoing).

  10. jonatack commented at 4:35 pm on July 28, 2021: contributor
    Concept ACK, will look more deeply.
  11. DrahtBot commented at 10:44 pm on September 7, 2021: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK jonatack, kristapsk

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  12. DrahtBot added the label Needs rebase on Apr 22, 2022
  13. jsarenik commented at 8:29 am on April 24, 2022: none
    Please also consider adding Yggdrasil to the list. https://github.com/bitcoin/bitcoin/pull/23531
  14. vasild force-pushed on Dec 12, 2022
  15. vasild commented at 10:30 am on December 12, 2022: contributor

    228a8c975a...71cb144cb4: rebase due to conflicts

    This is much relevant as it was when I opened it. I am just not sure whether the change may have some unexpected adverse effects. I will eventually reconsider this and maybe move it from “draft” to “open”.

  16. kristapsk commented at 11:53 am on December 12, 2022: contributor
    Concept ACK
  17. DrahtBot removed the label Needs rebase on Dec 12, 2022
  18. mzumsande commented at 6:50 pm on March 23, 2023: contributor

    We’d only ever use 64 of the 1024 buckets for addrs received from Tor addresses

    Correct. This PR makes all Tor sources be treated as a single source, so the same address received from different Tor sources will end up in the same bucket/position. Overall, all addresses received by Tor sources will be dispersed amongst 64 ADDRMAN_NEW_BUCKETS_PER_SOURCE_GROUP buckets.

    I think that we were both wrong above, it’s much less than that, let’s say we run -onlynet=onion and don’t allow inbounds: In that case, the only entropy in GetNewBucket() comes from netgroupman.GetGroup(*this), the netgroup of the addr itself. This is only 4 bits, so tor addresses can only go into 16 buckets (if we are very lucky! Depending on nKey, some of these 16 values likely hash to the same value modulo ADDRMAN_NEW_BUCKETS_PER_SOURCE_GROUP, so usually it’s more like 12-15 buckets). That means we’d likely be able to store less than 1024 onion addresses in the new tables (as opposed to >10000 on master).

    Considering how large the tor network has become and how common it has become to run -onlynet=tor despite the higher eclipse risks (see e.g. https://bitnodes.io/nodes/) that reduction seems too drastic to me.

  19. vasild commented at 11:19 am on March 30, 2023: contributor
    Yes. But I think I should not give up on treating Tor sources as one source because this is what it is in reality. And then, addrman should be able to accommodate 10s of thousands of addresses from one source (if the source is Tor, I2P or CJDNS) and still not allow a single malicious entity to flood it.
  20. DrahtBot added the label CI failed on Jun 9, 2023
  21. DrahtBot removed the label CI failed on Jun 18, 2023
  22. DrahtBot added the label Needs rebase on Aug 28, 2023
  23. addrman: treat Tor/I2P/CJDNS as a single group
    Sub-netting does not exist in Tor/I2P/CJDNS. Obtaining such addresses
    is easier compared to obtaining IP addresses. Also, obtaining
    Tor/I2P/CJDNS addresses that do not have a common prefix is not harder
    than obtaining ones with common prefix.
    9d2cc3f8ab
  24. vasild force-pushed on Aug 29, 2023
  25. vasild commented at 7:58 am on August 29, 2023: contributor

    71cb144cb4...9d2cc3f8ab: rebase due to conflicts

    To resolve this properly a bigger refactor would be required - where we call GetGroup() only for IPv4 and IPv6 addresses and treat Tor/I2P/CJDNS separately.

    Part of this was already done in #27374 which limited GetGroup() calls for IPv4/6 peers.

    Remaining pieces of code that depend on the output of GetGroup() and pass non-IPv4/6 addresses are:

    • the eviction logic (see nKeyedNetGroup in src/node/eviction.cpp) and
    • addrman bucketing (see GetTriedBucket() and GetNewBucket() in addrman.cpp).
  26. DrahtBot removed the label Needs rebase on Aug 29, 2023
  27. 0xB10C commented at 11:50 am on September 12, 2023: contributor
    Is this ready for review?
  28. vasild commented at 8:14 am on September 13, 2023: contributor
    @0xB10C this is a bit hackish and, as an undesirable side-effect, will change the number of buckets in addrman that are available for tor/i2p/cjdns addresses. I would rather embark on a bigger change where GetGroup() is only called for IPv4/6 addresses and the distinction between IPv4/6=use GetGroup() VS tor/i2p/cjdns=do something else be made in the higher level code (eviction logic and addrman).
  29. DrahtBot added the label CI failed on Jan 14, 2024
  30. DrahtBot commented at 1:17 am on April 13, 2024: contributor

    🤔 There hasn’t been much activity lately and the CI seems to be failing.

    If no one reviewed the current pull request by commit hash, a rebase can be considered. While the CI failure may be a false positive, the CI hasn’t been running for some time, so there may be a real issue hiding as well. A rebase triggers the latest CI and makes sure that no silent merge conflicts have snuck in.

  31. vasild commented at 9:00 am on May 7, 2024: contributor

    Closing this because I think the proper solution is a bigger change as mentioned above in #22563 (comment):

    If the address is IPv4 or IPv6 then use GetGroup() Else do something else (for Tor, I2P and CJDNS addresses).

    Currently GetGroup() is called for all types of addresses from: AddrInfo::GetTriedBucket() AddrInfo::GetNewBucket() CConnman::CalculateKeyedNetGroup() // saved in CNode::nKeyedNetGroup and later used during eviction CConnman::ThreadOpenConnections() // already only called for IPv4 or IPv6 addresses, good!

    If somebody works on this, I would be happy to review. Otherwise maybe I will pick it at a later time.

  32. vasild closed this on May 7, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me