Listen on random port by default (not 8333) #31036

issue vasild openend this issue on October 5, 2024
  1. vasild commented at 5:04 am on October 5, 2024: contributor

    Please describe the feature you’d like to see added.

    Connections to port 8333 can be recognized right away as Bitcoin P2P connections. While it is still possible to recognize Bitcoin P2P connections regardless of the port, random ports would make network-wide monitoring harder.

    Network-wide monitoring.

    Describe the solution you’d like

    The listening address and port of a node are propagated and saved in other nodes’ databases, so the port has to be constant. Thus, after generating a random port it would need to be saved on disk (e.g. settings.json) and reused after restarts.

    This applies to new installations. Existent ones have already propagated with port 8333 (if not changed by the node operator). So, something like: if a new installation and port is not explicitly provided, instead of using 8333 generate a random one and save it to settings.json.

    This applies only to listening on IPv4 and IPv6 addresses.

    Please leave any additional context

    This is more of a network-wide measure. Individual nodes have stronger means to protect themselves.

  2. vasild added the label Feature on Oct 5, 2024
  3. laanwj added the label Brainstorming on Oct 5, 2024
  4. mzumsande commented at 3:52 pm on October 5, 2024: contributor
    This would make the DNS seeds in their current from unusable (see #30900 for a related suggestion, though that would only remedy that for IPv4).
  5. 1440000bytes commented at 1:14 am on October 6, 2024: none

    There are 7000 IPv4 and 2000 IPv6 nodes. I expect this ratio to continue for next 10 years based on the adoption stats.

    IPv4 nodes will not be able to benefit from non-default port support because of DNS seeds and IPv6.

  6. vasild commented at 11:47 am on October 7, 2024: contributor

    Alright, looks like getting the DNS seeds to propagate ports would be really nice to have or an outright blocker for this.

    Currently, about 7% of my IPv4/IPv6 addrman entries use non 8333 port.

    0$ bitcoin-cli getnodeaddresses 0 | jq -r 'map(select((.network == "ipv6" or .network == "ipv4") and .port != 8333)) | length'
    1$ bitcoin-cli getnodeaddresses 0 | jq -r 'map(select(.network == "ipv6" or .network == "ipv4")) | length'
    
  7. Wronskode commented at 10:43 pm on October 10, 2024: none
    This may causes problems if we have to open the port on the rooter ?
  8. jonatack commented at 10:55 pm on October 10, 2024: member

    Currently, about 7% of my IPv4/IPv6 addrman entries use non 8333 port.

    Same (8% of IPv6, 2% of IPv6, 7% overall)

  9. vasild commented at 7:06 am on October 11, 2024: contributor

    This may causes problems if we have to open the port on the rooter ?

    You mean so that the node behind the router, which listens on non-8333 port be reachable from the internet? From the perspective of opening the port on the router a random port is no different than 8333. The idea is only about new installations, existent ones won’t be affected. And it is only about the default behavior, so even on new installations one would be able to use port=8333 or bind=whatever:8333.

  10. virtu commented at 8:09 am on October 18, 2024: contributor

    Concept ACK

    Using a default port defeats some parts of v2transport, I guess.

    Concerning DNS seeds, two ideas come to mind:

    1. In theory, the issue could be solved by increasing the scope of #31062 by adding port numbers to the encoding and encode also IP addresses with it. Whether the proposal is practical remains to be seen.
    2. A more radical solution would be to dispense with DNS seeds and instead rely entirely only on hardcoded seeds for all network types. Historical data suggests this could be feasible, even more so because the number of hardcoded seed addresses can be increased at virtually no cost. Still this seems rather controversial because past performance does not guarantee future results and it appears hard to reason about downstream effects.
  11. sipa commented at 1:00 pm on October 19, 2024: member

    Concept ACK.

    Regarding DNS seeds, I think there is a 3rd option, which I favor. We keep the DNS seeds, but treat them always as addrfetch peers, rather than as names to resolve directly. This is what already happens when you’re running in tor-only mode: a P2P connection is made “to” the DNS seed name at port 8333 (i.e., the Tor exit node resolves the name for us, picks one of the IP addresses it resolves to, establishes a connection to it, and forwards it to us without revealing what that IP was), and we send a GETADDR message to it and insert the results into adrman.

    This obviously isn’t possible in I2P-only or tor-hidden-service-only setups (which will need to rely on hardcoded seeds instead), but I believe it will keep functioning fine for tor-only setups. As long as a decent portion of port-8333 nodes remains, they will keep getting found by the DNS seeders, and get queried (but, due to the extra indirection step, what actually ends up in addrman can have any port, as well as any service flags, …). Longer term, seeders could become P2P based, if need be.

    Downsides:

    • Loses the caching effect that ISP resolvers have on DNS seeds (in honest operation, this reduces the load on DNS seeds somewhat).
    • If the network evolves to a point where barely anyone runs 8333 anymore (or only spy nodes…?), this is not a final solution.
    • Depending on how DNS seeders evolve, this may lose the advantage that the seeders do not observe the IP address of new nodes starting up (by being hidden behind ISP resolvers).

    Upsides:

    • Gets to distribute everything that ADDRV2 can encode (any BIP155 network type, port number, service flags).
    • If v2 connections get used, this removes ISP ability to gratuitously observe Bitcoin node IP addresses being distributed to new nodes starting up (assuming no ISP-operation Bitcoin-specific full connection MitM).
  12. mzumsande commented at 5:59 pm on October 20, 2024: contributor

    Downsides:

    Another downside would be that finding an initial batch of peers would take longer. GETADDR answers will include many addresses that are not reachable, whereas DNS seeders only resolve to addresses of nodes that the crawler could reach very recently. I would expect that there would be quite a noticeable effect.

  13. 1440000bytes commented at 6:47 pm on October 20, 2024: none

    Downsides:

    Another downside would be that finding an initial batch of peers would take longer. GETADDR answers will include many addresses that are not reachable, whereas DNS seeders only resolve to addresses of nodes that the crawler could reach very recently. I would expect that there would be quite a noticeable effect.

    Let’s fix DNS seeds: #30900

    Or someone experienced, influential does a RFC which changes https://datatracker.ietf.org/doc/html/rfc4291#section-2.5.5.2 and allows anything between 0000 and FFFF in those 16 bits.

  14. vasild commented at 12:00 pm on November 5, 2024: contributor

    We keep the DNS seeds, but treat them always as addrfetch peers

    Another downside would be that finding an initial batch of peers would take longer

    What about having P2P-seeds, an alternative to DNS-seeds, which are serving the P2P protocol and are used as addrfetch peers and they return only high-quality addresses from the crawler? The IP/Tor/I2P addresses of those P2P-seeds could be hardcoded or (for clearnet only) we can have hardcoded the hostname, e.g. p2pseed.bitcoin.org which resolves to one or more P2P-seeds. This would make it possible to change the location of a P2P-seed quickly, rather than waiting for the next release. Can have all of them: hardcoded Tor and I2P addresses of P2P-seeds and hardcoded clearnet hostnames (depend on DNS) which resolve to P2P-seeds.

  15. mzumsande commented at 3:22 pm on November 5, 2024: contributor

    What about having P2P-seeds, an alternative to DNS-seeds, which are serving the P2P protocol and are used as addrfetch peers and they return only high-quality addresses from the crawler?

    Yes, I think that could work. Also, I wouldn’t say that this particular downside I mentioned would be a blocker - even if it would take on average 1 minute instead of 20 seconds to find the initial 10 peers, that wouldn’t be the end of the world, since it’s just a one-time event for an empty addrman.

  16. darosior commented at 9:42 pm on November 8, 2024: member

    What about having P2P-seeds, an alternative to DNS-seeds

    I guess the biggest drawback with this is how bootstrapping nodes will actually connect to this “P2P-seeds” instead of leveraging the DNS cache of their ISP.

  17. darosior commented at 10:04 pm on November 8, 2024: member

    Besides the load, which can be dealt with by having a large number of seeds, i wonder how much getting rid of the DNS caching impacts “privacy”, loosely defined. Of course from the perspective of a node, if you want to hide your IP you use an anonymizing network period. But it’s also the case that being a P2P seed gives you a privileged observation position on all the nodes joining the network, which is less true for DNS seeds since most of the time the bootstrapping nodes would not actually connect to them directly. Can this be also dealt with by having a large number of seed nodes such as if every node joining the network picks randomly from the list you only get to observe very few of them? If so how many reachable seeds is enough? And at what rate do they decay?

    Of course this drawback needs to be considered on balance with all the advantages brought by seeding through the Bitcoin protocol instead of DNS:

    • Makes it possible to leverage BIP324 transport
    • Makes it possible to have seeds for anonymizing networks
    • Removes DNS as a build time dependency (it could help releasing static builds against musl for instance)
    • Works fine in a future where most nodes on the network listen on a random port
  18. vasild commented at 1:55 pm on November 12, 2024: contributor

    @darosior, good observations and questions! Maybe use the P2P-seeds only as a fallback if it takes too long?

    I mean this - right now there are around 2000 hardcoded addresses in contrib/seeds/nodes_main.txt. Some of them will be down, thus the problem mentioned above: “Another downside would be that finding an initial batch of peers would take longer”. So, maybe if a new node can’t find enough peers in some reasonable time from those 2000 (and GETADDR replies if it manages to connect to some of them), only then fallback to the P2P-seeds (that have high quality GETADDR responses). This way P2P-seeds will not see every new node. That plus having a lot of P2P seeds should mitigate the privacy issue of having a P2P seeder know all newly joining nodes.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me