tldr:
Block-relay-only connections help improve partition resistance on the bitcoin network by increasing connectivity while obfuscating the network graph. To be conservative and reduce chances of unexpected network effects, initially 2 outbound block-relay-only connections per node were introduced. Since then, we have improved resource utilization and network behaviors to work towards increasing that number. Now, let’s try and do it!
This is joint work with mzumsande. #28463 proposes a specific implementation to increase the number of inbound slots for block-relay-only connections, which is a prerequisite to later increase the outbound slots. This issue is a place to discuss any conceptual questions or concerns.
History & Context:
In 2019, PR #15759 introduced 2 outbound block-relay-only connections for bitcoind nodes. The primary motivation of introducing these connections was to help obfuscate the network graph, since leaked information could help an adversary execute a partition attack.
From the beginning, there was open questioning around how many block-relay-only connections we should add. More increases robustness but effects to consider include resource utilization and addr relay implications. Introducing 2 extra connections was evaluated to be a good first step that balanced tangible benefits with potential risk. gmaxwell mentioned in this comment, “ultimately I’d like to optimize memory usage for both inbound and outbound blocksonly links, then at least double the inbound connection limit for blocks only links, and make 8 blocks only links.” @sdaftuar expressed agreement with this direction here.
At the time, there were a few different mechanisms that would be a cause for concern if the number of block-relay-only connections were increased. Many have been resolved since and we have highlighted one significant question that needs to be evaluated in the context of this proposal. Additionally, reviewers should consider if there may be any other undesirable side effects.
- Addrman interactions (fixed) [PR #20187](https://github.com/bitcoin/bitcoin/pull/20187): There were a couple of nuanced interactions between block-relay-only connections and addrman that were addressed in this PR. The changes fix a privacy leak, ensure block-relay-only addresses are recognized as reliable connections, and fix a pre-existing addrman bug around active connections.
- Addr relay implications (fixed) [PR #21528](https://github.com/bitcoin/bitcoin/pull/21528): As @ajtowns described in this comment, introducing block-relay-only connections initially degraded the propagation of addr messages on the network. [PR #21528](https://github.com/bitcoin/bitcoin/pull/21528) reduces addr blackholes by improving the behavior for honest nodes. The default behavior was updated to not treat inbound connections as addr relay peers until they indicated interest by initiating an address related p2p message.
- Memory utilization (fixed) [PR #22778](https://github.com/bitcoin/bitcoin/pull/22778): block-relay-only connections do not require as much memory as transaction relay peers. They don’t need a
TxRelay
data structure, which is significant because the tx relay bloom filter uses approximately 500kb per peer. [PR #22778](https://github.com/bitcoin/bitcoin/pull/22778) allowed nodes to identify whether an inbound peer might ever relay transactions over the lifetime of the connection, and stop initializing the TxRelay data structure when unnecessary. - Netgroup limitation (fixed) [PR #27374](https://github.com/bitcoin/bitcoin/pull/27374): In certain circumstances, the logic to diversify netgroups of our outbound connections was limiting the total number of connections permitted by the node. This PR fixed this issue by applying separate logic for clearnet peers vs privacy network peers.
- Number of available connection slots on the network (OPEN): available connection slots are a shared and limited network resource. Increasing the defaults for outbound connections should be carefully calibrated against the expected values for inbound slots over time. The next section shares context for how numbers have been selected for our proposal, and this is an area where we are very interested in reviewer feedback.
- Is there anything else we are missing?
Availability of Connection Slots:
Context
The patch in #28463 proposes values based on observing network statistics & calculating expected memory utilization. This section provides more reasoning behind selecting those numbers, so reviewers can evaluate the methodology.
The fundamental question is: how much do we need to increase inbound capacity to accommodate for increasing the number of outbounds from 10 to 16. This is tricky to answer with precision because (1) estimating the number of non-reachable nodes is hard and (2) these numbers will inevitably change over time, and we want to accommodate for fluctuations.
There is no way to guarantee sufficient connection slots - if all users disabled inbounds, the network would fail. However, we can still observe network behaviors to build confidence around projecting likely proportions that would be maintained over time. After all, the default number of 8 outbound connections dates back to satoshi code from 2010, and has successfully held up over the years 🙂
Estimating the number of reachable clearnet nodes (per sept 2023)
6155 (bitnodes) 8509 (KIT) 3862 (Luke Dashjr) 7910 (21 Ninja)
Quality of data: The bitnodes number is verifiable because peers can be queried, and a random sampling has demonstrated a high probability of successfully connecting. Luke Dashjr reports significantly lower than the other two, but the methodology is not disclosed. The data from 21 Ninja also clearly specifies its methodology and crawler code.
Estimating the number of non-reachable clearnet nodes
50956 (Luke, sept 2023, unknown methodology and included networks) 32300 (KIT Addr Spam Paper 07/21, estimated from the degrees calculated from observed addr spam) 27000 - 35000 (KIT Monitoring Paper 12/21, estimated from the number of gossip addrs received) 35000 (bitnodes, estimated from gossip addrs received)
Quality of data: Both KIT methods don’t take into account nodes that don’t self-advertise (e.g. listen=0, or SPV clients) but do permit inbound slots. Estimating based on addr gossip is going to include many spam addresses, so may provide a reasonable upper bound.
Extrapolating numbers
Let’s use rough estimates of 8,000 reachable clearnet nodes & 40,000 non-reachable nodes.
Estimated slots required for each increment to the default number of outbound connections: 48,000. Required additional inbound slots for each increment to default outbounds: 48000/8000 = 6.
With the current network estimates, we would need at least 6 additional inbound slots for each increment to the default number of outbounds. To be more conservative, we should probably add ~8-10 inbound slots for each additional outbound.
This is all only clearnet. However, it seems likely that inbound capacity for privacy networks are higher in comparison, because unlike clearnet, there is no need to unblock ports, so accepting inbounds is the default behavior. While node operators are able to easily disable those inbounds, we’d anticipate that to happen less frequently than on clearnet where the reverse effort is required.
Future work:
We need users to adopt the changes to increase the default for inbound block-relay-only connections before we can safely increase the default for outbounds. #28463 proposes the increase in inbounds. If these changes are accepted, we would want to wait until the corresponding release is widely adopted, then have a future release update the outbound default.
Anchors were implemented using block-relay-only connections to mitigate against restart-based eclipse attacks. For more context see [issue #17326](https://github.com/bitcoin/bitcoin/issues/17326) & [PR #17428](https://github.com/bitcoin/bitcoin/pull/17428). If we increase the number of outbound block-relay-only connections, we will want to thoughtfully design the interactions with anchors. As sdaftuar mentions in this comment, we will likely want to cap the anchors at 2, which entails selection logic. In this comment, @brunoerg mentions the idea of having (at least) one anchor per network instead of just two generic ones. While discussing this in depth would be premature, we also want to keep these implications and options in mind as we advance the current work.