mzumsande
commented at 8:36 pm on September 12, 2023:
contributor
This is joint work with amitiuttarwar.
See issue #28462 for a broader discussion on increasing the number of block-relay-only connections independent of this particular implementation proposal.
We suggest to increase the number of inbound slots allocated to block-relay-only peers by increasing the default maximum connections from 125 to 200, with 50% of inbound slots accessible for tx-relaying peers.
This is a prerequisite for being able to increase the default number of outgoing block-relay-only peers later, because the current inbound capacity of the network is not sufficient.
In order to account for incoming tx-relaying peers separately from incoming block-relay peers, changes to the inbound eviction logic are necessary.
See the next post in this thread for a more detailed explanation and motivation of the changes.
DrahtBot
commented at 8:36 pm on September 12, 2023:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
#29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
DrahtBot added the label
P2P
on Sep 12, 2023
mzumsande marked this as a draft
on Sep 12, 2023
mzumsande
commented at 8:37 pm on September 12, 2023:
contributor
Status Quo
We have a default maximum of 125 connections (not counting manual ones). Out of these,
11 (8 full outbounds, 2 block-relay-only, 1 feeler) are reserved for outbounds, leaving us with
114 inbound slots. There are currently no limits on how many of the inbounds can support transaction relay.
Problem
We want to increase the number of inbound slots specifically offered to block-relay-only connections, but don’t want to significantly change the number of tx-relaying inbound peers, because we don’t want to radically change the memory and traffic requirements that come with running a full node with the default configuration.
This is complicated by a timing issue: When a new peer connects to us, we don’t know yet whether it is block-relay-only before we have processed their version message.
Inbound peer eviction, however, happens right after a peer connects to us and before processing its version msg. This means that we don’t yet know during inbound eviction, if the new peer wants tx relay.
Approach
We propose to introduce a separate maximum of tx-relaying inbound connections, in addition to the existing global maximum.
The new maximum for tx-relaying inbounds gets chosen such that 50% of the total inbound slots can be utilized by tx-relaying peers. We choose a ratio here instead of a fixed number here so that it scales with users changing their -maxconnections value from the default. The ratio of 50% is also made adjustable in the form of a new startup parameter.
Changes to the eviction logic
We don’t change the eviction logic for total connections, which happens in CreateNodeFromAcceptedSocket().
We add a tx-relay-specific eviction check to the version message processing which runs if we don’t have capacity for another tx-relaying peer: This check uses the same logic from AttemptToEvictConnection, but specifically evicts only tx-relaying peers (by protecting the rest of the peers). Only if we cannot find another peer to evict (e.g. all peers are protected, which should only happen if a low -maxconnections is used) we disconnect the new peer. This approach achieves that the number of tx-relaying peers can never exceed the limit.
Numbers
Specifically, we propose to increase the total number of connections from 125 to 200, with a suggested ratio of 50% tx-relay inbounds. With 11 connection slots reserved for outbound peers,
this results in (200 - 11) * 0.5 = 94.5 slots for tx-relaying inbound peers, which closely aligns with the typical tx-relaying inbounds seen in today’s world, where typically ~20% of inbound slots are already taken by block-relay-only connections.
These numbers were chosen to be sufficient to raise the number of block-relay-only connections from 2 to 8 once this patch is widely deployed (see Issue #28462 for more details on this).
[Note: In a previous iteration, a ratio of 60% was suggested, see this post for more discussion.]
Handling bloom filter peers
The logic gets a bit complicated if a node supports BIP37 (bloom filters), because then an inbound peer can be switched to tx-relay after sending a FILTERLOAD or FILTERCLEAR message (but never back!), which could result in an excess of tx-relaying inbounds. We propose to solve this by triggering the eviction logic for a tx-relaying peer from net_processing whenever we switch a peer to tx-relay due to a BIP37-related message.
naumenkogs
commented at 8:28 am on September 13, 2023:
member
Concept ACK, specifically on changes to the eviction logic, and handling bloom filters.
Numbers also seem fine I guess.
I was curious whether there is (or will be) any other good reason to delay eviction after the version (apart from making this logic cleaner), but I haven’t identified any.
mzumsande
commented at 4:16 pm on September 28, 2023:
contributor
I was curious whether there is (or will be) any other good reason to delay eviction after the version (apart from making this logic cleaner), but I haven’t identified any.
Discussed this with @amitiuttarwar and we think that it would be better to move the tx-related eviction into the version processing logic.
Pros:
It’s conceptually simpler to only evict a tx-relaying peer when we are sure that it is one
We wouldn’t have to evict anyone in the situation where all tx inbound slots are full and a new block-relay inbound connects
We need the logic for this (EvictTxPeerIfFull()) anyway when dealing with bloom-filter peers.
Cons:
The eviction now happens in two different places, after accepting a connection and during version processing.
I will update the PR with this soon!
DrahtBot added the label
Needs rebase
on Oct 3, 2023
mzumsande force-pushed
on Oct 3, 2023
DrahtBot removed the label
Needs rebase
on Oct 3, 2023
mzumsande force-pushed
on Oct 13, 2023
mzumsande
commented at 6:38 pm on October 13, 2023:
contributor
Changed the approach to do the eviction of tx-relaying peers within the version message processing, when we can make use of the fRelay flag they sent us - this has the advantage that we don’t have to evict new peers before being sure if they want tx relay.
As a result of the later eviction, it became necessary to protect the new peer from being evicted right by AttemptToEvictConnection() right after connecting (so that the new peer only gets evicted if no other peer can be found). This isn’t done for the existing unconditional inobound eviction in net (CreateNodeFromAcceptedSocket), because there we would evict a peer before adding the new peer to m_nodes.
mzumsande force-pushed
on Oct 13, 2023
DrahtBot added the label
CI failed
on Oct 13, 2023
DrahtBot removed the label
CI failed
on Oct 13, 2023
in
src/net.h:1201
in
ab5ad61f67outdated
1195@@ -1193,6 +1196,16 @@ class CConnman
1196 int GetExtraFullOutboundCount() const;
1197 // Count the number of block-relay-only peers we have over our limit.
1198 int GetExtraBlockRelayCount() const;
1199+ /**
1200+ * If we are at capacity for inbound tx-relay peers, attempt to evict one.
1201+ * @param[in] protect_peer NodeId of a peer we want to protect
amitiuttarwar
commented at 11:26 pm on October 16, 2023:
spacing nit to line up with the return description
0 * [@param](/bitcoin-bitcoin/contributor/param/)[in] protect_peer NodeId of a peer we want to protect
mzumsande
commented at 9:42 pm on October 19, 2023:
done
in
src/net.cpp:1658
in
ab5ad61f67outdated
1699@@ -1700,7 +1700,7 @@ std::pair<size_t, bool> CConnman::SocketSendData(CNode& node) const
1700 * to forge. In order to partition a node the attacker must be
1701 * simultaneously better at all of them than honest peers.
1702 */
1703-bool CConnman::AttemptToEvictConnection(bool evict_tx_relay_peer, std::optional<NodeId> protected_peer)
1704+bool CConnman::AttemptToEvictConnection(bool evict_tx_relay_peer, std::optional<NodeId> protect_peer)
amitiuttarwar
commented at 11:34 pm on October 16, 2023:
I’m guessing you renamed to protect_peer to match the function signature of EvictTxPeerIfFull, which sounds good to me. but this change should be squashed into ce8cd83a81ac5da882e8c6862a8d53efafc3ae86 which introduced the param.
mzumsande
commented at 9:42 pm on October 19, 2023:
Yes, some changes made it into the wrong commit. Fixed now.
in
src/net_processing.cpp:3492
in
ab5ad61f67outdated
3488@@ -3489,6 +3489,16 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3489 }
3490 }
34913492+ // If we have too many tx-relaying inbound peers, attempt to evict one (not necessarily this one).
amitiuttarwar
commented at 0:07 am on October 17, 2023:
current comment feels misleading since we specifically protect this peer
0 // If we have too many tx-relaying inbound peers, attempt to evict an existing one.
mzumsande
commented at 9:43 pm on October 19, 2023:
Thanks, took your suggestion.
in
src/net.h:1364
in
8247eaa09doutdated
1359@@ -1351,7 +1360,13 @@ class CConnman
1360 */
1361 bool AlreadyConnectedToAddress(const CAddress& addr);
13621363- bool AttemptToEvictConnection();
1364+ /**
1365+ * Try to find an inbound connection to evict when the node is full.
amitiuttarwar
commented at 0:58 am on October 17, 2023:
0 * Attempt to evict an inbound connection.
unlike EvictTxPeerIfFull, this function doesn’t explicitly check “full”, right? there’s some implicit logic around min number of connections required for there to be unprotected peers, but that’s slightly different than max capacity?
this comment is also on the function definition in net.cpp on master & I think we should remove it from there as well.
mzumsande
commented at 9:43 pm on October 19, 2023:
True, that is something the caller has to check. I removed the notion of “full” from both net.h and net.cpp.
in
src/net_processing.cpp:3494
in
8247eaa09doutdated
3488@@ -3489,6 +3489,16 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3489 }
3490 }
34913492+ // If we have too many tx-relaying inbound peers, attempt to evict one (not necessarily this one).
3493+ // Only if this fails, disconnect this peer.
3494+ if (pfrom.IsInboundConn()) {
amitiuttarwar
commented at 8:02 pm on October 17, 2023:
I think we can strengthen the check here to prevent a iterating through all the nodes if the incoming peer doesn’t want tx relay enabled (and we are not offering NODE_BLOOM service)
0 if (pfrom.IsInboundConn() && pfrom.m_relay_txs) {
mzumsande
commented at 9:43 pm on October 19, 2023:
Good point - done.
in
src/net.h:1369
in
8247eaa09doutdated
1365+ * Try to find an inbound connection to evict when the node is full.
1366+ * @param[in] evict_tx_relay_peer Whether to only select full relay peers for eviction
1367+ * @param[in] protect_peer Protect peer with node id
1368+ * @return True if a node was marked for disconnect
1369+ */
1370+ bool AttemptToEvictConnection(bool evict_tx_relay_peer, std::optional<NodeId> protect_peer);
amitiuttarwar
commented at 8:09 pm on October 17, 2023:
adding default values here would allow us not to have to touch the call site from CreateNodeFromAcceptedSocket & make more clear that the existing functionality shouldn’t change unless additional info is passed in.
mzumsande
commented at 9:44 pm on October 19, 2023:
changed to use default values.
in
src/net.h:1209
in
8247eaa09doutdated
1204+ * no need for eviction, or a peer was evicted).
1205+ * Returns false, if we are full but couldn't find
1206+ * a peer to evict (all eligible peers are protected)
1207+ * so that the caller can deal with this.
1208+ */
1209+ bool EvictTxPeerIfFull(std::optional<NodeId> protect_peer);
amitiuttarwar
commented at 8:47 pm on October 17, 2023:
I wonder if it could be clearer to call this function something like TxInboundCapacityAvailable. imo that seems more useful from the POV of the net processing caller, but either is ok with me.
amitiuttarwar
commented at 8:51 pm on October 17, 2023:
default param here makes the FILTERLOAD/FILTERCLEAR callers slightly cleaner
mzumsande
commented at 9:49 pm on October 19, 2023:
changed to use default values.
Not sure about the rename suggestion, I think I prefer EvictTxPeerIfFull, because I think for naming describing what the function does is more important than describing the return value. Otherwise other spots might call this function in the future with the intention to just perform a check, and then might be surprised about the side effects.
amitiuttarwar
commented at 10:19 pm on October 19, 2023:
and then might be surprised about the side effects.
yeah totally. I was trying to make the return values more self-evident, but on further thought, I agree with you that emphasizing eviction is more important - people can always read the doc string to understand the return value. can’t think of anything to concisely express HasOrMakeInboundTxCapacity, so leaving as is sounds good :)
amitiuttarwar
commented at 0:27 am on October 18, 2023:
contributor
overall, I like this approach significantly more. although we are introducing another touchpoint for inbound eviction logic, it is more straightforward to reason about. but the biggest improvement is not unnecessarily disconnecting a new or existing connection when tx relay isn’t being requested.
reviewed this in-depth. the fundamental structure & logic looks good to me. left a lot of comments along the way 😛
some additional thoughts:
commit message for ab5ad61f67f3bcab503ac3eb012c4f35d733b946 says “When we receive an inbound connection and don’t have space for another full-relay peer, we now attempt to evict specifically a full-relay inbound.” I think this should mention timing of processing the version message for clarity.
the order of events when processing the VERSION message makes sense, because we want to instantiate tx_relay for the before before triggering the new eviction logic (which uses that in the count).
this code strengthens the distinction between inbound-tx-relay connections from inbound-block-relay connections. ideally, this would be captured in separate connection types, but that would require more invasive changes since we don’t know the distinction at the time of accepting the connection. I don’t think the benefit would be worth the cost, but is something worth keeping in mind if we continue to build stronger distinctions between the two types of inbounds.
mzumsande force-pushed
on Oct 19, 2023
mzumsande
commented at 9:52 pm on October 19, 2023:
contributor
Thanks for the review @amitiuttarwar! i pushed an update, addressing all feedback.
amitiuttarwar
commented at 10:29 pm on October 19, 2023:
contributor
DrahtBot removed the label
Needs rebase
on Nov 9, 2023
0xB10C
commented at 4:17 pm on November 14, 2023:
contributor
I’ve been running this for about a week now. I currently have 161 inbound connections: 113 (maximum) full-relay connections and 48 block-relay-only connections.
0xB10C
commented at 4:18 pm on November 14, 2023:
contributor
[we] don’t want to significantly change the number of tx-relaying inbound peers, because we don’t want to radically change the memory and traffic requirements that come with running a full node with the default configuration.
On multiple full-inbound master nodes I’m getting roughly an 80%/20% full-relay/block-relay-only inbound split. This makes sense as we currently open 8 full-relay outbound and 2 block-relay outbound connections (ignoring feelers here). That’s roughly 91 full-relay and 23 block-relay inbound connections. With the new maximum of 113 that an increase of 22 full-relay inbound slots being used and nearly an increase of 25%. I’d argue that this is a significant change.
mzumsande
commented at 4:02 pm on November 15, 2023:
contributor
I’d argue that this is a significant change.
Hmm yes, in that comment I was thinking about the theoretical limit of all inbound slots being filled with tx-relaying peers. You are right, in practice that amount will be lower because some block-relay-only connections are made to us. In that sense, the introduction of block-relay-only peers likely came with a reduction in bandwidth for popular nodes that were already at capacity before.
One way to accommodate this would be to reduce the percentage, so that there would only be ~90-100 slots for tx-relaying inbound peers instead of 113.
DrahtBot added the label
Needs rebase
on Nov 28, 2023
mzumsande force-pushed
on Nov 28, 2023
DrahtBot removed the label
Needs rebase
on Nov 28, 2023
DrahtBot added the label
CI failed
on Nov 29, 2023
mzumsande force-pushed
on Dec 1, 2023
mzumsande force-pushed
on Dec 1, 2023
DrahtBot removed the label
CI failed
on Dec 1, 2023
mzumsande
commented at 5:13 pm on December 29, 2023:
contributor
After discussing wit @amitiuttarwar I’ve now lowered the inbound percentage from 60% to 50% (leading to 95 instead of 113 slots for tx-relaying inbounds). This way, we’ll be more aligned with the typical number of tx-relaying inbounds today, because nodes today usually have a significant number of block-relay-only inbounds as @0xB10C noted.
The previous number of 113 was aligned with the maximum number of tx-relaying inbounds.
mzumsande force-pushed
on Dec 29, 2023
mzumsande marked this as ready for review
on Dec 29, 2023
DrahtBot added the label
CI failed
on Jan 17, 2024
DrahtBot added the label
Needs rebase
on Jan 29, 2024
mzumsande force-pushed
on Feb 5, 2024
mzumsande
commented at 8:14 pm on February 5, 2024:
contributor
Make sure to run all tests locally, according to the documentation.
The failure may happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
mzumsande force-pushed
on Jul 25, 2024
mzumsande
commented at 3:00 pm on July 25, 2024:
contributor
mzumsande
commented at 6:19 pm on August 5, 2024:
contributor
related doc should also be updated?
Good point, thanks! I added a commit that updates these spots. This would obviously also need release notes, but those could be added at a later point.
mzumsande force-pushed
on Aug 5, 2024
DrahtBot
commented at 9:11 am on September 6, 2024:
contributor
Make sure to run all tests locally, according to the documentation.
The failure may happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the
affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot added the label
CI failed
on Sep 6, 2024
in
doc/reduce-traffic.md:8
in
13a51962e8outdated
2@@ -3,8 +3,9 @@ Reduce Traffic
3 4 Some node operators need to deal with bandwidth caps imposed by their ISPs.
5 6-By default, Bitcoin Core allows up to 125 connections to different peers, 11 of
7-which are outbound. You can therefore, have at most 114 inbound connections.
8+By default, Bitcoin Core allows up to 200 connections to different peers, 11 of
9+which are outbound. You can therefore, have at most 189 inbound connections, half of
10+which can only be taken up by low-traffic block-relay-only peers.
naiyoma
commented at 2:13 pm on September 14, 2024:
nit: I’m trying to understand something. According to the PR description, 50% of inbound slots are allocated for tx-relaying peers, but the commit message seems to imply that 50% are for block-relay-only peers. Does this mean that the inbound slots are equally split between tx-relaying and block-relay-only peers? If so, could the description be expanded for clarity?
mzumsande
commented at 10:33 pm on September 17, 2024:
What is meant is that half of the slots can only be accessed by block-relay-only peers, and the other half is accessible to both:
“with 50% of inbound slots accessible for tx-relaying peers” and “half of
which can only be taken up by low-traffic block-relay-only peers” are therefore both saying the same thing:
If we’d have 0 tx-relay inobund peers, in theory all inbound slots could be filled up with block-relay-only peers, whereas if we had 0 block-relay-only inbound peers, still only half of the slots could be filled with tx-relay peers.
naiyoma
commented at 5:01 pm on September 15, 2024:
contributor
Concept ACK to increase the number of maximum connections to 200, enabling an increase in outgoing block relay connections later on.
net: add options to AttemptToEvictConnection
If the added bool evict_tx_relay_peers set, non-tx-relay
peers will be exempt from disconnection.
Also allow to specify a peer that exempt from eviction.
Both options will be used in later commits.
Co-authored-by: Amiti Uttarwar <amiti@uttarwar.org>
d794e5157b
net: increase inbound capacity for block-relay-only connections
..and adjust the eviction logic.
The new default max connection number is 200, the default maximum of tx-relaying
inbounds is limited to 50% of all inbound connections.
With 11 outbound connections, that is (200 - 11) * 0.5 = 94.5.
As a result, the tx-related maximum traffic should not change
drastically.
When we receive an inbound connection and don't have space for another
full-relay peer, we now attempt to evict specifically a full-relay inbound
after receiving the version message of the new peer.
Once this commit is widely deployed, the added inbound capacity will
allow us to increase the number of outgoing block-relay-only connections.
Co-authored-by: Amiti Uttarwar <amiti@uttarwar.org>
7d53b0672c
test: add functional test for inbound maxconnection limits
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
d489f6d701
init: make inbound tx relay percentage configurable
Permit users to change the amount of inbounds that are permitted to relay
transactions. This is particularly relevant to ensure that superusers that are
not concerned with resource usage are not artificially restricted from offering
many transaction relay slots to the network.
Co-authored-by: Martin Zumsande <mzumsande@gmail.com>
7e84285532
p2p: trigger possible eviction if we support bloom filters and change a peer to tx relay
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me