p2p: Add 2 outbound block-relay-only connections

sdaftuar commented at 7:29 pm on April 5, 2019: member

Transaction relay is optimized for a combination of redundancy/robustness as well as bandwidth minimization – as a result transaction relay leaks information that adversaries can use to infer the network topology.

Network topology is better kept private for (at least) two reasons:

(a) Knowledge of the network graph can make it easier to find the source IP of a given transaction.

(b) Knowledge of the network graph could be used to split a target node or nodes from the honest network (eg by knowing which peers to attack in order to achieve a network split).

We can eliminate the risks of (b) by separating block relay from transaction relay; inferring network connectivity from the relay of blocks/block headers is much more expensive for an adversary.

After this commit, bitcoind will make 2 additional outbound connections that are only used for block relay. (In the future, we might consider rotating our transaction-relay peers to help limit the effects of (a).)

gmaxwell commented at 7:40 pm on April 5, 2019: contributor

I agree with the concept. I wish we could announce our own addresses to these peers, but I think doing so results in a topology leak. :( The reason I’d like this is that if an attacker eclipses a node except for its blocks only links it will eventually eclipse its blocks-only links too through control of the addrman state.

sdaftuar commented at 7:41 pm on April 5, 2019: member

This approach was motivated by the TxProbe paper. Fundamentally, transaction relay is going to leak information about our node’s peers (as long as we care about not wasting bandwidth). Rather than combat transaction-relay based inferences, which I think is a lost cause in the long-run even if there are improvements we can make in the short-term, I figure we can directly address why we care about keeping the graph private.

Some considerations I had while working on this:

Is 2 the right number of blocks-only connections? From a robustness standpoint, more connections like this are better; however there’s a tradeoff with resource utilization. Two resources that occurred to me to worry about were the memory used by each CNode object (hence the refactor commits that allow the transaction-relay data structures to not be instantiated for the blocks-only peers), and the number of connection slots on the network (I figure going from 8 to 10 is not that bad?). I also did some simulations to verify that a random graph of 10000 peers would be fully connected if each peer makes 2 outbound connections to random other peers, which I thought was a good justification for 2 being a useful number of outbounds to add.
How should addrman interactions work on these blocks-only connections? My approach attempts to leave addrman unchanged as a result of these connections, neither processing addr messages nor sending any, to avoid leaking information via addr messages that we relay to other peers. But I have not given a lot of thought to what kind of attacks are possible using addrman, nor what the consequences are to not updating addrman with information from our blocks-only peers.

gmaxwell commented at 7:49 pm on April 5, 2019: contributor

I also did some simulations to verify that a random graph of 10000 peers would be fully connected if each peer makes 2 outbound connections to random other peers,

Can you run these simulations assuming x% of the 10,000 peers are black holes (don’t connect the graph), for various values of x%?

2 might be fine for now, but ultimately I’d like to optimize memory usage for both inbound and outbound blocksonly links, then at least double the inbound connection limit for blocks only links, and make 8 blocks only links. (I suspect running that simulation assuming reasonable number of black hole peers will show that 2 is not really enough).

instagibbs commented at 8:15 pm on April 5, 2019: member

I figure we can directly address why we care about keeping the graph private

To the uninformed p2p reader this reason is?

DrahtBot commented at 9:12 pm on April 5, 2019: member

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#16762 (Rust-based Backup over-REST block downloader by TheBlueMatt)
#16702 (p2p: supplying and using asmap to improve IP bucketing in addrman by naumenkogs)
#16698 ([WIP] Mempool: rework rebroadcast logic to improve privacy by amitiuttarwar)
#16682 ([WIP] p2p: Disconnect peer that send us tx INVs when we opted out of tx relay by jnewbery)
#16507 (feefilter: Compute the absolute fee rather than stored rate by instagibbs)
#15502 (Speed up initial connection to p2p network by ajtowns)
#14033 (p2p: Drop CADDR_TIME_VERSION checks now that MIN_PEER_PROTO_VERSION is greater by Empact)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

gmaxwell commented at 9:37 pm on April 5, 2019: contributor

To the uninformed p2p reader this reason is?

(b) Knowledge of the network graph could be used to split a target node or nodes from the honest network (eg by knowing which peers to attack in order to achieve a network split).

DrahtBot added the label P2P on Apr 5, 2019

sdaftuar commented at 2:59 pm on April 8, 2019: member

Can you run these simulations assuming x% of the 10,000 peers are black holes (don’t connect the graph), for various values of x%?

This table shows my simulation results for various black hole rates, with various values of how many outbound connections we make (k=2 is what I’ve implemented in this PR). This is on a 10,000 node graph with no non-listening nodes. Each table entry shows the success count (where success is defined as “the subgraph of non-black-hole nodes is connected”) out of 1000 simulations.

black hole rate	k=2	k=4	k=6	k=8
0	1000	1000	1000	1000
5%	30	998	1000	1000
10%	0	976	1000	1000
15%		867	999	1000
20%		598	994	1000
25%		229	978	1000
30%		33	928	998
35%		2	812	992

2 might be fine for now, but ultimately I’d like to optimize memory usage for both inbound and outbound blocksonly links, then at least double the inbound connection limit for blocks only links, and make 8 blocks only links.

This is roughly what I have in mind as well. I think we need a p2p protocol change to allow negotiating blocks-only links (so that the peer on the other side of our connection knows that the inbound peer won’t turn transaction relay on later), so I figured we could start small for now.

practicalswift commented at 5:32 pm on April 8, 2019: contributor

@sdaftuar Thanks for sharing the results of the analysis! Would you be willing to share the simulation scripts?

sdaftuar commented at 11:57 am on April 9, 2019: member

@practicalswift My simulation code is unreadable, and also may have bugs, so I’d prefer that anyone interested in this would independently corroborate my results from scratch.

naumenkogs commented at 5:34 pm on April 10, 2019: member

separating block relay from transaction relay;

This is indeed a good idea. I also see it as a small step towards peer rotation, I think this suggestion makes it easier to reason about rotation.

I’m not sure if this is strongly related to the data you provided, but I’m also curious how many nodes have to be attacked (suspended?) to split network in halves etc.

luke-jr commented at 6:41 pm on April 18, 2019: member

Wouldn’t this break all the benefits of compact blocks? (Ignoring that with the current implementation, block relay would likely still occur on the normal connections.)

sdaftuar commented at 7:57 pm on April 18, 2019: member

@luke-jr I don’t think so. The idea is that nodes are still expected to do transaction relay with other peers, and my belief is that transaction propagation is good enough that compact block relay works just fine on these blocksonly links. I believe my testing of this branch confirms that as well, but I’d welcome additional testers to corroborate those results in case I’m overlooking something.

DrahtBot added the label Needs rebase on May 16, 2019

practicalswift commented at 7:13 pm on May 16, 2019: contributor

Concept ACK: I like the idea

laanwj added this to the "Blockers" column in a project

sdaftuar force-pushed on May 17, 2019

DrahtBot removed the label Needs rebase on May 17, 2019

in src/net.h:697 in c36939ca2e outdated

698-    int64_t nNextAddrSend GUARDED_BY(cs_sendProcessing){0};
699-    int64_t nNextLocalAddrSend GUARDED_BY(cs_sendProcessing){0};
700+    struct AddrRelay {
701+        CCriticalSection cs_addrsend;
702+        std::vector<CAddress> vAddrToSend GUARDED_BY(cs_addrsend);
703+        CRollingBloomFilter addrKnown GUARDED_BY(cs_addrsend);

promag commented at 9:01 am on May 18, 2019:

https://github.com/bitcoin/bitcoin/pull/15759/commits/c36939ca2ea636dce8047065d88c2e70bd418405

Move initialization here and drop constructor:

0CRollingBloomFilter addrKnown GUARDED_BY(cs_addrsend){5000, 0.001};

sdaftuar commented at 3:09 pm on June 5, 2019:

Thanks, fixed.

in src/net.h:693 in c36939ca2e outdated

696-    CRollingBloomFilter addrKnown;
697-    bool fGetAddr{false};
698-    int64_t nNextAddrSend GUARDED_BY(cs_sendProcessing){0};
699-    int64_t nNextLocalAddrSend GUARDED_BY(cs_sendProcessing){0};
700+    struct AddrRelay {
701+        CCriticalSection cs_addrsend;

promag commented at 9:04 am on May 18, 2019:

https://github.com/bitcoin/bitcoin/pull/15759/commits/c36939ca2ea636dce8047065d88c2e70bd418405

Could explain why the new mutex? This commit isn’t just refactor.

sdaftuar commented at 9:37 am on June 5, 2019:

I think one issue is that I wasn’t sure how to do the GUARDED_BY annotations with a variable that didn’t live in the class. But also I thought it might make sense to explicitly lock things in this class with its own mutex, as that seems like a clearer design for making this code modular.

The reason I included this refactor commit at all was in preparation for being able to optionally not instantiate this data structure, if there was concern about minimizing resource utilization as much as possible for these additional blocksonly connections. Since I ended up not doing that (I believe I concluded that it probably wasn’t all that much additional memory?), I could also drop this commit instead.

promag commented at 1:55 pm on June 11, 2019:

Thanks for the explanation, I’ll review again with that in mind.

in src/net.h:831 in c36939ca2e outdated

830         // SendMessages will filter it again for knowns that were added
831         // after addresses were pushed.
832-        if (_addr.IsValid() && !addrKnown.contains(_addr.GetKey())) {
833-            if (vAddrToSend.size() >= MAX_ADDR_TO_SEND) {
834-                vAddrToSend[insecure_rand.randrange(vAddrToSend.size())] = _addr;
835+        if (_addr.IsValid() && !m_addr_relay.addrKnown.contains(_addr.GetKey())) {

promag commented at 9:06 am on May 18, 2019:

https://github.com/bitcoin/bitcoin/pull/15759/commits/c36939ca2ea636dce8047065d88c2e70bd418405

nit, coult early return before the above lock:

0if (!_addr.IsValid()) return;
1LOCK(m_addr_relay.cs_addrsend);
2if (!m_addr_relay.addrKnown.contains(_addr.GetKey())) {

sdaftuar commented at 3:09 pm on June 5, 2019:

Done.

in src/net_processing.cpp:3566 in c36939ca2e outdated

3564             if (!vAddr.empty())
3565                 connman->PushMessage(pto, msgMaker.Make(NetMsgType::ADDR, vAddr));
3566             // we only send the big addr message once
3567-            if (pto->vAddrToSend.capacity() > 40)
3568-                pto->vAddrToSend.shrink_to_fit();
3569+            if (pto->m_addr_relay.vAddrToSend.capacity() > 40)

promag commented at 9:07 am on May 18, 2019:

https://github.com/bitcoin/bitcoin/pull/15759/commits/c36939ca2ea636dce8047065d88c2e70bd418405

nit, could add { }

sdaftuar commented at 3:09 pm on June 5, 2019:

Done.

in src/net.h:716 in 5e6e12ef04 outdated

717+        //    unless it loads a bloom filter.
718+        bool fRelayTxes GUARDED_BY(cs_filter){false};
719+        std::unique_ptr<CBloomFilter> pfilter PT_GUARDED_BY(cs_filter);
720+
721+        mutable CCriticalSection cs_tx_inventory;
722+        CRollingBloomFilter filterInventoryKnown GUARDED_BY(cs_tx_inventory);

promag commented at 9:15 am on May 18, 2019:

https://github.com/bitcoin/bitcoin/pull/15759/commits/5e6e12ef04b640d32d0341a3a1874fde46b0f95e

Initialize here

0CRollingBloomFilter filterInventoryKnown GUARDED_BY(cs_tx_inventory){50000, 0.000001};

And can drop .reset() call, CRollingBloomFilter constructor already calls it.

dongcarl commented at 6:22 pm on May 28, 2019:

And can drop .reset() call, CRollingBloomFilter constructor already calls it.

Permalink for convenience: https://github.com/bitcoin/bitcoin/blob/fdc951ad040a9117bc79f12086ce874b4c2aa55a/src/bloom.cpp#L206-L231

sdaftuar commented at 3:10 pm on June 5, 2019:

Thanks, done.

promag commented at 9:21 am on May 18, 2019: member

Maybe commits up to 7f2ce0a4c5054f8a4061f34968c5928f1f0bbde6 could be in a separate pull request.

jonasschnelli commented at 10:12 am on May 18, 2019: contributor

Concept ACK. I’m still trying to understand this more detailed: does that mean that with 2 block-only connection and an assumed 5% black-hole-nodes rate, we only have a 3% chance to succeed? Since we relay, I guess there is no chance we could identify (and thus disconnect) black-hole peers?

in src/net.h:715 in 1b90f9cafb outdated

712+
713+        AddrRelay() : addrKnown(5000, 0.001) {}
714+    };
715+
716+    AddrRelay m_addr_relay;
717+    const bool m_addr_relay_peer;

Empact commented at 9:33 pm on May 27, 2019:

nit: m_addr_relay_peer could be private

sdaftuar commented at 3:10 pm on June 5, 2019:

Meh, going to leave this as-is.

in src/net.cpp:2643 in 1b90f9cafb outdated

2639@@ -2628,8 +2640,10 @@ CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn
2640     hSocket = hSocketIn;
2641     addrName = addrNameIn == "" ? addr.ToStringIPPort() : addrNameIn;
2642     hashContinue = uint256();
2643-    filterInventoryKnown.reset();
2644-    pfilter = MakeUnique<CBloomFilter>();
2645+    if (fRelayTxes && !blocks_only) {

Empact commented at 9:44 pm on May 27, 2019:

0net.cpp:2643:9: error: ‘fRelayTxes’ was not declared in this scope
1     if (fRelayTxes && !blocks_only) {
2         ^~~~~~~~~~

https://travis-ci.org/bitcoin/bitcoin/jobs/533781128

sdaftuar commented at 3:10 pm on June 5, 2019:

Thanks, fixed now.

in src/net.h:722 in 5e6e12ef04 outdated

704-    std::set<uint256> setInventoryTxToSend;
705     // List of block ids we still have announce.
706     // There is no final sorting before sending, as they are always sent immediately
707     // and in the order requested.
708     std::vector<uint256> vInventoryBlockToSend GUARDED_BY(cs_inventory);
709     CCriticalSection cs_inventory;

dongcarl commented at 6:30 pm on May 28, 2019:

Nit: Would it be appropriate to rename this to cs_block_inventory for clarity?

sdaftuar commented at 3:11 pm on June 5, 2019:

I suppose but I’m going to leave this for now, this diff is already getting big.

in src/net.cpp:2635 in 7f2ce0a4c5 outdated

2631@@ -2632,8 +2632,10 @@ CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn
2632     hSocket = hSocketIn;
2633     addrName = addrNameIn == "" ? addr.ToStringIPPort() : addrNameIn;
2634     hashContinue = uint256();
2635-    m_tx_relay = MakeUnique<TxRelay>();
2636-    m_tx_relay->pfilter = MakeUnique<CBloomFilter>();
2637+    if (fRelayTxes) {

dongcarl commented at 7:11 pm on May 28, 2019:

Perhaps this is a C++ quirk I’m not aware of, but I think that CNode’s fRelayTxes was moved into TxRelay here: https://github.com/bitcoin/bitcoin/pull/15759/commits/5e6e12ef04b640d32d0341a3a1874fde46b0f95e#diff-1a8b9d1ad0a6fda5e751286c73102fc2R712. Perhaps this commit can be dropped as it seems like blocks_only does the same thing in the next commit: https://github.com/bitcoin/bitcoin/pull/15759/commits/d7991eb708589203b854aed19222864bca02c9dc#diff-9a82240fe7dfe86564178691cc57f2f1R2639

sdaftuar commented at 3:13 pm on June 5, 2019:

This was just a mistake when I rebased after the global fRelayTxes got renamed. Fixed now – I actually ended up dropping the commit where I made -blocksonly mode run without the transaction relay data structures.

in src/init.cpp:1781 in d7991eb708 outdated

1776@@ -1777,6 +1777,7 @@ bool AppInitMain(InitInterfaces& interfaces)
1777     connOptions.nLocalServices = nLocalServices;
1778     connOptions.nMaxConnections = nMaxConnections;
1779     connOptions.nMaxOutbound = std::min(MAX_OUTBOUND_CONNECTIONS, connOptions.nMaxConnections);
1780+    connOptions.nMaxOutboundBlocksOnly = std::min(MAX_BLOCKS_ONLY_CONNECTIONS, std::max(connOptions.nMaxOutbound-6, 0));

dongcarl commented at 7:40 pm on May 28, 2019:

Trying to understand the logic here…

According to the commit description, bitcoind will make 2 additional outbound connections, but with this logic, if connOptions.nMaxOutbound <= 6 then there will be no additional outbound connections: https://www.wolframalpha.com/input/?i=min(2,+max(x-6,+0))

From elsewhere in this commit, it seems like we consider the accounting for nMaxOutboundBlocksOnly as separate from that of nMaxOutbound, so perhaps just

0 connOptions.nMaxOutboundBlocksOnly = MAX_BLOCKS_ONLY_CONNECTIONS;

would do?

EthanHeilman commented at 8:14 pm on June 3, 2019:

This - 6 is a little confusing and requires that you look up the default value of connOptions.nMaxOutbound to understand what this is doing. Also echoing @dongcarl comments that this doesn’t work well when connOptions.nMaxOutbound <= 6.

Given that this fails for particular values and you only want 2 block only outbound why not change this to:

0connOptions.nMaxOutboundBlocksOnly = MAX_BLOCKS_ONLY_CONNECTIONS

sdaftuar commented at 1:37 pm on June 5, 2019:

Perhaps the accounting logic should be improved to be clearer, but I believe the behavior here is at least (roughly) correct. The way this works is that if a user sets -maxconnections very low, that will be a hard limit on the number of inbound, outbound and outbound-blocks-only connections. Before this PR, it would take the user setting their max connections value to less than 8 to change the number of outbound connections we make (we first reduce all the inbound we take and only then reduce the outbound).

This PR adds new blocks-only connections and preferences those connections over regular outbound connections (whenever we make an outbound connection, we check to see if we are below our blocks-only target, and if so we initiate the connection as blocks-only). As a result, I thought it didn’t make sense to have any blocks-only links if we have too few regular outbound connections, because I don’t think it makes sense to have blocks-only links at the expense of transaction relay.

In particular, the problem with the suggestion of just setting nMaxOutboundBlocksOnly to 2 is that if the user passes -maxconnections=2, then they’ll only have 2 blocksonly peers and won’t receive any transactions.

So I believe the way I implemented things here is that we try to reserve 6 connections as regular outbound, and then if we add additional connections we will have those be blocks-only until we get to 2, and then any further connections are regular outbounds up to 8, and then finally any additional ones are used for feelers and inbounds.

Of course we could change this to be more conservative and only allocate blocks-only connections if we have all 8 regular outbounds. However I don’t think we should be less conservative and risk having fewer regular outbounds.

TheBlueMatt commented at 7:58 pm on July 10, 2019:

I think just do std::min(MAX_BLOCKS_ONLY_CONNECTIONS, std::max(connOptions.nMaxConnections - MAX_OUTBOUND_CONNECTIONS, 0)).

sdaftuar commented at 2:36 pm on July 12, 2019:

@TheBlueMatt I pushed a commit that changes the behavior so that the first 8 outbounds are reserved to be regular full-relay peers, and only then do we allocate up to 2 blocksonly peers.

in src/net.h:735 in 5e6e12ef04 outdated

736+        int64_t nextSendTimeFeeFilter{0};
737+
738+        TxRelay() : filterInventoryKnown(50000, 0.000001) { filterInventoryKnown.reset(); }
739+    };
740+
741+    TxRelay m_tx_relay;

dongcarl commented at 8:11 pm on May 28, 2019:

Nit: Perhaps we should add documentation that m_tx_relay == nullptr means that the CNode is blocks_only, and !pnode->fInbound && (pnode->m_tx_relay == nullptr) is how we check if a CNode is outbound-blocks-only or not.

sdaftuar commented at 3:15 pm on June 5, 2019:

Added a comment explaining the == nullptr.

in src/init.cpp:1780 in d7991eb708 outdated

1776@@ -1777,6 +1777,7 @@ bool AppInitMain(InitInterfaces& interfaces)
1777     connOptions.nLocalServices = nLocalServices;
1778     connOptions.nMaxConnections = nMaxConnections;
1779     connOptions.nMaxOutbound = std::min(MAX_OUTBOUND_CONNECTIONS, connOptions.nMaxConnections);

dongcarl commented at 8:20 pm on May 28, 2019:

Nit: Perhaps this is a good opportunity to rename nMaxOutbound to something more precise… As it’s easy to think that:

nMaxOutbound bounds nOutbound in CConnman::ThreadOpenConnections(const std::vector<std::string>)
nMaxOutbound includes nMaxOutboundBlocksOnly added below

Both of which are untrue as far as I can tell.

sipa commented at 11:35 pm on July 29, 2019:

Bumping this. It seems the code changes would be slightly simpler if nMaxOutbound referred to the total of all outbound connections (including the blocksonly ones). That would also make it easier to reason whether no places where nMaxOutbound needed to be replaced with nMaxOutbound + nMaxOutboundBlocksOnly were forgotten.

sdaftuar commented at 3:44 pm on July 31, 2019:

Your point appears to be underscored by the fact that I did overlook at least one place where nMaxOutbound needed to be replaced by the sum. I’ll try to improve this as you both suggest.

sdaftuar commented at 4:08 pm on August 1, 2019:

Fixed now.

DrahtBot added the label Needs rebase on Jun 5, 2019

sdaftuar force-pushed on Jun 5, 2019

sdaftuar commented at 3:15 pm on June 5, 2019: member

Rebased and addressed review comments.

DrahtBot removed the label Needs rebase on Jun 5, 2019

sdaftuar commented at 7:58 pm on June 11, 2019: member

I’m still trying to understand this more detailed: does that mean that with 2 block-only connection and an assumed 5% black-hole-nodes rate, we only have a 3% chance to succeed? Since we relay, I guess there is no chance we could identify (and thus disconnect) black-hole peers? @jonasschnelli With 2 blocks-only connections, and if we assume 5% of nodes are useless black holes, then my simulation suggests we have a 3% chance of the network graph being connected by only the blocks-only connections. For additional robustness, we will want to add more blocks-only connections in the future – but for resource utilization reasons I don’t want to do that until we add better support at the p2p protocol layer for establishing these connections.

Note that we are still going to relay blocks on the existing 8 outbound connections, so this PR is purely additive robustness, at the cost of a little more resource utilization. We now will have 10 outbound peers, rather than 8, which are able to provide blocks to us. (We just aren’t going to do tx relay on the two additional connections.)

One issue I’d like to flag for reviewers is how these new blocksonly peers interact with outbound peer disconnection (introduced in #11560 and #11490). I believe I’ve implemented this so that the new blocksonly peers will not be subject to other forms of disconnection, because non-tx-relay outbounds are excluded from being outbound disconnection candidates. However, it’s worth verifying the implementation is correct and worth thinking about whether this behavior is the best choice.

JeremyRubin commented at 9:07 pm on June 11, 2019: contributor

I think that this should be correct code in case someone wants to set it up to tabulate against Suhas’s simulation.

Edit: I’ve rewritten it to be much much better. Scipy is so cool!

 0 import random
 1 import copy
 2 import scipy
 3 import scipy.sparse
 4 import scipy.linalg
 5 import numpy as np
 6 
 7 N = 1000
 8 def biased_coin(bias):
 9     return random.random() <= bias
10 def make_graph(n_outbound, p_blackhole, n_total = N):
11     graph = np.zeros((n_total, n_total))
12     for i in range(n_total):
13         if not biased_coin(p_blackhole):
14             for j in range(n_outbound):
15                 graph[i][j] = 1
16             # rejection sample no self-edges
17             while True:
18                 np.random.shuffle(graph[i])
19                 if not graph[i][i]:
20                     break
21     return graph
22 
23def is_connected(graph):
24      return scipy.sparse.csgraph.connected_components(graph, directed=True, return_labels=False) == 1
25
26 print(sum(is_connected(make_graph(8, 0.05, 1000)) for x in range(100)))

My results were, for 100 samples, 92. This suggests Suhas’s simulation was not correct.

edit: Nevermind, Suhas’s simulation was on n=10,000. will re-run

JeremyRubin commented at 9:53 pm on June 11, 2019: contributor

Improved the code for the above sim, in case anyone’s following by email-only.

JeremyRubin commented at 10:16 pm on June 11, 2019: contributor

Deep apologies for the noise; I missed Suhas’s earlier note about the connectivity of the non black hole nodes, not the total connectivity. The below should work for that.

 0import random
 1 import copy
 2 import scipy
 3 import scipy.sparse
 4 import scipy.linalg
 5 import numpy as np
 6 
 7 N = 1000
 8 def biased_coin(bias):
 9     return random.random() <= bias
10 def make_graph(n_outbound, p_blackhole, n_total = N):
11     graph = np.zeros((n_total, n_total))
12     count = 0
13     for i in range(n_total):
14         if not biased_coin(p_blackhole):
15             count += 1
16             for j in range(n_outbound):
17                 graph[i][j] = 1
18             # rejection sample no self-edges
19             while True:
20                 np.random.shuffle(graph[i])
21                 if not graph[i][i]:
22                     break
23     return (graph,count)
24 
25 def is_connected(graph, count):
26     first_non_blackhole = None
27     for (i,row) in enumerate(graph):
28         if any(r != 0 for r in row):
29             first_non_blackhole = i
30             break
31     if first_non_blackhole is None:
32         return False
33     n, labels = scipy.sparse.csgraph.connected_components(graph, directed=True, return_labels=True)
34     label = labels[first_non_blackhole]
35     size = sum(l == label  for l in labels)
36     return size >= count
37 
38 
39 
40 print(sum(is_connected(*make_graph(8,0.05,10000)) for _ in range(10)))

in src/net_processing.cpp:3577 in bacfdab463 outdated

3586                 }
3587+                pto->m_addr_relay.vAddrToSend.clear();
3588+                if (!vAddr.empty())
3589+                    connman->PushMessage(pto, msgMaker.Make(NetMsgType::ADDR, vAddr));
3590+                // we only send the big addr message once
3591+                if (pto->m_addr_relay.vAddrToSend.capacity() > 40)

promag commented at 10:34 pm on June 11, 2019:

bacfdab

nit, could fix indentation and keep braces?

JeremyRubin commented at 4:32 am on June 18, 2019: contributor

The earlier versions have some issues with the scipy connected_components API (their documentation of the definition of ‘weak’ connected was incorrect, since been patched)

After some back and forth offline with @sdaftuar, we’re on the same page that the below simulation should be an accurate model for the scenario. Suhas and I have spot checked it (with 100 samples) for a few of the k, p, N=10000 and verified it confirms Suhas’s earlier results (I checked k = 2, p = 0.05 and k = 4, p = 0.2).

 0import scipy
 1import scipy.sparse
 2import numpy as np
 3from matplotlib import pyplot as plt
 4
 5def make_graph(n_outbound, p_blackhole, n_total):
 6    assert(n_outbound < n_total)
 7    # Flip n coins, with P(honest_peer[i] == True) = 1-p_blackhole
 8    honest_peer = np.random.random(n_total) >= p_blackhole
 9    graph = np.zeros((n_total, n_total), dtype=bool)
10    for i in range(n_total):
11        if not honest_peer[i]:
12            continue
13        # set n_outbound peers connected
14        for j in range(n_outbound):
15            graph[i][j] = True
16        # shuffle the peers
17        np.random.shuffle(graph[i])
18        # rejection sample so no self-edges exist
19        while graph[i][i]:
20            np.random.shuffle(graph[i])
21        # Prune all connections to hole nodes for analysis
22        for j in range(n_total):
23            if graph[i][j] and not honest_peer[j]:
24                graph[i][j] = False
25    return (graph, honest_peer)
26
27def is_connected(graph, honest_peer):
28    try:
29        first_non_blackhole = np.argwhere(honest_peer)[0][0]
30    except IndexError:
31        # No Honest Peers
32        return False
33    # We use weak connections to imply transitivity of links
34    n, labels = scipy.sparse.csgraph.connected_components(graph, directed=True, return_labels=True, connection="weak")
35
36    # Check that all honest peers are connected
37    label = labels[first_non_blackhole]
38    component_count = np.sum(np.logical_and((label == labels), honest_peer))
39    honest_count = np.sum(honest_peer)
40    return component_count == honest_count
41
42
43# Draws the nodes on a circle and the edges between them
44def draw_graph(g, honest_peer):
45    n = len(g)
46    # Add one extra point and then remove it to not overlap
47    y = np.sin(np.linspace(0, 2*3.1415, n+1))[:n]
48    x = np.cos(np.linspace(0, 2*3.1415, n+1))[:n]
49
50    # Plot arrows
51    for i, col in enumerate(g):
52        for j, connected in enumerate(col):
53            if connected == 1:
54                plt.arrow(x[i], y[i], x[j]-x[i], y[j]-y[i], head_width= 0.0025)
55    # Plot blackholes
56    x_black = [x[j] for j in range(n) if not honest_peer[j]]
57    y_black = [y[j] for j in range(n) if not honest_peer[j]]
58    plt.scatter(x_black,y_black, color='b')
59    # Plot Honest Peers
60    x_red = [x[j] for j in range(n) if  honest_peer[j]]
61    y_red = [y[j] for j in range(n) if  honest_peer[j]]
62    plt.scatter(x_red,y_red, color='r')
63    plt.show()
64
65def sim(x):
66    # We must reseed for each fork...
67    scipy.random.seed()
68    return is_connected(*make_graph(2, 0.05, 10000))
69if __name__ == "__main__":
70    from multiprocessing import Pool
71    p = Pool(16)
72    r = p.map(sim, range(100))
73    print(sum(r))

We can keep this code around as reference because it’s useful to simulate other networks. I also added some code to visualize (small!) networks (~100 nodes looks fine for me), which is useful to visually inspect some low N examples. The models can also be simulated in parallel to speed things up, just adjust the number of cores for your system.

JeremyRubin commented at 5:11 am on June 18, 2019: contributor

I ran the sim with 16 samples per param group. With a small sample size, there’s some variance/loss of precision, but it seems in line. results below:

 0k = 2, p = 0%, 100%
 1k = 2, p = 5%, 0%
 2k = 4, p = 0%, 100%
 3k = 4, p = 5%, 100%
 4k = 4, p = 10%, 100%
 5k = 4, p = 15%, 93%
 6k = 4, p = 20%, 37%
 7k = 4, p = 25%, 37%
 8k = 4, p = 30%, 6%
 9k = 4, p = 35%, 0%
10k = 6, p = 0%, 100%
11k = 6, p = 5%, 100%
12k = 6, p = 10%, 100%
13k = 6, p = 15%, 100%
14k = 6, p = 20%, 100%
15k = 6, p = 25%, 93%
16k = 6, p = 30%, 68%
17k = 6, p = 35%, 93%
18k = 8, p = 0%, 100%
19k = 8, p = 5%, 100%
20k = 8, p = 10%, 100%
21k = 8, p = 15%, 100%
22k = 8, p = 20%, 100%
23k = 8, p = 25%, 100%
24k = 8, p = 30%, 100%
25k = 8, p = 35%, 100%

if you care to run it yourself, with a full 1000 samples:

 0 def sim(k,p):
 1     # We must reseed for each fork...
 2     scipy.random.seed()
 3     return is_connected(*make_graph(k, p, 10000))
 4 if __name__ == "__main__":
 5     from multiprocessing import Pool
 6     pool = Pool(16)
 7     for y in range(1,5):
 8         for x in range(8):
 9             p = x*0.05
10             k = y*2
11             r = pool.starmap(sim, [(k,p)]*(1000))
12             connected = 100.0*sum(r)/(1000.0)
13             print("k = %d, p = %d%%, %d%%"%(k, p*100, connected))
14             if connected == 0:
15                 break

sdaftuar commented at 4:52 pm on July 5, 2019: member

I believe there is a bug in this PR relating to address relay – I will investigate further and update.

sdaftuar commented at 8:37 pm on July 5, 2019: member

Pushed a fixup for an address relay issue I spotted (to be squashed).

jamesob commented at 7:02 pm on July 8, 2019: member

Hit a potential deadlock on this branch during IBD close to tip:

 02019-07-08T18:58:46Z UpdateTip: new best=0000000000000000001ec53a5934fa44779d8ab375605cc80fc1f2eb96c76ce8 height=5806
 189 version=0x20000000 log2_work=90.738975 tx=424259718 date='2019-06-14T12:56:31Z' progress=0.982123 cache=214.6MiB(1
 2639768txo)
 3...
 42019-07-08T18:58:58Z sending getdata (37 bytes) peer=9
 52019-07-08T18:58:58Z received: inv (37 bytes) peer=10
 62019-07-08T18:58:58Z got inv: tx 8290f778e9061a73ae752756aa6982d57d47ffae1317f2feea638c343e4c22d0  new peer=10
 72019-07-08T18:58:58Z received: block (1033841 bytes) peer=11
 82019-07-08T18:58:58Z received block 000000000000000000041665748801ac08d492429c1ce68778cc58d5d6f575a2 peer=11
 92019-07-08T18:58:58Z Requesting block 00000000000000000016c31a145239b8f7acadc11d0b583155a3f31dd6fdde45 (581015) peer=11
102019-07-08T18:58:58Z sending getdata (37 bytes) peer=11
112019-07-08T18:58:58Z received: block (1298943 bytes) peer=12
122019-07-08T18:58:58Z received block 0000000000000000001b2d966e013d6d7dd015702ccf818241a7d626e544c68a peer=12
132019-07-08T18:58:58Z Requesting block 00000000000000000005299461d084e73c72072c23d1372b2c0019d175e2378c (581016) peer=12
142019-07-08T18:58:58Z sending getdata (37 bytes) peer=12
152019-07-08T18:58:58Z received: block (1108360 bytes) peer=0
162019-07-08T18:58:58Z received block 0000000000000000001c3b6640c035a3e8a90162dc71cadf2ca71358c5638778 peer=0
172019-07-08T18:58:59Z Requesting block 0000000000000000001dee2f0e23f87b0826506ed41e88bfb01ab7f82e95daa8 (581017) peer=0
182019-07-08T18:58:59Z sending getdata (37 bytes) peer=0
192019-07-08T18:58:59Z received: block (1256520 bytes) peer=1
202019-07-08T18:58:59Z received block 0000000000000000000ba8d577ae11329b271bf89616afa1203654a411003309 peer=1
212019-07-08T18:58:59Z Requesting block 0000000000000000001c404446e1d35457832e18f7fed03daf06a2e3695df8f8 (581018) peer=1
222019-07-08T18:58:59Z sending getdata (37 bytes) peer=1
232019-07-08T18:58:59Z received: addr (31 bytes) peer=3
242019-07-08T18:58:59Z POTENTIAL DEADLOCK DETECTED
252019-07-08T18:58:59Z Previous lock order was:
262019-07-08T18:58:59Z  pfrom->m_addr_relay.cs_addrsend net_processing.cpp:2124 (in thread msghand)
272019-07-08T18:58:59Z  (1) cs_vNodes ./net.h:227 (in thread msghand)
282019-07-08T18:58:59Z  (2) m_addr_relay.cs_addrsend ./net.h:841 (in thread msghand)
292019-07-08T18:58:59Z Current lock order is:
302019-07-08T18:58:59Z  (2) pfrom->m_addr_relay.cs_addrsend net_processing.cpp:2124 (in thread msghand)
312019-07-08T18:58:59Z  (1) cs_vNodes ./net.h:227 (in thread msghand)
32Assertion failed: detected inconsistent lock order at sync.cpp:121, details in debug log.

sdaftuar force-pushed on Jul 9, 2019

sdaftuar commented at 3:50 pm on July 9, 2019: member

@jamesob Thanks for catching that! The addr relay refactor introduced that “potential deadlock”; while it’s straightforward to fix, I decided to just drop that refactor since I ended up not using it anyway.

I’ve updated the PR accordingly (the previous version of this branch is 15759.1).

in src/net.cpp:1824 in c987790152 outdated

1825@@ -1822,7 +1826,7 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
1826                 LogPrint(BCLog::NET, "Making feeler connection to %s\n", addrConnect.ToString());
1827             }
1828 
1829-            OpenNetworkConnection(addrConnect, (int)setConnected.size() >= std::min(nMaxConnections - 1, 2), &grant, nullptr, false, fFeeler);

jamesob commented at 6:29 pm on July 9, 2019:

This logic as written will mean the first two connections we open are blocks-only… would this be a problem if for some reason we only had two reachable peers? I.e. would it make any sense to delay creating blocks-only connections until we’ve already made, say, 2 regular connections? Probably not, but figured it was worth checking.

sdaftuar commented at 12:58 pm on July 12, 2019:

My intuition is that we have bigger problems than not receiving transactions if we’re trying to connect to 8 peers but can only make 2 connections, so this isn’t a big deal – happy to reconsider if others disagree.

TheBlueMatt commented at 8:30 pm on August 13, 2019:

I mean I dont think its all that critical, but I do think I’d prefer if it were non-blocks-only first. Specifically, if you have a rather-stale addrman, you’d like to use your first few peers to exchange addresses so speed up getting further connections (instead of being stuck spinning).

sdaftuar commented at 6:28 pm on August 14, 2019:

Fixed in the latest version.

jamesob commented at 7:07 pm on July 9, 2019: member

So far I’ve tested by applying a patch that adds blocksonly to getpeerinfo output, running an IBD-near-tip, and checking to see that we have blocksonly connections:

 0$ ./src/bitcoin-cli -datadir=/data/bitcoin getpeerinfo | grep blocksonly
 1
 2    "blocksonly": true,
 3    "blocksonly": true,
 4    "blocksonly": false,
 5    "blocksonly": false,
 6    "blocksonly": false,
 7    "blocksonly": false,
 8    "blocksonly": false,
 9    "blocksonly": false,
10    "blocksonly": false,
11    "blocksonly": false,

Code looks pretty straightforward and conceptually this seems like a good idea.

in src/net.h:724 in 32d8ac4ce6 outdated

710     // and in the order requested.
711     std::vector<uint256> vInventoryBlockToSend GUARDED_BY(cs_inventory);
712     CCriticalSection cs_inventory;
713-    int64_t nNextInvSend{0};
714+
715+    struct TxRelay {

TheBlueMatt commented at 8:06 pm on July 10, 2019:

Can you move this into CNodeState? They never belonged in CNode and the only reason they’re still there is because its not been worth doing a free-standing PR to do it, but since you’re moving them, might as well merge it all with TxDownloadState anyway.

sdaftuar commented at 12:54 pm on July 12, 2019:

That turns out to be more invasive than I realized when we discussed offline – fRelayTxes and the bloom filter are used in net.cpp, particularly for the inbound peer eviction logic, so refactoring to allow that to live in net_processing would be more complicated than I think makes sense for this PR.

jnewbery commented at 8:24 am on March 30, 2022:

Done in #21160

naumenkogs commented at 5:57 pm on July 11, 2019: member

I measured (see source code) how extra blocksonly links affect compact blocks relay latency.
Low block relay latency is important to reduce orphan rates and, thus, increase the security of the network. Just to remind, compact block relay latency can be measured in RTTs.

If a connected peer has all transactions from the block exccept coinbase tx (current strategy), it would receive a compact block in 0.5 RTT. If a connected peer misses at least one transaction from the block, it would take 1.5 RTT.

So, increasing the number of block relaying links might speed up compact block relay. But how much? I measured compact block relay RTT between 2 random private nodes in a Bitcoin-like network with pre-relayed transactions (via Erlay or BTCFlood).

	Best case (empty block)	Erlay tx relay	BTCFlood tx relay
8 + 0 links	1.86 RTT	1.9 RTT	2.06 RTT
8 + 2 links	1.8 RTT	1.82 RTT	2.0 RTT
8 + 8 links	1.71 RTT	1.71 RTT	1.91 RTT

It seems like we can reduce compact block relay latency by 7% with 8 extra blocks-only links. It’s hard to tell how it translates into orphan rate, but seems not significant to me.

jamesob commented at 6:41 pm on July 23, 2019: member

ACK https://github.com/bitcoin/bitcoin/pull/15759/commits/c73ff2cfa5e5532f908bfdd454efc02e4356f37a (modulo squashing the last commit into c987790152)

I’m excited for this change. Adding blocks-only connections gives us a lot of resiliency for very little added complexity, and opens the door to safely doing more sophisticated peer management of our regular outbounds. Hopefully we can soon bump up the number of blocks-only connections we make (over the regular 8 outbounds) after we make the marginal memory burden per blocks-only connection low (some CNodeState tweaks?).

Review notes

 0- [x] e101770e19 Remove unused variable
 1
 2- [x] 3742ffd1c2 [refactor] Move tx relay state to separate structure
 3
 4  Lock splitting happens (cs_inventory -> {cs_inventory,cs_tx_inventory}
 5
 6- [x] 1794eef544 [refactor] Change tx_relay structure to be unique_ptr
 7
 8  Super easy to review.
 9
10- [x] 196913ae33 Check that tx_relay is initialized before access
11
12  Grep'd for all usages of m_tx_relay, checked that they were guarded by
13  conditional.
14
15  Most easily reviewed with `-w`.
16
17- [x] c72414f65c Add comment explaining intended use of m_tx_relay
18
19- [x] c987790152 Add 2 outbound blocks-only connections
20
21  Long lines are hard to review on Github, but that's an out-of-scope nit.
22
23- [x] 32d8ac4ce6 Don't relay addr messages to block-relay-only peers
24
25- [x] c73ff2cfa5 fixup! Add 2 outbound blocks-only connections
26
27  Needs fixup into c987790152.

Testing

Built locally, ran with getpeerinfo RPC patch to observe that my first two outgoing connections (out of a total of ten) were blocks-only. Successfully synced to network tip.

Verified that starting up with -maxconnections=4 makes no blocks-only connections.

sdaftuar force-pushed on Jul 24, 2019

sdaftuar commented at 1:52 pm on July 24, 2019: member

@jamesob Thanks for reviewing, I’ve gone ahead and squashed the fixup commit into place.

jamesob commented at 9:57 pm on July 24, 2019: member

reACK https://github.com/bitcoin/bitcoin/pull/15759/commits/085f5b2805f5a202f7d553f9914c94c634c8092c based on an identical comparison

in src/net.cpp:2642 in 64024a0fd1 outdated

2636@@ -2633,8 +2637,10 @@ CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn
2637     hSocket = hSocketIn;
2638     addrName = addrNameIn == "" ? addr.ToStringIPPort() : addrNameIn;
2639     hashContinue = uint256();
2640-    m_tx_relay = MakeUnique<TxRelay>();
2641-    m_tx_relay->pfilter = MakeUnique<CBloomFilter>();
2642+    if (!blocks_only) {
2643+        m_tx_relay = MakeUnique<TxRelay>();
2644+        m_tx_relay->pfilter = MakeUnique<CBloomFilter>();

sipa commented at 11:36 pm on July 29, 2019:

Nit: this initialization of the TxRelay::pfilter field probably belongs better in the TxRelay constructor.

sdaftuar commented at 4:08 pm on August 1, 2019:

Done.

sdaftuar force-pushed on Aug 1, 2019

sdaftuar commented at 4:42 pm on August 1, 2019: member

Updated this PR to incorporate feedback from @sipa / @dongcarl. Previous version is 15759.2

jamesob approved

jamesob commented at 5:33 pm on August 1, 2019: member

ACK https://github.com/bitcoin/bitcoin/pull/15759/commits/aff2a198276f1659a096403cf17e919a103f806b

Only changes since last tip are clarifying the naming of max outbound connection variables and moving TxRelay.pfilter’s initialization to the TxRelay constructor.

Cloned locally, built, and tested with the patch referenced above and -maxconnections=4.

in src/net.h:743 in 3ffdaaba37 outdated

720+        std::set<uint256> setInventoryTxToSend;
721+        // Used for BIP35 mempool sending
722+        bool fSendMempool GUARDED_BY(cs_tx_inventory){false};
723+        // Last time a "MEMPOOL" request was serviced.
724+        std::atomic<int64_t> timeLastMempoolReq{0};
725+        int64_t nNextInvSend{0};

ariard commented at 0:25 am on August 4, 2019:

nit: could take opportunity to comment all members

sdaftuar commented at 2:09 pm on August 13, 2019:

Leaving this alone for now, since I think these are relatively self-commented anyway, but also when I looked at this again it seems like another improvement here would be to conform all these variables to the style guide as well, which I’d prefer to not do in this PR at this point.

in src/net.h:752 in c496ae11cf outdated

724@@ -725,6 +725,7 @@ class CNode
725         int64_t nextSendTimeFeeFilter{0};
726     };
727 
728+    // m_tx_relay == nullptr if we're not relaying transactions with this peer

ariard commented at 0:28 am on August 4, 2019:

Isn’t a boolean a better practice than a nullptr ? Do we prefer to have a node crashing because of non-initialized acces or a topology leak because of a boolean in wrong state ?

MarcoFalke commented at 2:30 pm on August 12, 2019:

@ariard Isn’t a boolean a better practice than a nullptr ? Do we prefer to have a node crashing because of non-initialized acces or a topology leak because of a boolean in wrong state ?

I think the goal here is to not to construct the tx relay object if we don’t want to use it. With a boolean that wouldn’t be possible. However, it is possible with a pointer or an optional.

ariard commented at 6:15 pm on August 14, 2019:

Sorry for the noise on this one, I thought about it after the fact, maybe the ideal would be to have special classes CBlockNode/CTxNode but would need different code paths so that’s for future works.

ajtowns commented at 9:24 am on August 20, 2019:

I could see bool IsBlocksOnly() const { return !m_tx_relay; } being a bit easier to follow, but otherwise thing the way it is is fine. unique_ptr saves memory compared to boost::optional so seems the right choice between those two.

jkczyz commented at 3:50 pm on August 28, 2019:

Agreed here regarding IsBlocksOnly. Would also make future updates simpler given the number of times the check is performed.

sdaftuar commented at 8:51 pm on August 28, 2019:

IMO an explicit nullptr check in the places in the code where we’re about to dereference makes more sense than only using an IsBlocksOnly() function, where it’s less clear to the reader whether a null-dereference could happen. This happens in a lot of places, so I plan to leave many of the == nullptr checks.

There are a couple places in the code where we logically only care whether the peer is a block-relay-only peer, where something like this might be nice; but given the other usages of the nullptr comparisons, I’d prefer to not have two ways to check for this condition, so will leave as-is.

in src/net_processing.cpp:746 in 76f1ffdfc0 outdated

742@@ -743,11 +743,11 @@ void UpdateLastBlockAnnounceTime(NodeId node, int64_t time_in_seconds)
743     if (state) state->m_last_block_announcement = time_in_seconds;
744 }
745 
746-// Returns true for outbound peers, excluding manual connections, feelers, and
747-// one-shots
748+// Returns true for outbound transaction-relay peers, excluding manual

ariard commented at 0:30 am on August 4, 2019:

nit: comment is a bit misleading and without context lets suggest we have already transaction-only peers, isn’t full-relay peers a better wording ?

sdaftuar commented at 2:10 pm on August 13, 2019:

Addressed in latest commit.

ariard commented at 0:39 am on August 4, 2019: member

Reviewed aff2a19.

If long-term goal is to split block relay from tx relay, what’s the state of art of network topology inference based only on blocks (you mentioned it was harder do you have more resources on it?). That would help a lot to decide for the right counter-measures because if you can infer from block propagation too we may need also to segregate blocks-only in addrman subsets with different block relay/addresses announcement policies ?

Specially reviewed than every m_tx_relay access is protected before access. Also don’t spend time to think if we can extend scope of TxRelay, maybe we can for better per-blocks-only node memory economy.

Waiting to sync on tip with this patch on a pruned node to ACK.

naumenkogs commented at 3:57 am on August 4, 2019: member

what’s the state of art of network topology inference based only on blocks

Not much. There is map-z paper which infers topology in Zcash, but it’s a lot simplier than Bitcoin, because they have smaller network, blocks are processed longer, a significant portion of nodes produce blocks, etc. I’m trying to see how easy it would be to use something similar in Bitcoin at the moment, but I tend to agree it’s difficult now (especially with Fibre and Compact Blocks).

sipa commented at 4:30 am on August 4, 2019: member

Ping @sr-gi, perhaps you know of ways for topology inference based on block relay?

sr-gi commented at 10:19 am on August 6, 2019: none

Ping @sr-gi, perhaps you know of ways for topology inference based on block relay?

Not really. We focused on tx relay since the difference in handling normal / orphan transaction could leak useful information. That is not the case for blocks AFAIK. I’ve had no chance to check map-z yet, but I’ll do and see if I can think of something.

in src/net_processing.cpp:1522 in 3ffdaaba37 outdated

1518@@ -1519,11 +1519,11 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
1519             if (mi != mapRelay.end()) {
1520                 connman->PushMessage(pfrom, msgMaker.Make(nSendFlags, NetMsgType::TX, *mi->second));
1521                 push = true;
1522-            } else if (pfrom->timeLastMempoolReq) {
1523+            } else if (pfrom->m_tx_relay.timeLastMempoolReq) {

MarcoFalke commented at 2:12 pm on August 12, 2019:

note to other reviewers: It is fine to not take a lock here because timeLastMempoolReq is atomic

in src/net_processing.cpp:1520 in 7e6f3738e0 outdated

1498@@ -1499,7 +1499,7 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
1499     std::deque<CInv>::iterator it = pfrom->vRecvGetData.begin();
1500     std::vector<CInv> vNotFound;
1501     const CNetMsgMaker msgMaker(pfrom->GetSendVersion());
1502-    {
1503+    if (pfrom->m_tx_relay != nullptr) {

MarcoFalke commented at 3:11 pm on August 12, 2019:

Would be nice to add a comment here to explain that this will effectively treat MSG_TX and MSG_WITNESS_TX as “unknown” types, thus blocking the queue, thus starving this peer completely and making it a zombie.

sdaftuar commented at 2:10 pm on August 13, 2019:

Addressed in latest commit.

MarcoFalke approved

MarcoFalke commented at 3:43 pm on August 12, 2019: member

ACK aff2a198276f1659a096403cf17e919a103f806b (code looks good, did not compile, left some doc-nits or questions)

My comment suggestion is obviously not blocking.

Signature:

 0-----BEGIN PGP SIGNED MESSAGE-----
 1Hash: SHA512
 2
 3ACK aff2a198276f1659a096403cf17e919a103f806b (code looks good, did not compile, left some doc-nits or questions)
 4-----BEGIN PGP SIGNATURE-----
 5
 6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
 7pUgNfAwAgUQwQBVBwDFpfLKsj/XmKAkJSRuiF5vIV0YfxazkrJKKDVVy/xv/oZ6l
 89hMEaE9VUYcvcDCCUNcRffsZWR3Z1RPKtXT5ZCsm6ao+jDPCu/jaTj44QU2Kup+V
 9nsVBI+RgTKsEQxq79CjE9/W7pwQxHDGv3Q/GrhwYbapm0+ChOiPu60yDslLd1Dmo
10tDzHj+97/9fQv8tGGpYkVhrZYFDooQwG0+0nm8fNFzx3u2uavUFVUytoU9pQXiBR
11FdnuZC8D6OHNXy9kY1g21WBwbADALlfzjGqY9ZzLatP+K3RJdocakju7J3Tz5OH+
12Ove4q1p0qPArP0dAWzJH504vrDv3ROAAB9s7GbxMxDINC5VoeR+mnIks13Wetg1f
13XzMu4vNcLXwbRwQmJG5R+okuQn2CJgaXW+0Jd8oz10pLhAW/kZuTyia6ZDa0rKCE
14ipDod3me1995V3Ss9v7E6Re6zEQHK4TXOgsDPDRjmA8XTTx/z1PFJRIyukW1vkkZ
15rdqAMVd4
16=YXe1
17-----END PGP SIGNATURE-----

Timestamp of file with hash 0779afcec8ee151d90bebc83b9f03a65c34007cf2b11641bd977acc4c7885919 -

sdaftuar commented at 2:11 pm on August 13, 2019: member

Updated to address some comment nits from @MarcoFalke and @ariard.

MarcoFalke commented at 2:19 pm on August 13, 2019: member

re-ACK 08478f1409c65335a28dcf796f8e87a43d1b7040 (only change is adding a commit to clarify documentation)

Signature:

 0-----BEGIN PGP SIGNED MESSAGE-----
 1Hash: SHA512
 2
 3re-ACK 08478f1409c65335a28dcf796f8e87a43d1b7040 (only change is adding a commit to clarify documentation)
 4-----BEGIN PGP SIGNATURE-----
 5
 6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
 7pUiQCAwAuhyBP+uOlfgbfa2sDwacm+CIVhPYIlXf06rqaW3j/aepwC2hsAQmV7bZ
 8UJ+jIJEniX7VMKjRU9v8bEc7BN3cUFaQVrWSPdVUslZGxKmP1TY7PaCzoBSabHhs
 9hY2ep0HpWGjaSSf32F+OsqX0qEP/+Pe65tAydZjzqQPMNz5pMgkfedFEHOpJF2vW
10bUAlQR7f1/1l/aJNmuF9to+GqP7iQGzf30EMSWI6YYPWtk2jUlTalBRd8SLZsE5e
11lkaamFWjVfTK7IgfhI7y5wg5/CwhdZWCOQpkycEu5moNhyUqCIBmi6IAqSDSn6iU
12DOLw7mZ29TU8Y92YebfYu+kkSJ93ZKNElSm7JlCDIfu+zKWYeGjQjDhFh6aNjN3J
13exjBXG0EI+uyiF/8DpjvgletvHkCqHEHPMexisH0Ly6T8PnHEblOFqIgTviwEKfU
148yAgIcdiVzYnHYCvFUrncFSPowh0rdJyhx07pLcaaHOKNLWSqDH+q330vz/a1nQX
15Mab5hYsv
16=DLTl
17-----END PGP SIGNATURE-----

Timestamp of file with hash 52bbf433e2ba99292fc1ad0a59de737cfa33cafaa42dd8ee010da3bd41aa1df1 -

MarcoFalke added this to the milestone 0.19.0 on Aug 13, 2019

mzumsande approved

mzumsande commented at 5:38 pm on August 13, 2019: member

tested ACK 08478f1409c65335a28dcf796f8e87a43d1b7040 (tested with aff2a198276f1659a096403cf17e919a103f806b and checked that the last commit is docs-only).

I tested this for a couple of hours on mainnet with debug=net during final 1% of IBD and after:

Checked that disconnections of normal outbound and blocks-only nodes don’t change the targeted composition.
Checked that blocks-only nodes download and relay blocks and don’t relay tx-related messages.
Tested number of blocks-only connections for different maxconnections (7->0,9->1)

Reviewed code and didn’t find issues, though I am not too familiar yet with net.

I think that the patch for the getpeerinfo RPC by @jamesob would be really helpful as a follow-up PR. I relied on the relaytxes property of getpeerinfo which is not ideal.

MarcoFalke commented at 5:50 pm on August 13, 2019: member

I think there are a bunch of “doc-level” cleanups that should come afterwards. Like release notes, rpc changes, functional tests, whatnot… I think it is best to leave them for later.

MarcoFalke added the label Needs release note on Aug 13, 2019

sdaftuar commented at 7:48 pm on August 13, 2019: member

I think it would be great to know whether this meaningfully improves our defenses against topology inference, but it also seems like something we will only know definitively if someone publishes a successful attack (I doubt we’ll get any proofs anytime soon that no attacks are possible!).

So while I think it’s good to give reviewers some time to see if anyone can come up with an attack against this approach, it also seems to me that we’ll get more information about this after code is deployed. And adding additional outbound peers should make the network more robust to partition attacks. So even if topology inference is still possible, this PR should only be making our code strictly better, from a security standpoint.

From a performance standpoint, this PR does increase the resource usage for these additional connections, but if that is sufficiently small (as I believe it would be, in this patch) I think we would not regret it.

(Also, if it turns out that approaches like the one described in the Map-Z paper work on the bitcoin network to infer these blocks-only links, then I suspect it should still be relatively straightforward to neutralize that kind of approach by, say, intentionally delaying block relay on these links so that in the absence of an attack on the rest of a node’s peers, these would not be inferable. I think we can discuss these kinds of hypothetical mitigations once an attack becomes apparent.)

in src/net_processing.cpp:3148 in aff2a19827 outdated

3099@@ -3091,12 +3100,12 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
3100             LOCK(cs_main);
3101             Misbehaving(pfrom->GetId(), 100);
3102         }
3103-        else
3104+        else if (pfrom->m_tx_relay != nullptr)

TheBlueMatt commented at 8:06 pm on August 13, 2019:

Hmm, should you add an else branch here and disconnect them if we opened a blocksonly connection to some peer that asked for filtering?

ajtowns commented at 10:05 am on August 20, 2019:

Would be nice to have the same behaviour for all the invalid-on-blocksonly-connections messages; whether ignoring it, logging it, or an immediate disconnect…

jamesob commented at 8:26 pm on August 13, 2019: member

If the resource usage for these additional connections is sufficiently small (as I believe it would be, in this patch) I think we would not regret it.

I agree with this. Phrased differently, blocks-only connections offer a strict subset of the information that standard connections do. Given that, I don’t see how these two additional connections could enable any new attacks, since these additional links don’t offer any additional information to would-be attackers.

What these connections do offer is increased connectedness in terms of block relay. So I think that even though we lack conclusive proof that these connections don’t enable some kind of attack, we have enough of an argument that they probably don’t. Netting that against the certainty that this change makes partitionability less of a concern (which is a much more severe risk than, say, topology inference) makes the low, maybe-nonexistent risk seem well worth it.

So this change seems like an uncontroversial win to me. I think we should merge it soon.

TheBlueMatt commented at 8:37 pm on August 13, 2019: member

utACK 08478f1409c65335a28dcf796f8e87a43d1b7040, though I’d prefer to open full connections before blocks-only connections.

instagibbs commented at 8:50 pm on August 13, 2019: member

I don’t see how these two additional connections could enable any new attacks, since these additional links don’t offer any additional information to would-be attackers.

Given correct implementation, of course!

DrahtBot added the label Needs rebase on Aug 14, 2019

Remove unused variable 26a93bce29

sdaftuar force-pushed on Aug 14, 2019

sdaftuar commented at 6:29 pm on August 14, 2019: member

I pushed changes to address feedback from @thebluematt and @jamesob so that we now connect to full relay peers before blocks-only ones.

I also improved the logging of new outbound connections so that we also output whether it’s a blocksonly peer or feeler peer.

DrahtBot removed the label Needs rebase on Aug 14, 2019

naumenkogs commented at 2:53 pm on August 15, 2019: member

utACK 6962113

in src/net.cpp:1726 in 4e6a635f4e outdated

1721@@ -1722,7 +1722,8 @@ void CConnman::ThreadOpenConnections(const std::vector<std::string> connect)
1722         CAddress addrConnect;
1723 
1724         // Only connect out to one peer per network group (/16 for IPv4).
1725-        int nOutbound = 0;
1726+        int nOutboundFullRelay = 0;
1727+        int nOutboundBlocksOnly = 0;

jamesob commented at 3:41 pm on August 15, 2019:

(https://github.com/bitcoin/bitcoin/pull/15759/commits/4e6a635f4ea3c1a6a026877e561e7671ac00b449)

Only if you have to rebase for other reasons: might wanna use the current naming convention for these.

in src/net.cpp:1842 in 4e6a635f4e outdated

1838+            // (It should not be possible for fFeeler to be set if we're not
1839+            // also at our blocks-only peer limit, but check against that as
1840+            // well for sanity.)
1841+            bool blocks_only = nOutboundBlocksOnly < m_max_outbound_blocks_only && !fFeeler && nOutboundFullRelay >= m_max_outbound_full_relay;
1842+
1843+            OpenNetworkConnection(addrConnect, (int)setConnected.size() >= std::min(nMaxConnections - 1, 2), &grant, nullptr, false, fFeeler, false, blocks_only);

jamesob commented at 3:59 pm on August 15, 2019:

These long lines are tough to review in Github.

Selection_132 (fullscreen)

Obviously not worth addressing unless revisiting for other reasons.

jnewbery commented at 6:18 pm on August 16, 2019:

I agree. It’d also be nice to comment the nullptr/false args.

jkczyz commented at 4:01 pm on August 28, 2019:

This may be a sign that the interface for OpenNetworkConnection needs to be rethought, given the way booleans are piped through different functions. Though this is probably outside the scope of the PR.

jamesob approved

jamesob commented at 4:21 pm on August 15, 2019: member

reACK https://github.com/bitcoin/bitcoin/pull/15759/commits/69621136f51c547e22bf6a095cbcd2edabb195f5

Reviewed the interdiff. Changes:
- Fills all full-relay outbound connections before creating blocks-only connections.
- Adds documentation regarding getdata response to blocks-only peers.
- Specify peer as blocks-only during VERACK receipt logging.
- Indentation change (net_processing.cpp:3816).
Built locally, tested we get two blocks-only connections and eight full relay connections with -maxconnections=12.

in src/net_processing.cpp:1517 in 69621136f5 outdated

1511@@ -1512,7 +1512,12 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
1512     std::deque<CInv>::iterator it = pfrom->vRecvGetData.begin();
1513     std::vector<CInv> vNotFound;
1514     const CNetMsgMaker msgMaker(pfrom->GetSendVersion());
1515-    {
1516+
1517+    // Note that if we receive a getdata for a MSG_TX or MSG_WITNESS_TX from a
1518+    // blocks-only outbound peer, we will stop processing further getdata

jnewbery commented at 4:02 pm on August 16, 2019:

I believe we’ll stop processing all messages in this case (since ProcessMessages() will exit early if the vRecvGetData vector is non-empty after ProcessGetData() was called)

in src/net_processing.cpp:3198 in 69621136f5 outdated

3192         vRecv >> newFeeFilter;
3193         if (MoneyRange(newFeeFilter)) {
3194-            {
3195-                LOCK(pfrom->cs_feeFilter);
3196-                pfrom->minFeeFilter = newFeeFilter;
3197+            if (pfrom->m_tx_relay != nullptr) {

jnewbery commented at 4:56 pm on August 16, 2019:

micronit: change this to early-exit like in the NetMsgType::FILTERCLEAR block (otherwise there will be a meaningless log when a FEEFILTER is received from a blocks-only peer).

in src/net_processing.cpp:3009 in 69621136f5 outdated

2998@@ -2989,6 +2999,10 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
2999             LogPrint(BCLog::NET, "Ignoring \"getaddr\" from outbound connection. peer=%d\n", pfrom->GetId());
3000             return true;
3001         }
3002+        if (!pfrom->IsAddrRelayPeer()) {

jnewbery commented at 7:28 pm on August 16, 2019:

I don’t think this branch is necessary (or can ever be hit). IsAddrRelayPeer() can only be false for blocks_only peers, which are outbound connections, and so will be caught by the if (!pfrom->fInbound) branch above.

There’s no harm in this being here, but I think it’s dead code.

ajtowns commented at 7:38 am on August 20, 2019:

I think it makes sense to keep this as belt-and-suspenders – it’s not that clear that fInbound will always be false whenever m_addr_relay_peer is false. Might make sense to have this test before the !fInbound test though, so that it is at least exercised?

jnewbery commented at 7:22 pm on August 21, 2019:

Seems reasonable. Perhaps also add a comment explaining that all blocks_only peers are outbound, so this check is currently redundant.

sdaftuar commented at 9:31 pm on August 28, 2019:

Improving this was below my bar for necessary changes with my last rework, so I have not addressed this (and may not).

jnewbery commented at 8:09 pm on August 16, 2019: member

utACK 69621136f51c547e22bf6a095cbcd2edabb195f5.

A few nits inline and some more general points:

Some tests here would be really nice. #14210 can’t come soon enough.
The GETDATA queue poisoning is horrible. This PR doesn’t introduce that issue, but it makes it a bit more obvious.
I don’t fully understand the reasoning behind the Don’t relay addr messages to block-relay-only peers. The commit log says “We don’t want relay of addr messages to leak information about these network links.” but I can’t see exactly what attack we’re mitigating, and how an adversary could use addr relay from/to blocks_only to leak information that they couldn’t get already. I don’t think the code in that commit can be harmful, but I’m just not sure whether it’s necessary.

in src/net_processing.cpp:763 in 69621136f5 outdated

761+// Returns true for outbound full-relay peers, excluding manual
762+// connections, feelers, one-shots, and blocks-only connections
763 static bool IsOutboundDisconnectionCandidate(const CNode *node)
764 {
765-    return !(node->fInbound || node->m_manual_connection || node->fFeeler || node->fOneShot);
766+    return !(node->fInbound || node->m_manual_connection || node->fFeeler || node->fOneShot || node->m_tx_relay == nullptr);

ajtowns commented at 6:49 am on August 20, 2019:

As far as I can see, this prevents blocks only connections from being disconnected during IBD if their header set has less than nMinimumChainWork (in net_processing:ProcessHeadersMessage()). It also prevents blocks only nodes from getting evicted in PeerLogicValidation::ConsiderEviction() if they’re not keeping up with the blocks we see. Not sure that makes sense?

sdaftuar commented at 12:58 pm on August 20, 2019:

Yeah this was the question I raised above – we currently protect 4 of our outbound peers from this logic anyway. So this PR changes that to 6 and always includes these blocks-only peers. Protecting some peers at all from this logic is just belt-and-suspenders prevention of some kind of cascading failure, so it’s not clear to me how to think about changing the logic, whether I should include these blocks-only peers in the protected set or not?

One advantage to protecting them is that it simplifies the reasoning over how to connect to a new peer when connecting to an extra outbound peer if the tip hasn’t advanced in a while – right now, we always connect as a full-relay peer, and then we evict one of our full-relay peers. If we don’t protect the blocks-only peers, this logic will be more annoying to write.

ajtowns commented at 4:27 am on August 22, 2019:

Yeah this was the question I raised above

(Link for anyone playing along at home)

Hmm, m_protect only applies to ConsiderEviction not the nMinimumChainWork check, if I’m reading this right, so this makes blocks only peers more protected from eviction than the normal protected outbound peers?

Maybe adding if (::g_relay_txes && pto->m_tx_relay == nulltr) return; after the m_protect check in EvictExtraOutboundPeers would work as an alternative to changing IsOutboundDisconnectionCandidate?

(Actually, I think we have to retain pto->m_tx_relay != nullptr for the initial 8 outbounds even when g_relay_txes == false or the way we count normal outbounds vs the extra blocks only ones will be wrong, so the g_relay_txes test above is redundant)

sdaftuar commented at 9:30 pm on August 28, 2019:

I think this is better now.

in src/net.h:725 in 69621136f5 outdated

721     std::vector<uint256> vInventoryBlockToSend GUARDED_BY(cs_inventory);
722     CCriticalSection cs_inventory;
723-    int64_t nNextInvSend{0};
724+
725+    struct TxRelay {
726+        TxRelay() { pfilter = MakeUnique<CBloomFilter>(); }

ajtowns commented at 9:17 am on August 20, 2019:

I don’t think pfilter can ever be null, so most of the if (...->pfilter) tests seem redundant. It’s only ever changed by either: pfilter.reset(new CBloomFilter(filter)) or pfilter.reset(new CBloomFilter()), which I think could be simplified to *pfilter = filter and *pfilter = CBloomFilter(), or, at that point, just having a CBloomFilter as a member directly.

sdaftuar commented at 1:01 pm on August 20, 2019:

I concluded this as well, but figured I’d save that simplification for a later PR to avoid complicating the reasoning here.

in src/net.h:726 in 69621136f5 outdated

728+        // We use fRelayTxes for two purposes -
729+        // a) it allows us to not relay tx invs before receiving the peer's version message
730+        // b) the peer may tell us in its version message that we should not relay tx invs
731+        //    unless it loads a bloom filter.
732+        bool fRelayTxes GUARDED_BY(cs_filter){false};
733+        std::unique_ptr<CBloomFilter> pfilter PT_GUARDED_BY(cs_filter);

ajtowns commented at 9:21 am on August 20, 2019:

I think this should be both PT_GUARDED_BY(cs_filter) and GUARDED_BY(cs_filter) – you shouldn’t be (and aren’t) resetting the pointer without holding the lock, since that would destroy the underlying data that some other thread might have held the lock and be accessing it.

sdaftuar commented at 9:30 pm on August 28, 2019:

Done.

ajtowns commented at 12:23 pm on August 20, 2019: member

This doesn’t actually seem to work right? I’m seeing:

02019-08-20T12:06:01Z receive version message: /Satoshi:0.18.0/: version 70015, b
1locks=590955, us=XXX, peer=11
22019-08-20T12:06:01Z New outbound peer connected: version: 70015, blocks=590955, blocks-only=true, feeler=false, peer=11
3
42019-08-20T12:06:04Z got inv: tx 9c42701fa9e819f9a858c0805058de2bd53a20e4f82f2a7539a3521dbb659ed8  have peer=11
52019-08-20T12:06:04Z got inv: tx 21a2a8e8940b0a066ad061afabcf6ce47e2e09d59620a0138e83b308cfb20dca  new peer=11
6
72019-08-20T12:06:04Z Requesting witness-tx 21a2a8e8940b0a066ad061afabcf6ce47e2e09d59620a0138e83b308cfb20dca peer=11

I guess bool fBlocksOnly = !g_relay_txes in ProcessMessage() should have an || !pfrom->m_tx_relay ? Possibly the same thing in the strCommand == NetMsgType::TX block?

sdaftuar commented at 12:35 pm on August 20, 2019: member

@ajtowns

This doesn’t actually seem to work right? I’m seeing:

02019-08-20T12:06:01Z receive version message: /Satoshi:0.18.0/: version 70015, b
1locks=590955, us=XXX, peer=11
22019-08-20T12:06:01Z New outbound peer connected: version: 70015, blocks=590955, blocks-only=true, feeler=false, peer=11
3
42019-08-20T12:06:04Z got inv: tx 9c42701fa9e819f9a858c0805058de2bd53a20e4f82f2a7539a3521dbb659ed8  have peer=11
52019-08-20T12:06:04Z got inv: tx 21a2a8e8940b0a066ad061afabcf6ce47e2e09d59620a0138e83b308cfb20dca  new peer=11
6
72019-08-20T12:06:04Z Requesting witness-tx 21a2a8e8940b0a066ad061afabcf6ce47e2e09d59620a0138e83b308cfb20dca peer=11

I guess bool fBlocksOnly = !g_relay_txes in ProcessMessage() should have an || !pfrom->m_tx_relay ? Possibly the same thing in the strCommand == NetMsgType::TX block?

Thanks, that is a good catch – the intention was for peers violating blocks-only mode to be disconnected, as you suggest. I’ll push a fix. EDIT: Actually, it appears that we do not disconnect peers from sending us tx inv’s if we’re running in -blocksonly mode, as I thought we did. I think we should, as the whole point is to reduce bandwidth in -blocksonly mode, but this will be a behavior change.

As an aside, I think I’ve occasionally seen this behavior from other peers myself where blocks-only connections still result in our peer sending us inv’s. I have never been able to reproduce with Bitcoin Core, but it does make me wonder if there might be some kind of rare bug where the blocks-only field is somehow ignored?

fanquake renamed this:
~~[p2p] Add 2 outbound blocks-only connections~~
p2p: Add 2 outbound blocks-only connections
on Aug 20, 2019

ajtowns commented at 3:37 am on August 22, 2019: member

Ignoring the inv’s (so we don’t send getdata and don’t receive the tx back) is already a bandwidth saving, and prevents revealing the topology, so maybe just maintain that behaviour for this PR, rather than dropping the connection?

I think dropping the connection does make sense though. Seems more likely that the observed misbehaviour in the wild is just due to spy nodes not having implemented blocks-only connections, than an obscure bug in bitcoin 0.18 to me. Not that it couldn’t be both, of course!

in src/net.h:725 in 8fb2d9f10c outdated

710     std::vector<uint256> vInventoryBlockToSend GUARDED_BY(cs_inventory);
711     CCriticalSection cs_inventory;
712-    int64_t nNextInvSend{0};
713+
714+    struct TxRelay {
715+        TxRelay() { pfilter = MakeUnique<CBloomFilter>(); }

jkczyz commented at 11:28 pm on August 27, 2019:

Can the constructor’s initializer list be used here for member initialization?

sdaftuar commented at 8:46 pm on August 28, 2019:

Seems like a style preference – I don’t see any harm the way I’ve written it, so will leave as-is.

jkczyz commented at 3:51 pm on August 29, 2019:

FWIW, pfilter is initialized first then reset here whereas when using the initializer list syntax it is simply initialized to the value you give it. Using the initializer syntax is considered safer as there is no risk of accessing an uninitialized member. Additionally, it allows you to make members const where appropriate.

in src/net.cpp:2636 in 4914f153b5 outdated

2632@@ -2633,6 +2633,7 @@ CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, int nMyStartingHeightIn
2633     hSocket = hSocketIn;
2634     addrName = addrNameIn == "" ? addr.ToStringIPPort() : addrNameIn;
2635     hashContinue = uint256();
2636+    m_tx_relay = MakeUnique<TxRelay>();

jkczyz commented at 0:03 am on August 28, 2019:

Likewise, can this be initialized in the above initializer list? It’s unclear to me why this and the above members are not initialized there like the other members.

sdaftuar commented at 8:47 pm on August 28, 2019:

This is in preparation for eventually not initializing this, in some cases. Again I think this is a code-style preference and there’s nothing wrong with this (intermediate) commit, so will leave as-is.

in src/net.cpp:826 in 90ef0a6ef4 outdated

817@@ -818,11 +818,17 @@ bool CConnman::AttemptToEvictConnection()
818                 continue;
819             if (node->fDisconnect)
820                 continue;
821-            LOCK(node->m_tx_relay->cs_filter);
822+            bool peer_relay_txes = false;
823+            bool peer_filter_not_null = false;

jkczyz commented at 0:12 am on August 28, 2019:

Consider naming peer_has_bloom_filter or something similarly readable corresponding to the member fBloomFilter.

jkczyz commented at 4:45 pm on August 28, 2019: contributor

Mainly nits.

[refactor] Move tx relay state to separate structure 4de0dbac9b

[refactor] Change tx_relay structure to be unique_ptr c4aa2ba822

sdaftuar force-pushed on Aug 28, 2019

sdaftuar commented at 9:29 pm on August 28, 2019: member

I’ve pushed several changes:

Fixed the bug where we’d request transactions from peers that violate the protocol by announcing transactions even though we’ve set fRelay=false
Added a commit where we go ahead and disconnect such peers
Reworked the outbound eviction logic so that we’ll evict block-relay-only peers that fail the nMinimumChainWork test, but protect them otherwise.
Renamed these “blocks-only” peers to be “block-relay” peers in our code and comments, to avoid confusion with “blocksonly” mode.

And I’ve addressed a few nits as well.

Prior version of the PR is 15759.4

ajtowns commented at 1:05 am on August 29, 2019: member

Added a commit where we go ahead and disconnect such peers

Need to update p2p_blocksonly test to match this behaviour change I think

sdaftuar force-pushed on Aug 29, 2019

sdaftuar renamed this:
~~p2p: Add 2 outbound blocks-only connections~~
p2p: Add 2 outbound block-relay-only connections
on Aug 29, 2019

sdaftuar commented at 6:45 pm on August 29, 2019: member

Fixed the p2p_blocksonly.py test to match the new behavior.

ajtowns commented at 5:59 am on September 2, 2019: member

* I don't fully understand the reasoning behind the _Don't relay addr messages to block-relay-only peers_.

I think I get this now.

I think the logic is that tx relay will allow an attacker to identify our tx relay peers, attack them, and eventually control all our tx relay connections; but if that happens we’ll still relay blocks via our block relay connections. What we don’t want is for the addr messages to potentially reveal which nodes we’ve selected for block relay to the attacker who by now controls all our other peers.

However an attacker that’s connected to a large proportion of the entire network can identify your peers (or near-peers) by sending tainted addresses to you – you then tell them to your peers and those peers then report back to the attacker identifying themselves. The patch here stops you from reporting tainted addresses from your blocks only peer (you ignore ADDR messages from them and won’t send a GETADDR message), and stops you sending potentially tainted addresses to your blocks only peers (you don’t choose them as the best peer in RelayAddress and even if you did you won’t send addr messages to them in SendMessages).

I think this might degrade addr connectivity a little bit though – RelayAddress won’t choose your block-relay outbound connection, but it might choose an inbound peer that thinks of you as a block-relay connection, so will drop your messages into the void. If this change is widely deployed, this will be 20% of your inbound connections (or more if we eventually go from 8+2 to 8+8 or similar), so could be significant. I think since we AdvertiseLocal regularly to our non-block-relay peers, we could just set a flag on the peer if we’ve ever seen an ADDR message from them, and only choose nodes with that flag in RelayAddress?

in src/net_processing.cpp:2489 in b21b291075 outdated

2484@@ -2470,9 +2485,11 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
2485     if (strCommand == NetMsgType::TX) {
2486         // Stop processing the transaction early if
2487         // We are in blocks only mode and peer is either not whitelisted or whitelistrelay is off
2488-        if (!g_relay_txes && !pfrom->HasPermission(PF_RELAY))
2489+        // or if this peer is supposed to be a block-relay-only peer
2490+        if ((!g_relay_txes && !pfrom->HasPermission(PF_RELAY)) || (pfrom->m_tx_relay == nullptr))

ajtowns commented at 6:12 am on September 2, 2019:

This doesn’t match the fBlocksOnly test for ::INV – if it did it would be:

0((!g_relay_txes || pfrom->m_tx_relay == nullptr) && !pfrom->HasPermission(PF_RELAY))

I think the logic here is probably better, though maybe there should be a check to prevent choosing a node for blocks-relay if it would have PF_RELAY perms?

sdaftuar commented at 2:12 pm on September 3, 2019:

I believe it’s impossible for an outbound peer to be whitelisted and have PF_RELAY permissions, so these are the same…

ajtowns commented at 2:23 pm on September 3, 2019:

I think you’re right; HasPermission() checks m_permissionFlags and the only way m_permissionFlags is anything but PF_NONE is via AcceptConnection(), so inbound only.

sdaftuar commented at 1:59 pm on September 3, 2019: member

I think this might degrade addr connectivity a little bit though – RelayAddress won’t choose your block-relay outbound connection, but it might choose an inbound peer that thinks of you as a block-relay connection, so will drop your messages into the void. If this change is widely deployed, this will be 20% of your inbound connections (or more if we eventually go from 8+2 to 8+8 or similar), so could be significant. I think since we AdvertiseLocal regularly to our non-block-relay peers, we could just set a flag on the peer if we’ve ever seen an ADDR message from them, and only choose nodes with that flag in RelayAddress?

I think the addr relay logic may need an overhaul, but I think the change made here is minor – as I understand it, since we relay addr messages by choosing among all our inbounds as potential next hops, there’s likely already a black-hole problem where light clients, spy nodes, adversaries, etc may be receiving addr messages and not propagating them further.

So adding 2 extra inbounds that don’t participate in addr propagation should be a relatively minor effect. I just checked one of my listening nodes running a recent release of Bitcoin Core, and observed that out of 46 total connections, 18 have never sent me an addr message, and 24 have never sent me a getaddr message. I would surmise that at least those 18-24 peers do not participate in addr relay (and possibly more)… So I think the 20% estimate of addr-relay-degradation due to this PR is a large overestimate.

I do think your suggestion is likely to be some kind of an improvement (at least in the non-adversarial case) but I’d prefer to defer addr-relay improvements to a future PR.

Check that tx_relay is initialized before access e75c39cd42

Add comment explaining intended use of m_tx_relay b83f51a4bb

Add 2 outbound block-relay-only connections

Transaction relay is primarily optimized for balancing redundancy/robustness
with bandwidth minimization -- as a result transaction relay leaks information
that adversaries can use to infer the network topology.

Network topology is better kept private for (at least) two reasons:

(a) Knowledge of the network graph can make it easier to find the source IP of
a given transaction.

(b) Knowledge of the network graph could be used to split a target node or
nodes from the honest network (eg by knowing which peers to attack in order to
achieve a network split).

We can eliminate the risks of (b) by separating block relay from transaction
relay; inferring network connectivity from the relay of blocks/block headers is
much more expensive for an adversary.

After this commit, bitcoind will make 2 additional outbound connections that
are only used for block relay. (In the future, we might consider rotating our
transaction-relay peers to help limit the effects of (a).)

3a5e885306

Don't relay addr messages to block-relay-only peers

We don't want relay of addr messages to leak information about
these network links.

430f489027

doc: improve comments relating to block-relay-only peers 937eba91e1

Disconnect peers violating blocks-only mode

If we set fRelay=false in our VERSION message, and a peer sends an INV or TX
message anyway, disconnect. Since we use fRelay=false to minimize bandwidth,
we should not tolerate remaining connected to a peer violating the protocol.

0ba08020c9

sdaftuar force-pushed on Sep 4, 2019

in src/net_processing.cpp:1518 in 937eba91e1 outdated

1511@@ -1512,6 +1512,11 @@ void static ProcessGetData(CNode* pfrom, const CChainParams& chainparams, CConnm
1512     std::deque<CInv>::iterator it = pfrom->vRecvGetData.begin();
1513     std::vector<CInv> vNotFound;
1514     const CNetMsgMaker msgMaker(pfrom->GetSendVersion());
1515+
1516+    // Note that if we receive a getdata for a MSG_TX or MSG_WITNESS_TX from a
1517+    // block-relay-only outbound peer, we will stop processing further getdata
1518+    // messages from this peer (likely resulting in our peer eventually

TheBlueMatt commented at 7:39 pm on September 4, 2019:

Note that this comment is incomplete, we’ll stop processing all messages from this peer (including pongs, so we’ll eventually disconnect them as well).

jnewbery commented at 10:25 pm on September 4, 2019:

I agree that we should update this comment (see #15759 (review))

TheBlueMatt approved

TheBlueMatt commented at 7:40 pm on September 4, 2019: member

re-utACK 0ba08020c9791f7caf5986ad6490c16a2b66cd83. Pointed out that stats.fRelayTxes was sometimes uninitialized for blocksonly peers (though its not a big deal and only effects RPC), which has since been fixed here. Otherwise changes are pretty trivial so looks good.

jnewbery commented at 10:25 pm on September 4, 2019: member

utACK 0ba08020c9791f7caf5986ad6490c16a2b66cd83

in src/init.cpp:1757 in 3a5e885306 outdated

1752@@ -1753,7 +1753,8 @@ bool AppInitMain(InitInterfaces& interfaces)
1753     CConnman::Options connOptions;
1754     connOptions.nLocalServices = nLocalServices;
1755     connOptions.nMaxConnections = nMaxConnections;
1756-    connOptions.nMaxOutbound = std::min(MAX_OUTBOUND_CONNECTIONS, connOptions.nMaxConnections);
1757+    connOptions.m_max_outbound_full_relay = std::min(MAX_OUTBOUND_FULL_RELAY_CONNECTIONS, connOptions.nMaxConnections);
1758+    connOptions.m_max_outbound_block_relay = std::min(MAX_BLOCKS_ONLY_CONNECTIONS, connOptions.nMaxConnections-connOptions.m_max_outbound_full_relay);

jamesob commented at 8:18 pm on September 6, 2019:

Way easier to reason about than the first iteration of this change. :+1:

jamesob approved

jamesob commented at 8:52 pm on September 6, 2019: member

ACK https://github.com/bitcoin/bitcoin/commit/0ba08020c9791f7caf5986ad6490c16a2b66cd83

Re-reviewed all commits. Cloned locally, built, and synced a few hundred blocks near the tip. Let the client run for most of the day. Found that getpeerinfo’s bytes(sent|recv)_per_msg for peers with relaytxes=false compares as expected to full relay peers.

Blocks-only peer 1

 0    "bytessent_per_msg": {
 1      "getdata": 6396,
 2      "getheaders": 3159,
 3      "headers": 1272,
 4      "ping": 5472,
 5      "pong": 5472,
 6      "sendcmpct": 66,
 7      "sendheaders": 24,
 8      "verack": 24,
 9      "version": 127
10    },
11    "bytesrecv_per_msg": {
12      "addr": 7660,
13      "block": 121277168,
14      "feefilter": 32,
15      "getheaders": 1053,
16      "headers": 3448,
17      "ping": 5472,
18      "pong": 5472,
19      "sendcmpct": 66,
20      "sendheaders": 24,
21      "verack": 24,
22      "version": 126
23    }

Blocks-only peer 2

 0    "bytessent_per_msg": {
 1      "getdata": 5969,
 2      "getheaders": 2106,
 3      "headers": 742,
 4      "ping": 5440,
 5      "pong": 5472,
 6      "sendcmpct": 66,
 7      "sendheaders": 24,
 8      "verack": 24,
 9      "version": 127
10    },
11    "bytesrecv_per_msg": {
12      "addr": 10975,
13      "block": 111063475,
14      "feefilter": 32,
15      "getheaders": 1053,
16      "headers": 4009,
17      "ping": 5472,
18      "pong": 5440,
19      "sendcmpct": 66,
20      "sendheaders": 24,
21      "verack": 24,
22      "version": 126
23    }

Full-relay peer

 0    "bytessent_per_msg": {
 1      "addr": 15415,
 2      "feefilter": 32,
 3      "getaddr": 24,
 4      "getblocktxn": 953,
 5      "getdata": 475098,
 6      "getheaders": 1053,
 7      "headers": 106,
 8      "inv": 1919584,
 9      "notfound": 2501,
10      "ping": 5472,
11      "pong": 5472,
12      "sendcmpct": 99,
13      "sendheaders": 24,
14      "tx": 325453,
15      "verack": 24,
16      "version": 127
17    },
18    "bytesrecv_per_msg": {
19      "addr": 38192,
20      "block": 107235996,
21      "blocktxn": 889600,
22      "cmpctblock": 572593,
23      "feefilter": 32,
24      "getdata": 29315,
25      "getheaders": 1053,
26      "headers": 106,
27      "inv": 1362767,
28      "notfound": 13058,
29      "ping": 5472,
30      "pong": 5472,
31      "sendcmpct": 66,
32      "sendheaders": 24,
33      "tx": 5666974,
34      "verack": 24,
35      "version": 126
36    }

I think this is ready for merge (perhaps after a signoff from @ajtowns?). I know I’ve harped on this before, but the risk-reward ratio here seems exceptionally low.

sipa commented at 10:07 pm on September 6, 2019: member

ACK 0ba08020c9791f7caf5986ad6490c16a2b66cd83

ajtowns commented at 9:21 am on September 7, 2019: member

ACK 0ba08020c9791f7caf5986ad6490c16a2b66cd83 – code review, ran tests. ran it on mainnet for a couple of days with MAX_BLOCKS_ONLY_CONNECTIONS upped from 2 to 16 and didn’t observe any unexpected behaviour: it disconnected a couple of peers that tried sending inv’s, and it successfully did compact block relay with some block relay peers.

fanquake referenced this in commit 189c19e012 on Sep 7, 2019

fanquake merged this on Sep 7, 2019

fanquake closed this on Sep 7, 2019

fanquake removed this from the "Blockers" column in a project

MarcoFalke commented at 2:21 pm on October 4, 2019: member

Still needs release notes at https://github.com/bitcoin-core/bitcoin-devwiki/wiki/0.19.0-Release-Notes-Draft#p2p-changes

MarcoFalke commented at 2:22 pm on October 4, 2019: member

Maybe just copy-paste the section from the optech newsletter?

sdaftuar commented at 2:37 pm on October 4, 2019: member

@MarcoFalke I added a little text to that wiki page referencing this PR.

MarcoFalke removed the label Needs release note on Oct 4, 2019

deadalnix referenced this in commit 13a97f03d3 on Jun 6, 2020

deadalnix referenced this in commit d80ffc2408 on Jun 6, 2020

deadalnix referenced this in commit 864a067138 on Jun 6, 2020

deadalnix referenced this in commit 1e4607d670 on Jun 8, 2020

deadalnix referenced this in commit de9e8eb0ec on Jun 8, 2020

deadalnix referenced this in commit da361cf611 on Jun 8, 2020

deadalnix referenced this in commit a93f717e00 on Jun 8, 2020

deadalnix referenced this in commit ce6365f592 on Jun 8, 2020

deadalnix referenced this in commit cdf97a64a5 on Jun 8, 2020

ftrader referenced this in commit 131e4f8f5d on Aug 17, 2020

ariard referenced this in commit 87ad980248 on Sep 4, 2020

ariard referenced this in commit b94ebc334a on Sep 4, 2020

ariard referenced this in commit 61c2a4282e on Sep 4, 2020

ariard referenced this in commit ac71fe936d on Sep 7, 2020

laanwj referenced this in commit 597488d37c on Oct 2, 2020

sidhujag referenced this in commit 2f44b26c5b on Oct 4, 2020

laanwj referenced this in commit 9855422e65 on Oct 15, 2020

sidhujag referenced this in commit f5fa561d87 on Oct 16, 2020

taurus228 commented at 6:38 pm on January 5, 2021: none

src/init.cpp

str4d referenced this in commit 57cf4c271c on Aug 13, 2021

str4d referenced this in commit 8469683eb6 on Aug 13, 2021

vijaydasmp referenced this in commit 26600e7394 on May 19, 2022

vijaydasmp referenced this in commit f0ce2630b1 on May 19, 2022

fanquake referenced this in commit 986bae8e72 on May 19, 2022

vijaydasmp referenced this in commit 22ce9eee9f on May 19, 2022

vijaydasmp referenced this in commit 3f083de493 on May 26, 2022

vijaydasmp referenced this in commit 835324b656 on Jun 14, 2022

PastaPastaPasta referenced this in commit ef3f738f6f on Jun 19, 2022

DrahtBot locked this on Mar 30, 2023

p2p: Add 2 outbound block-relay-only connections #15759

Conflicts

Review notes

Testing

Blocks-only peer 1

Blocks-only peer 2

Full-relay peer