ASN-based bucketing of the network nodes

naumenkogs commented at 8:29 pm on August 13, 2019: member

Currently we bucket peers (or potential peers) based on /16 network groups which directly correlate to the IP-addresses. This is done to diversify connections every node maintains, for example to avoid connecting to the nodes all belonging to the same region/provider.

Currently peers.dat (serialized version of addrman) does not store ip->bucket mappings explicitly, and all the known ips from peers.dat are re-hashed and re-bucketed at every restart (although it’s very cheap).

Idea

It was recently suggested by @TheBlueMatt to use ASN-based bucketing instead. This is strictly better because if the goal is to diversify connections: the distribution of IPs among the ASNs is not uniform, and because of that netgroup-based bucketing may result in having 8 peers from just 2 large ASNs. If we allow connecting to each ASN at most once, this would increase the security of the network.

We have @sipa’s script to create a compressed representation of mapping (ip->ASN), which is less than 2 megabytes.

However, there are integration-related design questions.

Distribution of the .map file

During the meeting there was a rough consensus (not unanimous though, @jnewbery ) that mapping file should be distributed along with the release, instead of becoming part of the binary.

If you want to question these, feel free to comment below.

Legacy /16 bucketing

There was a suggestion of having an old method as well. I think we should do it.

Loading the map

Maybe there will be concerns here, I have an understanding for now.

fanquake added the label P2P on Aug 13, 2019

Sjors commented at 5:51 pm on August 15, 2019: member

Concept ACK on ASN-based bucketing, no preference at the moment for how updates should work.

practicalswift commented at 7:31 pm on August 15, 2019: contributor

Concept ACK on ASN-based bucketing in addition to legacy /16 bucketing.

Making sure peers are diverse both across a.) the AS-number axis (ASN-based bucketing) and b.) the prefix axis (legacy /16 bucketing) should maximise overall network robustness.

In addition to this: given that we’ll have a prefix-to-ASN map – has the wild idea of opening a connection to one peer from within the same AS-number as oneself been discussed?

Sjors commented at 7:46 pm on August 15, 2019: member

@practicalswift I believe so, see IRC discussion from a few days ago: http://www.erisian.com.au/bitcoin-core-dev/log-2019-08-09.html#l-266

Such a nearby node can be useful for fetching blocks quickly, but at the same time e.g. creates a privacy risk for transaction broadcast. I believe that’s why @TheBlueMatt suggested to connect to them in blocksonly mode.

kristapsk commented at 9:38 pm on August 15, 2019: contributor

Definitely Concept ACK

mapping file should be distributed along with the release, instead of becoming part of the binary.

There’s GeoLite2 ASN database, which could be updated by the user independently (e.g. from the cron script) from Bitcoin Core updates. Actually, I’m not even sure it should be bundled with Bitcoin Core. If database is unavailable - fallback to old legacy /16 bucketing.

Anybody planning to actually work on this?

practicalswift commented at 12:14 pm on August 16, 2019: contributor

@Sjors The “connect-to-own-ASN” idea has another drawback if implemented naïvely without considering the node’s total connectivity in terms of ASN distribution:

Consider an organisation disallowing Bitcoin traffic at the edge router level. A newly launched bitcoind would only be able to connect to ASN-local nodes. The node would risk becoming part of an ASN-local “Bitcoin network” which may not be connected to the global network.

Perhaps we should require N connections to prefixes announced by external ASN:s before considering opening connections within our own ASN :-)

naumenkogs commented at 12:47 pm on August 16, 2019: member

Anybody planning to actually work on this? @kristapsk I am sketching an implementation.

Sjors commented at 2:03 pm on August 16, 2019: member

@practicalswift alternatively “blocks only” in this case would mean only downloading blocks for which you already have the headers (unlike -blocksonly).

sipa commented at 6:06 am on August 17, 2019: member

@kristapsk @naumenkogs and I are working on a way to load a compressed IP-to-ASN map into bitcoind and use it for grouping IPs. The compressed scheme currently needs slightly less than 1 MB for a full map of the Internet.

It’s an open question I think how this map will come to be. Initially I guess it’ll just be optional and available for people to experiment with (using a command line flag and a file). Eventually we may want to bundle a map with Bitcoin Core or even make it part of the binary if this approach turns out to be useful.

I wasn’t aware of that publicly available GeoLite2 database; that looks useful to experiment with. So far I’ve been using a BGP router dump I got from somewhere.

Sjors commented at 11:04 am on August 17, 2019: member

Relevant background reading: https://erebus-attack.comp.nus.edu.sg

One thing this attack leverages is that it can fake ASes “behind” it, from the victim node perspective. In light of that, for nodes that know their own IP address, would it make sense to divide AS buckets from an indivudal node perspective rather than from a global perspective?

For example if our node is in AS 1 and AS 1 has routes to AS 2 and 3, then for each new peer we check if it will route through AS 2 or AS 3 and spread equally over buckets. When AS 2 connects to AS 5 and 6, we again split the AS2 “bucket” between those two. No idea how much number crunching that involves (could wait until after IBD), or if there’s even a reasonable algorithm.

Sjors commented at 9:33 am on August 28, 2019: member

@TheBlueMatt wrote in #16702 (comment):

One thing we can play with after we build an initial table is to look at the paths, instead of looking only at the last ASN in the path. eg if, from many vantage points on the internet, a given IP block always passes from AS 1 to AS 2, we could consider it as a part of AS 1 (given it appears to only have one provider - AS 1). In order to avoid Western bias we’d need to do it across geographic regions and from many vantage points (eg maybe contact a Tier 1 and get their full routing table view, not just the selected routes), but once we get the infrastructure in place, further filtering can be played with.

Would it make sense to traceroute some of the nodes we connect to and re-bucket based on the ASNs of the first couple of hops? Or does such active probing draw too much attention?

practicalswift commented at 2:46 pm on August 28, 2019: contributor

@Sjors To do the equivalent of what traceroute does would require setting time-to-live on outgoing packets (bypassing the socket interface).

That would require the end-user to run bitcoind as root (bad), or having bitcoind invoke a third-party SUID root binary such as traceroute which is also bad: the various traceroute:s were clearly not written with security in mind – see history of heap overflows, etc.

sdaftuar commented at 1:04 pm on September 20, 2019: member

So there are two parts to this proposal:

Use ASN information for addrman bucketing.
Use ASN information for determining which peers to connect to.

It seems to me that using ASN information for addrman bucketing is likely to be very beneficial, as it prevents being blinded to peers outside of (eg) ASNs that an attacker might want to divert traffic towards. I figure we should be able to make addrman large enough (if it’s not already) that the increased collisions for AS’s that have more nodes should not be a big problem.

But it’s less clear to me if enforcing ASN-diversity on our outbound peers is beneficial or not, as it might drive connections to a relatively small part of the overall network graph. For instance if there is an ASN with only 1 node (A) in it, and let’s say there are a couple hundred ASNs in total with any nodes at all (fair assumption?), then rather than having a ~1 in 10000 chance of A being selected by a node making an outbound connection, A’s chances will be more like 1 in a couple hundred. This could have unfortunate side effects for other network-graph attacks (and EREBUS might even be easier in some situations as a result, since that attack revolves around using any AS for which the victim would route through the adversary’s network to reach). So this seems like a potentially large effect that I think would be worthy of more careful study before deploying.

An alternative approach might be to just aim for node diversity at the addrman level (using ASN information if available, as suggested here), and then use peer rotation or frequent chain-tip-sync with random peers (like #16859) in order to reduce the likelihood of being eclipsed.

EDIT: I realized I missed the IRC conversation on this, which I just read. I did misunderstand the sampling effect of randomly sampling from all of addrman, as we do now with our existing /16 group limit, and presumably we would also do if we were to enforce an ASN connection limit as well. Still, this seems to me like something we should model and study before deployment, as it’s not clear at all to me what effect this would have on the network topology. AS-diversity of our immediate peers is not clearly the thing we should be trying to maximize; if for instance dishonest peers can gain a connectivity advantage by locating themselves in small AS groups, that seems potentially problematic.

practicalswift commented at 4:53 pm on September 20, 2019: contributor

@sdaftuar It should also be noted that it is possible for a BGP-speaking attacker to export routes on behalf of an arbitrary number of fake downstream ASN:s (making it look like the attacker is providing transit service to the fake downstream ASN:s). Thereby gaining access to an arbitrary number of tickets in the ASN lottery :)

As described more in depth here: #16702 (comment)

naumenkogs commented at 8:28 pm on November 21, 2019: member

To understand how the proposed asmap solves the problem and affects the topology of the network, I think it’s important to distinguish the physical and logical level.

Physical level

Legacy /16 bucketing always attempted to not create more than one connection per node to the same /16 subnet, even if a lot of nodes are located there. It turns out that the correlation between location/owner and /16 is much weaker these days. Asmap diversifies by ASN, which is a better representation of a piece of infrastructure than /16 group. Asmap adjusts bucketing to be robust in more realistic scenarios (large AS gets corrupted, trouble with the particular AS-level infrastructure). I think we can agree that in terms of the physical level, this is an improvement.

Asmap might make it worse if an attacker manages to spin up fake AS, but that can be handled upon asmap file distribution.

Logical level

Any diversification makes certain nodes (placed in rare /16 groups or rare ASes) more likely to be chosen for connection. This might make topology look less like a random graph. This is generally not a good sign, as it creates weakly connected components, which are easier to attack. The effect on topology be represented by the variation of (AS/netgroup)->nodes] distribution. In practice, however, salted hashing + bucketing makes the effect of the uneven distribution less noticeable. (see next message)

naumenkogs commented at 8:28 pm on November 21, 2019: member

I made this simulating script (operating over the real current list of reachable nodes) to understand two things:

how much less of a random graph we are getting because of AS/netgroup bucketing
how much benefit can an attacker get from this non-uniformity

Both of these questions can be answered with the same answer: the probability of choosing a node from the rare [AS/netgroup]

This means that, if we pick 10% of nodes to be placed in the rarest groups, one per group, the probability of choosing them by other nodes (with different salts) in the network should be 10%.

As I mentioned before, I believe the result depends on the variation (AS/netgroup)->nodes. Since for netgroups the variation is low, it does not affect the graph, and the probability, in this case, is around 10%, no matter how many peers every node chooses.

For Asmap it does slightly affect the graph, and the probability is 11.5% if every node chooses 8 peers. If every node chooses 32 peers, the probability is 15%.

To reduce this, we can split N top AS into artificial sub-AS, to reduce the variation, and make it evener. For instance, if we randomly split top-25 AS each into 20 sub-AS, the probability becomes 10% no matter how many peers every node chooses. Alternatively, we can bucket small ASes together, which would also reduce the variance.

Overall, I believe this is a positive result, and asmap should be integrated. But let me know if you would like any other measurements or if you have an opinion on AS-splitting.

practicalswift commented at 9:25 pm on November 21, 2019: contributor

@naumenkogs Thanks for the simulation and analysis.

If going the asmap route (which I think is a nice idea) I guess there is the option of using the AS/netgroup bucketing method with probability p and using the legacy /16 bucketing with probability 1-p. Could that make us more robust against adversaries able to game only one of the methods (but not both), and increase the cost of attack for adversaries who have the capability to game both methods? Is it overkill?

Sorry if that has been discussed previously or if I’ve misunderstood how things are meant to work.

sipa commented at 10:07 pm on November 25, 2019: member

@practicalswift I believe that would be overkill. If we think using an AS map is the wrong approach, we should just improve the map. For example we could choose to merge the smallest AS’es until they’re all (say) as large as a (random example) /22.

jnewbery commented at 6:08 pm on December 11, 2019: contributor

During the meeting there was a rough consensus (not unanimous though, @jnewbery ) that mapping file should be distributed along with the release, instead of becoming part of the binary.

Just noticed this. I said in the meeting that it should be distributed with the release:

0601 2019-06-20T19:22:58  <jnewbery> yes, i think the distribution should include it

http://www.erisian.com.au/bitcoin-core-dev/log-2019-06-20.html#l-601

narula commented at 6:35 pm on December 11, 2019: contributor

I have questions about the generation and maintenance of the asmap file (independent of distribution method):

How often should this be regenerated?
What constitutes a “bad” asmap file?
How might an attacker take advantage of a very stale asmap file (let’s say people forget to update it for a few months… or years)?
What is the mechanism by which one detects a “bad” generated asmap? What should people look for?

practicalswift commented at 6:52 pm on December 11, 2019: contributor

Good questions!

How might an attacker take advantage of a very stale asmap file (let’s say people forget to update it for a few months… or years)?

From a BGP speaking attacker’s perspective wouldn’t a very stale asmap be harder to take advantage of compared to the other extreme of say a daily updated asmap? My thinking is that the attacker can easily influence the result of say next day’s asmap generation process by adjusting what prefixes and AS paths he/she chooses to communicate.

I’m not suggesting we should use a very stale asmap:s - just making the point that newer is not necessarily less attacker friendly :)

One mitigation could perhaps be to only consider routes and/or AS paths that have been stable for N months when generating the asmap.

Assumptions I’m making:

Active attackers are typically participating during shorter time horizons compared to active non-attackers.
Routing table anomalies may go unnoticed over short time horizons, but are less likely to go unnoticed over long time horizons. (Perhaps that can be said generally for all types of observable anomalies? :))

taylorjdawson commented at 11:11 pm on December 31, 2019: none

Overall, I believe this is a positive result, and asmap should be integrated. But let me know if you would like any other measurements or if you have an opinion on AS-splitting.

How do you measure the effectiveness of ASN bucketing vs /16 bucketing? In other words, how do you measure node diversity/distribution?

laanwj referenced this in commit 01fc5891fb on Jan 29, 2020

sidhujag referenced this in commit aec7334f19 on Feb 1, 2020

maflcko closed this on Apr 27, 2020

maflcko reopened this on Apr 27, 2020

leto commented at 1:44 pm on July 14, 2020: contributor

Now that #16702 has been merged, what is the status of this issue? What issues/bugs are remaining to be solved or changed with the new asmap code? I am looking for something to work on, maybe more tests or docs.

naumenkogs commented at 8:58 am on July 15, 2020: member

I asked @MarcoFalke to keep this issue open because it has some useful discussion about how we distributed asmap files. But that’s more of a coordination question, not really about coding etc.

Regarding other future steps, I’m writing up some call for action regarding asmap tests etc, hope to post later this week. I’ll post link here too. Basically, testing the full cycle of asmap use (mostly out of scope of Bitcoin Core i’d say), like see asmap-rs repo, play with it, look at the issues there, try to improve the full experience.

sidhujag referenced this in commit 61c24989ae on Nov 10, 2020

SRv6d commented at 4:07 pm on February 26, 2021: none

Has there been any progress here and are there any alternatives to sipa/asmap and rrybarczyk/asmap-rs for creating an asmap ?

naumenkogs commented at 9:41 pm on February 26, 2021: member

@xkeyscored

To be clear, sipa/asmap and rrybarczyk/asmap-rs are complementary (asmap-rs can’t work without sipa/asmap). Maybe you meant this, but readers might get confused.
No much progress, but I wrote a call to action on testing asmap.
Otherwise, the next step would probably be figuring out the map distribution strategy extensive testing of the tools(see previous point).

Also, I know there’s a new paper coming (not public yet), saying how we could expand asmap approach even further. The current asmap approach is still an improvement, but we should at least discuss going further with their recommendations, either right away or incrementally. I was going to bring that up once the paper is public (perhaps in a month or so)

Rspigler commented at 1:11 pm on June 23, 2021: contributor

Were the suggestions discussed here implemented?

Is the plan eventually to have ASN based bucketing be the default in Bitcoin Core (however the asmap is decided to be distributed)?

naumenkogs commented at 7:26 am on July 5, 2021: member

@Rspigler I’d have to revisit those observations, but I think it’s not the top priority (working on default asmap integration is). Yeah, I think that’s the plan.

Sjors commented at 5:50 pm on April 4, 2023: member

During Advancing Bitcoin 2023 @virtu showed some simulations for the number of inbound connections nodes on “obscure” ASN’s would get. Perhaps there’s a way to reduce the bandwidth burden for these nodes.

It’s good in general to connect to a diverse set of ASN’s so we don’t get eclipsed. But for downloading blocks it seems unnecessary, since we know exactly what we’re looking for. Perhaps during IBD (and long catchup) we could make additional outbound connections to a fast ASN. Hardcoding Amazon for this would be a bit dubious, but we could e.g. use whichever ASN gave us the fastest connection.

kouloumos commented at 6:22 pm on April 4, 2023: contributor

During Advancing Bitcoin 2023 @virtu showed some simulations for the number of inbound connections nodes on “obscure” ASN’s would get.

For more context: https://github.com/virtu/talks/tree/master/2023-03-02-advancing-bitcoin

virtu commented at 4:04 pm on January 30, 2024: contributor

Seeing some of my previous work already got linked, I wanted to share an updated view on one major and one minor concern I came across during my research on ASMAP.

The major concern relates to second- and third-order effects of establishing only one outbound per AS that lead to several negative outcomes for the P2P network graph. To demonstrate the effects, consider a clearnet node opening ten outbound connections: given that Hetzner’s AS comprises around 1k out of a total of 8k reachable clearnet nodes, let’s assume that the first selected node’s address resides in Hetzner’s AS; as a consequence, the remaining 999 Hetzner-hosted nodes just became ineligible. The second selected node comes from Amazon: there go another several hundred nodes. Google’s next: several hundred more. And so on.

Viewed through the lens of inbound connections, a significant decrease in the number of inbound connections observed by nodes residing in „large“ AS (i.e., AS with a large number of reachable nodes in them) can be expected; after all, each AS is getting at most one inbound connection per Bitcoin node, and the total of inbound connections from all nodes to the AS will be distributed across the reachable nodes inside the AS. Vice versa, nodes in „small“ AS are expected to see increased demand for their inbound connection slots. (Note that a similar effect is already at play at the netgroup-level today but it much less pronounced because hosters evenly distribute the IP addresses they allocate across their subnets/netgroups.)

Simulation data supports this thought experiment. The following graph compares the inbound connection histograms of three simulation runs for both netgroup prefix- and ASMAP-based connection policies, based on real-world reachable node data (~8k reachable clearnet node addresses with corresponding netgroups and kartograf-based ASNs) and a conservative assumption of 30k non-listening nodes. Data for netgroup bucketing looks normally distributed around the expected mean (38k total nodes times ten outbound connections per node distributed across 8k reachable nodes implies a mean of 47.5 inbound connections per reachable node). The ASMAP distribution has the appearance of two superimposed normal distributions (in fact, it is more than two, but this is not relevant here): Essentially, “large-AS” nodes underlie the normal distribution whose mean is located at around 27 inbound connections and “small-AS” nodes underlie the second one whose mean is at around 53 inbound connections.

So, what are the implications?

Bitcoin nodes are going to become less equal in terms of connectivity: „large-AS“ nodes (typically housed in data centers providing high bandwidth and well-equipped to handle lots of inbound connections) will see a reduction in inbound connections and associated resource utilization (e.g., bandwidth, compute) that „small-AS“ nodes are going to have to absorb.
Third-order effect: Simulations based on Luke’s ~65k non-reachable node estimate indicate that some “small-AS” nodes are going to run into their inbound connection limit, potentially making Bitcoin nodes’ outbound connections to nodes in small AS less stable because connections can not be opened or sustained. This means that when crossing a particular non-reachable-to-reachable-node ratio, the partitioning effect can spill over from inbound to outbound connections. If Luke’s numbers are correct (given that his graph shows only reachable IPv4 nodes, I expect his total node numbers to be IPv4-only too; moreover, the methodology is not disclosed, so I wouldn’t wager too much on these numbers), the effect is not academic either but to be expected in the wild. If non-reachable node numbers are higher than Luke’s estimate, a conceivable scenario entails a non-negligible share of Bitcoin nodes not being able to sustain ten outbound connections because they are in constant competition with other nodes for inbound slots on „small-AS“ nodes although ample unused inbound resources are available on „large-AS“ nodes.

The minor concern I wanted to bring up relates to the degree and rate of change of AS topology (i.e. how many and how often do subnets migrate between AS). Since there’s an effort to ship ASMAP snapshots with releases, having some idea if and when snapshots become stale seems advisable (imagine several subnets a node is connected to having migrated to single AS since the ASMAP used by the node was created). Moreover, we should keep in mind the large fraction of conservative and/or lazy node operators (per my crawler, only 25% of reachable nodes are running the latest release; 10%, 25%, 10%, 7% and 7% are running 25.1, 25.0, 24, 23, 22, respectively), so ASMAPs should be at least good for at least a year, maybe two, to avoid potentially exposing older nodes to harm.

To this end, I’ve done some cursory analysis of two kartograf-generated ASMAPs roughly one year apart (Feb 6, 2023 vs. Jan 25, 2024). The outcome: of 8,233 Bitcoin nodes reachable via IPv4/IPv6 on Jan 25, 2024, 1,024 (12.4%) have non-agreeing ASN. Whether this is a problem or not is hard to say without closer examination. (Data and code to reproduce these results available here.)

I’m sure these issues can be solved: the first one potentially by falling back to more than one connection per AS in case of connection problems; the second one might resolve itself by just looking into the data. But in the meantime, I think netgroup bucketing is doing a better job than one might expect (after all, Gleb’s initial post referred to a scenario of being connected to just two AS). I’ve run some Monte Carlo simulations that involve picking ten addresses for outbound connections in line with Bitcoin’s netgroup bucketing policy from a real-world reachable clearnet node dataset and determining the number of distinct AS the ten netgroups correspond to, and it looks the large majority of IPv4-only, IPv6-only, and IPv4/IPv6 nodes are connected to at least 8 AS. (Data and code to reproduce this graph available here.)

ASN-based bucketing of the network nodes #16599

Idea

Distribution of the .map file

Legacy /16 bucketing

Loading the map

Physical level

Logical level