EthanHeilman commented at 10:25 pm on June 29, 2015: contributor

These changes implement countermeasures 3 (feeler connections) and 4 (test-before-evict) suggested in our paper: “Eclipse Attacks on Bitcoin’s Peer-to-Peer Network”.

Design:

The primary change is the creation of a feeler connection thread. Every 2 minutes this feeler thread launches one feeler connection, increasing the default number of max outgoing connections to 9. Feeler connections are very short lived and disconnect upon verifying the tested host is running bitcoind. Feeler connections exist only to test if the remote host to test is online. The feeler thread pulls the addresses to test from two sources:

Source 1. Tried table collisions. A collision occurs when an address, addr1, is being moved to the tried table from the new table, but maps to a position in the tried table which already contains an address. This change ensures that during a collision, addr1 is not inserted into tried but instead inserted into a buffer. The to-be-evicted address, addr2, is then tested by the feeler thread. If addr2 is found to be online, we remove addr1 from the buffer and addr2 is not evicted, on the other hand if addr2 is found be offline it is replaced by addr1.

Source 2. The new table. If the feeler thread has no tried table collisions to be tested, it selects an address from the new table. It does this to grow the number of fresh (recently online) addresses in the tried table.

Advantages:

In our paper we sample several peer lists. We found that a large percentage of addresses in tried tables are stale IP addresses (the lowest was 72 percent stale, the highest was 95 percent stale), which increases the risk of eclipse attacks. This change remedies this by ensuring that the tried table grows quickly and contains many recently online addresses. Countermeasure 4 (feeler connections) strengthens countermeasure 3 (test-before-evict).
Another small side advantage is that, as no more than ten addresses can be in the test buffer at once, and addresses are only cleared one at a time from the test buffer, an attacker is forced to wait at least two minutes to insert a new address into tried after filling up the test buffer. This rate limits an attacker attempting to launch an eclipse attack.

See our paper for a full analysis of the benefits of these countermeasures.

Risk mitigation:

To prevent this functionality from being used as a DoS vector, we limit the number of addresses which are to be tested to ten. If we have more than ten addresses to test, we drop new addresses being added to tried if they would evict an address. Since the feeler thread only creates one new connection every 2 minutes the additional network overhead is limited.
An address in tried gains immunity from tests for 4 hours after it has been tested or successfully connected to.
To avoid issues of synchronization, the feeler thread sleeps for between 0 and 3 seconds prior to making a connection.

Tests:

We ran an instance with our changes for two days against our in house developed attack code to induce many collisions in the tried table. Under these conditions we used valgrind to look for memory leaks. See output of the test here:

 0e0@ubuntu:~/bitcoin-fork/src$ valgrind ./bitcoind -debug -printtoconsole -testnet > output.txt
 1==1918== Memcheck, a memory error detector
 2==1918== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
 3==1918== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
 4==1918== Command: ./bitcoind -debug -printtoconsole -testnet
 5==1918==
 6^C==1918==
 7==1918== HEAP SUMMARY:
 8==1918==     in use at exit: 3,864 bytes in 16 blocks
 9==1918==   total heap usage: 224,281,816 allocs, 224,281,800 frees, 32,337,629,302 bytes allocated
10==1918==
11==1918== LEAK SUMMARY:
12==1918==    definitely lost: 0 bytes in 0 blocks
13==1918==    indirectly lost: 0 bytes in 0 blocks
14==1918==      possibly lost: 304 bytes in 1 blocks
15==1918==    still reachable: 3,560 bytes in 15 blocks
16==1918==         suppressed: 0 bytes in 0 blocks
17==1918== Rerun with --leak-check=full to see details of leaked memory
18==1918==
19==1918== For counts of detected and suppressed errors, rerun with: -v
20==1918== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

As we have made some cosmetic code changes since this test was run we are rerunning this test and will update this pull request when it is finished. We are launching several test nodes.

If you want to test my code, and you don’t want to simulate a large number of incoming connections, you need to generate a bunch of collisions that would trigger feeler connections. The way to do this is to reduce the the number of buckets in tried to 1. (That way, every address inserted into tried will have a high probability (at least p=1/64) to be a collision.)

addrman_tests.cpp contains unit tests for the code I added to addrman. As a small side note, because addrman bucket placement depends on a randomly chosen seed (nKey) I needed to create a method to set this seed to a known value so that the unit tests would be deterministic. This method is only available during the addrman unittests.

Added test-before-evict discipline in Addrman, feeler connections.

Changes addrman to use the test-before-evict discipline in which an
address is to be evicted from the tried table is first tested and if
it is still online it is not evicted.

Creates a new thread which tests if addresses are online or offline by
briefly connecting to them. These short lived connections are referred
to as feeler connections. Feeler connections have two purposes:
First, to increase the number of addresses in tried, by selecting and
connecting to addresses in new. Second, to implement the testing stage
of the test-before-evict discipline.

Adds tests to provide test coverage for these changes.

This change was suggested as Countermeasure 3 and 4 in
Eclipse Attacks on Bitcoin’s Peer-to-Peer Network, Ethan Heilman,
Alison Kendler, Aviv Zohar, Sharon Goldberg. ePrint Archive Report
2015/263. March 2015.

caad33fb23

laanwj added the label P2P on Jun 30, 2015

EthanHeilman commented at 3:35 pm on July 13, 2015: contributor

We ran a node with these changes between July 9th to July 13th. We attempted to connect to our node 1375078 times (not all connections succeeded due to connection exhaustion) from 16384 district IP addresses (256 IPs per group, 64 groups using the unallocated prefix 249\8). Both the output file and valgrind are nominal.

Valgrind output:

 0e0@ubuntu:~/bitcoin/src$ valgrind ./bitcoind -testnet -debug -printtoconsole > output.txt
 1==53495== Memcheck, a memory error detector
 2==53495== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
 3==53495== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
 4==53495== Command: ./bitcoind -testnet -debug -printtoconsole
 5==53495== 
 6==53495== 
 7==53495== HEAP SUMMARY:
 8==53495==     in use at exit: 5,976 bytes in 18 blocks
 9==53495==   total heap usage: 1,094,555,059 allocs, 1,094,555,041 frees, 202,228,428,605 bytes allocated
10==53495== 
11==53495== LEAK SUMMARY:
12==53495==    definitely lost: 0 bytes in 0 blocks
13==53495==    indirectly lost: 0 bytes in 0 blocks
14==53495==      possibly lost: 304 bytes in 1 blocks
15==53495==    still reachable: 5,672 bytes in 17 blocks
16==53495==         suppressed: 0 bytes in 0 blocks
17==53495== Rerun with --leak-check=full to see details of leaked memory
18==53495== 
19==53495== For counts of detected and suppressed errors, rerun with: -v
20==53495== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Abstracts from the output:

We are connecting from 249\8 and running on testnet. We can see here bitcoin swapping an offline address for a recently connected address.

 0trying connection 243.7.23.180:18333 lastseen=371333.3hrs
 1received: inv (37 bytes) peer=378634
 2got inv: tx 4fa9bb2a4f2022cca3652f1a376a28080e7dcebfc28256a13ce3f117c1deb752  have peer=378634
 3sending: inv (37 bytes) peer=453392
 4sending: inv (37 bytes) peer=515854
 5sending: inv (37 bytes) peer=555025
 6--
 7sending: inv (73 bytes) peer=1290674
 8sending: inv (37 bytes) peer=1288871
 9sending: inv (37 bytes) peer=1288619
10received: ping (8 bytes) peer=515854
11sending: pong (8 bytes) peer=515854
12connection to 243.7.23.180:18333 timeout
13sending: inv (37 bytes) peer=453392
14sending: inv (73 bytes) peer=1289172
15sending: inv (37 bytes) peer=1289071
16received: ping (8 bytes) peer=566065
17sending: pong (8 bytes) peer=566065
18--
19sending: inv (37 bytes) peer=1290273
20received: ping (8 bytes) peer=515854
21sending: pong (8 bytes) peer=515854
22sending: inv (37 bytes) peer=1288971
23sending: inv (37 bytes) peer=1289923
24Swapping 249.48.23.239:18333 for 243.7.23.180:18333 in tried table
25Moving 249.48.23.239:18333 to tried

jgarzik commented at 1:31 pm on September 16, 2015: contributor

concept ACK

laanwj commented at 11:30 am on May 5, 2016: member

What is the status here? Needs rebase, and more review/testing.

EthanHeilman commented at 4:26 pm on May 5, 2016: contributor

@laanwj I’ve been auditing/fuzzing the existing network code and slowing adding unittests for net/addrman to establish a behavior baseline and make unittesting of this feature easier. See #6720, #7212, #7291, #7696

I’m currently planning on breaking this commit into two commits (feeler connections and test-before-evict) and testing them independently. My current roadmap is:

Late June: push out an cleaned up version of feeler connections, spin up some test nodes.
Early July: post results from test nodes.
July-August: depending on how things go, I’d like to push a cleaned up version of test-before-evict.

I have found some other minor bugs in the networking code, I’m trying to figure out if I should prioritize them over this.

in src/net.cpp: in caad33fb23

57@@ -55,6 +58,7 @@ using namespace std;
58 
59 namespace {
60     const int MAX_OUTBOUND_CONNECTIONS = 8;
61+    const int MAX_FEELER_CONNECTIONS = 1;

rebroad commented at 4:05 pm on August 6, 2016:

This number seems to only effectively increase MAX_OUTBOUND_CONNECTIONS as it is not used other than adding a number to this constant.

sipa commented at 11:44 am on August 25, 2016: member

Rebase now that #8282 is merged?

laanwj commented at 10:16 am on September 9, 2016: member

Is this still relevant after #8282?

sipa commented at 10:39 am on September 9, 2016: member

Yes, #8282 only implements feeler connections, not test-before-evict.

TheBlueMatt commented at 5:12 pm on October 28, 2016: member

Needs rebase since Aug 25 :(

EthanHeilman commented at 8:20 pm on October 28, 2016: contributor

@TheBlueMatt I have a new version of this change in which test-before-evict is broken out separately from feelers, since feelers is already in 13.1. I haven’t created a pull request for it because I’m hunting for typos and mistakes. It should be out either today or Monday.

https://github.com/EthanHeilman/bitcoin/commit/e7157a076429ab45151f228c4825a5d55f6ad68e

laanwj commented at 2:28 pm on November 2, 2016: member

Closing in favor of #9037

laanwj closed this on Nov 2, 2016

laanwj referenced this in commit a36834f10b on Mar 6, 2018

PastaPastaPasta referenced this in commit 84620651c8 on Jun 10, 2020

PastaPastaPasta referenced this in commit 8e50809b48 on Jun 13, 2020

PastaPastaPasta referenced this in commit 79b2ce9b86 on Jun 13, 2020

PastaPastaPasta referenced this in commit 8b9012d292 on Jun 13, 2020

PastaPastaPasta referenced this in commit f442923aee on Jun 17, 2020

gades referenced this in commit 7ad9414b05 on Jun 25, 2021

MarcoFalke locked this on Sep 8, 2021

Added test-before-evict discipline in Addrman, feeler connections. #6355

Design:

Advantages:

Risk mitigation:

Tests: