This PR attempts to solve the problem explained in #16599. A particular attack which encouraged us to work on this issue is explained here [Erebus Attack against Bitcoin Peer-to-Peer Network] (by @muoitranduc)
Instead of relying on /16 prefix to diversify the connections every node creates, we would instead rely on the (ip -> ASN) mapping, if this mapping is provided.
A .map file can be created by every user independently based on a router dump, or provided along with the Bitcoin release. Currently we use the python scripts written by @sipa to create a .map file, which is no larger than 2MB (awesome!).
Here I suggest adding a field to peers.dat which would represent a hash of asmap file used while serializing addrman (or 0 for /16 prefix legacy approach). In this case, every time the file is updated (or grouping method changed), all buckets will be re-computed. I believe that alternative selective re-bucketing for only updated ranges would require substantial changes.
TODO:
more unit testsfind a way to test the code without including >1 MB mapping file in the repo.- find a way to check that mapping file is not corrupted (checksum?)
- comments and separate tests for asmap.cpp
- make python code for .map generation public
- figure out asmap distribution (?)
Interesting corner case: I’m using std::hash to compute a fingerprint of asmap, and std::hash returns size_t. I guess if a user updates the OS to 64-bit, then the hash of asap will change? Does it even matter?