The overall goal here is to make it a bit more clear what criteria go into selecting seeds, but is really just “general clean-up”.
Stuff in this (somewhat large) change
- Added some objects for the main concepts (
Entry
= a line in the file,Address
= that first field) - Much more use of generators (
main
,get_entries_limited_by_asn
) which shaves off some time (40s -> 20s for me) - Sorting by 30-day uptime and last success rather than address (no grouping by IP type, assuming slowly more people will be able to consume IPv6, and this mixes the more highly available IPv6 address into the list rather than putting them after all IPv4 addresses)
- Using Python libraries for address validation (socket library).
- Split apart a few things that had logic buried away (
get_address_type
,get_asn
). - Use a schema to load a single line to remove any overhead with changes in seeds.txt.
- Consolidated “what makes a seed OK” (in
Entry.is_valid()
). - Use Python’s
set
for uniquely selecting addresses (where port is ignored). - Added error logging and warnings when values are invalid.
- Clarified the steps happening in the
main
method. - Moved suspicious hosts into a separate file.
Stuff that might be worth doing
- Making sure there are at least N addresses of each type (v4, v6, tor)
- ASN checks on IPv6 addresses
- Add lots of tests
I assume this script is run only once in a blue moon, so I get it if you want to punt on taking a look. In the meantime I can always add tests to make everyone more confident that I didn’t just bork the deployment process for bitcoin-core :p