seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes #30695

pull virtu wants to merge 3 commits into bitcoin:master from virtu:add-seed-source-bump-uptime-requirements changing 5 files +3397 −1112
  1. virtu commented at 2:25 pm on August 22, 2024: contributor

    This builds on #30008 and adds data exported by my crawler an additional source for seed nodes. Data covers all supported network types.

    [edit: Added Luke’s seeder as input as well.]

    Motivation

    • Further decentralizes the seed node selection process (in the long term potentially enabling an n-source threshold for nodes to prevent a single source from entering malicious nodes)
    • No longer need to manually curate seed node list for any network type: See last paragraph of OP in #30008. My crawler has been discovering the handful of available cjdns nodes for around two months, all but one of which meet the reliability criteria.
    • Alignment of uptime requirements for Onion and I2P nodes with those of clearnet nodes to 50%: If I’m reading the code correctly, seeders appear to optimize for up-to-dateness by using lower connection timeouts than Bitcoin Core to maximize throughput. Since my crawler does not have the same timeliness requirements, it opts for accuracy by using generous timeouts. As a result, its data contains additional eligible Onion (and other darknet nodes), as is shown in the histogram below. Around 4500 Onion nodes are discovered so far (blue); my data adds ~6400 more (orange); ~ 1500 nodes take longer than the default 20-second Bitcoin Core timeout and won’t qualify as “good”.

    Connection time histogram for Onion nodes

    Here’s the current results with 512 nodes for all networks except cjdns:

    0IPv4   IPv6  Onion  I2P    CJDNS Pass
    110335   2531  11545   1589     10 Initial
    210335   2531  11545   1589     10 Skip entries with invalid address
    35639   1431  11163   1589      8 After removing duplicates
    45606   1417  11163   1589      8 Enforce minimal number of blocks
    55606   1417  11163   1589      8 Require service bit 1
    64873   1228  11163   1589      8 Require minimum uptime
    74846   1225  11161   1588      8 Require a known and recent user agent
    84846   1225  11161   1588      8 Filter out hosts with multiple bitcoin ports
    9512    512    512    512      8 Look up ASNs and limit results per ASN and per net
    
    0IPv4   IPv6  Onion  I2P    CJDNS Pass
    15772   1323    443      0      2 Initial
    25772   1323    443      0      2 Skip entries with invalid address
    34758   1110    443      0      2 After removing duplicates
    44723   1094    443      0      2 Enforce minimal number of blocks
    54723   1094    443      0      2 Require service bit 1
    63732    867    443      0      2 Require minimum uptime
    73718    864    443      0      2 Require a known and recent user agent
    83718    864    443      0      2 Filter out hosts with multiple bitcoin ports
    9 512    409    443      0      2 Look up ASNs and limit results per ASN and per net
    

    To dos

    • Remove manual nodes and update README
    • Mark nodes with connection times exceeding Bitcoin Core’s default as bad in exporter: done
    • Regenerate mainnet seeds
    • Rebase, then remove WIP label once #30008 gets merged
  2. DrahtBot commented at 2:25 pm on August 22, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK achow101, fjahr

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. virtu renamed this:
    WIP: seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes
    [WIP] seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes
    on Aug 22, 2024
  4. in contrib/seeds/README.md:19 in 8332e219c7 outdated
    16 
    17 ```
    18 curl https://bitcoin.sipa.be/seeds.txt.gz | gzip -dc > seeds_main.txt
    19 curl https://mainnet.achownodes.xyz/seeds.txt.gz | gzip -dc >> seeds_main.txt
    20 curl https://testnet.achownodes.xyz/seeds.txt.gz | gzip -dc > seeds_test.txt
    21+curl https://21.ninja/seeds.txt.gz | gzip -dc > seeds_main.txt
    


    achow101 commented at 2:43 pm on August 22, 2024:
    This will overwrite instead of append.
  5. virtu force-pushed on Aug 22, 2024
  6. in contrib/seeds/nodes_main.txt:720 in f0a7a15459 outdated
    959+[2a12:8e40:5668:e429::1]:8333 # AS34465
    960 [2a12:8e40:5668:f001::1]:8333 # AS34465
    961-[2a12:a302:1:a180::b5ca]:8333 # AS23959
    962-[2c0f:f4a8:b:b108:807d:b2d6:9146:38be]:8333 # AS37254
    963-[2c0f:f4a8:b:b108:c458:5c61:dcca:cb10]:8333 # AS37254
    964-iy7go4454pb4p2zmnkwrgsi6v6oqv53zxnmalz6rnfjemxftapfa.b32.i2p:0
    


    jonatack commented at 5:05 pm on August 22, 2024:
    I could be misreading, but it looks like all the tor and i2p peers have been removed without being added elsewhere.

    virtu commented at 9:55 am on August 23, 2024:
    I hadn’t regenerated to node lists because I wanted to wait until Ava’s seeder resumed exporting good I2P node data. I have now pushed a preliminary export anyway. Everything should be there.
  7. jonatack commented at 5:09 pm on August 22, 2024: member

    As a result, its data contains additional eligible Onion (and other darknet nodes), as is shown in the histogram below.

    Some very good tor and i2p peers seems absent from the update. I addnode them, but hm.

  8. virtu commented at 10:03 am on August 23, 2024: contributor

    I’ve pushed a commit including the node lists generated with all sources. Could you recheck for the peers that were missing? [On a side note, I noticed what (based on the user agent string) might be your cjdns node is no longer part of the cjdns node list because the node didn’t have the NODE_NETWORK version bit set.]

    Also, how do you define very good nodes? Right now, seeds are selected more or less randomly from the set of nodes that pass all selection criteria. Instead of doing a random.shuffle(), it might be a better idea to sort them based on availability.

  9. DrahtBot added the label Needs rebase on Aug 26, 2024
  10. jonatack commented at 9:17 pm on August 26, 2024: member

    your cjdns node is no longer part of the cjdns node list because the node didn’t have the NODE_NETWORK version bit set

    Yes, I reluctantly had to begin pruning due to the increased rate of chain data growth over the past year and a half.

    Is there now only one CJDNS seed node? Edit: OK, I see at least 4 other CJDNS seed nodes.

  11. jonatack commented at 9:33 pm on August 26, 2024: member

    Also, how do you define very good nodes?

    I was manually curating I2P nodes based on trusted colleagues (akin to addnode peer selection), filtered by connection reliability and regularly seeing blocks from them. (Edit: I now see that you have ones that I’d recommend, so seems good. There were a couple that were missing, but I see that they, like me, began pruning.)

  12. in contrib/seeds/README.md:21 in a963725378 outdated
    18 curl https://bitcoin.sipa.be/seeds.txt.gz | gzip -dc > seeds_main.txt
    19-curl https://bitcoin.sipa.be/asmap-filled.dat > asmap-filled.dat
    20+curl https://mainnet.achownodes.xyz/seeds.txt.gz | gzip -dc >> seeds_main.txt
    21+curl https://testnet.achownodes.xyz/seeds.txt.gz | gzip -dc > seeds_test.txt
    22+curl https://21.ninja/seeds.txt.gz | gzip -dc >> seeds_main.txt
    23+curl https://raw.githubusercontent.com/fjahr/asmap-data/main/latest_asmap.dat > asmap-filled.dat
    


    luke-jr commented at 0:40 am on August 27, 2024:

    virtu commented at 4:36 am on August 27, 2024:
    nice, added!
  13. virtu force-pushed on Aug 27, 2024
  14. virtu force-pushed on Aug 27, 2024
  15. seeds: Pull nodes from virtu's crawler
    Pull additional nodes from virtu's crawler. Data includes sufficient
    Onion and I2P nodes to align the uptime requirements for these networks
    to that of clearnet nodes (i.e., 50%). Data also includes more than
    three times the number of CJDNS nodes currently hardcoded into
    nodes_main_manual.txt, so hardcoded nodes becomes obsolete.
    7a2068a0ff
  16. seeds: Pull nodes from Luke's seeder
    Pull additional nodes from Luke's seeder to further decentralize the
    generation of seed nodes.
    02dc45c506
  17. seeds: Regenerate mainnet seeds
    Regenerate mainnet seeds from new sources without the need for hardcoded
    data. Result has 512 nodes from each network type except cjdns, for
    which only eight nodes were found that match the seed node criteria.
    b061b35105
  18. virtu force-pushed on Aug 27, 2024
  19. virtu commented at 5:12 am on August 27, 2024: contributor

    I was manually curating I2P nodes based on trusted colleagues (akin to addnode peer selection), filtered by connection reliability and regularly seeing blocks from them. (Edit: I now see that you have ones that I’d recommend, so seems good. There were a couple that were missing, but I see that they, like me, began pruning.)

    Good to know the automatic process is getting all of the cjdns nodes you were tracking (modulo pruning).

    Data from @luke-jr’s seed is now included as well. As before: 512 nodes for each network type except cjdns, for which there are eight matching the seed node criteria.

  20. virtu renamed this:
    [WIP] seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes
    seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes
    on Aug 27, 2024
  21. DrahtBot removed the label Needs rebase on Aug 27, 2024
  22. achow101 added this to the milestone 28.0 on Aug 27, 2024
  23. achow101 commented at 3:31 pm on August 27, 2024: member

    ACK b061b3510585a1fe113cc9d1af65852b155aba45

    Instruction updates look correct; did not check any of the changed seeds but did check that i2p and onion mainnet seeds have been readded.

    If anyone has a testnet crawler, we probably want to have additional sources for testnet and add more i2p and onion testnet seeds.

  24. in contrib/seeds/README.md:12 in 7a2068a0ff outdated
     7@@ -8,16 +8,17 @@ and remove old versions as necessary (at a minimum when SeedsServiceFlags()
     8 changes its default return value, as those are the services which seeds are added
     9 to addrman with).
    10 
    11-The seeds compiled into the release are created from sipa's and achow101's DNS seed and AS map
    12-data. Run the following commands from the `/contrib/seeds` directory:
    13+The seeds compiled into the release are created from sipa's and achow101's DNS seed,
    14+virtu's crawler, and fjahr's community AS map data. Run the following commands from the
    


    fjahr commented at 3:47 pm on August 27, 2024:

    nit: It’s clear what belongs to who because all the other links have the names in the URL but still might be nicer to be explicit in the future. Ignore if you don’t have to retouch.

    0virtu's crawler (21.ninja), and fjahr's community AS map data. Run the following commands from the
    
  25. fjahr commented at 4:18 pm on August 27, 2024: contributor

    utACK b061b3510585a1fe113cc9d1af65852b155aba45

    I have reviewed the changes and verified that the included files generate the included chainparamsseeds.h. I have not tested the seed nodes but I did a quick plausibility check on the txt file changes.

    Fun side note: TIL that there is a stable node with a single digit ASN, AS9, which is Carnegie Mellon University.

  26. achow101 merged this on Aug 27, 2024
  27. achow101 closed this on Aug 27, 2024

  28. virtu deleted the branch on Aug 28, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me