seeds: Pull additional nodes from my seeder and update fixed seeds #30008

pull achow101 wants to merge 10 commits into bitcoin:master from achow101:my-seeder-fixed-seeds changing 8 files +2072 −4177
  1. achow101 commented at 8:05 pm on April 30, 2024: member

    The DNS seeder that I wrote collects statistics on node reliability in the same way that sipa’s seeder does, and also outputs this information in the same file format. Thus it can also be used in our fixed seeds update scripts. My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

    In doing this update, I’ve found that makeseeds.py is missing newer versions from the regex as well as cjdns support; both of these have been updated.

    I also noticed that the testnet fixed seeds are all manually curated and sipa’s seeder does not appear to publish any testnet data. Since I am also running the seeder for testnet, I’ve added the commands to generate testnet fixed seeds from my seeder’s data too.

    Lastly, I’ve updated all of the fixed seeds. However, since my seeder has not found any cjdns nodes that met the reliability criteria (possibly due to connectivity issues present in those networks), I’ve left the previous manual seeds for that network.

  2. DrahtBot commented at 8:05 pm on April 30, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK fjahr, virtu
    Concept ACK jaonoctus

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30695 ([WIP] seeds: Add additional seed source and bump uptime requirements for Onion and I2P nodes by virtu)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. laanwj added the label P2P on Apr 30, 2024
  4. in contrib/seeds/nodes_main_manual.txt:513 in ef78deb080 outdated
    509@@ -510,518 +510,6 @@ zvchlrjuzqdlx37fhibhnym4y6p56vtlymujjuzhh2cp34yqfrtq.b32.i2p:0
    510 zxsd3fqczh6ddgejc24nnmb3ww7nalieq3a7cs2mqiy6tmff3wia.b32.i2p:0
    511 zy2ywvyqds5bgdoo4tgbu3bwjp3ygyn3zfuby44jemc6xa6fbwta.b32.i2p:0
    512 zzre44vh766jgfordw2ehu2r6p44j23uyovgvm7iwuhp3g5iz4ca.b32.i2p:0
    513-ycdw2e4ufgfwhcqna4g3m2qsvaly23ozaexawcj3x4gtgcehgwujjgid.onion:8333
    


    jonatack commented at 8:27 pm on April 30, 2024:
    It looks like something odd happened to the manual onion and i2p seeds. Only a small range of first letters were present, and seeds run by colleagues and the bitcoin community were no longer present.

    achow101 commented at 9:05 pm on April 30, 2024:

    They were updated in #29561, and the addresses were pulled from my node’s addrman. Some sorting happened somewhere, and because makeseeds.py doesn’t shuffle (this PR adds a commit that does that), when it applied the max node count, it ended up with the tail end of that list.

    There seem to be sufficient i2p and onion nodes now that there is no need to specifically include nodes run by known people. We don’t do this for IPv4 or IPv6.


    jonatack commented at 9:51 pm on June 26, 2024:
    Yes, I see that your script removed my onion and i2p nodes then and also here.
  5. DrahtBot added the label Needs rebase on May 13, 2024
  6. achow101 force-pushed on May 14, 2024
  7. achow101 force-pushed on May 14, 2024
  8. achow101 commented at 3:45 am on May 14, 2024: member
    My seeder has now found several i2p nodes, so I’ve gone ahead and removed the manually curated ones for both mainnet and testnet. These are now filled in by the script. The only manual ones remaining are cjdns. However, as my seeder has also found cjdns nodes, they have been added, but are only a couple.
  9. DrahtBot removed the label Needs rebase on May 14, 2024
  10. achow101 force-pushed on Jun 5, 2024
  11. makeseeds: Update user agent regex
    Update the user agent regex to match all 3 digits of the version number,
    not just the first 2 digits.
    
    Also updates it to include 24.2, 25.2, 26.1, 27.0, 27.1, 27.99, 28.0 and
    28.99.
    d5a8c4c4bd
  12. makeseeds: Support CJDNS af550b3a0f
  13. makeseeds: Shuffle ips after parsing
    The crawlers are not guaranteed to output nodes in a random order, so
    shuffle the ips list after parsing to break any biasing that may be
    caused by the output order.
    d2465dfac6
  14. achow101 force-pushed on Aug 14, 2024
  15. achow101 commented at 5:22 pm on August 14, 2024: member

    Updated for testnet4. I don’t think there are any seeders publishing testnet4 data yet, so I’ve just used the fixed seeds in chainparamsseeds.h and turned them into a nodes_testnet4.txt, in addition to adding instructions.

    Also refreshed the seeds for pre-28.0.

  16. achow101 added this to the milestone 28.0 on Aug 14, 2024
  17. in contrib/seeds/makeseeds.py:201 in 71c91db172 outdated
    197@@ -198,6 +198,7 @@ def parse_args():
    198     argparser = argparse.ArgumentParser(description='Generate a list of bitcoin node seed ip addresses.')
    199     argparser.add_argument("-a","--asmap", help='the location of the asmap asn database file (required)', required=True)
    200     argparser.add_argument("-s","--seeds", help='the location of the DNS seeds file (required)', required=True)
    201+    argparser.add_argument("-m", "--minblocks", help="The minimum number of blocks each node must have", default=MIN_BLOCKS, type=int)
    


    fjahr commented at 9:59 am on August 16, 2024:
    nit: If we want this MIN_BLOCKS to default to mainnet then you could update it to 840_000 while you make edits here. But maybe it would be better to have a more concise behavior here in the future. I guess the file could have a header with the network, number of min blocks that makes sense etc. based on the numbers the seeder node has.

    achow101 commented at 3:30 pm on August 16, 2024:

    Updated MIN_BLOCKS.

    The seeders don’t necessarily have their own node.

  18. in contrib/seeds/nodes_testnet4.txt:1 in 6c2a87fb1c outdated
    0@@ -0,0 +1,9 @@
    1+18.189.156.102:48333 # AS16509
    


    fjahr commented at 10:01 am on August 16, 2024:
    nit: Typo in the commit message of 6c2a87fb1ccf26daa754e6d5a4c4ae687da562e3: “tesnet4”

    achow101 commented at 3:30 pm on August 16, 2024:
    Fixed
  19. in contrib/seeds/README.md:22 in 1e0c478889 outdated
    13@@ -14,8 +14,11 @@ data. Run the following commands from the `/contrib/seeds` directory:
    14 ```
    15 curl https://bitcoin.sipa.be/seeds.txt.gz | gzip -dc > seeds_main.txt
    16 curl https://mainnet.achownodes.xyz/seeds.txt.gz | gzip -dc >> seeds_main.txt
    17+curl https://testnet.achownodes.xyz/seeds.txt.gz | gzip -dc > seeds_test.txt
    18 curl https://bitcoin.sipa.be/asmap-filled.dat > asmap-filled.dat
    19 python3 makeseeds.py -a asmap-filled.dat -s seeds_main.txt > nodes_main.txt
    20 cat nodes_main_manual.txt >> nodes_main.txt
    21+python3 makeseeds.py -a asmap-filled.dat -s seeds_test.txt > nodes_test.txt
    22+python3 makeseeds.py -a asmap-filled.dat -s seeds_test.txt -m 30000 > nodes_testnet4.txt
    


    fjahr commented at 10:12 am on August 16, 2024:
    Hm, either I’m missing something or this would mean that the testnet3 nodes would still be included here? Since you say this isn’t really working yet (how I understand this here) maybe rather put a TODO here instead?

    achow101 commented at 3:30 pm on August 16, 2024:
    I’ve commented it out and added a todo.
  20. in contrib/seeds/README.md:18 in 7f55140007 outdated
    14 
    15 ```
    16 curl https://bitcoin.sipa.be/seeds.txt.gz | gzip -dc > seeds_main.txt
    17+curl https://mainnet.achownodes.xyz/seeds.txt.gz | gzip -dc >> seeds_main.txt
    18+curl https://testnet.achownodes.xyz/seeds.txt.gz | gzip -dc > seeds_test.txt
    19 curl https://bitcoin.sipa.be/asmap-filled.dat > asmap-filled.dat
    


    fjahr commented at 10:16 am on August 16, 2024:

    How up-to-date is that ASMap file? If it’s not up updated regularly it might be better to use one of the ones available at https://github.com/fjahr/asmap-data. Maybe this is a change in process that requires more eyes on it and I wouldn’t call this a blocker then but I would like to propose this as a goal for the next release to use something more recent.

    cc: @sipa


    sipa commented at 1:02 pm on August 16, 2024:
    The file is from May 2022. No reason to keep using it; there is no inherent reason why it should be considered more reliable than yours.

    achow101 commented at 3:31 pm on August 16, 2024:
    Yes, if it’s old, we should use a more up to date one. Can that repo somehow have a permalink to the most recent asmap? Otherwise we would run into the same problem with outdated asmaps.

    fjahr commented at 4:11 pm on August 16, 2024:

    Can that repo somehow have a permalink to the most recent asmap? Otherwise we would run into the same problem with outdated asmaps.

    I hadn’t really thought about this so far. This is a bit hacky but for now I copied the latest asmap file there to latest_asmap.dat and I would update that file as we create new maps so this link should always point to the latest file. I will look into making this a bit nicer by having latest.asmap.org (page is still WIP) link to that file automatically but that shouldn’t block this PR here. I can update this as a follow-up.


    achow101 commented at 4:35 pm on August 16, 2024:
    I think you can make a symlink in the repo and just make sure it’s updated every time.

    achow101 commented at 4:53 pm on August 16, 2024:
    Changed to use that asmap, although did not regenerate the seeds.

    fjahr commented at 2:05 pm on August 18, 2024:

    I think you can make a symlink in the repo and just make sure it’s updated every time.

    I tried this but it seems like it’s not supported. I didn’t find an official statement but here is a discussion that seems to confirm there is not way to use symlinks for this: https://github.com/dear-github/dear-github/issues/156

  21. fjahr commented at 10:38 am on August 16, 2024: contributor

    tACK 7f55140007186cda876ad0a5da812e391cddbcc4

    I reviewed the code and the changes look correct to me. I tested that the updated instructions in the README work as expected and they did (though see my comment on the testnet4 line). The resulting files showed some reasonable differences from the files included here, which is expected. I also confirmed that generating chainparamsseeds.h from the txt files included here yields the same result.

  22. makeseeds: Configurable minimum blocks for testnet4's smaller chain 5bab3175a6
  23. seeds: Also pull from achow101 seeder 0676515397
  24. seeds: Add testnet instructions ed5b86cbe4
  25. seeds: Remove manual onion and i2p seeds
    The seeders now produce onion and i2p seeds, so there is no need to keep these
    in the manual list.
    
    Although should also be produced, there are not enough
    good ones detected by the seeder, so we keep the manual seeds for them.
    8ace71c737
  26. seeds: Add testnet4 fixed seeds file f1f24d7214
  27. seeds: Fixed seeds update
    Update the fixed seeds for both mainnet and testnet
    d8fd1e0faf
  28. achow101 force-pushed on Aug 16, 2024
  29. seeds: Use fjahr's more up to date asmap 41ad84a00c
  30. fjahr commented at 9:55 pm on August 17, 2024: contributor

    re-ACK 41ad84a00c20f54b520aab7f6f975231da0ee2d0

    Only changes were addressing above (minor) comments: https://github.com/bitcoin/bitcoin/compare/7f55140007186cda876ad0a5da812e391cddbcc4..41ad84a00c20f54b520aab7f6f975231da0ee2d0

  31. virtu commented at 9:48 am on August 19, 2024: contributor

    ACK 41ad84a

    Reviewed the code; changes look fine. Also in favor of using the regularly-updated asmap from collaborative runs.

    I noticed the seeds.txt.gz file I used (~2024-08-19T08:30Z) file did not contain any good I2P nodes.

    Also, since I’ve been working on on exporting my crawler’s results as well, I noticed the Onion node numbers seem rather low. Here’s some statistics that I generated by skipping the final stage of the makeseeds.py script (so as to not apply the per-AS and per-network limit, thus retaining all viable nodes), applying the script to all input sources individually, and comparing the resulting addresses.

    Network Addresses (before limits) Overlap
    IPv4 sipa=1714, ava=2886, virtu=4554 sipa-ava=704, sipa-virtu=1684, ava-virtu=2355, sipa-ava-virtu=687
    IPv6 sipa=469, ava=583, virtu=1203 sipa-ava=178, sipa-virtu=460, ava-virtu=523, all=176
    Onion ava=443, virtu=11384 ava-virtu=388
    CJDNS ava=2, virtu=9 ava-virtu=2
  32. achow101 commented at 3:36 pm on August 19, 2024: member

    I noticed the seeds.txt.gz file I used (~2024-08-19T08:30Z) file did not contain any good I2P nodes.

    I think it got disconnected from I2P and started marking every I2P node as down when it was actually that the crawler’s I2P connection was down. That’s something I’ll need to fix.

  33. jaonoctus approved
  34. jaonoctus commented at 6:17 pm on August 26, 2024: none
    utACK
  35. achow101 referenced this in commit 17071e47f5 on Aug 26, 2024
  36. achow101 merged this on Aug 26, 2024
  37. achow101 closed this on Aug 26, 2024

  38. virtu commented at 5:53 am on August 27, 2024: contributor

    just rebased #30695 since #30008 got merged and noticed it accidentally removes all hardcoded onion and i2p seeds in src/chainparamsseeds.h (and seeds_main.txt).

    [edited for clarity]

  39. fjahr commented at 10:39 am on August 27, 2024: contributor

    it accidentally removes all hardcoded onion and i2p seeds

    Wasn’t that the plan? From the description:

    My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

  40. virtu commented at 10:46 am on August 27, 2024: contributor

    it accidentally removes all hardcoded onion and i2p seeds

    Wasn’t that the plan? From the description:

    My seeder additionally crawls onion v3, i2p, and cjdns, so will now be able to set those fixed seeds automatically rather than curating manual lists.

    “hardcoded” was a poor choice, I guess. Updated my OP to avoid further confusion.

    I wasn’t referring to the removal of the manually curated addresses in nodes_main_manual.txt but to the fact that there won’t be any Onion or I2P seeds hardcoded into the binary at all (via src/chainparamsseeds.h), which I believe is an accident.

  41. achow101 commented at 3:19 pm on August 27, 2024: member

    I wasn’t referring to the removal of the manually curated addresses in nodes_main_manual.txt but to the fact that there won’t be any Onion or I2P seeds hardcoded into the binary at all (via src/chainparamsseeds.h), which I believe is an accident.

    Yes, just noticed that too.

    I’ve added #30695 to the milestone to get those added back in.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-21 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me