contrib: makeseeds.py improvements #17020

issue laanwj openend this issue on October 2, 2019
  1. laanwj commented at 11:46 am on October 2, 2019: member

    Some suggestions for improvements, or at least for discussion between major releases:

    #7398 was a previous attempt to improve this script, but was abandoned, maybe some changes are useful.

  2. laanwj added the label P2P on Oct 2, 2019
  3. laanwj added the label Scripts and tools on Oct 2, 2019
  4. maflcko commented at 12:25 pm on October 2, 2019: member
    good first issue? :trollface:
  5. laanwj added the label good first issue on Oct 2, 2019
  6. solon referenced this in commit 0b1dcd32bf on Oct 3, 2019
  7. maflcko referenced this in commit f1f284aa75 on Oct 3, 2019
  8. sidhujag referenced this in commit 206951bc3b on Oct 4, 2019
  9. laanwj added this to the milestone 0.20.0 on Oct 15, 2019
  10. brakmic commented at 2:11 pm on November 21, 2019: contributor

    Hi,

    Would it be acceptable to get the suspicious hosts from a URL instead of a file? I expanded the makeseeds.py with a small example that gets them from Greg Maxwell’s page.

    Here’s the commit: https://github.com/brakmic/bitcoin/commit/93d1c47b9e127d1312caa33c76eb1c0aced009ed

    Regards,

  11. laanwj commented at 7:46 pm on November 22, 2019: member

    Would it be acceptable to get the suspicious hosts from a URL instead of a file? I expanded the makeseeds.py with a small example that gets them from Greg Maxwell’s page.

    I don’t think the script should fetch from a URL automatically (this interferes with deterministic use), but telling the user to get it themselves is fine.

  12. sanjaykdragon commented at 0:39 am on December 29, 2019: contributor
    Updated your first point: “Read suspicious hosts (and ASNs) from a file instead of hardcoding”
  13. laanwj referenced this in commit 7e841f3f9b on Jan 20, 2020
  14. sidhujag referenced this in commit ac6cf09572 on Jan 24, 2020
  15. fanquake removed this from the milestone 0.20.0 on Apr 7, 2020
  16. fanquake added this to the milestone 0.21.0 on Apr 7, 2020
  17. vasild commented at 7:49 pm on September 23, 2020: contributor

    Would it be acceptable to get the suspicious hosts from a URL instead of a file?

    A file that comes with the release is signed. Because the release is made e.g. 1 month ago the user knows that this same exact file has been published 1 month ago and if there was some problem with it then somebody would have complained. Accessing a local file does not depend on a centralized service running (the web server).

    A file that has to be downloaded is not signed (yes, we can sign it with the same keys as the release and make bitcoind verify the signature). The user never knows when the file was created, maybe the web server is supplying one content to one user and another content to another user. If the web server just bricks, then the file is not accessible.

  18. laanwj commented at 10:36 am on October 29, 2020: member

    Now that (since #20237) we get the full number of Tor seeds (512), we could look into increasing, or even normalizing the uptime requirement for onions:

    0    # Require at least 50% 30-day uptime for clearnet, 10% for onion.
    1    req_uptime = {
    2        'ipv4': 50,
    3        'ipv6': 50,
    4        'onion': 10,
    5    }
    
  19. maflcko removed this from the milestone 0.21.0 on Nov 1, 2020
  20. maflcko commented at 9:03 am on November 1, 2020: member
    Cleared the milestone for now. This can happen anytime but shouldn’t hold up the next release.
  21. sidhujag referenced this in commit 54aa12fd8b on Nov 10, 2020
  22. ghost commented at 11:21 am on February 23, 2022: none

    A. Filtering hosts with multiple ports can be removed IMO:

    https://github.com/bitcoin/bitcoin/blob/c44e734dca64a15fae92255a5d848c04adaad2fa/contrib/seeds/makeseeds.py#L215

    B. Tor v3 can also be included in the results.

    C. Recent observation which can be confirmed with:

    0wget https://gitlab.com/api/v4/projects/33695681/packages/generic/nrich/0.1.1/nrich_0.1.1_amd64.deb
    1sudo dpkg -i nrich_0.1.1_amd64.deb
    2host -t a seed.bitcoin.sipa.be | sed -e 's/seed.bitcoin.sipa.be has address //g' | nrich -
    

    Possible reasons for vulnerable machines used for bitcoin nodes:

    1. False positives
    2. Users not aware or don’t care
    3. Attackers prefer using these for better results
    4. Honeypots
    5. Other reasons

    Leaving 1 which won’t be true for all the results, filtering such nodes in makeseeds.py should make sense. Below is an example for one IP copied from suspicious_hosts.txt

    0ip = '88.198.17.7'
    1
    2url = 'https://internetdb.shodan.io/' + ip
    3response = requests.get(url)
    4
    5if response.text.find('CVE') != -1:
    6    print('vulnerable')
    
  23. russeree commented at 6:44 am on April 10, 2022: contributor

    From “Read suspicious hosts (and ASNs) from a file instead of hardcoding (scripts: Read suspicious hosts from a file instead of hardcoding #17823 does this for suspicious hosts)”

    I would like to tackle the (and ASNs) portion of this task. Firstly in regards to ASNs nothing seems to be hard coded into the generation process currently other than a parse to an external service. Currently each seed ip-address is parsed though xxx.xxx.xxx.xxx.origin.asn.cymru.com. Does this task imply that a complete list of ASNs should be stored locally and updated periodically? The current cost of a complete compressed list of ASNs and their IP address blocks is ~5.5MB.

    Secondly does this mean there should be a suspicious ASNs file? I don’t think this is the correct interpretation.

    If all my interpretations are incorrect please let me know.

  24. RF5 commented at 7:55 am on April 12, 2022: contributor
    I’ve tried to address some of the more minor points of improvement in #24818 , namely differentiating max seeds per ASN for ipv4 and ipv6, with some more docs and a MIN_BLOCKS bump. Any feedback would be appreciated.
  25. laanwj referenced this in commit 7da4f65a00 on Apr 15, 2022
  26. laanwj commented at 8:50 am on April 15, 2022: member

    Secondly does this mean there should be a suspicious ASNs file? I don’t think this is the correct interpretation.

    No, a list of suspicious ASNs has not been proposed. To be honest I don’t think that’s very useful, neither is the list of suspicious hosts, it makes no sense to maintain that as part of this project. The more general rules “N hosts per ASN” make sense, though.

  27. fanquake referenced this in commit d2e04196b6 on Apr 18, 2022
  28. Munkybooty referenced this in commit ad766b68c3 on Jun 9, 2022
  29. Munkybooty referenced this in commit c04179d2fe on Jun 21, 2022
  30. Munkybooty referenced this in commit 0dfe999d0d on Jun 25, 2022
  31. Munkybooty referenced this in commit 5c21cac562 on Jun 28, 2022
  32. Munkybooty referenced this in commit e34c7fe83d on Jul 6, 2022
  33. Munkybooty referenced this in commit 94a2c574ac on Aug 3, 2022
  34. fernandezpablo85 commented at 3:40 pm on August 9, 2022: contributor

    The description should be updated.

    Read ASNs from a file (in progress

    Should read:

    Read ASNs from a file (done in #24864)

  35. Munkybooty referenced this in commit 07e7f2a7c8 on Aug 16, 2022
  36. Munkybooty referenced this in commit edfe46a5e6 on Aug 22, 2022
  37. Munkybooty referenced this in commit 7ccdd9386e on Aug 22, 2022
  38. Munkybooty referenced this in commit df6c5a30c0 on Aug 23, 2022
  39. Munkybooty referenced this in commit 93d4768e99 on Sep 6, 2022
  40. Munkybooty referenced this in commit c685dbc502 on Sep 19, 2022
  41. Munkybooty referenced this in commit 8ddd492e6b on Oct 3, 2022
  42. Munkybooty referenced this in commit 5aee25d674 on Oct 13, 2022
  43. Munkybooty referenced this in commit 3909e38ecb on Oct 13, 2022
  44. Munkybooty referenced this in commit d8e62ca6bf on Oct 17, 2022
  45. PastaPastaPasta referenced this in commit 8c6fb5622d on Oct 17, 2022
  46. russeree commented at 2:29 pm on January 29, 2023: contributor

    The per ASN limit IS being enforce properly. I wrote a python script to validate the output of node_main.txt and the limits are being enforced. The previous thread is locked.

    Output of script. Each line shows the ASN group, IP Type and Count. https://gist.github.com/russeree/850d299f386fa19a9d819b2887190a2b

    Script source https://github.com/russeree/bitcoin/blob/e9693097fef78a01c47fb244cbebe8f580e179c4/contrib/seeds/test/asn_limit.py

  47. laanwj commented at 9:10 pm on January 29, 2023: member
    Thanks, I ticked that one off. I’m fairly sure also that it was fixed in the latest round of changes.
  48. willcl-ark commented at 11:57 am on May 31, 2024: member

    [ ] Bring back onion functionality past TorV3 switch [ ] We need a source of non-hardcoded V3 peers, currently the only ones are hardcoded and the seeder hasn’t been updated to crawl v3 nodes yet (see crawler: Collect Tor v3 and I2P addresses? sipa/bitcoin-seeder#92)

    I think these two can be checked off post-https://github.com/bitcoin/bitcoin/pull/30008 ?


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-18 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me