net: use interruptible async getaddrinfo wrapper from libevent for DNS #27505

pull pinheadmz wants to merge 2 commits into bitcoin:master from pinheadmz:dns-libevent changing 5 files +96 −30
  1. pinheadmz commented at 9:07 PM on April 20, 2023: member

    Closes #16778

    Bitcoin uses getaddrinfo to make DNS requests for DNS seed servers and for adding peers with -addnode, -seednode and -connect. Depending on the platform this can be clunky and a system issue could prevent name resolution from completing at all, blocking the thread and in some cases preventing a clean shutdown.

    An attempt was made to switch to the asynchronous getaddrinfo_a in #4421 but that was reverted in #9229 after discovering that function has a segfault!

    Taking BlueMatt's suggestion in #10215#issue-221886328, this PR modifies our g_dns_lookup function to use evdns_getaddrinfo() from libevent. This is an asynchronous function but I've implemented it in a polling loop so it still blocks -- but now will timeout after 2 seconds.

    TODO:

    • We could make the polling loop interruptible but I'm not sure the best approach to that, since these functions don't live in a class with a flag like interruptNet
    • Is it possible to add functional tests that mess with DNS? Probably not.
    • The unit test I added makes a live DNS request for nic.com that test will fail the platform has no DNS

    Future work:

    • libevent has more low-level DNS functions as well, we could ultimately use those to (for example) request TXT records with onion addresses from our DNS seeders
  2. DrahtBot commented at 9:07 PM on April 20, 2023: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27530 (Remove now-unnecessary poll, fcntl includes from net(base).cpp by Empact)
    • #27405 (util: Use steady clock instead of system clock to measure durations by MarcoFalke)
    • #27375 (net: support unix domain sockets for -proxy and -onion by pinheadmz)
    • #27071 (Handle CJDNS from LookupSubNet() by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label P2P on Apr 20, 2023
  4. DrahtBot added the label CI failed on Apr 20, 2023
  5. dergoegge commented at 9:06 AM on April 21, 2023: member

    Afaik we currently do not expose libevent on any public facing interface (e.g. p2p) and only use it for things that aren't supposed to be exposed (e.g. rpc or rest). By using it for DNS queries we would be changing that (e.g. a malicious DNS seeder) and I'm not sure if that is the best idea given that libevent is pretty archaic (I think the MSan, ASan, TSan and LSan failures in the CI are kind of proving my point).

  6. pinheadmz commented at 11:27 AM on April 21, 2023: member

    @dergoegge I think at least some of the CI failures are memory leaks from my code, I'm going to fix that. But I hear your point about the arcane library. Any suggestions?

  7. dergoegge commented at 12:49 PM on April 21, 2023: member

    Any suggestions?

    The issue seems somewhat stale so maybe no need to fix? Judging by your last comment, you were also not able to reproduce? (https://github.com/bitcoin/bitcoin/issues/16778#issuecomment-1458423187)

  8. pinheadmz commented at 12:53 PM on April 21, 2023: member

    I was able to somewhat reproduce -- by sending all DNS requests on my machine to a blackhole resolver, I observed a 30 second pause during shutdown. But I think that 30 seconds may be platform-specific, or whatever the local getaddrinfo() implementation is.

  9. in src/netbase.cpp:115 in fd6679b8f3 outdated
     127 | +        if (++checks > 20) break;
     128 | +        std::this_thread::sleep_for(std::chrono::milliseconds(100));
     129 |      }
     130 | -    freeaddrinfo(ai_res);
     131 |  
     132 | +    thread.join();
    


    dergoegge commented at 1:34 PM on April 21, 2023:

    Wouldn't this also wait indefinitely for getaddrinfo to finish before the thread can join?


    pinheadmz commented at 1:53 PM on April 21, 2023:

    2 sec timeout waiting for the callback and then the loop will close. With my black hole dns resolver, the patch saves 30 seconds against master on the unit test in the second commit.

    It took me some work to figure out that using ..._once() to start the event loop would allow me to quit the loop manually which allows join() to succeed


    pinheadmz commented at 1:56 PM on April 21, 2023:

    (libevent does not call the platform getaddrinfo for dns requests, it has its own async methods)


    dergoegge commented at 2:26 PM on April 21, 2023:

    libevent does not call the platform getaddrinfo for dns requests, it has its own async methods

    That explains why it doesn't hang, thanks!

    But it does seem like it uses getaddrinfo in some cases: https://github.com/libevent/libevent/blob/75208132d5b7a8fff59ca3bf47253179ec314951/evutil.c#L1686


    pinheadmz commented at 3:22 PM on April 21, 2023:

    Yeah I think thats just to decode numeric addresses like "100.200.30.40" etc. It looked to me like if an actual DNS request is necessary it has its own methods to make reqests directly to the nameservers provided by the platform? When things get "serious":

    https://github.com/libevent/libevent/blob/master/evdns.c#L5623

  10. pinheadmz force-pushed on Apr 26, 2023
  11. net: use interruptible async getaddrinfo wrapper from libevent for DNS c3ecbf6a8a
  12. test: cover DNS lookups with new libevent method c94962d700
  13. pinheadmz force-pushed on Apr 26, 2023
  14. fanquake commented at 1:30 PM on May 2, 2023: member

    In general, I'm ~0 on leaning into further libevent usage, especially for something like this. The upstream code is currently not necessarily very well maintained or tested. There is also at least one open issue which reports evdns_getaddrinfo just "crashing occasionally": https://github.com/libevent/libevent/issues/1130. Although it's not entirely clear if this is user error.

  15. dergoegge commented at 2:17 PM on May 2, 2023: member

    As an alternative to this PR, could we give the DNS thread time to join cleanly (e.g. 5s) and if it doesn't we just detach it and let the OS handle the clean up? That risks memory leaks but that shouldn't really matter when the program is about to exit anyway.

    I would also be fine with not addressing this at all, because it seems like this only happens when a system generally fails to make DNS requests? In the absence of nice solutions, it seems like that isn't our problem and the user should fix their system instead.

  16. pinheadmz commented at 2:50 PM on May 2, 2023: member

    @fanquake @dergoegge good feedback, thanks. I'll look into a different approach moving the lib call getaddrinfo to a detachable thread before abandoning, and then we may just have to close #16778 as "won't fix"

    There is another thought I had about name resolution moving forward: If bitcoin can make more interesting DNS queries (either using a library or just implementing some bare necessities) we could ask the DNS seeders for I2P and onion addresses as TXT records

  17. fanquake commented at 2:52 PM on May 2, 2023: member

    either using a library

    It's very unlikely we are going to add a new external dependency to make "interesting" DNS queries.

  18. pinheadmz commented at 6:29 PM on May 2, 2023: member

    Closing this now for alternative in #27557 which just calls getaddrinfo() from a detachable thread. No libevent at all and, yeeeeeesh it is way simpler.

  19. pinheadmz closed this on May 2, 2023

  20. bitcoin locked this on May 1, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-15 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me