docs: Improve netbase comments #15824

pull dongcarl wants to merge 1 commits into bitcoin:master from dongcarl:2019-04-netbase-comments changing 1 files +202 −20
  1. dongcarl commented at 8:27 PM on April 15, 2019: member

    Second in a series of PRs documenting the net stack. Contributed with sincere thanks to sipa, laanwj, and gmaxwell for providing much of the history, context, and rationale.

  2. dongcarl added the label Docs on Apr 15, 2019
  3. practicalswift commented at 7:04 AM on April 16, 2019: contributor

    Concept ACK

    Very nice!

  4. jonasschnelli commented at 7:37 AM on April 16, 2019: contributor

    Concept ACK ping @theuni

  5. instagibbs commented at 5:13 PM on April 16, 2019: member

    concept ACK!

  6. in src/netbase.cpp:177 in 8becd94b71 outdated
     165 | @@ -137,6 +166,26 @@ bool LookupHost(const char *pszName, CNetAddr& addr, bool fAllowLookup)
     166 |      return true;
     167 |  }
     168 |  
     169 | +/**
     170 | + * Resolve a service string to its corresponding service.
     171 | + *
     172 | + * @param pszName The string representing a service. Could be a name or a
    


    ariard commented at 2:10 PM on April 18, 2019:

    nit: maybe align name params and description comments to ease readability, at least some functions in others parts of code are doing it


    dongcarl commented at 3:53 PM on May 15, 2019:

    Not sure what exact alignment you mean, but my netaddress comments have been aligned like this as well. Lmk!


    ariard commented at 12:53 AM on May 16, 2019:

    Was thinking something likely to BroadcastTransaction in transaction.h, but that's a nit, doesn't matter!

  7. in src/netbase.cpp:229 in 8becd94b71 outdated
     219 | @@ -165,6 +220,16 @@ bool Lookup(const char *pszName, CService& addr, int portDefault, bool fAllowLoo
     220 |      return true;
     221 |  }
     222 |  
     223 | +/**
     224 | + * Resolve a service string with a numeric IP to its first corresponding
    


    ariard commented at 2:13 PM on April 18, 2019:

    maybe add a TODO to refactored out LookupNumeric by first-corresponding-service Lookup it seems feasible?


    laanwj commented at 4:24 PM on May 1, 2019:

    please don't add TODOs to the comments, especially not for refactors

  8. in src/netbase.cpp:314 in 8becd94b71 outdated
     316 | - * @note This function requires that hSocket is in non-blocking mode.
     317 | + * @returns An IntrRecvError indicating the resulting status of this read.
     318 | + *          IntrRecvError::OK only if all of the specified number of bytes were
     319 | + *          read.
     320 | + *
     321 | + * @note This function requires that the specified socket is in non-blocking
    


    ariard commented at 2:15 PM on April 18, 2019:

    redundancy between param hSocket and the note, maybe in param description stress on with "has to be in non-blocking mode"


    dongcarl commented at 4:01 PM on May 15, 2019:

    Done. Thanks!

  9. in src/netbase.cpp:322 in 8becd94b71 outdated
     321 | + * @note This function requires that the specified socket is in non-blocking
     322 | + *       mode.
     323 | + *
     324 | + * @see This function can be interrupted by calling InterruptSocks5(bool).
     325 | + *      Sockets can be made non-blocking with SetSocketNonBlocking(const
     326 | + *      SOCKET&, bool).
    


    ariard commented at 2:18 PM on April 18, 2019:

    nit: SetSocketNonBlocking(const SOCKET&, true), bool doesn't imply socket is blocking/non-blocking (and maybe add a TODO to refactor SetSocketNonBlocking without fNonBlocking flag, which seems useless given function name expected semantic)


    dongcarl commented at 3:18 PM on May 15, 2019:

    I wrote SetSocketNonBlocking(const SOCKET&, bool) so that Doxygen would render it and auto-link to the correct function if we ever made this function public. The name for SetSocketNonBlocking is a little confusing but I think one look at the function declaration/comment is probably enough to clear that up.


    ariard commented at 12:57 AM on May 16, 2019:

    Ah okay, need to unset non-blocking socket was removed long time ago in #4869, but maybe there could be an usage of it for future new types of socket..

  10. in src/netbase.cpp:325 in 8becd94b71 outdated
     328 |  static IntrRecvError InterruptibleRecv(uint8_t* data, size_t len, int timeout, const SOCKET& hSocket)
     329 |  {
     330 |      int64_t curTime = GetTimeMillis();
     331 |      int64_t endTime = curTime + timeout;
     332 | -    // Maximum time to wait in one select call. It will take up until this time (in millis)
     333 | +    // Maximum time to wait for I/O. It will take up until this time (in millis)
    


    ariard commented at 2:20 PM on April 18, 2019:

    nit: maximum to wait for "readyness" of any I/O operation?


    dongcarl commented at 4:01 PM on May 15, 2019:

    Done. Thanks!

  11. in src/netbase.cpp:71 in 8becd94b71 outdated
      67 | @@ -68,6 +68,8 @@ bool static LookupIntern(const char *pszName, std::vector<CNetAddr>& vIP, unsign
      68 |  
      69 |      {
      70 |          CNetAddr addr;
      71 | +        // We can't actually resolve onion addresses, return a network address
    


    ariard commented at 2:26 PM on April 18, 2019:

    ^ above, nit: maybe ParseNetwork could get also a little comment on which networks are currently supported due to addr v1, notably diff between tor and onion


    dongcarl commented at 3:59 PM on May 15, 2019:

    sipa commented at 6:17 PM on May 20, 2019:

    I think it would be clearer to point out that we treat ".onion" addresses the same way as we treat IPv4 a.b.c.d or IPv6 hex addresses, with the exception that the standard library resolving function don't support them.


    dongcarl commented at 7:38 PM on May 20, 2019:

    I think it would be clearer to point out that we treat ".onion" addresses the same way as we treat IPv4 a.b.c.d or IPv6 hex addresses

    Not 100% sure what you mean here, specifically, what is your specific meaning of "treat the same way"?


    sipa commented at 7:57 PM on May 20, 2019:

    For example "1.2.3.4" is a name, but it's one that directly encodes an IP address; no resolving is done when you turn it into a CNetAddr, it's just converted. The same is true for onion addresses; from our perspective, those are not hostnames - they're direct encodings of CNetAddr for onion addresses.

    In contrast, "sipa.be" is a hostname that needs network activity to resolve to a CNetAddr.


  12. promag commented at 10:03 AM on April 22, 2019: member

    Nice, concept ACK.

  13. DrahtBot commented at 1:44 PM on May 8, 2019: member

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    No conflicts as of last run.

  14. dongcarl force-pushed on May 15, 2019
  15. dongcarl commented at 5:02 PM on May 15, 2019: member

    Addressed comments

  16. ariard commented at 12:57 AM on May 16, 2019: member

    ACK 375d9c8

  17. theuni commented at 4:42 PM on May 16, 2019: member

    Concept ACK. Will review.

  18. in src/netbase.cpp:91 in 375d9c8d0f outdated
      78 | @@ -77,15 +79,21 @@ bool static LookupIntern(const char *pszName, std::vector<CNetAddr>& vIP, unsign
      79 |      struct addrinfo aiHint;
      80 |      memset(&aiHint, 0, sizeof(struct addrinfo));
      81 |  
      82 | +    // We want a TCP port, which is a streaming socket type
      83 |      aiHint.ai_socktype = SOCK_STREAM;
      84 |      aiHint.ai_protocol = IPPROTO_TCP;
      85 | +    // We don't care which address family (IPv4 or IPv6) is returned
      86 |      aiHint.ai_family = AF_UNSPEC;
      87 | +    // If we allow lengthy lookups of hostnames, only return addresses whose
    


    theuni commented at 5:39 PM on May 20, 2019:

    Nit: eliminate "lengthy" here, as it misrepresents the primary reason for disabling lookups (privacy, not performance).


    dongcarl commented at 5:12 PM on May 22, 2019:

    What privacy concerns are there? What call-sites are you talking about?


    sipa commented at 5:19 PM on May 22, 2019:

    DNS lookups leak your query to your DNS server (or when done over tor, to the exit node that does the resolving).


    dongcarl commented at 5:32 PM on May 22, 2019:

    @sipa If I'm understanding correctly, one such instance is the "Resolve" stage of ConnectNode where, if we have a name proxy, Lookup won't do external queries?


    sipa commented at 5:35 PM on May 22, 2019:

    If you set AI_NUMERICHOST then all getaddrinfo does is parse the IP address, and return the result immediately without network interaction.


    dongcarl commented at 5:42 PM on May 22, 2019:

    Ah, I thought that was obvious and wanted to explain why we needed AI_ADDRCONFIG for the case where fAllowLookup was true instead of just giving no flags. I should probably talk about both though.


  19. in src/netbase.cpp:101 in 375d9c8d0f outdated
      91 |      int nErr = getaddrinfo(pszName, nullptr, &aiHint, &aiRes);
      92 |      if (nErr)
      93 |          return false;
      94 |  
      95 | +    // Traverse the linked list starting with aiTrav, add all non-internal
      96 | +    // IPv4,v6 addresses to vIP while respecting nMaxSolutions.
    


    theuni commented at 5:54 PM on May 20, 2019:

    It occurs to me (because of this helpful new comment) that the addresses that we return for AF_UNSPEC lookups are hugely biased based on the order in which the OS returns them. For example, if the OS always returns ipv4 addresses before ipv6 addresses rather than interleaving, we may hit nMaxSolutions before adding a single ipv6 result.

    Similarly, it could mean that different OSs are biased differently towards ipv4 or ipv6. That's an opportunity for fingerprinting, though the complexities of ipv6 deployments probably immunize us against that.

    Without researching I'm not sure what to do, but we definitely should make sure that we understand and document the current behavior.


    theuni commented at 7:41 PM on May 20, 2019:

    Maybe @EthanHeilman has some experience with this?


    EthanHeilman commented at 8:55 PM on May 20, 2019:

    I would have assumed just reading this code that the OS would not impose an order on the addresses returned. However after @theuni's comment I see no reason why that assumption is justified. It seems entirely plausible that the OS or some upstream source would impose an order. Randomizing the order of the linked list would negate all of this.


    theuni commented at 9:01 PM on May 20, 2019:

    @EthanHeilman Because AF_UNSPEC actually results in two separate/parallel resolves internally, I would assume that for performance reasons the final list would simply be those two lists concatenated.

    Edit: Unless libc's randomize internally to harden against this type of thing.


    EthanHeilman commented at 6:18 PM on May 21, 2019:

    @theuni Even if it turns out the OS randomizes the list, that make not hold in the future and/or across OSes. If we expect the lists to be randomized in this code and it seems like we do, we should randomize the lists ourselves. Seems like a neat starter project.


    dongcarl commented at 6:49 PM on May 21, 2019:

    Having talked to TheBlueMatt about this, the only place we would care would be when loading addresses from seeds. @theuni @EthanHeilman are there specific call-sites you guys are talking about?


    EthanHeilman commented at 3:06 PM on June 6, 2019:

    I was just thinking about it with regard to seeds. Not sure what other features randomization would be important for.


    dongcarl commented at 4:55 PM on June 12, 2019:

    @EthanHeilman It occurred to me that this doesn't really matter until we decrease the nMaxIPs when learning from seeds. Right now it's 256, and no peer returns more than that, so there's no need to shuffle as there's no cut-off. Noted here: https://github.com/bitcoin/bitcoin/issues/16070

  20. in src/netbase.cpp:181 in 375d9c8d0f outdated
     176 | + *                [2001:db8:85a3:8d3:1319:8a2e:370:7348]:420)
     177 | + * @param[out] vAddr The resulting services to which the specified service string
     178 | + *                   resolved.
     179 | + * @param portDefault The default port for resulting services if not specified
     180 | + *                    by the service string.
     181 | + * @param fAllowLookup Whether or not we allow lengthy lookups by hostnames, or
    


    theuni commented at 6:04 PM on May 20, 2019:

    See above comment about "lengthy".

    Edit: Aha! You got it from the manpages ;)

    The AI_NUMERICHOST flag suppresses any potentially lengthy network host address lookups.
    

    We're more interested in the "suppresses" part than the "lengthy" part.

  21. in src/netbase.cpp:302 in 375d9c8d0f outdated
     298 | @@ -234,22 +299,29 @@ enum class IntrRecvError {
     299 |  };
     300 |  
     301 |  /**
     302 | - * Read bytes from socket. This will either read the full number of bytes requested
     303 | - * or return False on error or timeout.
     304 | - * This function can be interrupted by calling InterruptSocks5()
     305 | + * Try to read a specified number of bytes from a socket. Please read the "note"
    


    sipa commented at 6:19 PM on May 20, 2019:

    What note?


    dongcarl commented at 6:54 PM on May 22, 2019:

    Removed!

  22. in src/netbase.cpp:182 in 375d9c8d0f outdated
     177 | + * @param[out] vAddr The resulting services to which the specified service string
     178 | + *                   resolved.
     179 | + * @param portDefault The default port for resulting services if not specified
     180 | + *                    by the service string.
     181 | + * @param fAllowLookup Whether or not we allow lengthy lookups by hostnames, or
     182 | + *                     only allow parsing of numerical IPs.
    


    theuni commented at 6:20 PM on May 20, 2019:

    "only allow parsing" is dangerous here, because it strongly implies that only string parsing will be done. Additionally, onion addresses are "resolved" and getaddrinfo may still be called with AI_NUMERICHOST.

    How about something more elegant than: "fAllowLookup Whether or not hostname lookups are permitted. If yes, external queries may be performed."


    dongcarl commented at 5:56 PM on May 22, 2019:

    Perhaps instead of "external queries", it's more accurate to say "hostname lookup"? I'm thinking about the case where it's an entry in the /etc/hosts file and that's not external. After a closer reading of your statement, it is accurate :smile:


  23. sipa commented at 6:22 PM on May 20, 2019: member

    ACK

  24. in src/netbase.cpp:577 in 375d9c8d0f outdated
     575 | -    //Disable Nagle's algorithm
     576 | +    // Set the no-delay option (disable Nagle's algorithm) on the TCP socket.
     577 |      SetSocketNoDelay(hSocket);
     578 |  
     579 | -    // Set to non-blocking
     580 | +    // Set the non-blocking option on the file descriptor.
    


    theuni commented at 6:37 PM on May 20, 2019:

    Is it a socket or a file descriptor? :)


    dongcarl commented at 6:56 PM on May 22, 2019:

    Changed.

  25. in src/netbase.cpp:629 in 375d9c8d0f outdated
     624 |          int nErr = WSAGetLastError();
     625 |          // WSAEINVAL is here because some legacy version of winsock uses it
     626 |          if (nErr == WSAEINPROGRESS || nErr == WSAEWOULDBLOCK || nErr == WSAEINVAL)
     627 |          {
     628 | +            // Connection didn't actually fail, but is being established
     629 | +            // asynchronously. Thus, use async I/O api (select/poll) to check
    


    theuni commented at 7:03 PM on May 20, 2019:

    Nit: "async I/O api (select/poll) synchronously". Just so nobody goes looking for the missing magic to make this happen in the background.


    dongcarl commented at 6:57 PM on May 22, 2019:

    Changed.

  26. in src/netbase.cpp:710 in 375d9c8d0f outdated
     701 | @@ -572,6 +702,20 @@ bool GetProxy(enum Network net, proxyType &proxyInfoOut) {
     702 |      return true;
     703 |  }
     704 |  
     705 | +/**
     706 | + * Set the proxy to use for populating our peer list using a one-shot `getaddr`
    


    theuni commented at 7:31 PM on May 20, 2019:

    I believe most of this comment is incorrect, or at least very misleading :(.

    SetNameProxy makes sure that all resolves are routed through a proxy rather than being done ourselves. No need to bring the higher-level 'getaddr' into it.

    Maybe you grepped for HasNameProxy and missed the use of GetNameProxy in ConnectNode()? If so, looking at that may change your understanding somewhat.


    dongcarl commented at 5:36 PM on May 22, 2019:

    Okay this might also inform my documentation on net.cpp proper. I see that in the "Resolve" stage of ConnectNode, when we call Lookup we are guaranteed a non-external query lookup if we have a name proxy. However, if pszDest was a hostname, aforementioned Lookup would fail, and we would end up in the else if (pszDest && GetNameProxy(proxy)) case in the "Connect" stage, where we connect to the host through the proxy without doing any resolution ourselves (and delegate the resolution to the proxy). Is that a correct understanding?


    sipa commented at 5:39 PM on May 22, 2019:

    That matches what I remember about this logic. The reason is that if we're using Tor, we don't want to perform DNS lookups ourselves (as those would leak to DNS servers, instead of over the proxy). Instead, we do a name-based connection where we give the name directly as host to connect to the proxy (which in the case of Tor will cause the exit node to do the resolving locally and establish a connection with the result).


  27. theuni commented at 7:35 PM on May 20, 2019: member

    This was much-needed, thank you!

    Because this is documentation of some of our most quirky quirks, I was a little extra pedantic in my review.

  28. dongcarl force-pushed on May 22, 2019
  29. dongcarl force-pushed on May 22, 2019
  30. dongcarl commented at 7:34 PM on May 22, 2019: member

    Updated and addressed all concerns.

  31. in src/netbase.cpp:726 in 4ccd27e520 outdated
     721 | + * longer leak their external hostname queries to their DNS servers.
     722 | + *
     723 | + * @returns Whether or not the operation succeeded.
     724 | + *
     725 | + * @note Alternatively, SOCKS5 does have support for UDP-over-SOCKS5, but a DNS
     726 | + *       resolver is beyond the scope of this project.
    


    laanwj commented at 2:43 PM on June 6, 2019:

    Note that no SOCK5 server in common use, most notably Tor, actually implements UDP support. So it'd be kind of pointless to do this. The Tor socks extension spec does mention a RESOLVE command that they implement that is specific to Tor, this would be a more realistic option.


    dongcarl commented at 8:49 PM on June 10, 2019:

    @laanwj Cool, will include. I'm thinking we use @note as a place to discuss the design space we've considered and why they don't work so it's recorded somewhere for future developers. Does that good to you? We could also do this somewhere else and reference from @note.


    laanwj commented at 6:33 PM on July 15, 2019:

    Yes, sounds good to me, I was just mentioning it to prevent people from going up this road to implement this (reading it as a TODO) only to conclude it's pointless, so it makes sense to mention that.


    dongcarl commented at 6:48 PM on July 15, 2019:

    Reworded a little bit to make it more clear this is just a design space comment not a TODO.

  32. fanquake added the label Waiting for author on Jun 19, 2019
  33. docs: Improve netbase comments
    - Improve and add various Lookup* docs
    - Improve InterruptibleRecv docs
    - Improve Socks5 docs
    - Add CreateSocket docs
    - Add ConnectSocketDirectly docs
    - Add SetNameProxy docs
    - Add ConnectThroughProxy docs
    - Add LookupSubNet docs
    c7f6ce74d3
  34. dongcarl force-pushed on Jul 15, 2019
  35. dongcarl commented at 6:50 PM on July 15, 2019: member

    Addressed all concerns and rebased.

    Reviewers, you can verify that there were no substantial code changes that would invalidate my comments by invoking:

    git log 65526fc8666fef35ef908dbc225f706bef642c7e..origin/master -- src/netbase.{h,cpp}
    
  36. laanwj commented at 7:24 PM on July 15, 2019: member

    ACK c7f6ce74d3a5cf2a0c5bac20eab1efd997175a72

  37. laanwj merged this on Jul 15, 2019
  38. laanwj closed this on Jul 15, 2019

  39. laanwj referenced this in commit 6d37ed888e on Jul 15, 2019
  40. laanwj removed the label Waiting for author on Oct 24, 2019
  41. deadalnix referenced this in commit 76df4df8e6 on Jun 18, 2020
  42. vijaydasmp referenced this in commit 0b3ce66e7e on Oct 27, 2021
  43. vijaydasmp referenced this in commit 2541a51679 on Oct 28, 2021
  44. vijaydasmp referenced this in commit 1c43180754 on Oct 29, 2021
  45. vijaydasmp referenced this in commit 5eb7852322 on Oct 30, 2021
  46. vijaydasmp referenced this in commit ccaedbc4cd on Nov 2, 2021
  47. vijaydasmp referenced this in commit ce8d9055a9 on Nov 7, 2021
  48. vijaydasmp referenced this in commit f9721a578b on Nov 11, 2021
  49. vijaydasmp referenced this in commit 6df6549b11 on Nov 12, 2021
  50. vijaydasmp referenced this in commit 1a58f3b0b1 on Nov 13, 2021
  51. vijaydasmp referenced this in commit ac689184d1 on Nov 14, 2021
  52. vijaydasmp referenced this in commit dae8bdc6ea on Nov 14, 2021
  53. vijaydasmp referenced this in commit 22ef848f6e on Nov 16, 2021
  54. DrahtBot locked this on Dec 16, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-22 18:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me