p2p, rpc, test: expose tried and refcount in getnodeaddresses, update/improve tests #23035

pull jonatack wants to merge 5 commits into bitcoin:master from jonatack:getnodeaddresses-tried-and-reference_count changing 9 files +65 −41
  1. jonatack commented at 2:39 pm on September 19, 2021: member

    Based on #22831, this pull implements the suggestion in #22831 (comment) to extend RPC getnodeaddresses to return whether the address is in the tried or new table, and if it’s in the new table, its refcount. Doing so allows us to better test the contents of the addrman using a public interface instead of relying on assert_debug_log in the functional tests. The pull then updates the getnodeaddresses and addpeeraddress test coverage, followed by improving the relevant asmap test coverage.

    Credit to John Newbery for the idea.

  2. DrahtBot added the label P2P on Sep 19, 2021
  3. DrahtBot added the label RPC/REST/ZMQ on Sep 19, 2021
  4. DrahtBot commented at 3:22 pm on September 19, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #23053 ([fuzz] Use public methods in addrman fuzz tests by jnewbery)
    • #22950 ([p2p] Pimpl AddrMan to abstract implementation details by amitiuttarwar)
    • #22872 (log: improve checkaddrman logging with duration in milliseconds by jonatack)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. jnewbery commented at 3:56 pm on September 19, 2021: member

    Concept ACK, approach NACK. I don’t think it makes sense to add this addrman-specific data into the CAddress class, which is used by many different components.

    Two ways to expose them are (1) move them from CAddrInfo to CAddress or (2) use CAddrInfo instead of CAddress for the GetAddr_(), GetAddr() and GetAddresses() functions and callers.

    I don’t think we want to do (2) either. In general, we’re trying to encapsulate addrman more. CAddrInfo is an addrman implementation-specific class, so ideally it wouldn’t be exposed at all in the header. That’s done in #22950, which introduces a very clear compilation firewall between implementation and public interface.

    I think the best way to do this is update CAddrMan::GetAddr() to return a vector of tuples, std::vector<std::tuple<CAddress, /* tried */ bool, /* refcount*/ int>>, similar to how Select() is updated to return a std::pair<> in https://github.com/bitcoin/bitcoin/pull/22950/files#diff-164bd9e2e30f54d0a79eb7cc372309e2f2155edc6c3f051290ab078f03f6a771R89. That means that the public method is returning only what the caller needs, minimizing the coupling between implementation and client code.

  6. jonatack commented at 4:01 pm on September 19, 2021: member
    Thanks @jnewbery. I agree the two approaches aren’t ideal. Will have a look at your suggested approach.
  7. mzumsande commented at 11:24 am on September 20, 2021: member

    I think the best way to do this is update CAddrMan::GetAddr() to return a vector of tuples, std::vector<std::tuple<CAddress, /* tried */ bool, /* refcount*/ int>>

    A vector of std::pair might be sufficient, because nRefCount==0 means it’s in Tried.

  8. jonatack commented at 11:37 am on September 20, 2021: member

    A vector of std::pair might be sufficient, because nRefCount==0 means it’s in Tried.

    Good point, though I’m not sure we want to depend on that coupling.

    (Ideally I’ll try to use a struct as it’s nicer to work with.)

  9. jonatack force-pushed on Sep 20, 2021
  10. jnewbery commented at 5:07 pm on September 20, 2021: member

    Ideally I’ll try to use a struct as it’s nicer to work with.

    I’m curious why that is. I would have thought that using structured bindings to unpack a returned tuple would be just as easy: https://en.cppreference.com/w/cpp/language/structured_binding#Case_2:_binding_a_tuple-like_type.

    Another possible approach would be to add a new public method GetAddressesWithRefcounts() that could be used by the RPC code. That would avoid the need to update the CConnman and PeerManager code.

  11. jonatack commented at 5:32 pm on September 20, 2021: member
    Pushed an update that doesn’t touch CAddress or CAddrInfo while attempting to be clear and simple.
  12. amitiuttarwar commented at 5:32 pm on September 20, 2021: contributor

    A vector of std::pair might be sufficient, because nRefCount==0 means it’s in Tried.

    Good point, though I’m not sure we want to depend on that coupling.

    I don’t think there would be much logic coupling with a clearly documented function that returns pairs of < CAddress, multiplicity_on_new > with 0 for the second field indicating that the address is on the tried table.

    Another possible approach would be to add a new public method GetAddressesWithRefcounts() that could be used by the RPC code. That would avoid the need to update the CConnman and PeerManager code.

    I like the idea of minimizing impact to the p2p code paths for this test-only functionality.

  13. jonatack commented at 6:02 pm on September 20, 2021: member
    Hm, this isn’t urgent, so it might be an idea to try the new public method variant based on/built on #22950 to be sure it’s compatible.
  14. jnewbery commented at 8:39 am on September 21, 2021: member

    Hm, this isn’t urgent, so it might be an idea to try the new public method variant based on/built on #22950 to be sure it’s compatible.

    As long as you’re just adding a new public method, it’s compatible. @amitiuttarwar will need to update the header file to include that public method in the interface class declaration if this gets merged first, but I don’t think that’s any more/less work than merging the pimpl refactor first and then adding the public method.

    I’m happy to have a first attempt at implementing my suggestion if that’d be helpful to you.

  15. jonatack commented at 12:31 pm on September 21, 2021: member

    Rebased after merge of #22831.

    Structs/classes: I guess I like the friendly, readable built-in access to members via named methods, and their flexibility and adaptability to changing future requirements, like adding/removing members.

    I’m happy to have a first attempt at implementing my suggestion if that’d be helpful to you.

    SGTM! can pull it in. With that alternative, maybe we could also remove the std::optional network arg from the existing GetAddr/GetAddresses functions.

  16. jonatack force-pushed on Sep 21, 2021
  17. jnewbery commented at 1:08 pm on September 21, 2021: member

    SGTM! can pull it in.

    Here you go: https://github.com/jnewbery/bitcoin/tree/2021-09-getaddrwithrefcount. The first two commits add the new GetAddrWithRefcount() method. Let me know what you think!

    With that alternative, maybe we could also remove the std::optional network arg from the existing GetAddr/GetAddresses functions.

    Great idea! I added that as the final commit in the branch above.

  18. jonatack commented at 1:41 pm on September 21, 2021: member
    Thanks! That is helpful. I see that I was misreading the previous suggestion.
  19. DrahtBot commented at 10:40 am on September 23, 2021: member

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  20. DrahtBot added the label Needs rebase on Sep 23, 2021
  21. amitiuttarwar commented at 0:11 am on September 24, 2021: contributor

    I’m uncertain about this concept (adding table & refcount in getnodeaddresses), I don’t currently have a strong opinion, so am just sharing some thoughts & tradeoffs:

    • From the POV of a user, I can see how this additional information could be interesting. Since the getnodeaddresses endpoint is already essentially for super users or clients utilizing addrman, it seems reasonable to add this implementation-specific information.
    • For functional tests, I don’t think it makes sense to regularly test the internal storage mechanism of addrman. That seems better suited for unit tests. So, I’m not sure this is adding much value. If the root issue is expectations around what is on new vs tried because of collisions, adding a way to construct a deterministic addrman (eg. as suggested in #22831 (review)) seems like a more robust solution.
    • For the unit tests, simply returning the address table & multiplicity if on new seems insufficient to support the current tests with a smaller interface, because several of the tests require the specific bucket & position of the address.
    • For the fuzz tests, specifically addrman_serdeser, this more informational endpoint could potentially be used to replace the operator==() functionality. That would be nice to simplify the interface, but reduce the thoroughness of the check, and require multiplicity + collisions to detect any inconsistencies in bucket hashing logic. Since the majority of the time in the fuzz test is spent filling an addrman, I think more thorough checks would be preferable.

    So overall, I think the biggest benefit to this added functionality would be for superusers or clients using the RPC interface to observe or build on addrman. Additionally, there seems to be some marginal benefit for the functional tests. I am uncertain if these benefits outweigh the maintenance costs of the code.

    Many thanks to @mzumsande for collaborating to form these thoughts :)

  22. p2p: create AddrMan namespace, Address struct, Addresses type
    for GetAddr(), CConnMan::GetAddresses() and RPC getnodeaddresses
    ae637372eb
  23. p2p: use AddrMan::Addresses in GetAddr/GetAddresses/getnodeaddresses e3a3552a10
  24. rpc: expose tried and reference_count fields in getnodeaddresses 1febfb90bf
  25. test: update coverage for getnodeaddresses, addpeeraddress
    with the new getnodeaddresses "tried" and "reference_count" fields
    529e63f7f6
  26. test: improve asmap feature tests with getnodeaddresses#tried ad3c725a0c
  27. jonatack force-pushed on Oct 2, 2021
  28. jonatack marked this as a draft on Oct 2, 2021
  29. jonatack commented at 11:02 am on October 5, 2021: member
    It may also be handy for getnodeaddresses to return all addresses, including IsTerrible() ones, along with a boolean field indicating whether the address is terrible or not. Along with the other feedback here, it may be a good idea for getnodeaddresses to have its own GetAddr-like function that is separate from the one used by internal components.
  30. jnewbery commented at 10:31 am on December 22, 2021: member
    This overlaps with #23826. It’s not completely redundant, since that PR doesn’t expose the tried/new and multiplicity through a public API, but it does add a method to fetch that data that can be used by the unit tests. I think that’s probably enough, since addrman is now well-contained enough that it can be fully tested through the unit tests. WDYT @jonatack?
  31. fanquake commented at 9:51 am on April 20, 2022: member
    What is the state of this change? It seems like it could have been (at least somewhat) obsoleted by other now-merged changes.
  32. jonatack commented at 11:34 am on April 20, 2022: member
    Will update or close after reviewing the changes.
  33. fanquake commented at 9:05 am on May 4, 2022: member
    Feel free to reopen once you’ve gotten a chance to look at this again.
  34. fanquake closed this on May 4, 2022

  35. DrahtBot locked this on May 4, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me