net, rpc: expose connection type in getpeerinfo #19883

pull jonatack wants to merge 1 commits into bitcoin:master from jonatack:getpeerinfo-conn-type changing 4 files +13 −1
  1. jonatack commented at 3:33 pm on September 5, 2020: member

    Expose conn_type via a practical API for JSON-RPC consumers.

    • returns the conn_type as an integer id for API clients. It is a simple and small change to implement and maintain, and the API can remain stable even if the ConnectionType element naming or order changes.
    • adds a uint8_t type to the ConnectionType enum class; if preferred, this can be dropped
     0$ ./src/bitcoin-cli help getpeerinfo
     1...
     2    "inbound" : true|false,        (boolean) Inbound (true) or Outbound (false)
     3    "addnode" : true|false,        (boolean) Whether connection was due to addnode/-connect or if it was an automatic/inbound connection
     4    "conn_type" : n,               (numeric) Connection type between 0 and 5:
     5                                   0 - inbound (initiated by the peer)
     6                                   1 - outbound-full-relay (default automatic connections)
     7                                   2 - manual (added using the -addnode/-connect configuration options or the addnode RPC)
     8                                   3 - feeler (short-lived automatic connection to test addresses)
     9                                   4 - block-relay-only (does not relay transactions or addresses)
    10                                   5 - addr-fetch (short-lived automatic connection to request addresses)
    
  2. in src/rpc/net.cpp:135 in 0d24183e71 outdated
    131@@ -113,6 +132,8 @@ static UniValue getpeerinfo(const JSONRPCRequest& request)
    132                             {RPCResult::Type::STR, "subver", "The string version"},
    133                             {RPCResult::Type::BOOL, "inbound", "Inbound (true) or Outbound (false)"},
    134                             {RPCResult::Type::BOOL, "addnode", "Whether connection was due to addnode/-connect or if it was an automatic/inbound connection"},
    135+                            {RPCResult::Type::NUM, "conn_type", "Connection type between 0 and 5:\n" + ConnTypeList()},
    


    promag commented at 5:09 pm on September 5, 2020:
    Why does this matter?

    jonatack commented at 5:19 pm on September 5, 2020:
    To be sure I’m understanding, the suggestion is to drop “between 0 and 5”?

    promag commented at 5:34 pm on September 5, 2020:
    No, I’m asking why is this is needed.

    jonatack commented at 5:39 pm on September 5, 2020:
    It’s explained in the PR description.
  3. DrahtBot added the label P2P on Sep 5, 2020
  4. DrahtBot added the label RPC/REST/ZMQ on Sep 5, 2020
  5. jonatack force-pushed on Sep 5, 2020
  6. promag commented at 10:05 am on September 6, 2020: member
    I’m not sure if it’s good idea to return enum’s int value. If the enum is changed/refactored clients get broken. IMO string is fine and enough.
  7. jonatack commented at 10:40 am on September 6, 2020: member

    I’m not sure if it’s good idea to return enum’s int value. If the enum is changed/refactored clients get broken. IMO string is fine and enough.

    If a string was returned, changing the string name would be a breaking change, and string names are likely to be bike-shed or changed. OTOH there is no reason why the enum integer values would ever have to change. The enum class itself can separate the int values from the rest with no need for extra methods; it can remain stable even if the enum element naming or order is otherwise changed. And the integer is the easiest data for API clients to parse, bounds check and use as an index into a data structure.

  8. in src/rpc/net.cpp:218 in 515e30d1a4 outdated
    213@@ -193,6 +214,8 @@ static UniValue getpeerinfo(const JSONRPCRequest& request)
    214         obj.pushKV("subver", stats.cleanSubVer);
    215         obj.pushKV("inbound", stats.fInbound);
    216         obj.pushKV("addnode", stats.m_manual_connection);
    217+        obj.pushKV("conn_type", stats.m_conn_type);
    218+        obj.pushKV("conn_type_name", std::get<1>(CONN_TYPES.at(stats.m_conn_type)));
    


    promag commented at 10:52 am on September 6, 2020:
    This is not safe. If you shuffle the enum then this returns bad values.

    jonatack commented at 11:33 am on September 6, 2020:
    Dropped this field.
  9. in src/rpc/net.cpp:33 in 515e30d1a4 outdated
    28@@ -29,6 +29,25 @@
    29 
    30 #include <univalue.h>
    31 
    32+static const std::vector<std::tuple<int, const std::string, const std::string>> CONN_TYPES{
    33+    std::make_tuple(0, "inbound", "initiated by the peer"),
    


    promag commented at 10:53 am on September 6, 2020:
    I think this is complicated and fragile. This assumes 0 to be ConnectionType::INBOUND and if the enum is changed then this brakes.

    jonatack commented at 11:33 am on September 6, 2020:
    Removed this. Thanks for the feedback.
  10. promag commented at 10:53 am on September 6, 2020: member

    I think nobody cares about the int value, if that was the case we wouldn’t use enum class. Sending int and have this documented is fine too (but the redundant string is then unnecessary) but the way it is implemented doesn’t look safe, mainly because not all conn_type are tested.

    This code would be much simpler, and IMO safer, if you just add std::map<ConnectionType, std::string> CONNECTION_TYPE_NAME, CONNECTION_TYPE_DESCRIPTION.

  11. jonatack force-pushed on Sep 6, 2020
  12. net, rpc: expose connection type in getpeerinfo bd2aa75c33
  13. jonatack force-pushed on Sep 6, 2020
  14. jonatack commented at 11:46 am on September 6, 2020: member

    I think nobody cares about the int value

    Exactly – no one caring about the int value is an advantage. The enum class int values can remain stable for the API even if people want to change the enum element naming or order.

    Sending int and have this documented is fine too (but the redundant string is then unnecessary)

    Thanks for reviewing and making this better, @promag. Dropped the string field and put the doc directly in RPCHelpMan.

  15. promag commented at 9:44 pm on September 6, 2020: member
    Ideally this needs a test for all possible conn_type before merge. You can also add a release note of the new field.
  16. jnewbery commented at 9:57 am on September 7, 2020: member
    I prefer #19725. We shouldn’t leak our internal enum indexes out to a public API (since that locks us into a specific implementation)
  17. jonatack commented at 10:10 am on September 7, 2020: member

    since that locks us into a specific implementation

    If the ConnectionType enum were to be abandoned for a hypothetical different implementation, it would just require adding a method to serialise the ids. For now that’s not needed.

  18. promag commented at 10:12 am on September 7, 2020: member
    @jnewbery I’ve made that point too, but you can also assume that ATM the mapping function used is the identity function.
  19. amitiuttarwar commented at 4:22 am on September 8, 2020: contributor

    this PR isn’t quite a replacement for #19725 because it doesn’t include the deprecation of getpeerinfo.addnode, or the logging improvement in net_processing.

    I think the proposal of adding conn_type as an integer id is a reasonable proposal, but wanted to clarify the differences for reviewers. I’m personally -0 because I think a string suffices.

  20. laanwj commented at 7:36 am on September 8, 2020: member

    I think the proposal of adding conn_type as an integer id is a reasonable proposal, but wanted to clarify the differences for reviewers. I’m personally -0 because I think a string suffices.

    I agree here that a string suffices. I don’t think it’s wise to expose the enumeration IDs on the JSON-RPC interface, as they are an internal implementation detail, and for better or worse (no real enum type) the common way is to use strings as enumerators in JSON.

  21. jonatack commented at 9:29 am on September 8, 2020: member

    By any objective technical criteria, ersatz long-format string ids in the place of integer ones seem a substantially worse choice

    • code
    • complexity
    • robustness
    • API stability
    • API flexibility for clients
    • memory
    • speed
    • network bandwidth
    • maintainability

    in every way maybe an order of magnitude worse.

    An API client can bounds check an integer id, then call a vector element with it.

    With a long format string, the client has to match against every possible expected value first in order to error check the value.

    This is a lean, clean implementation that adds 2 lines to net.{h,cpp} versus 30.

    Overall, I think it’s objectively multiple times better for both Bitcoin Core and for software clients of the RPC API.

    Also, this unbundles #19725 which adds extraneous logging refactoring and a controversial deprecation into the same PR. Just unbundling it adds value before considering the order of magnitude technical improvement.

  22. jonatack commented at 9:33 am on September 8, 2020: member
    The ids can be serialised via a separate method, but that doesn’t seem needed here and would just be added complexity for no gain. The enum itself can separate the order and naming from the id values.
  23. jonatack commented at 9:48 am on September 8, 2020: member

    this PR isn’t quite a replacement for #19725 because it doesn’t include the deprecation of getpeerinfo.addnode, or the logging improvement in net_processing.

    Yes, it’s not intended to replace the logging refactoring or the deprecation. I’m -0.9 on both for the reasons I’ve stated in that PR.

  24. ajtowns commented at 7:13 am on September 19, 2020: member

    If a string was returned, changing the string name would be a breaking change, and string names are likely to be bike-shed or changed. OTOH there is no reason why the enum integer values would ever have to change

    The addition to getpeerinfo and the logging change is for helping humans understand what’s going on, not for interoperability (there’s no standard and different behaviours within the existing specs are perfectly reasonably), nor for command and control (there’s no way for other programs to act on the connection type info), so no, changing the string names isn’t a breaking change. There are obvious reasons why enum values change: if an entry is removed, or if the entries are rearranged. Yes, you can hardcode the values to prevent that, but there’s no reason to do so: this isn’t a standard, it’s an aid for debugging problems with your node.

    This is a small, focused, simple, performant alternative

    getpeerinfo isn’t a performance critical call, outputting strings via it isn’t performance critical (both since all the numbers are encoded as strings anyway – it’s json; and since we’re already decoding services to an array of strings via servicenames), and even if none of that were true, you should be providing benchmarks if claiming a performance improvement.

    JSON-RPC expects integer ids.

    I don’t know where this is coming from, but it’s not even literally true: “id - The request id. This can be of any type.” and “id .. MUST contain a String, Number, or NULL value if included”

  25. sipa commented at 7:33 am on September 19, 2020: member
    There is precedent for string values for enumerated types too already: the transaction types in listtransactions (“receive”, “send”, “generate”, “immature”, …), and the branch types in getchaintips come to mind.
  26. DrahtBot commented at 1:54 pm on September 19, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #20002 (net, rpc, cli: expose GetNetClass()/ConnectedViaTor() in getpeerinfo, use in -netinfo by jonatack)
    • #19877 ([test] clarify rpc_net & p2p_disconnect_ban functional tests by amitiuttarwar)
    • #19829 (net processing: Move block inventory state to net_processing by jnewbery)
    • #19776 (net, rpc: expose high bandwidth mode state via getpeerinfo by theStack)
    • #19725 ([RPC] Add connection type to getpeerinfo, improve logs by amitiuttarwar)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  27. jonatack commented at 10:06 am on September 20, 2020: member

    If a string was returned, changing the string name would be a breaking change, and string names are likely to be bike-shed or changed. OTOH there is no reason why the enum integer values would ever have to change

    The addition to getpeerinfo and the logging change

    Unless I’m mistaken, the logging change would have the same output.

    is for helping humans understand what’s going on

    From what I’ve been able to understand, the CLI client-side options are the human-first ones and not constrained by API stability constraints, which is why I added features to -getinfo and also created -netinfo that has been described as getpeerinfo for humans. The RPC API, on the other hand, is more-or-less machine-first and constrained by stability and the desire to avoid causing suffering for software clients downstream, even if it does not for human users of the RPC like us. That is what this PR focuses on. An additional human-friendly convenience field could be an option if people feel the tradeoffs are worth it, but that’s orthogonal to this proposal.

    changing the string names isn’t a breaking change

    Renaming a connection type not only entails cascading codebase changes for long-format name ids, it is also a breaking API change if the client no longer recognizes the connection type. This can be avoided by using a standard integer id decoupled from the naming. I will provide a demonstration with code a bit later.

    getpeerinfo isn’t a performance critical call

    This field is called in a loop. API clients may be interested in performance, either because they call this frequently or at high frequency (I do) or from clients/to servers that are constrained in CPU, memory, or internet bandwidth. If it’s available with a couple of lines, why dunk on it?

    all the numbers are encoded as strings anyway – it’s json

    I could be wrong but don’t think this is true. https://tools.ietf.org/html/rfc7159

    and even if none of that were true, you should be providing benchmarks

    Requesting benchmarks for the simplest, standard practice is a bit pedantic. No one asked for benchmarks, for instance, when in #19731 I proposed to send Unix epoch times instead of a human-friendly datetime format.

    I don’t know where this is coming from, but it’s not even literally true: “id - The request id. This can be of any type.” and “id .. MUST contain a String, Number, or NULL value if included”

    Thanks for confirming that we can use numbers. Now…do you see any long-format, non-numerical string ids in https://www.jsonrpc.org/specification#examples and https://www.jsonrpc.org/specification_v1#a4.CommunicationExamples? Right, neither do I (maybe I need glasses; I have been putting it off for a long time). Sure, you could use ersatz long-format strings, but ideally not as your primary id. “I want to depend on long-format names that people are already proposing to change, please send me that” said no API client to me ever. However, the clients might be ok with it as an optional convenience field that they can request on a per-call basis.

    At any rate, thank you for having a look. Mind giving a concept ACK?

  28. jonatack commented at 10:15 am on September 20, 2020: member

    There is precedent for string values for enumerated types too already: the transaction types in listtransactions (“receive”, “send”, “generate”, “immature”, …), and the branch types in getchaintips come to mind.

    Sure, but like the kebab-case and snake_case config args, precedents may have varying degrees of desirability and relevance.

    I wonder if we shouldn’t have more separation between the API for software and the CLI for humans (seems to be a trend already?)… e.g. the human-friendly CLI versions could have human-readable datetime formats instead of Unix epoch time, etc.

  29. michaelfolkson commented at 11:29 am on September 20, 2020: contributor

    Approach ACK

    I can see why some people think this argument is a touch pedantic. But at the very least this isn’t a worse approach than #19725 and it is definitely simpler. Plus there appears there could be benefits to this approach downstream.

    (PR #19725 could still do the logging refactoring and deprecation. I’m not convinced these are controversial.)

  30. sipa commented at 1:49 am on September 26, 2020: member

    Concept NACK.

    I don’t think we should be exposing internal enums, as its mapping between numbers and connection type semantics is arbitary. Exposing it via RPC is cementing it in stone, for no good reason. The set of available connection types will change over time, and that will very likely mean that some types that currently exist won’t remain.

    As I’ve pointed out #19725 (comment), if you want an actual stable mapping with the advantages of machine-readability of numbers over the strings assigned by #19725, I think you’d need to maintain a separate set of numbers for the RPC interface, in which old numbers may retire if connection types are removed/split or even just substantially change meaning. I don’t think that’s worth the effort.

  31. DrahtBot added the label Needs rebase on Sep 26, 2020
  32. DrahtBot commented at 5:21 pm on September 26, 2020: member

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  33. jonatack closed this on Sep 28, 2020

  34. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 13:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me