rpc: add ‘getnetmsgstats’ RPC #28926

pull willcl-ark wants to merge 9 commits into bitcoin:master from willcl-ark:2023-07-getnetmsgstats changing 13 files +728 −13
  1. willcl-ark commented at 12:37 pm on November 22, 2023: contributor

    This picks up #27534 with a rebase and a few fixups, now squashed. See #27534 for more background information.

    Considerations from the previous draft:

    Should we allow dimensions to be rearraged?

    When it comes time to gather up the RPC results, @vasild has provided an alternate implementation that uses an array instead of the MultiMap structure. This results in two changes:

    using the stack over the heap (yay!) enforcing a strict ordering of dimensions (direction, network, connection type, message type)

    Aside from being good for memory, the reasoning here is allowing users to rearrange dimensions may be too confusing. I personally really like the ability to rearrange dimensions, which is why I have not integrated that solution into this PR, but am open to whatever the larger community prefers :) Link

    ^ I have looked at the alternative implementation (thanks @vasild!), but prefer the current version myself so long as the performance is not an issue. I’ve left this as-is for now. If folks here think that it might be an issue then I can investigate the alternative further. See the next consideration for additional reasoning on sticking with the current approach.

    Now that an effort has been made to remove the friendship between CConnman and CNode (see this PR: #27257) I’m unsure if this (recording network stats on a CConnman object) belongs here. Link

    ^ These changes make @valid’s alternative approach a bit harder to rebase and AB test with the current approach, as it relied on access to the CNode msg queue, and #27257 included the change: “CNode’s message processing queue is made private in the process and its mutex is turned into a non-recursive mutex.” It can be worked around, but I’m not convinced it will be worth it.

    this function should handle if index > NUM_NET_MESSAGE_TYPES Link

    ^ I took this suggestion as part in 800ec2ac762cb48ac667cdf04de263634404569e

    #27534 (review)

    ^ I refactored this array in c909dc5704465aa96ce4496ed14452719e0b7860 to make it more comprehensible in the source code.

    Additional considerations from OP:

    • The code organisation was mainly taken care of by @satsie, with src/netstats.{h|cpp} now code from the netstats namespace instead of being contained within net.cpp
  2. DrahtBot commented at 12:37 pm on November 22, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK brunoerg, amitiuttarwar, Sjors, vasild, mzumsande, kristapsk

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29421 (net: make the list of known message types a compile time constant by vasild)
    • #29418 (rpc: provide per message stats for global traffic via new RPC ‘getnetmsgstats’ by vasild)
    • #27770 (Introduce ‘getblockfileinfo’ RPC command by furszy)
    • #27114 (p2p: Allow whitelisting manual connections by brunoerg)
    • #19463 (Prune locks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label RPC/REST/ZMQ on Nov 22, 2023
  4. in src/protocol.cpp:97 in 0341129ed1 outdated
    91@@ -90,6 +92,10 @@ const static std::vector<std::string> g_all_net_message_types{
    92     NetMsgType::SENDTXRCNCL,
    93 };
    94 
    95+static_assert(NUM_NET_MESSAGE_TYPES == sizeof(allNetMessageTypes) / sizeof(allNetMessageTypes[0]), "Please update NUM_NET_MESSAGE_TYPES");
    96+
    97+const static std::vector<std::string> allNetMessageTypesVec(std::begin(allNetMessageTypes), std::end(allNetMessageTypes));
    


    vasild commented at 2:40 pm on November 22, 2023:

    All these plus the NUM_NET_MESSAGE_TYPES constant can be avoided if the list of all message types is turned into a std::array. Consider this:

      0diff --git a/src/net.cpp b/src/net.cpp
      1index a2f80cbcf7..133abae117 100644
      2--- a/src/net.cpp
      3+++ b/src/net.cpp
      4@@ -3670,13 +3670,13 @@ CNode::CNode(NodeId idIn,
      5       nLocalHostNonce{nLocalHostNonceIn},
      6       m_recv_flood_size{node_opts.recv_flood_size},
      7       m_i2p_sam_session{std::move(node_opts.i2p_sam_session)}
      8 {
      9     if (inbound_onion) assert(conn_type_in == ConnectionType::INBOUND);
     10 
     11-    for (const std::string &msg : getAllNetMessageTypes())
     12+    for (const auto& msg : g_all_net_message_types)
     13         mapRecvBytesPerMsgType[msg] = 0;
     14     mapRecvBytesPerMsgType[NET_MESSAGE_TYPE_OTHER] = 0;
     15 
     16     if (fLogIPs) {
     17         LogPrint(BCLog::NET, "Added connection to %s peer=%d\n", m_addr_name, id);
     18     } else {
     19diff --git a/src/protocol.cpp b/src/protocol.cpp
     20index 27a0a2ffc1..0b22cc47aa 100644
     21--- a/src/protocol.cpp
     22+++ b/src/protocol.cpp
     23@@ -46,53 +46,12 @@ const char* CFHEADERS = "cfheaders";
     24 const char* GETCFCHECKPT = "getcfcheckpt";
     25 const char* CFCHECKPT = "cfcheckpt";
     26 const char* WTXIDRELAY = "wtxidrelay";
     27 const char* SENDTXRCNCL = "sendtxrcncl";
     28 } // namespace NetMsgType
     29 
     30-/** All known message types. Keep this in the same order as the list of
     31- * messages above and in protocol.h.
     32- */
     33-const static std::vector<std::string> g_all_net_message_types{
     34-    NetMsgType::VERSION,
     35-    NetMsgType::VERACK,
     36-    NetMsgType::ADDR,
     37-    NetMsgType::ADDRV2,
     38-    NetMsgType::SENDADDRV2,
     39-    NetMsgType::INV,
     40-    NetMsgType::GETDATA,
     41-    NetMsgType::MERKLEBLOCK,
     42-    NetMsgType::GETBLOCKS,
     43-    NetMsgType::GETHEADERS,
     44-    NetMsgType::TX,
     45-    NetMsgType::HEADERS,
     46-    NetMsgType::BLOCK,
     47-    NetMsgType::GETADDR,
     48-    NetMsgType::MEMPOOL,
     49-    NetMsgType::PING,
     50-    NetMsgType::PONG,
     51-    NetMsgType::NOTFOUND,
     52-    NetMsgType::FILTERLOAD,
     53-    NetMsgType::FILTERADD,
     54-    NetMsgType::FILTERCLEAR,
     55-    NetMsgType::SENDHEADERS,
     56-    NetMsgType::FEEFILTER,
     57-    NetMsgType::SENDCMPCT,
     58-    NetMsgType::CMPCTBLOCK,
     59-    NetMsgType::GETBLOCKTXN,
     60-    NetMsgType::BLOCKTXN,
     61-    NetMsgType::GETCFILTERS,
     62-    NetMsgType::CFILTER,
     63-    NetMsgType::GETCFHEADERS,
     64-    NetMsgType::CFHEADERS,
     65-    NetMsgType::GETCFCHECKPT,
     66-    NetMsgType::CFCHECKPT,
     67-    NetMsgType::WTXIDRELAY,
     68-    NetMsgType::SENDTXRCNCL,
     69-};
     70-
     71 CMessageHeader::CMessageHeader(const MessageStartChars& pchMessageStartIn, const char* pszCommand, unsigned int nMessageSizeIn)
     72     : pchMessageStart{pchMessageStartIn}
     73 {
     74     // Copy the command name
     75     size_t i = 0;
     76     for (; i < COMMAND_SIZE && pszCommand[i] != 0; ++i) pchCommand[i] = pszCommand[i];
     77@@ -175,17 +134,12 @@ std::string CInv::ToString() const
     78         return strprintf("%s %s", GetCommand(), hash.ToString());
     79     } catch(const std::out_of_range &) {
     80         return strprintf("0x%08x %s", type, hash.ToString());
     81     }
     82 }
     83 
     84-const std::vector<std::string> &getAllNetMessageTypes()
     85-{
     86-    return g_all_net_message_types;
     87-}
     88-
     89 /**
     90  * Convert a service flag (NODE_*) to a human readable string.
     91  * It supports unknown service flags which will be returned as "UNKNOWN[...]".
     92  * [@param](/bitcoin-bitcoin/contributor/param/)[in] bit the service flag is calculated as (1 << bit)
     93  */
     94 static std::string serviceFlagToStr(size_t bit)
     95diff --git a/src/protocol.h b/src/protocol.h
     96index e405253632..2a6b85dea5 100644
     97--- a/src/protocol.h
     98+++ b/src/protocol.h
     99@@ -264,14 +264,50 @@ extern const char* WTXIDRELAY;
    100  * The salt is used to compute short txids needed for efficient
    101  * txreconciliation, as described by BIP 330.
    102  */
    103 extern const char* SENDTXRCNCL;
    104 }; // namespace NetMsgType
    105 
    106-/* Get a vector of all valid message types (see above) */
    107-const std::vector<std::string>& getAllNetMessageTypes();
    108+/** All known message types (see above). Keep this in the same order as the list of messages above. */
    109+static const std::array g_all_net_message_types{
    110+    NetMsgType::VERSION,
    111+    NetMsgType::VERACK,
    112+    NetMsgType::ADDR,
    113+    NetMsgType::ADDRV2,
    114+    NetMsgType::SENDADDRV2,
    115+    NetMsgType::INV,
    116+    NetMsgType::GETDATA,
    117+    NetMsgType::MERKLEBLOCK,
    118+    NetMsgType::GETBLOCKS,
    119+    NetMsgType::GETHEADERS,
    120+    NetMsgType::TX,
    121+    NetMsgType::HEADERS,
    122+    NetMsgType::BLOCK,
    123+    NetMsgType::GETADDR,
    124+    NetMsgType::MEMPOOL,
    125+    NetMsgType::PING,
    126+    NetMsgType::PONG,
    127+    NetMsgType::NOTFOUND,
    128+    NetMsgType::FILTERLOAD,
    129+    NetMsgType::FILTERADD,
    130+    NetMsgType::FILTERCLEAR,
    131+    NetMsgType::SENDHEADERS,
    132+    NetMsgType::FEEFILTER,
    133+    NetMsgType::SENDCMPCT,
    134+    NetMsgType::CMPCTBLOCK,
    135+    NetMsgType::GETBLOCKTXN,
    136+    NetMsgType::BLOCKTXN,
    137+    NetMsgType::GETCFILTERS,
    138+    NetMsgType::CFILTER,
    139+    NetMsgType::GETCFHEADERS,
    140+    NetMsgType::CFHEADERS,
    141+    NetMsgType::GETCFCHECKPT,
    142+    NetMsgType::CFCHECKPT,
    143+    NetMsgType::WTXIDRELAY,
    144+    NetMsgType::SENDTXRCNCL,
    145+};
    146 
    147 /** nServices flags */
    148 enum ServiceFlags : uint64_t {
    149     // NOTE: When adding here, be sure to update serviceFlagToStr too
    150     // Nothing
    151     NODE_NONE = 0,
    152diff --git a/src/test/fuzz/p2p_transport_serialization.cpp b/src/test/fuzz/p2p_transport_serialization.cpp
    153index 21d8dab536..853661b4d4 100644
    154--- a/src/test/fuzz/p2p_transport_serialization.cpp
    155+++ b/src/test/fuzz/p2p_transport_serialization.cpp
    156@@ -18,19 +18,18 @@
    157 #include <limits>
    158 #include <optional>
    159 #include <vector>
    160 
    161 namespace {
    162 
    163-std::vector<std::string> g_all_messages;
    164+auto g_all_messages = g_all_net_message_types;
    165 
    166 void initialize_p2p_transport_serialization()
    167 {
    168     ECC_Start();
    169     SelectParams(ChainType::REGTEST);
    170-    g_all_messages = getAllNetMessageTypes();
    171     std::sort(g_all_messages.begin(), g_all_messages.end());
    172 }
    173 
    174 } // namespace
    175 
    176 FUZZ_TARGET(p2p_transport_serialization, .init = initialize_p2p_transport_serialization)
    177@@ -147,13 +146,13 @@ void SimulationTest(Transport& initiator, Transport& responder, R& rng, FuzzedDa
    178                 if (c < ' ' || c > 0x7E) break;
    179                 ret += c;
    180             }
    181             return ret;
    182         } else {
    183             // Otherwise, use it as index into the list of known messages.
    184-            return g_all_messages[v % g_all_messages.size()];
    185+            return std::string{g_all_messages[v % g_all_messages.size()]};
    186         }
    187     };
    188 
    189     // Function to construct a CSerializedNetMsg to send.
    190     auto make_msg_fn = [&](bool first) {
    191         CSerializedNetMsg msg;
    192diff --git a/src/test/fuzz/process_message.cpp b/src/test/fuzz/process_message.cpp
    193index d38d1bb40e..24c711f421 100644
    194--- a/src/test/fuzz/process_message.cpp
    195+++ b/src/test/fuzz/process_message.cpp
    196@@ -42,13 +42,13 @@ std::string_view LIMIT_TO_MESSAGE_TYPE{};
    197 } // namespace
    198 
    199 void initialize_process_message()
    200 {
    201     if (const auto val{std::getenv("LIMIT_TO_MESSAGE_TYPE")}) {
    202         LIMIT_TO_MESSAGE_TYPE = val;
    203-        Assert(std::count(getAllNetMessageTypes().begin(), getAllNetMessageTypes().end(), LIMIT_TO_MESSAGE_TYPE)); // Unknown message type passed
    204+        Assert(std::count(g_all_net_message_types.begin(), g_all_net_message_types.end(), LIMIT_TO_MESSAGE_TYPE)); // Unknown message type passed
    205     }
    206 
    207     static const auto testing_setup = MakeNoLogFileContext<const TestingSetup>(
    208             /*chain_type=*/ChainType::REGTEST,
    209             /*extra_args=*/{"-txreconciliation"});
    210     g_setup = testing_setup.get();
    

    willcl-ark commented at 9:30 pm on November 22, 2023:

    Thanks for the idea Vasil.

    I implemented it in this tag to see what it would look like.

    Overall, I prefer the design, but in writing the commit message for the change I began to feel unsure that it actually results in g_all_net_message_types being a compile-time constant. IIUC the array will reference a const char*. The pointer addresses can only be allocated at runtime and in turn g_all_net_message_types will be static at runtime but not at compile time?

    I’m not sure I see an easy way to have this be a compile time constant, if my understanding above is correct.


    sipa commented at 9:33 pm on November 22, 2023:
    Make it constexpr instead of const. That’ll force the compiler to evaluate the array at compile-time (or give an error if that’s not possible).

    willcl-ark commented at 9:37 pm on November 22, 2023:

    Thanks, I forgot to mention I had tried that. It results in something like this:

     0$ bear -- make
     1Making all in src
     2make[1]: Entering directory '/home/will/src/bitcoin/src'
     3make[2]: Entering directory '/home/will/src/bitcoin/src'
     4make[3]: Entering directory '/home/will/src/bitcoin'
     5make[3]: Leaving directory '/home/will/src/bitcoin'
     6  CXX      bitcoind-bitcoind.o
     7In file included from ./netstats.h:10,
     8                 from ./net.h:22,
     9                 from ./interfaces/node.h:10,
    10                 from ./interfaces/init.h:10,
    11                 from bitcoind.cpp:19:
    12./protocol.h:276:65: error: the type const std::array<std::__cxx11::basic_string<char>, 35> of constexpr variable g_all_net_message_types is not literal
    13  276 | static constexpr std::array<std::string, NUM_NET_MESSAGE_TYPES> g_all_net_message_types{
    14      |                                                                 ^~~~~~~~~~~~~~~~~~~~~~~
    15In file included from ./uint256.h:13,
    16                 from ./consensus/params.h:9,
    17                 from ./kernel/chainparams.h:9,
    18                 from ./chainparams.h:9,
    19                 from bitcoind.cpp:10:
    20/usr/include/c++/12/array:99:12: note: std::array<std::__cxx11::basic_string<char>, 35> is not literal because:
    21   99 |     struct array
    22      |            ^~~~~
    23/usr/include/c++/12/array:99:12: note:   std::array<std::__cxx11::basic_string<char>, 35> has a non-trivial destructor
    24make[2]: *** [Makefile:16350: bitcoind-bitcoind.o] Error 1
    25make[2]: Leaving directory '/home/will/src/bitcoin/src'
    26make[1]: *** [Makefile:20245: all-recursive] Error 1
    27make[1]: Leaving directory '/home/will/src/bitcoin/src'
    28make: *** [Makefile:813: all-recursive] Error 1
    

    sipa commented at 9:39 pm on November 22, 2023:
    Oh, yes, you cannot have constexpr std::string until C++20.

    vasild commented at 1:28 pm on November 24, 2023:
    const std::array a{x, y, z}; - at compile time the compiler knows the size of the array. a.size() can be used to define the size of other arrays, e.g. std::array<int, a.size()> anoher;. This is what matters. Whether the contents of the array is a compile time constant (the pointers), I am not sure.

    maflcko commented at 3:23 pm on December 8, 2023:
    C++20 is available to use now
  5. vasild commented at 2:45 pm on November 22, 2023: contributor

    Concept ACK

    I was looking at this recently, and got some ideas how to simplify and better organize this, will share here. Thanks for taking charge!

  6. DrahtBot added the label CI failed on Nov 22, 2023
  7. fanquake commented at 4:26 pm on November 22, 2023: member
    0clang-tidy-17 -p=/ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu -quiet -load=/tidy-build/libbitcoin-tidy.so /ci_container_base/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/src/rpc/net.cpp
    1rpc/net.cpp:726:30: error: use emplace_back instead of push_back [modernize-use-emplace,-warnings-as-errors]
    2  726 |                         path.push_back("totals");
    3      |                              ^~~~~~~~~~
    4      |                              emplace_back(
    
  8. brunoerg commented at 5:02 pm on November 22, 2023: contributor
    Concept ACK
  9. net: move NET_MESSAGE_TYPE_OTHER
    Move NET_MESSAGE_TYPE_OTHER to protocol.h/cpp, where the rest of the
    message types live.
    
    Co-authored-by: Will Clark <will@256k1.dev>
    d3b34b82af
  10. net: make constants for total conn and msg types
    Add consts to keep track of the number of connection and message types.
    The number of message types should match the number of elements in the
    allNetMessageTypes array.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    d1650639de
  11. net: add methods to convert a message type to/from an index
    Co-authored-by: Will Clark <will@256k1.dev>
    9383acb2d5
  12. net: create a data structure for network message stats
    New data structure to hold the atomic message and byte counts of each
    network message statistic at the most granular level. Uses a 4D array
    that indexes by direction, network type, connection type, and message
    type.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    Co-authored-by: Will Clark <will@256k1.dev>
    548a14c455
  13. net: add to/fromIndex() methods to NetStats
    Add to/fromIndex() methods to safely map enums to indexes.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    Co-authored-by: Will Clark <will@256k1.dev>
    5371845e7b
  14. net: count transferred bytes & messages globally
    In addition to per-peer stats, start recording stats (the number of
    messages and bytes) globally in `CConnman`. This happens every time
    messages are sent or received. The per-peer stats are lost when a peer
    disconnects.
    
    Also expose this object with a getter.
    
    Co-authored-by: Will Clark <will@256k1.dev>
    ed7fafaf00
  15. rpc: aggregate network message stats
    Aggregate stats to show zero or more of the following dimensions:
    direction, network, connection, message type
    
    The depth of the return object depends the number of dimensions
    specified to be shown. MultiLevelStatsMap makes this possible, and
    converts the object to a UniValue.
    
    Co-authored-by: Anthony Towns <aj@erisian.com.au>
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    22506994e0
  16. rpc: create a new getnetmsgstats rpc
    Introduce a new getnetmsgstats rpc that is able to return network
    message statistics arranged in a number of different ways (direction,
    network, connection, message type). If no dimension types are
    specified to be shown, return the totals.
    
    Includes helper code to convert a string to a DimensionType enum.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    Co-authored-by: Will Clark <will@256k1.dev>
    b352853b78
  17. test: add functional tests for getnetmsgstats rpc
    Co-authored-by: Will Clark <will@256k1.dev>
    33b44f2693
  18. willcl-ark force-pushed on Nov 22, 2023
  19. amitiuttarwar commented at 11:16 pm on November 22, 2023: contributor
    yay!! thanks for picking this up. approach ACK
  20. DrahtBot removed the label CI failed on Nov 22, 2023
  21. Sjors commented at 10:26 am on November 23, 2023: member

    Concept ACK

    The RPC help is a bit abstract at the moment, e.g. “Results may optionally be grouped so only certain dimensions are shown.”

    It might be enough to only show 1 Result: section in the help, e.g. just with two dimensions.

    In the example section (and in the PR description), you could add a description, e.g.:

    • Get the total number of messages and bytes sent/received: bitcoin-cli getnetmsgstats
    • Show number of messages and bytes per message type: bitcoin-cli getnetmsgstats '["message_type"]'
    • Group statistics first by message type, then by connection type: bitcoin-cli getnetmsgstats '["message_type", "connection_type"]'

    It might be better to add a totals field to all groups, in every dimension. In that case I would also have the default getnetmsgstats return something a bit more interesting, e.g. the equivalent of getnetmsgstats '["message_type"].

  22. vasild commented at 1:43 pm on November 24, 2023: contributor

    It might be better to add a totals field to all groups, in every dimension. In that case I would also have the default getnetmsgstats return something a bit more interesting, e.g. the equivalent of getnetmsgstats '["message_type"].

    That would be interesting. However it would further complicate this PR. The grouping/aggregating part I think is already a bit convoluted. Maybe it would make sense to drop it and print the full output without any grouping in this PR and do the grouping in another PR.

  23. Sjors commented at 1:51 pm on November 28, 2023: member

    drop it

    Do you mean dropping the default total field that this PR already has? That sounds good to me.

  24. vasild commented at 2:24 pm on November 29, 2023: contributor
    No, I meant to drop the possibility to aggregate or nest the result, i.e. remove the arguments of the RPC and always return the full JSON. I did not see (yet) that total field. I guess it would be cleaner without it.
  25. willcl-ark commented at 4:50 pm on November 29, 2023: contributor

    No, I meant to drop the possibility to aggregate or nest the result, i.e. remove the arguments of the RPC and always return the full JSON. I did not see (yet) that total field. I guess it would be cleaner without it.

    I think dropping the totals would probably make sense, as these can be totaled up using jq or similar tools pretty easily.

    The only issue I see with not performing (optional) aggregation on the CLI, is that a node active on all networks could return a json object with a maximum of 246*35 = 1680 fields, making further processing a hard requirement for human consuption. Arguably this approach fits the unix philosophy better.

    I personally think it’s a nice for human to be able to aggregate on the RPC in case tools like jq or fx aren’t available or something, but as mentioned removing this ability would simplify one of the commits here.

  26. vasild commented at 5:42 pm on November 29, 2023: contributor

    I personally think it’s a nice for human to be able to aggregate on the RPC in case tools like jq or fx aren’t available

    I agree. Would be nice to have that finally eventually. I meant that if it is too complex for review or causes controversy then to get the simpler one first merged (without aggregation) and then in another PR do the aggregation. Maybe that would not be necessary. I am updating the WIP from https://github.com/bitcoin/bitcoin/compare/master...vasild:bitcoin:getnetmsgstats and will compare with this…

  27. DrahtBot added the label CI failed on Dec 2, 2023
  28. in src/protocol.h:270 in 33b44f2693
    266@@ -267,9 +267,20 @@ extern const char* WTXIDRELAY;
    267 extern const char* SENDTXRCNCL;
    268 }; // namespace NetMsgType
    269 
    270+extern const std::string NET_MESSAGE_TYPE_OTHER;
    


    vasild commented at 1:21 pm on December 5, 2023:
    Commit d3b34b82af33909eaeffe8d596a6c70951dc844c net: move NET_MESSAGE_TYPE_OTHER is not needed and can be dropped if messageTypeFromIndex() and messageTypeToIndex() standalone functions are made private methods in the NetStats class and their usage limited to within the class. That is desirable anyway because they are only relevant for an index in an array that is defined in a particular way and is (should be) private within NetStats.

    mzumsande commented at 10:34 pm on January 5, 2024:
    I agree: Code and constants that are only used in RPCs are not part of the protocol and therefore don’t belong in protocol.h / protocol.cpp.
  29. in src/netstats.h:73 in 33b44f2693
    68+                        stat.byte_count = 0;
    69+                    }
    70+                }
    71+            }
    72+        }
    73+    }
    


    vasild commented at 1:36 pm on December 5, 2023:

    This method is not needed. It is simpler and suffices to explicitly initialize the members of MsgStat to zero:

    0        std::atomic_uint64_t byte_count{0}; //!< Number of bytes transferred.
    1        std::atomic_uint64_t msg_count{0};  //!< Number of messages transferred.
    
  30. in src/netstats.h:91 in 33b44f2693
    86+    // assuming MessageTypeToIndex("ping") == 15, then everything stored in
    87+    // m_data[x][y][z][15] is traffic from "ping" messages (for any x, y or z).
    88+
    89+    [[nodiscard]] static Direction DirectionFromIndex(size_t index);
    90+    [[nodiscard]] static Network NetworkFromIndex(size_t index);
    91+    [[nodiscard]] static ConnectionType ConnectionTypeFromIndex(size_t index);
    


    vasild commented at 1:43 pm on December 5, 2023:

    These might as well be private. It would be good to completely hide this to/from index from the outside world because it can be misused in a dangerous way. Then provide a way to call a provided function for each element:

     0void TrafficStats::ForEach(std::function<void(TrafficStats::Direction dir,
     1                                              Network net,
     2                                              ConnectionType con,
     3                                              const std::string& msg,
     4                                              const BytesAndCount& data)> func) const
     5{
     6    for (size_t dir_i = 0; dir_i < m_data.size(); ++dir_i) {
     7        for (size_t net_i = 0; net_i < m_data[dir_i].size(); ++net_i) {
     8            for (size_t con_i = 0; con_i < m_data[dir_i][net_i].size(); ++con_i) {
     9                for (size_t msg_i = 0; msg_i < m_data[dir_i][net_i][con_i].size(); ++msg_i) {
    10                    func(DirectionFromIndex(dir_i),
    11                         NetworkFromIndex(net_i),
    12                         ConnectionTypeFromIndex(con_i),
    13                         MessageTypeFromIndex(msg_i),
    14                         m_data[dir_i][net_i][con_i][msg_i]);
    15                }
    16            }
    17        }
    18    }
    19}
    
  31. in src/net.cpp:3722 in 33b44f2693
    3717@@ -3715,6 +3718,12 @@ void CNode::MarkReceivedMsgsForProcessing()
    3718         // vRecvMsg contains only completed CNetMessage
    3719         // the single possible partially deserialized message are held by TransportDeserializer
    3720         nSizeAdded += msg.m_raw_message_size;
    3721+
    3722+        net_stats.Record(NetStats::Direction::RECV,
    


    vasild commented at 2:14 pm on December 5, 2023:

    The code already collects the traffic data for the per peer stats. To avoid confusion and ease maintenance it would be nice if the per peer and global traffic data is collected at the same place. This is where the data is collected in master for the per peer stats:

    Received: https://github.com/bitcoin/bitcoin/blob/6d5790956f45e3de5c6c4ee6fda21878b0d1287b/src/net.cpp#L663 https://github.com/bitcoin/bitcoin/blob/6d5790956f45e3de5c6c4ee6fda21878b0d1287b/src/net.cpp#L669-L674

    Sent: https://github.com/bitcoin/bitcoin/blob/6d5790956f45e3de5c6c4ee6fda21878b0d1287b/src/net.cpp#L1620

  32. in src/rpc/net.cpp:672 in 33b44f2693
    667+ * |            |            |            |
    668+ * a+e          c+g          b+f          d+h          (direction)
    669+ * @endcode
    670+ *
    671+ */
    672+UniValue Aggregate(const std::vector<DimensionType>& dimensions, const NetStats::MultiDimensionalStats& raw_stats)
    


    vasild commented at 1:15 pm on December 6, 2023:
    I realized that this can be done in a simpler way - if the stats are put in a map and "" used for the aggregated dimensions. See commit b7c56affe9 rpc: make it possible to aggregate the result in getnetmsgstats from vasild/getnetmsgstats.
  33. in test/functional/rpc_net.py:285 in 33b44f2693
    282         info = self.nodes[0].getnetworkinfo()
    283         assert_equal(info['networkactive'], True)
    284-        assert_equal(info['connections'], 2)
    285-        assert_equal(info['connections_in'], 1)
    286+        assert_equal(info['connections'], 3)
    287+        assert_equal(info['connections_in'], 2)
    


    vasild commented at 1:18 pm on December 6, 2023:

    This shouldn’t be necessary. It means that the previous test did not leave the environment as it was when it started. That is the newly added test_getnetmsgstats. It is good to be able to run the tests in isolation and that they do not rely on the leftovers from previous tests. For example if:

    0         self.test_connection_count()
    1         self.test_getpeerinfo() 
    2         self.test_getnettotals()
    3         self.test_getnetmsgstats()
    4         self.test_getnetworkinfo()
    5         self.test_addnode_getaddednodeinfo()
    6         self.test_service_flags()
    

    is changed to:

    0         #self.test_connection_count()
    1         #self.test_getpeerinfo() 
    2         #self.test_getnettotals()
    3         #self.test_getnetmsgstats()
    4         self.test_getnetworkinfo()
    5         #self.test_addnode_getaddednodeinfo()
    6         #self.test_service_flags()
    

    would be good to have rpc_net.py still pass.

  34. vasild commented at 1:28 pm on December 6, 2023: contributor

    Concept ACK

    I have applied the suggestions below in vasild/getnetmsgstats plus some other simplifications. Feel free to pick them.

    Thanks!

  35. in src/protocol.h:279 in 9383acb2d5 outdated
    274@@ -275,6 +275,12 @@ static constexpr size_t NUM_NET_MESSAGE_TYPES{35};
    275 /* Get a vector of all valid message types (see above) */
    276 const std::vector<std::string>& getAllNetMessageTypes();
    277 
    278+/* Get the index of a message type from the vector of all message types */
    279+int messageTypeToIndex(const std::string& msg_type);
    


    mzumsande commented at 10:36 pm on January 5, 2024:
    nit: capital M, here and in messageTypeFromIndex to make it similar to the related functions for other dimensions.
  36. in src/rpc/net.cpp:850 in b352853b78 outdated
    845+            const CConnman& connman = EnsureConnman(node);
    846+            std::vector<net_stats::DimensionType> dimensions;
    847+
    848+            // Aggregate the messages to show only requested dimensions
    849+            if (request.params[0].isArray()) {
    850+                const UniValue raw_dimensions = request.params[0].get_array();
    


    mzumsande commented at 11:08 pm on January 5, 2024:
    Would be nice to have some validation of user input here - getnetmsgstats '["foo","bar"]' could return an error message, listing the supported dimensions.
  37. mzumsande commented at 11:12 pm on January 5, 2024: contributor
    Concept ACK. We are using this (in an adjusted form with another dimension for txrelay) for analysing the traffic implications of #28463, and it works great so far.
  38. vasild commented at 4:07 pm on January 6, 2024: contributor

    I have applied the suggestions below in vasild/getnetmsgstats plus some other simplifications. Feel free to pick them.

    Should I PR the above branch?

  39. kristapsk commented at 5:45 pm on January 6, 2024: contributor
    Concept ACK
  40. willcl-ark commented at 4:08 pm on January 24, 2024: contributor

    I have applied the suggestions below in vasild/getnetmsgstats plus some other simplifications. Feel free to pick them.

    Should I PR the above branch?

    Hey Vasil, I think this makes most sense, especially seeing as you’ve co-authored/written most of the changes here, and your improvements on the new branch!

    I’ve been testing your branch out today and am a fan of the simplicity vs this changeset. I did kind of prefer the “aggregation” RPC behaviour in this branch, even though it doesn’t really aggregate, more accurately it “groups by”.

    In any case, your new branch does actually aggregate results, and the same effect can be easily achieved. e.g. to produce output like:

     0{
     1  "wtxidrelay": {
     2    "bytes": 1146,
     3    "count": 47
     4  },
     5  "headers": {
     6    "bytes": 10846,
     7    "count": 55
     8  },
     9  "tx": {
    10    "bytes": 511341,
    11    "count": 801
    12  },
    13  "ping": {
    14    "bytes": 1786,
    15    "count": 56
    16  }
    17}
    

    In this branch we would do: .src/bitcoin-cli getnetmsgstats '["message_type"]', whereas on your new fork (with actual aggregation) this requires the “inverse”: ./src/bitcoin-cli getnetmsgstats '["direction", "network", "connection_type"]'.

    Your comments here did get me thinking whether we could use mapRecvBytesPerMsgType to return the netmsgstats, or conversely use data from net_stats to return stats to the getpeerinfo RPC and avoid counting the same message twice everywhere. But I’m not convinced that the changes that would be needed to do that would be worth it.

  41. vasild commented at 3:27 pm on February 10, 2024: contributor

    Should I PR the above branch?

    Hey Vasil, I think this makes most sense …

    I am on it!

    Your comments #28926 (review) did get me thinking whether we could use mapRecvBytesPerMsgType to return the netmsgstats, or conversely use data from net_stats to return stats to the getpeerinfo RPC and avoid counting the same message twice everywhere. But I’m not convinced that the changes that would be needed to do that would be worth it.

    Hmm,

    • If we use CNode::mapRecvBytesPerMsgType to create the global netmsgstats report, then we still need to store the data somewhere when the CNode disconnects. So we need another storage anyway.
    • If we use the new global net_stats to create the per-peer report, then we have to add another dimension to it - “peer id”. Then query only for a given peer for getpeerinfo and always aggregate by “peer id” (hide it) from the new totals report from getnetmsgstats. That seems like it may be a good idea? But what when a peer disconnects? We don’t want to bloat net_stats with individual info for disconnected peers. So then we would have to have one dummy “peer id”, e.g. -1 and sum disconnected peers’ stats into that one. That way net_stats cannot be an array anymore because the size of the “peer id” dimension is not known at compile time. What do others think about this?
  42. vasild commented at 3:35 pm on February 10, 2024: contributor

    … this requires the “inverse” …

    Ha, I did not realize this before! I am fine either way. Also I am not sure what would be the best name of the RPC parameter, e.g. bitcoin-cli -named getnetmsgstats whatname='[...]'. Lets do a poll below (use :+1:).

  43. vasild commented at 3:36 pm on February 10, 2024: contributor
    To produce the output in #28926 (comment) it is better to use bitcoin-cli getnetmsgstats '["message_type"]'. How to name the RPC parameter?
  44. vasild commented at 3:36 pm on February 10, 2024: contributor
    To produce the output in #28926 (comment) it is better to use bitcoin-cli getnetmsgstats '["direction", "network", "connection_type"]'. How to name the RPC parameter?
  45. vasild commented at 9:35 am on February 11, 2024: contributor

    Should I PR the above branch?

    Opened https://github.com/bitcoin/bitcoin/pull/29418

  46. satsie commented at 7:22 pm on February 11, 2024: contributor

    RE:

    … this requires the “inverse” …

    Ha, I did not realize this before! I am fine either way.

    I voted for bitcoin-cli getnetmsgstats '["message_type"]' because as a node operator it’s simpler to remember the dimensions I want (i.e. message) instead of knowing the RPC well enough to list every dimension I don’t want (i.e. direction, network, connection type). But of course anyone using an RPC should read the docs :smile:

  47. willcl-ark closed this on Feb 12, 2024

  48. willcl-ark commented at 8:30 am on February 12, 2024: contributor
    • But what when a peer disconnects? We don’t want to bloat net_stats with individual info for disconnected peers.

    Yes, and in fact if a peer repeatedly connected and disconnected we would end up with unbounded growth of net_stats size, which would not be good… Let’s shelve this thought for now.

    To produce the output in #28926 (comment) it is better to use bitcoin-cli getnetmsgstats ‘[“message_type”]’. How to name the RPC parameter?

    In a previous revision, I had renamed the RPC parameter group_by to address this, as I felt it made sense.

    Closed in favour of #29418


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me