Split CConnman #30988

pull vasild wants to merge 17 commits into bitcoin:master from vasild:sockman changing 20 files +1416 −729
  1. vasild commented at 7:57 am on September 27, 2024: contributor

    Currently CConnman is a mixture of:

    • low level socket handling, e.g. send, recv, poll, bind, listen, connect, and
    • higher level logic that is specific to the Bitcoin P2P protocol, e.g. V1/V2 transport, choosing which address to connect to, if we manage to connect mark the address good in AddrMan, maintaining the number of inbound and outbound connections, banning of peers, interacting with PeerManager.

    This PR splits the socket handling into a new class which makes the code more modular and reusable. Having more modular and reusable code is a good thing on its own, even if the code is not reused. Stratum V2 and libevent-less RPC/HTTP server could benefit from this, but it makes sense on its own, even without those projects.


    The socket operations are driven by the new class SockMan which informs the higher level via provided methods when e.g. new data arrives on the socket or a new connection is accepted. For this, SockMan provides some non-virtual methods to start it rolling and then it calls pure virtual methods which are implemented by the higher level (e.g. CConnman) on certain events, for example “got this new data on this node’s socket”.

    The interface of SockMan is:

      0/**
      1 * A socket manager class which handles socket operations.
      2 * To use this class, inherit from it and implement the pure virtual methods.
      3 * Handled operations:
      4 * - binding and listening on sockets
      5 * - starting of necessary threads to process socket operations
      6 * - accepting incoming connections
      7 * - making outbound connections
      8 * - closing connections
      9 * - waiting for IO readiness on sockets and doing send/recv accordingly
     10 */
     11class SockMan
     12{
     13public:
     14
     15    //
     16    // Non-virtual functions, to be reused by children classes.
     17    //
     18
     19    /**
     20     * Bind to a new address:port, start listening and add the listen socket to `m_listen`.
     21     * Should be called before `StartSocketsThreads()`.
     22     * [@param](/bitcoin-bitcoin/contributor/param/)[in] to Where to bind.
     23     * [@param](/bitcoin-bitcoin/contributor/param/)[out] err_msg Error string if an error occurs.
     24     * [@retval](/bitcoin-bitcoin/contributor/retval/) true Success.
     25     * [@retval](/bitcoin-bitcoin/contributor/retval/) false Failure, `err_msg` will be set.
     26     */
     27    bool BindAndStartListening(const CService& to, bilingual_str& err_msg);
     28
     29    /**
     30     * Start the necessary threads for sockets IO.
     31     */
     32    void StartSocketsThreads(const Options& options);
     33
     34    /**
     35     * Join (wait for) the threads started by `StartSocketsThreads()` to exit.
     36     */
     37    void JoinSocketsThreads();
     38
     39    /**
     40     * Make an outbound connection, save the socket internally and return a newly generated connection id.
     41     * [@param](/bitcoin-bitcoin/contributor/param/)[in] to The address to connect to, either as CService or a host as string and port as
     42     * an integer, if the later is used, then `proxy` must be valid.
     43     * [@param](/bitcoin-bitcoin/contributor/param/)[in] is_important If true, then log failures with higher severity.
     44     * [@param](/bitcoin-bitcoin/contributor/param/)[in] proxy Proxy to connect through, if set.
     45     * [@param](/bitcoin-bitcoin/contributor/param/)[out] proxy_failed If `proxy` is valid and the connection failed because of the
     46     * proxy, then it will be set to true.
     47     * [@param](/bitcoin-bitcoin/contributor/param/)[out] me If the connection was successful then this is set to the address on the
     48     * local side of the socket.
     49     * [@return](/bitcoin-bitcoin/contributor/return/) Newly generated id, or std::nullopt if the operation fails.
     50     */
     51    std::optional<SockMan::Id> ConnectAndMakeId(const std::variant<CService, StringHostIntPort>& to,
     52                                                bool is_important,
     53                                                std::optional<Proxy> proxy,
     54                                                bool& proxy_failed,
     55                                                CService& me)
     56        EXCLUSIVE_LOCKS_REQUIRED(!m_connected_mutex, !m_unused_i2p_sessions_mutex);
     57
     58    /**
     59     * Destroy a given connection by closing its socket and release resources occupied by it.
     60     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Connection to destroy.
     61     * [@return](/bitcoin-bitcoin/contributor/return/) Whether the connection existed and its socket was closed by this call.
     62     */
     63    bool CloseConnection(Id id)
     64        EXCLUSIVE_LOCKS_REQUIRED(!m_connected_mutex);
     65
     66    /**
     67     * Try to send some data over the given connection.
     68     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Identifier of the connection.
     69     * [@param](/bitcoin-bitcoin/contributor/param/)[in] data The data to send, it might happen that only a prefix of this is sent.
     70     * [@param](/bitcoin-bitcoin/contributor/param/)[in] will_send_more Used as an optimization if the caller knows that they will
     71     * be sending more data soon after this call.
     72     * [@param](/bitcoin-bitcoin/contributor/param/)[out] errmsg If <0 is returned then this will contain a human readable message
     73     * explaining the error.
     74     * [@retval](/bitcoin-bitcoin/contributor/retval/) >=0 The number of bytes actually sent.
     75     * [@retval](/bitcoin-bitcoin/contributor/retval/) <0 A permanent error has occurred.
     76     */
     77    ssize_t SendBytes(Id id,
     78                      std::span<const unsigned char> data,
     79                      bool will_send_more,
     80                      std::string& errmsg) const
     81        EXCLUSIVE_LOCKS_REQUIRED(!m_connected_mutex);
     82
     83    /**
     84     * Stop listening by closing all listening sockets.
     85     */
     86    void StopListening();
     87
     88    //
     89    // Pure virtual functions must be implemented by children classes.
     90    //
     91
     92    /**
     93     * Be notified when a new connection has been accepted.
     94     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Id of the newly accepted connection.
     95     * [@param](/bitcoin-bitcoin/contributor/param/)[in] me The address and port at our side of the connection.
     96     * [@param](/bitcoin-bitcoin/contributor/param/)[in] them The address and port at the peer's side of the connection.
     97     * [@retval](/bitcoin-bitcoin/contributor/retval/) true The new connection was accepted at the higher level.
     98     * [@retval](/bitcoin-bitcoin/contributor/retval/) false The connection was refused at the higher level, so the
     99     * associated socket and id should be discarded by `SockMan`.
    100     */
    101    virtual bool EventNewConnectionAccepted(Id id,
    102                                            const CService& me,
    103                                            const CService& them) = 0;
    104
    105    /**
    106     * Called when the socket is ready to send data and `ShouldTryToSend()` has
    107     * returned true. This is where the higher level code serializes its messages
    108     * and calls `SockMan::SendBytes()`.
    109     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Id of the connection whose socket is ready to send.
    110     * [@param](/bitcoin-bitcoin/contributor/param/)[out] cancel_recv Should always be set upon return and if it is true,
    111     * then the next attempt to receive data from that connection will be omitted.
    112     */
    113    virtual void EventReadyToSend(Id id, bool& cancel_recv) = 0;
    114
    115    /**
    116     * Called when new data has been received.
    117     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Connection for which the data arrived.
    118     * [@param](/bitcoin-bitcoin/contributor/param/)[in] data Received data.
    119     */
    120    virtual void EventGotData(Id id, std::span<const uint8_t> data) = 0;
    121
    122    /**
    123     * Called when the remote peer has sent an EOF on the socket. This is a graceful
    124     * close of their writing side, we can still send and they will receive, if it
    125     * makes sense at the application level.
    126     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Connection whose socket got EOF.
    127     */
    128    virtual void EventGotEOF(Id id) = 0;
    129
    130    /**
    131     * Called when we get an irrecoverable error trying to read from a socket.
    132     * [@param](/bitcoin-bitcoin/contributor/param/)[in] id Connection whose socket got an error.
    133     * [@param](/bitcoin-bitcoin/contributor/param/)[in] errmsg Message describing the error.
    134     */
    135    virtual void EventGotPermanentReadError(Id id, const std::string& errmsg) = 0;
    136};
    

    Resolves: #30694


    Review hint: this PR moves some code around, so reviewers may find this helpful: git show --color-moved --color-moved-ws=allow-indentation-change.

  2. DrahtBot commented at 7:57 am on September 27, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30988.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK tdb3, Sjors, hodlinator, jonatack
    Stale ACK pinheadmz

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31519 (refactor: Use std::span over Span by maflcko)
    • #30951 (net: option to disallow v1 connection on ipv4 and ipv6 peers by stratospher)
    • #30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #29418 (rpc: provide per message stats for global traffic via new RPC ‘getnetmsgstats’ by vasild)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
    • #28463 (p2p: Increase inbound capacity for block-relay only connections by mzumsande)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/rpc/net.cpp:210 in 03f6cc2b4a outdated
    202@@ -203,6 +203,15 @@ static RPCHelpMan getpeerinfo()
    203     std::vector<CNodeStats> vstats;
    204     connman.GetNodeStats(vstats);
    205 
    206+    // An undocumented side effect of how CConnman previously stored nodes was
    207+    // that they were returned ordered by id. At least some functional tests
    208+    // rely on that, so keep it that way. An alternative is to remove this sort
    209+    // and fix the tests and take the risk of breaking other users of the
    210+    // "getpeerinfo" RPC.
    


    maflcko commented at 8:41 am on September 27, 2024:
    nit: Arrays are ordered data structures and I don’t think it is user-friendly to shuffle them by ID. I think this comment can just be removed.

    vasild commented at 3:05 pm on September 27, 2024:
    Removed, thanks!
  4. DrahtBot added the label CI failed on Sep 27, 2024
  5. Sjors commented at 11:41 am on September 27, 2024: member

    Nice! I’ll try to use this for Sv2Connman in https://github.com/Sjors/bitcoin/pull/50 and will let you know if anything is missing.

    Can you put sockman.h in libbitcoin_common instead of libbitcoin_node? For the Template Provider I’m trying to prevent a circular dependency on the node. This should do the trick: https://github.com/bitcoin/bitcoin/commit/4dd51b2924860bf10466da080ea4ff7bed1a3e3f

  6. vasild force-pushed on Sep 27, 2024
  7. vasild commented at 3:07 pm on September 27, 2024: contributor

    03f6cc2b4a...70c2f13f83: fix CI failure, and address suggestions

    Can you put sockman.h in libbitcoin_common

    Done.

  8. DrahtBot removed the label CI failed on Sep 27, 2024
  9. Sjors commented at 4:51 pm on September 27, 2024: member

    Here’s an initial sketch of making Sv2Connman a subclass of SockMan. The test gets through the handshake but fails later on, so I’ll need to study it a bit more closely.

    https://github.com/Sjors/bitcoin/pull/64

  10. DrahtBot added the label CI failed on Sep 29, 2024
  11. DrahtBot removed the label CI failed on Sep 29, 2024
  12. tdb3 commented at 10:05 pm on September 29, 2024: contributor
    Concept ACK
  13. pinheadmz commented at 1:38 pm on September 30, 2024: member

    Concept ACK

    This looks great and the API in the header looks easy, thanks.

    I’m in the process of cleaning up my HTTP branch for a pull request and then I can start reviewing this and rebasing on top.

    One element of libevent I’m not immediately seeing here is timed events. Really the only thing HTTP needs it for is walletpassphrase which calls RPCRunLater() which interacts with HTTPRPCTimerInterface(). I don’t think Conman has a specific mechanism for this because timed things are attached directly to nodes like m_last_getheaders_timestamp etc. The current HTTPRPCTimerInterface uses libevent event_new() and evtimer_add(), I accomplish this with a map of timestamps and callback functions in my event loop: https://github.com/pinheadmz/bitcoin/commit/42b7240378ffc4890ab4ba1453623e5986ea2a71

  14. maflcko commented at 2:07 pm on September 30, 2024: member

    I accomplish this with a map of timestamps and callback functions in my event loop

    I wonder why the existing scheduler can’t be used for re-locking the wallet? I know there is #18488 and #14289, but the thread is already filled with random stuff such as BerkeleyDatabase::PeriodicFlush(), and relocking the wallet seems(?) fast (I haven’t benchmarked), so should be fine to put in there as well, at least from that perspective?

  15. vasild commented at 10:07 am on October 3, 2024: contributor

    @pinheadmz, I think that the functionality of “execute this code after some time”, is not much related to the sockets handling and better be implemented at some higher level, not inside SockMan. Maybe the scheduler, like @maflcko suggested, or in the EventIOLoopCompletedForAllPeers() method which will be called periodically by SockMan:

    0    /**
    1     * SockMan has completed send+recv for all nodes.
    2     * Can be used to execute periodic tasks for all nodes.
    3     * The implementation in SockMan does nothing.
    4     */
    5    virtual void EventIOLoopCompletedForAllPeers();
    

    Edit: I guess TriggerEvents() from https://github.com/pinheadmz/bitcoin/commit/42b7240378ffc4890ab4ba1453623e5986ea2a71 can be called from EventIOLoopCompletedForAllPeers() or from the scheduler.

  16. Sjors commented at 1:54 pm on October 4, 2024: member
    @vasild if you rebase past #31011, tidy might point out that sockman.cpp.o depends on i2p.cpp. So you probably need to either move i2p.cpp to common as well, or remove the dependency.
  17. Jacksonearl2468 approved
  18. DrahtBot added the label CI failed on Oct 7, 2024
  19. in src/common/sockman.cpp:5 in 70c2f13f83 outdated
    0@@ -0,0 +1,528 @@
    1+// Copyright (c) 2024-present The Bitcoin Core developers
    2+// Distributed under the MIT software license, see the accompanying
    3+// file COPYING or https://opensource.org/license/mit/.
    4+
    5+#include <config/bitcoin-config.h> // IWYU pragma: keep
    


    Sjors commented at 11:15 am on October 16, 2024:
    Silent merge conflict with #30937
  20. in src/net.cpp:1715 in 1bfc1ca9b6 outdated
    1795@@ -1796,10 +1796,10 @@ void CConnman::CreateNodeFromAcceptedSocket(std::unique_ptr<Sock>&& sock,
    1796 
    1797     CNode* pnode = new CNode(id,
    1798                              std::move(sock),
    1799-                             addr,
    1800+                             CAddress{addr, NODE_NONE},
    1801                              CalculateKeyedNetGroup(addr),
    1802                              nonce,
    1803-                             addr_bind,
    1804+                             CAddress{addr_bind, NODE_NONE},
    


    pinheadmz commented at 7:27 pm on October 16, 2024:

    1bfc1ca9b6b68c6356ffc23ecf01e417152ade95

    Why do CNode .addr and .addrBind still need to be CAddress? maybe it makes some sense for .addr if that’s where we also store the service flags for easier gossiping, but why would the local bind address need anything besides an IP and port?


    vasild commented at 12:43 pm on December 3, 2024:

    CNode::addr needs to be CAddress because at least PeerManagerImpl::PushNodeVersion() is using addr.nServices.

    CNode::addrBind indeed does not need to be CAddress. I changed it to CService.

  21. in src/common/sockman.cpp:89 in 41c87ddb3d outdated
    52+            strError = strprintf(Untranslated("Error setting IPV6_PROTECTION_LEVEL on socket: %s, continuing anyway"), NetworkErrorString(WSAGetLastError()));
    53+            LogPrintf("%s\n", strError.original);
    54+        }
    55+#endif
    56+    }
    57+
    


    pinheadmz commented at 7:38 pm on October 16, 2024:

    41c87ddb3d7a18d2c0fa7eccfbde57e9d6e898c2

    Maybe these are expected to be set by callers in future commits, so I’m just leaving myself a note to look for SO_KEEPALIVE and TCP_NODELAY


    pinheadmz commented at 3:19 pm on October 17, 2024:

    Ok I see NODELAY set by ConnMan in EventNewConnectionAccepted in b94f9d338fd0d3b47a4ea6165ce9cde48f3b19bc

    I’m guessing we dont use KEEPALIVE for bitcoin p2p because we have out own ping/pong protocol


    pinheadmz commented at 6:04 pm on October 17, 2024:
    Moved to NewSockAccepted() in 9d1b352a4d050ec88b59cdf59209c0f086fea4e6
  22. in src/common/sockman.h:23 in 5d4920f630 outdated
     8@@ -9,9 +9,12 @@
     9 #include <util/sock.h>
    10 #include <util/translation.h>
    11 
    12+#include <atomic>
    13 #include <memory>
    14 #include <vector>
    15 
    16+typedef int64_t NodeId;
    


    pinheadmz commented at 2:53 pm on October 17, 2024:

    5d4920f630417ceeee1c79304faac6803795cd64

    I don’t really have any better suggestion for this but the term “node” is now a higher-level ConnMan context, for HTTP and StratumV2 I’d probably refer to them as “client”.


    vasild commented at 12:46 pm on December 3, 2024:
    “Peer” is a good description for this, but that is already used in net_processing.

    Sjors commented at 12:01 pm on January 15, 2025:

    1b05e1d4ba55a42ba74026b68fa4e616b973e06d: maybe ConnectionId?

    And then in net.h use typedef ConnectionId NodeId;

    https://github.com/Sjors/bitcoin/commit/f33049b45b7013022ed4c75c0bc52d878fe15fd5

    Most of the churn would be in sockman.{h,cpp}, so that seems acceptable.

    That said, I don’t think it’s very important.


    vasild commented at 1:24 pm on February 6, 2025:

    Changed. This is now SockMan::Id and in net.h: using NodeId = SockMan::Id;. In SockMan comments change the word “node” to “connection”. This caused a lot of mechanical renames within this PR, but it looks better now, thanks!

    SockMan::GetNewNodeId() renamed to SockMan::GetNewId()

    SockMan::EventIOLoopCompletedForNode() renamed to SockMan::EventIOLoopCompletedForOne()

    SockMan::EventIOLoopCompletedForAllPeers() renamed to SockMan::EventIOLoopCompletedForAll()

    SockMan::ConnectAndMakeNodeId() renamed to SockMan::ConnectAndMakeId().

    Will be in next push.

  23. in src/common/sockman.cpp:309 in b94f9d338f outdated
    160+    };
    161+
    162+    while (!interruptNet) {
    163+
    164+        if (!m_i2p_sam_session->Listen(conn)) {
    165+            EventI2PListen(conn.me, /*success=*/false);
    


    pinheadmz commented at 3:11 pm on October 17, 2024:

    b94f9d338fd0d3b47a4ea6165ce9cde48f3b19bc

    Reminding myself to check for a test that covers a child class with i2p true but no Event listener

  24. in src/common/sockman.h:105 in b94f9d338f outdated
    100+     * Be notified when a new connection has been accepted.
    101+     * @param[in] sock Connected socket to communicate with the peer.
    102+     * @param[in] me The address and port at our side of the connection.
    103+     * @param[in] them The address and port at the peer's side of the connection.
    104+     */
    105+    virtual void EventNewConnectionAccepted(std::unique_ptr<Sock>&& sock,
    


    pinheadmz commented at 3:29 pm on October 17, 2024:

    b94f9d338fd0d3b47a4ea6165ce9cde48f3b19bc

    Seems weird to me to move the unique pointer to a connected socket out of sockman, implying that sockman itself doesn’t own the connections. Peeking ahead at future commits though I think this gets resolved.


    pinheadmz commented at 6:22 pm on October 17, 2024:
    resolved in b94f9d338fd0d3b47a4ea6165ce9cde48f3b19bc
  25. in src/net.h:1414 in b96beb27d2 outdated
    1410@@ -1410,7 +1411,7 @@ class CConnman : private SockMan
    1411     std::vector<AddedNodeParams> m_added_node_params GUARDED_BY(m_added_nodes_mutex);
    1412 
    1413     mutable Mutex m_added_nodes_mutex;
    1414-    std::vector<CNode*> m_nodes GUARDED_BY(m_nodes_mutex);
    1415+    std::unordered_map<NodeId, CNode*> m_nodes GUARDED_BY(m_nodes_mutex);
    


    pinheadmz commented at 4:52 pm on October 17, 2024:

    b96beb27d2b3d06a45644c537c114a08f0ccd285

    Love this, great optimization and makes a lot of sense.

  26. in src/test/fuzz/connman.cpp:49 in b96beb27d2 outdated
    44@@ -45,12 +45,13 @@ FUZZ_TARGET(connman, .init = initialize_connman)
    45     connman.Init(options);
    46 
    47     CNetAddr random_netaddr;
    48-    CNode random_node = ConsumeNode(fuzzed_data_provider);
    49+    NodeId node_id{0};
    50+    CNode random_node = ConsumeNode(fuzzed_data_provider, node_id++);
    


    pinheadmz commented at 4:55 pm on October 17, 2024:

    b96beb27d2b3d06a45644c537c114a08f0ccd285

    Why not use GetNewNodeId() here? (and below) Don’t you have a Connman with its own m_next_node_id?


    vasild commented at 1:10 pm on December 3, 2024:
    GetNewNodeId() is private. Not worth weakening the encapsulation for this test.
  27. in src/common/sockman.h:228 in bb5b91d430 outdated
    119+     */
    120+    virtual bool ShouldTryToSend(NodeId node_id) const;
    121+
    122+    /**
    123+     * SockMan would only call Recv() on a node's socket if this returns true.
    124+     * Can be used to temporary pause receives for a node.
    


    pinheadmz commented at 5:03 pm on October 17, 2024:

    bb5b91d430026c4826a2107c8c1a311518cc97ce

    nit s/temporary/temporarily

    and above


    vasild commented at 1:13 pm on December 3, 2024:
    Done
  28. in src/common/sockman.h:257 in 50cb52470e outdated
    139+    /**
    140+     * SockMan has completed send+recv for all nodes.
    141+     * Can be used to execute periodic tasks for all nodes.
    142+     * The implementation in SockMan does nothing.
    143+     */
    144+    virtual void EventIOLoopCompletedForAllPeers();
    


    pinheadmz commented at 5:12 pm on October 17, 2024:

    50cb52470ea6cc4de5a5e5260b8f308353942ec0

    This is great, can be used for timed actions (like locking the wallet) we used to depend on libevent for

  29. in src/net.cpp:2035 in 50cb52470e outdated
    2030+void CConnman::EventIOLoopCompletedForAllPeers()
    2031+{
    2032+    AssertLockNotHeld(m_nodes_mutex);
    2033+    AssertLockNotHeld(m_reconnections_mutex);
    2034+
    2035+    DisconnectNodes();
    


    pinheadmz commented at 5:19 pm on October 17, 2024:

    50cb52470ea6cc4de5a5e5260b8f308353942ec0

    Interesting that the parent class handles accepting connections but the child is expected to handle disconnections, and inside the utility event loop function. Of course only the derived class would know when disconnection is appropriate, I wonder if anything it makes sense just to add a comment in sockman.h that EventIOLoopCompletedForAllPeers() is expected to handle that, erasing items from m_nodes


    vasild commented at 1:34 pm on December 3, 2024:

    Good observation. I guess this is the nature of things: accepting new connections is driven (initiated) by low level socket event when a new connection arrives, while disconnecting can be low level driven or high level driven - e.g. misbehaving peer at the application layer.

    I extended the comment of EventIOLoopCompletedForAllPeers() a little bit:

    0    -+     * Can be used to execute periodic tasks for all nodes.                   
    1    ++     * Can be used to execute periodic tasks for all nodes, like disconnecting
    2    ++     * nodes due to higher level logic.           
    
  30. in src/common/sockman.cpp:25 in 3bb0145514 outdated
     9@@ -10,6 +10,19 @@
    10 #include <util/sock.h>
    11 #include <util/thread.h>
    12 
    13+CService GetBindAddress(const Sock& sock)
    14+{
    15+    CService addr_bind;
    16+    struct sockaddr_storage sockaddr_bind;
    17+    socklen_t sockaddr_bind_len = sizeof(sockaddr_bind);
    18+    if (!sock.GetSockName((struct sockaddr*)&sockaddr_bind, &sockaddr_bind_len)) {
    


    pinheadmz commented at 5:32 pm on October 17, 2024:

    3bb0145514978092ae966e30693cf619e4034837

    the else {error} looked backwards to me here at first, might be more clear to use == 0 instead of ! but i dunno what the official style is for this.


    vasild commented at 1:41 pm on December 3, 2024:

    I don’t think we have an official recommendation. My personal preference is to not treat integers as booleans: use if (x == 0) instead of if (!x) when x is an integer. But I am leaving this as it is because it is like that in master and this PR only moves it from net.cpp to sockman.cpp verbatim. I hope it is easier to reviewers to just check that a blob of code is moved without having to validate its correctness since that same code is already in master. This helps with such moves:

    ~/.gitconfig:

    0[diff]  
    1        colorMoved = dimmed-zebra
    2        colorMovedWS = allow-indentation-change
    
  31. in src/common/sockman.cpp:13 in 3bb0145514 outdated
     9@@ -10,6 +10,19 @@
    10 #include <util/sock.h>
    11 #include <util/thread.h>
    12 
    13+CService GetBindAddress(const Sock& sock)
    


    pinheadmz commented at 5:35 pm on October 17, 2024:

    3bb0145514978092ae966e30693cf619e4034837

    Would this make more sense as a direct method of Sock ? Especially now that it returns the simpler CService instead of CAddress


    vasild commented at 1:46 pm on December 3, 2024:
    Maybe yes, but sock.{h,cpp} does not currently include netaddress.h and I guess including it may cause circular dependency. It would also increase the scope of this PR. I am leaving it as it is.
  32. in src/common/sockman.h:110 in 3bb0145514 outdated
     99+     * local side of the socket.
    100+     * @param[out] sock Connected socket, if the operation is successful.
    101+     * @param[out] i2p_transient_session I2P session, if the operation is successful.
    102+     * @return Newly generated node id, or std::nullopt if the operation fails.
    103+     */
    104+    std::optional<NodeId> ConnectAndMakeNodeId(const std::variant<CService, StringHostIntPort>& to,
    


    pinheadmz commented at 5:44 pm on October 17, 2024:

    3bb0145514978092ae966e30693cf619e4034837

    This is great. Won’t be needed immediately for HTTP or StratumV2 but means we can more easily remove libevent from bitcoin-cli !

  33. in src/common/sockman.cpp:282 in c133a634d1 outdated
    334@@ -262,8 +335,16 @@ void SockMan::EventIOLoopCompletedForAllPeers() {}
    335 
    336 void SockMan::EventI2PListen(const CService&, bool) {}
    337 
    338+void SockMan::TestOnlyAddExistentNode(NodeId node_id, std::unique_ptr<Sock>&& sock)
    339+{
    340+    LOCK(m_connected_mutex);
    341+    m_connected.emplace(node_id, std::make_shared<NodeSockets>(std::move(sock)));
    


    pinheadmz commented at 2:57 pm on October 18, 2024:

    c133a634d10e172652ebbf05ca114037c8e551b9

    I know this is for testing only, but what happens if there is a NodeId collision (An existing node is inserted with a duplicate ID, replacing a connected node in sockman)?


    vasild commented at 1:53 pm on December 3, 2024:
    It would be a mess. I added an assert to make it clear that a test which inserts duplicate ids is broken.
  34. in src/common/sockman.cpp:348 in c133a634d1 outdated
    390+        // listening sockets in one call ("readiness" as in poll(2) or
    391+        // select(2)). If none are ready, wait for a short while and return
    392+        // empty sets.
    393+        auto io_readiness{GenerateWaitSockets()};
    394+        if (io_readiness.events_per_sock.empty() ||
    395+            !io_readiness.events_per_sock.begin()->first->WaitMany(SELECT_TIMEOUT,
    


    pinheadmz commented at 3:13 pm on October 18, 2024:

    c133a634d10e172652ebbf05ca114037c8e551b9

    You might want to add another comment here - when I was trying to handle socks myself I found this a bit weird and wrote:

    0    // WaitMany() mine as well be a static function, the context
    1    // of the first Sock in the vector is not relevant.
    

    vasild commented at 1:57 pm on December 3, 2024:
    Done
  35. pinheadmz approved
  36. pinheadmz commented at 3:14 pm on October 18, 2024: member

    ACK 70c2f13f83a5cc740330d0b4af9cbd74515be6b2

    Built and ran tests on macos/arm as well as debian/x86.

    Synced to signet with this build on a debian server, then synced from that node locally using the macos build.

    I left a lot of comments below but most are for myself to track during rebases, indicate to the maintainers the depth that I’m actually reviewing, and also some notes for myself – I’ll be rebaing my non-libevent http server on this banch and consuming the new SockMan.

    I have a working branch with my own sock handler for http, so I was able to compare that work to the code in this PR to follow along.

    There are two areas I am not entirely clear on that I will review more closely in the coming weeks which is I2P stuff, and reviewing all the mutex stuff.

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA256
     2
     3ACK 70c2f13f83a5cc740330d0b4af9cbd74515be6b2
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQIzBAEBCAAdFiEE5hdzzW4BBA4vG9eM5+KYS2KJyToFAmcSeTIACgkQ5+KYS2KJ
     7yTp/1A//ZzhksZzWlBbV6rcfIXnUVQS0SIgmY7sZCN8AN0RoiN4ECALWyrXz9Rdg
     8fONCWiQSTyxE5u2TnFo7os+o0Z+3ybxApufAx/OzqHaJwR8d5wnZPHU/uFTPYm/z
     9qArPT3SHQgKb1sftt8S5ve7j37RW7IbshiJ+6bq8+DNMY1Ep1DX4BBzZgwbzx2ek
    10qMcRFQkOkTbJg3/yOT1Vz361qCPLmbKjvbRyjYgYYG7oxtdwb5r/ezeagSCWKDJo
    11Yx/DJ2cwZ3uMWDvfnhNJOeql4MhOLZcU5hKyGGTRT3B5ER7ELNye5+Sz+pv0cJuP
    12R+zfoxAfYV3opqB/ewTD1sM+gtFZFUiLNHigPgKFmK5GerLguXUqY+vgQ/4VIEpg
    13XD2tHOCwwDULsekVLZ+hXJPSTBDmZAlHmJvcCBQgzLkjCMvrlmxJR8KpKg3lDgKM
    14GwFcYoIWubyFKgRCAQL6vc1qFnGc7PJ1TWJLoKxpjXeyE6igHquSK0KE3flhCJ97
    15CpWhZflrHwcaomSf9bDYsSaNvMN2MiFUzC4lbmLRWTzeOCS9oMFRMcIayQ+psUqC
    160gmbZzi+Qcf32gCNYvhxYwKqTeAlKih9x+vurTuK8831AZoUYwusetGaTsklwvOn
    17bDpgEGoqFPdPIB8yHAiFBwQPcGmn7tVmTbVhck59B3HvdS/wOr4=
    18=rnjt
    19-----END PGP SIGNATURE-----
    

    pinheadmz’s public key is on keybase

  37. DrahtBot requested review from tdb3 on Oct 18, 2024
  38. Sjors commented at 3:09 pm on October 21, 2024: member
    cc @theuni any takes?
  39. DrahtBot added the label Needs rebase on Oct 28, 2024
  40. Sjors commented at 4:47 pm on November 14, 2024: member

    This needs a rebase. Otherwise it’s non-trivial for me to keep https://github.com/Sjors/bitcoin/pull/67 rebased.

    Conflicts:

  41. vasild force-pushed on Nov 14, 2024
  42. vasild commented at 4:58 pm on November 14, 2024: contributor

    70c2f13f83...c4f51c7f61: rebase due to conflicts, next I will look at @pinheadmz feedback above and implement changes from coredev discussions:

    • Store the node id -> CNode map in SockMan (then the SockMan class becomes templated)
    • SockMan <-> Connman calls based on CNode, not node id
    • Make the virtual methods in SockMan private, since they are called only from SockMan
  43. DrahtBot removed the label CI failed on Nov 14, 2024
  44. DrahtBot removed the label Needs rebase on Nov 14, 2024
  45. vasild force-pushed on Nov 19, 2024
  46. vasild commented at 1:28 pm on November 19, 2024: contributor

    c4f51c7f61...2b0103705f: rebase and make the virtual methods of SockMan private since they are only used by SockMan.

    About storing the CNode in SockMan (which would be templated like SockMan<CNode>) and making the communication between SockMan and the higher class (e.g. CConnman) based on CNode instead of node id: this is an excellent idea that will make the code more straight-forward and the higher classes simpler. However it would make this PR larger. I will leave it off for now. There is some specific CNode ref-counting in CConnman and the code around deleting a node is a bit complicated, so it would require further non-trivial changes to move that to SockMan.

    I think in general, in the long term, independently of this PR, it would be good to get rid of the manual CNode ref-counting and tap std::shared_ptr to do that for us. In other words, to revive #28222.

  47. vasild force-pushed on Dec 3, 2024
  48. vasild commented at 1:59 pm on December 3, 2024: contributor
    2b0103705f...645f625e29: rebase and address suggestions
  49. DrahtBot added the label Needs rebase on Dec 6, 2024
  50. in src/common/sockman.h:128 in 645f625e29 outdated
    123+     * explaining the error.
    124+     * @retval >=0 The number of bytes actually sent.
    125+     * @retval <0 A permanent error has occurred.
    126+     */
    127+    ssize_t SendBytes(NodeId node_id,
    128+                      Span<const unsigned char> data,
    


    pinheadmz commented at 8:12 pm on December 10, 2024:
    Why not use std::span? I thought Span was only a temporary fill-in until C++20 something something? Also, I’m using std::span elsewhere… is Span the preference?

    Sjors commented at 7:10 am on December 16, 2024:
    Even in relatively new code folks often use Span, e.g. #30043. cc @sipa, @laanwj

    vasild commented at 5:47 am on December 19, 2024:

    Must have been just inertia. The caller of the newly added SendBytes() has an argument that is already Span from elsewhere in the code, coming from here:

    https://github.com/bitcoin/bitcoin/blob/c1252b14d714448295200a595086dd3e78b46c8f/src/net.h#L307-L311

    Now I changed the argument of the newly added SendBytes() to std::span, looks better and the Span variable is accepted. Thanks!

    wen s/Span/std::span/ all over the place?


    vasild commented at 6:36 pm on December 19, 2024:

    wen s/Span/std::span/ all over the place?

    now: https://github.com/bitcoin/bitcoin/pull/31519

  51. Sjors commented at 2:07 am on December 19, 2024: member
    Looks like the merge conflict is with #31072 as well as #31223.
  52. vasild force-pushed on Dec 19, 2024
  53. vasild commented at 5:48 am on December 19, 2024: contributor
    645f625e29...b8b042626e: rebase due to conflicts and use std::span instead of Span in new code: #30988 (review)
  54. DrahtBot removed the label Needs rebase on Dec 19, 2024
  55. DrahtBot added the label Needs rebase on Dec 27, 2024
  56. vasild force-pushed on Jan 13, 2025
  57. vasild commented at 2:31 pm on January 13, 2025: contributor
    b8b042626e...f71f1a346c: rebase due to conflicts
  58. DrahtBot removed the label Needs rebase on Jan 13, 2025
  59. Sjors commented at 2:09 pm on January 14, 2025: member
    I found myself confused by e8b589ef45d5c94c47209fb952e8b05d631a9c47 dropping AddSocketPermissionFlags. Can you move that to a separate (earlier?) commit?
  60. in src/net.cpp:3156 in 4c23aacc2b outdated
    3152             m_client_interface->ThreadSafeMessageBox(strError, "", CClientUIInterface::MSG_ERROR);
    3153         }
    3154         return false;
    3155     }
    3156 
    3157+    LogPrintLevel(BCLog::NET, BCLog::Level::Info, "Bound to and listening at %s\n", addr.ToStringAddrPort());
    


    Sjors commented at 3:05 pm on January 14, 2025:

    4c23aacc2b82083f96255963f543dc35fff28334: I’m inclined to say “listening on %s”

    As in: we bind to a port in order to listen on it.

    The linux manual for listen also uses “on”, as in “listen for connections on a socket” https://man7.org/linux/man-pages/man2/listen.2.html


    vasild commented at 4:38 pm on January 14, 2025:
    Done, s/at/on/
  61. in src/common/sockman.cpp:32 in 4c23aacc2b outdated
     8@@ -9,68 +9,77 @@
     9 #include <netbase.h>
    10 #include <util/sock.h>
    11 
    12-bool SockMan::BindListenPort(const CService& addrBind, bilingual_str& strError)
    13+bool SockMan::BindAndStartListening(const CService& to, bilingual_str& errmsg)
    


    Sjors commented at 3:15 pm on January 14, 2025:
    4c23aacc2b82083f96255963f543dc35fff28334: since you need to print log statements here in a few places anyway, might as well drop the errmsg argument and always print logs here.

    vasild commented at 4:38 pm on January 14, 2025:
    BindAndStartListening() needs to return an error string to the caller, because the caller not only logs it but also passes it to ThreadSafeMessageBox().

    Sjors commented at 4:45 pm on January 14, 2025:
    Ah ok, I missed that bit.
  62. vasild force-pushed on Jan 14, 2025
  63. vasild commented at 4:40 pm on January 14, 2025: contributor

    f71f1a346c...d83ff66fd7:

    … dropping AddSocketPermissionFlags. Can you move that to a separate (earlier?) commit?

    Done.

  64. DrahtBot added the label CI failed on Jan 14, 2025
  65. vasild force-pushed on Jan 14, 2025
  66. vasild commented at 5:41 pm on January 14, 2025: contributor
    d83ff66fd7...bcf1254e91: adjust test after change of the log message in the previous push
  67. DrahtBot removed the label CI failed on Jan 14, 2025
  68. in src/net.cpp:1678 in fd81820214 outdated
    1736@@ -1737,7 +1737,10 @@ void CConnman::AcceptConnection(const ListenSocket& hListenSocket) {
    1737     const CService addr_bind{MaybeFlipIPv6toCJDNS(GetBindAddress(*sock))};
    1738 
    1739     NetPermissionFlags permission_flags = NetPermissionFlags::None;
    1740-    hListenSocket.AddSocketPermissionFlags(permission_flags);
    1741+    auto it{m_listen_permissions.find(addr_bind)};
    1742+    if (it != m_listen_permissions.end()) {
    


    Sjors commented at 11:18 am on January 15, 2025:
    fd81820214e695ba228a954506397c3d781fe3fe: do want to add an Assume here, given that Bind always adds an entry?

    vasild commented at 1:49 pm on January 28, 2025:
    Right, currently Bind() always adds an entry, but I do not want to impose that because the code here does not require it. It is fine if there is nothing in m_listen_permissions. Then the permissions will remain at NetPermissionFlags::None. So it is kind of optional for Bind() to add an entry to m_listen_permissions and no reason to enforce it.
  69. Sjors commented at 11:28 am on January 15, 2025: member

    … dropping AddSocketPermissionFlags. Can you move that to a separate (earlier?) commit?

    Done.

    Thanks. Don’t forget to drop “Also, separate the listening socket from the permissions” from the commit description of e5d36eea015efc31aa38d540af4cf39c9e2e46b0.

    Along similar lines, though less important, you could also have a separate commit that drops ListenSocket and introduces m_listen as a member of CConnman, before moving it to Sockman.

    (it makes git show --color-moved=dimmed-zebra more grey, less red and green when you introduce SockMan)

  70. in src/net.cpp:2177 in 0241b04cf4 outdated
    2173+            CService addr_accepted;
    2174+
    2175+            auto sock_accepted{AcceptConnection(*sock, addr_accepted)};
    2176+
    2177+            if (sock_accepted) {
    2178+                addr_accepted = MaybeFlipIPv6toCJDNS(addr_accepted);
    


    Sjors commented at 11:54 am on January 15, 2025:
    0241b04cf406d482abfac3fddfad9a9c28725f32: previously this would not get called if addr.SetSockAddr failed in AcceptConnection. I suspect it doesn’t matter though.

    vasild commented at 2:01 pm on January 28, 2025:

    Yes, good observation! Indeed the flip will do nothing on the invalid 0.0.0.0 addr_accepted.

    The code in master is also a bit odd that it continues if addr.SetSockAddr() fails in CConnman::AcceptConnection() and it will pass the default constructed, invalid 0.0.0.0 address to CreateNodeFromAcceptedSocket(). I didn’t change that in this PR as it would be a functional change that is not needed for this PR.

  71. Sjors commented at 12:37 pm on January 15, 2025: member
    Reviewed up to 1b05e1d4ba55a42ba74026b68fa4e616b973e06d.
  72. in src/common/sockman.h:263 in 14fcef6b0d outdated
    77+
    78+    /**
    79+     * Be notified of a change in the state of listening for incoming I2P connections.
    80+     * The default behavior, implemented by `SockMan`, is to ignore this event.
    81+     * @param[in] addr Our listening address.
    82+     * @param[in] success If true then the listen succeeded and we are now
    


    Sjors commented at 1:17 pm on January 15, 2025:

    14fcef6b0d1d1fa9395f9af2bafbf3de63d14ac2: I find this confusing.

    I think the problem starts with the strange naming of ThreadI2PAccept, instead of ThreadI2PHandler akin to ThreadSocketHandler.

    Anyway, what that thread seems to do is, in a loop, try to Listen on the socket and once listening, Accept( new connections.

    Whenever listening fails or succeeds it calls EventI2PListen which then calls AddLocal / RemoveLocal to keep the announced public addresses up to date.

    With that in mind, I would call this new method EventI2PConnectivity and rename success to either connected or listening. Then the rest makes sense.


    vasild commented at 3:09 pm on January 28, 2025:

    ThreadI2PAccept vs ThreadI2PHandler. The thing is that this thread is only accepting connections. It does not do anything about I2P connections that are already established. ThreadSocketHandler does both accepting new connections and doing send/recv on existent connections.

    Yes, your observation is correct. Listening can fail temporary, e.g. if the I2P router daemon is restarted. This is unlike the TCP listening where once bind() and listen() succeed, then we are listening until we decide to close the socket. Tor is like I2P - it can fail to listen for a while if the Tor daemon is restarted. Our Tor code has the deficiency that it does not handle that, so it would act as if we are listening (e.g. advertising our Tor address) even if we are not listening.

    Current EventI2PListen vs EventI2PConnectivity - I think “Connectivity” would be too broad because this method is only called in the event where we start or stop listening for incoming I2P connections. It is not related to whether we have existent I2P connections or to our ability to make new outgoing I2P connections which “Connectivity” would imply.


    Sjors commented at 3:41 pm on January 28, 2025:
    Maybe EventI2PReady or EventI2PStatus? Something to distinguish it from listen.

    vasild commented at 2:23 pm on February 6, 2025:
    Changed to EventI2PStatus() and instead of bool it now takes an enum with (currently) two possible values: START_LISTENING and STOP_LISTENING. Will be in next push.
  73. in src/net.h:1154 in 21c5e05619 outdated
    1150@@ -1150,7 +1151,7 @@ class CConnman : private SockMan
    1151     void ForEachNode(const NodeFn& func)
    1152     {
    1153         LOCK(m_nodes_mutex);
    1154-        for (auto&& node : m_nodes) {
    1155+        for (auto& [id, node] : m_nodes) {
    


    Sjors commented at 1:59 pm on January 15, 2025:

    21c5e05619c8a6eb736bb1c61725f4b5f669ffb4: maybe use [_, node] in places where you don’t need the id.

    It’s a bit unfortunate that you can’t (?) directly loop over all T in std::unordered_map<Key, T>. Since most of the time we don’t need Key (id) here.


    vasild commented at 4:19 pm on January 28, 2025:
    Changed to [_, node], thanks! :)
  74. in src/common/sockman.h:244 in dc6393cb93 outdated
    132@@ -133,6 +133,22 @@ class SockMan
    133     // alone to use the default implementation from SockMan.
    134     //
    135 
    136+    /**
    137+     * SockMan would only call EventReadyToSend() if this returns true.
    


    Sjors commented at 2:10 pm on January 15, 2025:
    dc6393cb93c4851a363b69fd474656cac1ae3b3b: EventReadyToSend doesn’t exist yet in this commit, so maybe the commit message can announce it.

    vasild commented at 4:33 pm on January 28, 2025:
    Right! Adjusted the comments.
  75. in src/net.cpp:2060 in dc6393cb93 outdated
    2065-            // once a potential message from vSendMsg is handed to the transport. GetBytesToSend
    2066-            // determines both of these in a single call.
    2067-            const auto& [to_send, more, _msg_type] = pnode->m_transport->GetBytesToSend(!pnode->vSendMsg.empty());
    2068-            select_send = !to_send.empty() || more;
    2069-        }
    2070+        const bool select_recv{ShouldTryToRecv(pnode->GetId())};
    


    Sjors commented at 2:16 pm on January 15, 2025:
    dc6393cb93c4851a363b69fd474656cac1ae3b3b: it would be good to annotate in CConnman::SocketHandlerConnected above sendSet = it->second.occurred & Sock::SEND; that Sock::SEND is unset when ShouldTryToRecv is false.

    vasild commented at 4:47 pm on January 28, 2025:

    Hmm, you probably mean ShouldTryToSend is false, right? And then the same applies to receiving. Amended this:

    0-                recvSet = it->second.occurred & Sock::RECV;
    1-                sendSet = it->second.occurred & Sock::SEND;
    2+                recvSet = it->second.occurred & Sock::RECV; // Sock::RECV could only be set if ShouldTryToRecv() has returned true in GenerateWaitSockets().
    3+                sendSet = it->second.occurred & Sock::SEND; // Sock::SEND could only be set if ShouldTryToSend() has returned true in GenerateWaitSockets().
    
  76. Sjors commented at 2:25 pm on January 15, 2025: member
    Concept ACK
  77. in src/common/sockman.cpp:113 in bcf1254e91 outdated
    108+    return true;
    109+}
    110+
    111+void SockMan::StartSocketsThreads(const Options& options)
    112+{
    113+    m_thread_socket_handler = std::thread(&util::TraceThread, "net", [this] { ThreadSocketHandler(); });
    


    pinheadmz commented at 5:47 pm on January 17, 2025:
    Does this hard-coded thread name cause any issues if sockman is used more than once?

    vasild commented at 5:15 pm on January 28, 2025:
    Yes, it would mess up the logging. I put both "net" and "i2paccept" in SockMan::Options, so that they are not hardcoded here anymore and can be set by the callers of SockMan.
  78. in src/common/sockman.h:196 in bcf1254e91 outdated
    184+     * and calls `SockMan::SendBytes()`.
    185+     * @param[in] node_id Id of the node whose socket is ready to send.
    186+     * @param[out] cancel_recv Should always be set upon return and if it is true,
    187+     * then the next attempt to receive data from that node will be omitted.
    188+     */
    189+    virtual void EventReadyToSend(NodeId node_id, bool& cancel_recv) = 0;
    


    pinheadmz commented at 4:18 pm on January 23, 2025:

    I might be misunderstanding cancel_recv, in my HTTP server I set it to true when I’m planning on disconnecting a node like if there was an error sending data or if "Connection: close" is set. However this causes weird behavior where new connections get hung up forever. In particular, interface_rpc.py stalls on the ultimate node shutdown and eventually times out after 120 seconds of rpc stop not working. When I set cancel_recv=false everywhere, i don’t have this issue. (Still a lot of other issues to clean up in this branch!)

    https://github.com/pinheadmz/bitcoin/commit/df1965a4e97ce7f3df067df9b3dcc2bbd8e6e15b


    vasild commented at 7:36 pm on January 28, 2025:

    @param[out] cancel_recv Should always be set upon return

    I think the problem was that HTTPServer::EventReadyToSend() could end up not setting cancel_recv in which case it remains uninitialized, containing a “random” value from the stack at the caller (which was probably true, so it canceled all receives from the client, never receiving anything causing an infinite loop):

    0              bool cancel_recv;
    1  
    2              EventReadyToSend(node_id, cancel_recv);
    3  
    4              if (cancel_recv) {
    5                  recv_ready = false;
    6              }
    

    Valgrind or memory sanitizer should complain about this code reading uninitialized value in if (cancel_recv).

    The below fixes it, remove the last commit from your branch and move the setting of cancel_recv from the bottom to the top of the method. So that cancel_recv is set in all code paths inside HTTPServer::EventReadyToSend().

     0--- i/src/httpserver.cpp
     1+++ w/src/httpserver.cpp
     2@@ -1123,12 +1123,15 @@ bool HTTPServer::EventNewConnectionAccepted(NodeId node_id,
     3     m_connected_clients.emplace(client->m_node_id, std::move(client));
     4     return true;
     5 }
     6 
     7 void HTTPServer::EventReadyToSend(NodeId node_id, bool& cancel_recv)
     8 {
     9+    // Next attempt to receive data from this node is permitted
    10+    cancel_recv = false;
    11+
    12     // Get the HTTPClient
    13     auto client{WITH_LOCK(m_clients_mutex, return GetClientById(node_id);)};
    14     if (client == nullptr) {
    15         return;
    16     }
    17 
    18@@ -1149,26 +1152,26 @@ void HTTPServer::EventReadyToSend(NodeId node_id, bool& cancel_recv)
    19             LogDebug(
    20                 BCLog::HTTP,
    21                 "Error sending HTTP response headers to client %s (id=%lld): %s\n",
    22                 client->m_origin,
    23                 client->m_node_id,
    24                 err);
    25-            cancel_recv = false;
    26+            cancel_recv = true;
    27             client->m_disconnect = true;
    28             return;
    29         }
    30 
    31         ssize_t body_bytes_sent = SendBytes(node_id, MakeUCharSpan(res.m_body), /*will_send_more=*/true, err);
    32         if (body_bytes_sent < 0) {
    33             LogDebug(
    34                 BCLog::HTTP,
    35                 "Error sending HTTP response body to client %s (id=%lld): %s\n",
    36                 client->m_origin,
    37                 client->m_node_id,
    38                 err);
    39-            cancel_recv = false;
    40+            cancel_recv = true;
    41             client->m_disconnect = true;
    42             return;
    43         }
    44 
    45         LogDebug(
    46             BCLog::HTTP,
    47@@ -1176,20 +1179,17 @@ void HTTPServer::EventReadyToSend(NodeId node_id, bool& cancel_recv)
    48             hdr_bytes_sent + body_bytes_sent,
    49             client->m_origin,
    50             client->m_node_id);
    51 
    52         // Our work is done here
    53         if (!res.m_keep_alive) {
    54-            cancel_recv = false;
    55+            cancel_recv = true;
    56             client->m_disconnect = true;
    57             return;
    58         }
    59     }
    60-
    61-    // Next attempt to receive data from this node is permitted
    62-    cancel_recv = false;
    63 }
    64 
    65 void HTTPServer::EventGotData(NodeId node_id, const uint8_t* data, size_t n)
    66 {
    67     // Get the HTTPClient
    68     auto client{WITH_LOCK(m_clients_mutex, return GetClientById(node_id);)};
    

    pinheadmz commented at 8:23 pm on January 28, 2025:
    awesome thank you
  79. vasild force-pushed on Jan 28, 2025
  80. vasild commented at 5:20 pm on January 28, 2025: contributor

    bcf1254e91...266ac32673: address suggestions

    Don’t forget to drop “Also, separate …

    Done.

    Along similar lines, though less important, you could also have a separate commit that drops ListenSocket

    Done.

    Thanks!

  81. pinheadmz commented at 8:29 pm on January 28, 2025: member
    Something else I see using Sockman is in the mempool_limit functional test, we call getrawtransaction and expect around 540,000 bytes back. The call to send() returns the total amount of bytes immediately, but it requires 60-70 TCP packets to actually transmit the response. I think what I’m seeing in wireshark is that this transmission is incomplete when I close the HTTP connection with CloseConnection() which closes the underlying socket. The client waits 120 seconds for the rest of its data which never arrives before timing out. I’m not sure if there’s a flag in one of the syscalls or another way to check the kernel’s socket buffer before closing?
  82. vasild commented at 4:27 pm on January 29, 2025: contributor

    Something else I see using Sockman is in the mempool_limit functional test, we call getrawtransaction and expect around 540,000 bytes back. The call to send() returns the total amount of bytes immediately …

    Hmm, what I observed is that the send call sends less bytes than requested.

    (Lets move the discussion away from the main thread of this PR into the below link)

    Some a more elaborate explanation plus patch that fixes the problem in https://github.com/pinheadmz/bitcoin/commit/df1965a4e97ce7f3df067df9b3dcc2bbd8e6e15b#r151906181

  83. in src/common/sockman.h:54 in 266ac32673 outdated
    49+     * @param[in] to Where to bind.
    50+     * @param[out] errmsg Error string if an error occurs.
    51+     * @retval true Success.
    52+     * @retval false Failure, `strError` will be set.
    53+     */
    54+    bool BindAndStartListening(const CService& to, bilingual_str& errmsg);
    


    hodlinator commented at 9:01 pm on January 29, 2025:

    strError only appears in the comment, and “error message” should probably be snake_case:

    0     * [@param](/bitcoin-bitcoin/contributor/param/)[out] err_msg Error string if an error occurs.
    1     * [@retval](/bitcoin-bitcoin/contributor/retval/) true Success.
    2     * [@retval](/bitcoin-bitcoin/contributor/retval/) false Failure, `err_msg` will be set.
    3     */
    4    bool BindAndStartListening(const CService& to, bilingual_str& err_msg);
    

    vasild commented at 5:26 am on February 7, 2025:
    Done.

    hodlinator commented at 11:19 am on February 10, 2025:
    Thanks! (PR summary is out of date, but I understand if you don’t want to constantly manually update it).
  84. in src/net.cpp:3241 in 266ac32673 outdated
    3213     for (CNode* pnode : m_nodes_disconnected) {
    3214         DeleteNode(pnode);
    3215     }
    3216     m_nodes_disconnected.clear();
    3217-    vhListenSocket.clear();
    3218+    m_listen_permissions.clear();
    


    hodlinator commented at 9:39 pm on January 29, 2025:

    Do we ever want to remove individual entries from m_listen_permissions upon disconnect of single peers, long before .clear()?

    Same goes for m_listen. I’m not claiming vhListenSocket was any better before this PR.


    vasild commented at 5:33 am on February 7, 2025:
    No, we do not want to remove entries from m_listen_permissions or from m_listen when disconnecting a single peer. The entries in those two maps are per listening address. E.g. if we listen on 1.1.1.1:8333 and designate that every peer that connects to that address gets permissions1 and listen on 2.2.2.2:8333 and peers that connect to that address get permissions2. This stays unchanged as peers connect and disconnect. We would want to remove entries from those maps if we stop listening - i.e. if we close the listening socket. This is only done at shutdown.

    hodlinator commented at 9:53 am on February 7, 2025:
    :facepalm: makes total sense, I was thinking about the remote sockets.
  85. in src/net.cpp:3012 in 266ac32673 outdated
    2981         return false;
    2982     }
    2983 
    2984+    LogPrintLevel(BCLog::NET, BCLog::Level::Info, "Bound to and listening on %s\n", addr.ToStringAddrPort());
    2985+
    2986+    m_listen_permissions.emplace(addr, permissions);
    


    hodlinator commented at 9:42 pm on January 29, 2025:

    Seems like .emplace() will not replace existing entries. In the off chance that someone takes a peer offline and then restarts it with other permissions, would it not be better to do this?

    0    m_listen_permissions[addr] = permissions;
    

    vasild commented at 5:44 am on February 7, 2025:

    addr is our listening address. We can’t listen on the same address two times:

    If I put this in my bitcoind.conf:

    0bind=127.0.0.1:20001
    1bind=127.0.0.1:20002
    2bind=127.0.0.1:20003
    3bind=127.0.0.1:20003
    

    I get:

    02025-02-07T05:39:56Z [net:info] Bound to and listening on 127.0.0.1:20001
    12025-02-07T05:39:56Z [net:info] Bound to and listening on 127.0.0.1:20002
    22025-02-07T05:39:56Z [net:info] Bound to and listening on 127.0.0.1:20003
    32025-02-07T05:39:56Z [net:error] Cannot bind to 127.0.0.1:20003: Address already in use (48) (Bitcoin Core already running?)
    

    and startup is aborted.

    I could add Assume() or return an error from here if insertion fails due to an already existent entry?


    hodlinator commented at 9:55 am on February 7, 2025:
    Yeah, sorry, me confusing sockets again. An Assume() would be nice though.
  86. in src/net.cpp:1680 in 266ac32673 outdated
    1706-{
    1707-    int nInbound = 0;
    1708+    auto it{m_listen_permissions.find(addr_bind)};
    1709+    if (it != m_listen_permissions.end()) {
    1710+        NetPermissions::AddFlag(permission_flags, it->second);
    1711+    }
    


    hodlinator commented at 9:57 pm on January 29, 2025:

    In fd81820214e695ba228a954506397c3d781fe3fe: nit: Possible less mutating simplification (assumes NetPermissionFlags::None == 0):

    0-     NetPermissionFlags permission_flags = NetPermissionFlags::None;
    1-     auto it{m_listen_permissions.find(addr_bind)};
    2-     if (it != m_listen_permissions.end()) {
    3-         NetPermissions::AddFlag(permission_flags, it->second);
    4-     }
    5+     auto it{m_listen_permissions.find(addr_bind)};
    6+     NetPermissionFlags permission_flags = it != m_listen_permissions.end() ?
    7+         it->second : NetPermissionFlags::None;
    

    vasild commented at 5:53 am on February 7, 2025:
    The proposed snippet is more elegant and performant, I like it. It relies, however, on None being 0 and will break in a subtle way if that is changed. I will leave it as it is. This is executed once per newly accepted connection which is not super-often and the performance gain would be negligible.
  87. DrahtBot added the label Needs rebase on Jan 29, 2025
  88. in src/common/sockman.h:451 in 266ac32673 outdated
    418+    std::unique_ptr<i2p::sam::Session> m_i2p_sam_session;
    419+
    420+    /**
    421+     * List of listening sockets.
    422+     */
    423+    std::vector<std::shared_ptr<Sock>> m_listen;
    


    hodlinator commented at 10:04 pm on January 29, 2025:

    nit: The comment adds no information.

    0    std::vector<std::shared_ptr<Sock>> m_listen;
    

    Same for m_unused_i2p_sessions_mutex.


    Sjors commented at 9:19 am on February 3, 2025:
    It’s easier to read in doxygen html, and can always be expanded.

    vasild commented at 5:56 am on February 7, 2025:
    Agree, will leave it as it is.
  89. in src/common/sockman.cpp:267 in 266ac32673 outdated
    262+    }
    263+    errmsg = NetworkErrorString(err);
    264+    return -1;
    265+}
    266+
    267+void SockMan::CloseSockets()
    


    hodlinator commented at 10:15 pm on January 29, 2025:

    nit: Might have a more precise name since we are releasing references to shared_ptrs?

    0void SockMan::ReleaseSockets()
    

    Sjors commented at 9:30 am on February 3, 2025:
    IIUC the difference between release and close is only relevant in edge cases. The main purpose of this function it to close them. I would keep the name, but maybe document the function to point out the edge case.

    vasild commented at 6:07 am on February 7, 2025:
    This method was only dealing with listening sockets, so the name CloseSockets() with a comment “Close all sockets” was confusing. Renamed to StopListening().
  90. in src/net.cpp:2975 in 266ac32673 outdated
    2971@@ -3212,13 +2972,18 @@ bool CConnman::Bind(const CService& addr_, unsigned int flags, NetPermissionFlag
    2972     const CService addr{MaybeFlipIPv6toCJDNS(addr_)};
    2973 
    2974     bilingual_str strError;
    2975-    if (!BindListenPort(addr, strError, permissions)) {
    2976+    if (!BindAndStartListening(addr, strError)) {
    2977+        LogPrintLevel(BCLog::NET, BCLog::Level::Error, "%s\n", strError.original);
    


    hodlinator commented at 10:20 pm on January 29, 2025:

    nit: Sufficiently critical for a generic error?

    0        LogError("%s", strError.original);
    

    (Can also omit trailing newline here and in other added/modified log lines).


    Sjors commented at 9:34 am on February 3, 2025:

    According to developer notes, LogError is for:

    severe problems that require the node (or a subsystem) to shut down entirely

    And IIUC we indeed shut down here.


    vasild commented at 9:26 am on February 7, 2025:
    Changed to LogError() and removed the newline. I did not know the trailing newlines are now unnecessary (since bbbb2e43ee95c9a8866aa1f65e3f001f752dfed2).
  91. in src/common/sockman.cpp:96 in 266ac32673 outdated
    91+        errmsg = strprintf(_("Cannot bind to %s: %s%s"),
    92+                           to.ToStringAddrPort(),
    93+                           NetworkErrorString(err),
    94+                           err == WSAEADDRINUSE
    95+                               ? std::string{" ("} + CLIENT_NAME + " already running?)"
    96+                               : "");
    


    hodlinator commented at 10:34 pm on January 29, 2025:

    Problem: " (<CLIENT_NAME> already running?)"-string is untranslated.

    Would suggest bringing back old strings to reduce translation churn, or ensuring the new version is fully translated. Old strings:

    0        if (err == WSAEADDRINUSE) {
    1            err_msg = strprintf(_("Unable to bind to %s on this computer. %s is probably already running."), to.ToStringAddrPort(), CLIENT_NAME);
    2        } else {
    3            err_msg = strprintf(_("Unable to bind to %s on this computer (bind returned error %s)"), to.ToStringAddrPort(), NetworkErrorString(nErr));
    4        }
    

    vasild commented at 9:35 am on February 7, 2025:
    Done.
  92. in src/common/sockman.cpp:60 in 266ac32673 outdated
    54+    if (sock->SetSockOpt(SOL_SOCKET, SO_REUSEADDR, reinterpret_cast<sockopt_arg_type>(&one), sizeof(one)) == SOCKET_ERROR) {
    55+        LogPrintLevel(BCLog::NET,
    56+                      BCLog::Level::Info,
    57+                      "Cannot set SO_REUSEADDR on %s listen socket: %s, continuing anyway\n",
    58+                      to.ToStringAddrPort(),
    59+                      NetworkErrorString(WSAGetLastError()));
    


    hodlinator commented at 10:41 pm on January 29, 2025:

    nit: Curious why you changed these to be categorized (used to be LogPrintf() without category) - worth noting in commit message?

    0        LogInfo("Cannot set SO_REUSEADDR on %s listen socket: %s, continuing anyway",
    1                to.ToStringAddrPort(),
    2                NetworkErrorString(WSAGetLastError()));
    

    vasild commented at 9:38 am on February 7, 2025:
    Done.
  93. in src/common/sockman.cpp:102 in 266ac32673 outdated
     97+        return false;
     98+    }
     99+
    100+    // Listen for incoming connections
    101+    if (sock->Listen(SOMAXCONN) == SOCKET_ERROR) {
    102+        errmsg = strprintf(_("Cannot listen to %s: %s"), to.ToStringAddrPort(), NetworkErrorString(WSAGetLastError()));
    


    hodlinator commented at 10:42 pm on January 29, 2025:

    nit: listen to -> listen on?

    0        errmsg = strprintf(_("Cannot listen on %s: %s"), to.ToStringAddrPort(), NetworkErrorString(WSAGetLastError()));
    

    vasild commented at 9:58 am on February 7, 2025:
    Done.
  94. in src/common/sockman.cpp:25 in 266ac32673 outdated
    20+{
    21+    CService addr_bind;
    22+    struct sockaddr_storage sockaddr_bind;
    23+    socklen_t sockaddr_bind_len = sizeof(sockaddr_bind);
    24+    if (!sock.GetSockName((struct sockaddr*)&sockaddr_bind, &sockaddr_bind_len)) {
    25+        addr_bind.SetSockAddr((const struct sockaddr*)&sockaddr_bind);
    


    hodlinator commented at 9:18 am on January 30, 2025:

    nit: Leftover struct for instance and C-style casts post-modernization. Also in BindAndStartListening.

    0    sockaddr_storage sockaddr_bind;
    1    socklen_t sockaddr_bind_len = sizeof(sockaddr_bind);
    2    if (!sock.GetSockName(reinterpret_cast<sockaddr*>(&sockaddr_bind), &sockaddr_bind_len)) {
    3        addr_bind.SetSockAddr(reinterpret_cast<const sockaddr*>(&sockaddr_bind));
    

    vasild commented at 10:08 am on February 7, 2025:

    Removed struct from BindAndStartListening() in commit style: modernize the style of SockMan::BindListenPort().

    GetBindAddress() was just moved verbatim without any mods from net.cpp to sockman.cpp in another commit. This makes it easier to review with

    0[diff]  
    1        colorMoved = dimmed-zebra
    2        colorMovedWS = allow-indentation-change
    

    in ~/.gitconfig. Will leave it as it is. Don’t want to bloat this PR further with one more “style changing” commit.

  95. in src/common/sockman.h:54 in 266ac32673 outdated
    35+ */
    36+class SockMan
    37+{
    38+public:
    39+
    40+    virtual ~SockMan() = default;
    


    hodlinator commented at 9:35 am on January 30, 2025:

    Since it’s only inherited privately, destructor could be non-virtual and protected unless we want to destroy by SockMan-reference?

    Happy to see you made the virtual methods only called by SockMan private. :)


    vasild commented at 10:22 am on February 7, 2025:
    0error: delete called on non-final 'CConnman' that has virtual functions but non-virtual destructor [-Werror,-Wdelete-non-abstract-non-virtual-dtor]
    

    hodlinator commented at 11:12 am on February 7, 2025:
    Ah, and we can’t make CConnman final while we have struct ConnmanTestMsg : public CConnman, got it. Should have tested before suggesting.

    hodlinator commented at 3:10 pm on February 7, 2025:
    (Could still make the SockMan destructor protected and non-virtual if the CConnman destructor was made virtual, but there is little to gain from it).
  96. in src/net.cpp:2908 in 266ac32673 outdated
    2990-        if (sock->SetSockOpt(IPPROTO_IPV6, IPV6_PROTECTION_LEVEL, (const char*)&nProtLevel, sizeof(int)) == SOCKET_ERROR) {
    2991-            strError = Untranslated(strprintf("Error setting IPV6_PROTECTION_LEVEL on socket: %s, continuing anyway", NetworkErrorString(WSAGetLastError())));
    2992-            LogPrintf("%s\n", strError.original);
    2993-        }
    2994-#endif
    2995+        return;
    


    hodlinator commented at 9:43 am on January 30, 2025:
    nit: Could replace return with else if below?

    vasild commented at 10:18 am on February 7, 2025:

    In general, less indentation makes the code more readable, so I prefer to have return; and then the subsequent code to be with less indentation.

    Anyway this code was changed and now uses switch (see CConnman::EventI2PStatus() after I push).

  97. in src/common/sockman.h:102 in 266ac32673 outdated
    83+    void StartSocketsThreads(const Options& options);
    84+
    85+    /**
    86+     * Join (wait for) the threads started by `StartSocketsThreads()` to exit.
    87+     */
    88+    void JoinSocketsThreads();
    


    hodlinator commented at 9:58 am on January 30, 2025:

    nit: One plural word is enough?

    0     * Start the necessary threads for socket IO.
    1     */
    2    void StartSocketThreads(const Options& options);
    3
    4    /**
    5     * Join (wait for) the threads started by `StartSocketThreads()` to exit.
    6     */
    7    void JoinSocketThreads();
    

    vasild commented at 10:28 am on February 7, 2025:
    There are multiple sockets and multiple threads. “socket threads” looks like there is one socket and multiple threads. Will leave it as it is.

    Sjors commented at 11:15 am on February 7, 2025:
    SocketsAndThreads would be clearer, but I think the current name is fine.

    hodlinator commented at 11:20 am on February 7, 2025:
    We open/close sockets, we don’t start/join them, so I prefer the current name over adding “And”.
  98. in src/net.cpp:3427 in 266ac32673 outdated
    3400-            LogDebug(BCLog::NET, "disconnect by id peer=%d; disconnecting\n", pnode->GetId());
    3401-            pnode->fDisconnect = true;
    3402-            return true;
    3403-        }
    3404+    auto it{m_nodes.find(id)};
    3405+    if (it == m_nodes.end()) {
    


    hodlinator commented at 10:26 am on January 30, 2025:

    nit: Could go with something closer to original?

    0    if (auto it{m_nodes.find(id)}; it != m_nodes.end()) {
    1        auto node{it->second};
    2        LogDebug(BCLog::NET, "disconnect by id peer=%d; disconnecting\n", id);
    3        node->fDisconnect = true;
    4        return true;
    5    } else {
    6        return false;
    7    }
    

    Same pattern of auto it{m_nodes.find( appearing on it’s own line when it could be hoisted into the if to reduce scope is repeated elsewhere too.


    vasild commented at 10:34 am on February 7, 2025:

    Less indentation makes the code more readable in general. So I prefer:

    0if (A) {
    1    return;
    2}
    3CODE;
    

    over:

    0if (!A) {
    1    CODE;
    2} else {
    3    return;
    4}
    

    Reduced scope is nice, but I find if (auto it{m_nodes.find(id)}; it != m_nodes.end()) { hard to read. Sometimes I confuse it for a for-loop: for (foo; bar; baz) {. Also in this particular case the function ends right after this, so the scope of it is the same.

    Will leave it as it is.

  99. in src/net_processing.cpp:5093 in 266ac32673 outdated
    5087+                    youngest_peer.first = pnode->GetId();
    5088+                    youngest_peer.second = pnode->m_last_block_time;
    5089+                } else {
    5090+                    next_youngest_peer.first = pnode->GetId();
    5091+                    next_youngest_peer.second = pnode->m_last_block_time;
    5092+                }
    


    hodlinator commented at 10:32 am on January 30, 2025:
    Seems like a slight behavior change warranting it’s own commit or at least a mention in the commit message doing the change?

    vasild commented at 10:53 am on February 7, 2025:

    There should be no change in behavior. Note that the original code in master was very subtle - it required that ForEachNode() would iterate the nodes in increasing order of id:

    0std::pair<NodeId, std::chrono::seconds> youngest_peer{-1, 0}, next_youngest_peer{-1, 0};
    1
    2m_connman.ForEachNode([&](CNode* pnode) {
    3    if (!pnode->IsBlockOnlyConn() || pnode->fDisconnect) return;
    4    if (pnode->GetId() > youngest_peer.first) {
    5        next_youngest_peer = youngest_peer;
    6        youngest_peer.first = pnode->GetId();
    7        youngest_peer.second = pnode->m_last_block_time;
    8    }
    9});
    

    The condition pnode->GetId() > youngest_peer.first would be true every time. Because of that, this code is equivalent to:

    0std::pair<NodeId, std::chrono::seconds> youngest_peer{-1, 0}, next_youngest_peer{-1, 0};
    1
    2m_connman.ForEachNode([&](CNode* pnode) {
    3    if (!pnode->IsBlockOnlyConn() || pnode->fDisconnect) return;
    4    next_youngest_peer = youngest_peer;
    5    youngest_peer.first = pnode->GetId();
    6    youngest_peer.second = pnode->m_last_block_time;
    7});
    

    The new code will again select the two youngest peers (the two with the bigger ids), but does not require that the iterating is in increasing order of id:

     0std::pair<NodeId, std::chrono::seconds> youngest_peer{-1, 0}, next_youngest_peer{-1, 0};
     1
     2m_connman.ForEachNode([&](CNode* pnode) {
     3    if (!pnode->IsBlockOnlyConn() || pnode->fDisconnect) return;
     4    if (pnode->GetId() > next_youngest_peer.first) {
     5        if (pnode->GetId() > youngest_peer.first) {
     6            next_youngest_peer = youngest_peer;
     7            youngest_peer.first = pnode->GetId();
     8            youngest_peer.second = pnode->m_last_block_time;
     9        } else {  
    10            next_youngest_peer.first = pnode->GetId();
    11            next_youngest_peer.second = pnode->m_last_block_time;
    12        }
    13    }
    14});
    

    hodlinator commented at 12:19 pm on February 7, 2025:

    Ah, that was subtle, thanks for laying it out so clearly.

    Maybe amend the commit message in “net: index nodes in CConnman by id” with something like:

    Change PeerManagerImpl::EvictExtraOutboundPeers to account for nodes no longer always being in order. The old code would have failed to update next_youngest_peer correctly if CConnman::m_nodes hadn’t always had node ids in ascending order.


    vasild commented at 9:20 am on February 10, 2025:
    Done, will be in next push.
  100. in src/test/net_peer_connection_tests.cpp:158 in 266ac32673 outdated
    155         BOOST_CHECK(connman->AlreadyConnectedPublic(node->addr));
    156     }
    157 
    158     // Clean up
    159-    for (auto node : connman->TestNodes()) {
    160+    for (const auto& [id, node] : connman->TestNodes()) {
    


    hodlinator commented at 10:38 am on January 30, 2025:

    nit:

    0    for (const auto& [_, node] : connman->TestNodes()) {
    

    vasild commented at 10:55 am on February 7, 2025:
    Done.
  101. in src/test/fuzz/connman.cpp:68 in 266ac32673 outdated
    63@@ -64,13 +64,15 @@ FUZZ_TARGET(connman, .init = initialize_connman)
    64     connman.Init(options);
    65 
    66     CNetAddr random_netaddr;
    67-    CNode random_node = ConsumeNode(fuzzed_data_provider);
    68+    NodeId node_id{0};
    69+    CNode& random_node{*ConsumeNodeAsUniquePtr(fuzzed_data_provider, node_id++).release()};
    


    hodlinator commented at 10:43 am on January 30, 2025:
    Why are we now setting the node_id so deterministically?

    vasild commented at 11:00 am on February 7, 2025:
    To avoid overlaps. In master, without this PR, this test could generate nodes with duplicate ids.

    hodlinator commented at 11:07 am on February 10, 2025:
    (Might be worth noting that in commit message for the change that fixes the bug, 60f6cbb9b9f83e25217d30c889147ad517960ec7 / “net: index nodes in CConnman by id”).

    vasild commented at 11:42 am on February 10, 2025:
    Done.
  102. in src/test/util/net.h:60 in 266ac32673 outdated
    57+
    58     void AddTestNode(CNode& node)
    59     {
    60         LOCK(m_nodes_mutex);
    61-        m_nodes.push_back(&node);
    62+        m_nodes.emplace(node.GetId(), &node);
    


    hodlinator commented at 10:48 am on January 30, 2025:

    Might want to assert in case a test assigns ids randomly and the birthday paradox occurs on annoyingly rare test-runs.

    0        auto [_, success] = m_nodes.emplace(node.GetId(), &node);
    1        Assert(success);
    

    vasild commented at 11:05 am on February 7, 2025:
    Done.
  103. in src/net.cpp:2089 in 7aca044a93 outdated
    2221@@ -2191,6 +2222,7 @@ void CConnman::SocketHandlerListening(const Sock::EventsPerSock& events_per_sock
    2222 
    2223 void CConnman::ThreadSocketHandler()
    2224 {
    2225+    AssertLockNotHeld(m_nodes_mutex);
    


    hodlinator commented at 11:00 am on January 30, 2025:
    7aca044a932e163445f2fccec7739aa7768cf36e: Why assert this at the beginning of the thread, is it to help linters/sanitizers?

    vasild commented at 11:13 am on February 7, 2025:

    It is for every function, not just “thread” ones - doc/developer-notes.md says:

    • Combine annotations in function declarations with run-time asserts in function definitions …

    This means:

    0void f() EXCLUSIVE_LOCKS_REQUIRED(!m);
    1
    2...
    3
    4void f()
    5{
    6    AssertLockNotHeld(m);
    7    ...
    8}
    

    The reasoning is that the annotation EXCLUSIVE_LOCKS_REQUIRED can be ignored. It is nice that the check is done at compile time, but if the compiler is not clang then it will be ignored, or if the compiler is clang but is running without -Werror, then it will only result in a compile warning which is likely to be missed in the build output. In other words, it is possible to create a violating code and compile and run it with both gcc and clang. In that case AssertLockNotHeld() will stop the program at runtime.


    hodlinator commented at 11:23 am on February 7, 2025:
    Ah, so it’s done as a general rule even if it makes less logical sense for the function at the root of a thread (as nothing can have taken the lock before it unless the thread is wrapped in some kind of utility that is weird enough to take locks).
  104. in src/net.cpp:1605 in 266ac32673 outdated
    1599+CNode* CConnman::GetNodeById(NodeId node_id) const
    1600+{
    1601+    LOCK(m_nodes_mutex);
    1602+    auto it{m_nodes.find(node_id)};
    1603+    if (it != m_nodes.end()) {
    1604+        return it->second;
    


    hodlinator commented at 11:07 am on January 30, 2025:

    An issue here is that we return a CNode*, unlock m_nodes_mutex, then another thread comes along and deletes the node - leaving the caller with a dangling pointer.

    Maybe there are implementation details that make that impossible right now, but it would feel safer to instead have this function require the caller locks m_nodes_mutex, or do an AddRef() internally and require caller to do Release(). (Switching to shared_ptr instead of custom ref-counting might be more straight forward but maybe too disruptive).


    vasild commented at 1:12 pm on February 7, 2025:

    Maybe it is worth doing this: #30988 (comment). It is a super good change and normally I would do it, but the reason I didn’t is that it will increase the size of this PR which, I am afraid, would turn reviewers away.

    I will proceed to other suggestions and give this some thought…


    vasild commented at 5:44 am on February 18, 2025:

    After thinking about this a bit more and doing some changes to the code, I decided to:

    1. Open a separate PR to remove the manual ref counting on CNode and replace it with shared_ptr and change CConnman::m_nodes from vector<CNode*> to unordered_set<shared_ptr<CNode>>. That PR would be independent from this one.

    2. In this PR, change the communication between SockMan and CConman to be based on pointer to CNode instead of node id. Similarly to #30988 (comment). I will start with a raw pointer and the responsibility of destruction will be in CConnman, like it is now. If 1. is merged in the mean time before this PR, then I will just change this PR to use shared_ptr instead of a raw pointer.

  105. vasild force-pushed on Jan 30, 2025
  106. vasild commented at 2:02 pm on January 30, 2025: contributor
    266ac32673...7866c736c8: rebase due to conflicts
  107. DrahtBot removed the label Needs rebase on Jan 30, 2025
  108. in src/common/sockman.h:204 in 7866c736c8 outdated
    199+     * Called when new data has been received.
    200+     * @param[in] node_id Node for which the data arrived.
    201+     * @param[in] data Data buffer.
    202+     * @param[in] n Number of bytes in `data`.
    203+     */
    204+    virtual void EventGotData(NodeId node_id, const uint8_t* data, size_t n) = 0;
    


    hodlinator commented at 3:19 pm on January 30, 2025:

    nit: Could use span, and maybe even byte?

    0    virtual void EventGotData(NodeId node_id, const std::span<std::byte> data) = 0;
    

    vasild commented at 1:24 pm on February 7, 2025:
    Done, but used uint8_t because ReceiveMsgBytes() (in master) takes a span of that and conversion is not possible.

    maflcko commented at 1:30 pm on February 7, 2025:

    conversion is not possible.

    It should be possible to convert std::byte* to uint8_t* (and vice-versa). The two are the almost the same anyway (https://en.cppreference.com/w/cpp/types/byte). And a span is just a pointer+size, so conversion between the two span types should also be possible.


    vasild commented at 9:35 am on February 10, 2025:

    By default trying to pass std::span<std::byte> to a function that takes std::span<uint8_t> gives:

    0no known conversion from 'span<std::byte>' to 'span<uint8_t>'
    

    If I really insist then I can convert it like:

    0void f(std::span<uint8_t> s);
    1...
    2std::span<std::byte> a;
    3f(std::span<uint8_t>{reinterpret_cast<uint8_t*>(a.data()), a.size()});
    

    I find it cleaner to use uint8_t because it is already used in the existent code (ReceiveMsgBytes()) and avoid such forced coversion.


    hodlinator commented at 10:20 am on February 10, 2025:

    Certainly won’t insist, but another possibility is to use bit_cast for trivially constructible types:

    0void f(std::span<uint8_t> s);
    1...
    2std::span<std::byte> a;
    3f(std::bit_cast<std::span<uint8_t>>(a));
    

    maflcko commented at 10:40 am on February 10, 2025:

    bit_cast

    While this will likely work in practise, I don’t think there is any inherent guarantee that the layout of span is identical for all underlying types. So my recommendation would be to just use std::as_bytes instead.

    (Same for the reinterpret_cast: While it works, it is a bit verbose and std::as_bytes is the existing alias in the std lib, which is already used in the code today.)

    Obviously, anything is fine. I just left a comment to say it is possible to convert :sweat_smile:


    vasild commented at 10:50 am on February 10, 2025:
    Will leave it as uint8_t because it is already used in master in ReceiveMsgBytes() to which we have to pass that variable.
  109. in src/net.h:1320 in 248ec2d268 outdated
    1311@@ -1312,6 +1312,12 @@ class CConnman : private SockMan
    1312     virtual bool ShouldTryToRecv(NodeId node_id) const override
    1313         EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex);
    1314 
    1315+    virtual void EventIOLoopCompletedForNode(NodeId node_id) override
    1316+        EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex);
    1317+
    1318+    virtual void EventIOLoopCompletedForAllPeers() override
    1319+        EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex, !m_reconnections_mutex);
    1320+ 
    


    Sjors commented at 3:25 pm on January 30, 2025:

    248ec2d2687fae47b63688e00b9ef18d4c0c9676 nit: when I run the linter (along with all other tests) on this commit it complains about trailing whitespace.

     0diff --git a/src/net.cpp b/src/net.cpp
     1index 8d0dd84d91..0228ddce57 100644
     2--- a/src/net.cpp
     3+++ b/src/net.cpp
     4@@ -2068,7 +2068,7 @@ void CConnman::EventIOLoopCompletedForAllPeers()
     5     DisconnectNodes();
     6     NotifyNumConnectionsChanged();
     7 }
     8- 
     9+
    10 Sock::EventsPerSock CConnman::GenerateWaitSockets(Span<CNode* const> nodes)
    11 {
    12     AssertLockNotHeld(m_nodes_mutex);
    13diff --git a/src/net.h b/src/net.h
    14index 8778bcb88a..6c2c202fbb 100644
    15--- a/src/net.h
    16+++ b/src/net.h
    17@@ -1317,7 +1317,7 @@ private:
    18 
    19     virtual void EventIOLoopCompletedForAllPeers() override
    20         EXCLUSIVE_LOCKS_REQUIRED(!m_nodes_mutex, !m_reconnections_mutex);
    21- 
    22+
    23     /**
    24      * Generate a collection of sockets to check for IO readiness.
    25      * [@param](/bitcoin-bitcoin/contributor/param/)[in] nodes Select from these nodes' sockets.
    

    vasild commented at 2:30 pm on February 6, 2025:
    Fixed.
  110. in src/common/sockman.cpp:156 in 7866c736c8 outdated
    147+    std::unique_ptr<i2p::sam::Session> i2p_transient_session;
    148+
    149+    Assume(!me.IsValid());
    150+
    151+    if (std::holds_alternative<CService>(to)) {
    152+        const CService& addr_to{std::get<CService>(to)};
    


    hodlinator commented at 3:42 pm on January 30, 2025:

    nit:

    0    if (auto addr_to{std::get_if<CService>(&to)}) {
    

    vasild commented at 1:37 pm on February 7, 2025:
    That would make addr_to a pointer, so *addr_to would have to be used in the 6 places below. I slightly prefer the current “if holds alternative CService” because it is dump and easy to follow. Leaving it as it is.
  111. in src/test/fuzz/net.cpp:46 in f7dd4373b0 outdated
    41@@ -42,9 +42,6 @@ FUZZ_TARGET(net, .init = initialize_net)
    42     LIMITED_WHILE(fuzzed_data_provider.ConsumeBool(), 10000) {
    43         CallOneOf(
    44             fuzzed_data_provider,
    45-            [&] {
    46-                node.CloseSocketDisconnect();
    


    Sjors commented at 9:01 am on January 31, 2025:
    f7dd4373b06daf34033fa84fa99408a3938e4ad4: would it make sense to call MarkAsDisconnectAndCloseConnection in the connman fuzzer?

    vasild commented at 2:56 pm on February 6, 2025:
    Yes, done.
  112. in src/net.cpp:1705 in 0be6883603 outdated
    1703+                                          std::unique_ptr<Sock>&& sock,
    1704+                                          const CService& addr_bind_,
    1705+                                          const CService& addr_)
    1706 {
    1707+    const CService addr_bind{MaybeFlipIPv6toCJDNS(addr_bind_)};
    1708+    const CService addr{MaybeFlipIPv6toCJDNS(addr_)};
    


    hodlinator commented at 11:18 am on January 31, 2025:

    In commit 0be688360318511ecebcfc9cbabacfa6960fa5ef: The new code has the ThreadI2PAccept() -> NewSockAccepted() -> EventNewConnectionAccepted() which now contains:

    0    const CService addr_bind{MaybeFlipIPv6toCJDNS(addr_bind_)};
    1    const CService addr{MaybeFlipIPv6toCJDNS(addr_)};
    

    Is it okay to combine I2P with CJDNS, is this a bugfix?


    Sjors commented at 1:22 pm on February 3, 2025:

    MaybeFlipIPv6toCJDNS only does something if addr_(bind_) IsIPv6(), which checks m_net == NET_IPV6.

    This is the default for a CNetAddr, however deeply buried in ThreadI2PAccept - > i2p::sam::Session::Accept -> DestB64ToAddr -> DestBinToAddr -> CNetAddr::SetSpecial -> SetI2P do we find m_net = NET_I2P. So the flip won’t interfere with an I2P connection (in the case its prefix coincidentally matches the IPv6 CJDNS prefix).

    But maybe you should add a Assume(IsI2P(conn.peer));

    (in 0be688360318511ecebcfc9cbabacfa6960fa5ef “net: tweak EventNewConnectionAccepted()”)


    vasild commented at 1:42 pm on February 7, 2025:

    Yeah, EventNewConnectionAccepted() is generic used also for non-I2P connections and the flip will not do anything on I2P addresses.

    But maybe you should add a Assume(IsI2P(conn.peer));

    In which function?


    Sjors commented at 1:51 pm on February 7, 2025:
    I think in ThreadI2PAccept() right before it calls NewSockAccepted()

    vasild commented at 10:16 am on February 13, 2025:
    Added Assume(IsI2P(conn.peer)) and Assume(IsI2P(conn.me)) in ThreadI2PAccept().
  113. in src/common/sockman.h:308 in f7dd4373b0 outdated
    301@@ -267,17 +302,107 @@ class SockMan
    302      */
    303     virtual void EventI2PListen(const CService& addr, bool success);
    304 
    305+    /**
    306+     * The sockets used by a connected node - a data socket and an optional I2P session.
    307+     */
    308+    struct NodeSockets {
    


    Sjors commented at 11:49 am on January 31, 2025:

    f7dd4373b06daf34033fa84fa99408a3938e4ad4: why plural and not NodeSocket?

    Does the I2P session sess also contain a socket? Or does it use s?


    vasild commented at 3:00 pm on February 6, 2025:
  114. in src/common/sockman.h:350 in f7dd4373b0 outdated
    345+    /**
    346+     * Info about which socket has which event ready and its node id.
    347+     */
    348+    struct IOReadiness {
    349+        Sock::EventsPerSock events_per_sock;
    350+        std::unordered_map<Sock::EventsPerSock::key_type, NodeId> node_ids_per_sock;
    


    Sjors commented at 1:14 pm on January 31, 2025:
    f7dd4373b06daf34033fa84fa99408a3938e4ad4: it would be useful to document these two values.

    vasild commented at 3:08 pm on February 6, 2025:
    Done.
  115. in src/common/sockman.cpp:479 in f7dd4373b0 outdated
    474+
    475+            const ssize_t nrecv{WITH_LOCK(
    476+                node_sockets->mutex,
    477+                return node_sockets->sock->Recv(buf, sizeof(buf), MSG_DONTWAIT);)};
    478+
    479+            switch (nrecv) {
    


    Sjors commented at 2:21 pm on January 31, 2025:
    f7dd4373b06daf34033fa84fa99408a3938e4ad4: maybe add a comment here that both -1 and 0 still warrant (require?) reaching EventIOLoopCompletedForNode, so we can’t just have individual if ... continue guards for them.

    vasild commented at 3:13 pm on February 6, 2025:
    Done.
  116. Sjors commented at 2:24 pm on January 31, 2025: member

    Reviewing in reverse order.

    f7dd4373b06daf34033fa84fa99408a3938e4ad4 is a beast, but looks correct. I’m not sure if it can be split further. There’s some mixing of code modernisation with the move which make the diff a bit trickier to follow.

  117. in src/net.h:689 in 7866c736c8 outdated
    682-     * `shared_ptr` (instead of `unique_ptr`) is used to avoid premature close of
    683-     * the underlying file descriptor by one thread while another thread is
    684-     * poll(2)-ing it for activity.
    685-     * @see https://github.com/bitcoin/bitcoin/issues/21744 for details.
    686-     */
    687-    std::shared_ptr<Sock> m_sock GUARDED_BY(m_sock_mutex);
    


    hodlinator commented at 8:14 pm on January 31, 2025:
    Forgot to remove CNode::m_sock_mutex.

    vasild commented at 1:45 pm on February 7, 2025:
    Removed!
  118. in src/net.cpp:1563 in 7866c736c8 outdated
    1575+
    1576+        std::string errmsg;
    1577+
    1578+        const ssize_t sent{SendBytes(node.GetId(), data, more, errmsg)};
    1579+
    1580+        if (sent > 0) {
    


    hodlinator commented at 8:18 pm on January 31, 2025:

    nit: Much whitespace

    0        std::string errmsg;
    1        const ssize_t sent{SendBytes(node.GetId(), data, more, errmsg)};
    2        if (sent > 0) {
    

    vasild commented at 1:47 pm on February 7, 2025:
    Done.
  119. in src/net.cpp:1662 in 7866c736c8 outdated
    1682-    } else {
    1683-        addr = CAddress{MaybeFlipIPv6toCJDNS(addr), NODE_NONE};
    1684-    }
    1685+bool CConnman::EventNewConnectionAccepted(NodeId node_id,
    1686+                                          const CService& addr_bind_,
    1687+                                          const CService& addr_)
    


    hodlinator commented at 2:57 pm on February 1, 2025:
    nit: Don’t you want to rename these arguments to the new interface names from SockMan and net.h?

    vasild commented at 1:56 pm on February 7, 2025:
    Done.
  120. in src/net.cpp:1580 in 7866c736c8 outdated
    1604-                    LogDebug(BCLog::NET, "socket send error, %s: %s\n", node.DisconnectMsg(fLogIPs), NetworkErrorString(nErr));
    1605-                    node.CloseSocketDisconnect();
    1606-                }
    1607+            if (sent < 0) {
    1608+                LogDebug(BCLog::NET, "socket send error, %s: %s\n", node.DisconnectMsg(fLogIPs), errmsg);
    1609+                MarkAsDisconnectAndCloseConnection(node);
    


    hodlinator commented at 3:46 pm on February 1, 2025:

    Before looking at this PR, I was playing around with the idea of changing the raw access of CNode::fDisconnect everywhere into two functions, QueueDisconnect and DisconnectNow. One would just set the flag and the other would also release the socket reference. My idea was to have both of them require a reason-argument and an optional BCLog::Level-argument. They would log internally if fDisconnect was not already set, instead of having every call site log. MarkAsDisconnectAndCloseConnection, seems like an opportunity to do such a thing.

    WIP disconnecting++ branch

    Totally fine with considering it out of scope for this PR.

    0                MarkAsDisconnectAndCloseConnection(node, "socket send error: %s", err_msg);
    

    vasild commented at 2:08 pm on February 7, 2025:

    Hmm, is not straight forward. From all callers of MarkAsDisconnectAndCloseConnection():

    • 3 unconditional LogDebug(BCLog::NET
    • 1 no log
    • 2 conditional LogDebug(BCLog::NET based on fDisconnect

    So it is not like all the callers do the same thing. Maybe it is ok to consolidate those into one behavior, but would be a functional change that is not the purpose of this PR. Leaving it as it is, but sounds like something to explore as a followup.

  121. in src/common/sockman.cpp:492 in 7866c736c8 outdated
    487+            case 0:
    488+                EventGotEOF(node_id);
    489+                break;
    490+            default:
    491+                EventGotData(node_id, buf, nrecv);
    492+                break;
    


    hodlinator commented at 8:18 pm on February 1, 2025:

    Worth double-checking in case it doesn’t hold for all third-party recv()-implementations?

    0            default:
    1                Assert(nrecv > 0); // Should have caught all negative values above.
    2                EventGotData(node_id, buf, nrecv);
    3                break;
    

    vasild commented at 2:17 pm on February 7, 2025:
    Hmm, if we are worried that recv(2) will return a value less than -1, then we would better treat that as an error instead of terminating the whole program. Changed the switch to if/else, treating all negative values as an error.
  122. hodlinator changes_requested
  123. hodlinator commented at 8:37 pm on February 1, 2025: contributor

    Concept ACK 46131d44faeb797909e0fc3a2042492adef9aa0d

    (Still plan to check some in-depth aspects).

    Nice to separate out some concerns!

    Relieved to see sockets actually moved down into Sockman in next-to-last commit. :)

    WinSock

    Would be nice to have socket/WinSock code in fewer files. Current status:

    0₿ git grep -c -E "\bWSA"
    1src/common/pcp.cpp:10
    2src/common/sockman.cpp:13
    3src/common/system.cpp:2
    4src/compat/compat.h:12
    5src/net.cpp:1
    6src/netbase.cpp:11
    7src/util/sock.cpp:5
    

    Could at least remove the remaining call in net.cpp by moving CNetCleanup to sockman.cpp?

    Terminology

    nit: I prefer “local”/“remote” or “self”/“other” over “me”/“them”. i2p::Connection uses “me”/“peer” though.

    7866c736c87a908cef75dc6901c9c0594c65b0eb / net: move-only: improve encapsulation of SockMan

    Commit message typo: m_listen private

    645f83354c0304fd339a45d9439084a44b7ca415 / net: reduce CAddress usage to CService or CNetAddr

    Are we going to change file/network serialization by switching CAddress -> CService? Might try to answer this in later review.

  124. in src/net.cpp:442 in 91d97a198f outdated
    438@@ -454,52 +439,29 @@ CNode* CConnman::ConnectNode(CAddress addrConnect, const char *pszDest, bool fCo
    439     // Connect
    440     std::unique_ptr<Sock> sock;
    441     Proxy proxy;
    442-    CService addr_bind;
    443-    assert(!addr_bind.IsValid());
    444+    assert(!proxy.IsValid());
    


    Sjors commented at 1:41 pm on February 3, 2025:
    In 91d97a198fb57d47870ce094454c3776bc737ba0 “net: split CConnman::ConnectNode()”: why this new assert? If you need it all, maybe make it an Assume?

    vasild commented at 2:49 pm on February 7, 2025:
    Because later on in ConnectAndMakeId() there is logic that will be confused if the default constructed proxy is valid. But it is better to use a std::optional to designate “no proxy” instead of default-contructed-and-invalid proxy. Then this assert/Assume is not necessary. Changed.
  125. in src/common/sockman.h:100 in 91d97a198f outdated
     95+    /**
     96+     * Make an outbound connection, save the socket internally and return a newly generated node id.
     97+     * @param[in] to The address to connect to, either as CService or a host as string and port as
     98+     * an integer, if the later is used, then `proxy` must be valid.
     99+     * @param[in] is_important If true, then log failures with higher severity.
    100+     * @param[in] proxy Proxy to connect through if `proxy.IsValid()` is true.
    


    Sjors commented at 1:49 pm on February 3, 2025:

    In 91d97a198fb57d47870ce094454c3776bc737ba0 “net: split CConnman::ConnectNode()”: looking at the body of ConnectAndMakeNodeId the code paths with and without proxy are vastly different. They also have separate call sites.

    So instead of using an std::variant, it seems more clear to just have both ConnectAndMakeNodeId and ProxyConnectAndMakeNodeId.


    vasild commented at 2:54 pm on February 7, 2025:

    But there is still some common code that would have to be duplicated:

     0{
     1    AssertLockNotHeld(m_connected_mutex);
     2    AssertLockNotHeld(m_unused_i2p_sessions_mutex);
     3
     4    std::unique_ptr<Sock> sock;
     5    std::unique_ptr<i2p::sam::Session> i2p_transient_session;
     6
     7    Assume(!me.IsValid());
     8
     9    if (std::holds_alternative<CService>(to)) {
    10        ...
    11    } else {
    12        ...
    13    }
    14
    15    if (!sock) {
    16        return std::nullopt;
    17    }
    18
    19    if (!me.IsValid()) {
    20        me = GetBindAddress(*sock);
    21    }
    22
    23    const Id id{GetNewId()};
    24
    25    {
    26        LOCK(m_connected_mutex);
    27        m_connected.emplace(id, std::make_shared<ConnectionSockets>(std::move(sock),
    28                                                                    std::move(i2p_transient_session)));
    29    }
    30
    31    return id;
    32}
    

    Sjors commented at 3:02 pm on February 7, 2025:
    Everything starting at if (!sock) could go into a helper function? FinishConnectStuff()

    vasild commented at 11:25 am on February 10, 2025:

    I am considering this, but just realized that the distinction is not “proxy vs no-proxy” because in both cases of the std::variant we may end up connecting via the proxy. The logic (as convoluted as it is, it is the same in master) is:

    • if connecting to CService
      • if to an I2P CService then proxy must be used
      • otherwise the proxy is optional, if provided it will be used
    • if connecting to string host, then proxy must be used

    So, if std::variant is to be avoided and two functions provided instead of one, then they should be called something like ConnectToCService() and ConnectToString() :-|

    Edit: or a bunch of functions covering each case:

    0ConnectToI2P(CService, proxy); // proxy must be used
    1ConnectDirectly(CService); // no proxy
    2ConnectViaProxy(CService, proxy);
    3ConnectViaProxy(string, proxy); // overload
    
  126. DrahtBot added the label Needs rebase on Feb 5, 2025
  127. Sjors commented at 4:26 pm on February 5, 2025: member
    Merge conflict is probably from #25832.
  128. in src/common/sockman.cpp:17 in 7866c736c8 outdated
    12+
    13+#include <cassert>
    14+
    15+// The set of sockets cannot be modified while waiting
    16+// The sleep time needs to be small to avoid new sockets stalling
    17+static constexpr auto SELECT_TIMEOUT{50ms};
    


    pinheadmz commented at 2:58 pm on February 6, 2025:

    I think libevent might actually use 45 for the poll() loop:

    https://github.com/libevent/libevent/blob/112421c8fa4840acd73502f2ab6a674fc025de37/http-internal.h#L17-L20

    What’s the best way to determine this constant? Sockman-based HTTP certainly gets through the functional test suite a lot faster when this value is reduced (I tried 10 here)


    vasild commented at 3:41 pm on February 7, 2025:

    I just copied this from master:

    0static const uint64_t SELECT_TIMEOUT_MILLISECONDS = 50;
    

    I guess some empirical testing… This constant is used in two places: in WaitMany() and in the sleep if no sockets at all, maybe those two should be separate constants (btw the same is in master).

  129. in src/common/sockman.cpp:351 in 7866c736c8 outdated
    339+        if (io_readiness.events_per_sock.empty() ||
    340+            // WaitMany() may as well be a static method, the context of the first Sock in the vector is not relevant.
    341+            !io_readiness.events_per_sock.begin()->first->WaitMany(SELECT_TIMEOUT,
    342+                                                                   io_readiness.events_per_sock)) {
    343+            interruptNet.sleep_for(SELECT_TIMEOUT);
    344+        }
    


    pinheadmz commented at 5:07 pm on February 6, 2025:
    if WaitMany() returns false, we would have already waited the duration of SELECT_TIMEOUT – do we need to sleep_for() again?

    pinheadmz commented at 7:19 pm on February 6, 2025:

    For example I tried this. I expected a performance boost but didn’t really observe one:

    0
    1        auto io_readiness{GenerateWaitSockets()};
    2        if (io_readiness.events_per_sock.empty()) {
    3            interruptNet.sleep_for(SELECT_TIMEOUT);
    4        } else {
    5            // WaitMany() may as well be a static method, the context of the first Sock in the vector is not relevant.
    6            io_readiness.events_per_sock.begin()->first->WaitMany(SELECT_TIMEOUT,
    7                                                                   io_readiness.events_per_sock);
    8        }
    

    vasild commented at 4:06 pm on February 7, 2025:

    if WaitMany() returns false, we would have already waited

    No, false means an error, e.g. poll(2) returning -1. A timeout is signaled by a return value of true and all what[].occurred returned as 0.

  130. vasild force-pushed on Feb 7, 2025
  131. vasild commented at 7:26 pm on February 7, 2025: contributor

    7866c736c8...7d84f431f9: rebase due to conflicts and address suggestions

    Interface changes:

    SockMan::CloseSockets() renamed to SockMan::StopListening()

    SockMan::GetNewNodeId() renamed to SockMan::GetNewId()

    NodeId renamed to SockMan::Id

    SockMan::EventI2PListen() renamed to SockMan::EventI2PStatus() and instead of bool it now takes an enum I2PStatus argument

    SockMan::EventIOLoopCompletedForNode() renamed to SockMan::EventIOLoopCompletedForOne()

    SockMan::EventIOLoopCompletedForAllPeers() renamed to SockMan::EventIOLoopCompletedForAll()

    EventGotData() now takes std::span instead of pointer+length

    SockMan::ConnectAndMakeNodeId() renamed to SockMan::ConnectAndMakeId() and takes std::optional<Proxy> instead of Proxy to more clearly denote the “no proxy given” case (empty optional, was an invalid proxy before).

    private:

    SockMan::NodeSockets renamed to SockMan::ConnectionSockets

    SockMan::GetNodeSockets() renamed to SockMan::GetConnectionSockets()

  132. vasild commented at 9:12 am on February 10, 2025: contributor

    It is better to post such suggestions not in the main thread of the PR but as a comment to some line of code, even if that is a random, unrelated line. That way replies will be grouped together with the questions instead of being scattered around in the main PR thread. In the main thread it is easier to forget to reply to some questions. As comments to some line of code, they can be tracked and eventually “resolved” to collapse them and reduce the main PR thread noise.

    WinSock

    Would be nice to have socket/WinSock code in fewer files. Current status: … Could at least remove the remaining call in net.cpp by moving CNetCleanup to sockman.cpp?

    I agree, but that is out of the scope of this PR, even CNetCleanup is untouched by this PR, so I will leave it as it is.

    Terminology

    nit: I prefer “local”/“remote” or “self”/“other” over “me”/“them”. i2p::Connection uses “me”/“peer” though.

    No strong opinion.

    7866c73 / net: move-only: improve encapsulation of SockMan

    Commit message typo: m_listen private

    Fixed.

    645f833 / net: reduce CAddress usage to CService or CNetAddr

    Are we going to change file/network serialization by switching CAddress -> CService? Might try to answer this in later review.

    No.

  133. vasild force-pushed on Feb 10, 2025
  134. vasild commented at 11:59 am on February 10, 2025: contributor

    7d84f431f9...7b63b4ca1c: minor repush to elaborate a commit message as suggested.

    Are you in the mood of reviewing a change to this PR, #30988 (comment), that stores the “client” objects in SockMan (as shared_ptr, not as raw pointers)? In the case of CConnman that is CNode. This will:

    • make the interaction between SockMan and CConnman simpler
    • probably faster (less lookups by id)
    • resolve the issue in #30988 (review)
    • resolve the issue which #28222 tried to resolve
    • will get rid of the manual CNode reference counting (CNode::AddRef()).

    Or I could find a smaller workaround to #30988 (review).

  135. hodlinator commented at 12:47 pm on February 10, 2025: contributor

    Would really like to get rid of the custom ref-counting of CNode and avoid id-lookups. But fear it will grow this PR a bit much.

    Why not remove the custom ref-counting as a separate PR to begin with? Things like #28222 (review) make me think it may not be trivial, but maybe you have a more elegant approach.

    I personally commit to review regardless of whether you split it out to it’s own PR.

  136. vasild force-pushed on Feb 13, 2025
  137. vasild commented at 10:42 am on February 13, 2025: contributor
    7b63b4ca1c...c4ab7f82d6: rebase due to conflicts and address a suggestion
  138. vasild commented at 10:52 am on February 13, 2025: contributor
    Extracted the first commit into #31854. It is not strictly related to this PR and makes sense on its own. If merged will reduce the size of this PR.
  139. DrahtBot removed the label Needs rebase on Feb 13, 2025
  140. Sjors commented at 4:09 pm on February 13, 2025: member

    Can you update the PR description to say that this is based on #31854? The convention then is to mark this PR draft until that one is merged.

    Additionally, it’s 25% easier to review #31854 if you rebase this PR on the exact same commit. Since then I can just range-diff it with what I already reviewed here.

  141. vasild force-pushed on Feb 14, 2025
  142. vasild commented at 8:54 am on February 14, 2025: contributor

    c4ab7f82d6...563afdd975: pick change from #31854 into the first commit of this PR

    Can you update the PR description to say that this is based on #31854? The convention then is to mark this PR draft until that one is merged.

    Added a note at the bottom of the description of this PR. This PR was not a draft before #31854. The creation of #31854 did not change anything about this PR, so I think the existence of a chop off PR shouldn’t render this one as a draft. There is no dependency between the two - either one can be merged first.

    Additionally, it’s 25% easier to review #31854 if you rebase this PR on the exact same commit. Since then I can just range-diff it with what I already reviewed here.

    Done.

  143. in src/common/sockman.h:339 in 563afdd975 outdated
    334+        /**
    335+         * Map of socket -> connection id (in `m_connected`). For example
    336+         * socket1 -> id=23
    337+         * socket2 -> id=56
    338+         */
    339+        std::unordered_map<Sock::EventsPerSock::key_type, SockMan::Id> ids_per_sock;
    


    pinheadmz commented at 11:12 am on February 14, 2025:

    Since key_type here is std::shared_ptr<const Sock> shouldn’t this map also use the optimized hash/equal functions defined in:

    https://github.com/bitcoin/bitcoin/blob/86528937e5c4da2e12c46085fc41e87ed759258e/src/util/sock.h#L208


    vasild commented at 12:18 pm on February 14, 2025:
    Yes, done!
  144. vasild force-pushed on Feb 14, 2025
  145. vasild commented at 12:18 pm on February 14, 2025: contributor
    563afdd975...e1671ff42c: do #30988 (review)
  146. net: separate the listening socket from the permissions
    They were coupled in `struct ListenSocket`, but the socket belongs to
    the lower level transport protocol, whereas the permissions are specific
    to the higher Bitcoin P2P protocol.
    7745ea523c
  147. net: drop CConnman::ListenSocket
    Now that `CConnman::ListenSocket` is a `struct` that contains only one
    member variable of type `std::shared_ptr<Sock>`, drop `ListenSocket` and
    use `shared_ptr` directly.
    
    Replace the vector of `ListenSocket` with a vector of `shared_ptr`.
    69ac6802ba
  148. net: split CConnman::BindListenPort() off CConnman
    Introduce a new low-level socket managing class `SockMan`
    and move the `CConnman::BindListenPort()` method to it.
    98ba5c7965
  149. style: modernize the style of SockMan::BindListenPort()
    It was copied verbatim from `CConnman::BindListenPort()` in the previous
    commit. Modernize its variables and style and log the error messages
    from the caller. Also categorize the informative messages to the "net"
    category because they are quite specific to the networking layer.
    9d4e7e3bd7
  150. net: split CConnman::AcceptConnection() off CConnman
    Move the `CConnman::AcceptConnection()` method to `SockMan` and split
    parts of it:
    * the flip-to-CJDNS part: to just after the `AcceptConnection()` call
    * the permissions part: at the start of `CreateNodeFromAcceptedSocket()`
    c31fc1a993
  151. style: modernize the style of SockMan::AcceptConnection() 221c9224b2
  152. net: move the generation of ids for new nodes from CConnman to SockMan
    Move `CConnman::GetNewNodeId()` to `SockMan::GetNewId()`. Avoid using
    the word "node" because that is too specific for `CConnman`.
    25203720a1
  153. net: move CConnman-specific parts away from ThreadI2PAcceptIncoming()
    CConnman-specific or in other words, Bitcoin P2P specific. Now
    the `ThreadI2PAcceptIncoming()` method is protocol agnostic and
    can be moved to `SockMan`.
    a367d556fd
  154. net: move I2P-accept-incoming code from CConnman to SockMan be1d7418c1
  155. net: index nodes in CConnman by id
    Change `CConnman::m_nodes` from `std::vector<CNode*>` to
    `std::unordered_map<NodeId, CNode*>` because interaction
    between `CConnman` and `SockMan` is going to be based on
    `NodeId` and finding a node by its id would better be fast.
    
    Change `PeerManagerImpl::EvictExtraOutboundPeers()` to account for nodes
    no longer always being in order of id. The old code would have failed to
    update `next_youngest_peer` correctly if `CConnman::m_nodes` hadn't
    always had nodes in ascending order of id.
    
    During fuzzing make sure that we don't generate duplicate `CNode` ids.
    The easiest way to do that is to use sequential ids.
    
    As a nice side effect the existent search-by-id operations in
    `CConnman::AttemptToEvictConnection()`,
    `CConnman::DisconnectNode()` and
    `CConnman::ForNode()` now become `O(1)` (were `O(number of nodes)`),
    as well as the erase in `CConnman::DisconnectNodes()`.
    93df2db96d
  156. net: isolate P2P specifics from GenerateWaitSockets()
    Move the parts of `CConnman::GenerateWaitSockets()` that are specific to
    the Bitcoin-P2P protocol to dedicated methods:
    `ShouldTryToSend()` and `ShouldTryToRecv()`.
    
    This brings us one step closer to moving `GenerateWaitSockets()` to the
    protocol agnostic `SockMan` (which would call `ShouldTry...()` from
    `CConnman`).
    1cea2e4b84
  157. net: isolate P2P specifics from SocketHandlerConnected() and ThreadSocketHandler()
    Move some parts of `CConnman::SocketHandlerConnected()` and
    `CConnman::ThreadSocketHandler()` that are specific to the Bitcoin-P2P
    protocol to dedicated methods:
    `EventIOLoopCompletedForOne(id)` and
    `EventIOLoopCompletedForAll()`.
    
    This brings us one step closer to moving `SocketHandlerConnected()` and
    `ThreadSocketHandler()` to the protocol agnostic `SockMan` (which would
    call `EventIOLoopCompleted...()` from `CConnman`).
    f0b40d1c06
  158. net: isolate all remaining P2P specifics from SocketHandlerConnected()
    Introduce 4 new methods for the interaction between `CConnman` and
    `SockMan`:
    
    * `EventReadyToSend()`:
      called when there is readiness to send and do the actual sending of data.
    
    * `EventGotData()`, `EventGotEOF()`, `EventGotPermanentReadError()`:
      called when the corresponing recv events occur.
    
    These methods contain logic that is specific to the Bitcoin-P2P protocol
    and move it away from `CConnman::SocketHandlerConnected()` which will
    become a protocol agnostic method of `SockMan`.
    
    Also, move the counting of sent bytes to `CConnman::SocketSendData()` -
    both callers of that method called `RecordBytesSent()` just after the
    call, so move it from the callers to inside
    `CConnman::SocketSendData()`.
    ab9de0f226
  159. net: split CConnman::ConnectNode()
    Move the protocol agnostic parts of `CConnman::ConnectNode()` into
    `SockMan::ConnectAndMakeId()` and leave the Bitcoin-P2P specific
    stuff in `CConnman::ConnectNode()`.
    
    Move the protocol agnostic `CConnman::m_unused_i2p_sessions`, its mutex
    and `MAX_UNUSED_I2P_SESSIONS_SIZE` to `SockMan`.
    
    Move `GetBindAddress()` from `net.cpp` to `sockman.cpp`.
    455185b665
  160. net: tweak EventNewConnectionAccepted()
    Move `MaybeFlipIPv6toCJDNS()`, which is Bitcoin P2P specific from the
    callers of `CConnman::EventNewConnectionAccepted()` to inside that
    method.
    
    Move the IsSelectable check, the `TCP_NODELAY` option set and the
    generation of new connection id out of
    `CConnman::EventNewConnectionAccepted()` because those are protocol
    agnostic. Move those to a new method `SockMan::NewSockAccepted()` which
    is called instead of `CConnman::EventNewConnectionAccepted()`.
    52106d0136
  161. net: move sockets from CNode to SockMan
    Move `CNode::m_sock` and `CNode::m_i2p_sam_session` to `SockMan::m_connected`.
    Also move all the code that handles sockets to `SockMan`.
    
    `CNode::CloseSocketDisconnect()` becomes
    `CConnman::MarkAsDisconnectAndCloseConnection()`.
    
    `CConnman::SocketSendData()` is renamed to
    `CConnman::SendMessagesAsBytes()` and its sockets-touching bits are moved to
    `SockMan::SendBytes()`.
    
    `CConnman::GenerateWaitSockets()` goes to
    `SockMan::GenerateWaitSockets()`.
    
    `CConnman::ThreadSocketHandler()` and
    `CConnman::SocketHandler()` are combined into
    `SockMan::ThreadSocketHandler()`.
    
    `CConnman::SocketHandlerConnected()` goes to
    `SockMan::SocketHandlerConnected()`.
    
    `CConnman::SocketHandlerListening()` goes to
    `SockMan::SocketHandlerListening()`.
    08dc1ee704
  162. net: move-only: improve encapsulation of SockMan
    `SockMan` members
    
    `AcceptConnection()`
    `NewSockAccepted()`
    `GetNewId()`
    `m_i2p_sam_session`
    `m_listen`
    
    are now used only by `SockMan`, thus make them private.
    741f17e51d
  163. vasild force-pushed on Feb 17, 2025
  164. vasild commented at 8:52 am on February 17, 2025: contributor
    e1671ff42c...741f17e51d: rebase and remove the first commit which was merged via #31854, thanks!
  165. jonatack commented at 6:44 pm on February 17, 2025: member
    Concept ACK

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-02-22 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me