-capturemessages setting in CConnman rather than re-evaluating it every time we invoke PushMessaage.
-capturemessages setting in CConnman rather than re-evaluating it every time we invoke PushMessaage.
We run InactivityChecks() for each node everytime poll()/select() every
50ms or so. Rather than calculating the current time once for each node,
just calculate it once and reuse it.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34025.
See the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
Possible places where named args for integral literals may be used (e.g. func(x, /*named_arg=*/0) in C++, and func(x, named_arg=0) in Python):
2025-12-09
Not sure if the flame graph is usable, but:
GetBoolArg takes up 0.31% of total time, as part of PushMessage that takes up 1.75% of total time, in b-msghand.
GetTime takes up 0.82% of total time, as part of InactivityCheck that takes up 1.78% of total time, in b-net.
Converting from std::chrono::microseconds to NodeClock::time_point is a lot more intrusive (impacting at least net_processing and node/eviction as well).
Note that CConnman was a friend of CNode until #27257 (no longer relevant)
2121@@ -2123,6 +2122,8 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
2122 {
2123 AssertLockNotHeld(m_total_bytes_sent_mutex);
2124
2125+ auto now = GetTime<std::chrono::microseconds>();
GetTime() is deprecated.
0 const auto now = NodeClock::now();
(this is moving the deprecated call from elsewhere, but now is a good time to change it)
Converting from std::chrono::microseconds to NodeClock::time_point is a lot more intrusive (impacting at least net_processing and node/eviction as well).
I did try that, it requires a lot of changes to all the things we compare now against, so m_connected, m_last_send, m_last_recv. m_connected in particular is a big hit compared to the rest of this PR.
This works:
0@@ -2125 +2125 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
1- auto now = GetTime<std::chrono::microseconds>();
2+ auto now = NodeClock::now();
3@@ -2218 +2218 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
4- if (InactivityCheck(*pnode, now)) pnode->fDisconnect = true;
5+ if (InactivityCheck(*pnode, now.time_since_epoch())) pnode->fDisconnect = true;
GetTime() is still useful because a lot of surrounding code uses e.g. std::chrono::seconds which we need to compare against, then should GetTime() be un-deprecated?
Converting from std::chrono::microseconds to NodeClock::time_point is a lot more intrusive (impacting at least net_processing and node/eviction as well).
Happy to take a look as well in a fresh commit, either here, or in a follow-up.
if
GetTime()is still useful because a lot of surrounding code uses e.g.std::chrono::secondswhich we need to compare against, then shouldGetTime()be un-deprecated?
GetTime returning a time duration is wrong, because the current time point (now) is not a duration, but a time point. A duration arises as the difference of two points in time. This duration can then be compared with any other duration (e.g. peer timeout). I don’t think it makes sense un-deprecate something just because it is used in the current code. If this was a valid reason, nothing could ever be marked deprecated as long as it is used.
@maflcko, yes, I agree. But existent code uses “duration”:
0CNode::m_last_send
1CNode::m_last_recv
2CNode::m_last_tx_time
3CNode::m_last_block_time
4CNode::m_connected
so, if one if writing new code that needs to compare “now” to those, which one is preferred:
GetTime(); orNodeClock::now() and convert using time_since_epoch() in order to compare?Happy to take a look as well in a fresh commit, either here, or in a follow-up.
Happy to review in a followup.
When looking at this previously, it seemed like having an AtomicTimePoint<Clock> template would be helpful (for things like m_last_send which is currently atomic<seconds>), because time_point doesn’t support being an atomic. Here’s a branch where I last looked at this topic; the “Add AtomicTimePoint” commit might be worth cherry-picking.
because time_point doesn’t support being an atomic.
Duration does not either. I think it is fine to type .load() or .store(), where needed.
I guess there is no rush here, and this can be done, when all other places (non-atomic) are switched.
atomic<seconds> and atomic<microseconds> in various places in net, net_processing and util/time. atomic<time_point> doesn’t work because even if the duration is noexcept (necessary for atomic), that doesn’t propagate to the time_point’s constructor also being noexcept which is a requirement for wrapping in atomic.
noexcept. Can you give a minimal reproducer, or link to the std docs?
What I was seeing was that the duration constructor was noexcept in practice, but time_point has an explicit constructor that wouldn’t throw but wasn’t declared as noexcept, which was then rejected by atomic…
Stackoverflow answer about it: https://stackoverflow.com/questions/22701617/should-constructors-on-stdchrono-time-point-be-noexcept-or-why-arent
Ah, it looks like it changed between C++17 and C++20, going from atomic() noexcept = default; to constexpr atomic() noexcept(is_nothrow_default_constructible_v<T>); which fixes it.
Hmm, it looks to me like libstdc++ didn’t match the C++20 spec until https://github.com/gcc-mirror/gcc/commit/613f8ddbe3d7da63d827e588bf0333813c184b8a which doesn’t seem to have been merged until March this year in libstdc++15? C++20 spec change was https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p0883r2.pdf
However I can’t reproduce the “core” bug LWG#2165 CWG#1778 even with -std=c++11, so apparently modern compilers just no longer behave this way? I can’t quite figure out what the change was or when it was made though. :man_shrugging:
EDIT: ah, looks like the underlying language issue was present in g++ 9.x but fixed in g++ 10.x (May 2020).
2214@@ -2214,7 +2215,7 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
2215 }
2216 }
2217
2218- if (InactivityCheck(*pnode)) pnode->fDisconnect = true;
2219+ if (InactivityCheck(*pnode, now)) pnode->fDisconnect = true;
now is not the best name for it. What about time_at_start_of_loop or something like that?
1673@@ -1666,6 +1674,11 @@ class CConnman
1674 */
1675 bool whitelist_relay;
1676
1677+ /**
1678+ * flag for whether messages are captured
1679+ */
1680+ bool capture_messages{false};
m_capture_messages
966@@ -978,6 +967,8 @@ class CNode
967 }
968
969 private:
970+ friend class CConnman;
I do not like the second commit a542bf10f0 net: [refactor] move ReceiveMsgBytes to CConnman - it reduces the encapsulation of CNode and goes against #27257. Its commit message says:
This allows ReceiveMsgBytes() to access CConnman members without needing to copy them to every CNode. This is used in the next commit.
But I don’t think the next commit b4d1007221 net: Cache -capturemessages setting does any of that, hmm?
2121@@ -2123,6 +2122,8 @@ void CConnman::SocketHandlerConnected(const std::vector<CNode*>& nodes,
2122 {
2123 AssertLockNotHeld(m_total_bytes_sent_mutex);
2124
2125+ auto now = GetTime<std::chrono::microseconds>();
2126+
net: Pass time to InactivityChecks fuctions is well grounded. No need to retrieve the current time for every node, given that we only care about seconds-precision here. I measured on my slow node with a few tens of connections: all nodes are processed in a few milliseconds or less. So, at worst this is outdated that much which is fine IMO.
Re: #34025 (comment)
Just to confirm, this flamegraph is from a node that has finished syncing to the tip? i.e. IBD is not included in this graph, right?
GetTime takes up 0.82% of total time, as part of InactivityCheck that takes up 1.78% of total time, in b-net.
Interesting. I was wondering why getting the time eats so much CPU. Though, in the happy path, InactivityCheck is just loading a few atomics, which means getting the time costs as much as loading a few atomics. Also, the flame graph probably shows the CPU time, and not the wall clock time. So the patch here likely won’t cut the wall clock time between two calls of SocketHandlerConnected, but only the CPU time inside a single SocketHandlerConnected call?
concept ack, Seems fine to still make the changes here.
Just to confirm, this flamegraph is from a node that has finished syncing to the tip? i.e. IBD is not included in this graph, right?
Yes; it’s taken over a longish period though, iirc, I think either 30m or 2h. You can see ProcessNewBlock at 0.20% just before ProcessTransaction at 4.56% fwiw.
Also, the flame graph probably shows the CPU time, and not the wall clock time. So the patch here likely won’t cut the wall clock time between two calls of
SocketHandlerConnected, but only the CPU time inside a singleSocketHandlerConnectedcall?
Yeah, presuming SocketHandlerConnected isn’t using 100% of a core, it’ll be spending its time waiting for the SELECT_TIMEOUT_MILLISECONDS timeout to hit, which is 50ms.
ACK 5f5c1ea01955d277581f6c2acbeb982949088960
Would be happy to see the call to the deprecated GetTime() removed: #34025 (review).
I do not see it as a blocker because this PR is actually moving the call around, not planting a new one.
review ACK 5f5c1ea01955d277581f6c2acbeb982949088960 🏣
Signature:
0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: review ACK 5f5c1ea01955d277581f6c2acbeb982949088960 🏣
3DIf+EMRUrE9g+3ldiGUW0pHeRyThkZk3bbOL6sas10+I4f60l14Vo/S1nCwtrHEqiue3z1X4uTE3S7uN3Tg3DQ==
ACK 5f5c1ea01955d277581f6c2acbeb982949088960
Looks like an improvement, but I wonder if instead of doing micro-optimizations like the ones here, it wouldn’t be better to have the entire InactivityCheck() procedure run somewhere else than in the SocketHandlerConnected, or at least less often - it seems a bit absurd to check every 50ms for a timeout of 20 minutes. Even the initial waiting time m_peer_connect_timeout from ShouldRunInactivityChecks() for a new peer doesn’t really be checked that frequently.
it seems a bit absurd to check every
50msfor a timeout of 20 minutes. Even the initial waiting timem_peer_connect_timeoutfromShouldRunInactivityChecks()for a new peer doesn’t really be checked that frequently.
I think if you’re running a listening node with many real inbound peers, on mainnet where there seems to be maybe 4.5 tx/s, you probably expect to receive socket data from each peer every 2 seconds or so (announcing the ~9 txs that have been received in that time). With 80 (inbound, tx relay) peers, that’s an interrupt every 25ms, rather than every 50ms. Worse if you have more peers, better if many of your peers are block-relay only or inactive spy nodes or similar.