stickies-v
commented at 6:18 pm on March 11, 2024:
contributor
An earlier approach in #28956 involved simplifying and refactoring the network-adjusted time calculation logic, but this was eventually left out of the PR to make it easier for reviewers to focus on consensus logic changes.
Since network-adjusted time is now only used for warning/informational purposes, cleaning up the logic (building on @dergoegge’s approach in #28956) should be quite straightforward and uncontroversial. The main changes are:
Previously, we would only calculate the time offset from the first 199 outbound peers that we connected to. This limitation is now removed, and we have a proper rolling calculation. I’ve reduced the set to 50 outbound peers, which seems plenty.
Previously, we would automatically use the network-adjusted time if the difference was < 70 mins, and warn the user if the difference was larger than that. Since there is no longer any automated time adjustment, I’ve changed the warning threshold to 20 10 minutes (which is an arbitrary number).
Previously, a warning would only be raised once, and then never again until node restart. This behaviour is now updated to 1) warn to log for every new outbound peer for as long as we appear out of sync, 2) have the RPC warning toggled on/off whenever we go in/out of sync, and 3) have the GUI warn whenever we are out of sync (again), but limited to 1 messagebox per 60 minutes
#29015 (kernel: Streamline util library by ryanofsky)
#27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)
#26966 (index: blockfilter initial sync speedup, parallelize process by furszy)
#15218 (validation: sync chainstate to disk after syncing to tip by andrewtoth)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
stickies-v force-pushed
on Mar 11, 2024
DrahtBot added the label
CI failed
on Mar 11, 2024
DrahtBot
commented at 6:26 pm on March 11, 2024:
contributor
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
Could you add a fuzz test for TimeOffsets to replace the timedata one?
stickies-v
commented at 2:09 pm on March 12, 2024:
Can do, but given how simple the interface is I think it wouldn’t be anything more than the below (untested)? Could e.g. add the .Size() method back but adding it just for fuzzing might not be ideal?
but given how simple the interface is I think it wouldn’t be anything more than the below
That’s good enough. We just want the code to be under test, in case we have change it in the future.
stickies-v
commented at 4:39 pm on March 12, 2024:
Done. I was a bit worried WarnIfOutOfSync() outputting to stderr would mess with the fuzzer (as here), but running it locally seems to work fine so far, not sure if the appropriate paths have been hit already though.
Resolving for now.
stickies-v
commented at 2:09 pm on March 12, 2024:
contributor
Force pushed to:
address a thread safety analysis issue by annotating WarnIfOutOfSync() with a EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
fix failing functional test by resetting node_miner’s mocktime before connecting to the next outbound
rebase to latest master to fix native macos job
alfonsoromanz
commented at 2:46 pm on March 12, 2024:
contributor
Verified that the peer window now shows the offset of each peer.
FYI this is not new functionality, you should see the same behaviour on master.
stickies-v force-pushed
on Mar 13, 2024
vasild
commented at 8:02 am on March 13, 2024:
contributor
Concept ACK, I love simplifications or refactors that are net removing code (+213 −363).
DrahtBot removed the label
CI failed
on Mar 13, 2024
ariard
commented at 5:14 pm on March 16, 2024:
none
Previously, we would only calculate the time offset from the first 199 outbound peers that we connected to. This limitation is now removed, and we have a proper rolling calculation. I’ve reduced the set to 50 outbound peers, which seems plenty.
Previously, we would automatically use the network-adjusted time if the difference was < 70 mins, and warn the user if the difference was larger than that. Since there is no longer any automated time adjustment, I’ve changed the warning threshold to 20 minutes (which is an arbitrary number).
on the 2 value selections (50 outbound peers):
50 sounds reasonable, to be fair best would be to add timestamp in m_outbound_time_offsets from FEELER connection, it’s an outbound connection type per se. shuffling things more at the AS-level.
20 min, i can suggest to pick up 10 min to match average interval-block-time of ~10 min
sipa
commented at 4:52 pm on March 17, 2024:
member
utACK5c8e71cb2b827586efd052a87ccc463c99387fcf
DrahtBot requested review from vasild
on Mar 17, 2024
count isn’t type-safe. If the type were to change, the RPC result would silently change as well. It may be better to use a casting tick counter Ticks<>.
stickies-v
commented at 12:11 pm on March 18, 2024:
Done
maflcko
commented at 8:23 am on March 18, 2024:
member
left a nit, might review later
in
src/node/timeoffsets.cpp:48
in
5c8e71cb2boutdated
43+ auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+ if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+ bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+ "this may lead to consensus failure. Please ensure it is correct.")};
48+ LogWarning("%s\n", msg.translated);
The debug log should not log translated messages, but the original ones
stickies-v
commented at 11:56 am on March 18, 2024:
Thanks, changed
in
src/node/timeoffsets.cpp:33
in
5c8e71cb2boutdated
28+std::chrono::seconds TimeOffsets::Median() const
29+{
30+ LOCK(m_mutex);
31+
32+ // Only calculate the median if we have 5 or more offsets
33+ if (m_offsets.size() < 5) return {0s};
in fuzz tests, you’ll have to use LIMITED_WHILE. Otherwise, the fuzz target will never terminate, if the fuzz input is unlimited in length.
stickies-v
commented at 11:54 am on March 18, 2024:
Thanks, now using LIMITED_WHILE(..., 4'000) to keep the same number of iterations as the max in the previous fuzz test.
stickies-v force-pushed
on Mar 18, 2024
stickies-v
commented at 12:24 pm on March 18, 2024:
contributor
Thank you for the reviews everyone, I’ve force-pushed to address all comments (and no other changes).
50 sounds reasonable, to be fair best would be to add timestamp in m_outbound_time_offsets from FEELER connection, it’s an outbound connection type per se. shuffling things more at the AS-level.
That’s already happening, we’re considering all types of outbound connections (including feelers)
20 min, i can suggest to pick up 10 min to match average interval-block-time of ~10 min
What’s your rationale for matching it with the average block-time interval? I don’t see how they’re related. I’m not opposed to a shorter threshold (like 10 minutes), just want to avoid warnings occurring too frequently so users start to get used to and ignore them. Want to avoid bikeshedding over this, so going to keep as is for now until more reviewers voice their preference.edit: updated threshold to 10 minutes
stickies-v force-pushed
on Mar 18, 2024
stickies-v force-pushed
on Mar 18, 2024
stickies-v
commented at 2:00 pm on March 18, 2024:
contributor
Force-pushed to revert the fuzz harness name change, because for a couple of commits, we now have two harnesses with the same name - until the old one is removed in Remove timedata commit. I’ve opened a PR in qa-assets to rename the input folder (thx dergoegge for the suggestion).
dergoegge approved
dergoegge
commented at 4:26 pm on March 18, 2024:
member
DrahtBot requested review from sipa
on Mar 18, 2024
sipa
commented at 4:40 pm on March 18, 2024:
member
utACKa42d9b5041cdc522e8a4b81944e45b7f30458c26
stickies-v referenced this in commit
84c2d9607b
on Mar 18, 2024
in
src/node/timeoffsets.cpp:47
in
a42d9b5041outdated
42+ // when median == std::numeric_limits<int64_t>::min(), calling std::chrono::abs is UB
43+ auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+ if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+ bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+ "this may lead to consensus failure. Please ensure it is correct.")};
“Your computer’s clock is 765 seconds behind the network. The difference is more than 600 seconds and this may lead to a consensus failure. Please ensure your computer’s clock is correct.”
Consider this ^ while retouching. I think it is more useful because it tells the user how wrong is the clock and what are the expectations of Bitcoin Core. Otherwise (to the user) it is not clear whether Bitcoin Core is too picky and whether this is about +/- 1 second or something like that. Referring the user to the getnetworkinfo RPC may also be useful.
I think it is more useful because it tells the user how wrong is the clock
Not sure. Just because it is the median does not mean that it is an accurate representation of the current time. For example, the time offsets could all be widely different numbers larger than the warning threshold. Or, it could be an outdated number, because the median is calculated based on samples from random points in the past.
stickies-v
commented at 6:05 pm on March 22, 2024:
I agree that the specific numbers don’t matter too much here. The main goal is to have the user double check, and correct their settings if necessary, which should be obvious very quickly. I’ve updated the help message to reinforce that, and also to give context on when the message will disappear:
0 bilingual_str msg{strprintf(_(
1"Your computer's date and time appears to be more than %d minutes out of sync with the network, "2"this may lead to consensus failure. After you've corrected your settings, this message "3"should no longer appear when you restart your node. Without a restart, it should stop showing "4"automatically after you've connected to a sufficient number of new outbound peers, which may "5"take some time."6 ), Ticks<std::chrono::minutes>(m_warn_threshold))};
It’s quite a long message, but it should only very rarely pop up, and when it does, we really want the user to address it ASAP, so I suppose that’s fair.
I hope this works for both of you?
in
src/node/timeoffsets.h:21
in
a42d9b5041outdated
16+{
17+private:
18+ static constexpr size_t N{50};
19+ //! Minimum difference between system and network time for a warning to be raised.
20+ static constexpr std::chrono::minutes m_warn_threshold{10};
21+ mutable std::atomic<bool> m_warning_emitted{false}; // Avoid flooding the user with warnings.
This way the warning will be printed just once in the lifetime of the process. If the user does something in an attempt to resolve the issue, how would they know that the issue is resolved? Usually if a message is logged every, e.g. 10 minutes (to avoid flooding the log), the disappearance of the message is an indication that the problem is resolved.
Is there a way for the user to see the median network time, as perceived by his node? If yes, then they can manually compare.
Also, the time offset issue is one that can come and go as time passes. Imagine a bitcoind running for 1 month. It had issues with the time soon after it was started which the operator has resolved. 1 month later the issues resurface - there is no warning or indication about it?
Maybe replace that bool with a timestamp of the last log and print the error message if the last log was e.g. 10 minutes ago?
stickies-v
commented at 11:08 pm on March 19, 2024:
For GUI and RPC warnings, this PR pretty much preserves the current behaviour: GUI messagebox gets shown only once (until restarted), and the RPC will keep firing forever (on master, it will be reset when another SetMiscWarning() is called, but in practice I don’t think that really happens frequently).
For the debug log, however, this PR quite significantly improves things by
doing an actual rolling calculation, which means that if clocks start drifting, the user will eventually still get a warning, whereas on master the time adjustment system completely stops after the first 199 outbound handshakes
producing repeated log warnings (one for every outbound handshake) for as long and only for as long as the median time offset exceeds the threshold.
While I agree that for GUI & RPC things should be improved, this PR is no regression, and I’d rather it be done in a separate PR. Avoiding flooding seems less trivial to me than adding a simple timer: having a GUI message popup every 10 minutes is quite annoying if someone decides to flood the network with nodes with wrong clocks.
Is there a way for the user to see the median network time, as perceived by his node?
yes, through getnetworkinfo a user can see the median offset
1 month later the issues resurface - there is no warning or indication about it?
there will be warning logs every time we handshake with an outbound peer and the median timeoffset exceeds the threshold
stickies-v
commented at 4:45 pm on March 20, 2024:
See https://github.com/stickies-v/bitcoin/commit/0146553aaab53b6092bb54e0b2c71282fef44787 for a potential implementation on turning {GUI, RPC} warnings off again when the offsets fall within the threshold again. Happy to open a follow-up PR with this if you like the approach, I think this PR is too close to merge to overhaul it again as per my previous comment, what do you think?
I would be happy to review that commit if you PR it. Early suggestion:
0- m_warning_emitted.value() - now > m_warning_wait
1+ now - m_warning_emitted.value() > m_warning_wait
stickies-v
commented at 9:05 pm on March 22, 2024:
The latest force-push now also includes proper GUI and RPC warning behaviour. RPC warnings are now only shown when the median exceeds the threshold. GUI warnings are shown whenever the node falls out of sync, but max once per 60 minutes. I think this should fully address your comment? Thanks!
stickies-v referenced this in commit
373778c1fd
on Mar 19, 2024
in
src/warnings.cpp:68
in
a42d9b5041outdated
61@@ -56,6 +62,11 @@ bilingual_str GetWarnings(bool verbose)
62 warnings_verbose.emplace_back(warnings_concise);
63 }
6465+ if (g_timeoffset_warning) {
66+ warnings_concise = _("Your computer's date and time appear out of sync with the network, this may lead to consensus failure. Please ensure it is correct.");
67+ warnings_verbose.emplace_back(warnings_concise);
68+ }
Once g_timeoffset_warning is set to true it will never be set back to false and this will warning will persist even after the problem is resolved?
stickies-v
commented at 11:09 pm on March 19, 2024:
Correct, as is roughly the current behaviour, as per #29623 (review)
in
test/functional/feature_maxtipage.py:48
in
a42d9b5041outdated
42@@ -43,6 +43,10 @@ def test_maxtipage(self, maxtipage, set_parameter=True, test_deltas=True):
43 self.generate(node_miner, 1)
44 assert_equal(node_ibd.getblockchaininfo()['initialblockdownload'], False)
4546+ # reset time to system time so we don't have a time offset with the ibd node the next
47+ # time we connect to it, ensuring TimeOffsets::WarnIfOutOfSync() doesn't output to stderr
48+ node_miner.setmocktime(0)
Could squash commits:
587349072f[net processing] Use TimeOffsets to warn when node is out of syncd790e776ca[test] reset miner mocktime in maxtipage to prevent time offset warning
stickies-v
commented at 6:27 pm on March 22, 2024:
Squashed
DrahtBot requested review from vasild
on Mar 19, 2024
in
src/timedata.cpp:70
in
a42d9b5041outdated
64- // But in this case the 'bug' is protective against some attacks, and may
65- // actually explain why we've never seen attacks which manipulate the
66- // clock offset.
67- //
68- // So we should hold off on fixing this and clean it up as part of
69- // a timing cleanup that strengthens it in a number of other ways.
Right, it dawned on me that previously we would assume the network’s time, ie “adjust” our time according to the network and this has been removed, so “the network” can no longer mess with our clock. Sorry for the noise.
DrahtBot requested review from vasild
on Mar 19, 2024
vasild approved
vasild
commented at 9:31 am on March 21, 2024:
contributor
ACKa42d9b5041cdc522e8a4b81944e45b7f30458c26
DrahtBot added the label
CI failed
on Mar 21, 2024
stickies-v force-pushed
on Mar 21, 2024
stickies-v
commented at 7:19 pm on March 21, 2024:
contributor
While working on the fuzz inputs, I noticed that the memory usage for this new timeoffsets harness is unusually high. Doing one more small force-push to patch the fuzz and unit test to:
disable logging in the fuzz test, which was causing the memory leak (thanks @dergoegge)
use noui.h to prevent the GUI from flooding the fuzz and unit test with (expected) warnings
It’s a pretty small test-only diff, so I’m hoping reviewers can give this a quick re-review? Thank you!
maflcko
commented at 7:22 pm on March 21, 2024:
member
Maybe there should be an assertion failure in the fuzz tests, if log messages have been written to the buffer, but logging is still disabled when the program exits?
stickies-v marked this as a draft
on Mar 21, 2024
stickies-v
commented at 8:05 pm on March 21, 2024:
contributor
Changed to draft until I’ve resolved the merge conflict with #29279 which relies on setmocktime
theStack
commented at 11:43 pm on March 21, 2024:
contributor
Concept ACK
in
src/node/timeoffsets.h:13
in
9f2904b621outdated
nit in 9f2904b621cc945005570fc6e94832d73d36d92b: <cstddef>
stickies-v
commented at 3:56 pm on March 22, 2024:
fixed
in
src/node/timeoffsets.cpp:65
in
9f2904b621outdated
43+ auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+ if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+ bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+ "this may lead to consensus failure. Please ensure it is correct.")};
48+ LogWarning("%s\n", msg.original);
nit in 9f2904b621cc945005570fc6e94832d73d36d92b: This may keep printing the warning, even if the user adjusted the clock, because the offsets are not discarded when the clock changes.
stickies-v
commented at 9:09 pm on March 22, 2024:
The log message has been updated to highlight that the message may persist for a while, thanks.
Grepping in my node’s log for how frequent outbound connections are made - it would take about 1.5 days to make 200 new outbound connections. Since this is about the median, I guess 101 new connections after fixing the clock will be enough for us to stop the warning. That is like 20ish hours. Quite long. Can we do better?
Currently we only save the difference between our time and the peer’s time at the start of the connection and we have no clue that this difference has changed due to our clock being adjusted/corrected. If we save the value of the system_clock and steady_clock at the start of the connection, in addition to the peer’s offset, at a later time we can deduce that our clock has been adjusted and add that diff/adjustment to the peer’s offset to immediately figure out that “now our clock is in sync with that peer’s clock”.
So, instead of peer_offset, it would be a matter of using the following instead:
peer_offset + steady_clock_diff - system_clock_diff, assuming:
steady_clock_diff = steady_clock_now - steady_clock_when_connectedsystem_clock_diff = system_clock_now - system_clock_when_connected.
Example:
Lets say our clock is wrong, 15 minutes ahead. When we connect to a peer at 17:00:00 our time they send us a time that is 15 minutes ago and we save:
peer_offset = -15min,
system_clock_when_connected = 17:00:00,
steady_clock_when_connected = 1000 (some obscure value [seconds]).
One hour later, our clock has been fixed/corrected, by moving it 15 min backwards (or stopping to move for 15 min). So,
system_clock_now = 17:45:00,
steady_clock_now = 4600 (the obscure value 1000 plus 3600 because 1 hour has passed).
And then:
steady_clock_diff = 4600 - 1000 = 60 minsystem_clock_diff = 17:45:00 - 17:00:00 = 45 min
The difference between those is 15 minutes and adding that to peer_offset gives us offset 0, meaning that now our clock is in sync with the peer’s clock.
stickies-v
commented at 10:10 pm on March 30, 2024:
Using both system and steady clock is an interesting idea and one I hadn’t considered, thanks. I’m going to look into this a bit more and will follow up shortly.
it would take about 1.5 days to make 200 new outbound connections
FYI this PR reduces 200 -> 50 outbounds so your calculations are a bit off.
in
src/node/timeoffsets.cpp:52
in
9f2904b621outdated
47+ "this may lead to consensus failure. Please ensure it is correct.")};
48+ LogWarning("%s\n", msg.original);
49+
50+ if (!m_warning_emitted) {
51+ m_warning_emitted = true;
52+ SetMedianTimeOffsetWarning();
stickies-v
commented at 4:22 pm on March 22, 2024:
I didn’t think about it much, have re-added a debug log line.
in
src/net_processing.cpp:3628
in
e5c3b28079outdated
3623@@ -3625,8 +3624,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3624 if (!pfrom.IsInboundConn()) {
3625 // Don't use timedata samples from inbound peers to make it
3626 // harder for others to tamper with our adjusted time.
3627- AddTimeData(pfrom.addr, Ticks<std::chrono::seconds>(peer->m_time_offset.load()));
nit in e5c3b2807921caa6f2717fa84915cb02b1f4d9fb: Not that it matters, but the commit itself is broken, because GetTimeOffset in the rpc will now always be zero in this commit. The next commit fixes it.
stickies-v
commented at 6:27 pm on March 22, 2024:
Ah, right. I’ve squashed the commit so it’s no longer relevant.
maflcko
commented at 11:10 am on March 22, 2024:
member
I think it would be fine to globally set this in MakeNoLogFileContext, or all tests, or should there ever be a case where one would want stuff printed to stderr (from ThreadSafeMessageBox) in tests instead of to the test log?
stickies-v
commented at 7:30 pm on March 22, 2024:
Seems sensible, but given that MakeNoLogFileContext is used in quite a few other places I’d rather keep that for a separate pull
in
src/net_processing.cpp:3613
in
0b372c0ab3outdated
stickies-v
commented at 5:23 pm on March 22, 2024:
Neat, updated thanks
stickies-v referenced this in commit
032a597482
on Mar 22, 2024
fanquake referenced this in commit
a175efe768
on Mar 22, 2024
stickies-v force-pushed
on Mar 22, 2024
stickies-v force-pushed
on Mar 22, 2024
stickies-v
commented at 10:39 pm on March 22, 2024:
contributor
Thanks for the in-depth review everyone.
Force-pushed to rebase to make CI green again (#29704) and address all outstanding review comments, given that this has now become a larger push anyway. The main changes are:
RPC warnings are properly reset when the issue is resolved, GUI warnings can now be shown more than once (e.g. when the node falls out of sync again), but only once per 60 minutes to prevent flooding.
the warning message has been improved to make it more clear what the user has to/can do
re-added debug log whenever an offset is added (i.e. an outbound peer connects)
stickies-v marked this as ready for review
on Mar 22, 2024
DrahtBot removed the label
CI failed
on Mar 22, 2024
in
src/net_processing.cpp:394
in
5084ad4846outdated
389@@ -390,6 +390,10 @@ struct Peer {
390 /** Whether this peer wants invs or headers (when possible) for block announcements */
391 bool m_prefers_headers GUARDED_BY(NetEventsInterface::g_msgproc_mutex){false};
392393+ /** Time offset computed during the version handshake based on the
394+ * timestamp the peer send in the version message. */
in
src/net_processing.cpp:395
in
16dededcd8outdated
391@@ -392,7 +392,7 @@ struct Peer {
392393 /** Time offset computed during the version handshake based on the
394 * timestamp the peer send in the version message. */
395- std::atomic<int64_t> m_time_offset{0};
396+ std::atomic<std::chrono::seconds> m_time_offset{0s};
In 16dededcd8d52b6d0fdbe9e0ffab6a177aa8aa17, why is this defined as 0s, but time_offset is defined as 0 in net_processing.cpp?
stickies-v
commented at 11:10 am on April 6, 2024:
It’s wrapped in a std::atomic here which requires the chrono literal, doesn’t compile otherwise:
0net_processing.cpp:395:52: error: no matching constructor for initialization of 'std::atomic<std::chrono::seconds>' (aka 'atomic<duration<long long>>')
1 std::atomic<std::chrono::seconds> m_time_offset{0};
+1, was going to say this should be removed in 2ef71c73582be554e565ada3f8a6ca77c2cd79f1, or say we are avoiding false warnings instead of preventing tampering of adjusted time
stickies-v
commented at 10:26 am on April 6, 2024:
Thanks, I’ve updated the doc to to make it harder for others to create false warnings about our clock being out of sync.
in
src/node/timeoffsets.cpp:21
in
e6d7bcf768outdated
I removed it and this seems to run fine, at least in my system
stickies-v
commented at 4:40 pm on April 15, 2024:
We #include <util/time.h> which has using namespace std::chrono_literals; which is why it compiles. I thought we would prefer (IWYU-style) to be explicit about used namespaces, but upon further investigation the purpose of #20602 seems to have been to avoid that, so I’ll remove this if I re-touch.
in
src/node/timeoffsets.h:21
in
e6d7bcf768outdated
16+{
17+private:
18+ static constexpr size_t N{50};
19+ //! Minimum difference between system and network time for a warning to be raised.
20+ static constexpr std::chrono::minutes m_warn_threshold{10};
21+ //! Log the last time a GUI warning was raised to avoid flooding user with them
stickies-v
commented at 12:46 pm on April 10, 2024:
Thanks, no longer relevant since the variable has been removed in latest push
in
src/node/timeoffsets.h:23
in
e6d7bcf768outdated
18+ static constexpr size_t N{50};
19+ //! Minimum difference between system and network time for a warning to be raised.
20+ static constexpr std::chrono::minutes m_warn_threshold{10};
21+ //! Log the last time a GUI warning was raised to avoid flooding user with them
22+ mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
23+ //! Minimum time to wait before raising a new GUI warning
The m_warn wait (now WARN_WAIT) var was used to check that we’re not creating too many messageboxes (through uiInterface.ThreadSafeMessageBox()). The messagebox call and the relevant variables have now been removed and the GetWarnings() implementation updated so that the GUI now shows warnings this way, so this comment is no longer relevant.
stickies-v
commented at 12:46 pm on April 10, 2024:
Thanks, no longer relevant since the variable has been removed in latest push
in
src/node/timeoffsets.cpp:39
in
e6d7bcf768outdated
34+
35+std::chrono::seconds TimeOffsets::Median() const
36+{
37+ LOCK(m_mutex);
38+
39+ // Only calculate the median if we have 5 or more offsets
e6d7bcf7689e38dcda45a1cdbff98fd610864e47 it may be worth mentioning this in the header file
stickies-v
commented at 11:14 am on April 6, 2024:
Agreed, updated doc in the header.
in
src/node/timeoffsets.cpp:42
in
e6d7bcf768outdated
37+ LOCK(m_mutex);
38+
39+ // Only calculate the median if we have 5 or more offsets
40+ if (m_offsets.size() < 5) return 0s;
41+
42+ auto sorted_copy = m_offsets;
I think I’d prefer to leave this as is, it’s a well scoped mutex, performance impact should be negligible so keeping it concise makes sense I think.
in
src/node/timeoffsets.cpp:57
in
e6d7bcf768outdated
52+ SetMedianTimeOffsetWarning(std::nullopt);
53+ return false;
54+ }
55+
56+ bilingual_str msg{strprintf(_(
57+ "Your computer's date and time appears to be more than %d minutes out of sync with the network, "
Left some comments inline, mostly nits/things that may need renaming.
Also, qt/guiutils.h still refers to CNodeCombinedStats.nTimeOffset after 5084ad48462b3169c99a81aeb21a339c3a9a4728, in a couple of places, but it may be better to change that for time_offset given nTimeOffset is not a thing anymore.
in
src/node/timeoffsets.h:20
in
2ef71c7358outdated
15+class TimeOffsets
16+{
17+private:
18+ static constexpr size_t N{50};
19+ //! Minimum difference between system and network time for a warning to be raised.
20+ static constexpr std::chrono::minutes m_warn_threshold{10};
in
src/node/timeoffsets.h:24
in
2ef71c7358outdated
19+ //! Minimum difference between system and network time for a warning to be raised.
20+ static constexpr std::chrono::minutes m_warn_threshold{10};
21+ //! Log the last time a GUI warning was raised to avoid flooding user with them
22+ mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
23+ //! Minimum time to wait before raising a new GUI warning
24+ static constexpr std::chrono::minutes m_warning_wait{60};
This implies that int64_t is the same as the internal type of std::chrono::seconds. Better get a random integer in the range of [std::chrono::seconds::min().count(), std::chrono::seconds::max().count()].
stickies-v
commented at 2:06 pm on April 10, 2024:
Updated this line to use std::chrono::seconds::rep :
in
src/node/timeoffsets.cpp:58
in
2ef71c7358outdated
53+ return false;
54+ }
55+
56+ bilingual_str msg{strprintf(_(
57+ "Your computer's date and time appears to be more than %d minutes out of sync with the network, "
58+ "this may lead to consensus failure. After you've corrected your settings, this message "
In this case performance is irrelevant but in general in arguments that are expensive to copy like vector are passed by const reference.
stickies-v
commented at 11:31 am on April 6, 2024:
Oops, fixed.
in
src/test/timeoffsets_tests.cpp:52
in
2ef71c7358outdated
47+ AddMulti(offsets, {25, 15s});
48+ // we now have 25 offsets of 10s followed by 25 offsets of 15s
49+ BOOST_CHECK(offsets.Median() == 15s);
50+}
51+
52+bool IsWarningRaised(std::vector<std::chrono::seconds> check_offsets)
Wrt adjusting the peer’s offset with local clock corrections in order to stop the warnings immediately after the local clock is fixed (instead of after 4-5h) #29623 (review), consider this:
0diff --git i/src/node/timeoffsets.cpp w/src/node/timeoffsets.cpp
1index 8488e47ff5..4dce6f9edd 100644
2--- i/src/node/timeoffsets.cpp
3+++ w/src/node/timeoffsets.cpp
4@@ -24,27 +24,31 @@ void TimeOffsets::Add(std::chrono::seconds offset)
5 {
6 LOCK(m_mutex);
7 8 if (m_offsets.size() >= N) {
9 m_offsets.pop_front();
10 }
11- m_offsets.push_back(offset);
12+ m_offsets.emplace_back(offset, std::chrono::steady_clock::now(), std::chrono::system_clock::now());
13 LogDebug(BCLog::NET, "Added time offset %+ds, total samples %d\n",
14 Ticks<std::chrono::seconds>(offset), m_offsets.size());
15 }
1617 std::chrono::seconds TimeOffsets::Median() const
18 {
19 LOCK(m_mutex);
2021 // Only calculate the median if we have 5 or more offsets
22 if (m_offsets.size() < 5) return 0s;
2324+ const auto steady_now = std::chrono::steady_clock::now();
25+ const auto system_now = std::chrono::system_clock::now();
26 auto sorted_copy = m_offsets;
27- std::sort(sorted_copy.begin(), sorted_copy.end());
28- return sorted_copy[m_offsets.size() / 2]; // approximate median is good enough, keep it simple
29+ std::sort(sorted_copy.begin(), sorted_copy.end(), [&](const peer_time_data& a, const peer_time_data& b) {
30+ return a.Offset(steady_now, system_now) < b.Offset(steady_now, system_now);
31+ });
32+ return sorted_copy[m_offsets.size() / 2].Offset(steady_now, system_now); // approximate median is good enough, keep it simple
33 }
3435 bool TimeOffsets::WarnIfOutOfSync() const
36 {
37 // when median == std::numeric_limits<int64_t>::min(), calling std::chrono::abs is UB
38 auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
39diff --git i/src/node/timeoffsets.h w/src/node/timeoffsets.h
40index 8bccae20e8..1d7a9f0153 100644
41--- i/src/node/timeoffsets.h
42+++ w/src/node/timeoffsets.h
43@@ -21,13 +21,29 @@ private:
44 //! Log the last time a GUI warning was raised to avoid flooding user with them
45 mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
46 //! Minimum time to wait before raising a new GUI warning
47 static constexpr std::chrono::minutes m_warning_wait{60};
4849 mutable Mutex m_mutex;
50- std::deque<std::chrono::seconds> m_offsets GUARDED_BY(m_mutex){};
51+ struct peer_time_data {
52+ //! The difference between peer's time and our time, at the time of the P2P handshake.
53+ std::chrono::seconds handshake_offset;
54+ // The following two are used to detect system clock corrections/adjustments.
55+ std::chrono::time_point<std::chrono::steady_clock> handshake_steady;
56+ std::chrono::time_point<std::chrono::system_clock> handshake_system;
57+
58+ /// Calculate the offset as of now, taking into account any system clock corrections that
59+ /// have happened since the peer connected.
60+ std::chrono::seconds Offset(std::chrono::time_point<std::chrono::steady_clock> steady_now,
61+ std::chrono::time_point<std::chrono::system_clock> system_now) const {
62+ return handshake_offset +
63+ std::chrono::duration_cast<std::chrono::seconds>(steady_now - handshake_steady) -
64+ std::chrono::duration_cast<std::chrono::seconds>(system_now - handshake_system);
65+ }
66+ };
67+ std::deque<peer_time_data> m_offsets GUARDED_BY(m_mutex){};
6869 public:
70 /** Add a new time offset sample. */
71 void Add(std::chrono::seconds offset) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
7273 /** Compute and return the median of the collected time offset samples. */
in
src/node/timeoffsets.h:17
in
e6d7bcf768outdated
stickies-v
commented at 2:08 pm on April 10, 2024:
Ah right, removed.
in
src/net_processing.cpp:756
in
e6d7bcf768outdated
752@@ -753,6 +753,8 @@ class PeerManagerImpl final : public PeerManager
753 /** Next time to check for stale tip */
754 std::chrono::seconds m_stale_tip_check_time GUARDED_BY(cs_main){0s};
755756+ TimeOffsets m_outbound_time_offsets;
This isn’t changed behavior, but with the threshold reduced to 10 minutes it may be a bit more relevant now:
As far as I understand it, for GUI users ThreadSafeMessageBox() blocks the calling thread (here msghand) until the user has clicked away the warning:
That means that for as long as the user didn’t click anything (e.g. because they are afk), most activity stops: No more received messages are processed, the tip is no longer updated etc. The current peers will disconnect us soon (because we don’t answer their pings anymore), and we will constantly open new connections (that is done separately in the opencon thrad) that will then timeout because we never process their version message - leading to a lot of churn. This behavior doesn’t seem ideal to me, I wonder if it wouldn’t be better to display a warning in the GUI that is non blocking.
stickies-v
commented at 2:24 pm on April 10, 2024:
Great catch, thanks. I hadn’t considered the blocking nature being a problem, and you’re right that this PR makes the problem more apparent since the waiting threshold is reduced significantly.
I’ve removed the messagebox altogether, and updated the approach to properly use the GetWarnings() method to show the warning in the GUI. I think this simplifies the code, prevents blocking, and makes for nicer UI.
janus referenced this in commit
561b4b5d2d
on Apr 6, 2024
[net processing] Move nTimeOffset to net_processing038fd979ef
[net processing] Use std::chrono for type-safe time offsets55361a15d1
stickies-v force-pushed
on Apr 10, 2024
Add TimeOffsets helper class
This helper class is an alternative to CMedianFilter, but without a
lot of the special logic and exceptions that we needed while it was
still used for consensus.
ee178dfcc1
[net processing] Introduce PeerManagerInfo
For querying statistics/info from PeerManager. The median outbound time
offset is the only initial field.
7d9c3ec622
[net processing] Move IgnoresIncomingTxs to PeerManagerInfo92e72b5d0d
Remove timedata
With the introduction and usage of TimeOffsets, there is no more need
for this file. Remove it.
c6be144c4b
stickies-v force-pushed
on Apr 10, 2024
stickies-v
commented at 3:11 pm on April 10, 2024:
contributor
in GUI: display the warning along with other warnings, instead of in a messagebox. This prevents the warning from #29623 (review) (and is a nicer approach in general)
various nits around improved documentation, variable naming, references, help messages, …
I probably won’t be addressing nits in this PR anymore to minimize review burden, so if there are any left (please keep posting them if you see any) I’ll keep them for a follow-up.
Also, qt/guiutils.h still refers to CNodeCombinedStats.nTimeOffset
References updated, thanks
Wrt adjusting the peer’s offset with local clock corrections in order to stop the warnings immediately after the local clock is fixed (instead of after 4-5h) #29623 (review), consider this:
I like the idea of using steady_clock here, but I would prefer to do it in a follow-up if there’s appetite for it. It’s nice to have the warning disappear more quickly when the issue is resolved, but I think a downside is that the timeoffset peer stats in GUI and RPC will be inconsistent with the ones we use for the warning. I’d started implementing a PeerClock approach based on your suggestion, and I think it’s quite elegant, but it’s a lot more code overhaul so not really ideal for this PR. WIP commit at https://github.com/bitcoin/bitcoin/compare/master...stickies-v:bitcoin:2024-03/remove-timedata-keep-clock
sr-gi
commented at 3:54 pm on April 15, 2024:
member
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-05-14 21:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me