Simplify network-adjusted time warning logic

stickies-v commented at 6:18 pm on March 11, 2024: contributor

An earlier approach in #28956 involved simplifying and refactoring the network-adjusted time calculation logic, but this was eventually left out of the PR to make it easier for reviewers to focus on consensus logic changes.

Since network-adjusted time is now only used for warning/informational purposes, cleaning up the logic (building on @dergoegge’s approach in #28956) should be quite straightforward and uncontroversial. The main changes are:

Previously, we would only calculate the time offset from the first 199 outbound peers that we connected to. This limitation is now removed, and we have a proper rolling calculation. I’ve reduced the set to 50 outbound peers, which seems plenty.
Previously, we would automatically use the network-adjusted time if the difference was < 70 mins, and warn the user if the difference was larger than that. Since there is no longer any automated time adjustment, I’ve changed the warning threshold to 20 10 minutes (which is an arbitrary number).
Previously, a warning would only be raised once, and then never again until node restart. This behaviour is now updated to 1) warn to log for every new outbound peer for as long as we appear out of sync, 2) have the RPC warning toggled on/off whenever we go in/out of sync, and 3) have the GUI warn whenever we are out of sync (again), but limited to 1 messagebox per 60 minutes
no more globals
remove the -maxtimeadjustment startup arg

Closes #4521

DrahtBot commented at 6:18 pm on March 11, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	sr-gi, achow101, dergoegge
Concept ACK	fanquake, theStack, glozow
Approach ACK	vasild
Stale ACK	alfonsoromanz, sipa

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#29845 (rpc: return warnings as an array instead of just a single one by stickies-v)
#29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
#29494 (build: Assume HAVE_CONFIG_H, Add IWYU pragma keep to bitcoin-config.h includes by maflcko)
#29346 (Stratum v2 Noise Protocol by Sjors)
#29015 (kernel: Streamline util library by ryanofsky)
#27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)
#26966 (index: blockfilter initial sync speedup, parallelize process by furszy)
#15218 (validation: sync chainstate to disk after syncing to tip by andrewtoth)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

stickies-v force-pushed on Mar 11, 2024

DrahtBot added the label CI failed on Mar 11, 2024

DrahtBot commented at 6:26 pm on March 11, 2024: contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/22527379458

dergoegge commented at 10:10 am on March 12, 2024: member

Concept ACK

Thanks for picking this up!

fanquake commented at 10:46 am on March 12, 2024: member

Concept ACK

stickies-v force-pushed on Mar 12, 2024

in src/test/fuzz/timedata.cpp:1 in 55b476b015 outdated

-1@@ -1,31 +0,0 @@
0-// Copyright (c) 2020-2021 The Bitcoin Core developers

dergoegge commented at 1:07 pm on March 12, 2024:

Could you add a fuzz test for TimeOffsets to replace the timedata one?

stickies-v commented at 2:09 pm on March 12, 2024:

Can do, but given how simple the interface is I think it wouldn’t be anything more than the below (untested)? Could e.g. add the .Size() method back but adding it just for fuzzing might not be ideal?

 0diff --git a/src/Makefile.test.include b/src/Makefile.test.include
 1index c77056f29c..c20b531032 100644
 2--- a/src/Makefile.test.include
 3+++ b/src/Makefile.test.include
 4@@ -381,6 +381,7 @@ test_fuzz_fuzz_SOURCES = \
 5  test/fuzz/string.cpp \
 6  test/fuzz/strprintf.cpp \
 7  test/fuzz/system.cpp \
 8+ test/fuzz/timeoffsets.cpp \
 9  test/fuzz/torcontrol.cpp \
10  test/fuzz/transaction.cpp \
11  test/fuzz/tx_in.cpp \
12diff --git a/src/test/fuzz/timeoffsets.cpp b/src/test/fuzz/timeoffsets.cpp
13new file mode 100644
14index 0000000000..f0d2a8f739
15--- /dev/null
16+++ b/src/test/fuzz/timeoffsets.cpp
17@@ -0,0 +1,22 @@
18+// Copyright (c) 2020-present The Bitcoin Core developers
19+// Distributed under the MIT software license, see the accompanying
20+// file COPYING or http://www.opensource.org/licenses/mit-license.php.
21+
22+#include <test/fuzz/FuzzedDataProvider.h>
23+#include <test/fuzz/fuzz.h>
24+#include <test/fuzz/util.h>
25+#include <node/timeoffsets.h>
26+
27+#include <chrono>
28+
29+
30+FUZZ_TARGET(timeoffsets)
31+{
32+    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
33+    TimeOffsets offsets{};
34+    while (fuzzed_data_provider.remaining_bytes() > 0) {
35+        (void)offsets.Median();
36+        offsets.Add(std::chrono::seconds{fuzzed_data_provider.ConsumeIntegral<int64_t>()});
37+        offsets.WarnIfOutOfSync();
38+    }
39+}

dergoegge commented at 2:24 pm on March 12, 2024:

but given how simple the interface is I think it wouldn’t be anything more than the below

That’s good enough. We just want the code to be under test, in case we have change it in the future.

stickies-v commented at 4:39 pm on March 12, 2024:

Done. I was a bit worried WarnIfOutOfSync() outputting to stderr would mess with the fuzzer (as here), but running it locally seems to work fine so far, not sure if the appropriate paths have been hit already though.

Resolving for now.

stickies-v commented at 2:09 pm on March 12, 2024: contributor

Force pushed to:

address a thread safety analysis issue by annotating WarnIfOutOfSync() with a EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
fix failing functional test by resetting node_miner’s mocktime before connecting to the next outbound
rebase to latest master to fix native macos job

alfonsoromanz commented at 2:46 pm on March 12, 2024: contributor

Code review ACK 55b476b015d3a79ec5ea176247d2a70163cd034b

I also:

Ran the unit test, which is passing
Ran the functional test, which is passing
Verified that the peer window now shows the offset of each peer.

DrahtBot requested review from dergoegge on Mar 12, 2024

DrahtBot requested review from fanquake on Mar 12, 2024

stickies-v force-pushed on Mar 12, 2024

stickies-v commented at 4:43 pm on March 12, 2024: contributor

Force pushed to:

remove dependency on C++20’s std::ranges::sort which is not yet supported by the no wallet, libbitcoinkernel ci task
add timeoffsets fuzz target as per review comment
improve includes and copyright headers

Verified that the peer window now shows the offset of each peer.

FYI this is not new functionality, you should see the same behaviour on master.

stickies-v force-pushed on Mar 13, 2024

vasild commented at 8:02 am on March 13, 2024: contributor

Concept ACK, I love simplifications or refactors that are net removing code (+213 −363).

DrahtBot removed the label CI failed on Mar 13, 2024

ariard commented at 5:14 pm on March 16, 2024: none

Previously, we would only calculate the time offset from the first 199 outbound peers that we connected to. This limitation is now removed, and we have a proper rolling calculation. I’ve reduced the set to 50 outbound peers, which seems plenty. Previously, we would automatically use the network-adjusted time if the difference was < 70 mins, and warn the user if the difference was larger than that. Since there is no longer any automated time adjustment, I’ve changed the warning threshold to 20 minutes (which is an arbitrary number).

on the 2 value selections (50 outbound peers):

50 sounds reasonable, to be fair best would be to add timestamp in m_outbound_time_offsets from FEELER connection, it’s an outbound connection type per se. shuffling things more at the AS-level.
20 min, i can suggest to pick up 10 min to match average interval-block-time of ~10 min

sipa commented at 4:52 pm on March 17, 2024: member

utACK 5c8e71cb2b827586efd052a87ccc463c99387fcf

DrahtBot requested review from vasild on Mar 17, 2024

in src/rpc/net.cpp:241 in 5c8e71cb2b outdated

237@@ -239,7 +238,7 @@ static RPCHelpMan getpeerinfo()
238         obj.pushKV("bytessent", stats.nSendBytes);
239         obj.pushKV("bytesrecv", stats.nRecvBytes);
240         obj.pushKV("conntime", count_seconds(stats.m_connected));
241-        obj.pushKV("timeoffset", stats.nTimeOffset);
242+        obj.pushKV("timeoffset", statestats.time_offset.count());

maflcko commented at 8:03 am on March 18, 2024:

count isn’t type-safe. If the type were to change, the RPC result would silently change as well. It may be better to use a casting tick counter Ticks<>.

maflcko commented at 8:04 am on March 18, 2024:

(See also the line below)

stickies-v commented at 11:43 am on March 18, 2024:

Good point, I’ve replaced all new usage of count() with Ticks<std::chrono::seconds>, thanks.

in src/rpc/net.cpp:683 in 5c8e71cb2b outdated

677@@ -679,9 +678,10 @@ static RPCHelpMan getnetworkinfo()
678         obj.pushKV("localservicesnames", GetServicesNames(services));
679     }
680     if (node.peerman) {
681-        obj.pushKV("localrelay", !node.peerman->IgnoresIncomingTxs());
682+        auto peerman_info{node.peerman->GetInfo()};
683+        obj.pushKV("localrelay", !peerman_info.ignores_incoming_txs);
684+        obj.pushKV("timeoffset", peerman_info.median_outbound_time_offset.count());

maflcko commented at 8:03 am on March 18, 2024:

Same

stickies-v commented at 12:11 pm on March 18, 2024:

Done

maflcko commented at 8:23 am on March 18, 2024: member

left a nit, might review later

in src/node/timeoffsets.cpp:48 in 5c8e71cb2b outdated

43+    auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+    if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+    bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+                        "this may lead to consensus failure. Please ensure it is correct.")};
48+    LogWarning("%s\n", msg.translated);

maflcko commented at 8:34 am on March 18, 2024:

The debug log should not log translated messages, but the original ones

stickies-v commented at 11:56 am on March 18, 2024:

Thanks, changed

in src/node/timeoffsets.cpp:33 in 5c8e71cb2b outdated

28+std::chrono::seconds TimeOffsets::Median() const
29+{
30+    LOCK(m_mutex);
31+
32+    // Only calculate the median if we have 5 or more offsets
33+    if (m_offsets.size() < 5) return {0s};

maflcko commented at 8:35 am on March 18, 2024:

nit:

0    if (m_offsets.size() < 5) return 0s;

stickies-v commented at 11:57 am on March 18, 2024:

Thanks, changed

in src/test/fuzz/timeoffsets.cpp:13 in 5c8e71cb2b outdated

 8+#include <test/fuzz/util.h>
 9+
10+#include <chrono>
11+
12+
13+FUZZ_TARGET(timeoffsets)

maflcko commented at 8:39 am on March 18, 2024:

I think it is fine to keep calling the target timedata, to avoid deleting a fuzz input folder and adding a fuzz input folder, with similar data.

stickies-v commented at 12:10 pm on March 18, 2024:

Done.

stickies-v commented at 2:01 pm on March 18, 2024:

Reverted back to timeoffsets

in src/test/fuzz/timeoffsets.cpp:17 in 5c8e71cb2b outdated

12+
13+FUZZ_TARGET(timeoffsets)
14+{
15+    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
16+    TimeOffsets offsets{};
17+    while (fuzzed_data_provider.remaining_bytes() > 0) {

maflcko commented at 8:40 am on March 18, 2024:

in fuzz tests, you’ll have to use LIMITED_WHILE. Otherwise, the fuzz target will never terminate, if the fuzz input is unlimited in length.

stickies-v commented at 11:54 am on March 18, 2024:

Thanks, now using LIMITED_WHILE(..., 4'000) to keep the same number of iterations as the max in the previous fuzz test.

stickies-v force-pushed on Mar 18, 2024

stickies-v commented at 12:24 pm on March 18, 2024: contributor

Thank you for the reviews everyone, I’ve force-pushed to address all comments (and no other changes).

50 sounds reasonable, to be fair best would be to add timestamp in m_outbound_time_offsets from FEELER connection, it’s an outbound connection type per se. shuffling things more at the AS-level.

That’s already happening, we’re considering all types of outbound connections (including feelers)

20 min, i can suggest to pick up 10 min to match average interval-block-time of ~10 min

What’s your rationale for matching it with the average block-time interval? I don’t see how they’re related. I’m not opposed to a shorter threshold (like 10 minutes), just want to avoid warnings occurring too frequently so users start to get used to and ignore them. ~~Want to avoid bikeshedding over this, so going to keep as is for now until more reviewers voice their preference.~~ edit: updated threshold to 10 minutes

stickies-v force-pushed on Mar 18, 2024

stickies-v commented at 2:00 pm on March 18, 2024: contributor

Force-pushed to revert the fuzz harness name change, because for a couple of commits, we now have two harnesses with the same name - until the old one is removed in Remove timedata commit. I’ve opened a PR in qa-assets to rename the input folder (thx dergoegge for the suggestion).

dergoegge approved

dergoegge commented at 4:26 pm on March 18, 2024: member

Code review ACK a42d9b5041cdc522e8a4b81944e45b7f30458c26

DrahtBot requested review from sipa on Mar 18, 2024

sipa commented at 4:40 pm on March 18, 2024: member

utACK a42d9b5041cdc522e8a4b81944e45b7f30458c26

stickies-v referenced this in commit 84c2d9607b on Mar 18, 2024

in src/node/timeoffsets.cpp:47 in a42d9b5041 outdated

42+    // when median == std::numeric_limits<int64_t>::min(), calling std::chrono::abs is UB
43+    auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+    if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+    bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+                        "this may lead to consensus failure. Please ensure it is correct.")};

vasild commented at 11:14 am on March 19, 2024:

It would be more useful to print the offset and the threshold.

stickies-v commented at 11:11 pm on March 19, 2024:

I don’t immediately see what practical benefit it would offer to the user (but I’m happy to add it if I need to re-touch (or in follow-up)

vasild commented at 9:21 am on March 21, 2024:

“Your computer’s clock is 765 seconds behind the network. The difference is more than 600 seconds and this may lead to a consensus failure. Please ensure your computer’s clock is correct.”

vasild commented at 10:32 am on March 22, 2024:

Consider this ^ while retouching. I think it is more useful because it tells the user how wrong is the clock and what are the expectations of Bitcoin Core. Otherwise (to the user) it is not clear whether Bitcoin Core is too picky and whether this is about +/- 1 second or something like that. Referring the user to the getnetworkinfo RPC may also be useful.

maflcko commented at 11:20 am on March 22, 2024:

I think it is more useful because it tells the user how wrong is the clock

Not sure. Just because it is the median does not mean that it is an accurate representation of the current time. For example, the time offsets could all be widely different numbers larger than the warning threshold. Or, it could be an outdated number, because the median is calculated based on samples from random points in the past.

stickies-v commented at 6:05 pm on March 22, 2024:

I agree that the specific numbers don’t matter too much here. The main goal is to have the user double check, and correct their settings if necessary, which should be obvious very quickly. I’ve updated the help message to reinforce that, and also to give context on when the message will disappear:

0    bilingual_str msg{strprintf(_(
1        "Your computer's date and time appears to be more than %d minutes out of sync with the network, "
2        "this may lead to consensus failure. After you've corrected your settings, this message "
3        "should no longer appear when you restart your node. Without a restart, it should stop showing "
4        "automatically after you've connected to a sufficient number of new outbound peers, which may "
5        "take some time."
6    ), Ticks<std::chrono::minutes>(m_warn_threshold))};

It’s quite a long message, but it should only very rarely pop up, and when it does, we really want the user to address it ASAP, so I suppose that’s fair.

I hope this works for both of you?

in src/node/timeoffsets.h:21 in a42d9b5041 outdated

16+{
17+private:
18+    static constexpr size_t N{50};
19+    //! Minimum difference between system and network time for a warning to be raised.
20+    static constexpr std::chrono::minutes m_warn_threshold{10};
21+    mutable std::atomic<bool> m_warning_emitted{false};  // Avoid flooding the user with warnings.

vasild commented at 11:19 am on March 19, 2024:

This way the warning will be printed just once in the lifetime of the process. If the user does something in an attempt to resolve the issue, how would they know that the issue is resolved? Usually if a message is logged every, e.g. 10 minutes (to avoid flooding the log), the disappearance of the message is an indication that the problem is resolved.

Is there a way for the user to see the median network time, as perceived by his node? If yes, then they can manually compare.

Also, the time offset issue is one that can come and go as time passes. Imagine a bitcoind running for 1 month. It had issues with the time soon after it was started which the operator has resolved. 1 month later the issues resurface - there is no warning or indication about it?

Maybe replace that bool with a timestamp of the last log and print the error message if the last log was e.g. 10 minutes ago?

stickies-v commented at 11:08 pm on March 19, 2024:

For GUI and RPC warnings, this PR pretty much preserves the current behaviour: GUI messagebox gets shown only once (until restarted), and the RPC will keep firing forever (on master, it will be reset when another SetMiscWarning() is called, but in practice I don’t think that really happens frequently).

For the debug log, however, this PR quite significantly improves things by

doing an actual rolling calculation, which means that if clocks start drifting, the user will eventually still get a warning, whereas on master the time adjustment system completely stops after the first 199 outbound handshakes
producing repeated log warnings (one for every outbound handshake) for as long and only for as long as the median time offset exceeds the threshold.

While I agree that for GUI & RPC things should be improved, this PR is no regression, and I’d rather it be done in a separate PR. Avoiding flooding seems less trivial to me than adding a simple timer: having a GUI message popup every 10 minutes is quite annoying if someone decides to flood the network with nodes with wrong clocks.

Is there a way for the user to see the median network time, as perceived by his node?

yes, through getnetworkinfo a user can see the median offset

1 month later the issues resurface - there is no warning or indication about it?

there will be warning logs every time we handshake with an outbound peer and the median timeoffset exceeds the threshold

stickies-v commented at 4:45 pm on March 20, 2024:

See https://github.com/stickies-v/bitcoin/commit/0146553aaab53b6092bb54e0b2c71282fef44787 for a potential implementation on turning {GUI, RPC} warnings off again when the offsets fall within the threshold again. Happy to open a follow-up PR with this if you like the approach, I think this PR is too close to merge to overhaul it again as per my previous comment, what do you think?

vasild commented at 9:30 am on March 21, 2024:

what do you think?

I agree.

I would be happy to review that commit if you PR it. Early suggestion:

0-     m_warning_emitted.value() - now > m_warning_wait
1+     now - m_warning_emitted.value() > m_warning_wait

stickies-v commented at 9:05 pm on March 22, 2024:

The latest force-push now also includes proper GUI and RPC warning behaviour. RPC warnings are now only shown when the median exceeds the threshold. GUI warnings are shown whenever the node falls out of sync, but max once per 60 minutes. I think this should fully address your comment? Thanks!

stickies-v referenced this in commit 373778c1fd on Mar 19, 2024

in src/warnings.cpp:68 in a42d9b5041 outdated

61@@ -56,6 +62,11 @@ bilingual_str GetWarnings(bool verbose)
62         warnings_verbose.emplace_back(warnings_concise);
63     }
64 
65+    if (g_timeoffset_warning) {
66+        warnings_concise = _("Your computer's date and time appear out of sync with the network, this may lead to consensus failure. Please ensure it is correct.");
67+        warnings_verbose.emplace_back(warnings_concise);
68+    }

vasild commented at 5:27 pm on March 19, 2024:

Once g_timeoffset_warning is set to true it will never be set back to false and this will warning will persist even after the problem is resolved?

stickies-v commented at 11:09 pm on March 19, 2024:

Correct, as is roughly the current behaviour, as per #29623 (review)

in test/functional/feature_maxtipage.py:48 in a42d9b5041 outdated

42@@ -43,6 +43,10 @@ def test_maxtipage(self, maxtipage, set_parameter=True, test_deltas=True):
43         self.generate(node_miner, 1)
44         assert_equal(node_ibd.getblockchaininfo()['initialblockdownload'], False)
45 
46+        # reset time to system time so we don't have a time offset with the ibd node the next
47+        # time we connect to it, ensuring TimeOffsets::WarnIfOutOfSync() doesn't output to stderr
48+        node_miner.setmocktime(0)

vasild commented at 5:37 pm on March 19, 2024:

Could squash commits: 587349072f [net processing] Use TimeOffsets to warn when node is out of sync d790e776ca [test] reset miner mocktime in maxtipage to prevent time offset warning

stickies-v commented at 6:27 pm on March 22, 2024:

Squashed

DrahtBot requested review from vasild on Mar 19, 2024

in src/timedata.cpp:70 in a42d9b5041 outdated

64-    // But in this case the 'bug' is protective against some attacks, and may
65-    // actually explain why we've never seen attacks which manipulate the
66-    // clock offset.
67-    //
68-    // So we should hold off on fixing this and clean it up as part of
69-    // a timing cleanup that strengthens it in a number of other ways.

vasild commented at 5:47 pm on March 19, 2024:

#4526 (comment) I wonder what those “attacks” are? @gmaxwell

stickies-v commented at 11:10 pm on March 19, 2024:

I don’t know myself, but I’d argue it’s probably not relevant anymore as consensus dependencies on network-adjusted time have been removed

vasild commented at 4:38 pm on March 20, 2024:

Right, it dawned on me that previously we would assume the network’s time, ie “adjust” our time according to the network and this has been removed, so “the network” can no longer mess with our clock. Sorry for the noise.

DrahtBot requested review from vasild on Mar 19, 2024

vasild approved

vasild commented at 9:31 am on March 21, 2024: contributor

ACK a42d9b5041cdc522e8a4b81944e45b7f30458c26

DrahtBot added the label CI failed on Mar 21, 2024

stickies-v force-pushed on Mar 21, 2024

stickies-v commented at 7:19 pm on March 21, 2024: contributor

While working on the fuzz inputs, I noticed that the memory usage for this new timeoffsets harness is unusually high. Doing one more small force-push to patch the fuzz and unit test to:

disable logging in the fuzz test, which was causing the memory leak (thanks @dergoegge)
use noui.h to prevent the GUI from flooding the fuzz and unit test with (expected) warnings

It’s a pretty small test-only diff, so I’m hoping reviewers can give this a quick re-review? Thank you!

git range-diff a42d9b5041cdc522e8a4b81944e45b7f30458c26...3dfdb22c1da687bd86e3325ec02c7b38e15895b4

maflcko commented at 7:22 pm on March 21, 2024: member

Maybe there should be an assertion failure in the fuzz tests, if log messages have been written to the buffer, but logging is still disabled when the program exits?

stickies-v marked this as a draft on Mar 21, 2024

stickies-v commented at 8:05 pm on March 21, 2024: contributor

Changed to draft until I’ve resolved the merge conflict with #29279 which relies on setmocktime

theStack commented at 11:43 pm on March 21, 2024: contributor

Concept ACK

in src/node/timeoffsets.h:13 in 9f2904b621 outdated

 8+#include <sync.h>
 9+
10+#include <atomic>
11+#include <chrono>
12+#include <deque>
13+#include <stddef.h>

maflcko commented at 9:59 am on March 22, 2024:

nit in 9f2904b621cc945005570fc6e94832d73d36d92b: <cstddef>

stickies-v commented at 3:56 pm on March 22, 2024:

fixed

in src/node/timeoffsets.cpp:65 in 9f2904b621 outdated

43+    auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
44+    if (std::chrono::abs(median) <= m_warn_threshold) return false;
45+
46+    bilingual_str msg{_("Your computer's date and time appear out of sync with the network, "
47+                        "this may lead to consensus failure. Please ensure it is correct.")};
48+    LogWarning("%s\n", msg.original);

maflcko commented at 10:40 am on March 22, 2024:

nit in 9f2904b621cc945005570fc6e94832d73d36d92b: This may keep printing the warning, even if the user adjusted the clock, because the offsets are not discarded when the clock changes.

stickies-v commented at 9:09 pm on March 22, 2024:

The log message has been updated to highlight that the message may persist for a while, thanks.

vasild commented at 6:55 pm on March 29, 2024:

Grepping in my node’s log for how frequent outbound connections are made - it would take about 1.5 days to make 200 new outbound connections. Since this is about the median, I guess 101 new connections after fixing the clock will be enough for us to stop the warning. That is like 20ish hours. Quite long. Can we do better?

Currently we only save the difference between our time and the peer’s time at the start of the connection and we have no clue that this difference has changed due to our clock being adjusted/corrected. If we save the value of the system_clock and steady_clock at the start of the connection, in addition to the peer’s offset, at a later time we can deduce that our clock has been adjusted and add that diff/adjustment to the peer’s offset to immediately figure out that “now our clock is in sync with that peer’s clock”.

So, instead of peer_offset, it would be a matter of using the following instead: peer_offset + steady_clock_diff - system_clock_diff, assuming: steady_clock_diff = steady_clock_now - steady_clock_when_connected system_clock_diff = system_clock_now - system_clock_when_connected.

Example:

Lets say our clock is wrong, 15 minutes ahead. When we connect to a peer at 17:00:00 our time they send us a time that is 15 minutes ago and we save: peer_offset = -15min, system_clock_when_connected = 17:00:00, steady_clock_when_connected = 1000 (some obscure value [seconds]). One hour later, our clock has been fixed/corrected, by moving it 15 min backwards (or stopping to move for 15 min). So, system_clock_now = 17:45:00, steady_clock_now = 4600 (the obscure value 1000 plus 3600 because 1 hour has passed). And then: steady_clock_diff = 4600 - 1000 = 60 min system_clock_diff = 17:45:00 - 17:00:00 = 45 min The difference between those is 15 minutes and adding that to peer_offset gives us offset 0, meaning that now our clock is in sync with the peer’s clock.

stickies-v commented at 10:10 pm on March 30, 2024:

Using both system and steady clock is an interesting idea and one I hadn’t considered, thanks. I’m going to look into this a bit more and will follow up shortly.

it would take about 1.5 days to make 200 new outbound connections

FYI this PR reduces 200 -> 50 outbounds so your calculations are a bit off.

in src/node/timeoffsets.cpp:52 in 9f2904b621 outdated

47+                        "this may lead to consensus failure. Please ensure it is correct.")};
48+    LogWarning("%s\n", msg.original);
49+
50+    if (!m_warning_emitted) {
51+        m_warning_emitted = true;
52+        SetMedianTimeOffsetWarning();

maflcko commented at 10:43 am on March 22, 2024:

nit in https://github.com/bitcoin/bitcoin/commit/9f2904b621cc945005570fc6e94832d73d36d92b: Any reason to duplicate the message in two translation units? Why not pass it from here into this function?

stickies-v commented at 5:15 pm on March 22, 2024:

Thanks, updated.

in src/node/timeoffsets.cpp:23 in 9f2904b621 outdated

13+#include <chrono>
14+#include <deque>
15+
16+using namespace std::chrono_literals;
17+
18+void TimeOffsets::Add(std::chrono::seconds offset)

maflcko commented at 10:56 am on March 22, 2024:

nit in https://github.com/bitcoin/bitcoin/commit/9f2904b621cc945005570fc6e94832d73d36d92b: You are also removing the debug log that logged the time offset. Is it intentional?

stickies-v commented at 4:22 pm on March 22, 2024:

I didn’t think about it much, have re-added a debug log line.

in src/net_processing.cpp:3628 in e5c3b28079 outdated

3623@@ -3625,8 +3624,8 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3624         if (!pfrom.IsInboundConn()) {
3625             // Don't use timedata samples from inbound peers to make it
3626             // harder for others to tamper with our adjusted time.
3627-            AddTimeData(pfrom.addr, Ticks<std::chrono::seconds>(peer->m_time_offset.load()));

maflcko commented at 11:03 am on March 22, 2024:

nit in e5c3b2807921caa6f2717fa84915cb02b1f4d9fb: Not that it matters, but the commit itself is broken, because GetTimeOffset in the rpc will now always be zero in this commit. The next commit fixes it.

stickies-v commented at 6:27 pm on March 22, 2024:

Ah, right. I’ve squashed the commit so it’s no longer relevant.

maflcko commented at 11:10 am on March 22, 2024: member

lgtm 3dfdb22c1da687bd86e3325ec02c7b38e15895b4 🌽

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: lgtm 3dfdb22c1da687bd86e3325ec02c7b38e15895b4 🌽
33k5YN4x2rtvpoxcx88mjfPhCYoWnOCqHeZLoKsYaWrghV1HDaqNUzr4RiQ0kTXkWkdr/HPRYmcI77mwR2WiBBg==

in src/test/fuzz/timeoffsets.cpp:18 in 9f2904b621 outdated

12+#include <chrono>
13+
14+void initialize_timeoffsets()
15+{
16+    static const auto testing_setup = MakeNoLogFileContext<>(ChainType::MAIN);
17+    noui_test_redirect();  // prevent the TimeOffsets::WarnIfOutOfSync() GUI MessageBoxes from flooding the log

maflcko commented at 11:12 am on March 22, 2024:

I think it would be fine to globally set this in MakeNoLogFileContext, or all tests, or should there ever be a case where one would want stuff printed to stderr (from ThreadSafeMessageBox) in tests instead of to the test log?

stickies-v commented at 7:30 pm on March 22, 2024:

Seems sensible, but given that MakeNoLogFileContext is used in quite a few other places I’d rather keep that for a separate pull

in src/net_processing.cpp:3613 in 0b372c0ab3 outdated

3609@@ -3610,11 +3610,11 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3610                   peer->m_starting_height, addrMe.ToStringAddrPort(), fRelay, pfrom.GetId(),
3611                   remoteAddr, (mapped_as ? strprintf(", mapped_as=%d", mapped_as) : ""));
3612 
3613-        peer->m_time_offset = nTime - GetTime();
3614+        peer->m_time_offset = std::chrono::seconds{nTime - GetTime()};

maflcko commented at 11:26 am on March 22, 2024:

nit in 0b372c0ab3c800e70bd81fc647742f1d50953188: Could use the type-safe NodeClock/NodeSeconds, which is identical to GetTime?

0peer->m_time_offset = NodeSeconds{std::chrono::seconds{nTime}} - Now<NodeSeconds>();

stickies-v commented at 5:23 pm on March 22, 2024:

Neat, updated thanks

stickies-v referenced this in commit 032a597482 on Mar 22, 2024

fanquake referenced this in commit a175efe768 on Mar 22, 2024

stickies-v force-pushed on Mar 22, 2024

stickies-v commented at 10:39 pm on March 22, 2024: contributor

Thanks for the in-depth review everyone.

Force-pushed to rebase to make CI green again (#29704) and address all outstanding review comments, given that this has now become a larger push anyway. The main changes are:

RPC warnings are properly reset when the issue is resolved, GUI warnings can now be shown more than once (e.g. when the node falls out of sync again), but only once per 60 minutes to prevent flooding.
the warning message has been improved to make it more clear what the user has to/can do
re-added debug log whenever an offset is added (i.e. an outbound peer connects)

stickies-v marked this as ready for review on Mar 22, 2024

DrahtBot removed the label CI failed on Mar 22, 2024

in src/net_processing.cpp:394 in 5084ad4846 outdated

389@@ -390,6 +390,10 @@ struct Peer {
390     /** Whether this peer wants invs or headers (when possible) for block announcements */
391     bool m_prefers_headers GUARDED_BY(NetEventsInterface::g_msgproc_mutex){false};
392 
393+    /** Time offset computed during the version handshake based on the
394+     * timestamp the peer send in the version message. */

sr-gi commented at 1:47 pm on April 2, 2024:

5084ad48462b3169c99a81aeb21a339c3a9a4728 nit: sends/sent

in src/net_processing.cpp:395 in 16dededcd8 outdated

391@@ -392,7 +392,7 @@ struct Peer {
392 
393     /** Time offset computed during the version handshake based on the
394      * timestamp the peer send in the version message. */
395-    std::atomic<int64_t> m_time_offset{0};
396+    std::atomic<std::chrono::seconds> m_time_offset{0s};

sr-gi commented at 2:10 pm on April 2, 2024:

In 16dededcd8d52b6d0fdbe9e0ffab6a177aa8aa17, why is this defined as 0s, but time_offset is defined as 0 in net_processing.cpp?

stickies-v commented at 11:10 am on April 6, 2024:

It’s wrapped in a std::atomic here which requires the chrono literal, doesn’t compile otherwise:

0net_processing.cpp:395:52: error: no matching constructor for initialization of 'std::atomic<std::chrono::seconds>' (aka 'atomic<duration<long long>>')
1    std::atomic<std::chrono::seconds> m_time_offset{0};

sr-gi commented at 1:51 pm on April 11, 2024:

Oh, my point was that it could have also been defined as 0s there too, but I won’t insist

in src/net_processing.cpp:3677 in 5084ad4846 outdated

3670@@ -3666,12 +3671,11 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3671                   peer->m_starting_height, addrMe.ToStringAddrPort(), fRelay, pfrom.GetId(),
3672                   remoteAddr, (mapped_as ? strprintf(", mapped_as=%d", mapped_as) : ""));
3673 
3674-        int64_t nTimeOffset = nTime - GetTime();
3675-        pfrom.nTimeOffset = nTimeOffset;
3676+        peer->m_time_offset = nTime - GetTime();
3677         if (!pfrom.IsInboundConn()) {
3678             // Don't use timedata samples from inbound peers to make it
3679             // harder for others to tamper with our adjusted time.

sr-gi commented at 2:27 pm on April 2, 2024:

nit: Not sure what commit this may better fit in, but there is no such thing as adjusted time anymore, so the comment may need updating

glozow commented at 4:08 pm on April 3, 2024:

+1, was going to say this should be removed in 2ef71c73582be554e565ada3f8a6ca77c2cd79f1, or say we are avoiding false warnings instead of preventing tampering of adjusted time

stickies-v commented at 10:26 am on April 6, 2024:

Thanks, I’ve updated the doc to to make it harder for others to create false warnings about our clock being out of sync.

in src/node/timeoffsets.cpp:21 in e6d7bcf768 outdated

16+#include <cstdint>
17+#include <deque>
18+#include <limits>
19+#include <optional>
20+
21+using namespace std::chrono_literals;

sr-gi commented at 2:33 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 doesn’t look like chrono_literals is required here

stickies-v commented at 10:32 am on April 6, 2024:

I think it is for if (m_offsets.size() < 5) return 0s;?

sr-gi commented at 1:49 pm on April 11, 2024:

I removed it and this seems to run fine, at least in my system

stickies-v commented at 4:40 pm on April 15, 2024:

We #include <util/time.h> which has using namespace std::chrono_literals; which is why it compiles. I thought we would prefer (IWYU-style) to be explicit about used namespaces, but upon further investigation the purpose of #20602 seems to have been to avoid that, so I’ll remove this if I re-touch.

in src/node/timeoffsets.h:21 in e6d7bcf768 outdated

16+{
17+private:
18+    static constexpr size_t N{50};
19+    //! Minimum difference between system and network time for a warning to be raised.
20+    static constexpr std::chrono::minutes m_warn_threshold{10};
21+    //! Log the last time a GUI warning was raised to avoid flooding user with them

sr-gi commented at 2:35 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 nit: full stop (for consistency with the rest).

sr-gi commented at 2:37 pm on April 2, 2024:

https://github.com/bitcoin/bitcoin/commit/e6d7bcf7689e38dcda45a1cdbff98fd610864e47 nit: Also, the user/users

stickies-v commented at 12:46 pm on April 10, 2024:

Thanks, no longer relevant since the variable has been removed in latest push

in src/node/timeoffsets.h:23 in e6d7bcf768 outdated

18+    static constexpr size_t N{50};
19+    //! Minimum difference between system and network time for a warning to be raised.
20+    static constexpr std::chrono::minutes m_warn_threshold{10};
21+    //! Log the last time a GUI warning was raised to avoid flooding user with them
22+    mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
23+    //! Minimum time to wait before raising a new GUI warning

sr-gi commented at 2:36 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 nit: full stop (for consistency with the rest)

maflcko commented at 1:10 pm on April 3, 2024:

I don’t think the use of “GUI” is correct here, is it?

The GUI only calls getWarnings(), aka GetWarnings(true), aka warnings_concise, which is never set in this pull request.

Currently it is only set for RPC and the “No-UI” interface.

stickies-v commented at 3:33 pm on April 6, 2024:

The m_warn wait (now WARN_WAIT) var was used to check that we’re not creating too many messageboxes (through uiInterface.ThreadSafeMessageBox()). The messagebox call and the relevant variables have now been removed and the GetWarnings() implementation updated so that the GUI now shows warnings this way, so this comment is no longer relevant.

stickies-v commented at 12:46 pm on April 10, 2024:

Thanks, no longer relevant since the variable has been removed in latest push

in src/node/timeoffsets.cpp:39 in e6d7bcf768 outdated

34+
35+std::chrono::seconds TimeOffsets::Median() const
36+{
37+    LOCK(m_mutex);
38+
39+    // Only calculate the median if we have 5 or more offsets

sr-gi commented at 2:39 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 it may be worth mentioning this in the header file

stickies-v commented at 11:14 am on April 6, 2024:

Agreed, updated doc in the header.

in src/node/timeoffsets.cpp:42 in e6d7bcf768 outdated

37+    LOCK(m_mutex);
38+
39+    // Only calculate the median if we have 5 or more offsets
40+    if (m_offsets.size() < 5) return 0s;
41+
42+    auto sorted_copy = m_offsets;

sr-gi commented at 2:42 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 nit: theoretically, you only need to hold the mutex up to here, right?

Not that it matters much given the small size of the collection anyway

stickies-v commented at 3:33 pm on April 6, 2024:

I think I’d prefer to leave this as is, it’s a well scoped mutex, performance impact should be negligible so keeping it concise makes sense I think.

in src/node/timeoffsets.cpp:57 in e6d7bcf768 outdated

52+        SetMedianTimeOffsetWarning(std::nullopt);
53+        return false;
54+    }
55+
56+    bilingual_str msg{strprintf(_(
57+        "Your computer's date and time appears to be more than %d minutes out of sync with the network, "

sr-gi commented at 2:47 pm on April 2, 2024:

e6d7bcf7689e38dcda45a1cdbff98fd610864e47 nit: appear

stickies-v commented at 11:18 am on April 6, 2024:

Thanks, fixed.

in src/net_processing.cpp:3686 in 2ef71c7358 outdated

3680@@ -3666,12 +3681,12 @@ void PeerManagerImpl::ProcessMessage(CNode& pfrom, const std::string& msg_type,
3681                   peer->m_starting_height, addrMe.ToStringAddrPort(), fRelay, pfrom.GetId(),
3682                   remoteAddr, (mapped_as ? strprintf(", mapped_as=%d", mapped_as) : ""));
3683 
3684-        int64_t nTimeOffset = nTime - GetTime();
3685-        pfrom.nTimeOffset = nTimeOffset;
3686+        peer->m_time_offset = NodeSeconds{std::chrono::seconds{nTime}} - Now<NodeSeconds>();
3687         if (!pfrom.IsInboundConn()) {
3688             // Don't use timedata samples from inbound peers to make it

sr-gi commented at 3:02 pm on April 2, 2024:

After 2ef71c73582be554e565ada3f8a6ca77c2cd79f1, timedata is not a thing anymore, so this may need updating

stickies-v commented at 11:20 am on April 6, 2024:

updated to Don't use time offset samples, thanks.

sr-gi commented at 3:52 pm on April 15, 2024:

I don’t think this was covered (or it may have been regressed)

stickies-v commented at 4:11 pm on April 15, 2024:

Sorry, I think it must have regressed with this update in the next line - will fix if I have to retouch.

in src/rpc/net.cpp:683 in 2ef71c7358 outdated

679@@ -681,8 +680,8 @@ static RPCHelpMan getnetworkinfo()
680     if (node.peerman) {
681         auto peerman_info{node.peerman->GetInfo()};
682         obj.pushKV("localrelay", !peerman_info.ignores_incoming_txs);
683+        obj.pushKV("timeoffset", Ticks<std::chrono::seconds>(peerman_info.median_outbound_time_offset));

sr-gi commented at 3:05 pm on April 2, 2024:

2ef71c73582be554e565ada3f8a6ca77c2cd79f1 shouldn’t this belong to d81d969290e911904636bdd0bf6ba732615a7b68?

glozow commented at 4:05 pm on April 3, 2024:

Or its own rpc commit?

stickies-v commented at 1:05 pm on April 10, 2024:

I’ve moved this change to [net processing] Introduce PeerManagerInfo

achow101 commented at 9:40 pm on April 2, 2024: member

ACK 2ef71c73582be554e565ada3f8a6ca77c2cd79f1

DrahtBot requested review from vasild on Apr 2, 2024

DrahtBot requested review from theStack on Apr 2, 2024

DrahtBot requested review from dergoegge on Apr 2, 2024

in src/test/timeoffsets_tests.cpp:64 in e6d7bcf768 outdated

59+
60+BOOST_AUTO_TEST_CASE(timeoffsets_warning)
61+{
62+    noui_test_redirect();  // prevent the TimeOffsets::WarnIfOutOfSync() GUI MessageBoxes from flooding the log
63+    BOOST_CHECK(IsWarningRaised({{-60min, -40min, -30min, 0min, 10min}}));
64+    BOOST_CHECK(IsWarningRaised({5, 21min}));

sr-gi commented at 11:04 am on April 3, 2024:

Since the warn threshold was reduced from 20 to 10, it may make more sense to change this to 11

stickies-v commented at 11:22 am on April 6, 2024:

Ah yes good catch, updated.

sr-gi changes_requested

sr-gi commented at 11:35 am on April 3, 2024: member

ACK https://github.com/bitcoin/bitcoin/commit/2ef71c73582be554e565ada3f8a6ca77c2cd79f1

Left some comments inline, mostly nits/things that may need renaming.

Also, qt/guiutils.h still refers to CNodeCombinedStats.nTimeOffset after 5084ad48462b3169c99a81aeb21a339c3a9a4728, in a couple of places, but it may be better to change that for time_offset given nTimeOffset is not a thing anymore.

in src/node/timeoffsets.h:20 in 2ef71c7358 outdated

15+class TimeOffsets
16+{
17+private:
18+    static constexpr size_t N{50};
19+    //! Minimum difference between system and network time for a warning to be raised.
20+    static constexpr std::chrono::minutes m_warn_threshold{10};

vasild commented at 12:43 pm on April 3, 2024:

Since this is a constant, like N:

0    static constexpr std::chrono::minutes WARN_THRESHOLD{10};

in src/node/timeoffsets.h:24 in 2ef71c7358 outdated

19+    //! Minimum difference between system and network time for a warning to be raised.
20+    static constexpr std::chrono::minutes m_warn_threshold{10};
21+    //! Log the last time a GUI warning was raised to avoid flooding user with them
22+    mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
23+    //! Minimum time to wait before raising a new GUI warning
24+    static constexpr std::chrono::minutes m_warning_wait{60};

vasild commented at 12:49 pm on April 3, 2024:

Same as above:

0    static constexpr std::chrono::minutes WARNING_WAIT{60};

stickies-v commented at 1:57 pm on April 10, 2024:

Thanks, incorporated both suggestions.

in src/test/fuzz/timeoffsets.cpp:27 in 2ef71c7358 outdated

22+{
23+    FuzzedDataProvider fuzzed_data_provider(buffer.data(), buffer.size());
24+    TimeOffsets offsets{};
25+    LIMITED_WHILE(fuzzed_data_provider.remaining_bytes() > 0, 4'000) {
26+        (void)offsets.Median();
27+        offsets.Add(std::chrono::seconds{fuzzed_data_provider.ConsumeIntegral<int64_t>()});

vasild commented at 1:11 pm on April 3, 2024:

This implies that int64_t is the same as the internal type of std::chrono::seconds. Better get a random integer in the range of [std::chrono::seconds::min().count(), std::chrono::seconds::max().count()].

stickies-v commented at 2:06 pm on April 10, 2024:

Updated this line to use std::chrono::seconds::rep :

0        offsets.Add(std::chrono::seconds{fuzzed_data_provider.ConsumeIntegral<std::chrono::seconds::rep>()});

in src/node/timeoffsets.cpp:58 in 2ef71c7358 outdated

53+        return false;
54+    }
55+
56+    bilingual_str msg{strprintf(_(
57+        "Your computer's date and time appears to be more than %d minutes out of sync with the network, "
58+        "this may lead to consensus failure. After you've corrected your settings, this message "

maflcko commented at 1:13 pm on April 3, 2024:

0        "this may lead to consensus failure. After you've confirmed your computer's clock, this message "

nit: The clock may or may not already be correct, so there may be nothing to correct, only to confirm.

maflcko commented at 1:20 pm on April 3, 2024:

nit: Could also mention the RPCs to get the per-peer offset, and the median, as well as the NodeClock time?

stickies-v commented at 1:56 pm on April 10, 2024:

Thanks, I’ve included your suggestion and added a line about the RPCs.

maflcko commented at 1:13 pm on April 3, 2024: member

lgtm 2ef71c73582be554e565ada3f8a6ca77c2cd79f1 🛴

Signature:

0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
2trusted comment: lgtm 2ef71c73582be554e565ada3f8a6ca77c2cd79f1   🛴
3Q9sgCN5S+/6cPhk6lYNjDonkzAs6Gm8QI7XynAlXDMCM29pQnMy67XGFMTPuBfZCMwdyVCo/S1sFWJI8l5xkCw==

maflcko commented at 1:14 pm on April 3, 2024: member

Reminder for myself: #29623 (review)

in src/test/timeoffsets_tests.cpp:17 in 2ef71c7358 outdated

12+#include <chrono>
13+#include <vector>
14+
15+using namespace std::chrono_literals;
16+
17+void AddMulti(TimeOffsets& offsets, std::vector<std::chrono::seconds> to_add)

vasild commented at 1:20 pm on April 3, 2024:

In this case performance is irrelevant but in general in arguments that are expensive to copy like vector are passed by const reference.

stickies-v commented at 11:31 am on April 6, 2024:

Oops, fixed.

in src/test/timeoffsets_tests.cpp:52 in 2ef71c7358 outdated

47+    AddMulti(offsets, {25, 15s});
48+    // we now have 25 offsets of 10s followed by 25 offsets of 15s
49+    BOOST_CHECK(offsets.Median() == 15s);
50+}
51+
52+bool IsWarningRaised(std::vector<std::chrono::seconds> check_offsets)

vasild commented at 1:26 pm on April 3, 2024:

0static bool IsWarningRaised(std::vector<std::chrono::seconds> check_offsets)

stickies-v commented at 2:07 pm on April 10, 2024:

Thanks, fixed. Also for AddMulti in the same file.

vasild commented at 3:28 pm on April 3, 2024: contributor

Approach ACK 2ef71c73582be554e565ada3f8a6ca77c2cd79f1

Wrt adjusting the peer’s offset with local clock corrections in order to stop the warnings immediately after the local clock is fixed (instead of after 4-5h) #29623 (review), consider this:

 0diff --git i/src/node/timeoffsets.cpp w/src/node/timeoffsets.cpp
 1index 8488e47ff5..4dce6f9edd 100644
 2--- i/src/node/timeoffsets.cpp
 3+++ w/src/node/timeoffsets.cpp
 4@@ -24,27 +24,31 @@ void TimeOffsets::Add(std::chrono::seconds offset)
 5 {
 6     LOCK(m_mutex);
 7 
 8     if (m_offsets.size() >= N) {
 9         m_offsets.pop_front();
10     }
11-    m_offsets.push_back(offset);
12+    m_offsets.emplace_back(offset, std::chrono::steady_clock::now(), std::chrono::system_clock::now());
13     LogDebug(BCLog::NET, "Added time offset %+ds, total samples %d\n",
14              Ticks<std::chrono::seconds>(offset), m_offsets.size());
15 }
16 
17 std::chrono::seconds TimeOffsets::Median() const
18 {
19     LOCK(m_mutex);
20 
21     // Only calculate the median if we have 5 or more offsets
22     if (m_offsets.size() < 5) return 0s;
23 
24+    const auto steady_now = std::chrono::steady_clock::now();
25+    const auto system_now = std::chrono::system_clock::now();
26     auto sorted_copy = m_offsets;
27-    std::sort(sorted_copy.begin(), sorted_copy.end());
28-    return sorted_copy[m_offsets.size() / 2];  // approximate median is good enough, keep it simple
29+    std::sort(sorted_copy.begin(), sorted_copy.end(), [&](const peer_time_data& a, const peer_time_data& b) {
30+        return a.Offset(steady_now, system_now) < b.Offset(steady_now, system_now);
31+    });
32+    return sorted_copy[m_offsets.size() / 2].Offset(steady_now, system_now);  // approximate median is good enough, keep it simple
33 }
34 
35 bool TimeOffsets::WarnIfOutOfSync() const
36 {
37     // when median == std::numeric_limits<int64_t>::min(), calling std::chrono::abs is UB
38     auto median{std::max(Median(), std::chrono::seconds(std::numeric_limits<int64_t>::min() + 1))};
39diff --git i/src/node/timeoffsets.h w/src/node/timeoffsets.h
40index 8bccae20e8..1d7a9f0153 100644
41--- i/src/node/timeoffsets.h
42+++ w/src/node/timeoffsets.h
43@@ -21,13 +21,29 @@ private:
44     //! Log the last time a GUI warning was raised to avoid flooding user with them
45     mutable std::optional<std::chrono::steady_clock::time_point> m_warning_emitted GUARDED_BY(m_mutex){};
46     //! Minimum time to wait before raising a new GUI warning
47     static constexpr std::chrono::minutes m_warning_wait{60};
48 
49     mutable Mutex m_mutex;
50-    std::deque<std::chrono::seconds> m_offsets GUARDED_BY(m_mutex){};
51+    struct peer_time_data {
52+        //! The difference between peer's time and our time, at the time of the P2P handshake.
53+        std::chrono::seconds handshake_offset;
54+        // The following two are used to detect system clock corrections/adjustments.
55+        std::chrono::time_point<std::chrono::steady_clock> handshake_steady;
56+        std::chrono::time_point<std::chrono::system_clock> handshake_system;
57+
58+        /// Calculate the offset as of now, taking into account any system clock corrections that
59+        /// have happened since the peer connected.
60+        std::chrono::seconds Offset(std::chrono::time_point<std::chrono::steady_clock> steady_now,
61+                                    std::chrono::time_point<std::chrono::system_clock> system_now) const {
62+            return handshake_offset +
63+                std::chrono::duration_cast<std::chrono::seconds>(steady_now - handshake_steady) -
64+                std::chrono::duration_cast<std::chrono::seconds>(system_now - handshake_system);
65+        }
66+    };
67+    std::deque<peer_time_data> m_offsets GUARDED_BY(m_mutex){};
68 
69 public:
70     /** Add a new time offset sample. */
71     void Add(std::chrono::seconds offset) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex);
72 
73     /** Compute and return the median of the collected time offset samples. */

in src/node/timeoffsets.h:17 in e6d7bcf768 outdated

12+#include <deque>
13+#include <optional>
14+
15+class TimeOffsets
16+{
17+private:

glozow commented at 3:59 pm on April 3, 2024:

nit: this is unnecessary

stickies-v commented at 2:08 pm on April 10, 2024:

Ah right, removed.

in src/net_processing.cpp:756 in e6d7bcf768 outdated

752@@ -753,6 +753,8 @@ class PeerManagerImpl final : public PeerManager
753     /** Next time to check for stale tip */
754     std::chrono::seconds m_stale_tip_check_time GUARDED_BY(cs_main){0s};
755 
756+    TimeOffsets m_outbound_time_offsets;

glozow commented at 4:00 pm on April 3, 2024:

needs doc?

stickies-v commented at 2:08 pm on April 10, 2024:

Added!

in src/node/timeoffsets.h:18 in e6d7bcf768 outdated

13+#include <optional>
14+
15+class TimeOffsets
16+{
17+private:
18+    static constexpr size_t N{50};

glozow commented at 4:17 pm on April 3, 2024:

nit: documentation please, e.g. number of elements to track

stickies-v commented at 2:08 pm on April 10, 2024:

Renamed variable to MAX_SIZE and added a docstring.

glozow commented at 4:17 pm on April 3, 2024: member

approach ACK

in src/node/timeoffsets.cpp:71 in e6d7bcf768 outdated

66+    const auto now{SteadyClock::now()};
67+    {
68+        LOCK(m_mutex);
69+        if (!m_warning_emitted || now - m_warning_emitted.value() > m_warning_wait) {
70+            m_warning_emitted = now;
71+            uiInterface.ThreadSafeMessageBox(msg, "", CClientUIInterface::MSG_WARNING);

mzumsande commented at 1:50 pm on April 4, 2024:

This isn’t changed behavior, but with the threshold reduced to 10 minutes it may be a bit more relevant now: As far as I understand it, for GUI users ThreadSafeMessageBox() blocks the calling thread (here msghand) until the user has clicked away the warning: That means that for as long as the user didn’t click anything (e.g. because they are afk), most activity stops: No more received messages are processed, the tip is no longer updated etc. The current peers will disconnect us soon (because we don’t answer their pings anymore), and we will constantly open new connections (that is done separately in the opencon thrad) that will then timeout because we never process their version message - leading to a lot of churn. This behavior doesn’t seem ideal to me, I wonder if it wouldn’t be better to display a warning in the GUI that is non blocking.

stickies-v commented at 2:24 pm on April 10, 2024:

Great catch, thanks. I hadn’t considered the blocking nature being a problem, and you’re right that this PR makes the problem more apparent since the waiting threshold is reduced significantly.

I’ve removed the messagebox altogether, and updated the approach to properly use the GetWarnings() method to show the warning in the GUI. I think this simplifies the code, prevents blocking, and makes for nicer UI.

janus referenced this in commit 561b4b5d2d on Apr 6, 2024

[net processing] Move nTimeOffset to net_processing 038fd979ef

[net processing] Use std::chrono for type-safe time offsets 55361a15d1

stickies-v force-pushed on Apr 10, 2024

Add TimeOffsets helper class

This helper class is an alternative to CMedianFilter, but without a
lot of the special logic and exceptions that we needed while it was
still used for consensus.

ee178dfcc1

[net processing] Introduce PeerManagerInfo

For querying statistics/info from PeerManager. The median outbound time
offset is the only initial field.

7d9c3ec622

[net processing] Move IgnoresIncomingTxs to PeerManagerInfo 92e72b5d0d

Remove timedata

With the introduction and usage of TimeOffsets, there is no more need
for this file. Remove it.

c6be144c4b

stickies-v force-pushed on Apr 10, 2024

stickies-v commented at 3:11 pm on April 10, 2024: contributor

Thank you very much @sr-gi @maflcko @vasild @glozow @mzumsande for your extensive review.

Force pushed to address all review comments:

in GUI: display the warning along with other warnings, instead of in a messagebox. This prevents the warning from #29623 (review) (and is a nicer approach in general)
various nits around improved documentation, variable naming, references, help messages, …
I probably won’t be addressing nits in this PR anymore to minimize review burden, so if there are any left (please keep posting them if you see any) I’ll keep them for a follow-up.

Also, qt/guiutils.h still refers to CNodeCombinedStats.nTimeOffset

References updated, thanks

Wrt adjusting the peer’s offset with local clock corrections in order to stop the warnings immediately after the local clock is fixed (instead of after 4-5h) #29623 (review), consider this:

I like the idea of using steady_clock here, but I would prefer to do it in a follow-up if there’s appetite for it. It’s nice to have the warning disappear more quickly when the issue is resolved, but I think a downside is that the timeoffset peer stats in GUI and RPC will be inconsistent with the ones we use for the warning. I’d started implementing a PeerClock approach based on your suggestion, and I think it’s quite elegant, but it’s a lot more code overhaul so not really ideal for this PR. WIP commit at https://github.com/bitcoin/bitcoin/compare/master...stickies-v:bitcoin:2024-03/remove-timedata-keep-clock

sr-gi commented at 3:54 pm on April 15, 2024: member

Re-ACK c6be144

DrahtBot requested review from vasild on Apr 15, 2024

DrahtBot requested review from achow101 on Apr 15, 2024

DrahtBot requested review from glozow on Apr 15, 2024

achow101 commented at 7:30 pm on April 16, 2024: member

reACK c6be144c4b774a03a8bcab5a165768cf81e9b06b

achow101 removed review request from achow101 on Apr 16, 2024

dergoegge approved

dergoegge commented at 10:42 am on April 30, 2024: member

utACK c6be144c4b774a03a8bcab5a165768cf81e9b06b

achow101 merged this on Apr 30, 2024

achow101 closed this on Apr 30, 2024

stickies-v deleted the branch on May 1, 2024

hebasto added the label Needs CMake port on May 2, 2024

hebasto commented at 12:05 pm on May 4, 2024: member

Ported to the CMake-based build system in https://github.com/hebasto/bitcoin/pull/186.

hebasto removed the label Needs CMake port on May 4, 2024

achow101 referenced this in commit 8efd03ad04 on May 7, 2024

bitcoin locked this on May 4, 2025

Simplify network-adjusted time warning logic #29623

Code Coverage

Reviews

Conflicts

Example: