Switch CNode::m_ping_start
and CNetMessage::m_time
to mockable time, so that tests can be added.
Mockable time is also type-safe, since it uses std::chrono
Switch CNode::m_ping_start
and CNetMessage::m_time
to mockable time, so that tests can be added.
Mockable time is also type-safe, since it uses std::chrono
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
Coverage | Change (pull 18638, 0f96c4b9c8eed32c3fdf1203a2f9a2be59a5f4a2) | Reference (master, 75fcfdaf8b4b68ef1d1fcd606fb36c9265e15918) |
---|---|---|
Lines | +0.0020 % | 90.3582 % |
Functions | +0.1431 % | 85.0787 % |
Branches | +0.0089 % | 51.8202 % |
Updated at: 2020-04-15T00:09:25.322904.
Concept ACK.
Won’t this potentially cause issues (e.g. ping timeouts) in P2P tests that already use mockable time for other things, but are not expecting it to have an effect on pings?
(seems not, given that all the tests still pass)
The -peertimeout
is still measured in wall clock time, and because most of the test run through faster than what the -peertimeout
is, this should not lead to issues in general. There are two tests test/functional/feature_maxuploadtarget.py
and test/functional/feature_bip68_sequence.py
which mine a lot of blocks (more than 100) under mocktime. This can take time when the tests are run under valgrind, so I had to bump the -peertimeout
for those tests.
Other tests should not be affected. If they are, it should be trivial to fix them up.
97+ ping_delay = 29
98+ self.mock_forward(ping_delay)
99+ wait_until(lambda: 'ping' in no_pong_node.last_message)
100+ no_pong_node.send_and_ping(msg_pong(no_pong_node.last_message.pop('ping').nonce))
101+ self.check_peer_info(pingtime=ping_delay, minping=ping_delay, pingwait=None)
102+
minping
is the best/shortest round trip time measured so far. Thus, would it be a good idea to repeat this chunk with a shorter ping_delay
in order to check that minping
gets updated correctly? I tried with ping_delay
= 25 locally and it was updated from 29 to 25 as expected.
ping_delay
29? I’m not sure if it’s arbitrary or a chosen magic number.
60+ self.mock_forward(3)
61+ assert no_pong_node.last_message.pop('ping').nonce != 0
62+ self.check_peer_info(pingtime=None, minping=None, pingwait=3)
63+
64+ self.log.info('Reply without nonce cancels ping')
65+ with self.nodes[0].assert_debug_log(['pong peer=0: Short payload']):
msg_pong_corrupt
is sent (not a msg_pong
)
73@@ -74,7 +74,7 @@ void test_one_input(const std::vector<uint8_t>& buffer)
74 p2p_node.SetSendVersion(PROTOCOL_VERSION);
75 g_setup->m_node.peer_logic->InitializeNode(&p2p_node);
76 try {
77- (void)ProcessMessage(&p2p_node, random_message_type, random_bytes_data_stream, GetTimeMillis(), Params(), *g_setup->m_node.mempool, g_setup->m_node.connman.get(), g_setup->m_node.banman.get(), std::atomic<bool>{false});
Yes, this looks like a bug in the test to me as well. Luckily by making the time type safe, this is less likely to happen in the future.
Regardless, instead of depending on the current time, which is not deterministic and might cause issues while fuzzing, the fuzz test should ask the fuzz engine for a time value.
However, this seems out of scope for the changes in this pull request.
685@@ -686,7 +686,8 @@ const uint256& V1TransportDeserializer::GetMessageHash() const
686 return data_hash;
687 }
688
689-CNetMessage V1TransportDeserializer::GetMessage(const CMessageHeader::MessageStartChars& message_start, int64_t time) {
690+CNetMessage V1TransportDeserializer::GetMessage(const CMessageHeader::MessageStartChars& message_start, const std::chrono::microseconds& time)
std::chrono
times are cheaply copied ints; would it be simpler/safer/faster to pass by value rather than reference to const?
If yes, idem in net.h::L645+698
, net_processing.cpp::L2085
, and fuzz/process_message.cpp::L32
Reference: https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rf-in
I always follow rule 1 (const&), because it is always cheap and prevents the overhead of having to think how many words a type has.
0void f1(const string& s); // OK: pass by reference to const; always cheap
1141@@ -1141,9 +1142,9 @@ void CConnman::InactivityCheck(CNode *pnode)
1142 LogPrintf("socket receive timeout: %is\n", nTime - pnode->nLastRecv);
1143 pnode->fDisconnect = true;
1144 }
1145- else if (pnode->nPingNonceSent && pnode->nPingUsecStart + TIMEOUT_INTERVAL * 1000000 < GetTimeMicros())
1146+ else if (pnode->nPingNonceSent && pnode->m_ping_start.load() + std::chrono::seconds{TIMEOUT_INTERVAL} < GetTime<std::chrono::microseconds>())
nPingNonceSent
is tested for truthiness with implicit conversion instead of against an integer value.
560
561 // Raw ping time is in microseconds, but show it to user as whole seconds (Bitcoin users should be well used to small numbers with many decimal places by now :)
562 stats.m_ping_usec = nPingUsecTime;
563 stats.m_min_ping_usec = nMinPingUsecTime;
564- stats.m_ping_wait_usec = nPingUsecWait;
565+ stats.m_ping_wait_usec = count_microseconds(ping_wait);
duration
to no longer need helpers like count_microseconds
? (or am I confused)
I admit that the documentation isn’t the best https://doxygen.bitcoincore.org/time_8h.html#a50aae909d5edf426afb66cd4d30c8d00
However, those helpers will be needed to convert durations to raw integers for:
ACK fa704719529 good to reduce use of primitives for counting time in favor of the standard library and mockability, though it’s fiddly and easy to introduce regressions while doing so (thanks for adding the tests). Builds with gcc and clang compile cleanly and the node seems to run normally. Built and ran the fuzzers without issue.
0$ src/test/fuzz/process_message ../qa-assets/fuzz_seed_corpus/
1
2INFO: Seed: 2449945750
3INFO: Loaded 1 modules (498754 inline 8-bit counters): 498754 [0x561ad95753c8, 0x561ad95ef00a),
4INFO: Loaded 1 PC tables (498754 PCs): 498754 [0x561ad95ef010,0x561ad9d8b430),
5INFO: 52251 files found in ../qa-assets/fuzz_seed_corpus/
6INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
7INFO: seed corpus: files: 52251 min: 1b max: 3984182b total: 462001668b rss: 227Mb
8[#512](/bitcoin-bitcoin/512/) pulse cov: 6973 ft: 7092 corp: 10/13b exec/s: 256 rss: 516Mb
9[#1024](/bitcoin-bitcoin/1024/) pulse cov: 7037 ft: 7168 corp: 12/19b exec/s: 256 rss: 516Mb
10[#2048](/bitcoin-bitcoin/2048/) pulse cov: 7317 ft: 7541 corp: 18/45b exec/s: 292 rss: 516Mb
11
12.../...
13
14[#3094033](/bitcoin-bitcoin/3094033/) REDUCE cov: 34621 ft: 110638 corp: 2221/41Mb exec/s: 23 rss: 1025Mb L: 92/1048576 MS: 1 EraseBytes-
15[#3102635](/bitcoin-bitcoin/3102635/) NEW cov: 34621 ft: 110890 corp: 2222/41Mb exec/s: 22 rss: 1025Mb L: 111/1048576 MS: 2 CopyPart-ChangeByte-
16[#3102867](/bitcoin-bitcoin/3102867/) NEW cov: 34621 ft: 110899 corp: 2223/41Mb exec/s: 22 rss: 1025Mb L: 107/1048576 MS: 2 CopyPart-CMP- DE: "-bans"-
0$ src/test/fuzz/p2p_transport_deserializer ../qa-assets/fuzz_seed_corpus/
1
2INFO: Seed: 159618322
3INFO: Loaded 1 modules (498301 inline 8-bit counters): 498301 [0x55c1d40d0d48, 0x55c1d414a7c5),
4INFO: Loaded 1 PC tables (498301 PCs): 498301 [0x55c1d414a7c8,0x55c1d48e4f98),
5INFO: 52840 files found in ../qa-assets/fuzz_seed_corpus/
6INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 1048576 bytes
7INFO: seed corpus: files: 52840 min: 1b max: 3984182b total: 491369132b rss: 167Mb
8[#8192](/bitcoin-bitcoin/8192/) pulse cov: 452 ft: 453 corp: 1/1b exec/s: 4096 rss: 254Mb
9[#16384](/bitcoin-bitcoin/16384/) pulse cov: 1131 ft: 2680 corp: 34/1054b exec/s: 4096 rss: 389Mb
10[#32768](/bitcoin-bitcoin/32768/) pulse cov: 1137 ft: 5091 corp: 77/7383b exec/s: 2978 rss: 536Mb
11
12.../...
13
14[#1036084](/bitcoin-bitcoin/1036084/) REDUCE cov: 1137 ft: 7819 corp: 149/844Kb exec/s: 293 rss: 824Mb L: 6113/666283 MS: 4 ChangeByte-EraseBytes-CMP-CopyPart- DE: "\x0d\x00\x00\x00\x00\x00\x00\x00"-
15[#1048576](/bitcoin-bitcoin/1048576/) pulse cov: 1137 ft: 7819 corp: 149/844Kb exec/s: 294 rss: 824Mb
16[#1050168](/bitcoin-bitcoin/1050168/) REDUCE cov: 1137 ft: 7819 corp: 149/843Kb exec/s: 294 rss: 824Mb L: 25076/666283 MS: 4 ShuffleBytes-InsertByte-ShuffleBytes-EraseBytes-
35+ pass
36+
37+
38+class PingPongTest(BitcoinTestFramework):
39+ def set_test_params(self):
40+ self.setup_clean_chain = True
self.setup_clean_chain = False
for this reason. I reckoned the default was the cheaper option and the more expensive one was opt-in, so I always used False
where possible.
functional/README.md
where it states “Set the self.setup_clean_chain
variable in set_test_params()
to control whether or not to use the cached data directories.”
The reason this default to a chain is that most tests require a chain, so so the least verbose option is the default.
Also, defaulting to a chain prevents test writers from accidentally mining a chain when it is not needed. Mining is orders of magnitude slower than copying.
git diff fa70471 fabb382
are passing time by value to GetMessage
and ProcessMessage
and improved documentation.
15+ wait_until,
16+)
17+from test_framework.test_framework import BitcoinTestFramework
18+from test_framework.util import assert_equal
19+
20+PING_INTERVAL = 2 * 60
nit: if you retouch, could add a comment expressing the time units or along the lines of:
0PING_INTERVAL = 2 * 60 # 2 minutes (corresponds to net_processing::PING_INTERVAL)
git range-diff b3ec1fe fabb382 fa94d6f
are updates for a recent merge converting pointers to references in net processing code. Re-did clang and fuzz debug builds, re-ran the new test p2p_ping
and the process_message
and p2p_transport_deserializer
fuzz tests locally.
git range-diff bitcoin-core/master A B
git range-diff 4b5c919 fa94d6f fa33654
, previous tested ACKs still valid
55+ self.mock_time = int(time.time())
56+ self.mock_forward(0)
57+
58+ self.log.info('Check that ping is sent after connection is established')
59+ no_pong_node = self.nodes[0].add_p2p_connection(NodeNoPong())
60+ self.mock_forward(3)
peertimeout
right? If yes, we use peertimeout
at 3 places any maybe it’s more clear to define it at the outermost scope as well like PING_INTERVAL
? Or maybe there’s no need :)
peertimeout
is wall clock time, not mock time, I think. So this can be any value, as long as it matches pingwait=3 below, I’d say
56+ self.mock_forward(0)
57+
58+ self.log.info('Check that ping is sent after connection is established')
59+ no_pong_node = self.nodes[0].add_p2p_connection(NodeNoPong())
60+ self.mock_forward(3)
61+ assert no_pong_node.last_message.pop('ping').nonce != 0
Why pop here, as opposed to just accessing the value? Seems like having (message_count['ping'] == 1 and 'ping' not in last_message) == True
could be confusing.
Maybe decrementing message_count['ping']
every pop is an easier solution, being that this idea is used throughout the test.
0 assert no_pong_node.last_message['ping'].nonce != 0
Or maybe
0 assert no_pong_node.last_message.get('ping').nonce != 0
or
0 assert no_pong_node.last_message.pop('ping').nonce != 0
1 no_pong_node.message_count['ping'] -= 1
message_count
exists, nor am I using it in this test. Given that it is almost unused right now and causes (potential) confusion, I’d say to remove it
68+
69+ self.log.info('Reply without ping')
70+ with self.nodes[0].assert_debug_log([
71+ 'pong peer=0: Unsolicited pong without ping, 0 expected, 0 received, 8 bytes',
72+ ]):
73+ no_pong_node.send_and_ping(msg_pong())
send_and_ping
and not send_message
, here and throughout? Doesn’t seem like it would make a difference, and maybe it’s best that we don’t have extraneous pings and pongs.
I don’t like to use assert_debug_log
as a way to synchronize/poll for a change
send_and_ping
is a way to flush all messages in the buffer (on both sides, send and receive) and ensure they are processed (and any errors have been logged).
7+
8+import time
9+
10+from test_framework.messages import (
11+ msg_pong,
12+)
ACK fa3365430c5fb57d7c0b5f2bce9fbbe290be93c3
Reviewed, verified everything runs.
I enjoyed how each sub-test flows into the next one. Hard to see at first but they really work together.
A few questions.