sr-gi
commented at 4:00 PM on December 20, 2023:
member
Motivation
While checking the outbound eviction code I realized a case was not considered within the comments, which in turn made me realize we had no functional tests for the outbound eviction case (when I went to check/add the test case).
This PR updates the aforementioned comment and adds functional tests to cover the outbound eviction logic, in addition to the existing unit tests found at src/test/denialofservice_tests.cpp.
DrahtBot
commented at 4:00 PM on December 20, 2023:
contributor
<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
<!--174a7506f384e20aa4161008e828411d-->
Conflicts
No conflicts as of last run.
DrahtBot added the label Tests on Dec 20, 2023
sr-gi
commented at 4:02 PM on December 20, 2023:
member
While the tests seem to run fine, I noticed a warning that appears every now and then:
TestFramework.p2p (WARNING): Connection lost to 0:0 due to [Errno 54] Connection reset by peer
I'm not sure if something unintended is happening under the hood. Disconnections are expected, but it's our node who is supposed to trigger them, so maybe I'm missing something.
sr-gi force-pushed on Dec 20, 2023
DrahtBot added the label CI failed on Dec 20, 2023
sr-gi force-pushed on Dec 20, 2023
sr-gi force-pushed on Dec 20, 2023
sr-gi renamed this: test: adds outbound eviction functional tests, updated comment on ConsiderEviction test: adds outbound eviction functional tests, updates comment in ConsiderEviction on Dec 20, 2023
DrahtBot removed the label CI failed on Dec 20, 2023
sr-gi force-pushed on Dec 20, 2023
in
src/net_processing.cpp:5101
in
9ba568c816outdated
5097 | @@ -5098,9 +5098,10 @@ void PeerManagerImpl::ConsiderEviction(CNode& pto, Peer& peer, std::chrono::seco
5098 | state.m_chain_sync.m_sent_getheaders = false;
5099 | }
5100 | } else if (state.m_chain_sync.m_timeout == 0s || (state.m_chain_sync.m_work_header != nullptr && state.pindexBestKnownBlock != nullptr && state.pindexBestKnownBlock->nChainWork >= state.m_chain_sync.m_work_header->nChainWork)) {
5101 | - // Our best block known by this peer is behind our tip, and we're either noticing5102 | - // that for the first time, OR this peer was able to catch up to some earlier point5103 | - // where we checked against our tip.5104 | + // Either:
naumenkogs
commented at 9:34 AM on December 22, 2023:
I've spent a ridiculous amount of time understanding what's going on here, and I'm still far from being 100% sure. Perhaps if you touch this comment, you can expand it even further? E.g., connecting to each of the conditions in else if
Rebased and updated the comment. I also extended the test slightly to cover all possible cases
sr-gi force-pushed on Jan 3, 2024
sr-gi force-pushed on Jan 3, 2024
DrahtBot added the label CI failed on Jan 3, 2024
sr-gi force-pushed on Jan 3, 2024
DrahtBot removed the label CI failed on Jan 3, 2024
sr-gi force-pushed on Jan 4, 2024
DrahtBot added the label CI failed on Jan 15, 2024
sr-gi force-pushed on Jan 29, 2024
sr-gi
commented at 10:06 PM on January 29, 2024:
member
Rebased to fix CI
DrahtBot removed the label CI failed on Jan 29, 2024
DrahtBot added the label CI failed on Feb 6, 2024
sr-gi force-pushed on Feb 6, 2024
sr-gi
commented at 2:55 PM on February 6, 2024:
member
Rebased. This has failed CI once when waiting for the disconnection after:
cur_mock_time += (HEADERS_RESPONSE_TIME + 1)
I've tried to reproduce it but I cannot in my local setup, the only thing that comes to mind is that mock time may be too close to the timeout (only one second ahead), so I gave it a delta of 10 seconds instead.
PS: For context, you can check the failure here: https://cirrus-ci.com/task/4624744360706048?logs=ci#L6520. See how the peer is given the final notice by sending a getheaders message, but the check is performed before the peer gets disconnected.
sr-gi force-pushed on Feb 6, 2024
DrahtBot removed the label CI failed on Feb 6, 2024
in
src/net_processing.cpp:5167
in
edc14bc9daoutdated
5131 | @@ -5132,16 +5132,19 @@ void PeerManagerImpl::ConsiderEviction(CNode& pto, Peer& peer, std::chrono::seco
5132 | // unless it's invalid, in which case we should find that out and
5133 | // disconnect from them elsewhere).
5134 | if (state.pindexBestKnownBlock != nullptr && state.pindexBestKnownBlock->nChainWork >= m_chainman.ActiveChain().Tip()->nChainWork) {
I could be missing something, but a test case covering the reset of eviction timeout when an outbound peer sends us a valid block with more chainwork (i.e. the timeout set to 0 to prevent eviction, and m_work_header/m_sent_getheaders also reinitialized) was not observed. Do you think it's worth adding?
Added notes below to provide insight into what was reviewed:
If eviction does not occur when it should, then wait_for_disconnect() will raise (AssertionError, originally from wait_until_helper_internal()). Confirmed by temporarily setting cur_mock_time to a value significantly lower than CHAIN_SYNC_TIMEOUT.
Similarly, if unexpected eviction occurs (e.g. in the "keep catching up" case), then send_and_ping() or sync_with_ping() will raise IOError (from send_raw_message()). Confirmed by temporarily inserting a call to disconnect_p2ps().
in
test/functional/p2p_eviction.py:222
in
edc14bc9daoutdated
217 | + self.log.info("Test that the peer does not get evicted") 218 | + test_node.sync_with_ping() 219 | + 220 | + node.disconnect_p2ps() 221 | + 222 | + def test_outbound_eviction_protected(self):
Added notes below to provide insight into what was reviewed:
In test_outbound_eviction_protected() if unexpected eviction occurs (i.e. the P2P connection is no longer present), then sync_with_ping() will raise (IOError('Not connected'), causing the test to fail. Moved node.disconnect_p2ps() above the last test_node.sync_with_ping() call and, as expected, this occurred.
nit: The heading comment in the test_outbound_eviction_protected() omits that a protected peer is also non-block-only, but this is implicit in the call to add_outbound_p2p_connection() using outbound-full-relay as the connection_type.
net_processing.cpp:
A peer is marked as protected if all of these are true:
- its connection type is IsBlockOnlyConn() == false
- it gave us a valid connecting header
- we haven't reached MAX_OUTBOUND_PEERS_TO_PROTECT_FROM_DISCONNECT yet
- its chain tip has at least as much work as ours
nit: I could be missing something, but a test case covering the limit of protected outbound peers (i.e. MAX_OUTBOUND_PEERS_TO_PROTECT_FROM_DISCONNECT) was not observed. Do you think it's worth adding?
Thinking out loud:
MAX_OUTBOUND_PEERS_TO_PROTECT_FROM_DISCONNECT = 4, so maybe adding another peer that would normally be considered protected and seeing that it is not protected could test this?
The limit is tested in this test. We create 8 connections, from which only 4 are protected. If you mean that there are no unprotected peers that behave properly, that could be added. e.g.
4 protected, send initial headers, do nothing, don't get evicted
2 unprotected, keep in sync, don't get evicted
2 unprotected, do not keep up, get evicted
Thank you, yes, I should have been clearer. The updated test_outbound_eviction_protected_mixed in 91005f7459993e1f7139e46b53eef1bb04860a7c looks great (includes 4 protected, 4 unprotected (2 honest, 2 misbehaving by sending no headers or old)). ACK91005f7459993e1f7139e46b53eef1bb04860a7c
tdb3
commented at 12:06 AM on February 11, 2024:
contributor
Excellent work finding a gap and filling it by increasing functional test robustness.
ACK for edc14bc9da332cae39f8803db559b532b3c74e16
Checked out the PR branch, built, and ran all functional tests (all passed).
The review was focused on verifying edge case coverage of the tests and that tests will accurately catch both pass and fail conditions.
Did not observe previously reported warning message:
TestFramework.p2p (WARNING): Connection lost to 0:0 due to [Errno 54] Connection reset by peer
sr-gi
commented at 4:00 PM on February 12, 2024:
member
sr-gi
commented at 4:22 PM on February 12, 2024:
member
I redefined how peers are split in p2p_eviction:test_outbound_eviction_protected_mixed to address #29122 (review). Both commits can be squashed (leaving them as they are for now in case someone does not agree with the approach)
sr-gi force-pushed on Feb 12, 2024
sr-gi
commented at 9:31 PM on February 12, 2024:
member
Rebased. This has failed CI once when waiting for the disconnection after:
cur_mock_time += (HEADERS_RESPONSE_TIME + 1)
I've tried to reproduce it but I cannot in my local setup, the only thing that comes to mind is that mock time may be too close to the timeout (only one second ahead), so I gave it a delta of 10 seconds instead.
PS: For context, you can check the failure here: https://cirrus-ci.com/task/4624744360706048?logs=ci#L6520. See how the peer is given the final notice by sending a getheaders message, but the check is performed before the peer gets disconnected.
I've been looking more into this, given an error showed up again on CI. I think the issue arose from the initial mock time not being properly set. I've reset the increments to be just one second apart, plus set the initial mock time before the tests were run.
Also, I've moved the outbound eviction tests to their own file, given p2p_eviction was being described as testing inbound eviction, plus used some custom classes that were only relevant for inbounds.
sr-gi force-pushed on Feb 12, 2024
sr-gi force-pushed on Feb 12, 2024
DrahtBot
commented at 9:42 PM on February 12, 2024:
contributor
<!--85328a0da195eb286784d51f73fa0af9-->
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
DrahtBot added the label CI failed on Feb 12, 2024
sr-gi force-pushed on Feb 12, 2024
DrahtBot removed the label CI failed on Feb 12, 2024
tdb3
commented at 12:41 AM on February 13, 2024:
contributor
Rebased. This has failed CI once when waiting for the disconnection after:
cur_mock_time += (HEADERS_RESPONSE_TIME + 1)
I've tried to reproduce it but I cannot in my local setup, the only thing that comes to mind is that mock time may be too close to the timeout (only one second ahead), so I gave it a delta of 10 seconds instead.
PS: For context, you can check the failure here: https://cirrus-ci.com/task/4624744360706048?logs=ci#L6520. See how the peer is given the final notice by sending a getheaders message, but the check is performed before the peer gets disconnected.
I've been looking more into this, given an error showed up again on CI. I think the issue arose from the initial mock time not being properly set. I've reset the increments to be just one second apart, plus set the initial mock time before the tests were run.
Also, I've moved the outbound eviction tests to their own file, given p2p_eviction was being described as testing inbound eviction, plus used some custom classes that were only relevant for inbounds.
(now non-stale) ACK for f06b50dfd45000f34bf258d48101718b4b5f9772.
Pulled. Built. Ran p2p_eviction.py and p2p_outbound_eviction.py. Both passed.
in
test/functional/p2p_outbound_eviction.py:84
in
f06b50dfd4outdated
79 | + # Test that if the peer never catches up with our current tip, but it does with the 80 | + # expected work that we set when setting the timer (that is, our tip at the time) 81 | + # we do not disconnect the peer 82 | + test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="outbound-full-relay") 83 | + self.log.info("Mine a block so our peer starts lagging") 84 | + best_block_hash = self.generateblock(self.nodes[0], output="raw(42)", transactions=[])["hash"]
cbergqvist
commented at 1:22 PM on February 22, 2024:
self.nodes[0] => node here and in a few other places.
in
test/functional/p2p_outbound_eviction.py:67
in
f06b50dfd4outdated
62 | + 63 | + # Connect a peer and make it send us headers ending in our tip's parent 64 | + test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="outbound-full-relay") 65 | + headers_message = msg_headers() 66 | + headers_message.headers = [prev_header] 67 | + test_node.send_and_ping(headers_message)
cbergqvist
commented at 1:22 PM on February 22, 2024:
# Sending two pings back-to-back, requires that the node calls
# `ProcessMessage` twice, and thus ensures `SendMessages` must have
# been called at least once
For the sake of this test, this means that we are triggering the outbound eviction checks given we mock the node time right before the call to be past the check timeout
cbergqvist
commented at 8:56 AM on February 29, 2024:
Yeah, I saw that comment. Your explanation for the context of this test helps clarify it, thanks!
Calling "sync" (synchronize) on a bitcoin node and having it get the highest tip from it's peers would be less surprising. Maybe "update" or "process" would be better verbs but let's not rename it in this PR.
in
test/functional/p2p_outbound_eviction.py:70
in
f06b50dfd4outdated
cbergqvist
commented at 9:48 PM on February 22, 2024:
How about also verifying whether MaybeSendGetHeaders() is actually sent (or not) from node to test_node by PeerManagerImpl::ConsiderEviction() in relevant cases?
That is already being tested implicitly by sync_with_ping() + wait + check for the disconnection.
By mocking a time passed CHAIN_SYNC_TIMEOUT and syncing (and not getting an error) we test that the connection has not been dropped, therefore node must have followed to send the get headers message.
I'll add it for clarity anyway
cbergqvist
commented at 10:08 PM on February 22, 2024:
contributor
Reviewed f06b50d. Appreciate the additional test coverage!
cbergqvist
commented at 10:15 PM on February 22, 2024:
contributor
Couldn't reproduce the CI timeout issue btw. But I have encountered the warning:
2024-02-22T22:01:47.354000Z TestFramework.p2p (WARNING): Connection lost to 0:0 due to [Errno 104] Connection reset by peer
in
src/net_processing.cpp:5141
in
f06b50dfd4outdated
5141 | - } else if (state.m_chain_sync.m_timeout == 0s || (state.m_chain_sync.m_work_header != nullptr && state.pindexBestKnownBlock != nullptr && state.pindexBestKnownBlock->nChainWork >= state.m_chain_sync.m_work_header->nChainWork)) {5142 | - // Our best block known by this peer is behind our tip, and we're either noticing5143 | - // that for the first time, OR this peer was able to catch up to some earlier point5144 | - // where we checked against our tip.5145 | - // Either way, set a new timeout based on current tip.5146 | + } else if (state.m_chain_sync.m_timeout == 0s || (state.m_chain_sync.m_work_header != nullptr && state.pindexBestKnownBlock != nullptr && state.pindexBestKnownBlock->nChainWork >= state.m_chain_sync.m_work_header->nChainWork)) {\
marcofleon
commented at 12:30 AM on February 23, 2024:
nit: It looks like the backslash at the end of this line is a typo, yes? Could be removed for clarity.
cbergqvist
commented at 10:03 PM on February 28, 2024:
I assumed it was some super-rare syntax for making tooling understand the comment below applied to the line before it. :D
marcofleon
commented at 12:55 AM on February 23, 2024:
contributor
ACK. Well done on the new test. Seems like a good number of cases are included. Only thing I commented on was the backslash nit in net_processing.cpp.
No issues or warnings when built and tested. Tests p2p_eviction.py and p2p_outbound_eviction.py ran successfully.
sr-gi force-pushed on Feb 28, 2024
sr-gi
commented at 9:11 PM on February 28, 2024:
member
Thanks for the reviews @cbergqvist and @marcofleon, I've addressed the comments of both.
I also squashed the two commits from @tdb3 given looks like no one ha opposed to those
cbergqvist approved
cbergqvist
commented at 8:58 AM on February 29, 2024:
contributor
ACKbffbb17!
(Diffed test/functional/p2p_outbound_eviction.py against previous implementation and ran it a couple of times).
DrahtBot requested review from marcofleon on Feb 29, 2024
DrahtBot requested review from tdb3 on Feb 29, 2024
DrahtBot removed review request from marcofleon on Feb 29, 2024
DrahtBot removed review request from tdb3 on Feb 29, 2024
DrahtBot requested review from tdb3 on Feb 29, 2024
DrahtBot requested review from marcofleon on Feb 29, 2024
DrahtBot removed review request from tdb3 on Feb 29, 2024
DrahtBot removed review request from marcofleon on Feb 29, 2024
DrahtBot requested review from tdb3 on Feb 29, 2024
DrahtBot requested review from marcofleon on Feb 29, 2024
davidgumberg
commented at 7:02 AM on March 1, 2024:
contributor
What do you think about testing that block-relay-only nodes get evicted and aren't eligible for protection?
For example:
# Ensure that block-relay-only peers are not granted protection and ARE evicted
self.log.info("Create an outbound connection to a peer that shares our tip but is block-relay only, so it does not get protection")
test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=1, connection_type="block-relay-only")
test_node.send_and_ping(msg_headers([from_hex(CBlockHeader(), node.getblockheader(node.getbestblockhash(), False))]))
self.log.info("Mine a new block and sync with our peer")
self.generateblock(node, output="raw(42)", transactions=[])
test_node.sync_with_ping()
self.log.info("Let enough time pass for the timeouts to go off")
# Trigger the timeouts and check how we are still connected
cur_mock_time += (CHAIN_SYNC_TIMEOUT + 1)
node.setmocktime(cur_mock_time)
test_node.sync_with_ping()
test_node.wait_for_getheaders()
cur_mock_time += (HEADERS_RESPONSE_TIME + 1)
node.setmocktime(cur_mock_time)
self.log.info("Test that the block-relay-only peer gets evicted")
test_node.wait_for_disconnect()
Otherwise, reviewed ConsiderEviction and the tests added here provide good coverage. I ran the p2p_outbound_eviction.py tests and they passed, I was unable to reproduce the connection reset mentioned above.
TL;DR; Protected peers imply outbound-full-relay, but given this seems to be a recurring question I'll add it to the test. I guess it wouldn't hurt having it to make sure we don't change the logic without noticing it.
DrahtBot removed review request from marcofleon on Mar 1, 2024
DrahtBot requested review from marcofleon on Mar 1, 2024
sr-gi force-pushed on Mar 1, 2024
sr-gi force-pushed on Mar 1, 2024
sr-gi
commented at 4:19 PM on March 1, 2024:
member
I've added a test to cover for @tdb3 and @davidgumberg comments on non-outbound-full-relay peers eviction, plus amended the original commit to rename test_outbound_eviction_mixed and added a missing comment to the method.
DrahtBot
commented at 5:59 PM on March 1, 2024:
contributor
<!--85328a0da195eb286784d51f73fa0af9-->
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
I could be missing something here but the new test test_outbound_eviction_blocks_relay_only isn't included in run_test. It does look good from just reading it though :)
You're right, this may have been removed while rebasing. Nice catch!
This should be fixed now
sr-gi force-pushed on Mar 1, 2024
marcofleon approved
marcofleon
commented at 1:27 AM on March 2, 2024:
contributor
re-ACK33353b7723d0704d64c0dfc1b0f5d072692e68b9. I've reviewed the new changes, focusing on the added negative test. I think it's a valuable addition, expanding the test's coverage of different scenarios in the network. I built and ran p2p_outbound_eviction multiple times and observed no issues or warnings.
Well done on addressing the feedback and enhancing the test suite.
DrahtBot requested review from tdb3 on Mar 2, 2024
DrahtBot requested review from cbergqvist on Mar 2, 2024
in
test/functional/p2p_outbound_eviction.py:213
in
33353b7723outdated
I don't think that's the case. In the previous commit this was the last test so there was no need to manually disconnect the nodes. This is only needed after the last commit to start fresh
davidgumberg
commented at 10:50 PM on April 25, 2024:
(still nit) On second thought it seems a bad idea to depend on the order in which tests are called for deciding whether or not to perform cleanup. I think that might be kind of a footgun for anyone modifying the test in the future (more relevant for test_outbound_eviction_blocks_relay_only)
DrahtBot removed review request from tdb3 on Mar 2, 2024
DrahtBot removed review request from cbergqvist on Mar 2, 2024
DrahtBot requested review from cbergqvist on Mar 2, 2024
DrahtBot requested review from tdb3 on Mar 2, 2024
DrahtBot requested review from davidgumberg on Mar 2, 2024
cbergqvist approved
cbergqvist
commented at 12:14 PM on March 2, 2024:
contributor
ACK33353b7.
Diffed p2p_outbound_eviction.py against previously acked version.
Ran the new version a few times, switched the node type from block-relay-only to outbound-full-relay and confirmed that wait_for_disconnect() times out (decreased timeout to 10s locally, but should still be more than enough).
DrahtBot removed review request from tdb3 on Mar 2, 2024
DrahtBot removed review request from davidgumberg on Mar 2, 2024
DrahtBot requested review from tdb3 on Mar 2, 2024
DrahtBot requested review from davidgumberg on Mar 2, 2024
davidgumberg
commented at 1:26 AM on March 9, 2024:
contributor
Tested by modifying bool IsOutboundOrBlockRelayConn(), and CHAIN_SYNC_TIMEOUT and this test catches. It would be nice if we also had a test that made sure we don't punish peers and set the timeout to HEADERS_RESPONSE_TIMEbefore we've given them CHAIN_SYNC_TIMEOUT to catch up, but I think this PR looks great as it stands!
DrahtBot removed review request from tdb3 on Mar 9, 2024
DrahtBot removed review request from davidgumberg on Mar 9, 2024
DrahtBot requested review from tdb3 on Mar 9, 2024
DrahtBot added the label Needs rebase on Mar 11, 2024
sr-gi force-pushed on Mar 11, 2024
sr-gi
commented at 3:11 PM on March 11, 2024:
member
Rebased to deal with merge conflicts
DrahtBot removed the label Needs rebase on Mar 11, 2024
davidgumberg
commented at 2:03 AM on March 12, 2024:
contributor
$ git range-diff a945f09...59affd0 015ac13...cd5b7a1
1: a12a4dfa90 = 1: 8fa272dccc test: adds outbound eviction functional tests, updates comment in ConsiderEviction
2: 59affd0bf9 = 2: cd5b7a11ee test: adds outbound eviction tests for non outbound-full-relay peers
$ python ./test/functional/test_runner.py
ALL | ✓ Passed | 1350 s (accumulated)
DrahtBot requested review from cbergqvist on Mar 16, 2024
DrahtBot requested review from marcofleon on Mar 16, 2024
cbergqvist approved
cbergqvist
commented at 9:38 PM on March 16, 2024:
contributor
reACKcd5b7a11ee55b1f86bb7662c4fc83616ffa735b1
(Diff didn't change). Built & ran functional tests.
sipa
commented at 4:26 PM on March 17, 2024:
member
Concept ACK
tdb3
commented at 12:07 AM on March 18, 2024:
contributor
re ACK for cd5b7a11ee55b1f86bb7662c4fc83616ffa735b1.
Built and re-ran functional tests locally. All passed.
DrahtBot requested review from sipa on Mar 18, 2024
marcofleon
commented at 7:22 PM on March 18, 2024:
contributor
re-ACKcd5b7a11ee55b1f86bb7662c4fc83616ffa735b1. All functional tests passed.
in
test/functional/p2p_outbound_eviction.py:8
in
8fa272dcccoutdated
0 | @@ -0,0 +1,219 @@ 1 | +#!/usr/bin/env python3 2 | +# Copyright (c) 2019-2021 The Bitcoin Core developers 3 | +# Distributed under the MIT software license, see the accompanying 4 | +# file COPYING or http://www.opensource.org/licenses/mit-license.php. 5 | + 6 | +""" Test node outbound peer eviction logic 7 | + 8 | +A subset of out outbound peers are subject to eviction logic if they cannot keep up
stratospher
commented at 2:53 AM on March 20, 2024:
8fa272d: typo - s/out/our
in
test/functional/p2p_outbound_eviction.py:13
in
8fa272dcccoutdated
8 | +A subset of out outbound peers are subject to eviction logic if they cannot keep up 9 | +with our vision of the best chain. This criteria applies only to non-protected peers, 10 | +and can be triggered by either not learning about any blocks from an outbound peer after 11 | +a certain deadline, or by them not being able to catch up fast enough (under the same deadline). 12 | + 13 | +This tests the different eviction paths based on the peer' behavior and on whether they are protected
stratospher
commented at 2:54 AM on March 20, 2024:
8fa272d: typo - peer's
in
test/functional/p2p_outbound_eviction.py:28
in
8fa272dcccoutdated
in
test/functional/p2p_outbound_eviction.py:44
in
8fa272dcccoutdated
39 | + node = self.nodes[0] 40 | + cur_mock_time = node.mocktime 41 | + 42 | + self.log.info("Create an outbound connection and don't send any headers") 43 | + # Test disconnect due to no block being announced in 22+ minutes (headers are not even exchanged) 44 | + test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="outbound-full-relay")
stratospher
commented at 3:06 AM on March 20, 2024:
8fa272d: can we call test_node variable peer instead? add_outbound_p2p_connection returns P2PInterface objects and not a TestNode object like test_node implies.
Yeah, I see your point. The rationale was that it was the node being tested, but that's indeed confusing given TestNode, I'll rename it to peer.
in
test/functional/p2p_outbound_eviction.py:46
in
8fa272dcccoutdated
41 | + 42 | + self.log.info("Create an outbound connection and don't send any headers") 43 | + # Test disconnect due to no block being announced in 22+ minutes (headers are not even exchanged) 44 | + test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="outbound-full-relay") 45 | + # Wait for over 20 min to trigger the first eviction timeout. This sets the last call past 2 min in the future. 46 | + test_node.sync_with_ping()
stratospher
commented at 3:12 AM on March 20, 2024:
8fa272d: add_outbound_p2p_connection calls sync_with_ping() internally. do we need to do it again?
Oh no, I meant I don't think we need to do it again 😆
in
test/functional/p2p_outbound_eviction.py:99
in
cd5b7a11eeoutdated
94 | + # Send a header with the old tip 95 | + test_node.send_and_ping(msg_headers([tip_header])) 96 | + tip_header = from_hex(CBlockHeader(), node.getblockheader(best_block_hash, False)) 97 | + 98 | + # Check that we are not evicted 99 | + test_node.wait_for_getheaders()
stratospher
commented at 1:04 PM on March 22, 2024:
8fa272dc: last_message needs to be explicitly cleared:
with p2p_lock:
test_node.last_message.pop("getheaders", None)
before calling wait_for_getheaders(). otherwise, it will always pass.
Ohh, TIL, I'm guessing this is because the collection is already populated with the getheaders from the interaction i-1 and we find it all the time
stratospher
commented at 8:14 AM on March 25, 2024:
yeah! follow up idea, out of scope for this PR - it'd be awesome to modify wait_for_getheaders() to make sure that it's the same getheaders (which was sent from TestNode's MaybeSendGetHeaders) we're expecting that is received!
stratospher
commented at 12:31 PM on March 25, 2024:
Fixed it by popping from last_message, may open a followup simplifying it after
#18614 gets fixed
in
test/functional/p2p_outbound_eviction.py:158
in
8fa272dcccoutdated
153 | + node = self.nodes[0] 154 | + cur_mock_time = node.mocktime 155 | + 156 | + self.log.info("Create a mix of protected and unprotected outbound connections to check against eviction") 157 | + 158 | + # Lets try this logic having multiple peers, some protected and some unprotected
stratospher
commented at 4:58 AM on March 23, 2024:
8fa272dc: Let's
stratospher
commented at 6:03 AM on March 25, 2024:
contributor
cool test! Concept ACK. left some suggestions, nits and 1 problem.
something I found interesting was when generateblock is called - inv, getdata and block messages are exchanged between node and peer. so in the tests (ex: test_outbound_eviction_unprotected), the peer is sending the header of an older block it already knows about! I don't think it affects the behaviour in the tests. but would like to make sure that this is expected.
sr-gi force-pushed on Mar 25, 2024
sr-gi force-pushed on Mar 25, 2024
sr-gi
commented at 4:05 PM on March 25, 2024:
member
Rebased + addressed comments from @stratospher. I had to slightly reshape test_outbound_eviction_unprotected to fit #29122 (review) given some time mocks + waits were unnecessary
in
test/functional/p2p_outbound_eviction.py:24
in
0f5193199aoutdated
stratospher
commented at 4:09 AM on March 26, 2024:
0f51931: style nit - different lines
in
test/functional/p2p_outbound_eviction.py:46
in
0f5193199aoutdated
39 | + node = self.nodes[0] 40 | + cur_mock_time = node.mocktime 41 | + 42 | + self.log.info("Create an outbound connection and don't send any headers") 43 | + # Test disconnect due to no block being announced in 22+ minutes (headers are not even exchanged) 44 | + peer = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="outbound-full-relay")
stratospher
commented at 4:41 AM on March 26, 2024:
0f51931: I think we need to clear the initial getheaders which gets sent every time add_outbound_p2p_connection is done too. this makes test_outbound_eviction_protected fail :(
Yeah, I noticed that while patching your last comments for test_outbound_eviction_unprotected (which already has that change applied). I'm assessing how easy would be to supersede #29318 so this PR does not become super verbose and hacky
I patched this locally on top of the aforementioned PR and looks like all issues are gone :)
tdb3
commented at 3:09 PM on March 28, 2024:
contributor
reACK for 368389d973a85d21aa9ebf030ef5e7f7ac5343de.
Pulled, built, ran all unit/functional tests (all passed).
DrahtBot requested review from cbergqvist on Mar 28, 2024
DrahtBot requested review from marcofleon on Mar 28, 2024
DrahtBot requested review from davidgumberg on Mar 28, 2024
DrahtBot requested review from stratospher on Mar 28, 2024
in
test/functional/p2p_outbound_eviction.py:223
in
368389d973outdated
217 | + # This tests that other types of peers (blocks-relay-only for instance) are not granted protection 218 | + node = self.nodes[0] 219 | + cur_mock_time = node.mocktime 220 | + 221 | + self.log.info("Create an blocks-only outbound connection to a peer that shares our tip. This would usually grant protection") 222 | + test_node = node.add_outbound_p2p_connection(P2PInterface(), p2p_idx=0, connection_type="block-relay-only")
cbergqvist
commented at 11:09 AM on April 22, 2024:
nit: Noticed you renamed test_node -> peer above (based off #29122 (review)) but not inside test_outbound_eviction_blocks_relay_only().
(Was diffing against older head (59affd0bf9042d746a6b32739edb0cd45d28d569)).
Might be worth changing once you rebase on top of #29736.
DrahtBot added the label CI failed on Apr 25, 2024
DrahtBot
commented at 6:21 PM on April 25, 2024:
contributor
<!--85328a0da195eb286784d51f73fa0af9-->
🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the
documentation.
Possibly this is due to a silent merge conflict (the changes in this pull request being
incompatible with the current code in the target branch). If so, make sure to rebase on the latest
commit of the target branch.
Leave a comment here, if you need help tracking down a confusing failure.
It is worth noting that these tests will not catch if outbound eviction logic gets triggered too early. e.g.:
/** How long to wait for a peer to respond to a getheaders request */
static constexpr auto HEADERS_RESPONSE_TIME{0s};
// [...]
/** Timeout for (unprotected) outbound peers to sync to our chainwork */
static constexpr auto CHAIN_SYNC_TIMEOUT{0s};
will pass. That seems fine to me, maybe something for a follow up.
DrahtBot requested review from tdb3 on Apr 25, 2024
DrahtBot removed the label CI failed on Apr 26, 2024
test: adds outbound eviction functional tests, updates comment in ConsiderEvictiona8d9a0edc7
test: adds outbound eviction tests for non outbound-full-relay peers
Peer protection is only given to outbound-full-relay peers. Add a negative
test to check that other type of outbound peers are not given protection under
the circumstances that outbound-full-relay would
d53d848347
sr-gi
commented at 3:20 PM on April 26, 2024:
member
ACKbe2c551 Re-reviewed code, and attempted to break tests by making ConsiderEviction misbehave.
It is worth noting that these tests will not catch if outbound eviction logic gets triggered too early. e.g.:
/** How long to wait for a peer to respond to a getheaders request */
static constexpr auto HEADERS_RESPONSE_TIME{0s};
// [...]
/** Timeout for (unprotected) outbound peers to sync to our chainwork */
static constexpr auto CHAIN_SYNC_TIMEOUT{0s};
will pass. That seems fine to me, maybe something for a follow-up.
I don't think I follow you here. HEADERS_RESPONSE_TIME and CHAIN_SYNC_TIMEOUT are constants hardcoded in net_processing.cpp (and mapped here). I would expect unit test to fail if those are updated.
sr-gi force-pushed on Apr 26, 2024
sr-gi
commented at 3:21 PM on April 26, 2024:
member
I don't think I follow you here. HEADERS_RESPONSE_TIME and CHAIN_SYNC_TIMEOUT are constants hardcoded in net_processing.cpp (and mapped here). I would expect unit test to fail if those are updated.
I didn't mean to say that these tests should be responsible for enforcing the particular values of HEADERS_RESPONSE_TIME and CHAIN_SYNC_TIMEOUT, just that they would not catch a regression where outbound eviction logic gets triggered earlier than expected on peers, which can be demonstrated by setting both values to zero.
Anyways, I think if what I'm saying isn't confused in some way, it is more pedantic than useful.
cbergqvist approved
cbergqvist
commented at 9:53 PM on April 28, 2024:
contributor
ACKd53d84834747c37f4060a9ef379e0a6b50155246
Functional including --extended tests passed (skipped unrelated feature_dbcrash).
tdb3
commented at 3:54 PM on May 5, 2024:
contributor
Re ACK for d53d84834747c37f4060a9ef379e0a6b50155246
Pulled, built, ran all unit/functional tests (all passed).
Performed code review, but didn't see anything noteworthy over previous review.
bitcoin deleted a comment on May 5, 2024
marcofleon
commented at 3:30 PM on May 9, 2024:
contributor
Re ACK. Reviewed the code and the tests look good to me. I ran p2p_outbound_eviction.py individually and along with all the other functional tests and everything looks good.
achow101
commented at 8:13 PM on May 9, 2024:
member
ACKd53d84834747c37f4060a9ef379e0a6b50155246
DrahtBot requested review from marcofleon on May 9, 2024
achow101 merged this on May 9, 2024
achow101 closed this on May 9, 2024
PastaPastaPasta referenced this in commit 9db73d205c on Feb 17, 2025
PastaPastaPasta referenced this in commit cc9330c98e on Feb 17, 2025
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2026-04-15 00:13 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me