dergoegge
commented at 4:39 pm on July 8, 2022:
member
This PR splits off the next couple commits from #25268 that introduce the EvictionManager and use it for the inbound eviction logic.
One instance of the EvictionManager is created at start up and passed as a reference to the connection and peer managers. The connection and peer managers report all eviction relevant information (for inbound connections) to the eviction manager who ultimately suggests nodes to evict as the result of EvictionManager::SelectNodeToEvict.
fanquake added the label
P2P
on Jul 8, 2022
fanquake added the label
Refactoring
on Jul 8, 2022
in
src/node/eviction.cpp:255
in
964c048e03outdated
250+ bool noban, ConnectionType conn_type)
251+{
252+ LOCK(m_candidates_mutex);
253+ // The default values used to initialise the eviction candidates result in
254+ // newer candidates being more likely to be evicted.
255+ NodeEvictionCandidate candidate{
I wonder if it might make sense to split NodeEvictionCandidate into 2 parts: the const properties that don’t change (like conn_type/noban) and the non-const state that’s intended to be/might be updated?
I think that might be more straightforward to use as it’s self-documenting, while also allowing for safe lock-free access to the stuff that can’t change.
Maybe that’s easier said than done though. Thoughts?
I tried to do this but the src/test/net_peer_eviction_tests.cpp tests sadly rely on no members of NodeEvictionCandidate being const. Would like to leave this for a follow up, since it requires changing those tests quite a bit.
Hmm, I’d say if the tests rely on non-const values, then the tests need to be reworked somewhat (I assume they’re just re-using existing candidates as opposed to creating new ones for the sake of simplicity). Either that or they’re testing state-changes that aren’t possible.
Agree that it’s not worth refactoring the tests in this PR, but could you maybe organize the members and add a note to the definition of NodeEvictionCandidate to make the possible state changes clear? I think there are basically 3 types, though I may be oversimplifying:
These never change after creation
These may change once
These may change at any time
dergoegge
commented at 1:37 pm on September 26, 2022:
I have been looking a bit more at making the const NodeEvictionCandidate members const and it turns out that it is not just a test problem at the moment.
The eviction logic it self relies on non-constness due to multiple std::sort, vector::erase and std::remove_if calls. Will look at making a separate PR for this.
In the mean time i will add some comments in this PR like you suggested.
theuni
commented at 7:34 pm on July 8, 2022:
member
Concept ACK.
Added a high-level question about NodeEvictionCandidate to kick things off, will continue with review.
DrahtBot
commented at 10:31 pm on July 8, 2022:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#28222 (Use shared_ptr for CNode inside CConnman by willcl-ark)
#28170 (p2p: adaptive connections services flags by furszy)
#27912 (net: disconnect inside AttemptToEvictConnection by willcl-ark)
#27837 (net: introduce block tracker to retry to download blocks after failure by furszy)
#27600 (net: Add new permission forceinbound to evict a random unprotected connection if all slots are otherwise full by pinheadmz)
#27509 (Relay own transactions only via short-lived Tor or I2P connections by vasild)
#27385 (net, refactor: extract Network and BIP155Network logic to node/network by jonatack)
#27213 (p2p: Diversify automatic outbound connections with respect to networks by amitiuttarwar)
#26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)
#26621 (refactor: Continue moving application data from CNode to Peer by dergoegge)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
in
src/bench/peer_eviction.cpp:6
in
2f93a9fc87outdated
Cory originally suggested to it this way to untangle the removal from m_nodes_mutex. (See #25268 (review))
I think it could make sense to do it in the Delete disconnected nodes loop below which is also independent of m_node_mutex. Another option would be to do it in PeerManagerImpl::FinalizeNode() but that might be weird since we add candidates in net.
I can’t see any reason why this call would have to be done outside the m_nodes_mutex. EvictionManager should never call into net, so there isn’t any chance of a lock inversion. Besides, there are other EvictionManager methods (eg the Get...() methods) which are called while holding m_nodes_mutex.
0 /** A ping-pong round trip has completed successfully. Update latest ping time. */
In fact, m_last_ping_time may as well be moved into Peer since it’s not used in any net level logic and is only for RPC/GUI stats. Could be done as a follow-up.
Once the outbound eviction logic has also been moved into EvictionManager, can these three getters be combined into a single GetEvictionCandidateStats() or similar, since they’ll only be used for reporting stats to the user. If you think that’s a good idea, perhaps add that to the function comment for now.
in
src/node/eviction.cpp:301
in
2f93a9fc87outdated
in
src/test/net_peer_eviction_tests.cpp:5
in
2f93a9fc87outdated
1@@ -2,6 +2,7 @@
2 // Distributed under the MIT software license, see the accompanying
3 // file COPYING or http://www.opensource.org/licenses/mit-license.php.
45+#include <node/eviction_impl.h>
Is there any reason why this and ProtectEvictionCandidatesByRatio() aren’t members of EvictionManagerImpl? I’m not saying they have to be, but I’m interested in the design choice.
This is something that @theuni originally also had on his branch and i just went with the same design. To me this seems good because some of the tests purely test these functions without needing the eviction manager and adding them as members would require more test refactoring.
I think that test refactoring would probably be a good thing, since it would ensure that the unit test is not testing a state that EvictionManager can’t get into in the product code. In general, that seems like a good model - unit tests use the public methods to update the state of the object being tested, but optionally also have access to const test-only methods to peek inside the state of the object if it’s difficult to expose that state in other ways.
Certainly not required in this PR. Could be done in a follow-up (or not at all).
jnewbery
commented at 10:43 am on July 13, 2022:
contributor
Big concept ACK!
Minor cosmetic comments inline.
DrahtBot added the label
Needs rebase
on Jul 13, 2022
dergoegge force-pushed
on Jul 14, 2022
DrahtBot removed the label
Needs rebase
on Jul 14, 2022
DrahtBot added the label
Needs rebase
on Jul 19, 2022
dergoegge force-pushed
on Jul 27, 2022
dergoegge
commented at 2:25 pm on July 27, 2022:
member
Rebased.
DrahtBot removed the label
Needs rebase
on Jul 27, 2022
DrahtBot added the label
Needs rebase
on Aug 3, 2022
dergoegge force-pushed
on Aug 3, 2022
DrahtBot removed the label
Needs rebase
on Aug 3, 2022
theuni
commented at 4:26 pm on August 26, 2022:
member
Needs a quick linter fix.
DrahtBot added the label
Needs rebase
on Aug 26, 2022
dergoegge force-pushed
on Sep 8, 2022
DrahtBot removed the label
Needs rebase
on Sep 8, 2022
dergoegge
commented at 5:14 pm on September 9, 2022:
member
This has been rebased (including a small linter fix) and is ready for review again!
in
src/node/eviction.cpp:253
in
82eb8d34d5outdated
248+ uint64_t keyed_net_group, bool prefer_evict,
249+ bool is_local, Network network,
250+ bool noban, ConnectionType conn_type)
251+{
252+ LOCK(m_candidates_mutex);
253+ // The default values used to initialise the eviction candidates result in
naumenkogs
commented at 8:00 am on September 12, 2022:
Why don’t go further and summarize which parameters actually matter? Or just mention why it is so
dergoegge
commented at 5:09 pm on October 18, 2022:
Added some comments for these values.
naumenkogs
commented at 8:17 am on September 12, 2022:
member
Concept ACK, light code review ACK29c853614ac7ffbd6ab6c009f297f92149843e32.
This is a great refactor.
fanquake requested review from theuni
on Sep 12, 2022
in
src/test/denialofservice_tests.cpp:51
in
535c5040a1outdated
44@@ -45,10 +45,19 @@ BOOST_FIXTURE_TEST_SUITE(denialofservice_tests, TestingSetup)
45 // work.
46 BOOST_AUTO_TEST_CASE(outbound_slow_chain_eviction)
47 {
48- ConnmanTestMsg& connman = static_cast<ConnmanTestMsg&>(*m_node.connman);
49- // Disable inactivity checks for this test to avoid interference
50- connman.SetPeerConnectTimeout(99999s);
51- PeerManager& peerman = *m_node.peerman;
52+ auto evictionman = std::make_unique<EvictionManager>();
theuni
commented at 1:57 pm on September 15, 2022:
Why is this switched away from using m_node? Not that I object, it seems reasonable to me.
If it’s just because “it’s the only use of m_node.connman in these tests”, maybe break that change out as a separate commit?
dergoegge
commented at 3:18 pm on September 20, 2022:
You actually had this in your original branch and I just kept it this way but it made sense to me since all the other (dos) tests do it this way as well (i.e. “it’s the only use of m_node.connman in these tests”).
Added a commit that splits out this change.
DrahtBot added the label
Needs rebase
on Sep 20, 2022
dergoegge force-pushed
on Sep 20, 2022
dergoegge force-pushed
on Sep 20, 2022
DrahtBot removed the label
Needs rebase
on Sep 20, 2022
DrahtBot added the label
Needs rebase
on Oct 17, 2022
dergoegge force-pushed
on Oct 18, 2022
dergoegge
commented at 5:08 pm on October 18, 2022:
member
Rebased and added some docs for the initial NodeEvictionCandidate values.
DrahtBot removed the label
Needs rebase
on Oct 18, 2022
dergoegge force-pushed
on Oct 28, 2022
dergoegge force-pushed
on Oct 28, 2022
DrahtBot added the label
Needs rebase
on Nov 2, 2022
dergoegge force-pushed
on Nov 26, 2022
dergoegge force-pushed
on Nov 28, 2022
DrahtBot removed the label
Needs rebase
on Nov 28, 2022
dergoegge force-pushed
on Nov 28, 2022
dergoegge
commented at 9:33 pm on November 28, 2022:
member
Finally got around to adding a fuzz target for the Eviction Manager. The testing oracle is pretty dumb right now, gonna think about how that can be improved… The oracle could also be improved on later.
DrahtBot added the label
Needs rebase
on Nov 30, 2022
dergoegge force-pushed
on Dec 1, 2022
DrahtBot removed the label
Needs rebase
on Dec 1, 2022
dergoegge force-pushed
on Dec 2, 2022
DrahtBot added the label
Needs rebase
on Mar 11, 2023
dergoegge force-pushed
on Mar 15, 2023
DrahtBot removed the label
Needs rebase
on Mar 15, 2023
DrahtBot added the label
Needs rebase
on Mar 23, 2023
dergoegge force-pushed
on Mar 27, 2023
DrahtBot removed the label
Needs rebase
on Mar 27, 2023
DrahtBot added the label
Needs rebase
on Mar 28, 2023
dergoegge force-pushed
on Mar 28, 2023
DrahtBot removed the label
Needs rebase
on Mar 28, 2023
fanquake referenced this in commit
53eb4b7a21
on Apr 11, 2023
DrahtBot added the label
Needs rebase
on Apr 11, 2023
sidhujag referenced this in commit
4fe31e65c8
on Apr 11, 2023
dergoegge force-pushed
on Apr 12, 2023
DrahtBot removed the label
Needs rebase
on Apr 12, 2023
in
src/net_processing.cpp:488
in
dccf022242outdated
490@@ -490,7 +491,7 @@ struct CNodeState {
491 class PeerManagerImpl final : public PeerManager
492 {
493 public:
494- PeerManagerImpl(CConnman& connman, AddrMan& addrman,
495+ PeerManagerImpl(CConnman& connman, AddrMan& addrman, EvictionManager& evictionman,
Why not just use the evictionman from connman directly (or have connman use its peerman’s eviction manager)? Is there any reason to ever want them to have different eviction managers?
Having done that, why not have CConnman (or PeerManager) create (and own) its own eviction manager, rather than needing one to be passed in?
dergoegge
commented at 10:31 am on April 13, 2023:
I think it makes sense to not have the eviction manager be owned by another component, in the context of also moving the outbound eviction logic to the eviction manager. I see it as more of a standalone component like the addrman or banman.
This might also give us flexibility in mocking it if we wanted to for testing purposes (e.g. to test that net processing and net are updating the stats correctly, or have a FuzzedEvictionManager to let the fuzzer choose which peers to evict).
in
src/node/eviction.cpp:294
in
14f06a2ea5outdated
Repeatedly looking up the eviction manager map and locking and unlocking access to it, and introducing the potential for maps to be out of sync and having to add asserts to check for that doesn’t seem ideal…
Repeatedly looking up the eviction manager map and locking and unlocking access to it
Two things:
There are very few full outbound or block relay only connections, so this happens at most maybe 3 or 4 times for the block relay only loop and 11 or 12 times for the full outbound one.
The outbound eviction logic is also moved to the eviction manager in a follow up (see #25268), which removes this repeated locking.
and introducing the potential for maps to be out of sync and having to add asserts to check for that doesn’t seem ideal…
I don’t remember what I was thinking at the time but now this Assert doesn’t seem great to me either. I think this is safe because ForEachNode locks m_nodes_mutex and only iterates over nodes that completed the version handshake but that is quite the spaghetti assumption.
Simply using the default value for the last block time if the eviction manager doesn’t know about this peer would probably be the right thing to do. And then without the assertions, I wouldn’t be concerned about the maps (assuming your are talking about the peer map and the eviction manager map) being out of sync.
(This also resolves itself if we end up moving the outbound eviction logic to the eviction manager as well, because then there is only one internal eviction map to think about)
dergoegge
commented at 10:50 am on April 13, 2023:
member
Could use Assume and continue in order to avoid the assert.
in
src/net.h:408
in
00230350cdoutdated
403@@ -404,7 +404,6 @@ class CNode
404 * from the wire. This cleaned string can safely be logged or displayed.
405 */
406 std::string cleanSubVer GUARDED_BY(m_subver_mutex){};
407- const bool m_prefer_evict{false}; // This peer is preferred for eviction.
This removes the check node->fDisconnect, which I think could be dangerous:
We set fDisconnect in AttemptToEvictConnection(), but don’t immediately disconnect. The actual cleanup happens at some later point, when ThreadSocketHandler calls DisconnectNodes().
So if we have an attacker that makes new connections to us rapidly, there would be a race between ThreadSocketHandler and the attacker making the next connection to us. If the attacker wins this race, we could accept two inbound connections, but, without the check that is being removed here, AttemptToEvictConnection() could mark the same inbound peer for eviction twice, so that we’ll actually only evict one peer. Since inbound eviction is triggered only in CreateNodeFromAcceptedSocket() (and not through some regularly scheduled job), this could lead to us having an extra inbound peer above our limit indefinitely.
Now if the attacker repeats this scheme many times, they could make thousands of inbound connections to us / exhaust our resources / crash our node.
mzumsande
commented at 11:51 pm on April 17, 2023:
hmm, thinking more about this, I didn’t see the whole picture, because everything happens in ThreadSocketHandler - after CConnman::SocketHandler() -> CConnman::SocketHandlerListening(), CConnman::DisconnectNodes() is called, so it’s not possible to accept new connections in between - however, CConnman::SocketHandlerListening() calls AcceptConnection() in a loop for multiple ListenSockets, so if we have multiple listening sockets and an attacker would connect to more than one at the same time, the scenario above may still be possible?
DrahtBot added the label
Needs rebase
on Apr 20, 2023
dergoegge force-pushed
on Apr 21, 2023
dergoegge
commented at 9:20 am on April 21, 2023:
member
Rebased, have not addressed Martin’s comments yet
DrahtBot removed the label
Needs rebase
on Apr 21, 2023
DrahtBot added the label
Needs rebase
on May 10, 2023
dergoegge force-pushed
on May 30, 2023
DrahtBot removed the label
Needs rebase
on May 30, 2023
DrahtBot added the label
CI failed
on May 30, 2023
DrahtBot removed the label
CI failed
on May 31, 2023
DrahtBot added the label
CI failed
on Jul 15, 2023
maflcko
commented at 8:43 am on July 21, 2023:
member
Needs rebase if still relevant
dergoegge force-pushed
on Jul 24, 2023
DrahtBot added the label
Needs rebase
on Jul 25, 2023
[test] Make all denial of service tests use their own connman/peermane86cd1f977
dergoegge force-pushed
on Jul 26, 2023
[eviction] Create EvictionManager stub981617d338
[eviction] Make EvictionManager keep track of candidates6de4364e8e
[net] Get candidates from eviction manager
We temporarily introduce a method to get a specific
NodeEvictionCandidate from the eviction manager for use in
`CConnman::AttemptToEvictConnection()`. `EvictionManager::GetCandidate`
is removed once we are done moving the remaining eviction state from
`CNode` to the eviction manager.
bc7e7fc60a
[net] Remove unused CNode state
Now that the eviction manager is aware of the keyed net group and
whether or not a peer is prefered for eviction, we can get rid of the
corresponding CNode members, since they are not used elsewhere.
7f7d02cdb6
[eviction] Move ping/block/tx times from CNode to EvictionManager157afa0b00
[eviction] Report services to EvictionManager5a3ecac7c0
[eviction] Move BIP37 tx relay and bloom filter status to EvictionManager98d484706e
[eviction] Move SelectNodeToEvict into EvictionManager3bdaabd27d
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2024-12-19 21:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me