dergoegge
commented at 12:34 pm on June 2, 2022:
member
At the moment, the eviction logic is mangled across two different components (CConnman, PeerManager), so we can’t really test it in isolation. This is not completely true for the inbound eviction logic as it exists as static functions in net.{h.cpp} for which tests already exist. However, the outbound eviction logic is not covered by any fuzz tests and is only testable by spinning up both a connman and peerman.
This PR splits out the eviction logic into its own component EvictionManager. In addition to isolating the eviction logic, we get rid of several layer violations (e.g. CConnman::ForEachNode/ForNode calls, CNode::m_last_block_time, etc.) between net and net processing.
One instance of the EvictionManager is created at start up and passed as a reference to the connection and peer managers. The connection and peer managers report all eviction relevant information to the eviction manager who ultimately suggests nodes to evict as the result of EvictionManager::SelectInboundNodeToEvict and EvictionManager::SelectOutboundNodesToEvict.
dergoegge force-pushed
on Jun 2, 2022
DrahtBot added the label
Refactoring
on Jun 2, 2022
DrahtBot
commented at 4:39 pm on June 2, 2022:
contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviews
See the guideline for information on the review process.
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
#28222 (Use shared_ptr for CNode inside CConnman by willcl-ark)
#28170 (p2p: adaptive connections services flags by furszy)
#27912 (net: disconnect inside AttemptToEvictConnection by willcl-ark)
#27837 (net: introduce block tracker to retry to download blocks after failure by furszy)
#27600 (net: Add new permission forceinbound to evict a random unprotected connection if all slots are otherwise full by pinheadmz)
#27509 (Relay own transactions only via short-lived Tor or I2P connections by vasild)
#27385 (net, refactor: extract Network and BIP155Network logic to node/network by jonatack)
#27213 (p2p: Diversify automatic outbound connections with respect to networks by amitiuttarwar)
#26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)
#26621 (refactor: Continue moving application data from CNode to Peer by dergoegge)
#15218 (validation: Flush state after initial sync by andrewtoth)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
jnewbery
commented at 10:20 am on June 3, 2022:
contributor
I think we need to justify these values for the sake of review.
For example, a default (unset) value for m_min_ping_time could reasonably also be 0, so it’s not clear in review why it’s set to max(). For values that have no real meaning until after version handshake, maybe std::optional would better communicate their state?
I’m gonna try to slim down the NodeEvictionCandidate interface a bit because a lot of these values don’t need to be set by users of the eviction manager. They could just be default initialized internally in the eviction manager. I’m also gonna add some comments explaining the defaults.
I think std::optional would make it a bit to complicated but i’ll see maybe it works out nicely.
I thought about using std::optional here but afaict we would end up treating a std::nullopt value for one of these in the same way that we treat them right now w.r.t. to the default values. The default values currently increase the likelihood of eviction for new peers (e.g. we prefer to disconnect peers with slow ping times, so the default is the max possible value) and i have added a comment to document that.
in
src/net_processing.cpp:2633
in
d8458fabafoutdated
CNodeState::nBlocksInFlight is not removed in this PR so i think it would be confusing to also do the check with the value from the eviction manager. This is one of those cases that i didn’t like because the variables have to be kept in sync. (Arguably it’s pretty easy for this one since nBlocksInFlight is only modified in two places)
in
src/net_processing.cpp:1188
in
ba938cfd15outdated
993@@ -993,6 +994,7 @@ bool PeerManagerImpl::BlockRequested(NodeId nodeid, const CBlockIndex& block, st
994 std::list<QueuedBlock>::iterator it = state->vBlocksInFlight.insert(state->vBlocksInFlight.end(),
995 {&block, std::unique_ptr<PartiallyDownloadedBlock>(pit ? new PartiallyDownloadedBlock(&m_mempool) : nullptr)});
996 state->nBlocksInFlight++;
997+ m_evictionman.AddBlockInFlight(nodeid);
I think this could/should be done outside of m_nodes_mutex for the sake of untangling? I’m not actually sure, but if it wouldn’t break anything IMO it’d be preferable.
We’d need to create a list of nodes to be evicted, then do the removals outside of the lock.
PeerManagerImpl::UpdateLastBlockAnnounceTime is removed later on but i still went with your suggestion here for the commits before that.
theuni changes_requested
theuni
commented at 7:58 pm on June 21, 2022:
member
Thanks so much for working on this. I took a first pass through all commits except for the last few. I’ll need to give those another go with fresh eyes.
Some of these commits are mine, so several of these comments are probably aimed back at myself, but I’m not sparing myself from my review :)
I think we may need to restructure this PR a bit for the sake of readability. Again, I realize that you used my approach to the commit series, but now that I’m trying to review it, it’s not so easy to follow.
I think it may be much more reviewable if instead of adding all functionality to EvictionManager and then dropping the old stuff, they were done at the same time. So for ex in a commit that adds ping time to eviction manager, it should be hooked up and used, and the old logic dropped. Imo that would make this series much much easier to understand.
dergoegge force-pushed
on Jun 22, 2022
dergoegge
commented at 4:26 pm on June 22, 2022:
member
Thanks for the review @theuni! I’ve addressed most of your comments (still working on #25268 (review) and restructuring the commits).
Do you also think it would be good to split this into multiple PRs, like John suggested #25268#pullrequestreview-994948655? I think that would also help with readability.
theuni
commented at 5:50 pm on June 22, 2022:
member
Do you also think it would be good to split this into multiple PRs, like John suggested #25268 (review)? I think that would also help with readability.
Yes, I agree with that. And in that case, as I mentioned above, I’d also request that the commits in the 2nd pull request would immediately use the new functionality as it’s added as opposed to adding it all, then switching it all in the end. To do that, I suspect that many of these current commits will need to be combined/dropped/re-worked. If that’s the case, don’t worry about git attribution or trying to use the original commits, I don’t care about that at all :)
DrahtBot added the label
Needs rebase
on Jun 23, 2022
jamesob
commented at 5:23 pm on June 23, 2022:
member
Is it possible to contextualize the benefit of this relatively large refactor? A lot of code is changing here, and the benefits listed in the PR description (resolution of layer violations) are somewhat abstract.
If this is, for example, more testable, wouldn’t it make sense to bundle the added test coverage with these changes so that potential reviewers are able to see the tangible benefit?
Is there some broader effort that this is part of?
dergoegge
commented at 10:47 am on June 28, 2022:
member
Is it possible to contextualize the benefit of this relatively large refactor?
I would say this PR is similar to #21148, #19988 or #787 in that it splits some logic (e.g. tx request, orphan handling) into its own component (some of the mentioned PRs also changed behavior which this PR does not), which is good for modularity and testability. At the moment, the eviction logic is mangled across two different components (CConnman, PeerManager), so we can’t really test it in isolation. This is not completely true for the inbound eviction logic as it exists as static functions in net but the outbound eviction logic is not covered by any fuzz tests and is only testable by spinning up both a connman and peerman.
I also think that isolating this logic will make it easier to reason about potential future changes to it.
The layer violation resolutions are more of a nice side effect and I should have mentioned/stressed the test/fuzz benefits in the PR description.
If this is, for example, more testable, wouldn’t it make sense to bundle the added test coverage with these changes so that potential reviewers are able to see the tangible benefit?
Good idea i will add some fuzz and isolated testing for the outbound eviction logic.
Is there some broader effort that this is part of?
Pinging @theuni for this question and to see if he has more to add here in general since this was originally his work.
dergoegge force-pushed
on Jun 28, 2022
dergoegge
commented at 5:58 pm on June 28, 2022:
member
Rebased and restructured the commits a bit as suggested here: #25268#pullrequestreview-1014123554.
DrahtBot removed the label
Needs rebase
on Jun 28, 2022
dergoegge force-pushed
on Jun 29, 2022
dergoegge force-pushed
on Jun 30, 2022
dergoegge force-pushed
on Jun 30, 2022
dergoegge force-pushed
on Jul 4, 2022
DrahtBot added the label
Needs rebase
on Jul 4, 2022
dergoegge force-pushed
on Jul 5, 2022
DrahtBot removed the label
Needs rebase
on Jul 5, 2022
theuni
commented at 2:42 pm on July 6, 2022:
member
I think @dergoegge explained well, but I’ll add a bit on top.
The larger effort, which has been improving over the years, is the notion of self-contained subsystems. As it stands, in order to evict a peer, we need to gather the current states of several otherwise-unrelated things. Not only is this slow and error-prone (real-time locking for each individual state capture), it also makes the eviction logic impossible to reason about (and test) on its own.
So the solution/pattern that’s emerged to untangle similar problems in the past is: rather than doing real-time peeks into required subsystems every time we need to do something, instead broadcast the triggering events to a new subsystem that maintains a full view of whatever it needs. In practice, what this means is a producer/consumer pattern where multiple subsystems are notified when something happens, and it’s up to each to decide what to do with those changes. Yes, this may mean that several subsystems hold duplicate values for the same thing, but now they don’t have to cross over each-other for a full view.
Eventually these patterns could be formalized into something like the signals/slots we have elsewhere, which has the nice side-effect of defining triggers that other subsystems may be interested in as well. But as a first step, imo just untangling the layer violations is a worthy goal.
fanquake referenced this in commit
d571cf2d24
on Jul 7, 2022
DrahtBot added the label
Needs rebase
on Jul 7, 2022
dergoegge force-pushed
on Jul 8, 2022
DrahtBot removed the label
Needs rebase
on Jul 8, 2022
dergoegge
commented at 4:41 pm on July 8, 2022:
member
#25500 has been merged, the PR that splits off the next couple commits is #25572.
jnewbery
commented at 10:42 am on July 13, 2022:
contributor
.
DrahtBot added the label
Needs rebase
on Jul 13, 2022
dergoegge force-pushed
on Jul 14, 2022
DrahtBot removed the label
Needs rebase
on Jul 14, 2022
DrahtBot added the label
Needs rebase
on Jul 19, 2022
dergoegge marked this as a draft
on Oct 12, 2022
dergoegge force-pushed
on Oct 28, 2022
DrahtBot removed the label
Needs rebase
on Oct 28, 2022
dergoegge force-pushed
on Oct 28, 2022
DrahtBot added the label
Needs rebase
on Nov 2, 2022
dergoegge force-pushed
on Nov 29, 2022
DrahtBot removed the label
Needs rebase
on Nov 29, 2022
dergoegge
commented at 10:27 am on November 29, 2022:
member
Rebased. This now also includes a fuzz target for EvictionManager, both inbound and outbound logic is being fuzzed.
DrahtBot added the label
Needs rebase
on Nov 30, 2022
dergoegge force-pushed
on Dec 1, 2022
DrahtBot removed the label
Needs rebase
on Dec 1, 2022
dergoegge force-pushed
on Dec 2, 2022
DrahtBot added the label
Needs rebase
on Dec 8, 2022
dergoegge force-pushed
on Jan 24, 2023
DrahtBot removed the label
Needs rebase
on Jan 24, 2023
DrahtBot added the label
Needs rebase
on Mar 11, 2023
dergoegge force-pushed
on Mar 15, 2023
DrahtBot removed the label
Needs rebase
on Mar 15, 2023
DrahtBot added the label
Needs rebase
on Mar 23, 2023
dergoegge force-pushed
on Mar 27, 2023
DrahtBot removed the label
Needs rebase
on Mar 27, 2023
DrahtBot added the label
Needs rebase
on Mar 28, 2023
dergoegge force-pushed
on Mar 28, 2023
DrahtBot removed the label
Needs rebase
on Mar 28, 2023
DrahtBot added the label
Needs rebase
on Apr 11, 2023
dergoegge force-pushed
on Apr 12, 2023
DrahtBot removed the label
Needs rebase
on Apr 12, 2023
DrahtBot added the label
Needs rebase
on Apr 20, 2023
dergoegge force-pushed
on Apr 21, 2023
DrahtBot removed the label
Needs rebase
on Apr 21, 2023
DrahtBot added the label
Needs rebase
on May 10, 2023
dergoegge force-pushed
on May 30, 2023
DrahtBot added the label
CI failed
on May 30, 2023
DrahtBot removed the label
Needs rebase
on May 30, 2023
DrahtBot removed the label
CI failed
on May 31, 2023
DrahtBot added the label
CI failed
on Jul 15, 2023
dergoegge force-pushed
on Jul 24, 2023
DrahtBot added the label
Needs rebase
on Jul 25, 2023
[test] Make all denial of service tests use their own connman/peermane86cd1f977
dergoegge force-pushed
on Jul 26, 2023
[eviction] Create EvictionManager stub981617d338
[eviction] Make EvictionManager keep track of candidates6de4364e8e
[net] Get candidates from eviction manager
We temporarily introduce a method to get a specific
NodeEvictionCandidate from the eviction manager for use in
`CConnman::AttemptToEvictConnection()`. `EvictionManager::GetCandidate`
is removed once we are done moving the remaining eviction state from
`CNode` to the eviction manager.
bc7e7fc60a
[net] Remove unused CNode state
Now that the eviction manager is aware of the keyed net group and
whether or not a peer is prefered for eviction, we can get rid of the
corresponding CNode members, since they are not used elsewhere.
7f7d02cdb6
[eviction] Move ping/block/tx times from CNode to EvictionManager157afa0b00
[eviction] Report services to EvictionManager5a3ecac7c0
[eviction] Move BIP37 tx relay and bloom filter status to EvictionManager98d484706e
[eviction] Move SelectNodeToEvict into EvictionManager3bdaabd27d
This is a metadata mirror of the GitHub repository
bitcoin/bitcoin.
This site is not affiliated with GitHub.
Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC
This site is hosted by @0xB10C More mirrored repositories can be found on mirror.b10c.me