This moves g_peer_map from a global in net_processing.cpp's unnamed namespace to being a member m_peer_map of PeerManager.
net processing: Move peer_map to PeerManager #19910
pull jnewbery wants to merge 4 commits into bitcoin:master from jnewbery:2020-08-peer-plv2 changing 4 files +94 −74-
jnewbery commented at 5:17 PM on September 7, 2020: member
-
jnewbery commented at 5:18 PM on September 7, 2020: member
Requested by @MarcoFalke @sdaftuar and @theuni . Review very much appreciated :pray:
-
in src/rpc/net.cpp:160 in e0b0d0c564 outdated
155 | @@ -156,7 +156,10 @@ static UniValue getpeerinfo(const JSONRPCRequest& request) 156 | for (const CNodeStats& stats : vstats) { 157 | UniValue obj(UniValue::VOBJ); 158 | CNodeStateStats statestats; 159 | - bool fStateStats = GetNodeStateStats(stats.nodeid, statestats); 160 | + bool fStateStats{false}; 161 | + if (node.peerman) {
MarcoFalke commented at 5:25 PM on September 7, 2020:It is currently not possible to not have
node.peerman. I think anEnsure..()likeEnsureMempoolwould be good. It doesn't really make sense to callgetpeerinfowithout a peerman anyway?
jnewbery commented at 6:08 PM on September 7, 2020:I've done something slightly different and checked existence at the top of this function. Let me know what you think.
in src/net_processing.h:130 in e0b0d0c564 outdated
125 | @@ -93,7 +126,14 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface 126 | */ 127 | void Misbehaving(const NodeId pnode, const int howmuch, const std::string& message); 128 | 129 | + /** Get statistics from node state */ 130 | + bool GetNodeStateStats(NodeId nodeid, CNodeStateStats &stats);
MarcoFalke commented at 5:25 PM on September 7, 2020:bool GetNodeStateStats(NodeId nodeid, CNodeStateStats& stats);nit for new code
jnewbery commented at 6:09 PM on September 7, 2020:fixed. Thanks!
MarcoFalke commented at 5:31 PM on September 7, 2020: memberConcept ACK
DrahtBot added the label P2P on Sep 7, 2020DrahtBot added the label RPC/REST/ZMQ on Sep 7, 2020jnewbery force-pushed on Sep 7, 2020jnewbery force-pushed on Sep 7, 2020hebasto commented at 6:34 PM on September 7, 2020: memberConcept ACK.
MarcoFalke removed the label P2P on Sep 7, 2020MarcoFalke removed the label RPC/REST/ZMQ on Sep 7, 2020MarcoFalke added the label Refactoring on Sep 7, 2020practicalswift commented at 7:23 PM on September 7, 2020: contributorConcept ACK: decoupling is good!
in src/net_processing.h:184 in 322e87b603 outdated
175 | @@ -146,6 +176,16 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface 176 | CTxMemPool& m_mempool; 177 | 178 | int64_t m_stale_tip_check_time; //!< Next time to check for stale tip 179 | + 180 | + /** Protects m_peer_map */ 181 | + Mutex m_peer_mutex; 182 | + /** 183 | + * Map of all Peer objects, keyed by peer id. This map is protected 184 | + * by the global m_peer_mutex. Once a shared pointer reference is
hebasto commented at 9:36 AM on September 8, 2020:322e87b6039877ed3457baea93cc87d78f3f4d96
* by the m_peer_mutex. Once a shared pointer reference is
jnewbery commented at 10:32 AM on September 8, 2020:good catch. Fixed.
in src/net_processing.h:199 in 322e87b603 outdated
175 | @@ -146,6 +176,16 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface 176 | CTxMemPool& m_mempool; 177 | 178 | int64_t m_stale_tip_check_time; //!< Next time to check for stale tip 179 | + 180 | + /** Protects m_peer_map */
hebasto commented at 9:37 AM on September 8, 2020:322e87b6039877ed3457baea93cc87d78f3f4d96 nit: This comment duplicates the code :) Consider to drop it.
jnewbery commented at 10:32 AM on September 8, 2020:Normally I'd agree with removing unnecessary commenting, but I think it's useful for every member variable to have a doxygen comment. Happy to go either way if others think it should be dropped.
laanwj commented at 11:14 AM on September 8, 2020:I'm for keeping it, it's better for consistency to have a comment for every one.
hebasto changes_requestedhebasto commented at 9:39 AM on September 8, 2020: memberACK 322e87b6039877ed3457baea93cc87d78f3f4d96, I have reviewed the code and it looks OK.
jnewbery force-pushed on Sep 8, 2020hebasto approvedhebasto commented at 10:46 AM on September 8, 2020: memberre-ACK d350feb2013e8db8a7464ec40670ef609d77e778
in src/interfaces/node.cpp:110 in 6647af9361 outdated
110 | - for (auto& node_stats : stats) { 111 | - std::get<1>(node_stats) = 112 | - GetNodeStateStats(std::get<0>(node_stats).nodeid, std::get<2>(node_stats)); 113 | + if (m_context->peerman) { 114 | + TRY_LOCK(::cs_main, lockMain); 115 | + if (lockMain) {
ariard commented at 2:50 PM on September 8, 2020:Does it change something if this check is moved above at the
m_context->connmanlevel ? You may save one more control flow.
hebasto commented at 4:10 PM on September 8, 2020:Does it change something if this check is moved above at the
m_context->connmanlevel ? You may save one more control flow.Why increase code block with locked
::cs_main?
jnewbery commented at 4:27 PM on September 8, 2020:Does it change something if this check is moved above at the m_context->connman level ? You may save one more control flow.
Yes, it's a try lock. It's possible that we're not able to take the lock, and therefore that we're not able to get the node state stats, but we'd still get the node stats. If we move the try above the m_context->connman level, then if we couldn't get the cs_main lock, we wouldn't be able to get any stats.
Eventually, none of the statistics should require cs_main, so the try lock can be removed entirely.
in src/net_processing.cpp:855 in 6647af9361 outdated
952 | @@ -953,7 +953,7 @@ void PeerManager::FinalizeNode(NodeId nodeid, bool& fUpdateConnectionTime) { 953 | LogPrint(BCLog::NET, "Cleared nodestate for peer=%d\n", nodeid); 954 | } 955 | 956 | -bool GetNodeStateStats(NodeId nodeid, CNodeStateStats &stats) { 957 | +bool PeerManager::GetNodeStateStats(NodeId nodeid, CNodeStateStats &stats) {
ariard commented at 2:52 PM on September 8, 2020:It's a bit confusing to have
GetNodeStateStatswhich is actually a member ofPeerManager, and still have aCNodeState, even if inside this function you effectively fetch it. Maybe renameGetNodeProcessingStats?
jnewbery commented at 4:12 PM on September 8, 2020:I think the name right now is fine. It's getting stats related to the node's state.
ariard commented at 2:56 PM on September 8, 2020: memberConcept ACK
promag commented at 8:14 AM on September 9, 2020: memberConcept ACK.
DrahtBot commented at 1:46 PM on September 19, 2020: member<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #20295 (rpc: getblockfrompeer by Sjors)
- #20217 (net: Remove g_relay_txes by jnewbery)
- #19858 (Periodically make block-relay connections and sync headers by sdaftuar)
- #19829 (net processing: Move block inventory state to net_processing by jnewbery)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
in src/net_processing.cpp:874 in d350feb201 outdated
870 | @@ -915,8 +871,8 @@ void PeerManager::FinalizeNode(NodeId nodeid, bool& fUpdateConnectionTime) { 871 | PeerRef peer = GetPeerRef(nodeid); 872 | assert(peer != nullptr); 873 | misbehavior = WITH_LOCK(peer->m_misbehavior_mutex, return peer->m_misbehavior_score); 874 | - LOCK(g_peer_mutex); 875 | - g_peer_map.erase(nodeid); 876 | + LOCK(m_peer_mutex);
theuni commented at 8:19 PM on September 22, 2020:Using
GetPeerRef()with an encapsulated lock form_peer_map, then manually locking the private mutex 3 lines later looks strange. It also requires two locks... all the drawbacks of mutex encapsulation without exploiting the benefits :)Since misbehavior changes to the peer will go unobserved after this point anyway, why not just take and remove with a single lock?
Something like (untested):
PeerRef PeerManager::RemovePeer(NodeId id) { PeerRef ret; { LOCK(m_peer_mutex); auto it = m_peer_map.find(id); if (it != m_peer_map.end()) { ret = std::move(it->second); m_peer_map.erase(it); } } return ret; }
jnewbery commented at 9:22 AM on September 24, 2020:Yes, that's much better. Thank you!
I've implemented this within
GetPeerRef()to avoid the code duplication, and also added another commit that tidies upFinalizeNode()a bit more. Let me know what you think.
jnewbery commented at 10:50 AM on September 24, 2020:EDIT: The comment below is wrong. I introduced a different bug when implementing this as Cory points out below.
~oops. There's a bug in the code above which caused this to fail. You can't move the shared_ptr out of the map and then erase it. More precisely, I think something like the following is happening:~
- ~
ret = std::move(it->second);meansretis reset to be a shared pointer toPeer. It steals the refcount fromit->second, sincestd::move(it->second)is an rvalue.std::movedoesn't do anything to the map entry.~ - ~the entry is then erased from the map, decrementing the refcount to zero and destructing the
Peerobject.~
~Removing the
std::movefixed this.~
theuni commented at 5:12 PM on October 5, 2020:This is really uninteresting minutia, but @jnewbery and I happened to get into a discussion/debate about it today so I'm following up with details. Everyone else can safely ignore this as boring.
I don't believe the comment above is correct. If removing the
std::movehelps, I'm wondering if there might be another issue causing problems.- ret = std::move(it->second); means ret is reset to be a shared pointer to Peer. It steals the refcount from it->second, since std::move(it->second) is an rvalue. std::move doesn't do anything to the map entry.
Indeed
std::movedoesn't do anything to the map entry as it's essentially just a cast, but theoperator=does likely (definitely?) reset it. See libc++'s implementation of shared_ptr's move assignment operator, which calls the move constructor, where the rhs is null'd.- the entry is then erased from the map, decrementing the refcount to zero and destructing the Peer object.
We can write a quick test to see that the refcount never hits zero after removal from the map, that copy vs. move refcounts are different as we would expect them to be, and that a shared_ptr moved from a map's value is still safe to use:
#include <cassert> #include <cstdio> #include <map> #include <memory> #include <string> static std::shared_ptr<std::string> copytest() { std::map<char, std::shared_ptr<std::string> > foo; foo.emplace('c', std::make_shared<std::string>("hello")); auto it = foo.find('c'); std::shared_ptr<std::string> ret; if (it != foo.end()) { assert(it->second.use_count() == 1); ret = it->second; assert(it->second != nullptr); // <------------------ assert(ret.use_count() == 2); // <------------------ foo.erase(it); assert(ret.use_count() == 1); } return ret; } static std::shared_ptr<std::string> movetest() { std::map<char, std::shared_ptr<std::string> > foo; foo.emplace('c', std::make_shared<std::string>("world")); auto it = foo.find('c'); std::shared_ptr<std::string> ret; if (it != foo.end()) { assert(it->second.use_count() == 1); ret = std::move(it->second); assert(it->second == nullptr); // <------------------ assert(ret.use_count() == 1); // <------------------ foo.erase(it); assert(ret.use_count() == 1); } return ret; } int main() { auto copyptr = copytest(); assert(copyptr.use_count() == 1); auto moveptr = movetest(); assert(moveptr.use_count() == 1); printf("%s %s!\n", copyptr->c_str(), moveptr->c_str()); }Note that this doesn't actually prove that the map itself is still in working order after the move+removal, but I don't see why that should break anything.
Lastly, we have cases in our code where we move from a map's value. One example: https://github.com/bitcoin/bitcoin/blob/master/src/net.cpp#L1012
tl;dr: If moving a
shared_ptrout of a mapped value causes a problem that a copy doesn't, we should make sure to understand why and potentially audit other places where we do it as well.
theuni commented at 5:20 PM on October 5, 2020:Aha, I think I found the problem. From your previous version of this commit:
... if (it != m_peer_map.end()) { ret = std::move(it->second); if (erase) m_peer_map.erase(it); } return ret;Moving only makes sense when erasing.
jnewbery commented at 9:42 AM on October 8, 2020:Indeed, the failure was caused by that. Sorry for the noise (and for casting aspersions on your code!)
jnewbery force-pushed on Sep 24, 2020in src/net_processing.cpp:867 in 658ca700ff outdated
863 | @@ -864,23 +864,18 @@ void PeerManager::ReattemptInitialBroadcast(CScheduler& scheduler) const 864 | } 865 | 866 | void PeerManager::FinalizeNode(NodeId nodeid, bool& fUpdateConnectionTime) { 867 | - fUpdateConnectionTime = false; 868 | + PeerRef peer = RemovePeer(nodeid);
theuni commented at 7:56 PM on September 24, 2020:This undoes this assumption: #19607 (review) , which I'm still not convinced is a good idea.
jnewbery commented at 8:01 AM on September 25, 2020:ok, removed that commit
jnewbery force-pushed on Sep 25, 2020jnewbery force-pushed on Sep 29, 2020jnewbery added the label P2P on Oct 1, 2020in src/net_processing.cpp:910 in 2ce2ec188a outdated
906 | @@ -909,11 +907,16 @@ void PeerManager::FinalizeNode(NodeId nodeid, bool& fUpdateConnectionTime) { 907 | LogPrint(BCLog::NET, "Cleared nodestate for peer=%d\n", nodeid); 908 | } 909 | 910 | -PeerRef PeerManager::GetPeerRef(NodeId id) 911 | +PeerRef PeerManager::GetPeerRef(NodeId id, bool erase)
theuni commented at 6:58 PM on October 5, 2020:This is (mostly) a style preference, but I'm not a fan of adding a boolean option to substantially change the functionality of a function as opposed to just creating two functions. GetPeerRef() is logically a read-only operation.
It's also easier easy to slip up and make mistakes like this one: #19910 (review) ;)
jnewbery commented at 10:13 AM on October 8, 2020:You got me. I've made it a separate function.
GetPeerRef() is logically a read-only operation.
Good point. I've made it
const(and made the mutexmutable)jnewbery force-pushed on Oct 8, 2020DrahtBot added the label Needs rebase on Oct 19, 2020jnewbery force-pushed on Oct 19, 2020jnewbery commented at 8:10 AM on October 19, 2020: memberRebased
DrahtBot removed the label Needs rebase on Oct 19, 2020jnewbery commented at 11:14 AM on November 19, 2020: memberRebased
jnewbery force-pushed on Nov 19, 2020laanwj added this to the "Blockers" column in a project
promag commented at 12:46 AM on November 30, 2020: memberCode review ACK 3bf59172859b484405321d9778da16e41dc049cc.
This is a simple and nice refactor. Verified moved code.
DrahtBot added the label Needs rebase on Dec 1, 2020jnewbery force-pushed on Dec 1, 2020jnewbery commented at 11:54 AM on December 1, 2020: memberrebased
DrahtBot removed the label Needs rebase on Dec 1, 2020DrahtBot added the label Needs rebase on Dec 2, 2020in src/net_processing.h:147 in aedd418fed outdated
142 | @@ -143,6 +143,10 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface 143 | * May return nullptr if the Peer object can't be found. */ 144 | PeerRef GetPeerRef(NodeId id) const; 145 | 146 | + /** Get a shared pointer to the Peer object and remove it from m_peer_map. 147 | + * May return nullptr if the Peer object can't be found. */
jnewbery commented at 11:53 AM on December 4, 2020:fixed
in src/net_processing.cpp:842 in 6bfcef3797 outdated
841 | + PeerRef ret; 842 | LOCK(m_peer_mutex); 843 | auto it = m_peer_map.find(id); 844 | - return it != m_peer_map.end() ? it->second : nullptr; 845 | + if (it != m_peer_map.end()) ret = it->second; 846 | + return ret;
hebasto commented at 10:41 AM on December 3, 2020:6bfcef3797d4d105180e325033929684f84cfc08
Why are these changes required? What are benefits to return an empty shared pointer instead
nullptr-constructed one? Btw, in all caller places the return value is checked fornullptr, not being empty.
jnewbery commented at 11:54 AM on December 4, 2020:This makes is clear that NRVO can be used, since there's only one return statement. I'm not sure if the previous version would have used copy elision.
theuni commented at 2:54 PM on December 4, 2020:+1 for considering this!
I'd be curious to know if compilers are allowed to rearrange the code before deciding how many return statements there are. I could see this going either way.
jnewbery commented at 12:02 PM on December 7, 2020:Thanks @hebasto! Yes, I think you're right. If I'm reading it correctly, the result of the conditional operator is a prvalue (item 5 in https://en.cppreference.com/w/cpp/language/operator_other#Conditional_operator since 4 doesn't apply), and copy elision is mandatory for returning a prvalue.
I've reverted this.
hebasto commented at 12:05 PM on December 7, 2020:... and copy elision is mandatory for returning a prvalue.
Good to have C++17 :)
hebasto approveddongcarl commented at 4:39 PM on December 3, 2020: memberCode Review ACK aedd418
Private members >> Anon-namespaced statics/functions
Especially when the functionality is strongly intertwined with the class of which it's being made a member.
[net processing] Move GetNodeStateStats into PeerManager a529fd3e3fjnewbery force-pushed on Dec 4, 2020jnewbery commented at 11:55 AM on December 4, 2020: memberRebased and addressed outstanding review comments.
in src/net_processing.cpp:793 in 5a1f08ce1f outdated
789 | @@ -790,11 +790,9 @@ void PeerManager::FinalizeNode(const CNode& node, bool& fUpdateConnectionTime) { 790 | LOCK(cs_main); 791 | int misbehavior{0}; 792 | { 793 | - PeerRef peer = GetPeerRef(nodeid); 794 | + PeerRef peer = RemovePeer(nodeid);
promag commented at 12:00 PM on December 4, 2020:This also avoids duplicate lookup on m_peer_map :+1:
promag commented at 12:02 PM on December 4, 2020: memberCore review ACK 4fe77ae9aab73d8de6c40d504e079e30eff45f91.
Could squash last commit or fix the comment in 5a1f08ce1f4d49fe85b8571cd5da1b1096339a89.
DrahtBot removed the label Needs rebase on Dec 4, 2020in src/net_processing.cpp:794 in 4fe77ae9aa outdated
789 | @@ -842,11 +790,9 @@ void PeerManager::FinalizeNode(const CNode& node, bool& fUpdateConnectionTime) { 790 | LOCK(cs_main); 791 | int misbehavior{0}; 792 | { 793 | - PeerRef peer = GetPeerRef(nodeid); 794 | + PeerRef peer = RemovePeer(nodeid); 795 | assert(peer != nullptr);
theuni commented at 2:31 PM on December 4, 2020:This is a little awkward because of the comment in RemovePeer:
/** Get a shared pointer to the Peer object and remove it from m_peer_map. * May return an empty shared_ptr if the Peer object can't be found. */Might be worth of a comment explaining that at this point the peer is guaranteed to exist, but not afterwards. Then again, that's kinda what the assert means. Shrug.
jnewbery commented at 12:11 PM on December 7, 2020:Yeah, I think it's ok for now. Happy to change this if you decide you feel more strongly about it.
rebroad commented at 8:00 AM on September 15, 2021:I'm getting bitcoind crashing due to this assert. It appears that sometimes FinalizeNode gets called twice (from DeleteNode()). Perhaps nRefCount is being increased between the two occurances. This only seems to happen though when there's a delay between them due to UpdateTip(). It's probably a bug I've introduced somewhere, but just leaving this comment as a "heads up".
jnewbery commented at 9:54 AM on September 15, 2021:Thanks. I expect if the bug was present in master we'd have seen it in CI or from a user report.
theuni approvedhebasto approved[net_processing] Move peer_map to PeerManager ed7e469cee[net processing] Make GetPeerRef const a20ab227863025ca9e77[net processing] Add RemovePeer()
This allows us to avoid repeated locking in FinalizeNode()
jnewbery force-pushed on Dec 7, 2020in src/net_processing.cpp:853 in 3025ca9e77
849 | + if (it != m_peer_map.end()) { 850 | + ret = std::move(it->second); 851 | + m_peer_map.erase(it); 852 | + } 853 | + return ret; 854 | +}
hebasto commented at 12:15 PM on December 7, 2020:3025ca9e7743d9b96c22e9c6ed7ef051dcea7e54 I'm curios is it possible to force the mandatory copy/move elision for the return value here as well?
hebasto approvedtheuni commented at 7:23 PM on December 8, 2020: memberRe-ACK 3025ca9e7743d9b96c22e9c6ed7ef051dcea7e54.
dongcarl commented at 1:11 PM on December 9, 2020: memberRe-ACK 3025ca9
- Disambiguated between nullptr and empty shared_ptr
- Reverted refactoring in PeerManager::GetPeerRef
- Made
Peerconstructorexplicit
fanquake merged this on Dec 9, 2020fanquake closed this on Dec 9, 2020sidhujag referenced this in commit c27878a62e on Dec 9, 2020laanwj removed this from the "Blockers" column in a project
jnewbery deleted the branch on Sep 15, 2021fanquake locked this on Sep 15, 2021
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-05 12:14 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me