This continues the work of moving application layer data into net_processing, by moving all block inventory state into the new Peer object added in #19607.
For motivation, see #19398.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
487@@ -488,6 +488,9 @@ struct Peer {
488 /** Whether this peer should be disconnected and marked as discouraged (unless it has the noban permission). */
489 bool m_should_discourage GUARDED_BY(m_misbehavior_mutex){false};
490
491+ /** Starting height of this peer */
492+ std::atomic<int> nStartingHeight{-1};
921@@ -928,9 +922,6 @@ class CNode
922 // m_tx_relay == nullptr if we're not relaying transactions with this peer
923 std::unique_ptr<TxRelay> m_tx_relay;
924
925- // Used for headers announcements - unfiltered blocks to relay
What’s the distinction you loose by dropping unfiltered blocks to relay
?
I guess this is a reference to sort done at headers announcement, L4218 src/net_processing.cpp: https://github.com/bitcoin/bitcoin/blob/1cf73fb8eba3b89a30d5e943deaec5052a8b21b7/src/net_processing.cpp#L4218
Before to announce headers, we check both with regards to peer’s context (“Does this header connect to known peer’s tip ?”) and our context (“Does this header is part of best chain ?”). It wasn’t a clear comment, so yes either drop it or extend ?
492+ RecursiveMutex m_block_inv_mutex;
493+ /** List of block ids we still have announce.
494+ * There is no final sorting before sending, as they are always sent
495+ * immediately and in the order requested. */
496+ std::vector<uint256> vInventoryBlockToSend GUARDED_BY(m_block_inv_mutex);
497+ /** List of headers to relay */
vBlockHashesToAnnounce
is a list of block hashes not headers ? With regards to renaming follow-up commit, I would lean towards the most verbose option m_blockids_for_inv_relay/m_blockids_for_headers_relay
to denotate that these vectors contain an identifier even if the endgoal is to relay the identifiee (either block or headers).
Ok, I’ll update the comment to “List of block ids to announce headers for”.
I would lean towards the most verbose option m_blockids_for_inv_relay/m_blockids_for_headers_relay
Sure. I’ll update.
487@@ -488,6 +488,14 @@ struct Peer {
488 /** Whether this peer should be disconnected and marked as discouraged (unless it has the noban permission). */
489 bool m_should_discourage GUARDED_BY(m_misbehavior_mutex){false};
490
491+ /** Protects block inventory data members */
492+ RecursiveMutex m_block_inv_mutex;
RecursiveMutex
type be avoided?
(as there was Mutex cs_inventory;
)
cs_inventory
was a RecursiveMutex when I opened this PR, and I missed the change when I rebased. Fixed.
517+ peer_copies.push_back(peer.second);
518+ }
519+ }
520+
521+ for (auto&& peer : peer_copies) {
522+ func(peer);
func
on potentially stale peer_copies
?
peer_copies
holds shared_ptr references, so the objects pointed to can’t be destructed while it’s in scope.
peer
from the g_peer_map
, can we rely on result of func(peer)
?
1581@@ -1542,6 +1582,7 @@ static void RelayAddress(const CAddress& addr, bool fReachable, const CConnman&
1582
1583 void static ProcessGetBlockData(CNode& pfrom, const CChainParams& chainparams, const CInv& inv, CConnman& connman)
1584 {
1585+ PeerRef peer = GetPeerRef(pfrom.GetId());
peer
throughout the function, and there’s no downside to holding on to the object for a bit longer.
947@@ -913,6 +948,10 @@ void PeerManager::FinalizeNode(NodeId nodeid, bool& fUpdateConnectionTime) {
948 LOCK(cs_main);
949 int misbehavior{0};
950 {
951+ // We remove the Peer object from g_peer_map here, but it's not guaranteed that
952+ // we'll destruct Peer here since another thread could potentially still hold
953+ // a reference to the object. Be careful not to do any processing here that
954+ // assumes Peer won't be changed before it's destructed.
955 PeerRef peer = GetPeerRef(nodeid);
956 assert(peer != nullptr);
In commit 85e7deb, perhaps go a bit further in the added comment to clarify that the refcount is often > 1 here.
In my testing, adding assert(peer.use_count() <= 1);
after this line as suggested in #19607 (review) causes bitcoind on mainnet to halt nearly immediately after net processing begins and 1-2 peers start to connect.
518@@ -502,6 +519,24 @@ using PeerRef = std::shared_ptr<Peer>;
519 Mutex g_peer_mutex;
520 static std::map<NodeId, PeerRef> g_peer_map GUARDED_BY(g_peer_mutex);
521
522+/** Get a reference to each Peer, then call a function on each of them */
523+template<typename Callable>
524+void ForEachPeer(Callable&& func)
ForEachPeer()
in your roadmap…end game interface, or more of an intermediary step?
1146@@ -1148,6 +1147,8 @@ void RPCConsole::updateNodeDetail(const CNodeCombinedStats *stats)
1147 ui->peerCommonHeight->setText(QString("%1").arg(stats->nodeStateStats.nCommonHeight));
1148 else
1149 ui->peerCommonHeight->setText(tr("Unknown"));
1150+
1151+ ui->peerHeight->setText(QString::number(stats->nodeStateStats.m_starting_height));
-1
until 3 commits later at f27a1e221d.
208 // banscore is deprecated in v0.21 for removal in v0.22
209 obj.pushKV("banscore", statestats.m_misbehavior_score);
210 }
211 obj.pushKV("synced_headers", statestats.nSyncHeight);
212 obj.pushKV("synced_blocks", statestats.nCommonHeight);
213+ obj.pushKV("startingheight", statestats.m_starting_height);
./src/bitcoin-cli getpeerinfo | grep 'startingheight' | sort
, as there appear to be no functional tests for getpeerinfo#startingheight
. As above, it is broken at this commit and returns only the initial value of -1
. Functionality returns 3 commits later at f27a1e221d.
496+ std::vector<uint256> m_blocks_for_inv_relay GUARDED_BY(m_block_inv_mutex);
497+ /** List of headers to relay */
498+ std::vector<uint256> m_blocks_for_headers_relay GUARDED_BY(m_block_inv_mutex);
499+
500+ /** Starting height of this peer */
501+ std::atomic<int> m_starting_height{-1};
9b14eb372 m_starting_height
presumably still needs to be std::atomic_int
?
If you retouch, perhaps s/height/block height/
for the Doxygen doc.
m_starting_height presumably still needs to be std::atomic_int?
I don’t understand. It’s a std::atomic<int>
before and after this PR.
If you retouch, perhaps s/height/block height/ for the Doxygen doc.
I’ve updated the Doxygen comment to be clearer.
2374@@ -2334,6 +2375,7 @@ void PeerManager::ProcessMessage(CNode& pfrom, const std::string& msg_type, CDat
2375 return;
2376 }
2377
2378+ PeerRef peer = GetPeerRef(pfrom.GetId());
PeerRef peer = GetPeerRef(...)
in many net processing methods?
Peer&
to many of the private functions (just as CNodeState*
is passed to some of the functions now).
4238@@ -4196,7 +4239,7 @@ bool PeerManager::SendMessages(CNode* pto)
4239 // Try to find first header that our peer doesn't have, and
4240 // then send all headers past that one. If we come across any
4241 // headers that aren't on ::ChainActive(), give up.
4242- for (const uint256 &hash : pto->vBlockHashesToAnnounce) {
4243+ for (const uint256 &hash : peer->m_blocks_for_headers_relay) {
f27a1e221 nit, while touching this and lines 4334 and 4366 (3 places), if so inclined
0 for (const uint256& hash : peer->m_blocks_for_headers_relay) {
939@@ -940,8 +940,6 @@ class CNode
940 mapMsgCmdSize mapRecvBytesPerMsgCmd GUARDED_BY(cs_vRecv);
941
942 public:
943- uint256 hashContinue;
944- std::atomic<int> nStartingHeight{-1};
945
497+ /** List of headers to relay */
498+ std::vector<uint256> m_blocks_for_headers_relay GUARDED_BY(m_block_inv_mutex);
499+
500+ /** Starting height of this peer */
501+ std::atomic<int> m_starting_height{-1};
502+ /** The final block hash in a INV message. When the peer requests this
s/a INV/an INV/
and s/them/it|the peer/
1438@@ -1399,11 +1439,11 @@ void PeerManager::UpdatedBlockTip(const CBlockIndex *pindexNew, const CBlockInde
1439 }
1440 }
1441 // Relay inventory, but don't relay old inventory during initial block download.
1442- m_connman.ForEachNode([nNewHeight, &vHashes](CNode* pnode) {
1443- LOCK(pnode->cs_inventory);
1444- if (nNewHeight > (pnode->nStartingHeight != -1 ? pnode->nStartingHeight - 2000 : 0)) {
1445+ ForEachPeer([nNewHeight, &vHashes](const PeerRef& peer) {
1446+ LOCK(peer->m_block_inv_mutex);
LOCK(pnode->cs_inventory);
this early? Another lock doesn’t replace it until 3 commits later in f27a1e22.
getpeerinfo
or the GUI peer window.
1010@@ -971,6 +1011,7 @@ bool GetNodeStateStats(NodeId nodeid, CNodeStateStats &stats) {
1011 PeerRef peer = GetPeerRef(nodeid);
1012 if (peer == nullptr) return false;
1013 stats.m_misbehavior_score = WITH_LOCK(peer->m_misbehavior_mutex, return peer->m_misbehavior_score);
1014+ stats.m_starting_height = peer->m_starting_height;
0a9839058c
0net_processing.cpp: In function ‘bool GetNodeStateStats(NodeId, CNodeStateStats&)’:
1net_processing.cpp:982:11: error: ‘struct CNodeStateStats’ has no member named ‘m_starting_height’; did you mean ‘nStartingHeight’?
2 982 | stats.m_starting_height = peer->m_starting_height;
3 | ^~~~~~~~~~~~~~~~~
4 | nStartingHeight
git diff 8d3a068 1a9e191
) look good. One rebase fixup in 0a9839058c.
commits f9136f4bf, 7199d81968, 8ff3e9319c11 and 40ea4a42d9
0net_processing.cpp: In function ‘bool GetNodeStateStats(NodeId, CNodeStateStats&)’:
1net_processing.cpp:982/1000/1009: error: ‘struct CNodeStateStats’ has no member named ‘nStartingHeight’; did you mean ‘m_starting_height’?
2 982 | stats.nStartingHeight = peer->nStartingHeight;
3 | ^~~~~~~~~~~~~~~
4 | m_starting_height
(we’ll get there!)
0Running script for: fe391ca326f73337275d726f7de9db384ac35466
1
2sed -i 's/vBlockHashesToAnnounce/m_blocks_for_headers_relay/g' src/net_processing.cpp
3
4sed -i 's/vInventoryBlockToSend/m_blocks_for_inv_relay/g' src/net_processing.cpp
5
6diff --git a/src/net_processing.h b/src/net_processing.h
7
8index 1ad40982b..4998a1e04 100644
9
10--- a/src/net_processing.h
11
12+++ b/src/net_processing.h
13
14@@ -66,9 +66,9 @@ struct Peer {
15
16 /** List of block ids we still have announce.
17
18 * There is no final sorting before sending, as they are always sent
19
20 * immediately and in the order requested. */
21
22- std::vector<uint256> m_blocks_for_inv_relay GUARDED_BY(m_block_inv_mutex);
23
24+ std::vector<uint256> vInventoryBlockToSend GUARDED_BY(m_block_inv_mutex);
25
26 /** List of headers to relay */
27
28- std::vector<uint256> m_blocks_for_headers_relay GUARDED_BY(m_block_inv_mutex);
29
30+ std::vector<uint256> vBlockHashesToAnnounce GUARDED_BY(m_block_inv_mutex);
31
32
33
34 /** This peer's reported block height when we connected */
35
36 std::atomic<int> m_starting_height{-1};
37
38Failed
1320- LOCK(pnode->cs_inventory);
1321- for (const uint256& hash : reverse_iterate(vHashes)) {
1322- pnode->vBlockHashesToAnnounce.push_back(hash);
1323+ {
1324+ LOCK(m_peer_mutex);
1325+ for (auto& it : m_peer_map) {
a50dee50fd2aee93b41bfb3b04b24939d0c2f6c1 : ForEachNode
performs an additional check NodeFullyConnected
(pnode->fSuccessfullyConnected && !pnode->fDisconnect;
).
Not sure if that matters much here. Could we prematurely send headers before VERACK
?
This is an excellent observation. Block hashes could be pushed into m_blocks_for_headers_relay
before the version/verack handshake is complete. However, we’ll never send any blocks inventory before we’re fully connected due to the guard clause at the top of SendMessages():
I think this behaviour is ok. The version-verack handshake usually lasts no more than a second or two, and at the worst case times out after a minute, so we’re never going to push many block hashes onto this vector.
a50dee50fd2aee93b41bfb3b04b24939d0c2f6c1
Any reason to not have a ForEachPeer
member function?
I originally had one but removed it because it didn’t seem worth it. We can revisit that decision later if it seems useful.
ForEachNode was useful because it allowed passing a lambda into CConnman which would be executed while the cs_vNodes lock was held. Here, the m_peer_mutex is in PeerMan, so it can be held directly.
230 // banscore is deprecated in v0.21 for removal in v0.22
231 obj.pushKV("banscore", statestats.m_misbehavior_score);
232 }
233 obj.pushKV("synced_headers", statestats.nSyncHeight);
234 obj.pushKV("synced_blocks", statestats.nCommonHeight);
235+ obj.pushKV("startingheight", statestats.nStartingHeight);
"startingheight"
in RPCHelpMan
should be changed accordingly.
60@@ -61,6 +61,15 @@ struct Peer {
61 /** Whether this peer should be disconnected and marked as discouraged (unless it has the noban permission). */
62 bool m_should_discourage GUARDED_BY(m_misbehavior_mutex){false};
63
64+ /** Protects block inventory data members */
65+ Mutex m_block_inv_mutex;
66+ /** List of block ids we still have announce.
a50dee50fd2aee93b41bfb3b04b24939d0c2f6c1
I understand that comment text has been preserved, but maybe
0 /** List of block ids we still have to announce.
?
utACK b5b27ea052905ff7f26a962baa961fcc762c5ee0
Is m_hash_continue protected by any locks? A “this is only accessed by the singular net_processing thread” rationale would be sufficient, but if there happens to be a lock that’s held in all access locations, perhaps it’s easy to add?
getpeerinfo
RPC output, and detailed peer info in the “Peers” tab of the “Node window”.
1322- pnode->vBlockHashesToAnnounce.push_back(hash);
1323+ {
1324+ LOCK(m_peer_mutex);
1325+ for (auto& it : m_peer_map) {
1326+ Peer& peer = *it.second;
1327+ LOCK(peer.m_block_inv_mutex);
I think this is the first place where we are acquiring a lock while already holding m_peer_mutex
. Any thoughts on what is the best way to document this design, so that future code writers know that acquiring m_peer_mutex
while holding a peer’s m_block_inv_mutex
is not allowed?
Or, would it be better to first copy all the Peer objects to a temporary place, release the m_peer_mutex
lock, and then grab the per-peer m_block_inv_mutex
locks to avoid holding both at the same time? (I don’t know or have an opinion; just trying to figure out the locking design here.)
m_peer_mutex
suffice here? This design (take m_peer_mutex
first, then lock the mutexes protecting the individual data members) is analogous to ForEachNode()
was doing here before (take cs_vNodes
first, then lock the mutexes protecting the individual data members in the lambda - here cs_inventory
).
71@@ -72,6 +72,11 @@ struct Peer {
72
73 /** This peer's reported block height when we connected */
74 std::atomic<int> m_starting_height{-1};
75+ /** The final block hash in an INV message. When the peer requests this
76+ * block, we send a BLOCK message to trigger the peer to request the next
Is m_hash_continue protected by any locks? A “this is only accessed by the singular net_processing thread” rationale would be sufficient, but if there happens to be a lock that’s held in all access locations, perhaps it’s easy to add? @sipa could we just make it a std::atomic? It doesn’t need to be synchronized with any other data, so as long as we’re guaranteeing that reads/writes are atomic, I think that’s enough?
1843@@ -1843,7 +1844,8 @@ void PeerManager::ProcessHeadersMessage(CNode& pfrom, const std::vector<CBlockHe
1844 // Headers message had its maximum size; the peer may have more headers.
1845 // TODO: optimize: if pindexLast is an ancestor of ::ChainActive().Tip or pindexBestHeader, continue
1846 // from there instead.
1847- LogPrint(BCLog::NET, "more getheaders (%d) to end to peer=%d (startheight:%d)\n", pindexLast->nHeight, pfrom.GetId(), pfrom.nStartingHeight);
1848+ PeerRef peer = GetPeerRef(pfrom.GetId());
4ac61d85f1624b1cd7af4a85c8bca0bd430801f0:
Can you explain why a nullptr-Assert is not needed here? Also, why isn’t the peer passed in from outside, where it already exists?
65+ Mutex m_block_inv_mutex;
66+ /** List of block ids we still have announce.
67+ * There is no final sorting before sending, as they are always sent
68+ * immediately and in the order requested. */
69+ std::vector<uint256> vInventoryBlockToSend GUARDED_BY(m_block_inv_mutex);
70+ /** List of headers to relay */
a50dee50fd2aee93b41bfb3b04b24939d0c2f6c1
The list is unfiltered, as was mentioned in the previous comment. Any reason to remove that?
65@@ -66,9 +66,9 @@ struct Peer {
66 /** List of block ids we still have announce.
67 * There is no final sorting before sending, as they are always sent
68 * immediately and in the order requested. */
69- std::vector<uint256> vInventoryBlockToSend GUARDED_BY(m_block_inv_mutex);
70+ std::vector<uint256> m_blocks_for_inv_relay GUARDED_BY(m_block_inv_mutex);
71 /** List of headers to relay */
72- std::vector<uint256> vBlockHashesToAnnounce GUARDED_BY(m_block_inv_mutex);
73+ std::vector<uint256> m_blocks_for_headers_relay GUARDED_BY(m_block_inv_mutex);
22c2b3eff601cbc1aed2e466502bdc5549e11ae2
I don’t think this name is correct. Headers relay isn’t implied. It might very well fall back to inv relay. Most importantly when the peer asked us to. Though, there are other reasons, too.
We hope to announce these blocks via headers. If we can’t, we’ll move the hashes onto the m_blocks_for_inv_relay
and then announce via inv.
I’ve updated the comment. Let me know if you think it’s clear enough now.
71@@ -72,6 +72,11 @@ struct Peer {
72
73 /** This peer's reported block height when we connected */
74 std::atomic<int> m_starting_height{-1};
75+ /** The final block hash in an INV message. When the peer requests this
b5b27ea052905ff7f26a962baa961fcc762c5ee0
I don’t think this is true either. m_hash_continue
is a purely internal helper, never sent over the wire directly. Instead, the tip hash is sent to trigger the peer to ask further.
inv
message to this peer.” Let me know if you have suggestions for better wording.
review ACK b5b27ea052905ff7f26a962baa961fcc762c5ee0 I reviewed the code and it probably wont crash, but I had a few questions 🚟
Signature:
0-----BEGIN PGP SIGNED MESSAGE-----
1Hash: SHA512
2
3review ACK b5b27ea052905ff7f26a962baa961fcc762c5ee0 I reviewed the code and it probably wont crash, but I had a few questions 🚟
4-----BEGIN PGP SIGNATURE-----
5
6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
7pUioOwv6Amw0DZDkCn50bwGUmV7m2mPj2cL5zV9XhpbPUhiMa/emED2JTAu020/z
8LQZQp0pzHsSOCtBzVbfbi4GTK45AEfAV2FEMgMaQNO0QQIdFXD86angiyLZuojbM
9uq5gnQS9d2uPHgfbWpnGVY7VEMIrIt5wDwXi81geVErEYT3yIf6d7sKazeq0Rw2t
10VIIYUsWzESgJpw7iSMVuqP/Du9zuIkzreaLAi3VyJ6BqVxPDKF0OTVE4rsPh0UTN
11UAYyz0TczihypdAUm1KQGYeLuHORg+OpDf0ErRwdrahzz1Pk6i6qceJt6Xiaatmy
12F0VvaGCxRHCfZp2OGjAmp+Dv8v5AWLYnx4h4dobg+IleEVfOnQEp7G6guB/z30Ht
136mRu6x3PkFJH/LSfSmxRQLuJoteM+ZaPpm15hPrT+eismtk77EslGYDjo4SiUNxM
144u+r8fycyixslocpoByX8wTF0lrsY+YYH5hQINL1UIuQY70qPq9GLQlsTuJAZQyk
15nNGzdlSe
16=l+a9
17-----END PGP SIGNATURE-----
Timestamp of file with hash aea33e795fcbc5727e94f7d8ccebdce44be9f051f6b87be26df0c7f774cd7c32 -
Suggested here:
- https://github.com/bitcoin/bitcoin/pull/19607#discussion_r467071878
- https://github.com/bitcoin/bitcoin/pull/19829#discussion_r546116786
Addressed all comments.
Is m_hash_continue protected by any locks? A “this is only accessed by the singular net_processing thread” rationale would be sufficient, but if there happens to be a lock that’s held in all access locations, perhaps it’s easy to add? @sipa I’ve added a commit to guard m_hash_continue with m_block_inv_mutex.
1774@@ -1764,7 +1775,9 @@ void PeerManager::SendBlockTransactions(CNode& pfrom, const CBlock& block, const
1775 m_connman.PushMessage(&pfrom, msgMaker.Make(nSendFlags, NetMsgType::BLOCKTXN, resp));
1776 }
1777
1778-void PeerManager::ProcessHeadersMessage(CNode& pfrom, const std::vector<CBlockHeader>& headers, bool via_compact_block)
1779+void PeerManager::ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
1780+ const std::vector<CBlockHeader>& headers,
1781+ bool via_compact_block)
1866@@ -1854,7 +1867,8 @@ void PeerManager::ProcessHeadersMessage(CNode& pfrom, const std::vector<CBlockHe
1867 // Headers message had its maximum size; the peer may have more headers.
1868 // TODO: optimize: if pindexLast is an ancestor of ::ChainActive().Tip or pindexBestHeader, continue
1869 // from there instead.
1870- LogPrint(BCLog::NET, "more getheaders (%d) to end to peer=%d (startheight:%d)\n", pindexLast->nHeight, pfrom.GetId(), pfrom.nStartingHeight);
1871+ LogPrint(BCLog::NET, "more getheaders (%d) to end to peer=%d (startheight:%d)\n",
1872+ pindexLast->nHeight, pfrom.GetId(), peer.m_starting_height);
201@@ -180,7 +202,9 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface
202
203 void ProcessOrphanTx(std::set<uint256>& orphan_work_set) EXCLUSIVE_LOCKS_REQUIRED(cs_main, g_cs_orphans);
204 /** Process a single headers message from a peer. */
205- void ProcessHeadersMessage(CNode& pfrom, const std::vector<CBlockHeader>& headers, bool via_compact_block);
206+ void ProcessHeadersMessage(CNode& pfrom, const Peer& peer,
230 // banscore is deprecated in v0.21 for removal in v0.22
231 obj.pushKV("banscore", statestats.m_misbehavior_score);
232 }
233 obj.pushKV("synced_headers", statestats.nSyncHeight);
234 obj.pushKV("synced_blocks", statestats.nCommonHeight);
235+ obj.pushKV("startingheight", statestats.m_starting_height);
startingheight
should remain before synced_headers
and synced_blocks
(here and in the help) for consistency with the GUI peers info (or maybe less preferably, the GUI should be updated to follow this change)
75+ std::vector<uint256> m_blocks_for_headers_relay GUARDED_BY(m_block_inv_mutex);
76+ /** The final block hash that we sent in an `inv` message to this peer.
77+ * When the peer requests this block, we send an `inv` message to trigger
78+ * the peer to request the next sequence of block hashes.
79+ * Most peers use headers-first syncing, which doesn't use this mechanism */
80+ uint256 m_hash_continue GUARDED_BY(m_block_inv_mutex) {};
hashContinue
, I wonder if it can be better named (but don’t have one to propose)
m_continuation_block
45@@ -46,6 +46,8 @@ struct CNodeStateStats {
46 * Memory is owned by shared pointers and this object is destructed when
47 * the refcount drops to zero.
48 *
49+ * Mutexes inside this struct must not be held when locking m_peer_mutex.
717a374e74b64b7b90bc1b2995e8900212bd0bfe
nit to avoid the inversion and give a rationale:
“m_peer_mutex should be locked before locking a mutex inside this struct to avoid lock order violations”
This will also weaken the requirement, because for some mutexes it could make sense to not require m_peer_mutex beforehand.
That isn’t precise enough. It’s fine to lock m_peer_mutex, take a shared_ptr to one of the Peer objects, release m_peer_mutex and later lock one of the inner mutexes. The only rule is that we don’t want to lock m_peer_mutex while already holding a lock on one of the inner mutexes.
This is the same as cs_vNodes and the inner mutexes in CNode.
211@@ -210,7 +212,8 @@ class PeerManager final : public CValidationInterface, public NetEventsInterface
212 * on extra block-relay-only peers. */
213 bool m_initial_sync_finished{false};
214
215- /** Protects m_peer_map */
216+ /** Protects m_peer_map. This mutex must not be locked while holding a lock
217+ * on any of the mutexes inside a Peer object. */
1622- // wait for other stuff first.
1623- std::vector<CInv> vInv;
1624- vInv.push_back(CInv(MSG_BLOCK, ::ChainActive().Tip()->GetBlockHash()));
1625- connman.PushMessage(&pfrom, msgMaker.Make(NetMsgType::INV, vInv));
1626- peer.m_hash_continue.SetNull();
1627+ LOCK(peer.m_block_inv_mutex); {
LOCK(); { ... }
anywhere in the codebase
This is my brain not working properly. I’ve fixed the syntax.
This doesn’t violate the documentation. That says that it’s not permitted to lock m_peer_map while holding a lock on an inner mutex. This is locking an inner mutex.
review ACK 9aad3e4f33 🔱
Changes since previous review:
Signature:
0-----BEGIN PGP SIGNED MESSAGE-----
1Hash: SHA512
2
3review ACK 9aad3e4f33 🔱
4
5Changes since previous review:
6
7* Add lock-order docs
8* Pass in Peer& to ProcessHeadersMessage
9* Shuffle order in RPC doc
10* Rebase
11* Doc fixups
12* New commit with a lock
13
14-----BEGIN PGP SIGNATURE-----
15
16iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
17pUhhwgv+M611eBCdMzaHcsl/tL05mOFVC94Nc9F7t5MCW8iuNlEWaT7yzpQaPV5g
188L9MZqFGi7YpPVrL2y12ky0MYBfG9MdploG2F53O4WvLnyEa6PXw04HZKqlv9yfS
19qcJEwjJc+XvKPohl6NZg91GDyofVttKI8bqCqjPhEbZYCi2+RQ9o0OZ1PJEr8b6L
20KgtXpj+YgtpRwpH5ny/E/mR5LwfLFmxYNmgQlEpMvjIqyxcSUgrYXn77vxMJI//T
21Y2g5NowkMo43S12fT6KWJk9eMoRy+L2kgrCD9LI0r/wuY1B+9rCamBgzeBKIuk6f
22EfFnS33FIG8xcT21+x5g38J6jlHz6HlSy/huA31ALBhDItMTR+dm7VI+1KXKTTag
23nhN9S/ZVWXjjPaZKzbAmKguPRbJWdOcoWdyAlZy+S4qeC+wdEqT+GBGvhj0TZ3du
2456VMY6sNhdG8kUQr4KuwugereUtKQDJvHMehBH6ki6br/G6W71kdpCYPhxK5fZkx
25khylqj75
26=v6fJ
27-----END PGP SIGNATURE-----
Timestamp of file with hash 431bceaaa7cf87a17743f11d77f4a61c5cdf9d1374a5f4fcd4e5683b564af330 -
Not done as a scripted diff to avoid misnaming the local variable in
ProcessMessage().
-BEGIN VERIFY SCRIPT-
sed -i 's/vBlockHashesToAnnounce/m_blocks_for_headers_relay/g' src/net_processing.*
sed -i 's/vInventoryBlockToSend/m_blocks_for_inv_relay/g' src/net_processing.*
-END VERIFY SCRIPT-
Also rename to m_continuation_block to better communicate meaning.
Thanks for the latest reviews @jonatack and @MarcoFalke. I’ve addressed/responded to all of your comments.
I’d like to stabilize this PR now unless any functional issues are found. We have ACKs on the code change at various times from @MarcoFalke, @jonatack, @hebasto, @sipa and @Sjors. If there are any further style suggestions, I can address them in the next PR in this series.
re-ACK 3002b4af2b4fde63026f8f7c575452a5c989c662 🌓
only changes:
Signature:
0-----BEGIN PGP SIGNED MESSAGE-----
1Hash: SHA512
2
3re-ACK 3002b4af2b4fde63026f8f7c575452a5c989c662 🌓
4
5only changes:
6
7* Move RPC help once more
8* Fix LOCK scope {} nit
9* Rename a member
10
11-----BEGIN PGP SIGNATURE-----
12
13iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
14pUirewv/WiQMgzv/dZCAqCSBX8qa6mPZiEw4j5mOPAnylVH3P+qiQGEXrHpKVjTj
15XO0rGPI+dEvej90YVSThX231sBS1VXvP4hv8JKYHVdoAVP2j3XbmEmtZrPoGqViR
16eMtblmUALv5A8OfkquBDUb/PVDr9FxUycTe01tGxyxhcLtaTfoIcq+6Elh7NkWMm
17tgszY1rz2S7pjYU3GH/an+v5BYOBAi3HG5vHETEICQGjabAEduMZFF2bueQZIj1Q
18CkVhnj/mcFx7Fypc0J1q5LsPonuPHVfqMK38dO6iFxcpLfLZKwmcnXAJ7OFnHkQF
19JJ1zXA5HRixl5UKakZqUkjhjroD71vfODzfhSLaszEZLqeGXx0+N17CAd8Vgblk2
20ReLskQfzmco+aHWX6Hl7CKj7+yTszpWTMvmMvkl08EaNnAGuGBbF1TgI8oA213Ry
21QwjwlP5Ohx7Iy33qtysxAq5/4jqypu/srlhxcfFCidi+AvujeZqcyXLztxw54zLj
22cWdIBtpT
23=PVp9
24-----END PGP SIGNATURE-----
Timestamp of file with hash 03268669e27bdf510ed13b406639896a3bedb0573f1b2d13c4c88ed636966237 -
git diff 9aad3e4 3002b4a