Adds GUARDED_BY and const annotations to document how we currently ensure various members of CNode and Peer aren't subject to race conditions.
net/net_processing: Add thread safety related annotations for CNode and Peer #25174
pull ajtowns wants to merge 4 commits into bitcoin:master from ajtowns:202205-net-lock-annotations changing 3 files +16 −16-
ajtowns commented at 10:35 PM on May 19, 2022: contributor
- ajtowns added the label Refactoring on May 19, 2022
- ajtowns requested review from hebasto on May 19, 2022
- ajtowns requested review from jnewbery on May 19, 2022
- ajtowns requested review from vasild on May 19, 2022
-
in src/net_processing.cpp:651 in 5df0bd1e8d outdated
598 | @@ -599,14 +599,14 @@ class PeerManagerImpl final : public PeerManager 599 | std::atomic<int> m_best_height{-1}; 600 | 601 | /** Next time to check for stale tip */ 602 | - std::chrono::seconds m_stale_tip_check_time{0s}; 603 | + std::chrono::seconds m_stale_tip_check_time GUARDED_BY(cs_main){0s};
jnewbery commented at 7:24 AM on May 20, 2022:This member is only accessed by the scheduler thread, so I don't think it needs to be guarded. I see that all access happens to be while cs_main is held, but I'm not sure that's a good reason to add this annotation.
ajtowns commented at 3:15 PM on May 20, 2022:That it is accessible by other threads is why it needs to be guarded -- that's what ensures that it is only accessed by one thread at a time (or ever). Right now, if it were accessed by other threads who'd locked
cs_main, it would be safe -- all this does is document that.
vasild commented at 9:40 AM on May 24, 2022:Just because a variable is accessed under
cs_maindoes not mean that it has to be protected by it. It would have to be if there was a relationship betweenm_stale_tip_check_timeand some other variable that is already protected bycs_mainand both have to be accessed together, but that is not the case.This is the
m_stale_tip_check_timeaccess:if (now > m_stale_tip_check_time) { ... m_stale_tip_check_time = now + STALE_CHECK_INTERVAL; }If that would be called by multiple threads it would need a mutex protection. Not necessary
cs_main. I think it is best to either leave it as is (because it is only accessed by a single thread) or introduce a new mutex form_stale_tip_check_time.Consider also a future work that tries to split
cs_mainto improve concurrency - then things have to be chopped off fromcs_main. Needlessly addingm_stale_tip_check_timeunder it will create more work in the future.
ajtowns commented at 5:21 PM on May 26, 2022:Just because a variable is accessed under
cs_maindoes not mean that it has to be protected by it.That's true of everything protected by
cs_main-- if it weren't there would be no hope of replacing cs_main by some other scheme. The issue isn't whether guarding the variable in some other way would also be safe, it's about documenting some way in which we can guarantee that the use is safe, and have the compiler maintain that guarantee for us.
vasild commented at 7:31 AM on May 27, 2022:I see your point, but enforcing an unnecessary protection by
cs_mainseems like an abuse to me.IMO we shouldn't try to have the compiler maintain a thread-safety guarantee for private class members that are not accessed by multiple threads. If a future change expands the variable usage to multiple threads, then that change should introduce a suitable protection.
hebasto commented at 2:11 PM on July 18, 2022:Just because a variable is accessed under
cs_maindoes not mean that it has to be protected by it.Agree.
Mutex's purpose is to protect not an access to a variable, rather class's invariants (any thread must see an object in the consistent state). If we fail to clear describe a protected invariant, it would be better to avoid guarding by a mutex, no?
ajtowns commented at 12:20 PM on August 29, 2022:When I look at both
m_stale_tip_check_timeandm_initial_sync_finishedand want to convince myself that there's no chance of one thread racing to access them while another is in the middle of modifying them, the way I convince myself of that is "oh, they're both used from precisely one function, and that function just locked cs_main".It's certainly possible to look further, and say that
CheckForStaleTipAndEvictPeersis only called via the scheduler thread; but that's more work, and it's not something that I can easily get the compiler to check for me. I could tell the compiler to check it for me with more work; but that involves adding thread-specific mutexes which reviewers here have objected to previously #21527 (review) and the whole point of breaking this PR out was to try to make some incremental progress in the meantime.in src/net_processing.cpp:660 in 5df0bd1e8d outdated
606 | const bool m_ignore_incoming_txs; 607 | 608 | /** Whether we've completed initial sync yet, for determining when to turn 609 | * on extra block-relay-only peers. */ 610 | - bool m_initial_sync_finished{false}; 611 | + bool m_initial_sync_finished GUARDED_BY(cs_main){false};
jnewbery commented at 7:28 AM on May 20, 2022:As above, this doesn't need cs_main.
In fact, I don't see the point of this variable at all. It's only used to prevent calling
StartExtraBlockRelayPeers()more than once, but that function could be made idempotent to prevent logging multiple times by usingstd::atomic<bool>::exchange().
ajtowns commented at 3:17 PM on May 20, 2022:As above, if you have
cs_mainit is safe to access this. It is not unconditionally safe for anyone to access it (unlike if it were atomic or const), hence it should be guarded. This is attempting to be a minimal, documentation only PR, so removing the variable is out of scope.in src/net.h:396 in 5df0bd1e8d outdated
465 | @@ -466,7 +466,7 @@ class CNode 466 | * from the wire. This cleaned string can safely be logged or displayed. 467 | */ 468 | std::string cleanSubVer GUARDED_BY(m_subver_mutex){}; 469 | - bool m_prefer_evict{false}; // This peer is preferred for eviction. 470 | + bool m_prefer_evict{false}; // This peer is preferred for eviction. (treated as const)
jnewbery commented at 7:29 AM on May 20, 2022:As your commit log already says, I think it'd be better just to actually make this const and included as a ctor parameter (and also made private).
ajtowns commented at 3:58 PM on May 20, 2022:I guess I'm okay with that in a follow up, but this patchset's been stewing for about 15 months now, so I'm a bit skeptical about increasing the scope?
vasild commented at 10:26 AM on May 24, 2022:Same as m_permissionFlags, better make it
const. It is even easier thanm_permissionFlagsbecause there are no fuzzer changes. Or should we fuzzm_prefer_evicttoo? It would be one-line extension to the fuzzer'sConsumeNode()to pass one more boolean parameter fromConsumeBool().
jonatack commented at 10:36 AM on August 29, 2022:Agree to make these private data members (consider to prefer non-const), if they do not need to be public or protected for testing/fuzzing, e.g.
--- a/src/net.h +++ b/src/net.h @@ -339,6 +339,8 @@ class CNode { friend class CConnman; friend struct ConnmanTestMsg; +private: + bool m_prefer_evict{false}; // This peer is preferred for eviction. public: const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread @@ -393,7 +395,6 @@ public: * from the wire. This cleaned string can safely be logged or displayed. */ std::string cleanSubVer GUARDED_BY(m_subver_mutex){}; - bool m_prefer_evict{false}; // This peer is preferred for eviction. (treated as const) bool HasPermission(NetPermissionFlags permission) const { return NetPermissions::HasFlag(m_permissionFlags, permission); }
ajtowns commented at 11:51 AM on August 29, 2022:Making data members private does not make them thread safe. This member is thread safe in the current code, despite the lack of a guarding mutex, because its value is fixed immediately after creation, before another thread can access it, and it does not change until it's destructed.
jonatack commented at 12:03 PM on August 29, 2022:Yes, the point is to encapsulate as private if the data member participates in the object’s invariant, versus public (and const if thread-safe).
Edit: correcting myself, these don't seem to be invariants, so public and const seems fine.
ajtowns commented at 12:48 PM on August 29, 2022:The reason it's thread-safe is that it's (effectively) const; the reason it's (effectively) const is that its value is known at the time the connection is setup. This PR is only about documenting why data members are thread safe, not changing logic or encapsulation...
I don't really get the
privatesuggestion -- it's aCNodedata member, but it's not accessed by anyCNodemember functions, it's only used byCConnman::AttemptToEvictConnection. WhileCConnmanis a friend class so could get away with that, it seems like a weird thing to do.https://github.com/ajtowns/bitcoin/commits/202208-cnodeoptions has a commit with these actually made
constfor whatever that's worth. Again, it does more than just document the current state of the code, so I'm not adding it to this PR.
in src/net.h:344 in 5df0bd1e8d outdated
412 | @@ -413,10 +413,10 @@ class CNode 413 | friend struct ConnmanTestMsg; 414 | 415 | public: 416 | - std::unique_ptr<TransportDeserializer> m_deserializer; 417 | - std::unique_ptr<TransportSerializer> m_serializer; 418 | + const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread
jnewbery commented at 7:30 AM on May 20, 2022:I don't think the comment is useful. Anyone reading the code should be able to check this for themselves (and not trust a code comment that claims this).
ajtowns commented at 3:17 PM on May 20, 2022:The comment is so the person reading the code knows what to check for.
jnewbery commented at 7:31 AM on May 20, 2022: memberconcept ACK
DrahtBot commented at 9:25 AM on May 20, 2022: contributor<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #23233 (BIP324: Add encrypted p2p transport {de}serializer by dhruv)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
w0xlt commented at 4:36 PM on May 20, 2022: contributorApproach ACK.
in src/net.h:345 in 5df0bd1e8d outdated
412 | @@ -413,10 +413,10 @@ class CNode 413 | friend struct ConnmanTestMsg; 414 | 415 | public: 416 | - std::unique_ptr<TransportDeserializer> m_deserializer; 417 | - std::unique_ptr<TransportSerializer> m_serializer; 418 | + const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread 419 | + const std::unique_ptr<const TransportSerializer> m_serializer;
vasild commented at 9:51 AM on May 24, 2022:m_serializerandm_deserializerdo not have to beunique_ptr. I thinkconst unique_ptr<T> m_member;is the same asT m_member;and the latter should be preferred because it is simpler, no?Consider this:
<details> <summary>diff</summary>
diff --git i/src/net.cpp w/src/net.cpp index e275c2964d..18110e553b 100644 --- i/src/net.cpp +++ w/src/net.cpp @@ -670,22 +670,22 @@ bool CNode::ReceiveMsgBytes(Span<const uint8_t> msg_bytes, bool& complete) const auto time = GetTime<std::chrono::microseconds>(); LOCK(cs_vRecv); m_last_recv = std::chrono::duration_cast<std::chrono::seconds>(time); nRecvBytes += msg_bytes.size(); while (msg_bytes.size() > 0) { // absorb network data - int handled = m_deserializer->Read(msg_bytes); + int handled = m_deserializer.Read(msg_bytes); if (handled < 0) { // Serious header problem, disconnect from the peer. return false; } - if (m_deserializer->Complete()) { + if (m_deserializer.Complete()) { // decompose a transport agnostic CNetMessage from the deserializer bool reject_message{false}; - CNetMessage msg = m_deserializer->GetMessage(time, reject_message); + CNetMessage msg = m_deserializer.GetMessage(time, reject_message); if (reject_message) { // Message deserialization failed. Drop the message but don't disconnect the peer. // store the size of the corrupt message mapRecvBytesPerMsgType.at(NET_MESSAGE_TYPE_OTHER) += msg.m_raw_message_size; continue; } @@ -3042,14 +3042,14 @@ ServiceFlags CConnman::GetLocalServices() const return nLocalServices; } unsigned int CConnman::GetReceiveFloodSize() const { return nReceiveFloodSize; } CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, std::shared_ptr<Sock> sock, const CAddress& addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress& addrBindIn, const std::string& addrNameIn, ConnectionType conn_type_in, bool inbound_onion) - : m_deserializer{std::make_unique<V1TransportDeserializer>(V1TransportDeserializer(Params(), idIn, SER_NETWORK, INIT_PROTO_VERSION))}, - m_serializer{std::make_unique<V1TransportSerializer>(V1TransportSerializer())}, + : m_deserializer{V1TransportDeserializer(Params(), idIn, SER_NETWORK, INIT_PROTO_VERSION)}, + m_serializer{V1TransportSerializer()}, m_sock{sock}, m_connected{GetTime<std::chrono::seconds>()}, addr(addrIn), addrBind(addrBindIn), m_addr_name{addrNameIn.empty() ? addr.ToStringIPPort() : addrNameIn}, m_inbound_onion(inbound_onion), @@ -3094,13 +3094,13 @@ void CConnman::PushMessage(CNode* pnode, CSerializedNetMsg&& msg) msg.data.size(), msg.data.data() ); // make sure we use the appropriate network transport format std::vector<unsigned char> serializedHeader; - pnode->m_serializer->prepareForTransport(msg, serializedHeader); + pnode->m_serializer.prepareForTransport(msg, serializedHeader); size_t nTotalSize = nMessageSize + serializedHeader.size(); size_t nBytesSent = 0; { LOCK(pnode->cs_vSend); bool optimisticSend(pnode->vSendMsg.empty()); diff --git i/src/net.h w/src/net.h index 5c43c5bfb8..f99a33a7d1 100644 --- i/src/net.h +++ w/src/net.h @@ -410,14 +410,14 @@ public: class CNode { friend class CConnman; friend struct ConnmanTestMsg; public: - const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread - const std::unique_ptr<const TransportSerializer> m_serializer; + V1TransportDeserializer m_deserializer; // Used only by SocketHandler thread + const V1TransportSerializer m_serializer; NetPermissionFlags m_permissionFlags{NetPermissionFlags::None}; // treated as const outside of fuzz tester std::atomic<ServiceFlags> nServices{NODE_NONE}; /** * Socket used for communication with the node. diff --git i/src/test/util/net.cpp w/src/test/util/net.cpp index 62b770753a..c6d0c185d1 100644 --- i/src/test/util/net.cpp +++ w/src/test/util/net.cpp @@ -30,13 +30,13 @@ void ConnmanTestMsg::NodeReceiveMsgBytes(CNode& node, Span<const uint8_t> msg_by } } bool ConnmanTestMsg::ReceiveMsgFrom(CNode& node, CSerializedNetMsg& ser_msg) const { std::vector<uint8_t> ser_msg_header; - node.m_serializer->prepareForTransport(ser_msg, ser_msg_header); + node.m_serializer.prepareForTransport(ser_msg, ser_msg_header); bool complete; NodeReceiveMsgBytes(node, ser_msg_header, complete); NodeReceiveMsgBytes(node, ser_msg.data, complete); return complete; }</details>
ajtowns commented at 5:21 PM on May 26, 2022:They need to be pointers because they're initialised to a subclass of the type they are.
vasild commented at 7:10 AM on May 27, 2022:The type they are can be changed:
V1TransportDeserializer m_deserializer;, as in the diff above. The polymorphism is not needed - we don't use it via a pointer to the base class.Before:
const std::unique_ptr<TransportDeserializer> m_deserializer; m_deserializer{std::make_unique<V1TransportDeserializer>(V1TransportDeserializer(...))}VS
After:
V1TransportDeserializer m_deserializer; m_deserializer{V1TransportDeserializer(...)}It looks like a pure simplification.
jnewbery commented at 10:33 AM on May 27, 2022:For context, the unique ptr to a deserializer object was added in #16202. The intention was to use a polymorphism so the deserializer could be upgraded to a p2p v2 (BIP 324) deserializer later.
I tend to agree with @vasild that we should change this to be simpler. There's been almost no progress on p2p v2 in the intervening years, and it's easy enough to change this back to a polymorphism later if required.
ajtowns commented at 3:06 PM on May 27, 2022:The draft p2p v2 patches make use of the polymorphism, so removing it now seems like it makes things worse to me. Even if it is a good idea, it's fine for a followup, and doesn't improve thread safety so isn't necessary here.
vasild commented at 12:12 PM on June 2, 2022:Ok,
unique_ptror not is not so much related to thread-safety, kind of scope creep for this PR.in src/net.h:347 in 5df0bd1e8d outdated
417 | - std::unique_ptr<TransportSerializer> m_serializer; 418 | + const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread 419 | + const std::unique_ptr<const TransportSerializer> m_serializer; 420 | 421 | - NetPermissionFlags m_permissionFlags{NetPermissionFlags::None}; 422 | + NetPermissionFlags m_permissionFlags{NetPermissionFlags::None}; // treated as const outside of fuzz tester
vasild commented at 10:14 AM on May 24, 2022:Instead of just a comment, consider actually making it
const:<details> <summary>diff</summary>
diff --git i/src/net.cpp w/src/net.cpp index e275c2964d..7bcaa6feef 100644 --- i/src/net.cpp +++ w/src/net.cpp @@ -1237,15 +1237,15 @@ void CConnman::CreateNodeFromAcceptedSocket(std::unique_ptr<Sock>&& sock, addr, CalculateKeyedNetGroup(addr), nonce, addr_bind, /*addrNameIn=*/"", ConnectionType::INBOUND, - inbound_onion); + inbound_onion, + permissionFlags); pnode->AddRef(); - pnode->m_permissionFlags = permissionFlags; pnode->m_prefer_evict = discouraged; m_msgproc->InitializeNode(pnode); LogPrint(BCLog::NET, "connection from %s accepted\n", addr.ToString()); { @@ -3041,15 +3041,16 @@ ServiceFlags CConnman::GetLocalServices() const { return nLocalServices; } unsigned int CConnman::GetReceiveFloodSize() const { return nReceiveFloodSize; } -CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, std::shared_ptr<Sock> sock, const CAddress& addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress& addrBindIn, const std::string& addrNameIn, ConnectionType conn_type_in, bool inbound_onion) +CNode::CNode(NodeId idIn, ServiceFlags nLocalServicesIn, std::shared_ptr<Sock> sock, const CAddress& addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress& addrBindIn, const std::string& addrNameIn, ConnectionType conn_type_in, bool inbound_onion, NetPermissionFlags permissoin_flags) : m_deserializer{std::make_unique<V1TransportDeserializer>(V1TransportDeserializer(Params(), idIn, SER_NETWORK, INIT_PROTO_VERSION))}, m_serializer{std::make_unique<V1TransportSerializer>(V1TransportSerializer())}, + m_permissionFlags{permissoin_flags}, m_sock{sock}, m_connected{GetTime<std::chrono::seconds>()}, addr(addrIn), addrBind(addrBindIn), m_addr_name{addrNameIn.empty() ? addr.ToStringIPPort() : addrNameIn}, m_inbound_onion(inbound_onion), diff --git i/src/net.h w/src/net.h index 5c43c5bfb8..8701e7502a 100644 --- i/src/net.h +++ w/src/net.h @@ -413,13 +413,13 @@ class CNode friend struct ConnmanTestMsg; public: const std::unique_ptr<TransportDeserializer> m_deserializer; // Used only by SocketHandler thread const std::unique_ptr<const TransportSerializer> m_serializer; - NetPermissionFlags m_permissionFlags{NetPermissionFlags::None}; // treated as const outside of fuzz tester + const NetPermissionFlags m_permissionFlags; std::atomic<ServiceFlags> nServices{NODE_NONE}; /** * Socket used for communication with the node. * May not own a Sock object (after `CloseSocketDisconnect()` or during tests). * `shared_ptr` (instead of `unique_ptr`) is used to avoid premature close of @@ -582,13 +582,13 @@ public: std::atomic<std::chrono::microseconds> m_last_ping_time{0us}; /** Lowest measured round-trip time. Used as an inbound peer eviction * criterium in CConnman::AttemptToEvictConnection. */ std::atomic<std::chrono::microseconds> m_min_ping_time{std::chrono::microseconds::max()}; - CNode(NodeId id, ServiceFlags nLocalServicesIn, std::shared_ptr<Sock> sock, const CAddress& addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress& addrBindIn, const std::string& addrNameIn, ConnectionType conn_type_in, bool inbound_onion); + CNode(NodeId id, ServiceFlags nLocalServicesIn, std::shared_ptr<Sock> sock, const CAddress& addrIn, uint64_t nKeyedNetGroupIn, uint64_t nLocalHostNonceIn, const CAddress& addrBindIn, const std::string& addrNameIn, ConnectionType conn_type_in, bool inbound_onion, NetPermissionFlags permission_flags = NetPermissionFlags::None); CNode(const CNode&) = delete; CNode& operator=(const CNode&) = delete; NodeId GetId() const { return id; } diff --git i/src/test/fuzz/util.cpp w/src/test/fuzz/util.cpp index 033c6e18d5..5bbaf903c0 100644 --- i/src/test/fuzz/util.cpp +++ w/src/test/fuzz/util.cpp @@ -233,13 +233,12 @@ bool FuzzedSock::IsConnected(std::string& errmsg) const } void FillNode(FuzzedDataProvider& fuzzed_data_provider, ConnmanTestMsg& connman, PeerManager& peerman, CNode& node) noexcept { const bool successfully_connected{fuzzed_data_provider.ConsumeBool()}; const ServiceFlags remote_services = ConsumeWeakEnum(fuzzed_data_provider, ALL_SERVICE_FLAGS); - const NetPermissionFlags permission_flags = ConsumeWeakEnum(fuzzed_data_provider, ALL_NET_PERMISSION_FLAGS); const int32_t version = fuzzed_data_provider.ConsumeIntegralInRange<int32_t>(MIN_PEER_PROTO_VERSION, std::numeric_limits<int32_t>::max()); const bool relay_txs{fuzzed_data_provider.ConsumeBool()}; const CNetMsgMaker mm{0}; CSerializedNetMsg msg_version{ @@ -268,13 +267,12 @@ void FillNode(FuzzedDataProvider& fuzzed_data_provider, ConnmanTestMsg& connman, assert(node.nVersion == version); assert(node.GetCommonVersion() == std::min(version, PROTOCOL_VERSION)); assert(node.nServices == remote_services); CNodeStateStats statestats; assert(peerman.GetNodeStateStats(node.GetId(), statestats)); assert(statestats.m_relay_txs == (relay_txs && !node.IsBlockOnlyConn())); - node.m_permissionFlags = permission_flags; if (successfully_connected) { CSerializedNetMsg msg_verack{mm.Make(NetMsgType::VERACK)}; (void)connman.ReceiveMsgFrom(node, msg_verack); node.fPauseSend = false; connman.ProcessMessagesOnce(node); { diff --git i/src/test/fuzz/util.h w/src/test/fuzz/util.h index 580105e442..4d3568ff81 100644 --- i/src/test/fuzz/util.h +++ w/src/test/fuzz/util.h @@ -297,23 +297,25 @@ auto ConsumeNode(FuzzedDataProvider& fuzzed_data_provider, const std::optional<N const uint64_t keyed_net_group = fuzzed_data_provider.ConsumeIntegral<uint64_t>(); const uint64_t local_host_nonce = fuzzed_data_provider.ConsumeIntegral<uint64_t>(); const CAddress addr_bind = ConsumeAddress(fuzzed_data_provider); const std::string addr_name = fuzzed_data_provider.ConsumeRandomLengthString(64); const ConnectionType conn_type = fuzzed_data_provider.PickValueInArray(ALL_CONNECTION_TYPES); const bool inbound_onion{conn_type == ConnectionType::INBOUND ? fuzzed_data_provider.ConsumeBool() : false}; + const NetPermissionFlags permission_flags = ConsumeWeakEnum(fuzzed_data_provider, ALL_NET_PERMISSION_FLAGS); if constexpr (ReturnUniquePtr) { return std::make_unique<CNode>(node_id, local_services, sock, address, keyed_net_group, local_host_nonce, addr_bind, addr_name, conn_type, - inbound_onion); + inbound_onion, + permission_flags); } else { return CNode{node_id, local_services, sock, address, keyed_net_group,</details>
hebasto commented at 10:46 AM on July 18, 2022: memberConcept ACK.
DrahtBot added the label Needs rebase on Jul 19, 2022ajtowns force-pushed on Aug 5, 2022DrahtBot removed the label Needs rebase on Aug 5, 2022DrahtBot added the label Needs rebase on Aug 26, 2022ajtowns force-pushed on Aug 29, 2022DrahtBot removed the label Needs rebase on Aug 29, 2022in src/net_processing.cpp:660 in 38ea8226ba outdated
656 | @@ -657,7 +657,7 @@ class PeerManagerImpl final : public PeerManager 657 | 658 | /** Whether we've completed initial sync yet, for determining when to turn 659 | * on extra block-relay-only peers. */ 660 | - bool m_initial_sync_finished{false}; 661 | + bool m_initial_sync_finished GUARDED_BY(cs_main){false};
MarcoFalke commented at 7:52 AM on August 29, 2022:Apart from the feedback already given, this also seems arbitrary. If you want to document symbols that happen to be under the cs_main scope, why only document this one, and not everything else, like
m_highest_fast_announce...Maybe leave this hunk for a separate pull request? Otherwise I am not sure if this will move forward.
ajtowns commented at 8:32 AM on August 29, 2022:m_highest_fast_announceis a member of PeerManagerImpl, not CNode or Peer, so is out of scope for this PR...in src/net.h:616 in 069fc1e4d8 outdated
612 | @@ -613,7 +613,7 @@ class CNode 613 | * closed. 614 | * Otherwise this unique_ptr is empty. 615 | */ 616 | - std::unique_ptr<i2p::sam::Session> m_i2p_sam_session GUARDED_BY(m_sock_mutex); 617 | + std::unique_ptr<i2p::sam::Session> m_i2p_sam_session GUARDED_BY(m_sock_mutex) PT_GUARDED_BY(m_sock_mutex);
jonatack commented at 10:33 AM on August 29, 2022:Question, are both guards useful here? (All but one of the other instances of
PT_GUARDED_BYdon't cumulate both annotations.)- std::unique_ptr<i2p::sam::Session> m_i2p_sam_session GUARDED_BY(m_sock_mutex) PT_GUARDED_BY(m_sock_mutex); + std::unique_ptr<i2p::sam::Session> m_i2p_sam_session PT_GUARDED_BY(m_sock_mutex); };
ajtowns commented at 11:47 AM on August 29, 2022:No, the original annotation was correct; for some reason I thought PT_GUARDED_BY added some checks GUARDED_BY doesn't have, but that doesn't seem to be the case.
Changing from GUARDED_BY to PT_GUARDED_BY alone would allow you to do
m_i2p_sam_session = nullptrwithout holding the lock, orif (m_i2p_sam_session != nullptr) { LOCK(m_sock_mutex); m_i2p_sam_session->foo(); }which would be racy.jonatack commented at 10:44 AM on August 29, 2022: contributorACK first commit "net/net_processing: add missing thread safety annotations" modulo the
GUARDED_BY(cs_main)annotationsnet/net_processing: add missing thread safety annotations 06ebdc886fbbec32c9adnet: mark TransportSerializer/m_serializer as const
The (V1)TransportSerializer instance CNode::m_serializer is used from multiple threads via PushMessage without protection by a mutex. This is only thread safe because the class does not have any mutable state, so document that by marking the methods and the object as "const".
ef26f2f421net: mark CNode unique_ptr members as const
Dereferencing a unique_ptr is not necessarily thread safe. The reason these are safe is because their values are set at construction and do not change later; so mark them as const and set them via the initializer list to guarantee that.
9816dc96b7net: note CNode members that are treated as const
m_permissionFlags and m_prefer_evict are treated as const -- they're only set immediately after construction before any other thread has access to the object, and not changed again afterwards. As such they don't need to be marked atomic or guarded by a mutex; though it would probably be better to actually mark them as const...
ajtowns force-pushed on Aug 29, 2022MarcoFalke commented at 1:04 PM on August 29, 2022: memberreview ACK 9816dc96b77fd9780bb97891cd45a1b9798db8f5 📍
<details><summary>Show signature</summary>
Signature:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA512 review ACK 9816dc96b77fd9780bb97891cd45a1b9798db8f5 📍 -----BEGIN PGP SIGNATURE----- iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p pUiGPQwAk8l83FaAio2oGaavZm/DW+AtDfrRktDHvmfZC78JmukdBl+ZFGYStrY+ b7WdSH1n98Xf1Czr1oHrMc1uEHxj/ajAw0lR3ELurht0jRv6o12YMiltI6E21olh YFhrHVDPwPOgfcS8W1quNAv21MFX1tQTP7vFyyHiZbkP51dHlLKOJRah6SL9Ag9l CyFXjrDtIE6bgOF06+/T/3KFPDxbxy8vrhsMTGeg33oj6FcCvP+xhkLrwyrmdHnT mmcHCjfmoWRHcY5alEKOUQAFw+olpEzlLKq8mwsZoCt4prbk8tfihKu3If7LxscP uF4dDGD3nWiOL76mVgF27MMurdIY7yCI5IHNPOzBxQX8dfjuKY0OnPXLWFJjKOXf i2MOBQyxJ4fAkZ+KVyBo5qEh2PuwTA+oMFbIyNhaj/IDpOv9W9dHb7jqEkgVEynb BtmQztp7SkutBiv1rfarRRSWKRkrussJzMY43U7lFe+ClZFXMEw3NJh/AqtThtdV mG/YSC0D =34Gj -----END PGP SIGNATURE-----</details>
ajtowns commented at 1:16 PM on August 29, 2022: contributorHere's one way of documenting "This member is only accessed by the scheduler thread" with (I believe) no runtime impact:
class LOCKABLE CodeInspectionRequired { }; class SCOPED_LOCKABLE CodeInspectionSaysSafeToUse { public: CodeInspectionSaysSafeToUse(const CodeInspectionRequired& mut) EXCLUSIVE_LOCK_FUNCTION(mut) { } ~CodeInspectionSaysSafeToUse() UNLOCK_FUNCTION() { } }; #define SAFE_TO_USE(mut) CodeInspectionSaysSafeToUse UNIQUE_NAME(criticalblock)(mut)You'd then use it to annotate variables where you have to manually inspect the logic used to access them:
static constexpr CodeInspectionRequired FROM_SCHEDULER_THREAD; std::chrono::seconds m_stale_tip_check_time GUARDED_BY(FROM_SCHEDULER_THREAD){0s};but could pass that logic up through functions:
void CheckForStaleTipAndEvictPeers() EXCLUSIVE_LOCKS_REQUIRED(FROM_SCHEDULER_THREAD) override;to the caller:
scheduler.scheduleEvery([this] { SAFE_TO_USE(FROM_SCHEDULER_THREAD); this->CheckForStaleTipAndEvictPeers(); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});In this particular case, you'd need to put the
FROM_SCHEDULER_THREADdeclaration in thePeerManagervirtual class whereCheckForStaleTipAndEvictPeersis first declared there, and also addSAFE_TO_USEannotations in the unit tests. Without theSAFE_TO_USEannotation, clang errors with:net_processing.cpp:1695:81: error: calling function 'CheckForStaleTipAndEvictPeers' requires holding mutex 'FROM_SCHEDULER_THREAD' exclusively [-Werror,-Wthread-safety-analysis] scheduler.scheduleEvery([this] { this->CheckForStaleTipAndEvictPeers(); }, std::chrono::seconds{EXTRA_PEER_CHECK_INTERVAL});Is something like the above a good idea? It seems better than what we have now, where we don't document why a variable is safe at all, and could introduce a race by having one PR introduce a new access from another thread with
cs_mainheld, while another introduces a new access from the scheduler thread withoutcs_mainheld, with a bug introduced because neither reviewers nor the compiler hasn't been told what to expect.But compared to just saying
GUARDED_BY(cs_main)when it already is guarded by cs_main, the above seems needlessly complicated and like it puts too much work on developers rather than tooling.jonatack commented at 3:38 PM on August 29, 2022: contributorutACK 9816dc96b77fd9780bb97891cd45a1b9798db8f5
hebasto approvedhebasto commented at 9:27 AM on August 30, 2022: memberACK 9816dc96b77fd9780bb97891cd45a1b9798db8f5, I have reviewed the code and it looks OK. In particular, I verified the usage of variables which got
GUARDED_BYannotations.Still curious about PR author's opinion about #25174 (review).
MarcoFalke commented at 9:34 AM on August 30, 2022: memberStill curious about PR author's opinion about #25174 (review).
See #25174 (review)
MarcoFalke merged this on Aug 30, 2022MarcoFalke closed this on Aug 30, 2022sidhujag referenced this in commit 212684be09 on Aug 30, 2022bitcoin locked this on Aug 31, 2023Labels
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-19 21:13 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me