CConnman::Stop() resets semOutbound, yet m_reconnections is not cleared in Stop. Each ReconnectionInfo contains a grant member that points to the memory that semOutbound pointed to and ~CConnman will attempt to access the grant field (memory that was already freed) when destroying m_reconnections. Fix this by calling m_reconnections.clear() in CConnman::Stop() and add appropriate annotations.
I was able to reproduce the original issue #33615 with the following diff by randomly stopping my node while it was attempting to reconnect (and verified that this patch fixes the issue, at least in my ~40-50 runs):
<details> <summary> diff </summary>
diff --git a/src/net.cpp b/src/net.cpp
index ef1c63044a..9c1d161d8b 100644
--- a/src/net.cpp
+++ b/src/net.cpp
@@ -1918,8 +1918,8 @@ void CConnman::DisconnectNodes()
{
LOCK(m_nodes_mutex);
- const bool network_active{fNetworkActive};
- if (!network_active) {
+// const bool network_active{fNetworkActive};
+// if (!network_active) {
// Disconnect any connected nodes
for (CNode* pnode : m_nodes) {
if (!pnode->fDisconnect) {
@@ -1927,7 +1927,7 @@ void CConnman::DisconnectNodes()
pnode->fDisconnect = true;
}
}
- }
+// }
// Disconnect unused nodes
std::vector<CNode*> nodes_copy = m_nodes;
@@ -1941,7 +1941,7 @@ void CConnman::DisconnectNodes()
// Add to reconnection list if appropriate. We don't reconnect right here, because
// the creation of a connection is a blocking operation (up to several seconds),
// and we don't want to hold up the socket handler thread for that long.
- if (network_active && pnode->m_transport->ShouldReconnectV1()) {
+ if (true) {
reconnections_to_add.push_back({
.addr_connect = pnode->addr,
.grant = std::move(pnode->grantOutbound),
</details>
I'm curious to see if others can reproduce as well.