Fix AddRef() usage for CNode #8372

pull UdjinM6 wants to merge 1 commits into bitcoin:master from UdjinM6:fixNodeRefCount changing 2 files +7 −11
  1. UdjinM6 commented at 2:58 PM on July 19, 2016: contributor

    Using AddRef() in ConnectNode() for existing connections doesn't feel right imo considering how refs are released in ThreadSocketHandler() https://github.com/bitcoin/bitcoin/blob/master/src/net.cpp#L1113. I guess this could be the reason that sometimes refs stay >0 no matter what and nodes stuck in vNodesDisconnected forever which means that node never get deleted and FinalizeNode signal is never fired which in its turn means that for example mapBlocksInFlight can't be cleaned properly and #7596 happens.

    This commit should solve the issue by:

    1. removing AddRef() for existing connections
    2. adding AddRef() in CNode's constructor using the same conditions as in ThreadSocketHandler()
  2. Fix AddRef() usage for CNode 2fc83d945a
  3. jonasschnelli added the label P2P on Jul 19, 2016
  4. sdaftuar commented at 5:41 PM on August 5, 2016: member

    Interesting observation! Thanks for working on this -- I've wondered if we had this type of bug on a few occasions but was never able to pin it down. And I do think you're on to something, as your PR has made clear at least one situation where I can see that we might end up with extra references to the same node: I believe it can be reproduced by eg using -connect=<ip> and -connect=<hostname> on the command line, where hostname resolves to the same ip address (I added a LogPrintf() to FinalizeNode and observed that it doesn't seem to get called after disconnecting, eg via setban).

    Have you worked out how this extra reference might occur in typical node usage (or better yet, are you able to reproduce this in testing)? So far the only smoking gun I see is where we're connecting to a peer by address name rather than IP, as I described above, but I don't know if that would be something that would commonly affect a node.

    The only other thing I can think of is if there might be a race condition that could cause this -- eg in OpenNetworkConnections, is it possible that we might call ConnectNode() at: https://github.com/bitcoin/bitcoin/blob/6e6ab2c3238264b34c0c83ebf703502f5ec72848/src/net.cpp#L1788, after failing to find the ip address we're about to connect to in the call to FindNode at https://github.com/bitcoin/bitcoin/blob/6e6ab2c3238264b34c0c83ebf703502f5ec72848/src/net.cpp#L1782, but somehow in between that peer has connected to us, and thus it's found when we call FindNode at line https://github.com/bitcoin/bitcoin/blob/6e6ab2c3238264b34c0c83ebf703502f5ec72848/src/net.cpp#L388?

    I don't know if this is possible or if there's something in place already to prevent that... If it is, then perhaps in addition to cleaning up the reference counting, we should also be re-thinking locking here.

    Also ping @cfields in case this code is already set for demolition as part of the net refactor.

  5. theuni commented at 8:59 PM on August 5, 2016: member

    @sdaftuar I believe #8085 contains some of those locking fixes, though more are needed. I'll be rebasing that and begging for review after #8128 goes in. As a follow-up (or maybe it's in that branch?) CNodes move to shared_ptrs (which never leave CConnman iirc), so that we don't have to worry about manual refcounting.

    Not opposed to fixing this up as-is, but yes, this logic is set for removal.

  6. UdjinM6 closed this on Aug 28, 2016

  7. UdjinM6 deleted the branch on Aug 28, 2016
  8. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-02 12:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me