Small step towards demangling cs_main from CNodeState #10652

pull TheBlueMatt wants to merge 4 commits into bitcoin:master from TheBlueMatt:2017-06-cnodestateaccessors-1 changing 5 files +127 −80
  1. TheBlueMatt commented at 7:56 pm on June 22, 2017: member

    This is a rather long branch, split up into three PRs, with lots of cleanup and tweaks here and there that evntually gets us CNodeStates which have their own locks, locked before cs_main, which makes nodes not always require cs_main to do simple things like call Misbehaving(). The final result is at https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-3 (and next PR is https://github.com/TheBlueMatt/bitcoin/tree/2017-06-cnodestateaccessors-2).

    This PR is primarily some small cleanups and the first few commits of #9447 rebased on #10179 and with small tweaks.

    The result is that we take the lock for the CNodeState for the node we’re processing at the top of ProcessMessages/SendMessages, and hold it until the end, passing around an accessor to the CNodeState as we go. This moves towards a few goals: (1) ability to run ProcessMessages() in paralell, (2) move towards a world where ProcessMessage() is a function in CNodeState, which are objects owned by CConnman.

    This isn’t a trivial thing to pull off because of lockorder, which introduces two issues:

    • There are a few places we call State() as a result of callbacks from CValidationInterface. Addressed by extending the CValidationInterface threading introduced in #10179 and used for wallet to a few more functions.
    • You can’t take two State() accessors at once, because then you’d have undefined lockorder. Luckily with the CValidationInterface threading changes gets most of those, but also the first few commits from #9447 are needed (included here rebased and slightly tweaked) as well as two little tweaks to be able to apply DoS score after unlocking the CNodeState of the node we’re processing on.

    By the end DEBUG_LOCKORDER features are added to stricly enforce these potential issues (and I’ve been testing it a bunch in helgrind, etc).

  2. TheBlueMatt commented at 7:56 pm on June 22, 2017: member
    Ping @theuni.
  3. fanquake added the label Refactoring on Jun 23, 2017
  4. in src/net.cpp:2832 in f1d25b2d54 outdated
    2828+    bool res = found != nullptr && NodeFullyConnected(found) && func(found);
    2829+    found->Release();
    2830+    return res;
    2831 }
    2832 
    2833+void CConnman::ForEachNode(std::function<void (CNode* pnode)> func)
    


    jonasschnelli commented at 8:24 pm on June 28, 2017:
    Wouldn’t this discard the rvalue reference && (move schematic)?

    TheBlueMatt commented at 9:07 pm on June 28, 2017:
    I mean it does, but I’m not too worried about requiring move for a callable type (places where we std::move a lambda into this will still get the benifit of calling std::function(Callable&&)). The real concern here, of course, is that it introduces a performance penalty, but its not huge (an extra function call and few if’s on top of the callback function call) and we’re already calling this with lambda functions, so I dont think we were ever too worried about the performance here.

    TheBlueMatt commented at 9:19 pm on June 28, 2017:
    Moved this commit to #10697.
  5. TheBlueMatt force-pushed on Jun 28, 2017
  6. TheBlueMatt force-pushed on Jun 28, 2017
  7. TheBlueMatt commented at 9:28 pm on June 28, 2017: member

    Rebased on a rebased version of the updated #10179 and filled in more text for the " MarkBlockAsReceived on NewPoWValidBlock at receive." commit message which helps provide a bit more context for this PR:

    “Note that this also helps with future per-CNodeState locks, as those will require no two CNodeState locks to be held at the same time (for lockorder reasons), and this prevents us from accessing another peer’s mmapBlocksInFlight entry during processing as we can move the NewPoWValidBlock callback into the CValidationInterface background thread mechanics.”

  8. jonasschnelli added this to the "Blockers" column in a project

  9. jonasschnelli removed this from the "Blockers" column in a project

  10. in src/validation.cpp:3144 in df3001d00a outdated
    3022@@ -3023,9 +3023,7 @@ static bool AcceptBlock(const std::shared_ptr<const CBlock>& pblock, CValidation
    3023     }
    3024 
    3025     // Header is valid/has work, merkle tree and segwit merkle tree are good...RELAY NOW
    3026-    // (but if it does not build on our best tip, let the SendMessages loop relay it)
    3027-    if (!IsInitialBlockDownload() && chainActive.Tip() == pindex->pprev)
    3028-        GetMainSignals().NewPoWValidBlock(pindex, pblock);
    3029+    GetMainSignals().NewPoWValidBlock(pindex, pblock, chainActive.Tip() == pindex->pprev);
    


    theuni commented at 9:56 pm on July 6, 2017:
    I think this wants fInitialDownload as a param so that it’s always evaluated in its receiving context rather than having subscribers potentially coming to different conclusions.

    TheBlueMatt commented at 9:21 pm on July 7, 2017:
    Hmm? I’m not sure I caught your point - we never get here if we already have the block on disk, and the new uses of it in net_processing need to know that we got here for any ProcessNewBlock call they make.

    theuni commented at 10:20 pm on July 7, 2017:
    My point was that IsInitialBlockDownload() may be false by the time it’s checked in a callback on a separate thread (once that stuff comes in), though it was true in AcceptBlock. So it’s potentially racy to subscribers.

    TheBlueMatt commented at 11:18 pm on July 7, 2017:
    Ahh, ok. Yes, is not an issue in any of my work, I believe (worst-case you have an extra few fast-announcements when you first leave IBD), but I added a comment noting the issue.
  11. in src/net_processing.cpp:2430 in 35bd4b6204 outdated
    2423@@ -2431,7 +2424,10 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
    2424             LOCK(cs_main);
    2425             // Also always process if we requested the block explicitly, as we may
    2426             // need it even though it is not a candidate for a new best tip.
    2427-            forceProcessing |= MarkBlockAsReceived(hash);
    2428+            // TODO: Only process if requested from this peer?
    2429+            forceProcessing |= mmapBlocksInFlight.count(hash);
    2430+            // Block is no longer in flight from this peer
    2431+            MarkBlockAsNotInFlight(hash, pfrom->GetId());
    


    theuni commented at 9:58 pm on July 6, 2017:
    Probably makes sense for MarkBlockAsNotInFlight() to return whether it removed something for the peer? I think that would take care of the TODO as well.

    TheBlueMatt commented at 9:27 pm on July 7, 2017:
    Given the TODO has a “?” on it (and I dunno if we want it or not), I’ll leave this for a later PR?
  12. TheBlueMatt force-pushed on Jul 7, 2017
  13. TheBlueMatt commented at 9:28 pm on July 7, 2017: member
    Rebased.
  14. TheBlueMatt force-pushed on Jul 7, 2017
  15. in src/net_processing.cpp:312 in c248f0121b outdated
    307+    while (range.first != range.second) {
    308+        BlockDownloadMap::iterator itInFlight = range.first;
    309+        range.first++;
    310+        if (itInFlight->second.first == nodeid) {
    311+            if (clearState)
    312+                ClearDownloadState(itInFlight);
    


    promag commented at 2:08 pm on July 10, 2017:
    Missing block or same line as if ().

    morcos commented at 7:45 pm on July 10, 2017:
    same line or braces

    TheBlueMatt commented at 1:35 am on July 11, 2017:
    Fixed.
  16. in src/net_processing.cpp:2069 in 0010932a0d outdated
    2065@@ -2024,7 +2066,7 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
    2066         }
    2067 
    2068         // If we're not close to tip yet, give up and let parallel block fetch work its magic
    2069-        if (!fAlreadyInFlight && !CanDirectFetch(chainparams.GetConsensus()))
    2070+        if (!fInFlightFromSamePeer && !CanDirectFetch(chainparams.GetConsensus()))
    


    morcos commented at 9:01 pm on July 10, 2017:
    I’m a bit hesitant about this change. I can’t really remember exactly why we have this CanDirectFetch logic in compact block processing. But this could make us unnecessarily avoid an optimistic compact block reconstruction if for some reason it took more than 200 mins to receive 2 blocks (right we do reconstructions if within 2 blocks of the announced block?)

    morcos commented at 9:07 pm on July 10, 2017:
    Actually now that I think about it, that concern exists prior to the changes in this PR. Why do we have the CanDirectFetch guard for compact block processing?

    TheBlueMatt commented at 1:38 am on July 11, 2017:
    No, I think you’d have to go 200 minutes for one block for it to matter (if you go 200 minutes for two, and missed the first one, you wont compact-download that older one, but thats OK, you were probably offline or something anyway). Still, we dont have a much better way to say “ok, I really, really only want to do this if I’m caught up” (ie cause otherwise mempool will be worthless). We could drop it - its not the end of the world to take the extra RT to use the compact blocks and get a tiny bit of compression, but at that point you may very well be just as well off just doing a regular block fetch and not taking the extra RT.
  17. morcos approved
  18. morcos commented at 9:04 pm on July 10, 2017: member
    Looks pretty good to me. utACK 0010932
  19. TheBlueMatt force-pushed on Jul 11, 2017
  20. in src/scheduler.cpp:146 in 7145218d32 outdated
    138@@ -139,3 +139,69 @@ size_t CScheduler::getQueueInfo(boost::chrono::system_clock::time_point &first,
    139     }
    140     return result;
    141 }
    142+
    143+bool CScheduler::AreThreadsServicingQueue() const {
    144+    return nThreadsServicingQueue;
    145+}
    146+
    


    promag commented at 9:53 am on July 11, 2017:
    Remove extra line.
  21. in src/scheduler.cpp:143 in 7145218d32 outdated
    138@@ -139,3 +139,69 @@ size_t CScheduler::getQueueInfo(boost::chrono::system_clock::time_point &first,
    139     }
    140     return result;
    141 }
    142+
    143+bool CScheduler::AreThreadsServicingQueue() const {
    


    promag commented at 9:54 am on July 11, 2017:
    { in new line for all functions.
  22. in src/net_processing.cpp:2195 in 7145218d32 outdated
    2188@@ -2158,17 +2189,27 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
    2189         {
    2190             LOCK(cs_main);
    2191 
    2192-            std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> >::iterator it = mapBlocksInFlight.find(resp.blockhash);
    2193-            if (it == mapBlocksInFlight.end() || !it->second.second->partialBlock ||
    2194-                    it->second.first != pfrom->GetId()) {
    2195+            bool fExpectedBLOCKTXN = false;
    2196+            std::pair<BlockDownloadMap::iterator, BlockDownloadMap::iterator> rangeInFlight = mmapBlocksInFlight.equal_range(resp.blockhash);
    2197+            BlockDownloadMap::iterator it = rangeInFlight.first;
    


    promag commented at 10:03 am on July 11, 2017:
    Remove it and mutate rangeInFlight.first.
  23. in src/net_processing.cpp:2199 in 7145218d32 outdated
    2196+            std::pair<BlockDownloadMap::iterator, BlockDownloadMap::iterator> rangeInFlight = mmapBlocksInFlight.equal_range(resp.blockhash);
    2197+            BlockDownloadMap::iterator it = rangeInFlight.first;
    2198+            while (it != rangeInFlight.second) {
    2199+                if (it->second.first == pfrom->GetId()) {
    2200+                    if (it->second.second->partialBlock)
    2201+                        fExpectedBLOCKTXN = true;
    


    promag commented at 10:21 am on July 11, 2017:
    0if (it->second.first == pfrom->GetId()) {
    1    fExpectedBLOCKTXN = it->second.second->partialBlock;
    2    break;
    3}
    
  24. in src/net_processing.cpp:2193 in 7145218d32 outdated
    2188@@ -2158,17 +2189,27 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
    2189         {
    2190             LOCK(cs_main);
    2191 
    2192-            std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> >::iterator it = mapBlocksInFlight.find(resp.blockhash);
    2193-            if (it == mapBlocksInFlight.end() || !it->second.second->partialBlock ||
    2194-                    it->second.first != pfrom->GetId()) {
    2195+            bool fExpectedBLOCKTXN = false;
    


    promag commented at 10:22 am on July 11, 2017:
    expected_BLOCKTXN?
  25. in src/net_processing.cpp:2428 in 7145218d32 outdated
    2422@@ -2382,7 +2423,10 @@ bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStr
    2423             LOCK(cs_main);
    2424             // Also always process if we requested the block explicitly, as we may
    2425             // need it even though it is not a candidate for a new best tip.
    2426-            forceProcessing |= MarkBlockAsReceived(hash);
    2427+            // TODO: Only process if requested from this peer?
    2428+            forceProcessing |= mmapBlocksInFlight.count(hash);
    


    promag commented at 10:27 am on July 11, 2017:

    Replace these mmapBlocksInFlight.count() with, for instance:

    0forceProcessing |= IsBlockInFlight(hash);
    
  26. TheBlueMatt force-pushed on Jul 11, 2017
  27. TheBlueMatt commented at 3:16 pm on July 11, 2017: member
    Rebased to remove the useless old commits, but no changes. Still would be nice to see thtis for 0.15, but not a huge deal if it misses.
  28. laanwj added this to the milestone 0.15.0 on Jul 11, 2017
  29. morcos commented at 5:27 pm on July 11, 2017: member
    utACK ca66c4a
  30. in src/net_processing.cpp:108 in ca66c4a7d7 outdated
    104@@ -105,7 +105,8 @@ namespace {
    105         bool fValidatedHeaders;                                  //!< Whether this block has validated headers at the time of request.
    106         std::unique_ptr<PartiallyDownloadedBlock> partialBlock;  //!< Optional, used for CMPCTBLOCK downloads
    107     };
    108-    std::map<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator> > mapBlocksInFlight;
    109+    typedef std::multimap<uint256, std::pair<NodeId, std::list<QueuedBlock>::iterator>> BlockDownloadMap;
    


    promag commented at 10:52 pm on July 12, 2017:
    Suggestion, add NodeId to QueuedBlock structure?

    TheBlueMatt commented at 7:07 pm on August 3, 2017:
    Would be a bit cleaner, maybe, but would also make it redundant.
  31. TheBlueMatt commented at 9:30 pm on July 18, 2017: member
    This should likely be un-tagged 0.15. Its not a critical enough bugfix to merit merging at this late stage.
  32. MarcoFalke removed this from the milestone 0.15.0 on Jul 18, 2017
  33. TheBlueMatt force-pushed on Aug 3, 2017
  34. Turn mapBlocksInFlight into a multimap ff57eb4bbd
  35. Call NewPoWValidBlock callbacks for all new blocks, not just !IBD
    This pushes some "is this callback useful" logic down into
    net_processing, which is useful for later changes as it allows for
    more notifications to be used.
    b4592659b9
  36. MarkBlockAsReceived on NewPoWValidBlock at receive.
    The received block could be malleated, so this is both simpler, and
    supports parallel downloads.
    61c9b81df0
  37. Only request full blocks from the peer we thought had the block in-flight
    This is a change in behavior so that if for some reason we request a block from a peer, we don't allow an unsolicited CMPCT_BLOCK announcement for that same block to cause a request for a full block from the uninvited peer (as some type of request is already outstanding from the original peer)
    8689df48ce
  38. TheBlueMatt force-pushed on Aug 3, 2017
  39. TheBlueMatt commented at 7:05 pm on August 14, 2017: member
    Closing as I believe I’m taking a different approach to beter network parallelization, however the commits here are still part of #10984, where they will remain.
  40. TheBlueMatt closed this on Aug 14, 2017

  41. laanwj removed this from the "Blockers" column in a project

  42. laanwj referenced this in commit 723e580657 on Sep 7, 2017
  43. PastaPastaPasta referenced this in commit d9684ed18a on Sep 24, 2019
  44. codablock referenced this in commit 022e80a1c7 on Sep 26, 2019
  45. codablock referenced this in commit 2edc29ee32 on Sep 29, 2019
  46. barrystyle referenced this in commit a2bd5cd8fb on Jan 22, 2020
  47. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me