Block ActivateBestChain to empty validationinterface queue #11824

pull TheBlueMatt wants to merge 7 commits into bitcoin:master from TheBlueMatt:2017-12-11822-debug changing 11 files +292 −213
  1. TheBlueMatt commented at 0:00 am on December 5, 2017: member

    This should fix #11822.

    It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement.

  2. TheBlueMatt force-pushed on Dec 5, 2017
  3. in src/validation.cpp:2571 in 151b17da19 outdated
    2474     CBlockIndex *pindexNewTip = nullptr;
    2475     int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT);
    2476     do {
    2477         boost::this_thread::interruption_point();
    2478+
    2479+        if (GetMainSignals().CallbacksPending() > 10) {
    


    dcousens commented at 0:16 am on December 5, 2017:
    Should this be parameterized? Or at least a configurable constant? Is there any advantage to tweaking it?

    TheBlueMatt commented at 0:17 am on December 5, 2017:
    During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet. I dont think more knobs is helpful here.

    ryanofsky commented at 2:14 pm on December 22, 2017:

    #11824 (review)

    In commit “Block ActivateBestChain to empty validationinterface queue”

    During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet.

    Please write this in a code comment. (How is anybody supposed to know this?)


    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Done.

    sipa commented at 2:15 pm on December 25, 2017:
    It would be nice to introduce a named constant for this instead.
  4. fanquake added the label P2P on Dec 5, 2017
  5. fanquake added the label Validation on Dec 5, 2017
  6. TheBlueMatt force-pushed on Dec 5, 2017
  7. in src/net_processing.cpp:1193 in 3113573c84 outdated
    1179+    {
    1180+        LOCK(cs_main);
    1181 
    1182-        const CInv &inv = *it;
    1183-        {
    1184+        while (it != pfrom->vRecvGetData.end() && (it->type == MSG_TX || it->type == MSG_WITNESS_TX)) {
    


    achow101 commented at 3:58 am on December 5, 2017:
    It seems to me that this expects to see transactions first and then blocks. But can’t a getdata message have the types in any order?

    TheBlueMatt commented at 4:05 am on December 5, 2017:
    No, we are allowed to (in fact always have) not finish the full queue before returning, and we do for blocks (always did). A peer shouldn’t be able to make us spin giving them 100 blocks before processing any other peers’ requests.

    achow101 commented at 4:19 am on December 5, 2017:
    Ok, I see now. I didn’t see the break for when the type was a block.

    ryanofsky commented at 1:16 pm on December 22, 2017:

    #11824 (review)

    In commit “Block ActivateBestChain to empty validationinterface queue”

    Please add a code comment here saying this, I had exactly the same question as achow.

  8. achow101 commented at 4:01 am on December 5, 2017: member

    ACK-ish

    This definitely is fixing the problem I was seeing, although I have one question below.

  9. achow101 commented at 4:18 am on December 5, 2017: member
    tACK 3113573c847c76943c065b72d3e7a9edc643fed1
  10. dcousens approved
  11. dcousens commented at 4:33 am on December 5, 2017: contributor
    once-over ACK
  12. jamesob commented at 5:54 pm on December 5, 2017: member

    tACK https://github.com/bitcoin/bitcoin/pull/11824/commits/3113573c847c76943c065b72d3e7a9edc643fed1

    I’ve been running -reindex for over twelve hours and haven’t seen an OOM:

    0job@ali:~$ sudo egrep "Kill.*bitcoind" /var/log/syslog || echo "cool"
    1cool
    
  13. jamesob commented at 10:25 pm on December 5, 2017: member

    Just hit another OOM, but also realized I was on an outdated version of this branch (HEAD was 151b17da1 Block ActivateBestChain to empty validationinterface queue). @TheBlueMatt do you recall what changed between 151b17da1 and current HEAD and if it would explain the difference between another OOM or not?

    In any case, I’m pulling down the tip of this branch and starting another -reindex.

  14. TheBlueMatt commented at 10:27 pm on December 5, 2017: member
    @jamesob No, the only differences there were in test_bitcoin.cpp, so that should have had no affect.
  15. achow101 commented at 10:38 pm on December 5, 2017: member
    I haven’t seen any OOMs and my node is almost fully reindexed. Granted it did go down sometime last night as it ran out of disk space. It’s been running for ~6 hours now and was running for ~14 hours before then.
  16. TheBlueMatt commented at 11:25 pm on December 5, 2017: member
    Well this definitely fixes a bug…just did a reindex to 150k with 0.15, 0.15.1 master and this branch, peak memory usage on this branch and 0.15 and 0.15.1 were all about the same, master was 3x higher…I’ll look more into our memory footprint and see what else is hiding.
  17. jamesob commented at 7:01 pm on December 6, 2017: member

    tACK (again) https://github.com/bitcoin/bitcoin/pull/11824/commits/3113573c847c76943c065b72d3e7a9edc643fed1

    Definitely hitting fewer OOMs reindexing on this branch than I was a week ago. :)

  18. jamesob commented at 10:54 pm on December 8, 2017: member
    My bitcoind process has been running for 37 hours on this branch; memory usage has held steady through and after a partial reindex.
  19. TheBlueMatt force-pushed on Dec 9, 2017
  20. TheBlueMatt commented at 6:22 pm on December 9, 2017: member
    Rebased.
  21. laanwj added this to the milestone 0.16.0 on Dec 14, 2017
  22. Create new mutex for orphans, no cs_main in PLV::BlockConnected
    This should (marginally) speed up validationinterface queue
    draining by avoiding a cs_main lock in one client.
    818075adac
  23. TheBlueMatt force-pushed on Dec 15, 2017
  24. TheBlueMatt commented at 8:39 pm on December 15, 2017: member
    Rebased.
  25. jonasschnelli commented at 7:39 am on December 17, 2017: contributor
    Haven’t looked at the code, but can confirm that a -reindex OOM/leak issue (#11822) was fixed once I merged in that PR.
  26. laanwj added this to the "Blockers" column in a project

  27. in src/net_processing.cpp:1042 in 3c45fc8b77 outdated
    1037@@ -1038,180 +1038,193 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma
    1038     connman->ForEachNodeThen(std::move(sortfunc), std::move(pushfunc));
    1039 }
    1040 
    1041-void static ProcessGetData(CNode* pfrom, const Consensus::Params& consensusParams, CConnman* connman, const std::atomic<bool>& interruptMsgProc)
    1042+// returns true if we need to call ActivateBestChain before responding
    1043+bool static ProcessGetBlockData(CNode* pfrom, const Consensus::Params& consensusParams, const CInv& inv, CConnman* connman, const std::atomic<bool>& interruptMsgProc, bool best_chain_active)
    


    ryanofsky commented at 1:22 pm on December 22, 2017:

    In commit “Avoid cs_main in net_processing ActivateBestChain calls”

    This change would be easier to understand if ProcessGetBlockData refactoring happened in a different commit than than the cs_main ActivateBestChain unlock.


    ryanofsky commented at 1:29 pm on December 22, 2017:

    In commit “Avoid cs_main in net_processing ActivateBestChain calls”

    best_chain_active argument is unused, should remove or use or explain


    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Oops, was a bug, fixed in the refactor.

    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Done. I think.
  28. in src/net_processing.cpp:1066 in 3c45fc8b77 outdated
    1065+            // If we have the block and all of its parents, but have not yet validated it,
    1066+            // we might be in the middle of connecting it (ie in the unlock of cs_main
    1067+            // before ActivateBestChain but after AcceptBlock).
    1068+            // In this case, we need to run ActivateBestChain prior to checking the relay
    1069+            // conditions below.
    1070+            return true;
    


    ryanofsky commented at 1:53 pm on December 22, 2017:

    In commit “Avoid cs_main in net_processing ActivateBestChain calls”

    This is pretty convoluted. If the point is just to release cs_main before calling ActivateBestChain, why not just use LEAVE_CRITICAL_SECTION / ENTER_CRITICAL_SECTION to release it? Is the problem just lack of a reverse_lock RAII helper?

    More importantly, if the point is to release cs_main before calling ActivateBestChain, you really need to say this in a code comment, because without looking through code history, I don’t think this would be clear at all.


    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Valid point, refactored a bit, though LEAVE_CRITICAL_SECTION also isnt sufficient as you need to do the mapBlockIndex lookup again, which I went ahead and did.
  29. in src/wallet/test/wallet_tests.cpp:635 in ddbb3fc367 outdated
    632+        LOCK(wallet->cs_wallet);
    633+        auto it = wallet->mapWallet.find(wtx.GetHash());
    634+        BOOST_CHECK(it != wallet->mapWallet.end());
    635+        blocktx = CMutableTransaction(*it->second.tx);
    636+        }
    637+        CreateAndProcessBlock({CMutableTransaction(blocktx)}, GetScriptForRawPubKey(coinbaseKey.GetPubKey()));
    


    ryanofsky commented at 2:10 pm on December 22, 2017:

    In commit “Require no cs_main lock for ProcessNewBlock/ActivateBestChain”

    Maybe simplify this using mapWallet.at() instead of .find() and BOOST_CHECK.


    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Done.
  30. in src/validation.cpp:2577 in da8903142f outdated
    2572+            // Block until the validation queue drains
    2573+            std::promise<void> promise;
    2574+            CallFunctionInValidationInterfaceQueue([&promise] {
    2575+                promise.set_value();
    2576+            });
    2577+            promise.get_future().wait();
    


    ryanofsky commented at 2:20 pm on December 22, 2017:

    In commit “Block ActivateBestChain to empty validationinterface queue”

    Could add a CValidationInterface::WaitForPendingCallbacks() method to wrap this up since this is also needed in other places.


    TheBlueMatt commented at 5:21 pm on December 24, 2017:
    Done, in a new commit.
  31. ryanofsky commented at 2:27 pm on December 22, 2017: member
    utACK da8903142f807be9295797c0e65f228230ffd467. Locking changes look good, and nice to break up ProcessGetData.
  32. TheBlueMatt force-pushed on Dec 24, 2017
  33. Damballahwedo commented at 5:33 pm on December 24, 2017: none

    Don’t send me anymore pub mail. How can’t unsubscribe.

    On Dec 24, 2017 9:22 AM, “Matt Corallo” notifications@github.com wrote:

    @TheBlueMatt commented on this pull request.

    In src/validation.cpp https://github.com/bitcoin/bitcoin/pull/11824#discussion_r158606648:

    @@ -2566,6 +2567,16 @@ bool CChainState::ActivateBestChain(CValidationState &state, const CChainParams& int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT); do { boost::this_thread::interruption_point();

    •    if (GetMainSignals().CallbacksPending() > 10) {
      
    •        // Block until the validation queue drains
      
    •        std::promise<void> promise;
      
    •        CallFunctionInValidationInterfaceQueue([&promise] {
      
    •            promise.set_value();
      
    •        });
      
    •        promise.get_future().wait();
      

    Done, in a new commit.

    — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/pull/11824#discussion_r158606648, or mute the thread https://github.com/notifications/unsubscribe-auth/AhH-NWCScYWtFJ0hUPdgOs8P8xXy58kkks5tDohCgaJpZM4Q1hJ- .

  34. Refactor ProcessGetData in anticipation of avoiding cs_main for ABC 66aa1d58a1
  35. TheBlueMatt force-pushed on Dec 24, 2017
  36. Avoid cs_main in net_processing ActivateBestChain calls a734896038
  37. TheBlueMatt force-pushed on Dec 24, 2017
  38. TheBlueMatt force-pushed on Dec 24, 2017
  39. TheBlueMatt force-pushed on Dec 24, 2017
  40. in src/test/miner_tests.cpp:222 in c4fafa2fe0 outdated
    216@@ -218,6 +217,8 @@ BOOST_AUTO_TEST_CASE(CreateNewBlock_validity)
    217     for (unsigned int i = 0; i < sizeof(blockinfo)/sizeof(*blockinfo); ++i)
    218     {
    219         CBlock *pblock = &pblocktemplate->block; // pointer for convenience
    220+        {
    221+        LOCK(cs_main);
    222         pblock->nVersion = 1;
    


    sipa commented at 2:12 pm on December 25, 2017:
    Indentation?
  41. in src/wallet/test/wallet_tests.cpp:632 in c4fafa2fe0 outdated
    626@@ -627,10 +627,15 @@ class ListCoinsTestingSetup : public TestChain100Setup
    627         BOOST_CHECK(wallet->CreateTransaction({recipient}, wtx, reservekey, fee, changePos, error, dummy));
    628         CValidationState state;
    629         BOOST_CHECK(wallet->CommitTransaction(wtx, reservekey, nullptr, state));
    630+        CMutableTransaction blocktx;
    631+        {
    632+        LOCK(wallet->cs_wallet);
    


    sipa commented at 2:12 pm on December 25, 2017:
    Indentation?
  42. in src/validationinterface.h:54 in af5a003d3c outdated
    49+ *     CallFunctionInValidationInterfaceQueue([&promise] {
    50+ *         promise.set_value();
    51+ *     });
    52+ *     promise.get_future().wait();
    53+ */
    54+void BlockOnValidationInterfaceQueueDrain();
    


    sipa commented at 2:19 pm on December 25, 2017:

    If I understand this correctly, it’s not actually waiting until a drain, as new events can be added to the queue before the callback added in this function is executed.

    It’s more a synchronization point (guaranteeing that at least all pre-existing events are executed first).

  43. sipa commented at 2:20 pm on December 25, 2017: member
    utACK af5a003d3cfdd3f3a26dfd7c2db5dde347b4ab3c (with a few non-blocking nits)
  44. Require no cs_main lock for ProcessNewBlock/ActivateBestChain
    This requires the removal of some very liberal (incorrect) cs_mains
    sprinkled in some tests. It adds some chainActive.Tip() races, but
    the tests are all single-threaded anyway.
    a99b76f269
  45. Add an interface to get the queue depth out of CValidationInterface 5a933cefcc
  46. Block ActivateBestChain to empty validationinterface queue 36137497f1
  47. Add helper to wait for validation interface queue to catch up 97d2b09c12
  48. TheBlueMatt force-pushed on Dec 26, 2017
  49. sipa commented at 9:31 am on December 29, 2017: member

    utACK 97d2b09c124e6e5803f7fd4503348d9710d1260f (and also some experimental evidence that this indeed removes the runaway memory usage).

    Confirmed only indentation/naming changes since af5a003.

  50. sipa merged this on Dec 29, 2017
  51. sipa closed this on Dec 29, 2017

  52. sipa referenced this in commit d9fdac130a on Dec 29, 2017
  53. Sjors commented at 10:09 am on December 29, 2017: member
    I’m about to try this as well, will let you know if I still see extreme memory use during a reindex.
  54. laanwj removed this from the "Blockers" column in a project

  55. UdjinM6 referenced this in commit 871b8585ca on Mar 25, 2020
  56. codablock referenced this in commit c383bca1a6 on Mar 25, 2020
  57. codablock referenced this in commit 65ddd34b21 on Mar 25, 2020
  58. codablock referenced this in commit fa9b91b50f on Mar 26, 2020
  59. furszy referenced this in commit 307d7b1e5c on Feb 26, 2021
  60. ckti referenced this in commit ef6b4d8ffd on Mar 28, 2021
  61. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me