This should fix #11822.
It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement.
This should fix #11822.
It ended up bigger than I hoped for, but its not too gnarly. Note that " Require no cs_main lock for ProcessNewBlock/ActivateBestChain" is mostly pure code-movement.
2474 CBlockIndex *pindexNewTip = nullptr;
2475 int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT);
2476 do {
2477 boost::this_thread::interruption_point();
2478+
2479+ if (GetMainSignals().CallbacksPending() > 10) {
In commit “Block ActivateBestChain to empty validationinterface queue”
During normal (read: non-IBD/reindex) usage, this should virtually never reach 10. During IBD/reindex, your memory is probably better speint in more dbcache than storing blocks you recently connected to eventually give to your wallet.
Please write this in a code comment. (How is anybody supposed to know this?)
1179+ {
1180+ LOCK(cs_main);
1181
1182- const CInv &inv = *it;
1183- {
1184+ while (it != pfrom->vRecvGetData.end() && (it->type == MSG_TX || it->type == MSG_WITNESS_TX)) {
getdata
message have the types in any order?
In commit “Block ActivateBestChain to empty validationinterface queue”
Please add a code comment here saying this, I had exactly the same question as achow.
ACK-ish
This definitely is fixing the problem I was seeing, although I have one question below.
tACK https://github.com/bitcoin/bitcoin/pull/11824/commits/3113573c847c76943c065b72d3e7a9edc643fed1
I’ve been running -reindex
for over twelve hours and haven’t seen an OOM:
0job@ali:~$ sudo egrep "Kill.*bitcoind" /var/log/syslog || echo "cool"
1cool
Just hit another OOM, but also realized I was on an outdated version of this branch (HEAD was 151b17da1 Block ActivateBestChain to empty validationinterface queue
). @TheBlueMatt do you recall what changed between 151b17da1
and current HEAD and if it would explain the difference between another OOM or not?
In any case, I’m pulling down the tip of this branch and starting another -reindex.
tACK (again) https://github.com/bitcoin/bitcoin/pull/11824/commits/3113573c847c76943c065b72d3e7a9edc643fed1
Definitely hitting fewer OOMs reindexing on this branch than I was a week ago. :)
This should (marginally) speed up validationinterface queue
draining by avoiding a cs_main lock in one client.
1037@@ -1038,180 +1038,193 @@ static void RelayAddress(const CAddress& addr, bool fReachable, CConnman* connma
1038 connman->ForEachNodeThen(std::move(sortfunc), std::move(pushfunc));
1039 }
1040
1041-void static ProcessGetData(CNode* pfrom, const Consensus::Params& consensusParams, CConnman* connman, const std::atomic<bool>& interruptMsgProc)
1042+// returns true if we need to call ActivateBestChain before responding
1043+bool static ProcessGetBlockData(CNode* pfrom, const Consensus::Params& consensusParams, const CInv& inv, CConnman* connman, const std::atomic<bool>& interruptMsgProc, bool best_chain_active)
In commit “Avoid cs_main in net_processing ActivateBestChain calls”
This change would be easier to understand if ProcessGetBlockData refactoring happened in a different commit than than the cs_main ActivateBestChain unlock.
In commit “Avoid cs_main in net_processing ActivateBestChain calls”
best_chain_active argument is unused, should remove or use or explain
1065+ // If we have the block and all of its parents, but have not yet validated it,
1066+ // we might be in the middle of connecting it (ie in the unlock of cs_main
1067+ // before ActivateBestChain but after AcceptBlock).
1068+ // In this case, we need to run ActivateBestChain prior to checking the relay
1069+ // conditions below.
1070+ return true;
In commit “Avoid cs_main in net_processing ActivateBestChain calls”
This is pretty convoluted. If the point is just to release cs_main before calling ActivateBestChain, why not just use LEAVE_CRITICAL_SECTION / ENTER_CRITICAL_SECTION to release it? Is the problem just lack of a reverse_lock
RAII helper?
More importantly, if the point is to release cs_main before calling ActivateBestChain, you really need to say this in a code comment, because without looking through code history, I don’t think this would be clear at all.
632+ LOCK(wallet->cs_wallet);
633+ auto it = wallet->mapWallet.find(wtx.GetHash());
634+ BOOST_CHECK(it != wallet->mapWallet.end());
635+ blocktx = CMutableTransaction(*it->second.tx);
636+ }
637+ CreateAndProcessBlock({CMutableTransaction(blocktx)}, GetScriptForRawPubKey(coinbaseKey.GetPubKey()));
In commit “Require no cs_main lock for ProcessNewBlock/ActivateBestChain”
Maybe simplify this using mapWallet.at() instead of .find() and BOOST_CHECK.
2572+ // Block until the validation queue drains
2573+ std::promise<void> promise;
2574+ CallFunctionInValidationInterfaceQueue([&promise] {
2575+ promise.set_value();
2576+ });
2577+ promise.get_future().wait();
In commit “Block ActivateBestChain to empty validationinterface queue”
Could add a CValidationInterface::WaitForPendingCallbacks() method to wrap this up since this is also needed in other places.
Don’t send me anymore pub mail. How can’t unsubscribe.
On Dec 24, 2017 9:22 AM, “Matt Corallo” notifications@github.com wrote:
@TheBlueMatt commented on this pull request.
In src/validation.cpp https://github.com/bitcoin/bitcoin/pull/11824#discussion_r158606648:
@@ -2566,6 +2567,16 @@ bool CChainState::ActivateBestChain(CValidationState &state, const CChainParams& int nStopAtHeight = gArgs.GetArg("-stopatheight", DEFAULT_STOPATHEIGHT); do { boost::this_thread::interruption_point();
if (GetMainSignals().CallbacksPending() > 10) {
// Block until the validation queue drains
std::promise<void> promise;
CallFunctionInValidationInterfaceQueue([&promise] {
promise.set_value();
});
promise.get_future().wait();
Done, in a new commit.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/bitcoin/bitcoin/pull/11824#discussion_r158606648, or mute the thread https://github.com/notifications/unsubscribe-auth/AhH-NWCScYWtFJ0hUPdgOs8P8xXy58kkks5tDohCgaJpZM4Q1hJ- .
216@@ -218,6 +217,8 @@ BOOST_AUTO_TEST_CASE(CreateNewBlock_validity)
217 for (unsigned int i = 0; i < sizeof(blockinfo)/sizeof(*blockinfo); ++i)
218 {
219 CBlock *pblock = &pblocktemplate->block; // pointer for convenience
220+ {
221+ LOCK(cs_main);
222 pblock->nVersion = 1;
626@@ -627,10 +627,15 @@ class ListCoinsTestingSetup : public TestChain100Setup
627 BOOST_CHECK(wallet->CreateTransaction({recipient}, wtx, reservekey, fee, changePos, error, dummy));
628 CValidationState state;
629 BOOST_CHECK(wallet->CommitTransaction(wtx, reservekey, nullptr, state));
630+ CMutableTransaction blocktx;
631+ {
632+ LOCK(wallet->cs_wallet);
49+ * CallFunctionInValidationInterfaceQueue([&promise] {
50+ * promise.set_value();
51+ * });
52+ * promise.get_future().wait();
53+ */
54+void BlockOnValidationInterfaceQueueDrain();
If I understand this correctly, it’s not actually waiting until a drain, as new events can be added to the queue before the callback added in this function is executed.
It’s more a synchronization point (guaranteeing that at least all pre-existing events are executed first).
This requires the removal of some very liberal (incorrect) cs_mains
sprinkled in some tests. It adds some chainActive.Tip() races, but
the tests are all single-threaded anyway.
utACK 97d2b09c124e6e5803f7fd4503348d9710d1260f (and also some experimental evidence that this indeed removes the runaway memory usage).
Confirmed only indentation/naming changes since af5a003.
TheBlueMatt
dcousens
ryanofsky
sipa
achow101
jamesob
jonasschnelli
Damballahwedo
Sjors
Labels
P2P
Validation
Milestone
0.16.0