Mining: Prevent slowdown in CreateNewBlock on large mempools #9959

pull sdaftuar wants to merge 3 commits into bitcoin:master from sdaftuar:2017-03-cnb-optimizations changing 2 files +43 −8
  1. sdaftuar commented at 1:30 AM on March 9, 2017: member

    I've been investigating performance regressions in CreateNewBlock.

    addPackageTxs() is supposed to "fail fast" for transactions that don't fit in a block. However while we skip the most expensive ancestor/descendant calculations for failing transactions (generally), it turns out the other work we're doing (mostly map lookups?) can still be pretty slow.

    After trying various approaches to speed up those operations (replacing maps with unordered_maps, changing the sort order on the ancestor score function to avoid needless re-sorting, and even getting rid of the maps altogether in favor of storing set information directly in the mempool entry), the one optimization that dominates all these is to just return earlier when the block is nearly full.

    So in this PR: when we're within 4000 weight of the block being full, if we consider and fail to add 1000 transactions in a row, then give up. I've benchmarked this as reducing the average run of CNB from 84ms to 63ms 74ms to 61ms, with negligible difference in fees. Most of CNB time is currently taken by the call to TestBlockValidity, which is unaffected by this PR; so focusing just on package selection, the improvement from this change is a reduction in average addPackageTxs() time from 19ms to 7ms, over the time period I analyzed (first week of December, 2016).

    I also added some commits that provide benchmarking of CNB when running with -debug=bench, which I thought might be generally useful/interesting.

  2. fanquake added the label Mining on Mar 9, 2017
  3. sdaftuar commented at 7:10 PM on March 9, 2017: member

    Update OP with a correction and a more relevant statistic.

  4. in src/miner.cpp:None in 03a407662b outdated
     178 | @@ -177,7 +179,11 @@ std::unique_ptr<CBlockTemplate> BlockAssembler::CreateNewBlock(const CScript& sc
     179 |      // transaction (which in most cases can be a no-op).
     180 |      fIncludeWitness = IsWitnessEnabled(pindexPrev, chainparams.GetConsensus());
     181 |  
     182 | -    addPackageTxs();
     183 | +    int nPackagesSelected = 0;
     184 | +    int nDescendantsUpdated = 0;
    


    luke-jr commented at 7:27 PM on March 9, 2017:

    Might make more sense to put these on the class, rather than pass by-reference?


    sdaftuar commented at 1:33 PM on March 10, 2017:

    I'm working on a refactor of the BlockAssembler members separately (as part of a patch to exclude recent transactions from new blocks, when the fee difference is negligible), so I'd prefer to defer the decision of whether to include this in the class until I'm ready to PR that patch.

  5. in src/miner.cpp:None in 03a407662b outdated
     369 | @@ -358,6 +370,12 @@ void BlockAssembler::addPackageTxs()
     370 |  
     371 |      CTxMemPool::indexed_transaction_set::index<ancestor_score>::type::iterator mi = mempool.mapTx.get<ancestor_score>().begin();
     372 |      CTxMemPool::txiter iter;
     373 | +
     374 | +    // Limit the number of attempts to add transactions to the block when it is
     375 | +    // close to full.
     376 | +    const int64_t MAX_CONSECUTIVE_FAILURES = 1000;
    


    luke-jr commented at 7:29 PM on March 9, 2017:

    The old code used 50 for this. Does 1000 work good enough in practice?


    sdaftuar commented at 1:34 PM on March 10, 2017:

    Yes, well I also tried setting this to 100 and the performance was not really distinguishable.

  6. in src/miner.cpp:None in 03a407662b outdated
     436 | @@ -419,6 +437,14 @@ void BlockAssembler::addPackageTxs()
     437 |                  mapModifiedTx.get<ancestor_score>().erase(modit);
     438 |                  failedTx.insert(iter);
     439 |              }
     440 | +
     441 | +            ++nConsecutiveFailed;
     442 | +
     443 | +            if (nConsecutiveFailed > MAX_CONSECUTIVE_FAILURES && nBlockWeight >
     444 | +                    nBlockMaxWeight - 4000) {
    


    luke-jr commented at 7:31 PM on March 9, 2017:

    Perhaps the additional condition(s) should be before incrementing, or we could end up just aborting immediately as we approach the max block weight even if we could easily fit a few more txs in.


    sdaftuar commented at 1:36 PM on March 10, 2017:

    I'm not following; every time we add a new tx to the block we reset the counter... Can you clarify?

  7. luke-jr approved
  8. luke-jr commented at 7:33 PM on March 9, 2017: member

    utACK, certainly an improvement either way. (this behaviour used to exist before the CNB refactoring, maybe the old code is worth looking at)

  9. laanwj added this to the milestone 0.14.1 on Mar 9, 2017
  10. laanwj added the label Needs backport on Mar 9, 2017
  11. TheBlueMatt commented at 8:05 PM on March 9, 2017: member

    utACK 03a407662b008fe030d81bb54dfd3b677b248392, didnt bother to benchmark, should be obvious wins, even if minor.

  12. laanwj added this to the "Blockers" column in a project

  13. TheBlueMatt commented at 9:24 PM on March 24, 2017: member

    Is this worth backporting (my vote would be weak yes)?

  14. sdaftuar commented at 5:20 PM on March 26, 2017: member

    @thebluematt I don't feel strongly about whether we backport this change, to be honest (though it's simple enough that I don't think there's any downside).

    But it would be nice to merge this into master if there are no further comments, so I can continue work on additional mining changes (to support excluding recently received transactions if the fee difference from doing so is below some threshold).

  15. in src/miner.cpp:297 in 03a407662b outdated
     290 | @@ -282,16 +291,18 @@ void BlockAssembler::AddToBlock(CTxMemPool::txiter iter)
     291 |      }
     292 |  }
     293 |  
     294 | -void BlockAssembler::UpdatePackagesForAdded(const CTxMemPool::setEntries& alreadyAdded,
     295 | +int BlockAssembler::UpdatePackagesForAdded(const CTxMemPool::setEntries& alreadyAdded,
     296 |          indexed_modified_transaction_set &mapModifiedTx)
     297 |  {
     298 | +    int nDescendants = 0;
    


    JeremyRubin commented at 8:20 PM on March 28, 2017:

    Can you rename to nDescendantsUpdated to make clear that it should not be initialized to alreadyAdded.size()

  16. in src/miner.cpp:302 in 03a407662b outdated
     298 | +    int nDescendants = 0;
     299 |      BOOST_FOREACH(const CTxMemPool::txiter it, alreadyAdded) {
     300 |          CTxMemPool::setEntries descendants;
     301 |          mempool.CalculateDescendants(it, descendants);
     302 |          // Insert all descendants (not yet in block) into the modified set
     303 |          BOOST_FOREACH(CTxMemPool::txiter desc, descendants) {
    


    JeremyRubin commented at 8:48 PM on March 28, 2017:

    This is a really interesting section. I got a bit nerd-sniped so apologies for long writeup.

    Doing the following may be more efficient (I can code this up, but I don't want to step on your toes if you're operating here).

    std::vector<txiter> diff(descendants.size());
    auto end = std::set_difference(descendants.begin(), descendants.end(),
         alreadyAdded.begin(), alreadyAdded.end(), diff.begin()); // Linear time!
    nDescendants += end - diff.begin();
    for (auto it = diff.begin(); it != diff.end(); ++it) {
      // ....
    } 
    

    It requires an extra copy compared to the naive, but they're iterators so who cares (we can also reuse the vector for the entire call at the top of updatepackages)...

    Let N = alreadyAdded.size() Let M = descendants.size() This does O(M+N) work (I think that most implementations actually do O(max(M, N)), but the standard specified 2*(M+N-1)), while what's currently happening would appear to be O(M*log(N)).

    There is also a pre-loop-check one could do if it's likely they don't intersect.

    // O(1)
    if (descendants.back() < alreadyAdded.front() || descendants.front() > alreadAdded.back())
        // skip set_difference, descendants - alreadyAdded = descendants
    

    And a pre-loop narrowing one could do to make it O(min(M, N)).

    // O(log(M) + log(N))
    auto range1 = descendants.equal_range(alreadyAdded.begin(), alreadyAdded.end())
    auto range2 = alreadyAdded.equal_range(descendants.begin(), descendants.end())
    std::vector<txiter> diff(range1.second - range1.first);
    auto end = std::set_difference(range1.first, range1.second,
         range2.first, range2.second, diff.begin()); // Linear time!
    nDescendants += end - diff.begin();
    for (auto it = diff.begin(); it != diff.end(); ++it) {
      // ....
    } 
    

    sdaftuar commented at 3:38 PM on March 29, 2017:

    Thanks, will try some of these out in future optimization efforts.

  17. in src/miner.cpp:470 in 03a407662b outdated
     464 | @@ -439,6 +465,9 @@ void BlockAssembler::addPackageTxs()
     465 |              continue;
     466 |          }
     467 |  
     468 | +        // This transaction will make it in; reset the failed counter.
     469 | +        nConsecutiveFailed = 0;
    


    JeremyRubin commented at 9:51 PM on March 28, 2017:

    Could you add more of a comment on why resetting is correct behavior here?

    It seems to me at first glance, that if we fail to add something earlier, we should continue to tally those as failures? Perhaps:

    Assuming that below a certain threshold `T` of probability
    (i.e.,`T = P(success at M | failure at M-1....0)` where `M = 1000`) 
    of adding something to the block we want to give up,
     we expect that 
    `1000>D>0. P(success at N+D+1 | failure at N+D,... success at N, failure at N-1, ...) 
    > P(success at N+D | failure at N, failure at N-1, ...)`?
    

    Maybe not the most accurate, but would be helpful for future reviewers trying to understand what the intention is.


    sdaftuar commented at 4:23 PM on March 29, 2017:

    This is incremented whenever we fail (not just for failures after we're above a certain block weight), so it needs to be reset when adding a new tx.

  18. JeremyRubin approved
  19. JeremyRubin commented at 9:57 PM on March 28, 2017: contributor

    utack. I've reviewed the changes and it seems reasonable. @sdaftuar I also left some ideas for a few complexity optimizations you may not have yet considered for the expensive loop; but those are for future work :).

  20. sdaftuar commented at 4:40 PM on March 29, 2017: member

    Addressed @JeremyRubin's nits.

  21. JeremyRubin commented at 4:41 PM on March 29, 2017: contributor

    re-utack 4d1eb10

  22. Mining: return early when block is almost full eed816af6c
  23. Add benchmarking for CreateNewBlock 42cd8c890f
  24. Update benchmarking with package statistics 011124a2b2
  25. sdaftuar force-pushed on Mar 29, 2017
  26. sdaftuar commented at 5:59 PM on March 29, 2017: member

    Squashed 4d1eb10 -> 011124a

  27. TheBlueMatt commented at 7:13 PM on March 29, 2017: member

    re-utACK 011124a2b278c5a60bad5f1b0b4abbf7ebc95aa0

  28. gmaxwell approved
  29. gmaxwell commented at 4:33 PM on March 30, 2017: contributor

    utACK

  30. laanwj merged this on Mar 30, 2017
  31. laanwj closed this on Mar 30, 2017

  32. laanwj referenced this in commit cde9b1a864 on Mar 30, 2017
  33. laanwj removed this from the "Blockers" column in a project

  34. sdaftuar referenced this in commit b5c3440b05 on Mar 30, 2017
  35. sdaftuar referenced this in commit 10028fb555 on Mar 30, 2017
  36. sdaftuar referenced this in commit a296c6009f on Mar 30, 2017
  37. laanwj removed the label Needs backport on Mar 31, 2017
  38. laanwj commented at 9:46 AM on March 31, 2017: member

    Removing 'needs backport' label as a backport exists (#10127)

  39. codablock referenced this in commit 06c8714572 on Jan 26, 2018
  40. lateminer referenced this in commit a0230bec5d on Jan 5, 2019
  41. lateminer referenced this in commit cb96770a7a on Jan 5, 2019
  42. andvgal referenced this in commit aa32f4f219 on Jan 6, 2019
  43. CryptoCentric referenced this in commit c26e60bccf on Feb 27, 2019
  44. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-14 12:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me