Mining: Select transactions using feerate-with-ancestors

sdaftuar commented at 8:48 pm on February 25, 2016: member

This implements “Child-pays-for-parent” transaction selection in CreateNewBlock, where we sort the mempool using (modified) feerate with ancestors and consider transactions in that order during block creation. As transactions are selected, a separate map is used to track the new feerate-with-ancestors value to use for in-mempool descendants that haven’t yet been selected.

This is almost strictly better (ie, results in higher-fee blocks) than the existing mining algorithm, with some small variance due to edge cases around which transactions might get selected as the new block’s size approaches the maximum allowed. Moreover the performance impact on CreateNewBlock is minimal – I observed that the total run time of the function actually slightly improved, apparently due to TestBlockValidity being on average slightly faster (a somewhat surprising result).

The actual fee improvements observed were fairly low in the time period I looked at (October 2015), roughly a 1% fee improvement overall. I think that is driven first and foremost by blocks not really being full for a large bulk of the time period, so on a large number of data points the results were identical. Additionally, my guess is that user behavior may not have adapted to this type of transaction selection being widely deployed, and so in the future perhaps we might expect to see larger differences. I did note a handful of instances where the implementation here notably outperformed the existing algorithm, in some instances more than 20% better, but those were the exception and not the rule.

For comparison, I found no examples where the old algorithm produced a block with 0.5% or more in fees than the new algorithm (and few examples where the old algorithm’s block had more fees at all, around 1% of samples).

I’ve labeled this PR as a work-in-progress for now, as it builds off two other open PR’s (#7594 and #7598). Once those are reviewed and merged, I’ll rebase this and remove the WIP label. I opened this now to help motivate the work in those two PRs.

One open question I have is whether it’s worth keeping around the old transaction selection algorithm. The refactor in #7598 that this builds on makes it easy to leave it around, and I don’t think it would be hard to add command line arguments for policy configuration that would enable using the older algorithm. But I don’t know if that’s of interest to anyone, so for now I figure we can let people mull it over and decide what to do in a future PR.

A related question to that is whether to get rid of the existing mempool “score” index. The ancestor tracking pull adds a new ancestor feerate index without removing the old feerate index, but if we decide to get rid of the mining algorithm that uses that index, then perhaps we could also eliminate this mempool index as well.

To do:

Review/merge #7594
Review/merge #7598

jonasschnelli added the label Mining on Feb 25, 2016

dgenr8 commented at 8:19 pm on March 3, 2016: contributor

Does this algorithm claim to select the optimum descendant subtree for a given independent transaction?

sdaftuar commented at 9:04 pm on March 3, 2016: member

This algorithm is by no means guaranteed to produce maximum fee blocks for a given mempool, if that’s what you’re trying to ask.

The algorithm this PR is trying to implement is straightforward:

0Look at highest ancestor-feerate tx in mempool:
1 - Add tx's package (tx + unconfirmed ancestors) to block
2 - Remove tx's package from the mempool and re-sort (so that descendants of tx and its now in-block ancestors are treated as confirmed, and remaining in-mempool descendant's ancestor feerate is updated).

The implementation is complicated because we can’t actually remove things from the mempool. To deal with that, I added a new multi_index that CreateNewBlock uses to keep track of the effective ancestor feerate/size/sigops for transactions whose in-mempool ancestors have been selected for inclusion in the block. (The only other complication is that a package might not be includable, say because it’s too big or has too many sigops, but that’s easy to handle.)

Anyway, getting back to the algorithm – I think this algorithm is superior to the existing one (which iterates based on each tx’s individual feerate, not looking at ancestors) for optimizing block fees, and it has the property that (what I expect to be) typical uses of CPFP should generally be captured – namely those situations where a single child is paying a large fee for its parents.

However if there are multiple medium fee children that are chained below some low-fee parent, then the algorithm might easily fill the block without those transactions because they may not be considered until the block is full. For now though I suspect this algorithm is adequate for the current usage patterns, and perhaps we can come up with ways to monitor what’s in the mempool to look for patterns of usage that would motivate further improvements to CreateNewBlock.

sipa commented at 9:57 pm on March 4, 2016: member

Would it be possible to have a CTxMempool::check that accurately checks the ancestor-based statistics, rather than just lower bounding them to 1 level up?

sdaftuar force-pushed on Mar 17, 2016

sdaftuar commented at 3:11 pm on March 17, 2016: member

Updated now that #7594 has been merged.

btcdrak commented at 5:28 pm on March 17, 2016: contributor

Is this still WIP?

sdaftuar commented at 5:32 pm on March 17, 2016: member

@btcdrak: I was thinking I’d leave this marked as WIP until the other dependent PR (#7598) is merged, since I’ll have to rebase this on top of whatever the final version of the refactor ends up being.

I should add – if you want to start reviewing now, please do!

btcdrak commented at 5:43 pm on March 17, 2016: contributor

@sdaftuar Github tasklists are maybe better for this kind of thing, and they show up as tasks even on ticket lists. https://github.com/blog/1375-task-lists-in-gfm-issues-pulls-comments

sdaftuar force-pushed on May 18, 2016

sdaftuar renamed this:
~~[WIP] Mining: Select transactions using feerate-with-ancestors~~
Mining: Select transactions using feerate-with-ancestors
on May 18, 2016

sdaftuar commented at 6:39 pm on May 18, 2016: member

Rebased on the latest #7598

sdaftuar force-pushed on Jun 9, 2016

sdaftuar commented at 2:37 pm on June 9, 2016: member

Rebased again and built off the latest version of #7598.

seweso commented at 8:19 am on June 10, 2016: none

Very nice!

Additionally, my guess is that user behavior may not have adapted to this type of transaction selection being widely deployed, and so in the future perhaps we might expect to see larger differences.

Yes you would definitely expect bigger difference as this becomes available. As entities will be able to underpay transactions without risking getting dropped from the mempool. (for entities who continuously create transactions)

One of the dangers of the blocksize limit is that fees would go up, but have almost no reason/force to get it down. Things like this and RBF could really help with that. :+1:

Is this algorithm also used to determine which transactions to drop?

sdaftuar commented at 3:43 pm on June 11, 2016: member

Is this algorithm also used to determine which transactions to drop? @seweso The algorithm for eviction of transactions from the mempool is unchanged. In a sense, the transaction eviction algorithm for the mempool is already assuming we’re doing something like what this PR implements for transaction selection, in that a transaction is less likely to be evicted from the mempool if its feerate with descendants is high. But more fundamentally, mempool eviction is just a different problem than transaction selection: when selecting transactions for a block, a candidate transaction’s ancestors must be included (so it’s natural to sort by feerate-with-ancestors), but when considering a transaction for mempool eviction, a candidate transaction’s descendants must also be removed as well (so we use max(feerate, feerate with descendants) as our sort order for removal).

MarcoFalke commented at 9:52 am on June 13, 2016: member

Needs rebase

sdaftuar force-pushed on Jun 13, 2016

sdaftuar commented at 2:08 pm on June 13, 2016: member

Rebased after #7598 has been merged. This PR is ready for review!

MarcoFalke commented at 11:36 am on June 14, 2016: member

Concept ACK

in src/miner.cpp: in 157d24e2e3 outdated

189+        }
190+        else {
191+            iit++;
192+        }
193+    }
194+}

JeremyRubin commented at 5:04 pm on June 14, 2016:

~~Nit: may as well have ++iit in the for loop because it always occurs.~~

0for (CTxMemPool::setEntries::iterator iit = testSet.begin(); iit != testSet.end(); ++iit) {
1         // Only test txs not already in the block
2         if (inBlock.count(*iit)) 
3             testSet.erase(iit);
4 }

Edit: ignore, erasing an iterator invalidates iit.

in src/miner.cpp: in 157d24e2e3 outdated

198+    if (nBlockSize + packageSize >= nBlockMaxSize)
199+        return false;
200+    if (nBlockSigOps + packageSigOps >= MAX_BLOCK_SIGOPS)
201+        return false;
202+    return true;
203+}

JeremyRubin commented at 5:45 pm on June 14, 2016:

nit: Can just have return ! (nBlockSize + packageSize >= nBlockMaxSize || nBlockSigOps + packageSigOps >= MAX_BLOCK_SIGOPS)

or

return (nBlockSize + packageSize < nBlockMaxSize && nBlockSigOps + packageSigOps < MAX_BLOCK_SIGOPS)

JeremyRubin commented at 5:46 pm on June 14, 2016:

nit: >= should really be just > because we allow up to the max sizes.

JeremyRubin commented at 6:24 pm on June 14, 2016: contributor

utAck

in src/miner.cpp: in 0e1a2cc3fb outdated

215+        assert(potentialBlockSize < nBlockMaxSize);
216+        potentialBlockSigOps += it->GetSigOpCount();
217+        assert (potentialBlockSigOps < MAX_BLOCK_SIGOPS);
218+    }
219+    return true;
220+}

mrbandrews commented at 2:46 pm on June 15, 2016:

Nit: since block size and sigops have already been checked, it shouldn’t be necessary to re-check and assert here.

sdaftuar commented at 4:24 pm on June 15, 2016:

I guess I left these asserts in to make the logic in addPackageTxs easier to review, in that if you weren’t sure whether the mapModifiedTx calculations were correct, this assertion ensures that it doesn’t underestimate. But I suppose this would be caught in TestBlockValidity anyway, so I’ll remove.

in src/miner.cpp: in 0e1a2cc3fb outdated

399+// mapModifiedTxs with the next transaction in the mempool to decide what
400+// transaction package to work on next.
401+void BlockAssembler::addPackageTxs()
402+{
403+    // Start by adding all descendants of previously added txs to mapModifiedTx
404+    // and modifiying them for their already included ancestors

mrbandrews commented at 2:53 pm on June 15, 2016:

nit: modifiying -> modifying

in src/miner.cpp: in 0e1a2cc3fb outdated

408+    CTxMemPool::txiter iter;
409+    while (mi != mempool.mapTx.get<ancestor_score>().end() || !mapModifiedTx.empty())
410+    {
411+        // First try to find a new transaction in mapTx to evaluate.
412+        if (mi != mempool.mapTx.get<ancestor_score>().end() &&
413+                SkipMapTxEntry(mempool.mapTx.project<0>(mi))) {

mrbandrews commented at 3:02 pm on June 15, 2016:

Nit: perhaps SkipMapTxEntry could just be included here to avoid the separate method. Alternatively, if the code would be a little too messy, failedTxs could be a local variable passed to the helper function rather than a class member variable.

in src/miner.cpp: in 0e1a2cc3fb outdated

498+        }
499+
500+        // Update transactions that depend on each of these
501+        UpdatePackagesForAdded(ancestors);
502+    }
503+}

mrbandrews commented at 3:04 pm on June 15, 2016:

mapModifiedTxs should be cleared at the end of this function. Or it could be a local variable and passed as needed.

sdaftuar commented at 4:31 pm on June 15, 2016: member

Pushed commits addressing @mrbandrews comments.

re: @JeremyRubin’s nits, I’m leaving TestPackage alone, I think it’s pretty readable in its current form, and the >= shouldn’t matter for anything and is consistent with addScoreTx’s behavior.

I’ll do another round of testing locally, similar to what I did before when I opened the pull, with the current version of the code and report back.

sdaftuar commented at 9:15 pm on June 15, 2016: member

I re-tested my code by simulating on data from 2016-02-23 -to 2016-02-29, comparing the default mining code in master against the mining code introduced here, by calling CreateNewBlock every 100 transactions.

I looked at the fees produced in the last call to CNB before each block found by the network, and the code in this PR produced higher fee blocks in each instance (894 data points).

I was also benchmarking the runtime of CNB over all invocations, and this code was negligibly faster than the old code (summing over all invocations).

gmaxwell commented at 9:28 pm on June 15, 2016: contributor

ACK.

Beyond review, I tested this on several mining nodes for about a week (in addition to prior testing I did months back); I also took nodes running this and a tight getblocktemplate loop through a number of reorg tests with debugging turned up. Finally, rather than using sdaftuar record and replay framework, I scraped all the recent transactions out of recent blocks, reorged the chain back and put all the still-valid transactions into the mempool and confirmed it gave a valid gbt result with higher total fees than the before the patch, I tried this at a dozen different points.

mrbandrews commented at 3:37 pm on June 16, 2016: contributor

ACK. Code review ACK and in my testing, though not nearly as comprehensive as sdaftuar and gmaxwell’s testing, the new code yields higher fees by 1% to 15%.

in src/miner.cpp: in 0e1a2cc3fb outdated

400+// transaction package to work on next.
401+void BlockAssembler::addPackageTxs()
402+{
403+    // Start by adding all descendants of previously added txs to mapModifiedTx
404+    // and modifiying them for their already included ancestors
405+    UpdatePackagesForAdded(inBlock);

sipa commented at 3:50 pm on June 16, 2016:

If I understand it correctly, this means you can’t call addPackageTxs() twice (or have another add*Tx() that also calls UpdatePackagesForAdded(inBlock);. That could be solved by keeping a set of things in inBlock that have already been counted for package updates, but I guess it’s sufficient to just document that this must not be called twice.

sdaftuar commented at 3:53 pm on June 16, 2016:

Calling it twice doesn’t really make sense, as addPackageTx’s would fill up your block the first time. I think if you wanted to implement custom transaction selection, you’d implement it in a function that is called before you’d call addPackageTx’s to fill the rest of the block (like how priority tx’s are done).

I can add a comment though to make this clearer.

sdaftuar commented at 3:57 pm on June 16, 2016:

Oh I forgot to mention: this function no longer changes global state, after I addressed @mrbandrews comments in a later commit and moved mapModifiedTx to be a local variable in addPackageTxs. So this can be reused, though it’s pretty tied to the package selection.

sipa commented at 4:09 pm on June 16, 2016:

ACK, fixed by the later commit.

in src/miner.cpp: in 0e1a2cc3fb outdated

424+            // We're out of entries in mapTx; use the entry from mapModifiedTx
425+            iter = modit->iter;
426+            fUsingModified = true;
427+        } else {
428+            // Try to compare the mapTx entry to the mapModifiedTx entry
429+            iter = mempool.mapTx.project<0>(mi);

sipa commented at 3:56 pm on June 16, 2016:

Nit: this can move into the else branch below.

Edit: nevermind, you’re using it in the comparator.

in src/miner.h: in 0e1a2cc3fb outdated

84+};
85+
86+typedef boost::multi_index_container<
87+    CTxMemPoolModifiedEntry,
88+    boost::multi_index::indexed_by<
89+        boost::multi_index::ordered_unique<modifiedentry_txid>,

sipa commented at 4:03 pm on June 16, 2016:

Couldn’t CTxMempoolModifiedEntry be indexed by its CTxMemPool::txiter entry instead? Not sure if that’s comparable/hashable, though.

sdaftuar commented at 4:10 pm on June 16, 2016:

I don’t remember anymore but I feel like I tried that in the past and failed. If you have a patch, I will be happy to incorporate.

sipa commented at 4:25 pm on June 16, 2016:

See https://github.com/sipa/bitcoin/commits/cpfp

sdaftuar commented at 4:34 pm on June 16, 2016:

Thanks! Looks good.

in src/miner.cpp: in f3c65512a4 outdated

134@@ -134,7 +135,7 @@ CBlockTemplate* BlockAssembler::CreateNewBlock(const CScript& scriptPubKeyIn)
135                        : pblock->GetBlockTime();
136 
137     addPriorityTxs();
138-    addScoreTxs();
139+    addPackageTxs();

sipa commented at 4:05 pm on June 16, 2016:

The addScoreTxs method can be deleted, I think?

sdaftuar commented at 4:17 pm on June 16, 2016:

I raised that question at the beginning of this pull, and no discussion has happened; I’d like to defer that to its own PR to give anyone who is clinging to the old behavior a chance to argue for keeping the code?

sipa commented at 4:27 pm on June 16, 2016:

Fair enough.

sipa commented at 4:34 pm on June 16, 2016: member

ACK f3c65512a416cb9890a291d16204c83b5702b11e (with tree 66fb31f4aaaf3c49ef03c93b3b8155b531359e05).

Update: ACK b428fb2656b3cdf3021faf508eb78e6831e5a276 (with tree 2e6a8674a4e7179aea0041931d68076a442d0132).

Can you squash the fixups?

Use ancestor-feerate based transaction selection for mining

Includes changes by Pieter Wuille

c82a4e9a63

Add unit tests for ancestor feerate mining 29fac19c93

sdaftuar force-pushed on Jun 16, 2016

sdaftuar commented at 4:38 pm on June 16, 2016: member

Thanks, squashed.

sipa commented at 4:41 pm on June 16, 2016: member

ReACK 29fac19c93fabfed4163ee9ffa85f9188c9ee6ac (tree 2e6a8674a4e7179aea0041931d68076a442d0132).

sipa merged this on Jun 16, 2016

sipa closed this on Jun 16, 2016

sipa referenced this in commit 66db2d62d5 on Jun 16, 2016

codablock referenced this in commit 7f53ac9ea5 on Sep 16, 2017

codablock referenced this in commit 01b7745806 on Sep 19, 2017

codablock referenced this in commit f518f4cc40 on Dec 27, 2017

codablock referenced this in commit f362c610e4 on Dec 28, 2017

andvgal referenced this in commit a6b2047ce2 on Jan 6, 2019

furszy referenced this in commit b554c29785 on Jul 14, 2021

DrahtBot locked this on Sep 8, 2021

Mining: Select transactions using feerate-with-ancestors #7600