estimatefee / estimatepriority RPC methods

gavinandresen commented at 1:18 pm on March 25, 2014: contributor

New RPC methods: return an estimate of the fee (or priority) a transaction needs to be likely to confirm in a given number of blocks.

Fee-per-kilobyte estimates for 1..10 confirmations from this code:

 0gavin$ for i in {1..10}; do ./bitcoind estimatefee $i; done
 10.00044643
 20.00044248
 30.00044053
 40.00029499
 50.00020492
 60.00014970
 70.00012981
 80.00012190
 90.00011769
100.00011217

Mike Hearn created the first version of this method for estimating fees. It works as follows:

For transactions that took 1 to N (I picked N=25) blocks to confirm, keep N buckets with at most 100 entries in each recording the fees-per-kilobyte paid by those transactions.

(separate buckets are kept for transactions that confirmed because they are high-priority)

The buckets are filled as blocks are found, and are saved/restored as part of the mempool.dat file.

A few variations on Mike’s initial scheme:

To estimate the fee needed for a transaction to confirm in X buckets, all of the samples in all of the buckets are used and a median of all of the data is used to make the estimate. For example, imagine 25 buckets each containing the full 100 entries. Those 2,500 samples are sorted, and the estimate of the fee needed to confirm in the very next block is the 50’th-highest-fee-entry in that sorted list; the estimate of the fee needed to confirm in the next two blocks is the 150’th-highest-fee-entry, etc.

That algorithm has the nice property that estimates of how much fee you need to pay to get confirmed in block N will always be greater than or equal to the estimate for block N+1. It would clearly be wrong to say “pay 11 uBTC and you’ll get confirmed in 3 blocks, but pay 12 uBTC and it will take LONGER”.

A single block will not contribute more than 10 entries to any one bucket, so a single miner and a large block cannot overwhelm the estimates.

in qa/rpc-tests/smartfees.py: in 8a826e675c outdated

25+    for i in range(len(txdata["vout"])):
26+        if txdata["vout"][i]["value"] == amount:
27+            return i
28+    return -1
29+
30+def gather_inputs(from_node, amount_needed):

mikehearn commented at 4:49 pm on March 27, 2014:

A lot of this stuff looks like it should be in a utility library (the main function too), otherwise it’s getting copy/pasted everywhere.

gavinandresen commented at 5:52 pm on March 31, 2014:

Moving useful functions to util.py, but am going to hold off on moving main; there’s another pull request that bundles everything up into a framework…

in qa/rpc-tests/smartfees.py: in 8a826e675c outdated

43+def make_change(from_node, amount_in, amount_out, fee):
44+    outputs = {}
45+    amount = amount_out+fee
46+    change = amount_in - amount
47+    if change > amount*2:
48+        # Create an extra change output to break up big inputs

mikehearn commented at 4:49 pm on March 27, 2014:

Is this actually needed for the tests? It seems a bit odd to have it here.

in qa/rpc-tests/smartfees.py: in 8a826e675c outdated

23+def find_output(node, txid, amount):
24+    txdata = node.getrawtransaction(txid, 1)
25+    for i in range(len(txdata["vout"])):
26+        if txdata["vout"][i]["value"] == amount:
27+            return i
28+    return -1

mikehearn commented at 4:51 pm on March 27, 2014:

The error case is never checked by the caller. Better to throw, I think.

in qa/rpc-tests/smartfees.py: in 8a826e675c outdated

117+                             "-debug=mempool", "-debug=estimatefee"]))
118+    connect_nodes(nodes[1], 0)
119+
120+    # Node2 is a stingy miner, that
121+    # produces very small blocks (room for only 3 or so transactions)
122+    node2args = [ "-blockprioritysize=0", "-blockmaxsize=1500" ]

mikehearn commented at 4:52 pm on March 27, 2014:

But no debug flags here?

in qa/rpc-tests/smartfees.py: in 8a826e675c outdated

155+                                                  Decimal("0.0"), min_fee, 20))
156+        nodes[1].setgenerate(True, 1)
157+        sync_blocks(nodes)
158+
159+    all_estimates = [ nodes[0].estimatefee(i) for i in range(1,12) ]
160+    print("Fee estimates, more generous miner: "+str([ str(e) for e in all_estimates]))

mikehearn commented at 4:55 pm on March 27, 2014:

Don’t understand this bit about more generous miner: mining behaviour didn’t change in the step between the test being repeated?

in src/init.cpp: in 8a826e675c outdated

918+        {
919+            AcceptToMemoryPool(mempool, valState, mempoolEntry.GetTx(), false,
920+                               &fMissingInputs, false);
921+        }
922+        LogPrintf("Accepted %lu mempool transactions\n", mempool.size());
923+    }

mikehearn commented at 4:55 pm on March 27, 2014:

No log that loading failed?

gavinandresen commented at 5:46 pm on March 31, 2014:

No, logging of failures happens in mempool.Read(), no need to log a specific error and then a generic “read failed” error.

in src/txmempool.cpp: in 8a826e675c outdated

200+                seenTxConfirm(dFeePerKb, dPriority, i);
201+            }
202+        }
203+        for (size_t i = 0; i < history.size(); i++) {
204+            if (history[i].FeeSamples() + history[i].PrioritySamples() > 0)
205+// TODO: REMOVE ME

mikehearn commented at 5:00 pm on March 27, 2014:

// TODO: save me :)

in src/txmempool.cpp: in 8a826e675c outdated

305     // Sanity checks off by default for performance, because otherwise
306     // accepting transactions becomes O(N^2) where N is the number
307     // of transactions in the pool
308     fSanityCheck = false;
309+
310+    minerPolicyEstimator = new CMinerPolicyEstimator(25);

mikehearn commented at 5:05 pm on March 27, 2014:

// 25 blocks was determined empirically to give the best performance as of March 2014.

gavinandresen commented at 6:00 pm on March 31, 2014: contributor

Rebased on master, and tweaked to address Mike’s comments (thanks for the review!).

gavinandresen commented at 1:58 pm on April 1, 2014: contributor

Pull-tester errors were with the regression test:

pull-tester is running python 2.6 (fixed by using subprocess.check_call instead of python2.7 subprocess.check_output).

laanwj added this to the milestone 0.10.0 on May 6, 2014

mikehearn commented at 5:23 pm on May 7, 2014: contributor

What’s the status of this work? Does it need more review or testing or what?

laanwj commented at 5:34 pm on May 7, 2014: member

It’s queued to be merged for 0.10, so after the 0.9.2 branch split-off. But more review and testing is always welcome.

in src/qt/coincontroldialog.cpp: in eae89281ce outdated

594@@ -610,19 +595,19 @@ void CoinControlDialog::updateLabels(WalletModel *model, QDialog* dialog)
595 
596     // tool tips
597     QString toolTip1 = tr("This label turns red, if the transaction size is greater than 1000 bytes.") + "<br /><br />";
598-    toolTip1 += tr("This means a fee of at least %1 per kB is required.").arg(BitcoinUnits::formatWithUnit(nDisplayUnit, CTransaction::nMinTxFee)) + "<br /><br />";
599+    toolTip1 += tr("This means a fee of at least %1 per kB is required.").arg(BitcoinUnits::formatWithUnit(nDisplayUnit, CTransaction::minTxFee.GetFee(1000))) + "<br /><br />";

laanwj commented at 2:02 pm on May 21, 2014:

Nit: There are a lot of getFee(1000)’s. I don’t have an objection to them, although on first reading I was confused (‘what is this magic 1000 number’). But instead of suggesting defining a constant BYTES_PER_KB :p I think it would be nicer to have a function getFeePerKB() that’s simply an inline for getFee(1000).

gavinandresen commented at 4:33 pm on May 23, 2014: contributor

Nit-picked : replace GetFee(1000) with inline GetFeePerK() method.

gavinandresen commented at 4:35 pm on May 29, 2014: contributor

Rebased on master.

I’m very happy with how quickly this generates good estimates; after just three or four blocks it has a good idea of what priority or fee is needed to get into the next couple of blocks.

sipa commented at 5:18 pm on May 29, 2014: member

Save/restore of the mempool seems quite dangerous in making some unconfirmed transactions potentially live forever in the network…

jgarzik commented at 5:22 pm on May 29, 2014: contributor

Nod – you only want to save/restore once we have transactions expiring from the mempool (janitor etc.) after X blocks.

petertodd commented at 7:06 am on May 30, 2014: contributor

@sipa Yup. Even with expiration it makes many DoS attacks and similar vulnerabilities much more effective. For instance OOM crashes are hard to exploit right now because the txs are hard to propagate; not true with save restore.

jgarzik commented at 7:23 am on May 30, 2014: contributor

In general, lacking other tools, clearing the mempool [implicitly] at startup is a valuable tool.

gavinandresen commented at 1:22 pm on May 30, 2014: contributor

Consensus is saving/restoring memory pool is dangerous, so I’ve removed that functionality.

Fee estimates ARE saved/restored, to datadir fee_estimates.dat file.

Type-safe CFeeRate class

Use CFeeRate instead of an int64_t for quantities that are
fee-per-size.

Helps prevent unit-conversion mismatches between the wallet,
relaying, and mining code.

c6cb21d17a

Allow multiple regression tests to run at once

Choose ports at startup based on PID, so multiple regression tests
can run on the same system at the same time.

0193fb82a6

estimatefee / estimatepriority RPC methods

New RPC methods: return an estimate of the fee (or priority) a
transaction needs to be likely to confirm in a given number of
blocks.

Mike Hearn created the first version of this method for estimating fees.
It works as follows:

For transactions that took 1 to N (I picked N=25) blocks to confirm,
keep N buckets with at most 100 entries in each recording the
fees-per-kilobyte paid by those transactions.

(separate buckets are kept for transactions that confirmed because
they are high-priority)

The buckets are filled as blocks are found, and are saved/restored
in a new fee_estiamtes.dat file in the data directory.

A few variations on Mike's initial scheme:

To estimate the fee needed for a transaction to confirm in X buckets,
all of the samples in all of the buckets are used and a median of
all of the data is used to make the estimate. For example, imagine
25 buckets each containing the full 100 entries. Those 2,500 samples
are sorted, and the estimate of the fee needed to confirm in the very
next block is the 50'th-highest-fee-entry in that sorted list; the
estimate of the fee needed to confirm in the next two blocks is the
150'th-highest-fee-entry, etc.

That algorithm has the nice property that estimates of how much fee
you need to pay to get confirmed in block N will always be greater
than or equal to the estimate for block N+1. It would clearly be wrong
to say "pay 11 uBTC and you'll get confirmed in 3 blocks, but pay
12 uBTC and it will take LONGER".

A single block will not contribute more than 10 entries to any one
bucket, so a single miner and a large block cannot overwhelm
the estimates.

171ca7745e

jgarzik commented at 3:00 pm on June 6, 2014: contributor

ACK. Speaking conservatively, appears to be low impact as implemented at first glance.

The largest impact is to existing operations is the hook that is executed when a block appears, necessitating the removal of TXs from the mempool. At a quick glance, it was difficult to discern the big-O() of this operation, but at least it is all in CPU/RAM, and only executed once per incoming new block.

One nit (perhaps for post merge): I don’t like the mempool itself doing the reading. Would prefer a helper function outside the mempool class that perform the necessary I/O. Our normal pattern is a pure implement-serialize chain until actual file I/O is needed. This seems to mix metaphors a bit more than usual.

in src/qt/coincontroldialog.cpp: in 171ca7745e

535@@ -536,26 +536,11 @@ void CoinControlDialog::updateLabels(WalletModel *model, QDialog* dialog)
536         {
537             nChange = nAmount - nPayFee - nPayAmount;
538 
539-            // if sub-cent change is required, the fee must be raised to at least CTransaction::nMinTxFee

sipa commented at 3:08 pm on June 6, 2014:

Wouldn’t we want to keep this block, but do it to prevent dust change rather than subcent change?

gavinandresen commented at 3:34 pm on June 6, 2014:

Dust change is handled at line 543.

BitcoinPullTester commented at 4:50 pm on June 6, 2014: none

Automatic sanity-testing: FAILED BUILD/TEST, see http://jenkins.bluematt.me/pull-tester/171ca7745e77c9f78f26556457fe64e5b2004a75 for binaries and test log.

This could happen for one of several reasons:

It chanages paths in makefile.linux-mingw or otherwise changes build scripts in a way that made them incompatible with the automated testing scripts (please tweak those patches in qa/pull-tester)
It adds/modifies tests which test network rules (thanks for doing that), which conflicts with a patch applied at test time
It does not build on either Linux i386 or Win32 (via MinGW cross compile)
The test suite fails on either Linux i386 or Win32
The block test-cases failed (lookup the first bNN identifier which failed in https://github.com/TheBlueMatt/test-scripts/blob/master/FullBlockTestGenerator.java)

If you believe this to be in error, please ping BlueMatt on freenode or TheBlueMatt here.

This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.

laanwj merged this on Jun 6, 2014

laanwj closed this on Jun 6, 2014

laanwj referenced this in commit 95d68c48d7 on Jun 6, 2014

gmaxwell commented at 5:28 pm on November 20, 2014: contributor

If mempool saving ever comes back as a proposal we should keep in mind that doing so presents a pretty strong watermarking attack. (… not that we don’t have many other watermarking attacks.)

DrahtBot locked this on Sep 8, 2021

estimatefee / estimatepriority RPC methods #3959