Add feerate histogram to getmempoolinfo #21422

pull kiminuo wants to merge 2 commits into bitcoin:master from kiminuo:feature/2021-03-Feerate-histogram changing 6 files +273 −8
  1. kiminuo commented at 10:32 am on March 12, 2021: contributor

    This PR is a slightly modified version of #15836 (jonasschnelli):

    This follows the approach of adding statistical information to Bitcoin Core that would otherwise be inefficient to calculate outside of the codebase.

    Adds an optional feerate histogram to getmempoolinfo.

    The concept and code is heavily inspired by the stats jhoenicke runs (https://github.com/jhoenicke/mempool, http://bitcoin-mempool.info).

    If someone has a good idea how to make the feerate-groups dynamic but also semi-constant for similar fee environments, please comment.

    If this is feature we’d like to have in master (concept ACKs), I’d continue this with writing tests.

    A simple plot of the data is here. RPC output sample is here.

    My attempts to contact jonasschnelli were, unfortunately, unsuccessful so I decided to create this PR in an attempt to move this forward. If this is somehow problematic, please let me know to work it out. edit: Jonas is happy the work continues.

    This PR

    Note that REST support which is in #15836 is not included in this PR. It can be improved in this PR or in a follow-up one if it is deemed useful/required/etc.

    Applied review comments from the old PR

    Test commands

    0$ ./bitcoin-cli -testnet getmempoolinfo "[0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200]" # To test the new behavior
    
    0$ test/functional/test_runner.py mempool_fee_histogram.py # To run the new test
    
     0$ ./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test help getmempoolinfo
     1getmempoolinfo ( [fee_rate,...] )
     2
     3Returns details on the active state of the TX memory pool.
     4
     5Arguments:
     61. fee_histogram    (json array, optional) Fee statistics grouped by fee rate ranges
     7     [
     8       fee_rate,    (numeric, required) Fee rate (in sat/vB) to group the fees by
     9       ...
    10     ]
    11
    12Result:
    13{                               (json object)
    14  "loaded" : true|false,        (boolean) True if the mempool is fully loaded
    15  "size" : n,                   (numeric) Current tx count
    16  "bytes" : n,                  (numeric) Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted
    17  "usage" : n,                  (numeric) Total memory usage for the mempool
    18  "total_fee" : n,              (numeric) Total fees for the mempool in BTC, ignoring modified fees through prioritizetransaction
    19  "maxmempool" : n,             (numeric) Maximum memory usage for the mempool
    20  "mempoolminfee" : n,          (numeric) Minimum fee rate in BTC/kvB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee
    21  "minrelaytxfee" : n,          (numeric) Current minimum relay fee for transactions
    22  "unbroadcastcount" : n,       (numeric) Current number of transactions that haven't passed initial broadcast yet
    23  "fee_histogram" : {           (json object)
    24    "fee_rate_groups" : {       (json object)
    25      "<fee_rate_group>" : {    (json object) Fee rate group named by its lower bound (in sat/vB), identical to the "from" field below
    26        "size" : n,             (numeric) Cumulative size of all transactions in the fee rate group (in vBytes)
    27        "count" : n,            (numeric) Number of transactions in the fee rate group
    28        "fees" : n,             (numeric) Cumulative fees of all transactions in the fee rate group (in sat)
    29        "from" : n             (numeric) Group contains transactions with fee rates equal or greater than this value (in sat/vB)
    30      },
    31      ...
    32    },
    33    "total_fees" : n            (numeric) Total available fees in mempool (in sat)
    34  }
    35}
    36
    37Examples:
    38> bitcoin-cli getmempoolinfo
    39> bitcoin-cli getmempoolinfo "[0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200]"
    40> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/
    41> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": [[0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200]]}' -H 'content-type: text/plain;' http://127.0.0.1:8332/
    

    Output on testnet (2022-07-09)

    0./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200]"
    
      0{
      1"loaded": true,
      2"size": 10,
      3"bytes": 2652,
      4"usage": 13504,
      5"total_fee": 0.00363010,
      6"maxmempool": 300000000,
      7"mempoolminfee": 0.00001000,
      8"minrelaytxfee": 0.00001000,
      9"unbroadcastcount": 0,
     10"fee_histogram": {
     11  "fee_rate_groups": {
     12    "0": {
     13      "size": 0,
     14      "count": 0,
     15      "fees": 0,
     16      "from": 0
     17    },
     18    "1": {
     19      "size": 403,
     20      "count": 2,
     21      "fees": 403,
     22      "from": 1
     23    },
     24    "2": {
     25      "size": 554,
     26      "count": 2,
     27      "fees": 561,
     28      "from": 2
     29    },
     30    "3": {
     31      "size": 0,
     32      "count": 0,
     33      "fees": 0,
     34      "from": 3
     35    },
     36    "4": {
     37      "size": 0,
     38      "count": 0,
     39      "fees": 0,
     40      "from": 4
     41    },
     42    "5": {
     43      "size": 0,
     44      "count": 0,
     45      "fees": 0,
     46      "from": 5
     47    },
     48    "6": {
     49      "size": 255,
     50      "count": 1,
     51      "fees": 1345,
     52      "from": 6
     53    },
     54    "7": {
     55      "size": 0,
     56      "count": 0,
     57      "fees": 0,
     58      "from": 7
     59    },
     60    "8": {
     61      "size": 0,
     62      "count": 0,
     63      "fees": 0,
     64      "from": 8
     65    },
     66    "10": {
     67      "size": 0,
     68      "count": 0,
     69      "fees": 0,
     70      "from": 10
     71    },
     72    "12": {
     73      "size": 0,
     74      "count": 0,
     75      "fees": 0,
     76      "from": 12
     77    },
     78    "14": {
     79      "size": 0,
     80      "count": 0,
     81      "fees": 0,
     82      "from": 14
     83    },
     84    "17": {
     85      "size": 0,
     86      "count": 0,
     87      "fees": 0,
     88      "from": 17
     89    },
     90    "20": {
     91      "size": 0,
     92      "count": 0,
     93      "fees": 0,
     94      "from": 20
     95    },
     96    "25": {
     97      "size": 352,
     98      "count": 1,
     99      "fees": 9505,
    100      "from": 25
    101    },
    102    "30": {
    103      "size": 144,
    104      "count": 1,
    105      "fees": 4520,
    106      "from": 30
    107    },
    108    "40": {
    109      "size": 0,
    110      "count": 0,
    111      "fees": 0,
    112      "from": 40
    113    },
    114    "50": {
    115      "size": 374,
    116      "count": 1,
    117      "fees": 20000,
    118      "from": 50
    119    },
    120    "60": {
    121      "size": 351,
    122      "count": 1,
    123      "fees": 23937,
    124      "from": 60
    125    },
    126    "70": {
    127      "size": 0,
    128      "count": 0,
    129      "fees": 0,
    130      "from": 70
    131    },
    132    "80": {
    133      "size": 0,
    134      "count": 0,
    135      "fees": 0,
    136      "from": 80
    137    },
    138    "100": {
    139      "size": 0,
    140      "count": 0,
    141      "fees": 0,
    142      "from": 100
    143    },
    144    "120": {
    145      "size": 0,
    146      "count": 0,
    147      "fees": 0,
    148      "from": 120
    149    },
    150    "140": {
    151      "size": 0,
    152      "count": 0,
    153      "fees": 0,
    154      "from": 140
    155    },
    156    "170": {
    157      "size": 0,
    158      "count": 0,
    159      "fees": 0,
    160      "from": 170
    161    },
    162    "200": {
    163      "size": 219,
    164      "count": 1,
    165      "fees": 302739,
    166      "from": 200
    167    }
    168  },
    169  "total_fees": 363010
    170}
    171}
    

    Various inputs

    0./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[0]"       # OK, one fee rate group covering all possible transaction fee rates (i.e. `>=0`); total_fee = fee_histogram.total_fees
    1./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[-1]"      # reports: Non-negative values are expected
    2./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[0,0]"     # reports: Strictly increasing values are expected
    3./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[0,1,0]"   # reports: Strictly increasing values are expected
    4./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo "[0, 1000000000000000000000000000000000]" # reports: JSON integer out of range
    5./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo '[1.2,2,3]' # reports: JSON integer out of range
    6./bitcoin-cli -testnet -rpcuser=test -rpcpassword=test getmempoolinfo '["2"]'     # reports: Error parsing JSON: ['2']
    
  2. promag commented at 10:45 am on March 12, 2021: member
    Hey @kiminuo, can you amend the commit and remove my mention? Keep in mind that mentions in commits lead to notifications. Ty.
  3. kiminuo commented at 10:46 am on March 12, 2021: contributor

    Hey @kiminuo, can you amend the commit and remove my mention? Keep in mind that mentions in commits lead to notifications. Ty.

    Yes, sorry for that.

  4. DrahtBot added the label RPC/REST/ZMQ on Mar 12, 2021
  5. kiminuo force-pushed on Mar 12, 2021
  6. jonasschnelli commented at 7:34 pm on March 12, 2021: contributor
    Thanks for picking this up.
  7. DrahtBot commented at 11:09 am on March 16, 2021: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK 0xB10C
    Concept ACK jonatack, JeremyRubin, ghost, jamesob, molnard, sipa
    Stale ACK stratospher, kristapsk
    Ignored review glozow

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #26525 (Remove -mempoolfullrbf option by BitcoinErrorLog)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. kiminuo force-pushed on Mar 20, 2021
  9. kiminuo force-pushed on Mar 21, 2021
  10. kiminuo force-pushed on Mar 21, 2021
  11. jonatack commented at 10:31 am on March 21, 2021: contributor
    Concept ACK. For the test commit, maybe a more descriptive title that can be understood on its own.
  12. in src/rpc/blockchain.cpp:1518 in 81cd4bb482 outdated
    1513+        std::vector<uint64_t> sizes(limits.size(), 0);
    1514+        std::vector<uint64_t> count(limits.size(), 0);
    1515+        std::vector<uint64_t> fees(limits.size(), 0);
    1516+
    1517+        for (const CTxMemPoolEntry& e : pool.mapTx) {
    1518+            int size = (int)e.GetTxSize();
    


    jonatack commented at 10:45 am on March 21, 2021:
    Use a named cast (and can be const)
  13. in src/rpc/blockchain.cpp:1532 in 81cd4bb482 outdated
    1527+            CAmount dfpb = dfees / dsize; // fee per byte including descendants
    1528+            CAmount tfpb = (afees + dfees - fee) / (asize + dsize - size);
    1529+            CAmount feeperbyte = std::max(std::min(dfpb, tfpb), std::min(fpb, afpb));
    1530+
    1531+            // Distribute feerates into feelimits
    1532+            for (int i = limits.size() - 1; i >= 0; i--) {
    


    jonatack commented at 10:46 am on March 21, 2021:
    0            for (int i = limits.size() - 1; i >= 0; --i) {
    

    kiminuo commented at 7:20 am on March 22, 2021:
    Addressed, thanks!
  14. in src/rpc/blockchain.cpp:1535 in 81cd4bb482 outdated
    1530+
    1531+            // Distribute feerates into feelimits
    1532+            for (int i = limits.size() - 1; i >= 0; i--) {
    1533+                if (feeperbyte >= limits[i].second) {
    1534+                    sizes[i] += size;
    1535+                    count[i]++;
    


    jonatack commented at 10:46 am on March 21, 2021:
    0                    ++count[i];
    

    kiminuo commented at 7:21 am on March 22, 2021:
    Addressed, thanks!
  15. in src/rpc/blockchain.cpp:1527 in 81cd4bb482 outdated
    1522+            uint64_t dsize = e.GetSizeWithDescendants();
    1523+            CAmount dfees = e.GetModFeesWithDescendants();
    1524+
    1525+            CAmount fpb = fee / size; // fee per byte
    1526+            CAmount afpb = afees / asize; // fee per byte including ancestors
    1527+            CAmount dfpb = dfees / dsize; // fee per byte including descendants
    


    jonatack commented at 10:49 am on March 21, 2021:
    Are size, asize and dsize guaranteed to be non-zero? Can they be const?

    kiminuo commented at 9:46 pm on March 21, 2021:

    @jonatack This is a part of the original PR I need to research more.

    Do you possibly have a tip where to learn how to properly compute feeperbyte value for a transaction? Or possibly who would know that?

    Anyway, I’m slowly skimming Bitcoin Core codebase so maybe I’ll be lucky :-)


    jonatack commented at 7:16 pm on March 23, 2021:

    For computing a fee rate per byte, have a look in src/policy/feerate.{h, cpp}. Here’s a proposed diff (using the current code) to do explicit casts rather than implicit conversions and narrowing. The size denominators are calling GetTxSize(), which should be non-zero, but you can CHECK_NONFATAL on these. That said, you may be right that it can be done properly/better, as you asked, by using the CFeeRate ctors.

     0         for (const CTxMemPoolEntry& e : pool.mapTx) {
     1-            const int size = (int)e.GetTxSize();
     2-            CAmount fee = e.GetFee();
     3-            uint64_t asize = e.GetSizeWithAncestors();
     4-            CAmount afees = e.GetModFeesWithAncestors();
     5-            uint64_t dsize = e.GetSizeWithDescendants();
     6-            CAmount dfees = e.GetModFeesWithDescendants();
     7-
     8-            CAmount fpb = fee / size; // fee per byte
     9-            CAmount afpb = afees / asize; // fee per byte including ancestors
    10-            CAmount dfpb = dfees / dsize; // fee per byte including descendants
    11-            CAmount tfpb = (afees + dfees - fee) / (asize + dsize - size);
    12-            CAmount feeperbyte = std::max(std::min(dfpb, tfpb), std::min(fpb, afpb));
    13+            const CAmount fee{e.GetFee()};
    14+            const CAmount afees{e.GetModFeesWithAncestors()};
    15+            const CAmount dfees{e.GetModFeesWithDescendants()};
    16+
    17+            const int64_t size{static_cast<int64_t>(e.GetTxSize())};
    18+            const int64_t asize{static_cast<int64_t>(e.GetSizeWithAncestors())};
    19+            const int64_t dsize{static_cast<int64_t>(e.GetSizeWithDescendants())};
    20+
    21+            CHECK_NONFATAL(size > 0);
    22+            CHECK_NONFATAL(asize > 0);
    23+            CHECK_NONFATAL(dsize > 0);
    24+            CHECK_NONFATAL(asize + dsize - size > 0);
    25+
    26+            CAmount fpb{fee / size};     // fee per byte
    27+            CAmount afpb{afees / asize}; // fee per byte including ancestors
    28+            CAmount dfpb{dfees / dsize}; // fee per byte including descendants
    29+            CAmount tfpb{(afees + dfees - fee) / (asize + dsize - size)};
    30+            CAmount feeperbyte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    

    jonatack commented at 7:18 pm on March 23, 2021:
    (note, CAmount is int64_t, see src/amount.h)

    kiminuo commented at 7:40 pm on March 23, 2021:
    Thanks. Will have a look!

    jonatack commented at 6:42 am on March 24, 2021:

    Ah, my comment didn’t make it through the internet:

    It would be good to use feerate values in sat/vB for both the user input and the output here, as we are slowly moving from BTC/kvB to sat/vB fee rate units, per user demand.

    Currently, you can construct a feerate from a sat/vB amount with CFeeRate{amount, COIN}. I proposed CFeeRate::FromSatB and FromBtcKb named constructors in #20790, #20391, #20546 but none of the three are/were merged.


    kiminuo commented at 6:46 am on March 24, 2021:
    Originally, fee rates were in sats but then I found #12240 which made me think that sats -> BTC is preferred.

    kiminuo commented at 6:49 am on March 24, 2021:
    So I will switch it back to use sats. Thanks for the CFeeRate{amount, COIN} tip!

    kiminuo commented at 6:57 pm on March 24, 2021:
    @jonatack Do I understand correctly that CAmount fpb = fee / size is equivalent with CAmount fpb = CFeeRate(fee, size).GetFee(1)? Is that what you propose to use?

    jonatack commented at 6:07 pm on March 25, 2021:

    You can verify this with, for example

    0            assert(fpb == CFeeRate(fee, size).GetFee(1));
    1            assert(afpb == CFeeRate(afees, asize).GetFee(1));
    2            assert(dfpb == CFeeRate(dfees, dsize).GetFee(1));
    3            assert(tfpb == CFeeRate(afees + dfees - fee, asize + dsize - size).GetFee(1));
    

    (I left this as a hint in the suggestions in #21422#pullrequestreview-621285483, but please remove it and decide what you think is best)


    kiminuo commented at 9:04 am on March 26, 2021:
    Thanks! I like CFeeRate better as then I don’t add new divisions myself. The asserts work for me okay.
  16. in src/rpc/blockchain.cpp:1544 in 81cd4bb482 outdated
    1539+            }
    1540+        }
    1541+
    1542+        CAmount total_fees = 0; // Track total amount of available fees in mempool
    1543+        UniValue ranges(UniValue::VOBJ);
    1544+        for (size_t i = 0; i < limits.size(); i++) {
    


    jonatack commented at 10:50 am on March 21, 2021:
    0        for (size_t i = 0; i < limits.size(); ++i) {
    

    kiminuo commented at 7:21 am on March 22, 2021:
    Addressed, thanks!
  17. in src/rpc/blockchain.cpp:1614 in 81cd4bb482 outdated
    1611+    MempoolHistogramFeeLimits feelimits;
    1612+    std::optional<MempoolHistogramFeeLimits> feelimits_opt = std::nullopt;
    1613+
    1614+    if (!request.params[0].isNull()) {
    1615+        const UniValue feelimits_univalue = request.params[0].get_array();
    1616+        for (unsigned int i = 0; i < feelimits_univalue.size(); i++) {
    


    jonatack commented at 10:51 am on March 21, 2021:
    0        for (unsigned int i = 0; i < feelimits_univalue.size(); ++i) {
    

    kiminuo commented at 7:21 am on March 22, 2021:
    Addressed, thanks!
  18. in src/rpc/blockchain.cpp:1605 in 81cd4bb482 outdated
    1601+                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool"},
    1602+                            }},
    1603                     }},
    1604                 RPCExamples{
    1605                     HelpExampleCli("getmempoolinfo", "")
    1606             + HelpExampleRpc("getmempoolinfo", "")
    


    jonatack commented at 10:52 am on March 21, 2021:
    Can you add an example?

    kiminuo commented at 6:31 pm on March 23, 2021:
    I added an example.

    jonatack commented at 7:33 pm on March 23, 2021:

    New examples look good, thanks.

    0> bitcoin-cli getmempoolinfo 
    1> bitcoin-cli getmempoolinfo ["0.00000001","0.00000010","0.00000100","0.00000200","0.00000400","0.00000800"]
    2> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": []}' -H 'content-type: text/plain;' http://127.0.0.1:8332/
    3> curl --user myusername --data-binary '{"jsonrpc": "1.0", "id": "curltest", "method": "getmempoolinfo", "params": [["0.00000001","0.00000010","0.00000100","0.00000200","0.00000400","0.00000800"]]}' -H 'content-type: text/plain;' http://127.0.0.1:8332/
    

    kiminuo commented at 7:39 pm on March 23, 2021:
    Great!
  19. in test/functional/mempool_fee_histogram.py:14 in 81cd4bb482 outdated
     9+    assert_equal,
    10+    assert_greater_than,
    11+    assert_greater_than_or_equal,
    12+    assert_no_key,
    13+)
    14+from decimal import Decimal
    


    jonatack commented at 10:58 am on March 21, 2021:
    The decimal stdlib import should be before the framework imports per PEP8, separated by a blank line

    kiminuo commented at 7:23 am on March 22, 2021:
    Hopefully, it’s correct now.
  20. in test/functional/test_runner.py:269 in 81cd4bb482 outdated
    224@@ -225,6 +225,7 @@
    225     'feature_nulldummy.py --descriptors',
    226     'mempool_accept.py',
    227     'mempool_expiry.py',
    228+    'mempool_fee_histogram.py',
    


    jonatack commented at 11:00 am on March 21, 2021:

    The new test file has incorrect permissions:

    0-rwxr-xr-x 1   5133 Mar 17 15:30 mempool_expiry.py*
    1-rw-r--r-- 1   1716 Mar 21 11:56 mempool_fee_histogram.py
    2-rwxr-xr-x 1   3494 Mar  1 11:53 mempool_limit.py*
    

    I had to run sudo chmod 755 test/functional/mempool_fee_histogram.py to be able to call the test directly, e.g. with test/functional/mempool_fee_histogram.py


    kiminuo commented at 7:23 am on March 22, 2021:
    I fixed this. Thanks!
  21. jonatack commented at 11:00 am on March 21, 2021: contributor
    A few initial comments.
  22. in src/rpc/blockchain.cpp:1576 in 81cd4bb482 outdated
    1572     return RPCHelpMan{"getmempoolinfo",
    1573                 "\nReturns details on the active state of the TX memory pool.\n",
    1574-                {},
    1575+                {
    1576+                    // {"with_fee_histogram", RPCArg::Type::BOOL, /* default */ "false", "True for including the fee histogram in the response"},
    1577+                    {"fee_histogram", RPCArg::Type::STR, /* default */ "false", "Provide fee limits in format: '1,2,3,5,200,1000'"},
    


    jonatack commented at 11:06 am on March 21, 2021:
    Is the argument a string or a JSON array? (if an array, I think there needs to be another line that specifies the element type, e.g. number, string, or amount, probably the latter).

    kiminuo commented at 5:38 pm on March 23, 2021:
    The argument is a JSON array of amounts (fees). I have fixed it.

    jonatack commented at 7:25 pm on March 23, 2021:

    The argument is a JSON array of amounts (fees). I have fixed it.

    Yes, seems better now

    0Arguments:
    11. fee_histogram    (json array, optional) Fee amounts
    2     [
    3       fee,         (numeric or string, required) A fee amount
    4       ...
    5     ]
    
  23. kiminuo commented at 8:23 pm on March 21, 2021: contributor
    @jonatack Thanks for the superb review! I’m working on incorporating your suggestions.
  24. kiminuo force-pushed on Mar 22, 2021
  25. kiminuo force-pushed on Mar 23, 2021
  26. in src/rpc/blockchain.cpp:1595 in 3d923f4de9 outdated
    1588@@ -1520,15 +1589,41 @@ static RPCHelpMan getmempoolinfo()
    1589                         {RPCResult::Type::NUM, "maxmempool", "Maximum memory usage for the mempool"},
    1590                         {RPCResult::Type::STR_AMOUNT, "mempoolminfee", "Minimum fee rate in " + CURRENCY_UNIT + "/kB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee"},
    1591                         {RPCResult::Type::STR_AMOUNT, "minrelaytxfee", "Current minimum relay fee for transactions"},
    1592-                        {RPCResult::Type::NUM, "unbroadcastcount", "Current number of transactions that haven't passed initial broadcast yet"}
    1593+                        {RPCResult::Type::NUM, "unbroadcastcount", "Current number of transactions that haven't passed initial broadcast yet"},
    1594+                        {RPCResult::Type::OBJ, "fee_histogram", "",
    1595+                            {
    1596+                                {RPCResult::Type::OBJ, "<feerate-group>", "Object per feerate group",
    


    jonatack commented at 7:36 pm on March 23, 2021:

    Looks like the help is missing the “ranges” JSON object, returned by the output, that encompasses the feerate groups (tested on signet):

    help

    0"fee_histogram" : {        (json object)
    1    "<feerate-group>" : {    (json object) Object per feerate group
    2      "sizes" : n,           (numeric) Cumulated size of all transactions in feerate group
    3      "count" : n,           (numeric) Amount of transactions in feerate group
    4      "fees" : n,            (numeric) Cumulated fee of all transactions in feerate group
    5      "from_feerate" : n,    (numeric) Group contains transaction with feerates equal or greater than this value
    6      "to_feerate" : n       (numeric) Group contains transaction with feerates less than than this value
    7    },
    

    actual output

    0  "fee_histogram": {
    1    "ranges": {
    2      "0.00000001": {
    3        "sizes": 0,
    4        "count": 0,
    5        "fees": 0,
    6        "from_feerate": "0.00000001",
    7        "to_feerate": "0.0000001"
    8      },
    9      ...
    

    kiminuo commented at 7:40 pm on March 23, 2021:
    I see. Will fix. Thank you

    kiminuo commented at 6:50 am on March 24, 2021:
    It should be fixed now. Even though I still think ranges is not a good word. groups?

    jonatack commented at 6:09 pm on March 25, 2021:
    maybe fee rate groups

    kiminuo commented at 9:04 am on March 26, 2021:
    I have changed it to fee_rate_groups. I think it looks nicer now. What do you think?

    jonatack commented at 10:36 pm on March 26, 2021:
    LGTM
  27. in src/rpc/blockchain.cpp:1577 in 3d923f4de9 outdated
    1573                 "\nReturns details on the active state of the TX memory pool.\n",
    1574-                {},
    1575+                {
    1576+                    {"fee_histogram", RPCArg::Type::ARR, RPCArg::Optional::OMITTED_NAMED_ARG, "Fee amounts",
    1577+                        {
    1578+                            {"fee", RPCArg::Type::AMOUNT, RPCArg::Optional::NO, "A fee amount"},
    


    jonatack commented at 7:44 pm on March 23, 2021:

    IIUC these should be feerates, not fees

    0                            {"fee_rate", RPCArg::Type::AMOUNT, RPCArg::Optional::NO, "Fee rate to group the fees by"},
    
  28. in src/rpc/blockchain.cpp:1575 in 3d923f4de9 outdated
    1571 {
    1572     return RPCHelpMan{"getmempoolinfo",
    1573                 "\nReturns details on the active state of the TX memory pool.\n",
    1574-                {},
    1575+                {
    1576+                    {"fee_histogram", RPCArg::Type::ARR, RPCArg::Optional::OMITTED_NAMED_ARG, "Fee amounts",
    


    jonatack commented at 7:50 pm on March 23, 2021:
    If I understand correctly, this should be something like “Fee statistics grouped by feerate ranges”

    kiminuo commented at 6:43 am on March 24, 2021:
    Fixed, thanks!
  29. in src/rpc/blockchain.cpp:1597 in 3d923f4de9 outdated
    1593+                        {RPCResult::Type::NUM, "unbroadcastcount", "Current number of transactions that haven't passed initial broadcast yet"},
    1594+                        {RPCResult::Type::OBJ, "fee_histogram", "",
    1595+                            {
    1596+                                {RPCResult::Type::OBJ, "<feerate-group>", "Object per feerate group",
    1597+                                {
    1598+                                    {RPCResult::Type::NUM, "sizes", "Cumulated size of all transactions in feerate group"},
    


    jonatack commented at 7:52 pm on March 23, 2021:
    0                                    {RPCResult::Type::NUM, "size", "Cumulated size of all transactions in feerate group"},
    

    kiminuo commented at 6:43 am on March 24, 2021:
    Fixed, thanks!
  30. in src/rpc/blockchain.cpp:1599 in 3d923f4de9 outdated
    1595+                            {
    1596+                                {RPCResult::Type::OBJ, "<feerate-group>", "Object per feerate group",
    1597+                                {
    1598+                                    {RPCResult::Type::NUM, "sizes", "Cumulated size of all transactions in feerate group"},
    1599+                                    {RPCResult::Type::NUM, "count", "Amount of transactions in feerate group"},
    1600+                                    {RPCResult::Type::NUM, "fees", "Cumulated fee of all transactions in feerate group"},
    


    jonatack commented at 7:56 pm on March 23, 2021:
    Probably both should be “fees” in this sentence, like in line 1603.

    kiminuo commented at 6:43 am on March 24, 2021:
    Fixed, thanks!
  31. in src/rpc/blockchain.cpp:1601 in 3d923f4de9 outdated
    1597+                                {
    1598+                                    {RPCResult::Type::NUM, "sizes", "Cumulated size of all transactions in feerate group"},
    1599+                                    {RPCResult::Type::NUM, "count", "Amount of transactions in feerate group"},
    1600+                                    {RPCResult::Type::NUM, "fees", "Cumulated fee of all transactions in feerate group"},
    1601+                                    {RPCResult::Type::NUM, "from_feerate", "Group contains transaction with feerates equal or greater than this value"},
    1602+                                    {RPCResult::Type::NUM, "to_feerate", "Group contains transaction with feerates less than than this value"},
    


    jonatack commented at 7:57 pm on March 23, 2021:

    s/transaction/transactions/ in both lines 1600 and 1601

    s/than than/than/


    kiminuo commented at 6:43 am on March 24, 2021:
    Fixed, thanks!
  32. kiminuo force-pushed on Mar 23, 2021
  33. kiminuo force-pushed on Mar 24, 2021
  34. kiminuo force-pushed on Mar 24, 2021
  35. in src/rpc/blockchain.cpp:1535 in fb0824ce3c outdated
    1530+
    1531+            CAmount fpb{fee / size};     // fee per byte
    1532+            CAmount afpb{afees / asize}; // fee per byte including ancestors
    1533+            CAmount dfpb{dfees / dsize}; // fee per byte including descendants
    1534+            CAmount tfpb{(afees + dfees - fee) / (asize + dsize - size)};
    1535+            CAmount feeperbyte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    


    jonatack commented at 3:53 pm on March 25, 2021:

    naming style nit, per developer-notes.md

    0            CAmount fee_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    

    kiminuo commented at 8:59 am on March 26, 2021:
    Thanks! Applied.
  36. in src/rpc/blockchain.cpp:1616 in fb0824ce3c outdated
    1614                     }},
    1615                 RPCExamples{
    1616-                    HelpExampleCli("getmempoolinfo", "")
    1617-            + HelpExampleRpc("getmempoolinfo", "")
    1618+                    HelpExampleCli("getmempoolinfo", "") +
    1619+                    HelpExampleCli("getmempoolinfo", R"([1, 10, 100, 200, 400, 800])") +
    


    jonatack commented at 4:49 pm on March 25, 2021:

    missing quotes in the first example, I think, and perhaps use example fee rates that return more interesting results

    0-                    HelpExampleCli("getmempoolinfo", R"([1, 10, 100, 200, 400, 800])") +
    1+                    HelpExampleCli("getmempoolinfo", R"('[1, 5, 10, 25, 50, 100]')") +
    2                     HelpExampleRpc("getmempoolinfo", "") +
    3-                    HelpExampleRpc("getmempoolinfo", R"([1, 10, 100, 200, 400, 800])")
    4+                    HelpExampleRpc("getmempoolinfo", R"([1, 5, 10, 25, 50, 100])")
    

    kiminuo commented at 9:00 am on March 26, 2021:
    You are right. I have used the values from the first chart on http://bitcoin-mempool.info website and those are actually the same as the values in the original PR.
  37. in src/rpc/blockchain.cpp:1611 in fb0824ce3c outdated
    1607+                                        {RPCResult::Type::NUM, "count", "Amount of transactions in feerate group"},
    1608+                                        {RPCResult::Type::NUM, "fees", "Cumulated fees of all transactions in feerate group (in satoshis)"},
    1609+                                        {RPCResult::Type::NUM, "from_feerate", "Group contains transactions with feerates equal or greater than this value (in satoshis)"},
    1610+                                        {RPCResult::Type::NUM, "to_feerate", "Group contains transactions with feerates less than this value (in satoshis)"},
    1611+                                    }}}},
    1612+                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool (in satoshis)"},
    


    jonatack commented at 5:44 pm on March 25, 2021:
    This field seems to be the same as the total_fee field above in the same output (except that it is now in sat/vB instead of BTC/kvB)?

    kiminuo commented at 9:02 am on March 26, 2021:

    No. In short, it holds that fee_histogram.total_fees <= total_fee. More precisely, the total_fees (i.e. the histogram JSON property) here is a sum of fees JSON properties from fee rate groups (i.e. ranges). For example, ./bitcoin-cli getmempoolinfo "[0]" should show the difference but not on testnet because people mostly do not pay any fees there (🙄).

    However, I’m somewhat hesitant whether total_fees is useful or not. One can certainly compute it as it is a simple sum of already provided data. So it feels like I should remove it so that Bitcoin Core provides raw data and let customer applications do complex things based on the raw data. Of course, this is very subjective. I’m not sure. I’m neutral on this.

  38. jonatack commented at 6:00 pm on March 25, 2021: contributor

    Thanks for switching it from BTC/kvB to sat/vB!

    I looked over this more thoroughly and saw quite a few remaining fixups and suggestions. One thing to ensure is that the help docs are the same as the actual output and use the correct fee rate units and arg types.

    It would be good if the functional test actually verified that the values are correctly calculated after creating a few txns; ATM it is only really a smoke test that verifies the output structure.

      0diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
      1index 8d7ce13e57..fbae7cdbaf 100644
      2--- a/src/rpc/blockchain.cpp
      3+++ b/src/rpc/blockchain.cpp
      4@@ -1512,7 +1512,7 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHi
      5          * ... cumulated fees */
      6         std::vector<uint64_t> sizes(limits.size(), 0);
      7         std::vector<uint64_t> count(limits.size(), 0);
      8-        std::vector<uint64_t> fees(limits.size(), 0);
      9+        std::vector<CAmount> fees(limits.size(), 0);
     10 
     11         for (const CTxMemPoolEntry& e : pool.mapTx) {
     12             const CAmount fee{e.GetFee()};
     13@@ -1528,15 +1528,19 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHi
     14             CHECK_NONFATAL(dsize > 0);
     15             CHECK_NONFATAL(asize + dsize - size > 0);
     16
     17-            CAmount fpb{fee / size};     // fee per byte
     18-            CAmount afpb{afees / asize}; // fee per byte including ancestors
     19-            CAmount dfpb{dfees / dsize}; // fee per byte including descendants
     20-            CAmount tfpb{(afees + dfees - fee) / (asize + dsize - size)};
     21-            CAmount feeperbyte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
     22+            const CAmount fpb{fee / size};     // fee per byte
     23+            const CAmount afpb{afees / asize}; // fee per byte including ancestors
     24+            const CAmount dfpb{dfees / dsize}; // fee per byte including descendants
     25+            const CAmount tfpb{(afees + dfees - fee) / (asize + dsize - size)};
     26+            const CAmount fee_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
     27+ 
     28+            assert(fpb == CFeeRate(fee, size).GetFee(1));
     29+            assert(afpb == CFeeRate(afees, asize).GetFee(1));
     30+            assert(dfpb == CFeeRate(dfees, dsize).GetFee(1));
     31+            assert(tfpb == CFeeRate(afees + dfees - fee, asize + dsize - size).GetFee(1));
     32+
     33-            // Distribute feerates into feelimits
     34+            // Distribute fee rates into fee limits
     35             for (int i = limits.size() - 1; i >= 0; --i) {
     36-                if (feeperbyte >= limits[i]) {
     37+                if (fee_per_byte >= limits[i]) {
     38                     sizes[i] += size;
     39                     ++count[i];
     40                     fees[i] += fee;
     41@@ -1545,23 +1549,23 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHi
     42             }
     43         }
     44 
     45         UniValue ranges(UniValue::VOBJ);
     46         for (size_t i = 0; i < limits.size(); ++i) {
     47             UniValue info_sub(UniValue::VOBJ);
     48-            info_sub.pushKV("sizes", sizes[i]);
     49-            info_sub.pushKV("count", count[i]);
     50-            info_sub.pushKV("fees", fees[i]);
     51-            info_sub.pushKV("from_feerate", limits[i]);
     52+            info_sub.pushKV("size", sizes.at(i));
     53+            info_sub.pushKV("count", count.at(i));
     54+            info_sub.pushKV("fees", fees.at(i));
     55+            info_sub.pushKV("from", limits.at(i));
     56 
     57             if (i == limits.size() - 1) {
     58-                info_sub.pushKV("to_feerate", "Max"); // TODO.
     59+                info_sub.pushKV("to", "max"); // TODO.
     60             } else {
     61-                info_sub.pushKV("to_feerate", limits[i + 1]);
     62+                info_sub.pushKV("to", limits[i + 1]);
     63             }
     64 
     65-            total_fees += fees[i];
     66-            ranges.pushKV(ToString(limits[i]), info_sub);
     67+            total_fees += fees.at(i);
     68+            ranges.pushKV(ToString(limits.at(i)), info_sub);
     69         }
     70 
     71         UniValue info(UniValue::VOBJ);
     72@@ -1580,7 +1584,7 @@ static RPCHelpMan getmempoolinfo()
     73                 {
     74                     {"fee_histogram", RPCArg::Type::ARR, RPCArg::Optional::OMITTED_NAMED_ARG, "Fee statistics grouped by fee rate ranges",
     75                         {
     76-                            {"fee_rate", RPCArg::Type::NUM, RPCArg::Optional::NO, "Fee rate (in satoshis) to group the fees by"},
     77+                            {"fee_rate", RPCArg::Type::NUM, RPCArg::Optional::NO, "Fee rate (in " + CURRENCY_ATOM + "/vB) to group the fees by"},
     78                         },
     79                     },
     80                 },
     81@@ -1591,31 +1595,31 @@ static RPCHelpMan getmempoolinfo()
     82                         {RPCResult::Type::NUM, "size", "Current tx count"},
     83                         {RPCResult::Type::NUM, "bytes", "Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted"},
     84                         {RPCResult::Type::NUM, "usage", "Total memory usage for the mempool"},
     85-                        {RPCResult::Type::STR_AMOUNT, "total_fee", "Total fees for the mempool in " + CURRENCY_UNIT + ", ignoring modified fees through prioritizetransaction"},
     86+                        {RPCResult::Type::STR_AMOUNT, "total_fee", "Total fees for the mempool in " + CURRENCY_UNIT + "/kvB, ignoring modified fees through prioritizetransaction"},
     87                         {RPCResult::Type::NUM, "maxmempool", "Maximum memory usage for the mempool"},
     88-                        {RPCResult::Type::STR_AMOUNT, "mempoolminfee", "Minimum fee rate in " + CURRENCY_UNIT + "/kB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee"},
     89+                        {RPCResult::Type::STR_AMOUNT, "mempoolminfee", "Minimum fee rate in " + CURRENCY_UNIT + "/kvB for tx to be accepted. Is the maximum of minrelaytxfee and minimum mempool fee"},
     90                         {RPCResult::Type::STR_AMOUNT, "minrelaytxfee", "Current minimum relay fee for transactions"},
     91                         {RPCResult::Type::NUM, "unbroadcastcount", "Current number of transactions that haven't passed initial broadcast yet"},
     92                         {RPCResult::Type::OBJ, "fee_histogram", "",
     93                             {
     94-                                {RPCResult::Type::OBJ_DYN, "ranges", "Feerate groups",
     95+                                {RPCResult::Type::OBJ_DYN, "ranges", "",
     96                                 {
     97-                                    {RPCResult::Type::OBJ, "<feerate-group>", "Object per feerate group",
     98+                                    {RPCResult::Type::OBJ, "<fee_rate_group>", "Fee rate group named by its lower bound (in " + CURRENCY_ATOM + "/vB), identical to the \"from\" field below",
     99                                     {
    100-                                        {RPCResult::Type::NUM, "size", "Cumulated size of all transactions in feerate group"},
    101-                                        {RPCResult::Type::NUM, "count", "Amount of transactions in feerate group"},
    102-                                        {RPCResult::Type::NUM, "fees", "Cumulated fees of all transactions in feerate group (in satoshis)"},
    103-                                        {RPCResult::Type::NUM, "from_feerate", "Group contains transactions with feerates equal or greater than this value (in satoshis)"},
    104-                                        {RPCResult::Type::NUM, "to_feerate", "Group contains transactions with feerates less than this value (in satoshis)"},
    105+                                        {RPCResult::Type::NUM, "size", "Cumulative size of all transactions in the fee rate group"},
    106+                                        {RPCResult::Type::NUM, "count", "Number of transactions in the fee rate group"},
    107+                                        {RPCResult::Type::NUM, "fees", "Cumulative fees of all transactions in the fee rate group (in " + CURRENCY_ATOM + "/vB)"},
    108+                                        {RPCResult::Type::NUM, "from", "Group contains transactions with fee rates equal or greater than this value (in " + CURRENCY_ATOM + "/vB)"},
    109+                                        {RPCResult::Type::NUM, "to", "Group contains transactions with fee rates less than this value (in " + CURRENCY_ATOM + "/vB)"},
    110                                     }}}},
    111-                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool (in satoshis)"},
    112+                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool (in " + CURRENCY_ATOM + "/vB)"},
    113                             }},
    114                     }},
    115                 RPCExamples{
    116                     HelpExampleCli("getmempoolinfo", "") +
    117-                    HelpExampleCli("getmempoolinfo", R"([1, 10, 100, 200, 400, 800])") +
    118+                    HelpExampleCli("getmempoolinfo", R"('[1, 5, 10, 25, 50, 100]')") +
    119                     HelpExampleRpc("getmempoolinfo", "") +
    120-                    HelpExampleRpc("getmempoolinfo", R"([1, 10, 100, 200, 400, 800])")
    121+                    HelpExampleRpc("getmempoolinfo", R"([1, 5, 10, 25, 50, 100])")
    122                 },
    123
    124diff --git a/test/functional/mempool_fee_histogram.py b/test/functional/mempool_fee_histogram.py
    125index 0cbaa97bfc..91c6dccc30 100755
    126--- a/test/functional/mempool_fee_histogram.py
    127+++ b/test/functional/mempool_fee_histogram.py
    128@@ -31,15 +31,15 @@ class MempoolFeeHistogramTest(BitcoinTestFramework):
    129         total_fees = 0
    130 
    131         for key, bin in info['fee_histogram']['ranges'].items():
    132-            assert_equal(int(key), bin['from_feerate'])
    133+            assert_equal(int(key), bin['from'])
    134             if bin['fees'] > 0:
    135                 assert_greater_than(bin['count'], 0)
    136             else:
    137                 assert_equal(bin['count'], 0)
    138             assert_greater_than_or_equal(bin['fees'], 0)
    139-            assert_greater_than_or_equal(bin['sizes'], 0)
    140-            if bin['to_feerate'] != 'Max':
    141-                assert_greater_than(bin['to_feerate'], bin['from_feerate'])
    142+            assert_greater_than_or_equal(bin['size'], 0)
    143+            if bin['to'] != 'max':
    144+                assert_greater_than(bin['to'], bin['from'])
    145             total_fees += bin['fees']
    
  39. kiminuo force-pushed on Mar 26, 2021
  40. kiminuo commented at 9:07 am on March 26, 2021: contributor

    It would be good if the functional test actually verified that the values are correctly calculated after creating a few txns; ATM it is only really a smoke test that verifies the output structure.

    This is on my TODO list. So I will improve it over coming days.

  41. in src/rpc/blockchain.cpp:1528 in ccb043c809 outdated
    1523+            const int64_t dsize{static_cast<int64_t>(e.GetSizeWithDescendants())};
    1524+
    1525+            CHECK_NONFATAL(size > 0);
    1526+            CHECK_NONFATAL(asize > 0);
    1527+            CHECK_NONFATAL(dsize > 0);
    1528+            CHECK_NONFATAL(asize + dsize - size > 0);
    


    jonatack commented at 10:34 pm on March 26, 2021:
    Drive-by comment: now that you’re constructing the fee rates with the CFeeRate ctor, there is no longer a need for these greater than zero checks. Will re-review the rest.

    kiminuo commented at 11:25 am on March 27, 2021:
    Yes, good point. Thank you
  42. kiminuo force-pushed on Mar 27, 2021
  43. in src/rpc/blockchain.cpp:1523 in 7b6da9fa33 outdated
    1518+            const CAmount afees{e.GetModFeesWithAncestors()};
    1519+            const CAmount dfees{e.GetModFeesWithDescendants()};
    1520+
    1521+            const int64_t size{static_cast<int64_t>(e.GetTxSize())};
    1522+            const int64_t asize{static_cast<int64_t>(e.GetSizeWithAncestors())};
    1523+            const int64_t dsize{static_cast<int64_t>(e.GetSizeWithDescendants())};
    


    jonatack commented at 2:55 pm on March 28, 2021:

    A couple of missing headers and a suggested (tested) update now that we are using the CFeeRate{CAmount, size_t} ctor to construct the fee rates:

     0diff --git a/src/rest.cpp b/src/rest.cpp
     1index 809daa0ef8..400972c092 100644
     2--- a/src/rest.cpp
     3+++ b/src/rest.cpp
     4@@ -26,6 +26,8 @@
     5 
     6 #include <univalue.h>
     7 
     8+#include <optional>
     9+
    10 static const size_t MAX_GETUTXOS_OUTPOINTS = 15; //allow a max of 15 outpoints to be queried at once
    11diff --git a/src/rpc/blockchain.cpp b/src/rpc/blockchain.cpp
    12index d4035c3d0a..5e44990b57 100644
    13--- a/src/rpc/blockchain.cpp
    14+++ b/src/rpc/blockchain.cpp
    15@@ -45,6 +45,7 @@
    16 #include <condition_variable>
    17 #include <memory>
    18 #include <mutex>
    19+#include <optional>
    20 
    21 struct CUpdatedBlock
    22 {
    23@@ -1503,7 +1504,7 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHi
    24     ret.pushKV("unbroadcastcount", uint64_t{pool.GetUnbroadcastTxs().size()});
    25 
    26     if (feeLimits) {
    27-        const MempoolHistogramFeeRates& limits = feeLimits.value();
    28+        const MempoolHistogramFeeRates& limits{feeLimits.value()};
    29 
    30         /* Keep histogram per...
    31          * ... cumulated tx sizes
    32@@ -1514,23 +1515,18 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHi
    33         std::vector<CAmount> fees(limits.size(), 0);
    34 
    35         for (const CTxMemPoolEntry& e : pool.mapTx) {
    36-            const CAmount fee{e.GetFee()};
    37-            const CAmount afees{e.GetModFeesWithAncestors()};
    38-            const CAmount dfees{e.GetModFeesWithDescendants()};
    39+            const CAmount fee{e.GetFee()}, afees{e.GetModFeesWithAncestors()}, dfees{e.GetModFeesWithDescendants()};
    40+            const size_t size{e.GetTxSize()}, asize{e.GetSizeWithAncestors()}, dsize{e.GetSizeWithDescendants()};
    41 
    42-            const int64_t size{static_cast<int64_t>(e.GetTxSize())};
    43-            const int64_t asize{static_cast<int64_t>(e.GetSizeWithAncestors())};
    44-            const int64_t dsize{static_cast<int64_t>(e.GetSizeWithDescendants())};
    45-
    46-            const CAmount fpb{CFeeRate(fee, size).GetFee(1)};     // fee per byte
    47-            const CAmount afpb{CFeeRate(afees, asize).GetFee(1)}; // fee per byte including ancestors
    48-            const CAmount dfpb{CFeeRate(dfees, dsize).GetFee(1)}; // fee per byte including descendants
    49-            const CAmount tfpb{CFeeRate(afees + dfees - fee, asize + dsize - size).GetFee(1)};
    50-            const CAmount fee_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    51+            const CAmount fpb{CFeeRate{fee, size}.GetFee(1)};     // fee rate per byte
    52+            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // fee rate per byte including ancestors
    53+            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // fee rate per byte including descendants
    54+            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    55+            const CAmount fee_rate_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    56 
    57             // Distribute fee rates into fee limits
    58             for (int i = limits.size() - 1; i >= 0; --i) {
    59-                if (fee_per_byte >= limits[i]) {
    60+                if (fee_rate_per_byte >= limits[i]) {
    61                     sizes[i] += size;
    62                     ++count[i];
    63                     fees[i] += fee;
    

    kiminuo commented at 6:21 am on March 29, 2021:

    Thank you for the valuable feedback! Addressed.

    Should I add #include <optional> to blockchain.h too? Did you notice those missing #include <optional> or is there a tool that would warn me in the future?

  44. in test/functional/test_framework/util.py:60 in 7b6da9fa33 outdated
    54@@ -55,6 +55,11 @@ def assert_greater_than(thing1, thing2):
    55         raise AssertionError("%s <= %s" % (str(thing1), str(thing2)))
    56 
    57 
    58+def assert_no_key(k, d):
    59+    if k in d:
    60+        raise AssertionError("%s in %s" % (str(k), str(d)))
    


    jonatack commented at 3:09 pm on March 28, 2021:

    suggestion

    0-def assert_no_key(k, d):
    1-    if k in d:
    2-        raise AssertionError("%s in %s" % (str(k), str(d)))
    3+def assert_no_key(key, dictionary):
    4+    if key in dictionary:
    5+        raise AssertionError(f"Key '{key}' not expected to be found in {dictionary}")
    

    which uses Python f-strings per current practice and would improve the error message from, for example

    0    raise AssertionError("%s in %s" % (str(k), str(d)))
    1AssertionError: size in {'loaded': True, 'size': 1, 'bytes': 219, 'usage': 1072, 'total_fee': Decimal('0.00004380'), 'maxmempool': 300000000, 'mempoolminfee': Decimal('0.00001000'), 'minrelaytxfee': Decimal('0.00001000'), 'unbroadcastcount': 1}
    

    to

    0    raise AssertionError(f"Key '{key}' not expected to be found in {dictionary}")
    1AssertionError: Key 'size' not expected to be found in {'loaded': True, 'size': 1, 'bytes': 219, 'usage': 1072, 'total_fee': Decimal('0.00004380'), 'maxmempool': 300000000, 'mempoolminfee': Decimal('0.00001000'), 'minrelaytxfee': Decimal('0.00001000'), 'unbroadcastcount': 1}
    

    kiminuo commented at 6:23 am on March 29, 2021:
    Nice, I didn’t know about these f-string. Thanks.
  45. jonatack commented at 3:13 pm on March 28, 2021: contributor
    A couple more suggestions. Will re-review after the TODO you mention in #21422 (comment).
  46. in test/functional/mempool_fee_histogram.py:30 in 7b6da9fa33 outdated
    25+        node.sendtoaddress(node.getnewaddress(), 1)
    26+
    27+        info = node.getmempoolinfo()
    28+        assert_no_key('fee_histogram', info)
    29+
    30+        info = node.getmempoolinfo([100, 200, 300, 400, 500])
    


    jonatack commented at 3:23 pm on March 28, 2021:

    Oh, and can add logging:

     0diff --git a/test/functional/mempool_fee_histogram.py b/test/functional/mempool_fee_histogram.py
     1index 01fd7825b3..3edd31447b 100755
     2--- a/test/functional/mempool_fee_histogram.py
     3+++ b/test/functional/mempool_fee_histogram.py
     4@@ -21,12 +21,12 @@ class MempoolFeeHistogramTest(BitcoinTestFramework):
     5 
     6     def run_test(self):
     7         node = self.nodes[0]
     8-
     9         node.sendtoaddress(node.getnewaddress(), 1)
    10 
    11-        info = node.getmempoolinfo()
    12-        assert_no_key('fee_histogram', info)
    13+        self.log.info("Test getmempoolinfo does not return fee histogram by default")
    14+        assert_no_key('fee_histogram', node.getmempoolinfo())
    15 
    16+        self.log.info("Test getmempoolinfo returns fee histogram if fee rate array is passed")
    17         info = node.getmempoolinfo([100, 200, 300, 400, 500])
    

    kiminuo commented at 6:24 am on March 29, 2021:
    Thank you, fixed!
  47. kiminuo force-pushed on Mar 29, 2021
  48. kiminuo force-pushed on Mar 29, 2021
  49. kiminuo commented at 6:31 am on March 29, 2021: contributor

    Will re-review after the TODO you mention in #21422 (comment).

    Thank you. I’m slowly getting familiar with Bitcoin’s BitcoinTestFramework. I’m almost ready to start writing the test. So slow progress, but progress too. I would kind of want to start with self.setup_clean_chain = True because I will be more confident about the blockchain’s state. It can be improved further later.

  50. kiminuo force-pushed on Mar 30, 2021
  51. kiminuo force-pushed on Mar 31, 2021
  52. in src/rpc/blockchain.cpp:1671 in 3b2db54fbd outdated
    1522+
    1523+            const CAmount fpb{CFeeRate{fee, size}.GetFee(1)};     // fee rate per byte
    1524+            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // fee rate per byte including ancestors
    1525+            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // fee rate per byte including descendants
    1526+            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    1527+            const CAmount fee_rate_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    


    kiminuo commented at 10:50 am on March 31, 2021:

    @jonatack I still struggle with this as I’m not sure whether this is correct or not. It is in the original PR but:

    • Is this how it should be done?
    • Is there a better way?
    • Is it a copy of some existing code?

    It’s very important and I don’t really know the answer now. So I put it here even for my own reference.


    kiminuo commented at 8:44 am on April 12, 2021:
    This is useful #12118 to understand how txs are sorted in mempool.
  53. in src/rpc/blockchain.cpp:1550 in 3b2db54fbd outdated
    1545+            info_sub.pushKV("count", count.at(i));
    1546+            info_sub.pushKV("fees", fees.at(i));
    1547+            info_sub.pushKV("from", limits.at(i));
    1548+
    1549+            if (i == limits.size() - 1) {
    1550+                info_sub.pushKV("to", "max"); // TODO.
    


    kiminuo commented at 11:08 am on April 11, 2021:

    I think we have the following choices here:

    1. Make it: “to”: null
    2. Represent range with arrays: [1, 2], [2, 3], [3]
    3. Remove “to”
    4. Remove both “from” & “to”

    I would go with the option 1 or 3.

  54. kiminuo force-pushed on Apr 12, 2021
  55. kiminuo force-pushed on Apr 14, 2021
  56. kiminuo commented at 2:23 pm on April 14, 2021: contributor

    A couple more suggestions. Will re-review after the TODO you mention in #21422 (comment).

    I have improved the test. I find it much better than before but I still think it should be tested more. What do you think?

  57. JeremyRubin commented at 2:34 pm on April 14, 2021: contributor

    Concept ACK!

    It might make sense to have the API be more similar to https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

  58. kiminuo commented at 2:42 pm on April 14, 2021: contributor

    It might make sense to have the API be more similar to https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

    Thank you for the suggestion, I’ll have a look!

  59. kiminuo marked this as ready for review on Apr 16, 2021
  60. DrahtBot added the label Needs rebase on Apr 17, 2021
  61. kiminuo force-pushed on Apr 18, 2021
  62. DrahtBot removed the label Needs rebase on Apr 18, 2021
  63. laanwj commented at 11:06 am on June 3, 2021: member

    This, as merged on master, gives me the following build error with clang 13:

     0/bitcoin/src/rpc/blockchain.cpp:1666:45: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
     1            const CAmount fpb{CFeeRate{fee, size}.GetFee(1)};     // Fee rate per byte
     2                                            ^~~~
     3/bitcoin/src/rpc/blockchain.cpp:1666:45: note: insert an explicit cast to silence this issue
     4            const CAmount fpb{CFeeRate{fee, size}.GetFee(1)};     // Fee rate per byte
     5                                            ^~~~
     6                                            static_cast<uint32_t>( )
     7/bitcoin/src/rpc/blockchain.cpp:1667:48: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
     8            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // Fee rate per byte including ancestors
     9                                               ^~~~~
    10/bitcoin/src/rpc/blockchain.cpp:1667:48: note: insert an explicit cast to silence this issue
    11            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // Fee rate per byte including ancestors
    12                                               ^~~~~
    13                                               static_cast<uint32_t>( )
    14/bitcoin/src/rpc/blockchain.cpp:1668:48: error: non-constant-expression cannot be narrowed from type 'size_t' (aka 'unsigned long') to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
    15            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // Fee rate per byte including descendants
    16                                               ^~~~~
    17/bitcoin/src/rpc/blockchain.cpp:1668:48: note: insert an explicit cast to silence this issue
    18            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // Fee rate per byte including descendants
    19                                               ^~~~~
    20                                               static_cast<uint32_t>( )
    21/bitcoin/src/rpc/blockchain.cpp:1671:62: error: non-constant-expression cannot be narrowed from type 'unsigned long' to 'uint32_t' (aka 'unsigned int') in initializer list [-Wc++11-narrowing]
    22            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    23                                                             ^~~~~~~~~~~~~~~~~~~~
    24/bitcoin/src/rpc/blockchain.cpp:1671:62: note: insert an explicit cast to silence this issue
    25            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    26                                                             ^~~~~~~~~~~~~~~~~~~~
    27                                                             static_cast<uint32_t>( )
    284 errors generated.
    29make[2]: *** [Makefile:9273: rpc/libbitcoin_server_a-blockchain.o] Error 1
    30make[2]: *** Waiting for unfinished jobs....
    31make[2]: Leaving directory '/store2/build/bitcoin/bitcoin/src'
    32make[1]: *** [Makefile:16124: all-recursive] Error 1
    33make[1]: Leaving directory '/store2/build/bitcoin/bitcoin/src'
    34make: *** [Makefile:821: all-recursive] Error 1
    

    I can’t say I’ve ever seen these before.

  64. in src/rpc/blockchain.cpp:1563 in ce1bf40927 outdated
    1558+        std::vector<uint64_t> count(limits.size(), 0);
    1559+        std::vector<CAmount> fees(limits.size(), 0);
    1560+
    1561+        for (const CTxMemPoolEntry& e : pool.mapTx) {
    1562+            const CAmount fee{e.GetFee()}, afees{e.GetModFeesWithAncestors()}, dfees{e.GetModFeesWithDescendants()};
    1563+            const size_t size{e.GetTxSize()}, asize{e.GetSizeWithAncestors()}, dsize{e.GetSizeWithDescendants()};
    


    maflcko commented at 12:13 pm on June 3, 2021:

    tx size is limited to uint32_t on all platforms on master

    0            const uint32_t size{e.GetTxSize()}, asize{e.GetSizeWithAncestors()}, dsize{e.GetSizeWithDescendants()};
    

    kiminuo commented at 6:13 pm on June 3, 2021:

    Thank you, I have tried to address that to avoid subsequent narrowing warnings.

    Edit: Any guidance on how to fix this correctly would be greatly appreaciated.


    luke-jr commented at 5:18 pm on August 12, 2021:
    Putting such an assumption here is unnecessarily bug-prone for no benefit. All the functions involved return 64-bit types, so should not be truncated. Furthermore, GetSizeWith* may very well include multiple transactions. size_t is the right type here IMO.

    maflcko commented at 7:20 am on August 16, 2021:

    Putting such an assumption here is unnecessarily bug-prone for no benefit.

    Then #21848 should be reverted first. Otherwise using size_t here is compiled down to the same code anyway.

  65. kiminuo force-pushed on Jun 3, 2021
  66. laanwj commented at 3:35 pm on June 7, 2021: member
    It builds succesfully now, thanks!
  67. in src/rpc/blockchain.cpp:1663 in 9d16921553 outdated
    1659+        std::vector<uint64_t> count(limits.size(), 0);
    1660+        std::vector<CAmount> fees(limits.size(), 0);
    1661+
    1662+        for (const CTxMemPoolEntry& e : pool.mapTx) {
    1663+            const CAmount fee{e.GetFee()}, afees{e.GetModFeesWithAncestors()}, dfees{e.GetModFeesWithDescendants()};
    1664+            const uint32_t size{(uint32_t)e.GetTxSize()}, asize{(uint32_t)e.GetSizeWithAncestors()}, dsize{(uint32_t)e.GetSizeWithDescendants()};
    


    luke-jr commented at 5:55 pm on June 18, 2021:
    Why are you downgrading these to uint32_t?

    kiminuo commented at 8:52 am on June 20, 2021:

    Your question is probably related to Marco’s comment here: #21422 (review). And as I said there, any guidance on this would be greatly appreciated.

    Edit: I have rebased this PR so that this PR builds on tx size is limited to uint32_t on all platforms on master


    sipa commented at 4:21 pm on November 22, 2021:
    Why not use uint64_t for everything? GetSizeWithAncestors() returns a uint64_t too.

    kiminuo commented at 9:16 am on November 24, 2021:

    Now we have:

    0const CAmount fee{e.GetFee()};
    1const uint32_t size{(uint32_t)e.GetTxSize()};
    2const CAmount fee_rate{CFeeRate{fee, size}.GetFee(1)};
    

    CFeeRate has the following constructor: CFeeRate::CFeeRate(const CAmount& nFeePaid, uint32_t num_bytes). If size is casted to uint64_t, then I will get a warning unless I do some other casting.

    Any suggestion how to improve this?

  68. in src/rpc/blockchain.cpp:1675 in 9d16921553 outdated
    1670+            // Fee rate per byte including ancestors & descendants
    1671+            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    1672+            const CAmount fee_rate_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    1673+
    1674+            // Distribute fee rates into fee limits
    1675+            for (int i = limits.size() - 1; i >= 0; --i) {
    


    luke-jr commented at 6:00 pm on June 18, 2021:

    Should probably check for overflow explicitly if you use int here.

    Or something like:

    0            for (size_t i = limits.size(); i-- > 0; ) {
    

    kiminuo commented at 8:42 am on June 20, 2021:
    I have applied your suggestion. It is certainly not straightforward though.
  69. in src/rpc/client.cpp:176 in 9d16921553 outdated
    148@@ -149,6 +149,7 @@ static const CRPCConvertParam vRPCConvertParams[] =
    149     { "pruneblockchain", 0, "height" },
    150     { "keypoolrefill", 0, "newsize" },
    151     { "getrawmempool", 0, "verbose" },
    152+    { "getmempoolinfo", 0, "fee_histogram" },
    


    luke-jr commented at 9:32 pm on June 19, 2021:
    This shouldn’t interrupt getrawmempool

    kiminuo commented at 8:38 am on June 20, 2021:
    Yes, thank you.
  70. in test/functional/mempool_fee_histogram.py:146 in 9d16921553 outdated
    92+    def histogram_stats(self, histogram):
    93+        total_fees = 0
    94+        empty_count = 0
    95+        non_empty_count = 0
    96+
    97+        for key, bin in histogram['fee_rate_groups'].items():
    


    luke-jr commented at 9:34 pm on June 19, 2021:
    Is bin reserved in Python?

    kiminuo commented at 7:58 am on June 20, 2021:
  71. luke-jr changes_requested
  72. luke-jr commented at 9:47 pm on June 19, 2021: member
    Note: This PR drops the REST support
  73. in src/rpc/blockchain.h:48 in 9d16921553 outdated
    40@@ -41,8 +41,10 @@ void RPCNotifyBlockChange(const CBlockIndex*);
    41 /** Block description to JSON */
    42 UniValue blockToJSON(const CBlock& block, const CBlockIndex* tip, const CBlockIndex* blockindex, bool txDetails = false) LOCKS_EXCLUDED(cs_main);
    43 
    44+typedef std::vector<CAmount> MempoolHistogramFeeRates;
    45+
    46 /** Mempool information to JSON */
    47-UniValue MempoolInfoToJSON(const CTxMemPool& pool);
    48+UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHistogramFeeRates> feeLimits);
    


    luke-jr commented at 9:57 pm on June 19, 2021:
    Need to #include <optional> in this header

    kiminuo commented at 8:38 am on June 20, 2021:
    Yes, thank you.
  74. kiminuo force-pushed on Jun 20, 2021
  75. kiminuo commented at 8:53 am on June 20, 2021: contributor

    Note: This PR drops the REST support

    I have added a note about this in the PR description. Thanks.

  76. kiminuo force-pushed on Jun 20, 2021
  77. luke-jr referenced this in commit 0e037084b4 on Jun 27, 2021
  78. luke-jr referenced this in commit 3cf62e989e on Jun 29, 2021
  79. in src/rest.cpp:32 in 36f5e224f5 outdated
    28@@ -29,6 +29,8 @@
    29 
    30 #include <univalue.h>
    31 
    32+#include <optional>
    


    luke-jr commented at 4:46 pm on August 12, 2021:
    This should be above, after <any>

    kiminuo commented at 6:36 am on August 13, 2021:
    Fixed, thank you.
  80. kiminuo force-pushed on Aug 13, 2021
  81. ghost commented at 6:30 am on August 16, 2021: none

    Concept ACK. Tested the PR commits building on master branch.

     0$ bitcoin-cli getmempoolinfo "[1,10,20,30]"
     1
     2  "loaded": true,
     3  "size": 22,
     4  "bytes": 4472,
     5  "usage": 31152,
     6  "total_fee": 0.00096947,
     7  "maxmempool": 300000000,
     8  "mempoolminfee": 0.00001000,
     9  "minrelaytxfee": 0.00001000,
    10  "unbroadcastcount": 0,
    11  "fee_histogram": {
    12    "fee_rate_groups": {
    13      "1": {
    14        "size": 3274,
    15        "count": 15,
    16        "fees": 4577,
    17        "from": 1,
    18        "to": 9
    19      },
    20      "10": {
    21        "size": 141,
    22        "count": 1,
    23        "fees": 2800,
    24        "from": 10,
    25        "to": 19
    26      },
    27      "20": {
    28        "size": 0,
    29        "count": 0,
    30        "fees": 0,
    31        "from": 20,
    32        "to": 29
    33      },
    34      "30": {
    35        "size": 1057,
    36        "count": 6,
    37        "fees": 89570,
    38        "from": 30,
    39        "to": null
    40      }
    41    },
    42    "total_fees": 96947
    43  }
    44}
    

    Fee distribution chart using the results from bitcoin-cli getmempoolinfo "[1,10,20,30]":

    image

     0from flask import Flask, render_template
     1import requests
     2import json
     3
     4app = Flask(__name__)
     5[@app](/bitcoin-bitcoin/contributor/app/).route('/')
     6
     7def chart(chartID = 'chart_ID', chart_type = 'bar', chart_height = 200):
     8	size_group1 = size(1)
     9	size_group2 = size(10)
    10	size_group3 = size(20)
    11	
    12	chart = {"renderTo": chartID, "type": chart_type, "height": chart_height,}
    13	series = [{"name": 'Size (vByte)', "data": [int(size_group1), int(size_group2), int(size_group3)]}]
    14	title = {"text": 'Fee rate distribution'}
    15	xAxis = {"categories": ['1-9', '10-19', '20-29']}
    16	yAxis = {"title": {"text": ''}}
    17	return render_template('index.html', chartID=chartID, chart=chart, series=series, title=title, xAxis=xAxis, yAxis=yAxis)
    18
    19def size(group):
    20	url = "http://127.0.0.1:18333/"
    21	payload = "{\"jsonrpc\": \"1.0\", \"id\": \"bitcoin-histogram\", \"method\": \"getmempoolinfo\", \"params\": [[1,10,20,30]]}"
    22	headers = {
    23  	  'Authorization': 'Basic dXNlcjMzOnBhc3N3b3JkMzM=',
    24  	  'Content-Type': 'text/plain'
    25	}
    26	response = requests.request("POST", url, headers=headers, data=payload, )
    27	fee_dist = response.json()
    28	size_group = fee_dist['result']['fee_histogram']['fee_rate_groups']["" +str(group)+ ""]['size']	
    29	return str(size_group)
    30
    31if __name__ == "__main__":
    32	app.run(debug = True, host='127.0.0.1', port=8080, passthrough_errors=True)
    
  82. ghost commented at 7:30 pm on November 3, 2021: none

    @kiminuo #22891 adds fee rate distribution bars in -getinfo. Its on top of this PR. Will be helpful if you could review :)

    image

    Also waiting for reviewers in this PR to leave some ACKs or comments to improve so that other projects can also use this histogram

  83. stratospher commented at 6:29 pm on November 12, 2021: contributor

    ACK f2ca3d3. I really like this idea.

    • 6466eb6
      • Introduces an optional argument feeLimits to MempoolInfoToJSON() in order to specify the required fee rate ranges of the histogram. fee_rate_per_byte is calculated and the transaction is placed in the correct fee rate group.
    • f2ca3d3
      • Tests whether the correct histogram is generated for various cases when the mempool contains 0, 1, 2 and 3 transactions.
  84. kristapsk commented at 7:38 pm on November 12, 2021: contributor
    Concept ACK
  85. kristapsk approved
  86. kristapsk commented at 8:47 pm on November 12, 2021: contributor
    ACK f2ca3d35ee999e0be309a65c4f69865824e27a4b. Reviewed code, ran new functional test and did basic manual testing on signet and mainnet.
  87. in src/rpc/blockchain.h:46 in f2ca3d35ee outdated
    41@@ -41,8 +42,10 @@ void RPCNotifyBlockChange(const CBlockIndex*);
    42 /** Block description to JSON */
    43 UniValue blockToJSON(const CBlock& block, const CBlockIndex* tip, const CBlockIndex* blockindex, bool txDetails = false) LOCKS_EXCLUDED(cs_main);
    44 
    45+typedef std::vector<CAmount> MempoolHistogramFeeRates;
    


    0xB10C commented at 12:37 pm on November 22, 2021:

    nit: I guess CAmount works here, but the values actually represent a feerate (∈ N*) and not an /** Amount in satoshis (Can be negative) */.

    Maybe?

    0typedef std::vector<uint64_t> MempoolHistogramFeeRates;
    

    Feel free to ignore this though.


    kiminuo commented at 8:17 pm on November 23, 2021:

    It is more complicated because of this line:

    0const CAmount fee_rate{CFeeRate{fee, size}.GetFee(1)}; // Fee rate per byte
    

    (https://github.com/bitcoin/bitcoin/pull/21422/files#diff-decae4be02fb8a47ab4557fe74a9cb853bdfa3ec0fa1b515c0a1e5de91f4ad0bR1688)

    I think your suggestion seems reasonable but then I don’t really know whether there are some edge cases I don’t know about.


    0xB10C commented at 4:45 pm on November 25, 2021:
    Agree, yes. I’ve tried changing this and it gave me a bunch of warnings. I’d say ignore this comment unless someone else finds this to be problematic.

    kiminuo commented at 8:02 pm on November 25, 2021:
    Ok!
  88. in src/rpc/blockchain.cpp:1731 in f2ca3d35ee outdated
    1728                         {RPCResult::Type::BOOL, "loaded", "True if the mempool is fully loaded"},
    1729                         {RPCResult::Type::NUM, "size", "Current tx count"},
    1730                         {RPCResult::Type::NUM, "bytes", "Sum of all virtual transaction sizes as defined in BIP 141. Differs from actual serialized size because witness data is discounted"},
    1731                         {RPCResult::Type::NUM, "usage", "Total memory usage for the mempool"},
    1732-                        {RPCResult::Type::STR_AMOUNT, "total_fee", "Total fees for the mempool in " + CURRENCY_UNIT + ", ignoring modified fees through prioritizetransaction"},
    1733+                        {RPCResult::Type::STR_AMOUNT, "total_fee", "Total fees for the mempool in " + CURRENCY_UNIT + "/kvB, ignoring modified fees through prioritizetransaction"},
    


    0xB10C commented at 1:24 pm on November 22, 2021:
    I think /kvB was mistakenly added here and should be dropped. total_fee is not a feerate. Do you know if this was meant to be added somewhere else?

    kiminuo commented at 7:41 am on November 23, 2021:
    I think it was originally suggested here: #21422#pullrequestreview-621285483 but you are right. I will drop it. Thank you.
  89. in src/rpc/blockchain.cpp:1767 in 6466eb65ad outdated
    1775+        for (size_t i = 0; i < feelimits_univalue.size(); ++i) {
    1776+            int64_t value = feelimits_univalue[i].get_int64();
    1777+
    1778+            if (value < 0) {
    1779+                throw JSONRPCError(RPC_INVALID_PARAMETER, "Non-negative values are expected");
    1780+            } else if (i > 0 && feelimits.back() >= value) {
    


    brunoerg commented at 1:37 pm on November 22, 2021:
    What’s the reason for this i > 0 here?

    kiminuo commented at 2:37 pm on November 22, 2021:
    So you take one value at a time and compare it with the previous one to make sure that the sequence is an increasing one. For i == 0, you don’t have any previous value to compare with.

    brunoerg commented at 4:10 pm on November 22, 2021:
    Perfect. Thank you.
  90. in test/functional/mempool_fee_histogram.py:32 in f2ca3d35ee outdated
    27+        node.generate(102)
    28+
    29+        # We have two utxos and we do this:
    30+        #
    31+        # coinbase-tx-101 <- tx1 (5 sat/vB) <- tx2 (14 sat/vB) <----\
    32+        # coinbase-tx-102 <--------------------------------------- tx3 (6 sat/vB)
    


    0xB10C commented at 3:02 pm on November 22, 2021:

    These are the coinbase transactions for block 1 and block 2 not block 101 and block 102. The coinbase utxos for block 101 and block 102 aren’t mature. However, I don’t think it’s important that the UTXOs are Coinbase UTXOs.

    I found the ASCII art to be harder/took longer to understand than e.g. a comment like:

    0 We have two UTXOs (utxo_1 and utxo_2) and create three changeless transactions:
    1 - tx1 (5 sat/vB): spending utxo_1
    2 - tx2 (14 sat/vB): spending output from tx1
    3 - tx3 (6 sat/vB): spending utxo_2 and the output from tx2 
    

    kiminuo commented at 8:55 am on November 23, 2021:

    The text is more clear. Thanks.

    (A nicer diagram would be even more clear … :))

  91. in src/rpc/blockchain.cpp:1758 in f2ca3d35ee outdated
    1766+    std::optional<MempoolHistogramFeeRates> feelimits_opt = std::nullopt;
    1767+
    1768+    if (!request.params[0].isNull()) {
    1769+        const UniValue feelimits_univalue = request.params[0].get_array();
    1770+
    1771+        if (feelimits_univalue.size() == 0 || feelimits_univalue.size() > 30) {
    


    0xB10C commented at 3:17 pm on November 22, 2021:

    Is there a reason to limit this to specifically 30?

    If there is, I think it should be a constant and it should be tested.

    If not, I could image someone might want to use this with [1, 2, 3, ..., 9999, 10000] (or even more).


    kiminuo commented at 7:56 am on November 23, 2021:

    The idea behind the limit was that I didn’t want to let it be unbounded (MempoolInfoToJSON acquires pool.cs lock). If this is not a concern, I can remove the check or I can increase it substantially.

    I’m not sure what option is better really.


    kiminuo commented at 10:19 am on November 24, 2021:
    @0xB10C Would it make more sense to have a limit or having no limit? I’m not sure what is preferrable.

    0xB10C commented at 4:53 pm on November 25, 2021:

    Good point on the lock.. Can we benchmark the time the RPC takes with 10, 100. 1000, 10000 histogram bins? I think if it scales linear O(n) and somewhere between 10ms or so, then it’s fine to have no limit. Users selecting a million bins should expect that this could take a bit longer than only 10 bins.

    I’ll see if I can test that.


    kiminuo commented at 8:22 pm on November 25, 2021:

    https://github.com/bitcoin/bitcoin/pull/21422/files#diff-decae4be02fb8a47ab4557fe74a9cb853bdfa3ec0fa1b515c0a1e5de91f4ad0bR1685-R1699 takes O(#mempoolTxs * #bins)

    Looking at https://jochen-hoenicke.de/queue/#BTC,1y,count, to have some sense about numbers, it can be something like: 20_000 * 10_000 = 200_000_000 that should be computed in a less than a second.

    If you can test it, it would be great. Thinking about it, maybe it’s easy just to modify the test we have to add a lot of transactions and just measure the time. So it should be relatively easy I guess.


    kiminuo commented at 10:34 pm on March 4, 2023:

    @0xB10C I modified

     0for (const CTxMemPoolEntry& e : pool.mapTx) {
     1    const CAmount fee{e.GetFee()};
     2    const uint32_t size{uint32_t(e.GetTxSize())};
     3    const CAmount fee_rate{CFeeRate{fee, size}.GetFee(1)};
     4
     5    // Distribute fee rates
     6    for (size_t i = floors.size(); i-- > 0;) {
     7        if (fee_rate >= floors[i]) {
     8            sizes[i] += size;
     9            ++count[i];
    10            fees[i] += fee;
    11            break;
    12        }
    13    }
    14}
    

    to

     0std::vector<int> test_fees;
     1std::vector<uint32_t> test_sizes;
     2
     3FastRandomContext rand;
     4
     5const int test_iterations = 1000;
     6const int test_mempool_tx_size = 30000;
     7
     8for (int r = 0; r < test_iterations * test_mempool_tx_size; r++) {
     9    test_fees.push_back(rand.randrange(100)); // 0 - 100
    10    test_sizes.push_back(rand.randrange(400) + 200); // 200 - 600
    11}
    12
    13const auto before =  std::chrono::system_clock::now();
    14
    15for (int t = 0; t < test_iterations; t++) {
    16    for (int r = 0; r < test_mempool_tx_size; r++) {
    17        const CAmount fee = test_fees[t * test_mempool_tx_size + r];
    18        const uint32_t size = test_sizes[t * test_mempool_tx_size + r];
    19        const CAmount fee_rate{CFeeRate{fee, size}.GetFee(1)};
    20
    21        // Distribute fee rates
    22        for (size_t i = floors.size(); i-- > 0;) {
    23            if (fee_rate >= floors[i]) {
    24                sizes[i] += size;
    25                ++count[i];
    26                fees[i] += fee;
    27                break;
    28            }
    29        }
    30    }
    31}
    32
    33const std::chrono::duration<double, std::milli> duration = std::chrono::system_clock::now() - before;
    34std::cout << "It took " << duration.count() << "ms" << std::endl; // divide by the test_iterations number (ie 1000)
    

    So that I can measure this code a bit. It’s pretty ugly code and it’s certainly not scientific but the goal is just to have some sense how fast/slow it is. I take it that even if I were off by the factor of 10, it’s still good enough for these particular measurements.

    So on my machine, I measured these numbers:

    • 1000 bins ~14 ms
    • 10000 bins ~ 97 ms
    • (not too sure if there is a limit how many bins one can pass actually as I would guess that there is a limit on how long an RPC input can be)

    That is to say that the artificial limitation to 30 bins is too strict. And I’m removing that restriction because for “normal” use-cases it looks like the restriction is just an unnecessary impediment.

    WDYT?

  92. in src/rpc/blockchain.cpp:1742 in f2ca3d35ee outdated
    1740+                            {
    1741+                                {RPCResult::Type::OBJ_DYN, "fee_rate_groups", "",
    1742+                                {
    1743+                                    {RPCResult::Type::OBJ, "<fee_rate_group>", "Fee rate group named by its lower bound (in " + CURRENCY_ATOM + "/vB), identical to the \"from\" field below",
    1744+                                    {
    1745+                                        {RPCResult::Type::NUM, "size", "Cumulative size of all transactions in the fee rate group"},
    


    0xB10C commented at 4:06 pm on November 22, 2021:

    nit: could add

    0                                        {RPCResult::Type::NUM, "size", "Cumulative size of all transactions in the fee rate group (in vBytes)"},
    

    kiminuo commented at 7:46 am on November 23, 2021:
    Thanks
  93. in src/rpc/blockchain.cpp:1748 in f2ca3d35ee outdated
    1746+                                        {RPCResult::Type::NUM, "count", "Number of transactions in the fee rate group"},
    1747+                                        {RPCResult::Type::NUM, "fees", "Cumulative fees of all transactions in the fee rate group (in " + CURRENCY_ATOM + "/vB)"},
    1748+                                        {RPCResult::Type::NUM, "from", "Group contains transactions with fee rates equal or greater than this value (in " + CURRENCY_ATOM + "/vB)"},
    1749+                                        {RPCResult::Type::NUM, "to", "Group contains transactions with fee rates equal or less than this value (in " + CURRENCY_ATOM + "/vB)"},
    1750+                                    }}}},
    1751+                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool (in " + CURRENCY_ATOM + "/vB)"},
    


    0xB10C commented at 4:08 pm on November 22, 2021:

    total_fees is not a feerate.

    0                                {RPCResult::Type::NUM, "total_fees", "Total available fees in mempool (in " + CURRENCY_ATOM + ")"},
    

    kiminuo commented at 7:48 am on November 23, 2021:
  94. in src/rpc/blockchain.cpp:1780 in 6466eb65ad outdated
    1781+                throw JSONRPCError(RPC_INVALID_PARAMETER, "Strictly increasing values are expected");
    1782+            }
    1783+
    1784+            feelimits.push_back(value);
    1785+        }
    1786+        feelimits_opt = std::optional<MempoolHistogramFeeRates>(feelimits);
    


    sipa commented at 4:19 pm on November 22, 2021:
    Using std::move(feelimits) will avoid a vector copy. Same with the MempoolInfoToJSON call below.

    kiminuo commented at 10:19 am on November 24, 2021:
    Hopefully addressed. Thanks.
  95. in src/rpc/blockchain.cpp:1667 in 6466eb65ad outdated
    1662+            const CAmount fee{e.GetFee()}, afees{e.GetModFeesWithAncestors()}, dfees{e.GetModFeesWithDescendants()};
    1663+            const uint32_t size{(uint32_t)e.GetTxSize()}, asize{(uint32_t)e.GetSizeWithAncestors()}, dsize{(uint32_t)e.GetSizeWithDescendants()};
    1664+
    1665+            const CAmount fpb{CFeeRate{fee, size}.GetFee(1)};     // Fee rate per byte
    1666+            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // Fee rate per byte including ancestors
    1667+            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // Fee rate per byte including descendants
    


    sipa commented at 4:23 pm on November 22, 2021:

    I think using ancestor feerates here is going to distort the result, because multiple transactions can have shared ancestors. Using descendant feerates is even more questionable to me, as they don’t matter for transaction selection in blocks at all.

    Imagine a situation with 3 transactions (example due to @Xekyo): one parent at 1 sat/vbyte, with two children each at 7 sat/vbyte, with all transactions the same size. The current code will classify the parent at 5 sat/vbyte, and the children each at 4 sat/vbyte. That’s not accurate; e.g. the (weight-weighted) average of the reported feerates is 4.33, while the real average is 5, so this doesn’t reflect the actual state of the mempool. It also doesn’t correspond to the decisions actually made by miners, which with the current CPFP code would consider first a package of 1 parent + 1 child at 4 sat/vbyte, and then another child at 7 sat/vbyte.

    My suggestion would be either:

    • (1) use only transaction feerates directly (no ancestors/descendants), and have a result that ignores CPFP-like mechanics.
    • (2) invoke the mining code to actually build a giant “block” from the entire mempool (ignoring weight limits), and have it report the feerates of its included packages (which are similar to what you’d get with ancestor fees/sizes, but is guaranteed not to count any transaction twice).

    kiminuo commented at 8:15 pm on November 22, 2021:

    I think using ancestor feerates here is going to distort the result, because multiple transactions can have shared ancestors. Using descendant feerates is even more questionable to me, as they don’t matter for transaction selection in blocks at all.

    So I have adopted #15836 and did some modifications but I left L1665-L1667 as they were in the original PR because I was not sure why it was implemented that way. My attempts to contact the original author were unsuccessful for me to understand the motivation. Anyway, I knew back then that I didn’t have a good explanation for the lines … evidently rightly so. I should have added a comment warning about this though, that’s my mistake, sorry for that.

    Thank you for the analysis and the suggestions. I’ll try to address the issue.


    sipa commented at 6:27 pm on November 23, 2021:
    @kiminuo Just in case it needs to be said: no need to apologize for trying to improve things. The final state of the code is everyone’s responsibility; developers and reviewers.

    kiminuo commented at 10:14 am on November 24, 2021:

    @sipa

    • (1) use only transaction feerates directly (no ancestors/descendants), and have a result that ignores CPFP-like mechanics.

    I have picked (1) approach for now because: I can modify the PR in reasonable time and then it can serve as some sort of baseline. So we can still decide later whether the approach is the best.

    Anyway, there are still questions:

    • Is (2) much better than (1) or not?
    • Is there room for having both (1) and (2) possibly?
    • (2) seems hard-ish to implement. If that is true, one concern here is whether the additional complexity is worth it. I don’t know.

    btw: I was interested how https://jochen-hoenicke.de/queue/#BTC,24h,fee works and it seems that #15836 was actually heavily inspired by: https://github.com/jhoenicke/mempool/blob/31d8dcb896ebc8139d3e60c3f275b94811293777/mempool_sql.py#L44-L49.


    luke-jr commented at 4:42 pm on November 24, 2021:
    It really depends on the use case for the histogram. If people are trying to make decisions on what fee to use, it’s probably better to under-estimate the fee rate of a given transaction (by applying a CPFP logic excessively) than over-estimate it (by ignoring CPFP, etc). Are there other use cases that would be impacted?

    sipa commented at 4:43 pm on November 24, 2021:
    • Is (2) much better than (1) or not?

    It depends on how much CPFP is going on of course in the mempool overall. I think (2) is “perfect” in the sense that it exactly answers the question how much vbytes are competing at every feerate level for block space, according to Bitcoin Core’s own block building algorithm. But I think (1) is probably a good first approximation.

    • Is there room for having both (1) and (2) possibly?

    I don’t think so. We should just use the best algorithm we have, and I think (2) is strictly better, but obviously a lot more work.

    • (2) seems hard-ish to implement. If that is true, one concern here is whether the additional complexity is worth it. I don’t know.

    It’s certainly more code changes. I’m not sure whether it’s worth it, but that discussion can be left to a future improvement too.

    Re: johoe’s site using that formula… interesting find. This earlier version of the code has a bit more comments: https://github.com/jhoenicke/mempool/blob/548698d6d255a2e1b7d6f1981403d3de55c2182a/mempool-sql.pl#L30L38. I see where it’s coming from now, but I can’t imagine that it doesn’t add as many inaccuracies as that it improves (see examples above).


    kiminuo commented at 8:34 am on November 25, 2021:

    @luke-jr

    It really depends on the use case for the histogram. If people are trying to make decisions on what fee to use, it’s probably better to under-estimate the fee rate of a given transaction (by applying a CPFP logic excessively) than over-estimate it (by ignoring CPFP, etc).

    From practicality point of view, I agree. But I would still like to follow this plan as it converges to the best possible solution. @sipa

    Re: johoe’s site using that formula… interesting find. This earlier version of the code has a bit more comments: https://github.com/jhoenicke/mempool/blob/548698d6d255a2e1b7d6f1981403d3de55c2182a/mempool-sql.pl#L30L38. I see where it’s coming from now,

    Ah, that’s good to know.

    but I can’t imagine that it doesn’t add as many inaccuracies as that it improves (see examples above).

    Without thinking about it too deeply, my guess is that one can come up with pathological mempool state cases to force it report bad values. If it behaves ok-ish for real-world mempools, I don’t really know. Anyway, #21422 (comment) makes sense to me (the worst alternative there is basically that people disagree and then we would chase approach (2) immediately).

  96. in src/rpc/blockchain.cpp:1671 in f2ca3d35ee outdated
    1666+            const CAmount afpb{CFeeRate{afees, asize}.GetFee(1)}; // Fee rate per byte including ancestors
    1667+            const CAmount dfpb{CFeeRate{dfees, dsize}.GetFee(1)}; // Fee rate per byte including descendants
    1668+
    1669+            // Fee rate per byte including ancestors & descendants
    1670+            const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    1671+            const CAmount fee_rate_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    


    0xB10C commented at 4:35 pm on November 22, 2021:

    nit: my definition of feerate is fee per size. This doesn’t really work with fee rate per byte as it would be a fee per size per size? Also not sure about variable names like afpb and dfpb. Why not just spell it out as e.g. ancestor_feerate and descendant_feerate?

    (I’m assuming tfpb means total_feerate?)

    0            const CAmount fee{e.GetFee()}, ancestor_fees{e.GetModFeesWithAncestors()}, descendant_fees{e.GetModFeesWithDescendants()};
    1            const uint32_t size{(uint32_t)e.GetTxSize()}, ancestor_size{(uint32_t)e.GetSizeWithAncestors()}, descendant_size{(uint32_t)e.GetSizeWithDescendants()};
    2
    3            const CAmount feerate{CFeeRate{fee, size}.GetFee(1)};     // Fee rate per byte
    4            const CAmount ancestor_feerate{CFeeRate{ancestor_fees, ancestor_size}.GetFee(1)}; // Fee rate per byte including ancestors
    5            const CAmount descendant_feerate{CFeeRate{descendant_fees, descendant_size}.GetFee(1)}; // Fee rate per byte including descendants
    6
    7            // Fee rate per byte including ancestors & descendants
    8            const CAmount total_feerate{CFeeRate{ancestor_fees + descendant_fees - fee, ancestor_size + descendant_size - size}.GetFee(1)};
    9            const CAmount feerate{std::max(std::min(descendant_feerate, total_feerate), std::min(feerate, ancestor_feerate))};
    

    0xB10C commented at 4:37 pm on November 22, 2021:

    Can you explain these two lines?

    01670    const CAmount tfpb{CFeeRate{afees + dfees - fee, asize + dsize - size}.GetFee(1)};
    11671    const CAmount fee_rate_per_byte{std::max(std::min(dfpb, tfpb), std::min(fpb, afpb))};
    

    murchandamus commented at 6:37 pm on November 22, 2021:

    I’m confused by these as well.

    This breaks down when there are multiple descendants or ancestors in various scenarios (all txs assumed to be same size).

    If you e.g. had a parent transaction paying 5 sat/vB that had a child paying 1 sat/vB, the child would be irrelevant for the parent’s effective feerate and vice versa. If you however have a parent paying 3 sat/vB, and two children where one pays 5 sat/vB and one pays 1 sat/vB. The one with the 5 sat/vB would form a CPFP situation with the parent, while the other child would be effectively unrelated: since the parent’s feerate is larger than the child’s, the ancestor will be included before the child independently. Only children pay for parents, parents don’t pay for children in this case. ;)

    Beyond that, many transactions will have overlapping ancestries or descendants, which means that the bucketing across the complete graph of related transactions would often count related txs multiple times, e.g. where a parent paying 3 sat/vB has two children paying 4 sat/vB and 7 sat/vB, only the latter will form a CPFP and the parent will already be included when the child with 4 sat/vB is up for transaction selection.


    kiminuo commented at 8:19 pm on November 22, 2021:
    I can only say this: #21422 (review) and that I will try to fix it.

    kiminuo commented at 10:18 am on November 24, 2021:

    nit: my definition of feerate is fee per size. This doesn’t really work with fee rate per byte as it would be a fee per size per size?

    Yeah, I have the same definition of fee rate as “fee per size”.

    I’ve been trying to align with Sipa’s approach (1) for now.

  97. sipa commented at 4:39 pm on November 22, 2021: member
    Approach NACK as-is, this will result in distorted numbers. #21422 (review)
  98. 0xB10C changes_requested
  99. 0xB10C commented at 4:50 pm on November 22, 2021: contributor

    Thanks for picking this up. Left a few comments.

    Test currently only cover the count and not the fees or sizes per feerate group.

  100. in src/rpc/blockchain.cpp:1744 in 6466eb65ad outdated
    1742+                                {
    1743+                                    {RPCResult::Type::OBJ, "<fee_rate_group>", "Fee rate group named by its lower bound (in " + CURRENCY_ATOM + "/vB), identical to the \"from\" field below",
    1744+                                    {
    1745+                                        {RPCResult::Type::NUM, "size", "Cumulative size of all transactions in the fee rate group"},
    1746+                                        {RPCResult::Type::NUM, "count", "Number of transactions in the fee rate group"},
    1747+                                        {RPCResult::Type::NUM, "fees", "Cumulative fees of all transactions in the fee rate group (in " + CURRENCY_ATOM + "/vB)"},
    


    murchandamus commented at 6:28 pm on November 22, 2021:
    This also sounds like a fee not a feerate, i.e. should not have a /vB.

    kiminuo commented at 7:51 am on November 23, 2021:
    Yes, thanks.
  101. in src/rpc/blockchain.cpp:1634 in f2ca3d35ee outdated
    1630@@ -1630,7 +1631,7 @@ static RPCHelpMan getchaintips()
    1631     };
    1632 }
    1633 
    1634-UniValue MempoolInfoToJSON(const CTxMemPool& pool)
    1635+UniValue MempoolInfoToJSON(const CTxMemPool& pool, const std::optional<MempoolHistogramFeeRates> feeLimits)
    


    jamesob commented at 9:22 pm on November 22, 2021:
    feeLimits seems like a confusing name for this - maybe change to e.g. histogramFloors?

    kiminuo commented at 7:55 pm on November 23, 2021:
    Yes, makes sense.
  102. in src/rpc/blockchain.cpp:1752 in f2ca3d35ee outdated
    1760+                    HelpExampleRpc("getmempoolinfo", R"([0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200])")
    1761                 },
    1762         [&](const RPCHelpMan& self, const JSONRPCRequest& request) -> UniValue
    1763 {
    1764-    return MempoolInfoToJSON(EnsureAnyMemPool(request.context));
    1765+    MempoolHistogramFeeRates feelimits;
    


    jamesob commented at 10:42 pm on November 22, 2021:
    feeLimits also a confusing name here (see comment above)
  103. kiminuo force-pushed on Nov 23, 2021
  104. kiminuo force-pushed on Nov 23, 2021
  105. kiminuo force-pushed on Nov 23, 2021
  106. kiminuo marked this as a draft on Nov 23, 2021
  107. kiminuo force-pushed on Nov 23, 2021
  108. in test/functional/mempool_fee_histogram.py:28 in 47b5c3e03a outdated
    23+    def skip_test_if_missing_module(self):
    24+        self.skip_if_no_wallet()
    25+
    26+    def run_test(self):
    27+        node = self.nodes[0]
    28+        node.generate(COINBASE_MATURITY + 2)
    


    brunoerg commented at 11:19 am on November 23, 2021:
    Test is failing here. TypeError: generatetoaddress() missing 1 required keyword-only argument: 'invalid_call'. I think there is something wrong with the implementation of this function in test_node and I don’t see any other test calling generate from node, I will take a look on it. But I think you can use generatetoaddress from test_framework.

    kiminuo commented at 12:00 pm on November 23, 2021:

    Yes, that’s also a reason why I converted the PR to a draft. I have done as much as I could today. I will continue tomorrow.

    edit: Any help is certainly welcome. :)

    edit 2: It might be so that master branch has merged a PR that modifies this. I vaguely remember that I have seen a PR like that. If so, then rebase + fixing the call would do the trick probably.


    kiminuo commented at 7:59 pm on November 23, 2021:
    Rebased because of #23300.

    brunoerg commented at 11:09 am on November 24, 2021:
    Great. #23300 solves it!
  109. in test/functional/test_framework/util.py:59 in 47b5c3e03a outdated
    55@@ -56,6 +56,11 @@ def assert_greater_than(thing1, thing2):
    56         raise AssertionError("%s <= %s" % (str(thing1), str(thing2)))
    57 
    58 
    59+def assert_no_key(key, dictionary):
    


    jamesob commented at 3:09 pm on November 23, 2021:
    Is this necessary? Can’t just write assert(key not in dict) inline?

    kiminuo commented at 7:52 pm on November 23, 2021:
    It is not. Good idea. Thanks.
  110. jamesob commented at 3:10 pm on November 23, 2021: member

    Concept ACK

    It’s a good feature but needs some cleanup as many have noted. Rumor has it that a well-known wallet is currently using Bitcoin Knots solely because this feature isn’t available in this RPC interface, so maybe worth prioritizing review on that basis.

  111. kiminuo force-pushed on Nov 23, 2021
  112. kiminuo force-pushed on Nov 23, 2021
  113. kiminuo force-pushed on Nov 23, 2021
  114. kiminuo force-pushed on Nov 23, 2021
  115. kiminuo force-pushed on Nov 24, 2021
  116. sipa commented at 9:21 pm on November 24, 2021: member

    @kiminuo See https://github.com/sipa/bitcoin/commits/202111_mempoolfr for code that gets histogram data using the mining algorithm. Feel free to cherry pick or whatever; if you don’t, I may clean it up to make use it after this PR is merged.

    Now, it is somewhat slow (currently ~150ms for me for ~12000 mempool transactions, which is only a small fraction of the maximum). If we expect that go that direction, maybe that calls for having the functionality in a separate RPC than getmempoolinfo.

  117. kiminuo commented at 8:23 am on November 25, 2021: contributor

    @sipa That looks great!

    I would like to go with the simple approach first.

  118. kiminuo marked this as ready for review on Nov 26, 2021
  119. in src/rpc/blockchain.cpp:1687 in 2d2bae70ab outdated
    1683+        std::vector<uint64_t> count(floors.size(), 0);
    1684+        std::vector<CAmount> fees(floors.size(), 0);
    1685+
    1686+        for (const CTxMemPoolEntry& e : pool.mapTx) {
    1687+            const CAmount fee{e.GetFee()};
    1688+            const uint32_t size{(uint32_t)e.GetTxSize()};
    


    PastaPastaPasta commented at 1:40 pm on February 1, 2022:

    please remove this c-style cast

    0const uint32_t size{uint32_t(e.GetTxSize())};
    

    kiminuo commented at 10:01 am on February 2, 2022:

    Thank you. I have just changed it.

    Note: #23962 might affect this code later on.

    edit: Rebased. Hopefully, it will fix the mac test error.

  120. kiminuo force-pushed on Feb 2, 2022
  121. kiminuo force-pushed on Feb 2, 2022
  122. kiminuo force-pushed on Mar 10, 2022
  123. kiminuo commented at 8:51 am on March 12, 2022: contributor
    (1️⃣ year anniversary today 🎉)
  124. DrahtBot added the label Needs rebase on Mar 16, 2022
  125. kiminuo force-pushed on Mar 21, 2022
  126. kiminuo commented at 10:30 am on March 21, 2022: contributor
    Rebased after #24537.
  127. DrahtBot removed the label Needs rebase on Mar 21, 2022
  128. kiminuo commented at 2:21 pm on March 21, 2022: contributor

    @jamesob

    Concept ACK

    It’s a good feature but needs some cleanup as many have noted. Rumor has it that a well-known wallet is currently using Bitcoin Knots solely because this feature isn’t available in this RPC interface, so maybe worth prioritizing review on that basis.

    Can you clarify what you mean by the cleanup?

  129. in src/rpc/mempool.cpp:393 in 84418b05c3 outdated
    388@@ -386,39 +389,139 @@ UniValue MempoolInfoToJSON(const CTxMemPool& pool)
    389     ret.pushKV("bytes", (int64_t)pool.GetTotalTxSize());
    390     ret.pushKV("usage", (int64_t)pool.DynamicMemoryUsage());
    391     ret.pushKV("total_fee", ValueFromAmount(pool.GetTotalFee()));
    392-    int64_t maxmempool{gArgs.GetIntArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000};
    393-    ret.pushKV("maxmempool", maxmempool);
    394+    size_t maxmempool = gArgs.GetIntArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000;
    395+    ret.pushKV("maxmempool", (int64_t)maxmempool);
    


    luke-jr commented at 8:49 pm on March 21, 2022:
    This makes no sense? Why?

    kiminuo commented at 11:08 pm on March 21, 2022:
    Bad rebase :-|
  130. in src/rpc/mempool.cpp:457 in 84418b05c3 outdated
    474-            }},
    475-        RPCExamples{
    476-            HelpExampleCli("getmempoolinfo", "")
    477-            + HelpExampleRpc("getmempoolinfo", "")
    478-        },
    479+                "\nReturns details on the active state of the TX memory pool.\n",
    


    luke-jr commented at 9:14 pm on March 21, 2022:
    You’re unfixing the indentation here… bad rebase?

    kiminuo commented at 11:08 pm on March 21, 2022:
    Thanks, I have fixed it hopefully.

    luke-jr commented at 0:20 am on March 22, 2022:
    Nope

    luke-jr commented at 2:41 pm on March 22, 2022:
    Looks resolved. 288c4524fa0458a819664ccd568fb8e34e06cc7a -> 4af229650fe appears to be a clean rebase.
  131. kiminuo force-pushed on Mar 21, 2022
  132. kiminuo force-pushed on Mar 22, 2022
  133. DrahtBot added the label Needs rebase on Mar 28, 2022
  134. kiminuo force-pushed on Mar 28, 2022
  135. DrahtBot removed the label Needs rebase on Mar 28, 2022
  136. DrahtBot added the label Needs rebase on Apr 6, 2022
  137. kiminuo force-pushed on Apr 28, 2022
  138. kiminuo commented at 7:56 am on April 28, 2022: contributor
    @Xekyo Would you be interested in having another look at the PR, please?
  139. DrahtBot removed the label Needs rebase on Apr 28, 2022
  140. 0xB10C commented at 9:01 am on April 28, 2022: contributor
    The CI seems to be unhappy with the changes you’ve pushed. Probably best to make sure the tests pass locally first before there’s another round of review.
  141. kiminuo force-pushed on Apr 28, 2022
  142. in src/rpc/mempool.cpp:698 in 1de334c5f7 outdated
    694+                            {
    695+                                {RPCResult::Type::NUM, "size", "Cumulative size of all transactions in the fee rate group (in vBytes)"},
    696+                                {RPCResult::Type::NUM, "count", "Number of transactions in the fee rate group"},
    697+                                {RPCResult::Type::NUM, "fees", "Cumulative fees of all transactions in the fee rate group (in " + CURRENCY_ATOM + ")"},
    698+                                {RPCResult::Type::NUM, "from", "Group contains transactions with fee rates equal or greater than this value (in " + CURRENCY_ATOM + "/vB)"},
    699+                                {RPCResult::Type::ANY, "to", "Group contains transactions with fee rates equal or less than this value (in " + CURRENCY_ATOM + "/vB)"},
    


    luke-jr commented at 0:03 am on April 29, 2022:
    This is incorrect. It’s NUM, just optional (null is identical to omitted).

    kiminuo commented at 6:24 am on April 29, 2022:

    I know it’s incorrect. I just can’t see to make it work to make RPCResult::MatchesType happy. I have actually tried that NUM+optional but RPCResult::MatchesType reported an error.

    If you know correct solution, can you share it here?

    PS: I should probably have a deeper look how RPCResult::MatchesType handles NUMs.


    luke-jr commented at 8:00 pm on April 29, 2022:
    It’s entirely possible the MatchesType is buggy, in which case fixing it is the correct solution.

    maflcko commented at 6:31 am on April 30, 2022:

    null is identical to omitted

    That’s not true. In python it will throw a key-error.

    Maybe just not return to at all ever?


    luke-jr commented at 2:58 pm on June 10, 2022:
    The original PR returned max-int64 for the final range FWIW

    kiminuo commented at 8:17 am on July 10, 2022:

    I removed to as suggested. It feels right to me.

    CI failure looks unrelated to me … or not?


    kiminuo commented at 9:26 am on December 31, 2022:
    CI is green now.
  143. kiminuo force-pushed on Apr 29, 2022
  144. luke-jr referenced this in commit 01ab8e3dde on May 21, 2022
  145. luke-jr referenced this in commit 7fe3b24122 on May 21, 2022
  146. DrahtBot added the label Needs rebase on May 27, 2022
  147. kiminuo force-pushed on Jul 9, 2022
  148. DrahtBot removed the label Needs rebase on Jul 9, 2022
  149. DrahtBot added the label Needs rebase on Jul 18, 2022
  150. DrahtBot removed the label Needs rebase on Jul 19, 2022
  151. DrahtBot added the label Needs rebase on Aug 5, 2022
  152. kiminuo force-pushed on Aug 10, 2022
  153. kiminuo force-pushed on Aug 10, 2022
  154. DrahtBot removed the label Needs rebase on Aug 10, 2022
  155. DrahtBot added the label Needs rebase on Dec 19, 2022
  156. kiminuo force-pushed on Dec 29, 2022
  157. DrahtBot removed the label Needs rebase on Dec 29, 2022
  158. kiminuo force-pushed on Dec 29, 2022
  159. kiminuo force-pushed on Dec 30, 2022
  160. kiminuo force-pushed on Dec 30, 2022
  161. kiminuo commented at 9:32 am on December 31, 2022: contributor
    @sipa Would you mind reconsidering your concept NACK (https://github.com/bitcoin/bitcoin/pull/21422#issuecomment-975711424) given that the algorithm is now as you proposed (variant I)?
  162. kristapsk approved
  163. kristapsk commented at 5:32 pm on January 2, 2023: contributor
    re-ACK 56512a06800781cde7bc63fe386ba7301db6f40a
  164. molnard commented at 10:10 am on January 3, 2023: none
    cACK
  165. sipa commented at 3:00 pm on January 3, 2023: member
    Concept ACK
  166. kristapsk commented at 11:47 am on January 21, 2023: contributor
    Isn’t this good for merge? Or waiting for some more reviews?
  167. in src/rpc/mempool.cpp:776 in 32d872a2a5 outdated
    774+    std::optional<MempoolHistogramFeeRates> histogram_floors_opt = std::nullopt;
    775+
    776+    if (!request.params[0].isNull()) {
    777+        const UniValue histogram_floors_univalue = request.params[0].get_array();
    778+
    779+        if (histogram_floors_univalue.size() == 0 || histogram_floors_univalue.size() > 30) {
    


    aureleoules commented at 9:34 am on January 23, 2023:
    0        if (histogram_floors_univalue.empty() || histogram_floors_univalue.size() > 30) {
    

    kiminuo commented at 9:26 pm on January 23, 2023:
    Changed. Thanks.
  168. in src/rpc/mempool.cpp:28 in 32d872a2a5 outdated
    25@@ -25,6 +26,8 @@
    26 
    27 using kernel::DumpMempool;
    28 
    29+#include <optional>
    


    aureleoules commented at 9:35 am on January 23, 2023:
    Move this include above the using.

    kiminuo commented at 9:26 pm on January 23, 2023:
    Thanks. Changed.
  169. in src/rpc/mempool.cpp:711 in 32d872a2a5 outdated
    692+                    sizes[i] += size;
    693+                    ++count[i];
    694+                    fees[i] += fee;
    695+                    break;
    696+                }
    697+            }
    


    aureleoules commented at 9:57 am on January 23, 2023:

    32d872a2a58c79ff79049553c97cb78a7314bd8e: Could use std::algorithm.

    0            auto it = std::upper_bound(floors.begin(), floors.end(), fee_rate);
    1            if (it != floors.begin()) {
    2                size_t i = std::distance(floors.begin(), it - 1);
    3                sizes[i] += size;
    4                ++count[i];
    5                fees[i] += fee;
    6            }
    

    kiminuo commented at 9:31 pm on January 23, 2023:

    Reading about std::upper_bound, it looks to me that one should compare with floors.end() and not floors.begin().

    Also that it - 1 makes me a bit concerned because the current code allows works with i = 0.

    Anyway, I’m not too good with these idiomatic C++ code changes, so I might be completely wrong.

  170. in test/functional/mempool_fee_histogram.py:38 in 56512a0680 outdated
    33+        # - tx1 (5 sat/vB): spending utxo_1
    34+        # - tx2 (14 sat/vB): spending output from tx1
    35+        # - tx3 (6 sat/vB): spending utxo_2 and the output from tx2
    36+
    37+        self.log.info("Test getmempoolinfo does not return fee histogram by default")
    38+        assert("fee_histogram" not in node.getmempoolinfo())
    


    aureleoules commented at 10:00 am on January 23, 2023:

    missing whitespace after keyword

    0        assert ("fee_histogram" not in node.getmempoolinfo())
    

    kiminuo commented at 9:26 pm on January 23, 2023:

    Thank you.

    btw: It would be nice to have a lint warning about this. If it was not added since my last push.


    aureleoules commented at 9:36 pm on January 23, 2023:

    btw: It would be nice to have a lint warning about this. If it was not added since my last push.

    There was, the lint job failed with this error. I just copy-pasted it.

  171. aureleoules commented at 10:00 am on January 23, 2023: member
    Left some minor comments.
  172. aureleoules commented at 10:01 am on January 23, 2023: member
    Needs rebase
  173. kiminuo force-pushed on Jan 23, 2023
  174. kiminuo commented at 9:41 pm on January 23, 2023: contributor

    Needs rebase

    Rebased.

  175. kiminuo force-pushed on Mar 4, 2023
  176. kiminuo force-pushed on Mar 11, 2023
  177. kiminuo force-pushed on Mar 11, 2023
  178. kiminuo commented at 10:01 pm on March 11, 2023: contributor

    It would be good if the functional test actually verified that the values are correctly calculated after creating a few txns; ATM it is only really a smoke test that verifies the output structure. (by @jonatack) Test currently only cover the count and not the fees or sizes per feerate group. (by @0xB10C)

    Added explicit assert lines in the test for this.

  179. Introduce fee histogram in getmempoolinfo RPC command
    Co-authored-by: Jonas Schnelli <dev@jonasschnelli.ch>
    Co-authored-by: Jon Atack <jon@atack.com>
    d242aa52de
  180. test: Add mempool fee histogram test coverage
    Original commit: https://github.com/bitcoin/bitcoin/commit/0b6ba66238c377116bc6c21e19cffbf1b6dfc788
    
    Co-authored-by: João Barbosa <joao.paulo.barbosa@gmail.com>
    Co-authored-by: Jon Atack <jon@atack.com>
    c5e53d0d21
  181. kiminuo force-pushed on Mar 11, 2023
  182. kiminuo commented at 10:10 pm on March 11, 2023: contributor

    Concept ACK!

    It might make sense to have the API be more similar to https://numpy.org/doc/stable/reference/generated/numpy.histogram.html

    Honestly, I’m not sure. Anyone else who prefer that approach?

  183. kiminuo commented at 9:52 am on March 13, 2023: contributor
    (2️ year anniversary today 🎉)
  184. in src/rpc/mempool.cpp:698 in c5e53d0d21
    693+
    694+        std::vector<uint64_t> sizes(floors.size(), 0);
    695+        std::vector<uint64_t> count(floors.size(), 0);
    696+        std::vector<CAmount> fees(floors.size(), 0);
    697+
    698+        for (const CTxMemPoolEntry& e : pool.mapTx) {
    


    glozow commented at 10:46 am on March 13, 2023:
    fyi you can get the same information for each transaction by calling infoAll(). I think it’s not great practice for outside callers to reach into mapTx
  185. glozow commented at 11:31 am on March 13, 2023: member

    I’d be a pretty strong concept ACK to what seems like the original intent of the PR, enabling users/callers to get an understanding of the distribution of “effective” feerates of transactions taking into account ancestors/descendants. I definitely agree with the discussion at #21422 (review) and other comments about the implementation being much more complex and involved.

    However, given that the functionality added is currently to just build a histogram based on the individual feerates, I’m a little less clear on its utility and why it is necessary to add and maintain this in Bitcoin Core given how easy it would be to do it externally (which may or may not have been the case when people first asked for this?). I personally think we should have good reasons for putting something in Bitcoin Core as opposed to somewhere else in the software stack. Just my own opinion, and hopefully provides a reasonable explanation for why I haven’t prioritized reviewing this.

    Thank you very much for maintaining this PR for 2 years. If previous reviewers want to come back and drop a new review, maybe we can make some progress here.

  186. kiminuo commented at 3:26 pm on March 13, 2023: contributor

    @glozow Thank you for the review.

    given how easy it would be to do it externally (which may or may not have been the case when people first asked for this?)

    So what would be your approach to do it externally then? Especially if you want to get this information more often.

  187. kouloumos commented at 3:34 pm on March 13, 2023: contributor

    So what would be your approach to do it externally then? Especially if you want to get this information more often.

    Could a tracepoints approach, similar to #26531, work?

  188. kiminuo commented at 3:40 pm on March 13, 2023: contributor

    So what would be your approach to do it externally then? Especially if you want to get this information more often.

    Could a tracepoints approach, similar to #26531, work?

    I need to research that in detail but it looks promising. Thank you for sharing it here!

  189. 0xB10C commented at 11:53 am on March 14, 2023: contributor

    Could a tracepoints approach, similar to #26531, work?

    I need to research that in detail but it looks promising. Thank you for sharing it here!

    I don’t think tracepoints are suited for this use-case. Tracepoints allow you to react to a specific event. Here, you request / pull data when a consumer needs it.

    So what would be your approach to do it externally then? Especially if you want to get this information more often.

    You can query the getrawmempool RPC with the verbose flag set and build the histogram yourself. This Python script produces a similar histogram to what your adding in this PR.

     0from bisect import bisect_left
     1import json
     2import sys
     3
     4BUCKETS = [0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 17, 20, 25, 30, 40, 50, 60, 70, 80, 100, 120, 140, 170, 200]
     5COIN = 100000000
     6
     7histogram = {}
     8total_fees = 0
     9
    10def feerate_bucket(feerate):
    11    # Depending on how many buckets you have you probably don't even want to use a binary search. As the lower buckets are likely used more frequently a linear search should work fine too.
    12    i = bisect_left(BUCKETS, feerate)
    13    if i == len(BUCKETS):
    14        return BUCKETS[-1]
    15    return BUCKETS[i]
    16
    17mempool = json.load(sys.stdin)
    18
    19for txid in mempool:
    20    tx = mempool[txid]
    21    fee_sat = int(tx["fees"]["base"] * COIN)
    22    feerate = fee_sat / tx["vsize"]
    23    bucket = feerate_bucket(feerate)
    24    if bucket not in histogram:
    25        histogram[bucket] = { "vsize": 0, "count": 0, "fees": 0, "from": bucket }
    26    histogram[bucket]["vsize"] += tx["vsize"]
    27    histogram[bucket]["fees"] += fee_sat
    28    histogram[bucket]["count"] += 1
    29    total_fees += fee_sat
    30
    31fee_histogram = { "fee_rate_groups": histogram, "total_fees": total_fees }
    32print(json.dumps(fee_histogram, sort_keys = True, indent = 2))
    

    Runinng bitcoin-cli getrawmempool true | python3 feerate-histogram.py produces the following output for my current mempool.

      0{
      1  "fee_rate_groups": {
      2    "2": {
      3      "count": 7819,
      4      "fees": 88283562,
      5      "from": 2,
      6      "vsize": 44785089
      7    },
      8    "3": {
      9      "count": 255,
     10      "fees": 10961381,
     11      "from": 3,
     12      "vsize": 5227332
     13    },
     14    "4": {
     15      "count": 469,
     16      "fees": 9411549,
     17      "from": 4,
     18      "vsize": 3072663
     19    },
     20    "5": {
     21      "count": 341,
     22      "fees": 1203867,
     23      "from": 5,
     24      "vsize": 250260
     25    },
     26    "6": {
     27      "count": 325,
     28      "fees": 507162,
     29      "from": 6,
     30      "vsize": 88589
     31    },
     32    "7": {
     33      "count": 197,
     34      "fees": 574863,
     35      "from": 7,
     36      "vsize": 94766
     37    },
     38    "8": {
     39      "count": 99,
     40      "fees": 180911,
     41      "from": 8,
     42      "vsize": 23703
     43    },
     44    "10": {
     45      "count": 664,
     46      "fees": 2645382,
     47      "from": 10,
     48      "vsize": 270908
     49    },
     50    "12": {
     51      "count": 177,
     52      "fees": 605631,
     53      "from": 12,
     54      "vsize": 55444
     55    },
     56    "14": {
     57      "count": 588,
     58      "fees": 3090496,
     59      "from": 14,
     60      "vsize": 229933
     61    },
     62    "17": {
     63      "count": 495,
     64      "fees": 1748418,
     65      "from": 17,
     66      "vsize": 115549
     67    },
     68    "20": {
     69      "count": 1904,
     70      "fees": 9727850,
     71      "from": 20,
     72      "vsize": 510756
     73    },
     74    "25": {
     75      "count": 1335,
     76      "fees": 15926338,
     77      "from": 25,
     78      "vsize": 718764
     79    },
     80    "30": {
     81      "count": 1095,
     82      "fees": 8700416,
     83      "from": 30,
     84      "vsize": 320272
     85    },
     86    "40": {
     87      "count": 819,
     88      "fees": 7622226,
     89      "from": 40,
     90      "vsize": 221100
     91    },
     92    "50": {
     93      "count": 435,
     94      "fees": 4196189,
     95      "from": 50,
     96      "vsize": 94752
     97    },
     98    "60": {
     99      "count": 165,
    100      "fees": 2173791,
    101      "from": 60,
    102      "vsize": 39825
    103    },
    104    "70": {
    105      "count": 106,
    106      "fees": 1241456,
    107      "from": 70,
    108      "vsize": 19369
    109    },
    110    "80": {
    111      "count": 53,
    112      "fees": 1136711,
    113      "from": 80,
    114      "vsize": 15458
    115    },
    116    "100": {
    117      "count": 64,
    118      "fees": 26645889,
    119      "from": 100,
    120      "vsize": 269273
    121    },
    122    "120": {
    123      "count": 67,
    124      "fees": 42826729,
    125      "from": 120,
    126      "vsize": 418795
    127    },
    128    "140": {
    129      "count": 25,
    130      "fees": 598662,
    131      "from": 140,
    132      "vsize": 4723
    133    },
    134    "170": {
    135      "count": 76,
    136      "fees": 2294307,
    137      "from": 170,
    138      "vsize": 14943
    139    },
    140    "200": {
    141      "count": 60,
    142      "fees": 3159137,
    143      "from": 200,
    144      "vsize": 10590
    145    }
    146  },
    147  "total_fees": 245462923
    148}
    

    The getrawmempool true RPC was quite slow when the mempool is full (https://github.com/bitcoin/bitcoin/issues/14765, https://github.com/chris-belcher/electrum-personal-server/issues/96#issuecomment-481396276). This was improved in #14984 (merged after #15836 was opened) and might be fine now. I haven’t had problems with it in a while.

    However, given that the functionality added is currently to just build a histogram based on the individual feerates, I’m a little less clear on its utility and why it is necessary to add and maintain this in Bitcoin Core given how easy it would be to do it externally (which may or may not have been the case when people first asked for this?). I personally think we should have good reasons for putting something in Bitcoin Core as opposed to somewhere else in the software stack.

    IIRC there was also an out-of-band push to review this PR as “rumor has it that a well-known wallet is currently using Bitcoin Knots solely because this feature isn’t available in this RPC interface (https://github.com/bitcoin/bitcoin/pull/21422#pullrequestreview-813031121) “. Would be good to hear if that’s still the case and why calculating the histogram in the wallet isn’t possible/inefficient.

    I agree with @glozow. It might not be worth maintaining this in Core given the simplicity of doing it externally. As is, without utilizing information something that would indeed be inefficient to calculate, e.g. transaction lookups for each input, and without considering transaction relationships (https://github.com/bitcoin/bitcoin/pull/21422#issuecomment-978235017, #21422 (review)), I’m slightly leaning towards a Concept NACK at this point.

  190. kristapsk commented at 12:49 pm on March 14, 2023: contributor

    @0xB10C In general I agree it’s better to keep Core simple if things can be done with some simple workarounds using existing APIs, but I feel this is something that could make some use in Qt GUI too, and for that reason it makes sense to implement inside Core codebase.

    IIRC there was also an out-of-band push to review this PR as “rumor has it that a well-known wallet is currently using Bitcoin Knots solely because this feature isn’t available in this RPC interface (https://github.com/bitcoin/bitcoin/pull/21422#pullrequestreview-813031121) “. Would be good to hear if that’s still the case and why calculating the histogram in the wallet isn’t possible/inefficient.

    It’s used by Wasabi Wallet. https://github.com/zkSNACKs/WalletWasabi/blob/17f0995df568dbdbef9b1faf15da56aec29767bb/WalletWasabi/BitcoinCore/Rpc/RpcClientBase.cs#L63

  191. kiminuo commented at 3:41 pm on March 14, 2023: contributor

    @0xB10C

    Could a tracepoints approach, similar to #26531, work?

    I need to research that in detail but it looks promising. Thank you for sharing it here!

    I don’t think tracepoints are suited for this use-case. Tracepoints allow you to react to a specific event. Here, you request / pull data when a consumer needs it.

    Tracepoints would be nice if I need to compute the histograms at specified intervals and fast (to avoid getting complete mempool from a full node). It would be very similar to how people work with order books in trading - first, start listening to mempool updates and start queuing them, then fetch a mempool snapshot and possibly apply updates not present in the snapshot and keep updating. This gives one a local copy of the mempool that one updates as new transactions comes and the amount of transfered data is acceptable. So then you don’t need to get full mempool every time you want to do a computation based on mempool data. AFAIK it can be up to 300 MB of data which is a lot. But looking at the tracepoints, I’m not sure whether one can do that easily. From my point of view, this would be a nice solution to the problem this PR is about.

    The getrawmempool true RPC was quite slow when the mempool is full (#14765, chris-belcher/electrum-personal-server#96 (comment)). This was improved in #14984 (merged after #15836 was opened) and might be fine now. I haven’t had problems with it in a while.

    I agree that this is an acceptable solution when one does not want to compute the histogram too often. I need to research this more in detail to know what is “too often”.

  192. fanquake commented at 2:12 pm on March 20, 2023: member

    However, given that the functionality added is currently to just build a histogram based on the individual feerates, I’m a little less clear on its utility and why it is necessary to add and maintain this in Bitcoin Core given how easy it would be to do it externally

    I agree. I don’t think this is something that needs to exist in Core. The fact that you’ve had this PR open for a long time is unfortunate, but also not a reason to merge something. If the goal of this change can be achieved externally, with 20 lines of Python, then that’s where it should be done.

  193. kiminuo commented at 2:42 pm on March 20, 2023: contributor

    Closing the PR.

    If the goal of this change can be achieved externally, with 20 lines of Python, then that’s where it should be done.

    I would say it’s rather a proof that one can inefficiently achieve the goal.

    An efficient solution would be to mirror mempool using tracepoints (that did not exist when the PR was created), if1 it is possible.


    1. I have not found the time to test it yet but if it is not possible then that would be imho a very nice PR to allow people to externally compute various mempool statistics efficiently (including feerate histograms). ↩︎

  194. kiminuo closed this on Mar 20, 2023

  195. kiminuo deleted the branch on Mar 20, 2023
  196. prusnak commented at 4:37 pm on March 20, 2023: contributor
    I think it’s quite unfortunate there is no interest in merging this feature. How can we expect the Bitcoin Core wallet to behave reasonably if it can’t predict the fees well?
  197. ghost commented at 7:07 pm on March 20, 2023: none

    I think it’s quite unfortunate there is no interest in this feature. How can we expect the Bitcoin Core wallet to behave reasonably if it can’t predict the fees well?

    There is enough interest if you look at all the comments in this PR and the one opened by @jonasschnelli in 2019 except 3 developers. Could have been useful.

  198. prusnak commented at 7:39 pm on March 20, 2023: contributor

    There is enough interest if you look

    What I meant to say was “there is no interest in merging this feature”, updated the comment.

  199. maflcko commented at 7:41 am on March 21, 2023: member

    How can we expect the Bitcoin Core wallet to behave reasonably if it can’t predict the fees well?

    The Bitcoin Core wallet does not use this feature?

  200. jonasschnelli commented at 8:15 am on March 21, 2023: contributor

    The original idea of this patch was to add a small and easy to maintain efficient way to generate a fee histogram to bitcoin core (see original PR description #15836#issue-434109585).

    As described there, it was always possible to use getrawmempool (@jhoenicke script) to achieve this goal.

    However, it is not efficient to dumb out all transactions of the complete mempool in a json format just to calculate the fee histogram with a python script.

    Thus I think this is a valuable addition to Bitcoin Core.

    The patch size is minimal and it seems like there is some interest in this feature.

    EDIT: this PR has nothing to do with the Bitcoin cores internal wallet. It’s pure mempool statistics.

  201. kristapsk commented at 9:43 am on March 21, 2023: contributor

    How can we expect the Bitcoin Core wallet to behave reasonably if it can’t predict the fees well?

    The Bitcoin Core wallet does not use this feature?

    Does not use yet?

  202. prusnak commented at 10:03 am on March 21, 2023: contributor

    The Bitcoin Core wallet does not use this feature?

    My point is that a wallet without fee estimation is not very usable and harms the overall network. If we don’t want to have an effective way of determining the fee in Bitcoin Core, we might as well remove the entire wallet from the codebase.

  203. glozow commented at 10:19 am on March 21, 2023: member

    There seems to be confusion here. This PR does not touch the fee estimator or the wallet in any way. The fee estimator already keeps track of how many transactions at specific feerates {in the mempool, have confirmed within N blocks, haven’t confirmed within N blocks} there are internally; the wallet queries it directly for feerate estimates. I don’t know of any examples where the wallet needs to see a graph of something.

    If you feel that the fee estimation algorithms need improvement, by all means I’d say some review/PRs there would be welcome. Maybe take a look at #25380 or #21161.

    For those that feel very passionately about improving how the wallet calculates fees, my personal recommendation would be to review #26152 to have the strongest impact.

  204. 0xB10C commented at 10:21 am on March 21, 2023: contributor

    It’s confusing to me why this PR, tagged as [RPC/REST/ZMQ] and only touching src/rest.cpp and files in src/rpc/*, should have anything to do with the wallet or the GUI. AFAIK the wallet doesn’t use the RPC interface to communicate with bitcoind. If the wallet or the GUI would want a histogram, it couldn’t use the code added in this PR.

    edit: glozow was 1min faster

  205. kristapsk commented at 10:27 am on March 21, 2023: contributor

    If the wallet or the GUI would want a histogram, it couldn’t use the code added in this PR.

    That would require just moving fee histogram code out of MempoolInfoToJSON() to a separate function somewhere out of rpc/, code itself seems to me reusable.

  206. sipa commented at 11:54 am on March 21, 2023: member

    Bitcoin Core has had fee estimation logic for a long time (estimatefee RPC was added in 2014, replaced with estimatesmartfee in 2015).

    The Bitcoin Core wallet automatically uses this internal feerate estimation logic to decide on the fees of transactions. You need to override things, or use more low-level RPCs, to not use it.

    That feerate estimation logic works by looking at how quickly mempool transactions get confirmed, over longer time windows, not by looking at the current mempool composition (looking at just your own mempool may give very bad estimates if your relay policy isn’t exactly what the network and miners use, which is very hard to guarantee).

    This PR is about exposing mempool feerate histogram data through RPC. It’s not a feerate estimator, and is entirely unrelated to the wallet. While the logic could in theory be used for feerate estimation too, that’s not what this PR does, and I’m concerned that doing so would risk giving very inaccurate results when faced with diverging network policies.

    Conceptually, however, I think it’s reasonable to add something like this, as just RPC (or GUI) output. It can be informative for users, even if it can’t be directly used for automatic fee estimation. Of course, that’s subject to reviewer interest.

  207. ghost commented at 5:06 pm on March 21, 2023: none

    I am not against sipa as I understand fee estimation is difficult however estimatesmartfee sucks and could be improved.

    There are lot of occasions when I found it to be irrelevant and just look at mempool.space or multiple explorers

    One of the project that I found more useful: https://github.com/TrueLevelSA/btc-congestion-manager

    Anyway, this PR tries to improve how projects use mempool info and should not have been controversial at any point.

    Request to all developers including maintainers to re consider their opinion.

  208. nopara73 commented at 7:54 am on March 22, 2023: none

    For Wasabi’s purposes this output is needed purely for fee estimation, so it’d be ideal if Core would estimate fees properly. Here’s how I see things:

    • estimatesmartfee estimates fees based on the PAST
    • This PR enables us to put minimums on Core’s fee estimations. Eg. if Core estimates 6 hours at 10 sat/b, based on the past, but the mempool is already filled with 12 hours of 11 sat/b estimations, then what we do (currently with Knots) is that based on the mempool histogram we upgrade Core’s estimations accordingly. This’d mean we’re finally considering the PRESENT
    • Finally, for future work we can also aim somehow at predicting the FUTURE (neuro net, AI stuff? Update: I think this repo by @djkazic wanna do something like this: https://github.com/djkazic/matrix)
  209. sipa commented at 1:36 pm on March 22, 2023: member

    @nopara73 I’m aware of estimatesmartfee’s limitations, and understand the desire for something based on current mempool composition that can react faster to current conditions. The problem is that looking your own mempool only gives a reliable result in case you’re running with policies that more or less match what miners are doing, and if not, becomes easily gameable. That is the reason why estimatesmartfee intentionally does not use mempool composition.

    As I said, I think an RPC like this is valuable for people who want additional information for fee estimation, but using it automatically without at least someone looking at the result seems dangerous to me. E.g. future softforks your node doesn’t know about could permit an attacker to stuff your mempool with high-fee unminable transactions.

  210. prusnak commented at 2:25 pm on March 22, 2023: contributor
    @sipa By not providing sort-of accurate estimations from the current mempool, we force people to rely on third-party estimators, which is arguably much worse than having an independent estimation which works most of the time, although it may be games under certain specific conditions. I think that “perfect is enemy of good” rule applies here perfectly.
  211. nopara73 commented at 4:10 am on March 30, 2023: none

    @sipa sorry for the late reply, at first I was like “yes, that makes perfect sense,” however an argument came to my mind - this morning during shower :) - in favor of it, so I’d like to share that.

    Can’t we assume miners are seeking the highest fee transactions at all times? In other words: maximize their income from fees. If we could, then adjusting estimatesmartfee results based on the possibility of block acceptance of the current mempool state of the node does not seem gameable to me, because by extension we could assume that the highest fee transactions we have are precisely the ones that the miners seek.

    Now I may argue against this with “it is not guaranteed that the miners have the highest fee txs that my node does as well.”

    To this, first I’d say: we can guarantee they want to have them and isn’t that enough? Then I’d turn this around and look at it from an attacker’s point of view: the way to game this would be to create transactions those have high fees, get to the targeted node, but don’t get to the miners. And now I’d point out the attacker cannot guarantee his high fee transactions don’t get to the miners, especially with fullrbf around. This makes the attack super expensive, since in order to make the node think the fees are very high, the attacker has to create many blocks of transactions and guarantee that only the targeted node will get to know of these transactions.

    In summary, unless I am missing the point and you’re referring to another kind of game/attack here that does not involve the attacker creating blocks of high fee transactions, we can be reassured the attack is too expensive to be worth trying.

  212. craigraw commented at 12:48 pm on June 27, 2023: none

    Using getrawmempool true is unfortunately not a solution that can be applied generally to solve this use case. On systems with low memory (4Gb or less, for example a Raspberry Pi or the Futurebit Apollo) this RPC call causes the kernel to terminate the Bitcoin Core process (OOM) when the mempool is reasonably full. This occurs consistently on all released versions AFAIK.

    One workaround is to use getrawmempool false followed by repeated calls to getmempoolentry. This of course is even more inefficient, leading any reasonable client implementation to maintain a replica of mempool entries to avoid having to load them all, one by one, every time a fee rate histogram is required. This trades better performance for significantly increased memory utilisation.

    It’s a pity the PR has been closed since the complexity is low, there is clearly wide interest for it across many wallets, and a fee rate histogram is a useful tool in the toolbox when estimating fees. In addition, any middleware implementing the Electrum protocol needs to use one of the inefficient methods described to implement the mempool.get_fee_histogram call.

    Request to all developers including maintainers to re consider their opinion.

    +1.

  213. luke-jr referenced this in commit 1eeb2b0021 on Jun 27, 2023
  214. sipa commented at 5:47 pm on June 28, 2023: member

    I think the discussion here is conflating two things:

    • Problems with Bitcoin Core’s fee estimation algorithm.
    • The lack of a fee histogram RPC.

    It’s clear that the existing fee estimation algorithms make it inaccurate for many real use cases. There are good reasons for why it works the way it does, but that doesn’t change the fact that the outcome is bad. We can certainly discuss what can be done about that. I’ve posted about one possible approach in #27995.

    However, this PR is about something else; adding a feerate histogram. That seems to me like a vaguely reasonable, independent feature, but it has nothing to do with our fee estimation algorithm. And wanting a feerate histogram so you can implement fee estimation yourself just because the fee estimation feature itself does a bad job seems to be the wrong way around. If that’s the goal we should just think about how to improve the algorithm.

  215. craigraw commented at 6:12 am on June 29, 2023: none
    To be clear, my desire for an efficient way to retrieve fee rate histogram data has nothing to do with Bitcoin Core’s fee estimation algorithm. Because fee estimation is inherently difficult and no algorithm can know the user’s time preference for every transaction, Sparrow Wallet displays the fee rate histogram chart in the UI as another “tool in the toolbox”. This allows a user to observe trends and adjust the fee rate to place a transaction in a particular fee rate band to suit their time preference. This is not to suggest such an approach is always superior to algorithmic fee rate estimation, merely that more data is generally useful when trying to predict the future.
  216. maflcko commented at 6:32 am on June 29, 2023: member

    One workaround is to use getrawmempool false followed by repeated calls to getmempoolentry. This of course is even more inefficient, leading any reasonable client implementation to maintain a replica of mempool entries to avoid having to load them all, one by one, every time a fee rate histogram is required. This trades better performance for significantly increased memory utilisation.

    I think the getmempoolentry can be batched for a slight improvement. Also, you don’t have to store the full entry in the application, just the txid and fee info would be enough? Finally, you’d only have to call getmempoolentry once per transaction.

    Even if this pull was merged, it will just give one answer and I don’t think it is clear whether this is the answer that users actually want? See #21422 (comment) and the previous discussion threads. So anyone who wanted to give a better answer would have to implement some smarter algorithm in the application on a full copy of the mempool anyway, and this RPC wouldn’t help with that.

  217. glozow commented at 9:30 am on June 29, 2023: member

    Sparrow Wallet displays the fee rate histogram chart in the UI as another “tool in the toolbox”. This allows a user to observe trends and adjust the fee rate to place a transaction in a particular fee rate band to suit their time preference.

    a fee rate histogram is a useful tool in the toolbox when estimating fees

    I think it’s worth noting that a feerate histogram built from individual transaction feerates can be very misleading for fee estimation. A CPFP could look like n low feerate data points + 1 extremely high feerate data point. A few dozen 100kvB high-feerate transactions may dominate the block you are trying to get into, but they look identical to 100vB transactions as data points. As a user, I have no idea how I would look at a feerate histogram and get an idea of what feerate to put on my transaction.

    I agree that fee estimation using the current contents of the mempool can be very useful. However, the process of taking current mempool contents and meaningfully translating them into high/med/low priority bands is quite involved. Linking one implementation which is a lot more accurate than using individual feerates, but quite computationally heavy and not necessarily appropriate for the oom’d raspis either.

    Perhaps with cluster mempool we can expose a mempool entry’s mining score, but that score still doesn’t tell you much about how much block space is taken up by each data point.

    leading any reasonable client implementation to maintain a replica of mempool entries

    This is the most appropriate solution :thumbsup:. A simple map from txid to feerate would do if you’re trying to keep a feerate histogram.

    Also, you don’t have to store the full entry in the application, just the txid and fee info would be enough? Finally, you’d only have to call getmempoolentry once per transaction.

    Yeah, the nice thing about only needing fee and vsize is these will never change for a given txid in your mempool. You only need to query the entries you don’t already have in your external data structure.

    I think the getmempoolentry can be batched for a slight improvement.

    On Core’s side this idea seems the most worth pursuing, if somebody wants to open a pull.

  218. djkazic commented at 11:09 am on June 29, 2023: none

    Hey, developer of matrix here.

    I think a neural net fee estimator is going to be pretty performant compared to mempool space. My logger already takes into account things like CPFP and calculates the total package fee rate, and with approx 50MB of training data it’s capable of estimating fees such that if the necessary rate goes down, it reacts more quickly that estimatesmartfee.

    Like the built-in fee estimator, I do not use the current contents of the mempool. Rather, the training data packages up the time of tx seen, block count from seen to confirmed, and total package feerate. When it comes to inference-time, lighter nodes like RPis can download and validate a pretrained model. Hence, no fear of OOMing.

  219. craigraw commented at 2:33 pm on June 29, 2023: none

    The points regarding CPFP etc are well noted, although it’s possible these are in practice more applicable to higher time preference fee estimations? Estimation aside, there are not insignificant client costs to maintaining a “txid to vsize and fee rate” mempool replica. By my measurement, this data structure with ~125,000 transactions requires ~14Mb of RAM.

    Further, the initial load using getmempoolentry takes around 20 seconds on a locally running node, and would certainly be slower if the node was remote or running on an RPi. Because of this, I find it better to measure the time required for getrawmempool false and make a decision on whether to risk calling getrawmempool true instead of the getmempoolentry workaround. If a fee rate histogram RPC is not likely, I’d certainly be interested in a better solution to this.

  220. maflcko commented at 3:03 pm on June 29, 2023: member

    By my measurement, this data structure with ~125,000 transactions requires ~14Mb of RAM.

    If memory usage is the only concern, I am sure you could toss the transactions into (lossy) buckets and then subscribe via zmq to avoid duplicates, without having to remember txids, or individual size/fee details.

    … applicable to higher time preference fee estimations?

    If you only care about the top N blocks worth of transactions, the memory usage likely could also be reduced.

  221. santochibtc commented at 8:43 pm on June 30, 2023: none
    I don’t see the use of the histogram for fee estimation. You can have a lot of transactions with minimum fee that generate a large bar dominating the histogram. I see more useful a representation like mempool.space showing the list of transactions ordered by decreasing fee and grouped by blocks. That is a list of pending blocks with associated feerate ranges.
  222. sipa commented at 9:17 pm on July 6, 2023: member
    @santochibtc “Feerate histogram” here is a bytes vs fee diagram, not a #tx vs fee diagram.
  223. santochibtc commented at 5:51 pm on July 7, 2023: none
    Thanks @sipa, yes, it is “a histogram of the fee rates paid by transactions in the memory pool, weighted by transaction size”, but my point is still valid. If you want to include your tx in the next block you must estimate your fee based on the pending txs that are paying more per byte. I think that the histogram is not always a good picture of the mempool to estimate fees.
  224. nopara73 commented at 8:02 am on July 8, 2023: none
    It’s good enough.
  225. luke-jr referenced this in commit 315c64ff88 on Aug 16, 2023
  226. luke-jr referenced this in commit b4106f0654 on Aug 29, 2023
  227. luke-jr referenced this in commit 9caa0f7595 on Jan 30, 2024
  228. bitcoin locked this on Jul 7, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-28 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me