RPC: Add reserve member function to UniValue and use it in getblock RPC #31179

pull ismaelsadeeq wants to merge 2 commits into bitcoin:master from ismaelsadeeq:10-2024-add-reserve-to-univalue changing 4 files +13 −0
  1. ismaelsadeeq commented at 5:11 pm on October 29, 2024: member

    This PR is motivated by #30495 (comment), It adds a reserve member function to UniValue and applies it within the getblock RPC to pre-allocate memory, minimizing reallocations. This change provides a noticeable performance increase, particularly at verbosity levels 1 and 2.

    Here are benchmark for this conducted on a VPS with the specs:

    • 8 vCPU Cores, 24 GB RAM, 1.2 TB SSD, 32 TB Traffic
    • Ubuntu 22 with Bitcoin Core on the latest master da10e0bab4a3e98868dd663af02c43b1dc8b7f4a

    A script was used to retrieve 1000 blocks starting at block 840000, evaluated at verbosity levels 1, 2, and 3 using a sequential strategy. Each configuration was run in three iterations.

    Master Branch

    Verbosity Level 1

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 202 sec 118 sec 119 sec 146 sec 39 sec

    Verbosity Level 2

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 5004 sec 3517 sec 4952 sec 4491 sec 689 sec

    Verbosity Level 3

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 4145 sec 4175 sec 4187 sec 4169 sec 18 sec

    PR Branch with UniValue .reserve

    Verbosity Level 1

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 122 sec 105 sec 107 sec 111 sec 7 sec

    Verbosity Level 2

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 3241 sec 3272 sec 3267 sec 3260 sec 14 sec

    Verbosity Level 3

    Strategy Iteration 1 Iteration 2 Iteration 3 Mean Standard Deviation
    Sequential 4089 sec 4213 sec 4202 sec 4168 sec 56 sec

    Reserving the space in UniValue results in noticeable speed improvements, especially for verbosity levels 1 and 2 by 24-27% on average.

  2. DrahtBot commented at 5:11 pm on October 29, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31179.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK andrewtoth

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  3. DrahtBot added the label RPC/REST/ZMQ on Oct 29, 2024
  4. andrewtoth commented at 5:27 pm on October 29, 2024: contributor
    Concept ACK
  5. andrewtoth commented at 7:13 pm on October 29, 2024: contributor
    I think we can take this further and reserve for VOBJ types instead of just VARR. The VOBJ uses both values and keys, so we can reserve both. We can count how many times we do pushKV and reserve that amount for both keys and values.
  6. kennet31526 approved
  7. andrewtoth commented at 0:53 am on October 30, 2024: contributor
    I believe you can get more fine grained benchmarks for this using the rpc_blockchain benchmarks.
  8. maflcko commented at 6:38 am on October 30, 2024: member
    In theory you could also add UniValue to VectorLikeClasses ( https://clang.llvm.org/extra/clang-tidy/checks/performance/inefficient-vector-operation.html#cmdoption-arg-VectorLikeClasses), but this is probably best done in a follow-up.
  9. UniValue: add reserve member function
    - Only reserve keys when the typ is of `VOBJ`.
    952463ccdc
  10. ismaelsadeeq force-pushed on Oct 30, 2024
  11. ismaelsadeeq commented at 7:23 am on October 30, 2024: member

    I think we can take this further and reserve for VOBJ types instead of just VARR. The VOBJ type uses both values and keys, so we can reserve for both.

    Done in the latest push, see diff, but I am reserving keys only when typ is of VOBJ.

    We can count how many times we use pushKV and reserve that amount for both keys and values.

    +1, this approach is possible but is it maintainable? Manually counting each pushKVs means updating the value thats being passed to .reserve every time we add another element, will be better to reserve for size’s that are determined at runtime or at least use a constant.

    Is there a constant for the number of items in a block so that I can reserve for results in blockheaderToJSON?

  12. ismaelsadeeq commented at 7:32 am on October 30, 2024: member

    I believe you can get more fine-grained benchmarks for this using the rpc_blockchain benchmarks.

    Yes, I can, but I’ll need to add verbosity levels 1 and 2, where we can see some performance improvements.

     0diff --git a/src/bench/rpc_blockchain.cpp b/src/bench/rpc_blockchain.cpp
     1index 7e3e2d8e48a..6d431685e66 100644
     2--- a/src/bench/rpc_blockchain.cpp
     3+++ b/src/bench/rpc_blockchain.cpp
     4@@ -45,7 +45,25 @@ struct TestBlockAndIndex {
     5 
     6 } // namespace
     7 
     8-static void BlockToJsonVerbose(benchmark::Bench& bench)
     9+static void BlockToJsonVerbose1(benchmark::Bench& bench)
    10+{
    11+    TestBlockAndIndex data;
    12+    bench.run([&] {
    13+        auto univalue = blockToJSON(data.testing_setup->m_node.chainman->m_blockman, data.block, data.blockindex, data.blockindex, TxVerbosity::SHOW_TXID);
    14+        ankerl::nanobench::doNotOptimizeAway(univalue);
    15+    });
    16+}
    17+
    18+static void BlockToJsonVerbose2(benchmark::Bench& bench)
    19+{
    20+    TestBlockAndIndex data;
    21+    bench.run([&] {
    22+        auto univalue = blockToJSON(data.testing_setup->m_node.chainman->m_blockman, data.block, data.blockindex, data.blockindex, TxVerbosity::SHOW_DETAILS);
    23+        ankerl::nanobench::doNotOptimizeAway(univalue);
    24+    });
    25+}
    26+
    27+static void BlockToJsonVerbose3(benchmark::Bench& bench)
    28 {
    29     TestBlockAndIndex data;
    30     bench.run([&] {
    31@@ -54,7 +72,9 @@ static void BlockToJsonVerbose(benchmark::Bench& bench)
    32     });
    33 }
    34 
    35-BENCHMARK(BlockToJsonVerbose, benchmark::PriorityLevel::HIGH);
    36+BENCHMARK(BlockToJsonVerbose1, benchmark::PriorityLevel::HIGH);
    37+BENCHMARK(BlockToJsonVerbose2, benchmark::PriorityLevel::HIGH);
    38+BENCHMARK(BlockToJsonVerbose3, benchmark::PriorityLevel::HIGH);
    39 
    40 static void BlockToJsonVerboseWrite(benchmark::Bench& bench)
    41 {
    

    Benchmark Results

    On master:

    ns/op op/s err% total benchmark
    190,342 5,254 2.3% 0.01 BlockToJsonVerbose1
    34,812,292 28.73 1.0% 0.38 BlockToJsonVerbose2
    34,457,167 29.02 1.0% 0.38 BlockToJsonVerbose3

    On this PR:

    ns/op op/s err% total benchmark
    172,278 5,805 0.7% 0.01 BlockToJsonVerbose1
    33,720,584 29.66 0.4% 0.37 BlockToJsonVerbose2
    33,884,417 29.51 1.2% 0.38 BlockToJsonVerbose3
  13. andrewtoth commented at 1:28 pm on October 30, 2024: contributor

    this approach is possible but is it maintainable?

    Hmm, yeah this would end up with reserve(<magic number>) littered throughout the json serialization. Not ideal.

    I took a quick look through most of the block parsing, and many of the reservations we would do would be <5, so unlikely to be a big benefit.

    The entry in TxToUniv could have up to 12 kv-pairs stored, and in blockheaderToJSON it could have 14. These would also be reserving two vectors so might have some visible benefit.

    Also, the witness stack array inside TxToUniv is a VARR that could be reserved with a parameter resolved at runtime.

  14. rpc: reserve space for `UniValue` variables in `blockToJSON`
    - Reserving space avoid reallocation, this provide noticeable
      performance increase in verbosity level 1, 2.
    28e3392d11
  15. ismaelsadeeq force-pushed on Nov 1, 2024
  16. ismaelsadeeq commented at 9:56 pm on November 1, 2024: member

    Also, the witness stack array inside TxToUniv is a VARR that could be reserved with a parameter resolved at runtime.

    Thanks, I’ve added this in the latest push.

  17. Eunovo commented at 6:06 am on November 17, 2024: none

    I believe you can get more fine-grained benchmarks for this using the rpc_blockchain benchmarks.

    Yes, I can, but I’ll need to add verbosity levels 1 and 2, where we can see some performance improvements.

    Show diff Benchmark Results

    On master:

    ns/op op/s err% total benchmark 190,342 5,254 2.3% 0.01 BlockToJsonVerbose1 34,812,292 28.73 1.0% 0.38 BlockToJsonVerbose2 34,457,167 29.02 1.0% 0.38 BlockToJsonVerbose3 On this PR:

    ns/op op/s err% total benchmark 172,278 5,805 0.7% 0.01 BlockToJsonVerbose1 33,720,584 29.66 0.4% 0.37 BlockToJsonVerbose2 33,884,417 29.51 1.2% 0.38 BlockToJsonVerbose3 @ismaelsadeeq, could you commit the diff you’ve provided and include the benchmark results in the commit message? Then, rebase your current changes on top, rerun the benchmarks for https://github.com/bitcoin/bitcoin/pull/31179/commits/952463ccdc2baeacb527c66732fd184cbcbd53ba and https://github.com/bitcoin/bitcoin/pull/31179/commits/28e3392d11355b1160dc1a7a5557081728a02840, and update their commit messages with the results.

    This way I can just go through your commits, run the benchmarks and compare with your results.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 12:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me