merkle: pre‑reserve leaves to prevent reallocs with odd vtx count #32497

pull l0rinc wants to merge 2 commits into bitcoin:master from l0rinc:l0rinc/pre‑reserve-merkle-leaves-to-max changing 7 files +43 −40
  1. l0rinc commented at 1:01 pm on May 14, 2025: contributor

    Summary

    ComputeMerkleRoot duplicates the last hash when the input size is odd. If the caller provides a std::vector whose capacity equals its size (common when the vector was created with resize()), that extra push_back forces a reallocation, doubling its capacity. We pay this penalty on every Merkle root calculation even though we know the final size in advance.

    Fix

    • Deduplicated collection of the txs to properly sized vectors;
    • Guarantees one extra slot when vtx.size() is odd (assigned later);
    • Removes default construction of uint256 objects we overwrite anyway.

    Validation

    The benchmark was updated to use an even leaf count for simplicity and to focus on hashing speed rather than reallocations or vector copying.

    A full -reindex-chainstate up to block 896 408 ran without triggering the asserts.

    Asserts (not pushed in this PR) confirm that push_back never reallocates and that the coinbase witness hash remains null:

    0if (hashes.size() & 1) {
    1    assert(hashes.size() < hashes.capacity()); // TODO remove
    2    hashes.push_back(hashes.back());
    3}
    4
    5leaves.reserve(block.vtx.size() + (block.vtx.size() & 1)); // capacity rounded up to even
    6leaves.emplace_back();
    7assert(leaves.back().IsNull()); // TODO remove
    

    Benchmarks

    Before:

    ns/leaf leaf/s err% total benchmark
    45.55 21,953,985.76 0.1% 1.10 MerkleRoot

    After:

    ns/leaf leaf/s err% total benchmark
    44.46 22,491,655.23 0.0% 1.10 MerkleRoot
  2. DrahtBot commented at 1:01 pm on May 14, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32497.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK luke-jr
    Stale ACK yuvicc, optout21

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. in src/consensus/merkle.cpp:69 in 39b6c139bd outdated
    65@@ -66,20 +66,20 @@ uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated) {
    66 uint256 BlockMerkleRoot(const CBlock& block, bool* mutated)
    67 {
    68     std::vector<uint256> leaves;
    69-    leaves.resize(block.vtx.size());
    70-    for (size_t s = 0; s < block.vtx.size(); s++) {
    71-        leaves[s] = block.vtx[s]->GetHash();
    72+    leaves.reserve((block.vtx.size() + 1) & ~1ULL); // capacity rounded up to even
    


    luke-jr commented at 11:25 am on May 20, 2025:
    This assumes size_t is unsigned long long I think? Can we avoid that easily?

    l0rinc commented at 1:27 pm on May 20, 2025:
    we could either do size_t{1} or leaves.reserve(block.vtx.size() + (block.vtx.size() & 1)), I think, if you think this could be problematic on 32 bit systems.

    l0rinc commented at 9:36 am on June 15, 2025:
    I ended up with block.vtx.size() + (block.vtx.size() & 1) - if the value ends with a 1, it’s an odd number and we increment it to make it even
  4. in src/bench/merkle_root.cpp:15 in 899e466237 outdated
    10@@ -11,16 +11,15 @@
    11 
    12 static void MerkleRoot(benchmark::Bench& bench)
    13 {
    14-    FastRandomContext rng(true);
    15-    std::vector<uint256> leaves;
    16-    leaves.resize(9001);
    17+    FastRandomContext rng{/*fDeterministic=*/true};
    18+    std::vector<uint256> leaves(9000); // even number of leaves
    


    luke-jr commented at 11:33 am on May 20, 2025:
    Wouldn’t it be better to benchmark the less-performant input?

    l0rinc commented at 1:24 pm on May 20, 2025:
    After this change all call-sites are pre-sized to even number of leaves, so this reflects the usage better. I could also round to 9002 to mimic the new leaves.reserve((9001 + 1) & ~1ULL) behavior, if you thinks that’s more representative.

    l0rinc commented at 9:35 am on June 15, 2025:
    added back 9001
  5. in src/bench/merkle_root.cpp:22 in 899e466237 outdated
    23-        bool mutation = false;
    24-        uint256 hash = ComputeMerkleRoot(std::vector<uint256>(leaves), &mutation);
    25-        leaves[mutation] = hash;
    26+        bool mutation{false};
    27+        const uint256 hash{ComputeMerkleRoot(std::vector(leaves), &mutation)};
    28+        assert(mutation == false && hash == uint256{"18ff9b2bef46a834b2625871b2839fec3b67fc1243920c424858324fbde1a8de"});
    


    luke-jr commented at 11:33 am on May 20, 2025:
    How does this avoid optimisations?

    l0rinc commented at 1:21 pm on May 20, 2025:
    the ComputeMerkleRoot calculation has to be performed to be able to validate this assert - it’s can’t eliminate the call.
  6. luke-jr approved
  7. luke-jr commented at 11:34 am on May 20, 2025: member
    utACK main commit. Not sure why the benchmark is being changed.
  8. l0rinc commented at 1:28 pm on May 20, 2025: contributor
    Thanks for the review!
  9. yuvicc commented at 8:34 pm on June 13, 2025: contributor

    Concept ACK

    Memory allocation/deallocation is slow and expensive.

  10. yuvicc commented at 8:42 pm on June 13, 2025: contributor
    I have thought on changing the benchmark tests as we would want to test the worst case scenario right? or else we could have both case for testing?
  11. yuvicc commented at 9:35 pm on June 13, 2025: contributor

    Benchmark Testing

    master@5757de4ddd37f9321ee6b338b40888fd3561fc00

    • With 9000
    0|             ns/leaf |              leaf/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               52.62 |       19,005,746.07 |    0.2% |      0.01 | `MerkleRoot`
    3|               52.64 |       18,998,504.40 |    0.3% |      0.01 | `MerkleRoot`
    4|               52.63 |       18,999,727.67 |    0.2% |      0.01 | `MerkleRoot`
    
    • With 9001
    0|             ns/leaf |              leaf/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               53.50 |       18,693,063.88 |    0.3% |      0.01 | `MerkleRoot`
    3|               53.53 |       18,681,211.49 |    0.5% |      0.01 | `MerkleRoot`
    4|               53.49 |       18,694,053.87 |    0.5% |      0.01 | `MerkleRoot`
    

    Commit 39b6c139bd6be33699af781f1d71f6fed303d468

    • With 9000
    0|             ns/leaf |              leaf/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               52.51 |       19,043,628.95 |    0.2% |      0.01 | `MerkleRoot`
    3|               52.52 |       19,040,989.96 |    0.2% |      0.01 | `MerkleRoot`
    4|               52.53 |       19,036,358.39 |    0.2% |      0.01 | `MerkleRoot`
    
    • With 9001
    0|             ns/leaf |              leaf/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|               53.40 |       18,713,525.67 |    0.3% |      0.01 | `MerkleRoot`
    3|               53.44 |       19,314,655.73 |    0.3% |      0.01 | `MerkleRoot`
    4|               53.41 |       18,883,462.75 |    0.3% |      0.01 | `MerkleRoot`
    
  12. l0rinc force-pushed on Jun 15, 2025
  13. l0rinc commented at 10:06 am on June 15, 2025: contributor
    Updated the benchmark to showcase the before/after state better (resembles production code changes), by splitting out the vector copies to the unmetered lambda and having an odd number of elements. Changed the previous leaves.reserve((block.vtx.size() + 1) & ~1ULL) rounding to leaves.reserve(block.vtx.size() + (block.vtx.size() & 1)).
  14. l0rinc requested review from luke-jr on Jun 15, 2025
  15. l0rinc requested review from yuvicc on Jun 15, 2025
  16. yuvicc commented at 5:26 pm on June 15, 2025: contributor

    Code review ACK 4cc5942895673591de4edceb6dd0c7188c302a72

    Testing

    Before Master@9a7eece5a4

    ns/leaf leaf/s err% total benchmark
    53.75 18,605,160.70 0.7% 0.01 MerkleRoot
    53.64 18,643,002.35 0.1% 0.01 MerkleRoot
    53.91 18,548,704.52 0.4% 0.01 MerkleRoot

    After commit@4cc5942895673591de4edceb6dd0c7188c302a72

    ns/leaf leaf/s err% total benchmark
    57.67 17,340,305.39 0.0% 0.52 MerkleRoot
    57.69 17,332,534.96 0.0% 0.52 MerkleRoot
    58.14 17,199,057.10 0.0% 0.52 MerkleRoot
  17. in src/bench/merkle_root.cpp:19 in 064f33bbe2 outdated
    18     std::vector<uint256> leaves;
    19-    leaves.resize(9001);
    20-    for (auto& item : leaves) {
    21-        item = rng.rand256();
    22+    leaves.resize(block_vtx_size);
    23+    for (auto& h : leaves) h = rng.rand256();
    


    maflcko commented at 9:11 am on June 20, 2025:
    why is this changed. touching a line to only remove the {} after the for loop seems odd and I had the impression that the dev notes preferred to use {}

    l0rinc commented at 11:19 am on June 20, 2025:
    I’m fine with both, reverted.
  18. in src/bench/merkle_root.cpp:22 in 064f33bbe2 outdated
    21-        item = rng.rand256();
    22+    leaves.resize(block_vtx_size);
    23+    for (auto& h : leaves) h = rng.rand256();
    24+
    25+    constexpr size_t iterations{1000};
    26+    std::vector<std::vector<uint256>> cases(iterations);
    


    maflcko commented at 9:14 am on June 20, 2025:
    not sure about pre-allocating. All real code paths allocate in-line for each iteration, so the benchmark should follow that. This would also avoid eating 100+mb for a microbenchmark?

    l0rinc commented at 10:42 am on June 20, 2025:
    I wanted to explicitly only measure the ComputeMerkleRoot call here without collecting hashes into a vector. But if you think that part should de included, I don’t mind, but in that case we won’t measure the behavior change in ComputeMerkleRoot anymore. However, I think the proper allocation makes sense regardless - e.g. for 2001 entries now we’ll allocate 2002 entries instead of 4002 like before.
  19. in src/consensus/merkle.cpp:71 in 4cc5942895 outdated
    65@@ -66,20 +66,20 @@ uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated) {
    66 uint256 BlockMerkleRoot(const CBlock& block, bool* mutated)
    67 {
    68     std::vector<uint256> leaves;
    69-    leaves.resize(block.vtx.size());
    70-    for (size_t s = 0; s < block.vtx.size(); s++) {
    71-        leaves[s] = block.vtx[s]->GetHash();
    72+    leaves.reserve(block.vtx.size() + (block.vtx.size() & 1)); // capacity rounded up to even
    73+    for (const auto& tx : block.vtx) {
    74+        leaves.emplace_back(tx->GetHash());
    


    maflcko commented at 9:19 am on June 20, 2025:
    why is this changed? I don’t think a byte array can meaningfully be moved (of course it compiles, but moving a byte is no different than copying it)

    l0rinc commented at 11:19 am on June 20, 2025:
    After I have used leaves.emplace_back(); // The witness hash of the coinbase is 0 in BlockWitnessMerkleRoot it seemed more consistent to me to use emplace_back in BlockMerkleRoot as well - as you said it should do the same. I’m fine with both, I’ve changed them inside the loop to push_back.
  20. in src/consensus/merkle.cpp:81 in 4cc5942895 outdated
    83-    leaves[0].SetNull(); // The witness hash of the coinbase is 0.
    84-    for (size_t s = 1; s < block.vtx.size(); s++) {
    85-        leaves[s] = block.vtx[s]->GetWitnessHash();
    86+    leaves.reserve(block.vtx.size() + (block.vtx.size() & 1)); // capacity rounded up to even
    87+    leaves.emplace_back(); // The witness hash of the coinbase is 0.
    88+    for (size_t i{1}; i < block.vtx.size(); ++i) {
    


    maflcko commented at 9:20 am on June 20, 2025:
    why is this changed? I don’t think it is worth to touch a line just to change the name of the variable from one single char to another single char.

    l0rinc commented at 11:19 am on June 20, 2025:
    There was no other loops with the same variable after the change anymore, so it doesn’t need to be a non-standard loop variable name (wasn’t sure what s meant in this context) - but since it’s important to you I’ve reverted them.
  21. maflcko commented at 9:22 am on June 20, 2025: member
    not sure how meaningful this is, but it looks like there are a bunch of unrelated style changes that don’t seem to have a benefit?
  22. l0rinc force-pushed on Jun 20, 2025
  23. l0rinc commented at 11:24 am on June 20, 2025: contributor

    The changes revert the original benchmark’s behavior of including the leaf vector in the benchmark (more explicitly and reserved like the production code). I’ve also changed the emplace_back calls for the hashes to push_bash and reverted the original loop conditions to minimize the diff in critical code sections.

    Note to yuvicc: previous benchmark version was measuring something else than master (only the merkle call without the vector copy), so we can’t directly compare the results.

    The new results don’t show any significant speed increase, only the memory allocation is more predictable - which isn’t captured by the benchmark anymore.

  24. l0rinc requested review from yuvicc on Jun 25, 2025
  25. l0rinc requested review from maflcko on Jun 25, 2025
  26. l0rinc force-pushed on Jun 25, 2025
  27. l0rinc commented at 9:33 pm on June 25, 2025: contributor

    I’ve rebased the changed and adjusted the benchmark to be more similar to the other production usages in the first commit, rounded to even in the second (optimization) commit, so that we can realistically measure the speed difference before and after:

    % build/bin/bench_bitcoin -filter=‘MerkleRoot’ –min-time=1000

    Before 7f620cffebee593e48434cf182cc2fd64a6d76be:

    ns/leaf leaf/s err% total benchmark
    45.50 21,975,688.66 0.0% 1.10 MerkleRoot
    45.53 21,962,546.91 0.1% 1.10 MerkleRoot
    45.53 21,965,162.52 0.0% 1.10 MerkleRoot

    After a73dd45e95f5e909d0950ba7c76225589df74854:

    ns/leaf leaf/s err% total benchmark
    45.04 22,200,976.95 0.1% 1.10 MerkleRoot
    44.97 22,235,208.25 0.1% 1.10 MerkleRoot
    45.01 22,217,033.81 0.1% 1.10 MerkleRoot
  28. l0rinc force-pushed on Jun 25, 2025
  29. yuvicc commented at 7:22 am on June 27, 2025: contributor

    lgtm re-ACK be8f3053a7ad526823f32f1a70847b9c06c4c54b

    Before e87430ed5fad6f9d90c1e7d256b9b8276b936c24

    ns/leaf leaf/s err% total benchmark
    53.83 18,577,129.91 0.1% 1.10 MerkleRoot
    53.62 18,648,858.81 0.1% 1.10 MerkleRoot
    53.70 18,621,594.40 0.1% 1.10 MerkleRoot

    After be8f3053a7ad526823f32f1a70847b9c06c4c54b

    ns/leaf leaf/s err% total benchmark
    53.02 18,860,232.62 0.2% 1.10 MerkleRoot
    52.88 18,910,755.68 0.0% 1.10 MerkleRoot
    52.89 18,906,671.63 0.1% 1.10 MerkleRoot
  30. l0rinc force-pushed on Jul 30, 2025
  31. l0rinc commented at 11:58 pm on July 30, 2025: contributor
    Rebased and added a new commit on top for deduplicating the integer rounding. Ready for review again.
  32. l0rinc force-pushed on Jul 30, 2025
  33. DrahtBot added the label CI failed on Jul 30, 2025
  34. DrahtBot commented at 11:58 pm on July 30, 2025: contributor

    🚧 At least one of the CI tasks failed. Task lint: https://github.com/bitcoin/bitcoin/runs/47077763348 LLM reason (✨ experimental): Lint check failed due to missing include guards in src/util/ints.h.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  35. DrahtBot removed the label CI failed on Jul 31, 2025
  36. optout21 commented at 6:36 am on August 7, 2025: none

    LGTM! Minor comments:

    • Maybe add a note to the ComputeMerkleRoot declaration, that the hashes input is expected to have one extra capacity (in case size is odd). Minor, as this function is not used directly much.
    • Even more, it could debug-assert that capacity is even, but this may be a bit of a stretch.
  37. l0rinc force-pushed on Aug 7, 2025
  38. l0rinc commented at 7:03 pm on August 7, 2025: contributor

    it could debug-assert that capacity is even

    Added, thanks for the hint

    Edit: had to push a few variants, the fuzzing still seems to fail, so I removed the Assert/Assume from ComputeMerkleRoot (it’s not a requirement anyway, it’s just more performant)

  39. l0rinc force-pushed on Aug 7, 2025
  40. DrahtBot added the label CI failed on Aug 7, 2025
  41. DrahtBot commented at 7:31 pm on August 7, 2025: contributor

    🚧 At least one of the CI tasks failed. Task multiprocess, i686, DEBUG: https://github.com/bitcoin/bitcoin/runs/47624077326 LLM reason (✨ experimental): The CI failure is caused by an assertion failure in consensus/merkle.cpp during the merkle root computation.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  42. l0rinc force-pushed on Aug 7, 2025
  43. l0rinc force-pushed on Aug 7, 2025
  44. DrahtBot removed the label CI failed on Aug 8, 2025
  45. optout21 commented at 8:20 am on August 8, 2025: none

    ACK a7997d02ebe5b72e706712f43884644b705a1cd1

    Good optimization, nice catch!

  46. in src/test/fuzz/integer.cpp:85 in a7997d02eb outdated
    80@@ -80,7 +81,8 @@ FUZZ_TARGET(integer, .init = initialize_integer)
    81     }
    82     constexpr uint256 u256_min{"0000000000000000000000000000000000000000000000000000000000000000"};
    83     constexpr uint256 u256_max{"ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff"};
    84-    const std::vector<uint256> v256{u256, u256_min, u256_max};
    85+    std::vector<uint256> v256{u256, u256_min, u256_max};
    86+    v256.reserve(RoundUpToEven(v256.size()));
    


    achow101 commented at 1:34 am on August 9, 2025:
    This reserve seems to be for ComputeMerkleRoot. However, I think it would be better to have ComputeMerkleRoot do any reservations necessary rather than relying on the caller to do so.

    l0rinc commented at 2:32 am on August 9, 2025:

    ComputeMerkleRoot already received a list of values, usually sized to fit their current size exactly (i.e. capacity() == size()). We can of course increase the capacity inside - it’s what:

    0if (hashes.size() & 1) {
    1    hashes.push_back(hashes.back());
    2}
    

    does already, but if we want to avoid reallocations, we have to predict the size of the vector before we start populating it.


    We could add some kind of lambda as a parameter and an expected size maybe, something like:

    0template <typename Gen>
    1uint256 GenerateMerkleRoot(std::size_t count, Gen&& gen, bool* mutated = nullptr)
    2{
    3    std::vector<uint256> leaves;
    4    leaves.reserve(RoundUpToEven(count));
    5    for (std::size_t i{0}; i < count; ++i) {
    6        leaves.push_back(gen(i));
    7    }
    8    return ComputeMerkleRoot(std::move(leaves), mutated);
    9}
    

    and the production code callers would look like:

     0static uint256 ComputeModifiedMerkleRoot(const CMutableTransaction& cb, const CBlock& block)
     1{
     2    return GenerateMerkleRoot(block.vtx.size(), [&](auto i) { return i == 0 ? cb.GetHash() : block.vtx[i]->GetHash(); });
     3}
     4
     5uint256 BlockMerkleRoot(const CBlock& block, bool* mutated)
     6{
     7    return GenerateMerkleRoot(block.vtx.size(), [&](auto i) { return block.vtx[i]->GetHash(); }, mutated);
     8}
     9uint256 BlockWitnessMerkleRoot(const CBlock& block, bool* mutated)
    10{
    11    return GenerateMerkleRoot(block.vtx.size(), [&](auto i) { return i == 0 ? uint256{} : uint256(block.vtx[i]->GetWitnessHash()); }, mutated);
    12}
    

    Would that be simpler in your opinion?

     0diff --git a/src/consensus/merkle.cpp b/src/consensus/merkle.cpp
     1index 8c56d979a5..a0e358b005 100644
     2--- a/src/consensus/merkle.cpp
     3+++ b/src/consensus/merkle.cpp
     4@@ -5,7 +5,6 @@
     5 #include <consensus/merkle.h>
     6 #include <hash.h>
     7 #include <util/check.h>
     8-#include <util/ints.h>
     9 
    10 /*     WARNING! If you're reading this because you're learning about crypto
    11        and/or designing a new system that will use merkle trees, keep in mind
    12@@ -63,26 +62,14 @@ uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated) {
    13     return hashes[0];
    14 }
    15 
    16-
    17 uint256 BlockMerkleRoot(const CBlock& block, bool* mutated)
    18 {
    19-    std::vector<uint256> leaves;
    20-    leaves.reserve(RoundUpToEven(block.vtx.size()));
    21-    for (size_t s = 0; s < block.vtx.size(); s++) {
    22-        leaves.push_back(block.vtx[s]->GetHash());
    23-    }
    24-    return ComputeMerkleRoot(std::move(leaves), mutated);
    25+    return GenerateMerkleRoot(block.vtx.size(), [&](auto i) { return block.vtx[i]->GetHash(); }, mutated);
    26 }
    27 
    28 uint256 BlockWitnessMerkleRoot(const CBlock& block, bool* mutated)
    29 {
    30-    std::vector<uint256> leaves;
    31-    leaves.reserve(RoundUpToEven(block.vtx.size()));
    32-    leaves.emplace_back(); // The witness hash of the coinbase is 0.
    33-    for (size_t s = 1; s < block.vtx.size(); s++) {
    34-        leaves.push_back(block.vtx[s]->GetWitnessHash());
    35-    }
    36-    return ComputeMerkleRoot(std::move(leaves), mutated);
    37+    return GenerateMerkleRoot(block.vtx.size(), [&](auto i) { return i == 0 ? uint256{} : uint256(block.vtx[i]->GetWitnessHash()); }, mutated);
    38 }
    39 
    40 /* This implements a constant-space merkle root/path calculator, limited to 2^32 leaves. */
    41diff --git a/src/consensus/merkle.h b/src/consensus/merkle.h
    42index c722cbe446..8b5c14157a 100644
    43--- a/src/consensus/merkle.h
    44+++ b/src/consensus/merkle.h
    45@@ -9,9 +9,21 @@
    46 
    47 #include <primitives/block.h>
    48 #include <uint256.h>
    49+#include <util/ints.h>
    50 
    51 uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated = nullptr);
    52 
    53+template <typename Gen>
    54+uint256 GenerateMerkleRoot(std::size_t count, Gen&& gen, bool* mutated = nullptr)
    55+{
    56+    std::vector<uint256> leaves;
    57+    leaves.reserve(RoundUpToEven(count));
    58+    for (std::size_t i{0}; i < count; ++i) {
    59+        leaves.push_back(gen(i));
    60+    }
    61+    return ComputeMerkleRoot(std::move(leaves), mutated);
    62+}
    63+
    64 /*
    65  * Compute the Merkle root of the transactions in a block.
    66  * *mutated is set to true if a duplicated subtree was found.
    67diff --git a/src/signet.cpp b/src/signet.cpp
    68index 3104b7f2d7..5aea084f15 100644
    69--- a/src/signet.cpp
    70+++ b/src/signet.cpp
    71@@ -58,13 +58,8 @@ static bool FetchAndClearCommitmentSection(const std::span<const uint8_t> header
    72 
    73 static uint256 ComputeModifiedMerkleRoot(const CMutableTransaction& cb, const CBlock& block)
    74 {
    75-    std::vector<uint256> leaves;
    76-    leaves.reserve(RoundUpToEven(block.vtx.size()));
    77-    leaves.push_back(cb.GetHash());
    78-    for (size_t s = 1; s < block.vtx.size(); ++s) {
    79-        leaves.push_back(block.vtx[s]->GetHash());
    80-    }
    81-    return ComputeMerkleRoot(std::move(leaves));
    82+    return GenerateMerkleRoot(block.vtx.size(),
    83+        [&](auto i) { return i ? block.vtx[i]->GetHash() : cb.GetHash(); });
    84 }
    85 
    86 std::optional<SignetTxs> SignetTxs::Create(const CBlock& block, const CScript& challenge)
    

    achow101 commented at 6:15 am on August 9, 2025:

    Would that be simpler in your opinion?

    No.

    but if we want to avoid reallocations, we have to predict the size of the vector before we start populating it.

    We can also avoid inserting into the vector in the first place, e.g. this diff gives me about the same speedup without needing to change anything with allocations:

     0diff --git a/src/consensus/merkle.cpp b/src/consensus/merkle.cpp
     1index 7dd24e1868f..46056bfa3e0 100644
     2--- a/src/consensus/merkle.cpp
     3+++ b/src/consensus/merkle.cpp
     4@@ -44,6 +44,7 @@
     5 
     6 
     7 uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated) {
     8+    std::array<uint256, 2> odd_hash;
     9     bool mutation = false;
    10     while (hashes.size() > 1) {
    11         if (mutated) {
    12@@ -51,11 +52,20 @@ uint256 ComputeMerkleRoot(std::vector<uint256> hashes, bool* mutated) {
    13                 if (hashes[pos] == hashes[pos + 1]) mutation = true;
    14             }
    15         }
    16-        if (hashes.size() & 1) {
    17+        bool need_dup = hashes.capacity() == hashes.size() && (hashes.size() & 1);
    18+        if (need_dup) {
    19+            odd_hash[0] = hashes.back();
    20+            odd_hash[1] = hashes.back();
    21+        } else if (hashes.size() & 1) {
    22             hashes.push_back(hashes.back());
    23         }
    24         SHA256D64(hashes[0].begin(), hashes[0].begin(), hashes.size() / 2);
    25         hashes.resize(hashes.size() / 2);
    26+        if (need_dup) {
    27+            SHA256D64(odd_hash[0].begin(), odd_hash[0].begin(), 1);
    28+            assert(hashes.capacity() > hashes.size());
    29+            hashes.push_back(odd_hash[0]);
    30+        }
    31     }
    32     if (mutated) *mutated = mutation;
    33     if (hashes.size() == 0) return uint256();
    

    l0rinc commented at 6:21 am on August 9, 2025:
    Clever, we can of course calculate the extra value independently, but isn’t it a lot simpler to just precalculate the final size before insertions and keep ComputeMerkleRoot unchanged?

    achow101 commented at 6:58 am on August 9, 2025:
    I think it’s simpler if the caller does not have to be aware that they should allocate a larger vector to be passed in.

    l0rinc commented at 3:41 pm on August 9, 2025:
    The lambda solution is even simpler in that case, the caller doesn’t even have to allocate or iterate at all.

    l0rinc commented at 8:44 pm on August 13, 2025:
    Addressed it in a different way in latest push, let me know if this solves your concern or if it’s orthogonal.
  47. DrahtBot added the label Needs rebase on Aug 13, 2025
  48. refactor: make `MerkleRoot` benchmark more representative
    Instead of mutating the input after each round to avoid unwanted compiler optimizations, we assert the expected hash, which likewise inhibits aggressive optimization.
    
    To make the benchmark more similar to other `ComputeMerkleRoot` the input leaves copying is made explicit before each run.
    
    % build/bin/bench_bitcoin -filter='MerkleRoot' --min-time=1000
    
    |             ns/leaf |              leaf/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |               45.55 |       21,953,928.57 |    0.2% |      1.13 | `MerkleRoot`
    |               45.55 |       21,953,985.76 |    0.1% |      1.10 | `MerkleRoot`
    |               45.56 |       21,950,501.83 |    0.2% |      1.10 | `MerkleRoot`
    f2334a0ae0
  49. l0rinc force-pushed on Aug 13, 2025
  50. l0rinc commented at 8:44 pm on August 13, 2025: contributor

    Rebased after #33116 - the only conflict was adding .ToUint256() to the merkle hashes.

    Based on the suggestion of @achow101 (since I do agree that it’s kinda’ leaking abstractions and there’s also a lot of duplication for collecting the merkle leaves), I have also deduplicated all call sites without needing to touch the consensusy Merkle calculation itself. We can still do that in a follow-up, if needed, but this way we’ve made the code both simpler and more performant (which wasn’t the case with either the original or the suggestion), and separated the leaves abstraction and kept the risky merkle root calculation intact. I have remeasured the before/after benchmarks, no change in performance since last push, but updated the commit messages anyway.

  51. DrahtBot added the label CI failed on Aug 13, 2025
  52. DrahtBot commented at 9:49 pm on August 13, 2025: contributor

    🚧 At least one of the CI tasks failed. Task tidy: https://github.com/bitcoin/bitcoin/runs/48036589106 LLM reason (✨ experimental): The failure is caused by an error in clang-tidy related to the use of std::move on a const variable, indicating a code warning treated as an error.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  53. merkle: pre-reserve leaves to prevent reallocs with odd vtx count
    This prevents the input vector from doubling in size when the input count is odd.
    The rounding adds 1 when the size is odd (least significant bit is 1); otherwise it adds 0, keeping the value even.
    The new `ToMerkleLeaves` serves to deduplicate the collection of leaves, without leaking abstraction to the callers about the Merkle tree size evenness concerns.
    
    % build/bin/bench_bitcoin -filter='MerkleRoot' --min-time=1000
    
    |             ns/leaf |              leaf/s |    err% |     total | benchmark
    |--------------------:|--------------------:|--------:|----------:|:----------
    |               44.57 |       22,435,971.11 |    0.3% |      1.10 | `MerkleRoot`
    |               44.46 |       22,491,655.23 |    0.0% |      1.10 | `MerkleRoot`
    |               44.47 |       22,487,556.76 |    0.1% |      1.10 | `MerkleRoot`
    
    Co-authored-by: optout21 <13562139+optout21@users.noreply.github.com>
    Co-authored-by: Ava Chow <github@achow101.com>
    de6a28396b
  54. l0rinc force-pushed on Aug 13, 2025
  55. DrahtBot removed the label Needs rebase on Aug 13, 2025
  56. DrahtBot removed the label CI failed on Aug 13, 2025

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-02 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me