util: implement noexcept move assignment & move ctor for prevector #27334

pull martinus wants to merge 4 commits into bitcoin:master from martinus:2023-03-prevector-move changing 3 files +43 −4
  1. martinus commented at 2:55 pm on March 26, 2023: contributor

    prevector’s move assignment and move constructor were not noexcept, which makes it inefficient to use inside STL containers like std::vector. That’s the case e.g. for CScriptCheck. This PR adds noexcept, and also implements the move assignment & ctor, which makes it quite a bit more efficient to use prevector in an std::vector.

    The PR also adds a benchmark which grows an std::vector by adding prevector objects to it.

    merge-base:

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    6,440.29 155,272.42 0.2% 40,713.01 20,473.84 1.989 7,132.01 0.2% 0.44 PrevectorFillVectorDirectNontrivial
    3,213.19 311,217.35 0.7% 35,373.01 10,214.07 3.463 6,945.00 0.2% 0.43 PrevectorFillVectorDirectTrivial
    34,749.70 28,777.23 0.1% 364,396.05 110,521.94 3.297 78,568.37 0.1% 0.43 PrevectorFillVectorIndirectNontrivial
    32,535.05 30,736.09 0.4% 353,823.31 103,464.53 3.420 79,871.80 0.2% 0.40 PrevectorFillVectorIndirectTrivial

    util: prevector’s move ctor and move assignment is noexcept:

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    6,603.87 151,426.40 0.2% 23,734.01 21,009.63 1.130 2,445.01 0.3% 0.44 PrevectorFillVectorDirectNontrivial
    1,980.93 504,813.15 0.1% 13,784.00 6,304.32 2.186 2,258.00 0.3% 0.44 PrevectorFillVectorDirectTrivial
    19,110.54 52,327.15 0.1% 139,816.41 51,987.72 2.689 28,512.18 0.1% 0.43 PrevectorFillVectorIndirectNontrivial
    12,334.37 81,074.27 0.7% 125,655.12 39,253.58 3.201 27,854.46 0.2% 0.44 PrevectorFillVectorIndirectTrivial

    util: implement prevector’s move ctor & move assignment

    ns/op op/s err% ins/op cyc/op IPC bra/op miss% total benchmark
    5,262.66 190,018.01 0.2% 20,157.01 16,745.26 1.204 2,445.01 0.3% 0.44 PrevectorFillVectorDirectNontrivial
    1,687.07 592,744.35 0.2% 12,742.00 5,368.02 2.374 2,258.00 0.3% 0.44 PrevectorFillVectorDirectTrivial
    17,930.80 55,769.95 0.1% 136,237.69 47,903.31 2.844 28,512.02 0.2% 0.42 PrevectorFillVectorIndirectNontrivial
    11,893.75 84,077.78 0.2% 126,182.02 37,852.91 3.333 28,152.01 0.1% 0.44 PrevectorFillVectorIndirectTrivial

    As can be seen, mostly thanks to just noexcept the benchmark becomes about 2 times faster because std::vector can now use move operations instead of having to fall back to copy everything

    I had a look at how this change affects the other benchmarks, and they are all pretty much the same, the only noticable difference is CCheckQueueSpeedPrevectorJob goes from 364.56ns down to 346.21ns.

  2. bench: Add benchmark for prevector usage in std::vector d380d2877e
  3. util: prevector's move ctor and move assignment is `noexcept`
    Move operations already are `noexcept`, so add the keyword to the methods.
    This makes the `PrevectorFillVectorIndirect...` benchmarks about twice
    as fast on my machine, because otherwise `std::vector` has to use a copy
    when the vector resizes.
    81f67977f5
  4. test: CScriptCheck is used a lot in std::vector, make sure that's efficient
    Adds a few static_asserts so CScriptCheck stays is_nothrow_move_assignable,
    is_nothrow_move_constructible, and is_nothrow_destructible
    fffc86f49f
  5. util: implement prevector's move ctor & move assignment
    Using swap() was rather wasteful because it had to copy the whole direct
    memory data twice. Also, due to the swap() in move assignment the moved-from
    object might hold on to unused memory for longer than necessary.
    bfb9291a86
  6. DrahtBot commented at 2:55 pm on March 26, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK theStack, jonatack, achow101
    Concept ACK hebasto
    Approach ACK hernanmarino, john-moffett

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

  7. DrahtBot added the label Utils/log/libs on Mar 26, 2023
  8. hebasto commented at 3:59 pm on March 26, 2023: member
    Concept ACK. Thanks for picking this up.
  9. hernanmarino commented at 11:58 pm on March 26, 2023: contributor
    Approach ACK
  10. in src/prevector.h:280 in bfb9291a86
    277@@ -276,8 +278,13 @@ class prevector {
    278         return *this;
    279     }
    280 
    281-    prevector& operator=(prevector<N, T, Size, Diff>&& other) {
    282-        swap(other);
    


    jonatack commented at 9:59 pm on April 4, 2023:

    After removing these two swap calls in the last commit bfb9291, should we remove the unused swap function?

     0diff --git a/src/prevector.h b/src/prevector.h
     1index bcab1ff00cd..2e8510acbe6 100644
     2--- a/src/prevector.h
     3+++ b/src/prevector.h
     4@@ -459,12 +459,6 @@ public:
     5         return *item_ptr(size() - 1);
     6     }
     7 
     8-    void swap(prevector<N, T, Size, Diff>& other) noexcept
     9-    {
    10-        std::swap(_union, other._union);
    11-        std::swap(_size, other._size);
    12-    }
    13-
    14     ~prevector() {
    15         if (!is_direct()) {
    16             free(_union.indirect_contents.indirect);
    17diff --git a/src/test/fuzz/prevector.cpp b/src/test/fuzz/prevector.cpp
    18index 9cea32e304f..312e20a0998 100644
    19--- a/src/test/fuzz/prevector.cpp
    20+++ b/src/test/fuzz/prevector.cpp
    21@@ -161,12 +161,6 @@ public:
    22         pre_vector.shrink_to_fit();
    23     }
    24 
    25-    void swap() noexcept
    26-    {
    27-        real_vector.swap(real_vector_alt);
    28-        pre_vector.swap(pre_vector_alt);
    29-    }
    30-
    31     void move()
    32     {
    33         real_vector = std::move(real_vector_alt);
    34@@ -211,7 +205,7 @@ FUZZ_TARGET(prevector)
    35 
    36     LIMITED_WHILE(prov.remaining_bytes(), 3000)
    37     {
    38-        switch (prov.ConsumeIntegralInRange<int>(0, 13 + 3 * (test.size() > 0))) {
    39+        switch (prov.ConsumeIntegralInRange<int>(0, 12 + 3 * (test.size() > 0))) {
    40         case 0:
    41             test.insert(prov.ConsumeIntegralInRange<size_t>(0, test.size()), prov.ConsumeIntegral<int>());
    42             break;
    43@@ -261,21 +255,18 @@ FUZZ_TARGET(prevector)
    44             test.assign(prov.ConsumeIntegralInRange<size_t>(0, 32767), prov.ConsumeIntegral<int>());
    45             break;
    46         case 11:
    47-            test.swap();
    48-            break;
    49-        case 12:
    50             test.copy();
    51             break;
    52-        case 13:
    53+        case 12:
    54             test.move();
    55             break;
    56-        case 14:
    57+        case 13:
    58             test.update(prov.ConsumeIntegralInRange<size_t>(0, test.size() - 1), prov.ConsumeIntegral<int>());
    59             break;
    60-        case 15:
    61+        case 14:
    62             test.erase(prov.ConsumeIntegralInRange<size_t>(0, test.size() - 1));
    63             break;
    64-        case 16:
    65+        case 15:
    66             test.pop_back();
    67             break;
    68         }
    69diff --git a/src/test/prevector_tests.cpp b/src/test/prevector_tests.cpp
    70index 1559011fcd1..375a8772d8f 100644
    71--- a/src/test/prevector_tests.cpp
    72+++ b/src/test/prevector_tests.cpp
    73@@ -166,13 +166,6 @@ public:
    74         test();
    75     }
    76 
    77-    void swap() noexcept
    78-    {
    79-        real_vector.swap(real_vector_alt);
    80-        pre_vector.swap(pre_vector_alt);
    81-        test();
    82-    }
    83-
    84     void move() {
    85         real_vector = std::move(real_vector_alt);
    86         real_vector_alt.clear();
    87@@ -273,9 +266,6 @@ BOOST_AUTO_TEST_CASE(PrevectorTestInt)
    88             if (InsecureRandBits(9) == 12) {
    89                 test.assign(InsecureRandBits(5), int(InsecureRand32()));
    90             }
    91-            if (InsecureRandBits(3) == 3) {
    92-                test.swap();
    93-            }
    94             if (InsecureRandBits(4) == 8) {
    95                 test.copy();
    96             }
    

    john-moffett commented at 6:55 pm on May 12, 2023:
    I’d normally agree, but the documentation says that prevector is meant as a drop-in replacement for std::vector<T>, so I think it’s best to leave it.
  11. jonatack commented at 10:38 pm on April 4, 2023: contributor

    Tested Approach ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110, ~ACK modulo re-verifying the implementation of the last commit.

    Rebased to current master, built and tested at each commit on macOS Ventura 13.3, ARM64 M1 Max, regular (non-debug) build, and I am seeing the following speedups in the added benchmarks after the second commit 81f67977f543f and the fourth commit bfb9291a8661fe.

    The only exception is a slowdown at 81f67977f54 for the DirectNontrivial benchmark, but all the other improvements look quite significant.

    Ran the benchmarks a few times at each step to re-verify.

    current master @ 369d4c0

    0|               ns/op |                op/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|            3,390.34 |          294,955.41 |    0.3% |      0.01 | `PrevectorFillVectorDirectNontrivial`
    3|            2,455.29 |          407,283.18 |    0.2% |      0.01 | `PrevectorFillVectorDirectTrivial`
    4|           23,923.60 |           41,799.74 |    1.0% |      0.01 | `PrevectorFillVectorIndirectNontrivial`
    5|           16,638.89 |           60,100.17 |    0.6% |      0.01 | `PrevectorFillVectorIndirectTrivial`
    

    prevector’s move ctor and move assignment is noexcept

    0|            3,720.82 |          268,758.17 |    0.1% |      0.01 | `PrevectorFillVectorDirectNontrivial`
    1|            2,028.66 |          492,935.82 |    0.2% |      0.01 | `PrevectorFillVectorDirectTrivial`
    2|           11,149.71 |           89,688.44 |    2.2% |      0.01 | `PrevectorFillVectorIndirectNontrivial`
    3|            5,943.73 |          168,244.45 |    0.1% |      0.01 | `PrevectorFillVectorIndirectTrivial`
    

    implement prevector’s move ctor & move assignment

    0|            2,761.52 |          362,119.10 |    0.2% |      0.01 | `PrevectorFillVectorDirectNontrivial`
    1|            1,655.06 |          604,207.56 |    0.8% |      0.01 | `PrevectorFillVectorDirectTrivial`
    2|            9,561.75 |          104,583.42 |    1.4% |      0.01 | `PrevectorFillVectorIndirectNontrivial`
    3|            5,192.09 |          192,600.85 |    0.4% |      0.01 | `PrevectorFillVectorIndirectTrivial`
    

    We use std::vector<CScriptCheck> data structures in CheckInputScripts() and ConnectBlock().

    Some relevant links on this topic:

  12. jonatack commented at 10:53 pm on April 4, 2023: contributor

    I had a look at how this change affects the other benchmarks, and they are all pretty much the same, the only noticable difference is CCheckQueueSpeedPrevectorJob goes from 364.56ns down to 346.21ns.

    Here are my results with ./src/bench/bench_bitcoin -filter='CCheckQueueSpeedPrevectorJob*.*'.

    master @ 369d4c03b7084de967576759545ba36a17fc18bb

    0|              ns/job |               job/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|              249.56 |        4,007,033.05 |    8.3% |      0.09 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    3|              260.60 |        3,837,227.86 |   10.3% |      0.09 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    4|              235.15 |        4,252,693.75 |    7.0% |      0.08 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    

    this pull (rebased to master)

    0|              ns/job |               job/s |    err% |     total | benchmark
    1|--------------------:|--------------------:|--------:|----------:|:----------
    2|              237.13 |        4,217,094.35 |    6.6% |      0.09 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    3|              246.34 |        4,059,432.96 |    6.6% |      0.08 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    4|              230.56 |        4,337,272.00 |    5.5% |      0.09 | :wavy_dash: `CCheckQueueSpeedPrevectorJob` (Unstable with ~10.9 iters. Increase `minEpochIterations` to e.g. 109)
    
  13. martinus commented at 5:45 am on April 5, 2023: contributor

    The only exception is a slowdown at https://github.com/bitcoin/bitcoin/commit/81f67977f543faca2dcc35846f73e2917375ae79 for the DirectNontrivial benchmark, but all the other improvements look quite significant.

    I think the reason why DirectNontrivial got slower when only adding noexcept is that now the supposedly faster move operations are used when resizing the vector, but since this was implemented as a swap, the implementation had to swap the union which I think involves 3 copies behind the scenes: copy a to tmp, copy b to a, copy tmp to b.

    The copy constructor for direct storage which was used before the noexcept is simpler, it only had to copy the union once.

    So only in combination with noexcept and non-swap based implementation the direct storage version is faster

  14. in src/prevector.h:286 in bfb9291a86
    283+    prevector& operator=(prevector<N, T, Size, Diff>&& other) noexcept {
    284+        if (!is_direct()) {
    285+            free(_union.indirect_contents.indirect);
    286+        }
    287+        _union = std::move(other._union);
    288+        _size = other._size;
    


    john-moffett commented at 7:07 pm on May 12, 2023:

    Maybe a good idea to:

    0if (!other.is_direct()) {
    1        other._union.indirect_contents.indirect = nullptr;
    2}
    

    in case anyone tries to free it?


    martinus commented at 6:38 am on May 13, 2023:
    I think it’s enough to have other._size = 0; because then other.is_direct() will return false and no free happens
  15. john-moffett approved
  16. john-moffett commented at 7:10 pm on May 12, 2023: contributor
    Approach ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110
  17. theStack commented at 5:39 pm on June 9, 2023: contributor

    Concept ACK

    Very nice find! Here are the benchmark results on one of my slower machines (running OpenBSD 7.3, amd64). Interestingly, the speedup on the PrevectorFillVectorIndirect* benchmarks gained by marking noexcept seems to be even higher than 2x (close to 3x):

    merge-base:

    ns/op op/s err% total benchmark
    13,998.14 71,438.05 1.5% 0.01 PrevectorFillVectorDirectNontrivial
    9,583.81 104,342.67 2.0% 0.01 PrevectorFillVectorDirectTrivial
    204,851.50 4,881.58 3.4% 0.01 PrevectorFillVectorIndirectNontrivial
    180,488.00 5,540.53 4.1% 0.01 PrevectorFillVectorIndirectTrivial

    commit “util: prevector’s move ctor and move assignment is noexcept

    ns/op op/s err% total benchmark
    16,886.54 59,218.75 3.3% 0.01 PrevectorFillVectorDirectNontrivial
    6,777.13 147,555.06 3.8% 0.01 PrevectorFillVectorDirectTrivial
    70,299.40 14,224.87 1.6% 0.01 PrevectorFillVectorIndirectNontrivial
    62,582.36 15,978.94 1.2% 0.01 PrevectorFillVectorIndirectTrivial

    util: implement prevector’s move ctor & move assignment

    ns/op op/s err% total benchmark
    13,087.47 76,408.96 2.9% 0.01 PrevectorFillVectorDirectNontrivial
    6,000.36 166,656.77 0.7% 0.01 PrevectorFillVectorDirectTrivial
    71,839.56 13,919.91 3.6% 0.01 PrevectorFillVectorIndirectNontrivial
    61,748.83 16,194.64 1.9% 0.01 PrevectorFillVectorIndirectTrivial
  18. theStack approved
  19. theStack commented at 5:00 pm on June 10, 2023: contributor

    Tested and light code-review ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110

    (My understanding of move semantics is limited, but the change looks correct to me. [All the links @jonatack posted above](/bitcoin-bitcoin/27334/#pullrequestreview-1371788591) for further reading were very helpful, especially this one: https://quuxplusone.github.io/blog/2022/08/26/vector-pessimization/) @ryanofsky: you might be interested in this PR?

  20. fanquake requested review from ryanofsky on Jun 12, 2023
  21. jonatack approved
  22. jonatack commented at 3:42 pm on June 15, 2023: contributor

    Tested Approach ACK bfb9291, ~ACK modulo re-verifying the implementation of the last commit.

    We use std::vector<CScriptCheck> data structures in CheckInputScripts() and ConnectBlock().

    ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110 re-reviewed, verified no current merge conflict by rebasing to current master, and have been running on mainnet for a few days.

    Edit: I’d be happy to review a pull that documents the magic numbers in the prevector benchmarks.

  23. achow101 commented at 7:32 pm on June 27, 2023: member
    ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110
  24. achow101 merged this on Jun 27, 2023
  25. achow101 closed this on Jun 27, 2023

  26. jonatack commented at 8:58 pm on June 27, 2023: contributor

    The merge script picked my quoted github snippet rather than the ACK; maybe it should ignore lines that begin with > .

    0    ACKs for top commit:
    1      achow101:
    2        ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110
    3      jonatack:
    4        > Tested Approach ACK [bfb9291](https://github.com/bitcoin/bitcoin/commit/bfb9291a8661fe5b26c14ed755cfa89d27c37110), ~ACK modulo re-verifying the implementation of the last commit.
    5      john-moffett:
    6        Approach ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110
    7      theStack:
    8        Tested and light code-review ACK bfb9291a8661fe5b26c14ed755cfa89d27c37110
    
  27. martinus deleted the branch on Jun 28, 2023
  28. sidhujag referenced this in commit aa869ee933 on Jun 30, 2023
  29. achow101 referenced this in commit 9e350dc74b on Mar 12, 2024
  30. bitcoin locked this on Jun 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-31 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me