Do not use std::vector = {} to release memory #28452

pull sipa wants to merge 1 commits into bitcoin:master from sipa:202309_really_no_memory changing 4 files +57 −14
  1. sipa commented at 6:16 pm on September 11, 2023: member

    It appears that invoking v = {}; for an std::vector<...> v is equivalent to v.clear(), which does not release its allocated memory. There are a number of places in the codebase where it appears to be used for that purpose however (mostly written by me). Replace those with std::vector<...>{}.swap(v); (using a helper function ClearShrink in util/vector.h).

    To explain what is going on: v = {...}; is equivalent in general to v.operator=({...});. For many types, the {} is converted to the type of v, and then assigned to v - which for std::vector would ordinarily have the effect of clearing its memory (constructing a new empty vector, and then move-assigning it to v). However, since std::vector<T> has an operator=(std::initializer_list<T>) defined, it has precedence (since no implicit conversion is needed), and with an empty list, that is equivalent to clear().

    I did consider using v = std::vector<T>{}; as replacement for v = {}; instances where memory releasing is desired, but it appears that it does not actually work universally either. V{}.swap(v); does.

  2. DrahtBot commented at 6:16 pm on September 11, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK stickies-v, theStack, ajtowns
    Concept ACK theuni, hebasto

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #27596 (assumeutxo (2) by jamesob)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. sipa force-pushed on Sep 11, 2023
  4. DrahtBot added the label CI failed on Sep 11, 2023
  5. sipa force-pushed on Sep 11, 2023
  6. theuni commented at 7:18 pm on September 11, 2023: member

    Concept ACK on the msgs, but do we really want to release all mem all the time for the others? Without the shrinks, we essentially keep a high-water mark for memory where future allocations are unnecessary.

    I poked quickly at m_send_buffer for in MarkBytesSent for example, and it seems like that means constant reallocation now, no?

  7. sipa commented at 7:25 pm on September 11, 2023: member

    @theuni For m_send_buffer, m_recv_buffer, and m_recv_decode_buffer it’s not unreasonable to e.g. permit a certain lingering memory usage. E.g. if the capacity is under 64 KiB, don’t wipe.

    I considered that before, but wasn’t sure it’d be worth the discussion. We’re already doing several allocations per P2P message already in various places, some of which will require more refactoring to get rid of, so one more or less probably doesn’t matter that much.

    For this PR, I wanted to just make it do what it was intending to do already.

  8. hebasto commented at 7:50 pm on September 11, 2023: member

    Concept ACK.

    I’m curious, does this change have an observable effect on peak memory usage?

  9. sipa commented at 8:01 pm on September 11, 2023: member
    @hebasto Not right now, as V2Transport isn’t hooked up to anything. But once it is, without this PR, after sending a 4 MB message you’d permanently keep 4 MB allocated for that connection (and similarly for the receive side).
  10. hebasto commented at 8:03 pm on September 11, 2023: member

    FWIW, shrink_to_fit:

    … is a non-binding request to reduce capacity() to size(). It depends on the implementation whether the request is fulfilled.

  11. sipa commented at 8:03 pm on September 11, 2023: member

    @hebasto I’m aware. But it appears to work better in practice than v = std::vector<T>{};.

    I guess we could also use the combination, v = std::vector<T>{}; v.shrink_to_fit();.

    v.swap(std::vector<T>{}); is also an option, possibly followed by v.shrink_to_fit();.

  12. hebasto commented at 8:09 pm on September 11, 2023: member

    v.swap(std::vector<T>{}); is also an option, possibly followed by v.shrink_to_fit();.

    I was thinking just about the same :)

  13. DrahtBot removed the label CI failed on Sep 11, 2023
  14. DrahtBot added the label CI failed on Sep 12, 2023
  15. in src/test/net_tests.cpp:1572 in 9a92fda90e outdated
    1567+    std::vector<uint8_t>{}.swap(v5);
    1568+    BOOST_CHECK(v5.capacity() == 0);
    1569+
    1570+    std::vector<uint8_t> v6 = {1, 2, 3};
    1571+    v6 = std::vector<uint8_t>{};
    1572+    BOOST_CHECK(v6.capacity() == 0);
    


    MarcoFalke commented at 6:21 am on September 12, 2023:

    On clang, from CI:

    0test/net_tests.cpp(1554): error: in "net_tests/vector_release_test": check v2.capacity() == 0 has failed
    1test/net_tests.cpp(1572): error: in "net_tests/vector_release_test": check v6.capacity() == 0 has failed
    

    (But only with depends DEBUG=1 enabled)

    clang godbolt: https://godbolt.org/z/eYq7Kooz7 gcc godbolt: https://godbolt.org/z/e688xb4vM


    sipa commented at 12:07 pm on September 12, 2023:
    So bizarre, but that served as a good test for which methods work and don’t work.
  16. in src/headerssync.cpp:54 in 9a92fda90e outdated
    50@@ -51,9 +51,11 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
    51 void HeadersSyncState::Finalize()
    52 {
    53     Assume(m_download_state != State::FINAL);
    54-    m_header_commitments = {};
    55+    m_header_commitments.clear();
    


    MarcoFalke commented at 6:25 am on September 12, 2023:
    0    ClearShrink(m_header_commitments);
    

    nit: I wonder if there should be a clang-tidy-plugin rule to catch these and whether there should be a util method to do the stuff. This would make it easier to adjust the approach if it needs to be.


    sipa commented at 12:08 pm on September 12, 2023:
    I’ve added a ClearShrink(V& v) to util/vector.h, and used that everywhere. I’ve also added a unit test for it.
  17. sipa force-pushed on Sep 12, 2023
  18. in src/test/util_tests.cpp:1800 in 8332d52211 outdated
    1795+BOOST_AUTO_TEST_CASE(clearshrink_test)
    1796+{
    1797+    std::vector<uint8_t> v = {1, 2, 3};
    1798+    ClearShrink(v);
    1799+    BOOST_CHECK(v.empty());
    1800+    BOOST_CHECK(v.capacity() == 0);
    


    hebasto commented at 12:20 pm on September 12, 2023:

    nit:

    0    BOOST_CHECK_EQUAL(v.capacity(), 0);
    

    ?


    sipa commented at 9:19 pm on September 12, 2023:
    Done.
  19. DrahtBot removed the label CI failed on Sep 12, 2023
  20. sipa force-pushed on Sep 12, 2023
  21. ajtowns commented at 6:11 am on September 13, 2023: contributor

    Replace those with v.clear(); v.shrink_to_fit();.

    PR description needs updating?

  22. in src/util/vector.h:54 in 43627af084 outdated
    48@@ -49,4 +49,22 @@ inline V Cat(V v1, const V& v2)
    49     return v1;
    50 }
    51 
    52+/** Clear a vector and release its allocated memory. */
    53+template<typename V>
    54+inline void ClearShrink(V& v) noexcept
    


    stickies-v commented at 10:39 am on September 13, 2023:

    I’m not sure if we want to use this for vectors only, but from the docs it seems like it? If so, perhaps worthwhile enforcing that on compile time?

    0inline void ClearShrink(std::vector<V>& v) noexcept
    

    MarcoFalke commented at 10:58 am on September 13, 2023:
    shrink_to_fit exists only on vectors, so I think this is already enforced? Also, why is there a need to enforce it? Finally, your suggestion would require the caller to specify the type V for every call, which seems suboptimal.

    sipa commented at 11:04 am on September 13, 2023:

    Note that the code isn’t using v.shrink_to_fit(); anymore, but V{}.swap(v);.

    Also, both of those formulas also work on std::deque, and the call sites in HeadersSyncState::Finalize actually use it for that. Sadly, std::deque has no capacity(), so you can’t actually test if it has the desired effect.


    sipa commented at 11:21 am on September 13, 2023:
    I’ve clarified in the comment that the function also works on std::deque, and added a test for it (though not the capacity test part).

    stickies-v commented at 12:04 pm on September 13, 2023:

    edit: sipa’s responses didn’t automatically load so the below is slightly outdated.

    shrink_to_fit exists only on vectors

    std::deque also has it, as well as e.g. our prevector

    Actually, with this change I’ve found that this PR is using ClearShrink on 2 other types: bitdeque<> m_header_commitments and std::deque<CompressedHeader> m_redownloaded_headers.

    Also, why is there a need to enforce it?

    I’m not sure if we need to, I’m just raising the question. The main reason I’m considering is that we’re currently only unit testing this on a std::vector and it seems a pretty specific function. Keeping usage in line with documentation and testing seems a sane thing to do, but again, it’s just a consideration.

    Finally, your suggestion would require the caller to specify the type V for every call, which seems suboptimal.

    Hmm, can you elaborate? This diff compiles fine for me:

     0diff --git a/src/headerssync.cpp b/src/headerssync.cpp
     1index f891063cd2..09327a33fd 100644
     2--- a/src/headerssync.cpp
     3+++ b/src/headerssync.cpp
     4@@ -52,9 +52,9 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
     5 void HeadersSyncState::Finalize()
     6 {
     7     Assume(m_download_state != State::FINAL);
     8-    ClearShrink(m_header_commitments);
     9+    m_header_commitments = {};
    10     m_last_header_received.SetNull();
    11-    ClearShrink(m_redownloaded_headers);
    12+    m_redownloaded_headers = {};
    13     m_redownload_buffer_last_hash.SetNull();
    14     m_redownload_buffer_first_prev_hash.SetNull();
    15     m_process_all_remaining_headers = false;
    16diff --git a/src/util/vector.h b/src/util/vector.h
    17index 2e83eb181b..09a9c409d4 100644
    18--- a/src/util/vector.h
    19+++ b/src/util/vector.h
    20@@ -51,7 +51,7 @@ inline V Cat(V v1, const V& v2)
    21 
    22 /** Clear a vector and release its allocated memory. */
    23 template<typename V>
    24-inline void ClearShrink(V& v) noexcept
    25+inline void ClearShrink(std::vector<V>& v) noexcept
    26 {
    27     // There are various ways to clear a vector and releasing its memory:
    28     //
    29@@ -64,7 +64,7 @@ inline void ClearShrink(V& v) noexcept
    30     // follows. (3) and (4) rely on std::vector::shrink_to_fit, which is only a non-binding
    31     // request. Therefore, we use method (1).
    32 
    33-    V{}.swap(v);
    34+    std::vector<V>{}.swap(v);
    35 }
    36 
    37 #endif // BITCOIN_UTIL_VECTOR_H
    

    MarcoFalke commented at 12:15 pm on September 13, 2023:

    Hmm, can you elaborate? This diff compiles fine for me:

    Ok, my bad then. :)


    stickies-v commented at 12:16 pm on September 13, 2023:
    But anyway, it seems like we don’t want to limit this to just vectors, given that both std::deque and bitdeque have the same initializer_list assignment behaviour. So this can be marked as resolved.
  23. in src/test/util_tests.cpp:1797 in 43627af084 outdated
    1790@@ -1791,4 +1791,13 @@ BOOST_AUTO_TEST_CASE(util_WriteBinaryFile)
    1791     BOOST_CHECK(valid);
    1792     BOOST_CHECK_EQUAL(actual_text, expected_text);
    1793 }
    1794+
    1795+BOOST_AUTO_TEST_CASE(clearshrink_test)
    1796+{
    1797+    std::vector<uint8_t> v = {1, 2, 3};
    


    stickies-v commented at 10:41 am on September 13, 2023:
    Given its specialization, would adding a test on std::vector<bool> be worthwhile?

    sipa commented at 11:20 am on September 13, 2023:
    Done.
  24. in src/util/vector.h:56 in 43627af084 outdated
    48@@ -49,4 +49,22 @@ inline V Cat(V v1, const V& v2)
    49     return v1;
    50 }
    51 
    52+/** Clear a vector and release its allocated memory. */
    53+template<typename V>
    54+inline void ClearShrink(V& v) noexcept
    55+{
    56+    // There are various ways to clear a vector and releasing its memory:
    


    stickies-v commented at 10:52 am on September 13, 2023:

    grammar nit: consistent verb forms

    0    // There are various ways to clear a vector and release its memory:
    

    sipa commented at 11:20 am on September 13, 2023:
    Done.
  25. stickies-v commented at 10:52 am on September 13, 2023: contributor

    code review ACK 43627af0848dbabde34209ce07690553fcf94054

    Found two more instances where ClearShrink seems appropriate:

  26. fanquake added the label Needs backport (25.x) on Sep 13, 2023
  27. Do not use std::vector = {} to release memory 3fcd7fc7ff
  28. sipa force-pushed on Sep 13, 2023
  29. sipa commented at 11:24 am on September 13, 2023: member

    @stickies-v

    Found two more instances where ClearShrink seems appropriate:

    The headers variable here is just a local variable that will go out of scope when the received headers are processed; there is no long-term memory usage we’re concerned about. I believe the intent of the = {}; there is just to clear the vector (so headers don’t get processed a second time).

    Yeah, not going to touch that in this PR.

  30. stickies-v approved
  31. stickies-v commented at 12:35 pm on September 13, 2023: contributor

    ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e

    The headers variable here is just a local variable that will go out of scope…

    Thanks, I looked at the callstacks and have come to the same conclusion.

  32. theStack approved
  33. theStack commented at 0:04 am on September 14, 2023: contributor

    Code-review ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e

    Question out of curiosity: does the C++17 standard guarantee that default-constructed (i.e. new and empty) std::vectors have an initial capacity of zero? If not (as e.g. suggested by https://stackoverflow.com/a/23415582 or https://tylerayoung.com/2020/08/20/default-capacity-growth-rate-of-c-stdvector/ – not sure what C++ version specifically they are talking about though), then strictly speaking there’d be no guaranteed way to clear the memory that is not implementation-dependent, it seems.

  34. sipa commented at 0:31 am on September 14, 2023: member

    Question out of curiosity: does the C++17 standard guarantee that default-constructed (i.e. new and empty) std::vectors have an initial capacity of zero? If not (as e.g. suggested by https://stackoverflow.com/a/23415582 or https://tylerayoung.com/2020/08/20/default-capacity-growth-rate-of-c-stdvector/ – not sure what C++ version specifically they are talking about though), then strictly speaking there’d be no guaranteed way to clear the memory that is not implementation-dependent, it seems.

    I skimmed over the C++17 spec, but can’t find anything about capacity of newly-constructed vectors. I don’t think we need to care about this though; as that SO link says, I think it’d be nonsensical for an implementation to allocate anything for an empty newly-constructed vector.

  35. fanquake requested review from ajtowns on Sep 14, 2023
  36. ajtowns commented at 2:43 am on September 15, 2023: contributor
    utACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e
  37. DrahtBot removed review request from ajtowns on Sep 15, 2023
  38. fanquake merged this on Sep 15, 2023
  39. fanquake closed this on Sep 15, 2023

  40. fanquake referenced this in commit 8661c49901 on Sep 15, 2023
  41. fanquake removed the label Needs backport (25.x) on Sep 15, 2023
  42. fanquake commented at 9:10 am on September 15, 2023: member
    Backported (partial) to 25.x in #28487.
  43. hebasto commented at 10:26 am on September 15, 2023: member
    Post-merge ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e.

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me