Do not use std::vector = {} to release memory

sipa commented at 6:16 pm on September 11, 2023: member

It appears that invoking v = {}; for an std::vector<...> v is equivalent to v.clear(), which does not release its allocated memory. There are a number of places in the codebase where it appears to be used for that purpose however (mostly written by me). Replace those with std::vector<...>{}.swap(v); (using a helper function ClearShrink in util/vector.h).

To explain what is going on: v = {...}; is equivalent in general to v.operator=({...});. For many types, the {} is converted to the type of v, and then assigned to v - which for std::vector would ordinarily have the effect of clearing its memory (constructing a new empty vector, and then move-assigning it to v). However, since std::vector<T> has an operator=(std::initializer_list<T>) defined, it has precedence (since no implicit conversion is needed), and with an empty list, that is equivalent to clear().

I did consider using v = std::vector<T>{}; as replacement for v = {}; instances where memory releasing is desired, but it appears that it does not actually work universally either. V{}.swap(v); does.

DrahtBot commented at 6:16 pm on September 11, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	stickies-v, theStack, ajtowns
Concept ACK	theuni, hebasto

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#27596 (assumeutxo (2) by jamesob)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

sipa force-pushed on Sep 11, 2023

DrahtBot added the label CI failed on Sep 11, 2023

sipa force-pushed on Sep 11, 2023

theuni commented at 7:18 pm on September 11, 2023: member

Concept ACK on the msgs, but do we really want to release all mem all the time for the others? Without the shrinks, we essentially keep a high-water mark for memory where future allocations are unnecessary.

I poked quickly at m_send_buffer for in MarkBytesSent for example, and it seems like that means constant reallocation now, no?

sipa commented at 7:25 pm on September 11, 2023: member

@theuni For m_send_buffer, m_recv_buffer, and m_recv_decode_buffer it’s not unreasonable to e.g. permit a certain lingering memory usage. E.g. if the capacity is under 64 KiB, don’t wipe.

I considered that before, but wasn’t sure it’d be worth the discussion. We’re already doing several allocations per P2P message already in various places, some of which will require more refactoring to get rid of, so one more or less probably doesn’t matter that much.

For this PR, I wanted to just make it do what it was intending to do already.

hebasto commented at 7:50 pm on September 11, 2023: member

Concept ACK.

I’m curious, does this change have an observable effect on peak memory usage?

sipa commented at 8:01 pm on September 11, 2023: member

@hebasto Not right now, as V2Transport isn’t hooked up to anything. But once it is, without this PR, after sending a 4 MB message you’d permanently keep 4 MB allocated for that connection (and similarly for the receive side).

hebasto commented at 8:03 pm on September 11, 2023: member

FWIW, shrink_to_fit:

… is a non-binding request to reduce capacity() to size(). It depends on the implementation whether the request is fulfilled.

sipa commented at 8:03 pm on September 11, 2023: member

@hebasto I’m aware. But it appears to work better in practice than v = std::vector<T>{};.

I guess we could also use the combination, v = std::vector<T>{}; v.shrink_to_fit();.

v.swap(std::vector<T>{}); is also an option, possibly followed by v.shrink_to_fit();.

hebasto commented at 8:09 pm on September 11, 2023: member

v.swap(std::vector<T>{}); is also an option, possibly followed by v.shrink_to_fit();.

I was thinking just about the same :)

DrahtBot removed the label CI failed on Sep 11, 2023

DrahtBot added the label CI failed on Sep 12, 2023

in src/test/net_tests.cpp:1572 in 9a92fda90e outdated

1567+    std::vector<uint8_t>{}.swap(v5);
1568+    BOOST_CHECK(v5.capacity() == 0);
1569+
1570+    std::vector<uint8_t> v6 = {1, 2, 3};
1571+    v6 = std::vector<uint8_t>{};
1572+    BOOST_CHECK(v6.capacity() == 0);

maflcko commented at 6:21 am on September 12, 2023:

On clang, from CI:

0test/net_tests.cpp(1554): error: in "net_tests/vector_release_test": check v2.capacity() == 0 has failed
1test/net_tests.cpp(1572): error: in "net_tests/vector_release_test": check v6.capacity() == 0 has failed

(But only with depends DEBUG=1 enabled)

clang godbolt: https://godbolt.org/z/eYq7Kooz7 gcc godbolt: https://godbolt.org/z/e688xb4vM

sipa commented at 12:07 pm on September 12, 2023:

So bizarre, but that served as a good test for which methods work and don’t work.

in src/headerssync.cpp:54 in 9a92fda90e outdated

50@@ -51,9 +51,11 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
51 void HeadersSyncState::Finalize()
52 {
53     Assume(m_download_state != State::FINAL);
54-    m_header_commitments = {};
55+    m_header_commitments.clear();

maflcko commented at 6:25 am on September 12, 2023:

0    ClearShrink(m_header_commitments);

nit: I wonder if there should be a clang-tidy-plugin rule to catch these and whether there should be a util method to do the stuff. This would make it easier to adjust the approach if it needs to be.

sipa commented at 12:08 pm on September 12, 2023:

I’ve added a ClearShrink(V& v) to util/vector.h, and used that everywhere. I’ve also added a unit test for it.

sipa force-pushed on Sep 12, 2023

in src/test/util_tests.cpp:1800 in 8332d52211 outdated

1795+BOOST_AUTO_TEST_CASE(clearshrink_test)
1796+{
1797+    std::vector<uint8_t> v = {1, 2, 3};
1798+    ClearShrink(v);
1799+    BOOST_CHECK(v.empty());
1800+    BOOST_CHECK(v.capacity() == 0);

hebasto commented at 12:20 pm on September 12, 2023:

nit:

0    BOOST_CHECK_EQUAL(v.capacity(), 0);

?

sipa commented at 9:19 pm on September 12, 2023:

Done.

DrahtBot removed the label CI failed on Sep 12, 2023

sipa force-pushed on Sep 12, 2023

ajtowns commented at 6:11 am on September 13, 2023: contributor

Replace those with v.clear(); v.shrink_to_fit();.

PR description needs updating?

in src/util/vector.h:54 in 43627af084 outdated

48@@ -49,4 +49,22 @@ inline V Cat(V v1, const V& v2)
49     return v1;
50 }
51 
52+/** Clear a vector and release its allocated memory. */
53+template<typename V>
54+inline void ClearShrink(V& v) noexcept

stickies-v commented at 10:39 am on September 13, 2023:

I’m not sure if we want to use this for vectors only, but from the docs it seems like it? If so, perhaps worthwhile enforcing that on compile time?

0inline void ClearShrink(std::vector<V>& v) noexcept

maflcko commented at 10:58 am on September 13, 2023:

shrink_to_fit exists only on vectors, so I think this is already enforced? Also, why is there a need to enforce it? Finally, your suggestion would require the caller to specify the type V for every call, which seems suboptimal.

sipa commented at 11:04 am on September 13, 2023:

Note that the code isn’t using v.shrink_to_fit(); anymore, but V{}.swap(v);.

Also, both of those formulas also work on std::deque, and the call sites in HeadersSyncState::Finalize actually use it for that. Sadly, std::deque has no capacity(), so you can’t actually test if it has the desired effect.

sipa commented at 11:21 am on September 13, 2023:

I’ve clarified in the comment that the function also works on std::deque, and added a test for it (though not the capacity test part).

stickies-v commented at 12:04 pm on September 13, 2023:

edit: sipa’s responses didn’t automatically load so the below is slightly outdated.

shrink_to_fit exists only on vectors

std::deque also has it, as well as e.g. our prevector

Actually, with this change I’ve found that this PR is using ClearShrink on 2 other types: bitdeque<> m_header_commitments and std::deque<CompressedHeader> m_redownloaded_headers.

Also, why is there a need to enforce it?

I’m not sure if we need to, I’m just raising the question. The main reason I’m considering is that we’re currently only unit testing this on a std::vector and it seems a pretty specific function. Keeping usage in line with documentation and testing seems a sane thing to do, but again, it’s just a consideration.

Finally, your suggestion would require the caller to specify the type V for every call, which seems suboptimal.

Hmm, can you elaborate? This diff compiles fine for me:

 0diff --git a/src/headerssync.cpp b/src/headerssync.cpp
 1index f891063cd2..09327a33fd 100644
 2--- a/src/headerssync.cpp
 3+++ b/src/headerssync.cpp
 4@@ -52,9 +52,9 @@ HeadersSyncState::HeadersSyncState(NodeId id, const Consensus::Params& consensus
 5 void HeadersSyncState::Finalize()
 6 {
 7     Assume(m_download_state != State::FINAL);
 8-    ClearShrink(m_header_commitments);
 9+    m_header_commitments = {};
10     m_last_header_received.SetNull();
11-    ClearShrink(m_redownloaded_headers);
12+    m_redownloaded_headers = {};
13     m_redownload_buffer_last_hash.SetNull();
14     m_redownload_buffer_first_prev_hash.SetNull();
15     m_process_all_remaining_headers = false;
16diff --git a/src/util/vector.h b/src/util/vector.h
17index 2e83eb181b..09a9c409d4 100644
18--- a/src/util/vector.h
19+++ b/src/util/vector.h
20@@ -51,7 +51,7 @@ inline V Cat(V v1, const V& v2)
21 
22 /** Clear a vector and release its allocated memory. */
23 template<typename V>
24-inline void ClearShrink(V& v) noexcept
25+inline void ClearShrink(std::vector<V>& v) noexcept
26 {
27     // There are various ways to clear a vector and releasing its memory:
28     //
29@@ -64,7 +64,7 @@ inline void ClearShrink(V& v) noexcept
30     // follows. (3) and (4) rely on std::vector::shrink_to_fit, which is only a non-binding
31     // request. Therefore, we use method (1).
32 
33-    V{}.swap(v);
34+    std::vector<V>{}.swap(v);
35 }
36 
37 #endif // BITCOIN_UTIL_VECTOR_H

maflcko commented at 12:15 pm on September 13, 2023:

Hmm, can you elaborate? This diff compiles fine for me:

Ok, my bad then. :)

stickies-v commented at 12:16 pm on September 13, 2023:

But anyway, it seems like we don’t want to limit this to just vectors, given that both std::deque and bitdeque have the same initializer_list assignment behaviour. So this can be marked as resolved.

in src/test/util_tests.cpp:1797 in 43627af084 outdated

1790@@ -1791,4 +1791,13 @@ BOOST_AUTO_TEST_CASE(util_WriteBinaryFile)
1791     BOOST_CHECK(valid);
1792     BOOST_CHECK_EQUAL(actual_text, expected_text);
1793 }
1794+
1795+BOOST_AUTO_TEST_CASE(clearshrink_test)
1796+{
1797+    std::vector<uint8_t> v = {1, 2, 3};

stickies-v commented at 10:41 am on September 13, 2023:

Given its specialization, would adding a test on std::vector<bool> be worthwhile?

sipa commented at 11:20 am on September 13, 2023:

Done.

in src/util/vector.h:56 in 43627af084 outdated

48@@ -49,4 +49,22 @@ inline V Cat(V v1, const V& v2)
49     return v1;
50 }
51 
52+/** Clear a vector and release its allocated memory. */
53+template<typename V>
54+inline void ClearShrink(V& v) noexcept
55+{
56+    // There are various ways to clear a vector and releasing its memory:

stickies-v commented at 10:52 am on September 13, 2023:

grammar nit: consistent verb forms

0    // There are various ways to clear a vector and release its memory:

sipa commented at 11:20 am on September 13, 2023:

Done.

stickies-v commented at 10:52 am on September 13, 2023: contributor

code review ACK 43627af0848dbabde34209ce07690553fcf94054

Found two more instances where ClearShrink seems appropriate:

https://github.com/bitcoin/bitcoin/blob/adc0921ea19f3b06878df6b22393fec519ed8f91/src/net_processing.cpp#L2602
https://github.com/bitcoin/bitcoin/blob/adc0921ea19f3b06878df6b22393fec519ed8f91/src/script/script.h#L575 (but probably not worth adding in this pull as it’s consensus code)

fanquake added the label Needs backport (25.x) on Sep 13, 2023

Do not use std::vector = {} to release memory 3fcd7fc7ff

sipa force-pushed on Sep 13, 2023

sipa commented at 11:24 am on September 13, 2023: member

@stickies-v

Found two more instances where ClearShrink seems appropriate:

https://github.com/bitcoin/bitcoin/blob/adc0921ea19f3b06878df6b22393fec519ed8f91/src/net_processing.cpp#L2602

The headers variable here is just a local variable that will go out of scope when the received headers are processed; there is no long-term memory usage we’re concerned about. I believe the intent of the = {}; there is just to clear the vector (so headers don’t get processed a second time).

https://github.com/bitcoin/bitcoin/blob/adc0921ea19f3b06878df6b22393fec519ed8f91/src/script/script.h#L575 (but probably not worth adding in this pull as it’s consensus code)

Yeah, not going to touch that in this PR.

stickies-v approved

stickies-v commented at 12:35 pm on September 13, 2023: contributor

ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e

The headers variable here is just a local variable that will go out of scope…

Thanks, I looked at the callstacks and have come to the same conclusion.

theStack approved

theStack commented at 0:04 am on September 14, 2023: contributor

Code-review ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e

Question out of curiosity: does the C++17 standard guarantee that default-constructed (i.e. new and empty) std::vectors have an initial capacity of zero? If not (as e.g. suggested by https://stackoverflow.com/a/23415582 or https://tylerayoung.com/2020/08/20/default-capacity-growth-rate-of-c-stdvector/ – not sure what C++ version specifically they are talking about though), then strictly speaking there’d be no guaranteed way to clear the memory that is not implementation-dependent, it seems.

sipa commented at 0:31 am on September 14, 2023: member

Question out of curiosity: does the C++17 standard guarantee that default-constructed (i.e. new and empty) std::vectors have an initial capacity of zero? If not (as e.g. suggested by https://stackoverflow.com/a/23415582 or https://tylerayoung.com/2020/08/20/default-capacity-growth-rate-of-c-stdvector/ – not sure what C++ version specifically they are talking about though), then strictly speaking there’d be no guaranteed way to clear the memory that is not implementation-dependent, it seems.

I skimmed over the C++17 spec, but can’t find anything about capacity of newly-constructed vectors. I don’t think we need to care about this though; as that SO link says, I think it’d be nonsensical for an implementation to allocate anything for an empty newly-constructed vector.

fanquake requested review from ajtowns on Sep 14, 2023

ajtowns commented at 2:43 am on September 15, 2023: contributor

utACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e

DrahtBot removed review request from ajtowns on Sep 15, 2023

fanquake merged this on Sep 15, 2023

fanquake closed this on Sep 15, 2023

fanquake referenced this in commit 8661c49901 on Sep 15, 2023

fanquake removed the label Needs backport (25.x) on Sep 15, 2023

fanquake commented at 9:10 am on September 15, 2023: member

Backported (partial) to 25.x in #28487.

hebasto commented at 10:26 am on September 15, 2023: member

Post-merge ACK 3fcd7fc7ff563bdc0e2bba66b4cbe72d898c876e.

luke-jr referenced this in commit 3c9c582c15 on Sep 16, 2023

Frank-GER referenced this in commit 516e070e60 on Sep 19, 2023

fanquake referenced this in commit 47ab496b64 on Sep 26, 2023

fanquake referenced this in commit f17f329044 on Sep 26, 2023

fanquake referenced this in commit 67b6d99aea on Sep 26, 2023

fanquake referenced this in commit 2c51a07c08 on Oct 2, 2023

fanquake referenced this in commit 9f8d501cb8 on Oct 4, 2023

fanquake referenced this in commit 1416d09cba on Oct 6, 2023

bitcoin locked this on Sep 14, 2024

Do not use std::vector = {} to release memory #28452

Reviews

Conflicts