CCoinsViewDb
by using an in-memory LevelDB. We reuse the body of the existing fuzz target for coins_view
.
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/28216.
See the guideline for information on the review process.
Type | Reviewers |
---|---|
Concept ACK | brunoerg, l0rinc |
Stale ACK | TheCharlatan |
If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Reviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
coins_view
target. Catching the exception on AddCoin()
and continuing will cause cachedCoinsUsage
to underflow:
https://github.com/bitcoin/bitcoin/blob/4b1196a9855dcd188a24f393aa2fa21e2d61f061/src/test/fuzz/coins_view.cpp#L67-L75
https://github.com/bitcoin/bitcoin/blob/4b1196a9855dcd188a24f393aa2fa21e2d61f061/src/coins.cpp#L76-L100
Flush()
when the exception is hit to avoid the crash. See the first commit.
coins_db
: https://dergoegge.github.io/bitcoin-coverage/pr28216/fuzz.coverage/index.html
70@@ -71,6 +71,12 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
71 if (e.what() == std::string{"Attempted to overwrite an unspent coin (when possible_overwrite is false)"}) {
72 assert(!possible_overwrite);
73 expected_code_path = true;
74+ // AddCoin() decreases cachedCoinsUsage by the memory usage of the old coin at the beginning and
75+ // increase it by the value of the new coin at the end. If it throws in the process, the value
76+ // of cachedCoinsUsage would have been incorrectly decreased, leading to an underflow later on.
77+ // To avoid this, use Flush() to reset the value of cachedCoinsUsage in sync with the cacheCoins
78+ // mapping.
79+ (void)coins_view_cache.Flush();
AddCoin()
within a try
outside of tests, right?
possible_overwrite
is false
is always considered a fatal error elsewhere AFAICT.
87@@ -87,7 +88,9 @@ void TestCoinsView(FuzzedDataProvider& fuzzed_data_provider, CCoinsView& backend
88 (void)coins_view_cache.Sync();
89 },
90 [&] {
91- coins_view_cache.SetBestBlock(ConsumeUInt256(fuzzed_data_provider));
92+ uint256 best_block{coins_view_cache.GetBestBlock()};
93+ if (is_db && best_block.IsNull()) best_block = uint256::ONE;
94+ coins_view_cache.SetBestBlock(best_block);
To confirm: the removal of ConsumeUint256()
here is okay (and doesn’t cause this test to never rotate the best block) because of the BatchWrite()
call below, right?
For !is_db
cases this is now a no-op, whereas before it wasn’t. Is that intended?
It’s been a while since i wrote this code and to be honest i don’t remember what i was thinking. (Note-to-self: comments are good actually.)
As you point out it does change the behaviour of the existing target. Since i don’t have a rationale i’ve updated it to simply keep the existing behaviour.
43@@ -43,13 +44,12 @@ void initialize_coins_view()
44 g_setup = testing_setup.get();
Unrelated to this PR, but is there a reason this is using the full testing setup?
0diff --git a/src/test/fuzz/coins_view.cpp b/src/test/fuzz/coins_view.cpp
1index 8f3e357a84..c7563ae9f4 100644
2--- a/src/test/fuzz/coins_view.cpp
3+++ b/src/test/fuzz/coins_view.cpp
4@@ -30 +30 @@ namespace {
5-const TestingSetup* g_setup;
6+const BasicTestingSetup* g_setup;
7@@ -42 +42 @@ void initialize_coins_view()
8- static const auto testing_setup = MakeNoLogFileContext<const TestingSetup>();
9+ static const auto testing_setup = MakeNoLogFileContext<>(ChainType::MAIN);
Alright, i can’t reproduce the non-determinism he observed.
How did you try to reproduce it? Because I’m still seeing only ~70% stability when running with afl++. @marcofleon this could also be a target to test your “stability debugging” script on.
We'll reuse it for a target where the coins view is a DB.
🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/37422309131
Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:
Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.
Leave a comment here, if you need help tracking down a confusing failure.
It reuses the logic from the `coins_view` target, except it uses an
in-memory CCoinsViewDB as the backend.
Note `CCoinsViewDB` will assert the best block hash is never null, so we
slightly modify the coins_view fuzz logic to take care of this.
212@@ -207,7 +213,7 @@ void TestCoinsView(FuzzedDataProvider& fuzzed_data_provider, CCoinsView& backend
213
214 {
215 std::unique_ptr<CCoinsViewCursor> coins_view_cursor = backend_coins_view.Cursor();
216- assert(!coins_view_cursor);
217+ assert(is_db ? !!coins_view_cursor : !coins_view_cursor);
Why explicit conversion with !!
? Doesn’t assert cause contextual conversion to bool of the expression, or is this implementation specific?
assert(!!coins_view_cursor == is_db);
?
I thought I commented on this already, but seems I forgot to submit: 👍 for
0 assert(is_db == !!coins_view_cursor);
or maybe even
0 assert(is_db != !coins_view_cursor);
Reviewed the added fuzz test and it looks good.
I let the fuzz binary run with FUZZ=coins_db
for about an hour and nothing crashed.
And, I generated some corpora (https://github.com/davidgumberg/qa-assets/tree/coins_db_corpora) for ~20 minutes by doing
0$ ./build_fuzz/test/fuzz/test_runner.py -g qa-assets/fuzz_corpora/ coins_db
and used the deterministic fuzz coverage tool from #31836 but it found instability:
0$ RUST_BACKTRACE=1 cargo run --manifest-path ./contrib/devtools/deterministic-fuzz-coverage/Cargo.toml -- $PWD/build_fuzz $PWD/qa-assets/fuzz_corpora/ coins_db
0--- /bitcoin/build_det/fuzz_det_cov.show.0.txt 2025-02-18 21:44:37.658467363 -0800
1+++ /bitcoin/build_det/fuzz_det_cov.show.1.txt 2025-02-18 21:44:38.217482311 -0800
2@@ -58994,32 +58994,32 @@
3 45| 0| return "leveldb.InternalKeyComparator";
4 46| 0|}
5 47| |
6- 48| 81.7k|int InternalKeyComparator::Compare(const Slice& akey, const Slice& bkey) const {
7+ 48| 81.8k|int InternalKeyComparator::Compare(const Slice& akey, const Slice& bkey) const {
8 49| | // Order by:
9 50| | // increasing user key (according to user-supplied comparator)
10 51| | // decreasing sequence number
11 52| | // decreasing type (though sequence# should be enough to disambiguate)
12- 53| 81.7k| int r = user_comparator_->Compare(ExtractUserKey(akey), ExtractUserKey(bkey));
13- 54| 81.7k| if (r == 0) {
14+ 53| 81.8k| int r = user_comparator_->Compare(ExtractUserKey(akey), ExtractUserKey(bkey));
15+ 54| 81.8k| if (r == 0) {
16 ------------------
17- | Branch (54:7): [True: 49.6k, False: 32.0k]
18+ | Branch (54:7): [True: 49.8k, False: 32.0k]
19 ------------------
20- 55| 49.6k| const uint64_t anum = DecodeFixed64(akey.data() + akey.size() - 8);
21- 56| 49.6k| const uint64_t bnum = DecodeFixed64(bkey.data() + bkey.size() - 8);
22- 57| 49.6k| if (anum > bnum) {
23+ 55| 49.8k| const uint64_t anum = DecodeFixed64(akey.data() + akey.size() - 8);
24+ 56| 49.8k| const uint64_t bnum = DecodeFixed64(bkey.data() + bkey.size() - 8);
25+ 57| 49.8k| if (anum > bnum) {
26 ------------------
27- | Branch (57:9): [True: 4.60k, False: 45.0k]
28+ | Branch (57:9): [True: 4.60k, False: 45.2k]
29 ------------------
30 58| 4.60k| r = -1;
31- 59| 45.0k| } else if (anum < bnum) {
32+ 59| 45.2k| } else if (anum < bnum) {
33 ------------------
34- | Branch (59:16): [True: 43.6k, False: 1.46k]
35+ | Branch (59:16): [True: 43.6k, False: 1.50k]
36 ------------------
37 60| 43.6k| r = +1;
38 61| 43.6k| }
39- 62| 49.6k| }
40- 63| 81.7k| return r;
41- 64| 81.7k|}
42+ 62| 49.8k| }
43+ 63| 81.8k| return r;
44+ 64| 81.8k|}
45 65| |
46 66| |void InternalKeyComparator::FindShortestSeparator(std::string* start,
47 67| 37| const Slice& limit) const {
48@@ -60131,17 +60131,17 @@
49 52| |
50 53| 3| ~MemTableIterator() override = default;
51 54| |
52- 55| 6.60k| bool Valid() const override { return iter_.Valid(); }
53+ 55| 6.59k| bool Valid() const override { return iter_.Valid(); }
54 56| 2| void Seek(const Slice& k) override { iter_.Seek(EncodeKey(&tmp_, k)); }
55 57| 1| void SeekToFirst() override { iter_.SeekToFirst(); }
56 58| 0| void SeekToLast() override { iter_.SeekToLast(); }
57 59| 6.62k| void Next() override { iter_.Next(); }
58 60| 0| void Prev() override { iter_.Prev(); }
59 61| 6.59k| Slice key() const override { return GetLengthPrefixedSlice(iter_.key()); }
60- 62| 6.58k| Slice value() const override {
61- 63| 6.58k| Slice key_slice = GetLengthPrefixedSlice(iter_.key());
62- 64| 6.58k| return GetLengthPrefixedSlice(key_slice.data() + key_slice.size());
63- 65| 6.58k| }
64+ 62| 6.59k| Slice value() const override {
65+ 63| 6.59k| Slice key_slice = GetLengthPrefixedSlice(iter_.key());
66+ 64| 6.59k| return GetLengthPrefixedSlice(key_slice.data() + key_slice.size());
67+ 65| 6.59k| }
68 66| |
69 67| 1| Status status() const override { return Status::OK(); }
70 68| |
71@@ -60475,12 +60475,12 @@
72 150| |
73 151| | // Accessors/mutators for links. Wrapped in methods so we can
74 152| | // add the appropriate barriers as necessary.
75- 153| 78.3k| Node* Next(int n) {
76- 154| 78.3k| assert(n >= 0);
77+ 153| 78.4k| Node* Next(int n) {
78+ 154| 78.4k| assert(n >= 0);
79 155| | // Use an 'acquire load' so that we observe a fully initialized
80 156| | // version of the returned Node.
81- 157| 78.3k| return next_[n].load(std::memory_order_acquire);
82- 158| 78.3k| }
83+ 157| 78.4k| return next_[n].load(std::memory_order_acquire);
84+ 158| 78.4k| }
85 159| 6.13k| void SetNext(int n, Node* x) {
86 160| 6.13k| assert(n >= 0);
87 161| | // Use a 'release store' so that anybody who reads through this
88@@ -60518,9 +60518,9 @@
89 193| 1.17k|}
90 194| |
91 195| |template <typename Key, class Comparator>
92- 196| 28.1k|inline bool SkipList<Key, Comparator>::Iterator::Valid() const {
93- 197| 28.1k| return node_ != nullptr;
94- 198| 28.1k|}
95+ 196| 28.2k|inline bool SkipList<Key, Comparator>::Iterator::Valid() const {
96+ 197| 28.2k| return node_ != nullptr;
97+ 198| 28.2k|}
98 199| |
99 200| |template <typename Key, class Comparator>
100 201| 14.1k|inline const Key& SkipList<Key, Comparator>::Iterator::key() const {
101@@ -60531,8 +60531,8 @@
102 206| |template <typename Key, class Comparator>
103 207| 6.62k|inline void SkipList<Key, Comparator>::Iterator::Next() {
104 208| 6.62k| assert(Valid());
105- 209| 6.62k| node_ = node_->Next(0);
106- 210| 6.62k|}
107+ 209| 6.63k| node_ = node_->Next(0);
108+ 210| 6.63k|}
109 211| |
110 212| |template <typename Key, class Comparator>
111 213| 0|inline void SkipList<Key, Comparator>::Iterator::Prev() {
112@@ -65829,10 +65829,10 @@
113 31| 11.3k| Slice() : data_(""), size_(0) {}
114 32| |
115 33| | // Create a slice that refers to d[0,n-1].
116- 34| 364k| Slice(const char* d, size_t n) : data_(d), size_(n) {}
117+ 34| 365k| Slice(const char* d, size_t n) : data_(d), size_(n) {}
118 35| |
119 36| | // Create a slice that refers to the contents of "s"
120- 37| 16.6k| Slice(const std::string& s) : data_(s.data()), size_(s.size()) {}
121+ 37| 16.5k| Slice(const std::string& s) : data_(s.data()), size_(s.size()) {}
122 38| |
123 39| | // Create a slice that refers to s[0,strlen(s)-1]
124 40| 26| Slice(const char* s) : data_(s), size_(strlen(s)) {}
125@@ -65842,10 +65842,10 @@
126 44| | Slice& operator=(const Slice&) = default;
127 45| |
128 46| | // Return a pointer to the beginning of the referenced data
129- 47| 345k| const char* data() const { return data_; }
130+ 47| 347k| const char* data() const { return data_; }
131 48| |
132 49| | // Return the length (in bytes) of the referenced data
133- 50| 576k| size_t size() const { return size_; }
134+ 50| 579k| size_t size() const { return size_; }
135 51| |
136 52| | // Return true iff the length of the referenced data is zero
137 53| 5.76k| bool empty() const { return size_ == 0; }
138@@ -65910,30 +65910,30 @@
139 101| |
140 102| 0|inline bool operator!=(const Slice& x, const Slice& y) { return !(x == y); }
141 103| |
142- 104| 83.5k|inline int Slice::compare(const Slice& b) const {
143- 105| 83.5k| const size_t min_len = (size_ < b.size_) ? size_ : b.size_;
144- ^201 ^83.3k
145+ 104| 83.7k|inline int Slice::compare(const Slice& b) const {
146+ 105| 83.7k| const size_t min_len = (size_ < b.size_) ? size_ : b.size_;
147+ ^201 ^83.5k
148 ------------------
149- | Branch (105:26): [True: 201, False: 83.3k]
150+ | Branch (105:26): [True: 201, False: 83.5k]
151 ------------------
152- 106| 83.5k| int r = memcmp(data_, b.data_, min_len);
153- 107| 83.5k| if (r == 0) {
154+ 106| 83.7k| int r = memcmp(data_, b.data_, min_len);
155+ 107| 83.7k| if (r == 0) {
156 ------------------
157- | Branch (107:7): [True: 51.7k, False: 31.8k]
158+ | Branch (107:7): [True: 51.8k, False: 31.9k]
159 ------------------
160- 108| 51.7k| if (size_ < b.size_)
161+ 108| 51.8k| if (size_ < b.size_)
162 ------------------
163- | Branch (108:9): [True: 0, False: 51.7k]
164+ | Branch (108:9): [True: 0, False: 51.8k]
165 ------------------
166 109| 0| r = -1;
167- 110| 51.7k| else if (size_ > b.size_)
168+ 110| 51.8k| else if (size_ > b.size_)
169 ------------------
170- | Branch (110:14): [True: 0, False: 51.7k]
171+ | Branch (110:14): [True: 0, False: 51.8k]
172 ------------------
173 111| 0| r = +1;
174- 112| 51.7k| }
175- 113| 83.5k| return r;
176- 114| 83.5k|}
177+ 112| 51.8k| }
178+ 113| 83.7k| return r;
179+ 114| 83.7k|}
180 115| |
181 116| |} // namespace leveldb
182 117| |
183@@ -69664,11 +69664,11 @@
184 46| 30.3k| return reinterpret_cast<char*>(ptr);
185 47| 30.3k|}
186 48| |
187- 49| 20.0k|void PutVarint32(std::string* dst, uint32_t v) {
188- 50| 20.0k| char buf[5];
189- 51| 20.0k| char* ptr = EncodeVarint32(buf, v);
190- 52| 20.0k| dst->append(buf, ptr - buf);
191- 53| 20.0k|}
192+ 49| 20.1k|void PutVarint32(std::string* dst, uint32_t v) {
193+ 50| 20.1k| char buf[5];
194+ 51| 20.1k| char* ptr = EncodeVarint32(buf, v);
195+ 52| 20.1k| dst->append(buf, ptr - buf);
196+ 53| 20.1k|}
197 54| |
198 55| 89|char* EncodeVarint64(char* dst, uint64_t v) {
199 56| 89| static const int B = 128;
200@@ -69979,9 +69979,9 @@
201 24| |
202 25| 3| const char* Name() const override { return "leveldb.BytewiseComparator"; }
203 26| |
204- 27| 83.5k| int Compare(const Slice& a, const Slice& b) const override {
205- 28| 83.5k| return a.compare(b);
206- 29| 83.5k| }
207+ 27| 83.7k| int Compare(const Slice& a, const Slice& b) const override {
208+ 28| 83.7k| return a.compare(b);
209+ 29| 83.7k| }
210 30| |
211 31| | void FindShortestSeparator(std::string* start,
212 32| 37| const Slice& limit) const override {
213
214The coverage was not determinstic between runs.
215The fuzz target input was /bitcoin/qa-assets/fuzz_corpora/coins_db/0d249c3962392259953e8ce2148addd7ed92cda5.
314+ .path = "", // Memory only.
315+ .cache_bytes = 1 << 20, // 1MB.
316+ .memory_only = true,
317+ };
318+ CCoinsViewDB coins_db{std::move(db_params), CoinsViewOptions{}};
319+ TestCoinsView(fuzzed_data_provider, coins_db, /* is_db = */ true);
The boolean parameter basically decides if the second parameter is a database view or not.
We could reduce this duplication by deducing it inside the TestCoinsView
method instead:
0void TestCoinsView(FuzzedDataProvider& fuzzed_data_provider, CCoinsView& backend_coins_view)
1{
2 const bool is_db{!!dynamic_cast<CCoinsViewDB*>(&backend_coins_view)};
or if you prefer:
0void TestCoinsView(FuzzedDataProvider& fuzzed_data_provider, CCoinsView& backend_coins_view)
1{
2 const bool is_db{typeid(backend_coins_view) == typeid(CCoinsViewDB)};
310+FUZZ_TARGET(coins_db, .init = initialize_coins_view)
311+{
312+ FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
313+ auto db_params = DBParams{
314+ .path = "", // Memory only.
315+ .cache_bytes = 1 << 20, // 1MB.
nit: I know this was just copied over, but technically this is
0 .cache_bytes = 1 << 20, // 1MiB.
305+ FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
306+ CCoinsView backend_coins_view;
307+ TestCoinsView(fuzzed_data_provider, backend_coins_view);
308+}
309+
310+FUZZ_TARGET(coins_db, .init = initialize_coins_view)
The file name previously coincided with the only fuzz target - now a coins_view
contains a coins_db
target as well.
Given that the type is CCoinsViewDB
, we could rename this to:
0FUZZ_TARGET(coins_view_db, .init = initialize_coins_view)
309+
310+FUZZ_TARGET(coins_db, .init = initialize_coins_view)
311+{
312+ FuzzedDataProvider fuzzed_data_provider{buffer.data(), buffer.size()};
313+ auto db_params = DBParams{
314+ .path = "", // Memory only.
This is already obvious from two lines below:
0 .path = "",
68@@ -69,6 +69,12 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
69 if (e.what() == std::string{"Attempted to overwrite an unspent coin (when possible_overwrite is false)"}) {
70 assert(!possible_overwrite);
71 expected_code_path = true;
72+ // AddCoin() decreases cachedCoinsUsage by the memory usage of the old coin at the beginning and
73+ // increase it by the value of the new coin at the end. If it throws in the process, the value
74+ // of cachedCoinsUsage would have been incorrectly decreased, leading to an underflow later on.
75+ // To avoid this, use Flush() to reset the value of cachedCoinsUsage in sync with the cacheCoins
76+ // mapping.
77+ (void)coins_view_cache.Flush();
I still can’t run any fuzzing on my Mac (for the past ~5 months), so I only added code review comments based on what I found.
Concept ACK
I explored the instablilty of the coins_db
target some more and it seems to be in LevelDB, so I think we can ignore it as an issue for this fuzz test.
I generated two coverage reports: one with a corpus of inputs that were all stable (afl++ showing 99.3%) and then one with that same corpus but just a single added unstable input (afl++ showing ~75%). You can see that basically all of the differences in coverage are in LevelDB. If you look at the coverage for db_impl.cc
you’ll see that the one unstable input reaches the compaction code (probably seen clearest in MaybeScheduleCompaction()
).
Anyway, all this to say that I don’t think the instability should be a blocker for this harness. Unless we want to start changing LevelDB…