coins: remove logic for spent-and-FRESH cache entries and writing non-DIRTY entries

andrewtoth commented at 9:31 pm on August 18, 2024: contributor

Following up from #28280 (review), which suggested a revival of #18746.

GetCoin will never return true for a spent entry, so we can safely assume that any entry we fetch will not be spent. This lets us remove the only non-test code path which adds a FRESH-but-not-DIRTY entry to the flagged linked list. This in turn ensures all entries being sent to BatchWrite are DIRTY entries.

A corollary is that all spent coins must be DIRTY. The only time a coin can be spent and not DIRTY is when the CCoinsViewCacheEntry is created with an empty coin. The last commit makes this more clear by checking for freshness if inserted instead of if spent and not DIRTY.

This is a pure refactor removing dead code which handles a non-existent corner case of a FRESH-but-not-DIRTY entry in the CCoinsViewCache or a spent entry in CCoinsViewDB. There is a lot of test code which tries to exercise this corner case, which is also updated in this PR to behave like non-test code.

DrahtBot commented at 9:31 pm on August 18, 2024: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage & Benchmarks

For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30673.

Reviews

See the guideline for information on the review process.

Type	Reviewers
Concept ACK	l0rinc

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#32313 (coins: fix cachedCoinsUsage accounting in CCoinsViewCache by l0rinc)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

DrahtBot added the label UTXO Db and Indexes on Aug 18, 2024

andrewtoth force-pushed on Aug 18, 2024

DrahtBot added the label CI failed on Aug 18, 2024

DrahtBot commented at 9:39 pm on August 18, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/28917867890

Make sure to run all tests locally, according to the documentation.

The failure may happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

andrewtoth force-pushed on Aug 18, 2024

DrahtBot removed the label CI failed on Aug 18, 2024

in src/coins.cpp:53 in feaea74b04 outdated

48@@ -49,10 +49,8 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
49             cacheCoins.erase(ret);
50             return cacheCoins.end();
51         }
52-        if (ret->second.coin.IsSpent()) {
53-            // The parent only has an empty entry for this outpoint; we can consider our version as fresh.
54-            ret->second.AddFlags(CCoinsCacheEntry::FRESH, *ret, m_sentinel);
55-        }
56+        // GetCoin will never return true if the coin is spent
57+        Assume(!ret->second.coin.IsSpent());

l0rinc commented at 9:47 am on August 19, 2024:

when do we throw and when do we Assume here?

* Assume is the identity function. * * - Should be used to run non-fatal checks. In debug builds it behaves like * Assert()/assert() to notify developers and testers about non-fatal errors. * In production it doesn’t warn or log anything. * - For fatal errors, use Assert().

Wouldn’t this be a fatal error?

Alternatively, if we want to separate the test runs from production runs, would it maybe make sense to call SanityCheck in test&debug mode every time we want to be absolutely sure the state is correct? I know it’s expensive, but might be reassuring during testing - and similarly to Assume, would be skipped in prod.

andrewtoth commented at 7:26 pm on August 31, 2024:

It would be a fatal error when running in DEBUG mode, but is not compiled in for release.

when do we throw and when do we Assume here?

I only added throws to BatchWrite because there are invalid states that can happen in our fuzz test harness which we want to handle properly. A follow up to remove setting the flags directly can help make that more clear and we wouldn’t need to have the throws there anymore.

l0rinc commented at 1:39 pm on September 15, 2024:

Split out to https://github.com/bitcoin/bitcoin/pull/30906

in src/coins.cpp:326 in feaea74b04 outdated

320@@ -325,14 +321,14 @@ void CCoinsViewCache::SanityCheck() const
321         if (entry.IsDirty()) attr |= 1;
322         if (entry.IsFresh()) attr |= 2;
323         if (entry.coin.IsSpent()) attr |= 4;
324-        // Only 5 combinations are possible.
325-        assert(attr != 2 && attr != 4 && attr != 7);
326+        // Only 4 combinations are possible.
327+        assert(attr != 2 && attr != 4 && attr != 6 && attr != 7);

l0rinc commented at 11:51 am on August 19, 2024:

So basically if we lay out the allowed combinations we’d get:

0assert((!entry.IsDirty() & !entry.IsFresh() & !entry.coin.IsSpent())  |  // attr = 0
1        (entry.IsDirty() & !entry.IsFresh() & !entry.coin.IsSpent())  |  // attr = 1
2        (entry.IsDirty() &  entry.IsFresh() & !entry.coin.IsSpent())  |  // attr = 3
3        (entry.IsDirty() & !entry.IsFresh() &  entry.coin.IsSpent()));   // attr = 5

Which could be simplified to

0assert((entry.IsDirty() && !(entry.coin.IsSpent() && entry.IsFresh())) || !(entry.coin.IsSpent() || entry.IsFresh()));

which can be laid out as an if expression which would probably document the situation better than != 2 && != 4 && != 6 && != 7:

0if (entry.coin.IsSpent()) {
1    assert(entry.IsDirty() && !entry.IsFresh());
2} else {
3    assert(entry.IsDirty() || !entry.IsFresh());
4}

 0BOOST_AUTO_TEST_CASE(sanity_check_assertions)
 1{
 2    for (int is_dirty = 0; is_dirty <= 1; ++is_dirty) {
 3        for (int is_fresh = 0; is_fresh <= 1; ++is_fresh) {
 4            for (int is_spent = 0; is_spent <= 1; ++is_spent) {
 5                // Original check
 6                unsigned attr = 0;
 7                if (is_dirty) attr |= 1;
 8                if (is_fresh) attr |= 2;
 9                if (is_spent) attr |= 4;
10                bool original_result = (attr != 2 && attr != 4 && attr != 6 && attr != 7);
11
12                bool new_result = (!is_dirty & !is_fresh & !is_spent) |  // attr = 0
13                                  (is_dirty & !is_fresh & !is_spent) |   // attr = 1
14                                  (is_dirty & is_fresh & !is_spent) |    // attr = 3
15                                  (is_dirty & !is_fresh & is_spent);     // attr = 5
16                BOOST_CHECK_EQUAL(original_result, new_result);
17
18                new_result = ((is_dirty && !(is_spent && is_fresh)) || !(is_spent || is_fresh));
19                BOOST_CHECK_EQUAL(original_result, new_result);
20
21                if (is_dirty) {
22                    new_result = !(is_spent && is_fresh);
23                } else {
24                    new_result = !(is_spent || is_fresh);
25                }
26                BOOST_CHECK_EQUAL(original_result, new_result);
27            }
28        }
29    }
30}

andrewtoth commented at 7:18 pm on August 31, 2024:

Done.

andrewtoth force-pushed on Aug 19, 2024

in src/test/fuzz/coinscache_sim.cpp:148 in 963804c5f1 outdated

147@@ -151,14 +148,10 @@ class CoinsViewBottom final : public CCoinsView
148     {

l0rinc commented at 12:41 pm on August 19, 2024:

I was surprised to see that the method called GetCoin now:

overwrites the dummy coin parameter with the value found in the cache
returns a boolean indicating whether the coin exists in the cache and is not spent (the name doesn’t have any hints about it being spent or not)

I don’t mind if you think it’s outside the scope of the current change, but checking the usages it seems to me we can now simplify GetCoin to actually return the coin and let the callsite check for spent and it might simplify the usages as well:

0std::optional<Coin> CCoinsViewCache::GetCoin(const COutPoint& outpoint) const
1{
2    auto it = FetchCoin(outpoint);
3    if (it != cacheCoins.end()) {
4        return it->second.coin;
5    }
6    return std::nullopt;
7}

Which would simplify usages from e.g.

0Coin coin_using_get_coin;
1const bool exists_using_get_coin = coins_view_cache.GetCoin(random_out_point, coin_using_get_coin);
2if (exists_using_get_coin) {
3    assert(coin_using_get_coin == coin_using_access_coin);
4}
5assert((exists_using_access_coin && exists_using_have_coin_in_cache && exists_using_have_coin && exists_using_get_coin) ||
6       (!exists_using_access_coin && !exists_using_have_coin_in_cache && !exists_using_have_coin && !exists_using_get_coin));

to something like:

0if (auto coin_optional = coins_view_cache.GetCoin(random_out_point); coin_optional && !coin_optional->IsSpent()) {
1    assert(*coin_optional == coin_using_access_coin);
2    assert(exists_using_access_coin && exists_using_have_coin_in_cache && exists_using_have_coin);
3} else {
4    assert(!exists_using_access_coin && !exists_using_have_coin_in_cache && !exists_using_have_coin);
5}

What do you think?

andrewtoth commented at 1:35 pm on August 19, 2024:

This behavior is documented here, so not sure why test code bothered to do this. Indeed, switching to std::optional would be cleaner, but I think a much bigger refactor. This is also suggested in the original PR here.

l0rinc commented at 1:40 pm on August 19, 2024:

but I think a much bigger refactor

Maybe, would it help if I did that in a different PR?

andrewtoth commented at 1:45 pm on August 19, 2024:

Yes, but it would conflict heavily with this one. Many of the tests would have to be removed as well.

l0rinc commented at 2:24 pm on August 19, 2024:

I’m fine with that, extracted to a branch for now, will push as a draft after your PR stabilizes

l0rinc commented at 7:59 pm on August 19, 2024:

you can resolve this comment, I see now that this should definitely be investigated in a separate PR (it’s a big change).

andrewtoth commented at 0:18 am on August 20, 2024:

I think you can resolve your own comment threads too 😉

l0rinc commented at 5:42 am on August 20, 2024:

I don’t have write access to your PR: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/about-pull-request-reviews#resolving-conversations

in src/coins.h:182 in ffabb2ddcd outdated

176@@ -184,13 +177,13 @@ struct CCoinsCacheEntry
177     inline bool IsDirty() const noexcept { return m_flags & DIRTY; }
178     inline bool IsFresh() const noexcept { return m_flags & FRESH; }
179 
180-    //! Only call Next when this entry is DIRTY, FRESH, or both
181+    //! Only call Next when this entry is DIRTY
182     inline CoinsCachePair* Next() const noexcept {
183         Assume(m_flags);

l0rinc commented at 12:46 pm on August 19, 2024:

0        Assume(IsDirty());

nit: it’s a cheap call, why not make it an assert instead? Note that after that change the comments will become redundant

andrewtoth commented at 7:24 pm on August 31, 2024:

Done.

l0rinc commented at 2:08 pm on September 1, 2024:

I don’t see it in doc: update comments to reflect possible code paths (b0737df12ed6922624d3e78bbb2bc86e610522cd)

in src/test/coins_tests.cpp:64 in 946a035cdc outdated

52@@ -53,14 +53,11 @@ class CCoinsViewTest : public CCoinsView
53 
54     bool BatchWrite(CoinsViewCacheCursor& cursor, const uint256& hashBlock) override
55     {
56-        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.NextAndMaybeErase(*it)){
57-            if (it->second.IsDirty()) {

l0rinc commented at 12:50 pm on August 19, 2024:

is this is always the case, consider adding an assertion

0assert(it->second.IsDirty());

(we can remove them in the last commit, but would be useful for the intermediary commits at least)

andrewtoth commented at 7:18 pm on August 31, 2024:

Done.

l0rinc commented at 12:54 pm on August 19, 2024: contributor

The code became a lot simpler now - have a few suggestion I’d like us to consider, but I understand if you think it’s out of scope.

Once we get to a state that is ACK-worthy, I’ll enable hard assertions everywhere and run an IBD and check if it crashes anywhere.

in src/coins.cpp:52 in 1f2f8e516b outdated

48@@ -49,10 +49,8 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
49             cacheCoins.erase(ret);
50             return cacheCoins.end();
51         }
52-        if (ret->second.coin.IsSpent()) {
53-            // The parent only has an empty entry for this outpoint; we can consider our version as fresh.
54-            ret->second.AddFlags(CCoinsCacheEntry::FRESH, *ret, m_sentinel);
55-        }
56+        // GetCoin will never return true if the coin is spent

l0rinc commented at 8:01 pm on August 19, 2024:

It doesn’t sound like it’s GetCoin’s responsibility, I think the call-sites should be checking for whether the coins are spent or not, not the cache accessor method

andrewtoth commented at 1:39 am on August 20, 2024:

It’s specifically documented that this is the behavior. Why shouldn’t we be able to rely on it?

l0rinc commented at 5:53 am on August 20, 2024:

If we need to document the behavior of a method, it suggests that we’ve acknowledged it may not be intuitive for everyone, indicating that we’ve conceded on finding a better design. I think there’s a way we can avoid the comment: by separating the two concerns of getting a value from cache and checking if it’s spent or not. Or if you insist, we can call it “GetUnspent” in which case we can also get rid of the comment and call-site asserts and keep the current behavior. What do you think?

andrewtoth commented at 4:40 pm on August 20, 2024:

What you are suggesting is basically a concept NACK on this PR. If we aren’t confident that we will never get a spent coin from GetCoin, then we need to keep the logic to handle spent and FRESH coins.

sipa commented at 4:43 pm on August 20, 2024:

@paplorinc I feel your criticism here is about the naming/expectations/documentation of the existing code, rather than about this PR. Do you feel that addressing that first should be a blocker for this PR?

l0rinc commented at 5:06 pm on August 20, 2024:

I won’t block this since I’m fine with discussing/addressing any issues in later commits as well. I also don’t yet understand the second order effects of these changes, so treat my comments as curious observations.

So my concern is that in this PR we’re tying coin.IsSpent() with retrieving values from the cache. So if my understanding is correct we should either:

be able to get a coin from the cache and ignore/assert whether it’s spent or not - e.g. HaveCoin will now return false when the cache contains it but it’s spent… and if that’s not possible in certain situations I would assert instead of returning the boolean. So in that case GetCoin would just get it from the cache and the call-site will decide whether it should assert for IsSpent or branch based on it, something like this: https://github.com/bitcoin/bitcoin/blob/9db8b30788eff2b8e9c9f629a6a03886a28a3281/src/test/fuzz/coins_view.cpp#L167
rename the method to reflect the new responsibility that it has

There’s also a third possibility that I’m completely off on what this PR does - your patience is appreciated :)

andrewtoth commented at 8:13 pm on August 22, 2024:

@l0rinc It seems to me like you are thinking this PR has some behavior change. Any coin retrieved from GetCoin will be unspent if GetCoin returns true. You can follow this logically from seeing the return value here and then the only base class is CCoinsViewDB which does not save spent coins so cannot retrieve them in GetCoin.

This PR is recognizing that FetchCoin logically cannot get to here with a spent coin. This is a dead code path.

l0rinc commented at 8:26 pm on August 22, 2024:

Yeah, I understand that.

What I’m struggling with is the cache retrieval, and especially the invalidation logic, which seems to be scattered throughout the code and tightly coupled with the business logic.

Would you be open to a screen-sharing session to discuss this? It would help me better understand the constraints and likely speed up my review process as well.

l0rinc commented at 11:24 am on August 26, 2024:

So my understanding is that if a coin is already spent it’s basically “less likely” to be in the cache (0% without reorgs, right?), so we want to get rid of it ASAP to have more likely cache hits. Given that the complete UTXO set doesn’t fit into memory anymore, would it maybe make sense to have a more complex retainment heuristic for each cache layer, e.g. spentness, recency, dust amount, unspendables, time-locked, etc?

andrewtoth commented at 7:23 pm on August 31, 2024:

Given that the complete UTXO set doesn’t fit into memory anymore, would it maybe make sense to have a more complex retainment heuristic for each cache layer,

Since we now have a linked list of entries, we could have another sentinel that tracks coins that have their flags cleared. Then, when doing a flush we could just remove a chunk from the bottom of the linked list (entries added least recently).

l0rinc commented at 2:24 pm on September 1, 2024:

comment states exactly what the code already states - it’s just noise, can be removed

l0rinc commented at 3:16 pm on September 1, 2024:

That sounds like a really good idea that we should investigate after this PR!

l0rinc commented at 8:52 pm on September 8, 2024:

Added #30849 to somewhat mitigate the GetCoin mutations in tests - hoping this will get us closer to excluding invalid tests states.

andrewtoth commented at 6:36 pm on August 20, 2024: contributor

Once we get to a state that is ACK-worthy, I’ll enable hard assertions everywhere and run an IBD and check if it crashes anywhere.

I think for a change like this, spending time fuzzing would be more valuable. I believe it already runs in debug mode for that so Assumes would trigger crashes.

in src/coins.h:187 in 1f2f8e516b outdated

184         return m_next;
185     }
186 
187-    //! Only call Prev when this entry is DIRTY, FRESH, or both
188+    //! Only call Prev when this entry is DIRTY
189     inline CoinsCachePair* Prev() const noexcept {

l0rinc commented at 9:27 am on August 26, 2024:

nit: https://github.com/bitcoin/bitcoin/blob/master/doc/developer-notes.md?plain=1#L70

in src/test/fuzz/coinscache_sim.cpp:153 in 1f2f8e516b outdated

152-                coin.Clear();
153-                return true;
154-            }
155             return false;
156         } else {
157             coin = it->second;

l0rinc commented at 9:58 am on August 26, 2024:

The docs state:

When false is returned, coin’s value is unspecified.

Eventually I hope we can untangle these two “return values” (coin & spentness), but temporarily maybe we can improve this guarantee (since I’m very much against unspecified mutations of non-local state) so that we only set the value of coin when it’s unspent (similarly to how Uncache handles this case), e.g.:

0bool GetCoin(const COutPoint& outpoint, Coin& coin) const final
1{
2    if (auto it = m_data.find(outpoint); it != m_data.end() && !it->second.IsSpent()) {
3        coin = it->second;
4        return true;
5    } else {
6        return false;
7    }
8}

in src/test/coins_tests.cpp:889 in 1f2f8e516b outdated

874@@ -883,8 +875,13 @@ BOOST_AUTO_TEST_CASE(ccoins_write)
875     for (const CAmount parent_value : {ABSENT, SPENT, VALUE1})
876         for (const CAmount child_value : {ABSENT, SPENT, VALUE2})
877             for (const char parent_flags : parent_value == ABSENT ? ABSENT_FLAGS : FLAGS)
878-                for (const char child_flags : child_value == ABSENT ? ABSENT_FLAGS : CLEAN_FLAGS)
879-                    CheckWriteCoins(parent_value, child_value, parent_value, parent_flags, child_flags, parent_flags);
880+                for (const char child_flags : child_value == ABSENT ? ABSENT_FLAGS : CLEAN_FLAGS) {
881+                    if (child_flags == FRESH) {

l0rinc commented at 10:04 am on August 26, 2024:

nit: are we repeating business logic here? Can we make it more black-box somehow?

in src/coins.cpp:333 in 1f2f8e516b outdated

337@@ -341,7 +338,7 @@ void CCoinsViewCache::SanityCheck() const
338         assert(it->second.Next()->second.Prev() == it);
339         assert(it->second.Prev()->second.Next() == it);
340         // Verify they are actually flagged.
341-        assert(it->second.IsDirty() || it->second.IsFresh());
342+        assert(it->second.IsDirty());

l0rinc commented at 10:06 am on August 26, 2024:

this will be redundant once we assert it in Prev and Next

l0rinc commented at 2:54 pm on September 1, 2024:

We can remove this now

andrewtoth commented at 8:08 pm on December 12, 2024:

We only Assume in prev and next, here we assert.

in src/coins.cpp:271 in 1f2f8e516b outdated

274@@ -278,7 +275,7 @@ bool CCoinsViewCache::Sync()
275 void CCoinsViewCache::Uncache(const COutPoint& hash)
276 {
277     CCoinsMap::iterator it = cacheCoins.find(hash);
278-    if (it != cacheCoins.end() && !it->second.IsDirty() && !it->second.IsFresh()) {
279+    if (it != cacheCoins.end() && !it->second.IsDirty()) {

l0rinc commented at 10:07 am on August 26, 2024:

nit: can be modernized to (note: can likely be deduplicated with GetCoin):

0    if (auto it{cacheCoins.find(hash)}; it != cacheCoins.end() && !it->second.IsDirty()) {

l0rinc commented at 10:53 am on December 6, 2024:

I still think it applies

andrewtoth commented at 2:53 pm on December 6, 2024:

Just seems like noise. What is the benefit of this refactor?

l0rinc commented at 3:06 pm on December 6, 2024:

This is what we’ve used in many other parts of the code since we’re narrowing the scope of the iterator - helps with understanding the rest of the code where the it isn’t available this way. But if you strongly disagree, this isn’t a blocker for me.

andrewtoth commented at 3:52 am on December 13, 2024:

Done.

in src/coins.cpp:325 in 1f2f8e516b outdated

321@@ -325,14 +322,14 @@ void CCoinsViewCache::SanityCheck() const
322         if (entry.IsDirty()) attr |= 1;
323         if (entry.IsFresh()) attr |= 2;
324         if (entry.coin.IsSpent()) attr |= 4;
325-        // Only 5 combinations are possible.
326-        assert(attr != 2 && attr != 4 && attr != 7);
327+        // Only 4 combinations are possible.

l0rinc commented at 10:39 am on August 26, 2024:

I’m having a hard time with https://github.com/bitcoin/bitcoin/blob/master/src/coins.cpp#L97, where the coin is obviously spent and can also be non-dirty (according to updatecoins_simulation_test, ccoins_flush_behavior and others), which violates attr != 4.

If it’s valid for coin to (temporarily?) have that state, shouldn’t SanityCheck allow it?

Or if this is the state when the parent-cache gets invalidated because of a re-org, could we maybe simplify the possible states by always Sync-ing when a reorg is detected and/or only deleting spent entries from parent after a certain block maturity?

andrewtoth commented at 7:20 pm on August 31, 2024:

A coin cannot be spent and not DIRTY. I made a new commit to address this in AddCoin. The only time it will be in this state is when a CCoinsViewCacheEntry is constructed with an empty coin, which then has the coin moved into it a few lines later.

I think doing a follow up where we remove direct access to flags can remove a lot of code paths that we test for.

l0rinc commented at 4:10 pm on September 1, 2024:

That makes sense, as stated in other comments, I would be more comfortable doing that before this refactoring.

in src/coins.cpp:184 in 1f2f8e516b outdated

187-            continue;
188+            throw std::logic_error("A non-DIRTY coin was returned from the cursor in BatchWrite");
189+        }
190+        if (it->second.IsFresh() && it->second.coin.IsSpent()) {
191+            throw std::logic_error("A FRESH coin was not removed when it was spent");
192         }

l0rinc commented at 11:25 am on August 26, 2024:

I find it inconsistent that we’re throwing here only for testability - since we assume this cannot happen in reality. Basically, this is another “return value” of the method, so declaring it a bool was essentially a lie (especially since it always seems to just return true). Could we use the return value instead to signal failure (and since we think this is impossible, skip the error messages)?

andrewtoth commented at 6:15 pm on August 31, 2024:

This can happen in our fuzz tests. I think a follow up to this change would be to remove direct access to the flags, and replace with methods SetDirty and SetDirtyAndFresh. Then we can get rid of fuzzing all kinds of flags and simplify this.

l0rinc commented at 4:09 pm on September 1, 2024:

Can we consider fixing that before this PR? I wouldn’t mind doing parts of that myself!

andrewtoth commented at 4:19 pm on September 8, 2024:

Are you suggesting to refactor direct flags access with SetDirty and SetDirtyAndFresh first? If we do that first, we would still need a SetFresh setter since we still have a case here with setting FRESH-but-not-DIRTY. It’s kind of a chicken-and-egg problem.

l0rinc commented at 6:22 pm on September 8, 2024:

I suggest we make our tests reflect reality before we do this change, since currently if we add asserts about the internal consistency of the code, the tests seem to constantly violate them.

in src/coins.cpp:210 in 1f2f8e516b outdated

230+            // We can mark it FRESH in the parent if it was FRESH in the child
231+            // Otherwise it might have just been flushed from the parent's cache
232+            // and already exist in the grandparent
233+            if (it->second.IsFresh()) {
234+                entry.AddFlags(CCoinsCacheEntry::FRESH, *itUs, m_sentinel);
235             }

l0rinc commented at 11:31 am on August 26, 2024:

Instead of calling AddFlags twice, in other cases we’ve compressed this into a single line:

0            entry.AddFlags(CCoinsCacheEntry::DIRTY | (it->second.IsFresh() ? CCoinsCacheEntry::FRESH : 0), *itUs, m_sentinel);

andrewtoth commented at 7:21 pm on August 31, 2024:

I think I only touch this in whitespace, so I don’t want to refactor this right now.

in src/coins.cpp:190 in 1f2f8e516b outdated

214-                if (it->second.IsFresh()) {
215-                    entry.AddFlags(CCoinsCacheEntry::FRESH, *itUs, m_sentinel);
216-                }
217+            // Create the coin in the parent cache, move the data up
218+            // and mark it as dirty.
219+            itUs = cacheCoins.try_emplace(it->first).first;

l0rinc commented at 11:33 am on August 26, 2024:

Now that this structure is simpler, we could modernize (and optimize) this find / try_emplace construct as:

0auto [itUs, inserted] = cacheCoins.try_emplace(it->first);
1if (inserted) {

andrewtoth commented at 6:17 pm on August 31, 2024:

Yes, that would be a good follow up!

l0rinc changes_requested

l0rinc commented at 11:51 am on August 26, 2024: contributor

Dug deeper and added more relevant questions - bear with me if they’re outside the scope of this PR.

DrahtBot added the label Needs rebase on Aug 28, 2024

andrewtoth force-pushed on Aug 31, 2024

DrahtBot added the label CI failed on Aug 31, 2024

DrahtBot removed the label Needs rebase on Aug 31, 2024

DrahtBot removed the label CI failed on Aug 31, 2024

in src/coins.cpp:88 in 7d2b49926c outdated

87-        // we would remove it from this cache and would never flush spentness
88-        // to the parent cache.
89+        // A spent FRESH coin cannot exist in the cache because a FRESH coin
90+        // is simply erased when it is spent.
91         //
92         // Re-adding a spent coin can happen in the case of a re-org (the coin

l0rinc commented at 2:23 pm on September 1, 2024:

can this still happen during a reorg? The comment here is really scary, do we really need so much context here? I’d prefer a test that fails when the code is wrong instead of a really long description…

andrewtoth commented at 10:27 pm on December 12, 2024:

Yes, it can happen during a reorg. But we know we can’t have a spent coin that is not DIRTY, so the comments being updated here are making this not be redundant. When the coin is spent in DisconnectBlock, it is either erased if FRESH or set DIRTY. The only time we would get into this block if the coin is not DIRTY is if it was just inserted.

in src/coins.cpp:93 in 7d2b49926c outdated

 95         //
 96-        // If the coin doesn't exist in the current cache, or is spent but not
 97-        // DIRTY, then it can be marked FRESH.
 98-        fresh = !it->second.IsDirty();
 99+        // If the coin doesn't exist in the current cache then it can be marked FRESH.
100+        fresh = inserted;

l0rinc commented at 2:51 pm on September 1, 2024:

The freshness flag depend on inserted, possible_overwrite and it->second.coin.IsSpent(), assigned and validated in different places. Could we obviate the freshness flag calculation instead with something like:

0bool fresh_flag = inserted && !possible_overwrite ? CCoinsCacheEntry::FRESH : 0;

Either here or in a follow-up we could simplify & modernize the method to something like:

 0void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possible_overwrite) {
 1    assert(!coin.IsSpent());
 2    if (coin.out.scriptPubKey.IsUnspendable()) return;
 3
 4    auto [it, inserted] = cacheCoins.try_emplace(outpoint);
 5    if (!inserted) {
 6        if (!possible_overwrite && !it->second.coin.IsSpent()) throw std::logic_error("Attempted to overwrite an unspent coin (when possible_overwrite is false)");
 7        cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
 8    }
 9    cachedCoinsUsage += coin.DynamicMemoryUsage();
10    it->second.coin = std::move(coin);
11
12    bool fresh_flag = inserted && !possible_overwrite ? CCoinsCacheEntry::FRESH : 0;
13    it->second.AddFlags(CCoinsCacheEntry::DIRTY | fresh_flag, *it, m_sentinel);
14
15    TRACE5(utxocache, add,
16           outpoint.hash.data(),
17           (uint32_t) outpoint.n,
18           (uint32_t) it->second.coin.nHeight,
19           (int64_t) it->second.coin.out.nValue,
20           (bool) it->second.coin.IsCoinBase());
21}

andrewtoth commented at 10:29 pm on December 12, 2024:

Let’s leave this for a follow-up.

in src/test/coins_tests.cpp:752 in 7d2b49926c outdated

786@@ -793,7 +787,7 @@ BOOST_AUTO_TEST_CASE(ccoins_add)
787      */

l0rinc commented at 3:30 pm on September 1, 2024:

unrelated: could coins_cache_simulation_testbe moved into a fuzz test - or would it be too difficult to track all those states?. If you think any of these recommendations can be done in a parallel PR, let me know and I’ll do them myself before this is merged

andrewtoth commented at 3:49 am on December 13, 2024:

I’m not sure. It might just be redundant to our current fuzz harness. It’s also valuable to have these tests run on every CI.

in src/test/coins_tests.cpp:888 in 7d2b49926c outdated

884@@ -892,8 +885,13 @@ BOOST_AUTO_TEST_CASE(ccoins_write)
885     for (const CAmount parent_value : {ABSENT, SPENT, VALUE1})
886         for (const CAmount child_value : {ABSENT, SPENT, VALUE2})
887             for (const char parent_flags : parent_value == ABSENT ? ABSENT_FLAGS : FLAGS)
888-                for (const char child_flags : child_value == ABSENT ? ABSENT_FLAGS : CLEAN_FLAGS)
889-                    CheckWriteCoins(parent_value, child_value, parent_value, parent_flags, child_flags, parent_flags);
890+                for (const char child_flags : child_value == ABSENT ? ABSENT_FLAGS : CLEAN_FLAGS) {

l0rinc commented at 4:05 pm on September 1, 2024:

I might be completely off here, but in case of e.g. CheckWriteCoins(ABSENT, SPENT , FAIL , NO_ENTRY , DIRTY|FRESH, NO_ENTRY ); we’re calling .IsDirty() which returns false, but m_next and m_prev are set because CheckWriteCoins calls them with different flags.

Can this happen during actual usage, i.e. shouldn’t they only be part of the internal linked list when they’re dirty?

It bothers me a lot how many of the simulated combinations don’t always seem to reflect real usage - while comments warn us of situations that may not be covered with tests at all. Would it maybe make sense to fix the reliability of the tests before this PR?

l0rinc changes_requested

l0rinc commented at 4:11 pm on September 1, 2024: contributor

The unreliability of many of our tests worries me, would you be open to fixing them in a PR before this one?

DrahtBot added the label Needs rebase on Sep 2, 2024

andrewtoth force-pushed on Sep 2, 2024

DrahtBot removed the label Needs rebase on Sep 3, 2024

in src/test/fuzz/coinscache_sim.cpp:155 in 34a101d4ba outdated

147@@ -151,14 +148,10 @@ class CoinsViewBottom final : public CCoinsView
148     {
149         auto it = m_data.find(outpoint);
150         if (it == m_data.end()) {
151-            if ((outpoint.n % 5) == 3) {
152-                coin.Clear();

l0rinc commented at 1:25 pm on September 7, 2024:

I found this very confusing - a coin getter checking the cache, and if it’s not found, it clears the coin that it was supposed to return?

in src/test/fuzz/coinscache_sim.cpp:151 in 34a101d4ba outdated

147@@ -151,14 +148,10 @@ class CoinsViewBottom final : public CCoinsView
148     {
149         auto it = m_data.find(outpoint);
150         if (it == m_data.end()) {
151-            if ((outpoint.n % 5) == 3) {
152-                coin.Clear();
153-                return true;
154-            }
155             return false;

l0rinc commented at 1:56 pm on September 7, 2024:

since this always return false now for missing values, can https://github.com/bitcoin/bitcoin/pull/30673/files#diff-a8f78513bc27f9bf679eead54819a8e5be720401c6ae40858da226a66ca002e2R258 still happen?

andrewtoth referenced this in commit de0d22eb6c on Sep 7, 2024

in src/test/coins_tests.cpp:54 in 34a101d4ba outdated

50@@ -51,25 +51,19 @@ class CCoinsViewTest : public CCoinsView
51             return false;
52         }
53         coin = it->second;
54-        if (coin.IsSpent() && m_rng.randbool() == 0) {

l0rinc commented at 8:31 pm on September 8, 2024:

Did this change require any other test adjustment?

andrewtoth commented at 8:38 pm on September 8, 2024:

No, but without this the next commit would break.

in src/test/coins_tests.cpp:790 in 34a101d4ba outdated

786@@ -793,7 +787,7 @@ BOOST_AUTO_TEST_CASE(ccoins_add)
787      */
788     CheckAddCoin(ABSENT, VALUE3, VALUE3, NO_ENTRY   , DIRTY|FRESH, false);
789     CheckAddCoin(ABSENT, VALUE3, VALUE3, NO_ENTRY   , DIRTY      , true );
790-    CheckAddCoin(SPENT , VALUE3, VALUE3, 0          , DIRTY|FRESH, false);
791+    CheckAddCoin(SPENT , VALUE3, VALUE3, 0          , DIRTY      , false);

l0rinc commented at 11:21 am on September 12, 2024:

It seems to me that production code never actually calls AddFlags with a 0 value — we are always setting DIRTY, even when FRESH is set. Therefore, it would make sense to change AddFlags from using a custom uint8_t flags parameter (which is always DIRTY and sometimes FRESH in practice, and only called with a 0 in tests) to using a freshness parameter, such as SetDirtyAndFresh(CoinsCachePair& self, CoinsCachePair& sentinel, bool is_fresh = false).

This change would allow us to remove a lot of invalid test cases like this one (i.e. where flag is 0 or non-dirty).

As you’ve also mentioned, this would also allow us to remove the internal flags entirely. Instead of exposing the flags, we could simplify the implementation like this:

0inline bool IsDirty() const noexcept
1{
2    assert(!m_next == !m_prev);
3    return !!m_next;
4}
5inline bool IsFresh() const noexcept { return m_is_fresh; }

andrewtoth commented at 12:27 pm on September 12, 2024:

Yes, absolutely! But, this PR or something else would need to be merged first, or we would need to have a method SetFresh because there’s still a case where we have FRESH but not DIRTY.

I think we should keep the internal flags as a bitmap, since if we introduce new fields for each state it will increase the memory footprint of each CoinsCachePair.

l0rinc commented at 12:34 pm on September 12, 2024:

because there’s still a case where we have FRESH but not DIRTY.

I only see https://github.com/bitcoin/bitcoin/blob/34a101d4babd0cd2df86b414da385c3bded84027/src/coins.cpp#L204-L210 and https://github.com/bitcoin/bitcoin/blob/34a101d4babd0cd2df86b414da385c3bded84027/src/coins.cpp#L96 in production code - in both cases we’re only setting FRESH together with DIRTY. Did I miss anything in production code?

I think we should keep the internal flags as a bitmap, since if we introduce new fields for each state it will increase the memory footprint of each CoinsCachePair.

I would rather we simplify to boolean, and if we need more flags, change it to an uint8_t again - I don’t like over-generalizing for hypothetical scenarios - unless we’re sure about that future direction (which I understood isn’t the case).

andrewtoth commented at 1:28 pm on September 12, 2024:

This case will still have to be handled if this PR is not merged first: https://github.com/bitcoin/bitcoin/blob/master/src/coins.cpp#L52-L55

I would rather we simplify to boolean, and if we need more flags, change it to an uint8_t again

What I meant was adding a new field for DIRTY and a new field for FRESH. Since bools must be at least 1 byte it could make the pair larger depending on alignment.

l0rinc commented at 2:29 pm on September 12, 2024:

But we don’t need bool for dirty anymore, only for freshness - so we can even get rid of the enum (moving it over to the test maybe). I’ll experiment to see what the next smallest change is before this PR, we need to have tests that are more representative of what’s actually possible.

in src/coins.cpp:335 in 34a101d4ba outdated

336@@ -341,7 +337,7 @@ void CCoinsViewCache::SanityCheck() const
337         assert(it->second.Next()->second.Prev() == it);
338         assert(it->second.Prev()->second.Next() == it);
339         // Verify they are actually flagged.
340-        assert(it->second.IsDirty() || it->second.IsFresh());
341+        assert(it->second.IsDirty());
342         // Count the number of entries actually in the list.
343         ++count_linked;

l0rinc commented at 12:23 pm on September 12, 2024:

to avoid an infinite loop when the linking is incorrect (e.g. next returns itself), we could assert(count_linked++ < count_flagged);

andrewtoth commented at 3:50 am on December 13, 2024:

This is unrelated to this change. It will timeout if that happens, but I suppose a separate PR could be made to add that.

andrewtoth marked this as a draft on Sep 19, 2024

achow101 referenced this in commit 74fb19317a on Oct 24, 2024

DrahtBot added the label Needs rebase on Oct 24, 2024

ryanofsky referenced this in commit 17372d788e on Dec 4, 2024

andrewtoth force-pushed on Dec 4, 2024

andrewtoth marked this as ready for review on Dec 4, 2024

andrewtoth force-pushed on Dec 4, 2024

DrahtBot removed the label Needs rebase on Dec 5, 2024

andrewtoth commented at 3:50 pm on December 5, 2024: contributor

Rebased after #30849 and #30906. We can now specify the errors that should be thrown for the tests. In follow up PR, we can remove SetFresh and combine with SetDirty(bool isFresh) and remove tests for the case of FRESH-but-not-DIRTY.

in src/test/fuzz/coinscache_sim.cpp:152 in 9badfdb67e outdated

145@@ -148,8 +146,7 @@ class CoinsViewBottom final : public CCoinsView
146 public:
147     std::optional<Coin> GetCoin(const COutPoint& outpoint) const final
148     {
149-        // TODO GetCoin shouldn't return spent coins
150-        if (auto it = m_data.find(outpoint); it != m_data.end()) return it->second;
151+        if (auto it = m_data.find(outpoint); it != m_data.end() && !it->second.IsSpent()) return it->second;
152         return std::nullopt;
153     }
154

l0rinc commented at 10:21 am on December 6, 2024:

shouldn’t HaveCoin be synchronized with the behavior of GetCoin? This will return true now if we have entries that are in the db and spent:

0    bool HaveCoin(const COutPoint& outpoint) const override
1    {
2        return GetCoin(outpoint).has_value();
3    }

andrewtoth commented at 3:50 am on December 13, 2024:

Done.

in src/test/fuzz/coinscache_sim.cpp:151 in 9badfdb67e outdated

145@@ -148,8 +146,7 @@ class CoinsViewBottom final : public CCoinsView
146 public:
147     std::optional<Coin> GetCoin(const COutPoint& outpoint) const final
148     {
149-        // TODO GetCoin shouldn't return spent coins
150-        if (auto it = m_data.find(outpoint); it != m_data.end()) return it->second;
151+        if (auto it = m_data.find(outpoint); it != m_data.end() && !it->second.IsSpent()) return it->second;
152         return std::nullopt;
153     }

l0rinc commented at 10:24 am on December 6, 2024:

Nit: we could unify this methods with CCoinsViewTest#GetCoin in the same commit test: never return true from GetCoins for spent entries (brace init & override instead of final):

0    std::optional<Coin> GetCoin(const COutPoint& outpoint) const override
1    {
2        if (auto it{m_data.find(outpoint)}; it != m_data.end() && !it->second.IsSpent()) return it->second;
3        return std::nullopt;
4    }

andrewtoth commented at 3:51 am on December 13, 2024:

We would then have to share code between fuzz and unit tests, right? I’m not sure it’s worth it for this.

l0rinc commented at 2:30 pm on December 13, 2024:

I meant at least stylistically (helps with finding problems if the same logic is written in the same way)

in src/coins.cpp:52 in 2aaba5b704 outdated

48@@ -49,12 +49,9 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
49     const auto [ret, inserted] = cacheCoins.try_emplace(outpoint);
50     if (inserted) {
51         if (auto coin{base->GetCoin(outpoint)}) {
52+            Assume(!coin.value().IsSpent());

l0rinc commented at 10:27 am on December 6, 2024:

This line could make sense while we’re trying to get rid of invalid tests, but I don’t see the value of keeping it in the code for GetCoin (unlike with AccessCoin which doesn’t explicitly check)

andrewtoth commented at 3:51 am on December 13, 2024:

Removed that line.

in src/test/coins_tests.cpp:858 in 57f33c2063 outdated

861-            auto expected{CoinOrError{parent}}; // TODO test failure cases as well
862+        for (const MaybeCoin& child : {MISSING, SPENT_CLEAN, VALUE2_CLEAN}) {
863+            auto expected{CoinOrError{parent}};
864+            CheckWriteCoins(parent, child, expected);
865+        }
866+        for (const MaybeCoin& child : {SPENT_FRESH, VALUE2_FRESH}) {

l0rinc commented at 10:31 am on December 6, 2024:

beautiful <3

in src/test/fuzz/coins_view.cpp:167 in 57f33c2063 outdated

150@@ -151,7 +151,9 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
151                     coins_view_cache.BatchWrite(cursor, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());
152                     expected_code_path = true;
153                 } catch (const std::logic_error& e) {
154-                    if (e.what() == std::string{"FRESH flag misapplied to coin that exists in parent cache"}) {
155+                    if (e.what() == std::string{"FRESH flag misapplied to coin that exists in parent cache"}

l0rinc commented at 10:33 am on December 6, 2024:

I don’t understand why we’re okay with these values. Aren’t we simulating situations that cannot happen in reality (and should be a tragedy if they do appear, instead of whitelisting them)?

andrewtoth commented at 2:33 pm on December 6, 2024:

I think after this change is merged, we can remove SetFresh and then the fuzz harness will not be able to add a FRESH-but-not-DIRTY. Then we can remove these catches and actually break if the fuzzer reaches them.

Right now the fuzz harness can still call SetFresh without calling SetDirty.

andrewtoth commented at 2:34 pm on December 6, 2024:

The other 2 though I think you’re right. They should not be possible to hit even in the fuzz harness now. Will remove and fuzz.

andrewtoth commented at 3:34 am on December 13, 2024:

Looking again, these can still be called with the fuzz harness. Removing SetFresh in a follow-up would allow us to remove one of them, but the other 2 would not since the random_coin can be a spent coin.

l0rinc commented at 2:29 pm on December 13, 2024:

since the random_coin can be a spent coin

But can that happen in reality or are we testing something that’s difficult to justify?

There are no code paths which add a non-DIRTY entry to the cursor in BatchWrite, so we can safely make this a logic error and test for it.

andrewtoth commented at 5:52 pm on July 10, 2025:

I removed SetFresh in the last commit. This lets us remove these tests.

in src/coins.h:187 in c78fe97db8 outdated

180@@ -188,14 +181,14 @@ struct CCoinsCacheEntry
181     bool IsDirty() const noexcept { return m_flags & DIRTY; }
182     bool IsFresh() const noexcept { return m_flags & FRESH; }
183 
184-    //! Only call Next when this entry is DIRTY, FRESH, or both
185+    //! Only call Next when this entry is DIRTY
186     CoinsCachePair* Next() const noexcept
187     {
188         Assume(m_flags);

l0rinc commented at 10:36 am on December 6, 2024:

I still think we should Assume what the comment states instead - and get rid of the redundant comment:

0        Assume(IsDirty());

Edit: this seems to be done in a88c641 (#30673), please merge the two to avoid modifying a comment that is deleted later

andrewtoth commented at 3:52 am on December 13, 2024:

Done.

in src/test/coins_tests.cpp:65 in ca95ebb8d6 outdated

68-                }
69+        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.NextAndMaybeErase(*it)) {
70+            assert(it->second.IsDirty());
71+            map_[it->first] = it->second.coin;
72+            if (it->second.coin.IsSpent() && m_rng.randrange(3) == 0) {
73+                // Randomly delete empty entries on write.

l0rinc commented at 10:43 am on December 6, 2024:

empty entries?

andrewtoth commented at 2:51 pm on December 6, 2024:

Spent entries are empty. I guess we could unify the terminology here.

andrewtoth commented at 3:52 am on December 13, 2024:

Done.

in src/txdb.cpp:148 in ca95ebb8d6 outdated

144@@ -147,7 +145,7 @@ bool CCoinsViewDB::BatchWrite(CoinsViewCacheCursor& cursor, const uint256 &hashB
145 
146     LogDebug(BCLog::COINDB, "Writing final batch of %.2f MiB\n", batch.SizeEstimate() * (1.0 / 1048576.0));
147     bool ret = m_db->WriteBatch(batch);
148-    LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs (out of %u) to coin database...\n", (unsigned int)changed, (unsigned int)count);
149+    LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs to coin database...\n", (unsigned int)count);

l0rinc commented at 10:47 am on December 6, 2024:

We shouldn’t need the C cast (or the trailing newline)

0    LogDebug(BCLog::COINDB, "Committed %u changed transaction outputs to coin database...", count);

andrewtoth commented at 2:52 pm on December 6, 2024:

I don’t think we can get rid of newlines…

l0rinc commented at 3:03 pm on December 6, 2024:

It was removed in https://github.com/bitcoin/bitcoin/pull/30929/files#diff-8c9b223f6f7e311ef2a0585813c17e4cc78119d39ef9956431164336e9158832R137 - setting it back demonstrates that it’s not needed anymore (but tolerated):

 0diff --git a/src/test/logging_tests.cpp b/src/test/logging_tests.cpp
 1--- a/src/test/logging_tests.cpp	(revision ac69069c050a3a3a34e2b9f27a458872c4704e42)
 2+++ b/src/test/logging_tests.cpp	(date 1733497035556)
 3@@ -134,7 +134,7 @@
 4 BOOST_FIXTURE_TEST_CASE(logging_LogPrintMacros, LogSetup)
 5 {
 6     LogTrace(BCLog::NET, "foo6: %s", "bar6"); // not logged
 7-    LogDebug(BCLog::NET, "foo7: %s", "bar7");
 8+    LogDebug(BCLog::NET, "foo7: %s\n", "bar7");
 9     LogInfo("foo8: %s", "bar8");
10     LogWarning("foo9: %s", "bar9");
11     LogError("foo10: %s", "bar10");

andrewtoth commented at 3:52 am on December 13, 2024:

Done.

l0rinc commented at 11:02 am on December 6, 2024: contributor

I went over the changes again after the rebase. I like that the change is made in small commits, but they’re not telling me a story yet, I don’t fully understand why some changes were split out and others done in the same commit. These refactorings are really important to make coins logic accessible again to all. Many of my previous questions still apply and I would need to understand them before I can ack.

andrewtoth force-pushed on Dec 13, 2024

in src/test/fuzz/coinscache_sim.cpp:149 in e04bb1b4d2 outdated

145@@ -148,14 +146,13 @@ class CoinsViewBottom final : public CCoinsView
146 public:
147     std::optional<Coin> GetCoin(const COutPoint& outpoint) const final
148     {
149-        // TODO GetCoin shouldn't return spent coins
150-        if (auto it = m_data.find(outpoint); it != m_data.end()) return it->second;
151+        if (auto it = m_data.find(outpoint); it != m_data.end() && !it->second.IsSpent()) return it->second;

l0rinc commented at 2:32 pm on December 13, 2024:

nit: style of this GetCoin is still a tiny bit different from the other one (i.e. brace init)

0        if (auto it{m_data.find(outpoint)}; it != m_data.end() && !it->second.IsSpent()) return it->second;

andrewtoth commented at 1:31 pm on July 9, 2025:

Done.

in src/test/coins_tests.cpp:53 in e04bb1b4d2 outdated

52-            if (!it->second.IsSpent() || m_rng.randbool()) {
53-                return it->second; // TODO spent coins shouldn't be returned
54-            }
55+        if (auto it{map_.find(outpoint)}; it != map_.end() && !it->second.IsSpent()) {
56+            return it->second;
57         }

l0rinc commented at 2:33 pm on December 13, 2024:

nit: can be single-line now to align with other GetCoin implementations:

0        if (auto it{map_.find(outpoint)}; it != map_.end() && !it->second.IsSpent()) return it->second

andrewtoth commented at 1:32 pm on July 9, 2025:

Done.

in src/coins.cpp:54 in f55bac7951 outdated

50@@ -51,10 +51,6 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
51         if (auto coin{base->GetCoin(outpoint)}) {
52             ret->second.coin = std::move(*coin);
53             cachedCoinsUsage += ret->second.coin.DynamicMemoryUsage();
54-            if (ret->second.coin.IsSpent()) { // TODO GetCoin cannot return spent coins

l0rinc commented at 2:35 pm on December 13, 2024:

this commit seems to be the same as the previous one, but for production code - can we merge the two commits for better context?

andrewtoth commented at 1:32 pm on July 9, 2025:

Done.

in src/coins.h:105 in f44fbaea5a outdated

101@@ -102,7 +102,6 @@ using CoinsCachePair = std::pair<const COutPoint, CCoinsCacheEntry>;
102  * - unspent, FRESH, DIRTY (e.g. a new coin created in the cache)
103  * - unspent, not FRESH, DIRTY (e.g. a coin changed in the cache during a reorg)
104  * - unspent, not FRESH, not DIRTY (e.g. an unspent coin fetched from the parent cache)
105- * - spent, FRESH, not DIRTY (e.g. a spent coin fetched from the parent cache)

l0rinc commented at 2:38 pm on December 13, 2024:

Doesn’t f44fbae (#30673) belong with the previous commit instead, giving it more context?

andrewtoth commented at 1:32 pm on July 9, 2025:

Done.

in src/coins.h:191 in f44fbaea5a outdated

187@@ -195,7 +188,7 @@ struct CCoinsCacheEntry
188         return m_next;
189     }
190 
191-    //! Only call Prev when this entry is DIRTY, FRESH, or both
192+    //! Only call Next when this entry is DIRTY, FRESH, or both

l0rinc commented at 2:39 pm on December 13, 2024:

typo?

andrewtoth commented at 1:32 pm on July 9, 2025:

Fixed.

in src/coins.cpp:324 in 2c3e62d452 outdated

323         // Recompute cachedCoinsUsage.
324         recomputed_usage += entry.coin.DynamicMemoryUsage();
325 
326         // Count the number of entries we expect in the linked list.
327-        if (entry.IsDirty() || entry.IsFresh()) ++count_flagged;
328+        if (entry.IsDirty()) ++count_flagged;

l0rinc commented at 2:41 pm on December 13, 2024:

same: this commit seems related to the previous commit, can we merge them?

andrewtoth commented at 1:32 pm on July 9, 2025:

Done.

in src/coins.cpp:311 in 4c9c30eda3 outdated

310@@ -311,12 +311,13 @@ void CCoinsViewCache::SanityCheck() const
311     size_t recomputed_usage = 0;
312     size_t count_flagged = 0;
313     for (const auto& [_, entry] : cacheCoins) {

l0rinc commented at 2:46 pm on December 13, 2024:

It seems to me this can also be merged with the related SanityCheck commit

andrewtoth commented at 1:34 pm on July 9, 2025:

I merged it with https://github.com/bitcoin/bitcoin/pull/30673/commits/aaef2d6c0d801dc48798e9a7d0911c234870412e instead.

in src/test/fuzz/coins_view.cpp:171 in 6956ee9cc7 outdated

150@@ -151,7 +151,9 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
151                     coins_view_cache.BatchWrite(cursor, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());
152                     expected_code_path = true;
153                 } catch (const std::logic_error& e) {
154-                    if (e.what() == std::string{"FRESH flag misapplied to coin that exists in parent cache"}) {
155+                    if (e.what() == std::string{"FRESH flag misapplied to coin that exists in parent cache"}
156+                        || e.what() == std::string{"A FRESH coin was not removed when it was spent"}
157+                        || e.what() == std::string{"A non-DIRTY coin was returned from the cursor in BatchWrite"}) {
158                         expected_code_path = true;
159                     }

l0rinc commented at 11:28 am on January 23, 2025:

We may want to add a Flush here as well: https://github.com/bitcoin/bitcoin/pull/28216/files#diff-1ef3b6a1936b50f3d5ec4a1786d9e2d63d1a3e1815b103e67f20601995f355b4R79

andrewtoth commented at 1:35 pm on July 9, 2025:

That’s being done in #32602.

DrahtBot added the label Needs rebase on Apr 8, 2025

l0rinc commented at 8:43 am on July 6, 2025: contributor

This is still important - @andrewtoth, how can I help?

andrewtoth force-pushed on Jul 9, 2025

DrahtBot added the label CI failed on Jul 9, 2025

DrahtBot commented at 1:22 am on July 9, 2025: contributor

🚧 At least one of the CI tasks failed. Task TSan, depends, gui: https://github.com/bitcoin/bitcoin/runs/45603902258 LLM reason (✨ experimental): The CI failure is caused by a syntax error in coinscache_sim.cpp, specifically an expected ‘}’ which leads to undeclared identifier errors during compilation.

Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot removed the label Needs rebase on Jul 9, 2025

coins: coins returned from GetCoin cannot be spent 320482e860

coins: throw errors for non-DIRTY or spent and FRESH coins in BatchWrite

There are no code paths which add a non-DIRTY entry to the cursor in
BatchWrite, so we can safely make this a logic error and test for it.

There are no code paths which add a spent and FRESH coin to the cursor
in BatchWrite, so we can safely make this a logic error and test for it.

ae76ec7bcf

coins: remove redundant IsDirty checks in BatchWrite

It is no longer possible to get non-DIRTY entries in BatchWrite

a421be4e95

coins: remove IsFresh check in Uncache

It is not possible for an entry to be FRESH if not already DIRTY.

aaef2d6c0d

coins: assume entry is dirty in Next and Prev b2622c47c6

test: simplify coins sanity check bcf4ec0e1f

coins: set all inserted spent coins to fresh

A spent coins must be DIRTY, so remove
references to spent but not DIRTY coins.
The only way a spent coin can be not DIRTY
is when creating the CCoinsCacheEntry with
an empty coin. This can be made more clear
by setting fresh if inserted, instead of
checking if an unspent coin is not DIRTY.

53dbebd748

andrewtoth force-pushed on Jul 9, 2025

DrahtBot removed the label CI failed on Jul 9, 2025

andrewtoth commented at 1:37 pm on July 9, 2025: contributor

@l0rinc rebased. Thank you for your reviews! I have tried to address all your comments. There are some older comments that I’m unsure are still relevant. Please let me know if there is still anything outstanding that needs to be addressed.

coins: remove SetFresh and replace with boolean flag passed to SetDirty

It is not possible to have a FRESH CCoinsCacheEntry that is not also
DIRTY. Test code uses the SetFresh method out of order to simulate
entries that are FRESH but not DIRTY. By removing the method entirely
we can simplify the test code and make the production code easier to
understand.

fd2338d915

andrewtoth force-pushed on Jul 10, 2025

in src/test/fuzz/coinscache_sim.cpp:147 in 320482e860 outdated

145@@ -148,14 +146,13 @@ class CoinsViewBottom final : public CCoinsView
146 public:
147     std::optional<Coin> GetCoin(const COutPoint& outpoint) const final

l0rinc commented at 11:25 pm on July 14, 2025:

nit: the class is already final + other overrides specify that explicitly for clarity

0    std::optional<Coin> GetCoin(const COutPoint& outpoint) const override

and

0    bool HaveCoin(const COutPoint& outpoint) const override

And most other such methods in the file - but I understand if you’d prefer to do in a follow-up instead

in src/coins.h:173 in fd2338d915

166@@ -174,8 +167,10 @@ struct CCoinsCacheEntry
167         SetClean();
168     }
169 
170-    static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel) noexcept { AddFlags(DIRTY, pair, sentinel); }
171-    static void SetFresh(CoinsCachePair& pair, CoinsCachePair& sentinel) noexcept { AddFlags(FRESH, pair, sentinel); }
172+    static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel, bool fresh = false) noexcept
173+    {
174+        AddFlags(fresh ? FRESH | DIRTY : DIRTY, pair, sentinel);
175+    }

l0rinc commented at 1:27 am on July 15, 2025:

Since it’s only called here, we could inline and simplify it now:

 0    static void SetDirty(CoinsCachePair& pair, CoinsCachePair& sentinel, bool fresh = false) noexcept
 1    {
 2        if (!pair.second.m_flags) {
 3            Assume(!pair.second.m_prev && !pair.second.m_next);
 4            pair.second.m_prev = sentinel.second.m_prev;
 5            pair.second.m_next = &sentinel;
 6            sentinel.second.m_prev = &pair;
 7            pair.second.m_prev->second.m_next = &pair;
 8        }
 9        Assume(pair.second.m_prev && pair.second.m_next);
10        pair.second.m_flags |= fresh ? FRESH | DIRTY : DIRTY;
11    }

Not sure, but do the flags still make sense after this or would a bool is_fresh also suffice? The dirtiness can be deduced from the m_next/m_prev pair so the only unknown is the freshness which doesn’t necessitate a flag anymore - right?

in src/coins.h:184 in fd2338d915

182@@ -188,17 +183,15 @@ struct CCoinsCacheEntry
183     bool IsDirty() const noexcept { return m_flags & DIRTY; }
184     bool IsFresh() const noexcept { return m_flags & FRESH; }

l0rinc commented at 1:29 am on July 15, 2025:

we don’t have Fresh anymore, only IsDirtyAndFresh, right?

0    bool IsDirtyAndFresh() const noexcept
1    {
2        const bool is_fresh = m_flags & FRESH;
3        Assume(IsDirty() || !is_fresh);
4        return is_fresh;
5    }

in src/test/coins_tests.cpp:51 in fd2338d915

47@@ -48,26 +48,20 @@ class CCoinsViewTest : public CCoinsView
48 
49     std::optional<Coin> GetCoin(const COutPoint& outpoint) const override
50     {
51-        if (auto it{map_.find(outpoint)}; it != map_.end()) {
52-            if (!it->second.IsSpent() || m_rng.randbool()) {
53-                return it->second; // TODO spent coins shouldn't be returned
54-            }
55-        }
56+        if (auto it{map_.find(outpoint)}; it != map_.end() && !it->second.IsSpent()) return it->second;

l0rinc commented at 1:37 am on July 15, 2025:

Given the lack of reviews, we may want to split out the test cleanups (removing invalid states) and refactors to a separate PR, where this will be a tracking PR for where we want to get to in safer/smaller steps

in src/test/coinscachepair_tests.cpp:158 in fd2338d915

153@@ -154,22 +154,14 @@ BOOST_AUTO_TEST_CASE(linked_list_set_state)
154     BOOST_CHECK_EQUAL(sentinel.second.Next(), &n1);
155     BOOST_CHECK_EQUAL(sentinel.second.Prev(), &n1);
156 
157-    // Check that setting FRESH on new node inserts it after n1
158-    CCoinsCacheEntry::SetFresh(n2, sentinel);
159-    BOOST_CHECK(n2.second.IsFresh() && !n2.second.IsDirty());
160+    // Check that setting DIRTY and FRESH on new node inserts it after n1
161+    CCoinsCacheEntry::SetDirty(n2, sentinel, true);

l0rinc commented at 1:38 am on July 15, 2025:

0    CCoinsCacheEntry::SetDirty(n2, sentinel, /*fresh=*/true);

in src/test/fuzz/coinscache_sim.cpp:155 in fd2338d915

153     }
154 
155     bool HaveCoin(const COutPoint& outpoint) const final
156     {
157-        return m_data.count(outpoint);
158+        return GetCoin(outpoint).has_value();

l0rinc commented at 1:39 am on July 15, 2025:

we should be able to do this safely in a prefactor PR regardless of the other changes

l0rinc commented at 1:46 am on July 15, 2025: contributor

The src/coins.cpp prod changes are a bit scary - while it’s a very important change, based on the lack of enthusiasm by others, we may want to separate the tests and small refactors into PRs that will get us closer to this.

One could remove the invalid test states - without removing the invalid production code states, just potentially adding TODOs there in case it’s missing. The SetFresh and flag removal could be next and if all of those are merged, we can continue with the scary FetchCoin, AddCoin, BatchWrite and Uncache.

Left a few new comments quickly, let me know what you think.

andrewtoth commented at 7:22 pm on July 19, 2025: contributor

Closing in favor of #33018.

andrewtoth closed this on Jul 19, 2025

coins: remove logic for spent-and-FRESH cache entries and writing non-DIRTY entries #30673

Code Coverage & Benchmarks

Reviews

Conflicts