Don't empty dbcache on prune flushes: >30% faster IBD

andrewtoth commented at 10:38 pm on August 16, 2023: contributor

Since #17487 we no longer need to clear the coins cache when syncing to disk. A warm coins cache significantly speeds up block connection, and only needs to be fully flushed when nearing the dbcache limit.

For frequent pruning flushes there’s no need to empty the cache and kill connect block speed. However, simply using Sync in place of Flush actually slows down a pruned full IBD with a high dbcache value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it’s dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.

To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync, and simply clear the pointers after.

With this approach a full IBD with dbcache=16384 and prune=550 was 32% faster than master. For default dbcache=450 speedup was ~9%. All benchmarks were run with stopatheight=800000.

	prune	dbcache	time	max RSS	speedup
master	550	16384	8:52:57	2,417,464k	-
branch	550	16384	6:01:00	16,216,736k	32%
branch	550	450	8:05:08	2,818,072k	8.8%
master	10000	5000	8:19:59	2,962,752k	-
branch	10000	5000	5:56:39	6,179,764k	28.8%
master	0	16384	4:51:53	14,726,408k	-
branch	0	16384	4:43:11	16,526,348k	2.7%
master	0	450	7:08:07	3,005,892k	-
branch	0	450	6:57:24	3,013,556k	2.6%

While the 2 pointers add memory to each cache entry, it did not slow down IBD. For non-pruned IBD results were similar for this branch and master. When I performed the initial IBD, the full UTXO set could be held in memory when using the max dbcache value. For non-pruned IBD with max dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow. For smaller dbcache values the dbcache limit is respected so does not consume more memory, and the potentially more frequent flushes were not significant enough to cause any slowdown.

For reviewers, the commits in order do the following: First 4 commits encapsulate all accesses to flags on cache entries, and then the 5th makes flags private. Commits refactor: add CoinsCachePair alias to coins: call ClearFlags in CCoinsCacheEntry destructor create the linked list head nodes and cache entry self references and pass them into AddFlags. Commit coins: track flagged cache entries in linked list actually adds the entries into a linked list when they are flagged DIRTY or FRESH and removes them from the linked list when they are destroyed or the flags are cleared manually. However, the linked list is not yet used anywhere. Commit test: add cache entry linked list tests adds unit tests for the linked list. Commit coins: pass linked list of flagged entries to BatchWrite uses the linked list to iterate through DIRTY entries instead of using the entire coins cache. Commit validation: don't erase coins cache on prune flushes uses Sync instead of Flush for pruning flushes, so the cache is no longer cleared.

Inspired by this comment.

Fixes #11315.

DrahtBot commented at 10:38 pm on August 16, 2023: contributor

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Code Coverage

For detailed information about the code coverage, see the test coverage report.

Reviews

See the guideline for information on the review process.

Type	Reviewers
ACK	paplorinc, sipa, achow101, mzumsande
Concept ACK	jamesob, hernanmarino, vostrnad

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

#30326 (optimization: Reduce cache lookups in CCoinsViewCache::FetchCoin by paplorinc)
#28216 (fuzz: a new target for the coins database by darosior)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

in src/coins.h:109 in dbd663c446 outdated

102@@ -103,8 +103,16 @@ class Coin
103  */
104 struct CCoinsCacheEntry
105 {
106+private:
107+    //! These are used to create a doubly linked list of dirty entries. They are
108+    //! set in SetFlags and unset in ClearFlags.
109+    CCoinsCacheEntry *m_prev{nullptr};

sipa commented at 11:14 pm on August 16, 2023:

If these were iterators into the map, rather than pointers to the value, I think you wouldn’t need the m_outpoint.

sipa commented at 2:23 pm on August 17, 2023:

Actually, it seems iterators may get invalidated when a map rehash occurs, but pointers to elements don’t.

I think you can however store pointers to std::pair<const COutPoint, CoinsCacheEntry> (the value_type of the map). Would that help alleviate the need for m_outpoint?

andrewtoth commented at 2:28 pm on August 17, 2023:

Brilliant, thanks! Was just trying this last night, this looks like the way. Will give it a shot.

andrewtoth commented at 8:21 pm on August 17, 2023:

@sipa done! Thank you for the elegant solution.

andrewtoth force-pushed on Aug 17, 2023

DrahtBot added the label CI failed on Aug 17, 2023

DrahtBot removed the label CI failed on Aug 18, 2023

andrewtoth force-pushed on Aug 18, 2023

andrewtoth force-pushed on Aug 19, 2023

andrewtoth force-pushed on Aug 20, 2023

andrewtoth force-pushed on Aug 23, 2023

andrewtoth force-pushed on Aug 24, 2023

jamesob commented at 3:14 pm on August 24, 2023: contributor

Concept ACK! I’ll try to repro the bench results in the next week or so.

DrahtBot added the label CI failed on Aug 24, 2023

DrahtBot removed the label CI failed on Aug 25, 2023

0xB10C commented at 7:52 am on August 30, 2023: contributor

@andrewtoth Not sure if you have seen #20827 which is a different approach but has a similar effect.

andrewtoth commented at 7:13 pm on August 30, 2023: contributor

@0xB10C indeed I have seen that one. I believe they complement each other. #20827 reduces IBD time for larger prune values and this PR reduces it even more for smaller prune values. I should benchmark this on top of that one. Also, I’m curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?

0xB10C commented at 6:35 am on August 31, 2023: contributor

Also, I’m curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?

At the time I had a spare and otherwise idle machine set up I could use. I don’t have the bash script at hand but IIRC I had a merge base (MB) and a pull-request (PR) binary I’d run with different prune and dbcache configurations. After each run, the script would rename and move the debug log (with extra -debug=coindb -debug=prune) to a separate folder. After all runs were done, I loaded the debug logs into the following Jupyter notebooks to parse the debug logs and to generate the graphs. For anyone revisiting this, these are the relevant comments #20827 (comment), #20827 (comment), and #20827 (comment) for my setup and results.

I’ve uploaded the jupyter nodebooks here https://gist.github.com/0xB10C/89c2903290cfb1840792d41dcd079646. The first was used for a partial IBD from 500k-600k and the second one with a full IBD. Both times syncing from a local node on the same host. The notebooks expect the debug log files with the name debug_{binary}_{cset}_{run}.log where {binary} is either MB or PR, {cset} is an integer specifying the configuration set (pruning and dbcache parameter), and {run} is an integer numbering the run of this binary-cset combination if I did multiple. Feel free to remix and share in any way you want. Keep in mind that parsing logs is quite fragile as log message format can change. Feel free to reach out if there are questions.

DrahtBot added the label CI failed on Sep 4, 2023

DrahtBot removed the label CI failed on Sep 5, 2023

jamesob commented at 5:47 pm on September 20, 2023: contributor

Bench running now, should have some results in the next day or so.

0$ ( bitcoinperf --no-teardown bench-pr --num-blocks 30_000 --run-id no-flush-on-prune \
1    --bitcoind-args='-dbcache=4000 -prune=550' --run-count 3 28280 && \
2    pushover 'Bench for andrewtoth SUCCEEDED!' ) || \
3  pushover 'Bench for andrewtoth failed'

jamesob commented at 10:25 am on September 27, 2023: contributor

Local IBD of 30_000 blocks with -prune=550 from a Taproot-enabled height didn’t show any difference. I didn’t have -log=prune enabled, so my bench debug.log was basically useless. Rerunning…

DrahtBot added the label Needs rebase on Nov 8, 2023

andrewtoth force-pushed on Nov 30, 2023

DrahtBot removed the label Needs rebase on Dec 1, 2023

andrewtoth force-pushed on Dec 3, 2023

andrewtoth commented at 6:30 pm on December 3, 2023: contributor

Rebased, ran the following benchmark using hyperfine. Sync to block 800k 3 times on master commit and the last commit of this branch (I updated the commit recently, but no code changes only commit order changes to satisfy CI), dbcache=4000 and prune=550:

0hyperfine \
1--export-markdown ~/bench.md \
2--show-output --parameter-list commit 498994b6f55d04a7940f832e7fbd17e5acdaff15,216bd93138a46cdfa66523878a36d0e634476079 \
3--prepare 'git checkout {commit} && make -j$(nproc) src/bitcoind; sync; sudo /sbin/sysctl vm.drop_caches=3; rm -rf ~/.bitcoin/blocks; rm -rf ~/.bitcoin/chainstate;' \
4-M 3 './src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local_node> -stopatheight=800000'

Results from ~/bench.md:

Command	Mean [s]	Min [s]	Max [s]	Relative
`./src/bitcoind -dbcache=4000 -datadir=/home/user/.bitcoin -prune=550 -printtoconsole=0 -connect=192.168.2.11:8333 -stopatheight=800000`	27609.574 ± 37.857	27580.531	27652.389	1.24 ± 0.00
`./src/bitcoind -dbcache=4000 -datadir=/home/user/.bitcoin -prune=550 -printtoconsole=0 -connect=192.168.2.11:8333 -stopatheight=800000`	22188.405 ± 52.561	22128.517	22226.878	1.00

The second commit is this branch, so it runs 24% faster than master. @jamesob I suppose 30k blocks is not enough to appreciate the effect of syncing with a full cache?

in src/coins.h:140 in ba09f5039c outdated

136@@ -138,9 +137,9 @@ struct CCoinsCacheEntry
137         FRESH = (1 << 1),
138     };
139 
140-    CCoinsCacheEntry() : flags(0) {}
141-    explicit CCoinsCacheEntry(Coin&& coin_) : coin(std::move(coin_)), flags(0) {}
142-    CCoinsCacheEntry(Coin&& coin_, unsigned char flag) : coin(std::move(coin_)), flags(flag) {}
143+    CCoinsCacheEntry() {}

martinus commented at 5:26 pm on December 4, 2023:

While you are at it, I’d use CCoinsCacheEntry = default;. clang-tidy usually warns against this: https://clang.llvm.org/extra/clang-tidy/checks/modernize/use-equals-default.html

maflcko commented at 11:46 am on December 5, 2023:

One thing I don’t like about modernize-use-equals-default is that it makes aggregate initialization possible again, when a developer disabled it by declaring a constructor.

Though, now that C++20 will be required soon, I guess it can be enabled. C.f.

0struct A {
1    // A() {}; // "main" fn fails to compile under C++17 and 20
2    A() = default;  // "main" fn fails to compile under C++20
3    int b{};
4};
5
6int main() { (void)A{1}; }

martinus commented at 12:00 pm on December 5, 2023:

Ah, I wasn’t aware of that, good to know :+1:

in src/coins.cpp:12 in d66ad8efaf outdated

 8@@ -9,6 +9,42 @@
 9 #include <random.h>
10 #include <util/trace.h>
11 
12+void CCoinsCacheEntry::SetFlags(uint8_t flags, CoinsCachePair &self, CoinsCachePair &head)

martinus commented at 7:25 pm on December 4, 2023:

I think SetFlags is a bit confusing name, since it’s more a AddFlags, because it doesn’t simply set the flags, it ORs them together.

Alternatively (and I think this is the better solution) really use m_flags = flags instead of mflags |= flags.

andrewtoth commented at 2:15 am on December 5, 2023:

Only 2 of the 6 previous flags assignments (aside from = 0) used simple assignment, the other 4 use OR assignment. The 2 that use simple assignment we know logically that they must have been just created and therefore their flags are 0 so we are not overwriting any existing flags. Therefore an OR assignment in those 2 cases is logically equivalent.

So if we were to go with your second suggestion we would need to modify behavior to make sure we assign the new flags we wish to set OR’d with the current flags so we don’t overwrite existing flags. Or create separate Set and Add methods. I think that would make this patch more cumbersome.

It seems to me your first suggestion would be the path of least resistance. Updated to AddFlags.

in src/coins.cpp:151 in ba09f5039c outdated

148+    auto it{cacheCoins.emplace(
149         std::piecewise_construct,
150         std::forward_as_tuple(std::move(outpoint)),
151-        std::forward_as_tuple(std::move(coin), CCoinsCacheEntry::DIRTY));
152+        std::forward_as_tuple(std::move(coin))).first};
153+    it->second.SetFlags(CCoinsCacheEntry::DIRTY, *it, m_flagged_head);

martinus commented at 7:32 pm on December 4, 2023:

Not sure if that is possible in the call, but when emplace doesn’t create a new object because the outpoint is already present, then previously the flags didn’t change. Now, the flag DIRTY is always added, regardless if the outpoint was already present in cacheCoins or not.

in src/coins.cpp:234 in ba09f5039c outdated

236-            if (!(it->second.flags & CCoinsCacheEntry::FRESH && it->second.coin.IsSpent())) {
237+            if (!(it->second.GetFlags() & CCoinsCacheEntry::FRESH && it->second.coin.IsSpent())) {
238                 // Create the coin in the parent cache, move the data up
239                 // and mark it as dirty.
240-                CCoinsCacheEntry& entry = cacheCoins[it->first];
241+                itUs = cacheCoins.emplace(std::piecewise_construct, std::forward_as_tuple(it->first), std::tuple<>()).first;

martinus commented at 7:40 pm on December 4, 2023:

I’d use itUs = cacheCoins.try_emplace(it->first).first; instead, it’s just less code and does the same thing

in src/coins.cpp:317 in ba09f5039c outdated

314+    for (auto it{m_flagged_head.second.Next()}; it != nullptr; ) {
315         if (it->second.coin.IsSpent()) {
316+            // We need to get next entry before we erase this entry.
317+            // Since cacheCoins owns the entry, it will be deallocated before we
318+            // can access the next entry if we erase first.
319+            const auto next_entry{it->second.Next(/*clear_flags=*/true)};

martinus commented at 8:12 pm on December 4, 2023:

nit, you could move that line in front of the if, then remove the else, like so:

0const auto next_entry{it->second.Next(/*clear_flags=*/true)};
1if (it->second.coin.IsSpent()) {
2    cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
3    cacheCoins.erase(it->first);
4}
5it = next_entry;

andrewtoth force-pushed on Dec 5, 2023

andrewtoth commented at 3:03 am on December 5, 2023: contributor

@martinus thank you very much for your review! I updated the PR with all your suggestions.

andrewtoth force-pushed on Dec 5, 2023

DrahtBot added the label CI failed on Dec 5, 2023

DrahtBot removed the label CI failed on Dec 5, 2023

martinus commented at 5:55 am on December 5, 2023: contributor

I didn’t think this through yet, but this might be an alternative way to achieve the same thing: Instead of the double linked list, split the cacheCoins map up into 4 different maps, one map for each flag setting, e.g. as a std::array<CCoinMap, 4> m_cache_coins_per_flag;. Index is the flag, e.g. m_cache_coins_per_flag[FRESH | DIRTY] to get the map with all entries that are fresh & dirty. Instead of setting a flag & linking the list, use the unordered_map’s auto node = extract(...) and insert(std::move(node)) to move the entry to the corresponding map. This only relinks the unordered_map node and causes no allocation, pointers to the entry stay valid.

This approach would have a few advantages:

Less memory usage: no m_flag member is needed that actually required 8 bytes due to alignment, no 2 pointers needed for interlinking, and multiple maps means overall bucket sizes will stay smaller.
processing the maps might be faster: iteration is fast because e.g. skipping non-dirty entries is trivial (they are in a different map). There might be a hashing overhead though from extract() and insert().

For this to work, all 4 maps need to use the same m_cache_coins_memory_resource (insert() into a different map where the allocator doesn’t compare equal is undefined behavior).

andrewtoth commented at 1:19 pm on December 5, 2023: contributor

@martinus something similar was mentioned in #15265 (comment). However, with your approach every access would require 4 lookups.

martinus commented at 7:43 pm on December 5, 2023: contributor

@martinus something similar was mentioned in #15265 (comment). However, with your approach every access would require 4 lookups.

Ha, of course; I didn’t think about that at all.

andrewtoth force-pushed on Dec 13, 2023

DrahtBot added the label CI failed on Jan 12, 2024

andrewtoth force-pushed on Mar 8, 2024

DrahtBot removed the label CI failed on Mar 8, 2024

andrewtoth force-pushed on Mar 14, 2024

DrahtBot added the label CI failed on Mar 14, 2024

DrahtBot commented at 6:08 pm on March 14, 2024: contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/22665800478

andrewtoth force-pushed on Mar 14, 2024

DrahtBot removed the label CI failed on Mar 14, 2024

andrewtoth force-pushed on Mar 16, 2024

in src/coins.h:110 in 56faed97cd outdated

105@@ -103,6 +106,14 @@ class Coin
106  */
107 struct CCoinsCacheEntry
108 {
109+private:
110+    //! These are used to create a doubly linked list of flagged entries. They

sipa commented at 3:12 pm on March 17, 2024:

In commit “coins: add cache entry linked list fields and methods”

Nit: //! is for doxygen comments on the next definition. Since this is really a comment on the next two, it probably shouldn’t be a doxygen comment.

in src/coins.cpp:113 in 75a248f770 outdated

107@@ -108,10 +108,15 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
108 
109 void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
110     cachedCoinsUsage += coin.DynamicMemoryUsage();
111-    cacheCoins.emplace(
112+    CCoinsMap::iterator it;
113+    bool inserted;
114+    std::tie(it, inserted) = cacheCoins.emplace(

sipa commented at 3:16 pm on March 17, 2024:

In commit “coins: track dirty cache entries in doubly linked list”

Use a C++17 structured binding:

0auto [it, inserted] = cacheCoins.emplace(

in src/coins.cpp:54 in 75a248f770 outdated

50@@ -51,7 +51,7 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
51     if (ret->second.coin.IsSpent()) {
52         // The parent only has an empty entry for this outpoint; we can consider our
53         // version as fresh.
54-        ret->second.flags = CCoinsCacheEntry::FRESH;
55+        ret->second.AddFlags(CCoinsCacheEntry::FRESH, *ret, m_flagged_head);

sipa commented at 3:19 pm on March 17, 2024:

In commit “coins: track dirty cache entries in doubly linked list”

Should FRESH-but-not-DIRTY be flagged? I believe those correspond to spent but unmodified coins fetched from the parent cache.

If the rule is flagged == dirty, then perhaps it makes sense to instead of having AddFlags and Clearflags have a single SetFlags, which knows whether the new state is supposed to be in the list or not (moving the responsibility of flagged-or-not to that function instead of the caller). EDIT: I notice you used to have a SetFlags, actually, but my comments is more about where the responsibility lies for deciding whether an entry ought to be flagged or not than the actual function names.

andrewtoth commented at 6:32 pm on March 20, 2024:

Should FRESH-but-not-DIRTY be flagged? I believe those correspond to spent but unmodified coins fetched from the parent cache.

Yes, but only because in this case they are also spent. We erase all spent coins from the cache after doing a non-erasing batch write. There would be no way to track spent-but-not-DIRTY otherwise except by scanning the entire cache.

Since the only case where we have FRESH-but-not-DIRTY is where the coin is spent, it makes sense to me to just add this to the linked list as well. Before proposing this PR I did call the linked list head m_dirty_head, but after not finding an efficient way to remove non-dirty spent entries, I decided to call it m_flagged_head.

If this makes sense to you I can add a comment here explaining this.

sipa commented at 6:35 pm on March 20, 2024:

@andrewtoth Ah, I see, that makes sense. It would be good to add comments to explain that flagged means “FRESH, DIRTY, or both” and why.

mzumsande commented at 6:48 pm on June 27, 2024:

see #28280 (review), I think that FRESH-but-not-DIRTY coins may not exist.

andrewtoth force-pushed on Mar 21, 2024

andrewtoth commented at 2:11 pm on March 21, 2024: contributor

@0xB10C rebased after #20827 was merged, and ran the following benchmark:

 0hyperfine --export-markdown ~/bench.md --show-output --parameter-list commit 015ac13dcc964a31ef06dfdb565f88f901607f0e,25bdbc7f340884b3ab871c7948a4f2693037b871 --prepare 'git checkout {commit} && make -j$(nproc) src/bitcoind; sync; sudo /sbin/sysctl vm.drop_caches=3; rm -rf ~/.bitcoin/blocks; rm -rf ~/.bitcoin/chainstate;' -M 3 './src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000'
 1
 2...
 3
 4Benchmark 1: ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000 (commit = 015ac13dcc964a31ef06dfdb565f88f901607f0e)
 5
 6  Time (mean ± σ):     27241.365 s ± 33.693 s    [User: 34791.898 s, System: 7357.098 s]
 7  Range (min … max):   27202.978 s … 27266.037 s    3 runs
 8
 9Benchmark 2: ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000 (commit = 25bdbc7f340884b3ab871c7948a4f2693037b871)
10
11  Time (mean ± σ):     22153.568 s ± 37.663 s    [User: 27818.681 s, System: 4309.878 s]
12  Range (min … max):   22119.682 s … 22194.118 s    3 runs
13 
14Summary
15  ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=192.168.2.13 -stopatheight=800000 (commit = 25bdbc7f340884b3ab871c7948a4f2693037b871) ran
16    1.23 ± 0.00 times faster than ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=192.168.2.13 -stopatheight=800000 (commit = 015ac13dcc964a31ef06dfdb565f88f901607f0e)

So this is 23% faster than master when using dbcache=4000, prune=550. (I just rebased again but 015ac13dcc964a31ef06dfdb565f88f901607f0e is master and 25bdbc7f340884b3ab871c7948a4f2693037b871 is my branch). @sipa thank you for your review! Addressed your nits and added more comments about flagged entries.

DrahtBot added the label Needs rebase on Mar 22, 2024

andrewtoth force-pushed on Apr 3, 2024

DrahtBot removed the label Needs rebase on Apr 3, 2024

DrahtBot added the label CI failed on Apr 3, 2024

andrewtoth commented at 9:26 pm on April 4, 2024: contributor

Rebased and ran some faster benchmarks, 2 each but only until 400k. Will do some more going to 800k.

	prune	dbcache	mean time (s)	speedup
master	550	16384	2,018	-
branch	550	16384	1,667	17.4%
master	550	450	2,069	-
branch	550	450	1,822	11.9%
master	0	16384	1,447	-
branch	0	16384	1,396	3.5%
master	0	450	1,546	-
branch	0	450	1,527	1.2%

Command was

 0hyperfine \
 1--export-markdown ~/bench.md \
 2--show-output \
 3--parameter-list commit 0d509bab45d292caeaf34600e57b5928757c6005,499f1f029a265fbeb4824e5cb84726f82ca2824b \
 4--parameter-list dbcache 450,16384 \
 5--parameter-list prune 550,0 \
 6--prepare 'git checkout {commit} && \
 7make -j$(nproc) src/bitcoind; \
 8sync; sudo /sbin/sysctl vm.drop_caches=3; \
 9rm -rf /home/user/.bitcoin/blocks; \
10rm -rf /home/user/.bitcoin/chainstate;' \
11-M 2 \
12'echo '{commit}' && \
13./src/bitcoind -datadir=/home/user/.bitcoin \
14-dbcache={dbcache} \
15-prune={prune} \
16-printtoconsole=0 \
17-connect=<local node> \
18-stopatheight=400000'

DrahtBot removed the label CI failed on Apr 4, 2024

andrewtoth commented at 4:03 am on April 9, 2024: contributor

Ran benchmarks to 800k blocks, 2 times each. As expected, results are very good when running pruned. When running non-pruned with default dbcache, however, master was 1% faster. I think this slowdown is negligible though and the improvements to syncing with pruning enabled make this worth the changes here. Also, potential improvements like #28945 will make up for it.

	prune	dbcache	mean time	speedup
master	550	16384	7h 29m	-
branch	550	16384	4h 56m	34%
branch	550	450	6h 33m	12%
master	0	16384	3h 34m	-
branch	0	16384	3h 29m	2%
master	0	450	5h 34m	1%
branch	0	450	5h 40m	-

hernanmarino commented at 11:37 pm on April 12, 2024: contributor

Concept ACK . I ’d really like to see this merged

DrahtBot added the label CI failed on Apr 21, 2024

DrahtBot removed the label CI failed on Apr 23, 2024

achow101 commented at 8:21 pm on May 13, 2024: member

For frequent pruning flushes there’s no need to empty the cache and kill connect block speed. However, simply using Sync in place of Flush actually slows down a pruned full IBD with a high dbcache value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it’s dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.

To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync, and simply clear the pointers after.

Perhaps I’m missing something, but it seems to me that the obvious solution to this problem is to have BatchWrite always delete dirty entries and clear the flags on all entries, rather than implementing a whole new data structure? BatchWrite already iterates the entire cache anyways, so it would make sense to me that we can consolidate all of the operations that occur for each cache entry.

achow101 referenced this in commit d6069cb8d6 on May 13, 2024

sipa commented at 8:35 pm on May 13, 2024: member

@achow101 That works, but it misses the point. If you delete the dirty entries on a prune flush, then those entries need to be re-read from disk when they are (likely soon) spent.

achow101 commented at 8:36 pm on May 13, 2024: member

@achow101 That works, but it misses the point. If you delete the dirty entries on a prune flush, then those entries need to be re-read from disk when they are (likely soon) spent.

Sorry, I meant clear the Dirty and spent ones. Basically just move the stuff that Sync() does into the loop that BatchWrite is doing.

DrahtBot added the label Needs rebase on May 13, 2024

andrewtoth commented at 2:32 am on May 14, 2024: contributor

@achow101 I see what you mean, clear the flags and delete spent entries in the loop in BatchWrite. That would reduce each Sync from 2 full scans of the coinsCache to 1. It still scans the entire coinsCache, while the current change here only loops through flagged entries. Will experiment with how that performs, and possibly combining both ideas to do a single loop of only the flagged entries.

sipa commented at 7:44 pm on May 20, 2024: member

Am I right that after this change (or alternatives which let BatchWrite just iterate dirty/fresh entries), it would be possible to (perhaps dramatically) reduce the periodic flush time, so that after IBD, we basically always have a mostly non-dirty cache? Such a change could be made today of course, but the cost of iterating the entire cache every time would add up quickly.

andrewtoth commented at 4:44 pm on May 21, 2024: contributor

Am I right that after this change (or alternatives which let BatchWrite just iterate dirty/fresh entries), it would be possible to (perhaps dramatically) reduce the periodic flush time, so that after IBD, we basically always have a mostly non-dirty cache? Such a change could be made today of course, but the cost of iterating the entire cache every time would add up quickly.

I believe you are right, but I’m unsure what the benefit is that you see. From my understanding, the periodic flushes are solely to prevent data loss in case of unclean shutdown (otherwise, flushing from filling the dbcache or flushing from shutting down cleanly prevent data loss). With a 24 hour periodic flush schedule and #15218, the most that is at risk is ~144 blocks worth of data?

If you are talking about periodic flushes during IBD though, that might be a benefit that could obviate #15218 and properly fix #11600. If we reduce the periodic flush interval to an hour or so, then only an hour or so of IBD data would be at risk of rolling back. This would of course mean that there is less opportunity for the cache to delete FRESH & DIRTY entries on spend so they never have to hit the disk, so it would be less performant. But it would still keep the cache for connecting blocks quickly.

sipa commented at 4:51 pm on May 21, 2024: member

@andrewtoth I was talking about post-IBD. And one advantage to more frequent flushing is reducing the latency spikes you get when a flush is triggered.

andrewtoth commented at 12:43 pm on June 3, 2024: contributor

BatchWrite already iterates the entire cache anyways @achow101 I think this is what you might not be clear about. BatchWrite iterates the entire cache, but this patch does not. After this patch BatchWrite only iterates flagged entries, which is a much smaller subset of the cache.

I tried moving everything in the loop in Sync into BatchWrite, along with cachedCoinsUsage so it can be decremented as we delete spent coins from the child cache, and it looks like it will take many times longer to sync with dbcache=16384 prune=550 than current master. So I don’t think that’s a viable alternative solution to what is proposed here.

andrewtoth force-pushed on Jun 3, 2024

andrewtoth commented at 2:17 pm on June 3, 2024: contributor

Rebased, and added a small optimization based on @achow101’s suggestion. In BatchWrite, we also clear the flags when not erasing if the coin is unspent. This reduces the amount of flagged entries needed to be iterated in Sync to find the spent entries to delete. We can also throw a logic error now if we find a flagged entry in Sync that is not spent.

andrewtoth force-pushed on Jun 3, 2024

DrahtBot removed the label Needs rebase on Jun 3, 2024

achow101 commented at 4:16 pm on June 3, 2024: member

@achow101 I think this is what you might not be clear about. BatchWrite iterates the entire cache, but this patch does not. After this patch BatchWrite only iterates flagged entries, which is a much smaller subset of the cache.

Yes, that was what I was missing. It just wasn’t clear to me from the description that this was implemented as another performance improvement in addition to clearing dirty and spent UTXOs.

DrahtBot added the label CI failed on Jun 3, 2024

andrewtoth force-pushed on Jun 3, 2024

DrahtBot removed the label CI failed on Jun 4, 2024

in src/coins.cpp:113 in 6ed20408df outdated

107@@ -108,10 +108,13 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
108 
109 void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
110     cachedCoinsUsage += coin.DynamicMemoryUsage();
111-    cacheCoins.emplace(
112+    auto [it, inserted] = cacheCoins.emplace(

l0rinc commented at 1:44 pm on June 25, 2024:

would it make sense to use cacheCoins.try_emplace without the forward_as_tuple parts (similarly to what I’ve tried in a related PR) (in a separate commit or PR)

andrewtoth commented at 8:48 pm on June 28, 2024:

Sorry, can you elaborate on what benefit that would have here, and any possible downsides? I’m not 100% sure what all the differences are between the two methods, so preferred to stick to status quo.

l0rinc commented at 9:01 pm on June 28, 2024:

In that case stick to it and I’ll add it in a separate PR once this is merged!

l0rinc commented at 3:43 pm on August 12, 2024:

Opened https://github.com/bitcoin/bitcoin/pull/30637

in src/coins.h:121 in 0a2bc92b0c outdated

116+     * the parent cache to do a batch write. This is a performance optimization
117+     * compared to giving all entries in the cache to the parent and having the
118+     * parent scan for only modified entries.
119+     *
120+     * FRESH entries are tracked because a FRESH-but-not-DIRTY entry can only be
121+     * a spent coin that exists in the parent cache. When performing a batch

mzumsande commented at 8:24 pm on June 26, 2024:

This comment and the implementation further persists the notion of FRESH-but-not-DIRTY coin, which I think is unfortunate, because it should be impossible for them to exist: The only place in which they could be created is here, while fetching a coin from the parent that is spent. However that is not possible because we’d only get to that point if the GetCoin() call a few lines above returned true - which is impossible for spent coins because according to the abstract CCoinsView interface, GetCoin “Returns true only when an unspent coin was found, which is returned in coin.” This is followed both by CCoinsViewCache::GetCoin, which returns false if spent and CCoinsViewDB::GetCoin which doesn’t return spent coins because they are never saved to disk.

Maybe it would be good to revive #18746 which attempted to fix this but got closed.

With that, this PR could add only DIRTY coins and ignore the FRESH flag, which would be simpler.

andrewtoth commented at 2:36 am on June 28, 2024:

With that, this PR could add only DIRTY coins and ignore the FRESH flag, which would be simpler.

I’m not sure how that would be simpler. The two flags are coupled and so adding one and not the other would be a more complex diff. Perhaps I don’t understand what you are suggesting?

I could just replace the single FRESH-but-not-DIRTY check with an assertion and remove this comment about tracking FRESH coins?

mzumsande commented at 3:31 pm on June 28, 2024:

Yes, I thought some more and don’t find a great way to simplify the logic. I think it would be enough to remove / amend the comment, maybe mentioning instead that FRESH-but-not-DIRTY coins are not expected to occur but would be handled well if they did.

replace the single FRESH-but-not-DIRTY check

Not sure if that’s the “check” you mean, but replacing this with an assertion was essentially what #18746 did in its latest version. Since that PR was mildly controversial (I think mostly because it originally had the spin of “These coins may be unnecessarily added today; don’t add them anymore” instead of “We already don’t add them. Remove unreachable code”) I think the question of adding an assert should be separated from this PR.

andrewtoth commented at 2:58 pm on June 29, 2024:

Updated the comment.

andrewtoth commented at 10:11 pm on August 18, 2024:

Revived in #30673

in src/txdb.cpp:127 in a6324db9c6 outdated

123@@ -124,7 +124,7 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
124             changed++;
125         }
126         count++;
127-        it = erase ? mapCoins.erase(it) : std::next(it);
128+        it = it->second.Next(will_erase || !it->second.coin.IsSpent());

mzumsande commented at 2:51 pm on June 27, 2024:

I think that it should be explained in a comment similar as it’s already done in CCoinsViewCache::BatchWrite why we keep unspent coins because this is not obvious.

l0rinc commented at 4:35 pm on June 28, 2024:

Yeah, too many changes in one commit

andrewtoth commented at 2:58 pm on June 29, 2024:

Done.

mzumsande commented at 6:45 pm on June 27, 2024: contributor

A non-pruned IBD with full dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow.

I’m not sure if I understand this, is this still the case? Why would a node use more memory after this pull? Shouldn’t the 2 added pointers be included in the dbcache accounting, resulting in more frequent flushes due to the cache filling up faster, but not in an increase in memory when the cache is full?

l0rinc commented at 7:36 pm on June 27, 2024: contributor

Also, potential improvements like #28945 will make up for it.

It seems #28945 (comment) was abandoned since

andrewtoth commented at 7:40 pm on June 27, 2024: contributor

A non-pruned IBD with full dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow.

I’m not sure if I understand this, is this still the case? Why would a node use more memory after this pull? Shouldn’t the 2 added pointers be included in the dbcache accounting, resulting in more frequent flushes due to the cache filling up faster, but not in an increase in memory when the cache is full?

This is no longer the case. When I did the initial benchmarks the full UTXO set could be stored in memory, and so a full IBD on mainnet would be 14.7 GB for master and 16.5 GB for this pull. Today both master and this pull will hit the max limit and flush at some point, so their memory usage should be near identical. You are correct that if the dbcache limit is set below the maximum UTXO set size then this pull will not use more memory.

For clarity, this pull only uses more memory if the entire IBD can occur without a single flush. What I meant in the description is this pull uses more memory when the entire utxo set is stored in memory and never flushed.

This pull could use more memory again on mainnet if #28358 is merged, and this pull will use more memory on other chains than mainnet if the UTXO size is smaller than the dbcache setting.

andrewtoth commented at 7:44 pm on June 27, 2024: contributor

Also, potential improvements like #28945 will make up for it.

It seems #28945 (comment) was abandoned since

In my recent benchmarks it was again slightly faster on this branch for default dbcache and no pruning. So I think this PR is essentially the same.

in src/coins.h:122 in 0a2bc92b0c outdated

117+     * compared to giving all entries in the cache to the parent and having the
118+     * parent scan for only modified entries.
119+     *
120+     * FRESH entries are tracked because a FRESH-but-not-DIRTY entry can only be
121+     * a spent coin that exists in the parent cache. When performing a batch
122+     * write but not clearing the cache (CCoinsViewCache::Sync), we must also

l0rinc commented at 9:03 am on June 28, 2024:

nit:

0     * write without clearing the cache (CCoinsViewCache::Sync), we must also

andrewtoth commented at 2:58 pm on June 29, 2024:

Done.

in src/coins.h:116 in 0a2bc92b0c outdated

111+     * These are used to create a doubly linked list of flagged entries.
112+     * They are set in AddFlags and unset in ClearFlags.
113+     * A flagged entry is any entry that is either DIRTY, FRESH, or both.
114+     *
115+     * DIRTY entries are tracked so that only modified entries can be passed to
116+     * the parent cache to do a batch write. This is a performance optimization

l0rinc commented at 9:04 am on June 28, 2024:

Nit:

0     * the parent cache for batch writing. This is a performance optimization

andrewtoth commented at 2:58 pm on June 29, 2024:

Done.

in src/coins.h:163 in 0a2bc92b0c outdated

156@@ -130,6 +157,56 @@ struct CCoinsCacheEntry
157     CCoinsCacheEntry() : flags(0) {}
158     explicit CCoinsCacheEntry(Coin&& coin_) : coin(std::move(coin_)), flags(0) {}
159     CCoinsCacheEntry(Coin&& coin_, unsigned char flag) : coin(std::move(coin_)), flags(flag) {}
160+    ~CCoinsCacheEntry()
161+    {
162+        // We must always clear the flags when destroying this to remove it from
163+        // the flagged linked list

l0rinc commented at 9:06 am on June 28, 2024:

nit: redundant comment

andrewtoth commented at 8:39 pm on June 28, 2024:

Is this really obvious though that we need to clear flags before destruction? Normally flags wouldn’t affect destroying an object, but in this case they also tie the object to a linked list.

l0rinc commented at 8:44 pm on June 28, 2024:

No, but it’s in a destructor, calling the ClearFlags method - I didn’t seen what the comment added beyond what the code already explained well. It could explain why you chose this style, or what alternatives you’ve considered, or something else that the code cannot, but this just seemed redundant to me.

in src/coins.h:180 in 0a2bc92b0c outdated

184+    }
185+
186+    inline void ClearFlags()
187+    {
188+        if (m_flags == 0) return;
189+        m_prev->second.m_next = m_next;

l0rinc commented at 9:11 am on June 28, 2024:

Probably not a realistic scenario (may require multithreaded access because of the m_flags guard, not sure) , but m_prev will never be null when we get here, right?

andrewtoth commented at 7:00 pm on June 28, 2024:

Yes, this code is single threaded so not possible to get past the previous line if m_prev is null. It is possible for m_prev to point to a pair that has been freed if the linked list head is destroyed, which is why cacheCoins must always be destroyed before the linked list head.

l0rinc commented at 7:27 pm on June 28, 2024:

Got it, thanks

in src/test/coinscachepair_tests.cpp:168 in 6508bdfe43 outdated

163+    // Check that we can clear flags then re-add them
164+    n1.second.ClearFlags();
165+    BOOST_CHECK_EQUAL(n1.second.GetFlags(), 0);
166+    BOOST_CHECK_EQUAL(n2.second.Next(), nullptr);
167+
168+    // Check that we calling `ClearFlags` with 0 flags has no effect

l0rinc commented at 9:26 am on June 28, 2024:

nit:

0    // Check that calling `ClearFlags` with 0 flags has no effect

andrewtoth commented at 2:59 pm on June 29, 2024:

Done.

in src/coins.h:183 in 0a2bc92b0c outdated

197+
198+    inline uint8_t GetFlags() const { return m_flags; }
199+
200+    //! Get the next entry in the flagged linked list, optionally removing the
201+    //! current entry and clearing the flags.
202+    inline CoinsCachePair* Next(bool clear_flags = false)

l0rinc commented at 12:49 pm on June 28, 2024:

Not yet sure why iteration is tied to mutation here

andrewtoth commented at 6:57 pm on June 28, 2024:

It’s mostly convenience. Otherwise to clear flags while iterating we would have to bind m_next to another variable first, then call ClearFlags on the current pair, and then bind the the other variable holding the old m_next to the current pair variable.

l0rinc commented at 7:27 pm on June 28, 2024:

Can you find a way to separate the side-effects? I find this very confusing.

andrewtoth commented at 5:49 pm on June 29, 2024:

Before this patch, the iteration is it = erase ? mapCoins.erase(it) : std::next(it);. So this is basically the same behavior, is it not?

l0rinc commented at 7:34 pm on June 29, 2024:

Maybe, but now the method returns AND mutates, that’s a confusing combination. I would expect it to be either void and mutate or return and not change internal state. Can we maybe move the mutation out to the call-site?

andrewtoth commented at 0:31 am on July 1, 2024:

The std lib uses this pattern though, with std::unordered_map::erase. It mutates and returns next. Surely we should be following std lib conventions, no?

l0rinc commented at 8:47 am on July 1, 2024:

ok, thanks for checking

andrewtoth commented at 4:25 pm on July 8, 2024:

I updated this to no longer clear flags.

in src/coins.h:157 in 0a2bc92b0c outdated

165+    }
166+
167+    //! Adding a flag also requires a self reference to the pair that contains
168+    //! this entry in the CCoinsCache map and a reference to the head of the
169+    //! flagged pair linked list.
170+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)

l0rinc commented at 1:26 pm on June 28, 2024:

I find it quite confusing that AddFlags modifies the linked list as well (vs GetFlags only returning the flags, of course). Could make sense to split it to multiple methods or add the side-effect to the name

andrewtoth commented at 5:53 pm on June 29, 2024:

We want to tie adding a flag to adding the entry to the linked list. We don’t want to have an interface that lets something add a flag without adding it to the linked list. I think splitting that functionality into multiple methods would be dangerous.

ClearFlags is similar. Does it matter to external callers how these methods modify the linked list? That should be transparent to callers. They just know that calling Next() from the head gets all flagged entries. The how should not be exposed IMO.

l0rinc commented at 7:26 pm on June 29, 2024:

Could the overall intent be better expressed here, something like SetCacheStateAndLink/ResetCacheStateAndUnlink or Sync/Reset or similar? It’s obviously doing more after this commit than what it announces…

in src/test/coinscachepair_tests.cpp:43 in 6508bdfe43 outdated

38+    CoinsCachePair head;
39+    auto nodes{CreatePairs(head)};
40+
41+    // Check iterating through pairs is identical to iterating through a list
42+    auto lit{nodes.begin()};
43+    for (auto it{head.second.Next()}; it != nullptr; it = it->second.Next(/*clear_flags=*/false), ++lit) {

l0rinc commented at 1:38 pm on June 28, 2024:

Since iteration could misbehave severely and this would pass (e.g. if Next() would return nullptr we wouldn’t even enter the loop), it could make more sense to iterate over the target nodes instead, something like:

0    auto node{head.second.Next()};
1    for (const auto& expected : nodes) {
2        BOOST_CHECK_EQUAL(&expected, node);
3        node = node->second.Next();
4    }
5    BOOST_CHECK_EQUAL(nullptr, node);

or

0    auto node{&head};
1    for (const auto& expected : nodes) {
2        node = node->second.Next();
3        BOOST_CHECK_EQUAL(&expected, node);
4    }
5    BOOST_CHECK_EQUAL(node->second.Next(), nullptr);

andrewtoth commented at 2:59 pm on June 29, 2024:

Done

in src/test/coinscachepair_tests.cpp:15 in 6508bdfe43 outdated

10+
11+BOOST_AUTO_TEST_SUITE(coinscachepair_tests)
12+
13+static constexpr auto NUM_NODES{4};
14+
15+std::list<CoinsCachePair> CreatePairs(CoinsCachePair& head)

l0rinc commented at 1:45 pm on June 28, 2024:

shouldn’t we construct this via production code somehow? Otherwise we’re just testing test code…

andrewtoth commented at 1:58 pm on June 29, 2024:

Hmm well the AddFlags function is in production code, so this is constructing the linked list via production code. The std::list we create is a test harness helper that we use to check that our linked list behaves the same.

in src/test/coinscachepair_tests.cpp:70 in 6508bdfe43 outdated

58+        BOOST_CHECK_EQUAL(it->second.GetFlags(), 0);
59+        BOOST_CHECK_EQUAL(it->second.Next(), nullptr);
60+    }
61+}
62+
63+BOOST_AUTO_TEST_CASE(linked_list_iterate_erase)

l0rinc commented at 2:03 pm on June 28, 2024:

I’m not exactly sure what this test is checking, that the doubly linked list is unaffected by the std::list’s modification?

andrewtoth commented at 2:31 pm on June 29, 2024:

This checks that the linked list nodes get removed from the linked list due to the list iterator deleting them, not due to clearing the flags in Next.

l0rinc commented at 6:46 pm on June 29, 2024:

Got it, thanks!

in src/test/coinscachepair_tests.cpp:136 in 6508bdfe43 outdated

115+    BOOST_CHECK_EQUAL(head.second.Next(), &(*n3));
116+    BOOST_CHECK_EQUAL(n3->second.Next(), nullptr);
117+
118+    // Delete n3
119+    // head->nullptr
120+    nodes.erase(n3);

l0rinc commented at 2:13 pm on June 28, 2024:

is this here just for completeness’ sake? Or how would a list operation affect the head node’s behavior?

andrewtoth commented at 2:04 pm on June 29, 2024:

It just checks that removing the last element sets head to nullptr. I suppose just for completeness, since this case is the same as when we delete n4 above.

l0rinc commented at 7:47 pm on June 29, 2024:

same, it seems to me that it’s not the head that’s tested, but its next pointer

andrewtoth commented at 0:31 am on July 1, 2024:

Updated comment.

in src/test/coinscachepair_tests.cpp:25 in 6508bdfe43 outdated

20+
21+        auto node{nodes.begin()};
22+        node->second.AddFlags(CCoinsCacheEntry::DIRTY, *node, head);
23+
24+        BOOST_CHECK_EQUAL(node->second.GetFlags(), CCoinsCacheEntry::DIRTY);
25+        BOOST_CHECK_EQUAL(head.second.Next(), &(*node));

l0rinc commented at 2:15 pm on June 28, 2024:

we should validate the m_prev as well here

andrewtoth commented at 1:57 pm on June 29, 2024:

There’s no accessor to m_prev though. This is checked below in the random deletion test.

in src/coins.h:174 in 0a2bc92b0c outdated

169+    //! flagged pair linked list.
170+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
171+    {
172+        if (flags == 0) return;
173+        if (m_flags != 0) {
174+            m_flags |= flags;

l0rinc commented at 2:54 pm on June 28, 2024:

I find it confusing that m_flags is set multiple times (suggestion below)

andrewtoth commented at 2:59 pm on June 29, 2024:

Done.

in src/coins.h:180 in 0a2bc92b0c outdated

175+            return;
176+        }
177+        m_prev = &head;
178+        m_next = head.second.m_next;
179+        head.second.m_next = &self;
180+        if (m_next != nullptr) {

l0rinc commented at 3:02 pm on June 28, 2024:

we could move the m_next setter closer to the check, something like:

0    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
1    {
2        if (!m_flags && flags) {
3            m_prev = &head;
4            m_next = head.second.m_next;
5            if (m_next) m_next->second.m_prev = &self;
6            head.second.m_next = &self;
7        }
8        m_flags |= flags;
9    }

andrewtoth commented at 2:59 pm on June 29, 2024:

Done.

in src/coins.h:181 in 0a2bc92b0c outdated

190+        if (m_next != nullptr) {
191+            m_next->second.m_prev = m_prev;
192+        }
193+        m_prev = nullptr;
194+        m_next = nullptr;
195+        m_flags = 0;

l0rinc commented at 3:05 pm on June 28, 2024:

we could apply the same concept here to separate the constants from the condition (i.e. flags are always 0 after this call):

0    inline void ClearFlags()
1    {
2        if (m_flags) {
3            if (m_next) m_next->second.m_prev = m_prev;
4            m_prev->second.m_next = m_next;
5            m_prev = m_next = nullptr;
6        }
7        m_flags = 0;
8    }

andrewtoth commented at 2:57 pm on June 29, 2024:

Done

l0rinc changes_requested

l0rinc commented at 3:07 pm on June 28, 2024: contributor

First half of the review:

in src/coins.cpp:15 in a6324db9c6 outdated

11@@ -12,7 +12,7 @@
12 bool CCoinsView::GetCoin(const COutPoint &outpoint, Coin &coin) const { return false; }
13 uint256 CCoinsView::GetBestBlock() const { return uint256(); }
14 std::vector<uint256> CCoinsView::GetHeadBlocks() const { return std::vector<uint256>(); }
15-bool CCoinsView::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, bool erase) { return false; }
16+bool CCoinsView::BatchWrite(CoinsCachePair *pairs, const uint256 &hashBlock, bool will_erase) { return false; }

l0rinc commented at 4:21 pm on June 28, 2024:

doing refactoring and logic changes in the same commit makes reviewing slightly harder - could you extract all non-critical changes in a separate commit so that we can concentrate on high-risk code more than on stylistic changes (which are important, but shouldn’t distract us)

andrewtoth commented at 2:56 pm on June 29, 2024:

Done

in src/coins.cpp:332 in a6324db9c6 outdated

327@@ -326,8 +328,8 @@ void CCoinsViewCache::SanityCheck() const
328     size_t recomputed_usage = 0;
329     for (const auto& [_, entry] : cacheCoins) {
330         unsigned attr = 0;
331-        if (entry.flags & CCoinsCacheEntry::DIRTY) attr |= 1;
332-        if (entry.flags & CCoinsCacheEntry::FRESH) attr |= 2;
333+        if (entry.GetFlags() & CCoinsCacheEntry::DIRTY) attr |= 1;
334+        if (entry.GetFlags() & CCoinsCacheEntry::FRESH) attr |= 2;

l0rinc commented at 4:30 pm on June 28, 2024:

0        auto attr = entry.GetFlags();

(or just remove the assert, doesn’t make a lot of sense to me)

andrewtoth commented at 3:00 pm on June 29, 2024:

Not sure I want to touch this right now.

in src/test/coins_tests.cpp:61 in a6324db9c6 outdated

59+    bool BatchWrite(CoinsCachePair* pairs, const uint256& hashBlock, bool will_erase = true) override
60     {
61-        for (CCoinsMap::iterator it = mapCoins.begin(); it != mapCoins.end(); it = erase ? mapCoins.erase(it) : std::next(it)) {
62-            if (it->second.flags & CCoinsCacheEntry::DIRTY) {
63+        for (auto it{pairs}; it != nullptr; it = it->second.Next(will_erase || !it->second.coin.IsSpent())) {
64+            if (it->second.GetFlags() & CCoinsCacheEntry::DIRTY) {

l0rinc commented at 4:34 pm on June 28, 2024:

this it->second.GetFlags() & CCoinsCacheEntry::DIRTY appears quite often, maybe it would make sense to extract it to it->second.IsDirty() (and it->second.IsFresh(), of course)

andrewtoth commented at 6:53 pm on June 28, 2024:

I think this makes the diff easier to follow, since in git it highlights only the flags and GetFlags() in the line.

l0rinc commented at 7:12 pm on June 28, 2024:

if there was a separate commit doing only this, it would simplify the other important commits a lot

andrewtoth commented at 3:00 pm on June 29, 2024:

Done.

l0rinc commented at 4:37 pm on June 28, 2024: contributor

Fantastic change, would be helpful if you could make it a bit more digestible by separating it into smaller commits (especially splitting high-risk changes from low-risk refactoring), explaining the changes in commit messages.

in src/coins.h:127 in 6ed20408df outdated

123+     * erase all spent entries. Since only DIRTY or FRESH entries can be spent,
124+     * we can track them in the linked list to erase them. This is a performance
125+     * optimization compared to scanning all cache entries to find any that are
126+     * spent-but-not-DIRTY.
127+     */
128+    CoinsCachePair* m_prev{nullptr};

l0rinc commented at 4:50 pm on June 28, 2024:

I don’t fully understand why we need m_prev here, but have you investigated whether it’s possible to get rid of it and rely on m_next only? Sometimes we still have a reference to both (current and next, so we wouldn’t need the next’s prev), but I assume that wasn’t always the case, right?

andrewtoth commented at 6:49 pm on June 28, 2024:

With m_next we have constant time insertion at the head, which would be fine if we only added flagged entries and cleared them as we traverse them.

However, we also need to delete entries in constant time if they are both FRESH and DIRTY and get spent. For this we need a doubly linked list.

l0rinc commented at 7:25 pm on June 28, 2024:

What prohibits us from accessing the previous ones in constant time during iteration (I haven’t had the energy to delve into it properly yet)?

andrewtoth commented at 8:16 pm on June 28, 2024:

What prohibits us from accessing the previous ones in constant time during iteration

Nothing during iteration. But we need to delete entries randomly when a FRESH coin is spent, which is not during any iteration.

andrewtoth commented at 5:56 pm on June 28, 2024: contributor

Fantastic change, would be helpful if you could make it a bit more digestible by separating it into smaller commits (especially splitting high-risk changes from low-risk refactoring), explaining the changes in commit messages.

Thank you @paplorinc! I did try initially to separate the changes more. However, I found the third commit has to have all those changes done in one commit otherwise the test-all-commits CI job will fail. Changing only parts of it can break the tests, and break building the fuzz targets. I will explore to see if I can break it down any further and keep CI happy.

l0rinc commented at 7:22 pm on June 28, 2024: contributor

Changing only parts of it can break the tests

I usually try to recreate every line, after I have all the changes by interactively rebasing the current state and adding small refactoring under the current commit. E.g. these could be separate commits doing a single thing:

renaming bool erase to bool will_erase
changing it->second.flags usages to it->second.GetFlags() (again, without change in behavior, but you could also add another commit after this which extracts the usages and returns it->second.IsFresh() instead, making commits after it easier to read)
changing it->second.flags |= to use it->second.AddFlags(CCoinsCacheEntry::DIRTY); (without any change in behavior, just doing it in a method)
changing iterators to the new linked list - without any other change in logic.

etc, I think you got the picture. By extracting these to separate trivial commits (which are followed by high risk changes, done in as few lines as possible without any refactorings in the way), we can concentrate on the difficult ones (which are a lot simpler and in separate commits themselves).

Thanks!

andrewtoth force-pushed on Jun 29, 2024

DrahtBot added the label CI failed on Jun 29, 2024

DrahtBot commented at 3:02 pm on June 29, 2024: contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/26841759796

DrahtBot removed the label CI failed on Jun 29, 2024

andrewtoth commented at 6:06 pm on June 29, 2024: contributor

@mzumsande @paplorinc thank you very much for your reviews! I’ve rebased and updated with all your suggested changes.

I also updated the commits for easier review. First, flags is encapsulated whenever used, then made private. Then linked list head and self references are added to AddFlags, followed by AddFlags and ClearFlags actually maintaining a linked list of entries. Lastly, the linked list is used for BatchWrite instead of passing the entire map of entries.

in src/coins.cpp:286 in 7f66953d5b outdated

282@@ -283,7 +283,7 @@ bool CCoinsViewCache::Sync()
283 void CCoinsViewCache::Uncache(const COutPoint& hash)
284 {
285     CCoinsMap::iterator it = cacheCoins.find(hash);
286-    if (it != cacheCoins.end() && it->second.flags == 0) {
287+    if (it != cacheCoins.end() && it->second.GetFlags() == 0) {

l0rinc commented at 6:54 pm on June 29, 2024:

I don’t mind the current one either, but now that you have helper method, maybe we don’t even need to expose the internals:

0    if (it != cacheCoins.end() && !it->second.IsDirty() && !it->second.IsFresh()) {

andrewtoth commented at 11:33 pm on June 30, 2024:

This one is problematic though https://github.com/bitcoin/bitcoin/pull/28280/files#diff-3d0856e8b7f136c588b229e0cbd3b2e2c309cd097ade0879521daba4e1bb2a33L609. That would be a big change if we couldn’t get the flags.

l0rinc commented at 8:36 am on July 1, 2024:

The flags seem like an implementation detail to me which we might not want to expose (we could have stored it in two booleans as well and the outside behavior shouldn’t change), so usages such as:

0flags = it->second.GetFlags();

could become something like:

0setDirty(it->second.IsDirty());
1setFresh(it->second.IsFresh());

in src/coins.h:130 in d92dc3e21c outdated

126@@ -127,6 +127,8 @@ struct CCoinsCacheEntry
127         FRESH = (1 << 1),
128     };
129 
130+    inline void AddFlags(unsigned int flags) { flags |= flags; }

l0rinc commented at 6:59 pm on June 29, 2024:

hmmm, shouldn’t this be this->flags |= flags;? I wonder how the tests passed…

andrewtoth commented at 0:29 am on July 1, 2024:

Done.

in src/test/coins_tests.cpp:587 in d92dc3e21c outdated

586@@ -587,7 +587,7 @@ static size_t InsertCoinsMapEntry(CCoinsMap& map, CAmount value, char flags)
587     }
588     assert(flags != NO_ENTRY);
589     CCoinsCacheEntry entry;
590-    entry.flags = flags;

l0rinc commented at 6:59 pm on June 29, 2024:

this seems like a change in behavior (= to |=)

andrewtoth commented at 9:48 pm on June 29, 2024:

It isn’t because entry is constructed the line above, and we know from default constructor flags will be initialized to 0. So 0 | flags is same as flags.

l0rinc commented at 6:40 am on June 30, 2024:

Indeed, same for line 130

in src/coins.cpp:117 in 8ca27c7bb6 outdated

112+    auto [it, inserted] = cacheCoins.emplace(
113         std::piecewise_construct,
114         std::forward_as_tuple(std::move(outpoint)),
115-        std::forward_as_tuple(std::move(coin), CCoinsCacheEntry::DIRTY));
116+        std::forward_as_tuple(std::move(coin)));
117+    if (inserted) {

l0rinc commented at 7:00 pm on June 29, 2024:

can you please explain why this isn’t a change in behavior? (in the commit it was introduced in)

andrewtoth commented at 0:29 am on July 1, 2024:

Added a comment in the commit message.

in src/coins.h:129 in e0f394ec9d outdated

102@@ -103,8 +103,11 @@ class Coin
103  */
104 struct CCoinsCacheEntry
105 {
106+private:
107+    uint8_t m_flags{0};

l0rinc commented at 7:04 pm on June 29, 2024:

nice!

in src/test/coinscachepair_tests.cpp:66 in f5443052bf outdated

55+    // Check that head is empty
56+    BOOST_CHECK_EQUAL(head.second.Next(), nullptr);
57+
58+    // Delete the nodes from the list to make sure there are no dangling pointers
59+    for (auto it{nodes.begin()}; it != nodes.end(); it = nodes.erase(it)) {
60+        BOOST_CHECK_EQUAL(it->second.GetFlags(), 0);

l0rinc commented at 7:37 pm on June 29, 2024:

As mentioned before, instead of exposing GetFlags, we could check that it’s not fresh and not dirty

andrewtoth commented at 0:29 am on July 1, 2024:

Mentioned in another comment, we need GetFlags or we will have to change a lot of test code in coins_tests.cpp.

l0rinc commented at 8:40 am on July 1, 2024:

I’ll leave it up to you of course, but the instances I found would be simpler as well:

0BOOST_CHECK_EQUAL(n1.second.GetFlags(), CCoinsCacheEntry::DIRTY);

would become

0BOOST_CHECK(n1.second.IsDirty());

in src/test/coinscachepair_tests.cpp:78 in f5443052bf outdated

73+    auto node{head.second.Next()};
74+    for (auto expected{nodes.begin()}; expected != nodes.end(); expected = nodes.erase(expected)) {
75+        BOOST_CHECK_EQUAL(&(*expected), node);
76+        node = node->second.Next(/*clear_flags=*/false);
77+    }
78+    // Check that head is empty

l0rinc commented at 7:45 pm on June 29, 2024:

hmm, but that’s not what we’re checking, right, but that head’s next is empty

andrewtoth commented at 0:28 am on July 1, 2024:

Updated comment.

in src/test/coinscachepair_tests.cpp:22 in f5443052bf outdated

17+    std::list<CoinsCachePair> nodes;
18+    for (auto i{0}; i < NUM_NODES; ++i) {
19+        nodes.emplace_front();
20+
21+        auto node{nodes.begin()};
22+        node->second.AddFlags(CCoinsCacheEntry::DIRTY, *node, head);

l0rinc commented at 8:17 pm on June 29, 2024:

so this basically inserts “in the middle” of the linked list all the time, i.e. not at the beginning or the end, right? I also find that quite confusing, i.e.

0head <-> newest node <-> second newest <-> third newest <-> oldest node -> nullptr

what if we iterated the initialized nodes backwards so that they reflect the order of insertion, something like:

 0std::list<CoinsCachePair> CreatePairs(CoinsCachePair& head)
 1{
 2    std::list<CoinsCachePair> nodes(NUM_NODES);
 3    for (auto it = nodes.rbegin(); it != nodes.rend(); ++it) {
 4        it->second.AddFlags(CCoinsCacheEntry::DIRTY, *it, head);
 5
 6        BOOST_CHECK_EQUAL(it->second.GetFlags(), CCoinsCacheEntry::DIRTY);
 7        BOOST_CHECK_EQUAL(head.second.Next(), &(*it));
 8        BOOST_CHECK_EQUAL(it->second.Next(), it == nodes.rbegin() ? nullptr : &(*std::prev(it)));
 9    }
10    return nodes;
11}

andrewtoth commented at 5:09 pm on July 3, 2024:

I guess head could be though of as a sentinel, which always points to the head of the linked list? I suppose I misnamed that then.

in src/coins.h:163 in c36363f6b2 outdated

161+    {
162+        if (!m_flags && flags) {
163+            m_prev = &head;
164+            m_next = head.second.m_next;
165+            if (m_next) m_next->second.m_prev = &self;
166+            head.second.m_next = &self;

l0rinc commented at 9:00 pm on June 29, 2024:

I’m still thinking about this part, it’s basically a static method that doesn’t need this, which seems to indicate to me that it might be misplaced:

 0    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
 1    {
 2        if (!m_flags && flags) {
 3            AddToList(self, head);
 4        }
 5        m_flags |= flags;
 6    }
 7    static inline void AddToList(CoinsCachePair& self, CoinsCachePair& head) {
 8        self.second.m_prev = &head;
 9        self.second.m_next = head.second.m_next;
10        if (head.second.m_next) head.second.m_next->second.m_prev = &self;
11        head.second.m_next = &self;
12    }

No recommendation yet, just thinking out loud…

l0rinc commented at 9:02 pm on June 29, 2024: contributor

Added a few more comments, it’s a lot more understandable to me now - though I’m still lacking a few important details, I’ll investigate those next week. Thanks for all your effort!

in src/coins.h:127 in c36363f6b2 outdated

121+     * mean a spent coin exists in the parent CCoinsView and not in the child
122+     * CCoinsViewCache. Nevertheless, if a spent coin is retrieved from the
123+     * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
124+     * list and deleted when Sync or Flush is called on the CCoinsViewCache.
125+     */
126+    CoinsCachePair* m_prev{nullptr};

l0rinc commented at 7:00 am on June 30, 2024:

Not sure if it helps, just an observation: m_prev only uses .second, so it might as well be a CCoinsCacheEntry*

 0diff --git a/src/coins.h b/src/coins.h
 1--- a/src/coins.h	(revision c36363f6b24c7ab2afe198d9855f507ddf096e1f)
 2+++ b/src/coins.h	(date 1719730674870)
 3@@ -123,7 +123,7 @@
 4      * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
 5      * list and deleted when Sync or Flush is called on the CCoinsViewCache.
 6      */
 7-    CoinsCachePair* m_prev{nullptr};
 8+    CCoinsCacheEntry* m_prev{nullptr};
 9     CoinsCachePair* m_next{nullptr};
10     uint8_t m_flags{0};
11 
12@@ -157,9 +157,9 @@
13     inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
14     {
15         if (!m_flags && flags) {
16-            m_prev = &head;
17+            m_prev = &head.second;
18             m_next = head.second.m_next;
19-            if (m_next) m_next->second.m_prev = &self;
20+            if (m_next) m_next->second.m_prev = &self.second;
21             head.second.m_next = &self;
22         }
23         m_flags |= flags;
24@@ -168,8 +168,9 @@
25     {
26         if (m_flags) {
27             if (m_next) m_next->second.m_prev = m_prev;
28-            m_prev->second.m_next = m_next;
29-            m_prev = m_next = nullptr;
30+            m_prev->m_next = m_next;
31+            m_prev = nullptr;
32+            m_next = nullptr;
33         }
34         m_flags = 0;
35     }

andrewtoth commented at 0:28 am on July 1, 2024:

Done, nice catch!

in src/coins.h:178 in c36363f6b2 outdated

176+        }
177+        m_flags = 0;
178+    }
179+    inline uint8_t GetFlags() const { return m_flags; }
180+    inline bool IsDirty() const { return m_flags & CCoinsCacheEntry::DIRTY; }
181+    inline bool IsFresh() const { return m_flags & CCoinsCacheEntry::FRESH; }

l0rinc commented at 1:35 pm on June 30, 2024:

we’re inside CCoinsCacheEntry, might as well omit the prefix:

0    inline bool IsDirty() const { return m_flags & DIRTY; }
1    inline bool IsFresh() const { return m_flags & FRESH; }

in src/coins.cpp:189 in c36363f6b2 outdated

186-            it != mapCoins.end();
187-            it = erase ? mapCoins.erase(it) : std::next(it)) {
188+bool CCoinsViewCache::BatchWrite(CoinsCachePair *pairs, const uint256 &hashBlockIn, bool will_erase) {
189+    for (auto it{pairs};
190+            it != nullptr;
191+            it = it->second.Next(/*clear_flags=*/will_erase || !it->second.coin.IsSpent())) { // Keep flags for spent coins so they can be erased in Sync

l0rinc commented at 1:39 pm on June 30, 2024:

nit: it seems to me we’re really pushing it with the loop iteration + deletion here, could we do the deletion at the end of the loop instead (like in src/txdb.cpp) to have a cleaner separation or the responsibilities? It may make the Next method cleaner as well (I know it was similar before as well).

andrewtoth commented at 4:01 pm on July 7, 2024:

I moved the iteration down so the logic should be more clear now.

in src/test/coinscachepair_tests.cpp:49 in c36363f6b2 outdated

42+    auto node{head.second.Next()};
43+    for (const auto& expected : nodes) {
44+        BOOST_CHECK_EQUAL(&expected, node);
45+        node = node->second.Next();
46+    }
47+

l0rinc commented at 1:43 pm on June 30, 2024:

after iteration we may still have elements left in node, right? Can we assert that we don’t?

andrewtoth commented at 0:27 am on July 1, 2024:

Done

l0rinc commented at 2:18 pm on June 30, 2024: contributor

This kept me up at night, we should be able to get rid of m_prev by doing insertions and deletions slightly differently:

we’d have an immutable SENTINEL value always serving as the last element (instead of nullptr);
insertion would always be in front of the previous element, which would serve as the new head (iterating from which will always get us to the SENTINEL);
deletion would copy the content (node, flag and next) of the next node instead (unless it’s a SENTINEL), and dispose of that instead.

Something like this:

 0diff --git a/src/coins.h b/src/coins.h
 1--- a/src/coins.h	(revision c36363f6b24c7ab2afe198d9855f507ddf096e1f)
 2+++ b/src/coins.h	(date 1719755615706)
 3@@ -123,11 +123,11 @@
 4      * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
 5      * list and deleted when Sync or Flush is called on the CCoinsViewCache.
 6      */
 7-    CoinsCachePair* m_prev{nullptr};
 8     CoinsCachePair* m_next{nullptr};
 9     uint8_t m_flags{0};
10 
11 public:
12+    static CoinsCachePair SENTINEL;
13     Coin coin; // The actual cached data.
14 
15     enum Flags {
16@@ -154,22 +154,19 @@
17     //! Adding a flag also requires a self reference to the pair that contains
18     //! this entry in the CCoinsCache map and a reference to the head of the
19     //! flagged pair linked list.
20-    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
21+    inline void AddFlags(uint8_t flags, CoinsCachePair& next)
22     {
23         if (!m_flags && flags) {
24-            m_prev = &head;
25-            m_next = head.second.m_next;
26-            if (m_next) m_next->second.m_prev = &self;
27-            head.second.m_next = &self;
28+            m_next = &next;
29         }
30         m_flags |= flags;
31     }
32     inline void ClearFlags()
33     {
34-        if (m_flags) {
35-            if (m_next) m_next->second.m_prev = m_prev;
36-            m_prev->second.m_next = m_next;
37-            m_prev = m_next = nullptr;
38+        if (m_flags && m_next != &SENTINEL) {
39+            coin = std::move(m_next->second.coin);
40+            m_flags = m_next->second.m_flags;
41+            m_next = m_next->second.m_next;
42         }
43         m_flags = 0;
44     }

What do you think?

andrewtoth commented at 2:50 pm on June 30, 2024: contributor

we should be able to get rid of m_prev by doing insertions and deletions slightly differently

But we can’t move the coin from the next entry to us. The coinsCache map will still point to the next entry for that outpoint.

l0rinc commented at 3:06 pm on June 30, 2024: contributor

The coinsCache map will still point to the next entry for that outpoint

Yeah, but we can update the map in constant time, right? Do you think that would measurably slow down execution? Could the memory savings of one less pointer allow us to increase cache hit rate, or am I completely off here?

andrewtoth commented at 3:13 pm on June 30, 2024: contributor

Using a singly linked list would be better, agreed. I just don’t see how this design would work. How do we clear the flags but keep the coin? That is the purpose of this patch, to not delete an entry after clearing the flags.

andrewtoth commented at 3:16 pm on June 30, 2024: contributor

Oh wait, we only need random access for deletion! Otherwise we can clear the flags on iteration!

Brilliant! I think this could work then.

andrewtoth commented at 9:24 pm on June 30, 2024: contributor

@paplorinc was trying to implement your suggestion, but it’s much too complex I think. How do we delete the node that points to the sentinel? If we move the sentinel into the node and reassign sentinel to that node, we would still need to erase that node from the coinsCache map. We could extract the node handle from the map and have the CCoinsViewCache own that, but this seems like far too much code being changed. I don’t think there’s that much benefit for removing one pointer per entry. We would also have to extract the node from the map after the next node moves into it, and then reinsert the node with the next node’s original outpoint.

sipa commented at 9:28 pm on June 30, 2024: member

@andrewtoth Just high-level brainstorming, and I really don’t want to push this PR into something more complex than necessary.

But would it be possible to have a map from outpoints to entries, with a singly-linked list through the entries, such that *(map[key].m_prev) stores the data for key key, rather than map[key] itself?

andrewtoth commented at 9:49 pm on June 30, 2024: contributor

But would it be possible to have a map from outpoints to entries, with a singly-linked list through the entries, such that *(map[key].m_prev) stores the data for key key, rather than map[key] itself? @sipa Sorry I’m not sure I follow. What does *(map[key].m_prev) store?

sipa commented at 10:18 pm on June 30, 2024: member

@andrewtoth

So the data structure we want has (a) an index on COutPoint and (b) a single-linked list that stores an iteration order. The issue (I think, I haven’t followed in detail) is that we need the ability to delete entries found through the (a) index, and if we naively do that, the entry that (b) points to the deleted entry will have its prev pointer invalidated.

My suggestion is “shifting” the values of the map by one. map[key] stores the UTXO for the entry that is (b)-after key, so the UTXO for outpoint key is found in *(map[key]->m_prev) instead. This means an extra indirection on lookup (which may make the whole thing already not worth it from a performance perspective), but it does mean entries can be deleted (to delete key, do { auto to_del = map[key]->m_prev; map[key]->m_prev = to_del->m_prev; delete(to_del); }.

andrewtoth commented at 10:42 pm on June 30, 2024: contributor

@sipa thank you for the suggestion! I think it’s missing the fact that we want entries in the map to possibly not belong to the linked list. So we want to clear the linked list after a non-erasing flush, but leave the map intact otherwise. I don’t think that’s possible with just this data structure. We would need 2 parallel maps, one with unflagged entries and one with dirty entries. I’m not sure if that would be more performant with 2 lookups for every access? I think if we have 2 maps we can also just get rid of the linked list entirely.

You did mention the 2 maps idea a while ago #15265 (comment) and @martinus had a similar idea earlier in this PR discussion.

sipa commented at 10:45 pm on June 30, 2024: member

@andrewtoth You can have 1 map, with 2 sentinels, each being the end of one linked list (one for flagged entries, one for unflagged ones)… but yeah, fair point, this perhaps complicates matters even more. And the performance penalty of this idea is probably not insignificant either.

andrewtoth force-pushed on Jul 1, 2024

in src/test/coinscachepair_tests.cpp:165 in 5f753defe5 outdated

147+    BOOST_CHECK_EQUAL(n1.second.Next(), nullptr);
148+    BOOST_CHECK_EQUAL(head.Next(), &n1);
149+
150+    // Check that adding FRESH flag on new node inserts it after head
151+    n2.second.AddFlags(CCoinsCacheEntry::FRESH, n2, head);
152+    BOOST_CHECK_EQUAL(n2.second.GetFlags(), CCoinsCacheEntry::FRESH);

l0rinc commented at 8:51 am on July 1, 2024:

as mentioned before, after https://github.com/bitcoin/bitcoin/pull/28280/commits/cf5bfda650bd57636c7dd13f0ca03910d2aebe2e we can simplify these asserts

0BOOST_CHECK(n2.second.IsFresh());

andrewtoth commented at 4:03 pm on July 3, 2024:

I think the equivalent assert would be

0BOOST_CHECK(n2.second.IsFresh() && !n2.second.IsDirty());

l0rinc commented at 5:32 pm on July 3, 2024:

the exact equivalent, yes, but you might not want to test the dirtiness part here - up to you, I don’t mind either way.

andrewtoth commented at 5:51 pm on July 3, 2024:

But since these are unit tests we are kind of testing the internals, so it would be good to catch if any other flags sneak in. With IsFresh it just checks that the FRESH flag is set, but any other flag could be set as well.

in src/coins.cpp:274 in 670084adc5 outdated

284-            ++it;
285+            cacheCoins.erase(it->first);
286+        } else if (fOk) {
287+            /* BatchWrite must unset all unspent entries when will_erase=false. */
288+            throw std::logic_error("Not all unspent flagged entries were unset");
289         }

l0rinc commented at 8:58 am on July 1, 2024:

nit:

0        if (!it->second.coin.IsSpent()) {
1            /* BatchWrite must unset all unspent entries when will_erase=false. */
2            throw std::logic_error("Not all unspent flagged entries were unset");
3        }
4        const auto next_entry{it->second.Next(/*clear_flags=*/true)};
5        cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
6        cacheCoins.erase(it->first);

andrewtoth commented at 1:15 pm on July 1, 2024:

We don’t want to throw an error if fOk is false. This happens in tests when using mock parent caches that just return false and don’t unset any flags.

l0rinc commented at 1:30 pm on July 1, 2024:

Yeah, my suggestion was completely off here, thanks for checking.

in src/coins.h:131 in cf5bfda650 outdated

126@@ -127,6 +127,9 @@ struct CCoinsCacheEntry
127         FRESH = (1 << 1),
128     };
129 
130+    inline bool IsDirty() const { return flags & DIRTY; }
131+    inline bool IsFresh() const { return flags & FRESH; }

l0rinc commented at 9:13 am on July 1, 2024:

nit: modern compilers will likely decide the need for inlining themselves

Since inline substitution is unobservable in the standard semantics, compilers are free to use inline substitution for any function that’s not marked inline, and are free to generate function calls to any function marked inline.

https://godbolt.org/z/ePYnhnYGM indicates with & without inline the same asm is produced

sipa commented at 12:01 pm on July 1, 2024:

Yes and no. Compilers do decide for themselves what to inline, but the inline keyword does increase compilers’ eagerness to inline.

l0rinc commented at 1:34 pm on July 1, 2024:

Does it make sense to check what the main compilers do in this case (the ones I checked via godbolt automatically inline), or was that just a waste of time, since seemingly unrelated changes could change the outcome (but if that’s the case, shouldn’t we let the compiler decide)?

andrewtoth commented at 1:49 pm on July 1, 2024:

I think not inlining would mean I would also have to add these methods to the implementation file instead of the header.

Since you’re checking assembly, is there any difference if I add noexcept to these methods? I can’t find a definitive answer on if that will make any difference.

sipa commented at 1:59 pm on July 1, 2024:

So inline has two different effects in compilers:

1. It allows the function to appear in multiple compilation units (which makes it mandatory here). Templated entities automatically have this behavior (with or without explicit inline).
1. It suggests to the compiler that its implementation should be inlined in call sites. Because this inlining has no observable effect on compiler behavior, compilers can do this (or not do this) with or without inline. It has become less important as compilers are smart enough to make this decision themselves, but not entirely gone.

sipa commented at 2:08 pm on July 1, 2024:

@andrewtoth The C++ Core Guidelines suggest adding noexcept whenever you know a function won’t throw. Bitcoin Core treats failure to allocate memory as an unrecoverable condition, so having dynamic memory allocations in a function is not a reason not to add it.

It may or may not have a performance benefit (especially when it’s a function called from a different compilation unit - and we’re not compiling with LTO enabled, or in default/move constructors of types used inside STL containers which may pick more efficient algorithm variants in that case), but even if it doesn’t, it’s useful as human documentation that a function isn’t expected to throw.

That also holds for inline in my opinion, actually: it documents that the author intends for a function to be inlined, whether or not the compiler actually does that (at -O0 it won’t ever inline, at -O2 all tiny functions probably will).

l0rinc commented at 2:14 pm on July 1, 2024:

@andrewtoth are you asking because of the hasher involved in caching, i.e. https://github.com/bitcoin/bitcoin/blob/master/src/util/hasher.h#L41-L50?

andrewtoth commented at 2:21 pm on July 1, 2024:

@sipa thank you for the explanation. @paplorinc that was an example where it made a difference to performance that came to my mind, yes. I was just curious whether there would be any performance benefit of adding noexcept on these methods. If the asm generated with or without is the same then the answer is no. But the advice for adding them for human documentation makes sense.

andrewtoth commented at 5:52 pm on July 3, 2024:

Added noexcept everywhere.

l0rinc commented at 9:24 am on July 1, 2024: contributor

utACK 670084adc53e3d661ea5b4a19743659609a42d5e, will run the benchmarks in the following weeks, thanks for addressing the suggestions.

DrahtBot requested review from jamesob on Jul 1, 2024

andrewtoth force-pushed on Jul 3, 2024

in src/coins.h:142 in 9d539b7451 outdated

135@@ -136,7 +136,10 @@ struct CCoinsCacheEntry
136     CCoinsCacheEntry() noexcept = default;
137     explicit CCoinsCacheEntry(Coin&& coin_) noexcept : coin(std::move(coin_)) {}
138 
139-    inline void AddFlags(uint8_t flags) noexcept { m_flags |= flags; }
140+    //! Adding a flag also requires a self reference to the pair that contains
141+    //! this entry in the CCoinsCache map and a reference to the head of the
142+    //! flagged pair linked list.
143+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CCoinsCacheEntry& head) noexcept { m_flags = flags; }

l0rinc commented at 5:36 pm on July 3, 2024:

hmmm, why did this become = instead of |=?

andrewtoth commented at 5:49 pm on July 3, 2024:

Whoops bad rebase, thanks for catching!

in src/coins.h:189 in efe0f1fb7d outdated

186     inline bool IsDirty() const noexcept { return m_flags & DIRTY; }
187     inline bool IsFresh() const noexcept { return m_flags & FRESH; }
188+
189+    //! Get the next entry in the flagged linked list, optionally removing the
190+    //! current entry and clearing the flags.
191+    inline CoinsCachePair* Next(bool clear_flags = false) noexcept

l0rinc commented at 5:39 pm on July 3, 2024:

what happens if this ends up throwing, it’s not a trivial function (e.g. m_prev could become nullable in the future and cause havoc)?

andrewtoth commented at 6:41 pm on July 3, 2024:

Hmm but accessing a null pointer would be a segfault and not an exception right? Sorry I don’t follow how do you think this method could throw?

l0rinc commented at 7:08 am on July 4, 2024:

k, thanks, ignore this :)

in src/coins.cpp:256 in f7de284f82 outdated

252@@ -253,11 +253,11 @@ bool CCoinsViewCache::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlockIn
253 }
254 
255 bool CCoinsViewCache::Flush() {
256-    bool fOk = base->BatchWrite(cacheCoins, hashBlock, /*will_erase=*/true);
257+    bool fOk = base->BatchWrite(m_flagged_head.Next(), hashBlock, /*will_erase=*/true);

l0rinc commented at 5:40 pm on July 3, 2024:

does the naming inconsistency apply here as well - given that we ignore the first element with instant .Next() as a first step?

andrewtoth commented at 6:31 pm on July 3, 2024:

Yeah I suppose we can rename it m_flagged_sentinel or something? Do you have a suggestion that would have made this clearer if that was the name originally?

l0rinc commented at 6:36 pm on July 3, 2024:

we have “sentinels” on both ends now, right? A dummy at the beginning and a nullptr on the end. This is what I found confusing in the test, why not store the actual head and have a nullptr or sentinel at the end only? I haven’t investigated this, maybe I’m completely off again…

andrewtoth commented at 7:09 pm on July 3, 2024:

we have “sentinels” on both ends now, right? A dummy at the beginning and a nullptr on the end.

Yes, that sounds right.

why not store the actual head and have a nullptr or sentinel at the end only?

I found it was more code to keep track of a changing head in the CCoinsViewCache, instead of a dummy sentinel I called m_flagged_head that remained static and all insertions happened after the sentinel. If we tracked the actual head, we would have to assign the latest iterator added to be head for each call, and then on deletion we would have to check if it was head as well and then reassign head to the next entry. That last part couldn’t be done without access to the coinsCache, so we couldn’t just add ClearFlags into the destructor of the entry.

andrewtoth commented at 7:16 pm on July 6, 2024:

Renamed all head to sentinel. Hopefully that makes everything clearer :).

l0rinc commented at 7:19 pm on July 6, 2024:

Thanks, still working on reproducing the benchmarks before I can properly ACK it

andrewtoth force-pushed on Jul 3, 2024

in src/coins.cpp:280 in 230e0d747a outdated

281-            it = cacheCoins.erase(it);
282-        } else {
283-            it->second.ClearFlags();
284-            ++it;
285+            cacheCoins.erase(it->first);
286+        } else if (fOk) {

l0rinc commented at 5:53 pm on July 3, 2024:

maybe it could make sense to document these rare conditions with something like } else if ([[unlikely]] fOk) {

andrewtoth commented at 6:39 pm on July 3, 2024:

Neat! Could also add [[unlikely]] here https://github.com/bitcoin/bitcoin/pull/28280/files#diff-f0ed73d62dae6ca28ebd3045e5fc0d5d02eaaacadb4c2a292985a3fbd7e1c77cR189

andrewtoth commented at 7:15 pm on July 6, 2024:

I think I will refrain from adding these annotations for now, since they are more than just documentation and may have side effects in the generated code. I might experiment with a separate commit to see if they affect performance.

l0rinc commented at 6:19 pm on July 3, 2024: contributor

I ran the benchmarks without prune (on the previous push), with just a lazy -reindex or -reindex-chainstate, but couldn’t measure any difference - do I need more than 200k blocks? Or is prune critical here? Or is lazy reindexing not enough?

0sudo ./src/bitcoind -stopatheight=200000
1
2sudo hyperfine \
3--warmup 1 --runs 5 \
4--show-output \
5--parameter-list commit master,670084adc53e3d661ea5b4a19743659609a42d5e \
6--prepare 'git checkout {commit} && git reset --hard && make -j10 && sync && purge' \
7'./src/bitcoind -dbcache=4000 -stopatheight=200000 -reindex-chainstate -assumevalid=0 -connect=0 -listen=0 -maxconnections=0'

resulting in

0Summary
1  ./src/bitcoind -dbcache=4000 -stopatheight=100000 -reindex -assumevalid=0 -connect=0 -listen=0 -maxconnections=0 (commit = 670084adc53e3d661ea5b4a19743659609a42d5e) ran
2    1.00 ± 0.01 times faster than ./src/bitcoind -dbcache=4000 -stopatheight=100000 -reindex -assumevalid=0 -connect=0 -listen=0 -maxconnections=0 (commit = master)

andrewtoth commented at 6:34 pm on July 3, 2024: contributor

I ran the benchmarks without prune (on the previous push), with just a lazy -reindex or -reindex-chainstate, but couldn’t measure any difference - do I need more than 200k blocks? Or is prune critical here? Or is lazy reindexing not enough?

It shouldn’t have much effect on non-pruning nodes, so it’s good that there’s no regression. You will need pruning enabled to see the performance benefits. The higher the dbcache and lower the prune settings the better. The higher the block height the better.

l0rinc commented at 6:35 pm on July 3, 2024: contributor

You will need pruning enabled to see the performance benefits

That requires a bit more setup to be able to do it offline

andrewtoth commented at 6:43 pm on July 3, 2024: contributor

That requires a bit more setup to be able to do it offline

I don’t think it’s possible to reindex the chainstate of a pruned node, since that requires having all blocks. So you will need to redownload blocks to benchmark this. In my setup I have another fully synced non-pruned node on a separate machine in my local network and I only connect the node being benchmarked to that other node.

andrewtoth force-pushed on Jul 3, 2024

andrewtoth force-pushed on Jul 6, 2024

in src/coins.h:137 in dce41f2e7c outdated

130@@ -131,6 +131,12 @@ struct CCoinsCacheEntry
131     explicit CCoinsCacheEntry(Coin&& coin_) : coin(std::move(coin_)), flags(0) {}
132     CCoinsCacheEntry(Coin&& coin_, unsigned char flag) : coin(std::move(coin_)), flags(flag) {}
133 
134+    inline void AddFlags(unsigned char flags) noexcept { this->flags |= flags; }
135+    inline void ClearFlags() noexcept
136+    {
137+        if (!flags) return;

sipa commented at 9:20 pm on July 6, 2024:

In commit “refactor: encapsulate flags setting with AddFlags and ClearFlags”

I don’t think this test helps; testing a value is likely slower than just overwriting it. Testing with godbolt does seem like it doesn’t get optimized out, which surprises me a bit.

EDIT: I see this matters in a future commit. Feel free to ignore, or move the conditional there.

andrewtoth commented at 2:02 am on July 7, 2024:

Moved to later commit.

in src/test/coins_tests.cpp:95 in 4890c85efd outdated

91@@ -92,6 +92,7 @@ class CCoinsViewCacheTest : public CCoinsViewCache
92     }
93 
94     CCoinsMap& map() const { return cacheCoins; }
95+    CCoinsCacheEntry& entries() const { return m_sentinel; }

sipa commented at 9:23 pm on July 6, 2024:

In commit “refactor: require self and sentinel parameters for AddFlags”

This seems like a strange name for a getter for the sentinel?

andrewtoth commented at 2:02 am on July 7, 2024:

Changed to sentinel().

in src/test/fuzz/coins_view.cpp:128 in 4890c85efd outdated

121@@ -122,12 +122,16 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
122                 random_mutable_transaction = *opt_mutable_transaction;
123             },
124             [&] {
125+                // flagged_head must go out of scope and be destroyed after coins_map.
126+                // Any remaining flagged entries in coins_map need to reference the head
127+                // when they are destroyed.
128+                CCoinsCacheEntry flagged_head;

sipa commented at 9:23 pm on July 6, 2024:

In commit “refactor: require self and sentinel parameters for AddFlags”

Perhaps also use the name sentinel here? I see you’re changing the name in a later commit, but it would be cleaner to do it here.

andrewtoth commented at 2:03 am on July 7, 2024:

Fixed.

in src/coins.h:168 in ea7830ad35 outdated

163@@ -146,16 +164,36 @@ struct CCoinsCacheEntry
164     inline void AddFlags(uint8_t flags, CoinsCachePair& self, CCoinsCacheEntry& sentinel) noexcept
165     {
166         assert(&self.second == this);
167+        if (!m_flags && flags) {

sipa commented at 10:48 pm on July 6, 2024:

If I can suggest a slightly different linked-list design, which would avoid some conditionals.

A doubly-linked, circular, list. The sentinel is both the first and the last element of the list of flagged entries. Every non-flagged entry is also a linked list, containing just itself. In this case, adding/removing from the flagged list is just concatenating/splicing linked lists, which can be done without conditions/edge cases.

 0CCoinsCacheEntry() noexcept
 1{
 2    ...
 3    m_prev = this;
 4    m_next = this;
 5}
 6
 7/** Insert this after other. */
 8void InsertAfter(CCoinsCacheEntry* other) noexcept
 9{
10    auto other_next = other->m_next;
11    std::swap(other->m_next, m_prev->m_next);
12    std::swap(m_prev, other_next->m_prev);
13}
14
15/** Insert this before other. */
16void InsertBefore(CCoinsCacheEntry* other)
17{
18    auto other_prev = other->m_prev;
19    std::swap(other->m_prev, m_next->m_prev);
20    std::swap(m_next, other_m_prev->m_next);
21}
22
23/** Make this a singleton list. Can be done on entries that are already singletons. */
24void TakeOut()
25{
26    auto old_prev = m_prev;
27    std::swap(m_next->m_prev, m_prev);
28    std::swap(old_prev->m_next, m_next);
29}

sipa commented at 0:17 am on July 7, 2024:

Actually, even simpler. By delaying the setting of m_next = this and m_prev = this to just before insertion, inlining that, and not bothering with getting m_next and m_prev correct for taken-out elements:

 0void InsertAfter(CCoinsCacheEntry* other) noexcept
 1{
 2    m_prev = other;
 3    m_next = other->m_next;
 4    m_next->m_prev = this;
 5    other->m_next = this;
 6}
 7
 8void InsertBefore(CCoinsCacheEntry* other) noexcept
 9{
10    m_next = other;
11    m_prev = other->m_prev;
12    m_prev->m_next = this;
13    other->m_prev = this;
14}
15
16void TakeOut() noexcept
17{
18    auto old_prev = m_prev;
19    m_next->m_prev = m_prev;
20    old_prev->m_next = m_next;
21}

andrewtoth commented at 0:33 am on July 7, 2024:

Yeah I thought about doing a self-referencing sentinel, but for iteration then in BatchWrite we would have to pass the sentinel and do something like

0for (auto it{sentinel.Next()}; &it->second != &sentinel; it = it->Next()) {

I thought just passing a pointer of the first item and checking it != nullptr was cleaner. But I can update to do it this way too.

andrewtoth commented at 0:38 am on July 7, 2024:

Also, m_prev and m_next would have to be CoinsCachePairs, not CCoinsCacheEntrys, so we can’t set the pointers in the constructor since we don’t have a pair.

sipa commented at 0:39 am on July 7, 2024:

No need for anything in the constructor with my latest suggestion. I see how not needing the sentinel when iterating is nicer. Up to you.

andrewtoth commented at 2:06 am on July 7, 2024:

Done! Thank you for your review, I think it’s pretty clean now. I just had to add a SelfRef method on CCoinsCacheEntry for the sentinel to init its pointers to itself. Also, we need to not call Next() randomly on any CCoinsCacheEntry, only from the sentinel and then iterating through. This isn’t done anywhere in the patch, but it’s not protected against if anyone does. Now instead of returning a nullptr if it’s not in the linked list it could potentially return a pointer to an already freed entry.

davidgumberg commented at 9:53 pm on August 1, 2024:

we need to not call Next() randomly on any CCoinsCacheEntry, only from the sentinel and then iterating through. This isn’t done anywhere in the patch, but it’s not protected against if anyone does. Now instead of returning a nullptr if it’s not in the linked list it could potentially return a pointer to an already freed entry.

Feel free to disregard: Would it make sense for Next()/Prev() to add an Assume(m_flags) or to do if(m_flags) { return m_next; } to protect against the dangling reference?

andrewtoth commented at 4:21 am on August 2, 2024:

Added Assume(m_flags), thank you for the suggestion @davidgumberg! I did have to add m_flags = DIRTY; to the SelfRef method so that we can call Next on the sentinel to get the first entry.

andrewtoth force-pushed on Jul 7, 2024

in src/coins.h:144 in 4cc0f95c40 outdated

140+    //! Adding a flag also requires a self reference to the pair that contains
141+    //! this entry in the CCoinsCache map and a reference to the sentinel of the
142+    //! flagged pair linked list.
143+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& sentinel) noexcept
144+    {
145+        assert(&self.second == this);

sipa commented at 12:01 pm on July 7, 2024:

In commit “refactor: require self and sentinel parameters for AddFlags”

Maybe turn this into an Assume, so it doesn’t have a runtime cost in non-debug builds?

andrewtoth commented at 2:16 pm on July 7, 2024:

I see there’s also an assert(!coin.IsSpent()); at the beginning of AddCoin, which would also be in the hot path. When should we use assert vs Assume then?

sipa commented at 2:20 pm on July 7, 2024:

I guess that predates the introduction of Assume.

assert runs always; Assume only runs when ABORT_ON_FAILED_ASSUME is defined, which is the case in debug builds and in fuzz tests.

andrewtoth commented at 4:00 pm on July 7, 2024:

Changed to Assume.

in src/coins.h:168 in 39353838d5 outdated

163@@ -146,20 +164,42 @@ struct CCoinsCacheEntry
164     inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& sentinel) noexcept
165     {
166         assert(&self.second == this);
167+        if (!m_flags && flags) {
168+            m_prev = &sentinel;

sipa commented at 12:07 pm on July 7, 2024:

In commit “coins: track flagged cache entries in linked list”

This doesn’t matter much, but if you’d swap all prevs/nexts here, the entry would get inserted at the end of the list rather than the beginning, making the list be ordered old-to-new. Unless there is a reason to write things out from new to old?

andrewtoth commented at 3:59 pm on July 7, 2024:

Reordered the insertion.

in src/coins.h:111 in 39353838d5 outdated

106@@ -107,6 +107,24 @@ using CoinsCachePair = std::pair<const COutPoint, CCoinsCacheEntry>;
107 struct CCoinsCacheEntry
108 {
109 private:
110+    /**

sipa commented at 12:09 pm on July 7, 2024:

In commit “coins: track flagged cache entries in linked list”

Would it make sense to add code to CCoinsViewCache’s destructor to explicitly wipe the map, and then test that m_sentinel.m_next == &m_sentinel in an Assume or so? To ascertain that no deleted CCoinsCacheEntrys remain in the list?

andrewtoth commented at 3:59 pm on July 7, 2024:

Done.

in src/coins.cpp:275 in 5f1b7c6ab2 outdated

274-    // FRESH/DIRTY flags of any coin that isn't spent.
275-    for (auto it = cacheCoins.begin(); it != cacheCoins.end(); ) {
276+    bool fOk = base->BatchWrite(m_sentinel, hashBlock, /*will_erase=*/false);
277+    // Instead of clearing `cacheCoins` as we would in Flush(), just erase the
278+    // spent coins. All coins left in the linked list after BatchWrite should be spent.
279+    for (auto it{m_sentinel.second.Next()}; it != &m_sentinel; ) {

sipa commented at 12:15 pm on July 7, 2024:

In commit “coins: pass linked list of flagged entries to BatchWrite”

It’s a bit unfortunate that this means iterating through spent entries twice; wouldn’t it be possible to let BatchWrite do the deletion too in the same loop? This will probably mean passing both cacheCoins and m_sentinel in.

andrewtoth commented at 2:17 pm on July 7, 2024:

Yeah it also means passing in the cachedCoinsUsage so we can decrement that when we erase.

sipa commented at 2:25 pm on July 7, 2024:

Hmm, yeah, that’s pretty ugly. It’s fine to leave exploring this for a follow-up.

andrewtoth commented at 3:59 pm on July 7, 2024:

I ended up doing it. I pass in sentinel and cachedCoinsUsage and delete the spent entries when not erasing. I think calling erase with the outpoint only is slower than erasing with the iterator itself, so I only erase when calling Sync, and clear the map at the end of the loop if erase=true.

sipa commented at 4:49 pm on July 7, 2024:

Hmm, it’s not at all obvious that this will be faster, as an erase by outpoint involves hashing and lookup in the hash table. It may be worth benchmarking the two options to make sure.

Another option is switching the coinscache map from std::unordered_map to a Boost multiindex, which can simultaneously maintain a hashtable and a linked list transparently without needing to hack our own linked list on top. Practically that would mean no issues due to lack of ability to convert from an entry pointer to an iterator.

andrewtoth commented at 4:52 pm on July 7, 2024:

Hmm, it’s not at all obvious that this will be faster, as an erase by outpoint involves hashing and lookup in the hash table. It may be worth benchmarking the two options to make sure.

Sorry, I may not have been clear. I am not erasing by outpoint. I only erase the spent coins if erase=false by outpoint, because that’s the only way to find those in constant time. Otherwise, for a normal erasing flush, we just call mapCoins.clear() at the end to wipe them all at once. I think I have used the fastest options for each path.

sipa commented at 5:03 pm on July 7, 2024:

I understand now. My point stands that both versions are doing something unnecessary, but I see now it’s already a strict improvement:

For erase=false, you erase spent coins where iterating, which involves a hashtable lookup, even though you already have a pointer to the element you’re erasing (and it’s just due to an interface limitation that this cannot be turned into an iterator that can be erased directly).
For erase=true, you still iterate a second time (inside mapCoins.clear()), even though again you already passed them once (this is very low in CPU cost, but if the cache is big enough to not fit in CPU caches, this will incur a second RAM latency cost).

Using a multiindex would permit getting rid of both, but let’s keep that for a follow-up, if anyone wants to investigate that.

andrewtoth commented at 5:04 pm on July 7, 2024:

Do we want to add more Boost dependencies? I thought we wanted to go the other way.

sipa commented at 5:07 pm on July 7, 2024:

Boost multi-index is headers-only, so it’s just a build-time dependency, not a runtime one. For the mempool we already realistically can’t avoid having something that’s functionally equivalent to boost multiindex, so I’m not too worried about adding more reliance on it. I do know @theuni is looking into having a simple ad-hoc replacement for it, though.

andrewtoth commented at 5:26 pm on July 7, 2024:

you still iterate a second time (inside mapCoins.clear()),

Hmm I added this in the latest revision because we now have the mapCoins in BatchWrite and could clear it. Before this patch was just not erasing the entries and removing checking that it is empty in ReallocateCache. If the map is destroyed does it not essentially do a clear as well, so this is a wash whether we clear now or on destruction?

andrewtoth force-pushed on Jul 7, 2024

in src/coins.cpp:199 in 91c262bae9 outdated

199-        if (!(it->second.flags & CCoinsCacheEntry::DIRTY)) {
200+        if (!(it->second.IsDirty())) {
201+            const auto next_entry{it->second.Next(/*clear_flags=*/true)};
202+            if (!erase && it->second.coin.IsSpent()) {
203+                childCachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
204+                mapCoins.erase(it->first);

sipa commented at 4:56 pm on July 7, 2024:

Isn’t this an erase by outpoint?

andrewtoth commented at 4:57 pm on July 7, 2024:

Yes, but if !erase. This is what Sync was doing before I moved this into BatchWrite. Only spent outpoint when we don’t want to clear the cache. This is what I already benchmarked to be much faster.

andrewtoth commented at 4:59 pm on July 7, 2024:

This is the reason we need to have the linked list be CoinsCachePairs instead of CCoinsCacheEntrys, as you mentioned here.

sipa commented at 5:01 pm on July 7, 2024:

Ah, thank you! I missed that this was already being done in master.

andrewtoth commented at 5:08 pm on July 7, 2024:

Err, not master, but the previous version of this PR that I force pushed over an hour ago. In master it is looping through the entire cacheCoins map, and deleting spent entries. With this approach we do have to do a hash and lookup, but it’s constant time instead of linear.

sipa commented at 5:18 pm on July 7, 2024:

Hmm, right. But so there are still two possible variants for the erase=false case?

One where they are erased during BatchWrite, using outpoint-based erase
One where all the erasing is done inside Sync, and BatchWrite only does flag clearing.

(1) has higher CPU cost (due to hashing and hash-table search), while (2) will involves accessing entries’ memory twice. Which of the two is faster isn’t clear to me, as it’ll depend on CPU cache sizes etc.

sipa commented at 5:21 pm on July 7, 2024:

Ah, but (2) necessarily also involves an outpoint-based erase, as you only want to erase flagged entries without iterating the non-flagged ones. Right.

andrewtoth commented at 5:22 pm on July 7, 2024:

2 also required an outpoint based deletion, so this approach just moves that deletion into the same loop instead of clearing unspent flags in the first loop and then looping through spent entries again in a second loop and deleting them via outpoint.

andrewtoth commented at 5:23 pm on July 7, 2024:

If only we could convert a ConsCachePair back into a map iterator…

sipa commented at 5:25 pm on July 7, 2024:

In boost::multi_index you can :D (see https://www.boost.org/doc/libs/1_77_0/libs/multi_index/doc/reference/hash_indices.html#iterators, the iterator_to function).

I’ll shut up now and mark this resolved. Thanks for bearing with me.

andrewtoth commented at 4:13 am on July 8, 2024:

Ok, but is there no C++-fu incantation that can get us from a std::pair<const COutpoint, CCoinsCacheEntry>* to a CCoinsMap::iterator? I can’t figure out a way to get that to compile but I’m sure there’s a way using void* or something.

sipa commented at 4:17 am on July 8, 2024:

It might be possible to construct something that works for a specific libstdc++ or libc++ version, but it would be very ugly.

andrewtoth commented at 4:30 am on July 8, 2024:

Can you point me to the way of that ugliness? :pray:

sipa commented at 1:28 pm on July 8, 2024:

I don’t seem to be able to get it to work. It may be that’s because the iterator doesn’t actually encapsulate an element pointer, but a pointer to where in the hashtable the entry exists or so. If so, then that just isn’t possible.

andrewtoth commented at 5:11 pm on July 7, 2024: contributor

Thank you for your thorough review @sipa! Will run some more extensive benchmarks that should be done in the coming weeks.

in src/coins.cpp:48 in f8dfb5c0e5 outdated

40+}
41+
42+CCoinsViewCache::~CCoinsViewCache()
43+{
44+    cacheCoins.clear();
45+    Assume(m_sentinel.second.Next() == &m_sentinel);

l0rinc commented at 10:10 am on July 8, 2024:

this should probably be moved to a later commit where you introduce the Next() method

andrewtoth commented at 1:07 pm on July 8, 2024:

Right, thanks!

andrewtoth force-pushed on Jul 8, 2024

in src/test/coinscachepair_tests.cpp:57 in a46e958493 outdated

50+    node = sentinel.second.Next();
51+    for (const auto& expected : nodes) {
52+        BOOST_CHECK_EQUAL(&expected, node);
53+        auto next{node->second.Next()};
54+        node->second.ClearFlags();
55+        node = next;

l0rinc commented at 5:54 pm on July 8, 2024:

it seems to me that next remains unchanged after clear:

0        auto next{node->second.Next()};
1        node->second.ClearFlags();
2        assert(next == node->second.Next());
3        node = next;

so if that’s intentional, we can simplify to:

0        node->second.ClearFlags();
1        node = node->second.Next();

andrewtoth commented at 3:10 am on July 9, 2024:

Done.

l0rinc commented at 6:05 pm on July 8, 2024: contributor

nice, I like this version a lot more.

I haven’t investigated it yet, but now that we have a circular list, does the double linking still offer a measurable advantage or could we do the same with a single link and maybe 2 iterations instead (and would it be worth it, if it’s possible)?

sipa commented at 6:20 pm on July 8, 2024: member

@paplorinc With only a single link, how would you delete arbitrary map entries? Barring large hacks like putting the data in the next entry rather than the entry which the key points to, I don’t see how you could delete entries in less than $\mathcal{O}(n)$ time.

l0rinc commented at 7:07 pm on July 8, 2024: contributor

With only a single link, how would you delete arbitrary map entries?

My naive thinking was that maybe we don’t actually need random-element deletion. Based on how ClearFlags and erase are currently called from loops that already have the previous node, we could often call ClearFlags(*prev). The destructor is trickier. Maybe we can handle this in the CCoinsViewCache destructor instead, batch-destructing entries there, which would involve a single extra $\mathcal{O}(n)$ traversal. I’m sure this has some other disadvantages (if it works at all), which is why I’m asking if it could potentially work or not.

sipa commented at 7:10 pm on July 8, 2024: member

@paplorinc When UTXOs are spent while FRESH, they get deleted from the cache; this is quite possibly the most significant performance optimization Bitcoin Core has during IBD, as it avoids the need for those UTXOs to ever end up on disk and makes space for new ones. It’s done based on COutPoint lookup, not during iteration.

While it’s not impossible to instead do this in a cleanup step separate from normal operation, I think that would come with a significant performance cost, as it would need frequent iterations over all UTXOs to discover these to-be-deleted entries (not just flagged ones, like this PR does).

l0rinc commented at 7:19 pm on July 8, 2024: contributor

Makes sense, thanks for explaining @sipa!

andrewtoth force-pushed on Jul 9, 2024

andrewtoth commented at 3:12 am on July 9, 2024: contributor

Added a CoinsViewCacheCursor struct to encapsulate the iteration of the linked list when passed up to BatchWrite.

andrewtoth force-pushed on Jul 9, 2024

DrahtBot added the label CI failed on Jul 9, 2024

DrahtBot commented at 3:13 am on July 9, 2024: contributor

🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

Leave a comment here, if you need help tracking down a confusing failure.

Debug: https://github.com/bitcoin/bitcoin/runs/27195473297

andrewtoth force-pushed on Jul 9, 2024

DrahtBot removed the label CI failed on Jul 9, 2024

in src/coins.cpp:259 in adcccaae6b outdated

264+    bool fOk = base->BatchWrite(cursor, hashBlock, /*erase=*/true);
265     if (fOk) {
266-        if (!cacheCoins.empty()) {
267-            /* BatchWrite must erase all cacheCoins elements when erase=true. */
268-            throw std::logic_error("Not all cached coins were erased");
269-        }

l0rinc commented at 10:04 am on July 9, 2024:

you basically reverting https://github.com/bitcoin/bitcoin/commit/2e16054a66b0576ec93d4032d6b74f0930a44fef here, right? Was #17487 (review) just a temporary sanity check that we don’t need anymore?

andrewtoth commented at 2:15 pm on July 9, 2024:

Hmm yes because we no longer erase in BatchWrite, since we don’t pass the coinsCache map anymore.

l0rinc commented at 2:24 pm on July 9, 2024:

Thanks for checking!

in src/coins.h:321 in adcccaae6b outdated

317+    }
318+private:
319+    size_t& m_usage;
320+    CoinsCachePair& m_sentinel;
321+    CCoinsMap& m_map;
322+    CoinsCachePair* m_curr{nullptr};

l0rinc commented at 10:08 am on July 9, 2024:

nit: if we keep it, is curr an commonly used abbreviation or could we expand it?

andrewtoth commented at 2:16 pm on July 9, 2024:

I can expand to m_current if that’s more readable? Or do you have a better suggestion?

l0rinc commented at 2:23 pm on July 9, 2024:

If you decide to keep it, yes. Otherwise removing it seems like a better option to me.

in src/coins.h:285 in adcccaae6b outdated

254@@ -253,8 +255,8 @@ class CCoinsView
255     virtual std::vector<uint256> GetHeadBlocks() const;
256 
257     //! Do a bulk modification (multiple Coin changes + BestBlock change).
258-    //! The passed mapCoins can be modified.
259-    virtual bool BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, bool erase = true);
260+    //! The passed cursor is used to iterate through the coins.
261+    virtual bool BatchWrite(CoinsViewCacheCursor &cursor, const uint256 &hashBlock, bool erase = true);

l0rinc commented at 10:09 am on July 9, 2024:

nit: are you deliberately keeping the original & placements when refactoring, or when do you use X& y and when X &y?

andrewtoth commented at 2:16 pm on July 9, 2024:

I am deliberately matching the style used where I am making changes, yes.

sipa commented at 2:18 pm on July 9, 2024:

The developer guide says to prefer using the recommended style over mimicking the surrounding style.

andrewtoth commented at 9:28 pm on July 9, 2024:

Done.

in src/coins.cpp:273 in adcccaae6b outdated

284+    auto cursor = CoinsViewCacheCursor(cachedCoinsUsage, m_sentinel, cacheCoins);
285+    bool fOk = base->BatchWrite(cursor, hashBlock, /*erase=*/false);
286+    if (fOk) {
287+        if (m_sentinel.second.Next() != &m_sentinel) {
288+            /* BatchWrite must clear flags of all entries */
289+            throw std::logic_error("Not all unspent flagged entries were cleared");

l0rinc commented at 10:24 am on July 9, 2024:

is this important enough to cover it with a test?

andrewtoth commented at 2:17 pm on July 9, 2024:

Tests will fail if they are not all cleared, try it :).

l0rinc commented at 2:36 pm on July 9, 2024:

Which test should I run to check? I tried src/test/test_bitcoin --run_test=coins_tests and make check, they all passed when I commented out the throw. Also, code coverage (I know it’s incomplete, but still) indicates, this line is not exercised

andrewtoth commented at 2:39 pm on July 9, 2024:

Oh, right I meant don’t clear the flags during a BatchWrite and this line will throw, breaking the tests.

l0rinc commented at 2:40 pm on July 9, 2024:

So if I understood you correctly, there is no point in testing that this exception is thrown, right?

andrewtoth commented at 9:28 pm on July 9, 2024:

I don’t think it will add anything to have a test for that, no.

in src/coins.h:308 in adcccaae6b outdated

304+    inline CoinsCachePair* Next(bool erase) noexcept
305+    {
306+        const auto next_entry{m_curr->second.Next()};
307+        if (!erase) {
308+            if (m_curr->second.coin.IsSpent()) {
309+                m_usage -= m_curr->second.coin.DynamicMemoryUsage();

l0rinc commented at 10:28 am on July 9, 2024:

nit: to be sure that the Next method is indeed a noexcept, we could annotate https://github.com/bitcoin/bitcoin/blob/master/src/coins.cpp#L47 as well (it seems to be noexcept based on the impl), so maybe in a followup PR we could do that as well

in src/coins.cpp:96 in 169658de95 outdated

92@@ -93,7 +93,7 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
93         //
94         // If the coin doesn't exist in the current cache, or is spent but not
95         // DIRTY, then it can be marked FRESH.
96-        fresh = !(it->second.flags & CCoinsCacheEntry::DIRTY);
97+        fresh = !(it->second.IsDirty());

l0rinc commented at 10:40 am on July 9, 2024:

nit: we have a few of extra parenthesis now

0        fresh = !it->second.IsDirty();

in src/coins.cpp:224 in 169658de95 outdated

222                 // the calling code.
223                 throw std::logic_error("FRESH flag misapplied to coin that exists in parent cache");
224             }
225 
226-            if ((itUs->second.flags & CCoinsCacheEntry::FRESH) && it->second.coin.IsSpent()) {
227+            if ((itUs->second.IsFresh()) && it->second.coin.IsSpent()) {

l0rinc commented at 10:40 am on July 9, 2024:

same:

0            if (itUs->second.IsFresh() && it->second.coin.IsSpent()) {

in src/coins.h:290 in adcccaae6b outdated

286+    bool BatchWrite(CoinsViewCacheCursor &cursor, const uint256 &hashBlock, bool erase = true) override;
287     std::unique_ptr<CCoinsViewCursor> Cursor() const override;
288     size_t EstimateSize() const override;
289 };
290 
291+struct CoinsViewCacheCursor {

l0rinc commented at 10:42 am on July 9, 2024:

this encapsulation makes the diff a lot cleaner!

in src/coins.h:299 in adcccaae6b outdated

295+
296+    inline CoinsCachePair* Begin() noexcept
297+    {
298+        m_curr = m_sentinel.second.Next();
299+        return m_curr;
300+    }

l0rinc commented at 11:17 am on July 9, 2024:

I find it very confusing when a method mutates AND returns. Maybe we could either do something like

0    for (cursor.Begin(); cursor.GetCurrent() != cursor.End(); cursor.Next(erase)) {
1        auto it = cursor.GetCurrent();

in which case Begin and Next modify the state but don’t actually return anything.

Or even better, we could get rid CoinsCachePair* m_curr{nullptr}; (Begin would also become a const) and add it to Next(CoinsCachePair* current, bool erase), and the usage would simply become cursor.Next(it, erase) instead.

  0diff --git a/src/coins.h b/src/coins.h
  1--- a/src/coins.h	(revision 04ed226236d1ef12110cb5f85d4d3f133d2f68a8)
  2+++ b/src/coins.h	(date 1720523816507)
  3@@ -292,33 +292,25 @@
  4     CoinsViewCacheCursor(size_t& cachedCoinsUsage, CoinsCachePair& sentinel, CCoinsMap& mapCoins) noexcept
  5         : m_usage(cachedCoinsUsage), m_sentinel(sentinel), m_map(mapCoins) {}
  6 
  7-    inline CoinsCachePair* Begin() noexcept
  8-    {
  9-        m_curr = m_sentinel.second.Next();
 10-        return m_curr;
 11-    }
 12-
 13+    inline CoinsCachePair* Begin() const noexcept { return m_sentinel.second.Next(); }
 14     inline CoinsCachePair* End() const noexcept { return &m_sentinel; }
 15-
 16-    inline CoinsCachePair* Next(bool erase) noexcept
 17+    inline CoinsCachePair* Next(CoinsCachePair* current, bool erase) noexcept
 18     {
 19-        const auto next_entry{m_curr->second.Next()};
 20+        const auto next_entry{current->second.Next()};
 21         if (!erase) {
 22-            if (m_curr->second.coin.IsSpent()) {
 23-                m_usage -= m_curr->second.coin.DynamicMemoryUsage();
 24-                m_map.erase(m_curr->first);
 25+            if (current->second.coin.IsSpent()) {
 26+                m_usage -= current->second.coin.DynamicMemoryUsage();
 27+                m_map.erase(current->first);
 28             } else {
 29-                m_curr->second.ClearFlags();
 30+                current->second.ClearFlags();
 31             }
 32         }
 33-        m_curr = next_entry;
 34-        return m_curr;
 35+        return next_entry;
 36     }
 37 private:
 38     size_t& m_usage;
 39     CoinsCachePair& m_sentinel;
 40     CCoinsMap& m_map;
 41-    CoinsCachePair* m_curr{nullptr};
 42 };
 43 
 44 /** CCoinsView that adds a memory cache for transactions to another CCoinsView */
 45Index: src/txdb.cpp
 46IDEA additional info:
 47Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
 48<+>UTF-8
 49===================================================================
 50diff --git a/src/txdb.cpp b/src/txdb.cpp
 51--- a/src/txdb.cpp	(revision 04ed226236d1ef12110cb5f85d4d3f133d2f68a8)
 52+++ b/src/txdb.cpp	(date 1720523312389)
 53@@ -124,7 +124,7 @@
 54             changed++;
 55         }
 56         count++;
 57-        it = cursor.Next(erase);
 58+        it = cursor.Next(it, erase);
 59         if (batch.SizeEstimate() > m_options.batch_write_bytes) {
 60             LogPrint(BCLog::COINDB, "Writing partial batch of %.2f MiB\n", batch.SizeEstimate() * (1.0 / 1048576.0));
 61             m_db->WriteBatch(batch);
 62Index: src/coins.cpp
 63IDEA additional info:
 64Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
 65<+>UTF-8
 66===================================================================
 67diff --git a/src/coins.cpp b/src/coins.cpp
 68--- a/src/coins.cpp	(revision 04ed226236d1ef12110cb5f85d4d3f133d2f68a8)
 69+++ b/src/coins.cpp	(date 1720523312387)
 70@@ -190,7 +190,7 @@
 71 }
 72 
 73 bool CCoinsViewCache::BatchWrite(CoinsViewCacheCursor &cursor, const uint256 &hashBlockIn, bool erase) {
 74-    for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(erase)) {
 75+    for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(it, erase)) {
 76         // Ignore non-dirty entries (optimization).
 77         if (!(it->second.IsDirty())) {
 78             continue;
 79Index: src/test/fuzz/coinscache_sim.cpp
 80IDEA additional info:
 81Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
 82<+>UTF-8
 83===================================================================
 84diff --git a/src/test/fuzz/coinscache_sim.cpp b/src/test/fuzz/coinscache_sim.cpp
 85--- a/src/test/fuzz/coinscache_sim.cpp	(revision 04ed226236d1ef12110cb5f85d4d3f133d2f68a8)
 86+++ b/src/test/fuzz/coinscache_sim.cpp	(date 1720523312388)
 87@@ -174,7 +174,7 @@
 88 
 89     bool BatchWrite(CoinsViewCacheCursor& cursor, const uint256&, bool erase) final
 90     {
 91-        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(erase)) {
 92+        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(it, erase)) {
 93             if (it->second.IsDirty()) {
 94                 if (it->second.coin.IsSpent() && (it->first.n % 5) != 4) {
 95                     m_data.erase(it->first);
 96Index: src/test/coins_tests.cpp
 97IDEA additional info:
 98Subsystem: com.intellij.openapi.diff.impl.patch.CharsetEP
 99<+>UTF-8
100===================================================================
101diff --git a/src/test/coins_tests.cpp b/src/test/coins_tests.cpp
102--- a/src/test/coins_tests.cpp	(revision 04ed226236d1ef12110cb5f85d4d3f133d2f68a8)
103+++ b/src/test/coins_tests.cpp	(date 1720523312388)
104@@ -57,7 +57,7 @@
105 
106     bool BatchWrite(CoinsViewCacheCursor& cursor, const uint256& hashBlock, bool erase = true) override
107     {
108-        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(erase)){
109+        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(it, erase)){
110             if (it->second.IsDirty()) {
111                 // Same optimization used in CCoinsViewDB is to only write dirty entries.
112                 map_[it->first] = it->second.coin;

andrewtoth commented at 9:28 pm on July 9, 2024:

Done, thanks!

in src/test/fuzz/coins_view.cpp:149 in adcccaae6b outdated

148@@ -148,7 +149,8 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
149                 }
150                 bool expected_code_path = false;
151                 try {
152-                    coins_view_cache.BatchWrite(coins_map, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());
153+                    auto cursor = CoinsViewCacheCursor(cachedCoinsUsage, sentinel, coins_map);
154+                    coins_view_cache.BatchWrite(cursor, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());

l0rinc commented at 11:26 am on July 9, 2024:

nit: now that you’ve extracted the first complex parameter, it might make the code slightly more readable if we extracted the second one as well

andrewtoth commented at 9:27 pm on July 9, 2024:

I don’t think I should touch this here, it’s unrelated.

in src/txdb.cpp:127 in adcccaae6b outdated

123@@ -124,7 +124,7 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
124             changed++;
125         }
126         count++;
127-        it = erase ? mapCoins.erase(it) : std::next(it);
128+        it = cursor.Next(erase);

l0rinc commented at 11:29 am on July 9, 2024:

nit: does the update have to be in the middle of the method, or can it go to the top, like the rest (the related tests pass that way)?

andrewtoth commented at 2:20 pm on July 9, 2024:

I kept this here intentionally to make the diff easier to review. We don’t have to consider whether it breaks anything if we move it up.

l0rinc commented at 11:31 am on July 9, 2024: contributor

I think we can remove some of the cursor’s state, otherwise I’m mostly OK with the change, thanks for your perseverance!

in src/test/coins_tests.cpp:592 in e1076541e6 outdated

586     }
587     assert(flags != NO_ENTRY);
588     CCoinsCacheEntry entry;
589-    entry.flags = flags;
590     SetCoinsValue(value, entry.coin);
591     auto inserted = map.emplace(OUTPOINT, std::move(entry));

l0rinc commented at 2:30 pm on July 9, 2024:

nit: you haven’t touched this line, but Sonar complains that: "std::move" should not be called on an object of a type with neither move constructor nor move-assignment operator. - ignore it, if it’s unrelated (the rest from https://corecheck.dev/bitcoin/bitcoin/pulls/28280 are also not very useful, but I can’t judge the severity of this one)

andrewtoth commented at 2:33 pm on July 9, 2024:

Hmm but CCoinsCacheEntry does have a move constructor https://github.com/bitcoin/bitcoin/pull/28280/files#diff-095ce1081a930998a10b37358fae5499ac47f8cb6f25f5df5d88e920a54e0341R156.

l0rinc commented at 2:43 pm on July 9, 2024:

dunno: https://sonarcloud.io/project/issues?id=aureleoules_bitcoin&branch=28280-e1076541e62972464193afb1f8d90ff178665ecb&resolved=false&open=AZCVwk_WMNEdWlV7XnLo

andrewtoth commented at 3:43 am on August 2, 2024:

Here it says an explicit move constructor does not allow certain optimizations. So this is likely complaining due to the only move constructor being explicit. I’m not sure what the implications are exactly for removing the explicit keyword here.

sipa commented at 1:00 pm on August 4, 2024:

It appears the (explicit) Coin&& constructor is invoked here: https://godbolt.org/z/c1xdjKd5f

… but, I think inside that constructor, the attempt to construct the coin member of CCoinsCacheEntry using a move is ineffective, because there is no move constructor for Coin.

Note that CCoinsCacheEntry::CCoinsCacheEntry(Coin&&) is not a move constructor at all because the argument type does not match the type being constructed. I think the warning from Sonar is unrelated to this constructor.

My thinking is we should make sure Coin has a move constructor.

l0rinc commented at 2:55 pm on July 9, 2024: contributor

The benchmarks indicate that BlockToJsonVerboseWrite was slowed down

I ran a before/after against master, found only a minor (4%) diff - is this expected?

make -j10 && ./src/bench/bench_bitcoin –filter=BlockToJsonVerboseWrite –min-time=10000

before:

ns/op	op/s	err%	total	benchmark
27,574,397.06	36.27	0.2%	11.03	`BlockToJsonVerboseWrite`
27,586,627.40	36.25	0.1%	11.01	`BlockToJsonVerboseWrite`
27,638,839.92	36.18	0.2%	10.99	`BlockToJsonVerboseWrite`

after:

ns/op	op/s	err%	total	benchmark
28,573,375.00	35.00	0.2%	11.06	`BlockToJsonVerboseWrite`
28,531,835.59	35.05	0.1%	11.08	`BlockToJsonVerboseWrite`
28,602,410.56	34.96	0.1%	11.09	`BlockToJsonVerboseWrite`

andrewtoth force-pushed on Jul 9, 2024

andrewtoth commented at 9:29 pm on July 9, 2024: contributor

@paplorinc thank you again for your review! I addressed all your nits I think. I don’t think that benchmark is related, could just be noise.

l0rinc commented at 10:25 am on July 10, 2024: contributor

I don’t think that benchmark is related, could just be noise.

Indeed, after your rebase the speed seems to be the same as on master.

in src/coins.h:285 in dddb59a63d outdated

280@@ -253,8 +281,8 @@ class CCoinsView
281     virtual std::vector<uint256> GetHeadBlocks() const;
282 
283     //! Do a bulk modification (multiple Coin changes + BestBlock change).
284-    //! The passed mapCoins can be modified.
285-    virtual bool BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, bool erase = true);
286+    //! The passed cursor is used to iterate through the coins.
287+    virtual bool BatchWrite(CoinsViewCacheCursor &cursor, const uint256 &hashBlock, bool erase = true);

l0rinc commented at 10:27 am on July 10, 2024:

nit: if you end up modifying again, please fix the & here as well

in src/test/coins_tests.cpp:60 in dddb59a63d outdated

54@@ -55,9 +55,9 @@ class CCoinsViewTest : public CCoinsView
55 
56     uint256 GetBestBlock() const override { return hashBestBlock_; }
57 
58-    bool BatchWrite(CCoinsMap& mapCoins, const uint256& hashBlock, bool erase = true) override
59+    bool BatchWrite(CoinsViewCacheCursor& cursor, const uint256& hashBlock, bool erase = true) override
60     {
61-        for (CCoinsMap::iterator it = mapCoins.begin(); it != mapCoins.end(); it = erase ? mapCoins.erase(it) : std::next(it)) {
62+        for (auto it{cursor.Begin()}; it != cursor.End(); it = cursor.Next(*it, erase)){

l0rinc commented at 10:28 am on July 10, 2024:

very clean diff, love it

in src/coins.h:286 in dddb59a63d outdated

243+    {
244+        const auto next_entry{current.second.Next()};
245+        if (!erase) {
246+            if (current.second.coin.IsSpent()) {
247+                m_usage -= current.second.coin.DynamicMemoryUsage();
248+                m_map.erase(current.first);

l0rinc commented at 10:32 am on July 10, 2024:

isn’t the parameter naming off if not erase, will lead to map.erase?

andrewtoth commented at 1:34 pm on July 10, 2024:

Ahh yes this comment is removed in this patch, and a variation of it should actually probably go right here. Perhaps I do this in a follow-up to not invalidate your ACK?

l0rinc commented at 5:00 pm on July 10, 2024:

variation of it should actually probably go right here

Maybe, but I would still be surprised, regardless of the context, if not X leads to X. Could the parameter explain it better than “erase”, so that the reviewer isn’t surprised that it means “keep”?

invalidate your ACK

I expect that I will have to review and ack your changes many more times before it’s merged, don’t worry about it!

andrewtoth commented at 8:00 pm on July 10, 2024:

I expect that I will have to review and ack your changes many more times before it’s merged

I certainly hope that isn’t the case :sweat_smile:

andrewtoth commented at 4:16 pm on July 12, 2024:

@paplorinc what do you think a better name for this parameter would be?

l0rinc commented at 5:08 pm on July 12, 2024:

In https://github.com/bitcoin/bitcoin/blob/dddb59a63d90a108e7acefacdb6689a07324a6d7/src/coins.cpp#L193, we’re calling Next with erase, which simply returns current.second.Next() when erase = true, without any mutation, because of https://github.com/bitcoin/bitcoin/blob/dddb59a63d90a108e7acefacdb6689a07324a6d7/src/coins.h#L245, right?

This is the part I don’t understand, especially since later in https://github.com/bitcoin/bitcoin/blob/dddb59a63d90a108e7acefacdb6689a07324a6d7/src/coins.cpp#L207, we’re moving when erase = true because we’ve already erased it - right?

So why is erase == true || it->second.coin.IsSpent() == true an erasure in one place, but erase == false && current.second.coin.IsSpent() == true an erasure in the other method? Shouldn’t the Next method rather be something like this instead:

 0inline CoinsCachePair* Next(CoinsCachePair& current, bool erase) noexcept
 1{
 2    const auto next_entry{current.second.Next()};
 3    if (erase || current.second.coin.IsSpent()) {
 4        m_usage -= current.second.coin.DynamicMemoryUsage();
 5        m_map.erase(current.first);
 6    } else {
 7        current.second.ClearFlags();
 8    }
 9    return next_entry;
10}

The related tests seem to be passing this way, which at least suggests a missing test case, if not an invalid erasure.

andrewtoth commented at 5:17 pm on July 12, 2024:

Great question!

When erase = true, we will be erasing the entire map at the end of BatchWrite. When erase = false, we want to keep all unspent coins in the map and have their flags cleared. This also means we want to erase only the spent coins when erase = false.

While looping through CCoinsMap::iterators, we can just do it = erase ? mapCoins.erase(true) : std::next(it), and then deleting all spent coins and clear their flags right after we return from BatchWrite when erase = false.

Now that we are looping through CoinsCachePairs though, we can’t call mapCoins.erase(it) because a CoinsCachePair can not be coerced back to a CCoinsMap::iterator. So, instead of erasing while we iterate when erase = true, we just keep all entries and call mapCoins.clear() after we return from BatchWrite. But, as a performance optimization for the case where erase = false, we can also delete the spent coins in the same loop so we don’t have to loop through them again when we return. So in this case we can erase by the map key instead of the map iterator. This is slower because we have to hash the key and lookup the entry in the map, whereas with the iterator we could just delete it directly.

When erase = true we don’t want to erase by the key, because it is slower than calling clear at the end of the loop.

andrewtoth commented at 5:40 pm on July 12, 2024:

, we’re moving when erase = true because we’ve already erased it - right?

We’re moving because we are going to erase it in either case, not that we have yet. We can’t move an erased entry because it will have already been destroyed. And we can’t use a moved entry later because it is invalidated by the move. So when we want to keep the entry, we need to copy it and not move out of it. When erase = true, we will be erasing the entry after we return, so all entries can be moved. When coin.IsSpent(), we will be erasing it when erase = false because we only want to keep unspent entries.

I did have a commit before that changed erase to will_erase, I can’t remember why I removed it. Would that make it clearer if I renamed that again?

l0rinc commented at 5:45 pm on July 12, 2024:

I will have to read that again tomorrow, it’s the end of the day here :/ will_erase could help, but I find it hard to keep track of so many independent mutations, I’m hoping there’s a simpler way to do this… Do you think you could add a test that at least doesn’t pass for my suggested code above? Maybe that will help me understand why something this simple (iteration with/without simple/bulk cleanup) has this many loose parts, where the exact same parameter means the opposite from one line to the next…

andrewtoth commented at 5:57 pm on July 12, 2024:

You’re right, the logic for moving or not should not leak into BatchWrite. Perhaps we can get rid of erase in BatchWrite and have the cursor return a pair - the iterator and a bool for whether it can move out or not?

l0rinc commented at 6:16 pm on July 12, 2024:

sounds better, would Next become a const in that case?

andrewtoth commented at 6:28 pm on July 12, 2024:

Err, no Next would still conditionally erase entries. This would remove the erase || it->second.coin.IsSpent() condition in BatchWrite.

l0rinc commented at 6:31 pm on July 12, 2024:

Could we do that and delegate the deletion to a separate “service”? Seems complicated enough to warrant better encapsulation

andrewtoth commented at 6:35 pm on July 12, 2024:

I had it like that before but others suggested to move everything into 1 loop instead of looping through the dirty entries again after BatchWrite to clear flags and delete spent entries.

l0rinc commented at 6:38 pm on July 12, 2024:

Can we have both, single loop, but the mutation and its side-effects (like the move) separated clearly in the code from everything else?

andrewtoth commented at 6:42 pm on July 12, 2024:

Will give it some thought

l0rinc commented at 11:07 am on July 16, 2024:

Have you given it any more thought? I’ve tested it since, so i’ll ack it, and will reack when you need more changes.

andrewtoth commented at 3:26 am on July 18, 2024:

I haven’t been able to come up with something that I think would be an improvement. I did split the second last commit up into two commits though. Maybe that will make it more clear when reviewing.

mzumsande commented at 9:25 pm on July 22, 2024:

While reviewing (and before reading this discussion) I also got stuck on this. Because of the expectations the naming raises, it seems like a potential footgun (future code may try to iterate over the linked list without intending to erase entries). I think the side effect should be documented more clearly for the Next() function, maybe even rename to NextAndMaybeErase() or something similar.

andrewtoth commented at 10:02 pm on July 22, 2024:

This comment thread is pointing to an out of date version of the PR. There’s more documentation on this struct and methods now, as well as changing a variable and field name. Maybe it is more clear now?

andrewtoth commented at 3:44 am on August 2, 2024:

Renamed the method to be NextAndMaybeErase.

andrewtoth commented at 3:45 am on August 2, 2024:

Also added more comments around this struct.

l0rinc commented at 10:40 am on July 10, 2024: contributor

Thanks, I will try to properly test the speed in the following weeks (maybe I’ll just rent a VM and do it there)

utACK 5c37dfdda55384358540cfae50715faefa79c933

andrewtoth commented at 5:31 pm on July 12, 2024: contributor

Some benchmarks with the latest changes. Did 2 IBD with prune=550 and 2 -reindex-chainstate with prune=0 to block 820k. Results were similar to before. This time no prune was 1% faster on this branch with default dbache, and 1% slower on this branch with max dbcache. This just means there’s no regression IMO.

It’s 34% faster pruned with max dbcache and 13% faster with default dbcache on this branch vs a pruned node on master with max dbcache.

Note I did the dbache=450 prune=0 benchmarks on a slower machine so I could do these benchmarks in parallel.

	prune	dbcache	mean time (s)	speedup
master	550	16384	30,085	-
branch	550	16384	19,984	34%
branch	550	450	26,278	13%
master	0	16384	12,541	1%
branch	0	16384	12,669	-
master	0	450	21,942	-
branch	0	450	21,671	1%

andrewtoth force-pushed on Jul 12, 2024

l0rinc commented at 10:01 am on July 13, 2024: contributor

@andrewtoth, I’m running the benchmark now on a cosy Hetzner dedicated server (without a local Bitcoin node, let’s see if that makes it too unreliable).

Does this seem like a realistic measurement?

0hyperfine \
1--warmup 1 --runs 2 \
2--show-output \
3--export-markdown bench.md \
4--parameter-list COMMIT 0383de,21090d \
5--prepare 'git checkout {COMMIT} && make -j10 && rm -rf /mnt/hdd1/BitcoinData/*' \
6'echo {COMMIT} && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=820000 -printtoconsole=0'

Do you think we should try different -dbbatchsize values as well?

in src/coins.cpp:203 in fa0494d7ff outdated

208-                    // The `move` call here is purely an optimization; we rely on the
209-                    // `mapCoins.erase` call in the `for` expression to actually remove
210-                    // the entry from the child map.
211+                if (cursor.WillErase(*it)) {
212+                    // Since this entry will be erased,
213+                    // we can move the coin into us instead of copying it

l0rinc commented at 10:06 am on July 13, 2024:

nit: “move coin into us” sounds a bit weird. And now that the code is cleaner, does the comment add anything that the code doesn’t already say?

Since this entry will be erased -> if (cursor.WillErase(*it)) {
we can move the coin into us instead of copying it -> std::move(it->second.coin) vs it->second.coin

andrewtoth commented at 3:30 am on July 18, 2024:

You might be right here, but this function in particular is very heavily commented due to its importance. I think modifying the comment is better than just removing it.

l0rinc commented at 3:14 pm on August 14, 2024:

Addressed in https://github.com/bitcoin/bitcoin/pull/30656

in src/coins.cpp:207 in 21090d032f outdated

219+                    // Since this entry will be erased,
220+                    // we can move the coin into us instead of copying it
221                     entry.coin = std::move(it->second.coin);
222                 } else {
223                     entry.coin = it->second.coin;
224                 }

l0rinc commented at 10:14 am on July 13, 2024:

I believe in making important parts of the code stick out and secondary importance going in the background. This move is secondary importance, right? But it occupies a big part of the method, maybe we can simplify it with a ternary:

0                CCoinsCacheEntry& entry{itUs->second};
1                entry.coin = cursor.WillErase(*it) ? std::move(it->second.coin) : it->second.coin;

andrewtoth commented at 2:51 pm on July 13, 2024:

There’s a long thread here that is where this comment and if/else block arose from. I think you might have to click expand a few times to see it.

From that thread, I think using the ternary operator will result in an extra move instead of branching for the different cases.

l0rinc commented at 11:06 am on July 16, 2024:

Wow, that’s a lot more complicated than I imagined, let’s not touch it :D

l0rinc commented at 3:14 pm on August 14, 2024:

Addressed in https://github.com/bitcoin/bitcoin/pull/30656

in src/coins.h:235 in fa0494d7ff outdated

229@@ -230,6 +230,40 @@ class CCoinsViewCursor
230     uint256 hashBlock;
231 };
232 
233+struct CoinsViewCacheCursor
234+{
235+    //! will_erase should be set to true in CCoinsViewCache::Flush and false in CCoinsViewCache::Sync

l0rinc commented at 10:17 am on July 13, 2024:

this seems redundant, if we’re not sure about this, I’d rather add a test that can bite back (as you can tell, I’m not a big fan of dead comments :p)

sipa commented at 1:46 pm on July 16, 2024:

In commit “coins: pass linked list of flagged entries to BatchWrite”

I’d prefer a comment that explains what the flag does rather than just when to set it. For example “If will_erase is not set, iterating through the cursor will erase spent coins from the map, and other coins will be unflagged (removing them from the linked list). If will_erase is set, the underlying map will not be modified, as the caller is expected to wipe the entire map anyway.”

andrewtoth commented at 3:25 am on July 18, 2024:

Updated the comment to @sipa’s version.

in src/coins.h:294 in fa0494d7ff outdated

254+        }
255+        return next_entry;
256+    }
257+
258+    //! Whether the current pair will be erased after being passed to Next
259+    inline bool WillErase(CoinsCachePair& current) const noexcept { return m_will_erase || current.second.coin.IsSpent(); }

l0rinc commented at 10:18 am on July 13, 2024:

+1

andrewtoth commented at 3:24 am on July 18, 2024:

Removed.

in src/test/fuzz/coins_view.cpp:144 in fa0494d7ff outdated

145@@ -145,10 +146,12 @@ FUZZ_TARGET(coins_view, .init = initialize_coins_view)
146                     }
147                     auto it{coins_map.emplace(random_out_point, std::move(coins_cache_entry)).first};
148                     it->second.AddFlags(flags, *it, sentinel);
149+                    usage += it->second.coin.DynamicMemoryUsage();

l0rinc commented at 10:20 am on July 13, 2024:

this should have been there all along, i.e. you’ve fixed a small independent accounting bug here, right?

andrewtoth commented at 2:44 pm on July 13, 2024:

Yes!

andrewtoth commented at 2:43 pm on July 13, 2024: contributor

@andrewtoth, I’m running the benchmark now on a cosy Hetzner dedicated server (without a local Bitcoin node, let’s see if that makes it too unreliable).

Does this seem like a realistic measurement?
0hyperfine \
1--warmup 1 --runs 2 \
2--show-output \
3--export-markdown bench.md \
4--parameter-list COMMIT 0383de,21090d \
5--prepare 'git checkout {COMMIT} && make -j10 && rm -rf /mnt/hdd1/BitcoinData/*' \
6'echo {COMMIT} && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=820000 -printtoconsole=0'

Amazing, thank you!

Just wanted to note the first commit should probably be 94d56b9def44bfa5e002ef2c666e5961b1deacdc, the one before this PR, and not 0383defd049fa1feb6de7abea259041487ccda1d which is the first in this PR.

Also, a warmup run is going to take a long time. From https://github.com/sharkdp/hyperfine?tab=readme-ov-file#warmup-runs-and-preparation-commands, you should probably run with sync; echo 3 | sudo tee /proc/sys/vm/drop_caches in the prepare statement instead. But, you will need to run hyperfine as root to do that.

Do you think we should try different -dbbatchsize values as well?

This PR doesn’t do anything related to -dbbatchsize, so no I don’t think so.

andrewtoth commented at 3:13 pm on July 13, 2024: contributor

Also, you should probably wrap hyperfine with nohup ... & so you can exit the ssh session.

l0rinc commented at 11:38 am on July 14, 2024: contributor

the first commit should probably be

The first commit was just unrelated refactor, was easier to use that as a ‘before’.

a warmup run is going to take a long time

Yeah, but this makes the measurements simpler (this way I don’t need the sync and drop_caches), both runs start from the same (slightly warmed up) state. Plus I didn’t want to test the exact same settings as you did :)

wrap hyperfine with nohup … &

was using tmux for that, the warmup and the first 2 iterations for 0383de are almost finished now.

l0rinc commented at 8:53 am on July 16, 2024: contributor

I come bearing good news: the benchmarks have finished running!

https://github.com/bitcoin/bitcoin/pull/28280/commits/0383defd049fa1feb6de7abea259041487ccda1d

0Time (mean ± σ):     44949.012 s ± 425.373 s    [User: 42783.944 s, System: 9712.690 s]
1Range (min … max):   44648.228 s … 45249.796 s    2 runs

https://github.com/bitcoin/bitcoin/pull/28280/commits/21090d032f5c4d90a41c3ca2030690c75899ec1a

0Time (mean ± σ):     37726.687 s ± 681.782 s    [User: 35967.638 s, System: 3068.604 s]
1Range (min … max):   37244.595 s … 38208.780 s    2 runs

Summary

0'echo 21090d && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=820000 -printtoconsole=0' ran
11.19 ± 0.02 times faster than 'echo 0383de && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=820000 -printtoconsole=0'

i.e. after your changes the pre-assumevalid IBD was exactly 2 hours (19% faster) 🎉

l0rinc commented at 11:07 am on July 16, 2024: contributor

ACK 21090d032f5c4d90a41c3ca2030690c75899ec1a, nicely done!

in src/coins.h:259 in fa0494d7ff outdated

229@@ -230,6 +230,40 @@ class CCoinsViewCursor
230     uint256 hashBlock;
231 };
232 
233+struct CoinsViewCacheCursor

sipa commented at 1:36 pm on July 16, 2024:

In commit “coins: pass linked list of flagged entries to BatchWrite”

Can you add a one-paragraph doxygen description of what this type is for?

andrewtoth commented at 3:24 am on July 18, 2024:

Done.

in src/coins.h:236 in fa0494d7ff outdated

229@@ -230,6 +230,40 @@ class CCoinsViewCursor
230     uint256 hashBlock;
231 };
232 
233+struct CoinsViewCacheCursor
234+{
235+    //! will_erase should be set to true in CCoinsViewCache::Flush and false in CCoinsViewCache::Sync
236+    CoinsViewCacheCursor(size_t& usage, CoinsCachePair& sentinel, CCoinsMap& map, bool will_erase) noexcept

sipa commented at 1:37 pm on July 16, 2024:

In commit “coins: pass linked list of flagged entries to BatchWrite”

Can you add LIFETIMEBOUND to the usage, sentinel, and map arguments? That may make the compiler complain if the CoinsViewCacheCursor object is used after the arguments go out of scope.

andrewtoth commented at 3:24 am on July 18, 2024:

Done.

in src/coins.cpp:201 in fa0494d7ff outdated

203@@ -206,10 +204,9 @@ bool CCoinsViewCache::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlockIn
204                 // and mark it as dirty.
205                 itUs = cacheCoins.try_emplace(it->first).first;
206                 CCoinsCacheEntry& entry{itUs->second};
207-                if (erase) {
208-                    // The `move` call here is purely an optimization; we rely on the
209-                    // `mapCoins.erase` call in the `for` expression to actually remove
210-                    // the entry from the child map.
211+                if (cursor.WillErase(*it)) {

sipa commented at 2:13 pm on July 16, 2024:

In commit “coins: pass linked list of flagged entries to BatchWrite”

The behavior change is a bit confusing here, as cursor.WillErase(*it) will be true for spent items, which was not true for the old code’s erase. Would it make sense to split this commit up in two; one part which introduces the cursor and only iterate over flagged entry, and then a second part which moves the deleting of spent entries from the separate loop in CCoinsViewCache::Sync call into the CCoinsViewCache::BatchWrite loop?

andrewtoth commented at 3:21 am on July 18, 2024:

Split into two commits.

sipa commented at 4:05 pm on July 16, 2024: member

ACK 21090d032f5c4d90a41c3ca2030690c75899ec1a

I have a few non-blocking suggested improvements in the new commit. I have also pushed a sanity check commit in https://github.com/sipa/bitcoin/commits/pr_28280, feel free to cherry-pick or squash in.

andrewtoth force-pushed on Jul 18, 2024

in src/coins.h:187 in d7cf8f3372 outdated

183@@ -184,6 +184,7 @@ struct CCoinsCacheEntry
184     inline bool IsDirty() const noexcept { return m_flags & DIRTY; }
185     inline bool IsFresh() const noexcept { return m_flags & FRESH; }
186     inline CoinsCachePair* Next() const noexcept { return m_next; }
187+    inline CoinsCachePair* Prev() const noexcept { return m_prev; }

l0rinc commented at 10:05 am on July 18, 2024:

Now that we have this, could you please cover it with tests as well (where we’re testing that the linked list is constructed correctly)? Or would it make more sense to just call SanityCheck()since it’s already checking the linked list’s integrity?

andrewtoth commented at 3:43 pm on July 18, 2024:

Hmm this was added by the cherry-picked commit from @sipa. Can we leave it for a follow-up?

l0rinc commented at 4:26 pm on July 18, 2024:

Of course

sipa commented at 8:19 pm on July 18, 2024:

It makes sense to add a few Prev() calls to the unit test too. Now or later. Also, adding SanityCheck calls to those unit tests.

andrewtoth commented at 8:29 pm on July 18, 2024:

I can add Prev tests to the coinscachepair_tests and SanityCheck to the coins_tests in a follow-up.

andrewtoth commented at 3:37 am on August 2, 2024:

Added tests.

in src/coins.cpp:320 in d7cf8f3372 outdated

325@@ -326,6 +326,7 @@ void CCoinsViewCache::ReallocateCache()
326 void CCoinsViewCache::SanityCheck() const

l0rinc commented at 10:14 am on July 18, 2024:

This is only called in the fuzz tests, right?

Is it just bad luck that it wasn’t triggered by CI or could the fuzz tests be incorrect or just not yet triggered at this stage?

Could we use it in coins_tests or coinscachepair_tests as well? And if it’s a test-only method, maybe document, that it’s heavy and should only be used in testing, not in prod? Or maybe even move it completely to the tests…

maflcko commented at 10:47 am on July 18, 2024:

Fuzz coverage isn’t covered by corecheck. It is produced daily at https://maflcko.github.io/b-c-cov/fuzz.coverage/src/coins.cpp.gcov.html (for master).

It can also be produced (manually triggered) for this pull request, if you think it would be useful.

l0rinc commented at 10:55 am on July 18, 2024:

Thanks for the info. I assumed that some of the randomized tests contributed to the coverage here. Isn’t that why we have Lost baseline coverage for PRs? Or did I misunderstand the reason for unrelated code not being covered anymore?

maflcko commented at 11:04 am on July 18, 2024:

I assumed that some of the randomized tests contributed to the coverage here.

The section where your screenshot is from is “Uncovered new code”. The screenshot means that the code wasn’t covered previously and isn’t covered now either. There should be no randomness.

l0rinc commented at 11:10 am on July 18, 2024:

yes, of course. I assumed that the existence of Lost baseline coverage means that there was some randomness in the tests or infrastructure or production code or threading or something (i.e. covered before, but not covered now for a reason that’s unrelated to this change):

maflcko commented at 11:36 am on July 18, 2024:

Yes, that is unrelated and deterministic tests should be added.

andrewtoth commented at 3:37 am on August 2, 2024:

Added some calls to SanityCheck in coins_tests.cpp.

in src/coins.h:292 in 19660dde7b outdated

243+
244+    inline CoinsCachePair* Next(CoinsCachePair& current) noexcept
245+    {
246+        const auto next_entry{current.second.Next()};
247+        return next_entry;
248+    }

l0rinc commented at 10:36 am on July 18, 2024:

Any reason not returning directly like in Begin()?

0    inline CoinsCachePair* Next(CoinsCachePair& current) noexcept { return current.second.Next(); }

andrewtoth commented at 3:43 pm on July 18, 2024:

It was just easier to split it up into 2 commits this way.

andrewtoth commented at 3:54 pm on July 18, 2024:

Fixed.

l0rinc commented at 4:29 pm on July 18, 2024:

Ah, my mistake, this was an intermediary commit - fine either way

achow101 added this to the milestone 28.0 on Jul 18, 2024

andrewtoth force-pushed on Jul 18, 2024

l0rinc commented at 4:30 pm on July 18, 2024: contributor

ACK d7f94e0fa9ec3641405e6909ab7da4e48865e840

DrahtBot requested review from sipa on Jul 18, 2024

in src/coins.cpp:54 in 5907ace1e6 outdated

50@@ -51,7 +51,7 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
51     if (ret->second.coin.IsSpent()) {
52         // The parent only has an empty entry for this outpoint; we can consider our
53         // version as fresh.
54-        ret->second.flags = CCoinsCacheEntry::FRESH;

theuni commented at 5:04 pm on July 18, 2024:

There are a few places here that seem to be changing behavior from setting flags to adding flags. Are the outcomes guaranteed to be the same?

sipa commented at 5:34 pm on July 18, 2024:

Yeah, if you load a few lines above each of those instances it’ll be clear it’s always due to newly inserted entries.

sipa commented at 8:21 pm on July 18, 2024: member

ACK d7f94e0fa9ec3641405e6909ab7da4e48865e840

l0rinc commented at 6:04 am on July 19, 2024: contributor

Retested the speed gains after the recent minor changes, with a single run per commit:

Summary

0'echo d7f94e && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=400000 -printtoconsole=0' ran                                        
11.12 times faster than 'echo 0383de && ./src/bitcoind -datadir=/mnt/hdd1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=400000 -printtoconsole=0'

i.e. 12% faster for the first 400000 blocks after the changes.

in src/coins.h:179 in e3974c4a4d outdated

174         m_flags |= flags;
175     }
176     inline void ClearFlags() noexcept
177     {
178+        if (!m_flags) return;
179+        m_next->second.m_prev = m_prev;

mzumsande commented at 6:48 pm on July 22, 2024:

Did you consider setting m_next and m_prev be set to back to nullptr here? Doesn’t seem strictly necessary but maybe a bit cleaner?

andrewtoth commented at 10:00 pm on July 22, 2024:

An earlier version of this PR that relied on checking for null did do this. I just think it’s unnecessary work to set them to null, because access is guarded by m_flags.

in src/test/fuzz/coins_view.cpp:149 in 81f59cb109 outdated

150                 }
151                 bool expected_code_path = false;
152                 try {
153-                    coins_view_cache.BatchWrite(coins_map, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());
154+                    auto cursor{CoinsViewCacheCursor(usage, sentinel, coins_map, /*will_erase=*/true)};
155+                    coins_view_cache.BatchWrite(cursor, fuzzed_data_provider.ConsumeBool() ? ConsumeUInt256(fuzzed_data_provider) : coins_view_cache.GetBestBlock());

mzumsande commented at 8:58 pm on July 22, 2024:

I think that a call to coins_view_cache.clear() is necessary here since BatchWrite() doesn’t erase entries anymore when will_erase is set.

andrewtoth commented at 9:57 pm on July 22, 2024:

It would have to be coins_map.clear(), which is passed up in the cursor to coins_view_cache. This block is equivalent to Flush, and coins_view_cache is the base. However, coins_map goes out of scope right after this block and is not reused as in Flush, so clearing it is not necessary.

mzumsande commented at 9:47 pm on July 22, 2024: contributor

Code Review ACK d7f94e0fa9ec3641405e6909ab7da4e48865e840 (disclaimer: I’m not very experienced with the coins db but started to look into it more recently).

I did some testing on signet to ensure that the dumptxoutset data is not affected by this. I’m in the process of doing some test runs on mainnet, but that’ll take me a few more days.

in src/txdb.cpp:117 in 81f59cb109 outdated

113@@ -114,7 +114,7 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
114     batch.Erase(DB_BEST_BLOCK);
115     batch.Write(DB_HEAD_BLOCKS, Vector(hashBlock, old_tip));
116 
117-    for (CCoinsMap::iterator it = mapCoins.begin(); it != mapCoins.end();) {
118+    for (auto it{cursor.Begin()}; it != cursor.End();) {

mzumsande commented at 7:42 pm on July 24, 2024:

nit: After this, the log Committed X changed transaction outputs (out of Y) to coin database... has become less interesting because we only iterate over dirty coins, so X=Y always. I might be more helpful to report the initial size of m_map of the cursor (before entries are deleted from it) as the total.

andrewtoth commented at 8:03 pm on July 24, 2024:

There are other places that could be cleaned up, such as obviated dirty checks like here and here, but this should probably done after a #18746 revival so there’s no confusion.

mzumsande commented at 8:07 pm on July 24, 2024:

yes, no need to do that here. I just mentioned the logging here because I was interested in the ratios to assess the efficiency of this PR, and had to write my own log.

andrewtoth commented at 8:09 pm on July 24, 2024:

Also, that line predates #17487, back when all entries were dirty anyways.

andrewtoth commented at 10:12 pm on August 18, 2024:

Done in #30673

refactor: encapsulate flags access for dirty and fresh checks

No behavior change. This prepares moving the cache entry flags field
to private access.

Co-Authored-By: l0rinc <pap.lorinc@gmail.com>

df34a94e57

refactor: encapsulate flags get access for all other checks

No behavior change. This prepares moving the cache entry
flags field to private access.

9715d3bf1e

refactor: encapsulate flags setting with AddFlags and ClearFlags

No behavior change. This prepares moving the cache entry
flags field to private access.

8737c0cefa

refactor: disallow setting flags in CCoinsCacheEntry constructors

No behavior change because any entries that are added in EmplaceCoinInternalDANGER
have DIRTY assigned to them after, and if they
are not inserted then they will not be
modified as before.
This prepares moving the cache entry
flags field to private access.

Co-Authored-By: Martin Leitner-Ankerl <martin.ankerl@gmail.com>

4e4fb4cbab

refactor: move flags to private uint8_t and rename to m_flags

No behavior change. This prepares to add CCoinsCacheEntrys
to a doubly linked list when a flag is added.

f08faeade2

refactor: add CoinsCachePair alias 75f36d241d

in src/coins.cpp:48 in d7f94e0fa9 outdated

41+}
42+
43+CCoinsViewCache::~CCoinsViewCache()
44+{
45+    cacheCoins.clear();
46+    Assume(m_sentinel.second.Next() == &m_sentinel);

davidgumberg commented at 0:07 am on August 1, 2024:

Is this destructor defined just to document this assumption?

andrewtoth commented at 0:17 am on August 1, 2024:

It is actually here to clear cacheCoins and make sure all CCoinsCacheEntrys are destroyed and call ClearFlags in their destructors. Since the first and last flagged entries have a reference to m_sentinel, we must be sure they remove themselves from the linked list and set m_flags to 0 before m_sentinel is destroyed. Without this destructor, we can make this requirement implicit by having m_sentinel come before cacheCoins in the CCoinsViewCache members. That way m_sentinel will be destroyed after cacheCoins and all its entries have already been destroyed. But, that is brittle since someone could reorder the fields. By having this destructor here we make sure all entries are unflagged before destroying the sentinel.

davidgumberg commented at 0:21 am on August 1, 2024:

This was stumping me for a while, thank you!!

andrewtoth commented at 0:51 am on August 1, 2024:

Perhaps I should comment that here in a follow-up then?

davidgumberg commented at 1:05 am on August 1, 2024:

Follow-up: Why wouldn’t m_sentinel being destroyed first be OK since: when m_sentinel.second.ClearFlags() happens, that makes the last (youngest) entry’s m_next point to the first (oldest) entry and vice versa?

andrewtoth commented at 1:08 am on August 1, 2024:

m_sentinel does not have m_flags set. So when m_sentinel is destroyed it doesn’t actually remove itself from the list.

davidgumberg commented at 1:22 am on August 1, 2024:

Thanks, I missed that! I think a comment in a follow-up could be helpful.

andrewtoth commented at 3:36 am on August 2, 2024:

Added a comment

andrewtoth commented at 4:23 am on August 2, 2024:

Now that we mark the sentinel as dirty, my response to your follow-up is outdated. So perhaps we don’t need this destructor anymore? The last entry to be destroyed will just point to itself in ClearFlags, right?

andrewtoth commented at 4:31 am on August 2, 2024:

I removed the CCoinsViewCache destructor in light of this. Thank you!

andrewtoth force-pushed on Aug 2, 2024

DrahtBot commented at 4:11 am on August 2, 2024: contributor

🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/28247831691

Make sure to run all tests locally, according to the documentation.

The failure may happen due to a number of reasons, for example:

Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.
A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.
An intermittent issue.

Leave a comment here, if you need help tracking down a confusing failure.

DrahtBot added the label CI failed on Aug 2, 2024

andrewtoth commented at 4:17 am on August 2, 2024: contributor

Rebased and addressed most comments.

Added more comments in and around CoinsViewCacheCursor
Renamed Next to NextAndMaybeErase in CoinsViewCacheCursor
Added Assume(m_flags) to Next and Prev of CCoinsCacheEntry
Added tests for Prev in coinscachepair_tests and added calls to SanityCheck in coins_tests
Set m_flags = DIRTY in SelfRef of CCoinsCacheEntry so we can call Next on the sentinel with Assume(m_flags) and removed CCoinsViewCache destructor. This last change is the only change in behavior for non-debug, non-test code. This means that we don’t have to worry about clearing the coins map before destroying the linked list sentinel. The sentinel will remove its references from the entries it points to when it gets destroyed. This will leave the last entry pointing to itself when it is destroyed.

Thank you for your reviews @sipa @paplorinc @mzumsande @theuni @davidgumberg

refactor: require self and sentinel parameters for AddFlags

No behavior change. Prepares for adding the CoinsCachePairs to
a linked list when flagged.

8bd3959fea

coins: call ClearFlags in CCoinsCacheEntry destructor

No behavior change. Prepares for flags adding CCoinsCacheEntrys
to a linked list which need to be removed on destruction.

58b7ed156d

andrewtoth force-pushed on Aug 2, 2024

DrahtBot removed the label CI failed on Aug 2, 2024

andrewtoth commented at 4:55 pm on August 2, 2024: contributor

The compare feature is kind of broken now that I rebased it. You can checkout the changes from the previous version with the following command git range-diff 0383defd049fa1feb6de7abea259041487ccda1d..d7f94e0fa9ec3641405e6909ab7da4e48865e840 df34a94e57659c31873c26c86a8de5a032400061..28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf

or smaller git range-diff 0383def..d7f94e0 df34a94..28f0dd2

in src/coins.h:172 in 28f0dd2d60 outdated

170+        Assume(&self.second == this);
171+        if (!m_flags && flags) {
172+            m_prev = sentinel.second.m_prev;
173+            m_next = &sentinel;
174+            sentinel.second.m_prev = &self;
175+            m_prev->second.m_next = &self;

l0rinc commented at 3:15 pm on August 3, 2024:

there’s a hidden setter and updater here, which is applicable to ClearFlags and SelfRef as well, if you need change to the commits again, please consider extracting the repetition:

 0diff --git a/src/coins.h b/src/coins.h
 1--- a/src/coins.h	(revision 28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf)
 2+++ b/src/coins.h	(date 1722697977399)
 3@@ -128,6 +128,16 @@
 4     CoinsCachePair* m_next{nullptr};
 5     uint8_t m_flags{0};
 6 
 7+    void SetPrevNext(CoinsCachePair* prev, CoinsCachePair* next) noexcept {
 8+        m_prev = prev;
 9+        m_next = next;
10+    }
11+
12+    void UpdateLink(CoinsCachePair* prev, CoinsCachePair* next) noexcept {
13+        m_prev->second.m_next = next;
14+        m_next->second.m_prev = prev;
15+    }
16+
17 public:
18     Coin coin; // The actual cached data.
19 
20@@ -166,18 +176,16 @@
21     {
22         Assume(&self.second == this);
23         if (!m_flags && flags) {
24-            m_prev = sentinel.second.m_prev;
25-            m_next = &sentinel;
26-            sentinel.second.m_prev = &self;
27-            m_prev->second.m_next = &self;
28+            SetPrevNext(sentinel.second.m_prev, &sentinel);
29+            UpdateLink(&self, &self);
30         }
31         m_flags |= flags;
32     }
33     inline void ClearFlags() noexcept
34     {
35-        if (!m_flags) return;
36-        m_next->second.m_prev = m_prev;
37-        m_prev->second.m_next = m_next;
38+        if (m_flags) {
39+            UpdateLink(m_prev, m_next);
40+        }
41         m_flags = 0;
42     }
43     inline uint8_t GetFlags() const noexcept { return m_flags; }
44@@ -189,7 +197,6 @@
45         Assume(m_flags);
46         return m_next;
47     }
48-
49     //! Only call Prev when this entry is either DIRTY or FRESH
50     inline CoinsCachePair* Prev() const noexcept {
51         Assume(m_flags);
52@@ -200,8 +207,7 @@
53     inline void SelfRef(CoinsCachePair& self) noexcept
54     {
55         Assume(&self.second == this);
56-        m_prev = &self;
57-        m_next = &self;
58+        SetPrevNext(&self, &self);
59         // Set sentinel to DIRTY so we can call Next on it
60         m_flags = DIRTY;
61     }

(please find better names, I was uninspired)

in src/coins.h:187 in 28f0dd2d60 outdated

185+    }
186+    inline uint8_t GetFlags() const noexcept { return m_flags; }
187+    inline bool IsDirty() const noexcept { return m_flags & DIRTY; }
188+    inline bool IsFresh() const noexcept { return m_flags & FRESH; }
189+
190+    //! Only call Next when this entry is either DIRTY or FRESH

l0rinc commented at 3:23 pm on August 3, 2024:

this isn’t exactly what the code checks, right? It can also be both - which you’ve specified in other cases if it was allowed -, so we should either add DIRTY, FRESH, or both in the comment or change the precondition to Assume(IsDirty() ^ IsFresh());

sipa commented at 1:11 pm on August 4, 2024:

I guess it’s a bit ambiguous, but I’d interpret this as “Only call … when this entry has either the DIRTY of FRESH flags set” and not as “Only call … when this entry’s flags are either exactly DIRTY or exactly FRESH”.

andrewtoth commented at 11:45 pm on August 5, 2024:

Fixed.

in src/test/coins_tests.cpp:644 in 28f0dd2d60 outdated

640@@ -634,7 +641,7 @@ static void CheckAccessCoin(CAmount base_value, CAmount cache_value, CAmount exp
641 {
642     SingleEntryCacheTest test(base_value, cache_value, cache_flags);
643     test.cache.AccessCoin(OUTPOINT);
644-    test.cache.SelfTest();
645+    test.cache.SelfTest(/*sanity_check=*/false);

l0rinc commented at 3:29 pm on August 3, 2024:

👍 for adding sanity check in every other instance - do you think it would make sense to explain the reason why the exceptions don’t pass?

Assertion failed: (attr != 2 && attr != 4 && attr != 7), function SanityCheck, file coins.cpp, line 330. unknown location:0: fatal error: in “coins_tests/ccoins_write”: signal: SIGABRT (application abort requested)

in src/test/coinscachepair_tests.cpp:61 in 28f0dd2d60 outdated

55+        node->second.ClearFlags();
56+        node = next;
57+    }
58+    BOOST_CHECK_EQUAL(node, &sentinel);
59+    // Check that sentinel's next is itself
60+    BOOST_CHECK_EQUAL(sentinel.second.Next(), &sentinel);

l0rinc commented at 3:32 pm on August 3, 2024:

to be exhaustive, we may want to test Prev as well here:

0    BOOST_CHECK_EQUAL(sentinel.second.Prev(), &sentinel);

andrewtoth commented at 11:45 pm on August 5, 2024:

Done.

in src/txdb.cpp:127 in 28f0dd2d60 outdated

123@@ -124,7 +124,7 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
124             changed++;
125         }
126         count++;
127-        it = erase ? mapCoins.erase(it) : std::next(it);
128+        it = cursor.NextAndMaybeErase(*it);

l0rinc commented at 3:35 pm on August 3, 2024:

<3

in src/test/coinscachepair_tests.cpp:22 in 28f0dd2d60 outdated

17+    std::list<CoinsCachePair> nodes;
18+    for (auto i{0}; i < NUM_NODES; ++i) {
19+        nodes.emplace_back();
20+
21+        auto node{std::prev(nodes.end())};
22+        node->second.AddFlags(CCoinsCacheEntry::DIRTY, *node, sentinel);

l0rinc commented at 3:50 pm on August 3, 2024:

->first and ->second seems really foreign throughout the usages, if we could replace the std::pair<const COutPoint, CCoinsCacheEntry> with a

0class CoinsCachePair {
1public:
2    const COutPoint outpoint;
3    CCoinsCacheEntry entry;
4    
5    CoinsCachePair() : outpoint(), entry() {
6        entry.SelfRef(*this);
7    }
8    ...

we could have sentinel.entry.m_prev instead of sentinel.second.m_prev, which is more explanatory, and we wouldn’t need to repeat every init as:

0    CoinsCachePair sentinel{};
1    sentinel.second.SelfRef(sentinel);

since we can now do that in the constructor. What do you think? I understand if you want to do it in a separate PR, but I will gladly reapprove as well.

sipa commented at 12:41 pm on August 4, 2024:

@paplorinc That doesn’t work; CoinsCachePair needs to match CCoinsMap::value_type, which is a pair of the key and value of the map.

l0rinc commented at 12:44 pm on August 4, 2024:

Ah, thanks, makes sense

l0rinc approved

l0rinc commented at 3:52 pm on August 3, 2024: contributor

re-ACK 28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf

Left a few nits, if you edit again, please consider them!

DrahtBot requested review from mzumsande on Aug 3, 2024

DrahtBot requested review from sipa on Aug 3, 2024

sipa commented at 1:20 pm on August 4, 2024: member

reACK 28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf

vostrnad commented at 10:06 am on August 5, 2024: none

Benchmark ACK (that’s not a thing but it totally should be)

I benchmarked the latest revision (28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf) against the merge base (9774a958b501a6d439a734e18b29e04f59f973f6) with a few runs to block 400,000, syncing from a local node, with pruning enabled and all other settings at default. With dbcache=450 it finished 9.4% faster, dbcache=4096 was 14.3% faster.

dbcache=450:

master: 3203s pr: 2901s (9.4% speedup)

dbcache450

dbcache=4096:

master: 3241s pr: 2778s (14.3% speedup)

dbcache4096

mzumsande commented at 4:54 pm on August 5, 2024: contributor

re-ACK 28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf

in src/coins.h:277 in 6453d8e38a outdated

273@@ -257,9 +274,24 @@ struct CoinsViewCacheCursor
274     inline CoinsCachePair* Begin() const noexcept { return m_sentinel.second.Next(); }
275     inline CoinsCachePair* End() const noexcept { return &m_sentinel; }
276 
277-    inline CoinsCachePair* NextAndMaybeErase(CoinsCachePair& current) noexcept { return current.second.Next(); }
278+    //! Return the next entry after current, erasing current if WillErase(current) returns true

achow101 commented at 9:11 pm on August 5, 2024:

In 6453d8e38a51311b23c980c5c61a60a0d0bd68b2 “coins: move Sync logic to CoinsViewCacheCursor”

This comment seems to be incorrect, WillErase(current) will return true when m_will_erase == true, but this function will not erase the entry in that instance.

andrewtoth commented at 9:22 pm on August 5, 2024:

Good catch @achow101, updated the comment.

Ianilfy commented at 10:23 am on September 13, 2024:

What do I do

andrewtoth force-pushed on Aug 5, 2024

coins: track flagged cache entries in linked list

No visible behavior change. This commit tracks the flagged
entries internally but the list is not iterated by anything.

Co-Authored-By: Pieter Wuille <pieter@wuille.net>
Co-Authored-By: l0rinc <pap.lorinc@gmail.com>

24ce37cb86

test: add cache entry linked list tests



Co-Authored-By: l0rinc <pap.lorinc@gmail.com>

a14edada8a

coins: pass linked list of flagged entries to BatchWrite

BatchWrite now iterates through the linked
list of flagged entries instead of the entire
coinsCache map.

Co-Authored-By: l0rinc <pap.lorinc@gmail.com>

7825b8b9ae

andrewtoth force-pushed on Aug 5, 2024

andrewtoth commented at 11:45 pm on August 5, 2024: contributor

Might as well address some nits since I invalidated ACKs anyways.

Diff is trivial git diff 28f0dd2 589db87 https://github.com/bitcoin/bitcoin/compare/28f0dd2d60d0f9a8d8a5aa0ed4a5eb1bd4f605bf..589db872e116779ab9cae693171ac8a8c02d9923

coins: move Sync logic to CoinsViewCacheCursor

Erase spent cache entries and clear flags of unspent
entries inside the BatchWrite loop, instead of an
additional loop after BatchWrite.

Co-Authored-By: Pieter Wuille <pieter@wuille.net>

05cf4e1875

Add linked-list test to CCoinsViewCache::SanityCheck 0e8918755f

validation: don't erase coins cache on prune flushes 589db872e1

andrewtoth force-pushed on Aug 6, 2024

l0rinc commented at 6:38 am on August 6, 2024: contributor

 0Benchmark 1: COMMIT=df34a9 ./src/bitcoind -datadir=/root/BitcoinData -dbcache=16384 -prune=550 -stopatheight=300000 -printtoconsole=0
 1  Time (mean ± σ):     993.561 s ± 36.138 s    [User: 1001.395 s, System: 96.911 s]
 2  Range (min … max):   968.008 s … 1019.115 s    2 runs
 3 
 4Benchmark 2: COMMIT=28f0dd ./src/bitcoind -datadir=/root/BitcoinData -dbcache=16384 -prune=550 -stopatheight=300000 -printtoconsole=0
 5  Time (mean ± σ):     1168.303 s ± 39.657 s    [User: 919.521 s, System: 95.169 s]
 6  Range (min … max):   1140.262 s … 1196.345 s    2 runs
 7 
 8Summary
 9  'COMMIT=df34a9 ./src/bitcoind -datadir=/root/BitcoinData -dbcache=16384 -prune=550 -stopatheight=300000 -printtoconsole=0' ran
10    1.18 ± 0.06 times faster than 'COMMIT=28f0dd ./src/bitcoind -datadir=/root/BitcoinData -dbcache=16384 -prune=550 -stopatheight=300000 -printtoconsole=0'

andrewtoth commented at 11:50 am on August 6, 2024: contributor

@paplorinc if you’re not connecting to a local trusted node there will be too much variance in how quickly you can connect to peers and how reliably they will serve the blocks. Also, there will be barely any prune flushes the first 300k blocks anyways. That’s a not a good benchmark for this.

l0rinc commented at 12:16 pm on August 6, 2024: contributor

if you’re not connecting to a local trusted node

This is by design. I want a real-world test. Others have already tested the local-node scenario, so I’ll just work around the volatility by increasing the repetition.

That’s a not a good benchmark for this

I’m running 3 rounds now until 500k - that should be representative, right?

l0rinc commented at 12:34 pm on August 7, 2024: contributor

I come bearing good news again \:D/

0Benchmark 1: COMMIT=df34a9 ./src/bitcoind -datadir=/mnt/sda1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=500000 -printtoconsole=0
1  Time (mean ± σ):     9216.193 s ± 165.608 s    [User: 10809.113 s, System: 1931.597 s]
2  Range (min … max):   9065.956 s … 9393.770 s    3 runs
3 
4Benchmark 2: COMMIT=589db87 ./src/bitcoind -datadir=/mnt/sda1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=500000 -printtoconsole=0
5  Time (mean ± σ):     7087.173 s ± 320.442 s    [User: 8476.776 s, System: 789.276 s]
6  Range (min … max):   6737.609 s … 7367.011 s    3 runs

resulting in:

0Summary
1  'COMMIT=589db87 ./src/bitcoind -datadir=/mnt/sda1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=500000 -printtoconsole=0' ran
2    1.30 ± 0.06 times faster than 'COMMIT=df34a9 ./src/bitcoind -datadir=/mnt/sda1/BitcoinData -dbcache=16384 -prune=550 -stopatheight=500000 -printtoconsole=0'

i.e. 30% faster than master (I’ve used the first refactoring commit here as reference, seemed easier to validate) for the first 500k blocks with fully remote nodes to mimic realistic usage.

Congrats @andrewtoth!

ACK 589db872e116779ab9cae693171ac8a8c02d9923

DrahtBot requested review from mzumsande on Aug 7, 2024

in src/coins.cpp:286 in df34a94e57 outdated

282@@ -283,7 +283,7 @@ bool CCoinsViewCache::Sync()
283 void CCoinsViewCache::Uncache(const COutPoint& hash)
284 {
285     CCoinsMap::iterator it = cacheCoins.find(hash);
286-    if (it != cacheCoins.end() && it->second.flags == 0) {
287+    if (it != cacheCoins.end() && !it->second.IsDirty() && !it->second.IsFresh()) {

davidgumberg commented at 5:41 pm on August 7, 2024:

I think this is right as-is, just wanted to make a note that I believe @mzumsande earlier comment is relevant here. Namely, that it’s not necessary to check for freshness here if we cannot have fresh-but-not-dirty entries, and even if we could, I’m not sure why we wouldn’t remove them from the cache in this instance. But, I think this should be addressed by any revival of #18746.

andrewtoth commented at 10:10 pm on August 18, 2024:

Done in #30673

sipa commented at 7:22 pm on August 7, 2024: member

reACK 589db872e116779ab9cae693171ac8a8c02d9923

achow101 commented at 8:24 pm on August 7, 2024: member

ACK 589db872e116779ab9cae693171ac8a8c02d9923

mzumsande commented at 8:58 pm on August 7, 2024: contributor

re-ACK 589db872e116779ab9cae693171ac8a8c02d9923

achow101 merged this on Aug 8, 2024

achow101 closed this on Aug 8, 2024

andrewtoth deleted the branch on Aug 8, 2024

andrewtoth commented at 2:50 pm on August 8, 2024: contributor

:tada: Thanks everyone!

hebasto added the label Needs CMake port on Aug 9, 2024

hebasto commented at 10:18 am on August 10, 2024: member

Ported to the CMake-based build system in https://github.com/hebasto/bitcoin/pull/318.

hebasto removed the label Needs CMake port on Aug 10, 2024

l0rinc referenced this in commit d0f52dce9e on Aug 14, 2024

l0rinc referenced this in commit c98d57900e on Aug 14, 2024

Don’t empty dbcache on prune flushes: >30% faster IBD #28280

Code Coverage

Reviews

Conflicts