Since #17487 we no longer need to clear the coins cache when syncing to disk. A warm coins cache significantly speeds up block connection, and only needs to be fully flushed when nearing the dbcache
limit.
For frequent pruning flushes there’s no need to empty the cache and kill connect block speed. However, simply using Sync
in place of Flush
actually slows down a pruned full IBD with a high dbcache
value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it’s dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.
To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync
, and simply clear the pointers after.
With this approach a full IBD with dbcache=16384
and prune=550
was 32% faster than master. For default dbcache=450
speedup was ~9%. All benchmarks were run with stopatheight=800000
.
prune | dbcache | time | max RSS | speedup | |
---|---|---|---|---|---|
master | 550 | 16384 | 8:52:57 | 2,417,464k | - |
branch | 550 | 16384 | 6:01:00 | 16,216,736k | 32% |
branch | 550 | 450 | 8:05:08 | 2,818,072k | 8.8% |
master | 10000 | 5000 | 8:19:59 | 2,962,752k | - |
branch | 10000 | 5000 | 5:56:39 | 6,179,764k | 28.8% |
master | 0 | 16384 | 4:51:53 | 14,726,408k | - |
branch | 0 | 16384 | 4:43:11 | 16,526,348k | 2.7% |
master | 0 | 450 | 7:08:07 | 3,005,892k | - |
branch | 0 | 450 | 6:57:24 | 3,013,556k | 2.6% |
While the 2 pointers add memory to each cache entry, it did not slow down IBD. For non-pruned IBD results were similar for this branch and master. When I performed the initial IBD, the full UTXO set could be held in memory when using the max dbcache
value. For non-pruned IBD with max dbcache
to tip ended up using 12% more memory, but it was also 2.7% faster somehow. For smaller dbcache
values the dbcache
limit is respected so does not consume more memory, and the potentially more frequent flushes were not significant enough to cause any slowdown.
For reviewers, the commits in order do the following:
Commits bc95669d72d1aa9519514c0c7efd026c254e9e90 to 8ca27c7bb60f1d0042806010e60478e5190ddc52 encapsulate all accesses to flags
on cache entries, and then e0f394ec9d955d543dd8e4553aa3c566f5aea9cf makes flags
private.
Commits 784b8db2db7068954c0f7b6e1acb9670fe5182bd to 6321388b430b62d6b9b74982a4e31a6bfedc1bc6 create the linked list head nodes and cache entry self references and pass them into AddFlags
.
Commit fd53f603645a7a9baee9d5cb9a6c0f47c6b8bf87 actually adds the entries into a linked list when they are flagged DIRTY or FRESH and removes them from the linked list when they are destroyed or the flags are cleared manually. However, the linked list is not yet used anywhere.
Commit f5443052bf5db66c922e9239344afcb07f486070 adds unit tests for the linked list.
Commit 7b73a9b99f26964d413e3805cb25f5e8c7da324a uses the linked list to iterate through DIRTY entries instead of using the entire coins cache.
Commit c36363f6b24c7ab2afe198d9855f507ddf096e1f uses Sync
instead of Flush
for pruning flushes, so the cache is no longer cleared.
Inspired by this comment.
Fixes #11315.