Don’t empty dbcache on prune flushes: >30% faster IBD #28280

pull andrewtoth wants to merge 13 commits into bitcoin:master from andrewtoth:sync-dirty changing 10 files +358 −79
  1. andrewtoth commented at 10:38 pm on August 16, 2023: contributor

    Since #17487 we no longer need to clear the coins cache when syncing to disk. A warm coins cache significantly speeds up block connection, and only needs to be fully flushed when nearing the dbcache limit.

    For frequent pruning flushes there’s no need to empty the cache and kill connect block speed. However, simply using Sync in place of Flush actually slows down a pruned full IBD with a high dbcache value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it’s dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.

    To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync, and simply clear the pointers after.

    With this approach a full IBD with dbcache=16384 and prune=550 was 32% faster than master. For default dbcache=450 speedup was ~9%. All benchmarks were run with stopatheight=800000.

    prune dbcache time max RSS speedup
    master 550 16384 8:52:57 2,417,464k -
    branch 550 16384 6:01:00 16,216,736k 32%
    branch 550 450 8:05:08 2,818,072k 8.8%
    master 10000 5000 8:19:59 2,962,752k -
    branch 10000 5000 5:56:39 6,179,764k 28.8%
    master 0 16384 4:51:53 14,726,408k -
    branch 0 16384 4:43:11 16,526,348k 2.7%
    master 0 450 7:08:07 3,005,892k -
    branch 0 450 6:57:24 3,013,556k 2.6%

    While the 2 pointers add memory to each cache entry, it did not slow down IBD. For non-pruned IBD results were similar for this branch and master. When I performed the initial IBD, the full UTXO set could be held in memory when using the max dbcache value. For non-pruned IBD with max dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow. For smaller dbcache values the dbcache limit is respected so does not consume more memory, and the potentially more frequent flushes were not significant enough to cause any slowdown.

    For reviewers, the commits in order do the following: Commits bc95669d72d1aa9519514c0c7efd026c254e9e90 to 8ca27c7bb60f1d0042806010e60478e5190ddc52 encapsulate all accesses to flags on cache entries, and then e0f394ec9d955d543dd8e4553aa3c566f5aea9cf makes flags private. Commits 784b8db2db7068954c0f7b6e1acb9670fe5182bd to 6321388b430b62d6b9b74982a4e31a6bfedc1bc6 create the linked list head nodes and cache entry self references and pass them into AddFlags. Commit fd53f603645a7a9baee9d5cb9a6c0f47c6b8bf87 actually adds the entries into a linked list when they are flagged DIRTY or FRESH and removes them from the linked list when they are destroyed or the flags are cleared manually. However, the linked list is not yet used anywhere. Commit f5443052bf5db66c922e9239344afcb07f486070 adds unit tests for the linked list. Commit 7b73a9b99f26964d413e3805cb25f5e8c7da324a uses the linked list to iterate through DIRTY entries instead of using the entire coins cache. Commit c36363f6b24c7ab2afe198d9855f507ddf096e1f uses Sync instead of Flush for pruning flushes, so the cache is no longer cleared.

    Inspired by this comment.

    Fixes #11315.

  2. DrahtBot commented at 10:38 pm on August 16, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK paplorinc
    Concept ACK jamesob, hernanmarino

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30370 (sync: improve CCoinsViewCache ReallocateCache - 2nd try by fjahr)
    • #30326 (optimization: Reduce cache lookups in CCoinsViewCache::FetchCoin by paplorinc)
    • #28216 (fuzz: a new target for the coins database by darosior)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/coins.h:109 in dbd663c446 outdated
    102@@ -103,8 +103,16 @@ class Coin
    103  */
    104 struct CCoinsCacheEntry
    105 {
    106+private:
    107+    //! These are used to create a doubly linked list of dirty entries. They are
    108+    //! set in SetFlags and unset in ClearFlags.
    109+    CCoinsCacheEntry *m_prev{nullptr};
    


    sipa commented at 11:14 pm on August 16, 2023:
    If these were iterators into the map, rather than pointers to the value, I think you wouldn’t need the m_outpoint.

    sipa commented at 2:23 pm on August 17, 2023:

    Actually, it seems iterators may get invalidated when a map rehash occurs, but pointers to elements don’t.

    I think you can however store pointers to std::pair<const COutPoint, CoinsCacheEntry> (the value_type of the map). Would that help alleviate the need for m_outpoint?


    andrewtoth commented at 2:28 pm on August 17, 2023:
    Brilliant, thanks! Was just trying this last night, this looks like the way. Will give it a shot.

    andrewtoth commented at 8:21 pm on August 17, 2023:
    @sipa done! Thank you for the elegant solution.
  4. andrewtoth force-pushed on Aug 17, 2023
  5. andrewtoth force-pushed on Aug 17, 2023
  6. andrewtoth force-pushed on Aug 17, 2023
  7. DrahtBot added the label CI failed on Aug 17, 2023
  8. DrahtBot removed the label CI failed on Aug 18, 2023
  9. andrewtoth force-pushed on Aug 18, 2023
  10. andrewtoth force-pushed on Aug 18, 2023
  11. andrewtoth force-pushed on Aug 19, 2023
  12. andrewtoth force-pushed on Aug 19, 2023
  13. andrewtoth force-pushed on Aug 20, 2023
  14. andrewtoth force-pushed on Aug 23, 2023
  15. andrewtoth force-pushed on Aug 24, 2023
  16. andrewtoth force-pushed on Aug 24, 2023
  17. jamesob commented at 3:14 pm on August 24, 2023: member
    Concept ACK! I’ll try to repro the bench results in the next week or so.
  18. DrahtBot added the label CI failed on Aug 24, 2023
  19. DrahtBot removed the label CI failed on Aug 25, 2023
  20. 0xB10C commented at 7:52 am on August 30, 2023: contributor
    @andrewtoth Not sure if you have seen #20827 which is a different approach but has a similar effect.
  21. andrewtoth commented at 7:13 pm on August 30, 2023: contributor
    @0xB10C indeed I have seen that one. I believe they complement each other. #20827 reduces IBD time for larger prune values and this PR reduces it even more for smaller prune values. I should benchmark this on top of that one. Also, I’m curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?
  22. 0xB10C commented at 6:35 am on August 31, 2023: contributor

    Also, I’m curious how you are performing your benchmarks and generating all those nice graphs. Do you mind sharing what tools you use?

    At the time I had a spare and otherwise idle machine set up I could use. I don’t have the bash script at hand but IIRC I had a merge base (MB) and a pull-request (PR) binary I’d run with different prune and dbcache configurations. After each run, the script would rename and move the debug log (with extra -debug=coindb -debug=prune) to a separate folder. After all runs were done, I loaded the debug logs into the following Jupyter notebooks to parse the debug logs and to generate the graphs. For anyone revisiting this, these are the relevant comments #20827 (comment), #20827 (comment), and #20827 (comment) for my setup and results.

    I’ve uploaded the jupyter nodebooks here https://gist.github.com/0xB10C/89c2903290cfb1840792d41dcd079646. The first was used for a partial IBD from 500k-600k and the second one with a full IBD. Both times syncing from a local node on the same host. The notebooks expect the debug log files with the name debug_{binary}_{cset}_{run}.log where {binary} is either MB or PR, {cset} is an integer specifying the configuration set (pruning and dbcache parameter), and {run} is an integer numbering the run of this binary-cset combination if I did multiple. Feel free to remix and share in any way you want. Keep in mind that parsing logs is quite fragile as log message format can change. Feel free to reach out if there are questions.

  23. DrahtBot added the label CI failed on Sep 4, 2023
  24. DrahtBot removed the label CI failed on Sep 5, 2023
  25. jamesob commented at 5:47 pm on September 20, 2023: member

    Bench running now, should have some results in the next day or so.

    0$ ( bitcoinperf --no-teardown bench-pr --num-blocks 30_000 --run-id no-flush-on-prune \
    1    --bitcoind-args='-dbcache=4000 -prune=550' --run-count 3 28280 && \
    2    pushover 'Bench for andrewtoth SUCCEEDED!' ) || \
    3  pushover 'Bench for andrewtoth failed'
    
  26. jamesob commented at 10:25 am on September 27, 2023: member
    Local IBD of 30_000 blocks with -prune=550 from a Taproot-enabled height didn’t show any difference. I didn’t have -log=prune enabled, so my bench debug.log was basically useless. Rerunning…
  27. DrahtBot added the label Needs rebase on Nov 8, 2023
  28. andrewtoth force-pushed on Nov 30, 2023
  29. DrahtBot removed the label Needs rebase on Dec 1, 2023
  30. andrewtoth force-pushed on Dec 3, 2023
  31. andrewtoth commented at 6:30 pm on December 3, 2023: contributor

    Rebased, ran the following benchmark using hyperfine. Sync to block 800k 3 times on master commit and the last commit of this branch (I updated the commit recently, but no code changes only commit order changes to satisfy CI), dbcache=4000 and prune=550:

    0hyperfine \
    1--export-markdown ~/bench.md \
    2--show-output --parameter-list commit 498994b6f55d04a7940f832e7fbd17e5acdaff15,216bd93138a46cdfa66523878a36d0e634476079 \
    3--prepare 'git checkout {commit} && make -j$(nproc) src/bitcoind; sync; sudo /sbin/sysctl vm.drop_caches=3; rm -rf ~/.bitcoin/blocks; rm -rf ~/.bitcoin/chainstate;' \
    4-M 3 './src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local_node> -stopatheight=800000'
    

    Results from ~/bench.md:

    Command Mean [s] Min [s] Max [s] Relative
    ./src/bitcoind -dbcache=4000 -datadir=/home/user/.bitcoin -prune=550 -printtoconsole=0 -connect=192.168.2.11:8333 -stopatheight=800000 27609.574 ± 37.857 27580.531 27652.389 1.24 ± 0.00
    ./src/bitcoind -dbcache=4000 -datadir=/home/user/.bitcoin -prune=550 -printtoconsole=0 -connect=192.168.2.11:8333 -stopatheight=800000 22188.405 ± 52.561 22128.517 22226.878 1.00

    The second commit is this branch, so it runs 24% faster than master. @jamesob I suppose 30k blocks is not enough to appreciate the effect of syncing with a full cache?

  32. in src/coins.h:140 in ba09f5039c outdated
    136@@ -138,9 +137,9 @@ struct CCoinsCacheEntry
    137         FRESH = (1 << 1),
    138     };
    139 
    140-    CCoinsCacheEntry() : flags(0) {}
    141-    explicit CCoinsCacheEntry(Coin&& coin_) : coin(std::move(coin_)), flags(0) {}
    142-    CCoinsCacheEntry(Coin&& coin_, unsigned char flag) : coin(std::move(coin_)), flags(flag) {}
    143+    CCoinsCacheEntry() {}
    


    martinus commented at 5:26 pm on December 4, 2023:
    While you are at it, I’d use CCoinsCacheEntry = default;. clang-tidy usually warns against this: https://clang.llvm.org/extra/clang-tidy/checks/modernize/use-equals-default.html

    maflcko commented at 11:46 am on December 5, 2023:

    One thing I don’t like about modernize-use-equals-default is that it makes aggregate initialization possible again, when a developer disabled it by declaring a constructor.

    Though, now that C++20 will be required soon, I guess it can be enabled. C.f.

    0struct A {
    1    // A() {}; // "main" fn fails to compile under C++17 and 20
    2    A() = default;  // "main" fn fails to compile under C++20
    3    int b{};
    4};
    5
    6int main() { (void)A{1}; }
    

    martinus commented at 12:00 pm on December 5, 2023:
    Ah, I wasn’t aware of that, good to know :+1:
  33. in src/coins.cpp:12 in d66ad8efaf outdated
     8@@ -9,6 +9,42 @@
     9 #include <random.h>
    10 #include <util/trace.h>
    11 
    12+void CCoinsCacheEntry::SetFlags(uint8_t flags, CoinsCachePair &self, CoinsCachePair &head)
    


    martinus commented at 7:25 pm on December 4, 2023:

    I think SetFlags is a bit confusing name, since it’s more a AddFlags, because it doesn’t simply set the flags, it ORs them together.

    Alternatively (and I think this is the better solution) really use m_flags = flags instead of mflags |= flags.


    andrewtoth commented at 2:15 am on December 5, 2023:

    Only 2 of the 6 previous flags assignments (aside from = 0) used simple assignment, the other 4 use OR assignment. The 2 that use simple assignment we know logically that they must have been just created and therefore their flags are 0 so we are not overwriting any existing flags. Therefore an OR assignment in those 2 cases is logically equivalent.

    So if we were to go with your second suggestion we would need to modify behavior to make sure we assign the new flags we wish to set OR’d with the current flags so we don’t overwrite existing flags. Or create separate Set and Add methods. I think that would make this patch more cumbersome.

    It seems to me your first suggestion would be the path of least resistance. Updated to AddFlags.

  34. in src/coins.cpp:151 in ba09f5039c outdated
    148+    auto it{cacheCoins.emplace(
    149         std::piecewise_construct,
    150         std::forward_as_tuple(std::move(outpoint)),
    151-        std::forward_as_tuple(std::move(coin), CCoinsCacheEntry::DIRTY));
    152+        std::forward_as_tuple(std::move(coin))).first};
    153+    it->second.SetFlags(CCoinsCacheEntry::DIRTY, *it, m_flagged_head);
    


    martinus commented at 7:32 pm on December 4, 2023:
    Not sure if that is possible in the call, but when emplace doesn’t create a new object because the outpoint is already present, then previously the flags didn’t change. Now, the flag DIRTY is always added, regardless if the outpoint was already present in cacheCoins or not.
  35. in src/coins.cpp:234 in ba09f5039c outdated
    236-            if (!(it->second.flags & CCoinsCacheEntry::FRESH && it->second.coin.IsSpent())) {
    237+            if (!(it->second.GetFlags() & CCoinsCacheEntry::FRESH && it->second.coin.IsSpent())) {
    238                 // Create the coin in the parent cache, move the data up
    239                 // and mark it as dirty.
    240-                CCoinsCacheEntry& entry = cacheCoins[it->first];
    241+                itUs = cacheCoins.emplace(std::piecewise_construct, std::forward_as_tuple(it->first), std::tuple<>()).first;
    


    martinus commented at 7:40 pm on December 4, 2023:
    I’d use itUs = cacheCoins.try_emplace(it->first).first; instead, it’s just less code and does the same thing
  36. in src/coins.cpp:317 in ba09f5039c outdated
    314+    for (auto it{m_flagged_head.second.Next()}; it != nullptr; ) {
    315         if (it->second.coin.IsSpent()) {
    316+            // We need to get next entry before we erase this entry.
    317+            // Since cacheCoins owns the entry, it will be deallocated before we
    318+            // can access the next entry if we erase first.
    319+            const auto next_entry{it->second.Next(/*clear_flags=*/true)};
    


    martinus commented at 8:12 pm on December 4, 2023:

    nit, you could move that line in front of the if, then remove the else, like so:

    0const auto next_entry{it->second.Next(/*clear_flags=*/true)};
    1if (it->second.coin.IsSpent()) {
    2    cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
    3    cacheCoins.erase(it->first);
    4}
    5it = next_entry;
    
  37. andrewtoth force-pushed on Dec 5, 2023
  38. andrewtoth commented at 3:03 am on December 5, 2023: contributor
    @martinus thank you very much for your review! I updated the PR with all your suggestions.
  39. andrewtoth force-pushed on Dec 5, 2023
  40. DrahtBot added the label CI failed on Dec 5, 2023
  41. DrahtBot removed the label CI failed on Dec 5, 2023
  42. martinus commented at 5:55 am on December 5, 2023: contributor

    I didn’t think this through yet, but this might be an alternative way to achieve the same thing: Instead of the double linked list, split the cacheCoins map up into 4 different maps, one map for each flag setting, e.g. as a std::array<CCoinMap, 4> m_cache_coins_per_flag;. Index is the flag, e.g. m_cache_coins_per_flag[FRESH | DIRTY] to get the map with all entries that are fresh & dirty. Instead of setting a flag & linking the list, use the unordered_map’s auto node = extract(...) and insert(std::move(node)) to move the entry to the corresponding map. This only relinks the unordered_map node and causes no allocation, pointers to the entry stay valid.

    This approach would have a few advantages:

    • Less memory usage: no m_flag member is needed that actually required 8 bytes due to alignment, no 2 pointers needed for interlinking, and multiple maps means overall bucket sizes will stay smaller.
    • processing the maps might be faster: iteration is fast because e.g. skipping non-dirty entries is trivial (they are in a different map). There might be a hashing overhead though from extract() and insert().

    For this to work, all 4 maps need to use the same m_cache_coins_memory_resource (insert() into a different map where the allocator doesn’t compare equal is undefined behavior).

  43. andrewtoth commented at 1:19 pm on December 5, 2023: contributor
    @martinus something similar was mentioned in #15265 (comment). However, with your approach every access would require 4 lookups.
  44. martinus commented at 7:43 pm on December 5, 2023: contributor

    @martinus something similar was mentioned in #15265 (comment). However, with your approach every access would require 4 lookups.

    Ha, of course; I didn’t think about that at all.

  45. andrewtoth force-pushed on Dec 13, 2023
  46. DrahtBot added the label CI failed on Jan 12, 2024
  47. andrewtoth force-pushed on Mar 8, 2024
  48. andrewtoth force-pushed on Mar 8, 2024
  49. DrahtBot removed the label CI failed on Mar 8, 2024
  50. andrewtoth force-pushed on Mar 14, 2024
  51. andrewtoth force-pushed on Mar 14, 2024
  52. andrewtoth force-pushed on Mar 14, 2024
  53. DrahtBot added the label CI failed on Mar 14, 2024
  54. DrahtBot commented at 6:08 pm on March 14, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/22665800478

  55. andrewtoth force-pushed on Mar 14, 2024
  56. DrahtBot removed the label CI failed on Mar 14, 2024
  57. andrewtoth force-pushed on Mar 16, 2024
  58. in src/coins.h:110 in 56faed97cd outdated
    105@@ -103,6 +106,14 @@ class Coin
    106  */
    107 struct CCoinsCacheEntry
    108 {
    109+private:
    110+    //! These are used to create a doubly linked list of flagged entries. They
    


    sipa commented at 3:12 pm on March 17, 2024:

    In commit “coins: add cache entry linked list fields and methods”

    Nit: //! is for doxygen comments on the next definition. Since this is really a comment on the next two, it probably shouldn’t be a doxygen comment.

  59. in src/coins.cpp:113 in 75a248f770 outdated
    107@@ -108,10 +108,15 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
    108 
    109 void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
    110     cachedCoinsUsage += coin.DynamicMemoryUsage();
    111-    cacheCoins.emplace(
    112+    CCoinsMap::iterator it;
    113+    bool inserted;
    114+    std::tie(it, inserted) = cacheCoins.emplace(
    


    sipa commented at 3:16 pm on March 17, 2024:

    In commit “coins: track dirty cache entries in doubly linked list”

    Use a C++17 structured binding:

    0auto [it, inserted] = cacheCoins.emplace(
    
  60. in src/coins.cpp:54 in 75a248f770 outdated
    50@@ -51,7 +51,7 @@ CCoinsMap::iterator CCoinsViewCache::FetchCoin(const COutPoint &outpoint) const
    51     if (ret->second.coin.IsSpent()) {
    52         // The parent only has an empty entry for this outpoint; we can consider our
    53         // version as fresh.
    54-        ret->second.flags = CCoinsCacheEntry::FRESH;
    55+        ret->second.AddFlags(CCoinsCacheEntry::FRESH, *ret, m_flagged_head);
    


    sipa commented at 3:19 pm on March 17, 2024:

    In commit “coins: track dirty cache entries in doubly linked list”

    Should FRESH-but-not-DIRTY be flagged? I believe those correspond to spent but unmodified coins fetched from the parent cache.

    If the rule is flagged == dirty, then perhaps it makes sense to instead of having AddFlags and Clearflags have a single SetFlags, which knows whether the new state is supposed to be in the list or not (moving the responsibility of flagged-or-not to that function instead of the caller). EDIT: I notice you used to have a SetFlags, actually, but my comments is more about where the responsibility lies for deciding whether an entry ought to be flagged or not than the actual function names.


    andrewtoth commented at 6:32 pm on March 20, 2024:

    Should FRESH-but-not-DIRTY be flagged? I believe those correspond to spent but unmodified coins fetched from the parent cache.

    Yes, but only because in this case they are also spent. We erase all spent coins from the cache after doing a non-erasing batch write. There would be no way to track spent-but-not-DIRTY otherwise except by scanning the entire cache.

    Since the only case where we have FRESH-but-not-DIRTY is where the coin is spent, it makes sense to me to just add this to the linked list as well. Before proposing this PR I did call the linked list head m_dirty_head, but after not finding an efficient way to remove non-dirty spent entries, I decided to call it m_flagged_head.

    If this makes sense to you I can add a comment here explaining this.


    sipa commented at 6:35 pm on March 20, 2024:
    @andrewtoth Ah, I see, that makes sense. It would be good to add comments to explain that flagged means “FRESH, DIRTY, or both” and why.

    mzumsande commented at 6:48 pm on June 27, 2024:
    see #28280 (review), I think that FRESH-but-not-DIRTY coins may not exist.
  61. andrewtoth force-pushed on Mar 21, 2024
  62. andrewtoth commented at 2:11 pm on March 21, 2024: contributor

    @0xB10C rebased after #20827 was merged, and ran the following benchmark:

     0hyperfine --export-markdown ~/bench.md --show-output --parameter-list commit 015ac13dcc964a31ef06dfdb565f88f901607f0e,25bdbc7f340884b3ab871c7948a4f2693037b871 --prepare 'git checkout {commit} && make -j$(nproc) src/bitcoind; sync; sudo /sbin/sysctl vm.drop_caches=3; rm -rf ~/.bitcoin/blocks; rm -rf ~/.bitcoin/chainstate;' -M 3 './src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000'
     1
     2...
     3
     4Benchmark 1: ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000 (commit = 015ac13dcc964a31ef06dfdb565f88f901607f0e)
     5
     6  Time (mean ± σ):     27241.365 s ± 33.693 s    [User: 34791.898 s, System: 7357.098 s]
     7  Range (min  max):   27202.978 s  27266.037 s    3 runs
     8
     9Benchmark 2: ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=<local network node ip> -stopatheight=800000 (commit = 25bdbc7f340884b3ab871c7948a4f2693037b871)
    10
    11  Time (mean ± σ):     22153.568 s ± 37.663 s    [User: 27818.681 s, System: 4309.878 s]
    12  Range (min  max):   22119.682 s  22194.118 s    3 runs
    13 
    14Summary
    15  ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=192.168.2.13 -stopatheight=800000 (commit = 25bdbc7f340884b3ab871c7948a4f2693037b871) ran
    16    1.23 ± 0.00 times faster than ./src/bitcoind -dbcache=4000 -prune=550 -printtoconsole=0 -connect=192.168.2.13 -stopatheight=800000 (commit = 015ac13dcc964a31ef06dfdb565f88f901607f0e)
    

    So this is 23% faster than master when using dbcache=4000, prune=550. (I just rebased again but 015ac13dcc964a31ef06dfdb565f88f901607f0e is master and 25bdbc7f340884b3ab871c7948a4f2693037b871 is my branch). @sipa thank you for your review! Addressed your nits and added more comments about flagged entries.

  63. DrahtBot added the label Needs rebase on Mar 22, 2024
  64. andrewtoth force-pushed on Apr 3, 2024
  65. DrahtBot removed the label Needs rebase on Apr 3, 2024
  66. DrahtBot added the label CI failed on Apr 3, 2024
  67. andrewtoth commented at 9:26 pm on April 4, 2024: contributor

    Rebased and ran some faster benchmarks, 2 each but only until 400k. Will do some more going to 800k.

    prune dbcache mean time (s) speedup
    master 550 16384 2,018 -
    branch 550 16384 1,667 17.4%
    master 550 450 2,069 -
    branch 550 450 1,822 11.9%
    master 0 16384 1,447 -
    branch 0 16384 1,396 3.5%
    master 0 450 1,546 -
    branch 0 450 1,527 1.2%

    Command was

     0hyperfine \
     1--export-markdown ~/bench.md \
     2--show-output \
     3--parameter-list commit 0d509bab45d292caeaf34600e57b5928757c6005,499f1f029a265fbeb4824e5cb84726f82ca2824b \
     4--parameter-list dbcache 450,16384 \
     5--parameter-list prune 550,0 \
     6--prepare 'git checkout {commit} && \
     7make -j$(nproc) src/bitcoind; \
     8sync; sudo /sbin/sysctl vm.drop_caches=3; \
     9rm -rf /home/user/.bitcoin/blocks; \
    10rm -rf /home/user/.bitcoin/chainstate;' \
    11-M 2 \
    12'echo '{commit}' && \
    13./src/bitcoind -datadir=/home/user/.bitcoin \
    14-dbcache={dbcache} \
    15-prune={prune} \
    16-printtoconsole=0 \
    17-connect=<local node> \
    18-stopatheight=400000'
    
  68. DrahtBot removed the label CI failed on Apr 4, 2024
  69. andrewtoth commented at 4:03 am on April 9, 2024: contributor

    Ran benchmarks to 800k blocks, 2 times each. As expected, results are very good when running pruned. When running non-pruned with default dbcache, however, master was 1% faster. I think this slowdown is negligible though and the improvements to syncing with pruning enabled make this worth the changes here. Also, potential improvements like #28945 will make up for it.

    prune dbcache mean time speedup
    master 550 16384 7h 29m -
    branch 550 16384 4h 56m 34%
    branch 550 450 6h 33m 12%
    master 0 16384 3h 34m -
    branch 0 16384 3h 29m 2%
    master 0 450 5h 34m 1%
    branch 0 450 5h 40m -
  70. hernanmarino commented at 11:37 pm on April 12, 2024: contributor
    Concept ACK . I ’d really like to see this merged
  71. DrahtBot added the label CI failed on Apr 21, 2024
  72. DrahtBot removed the label CI failed on Apr 23, 2024
  73. achow101 commented at 8:21 pm on May 13, 2024: member

    For frequent pruning flushes there’s no need to empty the cache and kill connect block speed. However, simply using Sync in place of Flush actually slows down a pruned full IBD with a high dbcache value. This is because as the cache grows, sync takes longer since every coin in the cache is scanned to check if it’s dirty. For frequent prune flushes and a large cache this constant scanning starts to really slow IBD down, and just emptying the cache on every prune becomes faster.

    To fix this, we can add two pointers to each cache entry and construct a doubly linked list of dirty entries. We can then only iterate through all dirty entries on each Sync, and simply clear the pointers after.

    Perhaps I’m missing something, but it seems to me that the obvious solution to this problem is to have BatchWrite always delete dirty entries and clear the flags on all entries, rather than implementing a whole new data structure? BatchWrite already iterates the entire cache anyways, so it would make sense to me that we can consolidate all of the operations that occur for each cache entry.

  74. achow101 referenced this in commit d6069cb8d6 on May 13, 2024
  75. sipa commented at 8:35 pm on May 13, 2024: member
    @achow101 That works, but it misses the point. If you delete the dirty entries on a prune flush, then those entries need to be re-read from disk when they are (likely soon) spent.
  76. achow101 commented at 8:36 pm on May 13, 2024: member

    @achow101 That works, but it misses the point. If you delete the dirty entries on a prune flush, then those entries need to be re-read from disk when they are (likely soon) spent.

    Sorry, I meant clear the Dirty and spent ones. Basically just move the stuff that Sync() does into the loop that BatchWrite is doing.

  77. DrahtBot added the label Needs rebase on May 13, 2024
  78. andrewtoth commented at 2:32 am on May 14, 2024: contributor
    @achow101 I see what you mean, clear the flags and delete spent entries in the loop in BatchWrite. That would reduce each Sync from 2 full scans of the coinsCache to 1. It still scans the entire coinsCache, while the current change here only loops through flagged entries. Will experiment with how that performs, and possibly combining both ideas to do a single loop of only the flagged entries.
  79. sipa commented at 7:44 pm on May 20, 2024: member
    Am I right that after this change (or alternatives which let BatchWrite just iterate dirty/fresh entries), it would be possible to (perhaps dramatically) reduce the periodic flush time, so that after IBD, we basically always have a mostly non-dirty cache? Such a change could be made today of course, but the cost of iterating the entire cache every time would add up quickly.
  80. andrewtoth commented at 4:44 pm on May 21, 2024: contributor

    Am I right that after this change (or alternatives which let BatchWrite just iterate dirty/fresh entries), it would be possible to (perhaps dramatically) reduce the periodic flush time, so that after IBD, we basically always have a mostly non-dirty cache? Such a change could be made today of course, but the cost of iterating the entire cache every time would add up quickly.

    I believe you are right, but I’m unsure what the benefit is that you see. From my understanding, the periodic flushes are solely to prevent data loss in case of unclean shutdown (otherwise, flushing from filling the dbcache or flushing from shutting down cleanly prevent data loss). With a 24 hour periodic flush schedule and #15218, the most that is at risk is ~144 blocks worth of data?

    If you are talking about periodic flushes during IBD though, that might be a benefit that could obviate #15218 and properly fix #11600. If we reduce the periodic flush interval to an hour or so, then only an hour or so of IBD data would be at risk of rolling back. This would of course mean that there is less opportunity for the cache to delete FRESH & DIRTY entries on spend so they never have to hit the disk, so it would be less performant. But it would still keep the cache for connecting blocks quickly.

  81. sipa commented at 4:51 pm on May 21, 2024: member
    @andrewtoth I was talking about post-IBD. And one advantage to more frequent flushing is reducing the latency spikes you get when a flush is triggered.
  82. andrewtoth commented at 12:43 pm on June 3, 2024: contributor

    BatchWrite already iterates the entire cache anyways @achow101 I think this is what you might not be clear about. BatchWrite iterates the entire cache, but this patch does not. After this patch BatchWrite only iterates flagged entries, which is a much smaller subset of the cache.

    I tried moving everything in the loop in Sync into BatchWrite, along with cachedCoinsUsage so it can be decremented as we delete spent coins from the child cache, and it looks like it will take many times longer to sync with dbcache=16384 prune=550 than current master. So I don’t think that’s a viable alternative solution to what is proposed here.

  83. andrewtoth force-pushed on Jun 3, 2024
  84. andrewtoth commented at 2:17 pm on June 3, 2024: contributor
    Rebased, and added a small optimization based on @achow101’s suggestion. In BatchWrite, we also clear the flags when not erasing if the coin is unspent. This reduces the amount of flagged entries needed to be iterated in Sync to find the spent entries to delete. We can also throw a logic error now if we find a flagged entry in Sync that is not spent.
  85. andrewtoth force-pushed on Jun 3, 2024
  86. DrahtBot removed the label Needs rebase on Jun 3, 2024
  87. achow101 commented at 4:16 pm on June 3, 2024: member

    @achow101 I think this is what you might not be clear about. BatchWrite iterates the entire cache, but this patch does not. After this patch BatchWrite only iterates flagged entries, which is a much smaller subset of the cache.

    Yes, that was what I was missing. It just wasn’t clear to me from the description that this was implemented as another performance improvement in addition to clearing dirty and spent UTXOs.

  88. DrahtBot added the label CI failed on Jun 3, 2024
  89. andrewtoth force-pushed on Jun 3, 2024
  90. DrahtBot removed the label CI failed on Jun 4, 2024
  91. in src/coins.cpp:111 in 6ed20408df outdated
    107@@ -108,10 +108,13 @@ void CCoinsViewCache::AddCoin(const COutPoint &outpoint, Coin&& coin, bool possi
    108 
    109 void CCoinsViewCache::EmplaceCoinInternalDANGER(COutPoint&& outpoint, Coin&& coin) {
    110     cachedCoinsUsage += coin.DynamicMemoryUsage();
    111-    cacheCoins.emplace(
    112+    auto [it, inserted] = cacheCoins.emplace(
    


    paplorinc commented at 1:44 pm on June 25, 2024:
    would it make sense to use cacheCoins.try_emplace without the forward_as_tuple parts (similarly to what I’ve tried in a related PR) (in a separate commit or PR)

    andrewtoth commented at 8:48 pm on June 28, 2024:
    Sorry, can you elaborate on what benefit that would have here, and any possible downsides? I’m not 100% sure what all the differences are between the two methods, so preferred to stick to status quo.

    paplorinc commented at 9:01 pm on June 28, 2024:
    In that case stick to it and I’ll add it in a separate PR once this is merged!
  92. in src/coins.h:121 in 0a2bc92b0c outdated
    116+     * the parent cache to do a batch write. This is a performance optimization
    117+     * compared to giving all entries in the cache to the parent and having the
    118+     * parent scan for only modified entries.
    119+     *
    120+     * FRESH entries are tracked because a FRESH-but-not-DIRTY entry can only be
    121+     * a spent coin that exists in the parent cache. When performing a batch
    


    mzumsande commented at 8:24 pm on June 26, 2024:

    This comment and the implementation further persists the notion of FRESH-but-not-DIRTY coin, which I think is unfortunate, because it should be impossible for them to exist: The only place in which they could be created is here, while fetching a coin from the parent that is spent. However that is not possible because we’d only get to that point if the GetCoin() call a few lines above returned true - which is impossible for spent coins because according to the abstract CCoinsView interface, GetCoinReturns true only when an unspent coin was found, which is returned in coin.” This is followed both by CCoinsViewCache::GetCoin, which returns false if spent and CCoinsViewDB::GetCoin which doesn’t return spent coins because they are never saved to disk.

    Maybe it would be good to revive #18746 which attempted to fix this but got closed.

    With that, this PR could add only DIRTY coins and ignore the FRESH flag, which would be simpler.


    andrewtoth commented at 2:36 am on June 28, 2024:

    With that, this PR could add only DIRTY coins and ignore the FRESH flag, which would be simpler.

    I’m not sure how that would be simpler. The two flags are coupled and so adding one and not the other would be a more complex diff. Perhaps I don’t understand what you are suggesting?

    I could just replace the single FRESH-but-not-DIRTY check with an assertion and remove this comment about tracking FRESH coins?


    mzumsande commented at 3:31 pm on June 28, 2024:

    Yes, I thought some more and don’t find a great way to simplify the logic. I think it would be enough to remove / amend the comment, maybe mentioning instead that FRESH-but-not-DIRTY coins are not expected to occur but would be handled well if they did.

    replace the single FRESH-but-not-DIRTY check

    Not sure if that’s the “check” you mean, but replacing this with an assertion was essentially what #18746 did in its latest version. Since that PR was mildly controversial (I think mostly because it originally had the spin of “These coins may be unnecessarily added today; don’t add them anymore” instead of “We already don’t add them. Remove unreachable code”) I think the question of adding an assert should be separated from this PR.


    andrewtoth commented at 2:58 pm on June 29, 2024:
    Updated the comment.
  93. in src/txdb.cpp:127 in a6324db9c6 outdated
    123@@ -124,7 +124,7 @@ bool CCoinsViewDB::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, boo
    124             changed++;
    125         }
    126         count++;
    127-        it = erase ? mapCoins.erase(it) : std::next(it);
    128+        it = it->second.Next(will_erase || !it->second.coin.IsSpent());
    


    mzumsande commented at 2:51 pm on June 27, 2024:
    I think that it should be explained in a comment similar as it’s already done in CCoinsViewCache::BatchWrite why we keep unspent coins because this is not obvious.

    paplorinc commented at 4:35 pm on June 28, 2024:
    Yeah, too many changes in one commit

    andrewtoth commented at 2:58 pm on June 29, 2024:
    Done.
  94. mzumsande commented at 6:45 pm on June 27, 2024: contributor

    A non-pruned IBD with full dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow.

    I’m not sure if I understand this, is this still the case? Why would a node use more memory after this pull? Shouldn’t the 2 added pointers be included in the dbcache accounting, resulting in more frequent flushes due to the cache filling up faster, but not in an increase in memory when the cache is full?

  95. paplorinc commented at 7:36 pm on June 27, 2024: contributor

    Also, potential improvements like #28945 will make up for it.

    It seems #28945 (comment) was abandoned since

  96. andrewtoth commented at 7:40 pm on June 27, 2024: contributor

    A non-pruned IBD with full dbcache to tip ended up using 12% more memory, but it was also 2.7% faster somehow.

    I’m not sure if I understand this, is this still the case? Why would a node use more memory after this pull? Shouldn’t the 2 added pointers be included in the dbcache accounting, resulting in more frequent flushes due to the cache filling up faster, but not in an increase in memory when the cache is full?

    This is no longer the case. When I did the initial benchmarks the full UTXO set could be stored in memory, and so a full IBD on mainnet would be 14.7 GB for master and 16.5 GB for this pull. Today both master and this pull will hit the max limit and flush at some point, so their memory usage should be near identical. You are correct that if the dbcache limit is set below the maximum UTXO set size then this pull will not use more memory.

    For clarity, this pull only uses more memory if the entire IBD can occur without a single flush. What I meant in the description is this pull uses more memory when the entire utxo set is stored in memory and never flushed.

    This pull could use more memory again on mainnet if #28358 is merged, and this pull will use more memory on other chains than mainnet if the UTXO size is smaller than the dbcache setting.

  97. andrewtoth commented at 7:44 pm on June 27, 2024: contributor

    Also, potential improvements like #28945 will make up for it.

    It seems #28945 (comment) was abandoned since

    In my recent benchmarks it was again slightly faster on this branch for default dbcache and no pruning. So I think this PR is essentially the same.

  98. in src/coins.h:122 in 0a2bc92b0c outdated
    117+     * compared to giving all entries in the cache to the parent and having the
    118+     * parent scan for only modified entries.
    119+     *
    120+     * FRESH entries are tracked because a FRESH-but-not-DIRTY entry can only be
    121+     * a spent coin that exists in the parent cache. When performing a batch
    122+     * write but not clearing the cache (CCoinsViewCache::Sync), we must also
    


    paplorinc commented at 9:03 am on June 28, 2024:

    nit:

    0     * write without clearing the cache (CCoinsViewCache::Sync), we must also
    

    andrewtoth commented at 2:58 pm on June 29, 2024:
    Done.
  99. in src/coins.h:116 in 0a2bc92b0c outdated
    111+     * These are used to create a doubly linked list of flagged entries.
    112+     * They are set in AddFlags and unset in ClearFlags.
    113+     * A flagged entry is any entry that is either DIRTY, FRESH, or both.
    114+     *
    115+     * DIRTY entries are tracked so that only modified entries can be passed to
    116+     * the parent cache to do a batch write. This is a performance optimization
    


    paplorinc commented at 9:04 am on June 28, 2024:

    Nit:

    0     * the parent cache for batch writing. This is a performance optimization
    

    andrewtoth commented at 2:58 pm on June 29, 2024:
    Done.
  100. in src/coins.h:163 in 0a2bc92b0c outdated
    156@@ -130,6 +157,56 @@ struct CCoinsCacheEntry
    157     CCoinsCacheEntry() : flags(0) {}
    158     explicit CCoinsCacheEntry(Coin&& coin_) : coin(std::move(coin_)), flags(0) {}
    159     CCoinsCacheEntry(Coin&& coin_, unsigned char flag) : coin(std::move(coin_)), flags(flag) {}
    160+    ~CCoinsCacheEntry()
    161+    {
    162+        // We must always clear the flags when destroying this to remove it from
    163+        // the flagged linked list
    


    paplorinc commented at 9:06 am on June 28, 2024:
    nit: redundant comment

    andrewtoth commented at 8:39 pm on June 28, 2024:
    Is this really obvious though that we need to clear flags before destruction? Normally flags wouldn’t affect destroying an object, but in this case they also tie the object to a linked list.

    paplorinc commented at 8:44 pm on June 28, 2024:
    No, but it’s in a destructor, calling the ClearFlags method - I didn’t seen what the comment added beyond what the code already explained well. It could explain why you chose this style, or what alternatives you’ve considered, or something else that the code cannot, but this just seemed redundant to me.
  101. in src/coins.h:189 in 0a2bc92b0c outdated
    184+    }
    185+
    186+    inline void ClearFlags()
    187+    {
    188+        if (m_flags == 0) return;
    189+        m_prev->second.m_next = m_next;
    


    paplorinc commented at 9:11 am on June 28, 2024:
    Probably not a realistic scenario (may require multithreaded access because of the m_flags guard, not sure) , but m_prev will never be null when we get here, right?

    andrewtoth commented at 7:00 pm on June 28, 2024:
    Yes, this code is single threaded so not possible to get past the previous line if m_prev is null. It is possible for m_prev to point to a pair that has been freed if the linked list head is destroyed, which is why cacheCoins must always be destroyed before the linked list head.

    paplorinc commented at 7:27 pm on June 28, 2024:
    Got it, thanks
  102. in src/test/coinscachepair_tests.cpp:168 in 6508bdfe43 outdated
    163+    // Check that we can clear flags then re-add them
    164+    n1.second.ClearFlags();
    165+    BOOST_CHECK_EQUAL(n1.second.GetFlags(), 0);
    166+    BOOST_CHECK_EQUAL(n2.second.Next(), nullptr);
    167+
    168+    // Check that we calling `ClearFlags` with 0 flags has no effect
    


    paplorinc commented at 9:26 am on June 28, 2024:

    nit:

    0    // Check that calling `ClearFlags` with 0 flags has no effect
    

    andrewtoth commented at 2:59 pm on June 29, 2024:
    Done.
  103. in src/coins.h:183 in 0a2bc92b0c outdated
    197+
    198+    inline uint8_t GetFlags() const { return m_flags; }
    199+
    200+    //! Get the next entry in the flagged linked list, optionally removing the
    201+    //! current entry and clearing the flags.
    202+    inline CoinsCachePair* Next(bool clear_flags = false)
    


    paplorinc commented at 12:49 pm on June 28, 2024:
    Not yet sure why iteration is tied to mutation here

    andrewtoth commented at 6:57 pm on June 28, 2024:
    It’s mostly convenience. Otherwise to clear flags while iterating we would have to bind m_next to another variable first, then call ClearFlags on the current pair, and then bind the the other variable holding the old m_next to the current pair variable.

    paplorinc commented at 7:27 pm on June 28, 2024:
    Can you find a way to separate the side-effects? I find this very confusing.

    andrewtoth commented at 5:49 pm on June 29, 2024:
    Before this patch, the iteration is it = erase ? mapCoins.erase(it) : std::next(it);. So this is basically the same behavior, is it not?

    paplorinc commented at 7:34 pm on June 29, 2024:
    Maybe, but now the method returns AND mutates, that’s a confusing combination. I would expect it to be either void and mutate or return and not change internal state. Can we maybe move the mutation out to the call-site?

    andrewtoth commented at 0:31 am on July 1, 2024:
    The std lib uses this pattern though, with std::unordered_map::erase. It mutates and returns next. Surely we should be following std lib conventions, no?

    paplorinc commented at 8:47 am on July 1, 2024:
    ok, thanks for checking
  104. in src/coins.h:157 in 0a2bc92b0c outdated
    165+    }
    166+
    167+    //! Adding a flag also requires a self reference to the pair that contains
    168+    //! this entry in the CCoinsCache map and a reference to the head of the
    169+    //! flagged pair linked list.
    170+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
    


    paplorinc commented at 1:26 pm on June 28, 2024:
    I find it quite confusing that AddFlags modifies the linked list as well (vs GetFlags only returning the flags, of course). Could make sense to split it to multiple methods or add the side-effect to the name

    andrewtoth commented at 5:53 pm on June 29, 2024:

    We want to tie adding a flag to adding the entry to the linked list. We don’t want to have an interface that lets something add a flag without adding it to the linked list. I think splitting that functionality into multiple methods would be dangerous.

    ClearFlags is similar. Does it matter to external callers how these methods modify the linked list? That should be transparent to callers. They just know that calling Next() from the head gets all flagged entries. The how should not be exposed IMO.


    paplorinc commented at 7:26 pm on June 29, 2024:
    Could the overall intent be better expressed here, something like SetCacheStateAndLink/ResetCacheStateAndUnlink or Sync/Reset or similar? It’s obviously doing more after this commit than what it announces…
  105. in src/test/coinscachepair_tests.cpp:43 in 6508bdfe43 outdated
    38+    CoinsCachePair head;
    39+    auto nodes{CreatePairs(head)};
    40+
    41+    // Check iterating through pairs is identical to iterating through a list
    42+    auto lit{nodes.begin()};
    43+    for (auto it{head.second.Next()}; it != nullptr; it = it->second.Next(/*clear_flags=*/false), ++lit) {
    


    paplorinc commented at 1:38 pm on June 28, 2024:

    Since iteration could misbehave severely and this would pass (e.g. if Next() would return nullptr we wouldn’t even enter the loop), it could make more sense to iterate over the target nodes instead, something like:

    0    auto node{head.second.Next()};
    1    for (const auto& expected : nodes) {
    2        BOOST_CHECK_EQUAL(&expected, node);
    3        node = node->second.Next();
    4    }
    5    BOOST_CHECK_EQUAL(nullptr, node);
    

    or

    0    auto node{&head};
    1    for (const auto& expected : nodes) {
    2        node = node->second.Next();
    3        BOOST_CHECK_EQUAL(&expected, node);
    4    }
    5    BOOST_CHECK_EQUAL(node->second.Next(), nullptr);
    

    andrewtoth commented at 2:59 pm on June 29, 2024:
    Done
  106. in src/test/coinscachepair_tests.cpp:15 in 6508bdfe43 outdated
    10+
    11+BOOST_AUTO_TEST_SUITE(coinscachepair_tests)
    12+
    13+static constexpr auto NUM_NODES{4};
    14+
    15+std::list<CoinsCachePair> CreatePairs(CoinsCachePair& head)
    


    paplorinc commented at 1:45 pm on June 28, 2024:
    shouldn’t we construct this via production code somehow? Otherwise we’re just testing test code…

    andrewtoth commented at 1:58 pm on June 29, 2024:
    Hmm well the AddFlags function is in production code, so this is constructing the linked list via production code. The std::list we create is a test harness helper that we use to check that our linked list behaves the same.
  107. in src/test/coinscachepair_tests.cpp:67 in 6508bdfe43 outdated
    58+        BOOST_CHECK_EQUAL(it->second.GetFlags(), 0);
    59+        BOOST_CHECK_EQUAL(it->second.Next(), nullptr);
    60+    }
    61+}
    62+
    63+BOOST_AUTO_TEST_CASE(linked_list_iterate_erase)
    


    paplorinc commented at 2:03 pm on June 28, 2024:
    I’m not exactly sure what this test is checking, that the doubly linked list is unaffected by the std::list’s modification?

    andrewtoth commented at 2:31 pm on June 29, 2024:
    This checks that the linked list nodes get removed from the linked list due to the list iterator deleting them, not due to clearing the flags in Next.

    paplorinc commented at 6:46 pm on June 29, 2024:
    Got it, thanks!
  108. in src/test/coinscachepair_tests.cpp:127 in 6508bdfe43 outdated
    115+    BOOST_CHECK_EQUAL(head.second.Next(), &(*n3));
    116+    BOOST_CHECK_EQUAL(n3->second.Next(), nullptr);
    117+
    118+    // Delete n3
    119+    // head->nullptr
    120+    nodes.erase(n3);
    


    paplorinc commented at 2:13 pm on June 28, 2024:
    is this here just for completeness’ sake? Or how would a list operation affect the head node’s behavior?

    andrewtoth commented at 2:04 pm on June 29, 2024:
    It just checks that removing the last element sets head to nullptr. I suppose just for completeness, since this case is the same as when we delete n4 above.

    paplorinc commented at 7:47 pm on June 29, 2024:
    same, it seems to me that it’s not the head that’s tested, but its next pointer

    andrewtoth commented at 0:31 am on July 1, 2024:
    Updated comment.
  109. in src/test/coinscachepair_tests.cpp:25 in 6508bdfe43 outdated
    20+
    21+        auto node{nodes.begin()};
    22+        node->second.AddFlags(CCoinsCacheEntry::DIRTY, *node, head);
    23+
    24+        BOOST_CHECK_EQUAL(node->second.GetFlags(), CCoinsCacheEntry::DIRTY);
    25+        BOOST_CHECK_EQUAL(head.second.Next(), &(*node));
    


    paplorinc commented at 2:15 pm on June 28, 2024:
    we should validate the m_prev as well here

    andrewtoth commented at 1:57 pm on June 29, 2024:
    There’s no accessor to m_prev though. This is checked below in the random deletion test.
  110. in src/coins.h:174 in 0a2bc92b0c outdated
    169+    //! flagged pair linked list.
    170+    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
    171+    {
    172+        if (flags == 0) return;
    173+        if (m_flags != 0) {
    174+            m_flags |= flags;
    


    paplorinc commented at 2:54 pm on June 28, 2024:
    I find it confusing that m_flags is set multiple times (suggestion below)

    andrewtoth commented at 2:59 pm on June 29, 2024:
    Done.
  111. in src/coins.h:180 in 0a2bc92b0c outdated
    175+            return;
    176+        }
    177+        m_prev = &head;
    178+        m_next = head.second.m_next;
    179+        head.second.m_next = &self;
    180+        if (m_next != nullptr) {
    


    paplorinc commented at 3:02 pm on June 28, 2024:

    we could move the m_next setter closer to the check, something like:

    0    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
    1    {
    2        if (!m_flags && flags) {
    3            m_prev = &head;
    4            m_next = head.second.m_next;
    5            if (m_next) m_next->second.m_prev = &self;
    6            head.second.m_next = &self;
    7        }
    8        m_flags |= flags;
    9    }
    

    andrewtoth commented at 2:59 pm on June 29, 2024:
    Done.
  112. in src/coins.h:175 in 0a2bc92b0c outdated
    190+        if (m_next != nullptr) {
    191+            m_next->second.m_prev = m_prev;
    192+        }
    193+        m_prev = nullptr;
    194+        m_next = nullptr;
    195+        m_flags = 0;
    


    paplorinc commented at 3:05 pm on June 28, 2024:

    we could apply the same concept here to separate the constants from the condition (i.e. flags are always 0 after this call):

    0    inline void ClearFlags()
    1    {
    2        if (m_flags) {
    3            if (m_next) m_next->second.m_prev = m_prev;
    4            m_prev->second.m_next = m_next;
    5            m_prev = m_next = nullptr;
    6        }
    7        m_flags = 0;
    8    }
    

    andrewtoth commented at 2:57 pm on June 29, 2024:
    Done
  113. paplorinc changes_requested
  114. paplorinc commented at 3:07 pm on June 28, 2024: contributor
    First half of the review:
  115. in src/coins.cpp:15 in a6324db9c6 outdated
    11@@ -12,7 +12,7 @@
    12 bool CCoinsView::GetCoin(const COutPoint &outpoint, Coin &coin) const { return false; }
    13 uint256 CCoinsView::GetBestBlock() const { return uint256(); }
    14 std::vector<uint256> CCoinsView::GetHeadBlocks() const { return std::vector<uint256>(); }
    15-bool CCoinsView::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlock, bool erase) { return false; }
    16+bool CCoinsView::BatchWrite(CoinsCachePair *pairs, const uint256 &hashBlock, bool will_erase) { return false; }
    


    paplorinc commented at 4:21 pm on June 28, 2024:
    doing refactoring and logic changes in the same commit makes reviewing slightly harder - could you extract all non-critical changes in a separate commit so that we can concentrate on high-risk code more than on stylistic changes (which are important, but shouldn’t distract us)

    andrewtoth commented at 2:56 pm on June 29, 2024:
    Done
  116. in src/coins.cpp:332 in a6324db9c6 outdated
    327@@ -326,8 +328,8 @@ void CCoinsViewCache::SanityCheck() const
    328     size_t recomputed_usage = 0;
    329     for (const auto& [_, entry] : cacheCoins) {
    330         unsigned attr = 0;
    331-        if (entry.flags & CCoinsCacheEntry::DIRTY) attr |= 1;
    332-        if (entry.flags & CCoinsCacheEntry::FRESH) attr |= 2;
    333+        if (entry.GetFlags() & CCoinsCacheEntry::DIRTY) attr |= 1;
    334+        if (entry.GetFlags() & CCoinsCacheEntry::FRESH) attr |= 2;
    


    paplorinc commented at 4:30 pm on June 28, 2024:
    0        auto attr = entry.GetFlags();
    

    (or just remove the assert, doesn’t make a lot of sense to me)


    andrewtoth commented at 3:00 pm on June 29, 2024:
    Not sure I want to touch this right now.
  117. in src/test/coins_tests.cpp:61 in a6324db9c6 outdated
    59+    bool BatchWrite(CoinsCachePair* pairs, const uint256& hashBlock, bool will_erase = true) override
    60     {
    61-        for (CCoinsMap::iterator it = mapCoins.begin(); it != mapCoins.end(); it = erase ? mapCoins.erase(it) : std::next(it)) {
    62-            if (it->second.flags & CCoinsCacheEntry::DIRTY) {
    63+        for (auto it{pairs}; it != nullptr; it = it->second.Next(will_erase || !it->second.coin.IsSpent())) {
    64+            if (it->second.GetFlags() & CCoinsCacheEntry::DIRTY) {
    


    paplorinc commented at 4:34 pm on June 28, 2024:
    this it->second.GetFlags() & CCoinsCacheEntry::DIRTY appears quite often, maybe it would make sense to extract it to it->second.IsDirty() (and it->second.IsFresh(), of course)

    andrewtoth commented at 6:53 pm on June 28, 2024:
    I think this makes the diff easier to follow, since in git it highlights only the flags and GetFlags() in the line.

    paplorinc commented at 7:12 pm on June 28, 2024:
    if there was a separate commit doing only this, it would simplify the other important commits a lot

    andrewtoth commented at 3:00 pm on June 29, 2024:
    Done.
  118. paplorinc commented at 4:37 pm on June 28, 2024: contributor
    Fantastic change, would be helpful if you could make it a bit more digestible by separating it into smaller commits (especially splitting high-risk changes from low-risk refactoring), explaining the changes in commit messages.
  119. in src/coins.h:126 in 6ed20408df outdated
    123+     * erase all spent entries. Since only DIRTY or FRESH entries can be spent,
    124+     * we can track them in the linked list to erase them. This is a performance
    125+     * optimization compared to scanning all cache entries to find any that are
    126+     * spent-but-not-DIRTY.
    127+     */
    128+    CoinsCachePair* m_prev{nullptr};
    


    paplorinc commented at 4:50 pm on June 28, 2024:
    I don’t fully understand why we need m_prev here, but have you investigated whether it’s possible to get rid of it and rely on m_next only? Sometimes we still have a reference to both (current and next, so we wouldn’t need the next’s prev), but I assume that wasn’t always the case, right?

    andrewtoth commented at 6:49 pm on June 28, 2024:

    With m_next we have constant time insertion at the head, which would be fine if we only added flagged entries and cleared them as we traverse them.

    However, we also need to delete entries in constant time if they are both FRESH and DIRTY and get spent. For this we need a doubly linked list.


    paplorinc commented at 7:25 pm on June 28, 2024:
    What prohibits us from accessing the previous ones in constant time during iteration (I haven’t had the energy to delve into it properly yet)?

    andrewtoth commented at 8:16 pm on June 28, 2024:

    What prohibits us from accessing the previous ones in constant time during iteration

    Nothing during iteration. But we need to delete entries randomly when a FRESH coin is spent, which is not during any iteration.

  120. andrewtoth commented at 5:56 pm on June 28, 2024: contributor

    Fantastic change, would be helpful if you could make it a bit more digestible by separating it into smaller commits (especially splitting high-risk changes from low-risk refactoring), explaining the changes in commit messages.

    Thank you @paplorinc! I did try initially to separate the changes more. However, I found the third commit has to have all those changes done in one commit otherwise the test-all-commits CI job will fail. Changing only parts of it can break the tests, and break building the fuzz targets. I will explore to see if I can break it down any further and keep CI happy.

  121. paplorinc commented at 7:22 pm on June 28, 2024: contributor

    Changing only parts of it can break the tests

    I usually try to recreate every line, after I have all the changes by interactively rebasing the current state and adding small refactoring under the current commit. E.g. these could be separate commits doing a single thing:

    • renaming bool erase to bool will_erase
    • changing it->second.flags usages to it->second.GetFlags() (again, without change in behavior, but you could also add another commit after this which extracts the usages and returns it->second.IsFresh() instead, making commits after it easier to read)
    • changing it->second.flags |= to use it->second.AddFlags(CCoinsCacheEntry::DIRTY); (without any change in behavior, just doing it in a method)
    • changing iterators to the new linked list - without any other change in logic.

    etc, I think you got the picture. By extracting these to separate trivial commits (which are followed by high risk changes, done in as few lines as possible without any refactorings in the way), we can concentrate on the difficult ones (which are a lot simpler and in separate commits themselves).

    Thanks!

  122. andrewtoth force-pushed on Jun 29, 2024
  123. andrewtoth force-pushed on Jun 29, 2024
  124. DrahtBot added the label CI failed on Jun 29, 2024
  125. DrahtBot commented at 3:02 pm on June 29, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26841759796

  126. DrahtBot removed the label CI failed on Jun 29, 2024
  127. andrewtoth commented at 6:06 pm on June 29, 2024: contributor

    @mzumsande @paplorinc thank you very much for your reviews! I’ve rebased and updated with all your suggested changes.

    I also updated the commits for easier review. First, flags is encapsulated whenever used, then made private. Then linked list head and self references are added to AddFlags, followed by AddFlags and ClearFlags actually maintaining a linked list of entries. Lastly, the linked list is used for BatchWrite instead of passing the entire map of entries.

  128. in src/coins.cpp:286 in 7f66953d5b outdated
    282@@ -283,7 +283,7 @@ bool CCoinsViewCache::Sync()
    283 void CCoinsViewCache::Uncache(const COutPoint& hash)
    284 {
    285     CCoinsMap::iterator it = cacheCoins.find(hash);
    286-    if (it != cacheCoins.end() && it->second.flags == 0) {
    287+    if (it != cacheCoins.end() && it->second.GetFlags() == 0) {
    


    paplorinc commented at 6:54 pm on June 29, 2024:

    I don’t mind the current one either, but now that you have helper method, maybe we don’t even need to expose the internals:

    0    if (it != cacheCoins.end() && !it->second.IsDirty() && !it->second.IsFresh()) {
    

    andrewtoth commented at 11:33 pm on June 30, 2024:
    This one is problematic though https://github.com/bitcoin/bitcoin/pull/28280/files#diff-3d0856e8b7f136c588b229e0cbd3b2e2c309cd097ade0879521daba4e1bb2a33L609. That would be a big change if we couldn’t get the flags.

    paplorinc commented at 8:36 am on July 1, 2024:

    The flgas seem like an implementation detail to me which we might not want to expose (we could have stored it in two booleans as well and the outside behavior shouldn’t change), so usages such as:

    0flags = it->second.GetFlags();
    

    could become something like:

    0setDirty(it->second.IsDirty());
    1setFresh(it->second.IsFresh());
    
  129. in src/coins.h:130 in d92dc3e21c outdated
    126@@ -127,6 +127,8 @@ struct CCoinsCacheEntry
    127         FRESH = (1 << 1),
    128     };
    129 
    130+    inline void AddFlags(unsigned int flags) { flags |= flags; }
    


    paplorinc commented at 6:59 pm on June 29, 2024:
    hmmm, shouldn’t this be this->flags |= flags;? I wonder how the tests passed…

    andrewtoth commented at 0:29 am on July 1, 2024:
    Done.
  130. in src/test/coins_tests.cpp:590 in d92dc3e21c outdated
    586@@ -587,7 +587,7 @@ static size_t InsertCoinsMapEntry(CCoinsMap& map, CAmount value, char flags)
    587     }
    588     assert(flags != NO_ENTRY);
    589     CCoinsCacheEntry entry;
    590-    entry.flags = flags;
    


    paplorinc commented at 6:59 pm on June 29, 2024:
    this seems like a change in behavior (= to |=)

    andrewtoth commented at 9:48 pm on June 29, 2024:
    It isn’t because entry is constructed the line above, and we know from default constructor flags will be initialized to 0. So 0 | flags is same as flags.

    paplorinc commented at 6:40 am on June 30, 2024:
    Indeed, same for line 130
  131. in src/coins.cpp:115 in 8ca27c7bb6 outdated
    112+    auto [it, inserted] = cacheCoins.emplace(
    113         std::piecewise_construct,
    114         std::forward_as_tuple(std::move(outpoint)),
    115-        std::forward_as_tuple(std::move(coin), CCoinsCacheEntry::DIRTY));
    116+        std::forward_as_tuple(std::move(coin)));
    117+    if (inserted) {
    


    paplorinc commented at 7:00 pm on June 29, 2024:
    can you please explain why this isn’t a change in behavior? (in the commit it was introduced in)

    andrewtoth commented at 0:29 am on July 1, 2024:
    Added a comment in the commit message.
  132. in src/coins.h:128 in e0f394ec9d outdated
    102@@ -103,8 +103,11 @@ class Coin
    103  */
    104 struct CCoinsCacheEntry
    105 {
    106+private:
    107+    uint8_t m_flags{0};
    


    paplorinc commented at 7:04 pm on June 29, 2024:
    nice!
  133. in src/test/coinscachepair_tests.cpp:62 in f5443052bf outdated
    55+    // Check that head is empty
    56+    BOOST_CHECK_EQUAL(head.second.Next(), nullptr);
    57+
    58+    // Delete the nodes from the list to make sure there are no dangling pointers
    59+    for (auto it{nodes.begin()}; it != nodes.end(); it = nodes.erase(it)) {
    60+        BOOST_CHECK_EQUAL(it->second.GetFlags(), 0);
    


    paplorinc commented at 7:37 pm on June 29, 2024:
    As mentioned before, instead of exposing GetFlags, we could check that it’s not fresh and not dirty

    andrewtoth commented at 0:29 am on July 1, 2024:
    Mentioned in another comment, we need GetFlags or we will have to change a lot of test code in coins_tests.cpp.

    paplorinc commented at 8:40 am on July 1, 2024:

    I’ll leave it up to you of course, but the instances I found would be simpler as well:

    0BOOST_CHECK_EQUAL(n1.second.GetFlags(), CCoinsCacheEntry::DIRTY);
    

    would become

    0BOOST_CHECK(n1.second.IsDirty());
    
  134. in src/test/coinscachepair_tests.cpp:78 in f5443052bf outdated
    73+    auto node{head.second.Next()};
    74+    for (auto expected{nodes.begin()}; expected != nodes.end(); expected = nodes.erase(expected)) {
    75+        BOOST_CHECK_EQUAL(&(*expected), node);
    76+        node = node->second.Next(/*clear_flags=*/false);
    77+    }
    78+    // Check that head is empty
    


    paplorinc commented at 7:45 pm on June 29, 2024:
    hmm, but that’s not what we’re checking, right, but that head’s next is empty

    andrewtoth commented at 0:28 am on July 1, 2024:
    Updated comment.
  135. in src/test/coinscachepair_tests.cpp:22 in f5443052bf outdated
    17+    std::list<CoinsCachePair> nodes;
    18+    for (auto i{0}; i < NUM_NODES; ++i) {
    19+        nodes.emplace_front();
    20+
    21+        auto node{nodes.begin()};
    22+        node->second.AddFlags(CCoinsCacheEntry::DIRTY, *node, head);
    


    paplorinc commented at 8:17 pm on June 29, 2024:

    so this basically inserts “in the middle” of the linked list all the time, i.e. not at the beginning or the end, right? I also find that quite confusing, i.e.

    0head <-> newest node <-> second newest <-> third newest <-> oldest node -> nullptr
    

    what if we iterated the initialized nodes backwards so that they reflect the order of insertion, something like:

     0std::list<CoinsCachePair> CreatePairs(CoinsCachePair& head)
     1{
     2    std::list<CoinsCachePair> nodes(NUM_NODES);
     3    for (auto it = nodes.rbegin(); it != nodes.rend(); ++it) {
     4        it->second.AddFlags(CCoinsCacheEntry::DIRTY, *it, head);
     5
     6        BOOST_CHECK_EQUAL(it->second.GetFlags(), CCoinsCacheEntry::DIRTY);
     7        BOOST_CHECK_EQUAL(head.second.Next(), &(*it));
     8        BOOST_CHECK_EQUAL(it->second.Next(), it == nodes.rbegin() ? nullptr : &(*std::prev(it)));
     9    }
    10    return nodes;
    11}
    
  136. in src/coins.h:163 in c36363f6b2 outdated
    161+    {
    162+        if (!m_flags && flags) {
    163+            m_prev = &head;
    164+            m_next = head.second.m_next;
    165+            if (m_next) m_next->second.m_prev = &self;
    166+            head.second.m_next = &self;
    


    paplorinc commented at 9:00 pm on June 29, 2024:

    I’m still thinking about this part, it’s basically a static method that doesn’t need this, which seems to indicate to me that it might be misplaced:

     0    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
     1    {
     2        if (!m_flags && flags) {
     3            AddToList(self, head);
     4        }
     5        m_flags |= flags;
     6    }
     7    static inline void AddToList(CoinsCachePair& self, CoinsCachePair& head) {
     8        self.second.m_prev = &head;
     9        self.second.m_next = head.second.m_next;
    10        if (head.second.m_next) head.second.m_next->second.m_prev = &self;
    11        head.second.m_next = &self;
    12    }
    

    No recommendation yet, just thinking out loud…

  137. paplorinc commented at 9:02 pm on June 29, 2024: contributor
    Added a few more comments, it’s a lot more understandable to me now - though I’m still lacking a few important details, I’ll investigate those next week. Thanks for all your effort!
  138. in src/coins.h:126 in c36363f6b2 outdated
    121+     * mean a spent coin exists in the parent CCoinsView and not in the child
    122+     * CCoinsViewCache. Nevertheless, if a spent coin is retrieved from the
    123+     * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
    124+     * list and deleted when Sync or Flush is called on the CCoinsViewCache.
    125+     */
    126+    CoinsCachePair* m_prev{nullptr};
    


    paplorinc commented at 7:00 am on June 30, 2024:

    Not sure if it helps, just an observation: m_prev only uses .second, so it might as well be a CCoinsCacheEntry*

     0diff --git a/src/coins.h b/src/coins.h
     1--- a/src/coins.h	(revision c36363f6b24c7ab2afe198d9855f507ddf096e1f)
     2+++ b/src/coins.h	(date 1719730674870)
     3@@ -123,7 +123,7 @@
     4      * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
     5      * list and deleted when Sync or Flush is called on the CCoinsViewCache.
     6      */
     7-    CoinsCachePair* m_prev{nullptr};
     8+    CCoinsCacheEntry* m_prev{nullptr};
     9     CoinsCachePair* m_next{nullptr};
    10     uint8_t m_flags{0};
    11 
    12@@ -157,9 +157,9 @@
    13     inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
    14     {
    15         if (!m_flags && flags) {
    16-            m_prev = &head;
    17+            m_prev = &head.second;
    18             m_next = head.second.m_next;
    19-            if (m_next) m_next->second.m_prev = &self;
    20+            if (m_next) m_next->second.m_prev = &self.second;
    21             head.second.m_next = &self;
    22         }
    23         m_flags |= flags;
    24@@ -168,8 +168,9 @@
    25     {
    26         if (m_flags) {
    27             if (m_next) m_next->second.m_prev = m_prev;
    28-            m_prev->second.m_next = m_next;
    29-            m_prev = m_next = nullptr;
    30+            m_prev->m_next = m_next;
    31+            m_prev = nullptr;
    32+            m_next = nullptr;
    33         }
    34         m_flags = 0;
    35     }
    

    andrewtoth commented at 0:28 am on July 1, 2024:
    Done, nice catch!
  139. in src/coins.h:178 in c36363f6b2 outdated
    176+        }
    177+        m_flags = 0;
    178+    }
    179+    inline uint8_t GetFlags() const { return m_flags; }
    180+    inline bool IsDirty() const { return m_flags & CCoinsCacheEntry::DIRTY; }
    181+    inline bool IsFresh() const { return m_flags & CCoinsCacheEntry::FRESH; }
    


    paplorinc commented at 1:35 pm on June 30, 2024:

    we’re inside CCoinsCacheEntry, might as well omit the prefix:

    0    inline bool IsDirty() const { return m_flags & DIRTY; }
    1    inline bool IsFresh() const { return m_flags & FRESH; }
    
  140. in src/coins.cpp:187 in c36363f6b2 outdated
    186-            it != mapCoins.end();
    187-            it = erase ? mapCoins.erase(it) : std::next(it)) {
    188+bool CCoinsViewCache::BatchWrite(CoinsCachePair *pairs, const uint256 &hashBlockIn, bool will_erase) {
    189+    for (auto it{pairs};
    190+            it != nullptr;
    191+            it = it->second.Next(/*clear_flags=*/will_erase || !it->second.coin.IsSpent())) { // Keep flags for spent coins so they can be erased in Sync
    


    paplorinc commented at 1:39 pm on June 30, 2024:
    nit: it seems to me we’re really pushing it with the loop iteration + deletion here, could we do the deletion at the end of the loop instead (like in src/txdb.cpp) to have a cleaner separation or the responsibilities? It may make the Next method cleaner as well (I know it was similar before as well).
  141. in src/test/coinscachepair_tests.cpp:48 in c36363f6b2 outdated
    42+    auto node{head.second.Next()};
    43+    for (const auto& expected : nodes) {
    44+        BOOST_CHECK_EQUAL(&expected, node);
    45+        node = node->second.Next();
    46+    }
    47+
    


    paplorinc commented at 1:43 pm on June 30, 2024:
    after iteration we may still have elements left in node, right? Can we assert that we don’t?

    andrewtoth commented at 0:27 am on July 1, 2024:
    Done
  142. paplorinc commented at 2:18 pm on June 30, 2024: contributor

    This kept me up at night, we should be able to get rid of m_prev by doing insertions and deletions slightly differently:

    • we’d have an immutable SENTINEL value always serving as the last element (instead of nullptr);
    • insertion would always be in front of the previous element, which would serve as the new head (iterating from which will always get us to the SENTINEL);
    • deletion would copy the content (node, flag and next) of the next node instead (unless it’s a SENTINEL), and dispose of that instead.

    Something like this:

     0diff --git a/src/coins.h b/src/coins.h
     1--- a/src/coins.h	(revision c36363f6b24c7ab2afe198d9855f507ddf096e1f)
     2+++ b/src/coins.h	(date 1719755615706)
     3@@ -123,11 +123,11 @@
     4      * parent cache, the FRESH-but-not-DIRTY coin will be tracked by the linked
     5      * list and deleted when Sync or Flush is called on the CCoinsViewCache.
     6      */
     7-    CoinsCachePair* m_prev{nullptr};
     8     CoinsCachePair* m_next{nullptr};
     9     uint8_t m_flags{0};
    10 
    11 public:
    12+    static CoinsCachePair SENTINEL;
    13     Coin coin; // The actual cached data.
    14 
    15     enum Flags {
    16@@ -154,22 +154,19 @@
    17     //! Adding a flag also requires a self reference to the pair that contains
    18     //! this entry in the CCoinsCache map and a reference to the head of the
    19     //! flagged pair linked list.
    20-    inline void AddFlags(uint8_t flags, CoinsCachePair& self, CoinsCachePair& head)
    21+    inline void AddFlags(uint8_t flags, CoinsCachePair& next)
    22     {
    23         if (!m_flags && flags) {
    24-            m_prev = &head;
    25-            m_next = head.second.m_next;
    26-            if (m_next) m_next->second.m_prev = &self;
    27-            head.second.m_next = &self;
    28+            m_next = &next;
    29         }
    30         m_flags |= flags;
    31     }
    32     inline void ClearFlags()
    33     {
    34-        if (m_flags) {
    35-            if (m_next) m_next->second.m_prev = m_prev;
    36-            m_prev->second.m_next = m_next;
    37-            m_prev = m_next = nullptr;
    38+        if (m_flags && m_next != &SENTINEL) {
    39+            coin = std::move(m_next->second.coin);
    40+            m_flags = m_next->second.m_flags;
    41+            m_next = m_next->second.m_next;
    42         }
    43         m_flags = 0;
    44     }
    

    What do you think?

  143. andrewtoth commented at 2:50 pm on June 30, 2024: contributor

    we should be able to get rid of m_prev by doing insertions and deletions slightly differently

    But we can’t move the coin from the next entry to us. The coinsCache map will still point to the next entry for that outpoint.

  144. paplorinc commented at 3:06 pm on June 30, 2024: contributor

    The coinsCache map will still point to the next entry for that outpoint

    Yeah, but we can update the map in constant time, right? Do you think that would measurably slow down execution? Could the memory savings of one less pointer allow us to increase cache hit rate, or am I completely off here?

  145. andrewtoth commented at 3:13 pm on June 30, 2024: contributor
    Using a singly linked list would be better, agreed. I just don’t see how this design would work. How do we clear the flags but keep the coin? That is the purpose of this patch, to not delete an entry after clearing the flags.
  146. andrewtoth commented at 3:16 pm on June 30, 2024: contributor

    Oh wait, we only need random access for deletion! Otherwise we can clear the flags on iteration!

    Brilliant! I think this could work then.

  147. andrewtoth commented at 9:24 pm on June 30, 2024: contributor
    @paplorinc was trying to implement your suggestion, but it’s much too complex I think. How do we delete the node that points to the sentinel? If we move the sentinel into the node and reassign sentinel to that node, we would still need to erase that node from the coinsCache map. We could extract the node handle from the map and have the CCoinsViewCache own that, but this seems like far too much code being changed. I don’t think there’s that much benefit for removing one pointer per entry. We would also have to extract the node from the map after the next node moves into it, and then reinsert the node with the next node’s original outpoint.
  148. sipa commented at 9:28 pm on June 30, 2024: member

    @andrewtoth Just high-level brainstorming, and I really don’t want to push this PR into something more complex than necessary.

    But would it be possible to have a map from outpoints to entries, with a singly-linked list through the entries, such that *(map[key].m_prev) stores the data for key key, rather than map[key] itself?

  149. andrewtoth commented at 9:49 pm on June 30, 2024: contributor

    But would it be possible to have a map from outpoints to entries, with a singly-linked list through the entries, such that *(map[key].m_prev) stores the data for key key, rather than map[key] itself? @sipa Sorry I’m not sure I follow. What does *(map[key].m_prev) store?

  150. sipa commented at 10:18 pm on June 30, 2024: member

    @andrewtoth

    So the data structure we want has (a) an index on COutPoint and (b) a single-linked list that stores an iteration order. The issue (I think, I haven’t followed in detail) is that we need the ability to delete entries found through the (a) index, and if we naively do that, the entry that (b) points to the deleted entry will have its prev pointer invalidated.

    My suggestion is “shifting” the values of the map by one. map[key] stores the UTXO for the entry that is (b)-after key, so the UTXO for outpoint key is found in *(map[key]->m_prev) instead. This means an extra indirection on lookup (which may make the whole thing already not worth it from a performance perspective), but it does mean entries can be deleted (to delete key, do { auto to_del = map[key]->m_prev; map[key]->m_prev = to_del->m_prev; delete(to_del); }.

  151. andrewtoth commented at 10:42 pm on June 30, 2024: contributor

    @sipa thank you for the suggestion! I think it’s missing the fact that we want entries in the map to possibly not belong to the linked list. So we want to clear the linked list after a non-erasing flush, but leave the map intact otherwise. I don’t think that’s possible with just this data structure. We would need 2 parallel maps, one with unflagged entries and one with dirty entries. I’m not sure if that would be more performant with 2 lookups for every access? I think if we have 2 maps we can also just get rid of the linked list entirely.

    You did mention the 2 maps idea a while ago #15265 (comment) and @martinus had a similar idea earlier in this PR discussion.

  152. sipa commented at 10:45 pm on June 30, 2024: member
    @andrewtoth You can have 1 map, with 2 sentinels, each being the end of one linked list (one for flagged entries, one for unflagged ones)… but yeah, fair point, this perhaps complicates matters even more. And the performance penalty of this idea is probably not insignificant either.
  153. andrewtoth force-pushed on Jul 1, 2024
  154. refactor: encapsulate flags access for dirty and fresh checks
    No behavior change. This prepares moving the cache entry flags field
    to private access.
    
    Co-Authored-By: l0rinc <1841944+paplorinc@users.noreply.github.com>
    cf5bfda650
  155. refactor: encapsulate flags get access for all other checks
    No behavior change. This prepares moving the cache entry
    flags field to private access.
    405bd91716
  156. refactor: encapsulate flags setting by introducing AddFlags and ClearFlags
    No behavior change. This prepares moving the cache entry
    flags field to private access.
    62260e6dcf
  157. refactor: disallow setting flags in CCoinsCacheEntry constructors
    No behavior change because any entries that are added in EmplaceCoinInternalDANGER
    have DIRTY assigned to them after, and if they
    are not inserted then they will not be
    modified as before.
    This prepares moving the cache entry
    flags field to private access.
    
    
    Co-Authored-By: Martin Leitner-Ankerl <martin.ankerl@gmail.com>
    47bd57a92d
  158. refactor: move flags to private uint8_t and rename to m_flags
    No behavior change. This prepares to add CCoinsCacheEntrys
    to a doubly linked list when a flag is added.
    78ac8b0120
  159. refactor: add CoinsCachePair alias 4b95dd02b7
  160. refactor: require self and head parameters for AddFlags
    No behavior change. Prepares for adding the CoinsCachePairs to
    a linked list when flagged.
    564c5aa3f6
  161. refactor: rename erase to will_erase
    No behavior change. Prepare for change where BatchWrite
    will not erase coins internally.
    0b8c1ad6e9
  162. coins: call ClearFlags in CCoinsCacheEntry destructor
    No behavior change. Prepares for flags adding CCoinsCacheEntrys
    to a linked list which need to be removed on destruction.
    63aadf52c7
  163. coins: add cache entry linked list fields and methods
    No visible behavior change. This commit tracks the flagged
    entries internally but the list is not iterated by anything.
    
    Co-Authored-By: Pieter Wuille <pieter@wuille.net>
    a4275e3a7c
  164. test: add cache entry linked list tests
    
    
    Co-Authored-By: l0rinc <1841944+paplorinc@users.noreply.github.com>
    5f753defe5
  165. coins: track dirty cache entries in doubly linked list 8360a02992
  166. validation: don't erase coins cache on prune flushes 670084adc5
  167. andrewtoth force-pushed on Jul 1, 2024
  168. in src/test/coinscachepair_tests.cpp:152 in 5f753defe5 outdated
    147+    BOOST_CHECK_EQUAL(n1.second.Next(), nullptr);
    148+    BOOST_CHECK_EQUAL(head.Next(), &n1);
    149+
    150+    // Check that adding FRESH flag on new node inserts it after head
    151+    n2.second.AddFlags(CCoinsCacheEntry::FRESH, n2, head);
    152+    BOOST_CHECK_EQUAL(n2.second.GetFlags(), CCoinsCacheEntry::FRESH);
    


    paplorinc commented at 8:51 am on July 1, 2024:

    as mentioned before, after https://github.com/bitcoin/bitcoin/pull/28280/commits/cf5bfda650bd57636c7dd13f0ca03910d2aebe2e we can simplify these asserts

    0BOOST_CHECK(n2.second.IsFresh());
    
  169. in src/coins.cpp:281 in 670084adc5
    284-            ++it;
    285+            cacheCoins.erase(it->first);
    286+        } else if (fOk) {
    287+            /* BatchWrite must unset all unspent entries when will_erase=false. */
    288+            throw std::logic_error("Not all unspent flagged entries were unset");
    289         }
    


    paplorinc commented at 8:58 am on July 1, 2024:

    nit:

    0        if (!it->second.coin.IsSpent()) {
    1            /* BatchWrite must unset all unspent entries when will_erase=false. */
    2            throw std::logic_error("Not all unspent flagged entries were unset");
    3        }
    4        const auto next_entry{it->second.Next(/*clear_flags=*/true)};
    5        cachedCoinsUsage -= it->second.coin.DynamicMemoryUsage();
    6        cacheCoins.erase(it->first);
    

    andrewtoth commented at 1:15 pm on July 1, 2024:
    We don’t want to throw an error if fOk is false. This happens in tests when using mock parent caches that just return false and don’t unset any flags.

    paplorinc commented at 1:30 pm on July 1, 2024:
    Yeah, my suggestion was completely off here, thanks for checking.
  170. in src/coins.h:131 in cf5bfda650 outdated
    126@@ -127,6 +127,9 @@ struct CCoinsCacheEntry
    127         FRESH = (1 << 1),
    128     };
    129 
    130+    inline bool IsDirty() const { return flags & DIRTY; }
    131+    inline bool IsFresh() const { return flags & FRESH; }
    


    paplorinc commented at 9:13 am on July 1, 2024:

    nit: modern compilers will likely decide the need for inlining themselves

    Since inline substitution is unobservable in the standard semantics, compilers are free to use inline substitution for any function that’s not marked inline, and are free to generate function calls to any function marked inline.

    https://godbolt.org/z/ePYnhnYGM indicates with & without inline the same asm is produced


    sipa commented at 12:01 pm on July 1, 2024:
    Yes and no. Compilers do decide for themselves what to inline, but the inline keyword does increase compilers’ eagerness to inline.

    paplorinc commented at 1:34 pm on July 1, 2024:
    Does it make sense to check what the main compilers do in this case (the ones I checked via godbolt automatically inline), or was that just a waste of time, since seemingly unrelated changes could change the outcome (but if that’s the case, shouldn’t we let the compiler decide)?

    andrewtoth commented at 1:49 pm on July 1, 2024:

    I think not inlining would mean I would also have to add these methods to the implementation file instead of the header.

    Since you’re checking assembly, is there any difference if I add noexcept to these methods? I can’t find a definitive answer on if that will make any difference.


    sipa commented at 1:59 pm on July 1, 2024:

    So inline has two different effects in compilers:

      1. It allows the function to appear in multiple compilation units (which makes it mandatory here). Templated entities automatically have this behavior (with or without explicit inline).
      1. It suggests to the compiler that its implementation should be inlined in call sites. Because this inlining has no observable effect on compiler behavior, compilers can do this (or not do this) with or without inline. It has become less important as compilers are smart enough to make this decision themselves, but not entirely gone.

    sipa commented at 2:08 pm on July 1, 2024:

    @andrewtoth The C++ Core Guidelines suggest adding noexcept whenever you know a function won’t throw. Bitcoin Core treats failure to allocate memory as an unrecoverable condition, so having dynamic memory allocations in a function is not a reason not to add it.

    It may or may not have a performance benefit (especially when it’s a function called from a different compilation unit - and we’re not compiling with LTO enabled, or in default/move constructors of types used inside STL containers which may pick more efficient algorithm variants in that case), but even if it doesn’t, it’s useful as human documentation that a function isn’t expected to throw.

    That also holds for inline in my opinion, actually: it documents that the author intends for a function to be inlined, whether or not the compiler actually does that (at -O0 it won’t ever inline, at -O2 all tiny functions probably will).


    paplorinc commented at 2:14 pm on July 1, 2024:
    @andrewtoth are you asking because of the hasher involved in caching, i.e. https://github.com/bitcoin/bitcoin/blob/master/src/util/hasher.h#L41-L50?

    andrewtoth commented at 2:21 pm on July 1, 2024:
    @sipa thank you for the explanation. @paplorinc that was an example where it made a difference to performance that came to my mind, yes. I was just curious whether there would be any performance benefit of adding noexcept on these methods. If the asm generated with or without is the same then the answer is no. But the advice for adding them for human documentation makes sense.
  171. paplorinc commented at 9:24 am on July 1, 2024: contributor
    utACK 670084adc53e3d661ea5b4a19743659609a42d5e, will run the benchmarks in the following weeks, thanks for addressing the suggestions.
  172. DrahtBot requested review from jamesob on Jul 1, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-03 10:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me