sync: improve CCoinsViewCache ReallocateCache #28945

pull martinus wants to merge 3 commits into bitcoin:master from martinus:2023-11-improve-ccoinsviewcache-reallocatecache changing 3 files +13 −6
  1. martinus commented at 6:43 am on November 27, 2023: contributor

    TL;DR: this change improves sync time on my PC by ~6%.

    While syncing when the CCoinsViewCache gets full it is flushed to disk and then memory is freed. Depending on available cache size, this happens several times. This PR removes the unnecessary reallocations for temporary CCoinsViewCache and reserves the cacheCoins’s bucket array to its previous size.

    I benchmarked the change on an AMD Ryzen 9 7950X with this command:

    0hyperfine \
    1--show-output --parameter-list commit b5a271334ca81a6adcb1c608d85c83621a9eae47,ceeb71816512642e92a110b2cf5d2549d090fe78 \
    2--setup 'git checkout {commit} && make -j$(nproc) src/bitcoind' \
    3--prepare 'sync; sudo /sbin/sysctl vm.drop_caches=3' \
    4-M 3 './src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000'
    

    Where b5a271334ca81a6adcb1c608d85c83621a9eae47 is mergebase, and ceeb71816512642e92a110b2cf5d2549d090fe78 this PR.

    Which runs ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 3 times and compares both commits, giving this result:

     0Benchmark 1: ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = b5a271334ca81a6adcb1c608d85c83621a9eae47)
     1  Time (mean ± σ):     2815.648 s ± 70.720 s    [User: 2639.959 s, System: 640.051 s]
     2  Range (min … max):   2736.616 s … 2872.964 s    3 runs
     3
     4Benchmark 2: ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = ceeb71816512642e92a110b2cf5d2549d090fe78)
     5  Time (mean ± σ):     2661.520 s ± 23.205 s    [User: 2473.010 s, System: 624.040 s]
     6  Range (min … max):   2635.816 s … 2680.926 s    3 runs
     7
     8Summary
     9  ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = ceeb71816512642e92a110b2cf5d2549d090fe78) ran
    10    1.06 ± 0.03 times faster than ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = b5a271334ca81a6adcb1c608d85c83621a9eae47)
    
  2. coins: Add parameter to specify if ReallocateCache is necessary 83d10e9688
  3. validation: don't reallocate cache for short lived CCoinsViewCache
    The short lived CCoinsViewCache's are destructed right after the Flush(), therefore it is not necessary to call ReallocateCache.
    985ddd9541
  4. coins: reserve bucket count ceeb718165
  5. DrahtBot commented at 6:43 am on November 27, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29236 (log: Nuke error(…) by maflcko)
    • #28280 (Don’t empty dbcache on prune flushes: >30% faster IBD by andrewtoth)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  6. helpau commented at 6:42 am on December 9, 2023: none
    Hi, is this benchmark for HDD or SSD? Did you tried to run benchmark with huge dbcache and high height?
  7. martinus commented at 8:56 am on December 9, 2023: contributor
    I only measured it on an SSD, and only with relatively low dbcache size so flushing is triggered more often which I think would give more accurate results. But more benchmarks are definitely appreciated :)
  8. in src/coins.cpp:328 in ceeb718165
    323     ::new (&m_cache_coins_memory_resource) CCoinsMapMemoryResource{};
    324     ::new (&cacheCoins) CCoinsMap{0, SaltedOutpointHasher{/*deterministic=*/m_deterministic}, CCoinsMap::key_equal{}, &m_cache_coins_memory_resource};
    325+
    326+    // After a flush all the cacheCoins's memory has now been freed. It is likely that the map gets full again so flush is necessary, and it will
    327+    // happen around the same memory usage. So we can already reserve the bucket array to the previous size, reducing the number of rehashes needed.
    328+    cacheCoins.reserve(current_bucket_count);
    


    andrewtoth commented at 10:03 pm on December 16, 2023:

    The purpose of ReallocateCache originally was to balance the snapshot and background chainstate caches (see #18637). However, in https://github.com/bitcoin/bitcoin/pull/25325/commits/5e4ac5abf54f8e6d6330df0c73119aa0cca4c103 this function was removed from ResizeCoinsCaches and moved to Flush. This created an implicit assumption that calling Flush would clear the memory in the cache. This change violates that assumption, since the memory will no longer be clear and therefore ResizeCoinsCaches will likely no longer respect the dbcache limit during assumeutxo syncing.

    Also see the docs for this function https://github.com/bitcoin/bitcoin/blob/master/src/coins.h#L336-L341. The purpose of this is to clear the memory of the cache.

    Perhaps we could get the bucket count in Flush before calling ReallocateCache and reserve it in Flush, then re-add ReallocateCache to ResizeCoinsCaches?

    edit: I see that reserve only allocates memory for the hashtable, and not the actual nodes stored in the unordered_map. I’m not sure how significant that extra memory is.

  9. in src/coins.cpp:253 in ceeb718165
    249@@ -250,14 +250,16 @@ bool CCoinsViewCache::BatchWrite(CCoinsMap &mapCoins, const uint256 &hashBlockIn
    250     return true;
    251 }
    252 
    253-bool CCoinsViewCache::Flush() {
    254+bool CCoinsViewCache::Flush(bool reallocate_cache) {
    


    andrewtoth commented at 10:43 pm on December 16, 2023:
    Should this parameter be added to the fuzz targets or unit tests?
  10. andrewtoth commented at 2:03 pm on December 17, 2023: contributor

    Ran same hyperfine benchmark with base, 985ddd954199cb7686ac75d12492d25b16d235c8, and head of branch. Got similar results :rocket:.

    0Summary
    1  ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = ceeb71816512642e92a110b2cf5d2549d090fe78) ran
    2    1.06 ± 0.00 times faster than ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = 985ddd954199cb7686ac75d12492d25b16d235c8)
    3    1.07 ± 0.00 times faster than ./src/bitcoind -dbcache=500 -reindex-chainstate -printtoconsole=0 -stopatheight=500000 (commit = b5a271334ca81a6adcb1c608d85c83621a9eae47)
    

    It’s unclear to me though why 5e4ac5abf54f8e6d6330df0c73119aa0cca4c103 was necessary at all, and if reverting that change would be a cleaner fix for this instead? From the commit message:

    This is necessary in preparation for using the PoolAllocator for CCoinsMap, which does not actually free any memory on clear().

    Why is it necessary to free memory on clear()? Is the unfreed memory not just recycled for new coins as they are added? The only purpose of ReallocateCache seemed to be to clear all memory when rebalancing caches during assumeutxo syncing, and is unrelated to Flush.

  11. DrahtBot added the label CI failed on Jan 16, 2024
  12. DrahtBot added the label Needs rebase on Mar 12, 2024
  13. DrahtBot commented at 11:30 am on March 12, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  14. achow101 requested review from fjahr on Apr 9, 2024
  15. in src/coins.h:310 in 83d10e9688 outdated
    306@@ -307,7 +307,7 @@ class CCoinsViewCache : public CCoinsViewBacked
    307      * to be forgotten.
    308      * If false is returned, the state of this cache (and its backing view) will be undefined.
    309      */
    310-    bool Flush();
    311+    bool Flush(bool reallocate_cache = true);
    


    fjahr commented at 7:39 pm on April 27, 2024:
    Should probably add something about this to the docstring.
  16. fjahr commented at 7:51 pm on April 27, 2024: contributor
    @martinus Are you still working on this? If yes, it would be great if you could rebase and address @andrewtoth ’s comment. If not, I can try to take this over and get it the attention it deserves.
  17. martinus commented at 8:17 pm on April 27, 2024: contributor
    @fjahr I’m not working on this any more, feel free to take over!
  18. fanquake closed this on Apr 28, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me