CCoinsCache: Split read and write caches, MRU cache for read, add stats to RPC #4700

pull laanwj wants to merge 13 commits into bitcoin:master from laanwj:2014_08_ccoins_write changing 6 files +234 −44
  1. laanwj commented at 11:04 AM on August 14, 2014: member

    Most of these changes depend on each other, but are in separate commits and can be split off into separate pulls if this is deemed too high-impact to apply at once.

    Track cache statistics in getblockchaininfo

    This makes it possible to measure and compare caching strategies.

    "coinscache" : {
        "positive_hits" : 9186,
        "negative_hits" : 7579,
        "positive_misses" : 50791,
        "negative_misses" : 43200,
        "writes" : 8569,
        "cache_size" : 93722,
        "rcache_size" : 91207,
        "wcache_size" : 2515,
        "cache_limit" : 171267
    }
    

    Currently this is part of getblockchaininfo. I'm not sure about that, it probably should get its own RPC.

    Separate read and write cache

    Keep track of entries that are read-only or r/w in separate caches. This is an optimization that avoids writing back all accessed entries back to the database on a flush.

    The read and write cache can be flushed separately. Only the write entries will be written back.

    This also opens the door to using a different data structure for the read and write cache later on.

    Cache negative hits

    Currently the cache doesn't remember when a coin was not found. It could cache negative results as well.

    Looking at the statistics this can make a difference in number of database queries almost as large as positive caching during normal node operation.

    The implementation makes use of the fact that a a CCoins with 'pcoins->vout.empty()' is regarded as non-existent.

    Change read cache to MRU cache

    Changes the read cache to a MRU cache implemented using boost::bimap. This allows cleaning up the cache to a fixed size instead of flushing the entire cache periodically, and should result in more effective caching. Write cache flushes are done with the same frequency as before, and are still all-or-nothing.

    Flush write cache into read cache

    After writing the entries to the underlying store, move them to the read cache. It is likely that they will be accessed again soon. If not, the entries will be evicted eventually.

    Memory usage

    Peak memory usage hasn't changed compared to before. However, the evolution of memory usage in time has changed. Before, the cache was flushed completely every 10 minutes (when not in initial block sync) or when the cache was full (when initial block sync). Now, instead of alternating between empty and full it will plateau.

    Testing

    I did quite a bit of testing with these changes, amongst others various reindexes of the mainnet and testnet block chains as well as tests with -checkblocks and running a node for a while. Still, especially the MRU cache switch needs to be tested much more extensively before it can be merged.

  2. Create a constant for chainstate write period in WriteChainState c6c14d91e2
  3. Add coinscache information to RPC
    Makes it possible to measure and compare caching strategies.
    542a3fdec3
  4. Split CCoinsView::GetCoins into GetCoins and ModifyCoins
    Make intention more clear:
    
    - Call ModifyCoins if you intend to change the coins
    - Call GetCoins to get a const reference to the coins
    
    This can be used for dirty tracking later on.
    4d417c23f5
  5. Implement CCoinsViewCache::GetCoins with FetchCoins
    Avoids duplicating logic.
    Also move the function together with its other overload in the
    implementation file.
    15158ca8eb
  6. Adds stats counting to CCoinsViewCache 2488d41382
  7. Make CCoinsViewCache::FetchCoins return a pointer instead of an iterator
    This makes it possible to use different data structures for different
    parts of the cache.
    b1b1a54651
  8. CCoinsViewCache: separate read and write caches
    Keep track of entries that are read-only or r/w in separate
    caches. This is an optimization that avoids writing back all
    accessed entries back to the database on a flush.
    
    Also makes the way clear for future optimizations such as putting a
    fixed limit on the size of the read cache.
    b935d02264
  9. Report rcache and wcache size separately in RPC cfdd3b6442
  10. Change WriteChainState to use cache flags
    - Flush READ and WRITE caches if maximum size exceeded (`pcoinsTip->GetCacheSize() > nCoinCacheSize`)
    - Flush WRITE cache periodically if !InitialBlockDownload()
    3bb0fb2cfd
  11. laanwj added the label UTXO Db and Indexes on Aug 14, 2014
  12. Move entries from write to read cache on flush
    There's a large chance that entries will be accessed again after
    being written.
    133499ead7
  13. Cache negative hits 71f14baff1
  14. Change the read cache to a MRU cache 5469a9f27c
  15. Update VerifyDB and WriteChainState to make use of LRU cleanup 1648f256f7
  16. sipa commented at 10:09 AM on August 18, 2014: member

    I think the largest benefit of this will come later, from being able to track in-cache-created entries and removing them in-memory if they are fully spent, preventing them from ever needing to be serialized/written. I'm sure that tuning things like how frequently and when to write what can bring additional benefits.

    I'll look at the code in more detail later. Perhaps we can try to first implement such optimizations as further commits on top of this, to see whether there are actual benefits before merging.

  17. laanwj commented at 10:33 AM on August 18, 2014: member

    Indeed, there are a few other things we could try. But basically the MRU read cache seems to be useless at the moment:

    • During checkblocks: everything enters the write cache of the child cache (coins), so there is no use to the read cache in pcoinsTip
    • During initial sync: the read cache does have a lot of hits, but it appears that misses don't figure much into the overall timings (probably as leveldb has its own caching). So using the CCoinsCache as we do now, as a 'write buffer/batcher' is OK performance-wise
    • During normal node use: the cache gets lots of hits during transaction and block validation, but performance consequences of this are hard to measure. At least with current transaction volumes maybe it's just not so bad to query the underlying database. Especially as leveldb does caching of its own
  18. BitcoinPullTester commented at 5:35 PM on August 18, 2014: none

    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/p4700_1648f256f7104e7164f8aa81ae81b82bc68918e3/ for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.

  19. laanwj closed this on Aug 26, 2014

  20. MarcoFalke locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:15 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me