[Performance] LevelDB options.max_open_files = 64 parameter (Windows 10) #12123

issue donaloconnor openend this issue on January 8, 2018
  1. donaloconnor commented at 10:01 pm on January 8, 2018: contributor

    Observations

    Bitcoind startup performance (Fully synced node)

    While running procmon when starting bitcoind.exe I noticed millions of file open/read/close events to the chainstate leveldb dir. The high frequency file open/close events occurred during this: (init.cpp)

    if (!ActivateBestChain(state, chainparams)) ..

    Investigating this further led me to level DB’s LRUCache. We use a value of 64 for max_open_files:

    options.max_open_files = 64; in static leveldb::Options GetOptions(size_t nCacheSize)

    As far as I know and read online the default for LevelDB is 1000.

    I’ve noticed some (I consider significant) performance improvements by increasing the max_open_files var to the default 1000. This avoids the overhead of many thousand (per second) open/close operations on the files in the chainstate dir. This also avoids the unnecessary high frequency allocations created each time on the heap (LevelDB’s Win32RandomAccessFile objects).

    I am not a levelDB expert but from what I can gather this value needs to not exceed the maximum number of file handles that the process can have but 64 seems a bit on the low side.

    Results

    Here are some of my results while doing 5 iterations of max_open_files = 64 and 1000. The timings are timing the ActivateBestChain function call (Using high resolution timer).

    image

    Questions:

    1. Why did we chose 64 as the fixed global value of max_open_files?
    2. Should we expose the max_open_files via a command line option or should we be smarter with this value since it has performance benefits (Perhaps even with initial chain sync?)

    My system: Dell XPS 17 9560 i7-7700HQ CPU @ 2.8 GHz, 2801 Mhz, 4 Cores. 16GB Ram, 512GB PCIe SSD, Windows 10

    Release build (MSVC optimizations on /O2)

    It would be interesting if someone can try some tests on a Linux machine. It could be related to some overhead with Windows’ CreateFile while opening the files.

    If it’s accepted that this is a performance bottle neck then I am happy to propose a solution or do more research on this parameter. At the minimum expose it as a setting or command line option.

    Thanks, Donal

  2. MarcoFalke added the label UTXO Db and Indexes on Jan 11, 2018
  3. donaloconnor commented at 9:40 pm on January 15, 2018: contributor

    I did some more CPU profiling. Bitcoind running ~ 15 minutes (Start up time + one new block)

    image

    These are the top most active functions (Exclusive: Not including nested called functions)

    It’s evident that the max_open_files parameter makes a difference. Finding the correct value for this is another story but I think we could save some CPU cycles by considering a number > 64 at least.

    Leveldb’s paranoid flag is turned on also meaning that we do a CRC checksum on opening tables (files). 7% of collected samples include this. The number falls off as we require less file opens.

    I hope someone can comment on this research and any maybe run similar numbers on a Linux machine.

  4. martinus commented at 9:04 am on January 19, 2018: contributor
    See #2557 for why it was reduced to 64
  5. donaloconnor commented at 8:14 pm on January 19, 2018: contributor

    @martinus thanks. It’s difficult to tell the main reason for dropping to 64 but I did see the PR referencing issues with crashes using the default 1000. Maybe these crashes no longer are an issue, this was 5 years ago. @sipa - do you think we should consider taking a look at this again?

    Cheers, Donal

  6. eklitzke commented at 6:48 pm on February 20, 2018: contributor

    I have been looking at this independently. I believe that the frequent close/munmap calls is causing the kernel to make poor decisions about page cache management during IBD/reindexing, which is what I’m trying to fix.

    There’s a trick to make this work safely on Linux (maybe other POSIX systems as well?):

    • If you close a file descriptor that is associated with a mmap mapping, the mapping remains valid and does not count against your file descriptor limit
    • The default mmap limit on Linux (sysctl vm.max_map_count) is 64k, which is huge and way more than we need (currently the chainstate database has about 1500 files in it on mainnet)
    • The only overhead of doing this is the extra page table entries, which uses a negligible amount of extra memory

    This means that the code can be reworked to have all chainstate LevelDB files mmaped (for reads at least) simultaneously, without adversely affecting performance or using a lot of file descriptors. However, it does require some surgery to LevelDB internals. In concept the change sounds minimal but I have to do some actual hacking on the LevelDB code to see what the diff actually looks like when it’s fully implemented. I plan to work on this and present the changes for discussion if it ends up being sufficiently minimal/self contained.

  7. eklitzke commented at 8:44 pm on February 22, 2018: contributor
    I created #12495 to increase the file count on 64-bit Unix systems. The explanation I give there (about keeping block indexes loaded in memory) is likely the reason that increasing the file count also helps on Windows. However, I think the concerns about # of open file descriptors is valid, and I don’t know that Windows has the same optimization that Unix has for mmap’ing files without resulting in extra file handles. (I’m not saying it doesn’t, it’s just that I don’t know anything about Windows).
  8. adamjonas commented at 8:21 pm on April 29, 2020: member
    Believe this was closed by increasing LevelDB max_open_files to 1000 on Windows and other systems (except 32-bit POSIX hosts) in #12495.
  9. MarcoFalke closed this on Apr 29, 2020

  10. DrahtBot locked this on Feb 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-04 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me