Add option dbfilesize to control LevelDB target (“max”) file size #30059

pull luke-jr wants to merge 1 commits into bitcoin:master from luke-jr:dbfilesize_param changing 4 files +12 −0
  1. luke-jr commented at 4:06 am on May 8, 2024: member

    Debug option to control LevelDB file sizes. Since LevelDB seems to always overshoot, “max” didn’t seem fitting.

    Intended to be followed up with a change to the default in a rebased #30039

  2. Add option dbfilesize to control LevelDB target ("max") file size 9160cc0b1f
  3. DrahtBot commented at 4:06 am on May 8, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30059.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK tdb3

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28358 (Drop -dbcache limit by Sjors)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. laanwj commented at 9:50 am on May 9, 2024: member
    i’m broadly on board with making this an option (seperately from changing the default in #30039). Assuming there are legit reasons for varying this based on the kind of server, operating system, amount of RAM, kind of disk, etc, and there in’t a single sweet spot.
  5. in src/init.cpp:453 in 9160cc0b1f
    448@@ -448,6 +449,11 @@ void SetupServerArgs(ArgsManager& argsman)
    449     argsman.AddArg("-datadir=<dir>", "Specify data directory", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    450     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
    451     argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    452+    argsman.AddArg("-dbfilesize",
    453+                   strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
    


    tdb3 commented at 4:10 am on May 11, 2024:
    The text states a range of 1MiB to 1024MiB, but these values didn’t seem to be enforced (e.g. 0.5, -3, and 1025 were not rejected). Is the intent here to provide guidance for the user (i.e. tell the user to choose 1 to 1024 MiB) rather than enforce a specific range of values?

    luke-jr commented at 5:40 pm on May 11, 2024:
    LevelDB itself enforces it. Would it make sense to check it redundantly on our end to control error behaviour? (But then we have to remember to keep it in sync with LevelDB… but maybe just having the docs necessitates that)

    tdb3 commented at 4:07 pm on May 12, 2024:

    Yeah, redundant checking of range (beyond LevelDB) seems a bit much for a debug option (which is used by a more niche userbase). The range in the help message seems to be more of an example of a usable range rather than a recommendation of range or a strictly enforced range. The default value would be the defacto recommendation.

    If we’re set on continuing to have LevelDB handle the range rather than Core, it seems like it would make sense to inform the user in the help message that the range is an example. This could help prevent a developer from wasting time on a PR that misses the intent (e.g. could envision a PR starting with: ...dbfilesize specifies a range of allowable values, but doesn't enforce this range. This PR fixes this...).

    It might also help to clarify to the user which files this option impacts.

    This is a bit of a nit of a nit, so feel free to disregard it. Example:

     0diff --git a/src/init.cpp b/src/init.cpp
     1index 5cc513e18c9..f4be033f1c1 100644
     2--- a/src/init.cpp
     3+++ b/src/init.cpp
     4@@ -450,7 +450,7 @@ void SetupServerArgs(ArgsManager& argsman)
     5     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
     6     argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
     7     argsman.AddArg("-dbfilesize",
     8-                   strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
     9+                   strprintf("Target size of files within databases (chainstate, indexes), in MiB (example: %u to %u, default: %u).",
    10                              1, 1024,
    11                              DEFAULT_DB_FILE_SIZE),
    12                    ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
    
  6. tdb3 commented at 4:16 am on May 11, 2024: contributor

    Concept ACK.

    Thank you. Seems reasonable to include this as a debug option, and increases flexibility over the single value chosen for PR #30039. Left an inline comment/question.

    Ran a quick sanity check: With dbfilesize=64 set (along with txindex=1), performed an IBD on Signet from a node on the same LAN, then stopped and restarted bitcoind. Saw chainstate and indexes/txindex files that were close to 64MB (blocks/index for signet was less than 64MB, but still had a file over 2MB).

  7. sipa commented at 3:49 pm on May 17, 2024: member

    I think it would be good to understand what the trade-offs are for this before considering making it configurable (as in: understand in what situations we’d want to recommend people use lower vs higher values). I assume lower values mean more open files/IO and less outdated data on disk, and higher values mean fewer compactions. Depending on whether those are the only effects, and to what extent these things are affected by the value, maybe we can get away with a default value that’s a few % of the expected total db size.

    FWIW, RocksDB uses a default of 64 MiB, and has an option (default off) to make higher database-level files become larger (e.g.: level L files would target 64 * (2^L) MiB).

  8. DrahtBot commented at 9:04 pm on September 16, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  9. DrahtBot added the label Needs rebase on Sep 16, 2024
  10. in src/dbwrapper.cpp:230 in 9160cc0b1f
    226@@ -227,6 +227,7 @@ CDBWrapper::CDBWrapper(const DBParams& params)
    227     DBContext().iteroptions.fill_cache = false;
    228     DBContext().syncoptions.sync = true;
    229     DBContext().options = GetOptions(params.cache_bytes);
    230+    DBContext().options.max_file_size = params.options.max_file_size;
    


    l0rinc commented at 12:05 pm on October 2, 2024:

    setting the option here after we’ve partially constructed it seems inconsequential, could we add it as a parameter instead?

     0diff --git a/src/dbwrapper.cpp b/src/dbwrapper.cpp
     1--- a/src/dbwrapper.cpp	(revision 5f69dc11cb53c1794e9e71e6f398b95c1037cdc2)
     2+++ b/src/dbwrapper.cpp	(date 1727870700414)
     3@@ -134,9 +134,10 @@
     4              options->max_open_files, default_open_files);
     5 }
     6 
     7-static leveldb::Options GetOptions(size_t nCacheSize)
     8+static leveldb::Options GetOptions(size_t nCacheSize, size_t max_file_size)
     9 {
    10     leveldb::Options options;
    11+    options.max_file_size = max_file_size;
    12     options.block_cache = leveldb::NewLRUCache(nCacheSize / 2);
    13     options.write_buffer_size = nCacheSize / 4; // up to two write buffers may be held in memory simultaneously
    14     options.filter_policy = leveldb::NewBloomFilterPolicy(10);
    15@@ -226,8 +227,7 @@
    16     DBContext().iteroptions.verify_checksums = true;
    17     DBContext().iteroptions.fill_cache = false;
    18     DBContext().syncoptions.sync = true;
    19-    DBContext().options = GetOptions(params.cache_bytes);
    20-    DBContext().options.max_file_size = params.options.max_file_size;
    21+    DBContext().options = GetOptions(params.cache_bytes, params.options.max_file_size);
    22     DBContext().options.create_if_missing = true;
    23     if (params.memory_only) {
    24         DBContext().penv = leveldb::NewMemEnv(leveldb::Env::Default());
    

    luke-jr commented at 10:02 pm on December 5, 2024:
    That just seems uglier?
  11. l0rinc commented at 12:21 pm on October 2, 2024: contributor
    I’ve rebased this and ran a few benchmarks with different values. Will do a few more measurements, but so far I’m rather leaning on finding a better default value instead of making this configurable. See: #30039 (comment)

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-22 00:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me