Add option dbfilesize to control LevelDB target (“max”) file size #30059

pull luke-jr wants to merge 1 commits into bitcoin:master from luke-jr:dbfilesize_param changing 4 files +12 −2
  1. luke-jr commented at 4:06 am on May 8, 2024: member

    Debug option to control LevelDB file sizes. Since LevelDB seems to always overshoot, “max” didn’t seem fitting.

    Intended to be followed up with a change to the default in a rebased #30039

  2. DrahtBot commented at 4:06 am on May 8, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30059.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK tdb3

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. laanwj commented at 9:50 am on May 9, 2024: member
    i’m broadly on board with making this an option (seperately from changing the default in #30039). Assuming there are legit reasons for varying this based on the kind of server, operating system, amount of RAM, kind of disk, etc, and there in’t a single sweet spot.
  4. in src/init.cpp:494 in 9160cc0b1f outdated
    448@@ -448,6 +449,11 @@ void SetupServerArgs(ArgsManager& argsman)
    449     argsman.AddArg("-datadir=<dir>", "Specify data directory", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    450     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
    451     argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    452+    argsman.AddArg("-dbfilesize",
    453+                   strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
    


    tdb3 commented at 4:10 am on May 11, 2024:
    The text states a range of 1MiB to 1024MiB, but these values didn’t seem to be enforced (e.g. 0.5, -3, and 1025 were not rejected). Is the intent here to provide guidance for the user (i.e. tell the user to choose 1 to 1024 MiB) rather than enforce a specific range of values?

    luke-jr commented at 5:40 pm on May 11, 2024:
    LevelDB itself enforces it. Would it make sense to check it redundantly on our end to control error behaviour? (But then we have to remember to keep it in sync with LevelDB… but maybe just having the docs necessitates that)

    tdb3 commented at 4:07 pm on May 12, 2024:

    Yeah, redundant checking of range (beyond LevelDB) seems a bit much for a debug option (which is used by a more niche userbase). The range in the help message seems to be more of an example of a usable range rather than a recommendation of range or a strictly enforced range. The default value would be the defacto recommendation.

    If we’re set on continuing to have LevelDB handle the range rather than Core, it seems like it would make sense to inform the user in the help message that the range is an example. This could help prevent a developer from wasting time on a PR that misses the intent (e.g. could envision a PR starting with: ...dbfilesize specifies a range of allowable values, but doesn't enforce this range. This PR fixes this...).

    It might also help to clarify to the user which files this option impacts.

    This is a bit of a nit of a nit, so feel free to disregard it. Example:

     0diff --git a/src/init.cpp b/src/init.cpp
     1index 5cc513e18c9..f4be033f1c1 100644
     2--- a/src/init.cpp
     3+++ b/src/init.cpp
     4@@ -450,7 +450,7 @@ void SetupServerArgs(ArgsManager& argsman)
     5     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
     6     argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
     7     argsman.AddArg("-dbfilesize",
     8-                   strprintf("Target size of files within databases, in MiB (%u to %u, default: %u).",
     9+                   strprintf("Target size of files within databases (chainstate, indexes), in MiB (example: %u to %u, default: %u).",
    10                              1, 1024,
    11                              DEFAULT_DB_FILE_SIZE),
    12                    ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
    
  5. tdb3 commented at 4:16 am on May 11, 2024: contributor

    Concept ACK.

    Thank you. Seems reasonable to include this as a debug option, and increases flexibility over the single value chosen for PR #30039. Left an inline comment/question.

    Ran a quick sanity check: With dbfilesize=64 set (along with txindex=1), performed an IBD on Signet from a node on the same LAN, then stopped and restarted bitcoind. Saw chainstate and indexes/txindex files that were close to 64MB (blocks/index for signet was less than 64MB, but still had a file over 2MB).

  6. sipa commented at 3:49 pm on May 17, 2024: member

    I think it would be good to understand what the trade-offs are for this before considering making it configurable (as in: understand in what situations we’d want to recommend people use lower vs higher values). I assume lower values mean more open files/IO and less outdated data on disk, and higher values mean fewer compactions. Depending on whether those are the only effects, and to what extent these things are affected by the value, maybe we can get away with a default value that’s a few % of the expected total db size.

    FWIW, RocksDB uses a default of 64 MiB, and has an option (default off) to make higher database-level files become larger (e.g.: level L files would target 64 * (2^L) MiB).

  7. DrahtBot added the label Needs rebase on Sep 16, 2024
  8. in src/dbwrapper.cpp:230 in 9160cc0b1f outdated
    226@@ -227,6 +227,7 @@ CDBWrapper::CDBWrapper(const DBParams& params)
    227     DBContext().iteroptions.fill_cache = false;
    228     DBContext().syncoptions.sync = true;
    229     DBContext().options = GetOptions(params.cache_bytes);
    230+    DBContext().options.max_file_size = params.options.max_file_size;
    


    l0rinc commented at 12:05 pm on October 2, 2024:

    setting the option here after we’ve partially constructed it seems inconsequential, could we add it as a parameter instead?

     0diff --git a/src/dbwrapper.cpp b/src/dbwrapper.cpp
     1--- a/src/dbwrapper.cpp	(revision 5f69dc11cb53c1794e9e71e6f398b95c1037cdc2)
     2+++ b/src/dbwrapper.cpp	(date 1727870700414)
     3@@ -134,9 +134,10 @@
     4              options->max_open_files, default_open_files);
     5 }
     6 
     7-static leveldb::Options GetOptions(size_t nCacheSize)
     8+static leveldb::Options GetOptions(size_t nCacheSize, size_t max_file_size)
     9 {
    10     leveldb::Options options;
    11+    options.max_file_size = max_file_size;
    12     options.block_cache = leveldb::NewLRUCache(nCacheSize / 2);
    13     options.write_buffer_size = nCacheSize / 4; // up to two write buffers may be held in memory simultaneously
    14     options.filter_policy = leveldb::NewBloomFilterPolicy(10);
    15@@ -226,8 +227,7 @@
    16     DBContext().iteroptions.verify_checksums = true;
    17     DBContext().iteroptions.fill_cache = false;
    18     DBContext().syncoptions.sync = true;
    19-    DBContext().options = GetOptions(params.cache_bytes);
    20-    DBContext().options.max_file_size = params.options.max_file_size;
    21+    DBContext().options = GetOptions(params.cache_bytes, params.options.max_file_size);
    22     DBContext().options.create_if_missing = true;
    23     if (params.memory_only) {
    24         DBContext().penv = leveldb::NewMemEnv(leveldb::Env::Default());
    

    luke-jr commented at 10:02 pm on December 5, 2024:
    That just seems uglier?
  9. l0rinc commented at 12:21 pm on October 2, 2024: contributor
    I’ve rebased this and ran a few benchmarks with different values. Will do a few more measurements, but so far I’m rather leaning on finding a better default value instead of making this configurable. See: #30039 (comment)
  10. luke-jr force-pushed on Apr 5, 2025
  11. Add option dbfilesize to control LevelDB target ("max") file size c283a57214
  12. luke-jr force-pushed on Apr 5, 2025
  13. DrahtBot removed the label Needs rebase on Apr 5, 2025
  14. l0rinc commented at 5:37 pm on April 5, 2025: contributor
    Given that we couldn’t find any measurable difference for bigger values, do you still think it makes sense to make this configurable? And if so, what’s the intended usecase?
  15. luke-jr commented at 9:38 pm on April 5, 2025: member
    Can’t predict the behaviours of all storage devices and filesystems. This would enable users to tweak it and see if another configuration works better for them.
  16. laanwj commented at 9:15 am on April 7, 2025: member

    Given that we couldn’t find any measurable difference for bigger values, do you still think it makes sense to make this configurable? And if so, what’s the intended usecase?

    No strong opinion but i veer toward “no”, we should not be adding unneccesary knobs, and beyond this size the comparative overhead of handling files is neglible. Maybe there comes a future time when this size needs to be increased again, but it’s not something end-users are going to play with.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-04-27 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me