dbwrapper: Bump LevelDB max file size to 128 MiB to avoid system slowdown from high disk cache flush rate #30039

pull maciejsszmigiero wants to merge 1 commits into bitcoin:master from maciejsszmigiero:dbwrapper-bump-max-file-size changing 2 files +2 −0
  1. maciejsszmigiero commented at 9:41 am on May 4, 2024: none

    The default max file size for LevelDB is 2 MiB, which results in the LevelDB compaction code generating ~4 disk cache flushes per second when syncing with the Bitcoin network. These disk cache flushes are triggered by fdatasync() syscall issued by the LevelDB compaction code when reaching the max file size.

    If the database is on a HDD this flush rate brings the whole system to a crawl. It also results in very slow throughput since 2 MiB * 4 flushes per second is about 8 MiB / second max throughput, while even an old HDD can pull 100 - 200 MiB / second streaming throughput.

    Increase the max file size for LevelDB to 128 MiB instead so the flush rate drops to about 1 flush / 2 seconds and the system no longer gets so sluggish.

    The max file size value chosen also matches the MAX_BLOCKFILE_SIZE file size setting already used by the block storage.

  2. DrahtBot commented at 9:41 am on May 4, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30039.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK davidgumberg

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. sipa commented at 2:13 pm on May 4, 2024: member
    @jamesob Feel like benchmarking a reindex or so with this?
  4. laanwj commented at 4:41 pm on May 4, 2024: member
    Are there any drawbacks to this?
  5. laanwj added the label UTXO Db and Indexes on May 4, 2024
  6. willcl-ark commented at 4:52 pm on May 4, 2024: member
  7. maciejsszmigiero commented at 4:57 pm on May 4, 2024: none

    Are there any drawbacks to this?

    I didn’t notice any.

    It’s worth mentioning that the total amount of data stored in this database is at least two orders of magnitude higher than even 128 MiB file size.

  8. laanwj commented at 5:06 pm on May 4, 2024: member

    It’s worth mentioning that the total amount of data stored in this database is at least two orders of magnitude higher than even 128 MiB file size.

    Oh yes, i asked because i like this even from a “leveldb creates less files” point of view, eg anecdotally on one of my nodes the counter exceeds 6 digits bitcoin-core/bitcoin-maintainer-tools#161 . Of course, this includes deleted files, the active number is “only” about 6000.

  9. tdb3 commented at 6:30 pm on May 5, 2024: contributor

    Are there any drawbacks to this?

    I didn’t notice any.

    It’s worth mentioning that the total amount of data stored in this database is at least two orders of magnitude higher than even 128 MiB file size.

    It would be great if there are few/no drawbacks. Do you mind sharing the methods used so far to test this? It would be great to have some data for comparison.

    Other questions that come to mind (thinking out loud before I dig deeper or perform testing):

    • Does the change from 2MB to 128MB have any impact on consistent or transient RAM usage (i.e. for resource-constrained nodes)?
    • Is the file size (or an option to use the legacy smaller file size) something we would want to expose in bitcoin.conf (e.g. as a debug option)?
  10. andrewtoth commented at 9:31 pm on May 5, 2024: contributor

    Might partially address #29662

    That issue is complaining about long compaction times. From https://github.com/bitcoin/bitcoin/blob/master/src/leveldb/include/leveldb/options.h#L111-L112:

    The downside will be longer compactions and hence longer latency/performance hiccups.

    it seems this change would make compaction times longer, so would exacerbate that issue?

  11. maciejsszmigiero commented at 10:51 pm on May 5, 2024: none

    Do you mind sharing the methods used so far to test this?

    I am simply watching the disk cache flush rate in iostat(1). In addition to that, the difference in the system interactivity is also pretty apparent.

    Does the change from 2MB to 128MB have any impact on consistent or transient RAM usage (i.e. for resource-constrained nodes)?

    Did not observe any such effect, the RAM usage of the Bitcoin process seems to vary within roughly the same bounds when syncing with the Bitcoin network with our without this change.

    Is the file size (or an option to use the legacy smaller file size) something we would want to expose in bitcoin.conf (e.g. as a debug option)?

    Maybe, but I don’t know whether it makes sense to expose additional tuning option with respect to, for example, maintenance impact.

    The downside will be longer compactions and hence longer latency/performance hiccups.

    it seems this change would make compaction times longer, so would exacerbate that issue?

    For me, the biggest performance impact of compaction is from disk cache flushes this operation generates. This patch significantly reduces such flush rate and so should make compaction less painful.

  12. in src/dbwrapper.cpp:150 in 7f15e71f7e outdated
    146@@ -147,6 +147,7 @@ static leveldb::Options GetOptions(size_t nCacheSize)
    147         // on corruption in later versions.
    148         options.paranoid_checks = true;
    149     }
    150+    options.max_file_size = 128 << 20;
    


    andrewtoth commented at 0:55 am on May 6, 2024:
    Should we make this a constant? Would it be appropriate to reuse MAX_BLOCKFILE_SIZE?

    laanwj commented at 7:43 am on May 6, 2024:
    +1 on a constant, but i don’t think it’s approprioate to reuse MAX_BLOCKFILE_SIZE, better to define a new one

    maciejsszmigiero commented at 10:00 pm on May 7, 2024:
    Added a relevant constant.
  13. andrewtoth commented at 2:18 am on May 6, 2024: contributor

    Benchmarked IBD with an SSD to block 800k, dbcache=450, prune=0 with a local node serving the blocks. This branch is 27% (!) faster than master :rocket:

    0 commit 7f15e71f7e762645dbd1ea5eba9ecc6f9ad60236 (branch)
    1  Time (mean ± σ):     14711.490 s ± 225.376 s    [User: 19465.517 s, System: 1147.712 s]
    2  Range (min … max):   14552.125 s … 14870.854 s    2 runs
    3  
    4 commit eb0bdbdd753bca97120247b921fd29d606fea6e9 (master)
    5  Time (mean ± σ):     20274.276 s ± 106.042 s    [User: 21762.310 s, System: 4546.936 s]
    6  Range (min … max):   20199.293 s … 20349.259 s    2 runs
    

    This patch significantly reduces such flush rate and so should make compaction less painful.

    From what I understand, this patch reduces the frequency of flushes, but they will take longer when they do occur. This is great for IBD, but for #29662 the issue is an unavoidable compaction at startup. The compaction could potentially take longer with this patch.

  14. laanwj commented at 7:56 am on May 6, 2024: member

    This branch is 27% (!) faster than master

    That’s impressive!

    From what I understand, this patch reduces the frequency of flushes

    Not only the frequency of flushes; another potential advantage here is that leveldb will spend less time open()ing and close()ing files to maintain its allowed number of open files (eg the fd_limiter stuff).

  15. luke-jr commented at 5:49 pm on May 7, 2024: member
    If there’s no drawbacks, why not go even larger?
  16. willcl-ark commented at 6:38 pm on May 7, 2024: member

    I ran some benchmarks of IBD to block 800,000 vs master for comparison, and got some similar, if slightly less impressive, results with default dbcache.

    With -dbcache=16384:

    • master@ fdb41e08: 9607 s
    • master@ fdb41e08 + 7f15e71f7e762645dbd1ea5eba9ecc6f9ad60236: 9351 s
    • 3% faster with this change

    With -dbcache=450:

    • master@ fdb41e08: 15338 s
    • master@ fdb41e08 + 7f15e71f7e762645dbd1ea5eba9ecc6f9ad60236: 13246 s
    • ~16% faster with this change

    I only did a single run of each though. Sync was performed from a single second local node with datadir on a separate SSD.

  17. andrewtoth commented at 7:18 pm on May 7, 2024: contributor
    FWIW re: #29662 I did not notice any difference in compaction time at startup on an SSD. It takes about 5 seconds to finish with debug=leveldb both on master and this branch.
  18. dbwrapper: Bump max file size to 128 MiB
    The default max file size for LevelDB is 2 MiB, which results in the
    LevelDB compaction code generating ~4 disk cache flushes per second when
    syncing with the Bitcoin network.
    These disk cache flushes are triggered by fdatasync() syscall issued by the
    LevelDB compaction code when reaching the max file size.
    
    If the database is on a HDD this flush rate brings the whole system to a
    crawl.
    It also results in very slow throughput since 2 MiB * 4 flushes per second
    is about 8 MiB / second max throughput, while even an old HDD can pull
    100 - 200 MiB / second streaming throughput.
    
    Increase the max file size for LevelDB to 128 MiB instead so the flush rate
    drops to about 1 flush / 2 seconds and the system no longer gets so
    sluggish.
    
    The max file size value chosen also matches the MAX_BLOCKFILE_SIZE file
    size setting already used by the block storage.
    3e32d23c9e
  19. maciejsszmigiero force-pushed on May 7, 2024
  20. maciejsszmigiero commented at 10:07 pm on May 7, 2024: none

    If there’s no drawbacks, why not go even larger?

    I used 128 MiB as the new size for commonality with MAX_BLOCKFILE_SIZE already used by the block storage and because it gives me a nice low disk cache flush rate of about 1 flush / 2 seconds that no longer impacts the overall system performance.

    But just to be sure, changed the patch to use std::max() around this max_file_size option so if at some point LevelDB decides to increase its default above 128 MiB we won’t be lowering it accidentally.

  21. mzumsande commented at 5:02 pm on May 8, 2024: contributor
    I’ve played around with this branch a bit, upgrading and downgrading between it and master with existing datadirs on signet and didn’t run into any issues. Also just noting that this will affect all leveldb databases, also the indexes and the block/index db.
  22. sipa commented at 6:23 pm on May 17, 2024: member
    It appears that RocksDb (more-developed derivative of LevelDB) uses a default of 64 MiB (https://github.com/facebook/rocksdb/blob/main/include/rocksdb/advanced_options.h#L468). See also my comment in #30059 (comment).
  23. l0rinc commented at 12:18 pm on October 2, 2024: contributor

    I did a few benchmarks on HDD and SSD separately (no raspberry pi yet, but I understood @davidgumberg did some of those and saw a significant speedup), to see the effect of the different values on IBD.

    I have tried different values via #30059 (rebased), namely 1,2,4,8,16,32,64,128,256,512 MiB (current value is 2) with default dbcache, until 600k blocks using real nodes (which introduces some randomness, but the repeated runs should still indicate a trend).

    0hyperfine \
    1  --runs 1 \
    2  --export-json /mnt/my_storage/ibd_benchmark.json \
    3  --parameter-list DBFILESIZE 1,2,4,8,16,32,64,128,256,512 \
    4  --prepare 'rm -rf /mnt/my_storage/BitcoinData/*' \
    5  './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize={DBFILESIZE} -printtoconsole=0'
    
     0Benchmark 1: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=1 -printtoconsole=0
     1  Time (abs ≡):        9376.982 s               [User: 8939.258 s, System: 2037.366 s]
     2
     3Benchmark 2: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=2 -printtoconsole=0
     4  Time (abs ≡):        7809.227 s               [User: 8399.808 s, System: 1258.152 s]
     5
     6Benchmark 3: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=4 -printtoconsole=0
     7  Time (abs ≡):        7060.817 s               [User: 8210.950 s, System: 626.069 s]
     8
     9Benchmark 4: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=8 -printtoconsole=0
    10  Time (abs ≡):        7201.632 s               [User: 8046.769 s, System: 615.964 s]
    11
    12Benchmark 5: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=16 -printtoconsole=0
    13  Time (abs ≡):        7848.417 s               [User: 8394.320 s, System: 713.182 s]
    14
    15Benchmark 6: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=32 -printtoconsole=0
    16  Time (abs ≡):        8289.161 s               [User: 8183.729 s, System: 599.698 s]
    17
    18Benchmark 7: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=64 -printtoconsole=0
    19  Time (abs ≡):        7580.532 s               [User: 8077.446 s, System: 612.879 s]
    20
    21Benchmark 8: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=128 -printtoconsole=0
    22  Time (abs ≡):        9060.371 s               [User: 8140.057 s, System: 606.641 s]
    23
    24Benchmark 9: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=256 -printtoconsole=0
    25  Time (abs ≡):        8778.117 s               [User: 8001.854 s, System: 620.595 s]
    26
    27Benchmark 10: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=512 -printtoconsole=0
    28  Time (abs ≡):        7856.151 s               [User: 7970.946 s, System: 680.476 s]
    29
    30Summary
    31  './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=4 -printtoconsole=0' ran
    32    1.02 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=8 -printtoconsole=0'
    33    1.07 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=64 -printtoconsole=0'
    34    1.11 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=2 -printtoconsole=0'
    35    1.11 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=16 -printtoconsole=0'
    36    1.11 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=512 -printtoconsole=0'
    37    1.17 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=32 -printtoconsole=0'
    38    1.24 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=256 -printtoconsole=0'
    39    1.28 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=128 -printtoconsole=0'
    40    1.33 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=500000 -dbfilesize=1 -printtoconsole=0'
    
     0Benchmark 1: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=1 -printtoconsole=0
     1  Time (abs ≡):        10150.860 s               [User: 8046.261 s, System: 1557.130 s]
     2
     3Benchmark 2: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=2 -printtoconsole=0
     4  Time (abs ≡):        8935.037 s               [User: 7746.422 s, System: 981.186 s]
     5
     6Benchmark 3: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=4 -printtoconsole=0
     7  Time (abs ≡):        7636.675 s               [User: 7348.012 s, System: 547.172 s]
     8
     9Benchmark 4: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=8 -printtoconsole=0
    10  Time (abs ≡):        7633.078 s               [User: 7306.267 s, System: 572.424 s]
    11
    12Benchmark 5: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=16 -printtoconsole=0
    13  Time (abs ≡):        7639.829 s               [User: 7266.532 s, System: 591.955 s]
    14
    15Benchmark 6: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=32 -printtoconsole=0
    16  Time (abs ≡):        7345.802 s               [User: 7265.908 s, System: 584.797 s]
    17
    18Benchmark 7: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=64 -printtoconsole=0
    19  Time (abs ≡):        7617.101 s               [User: 7092.537 s, System: 551.785 s]
    20
    21Benchmark 8: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=128 -printtoconsole=0
    22  Time (abs ≡):        7508.948 s               [User: 7065.206 s, System: 580.337 s]
    23
    24Benchmark 9: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=256 -printtoconsole=0
    25  Time (abs ≡):        7563.822 s               [User: 7093.650 s, System: 599.636 s]
    26
    27Benchmark 10: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=512 -printtoconsole=0
    28  Time (abs ≡):        7600.085 s               [User: 6997.129 s, System: 536.973 s]
    29
    30Summary
    31  ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=32 -printtoconsole=0 ran
    32    1.02 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=128 -printtoconsole=0
    33    1.03 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=256 -printtoconsole=0
    34    1.03 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=512 -printtoconsole=0
    35    1.04 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=64 -printtoconsole=0
    36    1.04 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=8 -printtoconsole=0
    37    1.04 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=4 -printtoconsole=0
    38    1.04 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=16 -printtoconsole=0
    39    1.22 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=2 -printtoconsole=0
    40    1.38 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=500000 -dbfilesize=1 -printtoconsole=0
    

    While these measurements aren’t definitive, both hinted at -dbfilesize=4 being better than -dbfilesize=2 (the default) and may not be a lot better than -dbfilesize=128.

    I’ll rerun these with 2,4,8,64,128 and 800k blocks on the HDD to validate the findings.

  24. davidgumberg commented at 7:10 pm on October 4, 2024: contributor

    I cherry picked your branch onto master and did two runs syncing from a stable, dedicated local node twice on a Raspberry Pi 5 4GB using microSD for storage, with a prune of 2000 and the default dbcache using the following command:

    0./build/src/bitcoind -daemon=0 -connect=ryzen7900xnode:8333 -stopatheight=800000 -prune=2000 -debug=bench -debug=blockstorage -debug=coindb -debug=mempool -debug=prune
    

    I saw a massive improvement, with your branch taking, on average, ~67.8% of the time taken by the master branch to reach block height 800,0001:

    Avg (hh:mm:ss) Run 1 Run 2
    Master 47:17:14 (170,234s) 48:38:05 (175,085s) 45:56:22 (165,382s)
    Branch, cherry picked onto master 32:01:14 (115,274s) 34:06:26 (122,786s) 29:56:01 (107,761s)

    For me this validates that a substantial performance improvement is possible. I suspect especially on disk I/O constrained setups, and I’m really interested in making IBD on Raspberry Pi’s faster.

    Concept ACK on looking into the tradeoffs of different settings here. Not to try and duplicate discussion too much between this and #30059, but I second @l0rinc that looking for one good default seems better than making this configurable, unless we find evidence that different setups benefit substantially from different values.

    But, I think more work needs to be done to identify what value works best here, and hopefully come up with an account for why, I will try to bench some different max file size values on the Raspberry Pi that I have similar to @lorinc’s work above.



    1. These benchmarks took so long that the weather had changed between run 1 and run 2, and I am not running these in a room where the temperature is very well controlled which I believe is the primary cause of run 2 being faster for both. ↩︎

  25. l0rinc commented at 3:20 pm on October 7, 2024: contributor

    Finished benchmarking with the default 2 mb file size vs 4, 8, 64 and 128 mb. This time it’s full IBD with real peers until 800k blocks on a HDD.

    0hyperfine   --runs 1   --export-json /mnt/ibd_DBFILESIZE.json   --parameter-list DBFILESIZE 2,4,8,64,128   --prepare 'rm -rf /mnt/BitcoinData/*'   './build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize={DBFILESIZE} -printtoconsole=0'
    
     0 Benchmark 1: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=2 -printtoconsole=0
     1  Time (abs ≡):        36403.630 s               [User: 31186.459 s, System: 5761.138 s]
     2
     3 Benchmark 2: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=4 -printtoconsole=0
     4  Time (abs ≡):        30540.101 s               [User: 29188.931 s, System: 3430.547 s]
     5
     6Benchmark 3: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=8 -printtoconsole=0
     7  Time (abs ≡):        28913.948 s               [User: 28857.575 s, System: 2292.117 s]
     8
     9Benchmark 4: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=64 -printtoconsole=0
    10  Time (abs ≡):        27911.380 s               [User: 28268.729 s, System: 2179.778 s]
    11
    12Benchmark 5: ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=128 -printtoconsole=0
    13  Time (abs ≡):        28191.359 s               [User: 27915.963 s, System: 2045.088 s]
    
    0  ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=64 -printtoconsole=0 ran
    1    1.01 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=128 -printtoconsole=0
    2    1.04 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=8 -printtoconsole=0
    3    1.09 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=4 -printtoconsole=0
    4    1.30 times faster than ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=800000 -dbfilesize=2 -printtoconsole=0
    

    Edit:

    Repeated the same for SSD, very similar results:

    0hyperfine   --runs 1   --export-json /mnt/ibd_DBFILESIZE-ssd.json   --parameter-list DBFILESIZE 2,8,16,32,64 --prepare 'rm -rf /mnt/my_storage/BitcoinData/*'  './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize={DBFILESIZE} -printtoconsole=0 -dbcache=1000'        
    
     0Benchmark 1: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=2 -printtoconsole=0 -dbcache=1000
     1  Time (abs ≡):        32323.964 s               [User: 30174.040 s, System: 6349.312 s]
     2 
     3Benchmark 2: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=8 -printtoconsole=0 -dbcache=1000
     4  Time (abs ≡):        24513.755 s               [User: 27618.551 s, System: 1728.897 s]
     5 
     6Benchmark 3: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=16 -printtoconsole=0 -dbcache=1000
     7  Time (abs ≡):        24648.438 s               [User: 27925.669 s, System: 1893.671 s]
     8 
     9Benchmark 4: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=32 -printtoconsole=0 -dbcache=1000
    10  Time (abs ≡):        24797.871 s               [User: 27621.893 s, System: 1755.004 s]
    11 
    12Benchmark 5: ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=64 -printtoconsole=0 -dbcache=1000
    13  Time (abs ≡):        25078.417 s               [User: 27879.669 s, System: 2064.851 s]
    
    0  './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=8 -printtoconsole=0 -dbcache=1000' ran
    1    1.01 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=16 -printtoconsole=0 -dbcache=1000'
    2    1.01 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=32 -printtoconsole=0 -dbcache=1000'
    3    1.02 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=64 -printtoconsole=0 -dbcache=1000'
    4    1.32 times faster than './build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbfilesize=2 -printtoconsole=0 -dbcache=1000'
    

    In conclusion it seems to me that 2mb is indeed too low, there seems to be a significant jump when doubling the file size (~20-30%% faster), but after that the advantage is smaller (8mb is 25% faster, 64 mb is 30% faster and 128 is 29% faster).

    Since we’re not yet sure of all the second order effects of this change (longer compaction, more memory, migration problems, etc), I wouldn’t yet recommend jumping to 128, but to 8 or 16 only.

  26. in src/dbwrapper.cpp:150 in 3e32d23c9e
    146@@ -147,6 +147,7 @@ static leveldb::Options GetOptions(size_t nCacheSize)
    147         // on corruption in later versions.
    148         options.paranoid_checks = true;
    149     }
    150+    options.max_file_size = std::max(options.max_file_size, DBWRAPPER_MAX_FILE_SIZE);
    


    l0rinc commented at 9:10 pm on October 30, 2024:

    As mentioned in the comments, it seems to me that 16 may be a better default value based on the measured IBDs - basically just as fast as 128, without having to worry about the increase in e.g. MaxGrandParentOverlapBytes and ExpandedCompactionByteSizeLimit (10x and 25x this value) called e.g. in IsTrivialMove with a warning: "the move could create a parent file that will require a very expensive merge later on" (or any other such surprise) - which we likely want to avoid:

    0    options.max_file_size = 16 << 20;
    
  27. l0rinc commented at 8:54 pm on November 6, 2024: contributor

    @maciejsszmigiero, are you still working on this or should we take over?


    I can also confirm that it’s possible to just switch file size values back-and-forth without needing a reindex. I have reindexed until block 600k with master vs 16 mb blocks (instead of the 128 for the reasons mentioned before).

    The LevelDB files seem to effortlessly change from 2 mb to 17 :

    • chainstate/062435.ldb - 906'412 bytes
    • chainstate/061885.ldb - 2'171'330 bytes
    • chainstate/063212.ldb - 1'936'570 bytes
    • chainstate/064711.ldb - 982'165 bytes
    • chainstate/061518.ldb - 2'171'520 bytes
    • chainstate/062708.ldb - 2'169'653 bytes
    • chainstate/061659.ldb - 2'171'631 bytes
    • chainstate/063237.ldb - 2'170'487 bytes
    • chainstate/062435.ldb - 906'412 bytes
    • chainstate/065302.ldb - 17'347'086 bytes
    • chainstate/062708.ldb - 2'169'653 bytes
    • chainstate/063237.ldb - 2'170'487 bytes

    And when reverting to master, effortlessly go back:

    • chainstate/062468.ldb - 2'171'399 bytes
    • chainstate/065305.ldb - 17'358'270 bytes
    • chainstate/062543.ldb - 2'172'244 bytes
    • chainstate/062468.ldb - 2'171'399 bytes
    • chainstate/065579.ldb - 2'170'605 bytes
    • chainstate/068994.ldb - 2'169'617 bytes
    • chainstate/068954.ldb - 2'170'158 bytes
    • chainstate/062543.ldb - 2'172'244 bytes

    The total bytes on disk seems to be basically the same, but the number of files is reduced considerably (might alleviate open file problems):

    • before 2168 files, 4'383'947'229 bytes
    • after 280 files, 4'386'553'693 bytes
  28. maciejsszmigiero commented at 9:40 pm on November 9, 2024: none

    @l0rinc

    are you still working on this or should we take over?

    I can obviously change the default in this PR to 16 MiB but I think having #30059 is important too: as you measured here on Oct 2 the best performing size on HDD storage actually seems to be 32 MiB.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me