coins: compact chainstate regularly after post-IBD flushes #35465

pull l0rinc wants to merge 3 commits into bitcoin:master from l0rinc:l0rinc/forcecompactdb-ibd-exit changing 6 files +111 −7
  1. l0rinc commented at 4:02 PM on June 4, 2026: contributor

    Problem: https://github.com/bitcoin-core/leveldb-subtree/pull/61 disabled read-triggered seek compactions to avoid large chainstate write amplification from random UTXO lookups. That avoids repeated read-driven rewrites, but fragmented LevelDB layouts no longer self-repair during IBD.

    During IBD, this can leave validation reading through too many SSTables in compactible levels, putting pressure on LevelDB's table-cache/open-file/mmap budget. After IBD, normal chainstate churn can also leave obsolete entries behind until ordinary LevelDB compaction naturally reaches the affected levels.

    Fix: After each completed full chainstate flush, decide whether to compact the chainstate.

    After IBD compact randomly with 1/1000 probability per full flush. This spreads compactions across nodes and keeps recurring maintenance stateless, without storing last-compaction height or timestamp metadata in the chainstate database. Compaction runs on a background thread (utxocompact) so it does not block validation.

  2. DrahtBot added the label Validation on Jun 4, 2026
  3. DrahtBot commented at 4:02 PM on June 4, 2026: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--006a51241073e994b41acfe9ec718e94-->

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/35465.

    <!--021abf342d371248e50ceaed478a90ca-->

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK andrewtoth

    If your review is incorrectly listed, please copy-paste <code>&lt;!--meta-tag:bot-skip--&gt;</code> into the comment that the bot should ignore.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34897 (indexes: Don't commit ahead of the flushed chainstate by mzumsande)
    • #34320 (coins: delegate CCoinsViewDB::HaveCoin to GetCoin by l0rinc)
    • #30342 (kernel, logging: Pass Logger instances to kernel objects by ryanofsky)
    • #24230 (indexes: Stop using node internal types and locking cs_main, improve sync logic by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    <!--5faf32d7da4f0f540f40219e4f7537a3-->

  4. in src/validation.cpp:3418 in dbf0762734
    3414 | @@ -3385,6 +3415,7 @@ bool Chainstate::ActivateBestChain(BlockValidationState& state, std::shared_ptr<
    3415 |                      pindexMostWork = nullptr;
    3416 |                  }
    3417 |                  pindexNewTip = m_chain.Tip();
    3418 | +                force_compaction |= pindexNewTip->GetBlockHash() == m_chainman.AssumedValidBlock();
    


    sedited commented at 4:30 PM on June 4, 2026:

    Could we retrieve the assumed valid block from m_chainman.GetConsensus().defaultAssumevalid instead?


    l0rinc commented at 6:30 PM on June 4, 2026:

    That would have made testing more difficult, ended up relying on minimum chainwork instead.

  5. DrahtBot added the label CI failed on Jun 4, 2026
  6. DrahtBot commented at 5:22 PM on June 4, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task test ancestor commits: https://github.com/bitcoin/bitcoin/actions/runs/26963717717/job/79561784206</sub> <sub>LLM reason (✨ experimental): CI failed because CTest reported 1 failing test: coins_tests.</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  7. l0rinc renamed this:
    validation: compact chainstate after connecting assumevalid block
    validation: compact chainstate at minimum chain work
    on Jun 4, 2026
  8. l0rinc force-pushed on Jun 4, 2026
  9. in src/validation.cpp:1951 in ed7cb385f5
    1946 | +        if (m_chainman.m_chainstates.size() != 1) return; // Skip assumeutxo.
    1947 | +        Assert(this == &m_chainman.CurrentChainstate());
    1948 | +        coins_db = &CoinsDB();
    1949 | +    }
    1950 | +
    1951 | +    if (BlockValidationState state; !FlushStateToDisk(state, FlushStateMode::FORCE_FLUSH)) { // Ensure compaction reflects the latest chainstate.
    


    andrewtoth commented at 6:48 PM on June 4, 2026:

    We want the latest in memory chainstate to be consistent with leveldb before we compact, but we don't need to clear the cache. That would needlessly cause a lot more read requests at the same time that we are compacting in the background.

        if (BlockValidationState state; !FlushStateToDisk(state, FlushStateMode::FORCE_SYNC)) { // Ensure compaction reflects the latest chainstate.
    
  10. in doc/release-notes-35465.md:1 in ed7cb385f5
       0 | @@ -0,0 +1,7 @@
       1 | +Performance
    


    andrewtoth commented at 6:48 PM on June 4, 2026:

    I'm not sure this change warrants a release note. This change should not be visible to the user.

  11. andrewtoth commented at 6:48 PM on June 4, 2026: contributor

    Concept ACK

  12. l0rinc force-pushed on Jun 4, 2026
  13. l0rinc force-pushed on Jun 4, 2026
  14. DrahtBot removed the label CI failed on Jun 4, 2026
  15. l0rinc force-pushed on Jun 4, 2026
  16. andrewtoth commented at 12:32 PM on June 5, 2026: contributor

    I tested this on mainnet with a pruned IBD with default min work params. The compaction is triggered after 938344 is connected on the scheduler thread, and completes in 2m 32s while 110 blocks were connected on msghand thread. The resulting chainstate size after 952425 was 12GB. Doing the same IBD on master had a chainstate size of 15GB.

    However, the current size of the chainstate db of 11GB is rather unfortunate, since it straddles L4 and L5. Most files would naturally sit at L4, but a full compaction moves all files to L5. This means that when spending any UTXO, the tombstone entry will never reach L5 until 11 GB of new data is added. This means that in a year the db could be 22 GB even if there is a net negative growth in the UTXO set. So, because of this I think instead we will need to have a scheduled compaction instead of doing this only once.

  17. andrewtoth commented at 9:11 PM on June 5, 2026: contributor

    I ran an experiment with a chainstate synced up to 952529.

    First, we took the chainstate that is ~14.7 GB and erased every entry via an iterator and batch write. Since many files are in L4, size compaction was able to erase many of them, reducing chainstate size by over 4 GB. Still, not ideal since our empty chainstate db is still over 10 GB!

    Level Start files Start MB Erased files Erased MB
    L0 4 9 0 0
    L1 1 3 0 0
    L2 5 81 3 96
    L3 45 983 31 998
    L4 311 9,939 172 5,471
    L5 117 3,643 117 3,643
    L6 0 0 0 0
    Total (du) 14,666 10,231

    Second, I took the synced chainstate and did a manual full compaction on it (what this PR aims to do). This reduced the size of the db to ~10.6 GB, but every file is now on L5. Then I did the same as above, erasing every single entry. Now, the empty db balloons to around ~16 GB. This can be explained because size compaction no longer pushes any entries past L4 until there's more than 10 GB of new data.

    Level Start files Start MB Erased files Erased MB
    L0 0 0 0 0
    L1 0 0 0 0
    L2 0 0 3 97
    L3 0 0 31 995
    L4 0 0 131 4,203
    L5 337 10,618 337 10,618
    L6 0 0 0 0
    Total (du) 10,618 15,936

    So, I think this PR has a decent approach to scheduling the compaction, but I think we need to also schedule a compaction randomly every month or two. After triggering another manual compaction, both dbs correctly reduce to size 0.

  18. l0rinc commented at 9:15 PM on June 5, 2026: contributor

    I have also measured the performance; the extra compaction doesn't add any meaningful slowdown.

    <details><summary>reindex-chainstate without and with compaction</summary>

    for DBCACHE in 5000; do     COMMITS="47da4f9b716d11294d4fb0f30b04a7bcf128cc14 4bcac7f699457fb001410a49795d48e4941b27bd";     STOP=950059; CC=gcc; CXX=g++;     BASE_DIR="/mnt/my_storage"; DATA_DIR="$BASE_DIR/BitcoinData"; LOG_DIR="$BASE_DIR/logs";     (echo ""; for c in $COMMITS; do git fetch -q origin "$c" 2>/dev/null || true; git log -1 --pretty='%h %s' $c || exit 1; done) &&     (echo "" && echo "$(date -I) | reindex-chainstate | ${STOP} blocks | dbcache ${DBCACHE} | $(hostname) | $(uname -m) | $(lscpu | grep 'Model name' | head -1 | cut -d: -f2 | xargs) | $(nproc) cores | $(free -h | awk '/^Mem:/{print $2}') RAM | $(lsblk -no ROTA $(df --output=source $BASE_DIR | tail -1) | grep -q 1 && echo HDD || echo SSD)"; echo "") &&     hyperfine     --sort command     --runs 1     --export-json "$BASE_DIR/rdx-$(sed -E 's/[^ ]+/\L&/g;s/[.]/_/g;s/ /-/g'<<<"$COMMITS")-$STOP-$DBCACHE-$CC.json"     --parameter-list COMMIT ${COMMITS// /,}     --prepare "killall -9 bitcoind 2>/dev/null; rm -f ./build/bin/bitcoind; git clean -fxd; git reset --hard {COMMIT} && \
          cmake -B build -G Ninja -DCMAKE_BUILD_TYPE=Release && ninja -C build bitcoind -j1 && \
          ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=1000 -printtoconsole=0; sleep 20; rm -f $DATA_DIR/debug.log; rm -rfd $DATA_DIR/indexes;"     --conclude "killall bitcoind || true; sleep 5; grep -q 'height=0' $DATA_DIR/debug.log && grep -q 'Disabling script verification at block [#1](/bitcoin-bitcoin/1/)' $DATA_DIR/debug.log && grep -q 'height=$STOP' $DATA_DIR/debug.log && grep 'Bitcoin Core version' $DATA_DIR/debug.log | grep -q \"\$(git rev-parse --short=12 {COMMIT})\"; \
                    cp $DATA_DIR/debug.log $LOG_DIR/debug-{COMMIT}-$(date +%s).log"     "COMPILER=$CC ./build/bin/bitcoind -datadir=$DATA_DIR -stopatheight=$STOP -dbcache=$DBCACHE -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0"; done
    
    47da4f9b71 Merge bitcoin/bitcoin#35410: net: use the proxy if overriden when doing v2->v1 reconnections
    4bcac7f699 validation: compact chainstate at minimum work
    
    2026-06-04 | reindex-chainstate | 950059 blocks | dbcache 5000 | umbrel | x86_64 | Intel(R) N150 | 4 cores | 15Gi RAM | SSD
    
    Benchmark 1: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=5000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 47da4f9b716d11294d4fb0f30b04a7bcf128cc14)
      Time (abs ≡):        34274.786 s               [User: 32144.272 s, System: 3560.024 s]
    
    Benchmark 2: COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=5000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 4bcac7f699457fb001410a49795d48e4941b27bd)
      Time (abs ≡):        34745.423 s               [User: 32484.966 s, System: 3508.027 s]
    
    Relative speed comparison
            1.00          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=5000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 47da4f9b716d11294d4fb0f30b04a7bcf128cc14)
            1.01          COMPILER=gcc ./build/bin/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=950059 -dbcache=5000 -reindex-chainstate -blocksonly -connect=0 -printtoconsole=0 (COMMIT = 4bcac7f699457fb001410a49795d48e4941b27bd)
    

    </details>

    The logs also show that compaction happened close to the assumevalid height:

    <details><summary>Compaction logs</summary>

    cat ../logs/debug-4bcac7f699457fb001410a49795d48e4941b27bd-1780604207.log | grep -C4 compaction
    2026-06-05T14:56:31Z Enabling script verification at block [#938344](/bitcoin-bitcoin/938344/) (00000000000000000001731222e3df1a47a0944429559d4c4e4caf53e71676e5): block height above assumevalid height.
    2026-06-05T14:56:31Z UpdateTip: new best=00000000000000000001731222e3df1a47a0944429559d4c4e4caf53e71676e5 height=938344 version=0x2004a000 log2_work=96.100823 tx=1315808700 date='2026-02-25T21:37:38Z' progress=0.965842 cache=1973.3MiB(6654179txo)
    2026-06-05T14:56:32Z UpdateTip: new best=0000000000000000000157898e02aeb20f847dd3ef8973abe68510119b7e6a33 height=938345 version=0x26946000 log2_work=96.100834 tx=1315811999 date='2026-02-25T21:49:08Z' progress=0.965844 cache=1973.3MiB(6661457txo)
    2026-06-05T14:56:32Z UpdateTip: new best=0000000000000000000009325afaa03dd4686deea9781d754a82d2b7ba4ceba8 height=938346 version=0x34000000 log2_work=96.100844 tx=1315815668 date='2026-02-25T22:28:46Z' progress=0.965853 cache=1973.3MiB(6672983txo)
    2026-06-05T14:56:32Z Starting chainstate compaction of /mnt/my_storage/BitcoinData/chainstate
    2026-06-05T14:59:55Z Finished chainstate compaction of /mnt/my_storage/BitcoinData/chainstate
    2026-06-05T14:59:55Z UpdateTip: new best=000000000000000000003028d4699d1d2566de2934ca3f2301d990a95e7db0fc height=938347 version=0x22644000 log2_work=96.100855 tx=1315819222 date='2026-02-25T22:31:42Z' progress=0.965853 cache=1973.3MiB(6679310txo)
    2026-06-05T14:59:55Z UpdateTip: new best=000000000000000000009c20b9c211fd01503cfcabf44c939f83bc54d9a89b2c height=938348 version=0x20866000 log2_work=96.100865 tx=1315820982 date='2026-02-25T22:35:24Z' progress=0.965854 cache=1973.3MiB(6681830txo)
    2026-06-05T14:59:56Z UpdateTip: new best=00000000000000000001c0bc82a20646b8daadc382af5f06486f016e931cf332 height=938349 version=0x20036000 log2_work=96.100876 tx=1315824111 date='2026-02-25T22:45:22Z' progress=0.965857 cache=1973.3MiB(6686145txo)
    2026-06-05T14:59:56Z UpdateTip: new best=00000000000000000001dc00d7c939a9d780c72dc64e378c8c9638d3e8401ace height=938350 version=0x23f2c000 log2_work=96.100886 tx=1315826175 date='2026-02-25T22:55:02Z' progress=0.965859 cache=1973.3MiB(6690199txo)
    

    </details>

    Compaction took ~3.5 minutes, but it seems to have been blocking new block connections - I'm not yet sure why (cc: @andrewtoth).

    This means that in a year the db could be 22 GB even if there is a net negative growth in the UTXO set

    I'm still trying to simulate this, will get back to you on it.

    but I think we need to also schedule a compaction randomly every month or two

    That was my original approach, to schedule one every 10k blocks after IBD. I'll investigate it, thanks a lot for the measurements.

  19. l0rinc marked this as a draft on Jun 5, 2026
  20. l0rinc force-pushed on Jun 6, 2026
  21. l0rinc commented at 10:37 PM on June 6, 2026: contributor

    Since the last push, chainstate compaction no longer runs through the validation-interface queue (blocking block connections). CCoinsViewDB::CompactFull() now starts a one-shot background compaction job, so validation only force-syncs the latest chainstate before triggering the work.

    The first compaction is still deterministic when the active chain leaves IBD. After that, recurring maintenance is randomized: each post-IBD full chainstate flush has a 1/1000 chance to start compaction. With the current roughly hourly flush cadence, that averages to about six weeks.

  22. l0rinc marked this as ready for review on Jun 6, 2026
  23. l0rinc renamed this:
    validation: compact chainstate at minimum chain work
    validation: compact chainstate after IBD and post-IBD flushes
    on Jun 6, 2026
  24. andrewtoth commented at 11:34 PM on June 6, 2026: contributor

    If we are doing a compaction randomly after every full_flush_completed, then I don't see a need for the min work check. Since it's done in the background, we can randomly compact during IBD too. There's no need for logic that tracks IBD state. That should simplify the logic here quite a bit.

    Is there a way to prevent the compaction from occurring during shutdown? We flush a few times during shutdown and it would not be great if it triggered a compaction that prevented shutdown for several minutes.

    We should use a named TraceThread for the compaction thread.

  25. l0rinc marked this as a draft on Jun 8, 2026
  26. l0rinc force-pushed on Jun 8, 2026
  27. DrahtBot added the label CI failed on Jun 8, 2026
  28. DrahtBot commented at 5:26 PM on June 8, 2026: contributor

    <!--85328a0da195eb286784d51f73fa0af9-->

    🚧 At least one of the CI tasks failed. <sub>Task Windows-cross to x86_64, ucrt: https://github.com/bitcoin/bitcoin/actions/runs/27151249634/job/80142552112</sub> <sub>LLM reason (✨ experimental): CI failed at the link step due to an undefined reference to util::TraceThread(...) when building libbitcoinkernel.dll.</sub>

    <details><summary>Hints</summary>

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

    </details>

  29. l0rinc force-pushed on Jun 8, 2026
  30. l0rinc renamed this:
    validation: compact chainstate after IBD and post-IBD flushes
    validation: compact fragmented chainstate after full flushes
    on Jun 8, 2026
  31. DrahtBot removed the label CI failed on Jun 8, 2026
  32. l0rinc force-pushed on Jun 8, 2026
  33. in src/validation.cpp:134 in 849ffc18cc
     129 | +static bool ShouldCompactChainstate(CCoinsViewDB& coins_db, bool in_ibd)
     130 | +{
     131 | +    static constexpr uint32_t flush_ratio{1'000}; // Roughly every 6 weeks with hourly flushes
     132 | +    static constexpr uint32_t leveldb_compaction_threshold{1'000}; // LevelDB table-cache/open-file budget and POSIX mmap limit
     133 | +    if (sizeof(void*) < 8 || !in_ibd) return FastRandomContext().randrange(flush_ratio) == 0;
     134 | +    return CompactableFileCount(coins_db) > leveldb_compaction_threshold;
    


    sipa commented at 8:12 PM on June 8, 2026:

    If this threshold is exceeded outside of IBD somehow, wouldn't we also want to compact?


    l0rinc commented at 8:21 PM on June 8, 2026:

    But that's deterministic, isn't it?


    andrewtoth commented at 8:33 PM on June 8, 2026:

    Hmm if the chainstate ever exceeds 32GB (which we can expect at some point), then this will compact every flush.

    If we just want to compact a chainstate with older 2MB files, we can instead check if L3 has > 1GB / 32 MB file count?


    l0rinc commented at 8:51 PM on June 8, 2026:

    I'm pushing a new version since this won't reliably trigger compaction during IBD. The old code did not compare against total chainstate size, but against the file count of the level above the deepest occupied level. The new version I'm about to push checks for occupancy instead; I'll try some reindex-chainstates overnight to see how it performs.

    Note that changing the default max_open_files from 1000 to 2048 alone would avoid the speed regression, but I want to achieve the same through careful compaction instead.

    If this threshold is exceeded outside of IBD somehow, wouldn't we also want to compact?

    I deliberately wanteded to avoid that, i.e. went for fast compactions during IBD, unpredictable ones in steady state. Let me know if reviewers don't share this goal.


    andrewtoth commented at 9:38 PM on June 8, 2026:

    this won't reliably trigger compaction during IBD.

    Sorry, I'm confused. We don't want to trigger compaction during IBD, right? We want to compact afterwards. If we just compact every ~1000 flushes and also if there are too many files at a level than max file size (to compact an older 2mb file size chainstate), then we have accomplished the goal of the PR.

    I don't think this PR should be concerned with compacting during IBD. I'm not convinced we need to do that. I think doing compactions to speed up IBD can be done as a follow-up.

    I also don't think we need to benchmark this. It should not change IBD or reindex-chainstate from current master behavior.


    sipa commented at 9:41 PM on June 8, 2026:

    @l0rinc Did you mean to thumbs-up your own comment? :sweat_smile:


    l0rinc commented at 10:12 AM on June 9, 2026:

    Sorry, I'm confused. We don't want to trigger compaction during IBD, right?

    How else would we fix the regression? We can also bump max_open_files, but I want to achieve the same through careful compaction instead.

    doing compactions to speed up IBD can be done as a follow-up

    Not speed up, but revert the 8% regression that seek compaction + mmap reduction caused.

    Did you mean to thumbs-up your own comment? 😅

    I had a weak moment, felt really proud of that comment. <img width="1605" height="980" alt="image" src="https://github.com/user-attachments/assets/f4c6b659-f908-45df-b1ac-f8b829e2049e" />


    sipa commented at 12:09 PM on June 9, 2026:

    Maybe it makes sense to separate the two concerns into separate PRs?

    1. The persistent post-IBD inflated disk usage without compactions, especially after UTXO set shrinkage.
    2. The performance regression caused by many small files that haven't been compacted into large files yet, combined with the mmap file count reduction.

    The solution to the first is just a randomized occasional full compaction, post IBD. It's an easy fix that doesn't need much investigation.

    The second is possibly more involved if we want to address it during IBD as well, may need benchmarks, and safeguards against overly aggressive compaction runs if the over-1000-files condition is continuously hit. (One random idea: what if we just schedule a full compaction once, on startup, if the file count is above 1000?)


    andrewtoth commented at 12:10 PM on June 9, 2026:

    This PR was initially about fixing the disk space regression caused by https://github.com/bitcoin-core/leveldb-subtree/pull/61. That change alone has been backported, so if we want to have a change on master we can backport we should keep a fix concentrated on fixing the size regression alone.

    That change in combination with https://github.com/bitcoin-core/leveldb-subtree/pull/52 has introduced a speed regression (https://github.com/bitcoin/bitcoin/issues/35457) but is not backported. How we deal with the speed regression seems out of scope for a targeted fix to the disk size regression.

    I want to achieve the same through careful compaction instead.

    Perhaps we could discuss in #35457? I'm of the opinion https://github.com/bitcoin-core/leveldb-subtree/pull/52 should be reverted instead. More compactions will regress the disk IO improvements of https://github.com/bitcoin-core/leveldb-subtree/pull/61.


    l0rinc commented at 12:19 PM on June 9, 2026:

    Maybe it makes sense to separate the two concerns into separate PRs

    Sure, makes sense. I'm at a conference now, but I can easily split it today.

    One random idea: what if we just schedule a full compaction once, on startup, if the file count is above 1000?

    That's basically what #35467 tried (though it still warns for now since I'm not sure if it still makes sense after this PR, but @andrewtoth also mention it should do it instead of just warning).

    I'm of the opinion https://github.com/bitcoin-core/leveldb-subtree/pull/52 should be reverted instead. More compactions will regress the disk IO improvements of https://github.com/bitcoin-core/leveldb-subtree/pull/61.

    I'll split the two concerns and we can discuss alternatives on the one concerning the IBD compactions. I'm currently measuring how compactions during IBD affect the speed - and since they happen after flushes, I don't think they should affect IBD speed anymore. Also, the parallel fetcher likely changes our usage patterns, giving more time between bulk reads, so we likely have less blockage caused by background compactions (since they're not spread out evenly).

  34. l0rinc force-pushed on Jun 8, 2026
  35. coins: test chainstate flush baseline
    Add `CDBWrapper::GetProperty()` and expose it through `CCoinsViewDB::GetDBProperty()` so coins tests can inspect LevelDB runtime properties through the coins view.
    Use it in a coins DB flush baseline that records the LevelDB layout after flushing while keeping readback coverage for the flushed coin and best block.
    
    Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
    5d3fc1154f
  36. l0rinc force-pushed on Jun 9, 2026
  37. l0rinc renamed this:
    validation: compact fragmented chainstate after full flushes
    validation: compact chainstate regularly after post-IBD flushes
    on Jun 9, 2026
  38. l0rinc renamed this:
    validation: compact chainstate regularly after post-IBD flushes
    coins: compact chainstate regularly after post-IBD flushes
    on Jun 9, 2026
  39. validation: randomly compact chainstate
    Full chainstate flushes are convenient maintenance points for long-term LevelDB cleanup because the chainstate was just written.
    Randomize the trigger so nodes that flush near the same height do not compact together.
    
    Add blocking chainstate compaction through `CCoinsViewDB::CompactFull()` and give each full flush a 1/1000 chance to start compaction when only the normal chainstate is active.
    This keeps the schedule stateless and leaves last-compaction height or timestamp bookkeeping out of chainstate metadata.
    
    Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
    bc1a8f760d
  40. coins: compact chainstate in background
    Full chainstate compaction can take minutes on large databases.
    Move `CCoinsViewDB::CompactFull()` to a named `utxocompact` one-shot background thread so validation only schedules the work.
    
    Validation calls compaction after a full flush, when the chainstate was just written and another write is less likely to be needed immediately.
    Repeated calls reuse any running job.
    The coins view destructor waits for completion, and a mutex prevents compaction from using `m_db` while `ResizeCache()` replaces it.
    
    Co-authored-by: Andrew Toth <andrewstoth@gmail.com>
    d15668d535
  41. l0rinc force-pushed on Jun 9, 2026
  42. DrahtBot added the label CI failed on Jun 9, 2026
  43. in src/validation.cpp:120 in d15668d535
     112 | @@ -113,6 +113,13 @@ const std::vector<std::string> CHECKLEVEL_DOC {
     113 |   * */
     114 |  static constexpr int PRUNE_LOCK_BUFFER{10};
     115 |  
     116 | +// Return whether the completed full flush should compact chainstate
     117 | +static bool ShouldCompactChainstate(bool in_ibd)
     118 | +{
     119 | +    static constexpr uint32_t flush_ratio{1'000}; // Roughly every 6 weeks with hourly flushes
     120 | +    return !in_ibd && FastRandomContext().randrange(flush_ratio) == 0;
    


    andrewtoth commented at 1:46 PM on June 9, 2026:

    What is the rationale for not compacting if we are in IBD? The node should be able to make progress just the same if we are compacting on a background thread. Is there any evidence we would slow down if we triggered a compaction during IBD as well? Your other comments seem to suggest the opposite.


    l0rinc commented at 2:34 PM on June 9, 2026:

    I don't want randomness during IBD, makes benchmarking impossible. We already know most of the payload during IBD, we can optimize for speed through a deterministic algorithm.


    andrewtoth commented at 2:57 PM on June 9, 2026:

    I don't want randomness during IBD, makes benchmarking impossible.

    Hmm but we already randomly do periodic syncs if you have a high enough dbcache value.


    l0rinc commented at 4:41 PM on June 9, 2026:

    Isn't all of that deterministic currently? Subsequent runs are surprisingly stable, often within 1-2 minutes of each ither.


    andrewtoth commented at 4:58 PM on June 9, 2026:

    Isn't all of that deterministic currently?

    There's a 20 minute window of random time to periodically sync.

    I don't think doing a full compaction randomly during IBD will change anything here. We were doing constant seek compaction before anyways.

  44. DrahtBot removed the label CI failed on Jun 9, 2026
  45. in src/validation.cpp:117 in d15668d535
     112 | @@ -113,6 +113,13 @@ const std::vector<std::string> CHECKLEVEL_DOC {
     113 |   * */
     114 |  static constexpr int PRUNE_LOCK_BUFFER{10};
     115 |  
     116 | +// Return whether the completed full flush should compact chainstate
     117 | +static bool ShouldCompactChainstate(bool in_ibd)
    


    andrewtoth commented at 4:22 PM on June 9, 2026:

    This function seems simple enough to inline below.

  46. in src/validation.cpp:2843 in d15668d535
    2841 | +            m_chainman.m_options.signals->ChainStateFlushed(this->GetRole(), GetLocator(m_chain.Tip()));
    2842 | +        }
    2843 | +
    2844 | +        if (!m_chainman.m_interrupt && m_chainman.m_chainstates.size() == 1) { // Skip AssumeUTXO
    2845 | +            if (ShouldCompactChainstate(m_chainman.IsInitialBlockDownload())) {
    2846 | +                CoinsDB().CompactFull();
    


    andrewtoth commented at 4:23 PM on June 9, 2026:

    The thread-spawning can throw. Do we want to catch here and log?

  47. andrewtoth approved
  48. andrewtoth commented at 4:32 PM on June 9, 2026: contributor

    ACK d15668d53558ac78953b50634e8272d5bc55d654

    Nice approach with randomly compacting every 1000 flushes.

    Tested by setting flush_ratio to 1 and DATABASE_WRITE_INTERVAL_... to be between 5 and 7 minutes.

    Compaction was triggered every periodic flush in the background.

    No compaction was triggered on shutdown.

    Not convinced we need to not compact during IBD. There doesn't seem to be sufficient rationale to have a check for this and not just compact as well if one of the flushes during IBD is chosen.

    PR description should be updated. The motivation for this is not about pressure on leveldb data structures, but on the disk footprint of the chainstate db. Also, can mention #35298 is fixed by this.

  49. l0rinc marked this as ready for review on Jun 9, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-06-10 04:51 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me