More control of maintenance processes at startup/restart #29662

issue domZippilli openend this issue on March 15, 2024
  1. domZippilli commented at 4:53 pm on March 15, 2024: none

    Please describe the feature you’d like to see added.

    I’d like some way to control the levelDB compaction processes a bit more. Specifically, if I could control when they are scheduled (hourly in the background, for example), and possibly limit resources they consume (I/O, in particular), I think it would help.

    The problem I’m having is that when I reboot my bitcoin-core node (which runs with txindex=1), the RPC listener comes up pretty promptly, so my healthchecks (currently just TCP) pass. However, the service is not in its usual baseline state; it is doing a lot of read I/O, and logging about levelDB compaction. This condition lasts for about an hour.

    Here is an illustration of the I/O level relative to baseline. The left side is the restart, and the lower right side is after this abates. Dark blue is read:

    Here’s a summary of the logs at this time:

    The “Compacting” log line is extremely elevated during this period (though it does occur at a much lower level after):

    Here’s a graph of RPC trace P50 duration before, during, and after this phase:

    The real problem is the last graph, the elevated RPC latency. The tail and head latencies are also much worse than normal, so it’s not just a tail latency issue I could solve with timeouts / hedging. The server goes from microsecond/millisecond latency to 10s of seconds, especially for sendrawtransaction (10-30s max), with listunspent a distant second (3-4s max).

    This latency abates as soon as the high I/O and compaction logging stops. I am therefore making the intuitive leap (so experts, please consider this critically and I welcome other explanations) that resource utilization during compaction is causing some RPCs to be very slow. I also considered lock contention (perhaps cs_main) but I couldn’t see it in the code.

    Describe the solution you’d like

    I’d like the experts to recommend a solution. Intuitively, it seems like levelDB could amortize this compaction work during normal operation as a background task (the README seems to imply it already should?). Or maybe some way to limit resources used for compaction?

    Describe any alternatives you’ve considered

    Currently, I’m looking at alternative ways to do the RPC I enabled txindex for, which is getrawtransaction without the blockhash. But, that will only sidestep this issue with compaction and RPC latency.

    Please leave any additional context

    I am using a slower filesystem than most. It is a regionally-replicated NFS store, which we chose for resiliency reasons. Intuitively, I’d expect this problem to be less severe (or shorter duration) with lower-latency storage, but still present.

    Command line args:

    0txindex="1", rpcworkqueue="1024", rpc_*="redacted", debug="coindb", debug="estimatefee", debug="reindex", debug="leveldb", debug="walletdb", debug="lock", debug="rpc", dbcache="5734", datadir="/home/bitcoin/data", chain="main"
    
  2. domZippilli added the label Feature on Mar 15, 2024
  3. willcl-ark commented at 11:39 am on March 19, 2024: contributor

    Hi @domZippilli.

    Whilst I think having more control over leveldb compaction would be nice, it seems to me that the changes would be pretty invasive to our leveldb subtree, which we don’t particularly want to touch.

    I tried some minimal changes, such as skipping the scheduled compaction on startup, however it’s clear to me on testing this that these compaction jobs are scheduled very regularly by the DB implementation (and see below), as this made almost no difference to when I observed compaction running. It would also be possible to increase the leveldb cache size to trigger compactions less frequently (in combination with the previous change).

    Having read the leveldb documentation compaction is run:

    1. when the DB is opened (which I tested changes to)
    2. when the log gets full (unavoidable) or
    3. when manual compaction is called.

    You might be able to get some different performance characteristics by changing the database cache size in the source code manually, but I think it’s unlikely that these changes/options will see implementation in main branch here.

    If you are in IBD compactions will be unavoidable as you sync and the cache fills, but it sounds like you are either restarting or recovering from failure?

    The startup compactions I saw are mostly coming from initial chainstate verifcation – we verify the first 6 blocks at startup which triggers a few of these. Without disabling this (highly not recommended) I don’t see any way to avoid them either. Disabling these check did see fewer startup compactions. You could probably therefore see improved performance by setting -checkblocks=1, which would be preferable to the below (crude, untested, don’t run these!) changes which skips block checks altogether.

     0diff --git a/src/init.cpp b/src/init.cpp
     1index 1a4fce4678..f55aadc9de 100644
     2--- a/src/init.cpp
     3+++ b/src/init.cpp
     4@@ -614,6 +614,7 @@ void SetupServerArgs(ArgsManager& argsman)
     5     argsman.AddArg("-checkblocks=<n>", strprintf("How many blocks to check at startup (default: %u, 0 = all)", DEFAULT_CHECKBLOCKS), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     6     argsman.AddArg("-checklevel=<n>", strprintf("How thorough the block verification of -checkblocks is: %s (0-4, default: %u)", Join(CHECKLEVEL_DOC, ", "), DEFAULT_CHECKLEVEL), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     7     argsman.AddArg("-checkblockindex", strprintf("Do a consistency check for the block tree, chainstate, and other validation data structures occasionally. (default: %u, regtest: %u)", defaultChainParams->DefaultConsistencyChecks(), regtestChainParams->DefaultConsistencyChecks()), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     8+    argsman.AddArg("-skipverify", strprintf("Skip chain verification at startup (default %b)", DEFAULT_SKIP_VERIFY), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
     9     argsman.AddArg("-checkaddrman=<n>", strprintf("Run addrman consistency checks every <n> operations. Use 0 to disable. (default: %u)", DEFAULT_ADDRMAN_CONSISTENCY_CHECKS), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
    10     argsman.AddArg("-checkmempool=<n>", strprintf("Run mempool consistency checks every <n> transactions. Use 0 to disable. (default: %u, regtest: %u)", defaultChainParams->DefaultConsistencyChecks(), regtestChainParams->DefaultConsistencyChecks()), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
    11     argsman.AddArg("-checkpoints", strprintf("Enable rejection of any forks from the known historical chain until block %s (default: %u)", defaultChainParams->Checkpoints().GetHeight(), DEFAULT_CHECKPOINTS_ENABLED), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::DEBUG_TEST);
    12@@ -1573,6 +1574,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    13         options.mempool = Assert(node.mempool.get());
    14         options.reindex = node::fReindex;
    15         options.reindex_chainstate = fReindexChainState;
    16+        options.skip_verify = args.GetIntArg("-skipverify", DEFAULT_SKIP_VERIFY);
    17         options.prune = chainman.m_blockman.IsPruneMode();
    18         options.check_blocks = args.GetIntArg("-checkblocks", DEFAULT_CHECKBLOCKS);
    19         options.check_level = args.GetIntArg("-checklevel", DEFAULT_CHECKLEVEL);
    20diff --git a/src/node/chainstate.cpp b/src/node/chainstate.cpp
    21index bf1fc06b0b..d241b580e6 100644
    22--- a/src/node/chainstate.cpp
    23+++ b/src/node/chainstate.cpp
    24@@ -249,6 +249,10 @@ ChainstateLoadResult VerifyLoadedChainstate(ChainstateManager& chainman, const C
    25     auto is_coinsview_empty = [&](Chainstate* chainstate) EXCLUSIVE_LOCKS_REQUIRED(::cs_main) {
    26         return options.reindex || options.reindex_chainstate || chainstate->CoinsTip().GetBestBlock().IsNull();
    27     };
    28+    if (options.skip_verify) {
    29+        LogPrintf("[chainstate] WARNING: skipping chain verification due to --skipverify flag\n");
    30+        return {ChainstateLoadStatus::SUCCESS, {}};
    31+    };
    32 
    33     LOCK(cs_main);
    34 
    35diff --git a/src/node/chainstate.h b/src/node/chainstate.h
    36index a6e9a0331b..36321dbebd 100644
    37--- a/src/node/chainstate.h
    38+++ b/src/node/chainstate.h
    39@@ -24,6 +24,7 @@ struct ChainstateLoadOptions {
    40     bool coins_db_in_memory{false};
    41     bool reindex{false};
    42     bool reindex_chainstate{false};
    43+    bool skip_verify{false};
    44     bool prune{false};
    45     //! Setting require_full_verification to true will require all checks at
    46     //! check_level (below) to succeed for loading to succeed. Setting it to
    47diff --git a/src/validation.h b/src/validation.h
    48index b64ba4dcbc..147fd13ed9 100644
    49--- a/src/validation.h
    50+++ b/src/validation.h
    51@@ -65,6 +65,7 @@ class SignalInterrupt;
    52 /** Block files containing a block-height within MIN_BLOCKS_TO_KEEP of ActiveChain().Tip() will not be pruned. */
    53 static const unsigned int MIN_BLOCKS_TO_KEEP = 288;
    54 static const signed int DEFAULT_CHECKBLOCKS = 6;
    55+static const bool DEFAULT_SKIP_VERIFY{false};
    56 static constexpr int DEFAULT_CHECKLEVEL{3};
    57 // Require that user allocate at least 550 MiB for block & undo files (blk???.dat and rev???.dat)
    58 // At 1MB per block, 288 blocks = 288MB.
    

    Apart from that startup option, your only other solution as I see it is to speed up your storage backend/replication somehow. These compactions are barely noticable for me on a local SSD…

    I hope this helps in some way.

  4. willcl-ark added the label Resource usage on Mar 20, 2024
  5. TheBlueMatt commented at 2:56 pm on March 26, 2024: contributor
    Can you describe in more detail why skipping the on-open compaction is a bad idea in general (and, eg, schedule it to run in five minutes)? This behavior is very annoying even outside of a high-latency storage - right when you want your machine to be a bit more free so you can start other things LevelDB decides to spam I/O as fast as it can.
  6. willcl-ark commented at 4:13 pm on March 27, 2024: contributor

    Can you describe in more detail why skipping the on-open compaction is a bad idea in general (and, eg, schedule it to run in five minutes)?

    Sorry I haven’t made myself clear; what I think is a bad idea is default disabling the startup checkblocks, not DB compaction scheduling.

    I don’t see any great requirement for DB compaction on-open (but I am no LevelDB wizard), only as far as I can tell disabling it for Bitcoin Core would be quite invasive to the subtree.

    This behavior is very annoying even outside of a high-latency storage - right when you want your machine to be a bit more free so you can start other things LevelDB decides to spam I/O as fast as it can.

    If this problem is affecting more than just 1 user with high-latency storage then perhaps it warrants more investigation than I’ve given it so far, but it wasn’t clear to me from this report that this might be the case.

    When I profile from cold start to “Loading Wallet…” using an SSD, the compaction stuff doesn’t seem problematic: https://tmp.256k1.dev/check_block_6_startup.svg?s=DoCompaction

  7. TheBlueMatt commented at 0:30 am on March 29, 2024: contributor

    If this problem is affecting more than just 1 user with high-latency storage then perhaps it warrants more investigation than I’ve given it so far, but it wasn’t clear to me from this report that this might be the case.

    This is definitely not just a one person issue. I would posit a guess that most bitcoin core nodes are on network-attached storage (though mostly a VM’s backing storage in the same datacenter, not trying to sync storage across two datacenters). But also this can be nasty if you crashed and you have to do replay on startup, in that case you have a lot of IO and now compaction is not only much larger but also competing with the block validation.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-18 19:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me