validation: write chainstate to disk every hour #30611

pull andrewtoth wants to merge 5 commits into bitcoin:master from andrewtoth:write-chainstate-every-hour changing 4 files +94 −49
  1. andrewtoth commented at 6:21 pm on August 8, 2024: contributor

    Since #28233, periodically writing the chainstate to disk every 24 hours does not clear the dbcache. Since #28280, periodically writing the chainstate to disk is proportional only to the amount of dirty entries in the cache. Due to these changes, it is no longer beneficial to only write the chainstate to disk every 24 hours. The periodic flush interval was necessary because every write of the chainstate would clear the dbcache. Now, we can get rid of the periodic flush interval and simply write the chainstate along with blocks and block index at least every hour.

    Two benefits of doing this:

    1. For IBD or reindex-chainstate with a combination of large dbcache setting, slow CPU, slow internet speed/unreliable peers, it could be up to 24 hours until the chainstate is persisted to disk. A power outage or crash could potentially lose up to 24 hours of progress, causing #11600. By syncing every hour in unison with the block index we avoid this problem entirely. For IBD with lower dbcache settings, faster CPU, or better internet speed/reliable peers, chainstate writes are already triggered more often than every hour so this change will have no effect on IBD.
    2. Based on discussion in #28280, writing only once every 24 hours during long running operation of a node causes IO spikes. Writing smaller chainstate changes every hour like we do with blocks and block index will reduce IO spikes.

    Inspired by this comment.

    Resolves https://github.com/bitcoin/bitcoin/issues/11600

  2. DrahtBot commented at 6:21 pm on August 8, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/30611.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Approach ACK l0rinc

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30610 (validation: do not wipe utxo cache for stats/scans/snapshots by sipa)
    • #29365 (Extend signetchallenge to set target block spacing by starius)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Validation on Aug 8, 2024
  4. andrewtoth force-pushed on Aug 8, 2024
  5. gmaxwell commented at 8:51 pm on August 8, 2024: contributor

    Before someone suggests that it should do this every block, a reason to not do this too much is that delaying writes prevents many writes from ever happening at all because many outputs are quickly spent.

    I don’t see a particular problem for this, there will be some write traffic increase but I don’t expect it will be enough to meaningfully reduce the life time of flash on cheap devices (and if that were a concern the default level of logging is probably more of an issue). It’s the sort of thing someone could test, however.

    Someone might also want to test the latency impact (e.g. on block template creation) of the traversal.

    Though long term it would be preferable to introduce a data structure to track pending writes and keep a constantly flushing going which always lags the tip by N blocks, both avoiding the latency spikes and preventing unnecessary writes of entries that are just going to be spent in the next couple blocks.

  6. mzumsande commented at 3:49 pm on August 9, 2024: contributor
    Did you check if this would slow down IBD noticeably when running with a large dbcache? If i understand it right, coins that are created and destroyed more than 1 hour of syncing apart would not have been written to disk before, but will be written and deleted after this PR - though presumably this is not a frequent event, so there wouldn’t be much of an impact?!
  7. andrewtoth commented at 4:47 pm on August 9, 2024: contributor

    Before someone suggests that it should do this every block, a reason to not do this too much is that delaying writes prevents many writes from ever happening at all because many outputs are quickly spent.

    If i understand it right, coins that are created and destroyed more than 1 hour of syncing apart would not have been written to disk before, but will be written and deleted after this PR

    Correct, there will be significantly less utxo cut through with this PR.

    I don’t see a particular problem for this, there will be some write traffic increase but I don’t expect it will be enough to meaningfully reduce the life time of flash on cheap devices (and if that were a concern the default level of logging is probably more of an issue). It’s the sort of thing someone could test, however.

    though presumably this is not a frequent event, so there wouldn’t be much of an impact?!

    For normal operation, this would mean writing 10s of MB every hour, vs 100s of MB every 24 hours. I don’t think that we should be concerned about that increase in disk IO? Seems negligible to me.

    Did you check if this would slow down IBD noticeably when running with a large dbcache?

    Not yet. I will run a reindex-chainstate benchmark with max dbcache and see what the impact is.

  8. sipa commented at 8:40 pm on August 9, 2024: member

    It would be good to know what impact it has on IBD, as it’s necessarily going to make things slower (more syncs means more disk writes, and more memory/cpu to remember UTXOs which must be marked spent on disk), for the benefit of reducing losses in case of crashes. The question is how much time & I/O it adds.

    Regarding doing it for every block (during steady state), the trade-off there is different but still exists. On the pro side there is a reduction of latency spikes instead of reducing crash losses (as these are inherently less relevant in steady state). On the con side there is an increase in frequency of writes (and the latency spikes they bring, albeit smaller ones) instead of a sync speed slowdown.

    We don’t necessarily need to use the same sync frequency during IBD and during steady state, but the effects during steady state seem harder to measure, so it’s hard to say what is acceptable there.

    One related idea: maybe we should randomize the sync event times a bit (say, uniformly random intervals between 50 and 70 minutes) to prevent a situation where the network over time settles into a few cohorts of synchronized syncers (think like communicating metronomes: they start randomly, but have synchronized events (blocks arriving) that slightly delay sync events if they were to coincide).

  9. andrewtoth commented at 9:44 pm on August 9, 2024: contributor

    It would be good to know what impact it has on IBD, as it’s necessarily going to make things slower

    Note that it is only necessarily slower when running with very high dbcache settings. For default settings the cache fills up and is flushed far more often than every hour, so this will have no effect.

  10. andrewtoth commented at 1:54 pm on August 10, 2024: contributor

    This change didn’t appear to have any speed impact for a reindex-chainstate using max dbcache. Ran the following benchmark:

     0nohup hyperfine \
     1--export-markdown ~/bench.md \
     2--show-output \
     3--parameter-list commit 81508954f1f89f95b6fdca10c3c471cdb6566c80,27a770b34b8f1dbb84760f442edb3e23a0c2420b \
     4--prepare 'git checkout {commit} && \
     5make -j$(nproc) src/bitcoind; \
     6sync; sudo /sbin/sysctl vm.drop_caches=3;' \
     7-M 2 \
     8'echo '{commit}' && \
     9./src/bitcoind -datadir=/home/user/.bitcoin \
    10-reindex-chainstate \
    11-dbcache=16384 \
    12-printtoconsole=0 \
    13-connect=0 \
    14-stopatheight=820000' &
    

    Results are the same:

     0Benchmark 1: echo 81508954f1f89f95b6fdca10c3c471cdb6566c80 && \
     1./src/bitcoind -datadir=/home/user/.bitcoin \
     2-reindex-chainstate \
     3-dbcache=16384 \
     4-printtoconsole=0 \
     5-connect=0 \
     6-stopatheight=820000
     7
     881508954f1f89f95b6fdca10c3c471cdb6566c80
     9  Time (mean ± σ):     12239.620 s ±  3.156 s    [User: 12366.988 s, System: 562.121 s]
    10  Range (min … max):   12237.388 s … 12241.851 s    2 runs
    11 
    12Benchmark 2: echo 27a770b34b8f1dbb84760f442edb3e23a0c2420b && \
    13./src/bitcoind -datadir=/home/user/.bitcoin \
    14-reindex-chainstate \
    15-dbcache=16384 \
    16-printtoconsole=0 \
    17-connect=0 \
    18-stopatheight=820000
    19
    2027a770b34b8f1dbb84760f442edb3e23a0c2420b
    21  Time (mean ± σ):     12243.072 s ±  6.588 s    [User: 12382.386 s, System: 559.349 s]
    22  Range (min … max):   12238.413 s … 12247.730 s    2 runs
    23 
    24Summary
    25  'echo 81508954f1f89f95b6fdca10c3c471cdb6566c80 && \
    26./src/bitcoind -datadir=/home/user/.bitcoin \
    27-reindex-chainstate \
    28-dbcache=16384 \
    29-printtoconsole=0 \
    30-connect=0 \
    31-stopatheight=820000' ran
    32    1.00 ± 0.00 times faster than 'echo 27a770b34b8f1dbb84760f442edb3e23a0c2420b && \
    33./src/bitcoind -datadir=/home/user/.bitcoin \
    34-reindex-chainstate \
    35-dbcache=16384 \
    36-printtoconsole=0 \
    37-connect=0 \
    38-stopatheight=820000'
    

    So, I would say this PR is strictly beneficial for IBD/reindex-chainstate. It is the same speed, but protects against data loss.

  11. gmaxwell commented at 6:32 pm on August 10, 2024: contributor

    I didn’t comment on IBD because I figured a hit per hour isn’t that significant in the context of IBD and likely pays for itself in improving crash robustness. I’m a little surprised that the effect is quite so small, but it does probably improve performance too by making writes a little more concurrent.

    Sipa: good point on the injection locking, I’ve previously wondered if the the 24 hour behavior could sync up.

  12. andrewtoth commented at 2:14 am on August 15, 2024: contributor

    I’m a little surprised that the effect is quite so small @gmaxwell I was as well, so I ran the same benchmark with -reindex as well and it gave me similar results:

    Command Mean [s] Min [s] Max [s] Relative
    echo 81508954f1f89f95b6fdca10c3c471cdb6566c80 && ./src/bitcoind -datadir=/home/user/.bitcoin -reindex -dbcache=16384 -printtoconsole=0 -connect=0 -stopatheight=820000 40765.075 ± 132.452 40671.417 40858.733 1.01 ± 0.00
    echo 27a770b34b8f1dbb84760f442edb3e23a0c2420b && ./src/bitcoind -datadir=/home/user/.bitcoin -reindex -dbcache=16384 -printtoconsole=0 -connect=0 -stopatheight=820000 40536.484 ± 46.379 40503.689 40569.279 1.00

    Will get some measurements for total chainstate bytes written and total time writing chainstate before and after this change during steady state. But, this will take a few weeks to get meaningful data. Even then there will be a lot of variance depending on how many utxos are being eagerly spent within 24 hours vs 1 hour at the time I am measuring.

    Someone might also want to test the latency impact (e.g. on block template creation) of the traversal.

    I’m not sure I understand what you are suggesting to measure here, or what you mean by traversal. We can measure the total time writing chainstate to disk with -debug=bench, which would be the latency impact if that’s what you mean.

    Though long term it would be preferable to introduce a data structure to track pending writes and keep a constantly flushing going which always lags the tip by N blocks, both avoiding the latency spikes and preventing unnecessary writes of entries that are just going to be spent in the next couple blocks.

    You mentioned a similar idea a while ago here. I agree with you there that it is a much bigger project. If this change is accepted and merged, would there really be a benefit anymore of adding something like that?

  13. l0rinc commented at 1:26 pm on August 19, 2024: contributor
    I’ll review this in more detail later, but I have a question first: Whenever a production change is made without corresponding test changes (yet the build passes), it suggests to me that there might be a gap in our coverage. Do you think it would make sense to add a test that reproduces some aspect of the old behavior before making the change—one that would need to be updated in the commit that introduces the change?
  14. andrewtoth commented at 1:31 pm on August 19, 2024: contributor

    it would make sense to add a test that reproduces some aspect of the old behavior before making the change—one that would need to be updated in the commit that introduces the change?

    Good point, I can add a test that that expects flush every 24 hours, then reduce it to 1 hour after this change.

  15. in src/validation.cpp:2849 in 81508954f1 outdated
    2851         bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow > m_last_write + DATABASE_WRITE_INTERVAL;
    2852-        // It's been very long since we flushed the cache. Do this infrequently, to optimize cache usage.
    2853-        bool fPeriodicFlush = mode == FlushStateMode::PERIODIC && nNow > m_last_flush + DATABASE_FLUSH_INTERVAL;
    2854         // Combine all conditions that result in a full cache flush.
    2855-        fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicFlush || fFlushForPrune;
    2856+        bool fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    


    l0rinc commented at 1:37 pm on August 19, 2024:
    To simplify review, could you please separate the concerns - having low-risk changes (e.g. moves, renames, inlines, rewording) in separate commits from high risk ones?

    andrewtoth commented at 5:44 pm on August 31, 2024:
    There really isn’t much I am doing here that can be split up. I added a second commit for moving the declaration of fDoFullFlush here.
  16. andrewtoth force-pushed on Aug 31, 2024
  17. andrewtoth force-pushed on Aug 31, 2024
  18. andrewtoth force-pushed on Aug 31, 2024
  19. DrahtBot added the label CI failed on Aug 31, 2024
  20. DrahtBot commented at 5:59 pm on August 31, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/29514781087

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  21. DrahtBot removed the label CI failed on Aug 31, 2024
  22. in src/validation.cpp:2843 in 1c1b63a041 outdated
    2842         if (m_last_write == decltype(m_last_write){}) {
    2843             m_last_write = nNow;
    2844-        }
    2845-        if (m_last_flush == decltype(m_last_flush){}) {
    2846-            m_last_flush = nNow;
    2847+            m_random_write_delay = std::chrono::minutes{(FastRandomContext().randrange(DATABASE_WRITE_INTERVAL_RANDOM_DELAY_RANGE))};
    


    sipa commented at 5:40 pm on September 1, 2024:

    It’d be a bit cleaner (and more granular) to have:

    0static constexpr auto DATABASE_WRITE_INTERVAL_MIN{50min};
    1static constexpr auto DATABASE_WRITE_INTERVAL_MAX{70min};
    2...
    3m_random_write_delay = DATABASE_WRITE_INTERVAL_MIN +
    4    FastRandomContext().rand_uniform_duration<NodeClock>(DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN);`
    

    l0rinc commented at 8:19 pm on September 1, 2024:

    given that this delay is written multiple times, would that be more verbose? Alternatively maybe we could add an m_next_write next to m_last_write, like:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index b54446be4e..e2c7272531 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -86,10 +86,9 @@ using node::CBlockIndexHeightOnlyComparator;
     5 using node::CBlockIndexWorkComparator;
     6 using node::SnapshotMetadata;
     7 
     8-/** Time to wait between writing blocks/block index and chainstate to disk. */
     9-static constexpr std::chrono::minutes DATABASE_WRITE_INTERVAL{50};
    10-/** Range of random delay to add to DATABASE_WRITE_INTERVAL */
    11-static constexpr uint32_t DATABASE_WRITE_INTERVAL_RANDOM_DELAY_RANGE{20};
    12+/** Minimum and maximum time to wait between writing blocks/block index and chainstate to disk. */
    13+static constexpr auto DATABASE_WRITE_INTERVAL_MIN{50min};
    14+static constexpr auto DATABASE_WRITE_INTERVAL_MAX{70min};
    15 /** Maximum age of our tip for us to be considered current for fee estimation */
    16 static constexpr std::chrono::hours MAX_FEE_ESTIMATION_TIP_AGE{3};
    17 const std::vector<std::string> CHECKLEVEL_DOC {
    18@@ -2776,6 +2775,11 @@ CoinsCacheSizeState Chainstate::GetCoinsCacheSizeState(
    19     return CoinsCacheSizeState::OK;
    20 }
    21 
    22+NodeClock::time_point Chainstate::CalculateNextWrite(NodeClock::time_point after) const
    23+{
    24+    return FastRandomContext().rand_uniform_delay(after + DATABASE_WRITE_INTERVAL_MIN, DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN);
    25+}
    26+
    27 bool Chainstate::FlushStateToDisk(
    28     BlockValidationState &state,
    29     FlushStateMode mode,
    30@@ -2840,14 +2844,14 @@ bool Chainstate::FlushStateToDisk(
    31         // Avoid writing/flushing immediately after startup.
    32         if (m_last_write == decltype(m_last_write){}) {
    33             m_last_write = nNow;
    34-            m_random_write_delay = std::chrono::minutes{(FastRandomContext().randrange(DATABASE_WRITE_INTERVAL_RANDOM_DELAY_RANGE))};
    35+            m_next_write = CalculateNextWrite(m_last_write);
    36         }
    37         // The cache is large and we're within 10% and 10 MiB of the limit, but we have time now (not in the middle of a block processing).
    38         bool fCacheLarge = mode == FlushStateMode::PERIODIC && cache_state >= CoinsCacheSizeState::LARGE;
    39         // The cache is over the limit, we have to write now.
    40         bool fCacheCritical = mode == FlushStateMode::IF_NEEDED && cache_state >= CoinsCacheSizeState::CRITICAL;
    41         // It's been a while since we wrote the block index to disk. Do this frequently, so we don't need to redownload after a crash.
    42-        bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow > m_last_write + DATABASE_WRITE_INTERVAL + m_random_write_delay;
    43+        bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow >= m_next_write;
    44         // Combine all conditions that result in a full cache flush.
    45         bool fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    46         // Write blocks, block index and best chain related state to disk.
    47@@ -2882,7 +2886,7 @@ bool Chainstate::FlushStateToDisk(
    48                 m_blockman.UnlinkPrunedFiles(setFilesToPrune);
    49             }
    50             m_last_write = nNow;
    51-            m_random_write_delay = std::chrono::minutes{(FastRandomContext().randrange(DATABASE_WRITE_INTERVAL_RANDOM_DELAY_RANGE))};
    52+            m_next_write = CalculateNextWrite(m_last_write);
    53 
    54             if (!CoinsTip().GetBestBlock().IsNull()) {
    55                 LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fkB)",
    56diff --git a/src/validation.h b/src/validation.h
    57index dfe33b8e29..953793bdf7 100644
    58--- a/src/validation.h
    59+++ b/src/validation.h
    60@@ -810,7 +810,9 @@ private:
    61         EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    62 
    63     NodeClock::time_point m_last_write{};
    64-    std::chrono::minutes m_random_write_delay{};
    65+    NodeClock::time_point m_next_write{};
    66+
    67+    NodeClock::time_point CalculateNextWrite(NodeClock::time_point after) const;
    68 
    69     /**
    70      * In case of an invalid snapshot, rename the coins leveldb directory so
    

    note: this way it should probably be nNow >= m_next_write instead of nNow > m_next_write to be consistent with the naming.


    andrewtoth commented at 2:25 am on September 2, 2024:
    Thanks for the suggestions! I could not get @sipa’s to compile, so went with @l0rinc’s.
  23. andrewtoth commented at 1:50 am on September 2, 2024: contributor

    I got some stats for running in steady state for this branch and master. I ran 4 nodes for 11 days and 18 hours using identical aws t2-small instances backed by 20GB EBS volumes all connected to another node in the same VPC.

    Branch Total time writing chainstate Total bytes written
    master 1 26372ms 497 MiB
    master 2 28188ms 497 MiB
    branch 1 32496ms 760 MiB
    branch 2 32566ms 760 MiB

    By writing every hour, we write about 50% more data, but we are only spending 20% more time writing. On master we write about 42 MiB every 24 hours which locks the main thread for 2.3 seconds. On this branch we write about 2.7 MiB every hour which locks the main thread for 115 milliseconds. So it’s about 0.9 MiB more data written every hour, which I think is a negligible amount. Note that this extra data is ephemeral and will be erased as these extra utxos are spent. It will not cause nodes to have to store more data.

    Here are the write operations charts for the time period. You can see the write operations are more evenly spaced out by writing every hour. Master 1 Master 2 Branch 2 Branch 1

  24. andrewtoth force-pushed on Sep 2, 2024
  25. andrewtoth commented at 2:26 am on September 2, 2024: contributor
    @gmaxwell @l0rinc @mzumsande @sipa thank you all for your reviews! I’ve updated the PR to have a random window for periodic flushes of between 50 and 70 minutes. I’ve also added a unit test that checks for a flush in this window. That unit test can be cherry-picked to master and verified that it does not pass.
  26. andrewtoth force-pushed on Sep 2, 2024
  27. in src/validation.cpp:2786 in c529a595dc outdated
    2774@@ -2774,6 +2775,11 @@ CoinsCacheSizeState Chainstate::GetCoinsCacheSizeState(
    2775     return CoinsCacheSizeState::OK;
    2776 }
    2777 
    2778+SteadyClock::time_point CalculateNextWrite(SteadyClock::time_point after)
    2779+{
    2780+    return FastRandomContext().rand_uniform_delay(after + DATABASE_WRITE_INTERVAL_MIN, DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN);
    2781+}
    


    l0rinc commented at 11:27 am on September 2, 2024:

    I was a bit hasty yesterday - the MIN/MAX naming indicates that the interval is closed (i.e. can actually reach the values), so we should add 1 minute to the range:

    0NodeClock::time_point CalculateNextWrite(const NodeClock::time_point after)
    1{
    2    const auto time{after + DATABASE_WRITE_INTERVAL_MIN};
    3    constexpr auto range{DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN + 1min};
    4    return FastRandomContext().rand_uniform_delay(time, range);
    5}
    

    See:

     0BOOST_AUTO_TEST_CASE(CalculateNextWrite_interval)
     1{
     2    NodeClock::time_point after = NodeClock::now();
     3    std::set<int64_t> results;
     4    for (int i = 0; i < 100; ++i) {
     5        auto next = CalculateNextWrite(after);
     6        auto diff = (next - after) / std::chrono::minutes(1);
     7        results.insert(diff);
     8    }
     9    BOOST_CHECK_EQUAL(results.size(), 21);
    10}
    

    andrewtoth commented at 2:27 pm on September 2, 2024:
    Done.
  28. in src/test/chainstate_write_tests.cpp:40 in 5cad9d16dd outdated
    30+
    31+    // Time window to periodically flush is between 50 and 70 minutes
    32+    SetMockTime(GetTime<std::chrono::minutes>() + 49min);
    33+    chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
    34+    m_node.validation_signals->SyncWithValidationInterfaceQueue();
    35+    BOOST_CHECK(!sub->m_did_flush);
    


    l0rinc commented at 12:11 pm on September 2, 2024:

    The initial run should never flush, right? SetMockTime(GetTime<std::chrono::minutes>() + 55min); passes also, so we should likely update the test . Actually, sometimes it passes, other times it fails - this is the reason for my NACK

     0% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval        
     1Running 1 test case...
     2src/test/chainstate_write_tests.cpp:35: error: in "chainstate_write_tests/chainstate_write_interval": check !sub->m_did_flush has failed
     3
     4*** 1 failure is detected in the test module "Bitcoin Core Test Suite"
     5% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval        
     6Running 1 test case...
     7
     8*** No errors detected
     9% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval
    10Running 1 test case...
    11
    12*** No errors detected
    13% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval
    14Running 1 test case...
    15
    16*** No errors detected
    17% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval
    18Running 1 test case...
    19
    20*** No errors detected
    21% build/src/test/test_bitcoin --run_test=chainstate_write_tests/chainstate_write_interval
    22Running 1 test case...
    23src/test/chainstate_write_tests.cpp:35: error: in "chainstate_write_tests/chainstate_write_interval": check !sub->m_did_flush has failed
    24
    25*** 1 failure is detected in the test module "Bitcoin Core Test Suite"
    

    andrewtoth commented at 1:55 pm on September 2, 2024:
    Well, changing the test to periodically flush after 55 minutes makes it a different test. That would mean the first flush is in the window and could randomly trigger a flush or not. Was the failure after changing to 55min?

    andrewtoth commented at 2:16 pm on September 2, 2024:
    Ahh, I see what you mean. We need to trigger an initial flush to set m_next_write first before testing the periodic flush window. Fixed.
  29. in src/test/chainstate_write_tests.cpp:24 in 5cad9d16dd outdated
    19+    {
    20+        m_did_flush = true;
    21+    }
    22+};
    23+
    24+BOOST_FIXTURE_TEST_CASE(chainstate_write_interval, TestingSetup)
    


    l0rinc commented at 12:41 pm on September 2, 2024:

    Besides the stability and the initial run, I think we should test the valid values as well inside the range, e.g.

     0BOOST_FIXTURE_TEST_CASE(chainstate_write_interval, TestingSetup)
     1{
     2    auto DATABASE_WRITE_INTERVAL_MIN{50min};
     3    auto DATABASE_WRITE_INTERVAL_MAX{70min};
     4    auto now = GetTime<std::chrono::minutes>();
     5    auto sub{std::make_shared<TestSubscriber>()};
     6    m_node.validation_signals->RegisterSharedValidationInterface(sub);
     7    auto test_flush = [&](std::chrono::minutes offset) {
     8        assert(!sub->m_did_flush);
     9        Chainstate& chainstate{Assert(m_node.chainman)->ActiveChainstate()};
    10        BlockValidationState state_dummy{};
    11
    12        SetMockTime(now + offset);
    13        chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
    14        m_node.validation_signals->SyncWithValidationInterfaceQueue();
    15        return sub->m_did_flush;
    16    };
    17
    18    // Initial run shouldn't flush
    19    BOOST_CHECK_EQUAL(test_flush(DATABASE_WRITE_INTERVAL_MIN + 5min), false);
    20
    21    // Test all values (+1) in the interval
    22    BOOST_CHECK_EQUAL(test_flush(DATABASE_WRITE_INTERVAL_MIN - 1min), false);
    23    for (auto m = DATABASE_WRITE_INTERVAL_MIN; m <= DATABASE_WRITE_INTERVAL_MAX + 1min; m += 1min) {
    24        BOOST_CHECK_EQUAL(test_flush(m), true);
    25    }
    26}
    

    andrewtoth commented at 1:59 pm on September 2, 2024:
    Why would we need to test every value in the window? Testing the edges of the window accomplishes the same thing, no?

    l0rinc commented at 4:14 pm on September 2, 2024:

    ~Using SteadyClock in #30611 (review) seems to have stabilized the tests.~

    Edit: it seems the addition of the first periodic flush stabilized the test.

    Approach ACK

  30. in src/validation.cpp:89 in 5cad9d16dd outdated
    85@@ -86,10 +86,9 @@ using node::CBlockIndexHeightOnlyComparator;
    86 using node::CBlockIndexWorkComparator;
    87 using node::SnapshotMetadata;
    88 
    89-/** Time to wait between writing blocks/block index to disk. */
    90-static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
    91-/** Time to wait between flushing chainstate to disk. */
    92-static constexpr std::chrono::hours DATABASE_FLUSH_INTERVAL{24};
    93+/** Time window to wait between writing blocks/block index and chainstate to disk. */
    


    l0rinc commented at 12:43 pm on September 2, 2024:
    can we add @sipa’s reasoning here for why it’s not a fixed value?

    andrewtoth commented at 2:27 pm on September 2, 2024:
    Done.
  31. in src/validation.cpp:2849 in 5cad9d16dd outdated
    2839@@ -2837,26 +2840,21 @@ bool Chainstate::FlushStateToDisk(
    2840                 }
    2841             }
    2842         }
    2843-        const auto nNow{SteadyClock::now()};
    2844+        const auto nNow{NodeClock::now()};
    2845         // Avoid writing/flushing immediately after startup.
    


    l0rinc commented at 12:51 pm on September 2, 2024:

    If my understanding is correct:

    • at startup we never want to flush, but the initial value of m_next_write would trigger one, so we have to set the next write value in that case quickly to prevent one
    • when we do flush, we want to set the next flush’s time again

    I think we can deduplicate setting this value twice (which also enables us to inline CalculateNextWrite):

    • setting m_next_write to max, which won’t trigger fDoFullFlush since it’s >> nNow
    • add another boolean where we check for initial run i.e. m_next_write == NodeClock::time_point::max() (or we can inline this, it’s only used once)
    • set the next flush time when fInitialRun || fDoFullFlush, independently from the if (fDoFullFlush) { block
     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 6b401799bb..33fcc30806 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -2775,11 +2775,6 @@ CoinsCacheSizeState Chainstate::GetCoinsCacheSizeState(
     5     return CoinsCacheSizeState::OK;
     6 }
     7 
     8-NodeClock::time_point CalculateNextWrite(NodeClock::time_point after)
     9-{
    10-    return FastRandomContext().rand_uniform_delay(after + DATABASE_WRITE_INTERVAL_MIN, DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN);
    11-}
    12-
    13 bool Chainstate::FlushStateToDisk(
    14     BlockValidationState &state,
    15     FlushStateMode mode,
    16@@ -2841,10 +2836,6 @@ bool Chainstate::FlushStateToDisk(
    17             }
    18         }
    19         const auto nNow{NodeClock::now()};
    20-        // Avoid writing/flushing immediately after startup.
    21-        if (m_next_write == decltype(m_next_write){}) {
    22-            m_next_write = CalculateNextWrite(nNow);
    23-        }
    24         // The cache is large and we're within 10% and 10 MiB of the limit, but we have time now (not in the middle of a block processing).
    25         bool fCacheLarge = mode == FlushStateMode::PERIODIC && cache_state >= CoinsCacheSizeState::LARGE;
    26         // The cache is over the limit, we have to write now.
    27@@ -2853,6 +2844,15 @@ bool Chainstate::FlushStateToDisk(
    28         bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow >= m_next_write;
    29         // Combine all conditions that result in a full cache flush.
    30         bool fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    31+        // Avoid writing/flushing immediately after startup
    32+        bool fInitialRun = m_next_write == NodeClock::time_point::max();
    33+
    34+        if (fInitialRun || fDoFullFlush) {
    35+            const auto time{nNow + DATABASE_WRITE_INTERVAL_MIN};
    36+            constexpr auto range{DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN + 1min};
    37+            m_next_write = FastRandomContext().rand_uniform_delay(time, range);
    38+        }
    39+
    40         // Write blocks, block index and best chain related state to disk.
    41         if (fDoFullFlush) {
    42             // Ensure we can write block index
    43@@ -2884,7 +2884,6 @@ bool Chainstate::FlushStateToDisk(
    44 
    45                 m_blockman.UnlinkPrunedFiles(setFilesToPrune);
    46             }
    47-            m_next_write = CalculateNextWrite(nNow);
    48 
    49             if (!CoinsTip().GetBestBlock().IsNull()) {
    50                 LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fkB)",
    51diff --git a/src/validation.h b/src/validation.h
    52index 1c2fd0bd3c..ea737a9c4c 100644
    53--- a/src/validation.h
    54+++ b/src/validation.h
    55@@ -808,7 +808,7 @@ private:
    56     void UpdateTip(const CBlockIndex* pindexNew)
    57         EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    58 
    59-    NodeClock::time_point m_next_write{};
    60+    NodeClock::time_point m_next_write{NodeClock::time_point::max()};
    61 
    62     /**
    63      * In case of an invalid snapshot, rename the coins leveldb directory so
    

    andrewtoth commented at 2:01 pm on September 2, 2024:
    I’m not sure this is an improvement. We would be adding a new variable, instead of calling a function twice. Seems like a wash to me.

    l0rinc commented at 3:53 pm on September 2, 2024:

    we don’t have to add a variable, we can just inline it to:

    0if (fDoFullFlush || m_next_write == NodeClock::time_point::max()) {
    

    Which would simplify a few places (logically, making it obvious when we’re queuing the next run, and also codewise, reducing duplication), removing the helper method and explanatory comments and multiple assignments (i.e. business logic being spread around).


    andrewtoth commented at 9:15 pm on September 2, 2024:
    This seems like just replacing if (m_next_write == decltype(m_next_write){}) { with if (m_next_write == NodeClock::time_point::max()) {, and since it’s max we can just move the if block below fPeriodicWrite assignment. I don’t see how this is an improvement? Doesn’t having multiple checks for fDoFullFlush make it more complex?

    l0rinc commented at 9:24 pm on September 2, 2024:

    it’s not just replacing the if, please apply this patch on top to see what I mean:

     0diff --git a/src/validation.cpp b/src/validation.cpp
     1index 87a144cb99..6fdc070db3 100644
     2--- a/src/validation.cpp
     3+++ b/src/validation.cpp
     4@@ -2778,13 +2778,6 @@ CoinsCacheSizeState Chainstate::GetCoinsCacheSizeState(
     5     return CoinsCacheSizeState::OK;
     6 }
     7 
     8-NodeClock::time_point CalculateNextWrite(NodeClock::time_point after)
     9-{
    10-    const auto time{after + DATABASE_WRITE_INTERVAL_MIN};
    11-    constexpr auto range{DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN + 1min};
    12-    return FastRandomContext().rand_uniform_delay(time, range);
    13-}
    14-
    15 bool Chainstate::FlushStateToDisk(
    16     BlockValidationState &state,
    17     FlushStateMode mode,
    18@@ -2846,10 +2839,6 @@ bool Chainstate::FlushStateToDisk(
    19             }
    20         }
    21         const auto nNow{NodeClock::now()};
    22-        // Avoid writing/flushing immediately after startup.
    23-        if (m_next_write == decltype(m_next_write){}) {
    24-            m_next_write = CalculateNextWrite(nNow);
    25-        }
    26         // The cache is large and we're within 10% and 10 MiB of the limit, but we have time now (not in the middle of a block processing).
    27         bool fCacheLarge = mode == FlushStateMode::PERIODIC && cache_state >= CoinsCacheSizeState::LARGE;
    28         // The cache is over the limit, we have to write now.
    29@@ -2858,6 +2847,12 @@ bool Chainstate::FlushStateToDisk(
    30         bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow >= m_next_write;
    31         // Combine all conditions that result in a full cache flush.
    32         bool fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    33+
    34+        if (fDoFullFlush || m_next_write == NodeClock::time_point::max()) {
    35+            constexpr auto range{DATABASE_WRITE_INTERVAL_MAX - DATABASE_WRITE_INTERVAL_MIN + 1min};
    36+            m_next_write = FastRandomContext().rand_uniform_delay(nNow + DATABASE_WRITE_INTERVAL_MIN, range);
    37+        }
    38+
    39         // Write blocks, block index and best chain related state to disk.
    40         if (fDoFullFlush) {
    41             // Ensure we can write block index
    42@@ -2889,7 +2884,6 @@ bool Chainstate::FlushStateToDisk(
    43 
    44                 m_blockman.UnlinkPrunedFiles(setFilesToPrune);
    45             }
    46-            m_next_write = CalculateNextWrite(nNow);
    47 
    48             if (!CoinsTip().GetBestBlock().IsNull()) {
    49                 LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fkB)",
    50diff --git a/src/validation.h b/src/validation.h
    51index 1c2fd0bd3c..ea737a9c4c 100644
    52--- a/src/validation.h
    53+++ b/src/validation.h
    54@@ -808,7 +808,7 @@ private:
    55     void UpdateTip(const CBlockIndex* pindexNew)
    56         EXCLUSIVE_LOCKS_REQUIRED(::cs_main);
    57 
    58-    NodeClock::time_point m_next_write{};
    59+    NodeClock::time_point m_next_write{NodeClock::time_point::max()};
    60 
    61     /**
    62      * In case of an invalid snapshot, rename the coins leveldb directory so
    

    andrewtoth commented at 2:33 am on September 3, 2024:
    Ok, done!
  32. l0rinc commented at 12:57 pm on September 2, 2024: contributor

    The test runs unreliably for some reason, we need to fix that first.

    NACK 5cad9d16dd07859a76c6637789695bf1b4f36e1c

    On this branch […] main thread for 115 milliseconds.

    This sounds awesome!

    Good point, I can add a test that that expects flush every 24 hours, then reduce it to 1 hour after this change.

    Do you think we could still do something like this, i.e. add a test as a very first commit, which reproduces the old behavior first?

  33. l0rinc changes_requested
  34. andrewtoth force-pushed on Sep 2, 2024
  35. andrewtoth force-pushed on Sep 2, 2024
  36. andrewtoth force-pushed on Sep 2, 2024
  37. andrewtoth commented at 2:48 pm on September 2, 2024: contributor

    add a test as a very first commit, which reproduces the old behavior first?

    Yeah, I realize the test commit can’t be cherry-picked cleanly to master. Updated to have the test commit first, and update the test for each change.

  38. andrewtoth force-pushed on Sep 2, 2024
  39. in src/test/chainstate_write_tests.cpp:36 in 1d50f72d16 outdated
    31+    // The first periodic flush sets m_last_write and does not flush
    32+    chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
    33+    m_node.validation_signals->SyncWithValidationInterfaceQueue();
    34+    BOOST_CHECK(!sub->m_did_flush);
    35+
    36+    // Time to periodically flush is between 50 and 70 minutes
    


    l0rinc commented at 4:17 pm on September 2, 2024:
    0    // Time to periodically flush is between 50 and 70 minutes (inclusive)
    

    andrewtoth commented at 2:32 am on September 3, 2024:
    Done.
  40. in src/test/chainstate_write_tests.cpp:40 in 1d50f72d16 outdated
    35+
    36+    // Time to periodically flush is between 50 and 70 minutes
    37+    SetMockTime(GetTime<std::chrono::minutes>() + 49min);
    38+    chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
    39+    m_node.validation_signals->SyncWithValidationInterfaceQueue();
    40+    BOOST_CHECK(!sub->m_did_flush);
    


    l0rinc commented at 4:19 pm on September 2, 2024:
    if you decide to keep this structure, please add boundary test for 50min turning m_did_flush to true. Would also cover + 71min for the same reason.

    andrewtoth commented at 4:21 pm on September 2, 2024:
    I don’t think it would be true, because it’s random.

    l0rinc commented at 6:25 pm on September 2, 2024:
    got it, you can resolve this comment
  41. andrewtoth force-pushed on Sep 2, 2024
  42. andrewtoth force-pushed on Sep 3, 2024
  43. in src/test/chainstate_write_tests.cpp:31 in 1b448f5bca outdated
    26+    const auto sub{std::make_shared<TestSubscriber>()};
    27+    m_node.validation_signals->RegisterSharedValidationInterface(sub);
    28+    Chainstate& chainstate{Assert(m_node.chainman)->ActiveChainstate()};
    29+    BlockValidationState state_dummy{};
    30+
    31+    // The first periodic flush sets m_last_write and does not flush
    


    l0rinc commented at 8:34 am on September 3, 2024:
    we don’t have m_last_write anymore

    andrewtoth commented at 3:58 pm on September 8, 2024:
    Fixed.
  44. in src/test/chainstate_write_tests.cpp:36 in 81e9d97b86 outdated
    33     chainstate.FlushStateToDisk(state_dummy, FlushStateMode::PERIODIC);
    34     m_node.validation_signals->SyncWithValidationInterfaceQueue();
    35     BOOST_CHECK(!sub->m_did_flush);
    36 
    37-    SetMockTime(GetTime<std::chrono::minutes>() + 23h + 59min);
    38+    // Time to periodically flush is 1 hour
    


    l0rinc commented at 8:38 am on September 3, 2024:

    nit: 1 hour is a duration or interval or period, but technically not a time:

    0    // The periodic flush interval is 1 hour
    

    andrewtoth commented at 3:58 pm on September 8, 2024:
    Fixed.
  45. in src/validation.cpp:93 in 81e9d97b86 outdated
    85@@ -86,10 +86,8 @@ using node::CBlockIndexHeightOnlyComparator;
    86 using node::CBlockIndexWorkComparator;
    87 using node::SnapshotMetadata;
    88 
    89-/** Time to wait between writing blocks/block index to disk. */
    90+/** Time to wait between writing blocks/block index and chainstate to disk. */
    91 static constexpr std::chrono::hours DATABASE_WRITE_INTERVAL{1};
    92-/** Time to wait between flushing chainstate to disk. */
    93-static constexpr std::chrono::hours DATABASE_FLUSH_INTERVAL{24};
    


    l0rinc commented at 8:41 am on September 3, 2024:
    could you please explain the flush removal in the corresponding commit message as well?

    andrewtoth commented at 3:58 pm on September 8, 2024:
    Added a sentence in the commit message.
  46. in src/validation.cpp:2850 in 81e9d97b86 outdated
    2853-        bool fPeriodicFlush = mode == FlushStateMode::PERIODIC && nNow > m_last_flush + DATABASE_FLUSH_INTERVAL;
    2854         // Combine all conditions that result in a full cache flush.
    2855-        fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicFlush || fFlushForPrune;
    2856-        // Write blocks and block index to disk.
    2857-        if (fDoFullFlush || fPeriodicWrite) {
    2858+        fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    


    l0rinc commented at 8:52 am on September 3, 2024:
    is the name still accurate, now that a periodic write can also trigger it (i.e. not just flushes, like before)?

    andrewtoth commented at 3:59 pm on September 8, 2024:
    Renamed to should_write.
  47. in src/validation.cpp:2879 in 81e9d97b86 outdated
    2897-            // Flush the chainstate (which may refer to block index entries).
    2898-            const auto empty_cache{(mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical};
    2899-            if (empty_cache ? !CoinsTip().Flush() : !CoinsTip().Sync()) {
    2900-                return FatalError(m_chainman.GetNotifications(), state, _("Failed to write to coin database."));
    2901+
    2902+            if (!CoinsTip().GetBestBlock().IsNull()) {
    


    l0rinc commented at 9:01 am on September 3, 2024:

    Is it even possible for this to be false, given that we skip flush on first run?

    nit: just an observation, the starting ! hides well here, if you think it read better, here’s an alternative (though I see that IsNull is used in a few other places):

    0            if (CoinsTip().GetBestBlock() != uint256::ZERO) {
    

    andrewtoth commented at 4:02 pm on September 3, 2024:

    Is it even possible for this to be false, given that we skip flush on first run?

    Can we logically determine it will never be false? If so, we can simplify this quite a bit.


    l0rinc commented at 4:17 pm on September 3, 2024:
    what would happen when we restart with a different data directory, it will have to refetch everything, right? Or if we had a reorg on the very first block maybe? Maybe @sipa can help us out here…

    andrewtoth commented at 6:08 pm on September 6, 2024:

    Hmm there are a lot of entry points to FlushStateToDisk, and many places where SetBestBlock is called. I think this PR should leave this as is to not complicate review.

    I was thinking we could remove the full_flush_completed variable if we didn’t have this check.


    l0rinc commented at 6:49 pm on September 8, 2024:
    you can resolve this comment
  48. in src/validation.cpp:2893 in 81e9d97b86 outdated
    2911+                if (!CheckDiskSpace(m_chainman.m_options.datadir, 48 * 2 * 2 * CoinsTip().GetCacheSize())) {
    2912+                    return FatalError(m_chainman.GetNotifications(), state, _("Disk space is too low!"));
    2913+                }
    2914+                // Flush the chainstate (which may refer to block index entries).
    2915+                const auto empty_cache{(mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical};
    2916+                if (empty_cache ? !CoinsTip().Flush() : !CoinsTip().Sync()) {
    


    l0rinc commented at 9:19 am on September 3, 2024:
    I know this line didn’t change here, I’m looking for answers: I find it quite confusing that most of the side-effects of this method are actually done inside if conditions - I wanted to understand whether the name fDoFullFlush is accurate, but had a hard time understanding which methods are just checking state (e.g. CheckDiskSpace, I assume) and which ones are changing state (e.g. empty_cache ? !CoinsTip().Flush() : !CoinsTip().Sync()). So my question is whether it’s fair to call the condition fDoFullFlush, when sometimes it just Syncs (if I understand the algo correctly)?

    andrewtoth commented at 4:04 pm on September 3, 2024:
    Yeah we could rename fDoFullFlush. What about should_write?

    andrewtoth commented at 3:59 pm on September 8, 2024:
    Renamed.
  49. in src/test/chainstate_write_tests.cpp:1 in 1b448f5bca outdated
    0@@ -0,0 +1,48 @@
    1+// Copyright (c) 2024- The Bitcoin Core developers
    


    l0rinc commented at 9:27 am on September 3, 2024:

    I think the convention is:

    0// Copyright (c) 2024-present The Bitcoin Core developers
    

    andrewtoth commented at 3:59 pm on September 8, 2024:
    Removed the dates altogether.
  50. in src/validation.cpp:2851 in 1b448f5bca outdated
    2851-        bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow > m_last_write + DATABASE_WRITE_INTERVAL;
    2852+        bool fPeriodicWrite = mode == FlushStateMode::PERIODIC && nNow >= m_next_write;
    2853         // Combine all conditions that result in a full cache flush.
    2854         bool fDoFullFlush = (mode == FlushStateMode::ALWAYS) || fCacheLarge || fCacheCritical || fPeriodicWrite || fFlushForPrune;
    2855+
    2856+        if (fDoFullFlush || m_next_write == NodeClock::time_point::max()) {
    


    l0rinc commented at 9:30 am on September 3, 2024:
    we do more than add randomness to periodic write interval in this commit now, could you please split it up a bit more?

    andrewtoth commented at 3:59 pm on September 8, 2024:
    Split into 2 commits.
  51. l0rinc commented at 9:34 am on September 3, 2024: contributor
    Thanks for quick responses, my main concern at this stage is that the commit messages don’t explain the changes in enough detail and some symbols might need a rename after the change
  52. DrahtBot added the label CI failed on Sep 7, 2024
  53. andrewtoth force-pushed on Sep 8, 2024
  54. DrahtBot removed the label CI failed on Sep 8, 2024
  55. l0rinc commented at 6:39 pm on September 8, 2024: contributor

    Running

    0hyperfine \
    1--runs 5 \
    2--export-json /mnt/ibd_full.json \
    3--parameter-list COMMIT 8aabdfa6,1b448f5b \
    4--prepare 'git checkout {COMMIT} && git clean -fxd && git reset --hard && cmake -B build && cmake --build build -j8 && rm -rf /mnt/BitcoinData/*' \
    5'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0'
    

    on a HDD SATA 2,0 TB Enterprise resulted in:

    0 Benchmark 1: COMMIT=8aabdfa6 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0
    1  Time (mean ± σ):     33934.695 s ± 530.080 s    [User: 27857.218 s, System: 1873.581 s]
    2  Range (min  max):   33208.510 s  34511.668 s    5 runs
    
    0Benchmark 2: COMMIT=1b448f5b ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0
    1  Time (mean ± σ):     36605.266 s ± 584.663 s    [User: 29487.564 s, System: 2228.572 s]
    2  Range (min  max):   35846.681 s  37275.497 s    5 runs
    
    0Summary
    1  COMMIT=8aabdfa6 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0 ran
    2    1.08 ± 0.02 times faster than COMMIT=1b448f5b ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0
    
     0 "results": [
     1    {
     2      "command": "COMMIT=8aabdfa6 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0",
     3      "mean": 33934.69470762795,
     4      "stddev": 530.0799467530879,
     5      "median": 34177.67060207214,
     6      "user": 27857.2184555,
     7      "system": 1873.58075582,
     8      "min": 33208.50992151314,
     9      "max": 34511.66842756214,
    10      "times": [
    11        33208.50992151314,
    12        34511.66842756214,
    13        33571.37685305914,
    14        34177.67060207214,
    15        34204.24773393314
    16      ],
    17      "exit_codes": [
    18        0,
    19        0,
    20        0,
    21        0,
    22        0
    23      ],
    24      "parameters": {
    25        "COMMIT": "8aabdfa6"
    26      }
    27    },
    28    {
    29      "command": "COMMIT=1b448f5b ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=820000 -dbcache=16384 -printtoconsole=0",
    30      "mean": 36605.26595186774,
    31      "stddev": 584.6634786404139,
    32      "median": 36732.99369363214,
    33      "user": 29487.564016900003,
    34      "system": 2228.57235822,
    35      "min": 35846.68126299314,
    36      "max": 37275.49678005814,
    37      "times": [
    38        36182.10911623014,
    39        37275.49678005814,
    40        36732.99369363214,
    41        36989.04890642514,
    42        35846.68126299314
    43      ],
    44      "exit_codes": [
    45        0,
    46        0,
    47        0,
    48        0,
    49        0
    50      ],
    51      "parameters": {
    52        "COMMIT": "1b448f5b"
    53      }
    54    }
    55  ]
    56}
    

    So running the benchmark for almost 100 hours, having a standard deviation of ~9 minutes for both benchmarks (5 runs per commit) reveals that the full pre-assumevalid IBD consistently takes ~45 minutes (8%) longer after the change.

    Note that this was also run with actual nodes (i.e. non-local copies) to be as realistic as possible and was using a HDD.

    Edit: could we maybe test a 100-200 minute window instead of 60?

  56. DrahtBot added the label CI failed on Sep 9, 2024
  57. DrahtBot removed the label CI failed on Sep 13, 2024
  58. andrewtoth force-pushed on Sep 22, 2024
  59. andrewtoth commented at 1:23 pm on September 22, 2024: contributor

    Rebased to benchmark reindex-chainstate with #28358 and a cache size holding the entire utxo set.

    the full pre-assumevalid IBD consistently takes ~45 minutes (8%) longer after the change

    I got a hold of an HDD, and surprisingly my benchmarks show the opposite. I kept the blocksdir on the SSD and only had the datadir on the HDD to isolate the effects of this change. When changing from writing at the end to writing every hour, writing every hour was ~25 minutes (6%) faster.

    Command Mean [s] Min [s] Max [s] Relative
    echo 484aee0e16aff9c5fcfe66157c67bf29c66baa2a && ./build/src/bitcoind -datadir=/media/user/Barracuda/bitcoin -blocksdir=/home/user/.bitcoin -dbcache=27000 -reindex-chainstate -printtoconsole=0 -connect=0 -stopatheight=840000 25447.840 ± 390.609 25171.638 25724.043 1.00
    echo 17cf3d64b8a56763b9aff8b8af6ac8bd2b100ff6 && ./build/src/bitcoind -datadir=/media/user/Barracuda/bitcoin -blocksdir=/home/user/.bitcoin -dbcache=27000 -reindex-chainstate -printtoconsole=0 -connect=0 -stopatheight=840000 26932.607 ± 232.182 26768.430 27096.785 1.06 ± 0.02

    Notably, before this change flushing the chainstate to disk at the end of the reindex took longer than the entire reindex into memory on an HDD (4 hours of flushing and ~7.5 hours total time). During that entire flush bitcoind will be unresponsive as the main lock is held. So, this change is also a UX improvement because the writes will be more spaced out and not block RPC responses.

  60. l0rinc commented at 12:58 pm on September 25, 2024: contributor

    I did a rebased -reindex-chainstate as well, this seems to be basically the same as before:

     0# git checkout 33adc7521cc8bb24b941d959022b084002ba7c60
     1HEAD is now at 33adc7521c Merge bitcoin/bitcoin#30765: refactor: Allow `CScript`'s `operator<<` to accept spans, not just vectors
     2
     3# git checkout 391c87640d78d9821b87e3f3de755af40a191d24
     4HEAD is now at 391c87640d validation: add randomness to periodic write interval
     5
     6hyperfine \
     7--runs 3 \
     8--export-json /mnt/ibd_full-30611.json \
     9--parameter-list COMMIT 33adc7521cc8bb24b941d959022b084002ba7c60,391c87640d78d9821b87e3f3de755af40a191d24 \
    10--prepare 'git checkout {COMMIT} && git clean -fxd && git reset --hard && cmake -B build && cmake --build build -j8' \
    11'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -reindex-chainstate -printtoconsole=0'
    

    Benchmark 1: COMMIT=33adc7521cc8bb24b941d959022b084002ba7c60 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -reindex-chainstate -printtoconsole=0

    0  Time (mean ± σ):     22795.930 s ± 42.220 s    [User: 19545.844 s, System: 926.963 s]
    1  Range (min … max):   22756.042 s … 22840.149 s    3 runs
    

    Benchmark 2: COMMIT=391c87640d78d9821b87e3f3de755af40a191d24 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -reindex-chainstate -printtoconsole=0

    0  Time (mean ± σ):     22962.909 s ± 457.340 s    [User: 19563.315 s, System: 923.347 s]
    1  Range (min … max):   22461.120 s … 23356.337 s    3 runs
    
    0  COMMIT=33adc7521cc8bb24b941d959022b084002ba7c60 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -reindex-chainstate -printtoconsole=0 ran
    1    1.01 ± 0.02 times faster than COMMIT=391c87640d78d9821b87e3f3de755af40a191d24 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -reindex-chainstate -printtoconsole=0
    

    I’ll rerun on a full IBD for the same params:

    0hyperfine \
    1--runs 3 \
    2--export-json /mnt/ibd-30611-full.json \
    3--parameter-list COMMIT 33adc7521cc8bb24b941d959022b084002ba7c60,391c87640d78d9821b87e3f3de755af40a191d24 \
    4--prepare 'git checkout {COMMIT} && git clean -fxd && git reset --hard && cmake -B build && cmake --build build -j8 && rm -rf /mnt/BitcoinData/*' \
    5'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0'
    
  61. andrewtoth commented at 1:20 pm on September 25, 2024: contributor

    I did a rebased -reindex-chainstate as well, this seems to be basically the same as before:

    Nice, so there’s no speed regression but crash resistance gained!

    Also, I benchmarked only at 484aee0e16aff9c5fcfe66157c67bf29c66baa2a instead of 391c87640d78d9821b87e3f3de755af40a191d24 so that there’s no randomness in the benchmarks. It can give us a better picture of what this PR affects.

  62. l0rinc commented at 1:24 pm on September 25, 2024: contributor

    so that there’s no randomness in the benchmarks

    I’ve rebased on master to have up-to-date data. What kind of randomness are you referring to?

    there’s no speed regression

    3 benchmarks, all showed different things. We have to find out what causes these differences, that’s why I want to try a complete IBD again (which seems like the only downside so far), since ultimately most people won’t have blocksdir on the SSD and datadir on the HDD doing a reindex.

  63. andrewtoth commented at 1:29 pm on September 25, 2024: contributor

    What kind of randomness are you referring to?

    The random window for when this PR will flush introduced in the last commit. That is good for steady state operation to not have all nodes converge on the same flush time. But, for benchmarking a reindex or IBD it will just add randomness to the results.

    3 benchmarks, all showed different things.

    Well, they are all different benchmarks with different variables. Changing more variables by running different benchmarks won’t necessarily give us a clearer picture.

  64. l0rinc commented at 7:36 pm on September 28, 2024: contributor

    running different benchmarks won’t necessarily give us a clearer picture.

    It could get confusing if they’re not pointing in the same direction (which seems to be the case here). But we need to investigate, we can’t have surprises, right?


    I reran the previous full IBD on the previous setup with HDD, but this time until 860000 with 30gb memory.

    0hyperfine --runs 3 \
    1--export-json /mnt/ibd-30611-full.json \
    2--parameter-list COMMIT 33adc7521cc8bb24b941d959022b084002ba7c60,391c87640d78d9821b87e3f3de755af40a191d24 \
    3--prepare 'git checkout {COMMIT} && git clean -fxd && git reset --hard && cmake -B build && cmake --build build -j8 && rm -rf /mnt/BitcoinData/*' \
    4'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0'
    

    Benchmark 1: COMMIT=33adc7521cc8bb24b941d959022b084002ba7c60 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0

    0  Time (mean ± σ):     37410.149 s ± 264.840 s    [User: 32792.866 s, System: 1938.561 s]
    1  Range (min … max):   37105.210 s … 37582.606 s    3 runs
    

    I.e. the three independent measurements before the change were:

    • 10.42 hours
    • 10.43 hours
    • 10.30 hours
    • 37542.6300163161
    • 37582.6056244741
    • 37105.2102307041

    Benchmark 2: COMMIT=391c87640d78d9821b87e3f3de755af40a191d24 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0 Time (mean ± σ): 41970.584 s ± 995.527 s [User: 35046.785 s, System: 2172.531 s] Range (min … max): 41203.415 s … 43095.560 s 3 runs

    Having the measurements:

    • 11.44 hours
    • 11.97 hours
    • 11.55 hours
    • 41203.4147152671
    • 43095.5604828261
    • 41612.7761091461

    Summary COMMIT=33adc7521cc8bb24b941d959022b084002ba7c60 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0 ran 1.12 ± 0.03 times faster than COMMIT=391c87640d78d9821b87e3f3de755af40a191d24 ./build/src/bitcoind -datadir=/mnt/BitcoinData -stopatheight=860000 -dbcache=30000 -printtoconsole=0

    i.e. 10% slower, roughly 1 hour longer IBD on a HDD if we flush every hour.

    As you can see the measurements are pretty consistently faster before and slower after. Repeated the same experiment twice, 5 times first time, 3 times now, it doesn’t seem like a fluke.

    I understand that it’s inconvenient (as the change makes theoretical sense to me), but maybe we can find a different interval (e.g. 100 minutes maybe?) where the difference isn’t this big.

  65. l0rinc commented at 11:44 am on September 30, 2024: contributor
    @davidgumberg mentioned that it’s possible flushing the block index is causing the regression. This could also explain why there isn’t a regression when the blocksdir is on an SSD and the coinsdb is not.
  66. l0rinc commented at 10:18 am on October 13, 2024: contributor

    I redid the benchmarks on an SSD with 100-120 second range:

    • (before) 5e90501abe test: chainstate write test for periodic chainstate flush
    • (write every 50-70 minutes) ffc9df87a0 validation: add randomness to periodic write interval
    • (write every 100-120 minutes) f5e7b1508a Increase DATABASE_WRITE_INTERVAL MIN/MAX to 100/120
    0hyperfine \
    1--runs 1 \
    2--export-json /mnt/my_storage/ibd_full-write-chainstate-every-hour.json \
    3--parameter-list COMMIT 5e90501abe13731579c61b13092b31b9fc4f8c4d,ffc9df87a0da35b451720556bdb3f0f65f315c8f,f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f \
    4--prepare 'rm -rf /mnt/my_storage/BitcoinData/* && git checkout {COMMIT} && git clean -fxd && git reset --hard && cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_UTIL=OFF -DBUILD_TX=OFF -DBUILD_TESTS=OFF -DENABLE_WALLET=OFF -DINSTALL_MAN=OFF && cmake \
    5--build build -j$(nproc)' \
    6'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0'
    
     0Benchmark 1: COMMIT=5e90501abe13731579c61b13092b31b9fc4f8c4d ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     1  Time (abs ≡):        30685.829 s               [User: 29155.745 s, System: 4840.419 s]
     2
     3Benchmark 2: COMMIT=ffc9df87a0da35b451720556bdb3f0f65f315c8f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     4  Time (abs ≡):        31140.007 s               [User: 29761.900 s, System: 4909.715 s]
     5
     6Benchmark 3: COMMIT=f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     7  Time (abs ≡):        31935.055 s               [User: 29501.483 s, System: 4915.473 s]
     8
     9Summary
    10  'COMMIT=5e90501abe13731579c61b13092b31b9fc4f8c4d ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0' ran
    11    1.01 times faster than 'COMMIT=ffc9df87a0da35b451720556bdb3f0f65f315c8f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0'
    12    1.04 times faster than 'COMMIT=f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0'
    

    The measurements are pretty close this time, I’ll try the same on an HDD as well.

  67. l0rinc commented at 11:26 pm on October 14, 2024: contributor

    Finished the same benchmarks on a HDD with very similar results:

    0# hyperfine \
    1--runs 1 \
    2--export-json /mnt/my_storage/ibd_full-write-chainstate-every-hour.json \
    3--parameter-list COMMIT 5e90501abe13731579c61b13092b31b9fc4f8c4d,ffc9df87a0da35b451720556bdb3f0f65f315c8f,f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f \
    4--prepare 'rm -rf /mnt/my_storage/BitcoinData/* && git checkout {COMMIT} && git clean -fxd && git reset \
    5--hard && cmake -B build -DCMAKE_BUILD_TYPE=Release -DBUILD_UTIL=OFF -DBUILD_TX=OFF -DBUILD_TESTS=OFF -DENABLE_WALLET=OFF -DINSTALL_MAN=OFF && cmake \
    6--build build -j$(nproc)' \
    7'COMMIT={COMMIT} ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0'
    
     0Benchmark 1: COMMIT=5e90501abe13731579c61b13092b31b9fc4f8c4d ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     1  Time (abs ≡):        35216.011 s               [User: 28946.209 s, System: 4439.914 s]
     2
     3Benchmark 2: COMMIT=ffc9df87a0da35b451720556bdb3f0f65f315c8f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     4  Time (abs ≡):        35007.008 s               [User: 28473.073 s, System: 4140.873 s]
     5
     6Benchmark 3: COMMIT=f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
     7  Time (abs ≡):        35625.822 s               [User: 28383.805 s, System: 4280.052 s]
     8
     9Summary
    10  COMMIT=ffc9df87a0da35b451720556bdb3f0f65f315c8f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0 ran
    11    1.01 times faster than COMMIT=5e90501abe13731579c61b13092b31b9fc4f8c4d ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
    12    1.02 times faster than COMMIT=f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f ./build/src/bitcoind -datadir=/mnt/my_storage/BitcoinData -stopatheight=800000 -dbcache=2000 -printtoconsole=0
    

    git checkout 5e90501abe13731579c61b13092b31b9fc4f8c4d

    HEAD is now at 5e90501abe test: chainstate write test for periodic chainstate flush

    git checkout ffc9df87a0da35b451720556bdb3f0f65f315c8f

    HEAD is now at ffc9df87a0 validation: add randomness to periodic write interval

    git checkout f5e7b1508a06c0c7a9c06d1ac270840fd72e9e6f

    HEAD is now at f5e7b1508a Increase DATABASE_WRITE_INTERVAL MIN/MAX to 100/120


    Based on these I’m fine with 50/70 or 100/120 minute writes. Approach ACK - let’s discuss whether this is the final form.

  68. DrahtBot added the label Needs rebase on Nov 11, 2024
  69. test: chainstate write test for periodic chainstate flush 118305452d
  70. validation: write chainstate to disk every hour
    Remove the 24 hour periodic flush interval and
    write the chainstate along with blocks and block
    index every hour
    a61abb003e
  71. refactor: rename fDoFullFlush to should_write 97184419fe
  72. refactor: replace m_last_write with m_next_write
    Co-Authored-By: l0rinc <pap.lorinc@gmail.com>
    63bfc82ce0
  73. validation: add randomness to periodic write interval
    Co-Authored-By: Pieter Wuille <pieter@wuille.net>
    Co-Authored-By: l0rinc <pap.lorinc@gmail.com>
    5cf6dcb08a
  74. andrewtoth force-pushed on Nov 11, 2024
  75. andrewtoth commented at 3:26 pm on November 11, 2024: contributor
    Rebased for renaming TRACE5 -> TRACEPOINT.
  76. DrahtBot removed the label Needs rebase on Nov 11, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me