coins: warn on shutdown for big UTXO set flushes #31534

pull l0rinc wants to merge 1 commits into bitcoin:master from l0rinc:l0rinc/warn-big-flush changing 1 files +5 −2
  1. l0rinc commented at 6:44 pm on December 18, 2024: contributor

    Split out of #30611 (comment)

    Setting a large -dbcache size postpones the index writes until the coins cache size exceeds the specified limit. This causes the final flush after manual termination to seemingly hang forever (e.g. tens of minutes for 20 GiB); Now that the dbcache upper cap has been lifted, this will become even more apparent, so a warning will be shown when large UTXO sets are flushed (currently >1 GiB), such as:

    2024-12-18T18:25:03Z Flushed fee estimates to fee_estimates.dat. 2024-12-18T18:25:03Z [warning] Flushing large (1 GiB) UTXO set to disk, it may take several minutes 2024-12-18T18:25:09Z Shutdown: done


    You can reproduce it by starting bitcoind with a large -dbcache:

    mkdir demo && cmake -B build -DCMAKE_BUILD_TYPE=Release && cmake –build build -j$(nproc) && build/src/bitcoind -datadir=demo -dbcache=10000

    Waiting until the used memory is over 1 GiB

    2024-12-18T18:25:02Z UpdateTip: […] progress=0.069009 cache=1181.1MiB(8827981txo)

    And cancelling the process from the terminal:

    ^C2024-12-18T18:25:03Z tor: Thread interrupt […] 2024-12-18T18:25:03Z *[warning] Flushing large (1 GiB) UTXO set to disk, it may take several minutes

  2. DrahtBot commented at 6:44 pm on December 18, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/31534.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK tdb3, 1440000bytes, sipa, danielabrozzoni
    Concept ACK theuni, BrandonOdiwuor

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30611 (validation: write chainstate to disk every hour by andrewtoth)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label UTXO Db and Indexes on Dec 18, 2024
  4. 1440000bytes commented at 7:48 pm on December 18, 2024: none
    Why is this a warning and not normal log?
  5. l0rinc commented at 7:55 pm on December 18, 2024: contributor

    Why is this a warning and not normal log?

    Instead of logging every time that we’re flushing, we only show this when the process will hang for a (likely unexpectedly) long time - hence the warn.

  6. 1440000bytes approved
  7. in src/validation.cpp:2932 in 2abfb8372d outdated
    2928@@ -2929,6 +2929,7 @@ bool Chainstate::FlushStateToDisk(
    2929         }
    2930         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
    2931         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
    2932+            if (auto mem_gib{coins_mem_usage >> 30}) LogWarning("Flushing large %d GiB UTXO set to disk...", mem_gib);
    


    theuni commented at 9:56 pm on December 18, 2024:

    Could you please name a constant for this?

    And while we’re at it… it’d be nice to express these sizes in terms of user-defined literals. See: https://github.com/theuni/bitcoin/commits/user-defined-size-literals/

    So this could become:

    0static constexpr auto WARN_FLUSH_COINS_SIZE = 1GiB;
    1if(coins_mem_usage >= WARN_FLUSH_COINS_SIZE)
    2...
    

    l0rinc commented at 10:15 pm on December 18, 2024:

    I don’t mind user defined literals (it’s what we did for hexadecimal conversion as well) - but to be fair, 32<<20 isn’t a lot worse than32_MiB (but the second may be more descriptive, indeed).

    Extracted the constant, let me know how you’d like to apply the user-defined literals - my understanding is that they’re not merged yet, right?

    Edit: how would we convert bytes to gigabytes that way? coins_mem_usage / 1_GiB instead of coins_mem_usage >> 30?


    theuni commented at 10:25 pm on December 18, 2024:

    Yeah, I find coins_mem_usage / 1_GiB to be much more obvious in meaning.

    But sure, it’s out of scope for this PR. I’ll open another one for that.

  8. theuni commented at 9:56 pm on December 18, 2024: member
    Concept ACK. Good idea!
  9. l0rinc force-pushed on Dec 18, 2024
  10. l0rinc force-pushed on Dec 18, 2024
  11. in src/validation.cpp:2934 in 36a8133c7b outdated
    2930@@ -2929,6 +2931,7 @@ bool Chainstate::FlushStateToDisk(
    2931         }
    2932         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
    2933         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
    2934+            if (coins_mem_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large %d GiB UTXO set to disk...", coins_mem_usage >> 30);
    


    theuni commented at 10:30 pm on December 18, 2024:

    Thanks for the constant.

    This is now printing GiB, and the very next log line will be printing kB. Not exactly pretty, but I suppose it’s only when bench is on.

    Also, to the average user seeing this, it may not be exactly clear what you’re trying to communicate. How about adding “This may take some time.” or so?


    l0rinc commented at 10:45 pm on December 18, 2024:

    This is now printing GiB, and the very next log line will be printing kB. Not exactly pretty, but I suppose it’s only when bench is on.

    Yeah, I saw that, it’s dividing by 1000 - changed to KiB and >> 10.

  12. theuni commented at 10:33 pm on December 18, 2024: member
    utACK after addressing the string nit.
  13. l0rinc force-pushed on Dec 18, 2024
  14. BrandonOdiwuor commented at 10:39 am on December 19, 2024: contributor
    Concept ACK
  15. tdb3 approved
  16. tdb3 commented at 4:36 pm on December 21, 2024: contributor

    Tested ACK ec8fa17872c8cb7f214a451aaa23701ced8bd5fb

    Great addition! This will prevent many users from believing bitcoind is “stuck/frozen”.

    Performed IBD to block ~100k (dbcache=6000, cache used < 1000MiB). Stopped bitcoind and (as expected) did not see the warning.

    Restarted IBD and stopped at block ~300k (cache used > 1000MiB). Warning displayed as expected.

     0^C
     12024-12-21T16:28:00Z Shutdown: In progress...
     22024-12-21T16:28:00Z opencon thread exit
     32024-12-21T16:28:00Z addcon thread exit
     42024-12-21T16:28:00Z net thread exit
     52024-12-21T16:28:02Z UpdateTip: new best=00000000000000006a0f8a4137c14c4ebbfd4845f3d17777d8efc1308fe4616b height=300085 version=0x00000002 log2_work=78.509378 tx=38492174 date='2014-05-10T18:18:22Z' progress=0.033936 cache=1443.7MiB(11747928txo)
     62024-12-21T16:28:02Z msghand thread exit
     72024-12-21T16:28:02Z scheduler thread exit
     82024-12-21T16:28:02Z Writing 0 mempool transactions to file...
     92024-12-21T16:28:02Z Writing 0 unbroadcast transactions to file.
    102024-12-21T16:28:02Z Dumped mempool: 0.000s to copy, 0.172s to dump, 27 bytes dumped to file
    112024-12-21T16:28:02Z Flushed fee estimates to fee_estimates.dat.
    122024-12-21T16:28:03Z [warning] Flushing large 1 GiB UTXO set to disk may take several minutes.
    132024-12-21T16:29:07Z Shutdown: done
    

    I’d be ok with KiB, MiB, or GiB. The important part is “may take several minutes,” which is already included.

  17. DrahtBot requested review from BrandonOdiwuor on Dec 21, 2024
  18. DrahtBot requested review from theuni on Dec 21, 2024
  19. DrahtBot requested review from 1440000bytes on Dec 21, 2024
  20. in src/validation.cpp:2934 in ec8fa17872 outdated
    2930@@ -2929,8 +2931,9 @@ bool Chainstate::FlushStateToDisk(
    2931         }
    2932         // Flush best chain related state. This can only be done if the blocks / block index write was also done.
    2933         if (fDoFullFlush && !CoinsTip().GetBestBlock().IsNull()) {
    2934-            LOG_TIME_MILLIS_WITH_CATEGORY(strprintf("write coins cache to disk (%d coins, %.2fkB)",
    2935-                coins_count, coins_mem_usage / 1000), BCLog::BENCH);
    2936+            if (coins_mem_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing large %d GiB UTXO set to disk may take several minutes.", coins_mem_usage >> 30);
    


    1440000bytes commented at 2:51 am on December 22, 2024:
    0            if (coins_mem_usage >= WARN_FLUSH_COINS_SIZE) LogWarning("Flushing %d GiB UTXO set to disk, it may take several minutes", coins_mem_usage >> 30);
    

    l0rinc commented at 12:48 pm on December 22, 2024:
    The large part is important, we’re not always displaying this, only when flushing is expected to take a long time

    1440000bytes commented at 12:52 pm on December 22, 2024:
    It is grammatically incorrect to write “large N GiB”.

    l0rinc commented at 1:09 pm on December 22, 2024:
    K, rephrased it to Flushing large (1 GiB) UTXO set to disk, it may take several minutes
  21. DrahtBot requested review from 1440000bytes on Dec 22, 2024
  22. l0rinc force-pushed on Dec 22, 2024
  23. coins: warn on shutdown for big UTXO set flushes
    Setting a large `-dbcache` size postpones the index writes until the coins cache size exceeds the specified limit.
    This causes the final flush after manual termination to seemingly hang forever (e.g. tens of minutes for 20 GiB);
    Now that the `dbcache` upper cap has been lifted, this will become even more apparent, so a warning will be shown when large UTXO sets are flushed (currently >1 GiB), such as:
    > 2024-12-18T18:25:03Z Flushed fee estimates to fee_estimates.dat.
    > 2024-12-18T18:25:03Z [warning] Flushing large (1 GiB) UTXO set to disk, it may take several minutes
    > 2024-12-18T18:25:09Z Shutdown: done
    
    Note that the related BCLog::BENCH units were also converted to `KiB` from `kB` to unify the bases.
    
    Co-authored-by: Cory Fields <cory-nospam-@coryfields.com>
    5709718b83
  24. l0rinc force-pushed on Dec 22, 2024
  25. tdb3 approved
  26. tdb3 commented at 2:32 pm on December 22, 2024: contributor
    re ACK 5709718b830161b7c2ba0db545ef0cfa98423597
  27. 1440000bytes approved
  28. sipa commented at 4:30 pm on December 23, 2024: member
    utACK 5709718b830161b7c2ba0db545ef0cfa98423597
  29. danielabrozzoni approved
  30. danielabrozzoni commented at 6:49 pm on December 23, 2024: contributor

    tACK 5709718b830161b7c2ba0db545ef0cfa98423597

    Code looks good to me, I did some manual testing by starting the node with a large dbcache, waiting for the cache to go over 1GiB and then stopping the node, as suggested in the PR description.

  31. ryanofsky merged this on Dec 27, 2024
  32. ryanofsky closed this on Dec 27, 2024

  33. l0rinc deleted the branch on Dec 27, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-21 06:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me