Drop -dbcache limit #28358

pull Sjors wants to merge 1 commits into bitcoin:master from Sjors:2023/08/double-your-coins---cache changing 6 files +9 −7
  1. Sjors commented at 3:57 pm on August 28, 2023: member

    Due to recent UTXO set growth, the current maximum value for -dbcache of 16GB is ~just months away from being~ insufficient (for those who wish to complete IBD with the UTXO set held in RAM).

    This drops the limit. It also adds a warning that it’s up to users to check that they have enough RAM.

    Fixes #28249.


    A previous version of this PR increased the maximum to 64GB. It also made startup abort if the value provided is too high, rather than quietly round it down. But this didn’t get much support.

  2. DrahtBot commented at 3:57 pm on August 28, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30059 (Add option dbfilesize to control LevelDB target (“max”) file size by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. Sjors marked this as ready for review on Aug 28, 2023
  4. Sjors force-pushed on Aug 28, 2023
  5. DrahtBot added the label CI failed on Aug 28, 2023
  6. maflcko commented at 4:11 pm on August 28, 2023: member

    insufficient

    Can you explain this? Why is there a requirement that the whole utxo set must fit in the cache? This can easily lead to OOM, as you explain, in which case the user would have to start from zero, because potentially nothing has been persisted at all. Also, for slow storage it could lead to hangs, which could lead the user (or OS) to kill the process, causing the same issues. Finally, with assumeutxo, I wonder why there is any need to chase this? No objection to this pull, but I wonder if this really makes users happier on average, or will just lead to more hangs, crashes and -reindex.

  7. in doc/release-notes-25358.md:4 in f311e4549b outdated
    0@@ -0,0 +1,5 @@
    1+Node
    2+------
    3+
    4+- The maximum value for `-dbcache` has been increased to 32,000 MiB due to recent UTXO set growth. Be sure to check that you have enough RAM before increasing this.
    


    maflcko commented at 4:17 pm on August 28, 2023:
    Instead of implying that users should check their RAM and then increase this, it would be more helpful to explain why a user should change the setting in the first place, what they can expect, and what the tradeoffs are, but this seems better put in a separate document or the -help.

    Sjors commented at 4:41 pm on August 28, 2023:
    That was not the intended implication. If they feel like increasing it, they should check their RAM first.

    jonatack commented at 2:33 pm on August 30, 2023:
    0- The maximum allowed value for the `-dbcache` configuration option has been increased to 32,000 MiB due to recent UTXO set growth. Be sure to check that you have enough RAM before increasing this. (#28358)
    

    Sjors commented at 2:44 pm on August 30, 2023:
    I might also add something like “Even with sufficient RAM the maximum setting may not be optimal.”

    jonatack commented at 2:49 pm on August 30, 2023:
    Yes – and why, or link to an improved In-memory caches section in doc/reduce-memory.md.

    maflcko commented at 2:53 pm on August 30, 2023:
    It is just that this is the only thing you mention in the release notes. In any case, I don’t think the release notes are the right place for extended documentation. Either put it in the -help or a separate md.

    Sjors commented at 2:07 pm on February 13, 2024:
    Taken the suggested text diff.

    Sjors commented at 2:11 pm on February 13, 2024:
    I dropped the “check RAM” sentence.
  8. jamesob commented at 4:25 pm on August 28, 2023: member

    This can easily lead to OOM

    Wonder if it’s worth warning/failing if the dbcache param exceeds available memory.

  9. Sjors commented at 4:40 pm on August 28, 2023: member

    My recollection is that an IBD without a single flush is faster, but have not benched this in recent years.

    We can always recommend a lower setting if that turns out to be better. By not increasing it to infinity at least we constrain the worst case here, e.g. if it turns out the failures you mean get exponentially worse as the UTXO set keeps growing. While at the same time not removing the existing ability to sync in one go.

  10. Sjors commented at 4:41 pm on August 28, 2023: member

    if the dbcache param exceeds available memory.

    My understanding was that we can’t measure available RAM reliably.

  11. andrewtoth commented at 4:50 pm on August 28, 2023: contributor

    My recollection is that an IBD without a single flush is faster, but have not benched this in recent years.

    A single flush throughout the entire multi-hour IBD will not make a noticeable difference to a user. My initial version of #28280 had 3 extra pointers per cache entry, which caused the dbcache limit to be hit. It was still >2% faster than master which did not have a single flush.

  12. kristapsk commented at 5:00 pm on August 28, 2023: contributor
    Concept ACK. This does not change defaults, just gives more options to power users. And UTXO set will grow in future. For some server environments 16 GB more or less will mean nothing.
  13. DrahtBot removed the label CI failed on Aug 28, 2023
  14. TheCharlatan commented at 5:21 am on August 29, 2023: contributor
    Concept ACK.
  15. henrikhk commented at 4:51 pm on August 29, 2023: none
    Concept ACK.
  16. in doc/release-notes-25358.md:5 in f311e4549b outdated
    0@@ -0,0 +1,5 @@
    1+Node
    2+------
    3+
    4+- The maximum value for `-dbcache` has been increased to 32,000 MiB due to recent UTXO set growth. Be sure to check that you have enough RAM before increasing this.
    5+- Setting `-dbcache` above the maximum causes the node to abort startup. Previously it would round the number down.
    


    BenWestgate commented at 2:25 am on August 30, 2023:

    I would prefer it quietly round down to the maximum. Especially since it rounds up to the minimum.

    I know a few scripts people use that set dbcache to a percentage of available memory that this part of the PR would break for 64GB RAM.

    What problem is failing to start meant to solve?

  17. BenWestgate commented at 2:37 am on August 30, 2023: contributor

    Concept ACK.

    Except for:

    make startup abort if the value provided is too high, rather than quietly rounding it down.

    This breaks scripts that set dbcache to a % of available RAM.

    And is inconsistent with it rounding up silently rather than fail to start if the value is too low.

  18. 0xB10C commented at 7:46 am on August 30, 2023: contributor

    Concept ACK on increasing the maximum dbcache.

    I had to manually bump the dbcache to run bitcoin-cli verifychain 4 0 last year (https://github.com/bitcoin/bitcoin/pull/24851#issuecomment-1214040286). This came close to using 32 GB and might not be enough anymore (or soon - I haven’t checked again). Maybe worth considering going directly to 64 GB here to be able to actually use the verifychain 4 0 RPC?

  19. darosior commented at 7:57 am on August 30, 2023: member
    Concept ACK on giving more options.
  20. Sjors commented at 8:14 am on August 30, 2023: member

    This breaks scripts that set dbcache to a % of available RAM.

    Do you know any example of people doing that?

    In general I think we should fail to startup if the user provides an invalid configuration option, as this often leads to problems. I believe this has been gradually introduced for various other options too (see also #16545).

    At the same time I don’t want to needless break scripts out there.

    From my own experience I was very surprised to find that this rounding down behavior happened, which is which I prefer an error. But we could also print a warning in the debug log.

    Maybe worth considering going directly to 64 GB here to be able to actually use the verifychain 4 0 RPC? @sipa any thoughts on a sane maximum?

  21. BenWestgate commented at 12:57 pm on August 30, 2023: contributor

    This breaks scripts that set dbcache to a % of available RAM.

    Do you know any example of people doing that?

    My project Bails puts this in bitcoin-qt.desktop as well as .config/autostart/Bitcoin/bitcoin.desktop: -dbcache=$(($(grep Available /proc/meminfo | sed s/[^0-9]//g)/1024-2000))

    It’s an alpha with about 20 users, makes a portable USB stick node that keeps Bitcoin Core up to date, none probably own a 64GB computer, but it will break if they do.

    I found a couple other projects with the idea: https://github.com/Start9Labs/bitcoind-wrapper/issues/39 https://github.com/JWWeatherman/yeticold/issues/145 https://github.com/epiccurious/bitcoin-core-node-builder/issues/38

    They’ll waste time if they miss the change notes. Maybe a one time warning would be better when the value rounds up or down? It took me a long time to notice that it did that.

  22. in src/init.cpp:449 in f311e4549b outdated
    445@@ -446,7 +446,7 @@ void SetupServerArgs(ArgsManager& argsman)
    446     argsman.AddArg("-conf=<file>", strprintf("Specify path to read-only configuration file. Relative paths will be prefixed by datadir location (only useable from command line, not configuration file) (default: %s)", BITCOIN_CONF_FILENAME), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    447     argsman.AddArg("-datadir=<dir>", "Specify data directory", ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    448     argsman.AddArg("-dbbatchsize", strprintf("Maximum database write batch size in bytes (default: %u)", nDefaultDbBatchSize), ArgsManager::ALLOW_ANY | ArgsManager::DEBUG_ONLY, OptionsCategory::OPTIONS);
    449-    argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    450+    argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). Make sure you have enough RAM. In addition, unused mempool memory is shared for this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    


    jonatack commented at 2:28 pm on August 30, 2023:

    While touching this help, I think the following would be much clearer (particularly s/for/with).

    0    argsman.AddArg("-dbcache=<n>", strprintf("Maximum database cache size <n> MiB (%d to %d, default: %d). Make sure you have enough RAM. In addition, unused memory allocated to the mempool is shared with this cache (see -maxmempool).", nMinDbCache, nMaxDbCache, nDefaultDbCache), ArgsManager::ALLOW_ANY, OptionsCategory::OPTIONS);
    

    Sjors commented at 2:07 pm on February 13, 2024:
    Taken
  23. in src/init.cpp:974 in f311e4549b outdated
    924@@ -925,6 +925,10 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    925         return InitError(Untranslated("Cannot set -listen=0 together with -listenonion=1"));
    926     }
    927 
    928+    if (args.GetIntArg("-dbcache", nDefaultDbCache) >  nMaxDbCache) {
    929+        return InitError(Untranslated(strprintf("-dbcache must be at most %d MiB", nMaxDbCache)));
    930+    }
    


    jonatack commented at 2:31 pm on August 30, 2023:

    Could add a test.

    Edit: maybe -dbcache configuration option can be at most %d MiB, but %d was passed.


    Sjors commented at 2:07 pm on February 13, 2024:
    Used the text, added a test.
  24. in doc/release-notes-25358.md:1 in f311e4549b outdated
    0@@ -0,0 +1,5 @@
    1+Node
    


    jonatack commented at 2:35 pm on August 30, 2023:

    Maybe (I’m not sure this matters).

    0Updated settings
    

    Sjors commented at 2:46 pm on August 30, 2023:

    Release notes get edited later anyway, but I’ll change once I retouch.

    (waiting for more inputs in the discussion on warning/failure and the max size)


    Sjors commented at 2:07 pm on February 13, 2024:
    Updated.
  25. jonatack commented at 2:36 pm on August 30, 2023: contributor

    Concept ACK

    In general I think we should fail to startup if the user provides an invalid configuration option, as this often leads to problems. I believe this has been gradually introduced for various other options too (see also #16545).

    #27632 was another recent example of switching from a warning to raising an error. It’s too easy for users, even experienced ones, not to notice a warning line in the log.

  26. BenWestgate commented at 2:50 pm on August 30, 2023: contributor

    Perhaps it should be set so high that you can’t easily buy a Desktop with more RAM than the limit.

    I could go to Microcenter today and walk out with a 64GB gaming tower, or a Xeon workstation with 128GB, likely a 64GB laptop as well.

    Lots of guides on internet (foolishly) suggest setting it to half the total RAM.

  27. in src/init.cpp:929 in f311e4549b outdated
    924@@ -925,6 +925,10 @@ bool AppInitParameterInteraction(const ArgsManager& args)
    925         return InitError(Untranslated("Cannot set -listen=0 together with -listenonion=1"));
    926     }
    927 
    928+    if (args.GetIntArg("-dbcache", nDefaultDbCache) >  nMaxDbCache) {
    929+        return InitError(Untranslated(strprintf("-dbcache must be at most %d MiB", nMaxDbCache)));
    


    maflcko commented at 2:51 pm on August 30, 2023:

    Given that this will break software out there, why not simply keep the previous behavior and leave it as-is? What problem are you trying to solve?

    Printing a warning and/or updating the docs should be more than enough.


    Sjors commented at 8:03 am on August 31, 2023:
    That indeed seems like the best approach. Silently rounding down is annoying, but doesn’t break anything, so a warning should be fine.

    jonatack commented at 12:59 pm on August 31, 2023:
    OTOH, it seems that raising would be a clearer user interface than rounding down silently as now, or with an added warning. See https://en.wikipedia.org/wiki/Principle_of_least_astonishment. I don’t want to break anything, but the only software mentioned so far that this might marginally affect is that of @BenWestgate (thank you for reporting), who would be aware of the change.

    Sjors commented at 2:08 pm on February 13, 2024:

    I’m leaving this error as is for now, unless there are other examples of breaking software.

    I would also rather drop nMaxDbCache entirely than quietly enforce it.


    maflcko commented at 11:07 am on February 22, 2024:

    See https://en.wikipedia.org/wiki/Principle_of_least_astonishment

    If I imagined myself running into this error. I’d be indeed astonished and ask myself why the dbcache is capped at all. Again, if there is a reason for this change, it should be explained. It can’t hurt to put the reason in the error string.

  28. DrahtBot added the label CI failed on Sep 3, 2023
  29. DrahtBot removed the label CI failed on Sep 5, 2023
  30. luke-jr commented at 8:52 pm on September 5, 2023: member

    Why do we have an upper limit at all?

    My understanding was that we can’t measure available RAM reliably.

    #19873 does exactly that

  31. in src/txdb.h:41 in f311e4549b outdated
    37@@ -38,7 +38,7 @@ static const int64_t nDefaultDbCache = 450;
    38 //! -dbbatchsize default (bytes)
    39 static const int64_t nDefaultDbBatchSize = 16 << 20;
    40 //! max. -dbcache (MiB)
    41-static const int64_t nMaxDbCache = sizeof(void*) > 4 ? 16384 : 1024;
    42+static const int64_t nMaxDbCache = sizeof(void*) > 4 ? 32768 : 1024;
    


    luke-jr commented at 9:55 pm on September 5, 2023:
    Might as well make it constexpr while you’re touching it

    Sjors commented at 2:08 pm on February 13, 2024:
    Done
  32. DrahtBot added the label CI failed on Jan 12, 2024
  33. Sjors force-pushed on Feb 13, 2024
  34. Sjors commented at 2:10 pm on February 13, 2024: member

    Rebased, incorporated most of the text feedback above. Dropped the RAM warning from the release note, since that’s already in the help text.

    Why do we have an upper limit at all? @sipa seemed worried about users going overboard with this setting: #28249 (comment)

  35. tdb3 commented at 10:53 pm on February 17, 2024: contributor
    crACK
  36. Sjors force-pushed on Feb 22, 2024
  37. Sjors commented at 10:33 am on February 22, 2024: member
    Fixed broken feature_config_args.py test, taking into account the lower limit on 32 bit systems.
  38. maflcko commented at 11:09 am on February 22, 2024: member
    I still don’t understand the state of the current pull request. This leaves the currently hard-failing use case un-addressed (https://github.com/bitcoin/bitcoin/pull/28358#issuecomment-1698663976) and adds more hard-failing behavior for no reason (https://github.com/bitcoin/bitcoin/pull/28358#discussion_r1310404672)?
  39. maflcko commented at 11:34 am on February 22, 2024: member
    Also, CI still fails
  40. Sjors commented at 1:22 pm on February 22, 2024: member

    I was still waiting for someone to suggest a new value. But it seems fine to bump to 64 GB in order to support verifychain 4 0, so added that. @maflcko do you prefer to drop the limit? We should not quietly change a user setting, so imo an invalid value should explicitly fail, rather than silently fail.

    Also, CI still fails

    Indeed. I copied the is_64bits = sys.maxsize > 2**32 check from netutil.py, but it seems the check doesn’t actually work. So there might be a bug in netutil.py.

    https://cirrus-ci.com/task/5137529851084800?logs=ci#L10796

    Meanwhile the multiprocess, i686 CI compiler seems to think it’s 32 bit (while the test framework thinks it’s 64 bit), so the sizeof(void*) check in txdb.h seems broken too. (which could be another reason to just drop the limit entirely)

    https://cirrus-ci.com/task/4715317386018816?logs=ci#L3814

  41. Sjors force-pushed on Feb 22, 2024
  42. Sjors marked this as a draft on Feb 22, 2024
  43. Sjors renamed this:
    Double -dbache maximum to 32GB
    Double -dbache maximum to 64GB
    on Feb 22, 2024
  44. maflcko commented at 1:32 pm on February 22, 2024: member

    i686 is 32 bit and python on that CI machine is compiled to 64 bit.

    Edit: Assuming they are the same is the broken assumption

  45. Sjors force-pushed on Feb 22, 2024
  46. Sjors commented at 2:38 pm on February 22, 2024: member
    We could expose whether the program is compiled for 32 or 64 bit in the test framework… but I just added a regex for the error message instead.
  47. DrahtBot removed the label CI failed on Feb 22, 2024
  48. Sjors marked this as ready for review on Feb 23, 2024
  49. in src/node/caches.cpp:16 in 39a3d48a4a outdated
    12@@ -13,7 +13,7 @@ CacheSizes CalculateCacheSizes(const ArgsManager& args, size_t n_indexes)
    13 {
    14     int64_t nTotalCache = (args.GetIntArg("-dbcache", nDefaultDbCache) << 20);
    15     nTotalCache = std::max(nTotalCache, nMinDbCache << 20); // total cache cannot be less than nMinDbCache
    16-    nTotalCache = std::min(nTotalCache, nMaxDbCache << 20); // total cache cannot be greater than nMaxDbcache
    17+    assert(nTotalCache <= nMaxDbCache << 20);
    


    cbergqvist commented at 10:19 pm on March 4, 2024:

    nit: How about appending a comment to the assert such as // error already handled in AppInitParameterInteraction()?

    (Someone might have accidentally implemented a version of this function “properly” returning util::Result<CacheSizes> before realizing this was the case :) ).


    Sjors commented at 8:43 am on March 5, 2024:
    I’ll add a comment if I have to retouch.

    Sjors commented at 10:54 am on March 12, 2024:
    No longer needed since the limit is dropped now.
  50. cbergqvist approved
  51. cbergqvist commented at 11:04 pm on March 4, 2024: contributor

    ACK 39a3d48.

    Concept

    Adding this kind of hard error might make it more annoying to run old Bitcoin Core software in the far future. (But the consequence of a few extra flushes should be fine).

    Tested

    Starting bitcoind with -dbcache set to low value, max value, and beyond max value with expected results.

    Ran ./test/functional/feature_config_args.py with and without modified expected assert message, with expected results.

  52. DrahtBot requested review from TheCharlatan on Mar 4, 2024
  53. DrahtBot requested review from tdb3 on Mar 4, 2024
  54. DrahtBot requested review from jonatack on Mar 4, 2024
  55. DrahtBot requested review from 0xB10C on Mar 4, 2024
  56. DrahtBot requested review from darosior on Mar 4, 2024
  57. DrahtBot removed review request from tdb3 on Mar 4, 2024
  58. DrahtBot requested review from tdb3 on Mar 4, 2024
  59. DrahtBot removed review request from tdb3 on Mar 5, 2024
  60. DrahtBot requested review from tdb3 on Mar 5, 2024
  61. Sjors renamed this:
    Double -dbache maximum to 64GB
    Quadruple -dbache maximum to 64GB
    on Mar 11, 2024
  62. maflcko commented at 8:43 am on March 12, 2024: member

    @maflcko do you prefer to drop the limit? We should not quietly change a user setting, so imo an invalid value should explicitly fail, rather than silently fail.

    Why would the value be “invalid”?

  63. Sjors commented at 8:56 am on March 12, 2024: member
    If there is a limit, a value above the limit is invalid. That is the entire point of a limit. Which is why I asked if you prefer to drop the limit.
  64. maflcko commented at 9:47 am on March 12, 2024: member

    I don’t know if it is safe to drop the limit. For example, one could run into a signed integer overflow for some values. If that is the reason for the limit, you should add a comment for it.

    However, if there is no reason for the limit, I don’t see why a limit should exist.

  65. Sjors force-pushed on Mar 12, 2024
  66. Sjors commented at 10:50 am on March 12, 2024: member

    It seems there’s more enthusiasm for dropping this limit entirely, so I did that. This also impacts the GUI (but none of the translated strings), cc @hebasto.

    For future reference, c016405dadcdc645f31dc77cb0a92e2e10fe358a + 39a3d48a4ae87fd659a58ce93109a722fc436dd8 is the last version of this PR which does have a limit.

  67. Sjors renamed this:
    Quadruple -dbache maximum to 64GB
    Drop -dbache limit
    on Mar 12, 2024
  68. in src/qt/optionsdialog.cpp:96 in 5f313495ee outdated
    93@@ -94,8 +94,7 @@ OptionsDialog::OptionsDialog(QWidget* parent, bool enableWallet)
    94     ui->setupUi(this);
    95 
    96     /* Main elements init */
    97-    ui->databaseCache->setMinimum(nMinDbCache);
    98-    ui->databaseCache->setMaximum(nMaxDbCache);
    99+    ui->databaseCache->setRange(nMinDbCache, std::numeric_limits<int>::max());
    


    Sjors commented at 10:56 am on March 12, 2024:
    Even on a 32 bit system the maximum would be 2 petabyte (2 billion MiB), so not a practical concern imo.
  69. Sjors renamed this:
    Drop -dbache limit
    Drop -dbcache limit
    on Mar 12, 2024
  70. sipa commented at 11:24 am on March 12, 2024: member

    On 32-bit systems we shouldn’t raise the cache size too close to 4 GiB or more, as it’ll just result in memory allocation errors.

    On 64-bit systems, the limit is much less necessary, but I’d still worry about wildly too large numbers, as they also affect leveldb cache sizes. Setting those (way) too large is probably counterproductive.

  71. Sjors commented at 12:26 pm on March 12, 2024: member

    @sipa since 32 bit systems can’t have more than 4 GiB of RAM, the “Make sure you have enough RAM” wording in the documentation should cover that. Though perhaps “Stay at least 1 GiB below your RAM size” may be safer, also because of swap behaviour (see e.g. #29603 (comment))

    With something like #19873 we could warn the use if they pick a value too close to the limit (on both 32 and 64 bit systems).

    wildly too large

    We could suggest an optimum, but without further experimentation we don’t know where that is.

    Ideally we would measure performance in real time and flush when it starts degrading (unless the user really insists that we don’t).

  72. cbergqvist commented at 8:12 pm on March 14, 2024: contributor

    Built 5f313495eea36b566109894697b4a90085c53113 and ran bitcoin-qt because I was worried that setting a very high max value might cause issues with the QT UI control. It’s not some kind of horizontal slider that goes crazy, but rather a textbox with up/down arrows. It ignores key inputs that would make the value go further than std::numeric_limits<int>::max(). So all good from that perspective.

    bitcoin-qt_screenshot

  73. hebasto commented at 4:34 pm on March 18, 2024: member

    This also impacts the GUI (but none of the translated strings), cc @hebasto.

    5f313495eea36b566109894697b4a90085c53113: changes to the GUI code look good to me. It might worth considering adding the “Make sure you have enough RAM.” warning to the GUI as well (as a follow-up).

  74. Sjors commented at 4:37 pm on March 18, 2024: member

    “Make sure you have enough RAM.” warning to the GUI as well (as a follow-up).

    Will do if I need to retouch.

  75. andrewtoth commented at 3:53 am on May 14, 2024: contributor
    After #28233 keeping the entire utxo set in memory will also help connect new blocks as fast as possible, not just for IBD.
  76. Sjors force-pushed on May 14, 2024
  77. Sjors commented at 6:54 am on May 14, 2024: member

    Rebased to make it easier to test the interaction with (merged) #28233. However, any potential benefit goes away after the first restart. The only time -dbcache is filled beyond its previous 16GB limit is during IDB.

    In order to benefit from having the full UTXO set in RAM after a restart, we’d have to proactively load the entire UTXO set into -dbcache if space permits. Or maybe fill half the available space with coins, picked randomly.

    Added “Make sure you have enough RAM.” warning to the GUI.

  78. andrewtoth commented at 1:38 am on May 15, 2024: contributor

    The only time -dbcache is filled beyond its previous 16GB limit is during IDB

    I’m sure it will happen if you restart your node now and leave it running for long enough ;)

    In order to benefit from having the full UTXO set in RAM after a restart, we’d have to proactively load the entire UTXO set into -dbcache if space permits.

    I did make a crude attempt at that years ago #18941. Not sure it is worth it.

  79. BenWestgate approved
  80. BenWestgate commented at 11:22 am on May 23, 2024: contributor
    crACK
  81. in doc/release-notes-25358.md:1 in 2a71819af6 outdated
    0@@ -0,0 +1,6 @@
    1+Updated settings
    


    sdaftuar commented at 12:27 pm on May 23, 2024:
    Just noticed the filename here should be fixed (25358 vs 28358).
  82. sdaftuar commented at 12:32 pm on May 23, 2024: member

    Concept ACK.

    I don’t see much of a philosophical difference between having no limit and having a limit of 16GB on systems where a user has less than 16GB actually available… If we’re already requiring users to do something smart, we can keep doing that – and if we are worried about someone using an inappropriate value we can try to solve that separately (maybe by testing that we can allocate that much at startup or something?).

  83. BenWestgate commented at 3:53 pm on May 23, 2024: contributor

    If we are worried about someone using an inappropriate value we can try to solve that separately (maybe by testing that we can allocate that much at startup or something?).

    The best would be never setting it greater than total RAM. -dbcache=<total RAM> is still higher than optimal as there will be applications in memory that won’t be swapped to disk including Bitcoin. But measuring Available RAM or swap is precarious as it changes.

    Since other reviewers mentioned only wildly excessive values are seriously detrimental with swap, capping to max RAM is sensible.

    In order to benefit from having the full UTXO set in RAM after a restart, we’d have to proactively load the entire UTXO set into -dbcache if space permits. Or maybe fill half the available space with coins, picked randomly.

    Some have used cat .bitcoin/chainstate/* > /dev/null to make a node that’s far behind start syncing faster. Because of the disk io to read the chainstate folder, there’s a trade off between initial speed and medium-term speed. A node mere hours behind shouldn’t read the entire UTXO set into memory. But one that’s weeks behind will reach tip faster if it does.

    Though perhaps “Stay at least 1 GiB below your RAM size” may be safer, also because of swap behaviour (see e.g. #29603 (comment))

    That’s barely safe without swap, Bitcoin-QT and the OS will use most of the 1 GiB offset leaving it pretty close to locking up.

    Without a swap file, an offset from total ram (1-2 GiB?) may avoid OOM, but this becomes arbitrary and may limit performance if set too low. It’s better users w/o swap who set it too high OOM, than to limit performance of systems w/ swap that may benefit from more.

    So total memory is still the best limit, if any is enforced.

    We could suggest an optimum, but without further experimentation we don’t know where that is.

    I’ve done a lot of tests on 4 GB and 8 GB RAM machines with zram swap and the -datadir on USB sticks and available_RAM - 2 GiB is about optimal for keeping those systems responsive. See this closed issue for more data: https://github.com/BenWestgate/Bails/issues/41

    With -dbcache=<available_RAM - 2 GiB> both 4 GB and 8 GB RAM systems can unlock the screen and manipulate the desktop without exceeding a 10 second lag, after a user locks the screen and lets Bitcoin sync several hours. With 4 GB RAM, this is approximately our default -dbcache=450 as suggested elsewhere to NOT raise it on 4 GB. If -dbcache is too high with a slow swap the machine may take minutes to respond. That’s true even when the value is set lower than total or available memory. But whether that lag is undesirable or not depends on whether the user intends to use the machine or not before sync completes.

    Suggesting dbcache=$(total_ram - 3650) as an optimum place to start matches what some of our devs have recommended on stack exchange.

  84. validation: drop maximum -dbcache
    Fixes #28249
    bb3b980dfd
  85. Sjors force-pushed on May 28, 2024
  86. Sjors commented at 6:42 am on May 28, 2024: member

    @BenWestgate thanks for the analysis. If we want to add a recommendation to the help text, I suggest we do that in another PR though. It’s probably going to take multiple people doing benchmark on wildly different machines.

    Checking against RAM can also be done in a followup.

    I rebased and changed the .md file name.

  87. jlopp commented at 1:49 pm on June 24, 2024: contributor

    Concept ACK.

    For reference, I ran several sync tests with different dbcache sizes to see how much of an effect they have on performance. A good 25% improvement vs default when my machine abstained from flushing to disk. https://blog.lopp.net/effects-dbcache-size-bitcoin-node-sync-speed/

  88. Sjors commented at 5:24 pm on June 25, 2024: member

    @jlopp thanks for the testing and your article.

    I assume “to reach a given block height with a 28 GB dbcache size” you did not run with this patch? I’d be curious to see the result.

    And if you feel like battle testing this further, you could make a custom signet, produce a similar UTXO set to mainnet (maybe #24952 can help), increase it to a few hundred gigabyte and then see if and where there’s a value -dbcache where performance peaks and goes down.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-07-05 22:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me