refactor: Remove gArgs from bdb.h and sqlite.h #23732

pull kiminuo wants to merge 1 commits into bitcoin:master from kiminuo:feature/2021-12-02-db-without-gArgs-base changing 21 files +93 −60
  1. kiminuo commented at 9:34 am on December 10, 2021: contributor

    Contributes to #21005.

    The goal of this PR is to remove gArgs from database classes (i.e. bdb.h and sqlite.h) so that they can be tested without relying on gArgs in tests.

    Notes:

    • My goal is to enable unit-testing without relying on gArgs as much as possible. Global variables are hard to reason about which in turn makes it slightly harder to contribute to this codebase. When the compiler does the heavy lifting for us and allows us only to construct an object (or call a method) with valid parameters, we may also save some time in code reviews. The cost for this is passing an argument which is not for free but the cost is very miniscule compared to benefits, I think.
      • GUI code is an exception because it seems fine to have gArgs there so I don’t plan to make changes in src/qt folder, for example.
    • My approach to removal of gArgs uses is moving from lower levels to upper ones and pass ArgsManager as an argument as needed. The approach is very similar to what #20158.
  2. DrahtBot added the label GUI on Dec 10, 2021
  3. DrahtBot added the label Refactoring on Dec 10, 2021
  4. DrahtBot added the label Wallet on Dec 10, 2021
  5. ryanofsky commented at 2:35 pm on December 10, 2021: member

    In general this seems good, but I think you could shrink the PR and improve it by not exposing ArgsManager to low-level BDB and sqlite code, by instead adding fields to the existing DatabaseOptions struct

    0struct DatabaseOptions {
    1  ...
    2  bool use_unsafe_sync = false;
    3  bool use_shared_memory = !DEFAULT_WALLET_PRIVDB;
    4  int64_t checkpoint_log_size = DEFAULT_WALLET_DBLOGSIZE;
    5};
    

    and adding corresponding m_ members to the SQLiteDatabse and BerkeleyEnvironment classes where necessary. This way, database options are consolidated in one location, less code needs to change, and ArgsManager is not involved in database setup or operation.

  6. kiminuo commented at 3:10 pm on December 10, 2021: contributor

    Thank you for the suggestion! Yes, that is an alternative. From the design point of view, we seem to balance the following properties:

    • Easiness to use new arguments in bdb and sqlite code.
    • Whether ArgsManager is meant to be basic class for all other code to use. In your approach, it hints it is not. I guess it makes sense not to expand what we consider to be “basic” unless really needed.
    • Both approaches remove gArgs which is my main goal really.

    Your sugestion seems reasonable. I’ll try to modify it.

  7. DrahtBot commented at 5:49 pm on December 10, 2021: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #24470 (Disallow more unsafe string->path conversions allowed by path append operators by ryanofsky)
    • #20205 (wallet: Properly support a wallet id by achow101)
    • #19873 (Flush dbcache early if system is under memory pressure by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. achow101 commented at 9:43 pm on December 15, 2021: member
    I agree with @ryanofsky. I don’t think we should be adding ArgsManager to the low level db implementations.
  9. DrahtBot added the label Needs rebase on Dec 16, 2021
  10. kiminuo force-pushed on Dec 16, 2021
  11. kiminuo force-pushed on Dec 16, 2021
  12. kiminuo force-pushed on Dec 16, 2021
  13. DrahtBot removed the label Needs rebase on Dec 16, 2021
  14. kiminuo commented at 10:28 am on December 16, 2021: contributor
    Thank you. I’m slowly trying to refactor it to follow ryanofsky’s suggestion.
  15. kiminuo force-pushed on Dec 18, 2021
  16. kiminuo force-pushed on Dec 18, 2021
  17. kiminuo commented at 9:32 am on December 19, 2021: contributor

    @ryanofsky Would the latest approach make more sense to you? There are still rough edges - i.e.:

    • the new names of DatabaseOptions parameters
    • whether it is acceptable to force construction of DatabaseOptions via ArgsManager.
    • whether std::shared_ptr is appropriate for passing of DatabaseOptions. Attempt to go with std::unique_ptr instead?
    • we can also strive for std::shared_ptr<const DatabaseOptions> (with the const modifier).
  18. ryanofsky commented at 3:59 am on December 21, 2021: member

    @ryanofsky Would the latest approach make more sense to you?

    There are some good changes here, but I think it is a mistake change DatabaseOptions usage so much and share it so widely. The goal of the suggestion was really to make PR smaller, not larger.

    Here’s an implementation that just adds a few fields to the DatabaseOptions struct without changing how the struct is used: 17176efddb1b9afaacc1e4c8411d1c505cae4740 (branch). This version (+90/-60 lines) is smaller than both the original version (+121/-101) e0500ac13b50e0953e6cc188570b6b8f941fb3d4 and the current version (+157/-133) 869fab49bfd7032f3fe5b65504f9c0a66e451026.

    I think DatabaseOptions shouldn’t be a part of the BerkeleyEnvironment / BerkeleyDatabase / SQLiteDatabase classes, because most of the information it exposes is stale and not relevant to them.

    I also think ArgsManager shouldn’t be required to use the MakeDatabase function or database functionality in general. It should be possible to get database options from other sources like RPC parameters, or GUI selections, or choose different options depending on whether database is opened for a batch operation or continuous use. The point of getting rid of gArgs is not just to get rid of the variable, but to use less shared state in general and make option setting more explicit.

  19. kiminuo force-pushed on Dec 21, 2021
  20. kiminuo commented at 1:57 pm on December 21, 2021: contributor

    I think DatabaseOptions shouldn’t be a part of the BerkeleyEnvironment / BerkeleyDatabase / SQLiteDatabase classes, because most of the information it exposes is stale and not relevant to them.

    I couldn’t decide what approach to choose. I went with the more flexible one (in the sense that adding a new option is easy) because that is what gArgs provides too. This of course makes the assumption that we need that flexibility and I understand now that is not needed. Anyway, I like your code and how nicely it is written.

    I also think ArgsManager shouldn’t be required to use the MakeDatabase function or database functionality in general.

    I see. I did it because it helps find all the places where DatabaseOptions are being instantiated and also I like the guarrantee that the object cannot be instantiated incorrectly.

  21. kiminuo force-pushed on Dec 21, 2021
  22. ryanofsky approved
  23. ryanofsky commented at 2:47 pm on December 21, 2021: member

    Code review ACK 0a0a6de78c00605d4fdfe5745a37f4b9876160b6, which uses suggested changes I posted yesterday and fixes doxygen comments.

    I see. I did it because it helps find all the places where DatabaseOptions are being instantiated and also I like the guarrantee that the object cannot be instantiated incorrectly.

    This is reasonable and I also added a non-default DatabaseOptions constructor temporarily to help verify the PR would not change behavior. But in general I don’t think having an ArgsManager instance should be required to use the database API. And even if shared global options are used everywhere now, they shouldn’t be required to be used for new databases and new database operations in the future.

  24. kiminuo marked this as ready for review on Dec 21, 2021
  25. DrahtBot added the label Needs rebase on Dec 23, 2021
  26. kiminuo force-pushed on Dec 23, 2021
  27. DrahtBot removed the label Needs rebase on Dec 23, 2021
  28. in src/wallet/bdb.cpp:63 in c1ef77498c outdated
    57@@ -58,12 +58,12 @@ bool WalletDatabaseFileId::operator==(const WalletDatabaseFileId& rhs) const
    58  * erases the weak pointer from the g_dbenvs map.
    59  * @post A new BerkeleyEnvironment weak pointer is inserted into g_dbenvs if the directory path key was not already in the map.
    60  */
    61-std::shared_ptr<BerkeleyEnvironment> GetBerkeleyEnv(const fs::path& env_directory)
    62+std::shared_ptr<BerkeleyEnvironment> GetBerkeleyEnv(const fs::path& env_directory, bool use_shared_memory)
    


    laanwj commented at 5:53 pm on January 5, 2022:
    What’s the rationale for using “shared memory” here instead of just “private”? Especially as it comes from a “-privdb” flag and is eventually converted to a DB_PRIVATE flag. I don’t think any flavor of actual shared memory is used here? Another option would be to use std::option<fs::path> and a single argument. After all, the path is irrelevant in case of a private database.

    ryanofsky commented at 6:08 pm on January 5, 2022:
    I’ve never heard of a “private database” before this PR and I think that terminology is a little arcane, but shared memory is a concept I’m familiar with, and that’s what the flag controls, so that’s what i called the DatabaseOptions member. This particular function argument and a few local variables are named to be consistent with the DatabaseOptions field name, but if you want to call them something different, that sounds fine. Maybe say what specific variables you would like to rename and what you would like to call them. If you want everything in bdb.h/bdb.cpp to use a bool db_private instead of bool use_shared_memory that seems reasonable (though IMO use_shared_memory is still clearer).

    kiminuo commented at 3:38 pm on January 11, 2022:

    ryanofsky commented at 4:12 pm on January 11, 2022:

    https://github.com/observerdev/db-4.8.30/blob/3c75267f11336acb1be06bed419e8056cc689d09/dbinc/region.h#L26-L28 - Is this possibly the correct explanation of DB_PRIVATE?

    I’m sure it’s correct. I don’t think is any question about what that flag does. The questions here are just about readability and variable names in this PR. I don’t think there’s anything that needs to be done yet, except rebase


    kiminuo commented at 4:21 pm on January 11, 2022:
    That was supposed to be for my information :)
  29. DrahtBot added the label Needs rebase on Jan 11, 2022
  30. kiminuo force-pushed on Jan 11, 2022
  31. DrahtBot removed the label Needs rebase on Jan 11, 2022
  32. DrahtBot added the label Needs rebase on Feb 9, 2022
  33. kiminuo force-pushed on Feb 10, 2022
  34. DrahtBot removed the label Needs rebase on Feb 10, 2022
  35. kiminuo commented at 9:42 pm on March 9, 2022: contributor
    @ryanofsky Would you mind re-ACKing, please?
  36. ryanofsky approved
  37. ryanofsky commented at 9:53 pm on March 9, 2022: member

    Code review ACK 3f730adbdf1467744f722bd43ccc4c7249cd3dc9. No changes since last review other than rebase


    @ryanofsky Would you mind re-ACKing, please?

    Sure

  38. in src/wallet/db.cpp:144 in 3f730adbdf outdated
    137@@ -137,4 +138,12 @@ bool IsSQLiteFile(const fs::path& path)
    138     // Check the application id matches our network magic
    139     return memcmp(Params().MessageStart(), app_id, 4) == 0;
    140 }
    141+
    142+void ReadDatabaseArgs(const ArgsManager& args, DatabaseOptions& options)
    143+{
    144+    if (args.IsArgSet("-unsafesqlitesync")) options.use_unsafe_sync = args.GetBoolArg("-unsafesqlitesync", options.use_unsafe_sync);
    


    MarcoFalke commented at 3:19 pm on March 15, 2022:
    Why is the IsArgSet check needed? Looks like this either makes the default argument path unused, or the default argument path makes this redundant.

    ryanofsky commented at 3:54 pm on March 15, 2022:

    Why is the IsArgSet check needed? Looks like this either makes the default argument path unused, or the default argument path makes this redundant.

    That’s true. I think I suggested IsArgSet to try to make it clear that all the current option values would be unchanged if they weren’t explicitly overridden by ArgsManager. But probably better to simplify. Maybe:

    0// Override current options with args values, if any were specified
    1options.use_unsafe_sync = args.GetBoolArg("-unsafesqlitesync", options.use_unsafe_sync);
    2options.use_shared_memory = !args.GetBoolArg("-privdb", !options.use_shared_memory);
    3options.max_log_mb = args.GetIntArg("-dblogsize", options.max_log_mb);
    

    kiminuo commented at 7:30 am on March 16, 2022:
    I have modified it as ryanofsky suggested.
  39. in src/wallet/bdb.cpp:148 in 3f730adbdf outdated
    144@@ -145,7 +145,7 @@ bool BerkeleyEnvironment::Open(bilingual_str& err)
    145     LogPrintf("BerkeleyEnvironment::Open: LogDir=%s ErrorFile=%s\n", fs::PathToString(pathLogDir), fs::PathToString(pathErrorFile));
    146 
    147     unsigned int nEnvFlags = 0;
    148-    if (gArgs.GetBoolArg("-privdb", DEFAULT_WALLET_PRIVDB))
    149+    if (!m_use_shared_memory)
    


    MarcoFalke commented at 3:59 pm on March 15, 2022:
    0    if (!m_use_shared_memory) {
    

    nit (if you retouch)


    kiminuo commented at 7:30 am on March 16, 2022:
    Thank you. Modified.
  40. Replace use of `ArgsManager` with `DatabaseOptions`
    Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
    39b1763730
  41. kiminuo force-pushed on Mar 16, 2022
  42. ryanofsky approved
  43. ryanofsky commented at 8:37 pm on March 17, 2022: member
    Code review ACK 39b1763730177cd7d6a32fd9321da640b0d65e0e. Just the two small ReadDatabaseArgs and Berkeley open changes that were discussed since the last review
  44. achow101 commented at 5:15 pm on March 23, 2022: member
    Since you’re basically adding ReadDatabaseArgs everywhere a DatabaseOptions is created, could this function turned into a constructor for DatabaseOptions? This would also help in the future so that we don’t forget that we need to set some options based on the cli args.
  45. ryanofsky commented at 7:22 pm on March 23, 2022: member

    Since you’re basically adding ReadDatabaseArgs everywhere a DatabaseOptions is created, could this function turned into a constructor for DatabaseOptions? This would also help in the future so that we don’t forget that we need to set some options based on the cli args.

    My thinking is that I would like it to be possible to call wallet database functions without having any ArgsManager. It’s true if that wallet database code doesn’t depend on ArgsManager, it can’t require an ArgsManager argument, and it is theoretically possible for a bug to arise where ArgsManager options are not applied despite it making sense to apply them. But I think potential for having this type of bug is a reasonable tradeoff for wallet database code to be more modular.

    I wouldn’t oppose if you want to DatabaseOptions to require an ArgsManager instance to be constructed, but I don’t think it’s the best approach. Also doing this would require changing the AddArg code and maybe reintroducing the DEFAULT_WALLET_DBLOGSIZE and DEFAULT_WALLET_PRIVDB constants I thought we would be happy to get rid of.

  46. achow101 commented at 7:41 pm on March 23, 2022: member
    ACK 39b1763730177cd7d6a32fd9321da640b0d65e0e
  47. kiminuo commented at 8:57 pm on March 23, 2022: contributor

    Since you’re basically adding ReadDatabaseArgs everywhere a DatabaseOptions is created, could this function turned into a constructor for DatabaseOptions? This would also help in the future so that we don’t forget that we need to set some options based on the cli args.

    I actually did that in a previous iteration of this PR but @ryanofsky persuaded me that it was/is better to avoid doing that.

  48. MarcoFalke merged this on Mar 24, 2022
  49. MarcoFalke closed this on Mar 24, 2022

  50. kiminuo deleted the branch on Mar 24, 2022
  51. sidhujag referenced this in commit de4bc42549 on Apr 2, 2022
  52. DrahtBot locked this on Mar 24, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-19 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me