RFC: Instanced logs #30338

pull theuni wants to merge 2 commits into bitcoin:master from theuni:instance-log3 changing 94 files +1053 −1052
  1. theuni commented at 5:47 pm on June 25, 2024: member

    PR’ing this early as an RFC because it’s a lot of work that will require constant rebasing. This is the first logical step and is mostly mechanical.

    This PR adds an “instance” param to each log function and callsite. As is, there should be no functional change here.

    The intention is to allow for a specific instance to be passed in each time we use the logger, rather than the implicit global as we have it now. Eventually (one subsystem/chunk at a time), this will allow us to instantiate a per-context logger, which is necessary for the kernel.

    This is my second iteration. The first approach can be seen here: https://github.com/theuni/bitcoin/commit/533e75e81a18a06708bc3dfcee554d344cf1c81b. It was much cleaner and used c++20’s std::source_location, however it proved to be impossible to do lazy variadic parameter evaluation that way. The approach here is rather dumb, but it should incur no performance penalty.

    If there’s a rough consensus on this approach, I’d like to go ahead and get this PR cleaned up and merged so that I can begin the work of converting global LogInstance()s to real instances. The first major goal would be to allow the kernel/bitcoin-chainstate to be used without the static global instance.

    Note that the first commit will not compile. The second is a massive scripted-diff. If that’s a problem, I can add a commit which introduces dummy macros, then switch them back after the scripted-diff.

  2. util: add instance to all logging macros
    THIS COMMIT DOES NOT COMPILE!
    
    Add an instance param to every logging macro/function. The next commit will
    add the global LogInstance() to each callsite.
    
    This will allow us to use an instantiated logger rather than assuming a static
    global, which will be especially useful for the kernel.
    a50e2b2e1e
  3. scripted-diff: log transform
    -BEGIN VERIFY SCRIPT-
    sed -i -e 's/LogInfo(/LogInfo(LogInstance(), /g' $(git grep -l LogInfo\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogWarning(/LogWarning(LogInstance(), /g' $(git grep -l LogWarning\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogError(/LogError(LogInstance(), /g' $(git grep -l LogError\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogDebug(/LogDebug(LogInstance(), /g' $(git grep -l LogDebug\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogTrace(/LogTrace(LogInstance(), /g' $(git grep -l LogTrace\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogPrintfCategory(/LogPrintfCategory(LogInstance(), /g' $(git grep -l LogPrintfCategory\( ./src | grep -v logging.h) && \
    sed -i -e '/WalletLogPrintf(/!s/LogPrintf(/LogPrintf(LogInstance(), /g' $(git grep -l LogPrintf\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogPrintLevel(/LogPrintLevel(LogInstance(), /g' $(git grep -l LogPrintLevel\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogPrint(/LogPrint(LogInstance(), /g' $(git grep -l LogPrint\( ./src | grep -v logging.h) && \
    sed -i -e 's/LogAcceptCategory(/LogAcceptCategory(LogInstance(), /g' $(git grep -l LogAcceptCategory\( ./src | grep -v logging.h) && \
    sed -i -e 's/ LogPrintf_(/ LogInstance().LogPrintf_(/g' src/test/logging_tests.cpp
    -END VERIFY SCRIPT-
    07d68fdf09
  4. DrahtBot commented at 5:47 pm on June 25, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process. A summary of reviews will appear here.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #30324 (optimization: Moved repeated -printpriority fetching out of AddToBlock by paplorinc)
    • #30320 (assumeutxo: Don’t load a snapshot if it’s not in the best header chain by mzumsande)
    • #30277 ([DO NOT MERGE] Erlay: bandwidth-efficient transaction relay protocol (Full implementation) by sr-gi)
    • #30267 (assumeutxo: Check snapshot base block is not in invalid chain by fjahr)
    • #30155 (validation: Make ReplayBlocks interruptible by mzumsande)
    • #30141 (kernel: De-globalize validation caches by TheCharlatan)
    • #30116 (p2p: Fill reconciliation sets (Erlay) attempt 2 by sr-gi)
    • #30065 (init: fixes file descriptor accounting by sr-gi)
    • #30064 (net: log connections failures via SOCKS5 with less severity by vasild)
    • #30043 (net: Replace libnatpmp with built-in PCP+NATPMP implementation by laanwj)
    • #29833 (i2p: fix and improve logs by brunoerg)
    • #29702 (fees: Remove CLIENT_VERSION serialization by maflcko)
    • #29656 (chainparams: Change nChainTx type to uint64_t by fjahr)
    • #29641 (scripted-diff: Use LogInfo/LogDebug over LogPrintf/LogPrint by maflcko)
    • #29640 (Fix tiebreak when loading blocks from disk (and add tests for comparing chain ties) by sr-gi)
    • #29625 (Several randomness improvements by sipa)
    • #29605 (net: Favor peers from addrman over fetching seednodes by sr-gi)
    • #29480 (Drop log category in SeedStartup by hebasto)
    • #29431 (test/BIP324: disconnection scenarios during v2 handshake by stratospher)
    • #29415 (Broadcast own transactions only via short-lived Tor or I2P connections by vasild)
    • #29307 (util: explicitly close all AutoFiles that have been written by vasild)
    • #29256 (Improve new LogDebug/Trace/Info/Warning/Error Macros by ryanofsky)
    • #29141 (Bugfix: RPC: Check for blank rpcauth on a per-param basis by luke-jr)
    • #28792 (Embedding ASMap files as binary dump header file by fjahr)
    • #28521 (net: additional disconnect logging by Sjors)
    • #28167 (init: Add option for rpccookie permissions (replace 26088) by willcl-ark)
    • #27826 (validation: log which peer sent us a header by Sjors)
    • #27052 (test: rpc: add last block announcement time to getpeerinfo result by LarryRuane)
    • #27006 (reduce cs_main scope, guard block index ’nFile’ under a local mutex by furszy)
    • #26114 (net: Make AddrFetch connections to fixed seeds by mzumsande)
    • #26022 (Add util::ResultPtr class by ryanofsky)
    • #25832 (tracing: network connection tracepoints by 0xB10C)
    • #25722 (refactor: Use util::Result class for wallet loading by ryanofsky)
    • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)
    • #19463 (Prune locks by luke-jr)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  5. theuni commented at 5:48 pm on June 25, 2024: member
    Ping @TheCharlatan @ajtowns @maflcko for Concept (N)ACKs
  6. ryanofsky commented at 6:15 pm on June 25, 2024: contributor
    @theuni I made a similar change in #29256. That approach might be less invasive because it makes the instance argument optional. So kernel code can be changed to log to a configurable instance while other code uses a global instance.
  7. DrahtBot added the label CI failed on Jun 25, 2024
  8. DrahtBot commented at 9:42 pm on June 25, 2024: contributor

    🚧 At least one of the CI tasks failed. Make sure to run all tests locally, according to the documentation.

    Possibly this is due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    Leave a comment here, if you need help tracking down a confusing failure.

    Debug: https://github.com/bitcoin/bitcoin/runs/26668500153

  9. EarlTZ approved
  10. EarlTZ approved
  11. in src/dbwrapper.cpp:103 in 07d68fdf09
     99@@ -100,7 +100,7 @@ class CBitcoinLevelDBLogger : public leveldb::Logger {
    100 
    101                 assert(p <= limit);
    102                 base[std::min(bufsize - 1, (int)(p - base))] = '\0';
    103-                LogPrintLevel(BCLog::LEVELDB, BCLog::Level::Debug, "%s", base); // NOLINT(bitcoin-unterminated-logprintf)
    104+                LogPrintLevel(LogInstance(), BCLog::LEVELDB, BCLog::Level::Debug, "%s", base); // NOLINT(bitcoin-unterminated-logprintf)
    


    vasild commented at 1:15 pm on June 26, 2024:

    This makes the call sites even more verbose and LogInstance(), is repeated everywhere. In this case one improvement could be to use a default argument for that parameter which defaults to LogInstance() but I can see that this may not play well with the ... arguments.

    Maybe introduce another function that takes (category, level, instance, ...)? Or check if the first of ... arguments in (category, level, ...) is a log instance and if not, then default the instance to LogInstance()? Something like:

     0  class Logger
     1  {
     2  };
     3  
     4  template <typename... Args>
     5  void Log(int, const Logger&, const Args&...)
     6  {
     7  }
     8  
     9  template <typename... Args>
    10  void f(int a, const Args&... args)
    11  {
    12      auto CallLog = [&](const auto& first, const auto&... rest) {
    13          if constexpr (std::is_same<decltype(first), Logger>::value) {
    14              Log(a, first, rest...);
    15          } else {
    16              Log(a, Logger{}, first, rest...);
    17          }
    18      };
    19  
    20      CallLog(args...);
    21  }
    22
    23  int main(int, char**)
    24  {
    25      f(5, 6);
    26      f(8, Logger{}, 9);
    27      return 0;
    28  }
    

    ryanofsky commented at 6:23 pm on June 26, 2024:

    re: #30338 (review)

    This makes the call sites even more verbose and LogInstance(), is repeated everywhere

    See #30342 for an alternative which should make everything less verbose. For example, this line changes to LogDebug(m_log, "%s", base);

  12. DrahtBot added the label Needs rebase on Jun 26, 2024
  13. DrahtBot commented at 7:51 pm on June 26, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  14. ajtowns commented at 2:53 pm on June 27, 2024: contributor

    this will allow us to instantiate a per-context logger, which is necessary for the kernel.

    I don’t understand where this is coming from. Libraries can export globals, eg stdin, so I don’t see why you’re claiming it’s necessary for the kernel when it’s not even necessary for libc.

  15. TheCharlatan commented at 4:11 pm on June 28, 2024: contributor

    Re #30338 (comment)

    I don’t understand where this is coming from. Libraries can export globals, eg stdin, so I don’t see why you’re claiming it’s necessary for the kernel when it’s not even necessary for libc.

    Being forced to eliminate all globals doesn’t seem to be the claim here? As long as the globals don’t carry over annoying state from previous usage, require annoying initialization, or aren’t mostly safe from user misuse, their usage is fine. Next to what was brought up before about having the ability to distinguish between between the logged lines from multiple instances, there are a few other problems with the existing implementation that make its usage as a global annoying at the moment. If an instantiated logger is a red line here, we might be able to find other solutions though.

    The current implementation buffers without a limit until StartLogging() is called. This forces the user to at least eventually call StartLogging(), even if they don’t wish to consume logs. We could limit the buffer to some acceptably low amount, or ensure that StartLogging() is statically called when the library is loaded (similar to how the static secp context is created when bitcoin is started). This would discard previous logging messages though.

    I also noticed that any logs buffered before the user applies their own user-defined options will be printed in the default format and with default categories. Re-starting logging is also awkward to do at the moment.

    Logging is workable for the kernel library as is, but it would be nice to address the remaining pain points. I think having instantiated logging would be the best way to solve these issues, making it clear to the user when something will be logged, as well as giving guarantees on its format.

  16. ajtowns commented at 6:30 pm on June 28, 2024: contributor

    Re #30338 (comment)

    I don’t understand where this is coming from. Libraries can export globals, eg stdin, so I don’t see why you’re claiming it’s necessary for the kernel when it’s not even necessary for libc.

    Being forced to eliminate all globals doesn’t seem to be the claim here?

    The opposite of instanced parameters is globals; if instanced parameters are necessary, then globals aren’t possible. The claim in the OP is that the “instanced logs” proposed here are “necessary for the kernel”.

    Next to what was brought up before about having the ability to distinguish between between the logged lines from multiple instances, there are a few other problems with the existing implementation that make its usage as a global annoying at the moment.

    I assume that’s referring to the concept of rewriting contrib/linearize from python to C++. That doesn’t make sense to me as any kind of priority (and seems to conflict with other proposed benefits of instanced logging), so I’m going to assume that the “few other problems” are the real issue.

    The current implementation buffers without a limit until StartLogging() is called. This forces the user to at least eventually call StartLogging(), even if they don’t wish to consume logs. We could limit the buffer to some acceptably low amount, or ensure that StartLogging() is statically called when the library is loaded (similar to how the static secp context is created when bitcoin is started). This would discard previous logging messages though.

    StartLogging is meant to indicate the appropriate configuration values have been applied to the logger; so calling it on library load wouldn’t make sense, as far as I can see.

    Limiting the buffer could make sense. We could also tweak the defaults to be more suitable for the kernel (eg, no buffering of logs prior to StartLogging() at all), and override the settings for bitcoind/etc with something like:

     0// logging.h
     1class LoggingLaunchSettings
     2{
     3public:
     4   template<typename Fn>
     5   LoggingLaunchSettings(Fn&& fn) { Logger logger = LogInstance(); fn(logger); }
     6};
     7
     8// init.cpp
     9static BCLog::LoggingLaunchSettings bitcoind_logging_control{
    10 [](auto& logger) { logger.m_early_buffer = true; }
    11};
    

    Other users of libkernel who wanted buffering could then do the same thing.

    I also noticed that any logs buffered before the user applies their own user-defined options will be printed in the default format and with default categories.

    Buffering debug/trace logging prior to StartLogging doesn’t really make sense as far as I can see.

    Fixing the formatting of buffered logs is a bit of work, but not very complicated. It doesn’t seem particularly related to the kernel, though. Would you review a PR to do that?

    Re-starting logging is also awkward to do at the moment.

    I’m not sure what you mean? There’s DisconnectTestLogger(); StartLogging(); ?

    Logging is workable for the kernel library as is, but it would be nice to address the remaining pain points. I think having instantiated logging would be the best way to solve these issues, making it clear to the user when something will be logged, as well as giving guarantees on its format.

    Touching every single line of logging code for a “nice to have” doesn’t make much sense to me. It’s nice to be able to just shove a LogInfo("XXX %s happened", thing); anywhere in the code and have it work, without having to find/add an enclosing manager object that has had a logging context passed in from init.

  17. ryanofsky commented at 7:05 pm on June 28, 2024: contributor

    Touching every single line of logging code for a “nice to have” doesn’t make much sense to me.

    AJ I know we are disagreeing about a lot of things, and I do not fault you in any way for this, but your reactions seem very strong relative the actual scope of changes we would like to make.

    We should set aside Cory’s big scripted diff commit here, which I think is a proof of concept and an illustrative change, not an actual change being proposed to merge. I think the actual changes being considered would look a lot more like:

    • e9b91f5c5c9283516e67dd4e21d61c6b154672b6 refactor: Pass Logger instances to kernel objects
    • facc2d0edff9831b35dd3d7bcc82fc9b94e4cdd6 refactor: Log kernel output to local log instances

    Which come from #30342 and yes, I can understand, probably you do not love. But I don’t understand how they would have a discernible negative impact or why you would want to spend time arguing against them? These seem like basically trivial changes.

    I am sincerely asking about this, because I do not understand what you see as being at stake here.

  18. theuni commented at 7:10 pm on June 28, 2024: member

    Yes, @ryanofsky. This was a POC (FWIW I am actually very ok with the scripted-diff approach which touches every line, but I didn’t expect it to be popular). At this point I’d prefer to throw my weight behind #30342 as it accomplishes what I want without being so invasive.

    I won’t add to the arguing here as I think you and @TheCharlatan have said what I would’ve anyway.

  19. TheCharlatan commented at 7:40 pm on June 28, 2024: contributor

    Re #30338 (comment)

    Thanks for going through this. Having something like LaunchSettings seems like a good thing.

    Fixing the formatting of buffered logs is a bit of work, but not very complicated. It doesn’t seem particularly related to the kernel, though. Would you review a PR to do that?

    yes :)

    I’m not sure what you mean? There’s DisconnectTestLogger(); StartLogging(); ?

    Besides the incoherent naming between the two, it also starts buffering again when calling DisconnectTestLogger(), which does not seem immediately intuitive to me. It seems like what these methods are actually doing is disconnect all callbacks and start buffering, and stop buffering and flushing the buffered messages respectively. I’m not saying this is bad, just that they should have better ergonomics that clearly convey what they do. The current state is workable as is, albeit a bit rough.

  20. ajtowns commented at 9:26 pm on June 28, 2024: contributor

    Re #30338 (comment)

    Thanks for going through this. Having something like LaunchSettings seems like a good thing.

    Fixing the formatting of buffered logs is a bit of work, but not very complicated. It doesn’t seem particularly related to the kernel, though. Would you review a PR to do that?

    yes :)

    Have a look at https://github.com/ajtowns/bitcoin/pull/9 and see what you think; obviously needs a little cleaning up before PRing properly. It also has a commit to add m_max_buffer_memusage so that log buffering doesn’t blow up if you never call StartLogging() if you think that makes sense.

    I’m not sure what you mean? There’s DisconnectTestLogger(); StartLogging(); ?

    Besides the incoherent naming between the two, it also starts buffering again when calling DisconnectTestLogger(), which does not seem immediately intuitive to me.

    Not really sure what else you’d expect it to do? I don’t really see why you’d ever “restart” logging rather than “reconfiguring” it while it’s still running though, at least outside of tests.

  21. ajtowns commented at 0:00 am on June 29, 2024: contributor

    Touching every single line of logging code for a “nice to have” doesn’t make much sense to me.

    AJ I know we are disagreeing about a lot of things, and I do not fault you in any way for this, but your reactions seem very strong relative the actual scope of changes we would like to make.

    I don’t know what you want me to say here? I didn’t say “THAT’S CRAZY YOU SHOULD ALL BE PUT IN AN ASYLUM”; I said “it doesn’t make much sense to me”. The only way to be less strong about my opinion is to be entirely silent…

    Your patches are roughly how I imagine Cory’s would eventually look, and they still touch every single line of logging code (*in the kernel, anyway). That just seems completely unnecessary to me. The kernel library already works; I don’t think logging needs major changes in order for it to not only work but work well with the kernel.

    But I don’t understand how they would have a discernible negative impact

    For instance, once we have LogInstance() removed from the kernel, so that you have to pass a logger down to any function that wants to log, how do we add temporary logging for development in kernel functions that don’t have a manager object handy? Alternatively, I think your patches didn’t touch sync.cpp despite that being obviously essential to the kernel and which does some logging in debug modes. I think passing a logger to it without just recreating the LogInstance() global would already be awkward.

  22. ryanofsky commented at 5:58 pm on July 1, 2024: contributor

    Thanks AJ, it’s great to get specific and find out what actual problems you are concerned about, before coming to the conclusion that it is too difficult to remove one hard dependency on global shared state, and therefore we should resort to workarounds such as running code in multiple processes, or finding ways to reset the state and coordinate access to it.

    how do we add temporary logging for development in kernel functions that don’t have a manager object handy?

    The same way you add temporary logging now. With #30342, if you are modifying bitcoind or bitcoin-qt, you can continue to write LogInfo("Hello world\n") anywhere in the codebase and it works the same as before.

    The goal is to allow (and probably require) external applications using the kernel library to specify log instances. The goal is not to get rid of global state in bitcoind and bitcoin-qt. The idea is to let applications using the kernel library work with multiple data directories and simulate actions of multiple nodes, while collecting output from each separately, not mixed in a single stream. (Also, more speculatively, being able to specify log instances could make it easier to use the kernel library to write software that connects multiple networks, or to use kernel library as a dependency of another library that allows having separate instances of its own state.)

    Obviously, we can have a debate about how important these features are, or discuss a myriad of workarounds to implement them in different ways, but we should also evaluate what a straightforward approach of just declaring logger variables and using them looks like, and what the advantages and disadvantages of this approach are. Having implemented e9b91f5c5c9283516e67dd4e21d61c6b154672b6 and facc2d0edff9831b35dd3d7bcc82fc9b94e4cdd6 I don’t see real disadvantages, as

    0- LogPrintf("Wiping LevelDB in %s\n", fs::PathToString(params.path));
    1+ LogInfo(m_log, "Wiping LevelDB in %s\n", fs::PathToString(params.path));
    

    seems like a boring and straightforward change to me.

    I think your patches didn’t touch sync.cpp despite that being obviously essential to the kernel and which does some logging in debug modes.

    Right, sync.cpp logs debug information if locks are not acquired in a consistent order. If there is a desire to specify which log instance that information gets sent to, there are different ways we could support that, such as by adding an optional Logger* pointer in the Mutex class. I would not foresee any need or desire to do this, however.

    In bitcoind and bitcoin-qt, if a log statement does not specify a logger, it should just go to the global log instance. By contrast, libbitcoinkernel probably would be more flexible if it didn’t come with a default log instance, so any instance could be the default instance. A straightforward approach would be to a have a Logger* g_logger; global and if (!g_logger) g_logger = this; in the Logger constructor and if (g_logger == this) g_logger = nullptr; in the destructor. But other approaches are possible too. The important thing is just to allow log instances to be optionally specified in log statements and not hardcode them to use one global instance.

  23. ajtowns commented at 3:31 am on July 3, 2024: contributor

    Thanks AJ, it’s great to get specific and find out what actual problems you are concerned about,

    What I’m concerned about is making logging unnecessarily complicated to use – it should be easy to use, and it should be easy to understand what’s going on without needing any context from the rest of the module where the logging is happening, or from debates about how logging should work.

    You asked about specific problems this change could cause; and I’m happy to give examples, but they’re not my fundamental concern.

    before coming to the conclusion that it is too difficult to remove one hard dependency on global shared state,

    The same way you add temporary logging now. With #30342, if you are modifying bitcoind or bitcoin-qt, you can continue to write LogInfo("Hello world\n") anywhere in the codebase and it works the same as before.

    The goal is to allow (and probably require) external applications using the kernel library to specify log instances. The goal is not to get rid of global state in bitcoind and bitcoin-qt.

    To me, the logic goes: okay, there’s going to be global shared state anyway, how can we make it as simple as possible to deal with? And having multiple references to the same object that we pass around, even though there’s a global anyway, doesn’t make things simpler.

    The idea is to let applications using the kernel library work with multiple data directories and simulate actions of multiple nodes, while collecting output from each separately, not mixed in a single stream.

    I don’t think it makes sense to do that within a single process / address space, personally. I really don’t think it makes sense to make it more complicated to maintain bitcoind/bitcoin-qt in order to optimise speculative simulation software.

    Obviously, we can have a debate about how important these features are, or discuss a myriad of workarounds to implement them in different ways, but we should also evaluate what a straightforward approach of just declaring logger variables and using them looks like,

    I don’t think this change is straightforward: if it were, this PR wouldn’t need to be a WIP that still needs a bunch of other changes in a followup to work.

    I think your patches didn’t touch sync.cpp despite that being obviously essential to the kernel and which does some logging in debug modes.

    Right, sync.cpp logs debug information if locks are not acquired in a consistent order. If there is a desire to specify which log instance that information gets sent to, there are different ways we could support that, such as by adding an optional Logger* pointer in the Mutex class. I would not foresee any need or desire to do this, however.

    The idea, as I understand it, would be that you’d invoke:

    0    int main() {
    1        Logger my_logger("/tmp/debug.log");
    2        ChainstateManager chainman{..., my_logger};
    3        ...
    4        return 0;
    5    }
    

    at which point chainman could initialise Chainstate::m_chainstate_mutex by passing in my_logger, eg. The key point being that the logger is fully initialised (on construction) before potentially being used anywhere, so there’s no concern about buffering etc.

    But that approach breaks if you’re initialising the logger based on a conf file that you manage with ArgsManager – since ArgsManager needs to initialise its cs_args, which then needs a logger, but you’re still trying to work out how to configure the logger at this point.

    It also breaks for the various global mutexes that are still in the kernel (just cs_main, g_best_block_mutex? perhaps DumpMutex()::dump_mutex).

    The goal is to allow (and probably require) external applications using the kernel library to specify log instances.

    The important thing is just to allow log instances to be optionally specified in log statements and not hardcode them to use one global instance.

    That’s begging the question… The goal, the important thing, should be to have some observable benefit, that is achieved by that change, not to make the change itself.

    I believe the important thing this PR is attempting to resolve is more along the lines of “it should be easy to use libbitcoinkernel from a quick and dirty C++ program (along the lines of bitcoin-chainstate), without having to worry about doing anything with logging at all – at most creating a dummy logger because it’s a mandatory argument to some constructors. That should then work fine without potentially leaking memory or any other problems. Preferably, adding a simple one-liner should be enough to enable outputting all the logging information so that you can debug your program just like in bitcoind”.

    I think “being able to use two (or more) different BlockManager objects in the same program at the same time and distinguish their logs” isn’t important, though it might be nice. Likewise “if I write a libbitcoinkernel program and haven’t added a one-liner to enable logging, logging is a no-op and it should be optimised as such and execute as quick as possible”.

  24. TheCharlatan commented at 12:04 pm on July 3, 2024: contributor

    Re #30338 (comment)

    Have a look at https://github.com/ajtowns/bitcoin/pull/9 and see what you think;

    This looks good to me, thank you for implementing. Regarding disabling the buffering, I’m not sure if defaulting to off in our code is a good idea. How about this as an alternative: https://github.com/TheCharlatan/bitcoin/commit/86fed95f15fa6cee2e28679fe1ac8c537c4555a0 ? The library would always be configured to not buffer. The user could choose at compile time to leave this initialization out, or override it at runtime.

  25. ryanofsky commented at 1:15 pm on July 3, 2024: contributor

    But that approach breaks if you’re initialising the logger based on a conf file that you manage with ArgsManager – since ArgsManager needs to initialise its cs_args, which then needs a logger, but you’re still trying to work out how to configure the logger at this point.

    It also breaks for the various global mutexes that are still in the kernel (just cs_main, g_best_block_mutex? perhaps DumpMutex()::dump_mutex).

    I don’t think anything is broken here, or maybe we have different ideas of what broken means. In any case, I don’t understand what problem you see here, and think it would be helpful if you could describe whatever the issue is more fully.

    I appreciate you acknowledging the benefits of applications being able to declare their own loggers that are “fully initialised (on construction) before potentially being used anywhere” and even saying “might be nice” to be able to use different log instances for different purposes. I think, at this point, we both understand the benefits of not hardcoding logging macros to use a single global leaked log instance, even while we disagree about the magnitude of the benefits.

    I am trying to focus discussion on the costs, not the benefits, because we seem to disagree about not just the magnitude of the costs, but whether they exist at all. For example you are mentioning complexity without pointing to complicated changes or complicated code, and you are saying certain things are broken when they don’t seem to be obviously broken. It would be good to clear these things up so neither of us have misconceptions.

  26. ajtowns commented at 1:37 pm on July 3, 2024: contributor

    But that approach breaks if you’re initialising the logger based on a conf file that you manage with ArgsManager – since ArgsManager needs to initialise its cs_args, which then needs a logger, but you’re still trying to work out how to configure the logger at this point. It also breaks for the various global mutexes that are still in the kernel (just cs_main, g_best_block_mutex? perhaps DumpMutex()::dump_mutex).

    I don’t think anything is broken here, or maybe we have different ideas of what broken means. In any case, I don’t understand what problem you see here, and think it would be helpful if you could describe whatever the issue is more fully.

    Not really convinced explaining things is getting anywhere? Sometimes you have to try fully implementing your idea before you realise it doesn’t work…

    As I understand it, your proposal is (a) that the kernel will not have a global logger instance; but since mutexes do logging, they will need a logger instance, and since we use mutexes in the kernel they cannot use the global logger instance; therefore (b) you suggest that the each Mutex class can just have a logger instance of its own; however this doesn’t work for two reasons:

    • global mutexes like cs_main have to be initialised prior to main, so can’t be initialised with a non-global logger
    • (in bitcoind) we use mutexes when parsing the configuration, prior to gaining the information we’d like to have to be able to initialise our logger

    I appreciate you acknowledging the benefits of

    I really wish you’d do likewise at some point… It feels like I’m saying “just treat everything as orbiting the sun, it’s simpler” and just getting “no, we’re here on earth, and we can make the maths work if we just …” in response.

    I am trying to focus discussion on the costs, not the benefits,

    The point of looking at the benefits is you can see whether there might be some other way of achieving the same benefits. “I’d like to get from point A to B, and a Lambo will get me there, and I know it’s expensive, but I just want to focus on if whether we can afford it”. I’m not sure what you think the benefits actually are (eg, if we keep a global logger for mutexes around, then there’s still a global logger in the kernel) – and they seem to keep changing while the solution remains the same…

    because we seem to disagree about not just the magnitude of the costs, but whether they exist at all. For example you are mentioning complexity without pointing to complicated changes or complicated code, and you are saying certain things are broken when they don’t seem to be obviously broken. It would be good to clear these things up so neither of us have misconceptions.

    The PR that we’re commenting on is +1000 -1000 and isn’t even remotely complete. You’re coming up with random extra ideas to perhaps fix edge cases, though you don’t think they’re necessary. Things look broken to me, but not to you. You keep saying things that suggest you don’t think I’m an idiot, but if that’s the case, then all of that is evidence that this isn’t just a simple tweak.

    Passing a logger through constructors and keep tracking of it in member variables is already more complex than just having a global; additionally having a global that is automatically initialised and cleared would be much more complicated when you start taking thread-safety into account.

  27. ryanofsky commented at 3:42 pm on July 3, 2024: contributor

    Not really convinced explaining things is getting anywhere? Sometimes you have to try fully implementing your idea before you realise it doesn’t work…

    I think #30342 is a full implementation of making it possible the specify log instances kernel code should use. If you are looking for an implementation of a change to remove the hardcoded global logger instance from the kernel (replacing it as described with a global pointer to a default fallback log instance), see ecadc8f6e5554704748237354ed20628f54b0ba4. (EDIT: added link to implementation)

    As I understand it, your proposal is (a) that the kernel will not have a global logger instance; but since mutexes do logging, they will need a logger instance, and since we use mutexes in the kernel they cannot use the global logger instance; therefore (b) you suggest that the each Mutex class can just have a logger instance of its own

    Thank you for clarifying. This is not what I’m proposing. Mutexes don’t normally log anything except lock contention prints and inconsistent lock order warnings when certain debug options are turned on. I don’ t think there is a need for libbitcoinkernel users to be able specify which specific logger instance these are logged to. Allowing kernel users to create multiple logger instances but warning them that if (1) they create multiple instances simultaneously and (2) they enable very low-level log prints, the prints might be sent to the first instance seems fine. If it is not fine, with a small amount of extra work we can also allow them to specify which instance even kernel lock order warnings get sent to. I just don’t see a need to do this.

    I really wish you’d do likewise at some point… It feels like I’m saying “just treat everything as orbiting the sun, it’s simpler” and just getting “no, we’re here on earth, and we can make the maths work if we just …” in response.

    I am sorry, but the situation where we heap-allocate a global logger and leak it and hardcode all log prints in the codebase to use a shared logger does not seem simpler than the approach I am proposing where a log instance is just a normal object, and if you want code control where log output goes, you just pass it the log instance you want to use, and do not worry about global initialization order and heaving a memory leak, and coordinating shared access to the logger with reset functions, and buffering etc.

    I understand the current situation works well for bitcoind and bitcoin-qt code, but I think it will hamstring potential uses of libbitcoinkernel, and I think there is a way to have the best of both worlds that will reduce rather than increase complexity.

    I understand the previous statements are subjective, that passing the variable you want to use to the function you want to call may be simpler to some people, and making the variable a global accessed from everywhere may be simpler to other people. I just think if you look specifically at kernel logging, the two approaches are just not very different from each other and #30342 is a small, straightforward change that not does not have other objectively negative effects.

    I also want to apologize if this discussion is frustrating and I am sorry for only responding the technical parts of your posts and not responding to “meta” comments about lambos and +1000-1000 changes. I just think right now you are thinking about a different change than what I am proposing and it is important to clear up technical, objective disagreements before focusing on more subjective ones.

  28. ajtowns commented at 4:52 pm on July 3, 2024: contributor

    Re #30338 (comment)

    Have a look at ajtowns#9 and see what you think;

    This looks good to me, thank you for implementing.

    Okay, see #30386 for the more serious take. I have another patch that I could add on top that would also buffer early debug/trace logs, but that seemed a bit unnecessary (LMK if it would be useful).

    Regarding disabling the buffering, I’m not sure if defaulting to off in our code is a good idea. How about this as an alternative: TheCharlatan@86fed95 ? The library would always be configured to not buffer. The user could choose at compile time to leave this initialization out, or override it at runtime.

    With the PR above, the behaviour is:

    1. default to buffering logs, but apps that don’t care can just let the buffer rollover (and if you’re debugging things, you can invoke LogInstance().m_print_to_console = true; LogInstance().StartLogging(); to get at the buffered logs to see what was going on
    2. apps that hate logs can add a one liner to disable them (namely LogInstance().DisableLogging();) which will clear the buffer and make it so LogInstance().Enabled() returns false, at which point all the LogInfo() functions will early exit before even formatting the string.

    That feels fine to me, but ymmv. I think having the same “default” between the kernel code as used by bitcoind and the kernel code as exposed in a library is better than having it be different; not too fussed on what that default is.

  29. DrahtBot commented at 1:59 am on September 30, 2024: contributor

    ⌛ There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  30. achow101 closed this on Oct 15, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-03 15:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me