kernel: Improve logging API #33847

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/klog changing 5 files +118 −108
  1. ryanofsky commented at 11:30 pm on November 10, 2025: contributor

    Simplify kernel logging API and try to connect it to the rest of the kernel API by letting log options be set on Logger objects directly and Logger objects to be associated with Context objects directly. Also drop logging_disable() function and avoid having library accumulate log messages in a 1MB buffer by default.

    Changes are intended to make the API a little less surprising and more future proof. Rationale is described in some more detail the commit message.

    This change is not backwards compatible but I’d expect it to port over pretty smoothly to the language bindings, and offer the same benefits for them as for C/C++ callers.

  2. kernel: Simplify logging API
    Current kernel logging API is too complicated and too restrictive. This PR
    tries to improves it.
    
    I'd expect an API that supported logging to allow creating log streams with
    different options and providing some way to specify which library operations
    should be logged to which streams.
    
    By contrast, the current API allows creating multiple log streams, but all the
    streams receive the same messages because options can only be set globally and
    the stream objects can't be passed to any kernel API functions. They are not
    even referenced by the `btck_Context` struct which holds other shared state. If
    no log streams are created, log messages are generated anyway, but they are
    stored in a 1MB buffer and not sent anywhere, unless a log stream is created
    later, at which point they are sent in bulk to that stream. More log streams
    can be created after that, but they only receive new messages, not the buffered
    ones. If log output is not needed, a btck_logging_disable() call is required to
    prevent log messages from accumulating in the 1MB buffer. Calling this will
    abort the program if any log streams were created before it was called, and
    also abort the program if any new log streams are created later.
    
    None of these behaviors seem necessary or ideal, and it would be better to
    provide a simpler logging API that allows creating a log stream, setting
    options on it, registering it with the `btck_Context` instance and receiving
    log messages from it. Another advantage of this approach is that it could allow
    (with #30342) different log streams to be used for different purposes, which
    would be not be possible with the current interface.
    0926479924
  3. maflcko added the label Insufficient Review on Nov 11, 2025
  4. maflcko removed the label Insufficient Review on Nov 11, 2025
  5. DrahtBot added the label Validation on Nov 11, 2025
  6. sedited commented at 2:05 pm on November 11, 2025: contributor

    Concept ACK

    Thank you for contributing this @ryanofsky! We’ve discussed doing something similar a few months ago. This is close to how I hoped to implement the logging capabilities, but reviewers highlighted some inconsistencies with this approach: A secondary logger object cannot receive its own non-global options. The docstrings do at least still explain this even with this change, I am not sure if it is really less surprising for users. Maybe it would be preferable to push forward #30342 before this is done? Tagging @stringintech and @stickies-v here, since they both left comments on the logging capabilities during review of #30595.

  7. DrahtBot commented at 2:05 pm on November 11, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33847.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK sedited, stickies-v

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33796 (kernel: Expose CheckTransaction consensus validation function by w0xlt)
    • #33728 (test: Add bitcoin-chainstate test for assumeutxo functionality by stringintech)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • return new -> return a new [grammar: needs the indefinite article “a” before “new” to be grammatical]
    • effect -> affect [word choice: “affect” (verb) is correct here; “effect” is a noun and makes the sentence incorrect]

    drahtbot_id_5_m

  8. ryanofsky force-pushed on Nov 11, 2025
  9. ryanofsky commented at 2:28 pm on November 11, 2025: contributor
    Updated d0bd114e974500ad11c8ecac26d6606166ab7438 -> 0926479924f453c38b5e0ad069435fb3241b2389 (pr/klog.1 -> pr/klog.2, compare) updating btck_logging_connection_create documentation
  10. ryanofsky commented at 3:29 pm on November 11, 2025: contributor

    re: #33847 (comment)

    Thanks @TheCharlatan. I think the approach you took in #30595 of making the kernel API mirror the current logging.h API was the best one to take as a starting point. #30595 covers a large surface area, and it would have been asking too much of reviewers to evaluate all the different ways the kernel API and existing APIs could or should diverge in one PR. It should be more manageable to make kernel API improvements in targeted followups like this PR, and hopefully the improvements here are welcome.

    On the question of whether to push forward with this PR or #30342 first: I think this PR should have much higher priority than #30342. The reason is that this PR is not backwards compatible and will require downstream changes to language bindings and probably to most external programs using those bindings. By contrast, #30342 is backwards compatible. If #30342 were merged later, it would just improve how requested log options are applied and how log messages are forwarded, without requiring external changes. #30342 is also a bigger PR that will take more effort to review.

    It is true that a potential drawback of merging this PR without #30342 is that the design of the API exposed here could make it seem like it provides more functionality than it currently does. Users would not know about the limitations without reading the documentation, and could be surprised or confused. I just would not expect this to be a problem in practice, because the surprising behavior should only occur if users create multiple log streams or multiple kernel context objects at the same time, and that should be uncommon. And even if it did happen, the consequences would not seem very bad: log options might be applied in a different way than expected, or some log messages might be received unexpectedly. It seems like it would place a bigger burden on users to first roll out a logging API that does not support multiple options and streams and then later change it incompatibly, than to roll out an API that could support them and then implement that support internally later.

  11. stringintech commented at 10:07 pm on November 13, 2025: contributor

    Thanks @ryanofsky for explaining the trade-offs!

    I think I also prefer we push this change after we support separate logger instances per connection though, and at this point keep the API consistent with what it’s currently capable of. I’m not sure how big of an impact delaying this would have on bindings/users, especially if we’re assuming their use cases are mostly limited to a single logging connection.

  12. ryanofsky commented at 1:29 am on November 14, 2025: contributor

    Thanks for the feedback @stringintech. Can you explain what specifically you may be looking for which this PR does not provide? The API defined here is consistent and logical. It just happens to be compatible with new features like per-connection options and separate log streams (already implemented in #30342) instead of being pointlessly incompatible with them. It also eliminates the btck_logging_disable() function which is unsafe and confusing. And it gets rid of the kernel’s inefficient default behavior of storing up log messages in a 1MB buffer in case a logger is attached at later time, instead taking a simpler and more rational approach: logging when a logger is attached and not logging otherwise. I think if you look at the API and documentation defined here you will see that it makes more sense than the current API and should be less confusing. If there are any specific problems I’d be happy to address them.

    If the logging API is not going to be rationalized soon (it’s only been around a week since the API was introduced in #30595), I am concerned it will be more difficult to rationalize later. Logging is a fundamental part of the kernel API so if backwards incompatible changes are made, they are likely to break most language bindings and potentially many applications written in different languages. It should make sense to provide a logging API that’s forward compatible and exposes fewer quirks and implementation details sooner instead of later, if we can do that.

  13. stringintech commented at 12:58 pm on November 14, 2025: contributor

    By inconsistency, I mean that refactoring set_options/set_level_category/enable_category/etc. to receive a btck_LoggingConnection* parameter suggests that settings are supported per connection, while calling them on a newly instantiated logging connection would actually override settings for all previously instantiated connections. While the actual behavior is well-documented, the function signatures themselves may still create expectations of per-connection isolation.

    I understand you already acknowledged this concern in your last paragraph here, and believe the benefits of pushing this change forward outweigh this downside. My preference was to wait for #30342, since the API refactoring and implementation naturally seem to be parts of a single patch, though I may be too optimistic about how much time/effort it takes to push that PR forward, or the effort it takes to adapt the bindings/other projects to the breaking changes.

    Regarding the StartLogging() call being moved from the logging connection constructor to context creation: This delay could mean logs won’t be produced by context-free operations (like btck_script_pubkey_verify()). If i am not wrong, it appears we don’t currently have such operations producing logs, but if we add them in the future while still using a global logger, would we need to move StartLogging() back to the logging connection constructor?

  14. stickies-v commented at 1:59 pm on November 14, 2025: contributor

    Concept ACK for making the logging interface local instead of global, but approach ~0 leaning towards nack. I think the approach that merging the backwards-incompatible interface early is not unreasonable but not my preferred one, because:

    • I don’t think we should at all care about breaking the kernel interface at the moment, the kernel API is experimental and upstream development flexibility and pace should absolutely be prioritized until kernel is in a more stable state
    • having the interface behave in unexpected ways is just not desirable, even if this gap is not catastrophic as I think you correctly described
    • internal logging refactorings / changes (e.g. #30342, but also others) may surface roadblocks or ideas that inform us about a better interface design, so waiting a bit more seems helpful to me

    I would prefer to convert this to draft, and have this PR serve as the WIP discussion grounds for polishing up the kernel logging API while we make changes such as #30342. I don’t have a proposal ready, but I would also like to make breaking changes to the category/level/enable/disable logic (at least for kernel API) which I think currently is extremely unergonomic and unintuitive, as well as support richer callbacks that do not require users (language bindings, especially) to parse already formatted log strings.

  15. ryanofsky commented at 2:15 pm on November 14, 2025: contributor

    This delay could mean logs won’t be produced by context-free operations

    Thanks, I’ll look into this. I am a little surprised there would be any context-free operations except very basic ones that shouldn’t log. It’d expect there to be some context object, even a minimal one, associated with all non-trivial operations to control things like logging, random number generation, caching, etc.

    Of course I do understand your abstract concern that adding a connection parameter “suggests that settings are supported per connection.” This is similar to how in many filesystem API’s, flushing one file can cause other files to be flushed to disk. Or how in the python database API, cursor objects have commit() and rollback() methods even though in some implementations calling these can affect other cursors. It’s not unusual for APIs to provide a little more generality than is guaranteed in every case, if this provides other benefits and avoids other types of pain.

    So I would like to understand any ways your abstract concern could translate to confusion or problems in practice. When I consider all use-cases I can think of, the new logging API here is less confusing and more coherent than the current one. I already mentioned the current nonsensical default behavior of storing all log messages into a 1MB buffer even when no logging is requested. But lets say you do want to enable logging. You might see the current btck_logging_set_* functions and think they actually do something. Well, they don’t, because in order to have logging, you need to create a connection object. Adding a connection parameter to these functions makes it obvious that a connection object is actually necessary. Or, in the current API you might see the btck_logging_connection_create function and notice that takes a callback and returns a btck_LoggingConnection handle. But where is that handle used? Only by the btck_logging_connection_destroy function and nowhere else. In the current API, kernel functions which actually log and set logging options are less discoverable because the API is not coherent and the functions using the connection do not reference it.

    More generally, I’d expect the majority of scripts and applications using the kernel library to only set one logging callback, and in that use-case, the new API should be a strict improvement over the old one. It is true that in the alternate case where an application does create multiple log callbacks, if the application calls btck_logging_set_* functions, and the author did not read the documentation for those functions, they might mistakenly think those options apply to a single callback instead of multiple. But those applications will still benefit from this PR because of the follow-up in #30342 which can apply log options singly. So I am trying to think of any application or use-case which would not benefit from this PR and I don’t see one. Would be interested to know if I am looking at this from the wrong perspective or am missing something.

  16. ryanofsky marked this as a draft on Nov 14, 2025
  17. ryanofsky commented at 2:33 pm on November 14, 2025: contributor

    prefer to convert this to draft

    Sure happy to do that while there is conceptual discussion.

    having the interface behave in unexpected ways is just not desirable

    I think I’ve pointed out practical ways the current interface behaves in unexpected ways and is counterintuitive, and practical ways this PR is an improvement. The concerns you and stringintech raised do seem plausible, but also vague so if there are any concrete scenarios you can think of where this PR might create a problem, they would be helpful to know about.

    Thanks for the thoughtful reviews!

  18. in src/kernel/bitcoinkernel.h:807 in 0926479924
    802+ * @brief Sets log instance for the kernel context to use.
    803+ *
    804+ * @param[in] context_options  Non-null, previously created by @ref btck_context_options_create.
    805+ * @param[in] logger           Is set to the context options.
    806+ */
    807+BITCOINKERNEL_API void btck_context_options_set_logger(
    


    stickies-v commented at 1:20 pm on December 10, 2025:

    How do we deal with ownership of the logger here? Is it owned by the context? If yes:

    • should we still expose the logging_connection_destroy function?
    • how do users update the logging_connection options? Their handle should be invalid after ownership is passed to the context? A btck_context_get_logger() could be an alternative?
    • users may need to create multiple logging connections, e.g. to pass to context-free functions that need logging

    If no:

    • managing lifetimes becomes more brittle, i.e. users must ensure logger is kept alive for the duration of context. This feels like a terrible idea, but one approach could be to let the static kernel::Context own the loggers, with the implication that connections can be created, but not destroyed (until the application aborts)?

    ryanofsky commented at 2:27 pm on December 10, 2025:

    Thanks for following up! If I understanding your concerns correctly, I don’t think there is any reason to think this PR can cause problems.

    In the kernel API, log messages are only written to logging connections, not to files or other possible log sinks. So with #30342 a “logger” is just a logging connection. If you create a logging connection, you create a BCLog::Logger instance and if you destroy the logging connection you destroy it.

    Before #30342, there is only a single BCLog::Logger instance and all logging connections share it. This PR does not change that. It keeps behavior the same and just adds function parameters so the association between log connections and log sources is explicit, rather than that implicit, and so the more flexible logging enabled by #30342 can be introduced without API changes, binding changes, script and application changes.

    How do we deal with ownership of the logger here?

    It is owned by the logging connection.

    Is it owned by the context?

    No.

    • should we still expose the logging_connection_destroy function?

    Yes.

    • how do users update the logging_connection options? Their handle should be invalid after ownership is passed to the context?

    No, a logging connection is just a handle referring to a BCLog::Logger instance. The handle can be used until the connection is closed.

    • users may need to create multiple logging connections, e.g. to pass to context-free functions that need logging

    No, a logging connection can be passed to different log sources, the log sources do not own the connection.

    • managing lifetimes becomes more brittle, i.e. users must ensure logger is kept alive for the duration of context. This feels like a terrible idea

    No, they only need to keep the logging connection open as long as they want to receive logging callbacks. If they don’t want to receive logging callbacks, there is no need for a connection or logger.

    but one approach could be to let the static kernel::Context own the loggers, with the implication that connections can be created, but not destroyed (until the application aborts)?

    This sounds very kludgy and there should be no need for it. (Having a static context in general seems like a kludge to me and I hope there are plans to get rid of it. There should definitely be no need to use it here.)


    stickies-v commented at 2:49 pm on December 10, 2025:

    I agree that with the current implementation (where btck_LoggingConnection just represents a callback) there are no meaningful lifetime concerns, except perhaps conceptually. However, with something like #30342, it does look problematic, since Context would be keeping a reference to the logger instance, which may have already been invalidated by the user (who owns the logger)?

    Assume the following pseudo-code:

    0opts = btck_context_options_create();
    1logger = btck_logging_connection_create();
    2btck_context_options_set_logger(opts, logger);
    3ctx = btck_context_create(opts);
    4btck_logging_connection_destroy(loger);
    

    I think ctx is now in an invalid state? We had a similar concern in earlier versions of this PR, where context had to be kept alive for the duration of chainman, so eventually the approach was changed to let chainman own the context it was created with. Arguably, the same could be done here, as per the bullet points under the “yes” section? It’s currently my preferred approach over passing just a view (which, I think should be passed by const reference and documented as such, if you decide to keep it).

    In my previous comment, in the context-owns-logger approach, I suggested we might want to remove logging_connection_destroy. Of course, if/when we have context-free functions that need logging, it would be perfectly fine to pass a stand-alone btck_LoggingConnection, at which point exposing logging_connection_destroy would make sense again.


    ryanofsky commented at 3:50 pm on December 10, 2025:

    I see comment has been edited, but just to respond to the example, since it is illustrative:

    0opts = btck_context_options_create();
    1logger = btck_logging_connection_create();
    2btck_context_options_set_logger(opts, logger);
    3ctx = btck_context_create(opts);
    4btck_logging_connection_destroy(loger);
    

    I’d want to resolve this by adding a int use_count field to the LoggingConnection struct and making the btck_logging_connection_destroy call fail if the use_count is not 0.

    I would imagine using a similar pattern in other parts of the kernel when references to handles need to be stored.

    I don’t think it would be smart in general to work around the need to share handles by resorting to using global variables or large context objects, or by limiting flexibility of the API.


    stickies-v commented at 4:40 pm on December 10, 2025:

    I was actually wrong about my earlier example about chainman owning context: it is passed as a non-owning (const ptr) view, and lifetime then ensured via std::shared_ptr. Whether we prevent destroying or automatically extend lifetimes is not really my concern here, so I think we can wrap this up on the conclusion that you prefer letting the user manage btck_LoggingConnection’s lifetime (with safeguards in place), which I think works well.

    Thank you for the in-depth response. Essentially, the misunderstanding I had is that if the context doesn’t own the logging connection it can’t ensure its lifetime. I now understand that’s not actually true, and we already use this pattern in existing kernel code, e.g. with context associating with chainman. I agree then that there is no need for context to own the logging connection.


    ryanofsky commented at 6:59 pm on December 10, 2025:

    re: #33847 (review)

    my earlier example about chainman owning context: it is passed as a non-owning (const ptr) view, and lifetime then ensured via std::shared_ptr.

    That’s pretty interesting. I can see how the lifetime extension idea also works, although my first reaction is that it seems a little fragile, and it seems safer in general to return an error to applications if they try to delete objects that are currently in use, instead of internally giving the objects longer lifetimes.

    In the case of btck_LoggingConnection, I agree either approach could work. The only nuance is that if a shared_ptr is used, regardless of whether the logger object is freed, btck_logging_connection_destroy needs to unregister the logging callback function so the kernel does not try to call the application after it is no longer expecting calls.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-12-12 00:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me