kernel: Improve logging API #33847

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/klog changing 5 files +118 −108
  1. ryanofsky commented at 11:30 pm on November 10, 2025: contributor

    Simplify kernel logging API and try to connect it to the rest of the kernel API by letting log options be set on Logger objects directly and Logger objects to be associated with Context objects directly. Also drop logging_disable() function and avoid having library accumulate log messages in a 1MB buffer by default.

    Changes are intended to make the API a little less surprising and more future proof. Rationale is described in some more detail the commit message.

    This change is not backwards compatible but I’d expect it to port over pretty smoothly to the language bindings, and offer the same benefits for them as for C/C++ callers.

  2. kernel: Simplify logging API
    Current kernel logging API is too complicated and too restrictive. This PR
    tries to improves it.
    
    I'd expect an API that supported logging to allow creating log streams with
    different options and providing some way to specify which library operations
    should be logged to which streams.
    
    By contrast, the current API allows creating multiple log streams, but all the
    streams receive the same messages because options can only be set globally and
    the stream objects can't be passed to any kernel API functions. They are not
    even referenced by the `btck_Context` struct which holds other shared state. If
    no log streams are created, log messages are generated anyway, but they are
    stored in a 1MB buffer and not sent anywhere, unless a log stream is created
    later, at which point they are sent in bulk to that stream. More log streams
    can be created after that, but they only receive new messages, not the buffered
    ones. If log output is not needed, a btck_logging_disable() call is required to
    prevent log messages from accumulating in the 1MB buffer. Calling this will
    abort the program if any log streams were created before it was called, and
    also abort the program if any new log streams are created later.
    
    None of these behaviors seem necessary or ideal, and it would be better to
    provide a simpler logging API that allows creating a log stream, setting
    options on it, registering it with the `btck_Context` instance and receiving
    log messages from it. Another advantage of this approach is that it could allow
    (with #30342) different log streams to be used for different purposes, which
    would be not be possible with the current interface.
    0926479924
  3. maflcko added the label Insufficient Review on Nov 11, 2025
  4. maflcko removed the label Insufficient Review on Nov 11, 2025
  5. DrahtBot added the label Validation on Nov 11, 2025
  6. TheCharlatan commented at 2:05 pm on November 11, 2025: contributor

    Concept ACK

    Thank you for contributing this @ryanofsky! We’ve discussed doing something similar a few months ago. This is close to how I hoped to implement the logging capabilities, but reviewers highlighted some inconsistencies with this approach: A secondary logger object cannot receive its own non-global options. The docstrings do at least still explain this even with this change, I am not sure if it is really less surprising for users. Maybe it would be preferable to push forward #30342 before this is done? Tagging @stringintech and @stickies-v here, since they both left comments on the logging capabilities during review of #30595.

  7. DrahtBot commented at 2:05 pm on November 11, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33847.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK TheCharlatan, stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #33796 (kernel: Expose CheckTransaction consensus validation function by w0xlt)
    • #33728 (test: Add bitcoin-chainstate test for assumeutxo functionality by stringintech)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • return new -> return a new [grammar: needs the indefinite article “a” before “new” to be grammatical]
    • effect -> affect [word choice: “affect” (verb) is correct here; “effect” is a noun and makes the sentence incorrect]

    drahtbot_id_5_m

  8. ryanofsky force-pushed on Nov 11, 2025
  9. ryanofsky commented at 2:28 pm on November 11, 2025: contributor
    Updated d0bd114e974500ad11c8ecac26d6606166ab7438 -> 0926479924f453c38b5e0ad069435fb3241b2389 (pr/klog.1 -> pr/klog.2, compare) updating btck_logging_connection_create documentation
  10. ryanofsky commented at 3:29 pm on November 11, 2025: contributor

    re: #33847 (comment)

    Thanks @TheCharlatan. I think the approach you took in #30595 of making the kernel API mirror the current logging.h API was the best one to take as a starting point. #30595 covers a large surface area, and it would have been asking too much of reviewers to evaluate all the different ways the kernel API and existing APIs could or should diverge in one PR. It should be more manageable to make kernel API improvements in targeted followups like this PR, and hopefully the improvements here are welcome.

    On the question of whether to push forward with this PR or #30342 first: I think this PR should have much higher priority than #30342. The reason is that this PR is not backwards compatible and will require downstream changes to language bindings and probably to most external programs using those bindings. By contrast, #30342 is backwards compatible. If #30342 were merged later, it would just improve how requested log options are applied and how log messages are forwarded, without requiring external changes. #30342 is also a bigger PR that will take more effort to review.

    It is true that a potential drawback of merging this PR without #30342 is that the design of the API exposed here could make it seem like it provides more functionality than it currently does. Users would not know about the limitations without reading the documentation, and could be surprised or confused. I just would not expect this to be a problem in practice, because the surprising behavior should only occur if users create multiple log streams or multiple kernel context objects at the same time, and that should be uncommon. And even if it did happen, the consequences would not seem very bad: log options might be applied in a different way than expected, or some log messages might be received unexpectedly. It seems like it would place a bigger burden on users to first roll out a logging API that does not support multiple options and streams and then later change it incompatibly, than to roll out an API that could support them and then implement that support internally later.

  11. stringintech commented at 10:07 pm on November 13, 2025: contributor

    Thanks @ryanofsky for explaining the trade-offs!

    I think I also prefer we push this change after we support separate logger instances per connection though, and at this point keep the API consistent with what it’s currently capable of. I’m not sure how big of an impact delaying this would have on bindings/users, especially if we’re assuming their use cases are mostly limited to a single logging connection.

  12. ryanofsky commented at 1:29 am on November 14, 2025: contributor

    Thanks for the feedback @stringintech. Can you explain what specifically you may be looking for which this PR does not provide? The API defined here is consistent and logical. It just happens to be compatible with new features like per-connection options and separate log streams (already implemented in #30342) instead of being pointlessly incompatible with them. It also eliminates the btck_logging_disable() function which is unsafe and confusing. And it gets rid of the kernel’s inefficient default behavior of storing up log messages in a 1MB buffer in case a logger is attached at later time, instead taking a simpler and more rational approach: logging when a logger is attached and not logging otherwise. I think if you look at the API and documentation defined here you will see that it makes more sense than the current API and should be less confusing. If there are any specific problems I’d be happy to address them.

    If the logging API is not going to be rationalized soon (it’s only been around a week since the API was introduced in #30595), I am concerned it will be more difficult to rationalize later. Logging is a fundamental part of the kernel API so if backwards incompatible changes are made, they are likely to break most language bindings and potentially many applications written in different languages. It should make sense to provide a logging API that’s forward compatible and exposes fewer quirks and implementation details sooner instead of later, if we can do that.

  13. stringintech commented at 12:58 pm on November 14, 2025: contributor

    By inconsistency, I mean that refactoring set_options/set_level_category/enable_category/etc. to receive a btck_LoggingConnection* parameter suggests that settings are supported per connection, while calling them on a newly instantiated logging connection would actually override settings for all previously instantiated connections. While the actual behavior is well-documented, the function signatures themselves may still create expectations of per-connection isolation.

    I understand you already acknowledged this concern in your last paragraph here, and believe the benefits of pushing this change forward outweigh this downside. My preference was to wait for #30342, since the API refactoring and implementation naturally seem to be parts of a single patch, though I may be too optimistic about how much time/effort it takes to push that PR forward, or the effort it takes to adapt the bindings/other projects to the breaking changes.

    Regarding the StartLogging() call being moved from the logging connection constructor to context creation: This delay could mean logs won’t be produced by context-free operations (like btck_script_pubkey_verify()). If i am not wrong, it appears we don’t currently have such operations producing logs, but if we add them in the future while still using a global logger, would we need to move StartLogging() back to the logging connection constructor?

  14. stickies-v commented at 1:59 pm on November 14, 2025: contributor

    Concept ACK for making the logging interface local instead of global, but approach ~0 leaning towards nack. I think the approach that merging the backwards-incompatible interface early is not unreasonable but not my preferred one, because:

    • I don’t think we should at all care about breaking the kernel interface at the moment, the kernel API is experimental and upstream development flexibility and pace should absolutely be prioritized until kernel is in a more stable state
    • having the interface behave in unexpected ways is just not desirable, even if this gap is not catastrophic as I think you correctly described
    • internal logging refactorings / changes (e.g. #30342, but also others) may surface roadblocks or ideas that inform us about a better interface design, so waiting a bit more seems helpful to me

    I would prefer to convert this to draft, and have this PR serve as the WIP discussion grounds for polishing up the kernel logging API while we make changes such as #30342. I don’t have a proposal ready, but I would also like to make breaking changes to the category/level/enable/disable logic (at least for kernel API) which I think currently is extremely unergonomic and unintuitive, as well as support richer callbacks that do not require users (language bindings, especially) to parse already formatted log strings.

  15. ryanofsky commented at 2:15 pm on November 14, 2025: contributor

    This delay could mean logs won’t be produced by context-free operations

    Thanks, I’ll look into this. I am a little surprised there would be any context-free operations except very basic ones that shouldn’t log. It’d expect there to be some context object, even a minimal one, associated with all non-trivial operations to control things like logging, random number generation, caching, etc.

    Of course I do understand your abstract concern that adding a connection parameter “suggests that settings are supported per connection.” This is similar to how in many filesystem API’s, flushing one file can cause other files to be flushed to disk. Or how in the python database API, cursor objects have commit() and rollback() methods even though in some implementations calling these can affect other cursors. It’s not unusual for APIs to provide a little more generality than is guaranteed in every case, if this provides other benefits and avoids other types of pain.

    So I would like to understand any ways your abstract concern could translate to confusion or problems in practice. When I consider all use-cases I can think of, the new logging API here is less confusing and more coherent than the current one. I already mentioned the current nonsensical default behavior of storing all log messages into a 1MB buffer even when no logging is requested. But lets say you do want to enable logging. You might see the current btck_logging_set_* functions and think they actually do something. Well, they don’t, because in order to have logging, you need to create a connection object. Adding a connection parameter to these functions makes it obvious that a connection object is actually necessary. Or, in the current API you might see the btck_logging_connection_create function and notice that takes a callback and returns a btck_LoggingConnection handle. But where is that handle used? Only by the btck_logging_connection_destroy function and nowhere else. In the current API, kernel functions which actually log and set logging options are less discoverable because the API is not coherent and the functions using the connection do not reference it.

    More generally, I’d expect the majority of scripts and applications using the kernel library to only set one logging callback, and in that use-case, the new API should be a strict improvement over the old one. It is true that in the alternate case where an application does create multiple log callbacks, if the application calls btck_logging_set_* functions, and the author did not read the documentation for those functions, they might mistakenly think those options apply to a single callback instead of multiple. But those applications will still benefit from this PR because of the follow-up in #30342 which can apply log options singly. So I am trying to think of any application or use-case which would not benefit from this PR and I don’t see one. Would be interested to know if I am looking at this from the wrong perspective or am missing something.

  16. ryanofsky marked this as a draft on Nov 14, 2025
  17. ryanofsky commented at 2:33 pm on November 14, 2025: contributor

    prefer to convert this to draft

    Sure happy to do that while there is conceptual discussion.

    having the interface behave in unexpected ways is just not desirable

    I think I’ve pointed out practical ways the current interface behaves in unexpected ways and is counterintuitive, and practical ways this PR is an improvement. The concerns you and stringintech raised do seem plausible, but also vague so if there are any concrete scenarios you can think of where this PR might create a problem, they would be helpful to know about.

    Thanks for the thoughtful reviews!


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-11-20 21:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me