doc: Drop description of LogError messages as fatal #30361

pull ryanofsky wants to merge 1 commits into bitcoin:master from ryanofsky:pr/nofat changing 1 files +6 −6
  1. ryanofsky commented at 10:18 pm on June 28, 2024: contributor

    Remove documentation that says LogError should be used for “severe problems that require the node (or a subsystem) to shut down entirely” because:

    • It is not how LogError and Level::Error are used currently. Of 129 current uses only 58 cases are fatal according to #30348

    • “[T]here’s not much benefit in a log that says “hey this error is about to cause the node to stop” – you already get that information by seeing “Shutdown: in progress…” immediately following.” according to #30347 (comment)

    Fixes #30348

  2. DrahtBot commented at 10:18 pm on June 28, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept NACK maflcko
    Ignored review l0rinc

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    No conflicts as of last run.

  3. DrahtBot added the label Docs on Jun 28, 2024
  4. ajtowns commented at 11:58 pm on June 28, 2024: contributor

    I think both entries should continue to include examples.

    I think removing “severe” from the LogWarning description doesn’t make sense – anything the admin might have to address is a severe (potential) problem, isn’t it?

    Having LogError mean “the node admin will need to address” vs LogWarning mean “the node admin may need to address” seems to argue towards retaining the “LogError is fatal” interpretation to me – fatal errors are ones the admin does need address; anything else they can ignore until it becomes fatal.

    A potential advantage of having LogError (or LogFatal) reserved for fatal errors is that you could grep through the codebase for LogError to see all the potentially fatal errors, and perhaps work on finding ways to make them less fatal. Not convinced that’s very beneficial though.

    I don’t think there’s much other benefit to splitting potential problems up from actual problems otherwise if the admin still needs to attend to them either way, though; so if we’re already making LogError just mean actual errors; is there any reason not to merge them both into a single LogAlert() or similar?

  5. doc: Drop description of LogError messages as fatal
    Remove documentation that says LogError should be used for "severe problems
    that require the node (or a subsystem) to shut down entirely" because:
    
    - This is not how `LogError` and `Level::Error` are used currently. Of 129
      current uses only 58 cases are fatal according to
      https://github.com/bitcoin/bitcoin/issues/30348
    
    - "[T]here's not much benefit in a log that says "hey this error is about to
      cause the node to stop" -- you already get that information by seeing
      "Shutdown: in progress..." immediately following." according to
      https://github.com/bitcoin/bitcoin/pull/30347#discussion_r1659096893
    b7aae361b2
  6. ryanofsky force-pushed on Jun 30, 2024
  7. ryanofsky commented at 3:55 pm on June 30, 2024: contributor

    re: #30361 (comment)

    Thanks, added back examples and “severe”.

    I think probably just looking at AbortNode callers would be as effective as as searching for fatal error messages, if looking for places to make error handling more robust.

    I do like the alert level idea and posted an implementation in #30364.

    Updated 5175cbcbb00d6212cf548de428f43cb0c2a1be4b -> b7aae361b273f2f439d3b278214b7e37908c8cb0 (pr/nofat.1 -> pr/nofat.2, compare) with clarifications and examples.

  8. ryanofsky marked this as ready for review on Jun 30, 2024
  9. maflcko commented at 8:12 pm on July 16, 2024: member
    • It is not how LogError and Level::Error are used currently. Of 129 current uses only 58 cases are fatal according to #30348

    The reason is that many stem from error(), see also #30364#pullrequestreview-2181072864.

    • “[T]here’s not much benefit in a log that says “hey this error is about to cause the node to stop” – you already get that information by seeing “Shutdown: in progress…” immediately following.” according to #30347 (comment)

    Agree, but then just go with #30364? Seems odd to change something twice, when it can be changed once?

    weak NACK for now.

  10. ryanofsky commented at 9:58 pm on August 16, 2024: contributor

    re: #30361 (comment)

    I think I might not have enough information to understand the weak nack, because I don’t actually see a reason in there to reject a documentation fix that is only dropping advice to interpret Error as “fatal” and Warning as “nonfatal”.

    The advice is not followed by current code, but more importantly, I don’t think anybody (even the original author) currently thinks this advice is something we should try to implement anymore. If something in this PR makes the documentation worse, it’d be helpful to know where the problems are. Otherwise this seems like a clean fix for the issue described in #30348, and a way to close that issue and move on to other fixes and improvements.

    I do understand what you are saying about the error() removal in #29236 making the issue described in #30348 worse because it increased the amount of Error spam. But it’s not like this PR is changing documentation to say that Error spam is ok. It is clearly not. It is just trying to remove a separate point of confusion.

  11. maflcko commented at 7:18 am on August 27, 2024: member

    I think I might not have enough information to understand the weak nack

    It is the line prior: “Seems odd to change something twice, when it can be changed once? weak NACK for now.”

    Now that the other pull is closed, I guess this is fine.

  12. maflcko commented at 7:23 am on August 27, 2024: member

    lgtm ack b7aae361b273f2f439d3b278214b7e37908c8cb0

    Seems fine to change the documentation here to be a description of what the code does right now. Also, it seems fine for this description to be applied going forward, because both an “Error” or “Warning” should be considered an Alert for the node admin to investigate (and possibly dismiss) (with or without the node shutting down).

    There are probably cases where an Error can be “downgraded” to a Warning in the code, but they can be done in follow-up changes.

  13. ajtowns commented at 9:22 am on August 27, 2024: contributor

    I don’t think anybody (even the original author) currently thinks this advice is something we should try to implement anymore.

    Presuming that if it’s me who’s he-who-must-not-be-named here rather than referring back to #24464 etc, what I think is that if we’re going to have distinct log levels for “error” and “warning”, I think there should be a clear documented difference between what they are, so that PR authors, reviewers and users can know what to expect. If there’s places in our code that don’t meet that expectation, then that’s something that should be fixed, though it doesn’t need to be treated as some huge priority. On the other hand, if there’s no distinction between the levels that’s worth making, we shouldn’t have two levels. So I think either the status quo or #30364 is better than this PR. It’s not really worth arguing about, but it’s frustrating having people put words in my mouth that aren’t mine.

  14. Tarhanshop approved
  15. maflcko commented at 8:45 am on September 18, 2024: member

    So I think either the status quo or #30364 is better than this PR.

    I tend to agree on a second thought. This pull just seems to increase the long-term confusion by treating the levels equivalent or similar, just because some parts of the code happen to confuse them (for various reasons).

    Also, as pointed out in #30364 (review) (and other comments) a good chunk of the log statements that this doc-only pull is trying to “fix” shouldn’t be log statements at all.

    So overall I am NACK -ish on increasing confusion in the docs, motivated by code that will be removed or changed long-term anyway.

  16. in doc/developer-notes.md:747 in b7aae361b2
    742@@ -742,14 +743,13 @@ logging messages. They should be used as follows:
    743   attacker to fill up storage. Note that `LogPrintf(fmt, params...)` is
    744   a deprecated alias for `LogInfo`.
    745 
    746-- `LogError(fmt, params...)` should be used in place of `LogInfo` for
    747-  severe problems that require the node (or a subsystem) to shut down
    748-  entirely (e.g., insufficient storage space).
    749+- `LogError(fmt, params...)` should be used in place of `LogInfo` for severe
    750+  errors the node admin will need to address (e.g., failure to write data).
    


    l0rinc commented at 12:34 pm on September 18, 2024:

    nit: instead of seems slightly more common than in place of + that to streamline readability + needs to address to imply urgency:

    0- `LogError(fmt, params...)` should be used instead of `LogInfo` for severe errors
    1  that the node admin needs to address (e.g., failure to write data).
    
  17. in doc/developer-notes.md:754 in b7aae361b2
    755-  appears to be wrong, unknown soft fork appears to have activated).
    756+  unexpected conditions indicating potentially severe problems the node admin
    757+  should address (e.g. system time appears to be wrong, unknown soft fork
    758+  appears to have activated).
    759 
    760 - `LogTrace(BCLog::CATEGORY, fmt, params...)` should be used in place of
    


    l0rinc commented at 12:42 pm on September 18, 2024:
    nit: the logger order is weird, why debug -> info -> error -> warning -> trace? Especially after announcing LogInfo, LogDebug, LogTrace, LogWarning and LogError in the intro.
  18. l0rinc commented at 12:43 pm on September 18, 2024: contributor

    While I understand why @maflcko doesn’t want to change twice, this seems to me like a slight improvement.

    utACK b7aae361b273f2f439d3b278214b7e37908c8cb0

  19. l0rinc approved

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-26 21:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me