log: Use ConstevalFormatString #30889

pull maflcko wants to merge 3 commits into bitcoin:master from maflcko:2407-log changing 7 files +19 −34
  1. maflcko commented at 6:37 pm on September 12, 2024: member

    This changes all logging (including the wallet logging) to produce a ConstevalFormatString at compile time, so that the format string can be validated at compile-time.

    I tested with clang and found that the compiler will use less than 1% more of time and memory.

    When an error is found, the compile-time error depends on the compiler, but it may look similar to:

    0src/util/string.h: In function int main(int, char**):
    1src/bitcoind.cpp:265:5:   in constexpr expansion of util::ConstevalFormatString<1>(((const char*)"Hi %s %s"))
    2src/util/string.h:38:98:   in constexpr expansion of util::ConstevalFormatString<1>::Detail_CheckNumFormatSpecifiers(std::basic_string_view<char>(((const char*)((util::ConstevalFormatString<1>*)this)->util::ConstevalFormatString<1>::fmt)))
    3src/util/string.h:78:34: error: expression <throw-expression> is not a constant expression
    4   78 |         if (num_params != count) throw "Format specifier count must match the argument count!";
    5      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    

    This refactor does not change behavior of the compiled executables.

  2. doc: move-only logging warning
    Put the warning closer to where it is relevant. That is, put it close to
    the functions that actually do unconditional logging.
    
    Also, remove a stray empty line.
    fa39b1ca63
  3. DrahtBot commented at 6:37 pm on September 12, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK hodlinator, l0rinc, ryanofsky, pablomartin4btc
    Approach ACK stickies-v

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #29418 (rpc: provide per message stats for global traffic via new RPC ‘getnetmsgstats’ by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  4. DrahtBot renamed this:
    log: Use ConstevalFormatString
    log: Use ConstevalFormatString
    on Sep 12, 2024
  5. DrahtBot added the label Utils/log/libs on Sep 12, 2024
  6. maflcko force-pushed on Sep 12, 2024
  7. ryanofsky commented at 7:31 pm on September 12, 2024: contributor

    Code review ACK 555513ac4e40f25b54d5f0473904adb23f4e3df7. This is great and I’m surprised it could be implemented so easily.

    I do have questions about the second commit “log: Make format errors visible in debug builds” (fa0383761696d2c5c2e88208676cc993694fc1d4). I understand compile time checking should prevent most runtime errors, so the behavior in that commit of aborting the program when an invalid format string is specified is not as annoying for log-print debugging as it would be otherwise. But is it really the case that compile time checking will prevent all runtime errors? If some runtime errors are still possible I think it would better not to abort when they happen, or to at least move the commit to a different PR so we can evaluate whether to this independently of the other changes here, which are strict improvements otherwise.

  8. test: Use LogPrintStr to test m_log_sourcelocations
    This test checks m_log_sourcelocations, not the formatting with format
    specifiers. Those are tested in logging_LogPrintMacros below.
    
    So just use LogPrintStr directly in this test, without format specifiers
    and format args.
    
    This is required for a follow-up commit.
    fae9b60c4f
  9. maflcko force-pushed on Sep 12, 2024
  10. maflcko commented at 10:10 pm on September 12, 2024: member

    Sure, dropped the commit. It probably matters so little that reviewing it isn’t worth it. (Edit: To clarify the commit uses Assume, which only aborts in CI or other test-only devel builds)

    Edit: To add even more context, IIUC the only remaining possible runtime errors should be due to the use of * (impossible to check at compile-time without a full compile-time re-implementation of POSIX or tinyformat), or %n: TINYFORMAT_ERROR("tinyformat: %n conversion spec not supported");.

  11. pablomartin4btc commented at 11:26 am on September 13, 2024: member

    ACK fabd78e2e30e557e04739e29c9c221ad47245df1

    I find this very practical. Regarding the abort on CI specifically would be very useful as well in order to catch the issues earlier, not sure about the cons on test-only builds, perhaps someone can add some issues related with it.

  12. DrahtBot requested review from ryanofsky on Sep 13, 2024
  13. maflcko commented at 12:02 pm on September 13, 2024: member

    Regarding the abort on CI specifically would be very useful as well in order to catch the issues earlier, not sure about the cons on test-only builds

    It should be trivial to extend the check to reject %n, but I still don’t think it matters much either way, and it would be better in a separate pull request anyway.

  14. ryanofsky approved
  15. ryanofsky commented at 2:50 pm on September 13, 2024: contributor

    Code review ACK fabd78e2e30e557e04739e29c9c221ad47245df1. This is a simple change that should be useful for developers.


    Do we know if this change affects compile time in any noticeable way? Might be worth checking since there are a lot of print statements, and compiler is now doing a little more work for each of them.


    It might also be worth updating description to give a sense of what errors look like. For me they look like:

    0src/util/string.h: In function int main(int, char**):
    1src/bitcoind.cpp:265:5:   in constexpr expansion of util::ConstevalFormatString<1>(((const char*)"Hi %s %s"))
    2src/util/string.h:38:98:   in constexpr expansion of util::ConstevalFormatString<1>::Detail_CheckNumFormatSpecifiers(std::basic_string_view<char>(((const char*)((util::ConstevalFormatString<1>*)this)->util::ConstevalFormatString<1>::fmt)))
    3src/util/string.h:78:34: error: expression <throw-expression> is not a constant expression
    4   78 |         if (num_params != count) throw "Format specifier count must match the argument count!";
    5      |                                  ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    

    To be clear about dropped commit fa0383761696d2c5c2e88208676cc993694fc1d4, my concern is that I tend to add a lot of hasty print statements when debugging, and am usually debugging in debug mode, and would find it pretty annoying if I now have to go through potentially multiple unnecessary build/run cycles just because there is a new crash when hitting debug prints that don’t use the right specifiers. This may not be a real issue depending on how reliable compile-time checking is, but I don’t know. In any case, it’s a change better done separately to keep this PR focused.

  16. in src/logging.h:248 in fabd78e2e3 outdated
    244@@ -245,22 +245,22 @@ static inline bool LogAcceptCategory(BCLog::LogFlags category, BCLog::Level leve
    245 /** Return true if str parses as a log category and set the flag */
    246 bool GetLogCategory(BCLog::LogFlags& flag, std::string_view str);
    247 
    248-template <typename... Args>
    249-static inline void LogPrintf_(std::string_view logging_function, std::string_view source_file, const int source_line, const BCLog::LogFlags flag, const BCLog::Level level, const char* fmt, const Args&... args)
    250+template <typename... Params>
    


    hodlinator commented at 2:00 pm on September 17, 2024:

    nit: The Params name was already used in scriptpubkeyman.h and wallet.h before the PR but Args is used in util/string.h, index/base.h/cpp, netbase.cpp and used to be used here. Why switch and touch 2 additional lines?

    The old way feels more consistent with the C “varargs”. But maybe there’s a convincing argument for “Params”.

    Did the change locally and it compiles without causing any non-obvious issue.


    maflcko commented at 4:23 pm on September 17, 2024:
    Ok, done
  17. in src/logging.h:249 in fabd78e2e3 outdated
    250-// peer can fill up a user's disk with debug.log entries.
    251-
    252-template <typename... Args>
    253-static inline void LogPrintf_(std::string_view logging_function, std::string_view source_file, const int source_line, const BCLog::LogFlags flag, const BCLog::Level level, const char* fmt, const Args&... args)
    254+template <typename... Params>
    255+inline void LogPrintFormatInternal(std::string_view logging_function, std::string_view source_file, const int source_line, const BCLog::LogFlags flag, const BCLog::Level level, util::ConstevalFormatString<sizeof...(Params)> fmt, const Params&... params)
    


    hodlinator commented at 2:12 pm on September 17, 2024:

    nit: Line grows from 209 to 253 chars, please consider inserting line break(s).

    0inline void LogPrintFormatInternal(std::string_view logging_function, std::string_view source_file, const int source_line,
    1    const BCLog::LogFlags flag, const BCLog::Level level, util::ConstevalFormatString<sizeof...(Params)> fmt, const Params&... params)
    

    maflcko commented at 4:23 pm on September 17, 2024:
  18. hodlinator approved
  19. hodlinator commented at 2:29 pm on September 17, 2024: contributor

    ACK fabd78e2e30e557e04739e29c9c221ad47245df1

    Concept: Great to have compile time format string error checking on the common logging functions!

    Tested changing this in logging.h:

    0- log_msg = tfm::format(fmt, params...);
    1+ log_msg = tfm::format("%s %s\n", 1.1f);
    

    and ran with expected error output.

    Agree with ryanofsky that it would be good to document rough compile time impact if any.

  20. log: Use ConstevalFormatString
    This changes all logging (including the wallet logging) to produce a
    ConstevalFormatString at compile time, so that the format string can be
    validated at compile-time.
    
    Also, while touching the wallet logging, avoid a copy of the template
    Params by using const Params&.
    facbcd4cef
  21. maflcko force-pushed on Sep 17, 2024
  22. maflcko commented at 4:48 pm on September 17, 2024: member

    Agree with ryanofsky that it would be good to document rough compile time impact if any.

    Sure added a note that it is less than 1% of both memory and time. The time should be offset by dropping the corresponding stuff from the Python linter, which is slower than a compiler at iterating over strings.

  23. hodlinator approved
  24. hodlinator commented at 9:28 pm on September 17, 2024: contributor

    re-ACK facbcd4cef8890ae18976fb53b67ea56b3c04454

    git range-diff master fabd78e facbcd4: Undid Args -> Params rename based on nit.

  25. DrahtBot requested review from pablomartin4btc on Sep 17, 2024
  26. DrahtBot requested review from ryanofsky on Sep 17, 2024
  27. in src/logging.h:257 in facbcd4cef
    254         try {
    255             log_msg = tfm::format(fmt, args...);
    256         } catch (tinyformat::format_error& fmterr) {
    257             /* Original format string will have newline so don't add one here */
    258-            log_msg = "Error \"" + std::string(fmterr.what()) + "\" while formatting log message: " + fmt;
    259+            log_msg = "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: " + fmt.fmt;
    


    l0rinc commented at 11:04 am on September 18, 2024:

    nit: maybe we could dogfood this and use:

    0            log_msg = tfm::format("Error \"%s\" while formatting log message: %s", fmterr.what(), fmt.fmt);
    

    maflcko commented at 11:41 am on September 18, 2024:
    No. It seems fragile to fall back to tinyformat when a tinyformat error occurred. Even if unlikley, that just seems like it could lead to another tfm exception, rendering this whole code useless in the first place.

    l0rinc commented at 11:47 am on September 18, 2024:
    I understand (and somewhat share) the reasoning, but doesn’t this mean that we don’t trust tfm::format even in a fixed case? E.g. isn’t it possible after this call for log_msg to contain dangling format specifiers, which would be passed on to LogInfo anyway?

    maflcko commented at 11:55 am on September 18, 2024:

    doesn’t this mean that we don’t trust tfm::format even in a fixed case?

    See #30889 (comment), but this should never happen. So I don’t think micro-optimizing this is worth it.


    l0rinc commented at 12:04 pm on September 18, 2024:
    Ok, what about the case when fmterr.what() contains formatting, like %s?

    maflcko commented at 12:53 pm on September 18, 2024:

    Ok, what about the case when fmterr.what() contains formatting, like %s?

    It may or may not contain it. It doesn’t matter either way. Also, I am not changing that (or any behavior). So a separate issue or pull request seems better.

  28. in src/wallet/scriptpubkeyman.h:259 in facbcd4cef
    253@@ -254,9 +254,9 @@ class ScriptPubKeyMan
    254 
    255     /** Prepends the wallet name in logging output to ease debugging in multi-wallet use cases */
    256     template <typename... Params>
    257-    void WalletLogPrintf(const char* fmt, Params... parameters) const
    258+    void WalletLogPrintf(util::ConstevalFormatString<sizeof...(Params)> wallet_fmt, const Params&... params) const
    259     {
    260-        LogPrintf(("%s " + std::string{fmt}).c_str(), m_storage.GetDisplayName(), parameters...);
    261+        LogInfo("%s %s", m_storage.GetDisplayName(), tfm::format(wallet_fmt, params...));
    


    l0rinc commented at 11:11 am on September 18, 2024:
    Should this always be info? Some WalletLogPrintf usages seem like warnings, or errors.

    maflcko commented at 11:41 am on September 18, 2024:
    A refactor is not allowed to change behavior. If you want to change behavior, a separate issue or pull request seems more appropriate.

    maflcko commented at 11:45 am on September 18, 2024:
    Added “This refactor does not change behavior of the compiled executables.” to the pull description and applied the label.

    l0rinc commented at 11:53 am on September 18, 2024:
    That’s my concern as well, it seems to me we’re changing behavior (or at least meaning) here by swapping a neutral LogPrintf to Info level logging, when in fact it’s sometimes a warning or error.

    maflcko commented at 12:02 pm on September 18, 2024:

    LogInfo is neutral, because there is no level. See GetLogPrefix:

    0    // If there is no category, Info is implied
    1    if (!has_category && level == Level::Info) return {};
    

    In any case, it is literally a macro alias:

    0// Deprecated unconditional logging.
    1#define LogPrintf(...) LogInfo(__VA_ARGS__)
    

    l0rinc commented at 12:06 pm on September 18, 2024:
    Got it, thanks. Would it make sense to add an optional log-level parameter here to be able to set some of those logs to errors/warns in a future PR?

    maflcko commented at 12:52 pm on September 18, 2024:

    Would it make sense … in a future PR?

    I don’t know and I don’t think it matters much one way or the other.

    Got it, thanks.

    Ok, resolving this for now.

  29. in src/wallet/wallet.h:932 in facbcd4cef
    926@@ -927,9 +927,9 @@ class CWallet final : public WalletStorage, public interfaces::Chain::Notificati
    927 
    928     /** Prepends the wallet name in logging output to ease debugging in multi-wallet use cases */
    929     template <typename... Params>
    930-    void WalletLogPrintf(const char* fmt, Params... parameters) const
    931+    void WalletLogPrintf(util::ConstevalFormatString<sizeof...(Params)> wallet_fmt, const Params&... params) const
    932     {
    933-        LogPrintf(("%s " + std::string{fmt}).c_str(), GetDisplayName(), parameters...);
    934+        LogInfo("%s %s", GetDisplayName(), tfm::format(wallet_fmt, params...));
    


    l0rinc commented at 11:23 am on September 18, 2024:
    clever workaround

    stickies-v commented at 11:02 am on September 19, 2024:

    By not letting LogInfo do the formatting, there’s no tinyformat::format_error error handling anymore. This makes the code less robust to the malformed format strings we do not catch at compile time. For example, with this diff:

     0diff --git a/src/wallet/wallet.cpp b/src/wallet/wallet.cpp
     1index 83e96adf07..e6a2576f0a 100644
     2--- a/src/wallet/wallet.cpp
     3+++ b/src/wallet/wallet.cpp
     4@@ -3822,7 +3822,7 @@ void CWallet::LoadActiveScriptPubKeyMan(uint256 id, OutputType type, bool intern
     5     // Legacy wallets have only one ScriptPubKeyManager and it's active for all output and change types.
     6     Assert(IsWalletFlagSet(WALLET_FLAG_DESCRIPTORS));
     7 
     8-    WalletLogPrintf("Setting spkMan to active: id = %s, type = %s, internal = %s\n", id.ToString(), FormatOutputType(type), internal ? "true" : "false");
     9+    WalletLogPrintf("Setting spkMan to active: id = %s, type = %s, internal = %*s\n", id.ToString(), FormatOutputType(type), internal ? "true" : "false");
    10     auto& spk_mans = internal ? m_internal_spk_managers : m_external_spk_managers;
    11     auto& spk_mans_other = internal ? m_external_spk_managers : m_internal_spk_managers;
    12     auto spk_man = m_spk_managers.at(id).get();
    

    on master, this would still allow a wallet to be loaded (with just the logging failing):

     0% bitcoin-cli loadwallet test
     1{
     2  "name": "test"
     3}
     4
     5...
     6
     72024-09-19T10:32:57Z init message: Loading wallet
     82024-09-19T10:32:57Z [test] Wallet file version = 10500, last client version = 289900
     92024-09-19T10:32:57Z [test] Descriptors: 8, Descriptor Keys: 1 plaintext, 0 encrypted, 1 total.
    102024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    112024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    122024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    132024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    142024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    152024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    162024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    172024-09-19T10:32:57Z Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %s Setting spkMan to active: id = %s, type = %s, internal = %*s
    182024-09-19T10:32:57Z [test] Wallet completed loading in             281ms
    

    whereas on facbcd4cef8890ae18976fb53b67ea56b3c04454, you get:

     0% bitcoin-cli loadwallet test
     1error code: -4
     2error message:
     3Wallet loading failed. Error loading <...>/signet/wallets/test/wallet.dat: Wallet corrupted
     4
     5...
     6
     72024-09-19T11:00:08Z init message: Loading wallet
     82024-09-19T11:00:08Z [test] Wallet file version = 10500, last client version = 289900
     92024-09-19T11:00:09Z [test] Descriptors: 8, Descriptor Keys: 1 plaintext, 0 encrypted, 1 total.
    102024-09-19T11:00:09Z [test] Releasing wallet test..
    

    Similar concern in scriptpubkeyman.h.


    maflcko commented at 11:37 am on September 19, 2024:

    Good point and good catch. This should be trivial to fix by using a macro approach to concatenate the two format strings. (In C++ "a" "b" is equivalent to "ab"). A macro approach was my initial implementation when I wrote this in July, but the benefit wasn’t clear, so I switched to this. Basically,

    • Breaking this requires an intentionally malformed format string, such as %n or %*s, which may or may not be caught during review.
    • The wallet logging is unconditional, so this requires the code to be completely untested. Otherwise tests would hopefully catch the intentional malformed format string before review.
    • There are many other places where tinyformat is used without a catch, so it is unclear why it should be checked here, but not in the other places.

    But again, it should be trivial to add this back, if reviewers think this would be useful to add back.


    stickies-v commented at 2:50 pm on September 19, 2024:

    Addressed in #30928.

    You’re right that the unconditional logging should make these errors quite likely to be caught in review, and that there is no consistency in when we catch formatting exceptions or not. In #30928 I tried to address the latter by enforcing run-time exceptions are handled whenever we use util::ConstevalFormatString, which should correlate with needing to prioritize run-time robustness.

  30. in src/logging.h:249 in facbcd4cef
    249-// unconditionally log to debug.log! It should not be the case that an inbound
    250-// peer can fill up a user's disk with debug.log entries.
    251-
    252 template <typename... Args>
    253-static inline void LogPrintf_(std::string_view logging_function, std::string_view source_file, const int source_line, const BCLog::LogFlags flag, const BCLog::Level level, const char* fmt, const Args&... args)
    254+inline void LogPrintFormatInternal(std::string_view logging_function, std::string_view source_file, const int source_line, const BCLog::LogFlags flag, const BCLog::Level level, util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    


    l0rinc commented at 11:29 am on September 18, 2024:
    is there a conceptual difference between Args (used here) and Params (used in WalletLogPrintf)?

    maflcko commented at 11:44 am on September 18, 2024:
    No

    l0rinc commented at 11:54 am on September 18, 2024:
    So can we unify the two and use only one to avoid confusion?

    maflcko commented at 11:59 am on September 18, 2024:
    I don’t want to touch lines in this pull request that don’t have to be touched. Also, I don’t want to increase the scope here.
  31. l0rinc approved
  32. l0rinc commented at 11:29 am on September 18, 2024: contributor
    ACK facbcd4cef8890ae18976fb53b67ea56b3c04454
  33. maflcko added the label Refactoring on Sep 18, 2024
  34. ryanofsky approved
  35. ryanofsky commented at 10:14 am on September 19, 2024: contributor

    Code review ACK facbcd4cef8890ae18976fb53b67ea56b3c04454

    Only change since last review is tweaking log function declaration

  36. pablomartin4btc approved
  37. pablomartin4btc commented at 10:58 am on September 19, 2024: member

    re-ACK facbcd4cef8890ae18976fb53b67ea56b3c04454

    Checked the code changes since my last review.

  38. stickies-v commented at 11:16 am on September 19, 2024: contributor

    Approach ACK and code LGTM facbcd4cef8890ae18976fb53b67ea56b3c04454 modulo a tinyformat::format_error concern.

    Very elegant way to force more compile-time format string checks with minimal code overhaul, I really like this.

  39. fanquake merged this on Sep 19, 2024
  40. fanquake closed this on Sep 19, 2024

  41. maflcko deleted the branch on Sep 19, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-09-29 04:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me