tinyformat: refactor: increase compile-time checks and don’t throw for tfm::format_error #30928

pull stickies-v wants to merge 4 commits into bitcoin:master from stickies-v:2024-09/try-format changing 15 files +144 −130
  1. stickies-v commented at 2:45 pm on September 19, 2024: contributor

    Avoid unexpected run-time crashes from string formatting by enforcing compile-time format string checks for most* tfm::format and strprintf usage, and by returning an error string instead of throwing for a run-time tfm::format_error so callsites don’t need to implement this error handling.

    This PR should introduce no behaviour change. The main changes are:

    • remove the const std::string& tfm::format overload since it’s not necessary. Usage of this overload is removed by one of:
      • replacing string concatenation in the format string with just an extra parameter
      • using the bilingual_str overload
      • using the tfm::format_raw functions (only for tinyformat implementation or tests)
    • rename the non-compile-time-validated tfm::format overloads to tfm::format_raw, so existing callsites by default use the safer util::ConstevalFormatString tfm::format overloads. Callsites that for some reason don’t pass the compile-time checks (such as in bitcoin-cli.cpp) can use tfm::format_raw.
    • make tfm::format_error internal to tinyformat and just return error strings instead, so callsites don’t need to implement error handling to prevent crashing.

    *The bilingual_str format(const bilingual_str& fmt, const Args&... args) is not yet compile-time checked, which means the lint-format-strings.py test can’t be removed yet.

    Follow-up on #30889 (review), an alternative to #15926.

  2. DrahtBot commented at 2:45 pm on September 19, 2024: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage

    For detailed information about the code coverage, see the test coverage report.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, l0rinc
    Concept ACK hodlinator

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #31061 (refactor: Check translatable format strings at compile-time by maflcko)
    • #31042 (build: Rename PACKAGE_* variables to CLIENT_* by hebasto)
    • #30997 (build: Switch to Qt 6 by hebasto)
    • #30930 (netinfo: add peer services column and outbound-only option by jonatack)
    • #30221 (wallet: Ensure best block matches wallet scan state by achow101)
    • #29936 (fuzz: wallet: add target for CreateTransaction by brunoerg)
    • #29418 (rpc: provide per message stats for global traffic via new RPC ‘getnetmsgstats’ by vasild)
    • #28710 (Remove the legacy wallet and BDB dependency by achow101)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label Utils/log/libs on Sep 19, 2024
  4. stickies-v force-pushed on Sep 19, 2024
  5. DrahtBot added the label CI failed on Sep 19, 2024
  6. DrahtBot commented at 2:55 pm on September 19, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/30382700932

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  7. in src/util/string.h:258 in bb9b55f51c outdated
    255+        log_msg = tfm::format(fmt.fmt, args...);
    256+    } catch (tinyformat::format_error& fmterr) {
    257+        /* Original format string will have newline so don't add one here */
    258+        log_msg = "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: \"" + fmt.fmt + "\"";
    259+    }
    260+    return log_msg;
    


    l0rinc commented at 3:02 pm on September 19, 2024:

    Since we have simple, atomic instructions in both branches after which we immediately return, we might as well return directly:

    0    try {
    1        return tfm::format(fmt.fmt, args...);
    2    } catch (tinyformat::format_error& fmterr) {
    3        /* Original format string will have newline so don't add one here */
    4        return "Error \""s + fmterr.what() + "\" while formatting log message: \"" + fmt.fmt + "\"";
    5    }
    

    Note: I’ve used using namespace std::string_literals to simplify "Error \""s + fmterr.what() concat here.


    stickies-v commented at 3:33 pm on September 19, 2024:
    Taken (minus string literals), thanks!
  8. in src/netbase.cpp:562 in ed4c60e38b outdated
    563-    if (manual_connection) {
    564-        LogPrintf("%s\n", error_message);
    565-    } else {
    566-        LogDebug(BCLog::NET, "%s\n", error_message);
    567-    }
    568+    auto log_level{manual_connection ? BCLog::Level::Info : BCLog::Level::Debug};
    


    maflcko commented at 3:06 pm on September 19, 2024:
    Not sure about the first commit. One is unconditionally logging and the other not. That is a pretty stark difference and would be better to leave on separate lines. In any case, this isn’t a refactor, because you drop the trailing \n, and it isn’t needed for the changes here, so it may be best to drop the first commit.

    stickies-v commented at 3:33 pm on September 19, 2024:
    Thanks for pointing out the behaviour change, I’ve dropped the commit.
  9. in src/netbase.cpp:563 in ed4c60e38b outdated
    564-        LogPrintf("%s\n", error_message);
    565-    } else {
    566-        LogDebug(BCLog::NET, "%s\n", error_message);
    567-    }
    568+    auto log_level{manual_connection ? BCLog::Level::Info : BCLog::Level::Debug};
    569+    LogPrintLevel(BCLog::NET, log_level, fmt, args...);
    


    l0rinc commented at 3:07 pm on September 19, 2024:
    Is the LogFlags::NONE -> BCLog::NET category in case of manual_connection a behavior change?

    stickies-v commented at 3:34 pm on September 19, 2024:
    It is indeed, thanks for pointing it out. Resolved by dropping the first commit and applying a minimal diff to netbase.cpp.
  10. in src/test/util_string_tests.cpp:91 in bb9b55f51c outdated
    83@@ -83,4 +84,12 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    84     FailFmtWithError<1>("%1$", err_term);
    85 }
    86 
    87+BOOST_AUTO_TEST_CASE(tinyformat_try_format)
    88+{
    89+    // ensure invalid format strings don't throw at run-time
    


    l0rinc commented at 3:08 pm on September 19, 2024:
    nit: comment seems redundant here

    stickies-v commented at 3:37 pm on September 19, 2024:

    I don’t agree it’s redundant:

    • it’s not obvious from the tests that we’re not just checking error messages, but that we’re ensuring no errors are thrown, which is the difference with tfm::format()
    • additional tests to try_format may be added in the future that test other aspects, so this acts as a little delineator

    l0rinc commented at 5:54 pm on September 19, 2024:
    thanks for clarifying
  11. stickies-v force-pushed on Sep 19, 2024
  12. in src/util/string.h:249 in 6dc6e4583c outdated
    245+ * to handle the error.
    246+ */
    247 namespace tinyformat {
    248 template <typename... Args>
    249-std::string format(util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    250+std::string try_format(util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    


    maflcko commented at 3:37 pm on September 19, 2024:

    Not sure about this. tfm::format is the only remaining function preventing the removal of ./test/lint/lint-format-strings.py. I wanted to do this in a follow-up, but if this function is renamed, it will be harder, because all call sites will have to be renamed.

    My preference would be to leave the name as-is and instead just delete the unchecked one, or rename that one to format_maybe_throw (or similar).

    If you don’t mind, it would be good to delete ./test/lint/lint-format-strings.py in this pull request. Otherwise there will need to be another follow-up, or a possibly conflicting pull.


    stickies-v commented at 1:03 pm on September 20, 2024:

    I renamed it to try_format to be clear about the different behaviour compared to tfm::format, but agreed that avoiding merge conflicts is useful to coordinate on.

    My preference would be to leave the name as-is and instead just delete the unchecked one

    Could you elaborate on this a bit more? We have these 4 other format overloads, I don’t think you want to just remove them all? At least 2) and 4) would imply touching quite a bit of code to refactor, and it seems not all call sites of format are currently constant expressions?

    1. std::string format(const std::string &fmt, const Args&... args)
    2. std::string format(const char* fmt, const Args&... args)
    3. void format(std::ostream& out, const char* fmt, const Args&... args)
    4. bilingual_str format(const bilingual_str& fmt, const Args&... args)

    maflcko commented at 1:20 pm on September 20, 2024:

    IIUC the compiler will lookup format(const char*, ...) for format("%s, 1), so format(const char*, ...) would have to be deleted for the compiler to pick format(ConstevalFormatString, ...). (My recommendation for this pull would be to do at least this)

    IIUC the remaining format taking a std::string can be left as-is for now, as they are less common anyway than the ones taking a compile-time string literal. (Possibly something can be done for them as well, but it seems fine to leave them for a follow-up)

    If there is a need for the throwing format, it seems less churn to add a throwing_format than to switch everything else to try_format.

    But maybe I am missing something obvious?


    stickies-v commented at 1:49 pm on September 20, 2024:

    so format(const char*, ...) would have to be deleted for the compiler to pick format(ConstevalFormatString, ...)

    I think just deleting format(const char*, ...) leads to ambiguous overloading?

     0../../../../src/checkqueue.h:136:36: error: call to 'format' is ambiguous
     1                util::ThreadRename(strprintf("scriptch.%i", n));
     2                                   ^~~~~~~~~
     3../../../../src/tinyformat.h:1151:19: note: expanded from macro 'strprintf'
     4#define strprintf tfm::format
     5                  ^~~~~~~~~~~
     6../../../../src/tinyformat.h:1140:13: note: candidate function [with Args = <int>]
     7std::string format(const std::string &fmt, const Args&... args)
     8            ^
     9../../../../src/util/string.h:249:13: note: candidate function [with Args = <int>]
    10std::string format(util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    

    IIUC the remaining format taking a std::string can be left as-is for now

    That (and keeping the bilingual_str overload) would be my preference too, but you also asked to remove ./test/lint/lint-format-strings.py in this pull which I think we shouldn’t do before all tfm::format overloads are compile-time checked? So that’s why I’m a bit confused.


    maflcko commented at 1:57 pm on September 20, 2024:

    I think just deleting format(const char*, ...) leads to ambiguous overloading?

    Thanks for actually checking. I guess this means that the std::string overload would have to be renamed to avoid the overload problem. Hopefully only the fuzz tests and a few lines in real code use the string overload?

    you also asked to remove ./test/lint/lint-format-strings.py in this pull which I think we shouldn’t do before all tfm::format overloads are compile-time checked? So that’s why I’m a bit confused.

    Jup, something would have to be done to the std::string overloads. It should be possible and I can take a look at this later, however it hopefully shouldn’t be a blocker for this pull request either way, if the (temporary?) rename isn’t too involved. I just have the feeling that it is easier to rename the one taking std::string, than the one taking the consteval literal.


    maflcko commented at 4:47 pm on September 24, 2024:

    Hmm, so I took a look here and the following diff seems to compile (almost):

     0diff --git a/src/tinyformat.h b/src/tinyformat.h
     1index f536306375..079b0ccbff 100644
     2--- a/src/tinyformat.h
     3+++ b/src/tinyformat.h
     4@@ -142,6 +142,7 @@ namespace tfm = tinyformat;
     5 //------------------------------------------------------------------------------
     6 // Implementation details.
     7 #include <algorithm>
     8+#include <concepts> // Added for Bitcoin Core
     9 #include <iostream>
    10 #include <sstream>
    11 #include <stdexcept> // Added for Bitcoin Core
    12@@ -1056,7 +1057,7 @@ inline void vformat(std::ostream& out, const char* fmt, FormatListRef list)
    13 
    14 /// Format list of arguments to the stream according to given format string.
    15 template<typename... Args>
    16-void format(std::ostream& out, const char* fmt, const Args&... args)
    17+void format_raw(std::ostream& out, const char* fmt, const Args&... args) // Renamed for Bitcoin Core
    18 {
    19     vformat(out, fmt, makeFormatList(args...));
    20 }
    21@@ -1064,24 +1065,24 @@ void format(std::ostream& out, const char* fmt, const Args&... args)
    22 /// Format list of arguments according to the given format string and return
    23 /// the result as a string.
    24 template<typename... Args>
    25-std::string format(const char* fmt, const Args&... args)
    26+std::string format_raw(const char* fmt, const Args&... args) // Renamed for Bitcoin Core
    27 {
    28     std::ostringstream oss;
    29-    format(oss, fmt, args...);
    30+    format_raw(oss, fmt, args...);
    31     return oss.str();
    32 }
    33 
    34 /// Format list of arguments to std::cout, according to the given format string
    35 template<typename... Args>
    36-void printf(const char* fmt, const Args&... args)
    37+void printf_raw(const char* fmt, const Args&... args) // Renamed for Bitcoin Core
    38 {
    39-    format(std::cout, fmt, args...);
    40+    format_raw(std::cout, fmt, args...);
    41 }
    42 
    43 template<typename... Args>
    44-void printfln(const char* fmt, const Args&... args)
    45+void printfln_raw(const char* fmt, const Args&... args) // Renamed for Bitcoin Core
    46 {
    47-    format(std::cout, fmt, args...);
    48+    format_raw(std::cout, fmt, args...);
    49     std::cout << '\n';
    50 }
    51 
    52@@ -1146,11 +1147,12 @@ TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_FORMAT_FUNCS)
    53 #endif
    54 
    55 // Added for Bitcoin Core
    56-template<typename... Args>
    57-std::string format(const std::string &fmt, const Args&... args)
    58+template <typename String, typename... Args>
    59+    requires std::same_as<String, std::string>
    60+std::string format(const String& fmt, const Args&... args)
    61 {
    62     std::ostringstream oss;
    63-    format(oss, fmt.c_str(), args...);
    64+    format_raw(oss, fmt.c_str(), args...);
    65     return oss.str();
    66 }
    67 
    68@@ -1160,4 +1162,6 @@ std::string format(const std::string &fmt, const Args&... args)
    69 /** Format arguments and return the string or write to given std::ostream (see tinyformat::format doc for details) */
    70 #define strprintf tfm::format
    71 
    72+#include <util/string.h> // Added for Bitcoin Core
    73+
    74 #endif // TINYFORMAT_H_INCLUDED
    75diff --git a/src/util/string.h b/src/util/string.h
    76index c5183d6c80..5a9b55602f 100644
    77--- a/src/util/string.h
    78+++ b/src/util/string.h
    79@@ -239,7 +239,12 @@ namespace tinyformat {
    80 template <typename... Args>
    81 std::string format(util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    82 {
    83-    return format(fmt.fmt, args...);
    84+    return tfm::format_raw(fmt.fmt, args...);
    85+}
    86+template <typename... Args>
    87+void format(std::ostream& out, util::ConstevalFormatString<sizeof...(Args)> fmt, const Args&... args)
    88+{
    89+    tfm::format_raw(out, fmt.fmt, args...);
    90 }
    91 } // namespace tinyformat
    92 
    

    The problem is that %*s are actually used in bitcoin-cli. However, in an unspecified form (%*s can not be used after using positional specifiers). I am not sure if it is worth it to change the format strings, or to adjust the consteval parser. So an easy fix would be to just call the unchecked c-string functions in bitcoin-cli, which I think is fine.


    stickies-v commented at 1:51 pm on September 25, 2024:

    Thanks for looking into this further! I was also exploring the concepts route, but hadn’t yet resolved all concerns.

    1. Re naming: having format overloads with significantly differently behaviour (throwing vs non-throwing) didn’t quite sit right with me, so I think your format_raw (or format_throw) suggestion is preferable, and I agree it makes more sense to rename the throwing variant instead of the non-throwing variant (although the downside is that it makes reviewing this PR less straightforward).
    2. forcing the util::ConstevalFormatString overload for all string literals either requires that: a) all callsites include util/string.h b) tinyformat.h includes util/string.h (as in your suggestion, although we should probably iwyu export it too), but this introduces a circular dependency c) the util::ConstevalFormatString overloads are moved to tinyformat.h, so the tinyformat.h dependency in util/string.h can be removed.

    I think c) is the preferable approach. It adds more core-specific LoC to tinyformat but I think it is the cleanest overall?

    So an easy fix would be to just call the unchecked c-string functions in bitcoin-cli, which I think is fine.

    I agree, although I think I’ll clean up the incorrect bitcoin-cli format strings here too since that seems like an easy win.


    maflcko commented at 2:13 pm on September 25, 2024:

    I think c) is the preferable approach. It adds more core-specific LoC to tinyformat but I think it is the cleanest overall?

    Yeah, probably. Maybe a large header // === Everything below was added for Bitcoin Core === in the end and then appending the Bitcoin Core stuff would be cleaner than inlining every function into the existing header with a comment by itself.


    stickies-v commented at 12:13 pm on September 26, 2024:
    Latest force-push incorporates what we’ve discussed here, and goes one step further by removing the const std::string& overload entirely. I inspected the callsites and none of them seemed necessary to have the overload (see updated PR description), so this felt like the better way forward.
  13. DrahtBot removed the label CI failed on Sep 19, 2024
  14. l0rinc approved
  15. l0rinc commented at 5:55 pm on September 19, 2024: contributor
    utACK 6dc6e4583c02f0d21dffeb6b2c81358b49a1f340
  16. stickies-v force-pushed on Sep 26, 2024
  17. stickies-v force-pushed on Sep 26, 2024
  18. DrahtBot commented at 12:06 pm on September 26, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/30701048188

    Make sure to run all tests locally, according to the documentation.

    The failure may happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  19. DrahtBot added the label CI failed on Sep 26, 2024
  20. stickies-v renamed this:
    util: refactor: add and use run-time safe tinyformat::try_format
    tinyformat: refactor: increase compile-time checks and don't throw for tfm::format_error
    on Sep 26, 2024
  21. stickies-v commented at 12:16 pm on September 26, 2024: contributor

    Latest force-push increases the scope of this PR, see updated PR description for an overview. In a nutshell:

    • all non-bilingual_str format/strprintf usage is now compile-time checked. Format errors are returned as strings instead of thrown as error, but throwing behaviour can be maintained through tfm::format_raw.
    • the const std::string& format overload is removed because it’s unnecessary

    Thanks a lot @maflcko and @l0rinc for your review!

  22. in src/tinyformat.h:1174 in ed57f6ee02 outdated
    1171 {
    1172-    return format_raw(out, fmt.fmt, args...);
    1173+    try {
    1174+        tfm::format_raw(out, fmt.fmt, args...);
    1175+    } catch (tinyformat::format_error& fmterr) {
    1176+        out << "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: \"" + fmt.fmt + "\"";
    


    maflcko commented at 12:21 pm on September 26, 2024:

    nit in ed57f6ee02d5a6ada23abaea6e2176cd4f173e4f: Would be nice to do it in tinyformat itself, or not at all. ConstevalFormatString is already checked, so extremely unlikely to hit. However, translated strings (which are apparently unreviewed, see #30897) are left unchecked and may throw.

    It would be good to validate and review the translated strings, or remove translations completely for now, but this is probably a separate issue.

    Here, it would be good to apply the try-catch to them.


    stickies-v commented at 10:51 am on September 27, 2024:

    I assumed the next step would be to validate the bilingual_str overload at compile-time too, but I’ve been looking at implementing that and with the run-time translation dependency (QCoreApplication::translate) it doesn’t really seem feasible.

    I hadn’t considered that translations can introduce format errors too, which makes the case for bilingual_str format errors not throwing ever stronger indeed. I was worried that blanket catching all formatting errors at the tfm level could be considered controversial so went for a throwing format_raw and non-throwing format, but perhaps just never having throwing behaviour makes for a more consistent and safer interface, so I’ll move the try/catch to tfm::vformat now.

  23. maflcko approved
  24. maflcko commented at 12:22 pm on September 26, 2024: member

    lgtm

    Thanks for taking the feedback

  25. DrahtBot removed the label CI failed on Sep 26, 2024
  26. stickies-v force-pushed on Sep 27, 2024
  27. stickies-v commented at 11:18 am on September 27, 2024: contributor
    Force-pushed to address @maflcko’s comment about generally not letting tinyformat throw format errors, instead of just for the util::ConstevalFormatString overloads. Most notably, this changes behaviour for the bilingual_str format overload to also not throw.
  28. maflcko approved
  29. maflcko commented at 11:28 am on September 27, 2024: member
    lgtm
  30. in src/wallet/test/fuzz/notifications.cpp:67 in 5d69f2d33b outdated
    63@@ -64,7 +64,7 @@ void ImportDescriptors(CWallet& wallet, const std::string& seed_insecure)
    64 
    65     for (const std::string& desc_fmt : DESCS) {
    66         for (bool internal : {true, false}) {
    67-            const auto descriptor{(strprintf)(desc_fmt.c_str(), "[5aa9973a/66h/4h/2h]" + seed_insecure, int{internal})};
    68+            const auto descriptor{(tfm::format_raw)(desc_fmt.c_str(), "[5aa9973a/66h/4h/2h]" + seed_insecure, int{internal})};
    


    maflcko commented at 1:01 pm on September 27, 2024:
    nit in 5d69f2d33b64a025d0b93b81558f2f51bd54b104: I think the () are no longer needed to silence the linter and can be removed: {tfm::format_raw(desc_fmt.c_str(), ...}

    stickies-v commented at 2:49 pm on September 27, 2024:
    Ah, I wasn’t sure why that was. Fuzzer compiles and linter seems happy so updated, thanks!
  31. maflcko approved
  32. maflcko commented at 1:07 pm on September 27, 2024: member

    review ACK bb12305737e3c22325930b5bdb04028f1a6f2ef6 💅

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK bb12305737e3c22325930b5bdb04028f1a6f2ef6 💅
    3Fge/JMP/cEBpCNfrVWYhr7mIRY4zXGiZRkBzWtbVouQ4q6WPnKlif9AdGqIb4Jrmkp/sIcvaECTG9QyCy1/mBA==
    
  33. DrahtBot requested review from l0rinc on Sep 27, 2024
  34. stickies-v force-pushed on Sep 27, 2024
  35. in doc/developer-notes.md:955 in 2d4f82d590 outdated
    949@@ -950,7 +950,9 @@ Strings and formatting
    950 
    951   - Do not use it when passing strings to `tfm::format`, `strprintf`, `LogInfo`, `LogDebug`, etc.
    952 
    953-    - *Rationale*: This is redundant. Tinyformat handles strings.
    954+    - *Rationale*: Tinyformat handles string parameters. Format strings
    955+      should be known at compile-time, so using a string literal or
    956+      `constexpr const char*` allows for compile-time validation.
    


    l0rinc commented at 2:55 pm on September 27, 2024:

    nit: we could avoid mentioning compile-time twice:

    0    - *Rationale*: Tinyformat handles string parameters.
    1      Using a string literal or `constexpr const char*` ensures
    2      format strings are validated at compile time.
    

    stickies-v commented at 9:53 am on September 30, 2024:
    I don’t think this is an improvement, I find it less clear about the distinction between the compile-time format string and the run-time string parameters, so I’m going to leave this as is for now.

    l0rinc commented at 10:31 am on September 30, 2024:
    Not sure why it’s less clear than before, but it’s just a nit from my part anyway

    hodlinator commented at 10:10 am on October 1, 2024:

    Agree the current message is confusing. The distinction between format string and parameters to the format string should ideally be made clearer, as well as the actual string type.

    0    - *Rationale*: Tinyformat handles `std::string` parameters. Format strings
    1      on the other hand should be known at compile-time, so using a string literal or
    2      `constexpr const char*` allows for compile-time validation.
    

    stickies-v commented at 10:17 am on October 2, 2024:
    I used std::string at first too, but then it was inconsistent with the rest of the documentation, which I think is even more unclear. I’ll reconsider updating all std::string references in this block if I have to force-push again unless you think this is blocking?

    maflcko commented at 10:39 am on October 2, 2024:

    If Tinyformat handles string parameters. is too controversial, I think it can be removed in the follow-up that removes the linter.

    Otherwise, if this is replaced with std::string, someone else is going to suggest to mention std::string_view as well.


    hodlinator commented at 1:30 pm on October 2, 2024:

    Changing tinyformat to support string_view is definitely out of scope for this PR.

    I would at least want to make it clearer that format strings are separate from other parameters, like this or with other words to that effect:

    0    - *Rationale*: Tinyformat handles string parameters. Format strings
    1      on the other hand should be known at compile-time, so using a string literal or
    2      `constexpr const char*` allows for compile-time validation.
    

    Even though I decided to drop into l0rinc’s thread here, I independently found it unclear when doing a review-pass on it.

    More precise suggestion, but not blocking:

    0    - *Rationale*: Tinyformat handles string parameters. Format strings
    1      on the other hand should (except for translations) be known at compile-time, so using a string literal or
    2      `constexpr const char*` allows for compile-time validation.
    

    stickies-v commented at 3:06 pm on October 2, 2024:

    I would at least want to make it clearer that format strings are separate from other parameters

    I agree with that, and I’ll incorporate your first suggestion (I don’t agree with the translations bit) if I have to force-push, thanks!

  36. in src/test/fuzz/strprintf.cpp:49 in 2d4f82d590 outdated
    47@@ -48,35 +48,32 @@ FUZZ_TARGET(str_printf)
    48     //
    49     // Upstream bug report: https://github.com/c42f/tinyformat/issues/70
    


    l0rinc commented at 2:58 pm on September 27, 2024:
    is this bug still possible now? If it is, consider updating the strprintf example in the comment

    stickies-v commented at 9:55 am on September 30, 2024:
    That seems unrelated to this PR though?

    l0rinc commented at 10:26 am on September 30, 2024:

    The comment states:

    0    // Avoid triggering the following crash bug:
    1    // * strprintf("%.1s", (char*)nullptr);
    

    but we’re not calling strprintf directly anymore, hence my comment


    maflcko commented at 10:32 am on September 30, 2024:

    but we’re not calling strprintf directly anymore, hence my comment

    strprintf is an alias for tfm::format, introduced by and for Bitcoin Core. This pull request does not touch the alias and does not change it. It exists before an after this pull request.


    l0rinc commented at 10:35 am on September 30, 2024:
    Ok, it’s not that important
  37. in src/clientversion.cpp:88 in 9b05c64dd0 outdated
    86@@ -87,15 +87,17 @@ std::string LicenseInfo()
    87 {
    88     const std::string URL_SOURCE_CODE = "<https://github.com/bitcoin/bitcoin>";
    89 
    90-    return CopyrightHolders(strprintf(_("Copyright (C) %i-%i").translated, 2009, COPYRIGHT_YEAR) + " ") + "\n" +
    91+    return CopyrightHolders(strprintf(_("Copyright (C) %i-%i"), 2009, COPYRIGHT_YEAR).translated + " ") + "\n" +
    


    l0rinc commented at 3:05 pm on September 27, 2024:

    I’m not sure I understand how this change is supposed to work.

    "Copyright (C) %i-%i" can be translated by checking the exact string from https://github.com/bitcoin/bitcoin/blob/master/src/qt/bitcoinstrings.cpp#L280, but how is "Copyright (C) 2009-2024" supposed to have a translated counterpart?


    stickies-v commented at 5:27 pm on September 27, 2024:
    "Copyright (C) %i-%i" remains the translateable string both before and after this change. The only difference is that we use the bilingual_str format overload instead of the const std::string& one.

    l0rinc commented at 3:39 pm on September 28, 2024:
    Got it, thanks for clarifying!
  38. in src/test/txrequest_tests.cpp:181 in 9b05c64dd0 outdated
    177@@ -178,9 +178,9 @@ class Scenario
    178             size_t real_total = runner.txrequest.Count(peer);
    179             size_t real_candidates = runner.txrequest.CountCandidates(peer);
    180             size_t real_inflight = runner.txrequest.CountInFlight(peer);
    181-            BOOST_CHECK_MESSAGE(real_total == total, strprintf("[" + comment + "] total %i (%i expected)", real_total, total));
    


    l0rinc commented at 3:06 pm on September 27, 2024:
    lol
  39. in src/tinyformat.h:148 in 1f2d0ffd56 outdated
    144@@ -145,6 +145,7 @@ namespace tfm = tinyformat;
    145 #include <iostream>
    146 #include <sstream>
    147 #include <stdexcept> // Added for Bitcoin Core
    148+#include <util/string.h> // Added for Bitcoin Core
    


    l0rinc commented at 3:10 pm on September 27, 2024:
    👍, should have been there all along
  40. in src/test/txrequest_tests.cpp:184 in 9b05c64dd0 outdated
    182-            BOOST_CHECK_MESSAGE(real_inflight == inflight, strprintf("[" + comment + "] inflight %i (%i expected)", real_inflight, inflight));
    183-            BOOST_CHECK_MESSAGE(real_candidates == candidates, strprintf("[" + comment + "] candidates %i (%i expected)", real_candidates, candidates));
    184+            BOOST_CHECK_MESSAGE(real_total == total, strprintf("[%s] total %i (%i expected)", comment, real_total, total));
    185+            BOOST_CHECK_MESSAGE(real_inflight == inflight, strprintf("[%s] inflight %i (%i expected)", comment, real_inflight, inflight));
    186+            BOOST_CHECK_MESSAGE(real_candidates == candidates, strprintf("[%s] candidates %i (%i expected)", comment, real_candidates, candidates));
    187             BOOST_CHECK_MESSAGE(ret == expected, "[" + comment + "] mismatching requestables");
    


    l0rinc commented at 3:14 pm on September 27, 2024:

    for consistency, consider using strprintf here as well:

    0            BOOST_CHECK_MESSAGE(ret == expected, strprintf("[%s] mismatching requestables", comment");
    

    Would be cool if we could use BOOST_CHECK_EQUAL_COLLECTIONS instead, but may not be trivial to do, haven’t investigated in detail.


    stickies-v commented at 9:41 am on September 30, 2024:

    I think BOOST_TEST_CONTEXT would simplify this whole block quite nicely (if we add left shift support to GenTxid), but it’s out of scope for this PR so I’m going not going to push that here. I’ll update l184 with a strprintf as per your suggestion.

     0diff --git a/src/test/txrequest_tests.cpp b/src/test/txrequest_tests.cpp
     1index a6c4a3613d..e70d9aa826 100644
     2--- a/src/test/txrequest_tests.cpp
     3+++ b/src/test/txrequest_tests.cpp
     4@@ -164,7 +164,6 @@ public:
     5         size_t completed, const std::string& checkname,
     6         std::chrono::microseconds offset = std::chrono::microseconds{0})
     7     {
     8-        const auto comment = m_testname + " " + checkname;
     9         auto& runner = m_runner;
    10         const auto now = m_now;
    11         assert(offset.count() <= 0);
    12@@ -178,10 +177,12 @@ public:
    13             size_t real_total = runner.txrequest.Count(peer);
    14             size_t real_candidates = runner.txrequest.CountCandidates(peer);
    15             size_t real_inflight = runner.txrequest.CountInFlight(peer);
    16-            BOOST_CHECK_MESSAGE(real_total == total, strprintf("[%s] total %i (%i expected)", comment, real_total, total));
    17-            BOOST_CHECK_MESSAGE(real_inflight == inflight, strprintf("[%s] inflight %i (%i expected)", comment, real_inflight, inflight));
    18-            BOOST_CHECK_MESSAGE(real_candidates == candidates, strprintf("[%s] candidates %i (%i expected)", comment, real_candidates, candidates));
    19-            BOOST_CHECK_MESSAGE(ret == expected, "[" + comment + "] mismatching requestables");
    20+            BOOST_TEST_CONTEXT(m_testname + " " + checkname) {
    21+                BOOST_CHECK_EQUAL(real_total, total);
    22+                BOOST_CHECK_EQUAL(real_inflight, inflight);
    23+                BOOST_CHECK_EQUAL(real_candidates, candidates);
    24+                BOOST_CHECK_EQUAL_COLLECTIONS(ret.begin(), red.end(), expected.begin(), expected.end());
    25+            }
    26         });
    27     }
    28 
    

    l0rinc commented at 10:30 am on September 30, 2024:
    wasn’t aware of BOOST_TEST_CONTEXT, would indeed be an improvement here
  41. in src/test/util_string_tests.cpp:88 in 2d4f82d590 outdated
    85@@ -83,4 +86,15 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    86     FailFmtWithError<1>("%1$", err_term);
    87 }
    88 
    89+BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
    


    l0rinc commented at 4:05 pm on September 27, 2024:
    Unrelated: just realized that we should eventually add a test for "%1", since while ('0' <= *it && *it <= '9') { can go beyond the string and passes accidentally - but seems to fail in tinyformat, see: https://godbolt.org/z/nbTKnrqch. Added it in the patch, feel free to use it here, not worth adding a new PR for it

    stickies-v commented at 10:00 am on September 30, 2024:
    Looks like this is already covered in #30999, resolving this.

    l0rinc commented at 11:09 am on September 30, 2024:
    Yes, your review is welcome there as well
  42. in src/bitcoin-cli.cpp:543 in 2d4f82d590 outdated
    539@@ -540,15 +540,15 @@ class NetinfoRequestHandler : public BaseRequestHandler
    540         // Report detailed peer connections list sorted by direction and minimum ping time.
    541         if (DetailsRequested() && !m_peers.empty()) {
    542             std::sort(m_peers.begin(), m_peers.end());
    543-            result += strprintf("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ",
    544-                                m_max_addr_processed_length, "addrp",
    545-                                m_max_addr_rate_limited_length, "addrl",
    546-                                m_max_age_length, "age");
    547+            result += tfm::format_raw("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ",
    


    l0rinc commented at 4:29 pm on September 27, 2024:

    we could avoid formatting these by adding something like:

    0if (*it == '*' || ((*it == '_' || *it == '-') && *std::next(it) == '*')) {
    1    return; // We can't validate format string with %* or %_
    2}
    

    to Detail_CheckNumFormatSpecifiers , which would enable us using tfm::format here - maybe later we’ll implement full validation, but this way at least everything before the * format is validated properly.


    maflcko commented at 2:25 pm on September 28, 2024:
    I am not sure about “partial validation”. Either the string is fully validated, or not at all. Prepending a valid string to an invalid string doesn’t make it valid.

    l0rinc commented at 3:06 pm on September 28, 2024:

    Weren’t we already doing partial validation? Doing a full one for these examples wouldn’t be that difficult - do you suggest we do that instead?

    e.g. something like:

    0if (*it == '*' || ((*it == '_' || *it == '-') && *std::next(it) == '*')) {
    1    count_normal += 2; // width + value
    2    it += 2;
    3    continue;
    4}
    

    l0rinc commented at 4:08 pm on September 28, 2024:

    Alternatively, if that’s not desired, what if we replaced the dynamic widths with explicit PadLeft/PadRight calls, e.g.

    0result += tfm::format("<->   type   net  v  mping   ping send recv  txn  blk  hb %s%s%s ",
    1                      PadRight("addrp", m_max_addr_processed_length),
    2                      PadRight("addrl", m_max_addr_rate_limited_length),
    3                      PadRight("age", m_max_age_length));
    

    and

    0result += tfm::format("%s %s%s\n",
    1                      PadLeft("id", m_max_id_length),
    2                      IsAddressSelected() ? PadRight("address", m_max_addr_length) : "",
    3                      IsVersionSelected() ? "version" : "");
    

    etc. I can provide either of these in a follow-up, if the reviewers think it’s a good direction.

     0[[nodiscard]] inline std::string PadLeft(const std::string_view& str, size_t width, const char pad = ' ')
     1{
     2    if (str.size() >= width) return std::string(str);
     3    return std::string(width - str.size(), pad) + std::string(str);
     4}
     5
     6[[nodiscard]] inline std::string PadRight(const std::string_view& str, size_t width, const char pad = ' ')
     7{
     8    if (str.size() >= width) return std::string(str);
     9    return std::string(str) + std::string(width - str.size(), pad);
    10}
    


    stickies-v commented at 10:01 am on September 30, 2024:
    Resolving since it has its own PR now.
  43. in src/test/util_string_tests.cpp:91 in 2d4f82d590 outdated
    85@@ -83,4 +86,15 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    86     FailFmtWithError<1>("%1$", err_term);
    87 }
    88 
    89+BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
    90+{
    91+    // ensure invalid format strings don't throw at run-time
    


    l0rinc commented at 4:29 pm on September 27, 2024:

    nit:

    0    // Ensure invalid format strings don't throw at run-time
    
  44. in src/test/util_string_tests.cpp:87 in 2d4f82d590 outdated
    85@@ -83,4 +86,15 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    86     FailFmtWithError<1>("%1$", err_term);
    87 }
    88 
    


    l0rinc commented at 4:30 pm on September 27, 2024:

    We could add the examples from bitcoin-cli here:

    0BOOST_AUTO_TEST_CASE(ConstevalFormatString_SpecificUsageTests)
    1{
    2    PassFmt<6>("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ");
    3    PassFmt<5>("%*s %-*s%s\n");
    4    PassFmt<24>("%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %2s %*s%*s%*s%*i %*s %-*s%s\n");
    5    PassFmt<2>("                        ms     ms  sec  sec  min  min                %*s\n\n");
    6    PassFmt<4>("\n%-*s    port %6i    score %6i");
    7    PassFmt<3>("%*s %s\n");
    8}
    

    stickies-v commented at 10:03 am on September 30, 2024:
    Addressed in #30999, resolving.
  45. l0rinc changes_requested
  46. l0rinc commented at 4:37 pm on September 27, 2024: contributor

    I have concerns about the translations, using raw format instead of adjusting the validator, and I think we could sneak in a fix for unterminated positionals.

      0Index: src/util/string.h
      1<+>UTF-8
      2===================================================================
      3diff --git a/src/util/string.h b/src/util/string.h
      4--- a/src/util/string.h	(revision 2d4f82d590845f9b2ea11bbcfe9377ef14df8775)
      5+++ b/src/util/string.h	(date 1727560531914)
      6@@ -51,12 +51,19 @@
      7                 ++it;
      8                 continue;
      9             }
     10+            if (*it == '*' || ((*it == '_' || *it == '-') && *std::next(it) == '*')) {
     11+                count_normal += 2; // width + value
     12+                it += 2;
     13+                continue;
     14+            }
     15 
     16             unsigned maybe_num{0};
     17             while ('0' <= *it && *it <= '9') {
     18                 maybe_num *= 10;
     19                 maybe_num += *it - '0';
     20-                ++it;
     21+                if (++it >= str.end()) throw "Format specifier incorrectly terminated by end of string";
     22             };
     23 
     24             if (*it == '$') {
     25Index: src/test/util_string_tests.cpp
     26<+>UTF-8
     27===================================================================
     28diff --git a/src/test/util_string_tests.cpp b/src/test/util_string_tests.cpp
     29--- a/src/test/util_string_tests.cpp	(revision 2d4f82d590845f9b2ea11bbcfe9377ef14df8775)
     30+++ b/src/test/util_string_tests.cpp	(date 1727560389361)
     31@@ -61,7 +61,6 @@
     32     // The `*` specifier behavior is unsupported and can lead to runtime
     33     // errors when used in a ConstevalFormatString. Please refer to the
     34     // note in the ConstevalFormatString docs.
     35-    PassFmt<1>("%*c");
     36     PassFmt<2>("%2$*3$d");
     37     PassFmt<1>("%.*f");
     38 
     39@@ -83,18 +82,27 @@
     40 
     41     auto err_term{"Format specifier incorrectly terminated by end of string"};
     42     FailFmtWithError<1>("%", err_term);
     43+    FailFmtWithError<1>("%1", err_term);
     44     FailFmtWithError<1>("%1$", err_term);
     45 }
     46 
     47+BOOST_AUTO_TEST_CASE(ConstevalFormatString_SpecificUsageTests)
     48+{
     49+    PassFmt<6>("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ");
     50+    PassFmt<5>("%*s %-*s%s\n");
     51+    PassFmt<24>( "%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %2s %*s%*s%*s%*i %*s %-*s%s\n");
     52+    PassFmt<2>("                        ms     ms  sec  sec  min  min                %*s\n\n");
     53+    PassFmt<4>("\n%-*s    port %6i    score %6i");
     54+    PassFmt<3>("%*s %s\n");
     55+}
     56+
     57 BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
     58 {
     59-    // ensure invalid format strings don't throw at run-time
     60-    BOOST_CHECK_EQUAL(tfm::format("%*c", "dummy"), R"(Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: "%*c")");
     61+    // Ensure invalid format strings don't throw at run-time
     62     BOOST_CHECK_EQUAL(tfm::format("%2$*3$d", "dummy", "value"), R"(Error "tinyformat: Positional argument out of range" while formatting log message: "%2$*3$d")");
     63     std::ostringstream oss;
     64     tfm::format(oss, "%.*f", 5);
     65     BOOST_CHECK_EQUAL(oss.str(), R"(Error "tinyformat: Too many conversion specifiers in format string" while formatting log message: "%.*f")");
     66 }
     67 
     68-
     69 BOOST_AUTO_TEST_SUITE_END()
     70Index: src/bitcoin-cli.cpp
     71<+>UTF-8
     72===================================================================
     73diff --git a/src/bitcoin-cli.cpp b/src/bitcoin-cli.cpp
     74--- a/src/bitcoin-cli.cpp	(revision 2d4f82d590845f9b2ea11bbcfe9377ef14df8775)
     75+++ b/src/bitcoin-cli.cpp	(date 1727560250888)
     76@@ -540,42 +540,46 @@
     77         // Report detailed peer connections list sorted by direction and minimum ping time.
     78         if (DetailsRequested() && !m_peers.empty()) {
     79             std::sort(m_peers.begin(), m_peers.end());
     80-            result += tfm::format_raw("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ",
     81-                                      m_max_addr_processed_length, "addrp",
     82-                                      m_max_addr_rate_limited_length, "addrl",
     83-                                      m_max_age_length, "age");
     84+            result += tfm::format("<->   type   net  v  mping   ping send recv  txn  blk  hb %*s%*s%*s ",
     85+                                  m_max_addr_processed_length, "addrp",
     86+                                  m_max_addr_rate_limited_length, "addrl",
     87+                                  m_max_age_length, "age");
     88             if (m_is_asmap_on) result += " asmap ";
     89-            result += tfm::format_raw("%*s %-*s%s\n", m_max_id_length, "id", IsAddressSelected() ? m_max_addr_length : 0, IsAddressSelected() ? "address" : "", IsVersionSelected() ? "version" : "");
     90+            result += tfm::format("%*s %-*s%s\n",
     91+                                  m_max_id_length, "id",
     92+                                  IsAddressSelected() ? m_max_addr_length : 0,
     93+                                  IsAddressSelected() ? "address" : "",
     94+                                  IsVersionSelected() ? "version" : "");
     95             for (const Peer& peer : m_peers) {
     96                 std::string version{ToString(peer.version) + peer.sub_version};
     97-                result += tfm::format_raw(
     98-                    "%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %2s %*s%*s%*s%*i %*s %-*s%s\n",
     99-                    peer.is_outbound ? "out" : "in",
    100-                    ConnectionTypeForNetinfo(peer.conn_type),
    101-                    peer.network,
    102-                    (peer.transport_protocol_type.size() == 2 && peer.transport_protocol_type[0] == 'v') ? peer.transport_protocol_type[1] : ' ',
    103-                    PingTimeToString(peer.min_ping),
    104-                    PingTimeToString(peer.ping),
    105-                    peer.last_send ? ToString(time_now - peer.last_send) : "",
    106-                    peer.last_recv ? ToString(time_now - peer.last_recv) : "",
    107-                    peer.last_trxn ? ToString((time_now - peer.last_trxn) / 60) : peer.is_tx_relay ? "" : "*",
    108-                    peer.last_blck ? ToString((time_now - peer.last_blck) / 60) : "",
    109-                    strprintf("%s%s", peer.is_bip152_hb_to ? "." : " ", peer.is_bip152_hb_from ? "*" : " "),
    110-                    m_max_addr_processed_length, // variable spacing
    111-                    peer.addr_processed ? ToString(peer.addr_processed) : peer.is_addr_relay_enabled ? "" : ".",
    112-                    m_max_addr_rate_limited_length, // variable spacing
    113-                    peer.addr_rate_limited ? ToString(peer.addr_rate_limited) : "",
    114-                    m_max_age_length, // variable spacing
    115-                    peer.age,
    116-                    m_is_asmap_on ? 7 : 0, // variable spacing
    117-                    m_is_asmap_on && peer.mapped_as ? ToString(peer.mapped_as) : "",
    118-                    m_max_id_length, // variable spacing
    119-                    peer.id,
    120-                    IsAddressSelected() ? m_max_addr_length : 0, // variable spacing
    121-                    IsAddressSelected() ? peer.addr : "",
    122-                    IsVersionSelected() && version != "0" ? version : "");
    123+                result += tfm::format("%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %2s %*s%*s%*s%*i %*s %-*s%s\n",
    124+                                      peer.is_outbound ? "out" : "in",
    125+                                      ConnectionTypeForNetinfo(peer.conn_type),
    126+                                      peer.network,
    127+                                      (peer.transport_protocol_type.size() == 2 && peer.transport_protocol_type[0] == 'v') ? peer.transport_protocol_type[1] : ' ',
    128+                                      PingTimeToString(peer.min_ping),
    129+                                      PingTimeToString(peer.ping),
    130+                                      peer.last_send ? ToString(time_now - peer.last_send) : "",
    131+                                      peer.last_recv ? ToString(time_now - peer.last_recv) : "",
    132+                                      peer.last_trxn ? ToString((time_now - peer.last_trxn) / 60) : peer.is_tx_relay ? "" : "*",
    133+                                      peer.last_blck ? ToString((time_now - peer.last_blck) / 60) : "",
    134+                                      strprintf("%s%s", peer.is_bip152_hb_to ? "." : " ", peer.is_bip152_hb_from ? "*" : " "),
    135+                                      m_max_addr_processed_length, // variable spacing
    136+                                      peer.addr_processed ? ToString(peer.addr_processed) : peer.is_addr_relay_enabled ? "" : ".",
    137+                                      m_max_addr_rate_limited_length, // variable spacing
    138+                                      peer.addr_rate_limited ? ToString(peer.addr_rate_limited) : "",
    139+                                      m_max_age_length, // variable spacing
    140+                                      peer.age,
    141+                                      m_is_asmap_on ? 7 : 0, // variable spacing
    142+                                      m_is_asmap_on && peer.mapped_as ? ToString(peer.mapped_as) : "",
    143+                                      m_max_id_length, // variable spacing
    144+                                      peer.id,
    145+                                      IsAddressSelected() ? m_max_addr_length : 0, // variable spacing
    146+                                      IsAddressSelected() ? peer.addr : "",
    147+                                      IsVersionSelected() && version != "0" ? version : "");
    148             }
    149-            result += tfm::format_raw("                        ms     ms  sec  sec  min  min                %*s\n\n", m_max_age_length, "min");
    150+            result += tfm::format("                        ms     ms  sec  sec  min  min                %*s\n\n",
    151+                                  m_max_age_length, "min");
    152         }
    153 
    154         // Report peer connection totals by type.
    155@@ -624,7 +628,10 @@
    156                 max_addr_size = std::max(addr["address"].get_str().length() + 1, max_addr_size);
    157             }
    158             for (const UniValue& addr : local_addrs) {
    159-                result += tfm::format_raw("\n%-*s    port %6i    score %6i", max_addr_size, addr["address"].get_str(), addr["port"].getInt<int>(), addr["score"].getInt<int>());
    160+                result += tfm::format("\n%-*s    port %6i    score %6i",
    161+                                      max_addr_size, addr["address"].get_str(),
    162+                                      addr["port"].getInt<int>(),
    163+                                          addr["score"].getInt<int>());
    164             }
    165         }
    166 
    167@@ -1116,10 +1123,9 @@
    168         }
    169 
    170         for (const std::string& wallet : result["balances"].getKeys()) {
    171-            result_string += tfm::format_raw("%*s %s\n",
    172-                                             max_balance_length,
    173-                                             result["balances"][wallet].getValStr(),
    174-                                             wallet.empty() ? "\"\"" : wallet);
    175+            result_string += tfm::format("%*s %s\n",
    176+                                         max_balance_length, result["balances"][wallet].getValStr(),
    177+                                         wallet.empty() ? "\"\"" : wallet);
    178         }
    179         result_string += "\n";
    180     }
    181Index: src/test/txrequest_tests.cpp
    182<+>UTF-8
    183===================================================================
    184diff --git a/src/test/txrequest_tests.cpp b/src/test/txrequest_tests.cpp
    185--- a/src/test/txrequest_tests.cpp	(revision 2d4f82d590845f9b2ea11bbcfe9377ef14df8775)
    186+++ b/src/test/txrequest_tests.cpp	(date 1727560250869)
    187@@ -181,7 +181,7 @@
    188             BOOST_CHECK_MESSAGE(real_total == total, strprintf("[%s] total %i (%i expected)", comment, real_total, total));
    189             BOOST_CHECK_MESSAGE(real_inflight == inflight, strprintf("[%s] inflight %i (%i expected)", comment, real_inflight, inflight));
    190             BOOST_CHECK_MESSAGE(real_candidates == candidates, strprintf("[%s] candidates %i (%i expected)", comment, real_candidates, candidates));
    191-            BOOST_CHECK_MESSAGE(ret == expected, "[" + comment + "] mismatching requestables");
    192+            BOOST_CHECK_MESSAGE(ret == expected, strprintf("[%s] mismatching requestables", comment));
    193         });
    194     }
    
  47. in src/bitcoin-cli.cpp:551 in 2d4f82d590 outdated
    552-            result += strprintf("%*s %-*s%s\n", m_max_id_length, "id", IsAddressSelected() ? m_max_addr_length : 0, IsAddressSelected() ? "address" : "", IsVersionSelected() ? "version" : "");
    553+            result += tfm::format_raw("%*s %-*s%s\n", m_max_id_length, "id", IsAddressSelected() ? m_max_addr_length : 0, IsAddressSelected() ? "address" : "", IsVersionSelected() ? "version" : "");
    554             for (const Peer& peer : m_peers) {
    555                 std::string version{ToString(peer.version) + peer.sub_version};
    556-                result += strprintf(
    557+                result += tfm::format_raw(
    


    l0rinc commented at 5:24 pm on September 28, 2024:
    do we need to keep the strprintf("%s%s" on line 563?

    stickies-v commented at 10:02 am on September 30, 2024:
    I suppose we don’t, but that seems unrelated to this PR so going to leave that as is.

    l0rinc commented at 10:34 am on September 30, 2024:
    Can you expand on that? We’ve changed other strprintf instance here to enable compile-time validation, how come this one didn’t make the cut?

    stickies-v commented at 10:37 am on September 30, 2024:
    Hmm, then I don’t think I understand what you mean. Would you be able to provide a diff/example?

    l0rinc commented at 10:58 am on September 30, 2024:

    My mistake, rereading it it’s not clear what I was asking. So I think we could simplify the nested formats by pulling up the internal pattern: Having %2s in the main pattern and having a %s%s inside could be simplified to having %s%s at the parent and expand values in the parameters:

     0Index: src/bitcoin-cli.cpp
     1<+>UTF-8
     2===================================================================
     3diff --git a/src/bitcoin-cli.cpp b/src/bitcoin-cli.cpp
     4--- a/src/bitcoin-cli.cpp	(revision 2e33dbadcf69838c8ea591dccdc666f809c3134a)
     5+++ b/src/bitcoin-cli.cpp	(date 1727693490146)
     6@@ -549,7 +549,7 @@
     7             for (const Peer& peer : m_peers) {
     8                 std::string version{ToString(peer.version) + peer.sub_version};
     9                 result += tfm::format_raw(
    10-                    "%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %2s %*s%*s%*s%*i %*s %-*s%s\n",
    11+                    "%3s %6s %5s %2s%7s%7s%5s%5s%5s%5s  %s%s %*s%*s%*s%*i %*s %-*s%s\n",
    12                     peer.is_outbound ? "out" : "in",
    13                     ConnectionTypeForNetinfo(peer.conn_type),
    14                     peer.network,
    15@@ -560,7 +560,8 @@
    16                     peer.last_recv ? ToString(time_now - peer.last_recv) : "",
    17                     peer.last_trxn ? ToString((time_now - peer.last_trxn) / 60) : peer.is_tx_relay ? "" : "*",
    18                     peer.last_blck ? ToString((time_now - peer.last_blck) / 60) : "",
    19-                    strprintf("%s%s", peer.is_bip152_hb_to ? "." : " ", peer.is_bip152_hb_from ? "*" : " "),
    20+                    peer.is_bip152_hb_to ? "." : " ",
    21+                    peer.is_bip152_hb_from ? "*" : " ",
    22                     m_max_addr_processed_length, // variable spacing
    23                     peer.addr_processed ? ToString(peer.addr_processed) : peer.is_addr_relay_enabled ? "" : ".",
    24                     m_max_addr_rate_limited_length, // variable spacing
    25Index: src/test/util_string_tests.cpp
    26<+>UTF-8
    27===================================================================
    28diff --git a/src/test/util_string_tests.cpp b/src/test/util_string_tests.cpp
    29--- a/src/test/util_string_tests.cpp	(revision 2e33dbadcf69838c8ea591dccdc666f809c3134a)
    30+++ b/src/test/util_string_tests.cpp	(date 1727693570815)
    31@@ -96,5 +96,10 @@
    32     BOOST_CHECK_EQUAL(oss.str(), R"(Error "tinyformat: Too many conversion specifiers in format string" while formatting log message: "%.*f")");
    33 }
    34 
    35+BOOST_AUTO_TEST_CASE(TwoCharFormattingTest)
    36+{
    37+    BOOST_CHECK_EQUAL(tfm::format("%2s", strprintf("%s%s", ".", "*")),
    38+                      tfm::format("%s%s", ".", "*"));
    39+}
    40 
    41 BOOST_AUTO_TEST_SUITE_END()
    

    (the dummy test was added just to make sure we’re doing it correctly, I don’t think it should be committed)


    stickies-v commented at 11:36 am on September 30, 2024:
    Thanks for providing the diff. There are probably plenty of opportunities to clean up the bitcoin-cli string printing, but this line is not touched and it does not relate to the goal of this PR, so it seems best to me to leave this as is.

    l0rinc commented at 11:46 am on September 30, 2024:
    Ok, thanks for checking
  48. stickies-v force-pushed on Sep 30, 2024
  49. stickies-v commented at 10:21 am on September 30, 2024: contributor
    Force-pushed to address two nits from @l0rinc - thanks for the review!
  50. maflcko commented at 10:35 am on September 30, 2024: member

    Only changes are some minor style-only test-only fixups.

    re-ACK 2e33dbadcf69838c8ea591dccdc666f809c3134a 🔰

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: re-ACK 2e33dbadcf69838c8ea591dccdc666f809c3134a 🔰
    3h0UdH750a2MwfQDDQ6yAM8tgC42014X4xvJwvJx2c8A+MKy2Ic1ul1lWUqsX8+xhu3EQRJWsZpa3GZbGzbsGCg==
    
  51. DrahtBot requested review from l0rinc on Sep 30, 2024
  52. in src/tinyformat.h:1056 in 2e33dbadcf outdated
    1048@@ -1049,7 +1049,12 @@ TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_MAKEFORMATLIST)
    1049 /// list of format arguments is held in a single function argument.
    1050 inline void vformat(std::ostream& out, const char* fmt, FormatListRef list)
    1051 {
    1052-    detail::formatImpl(out, fmt, list.m_args, list.m_N);
    1053+    // Modified for Bitcoin Core to not throw for formatting errors
    1054+    try {
    1055+        detail::formatImpl(out, fmt, list.m_args, list.m_N);
    1056+    } catch (tinyformat::format_error& fmterr) {
    1057+        out << "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: \"" + fmt + "\"";
    


    maflcko commented at 10:55 am on September 30, 2024:
    last commit: Not that it matters, but my preference would be to not change behavior here and keep the previous formatting. If fmt has a trailing newline, it would be “removed” by this patch and it would break the internal logging consistency of the debug log.

    stickies-v commented at 12:30 pm on September 30, 2024:

    Ah, good catch, thanks. Updated to remove the quotes and revert to existing behaviour, to avoid messing up the logs like e.g. the below:

    With quotes:

    02024-09-30T12:22:36Z Warning: Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: "Warning: %*s Support for testnet3 is deprecated and will be removed in an upcoming release. Consider switching to testnet4.
    1"Script verification uses 7 additional threads
    

    Without quotes:

    02024-09-30T12:21:18Z Warning: Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: Warning: %*s Support for testnet3 is deprecated and will be removed in an upcoming release. Consider switching to testnet4.
    12024-09-30T12:21:18Z Script verification uses 7 additional threads
    
  53. l0rinc commented at 11:48 am on September 30, 2024: contributor

    Most of my observations should rather be addressed in a separate PR (can be merged before or after, I don’t mind rebasing, would appreciate a review there). I’m fine with merging this one as is, thanks for the improvement.

    ACK 2e33dbadcf69838c8ea591dccdc666f809c3134a

  54. stickies-v force-pushed on Sep 30, 2024
  55. in src/test/util_string_tests.cpp:92 in c081b63777 outdated
    85@@ -83,4 +86,15 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    86     FailFmtWithError<1>("%1$", err_term);
    87 }
    88 
    89+BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
    90+{
    91+    // Ensure invalid format strings don't throw at run-time.
    92+    BOOST_CHECK_EQUAL(tfm::format("%*c", "dummy"), R"(Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: "%*c")");
    


    l0rinc commented at 12:29 pm on September 30, 2024:
    you should probably update these tests as well
  56. stickies-v force-pushed on Sep 30, 2024
  57. stickies-v commented at 12:31 pm on September 30, 2024: contributor
    Force-pushed to revert the changed behaviour that wraps the format string into quotes, which doesn’t play nice with newlines.
  58. l0rinc commented at 12:31 pm on September 30, 2024: contributor
    ACK 4f33e9a85c3a8f546deddc2f78cf74bceff30315
  59. DrahtBot requested review from maflcko on Sep 30, 2024
  60. maflcko commented at 7:26 am on October 1, 2024: member

    Only change is #30928 (review)

    re-ACK 4f33e9a85c3a8f546deddc2f78cf74bceff30315 👾

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: re-ACK 4f33e9a85c3a8f546deddc2f78cf74bceff30315 👾
    3ouSYrq7MNfE0e0Yqx0hbAqK94p/zuOqik+Tjz2hCaVlVZ5IdU1RQWwbMq6dKpL6cn6Szqf0JvwuGbX7mt2ZYAA==
    
  61. hodlinator commented at 2:00 pm on October 1, 2024: contributor

    Concept ACK 4f33e9a85c3a8f546deddc2f78cf74bceff30315

    It would be cleaner in the implementation to use util::Result instead of overwriting the returned value with the error message, but it would mess up the call sites too much probably. Current approach in this PR seems one of the least worse ones.

    Passed locally: FUZZ=str_printf src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/str_printf/*.

    PR summary nit

    rename the throwing tfm::format functions to tfm::format_raw

    The last commit makes it so that none of the format_raw functions throw either? format_rawends up meaning “call with raw const char* fmt, not ConstevalFormatString<> fmt”.

  62. stickies-v commented at 10:28 am on October 2, 2024: contributor

    It would be cleaner in the implementation to use util::Result instead of overwriting the returned value with the error message

    I don’t think that’s a good approach for our codebase, even if there was minimal code churn. All format strings are known at compile-time, so any formatting errors are programming errors. Handling that with util::Result would be both a cumbersome interface and bad practice imo. We don’t need to let callsites handle formatting errors, they should never occur, but if/when they do after all we don’t want them to crash the application because they are harmless to program execution, until we fix the error in the code.

    The last commit makes it so that none of the format_raw functions throw either?

    Sorry, missed that line when updating the PR description, fixed now!

  63. maflcko commented at 10:36 am on October 2, 2024: member

    I agree that util::Result doesn’t make any sense here. Apart from the reasons mentioned, it would also make it harder to switch to std::format, as well as being confusing, because no other format lib I am aware of is doing that.

    • using the tfm::format_raw functions (only for tinyformat implementation or tests)

    In the PR description: I think this should also say or bitcoin-cli.

  64. stickies-v commented at 10:48 am on October 2, 2024: contributor

    In the PR description: I think this should also say or bitcoin-cli.

    bitcoin-cli wasn’t using the const std::string& overload which that bulletpoint is about. It is mentioned in the next bulletpoint about compile-time checks:

    Callsites that for some reason don’t pass the compile-time checks (such as in bitcoin-cli.cpp) can use tfm::format_raw.

  65. hodlinator commented at 1:53 pm on October 2, 2024: contributor

    stickies-v:

    All format strings are known at compile-time, so any formatting errors are programming errors. Handling that with util::Result would be both a cumbersome interface and bad practice imo.

    This is a good argument except we do have some strings only known at runtime, notably translations, which probably don’t get full testing. Maybe there’s something that automatically compares format strings for equivalence across translations though.

    maflcko:

    I agree that util::Result doesn’t make any sense here.

    (See above. I’d say it makes some sense).

    Apart from the reasons mentioned, it would also make it harder to switch to std::format, as well as being confusing, because no other format lib I am aware of is doing that.

    “errors as values” rather than errors as exceptions are generally a step in the right direction for runtime errors. Modern formatting libraries tend to use a lot of compile time checking though AFAIK, decreasing the need for runtime errors, so I’m happy about that direction in Bitcoin Core.

    One could argue that if we returned util::Result, most call sites would probably opt to print an error + the offending format string to the log anyway, so doing that as a blanket rule for everything is acceptable.

  66. maflcko commented at 2:17 pm on October 2, 2024: member

    This is a good argument except we do have some strings only known at runtime, notably translations, which probably don’t get full testing. Maybe there’s something that automatically compares format strings for equivalence across translations though.

    If translations are malformed to induce a tinyformat error where one wouldn’t be present in the non-translated string, I don’t think the solution would to use Result. This seems no different from an otherwise fully tinyformat-valid string that is completely malformed by the translation. Trying to recover from this error in the code at runtime is the wrong approach and it was specifically changed for this reason, see #30928 (review). A better approach would be to do it when updating the translations, or at build-time. But this can be done in a follow-up and shouldn’t be a blocker here.

    If you are referring to the fact that the non-translated format string in a translation may be malformed: They are still checked by the current python linter, so there shouldn’t be an issue either. If there is an issue, it should be fixed in the format string, not with a Result.

    One could argue that if we returned util::Result, most call sites would probably opt to print an error + the offending format string to the log anyway, so doing that as a blanket rule for everything is acceptable.

    Yes, indeed. I also can’t see a reason for a call-site to do anything else here.

  67. stickies-v commented at 3:02 pm on October 2, 2024: contributor

    This is a good argument except we do have some strings only known at runtime, notably translations

    I chose my wording “known at compile-time” carefully for exactly this reason: they’re known at compile-time, we just choose not to implement the (complete) validation logic because it much more complex than just returning an error string. It’s not worth arguing semantics too much, but in my view that’s very much still a programming error and not something that callsites should be aware of, let alone handle.

  68. ryanofsky commented at 6:12 pm on October 2, 2024: contributor

    Will not nack this pr since I don’t think consequences are great, but I would prefer not to see this PR merged.

    I don’t think putting “error while formatting log message” in a std::string and then returning that std::string is a good way for a C++ API to return errors, especially errors that are almost certainly bugs in the code. Throwing an exception seems far more appropriate, and more likely to result in bugs being detected and fixed. (Also nit: the wording of that message doesn’t make sense because tinyformat is used to format things other than log messages, and is also useful for low level string manipulation and things like formatting numbers or constructing urls)

    I understand that practically speaking impact of this change should be small because the new compile time checking is pretty good, but that just means to my mind that there should be less of a need for this PR.

    EDIT: I also think the new uses of .c_str() are gross, though I didn’t look into why those are necessary and that is not my main objection to this PR.

  69. hodlinator commented at 7:28 pm on October 2, 2024: contributor
    My Concept ACK accepted the current concept, while stating that I see it as one of the least worst alternatives (implying I don’t see the error handling part as blocking). I’m happy to get feedback on that but my intent was not to start a debate. @ryanofsky how would you feel about adding an Assume() statement that could fail Debug-builds when formatting fails?
  70. ryanofsky commented at 8:20 pm on October 2, 2024: contributor

    Thanks, I didn’t read the previous discussion history, so my feedback is only based on the implementation of this PR. not on earlier comments.

    I think tinyformat’s original decision to throw exceptions when invalid format strings were specified was flexible and sound, and the additional consteval checking we were able to add on top of that was a nice improvement that should be extended more places.

    I just don’t see a need to rip out tinyformat’s simple handling mechanism and replace it with more complicated and unusual one, whether it’s a self-referencing string apologizing for not being formatted correctly, or an inconsistently enforced check that behaves differently based on build flags.

    To put it another way:

    • I think adding consteval format string checking was a great improvement that should improve developer experience and catch bugs.
    • I think the consteval string checking should be extended more places, for example supporting the _ function to check translated original strings, and maybe even the imported translations of those strings.
    • I don’t like the runtime behavior change implemented in the fourth commit of this PR, and I’m not sure about the other commits. The only cited benefit in the PR description is “Avoid unexpected run-time crashes from string formatting” but I feel like the way to achieve that is to extend compile time checks more places, not to take a detour into changing runtime behavior.
  71. hodlinator commented at 8:47 pm on October 2, 2024: contributor

    My Concept ACK accepted the current concept, while stating that I see it as one of the least worst alternatives (implying I don’t see the error handling part as blocking). I’m happy to get feedback on that but my intent was not to start a debate.

    (This part my previous comment was more for maflcko & stickies-v. I think they were interpreting me as being more critical of the error-handling than I intended).

    That said, I also see ryanofsky’s perspective. The PR is borderline over-incremental instead of extending build time checks.

    Worst case example that could help motivate the current PR approach:

    A formatting error in a rarely used codepath in the Icelandic translation causes an unhandled exception, leading a lightning node to becoming unable to broadcast a penalty transaction in time. => Angry angry vikings. :crossed_swords:

  72. DrahtBot added the label Needs rebase on Oct 2, 2024
  73. stickies-v force-pushed on Oct 8, 2024
  74. stickies-v force-pushed on Oct 8, 2024
  75. DrahtBot added the label CI failed on Oct 8, 2024
  76. DrahtBot commented at 1:49 pm on October 8, 2024: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/31237946658

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  77. move-only: ConstevalFormatString tfm::format to tinyformat.h
    In future commits, the non-ConstevalFormatString tfm::format
    overloads will be renamed or removed. To avoid all callsites
    having to include util/string.h (or to have a circular dependency
    between tinyformat.h and util/string.h), just move the
    ConstevalFormatString overloads to tinyformat.h
    104c805373
  78. tinyformat: remove std::string overload
    All (non-test) format strings are known at compile-time, so we
    don't need the std::string overload. Remove it so users use the
    safer util::ConstevalFormatString where possible, or the
    bilingual_str overload where applicable.
    4ce60d82a0
  79. tinyformat: force compile-time checks for format string literals
    Callsites such as bitcoin-cli that (incorrectly) fail compile-time
    validation can still use tfm::format_raw.
    4cf1221695
  80. stickies-v force-pushed on Oct 8, 2024
  81. stickies-v commented at 2:39 pm on October 8, 2024: contributor

    Force-pushed to rebase addressing a merge conflict, preserve tfm::format_error throwing behaviour for DEBUG builds to improve visibility into programming errors before they manifest, and improve an error string.


    Thank you for your thoughtful review @ryanofsky. tl;dr I share your concerns, but I think I have a different opinion on the trade-offs made.

    I think everyone here is in agreement that compile-time checks (ideally with std::format) are the proper way forward, and everything else is a temporary patch. I think there are differing views on what we do in the meanwhile. It appears, especially without std::format and having to deal with unverified translated strings, that implementing bulletproof compile-time checking on format strings is not really feasible or the best use of our time.

    I don’t think putting “error while formatting log message” in a std::string and then returning that std::string is a good way for a C++ API to return errors, especially errors that are almost certainly bugs in the code.

    I agree that returning error strings is not a good way to handle errors in general, but I don’t think throwing a std::runtime_error is a good way to deal with programming errors either. Typically, these would be handled with assert statements (which is also tinyformat’s default), but crashing the node for malformed (usually) informational strings seems perhaps unnecessarily strict, in similar vein to our usage of CHECK_NONFATAL instead of assert?

    Can you agree that the concept of not exposing tfm::format_error to callsites makes sense in our codebase, at least philosophically? Fomat errors should never occur at run-time, even if they still can at present. If you agree with that, then I think it all comes down to what we do before we have full compile-time checking? It appears you prefer keeping the increased visibility when errors occur, whereas in this PR I prioritize not crashing the software in case we do miss it or if it’s part of an erroneously translated format string.

    I’ve force-pushed to preserve tfm::format_error throwing behaviour in DEBUG builds to keep more of the visibility without reducing stability. If you still feel like that new behaviour is still not an improvement over master (which is a view I can absolutely understand), I will drop the last commit and rework the first ones a bit. Even though the intent of the original version of this PR was to fix a regression, I’d be happy to change course and keep just the first commits that force more compile-time checks, without changing run-time behaviour, which I think everyone here agrees is an improvement.

    EDIT: I also think the new uses of .c_str() are gross, though I didn’t look into why those are necessary and that is not my main objection to this PR.

    The const std::string& overload was dropped so that the util::ConstevalFormatString overload could be forced for all string literals. Although I don’t like the additional .c_str() usage this introduced, it is mostly limited to fuzz and bitcoin-cli, and I think is worth the trade-off?

    I think tinyformat’s original decision to throw exceptions when invalid format strings were specified was flexible and sound

    Note: tinyformat natively asserts that there are no formatting errors, the tfm::format_error is Bitcoin Core-specific.

    or an inconsistently enforced check that behaves differently based on build flags.

    I don’t think this is correct: https://github.com/bitcoin/bitcoin/blob/caf44e500eb257376c36c0e043f71f72b949f743/src/tinyformat.h#L139

    (Also nit: the wording of that message doesn’t make sense because tinyformat is used to format things other than log messages, and is also useful for low level string manipulation and things like formatting numbers or constructing urls)

    Oops, good catch, thanks. This string was lifted from logging.h and I didn’t properly update it - done now.

    how would you feel about adding an Assume() statement that could fail Debug-builds when formatting fails?

    Adding util/check.h to tinyformat.h introduced a couple of circular dependencies so instead I’m using #ifdef DEBUG, but the effect should be the same! (We could expand on this adding a TFM_THROW_FORMAT_ERROR flag (that defaults to set in DEBUG) so it can be added to more CI jobs without overhead if people would like that).

  82. DrahtBot removed the label Needs rebase on Oct 8, 2024
  83. tinyformat: don't throw format string errors
    In almost all cases, invalid format strings should not lead to crashing
    behaviour, so suppress run-time format errors by returning the error as
    a string instead of throwing a tinyformat::format_error.
    
    In DEBUG builds, tinyformat::format_error throwing behaviour is
    preserved.
    
    Note: most format string literals are partially validated at compile-time
    through util::ConstevalFormatString. Notably, bilingual_str format strings
    do not have this compile-time validation.
    49f73071a8
  84. stickies-v force-pushed on Oct 8, 2024
  85. in src/wallet/wallet.cpp:2006 in 4ce60d82a0 outdated
    2002@@ -2003,7 +2003,7 @@ CWallet::ScanResult CWallet::ScanForWalletTransactions(const uint256& start_bloc
    2003         WalletLogPrintf("Scanning current mempool transactions.\n");
    2004         WITH_LOCK(cs_wallet, chain().requestMempoolTransactions(*this));
    2005     }
    2006-    ShowProgress(strprintf("%s " + _("Rescanning…").translated, GetDisplayName()), 100); // hide progress dialog in GUI
    2007+    ShowProgress(strprintf("%s %s", GetDisplayName(), _("Rescanning…").translated), 100); // hide progress dialog in GUI
    


    maflcko commented at 5:34 pm on October 8, 2024:

    nit in 4ce60d82a0c1e21ee7ecba69e4c8926720829429: Could split the changes to make the format string a bit more const, and never exclusively translated in real code into a separate commit? The benefit would be that the doc changes and (constroversial?) c_str changes are in a separate commit. It would also allow easier cherry-picking into other pulls.

    Edit: Done in commit fa788db64b4bb05c0fccc7155ddac97fa42fd5e8 (and the previous commit). Feel free to rebase on them, or not, or otherwise take them, and modify them as you like.

  86. maflcko commented at 5:35 pm on October 8, 2024: member

    Left a nit.

    Only change is adding throw when DEBUG, which seems fine. Edit: However, I think any attempt to catch bugs at runtime in this context is fragile and inefficient. If bugs should be found, it would be better to catch them at (or before) compile-time with the compiler, or with a tool.

    re-ACK 49f73071a8ec4131595090a619d6198a5f8b7c3c 🎖

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: re-ACK 49f73071a8ec4131595090a619d6198a5f8b7c3c 🎖
    3XSyui/vuov5yPzjlUZ4UeGQZHo/JZ6LsYy2ftwOqSStKhn/L2aHSd/uDBco6LvvTWd3tPsTlSY2y9myvd7H5Aw==
    
  87. DrahtBot requested review from l0rinc on Oct 8, 2024
  88. DrahtBot requested review from hodlinator on Oct 8, 2024
  89. DrahtBot removed the label CI failed on Oct 8, 2024
  90. maflcko commented at 12:11 pm on October 9, 2024: member

    EDIT: I also think the new uses of .c_str() are gross, though I didn’t look into why those are necessary and that is not my main objection to this PR.

    I removed tfm::format_raw(fmt.original.c_str(), ... in #31061, where it remains just tfm::format(fmt.original, .... Obviously the two pulls conflict a bit, but the conflicts are trivial to solve either way.

  91. hodlinator commented at 8:16 am on October 23, 2024: contributor

    Worth bringing back the try/catch in src/test/fuzz/strprintf.cpp for Debug builds?

     0 cmake --preset=libfuzzer -DCMAKE_BUILD_TYPE=Debug
     1...
     2 cmake --build build_fuzz -j<X>
     3...
     4 FUZZ=str_printf build_fuzz/src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/str_printf/
     5INFO: Running with entropic power schedule (0xFF, 100).
     6INFO: Seed: 734467281
     7INFO: Loaded 1 modules   (1292656 inline 8-bit counters): 1292656 [0x5d5503ee60a0, 0x5d5504021a10), 
     8INFO: Loaded 1 PC tables (1292656 PCs): 1292656 [0x5d5504021a10,0x5d55053db110), 
     9INFO:     1133 files found in ../qa-assets/fuzz_seed_corpus/str_printf/
    10INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 16446 bytes
    11terminate called after throwing an instance of 'tinyformat::format_error'
    12  what():  tinyformat: Not enough conversion specifiers in format string
    13==59713== ERROR: libFuzzer: deadly signal
    14    [#0](/bitcoin-bitcoin/0/) 0x5d54f8395fea  (/home/hodlinator/bitcoin/build_fuzz/src/test/fuzz/fuzz+0x5bd1fea)
    15...
    16    [#30](/bitcoin-bitcoin/30/) 0x5d54f8245174  (/home/hodlinator/bitcoin/build_fuzz/src/test/fuzz/fuzz+0x5a81174)
    17
    18NOTE: libFuzzer has rudimentary signal handlers.
    19      Combine libFuzzer with AddressSanitizer or similar for better crash reports.
    20SUMMARY: libFuzzer: deadly signal
    21MS: 0 ; base unit: 0000000000000000000000000000000000000000
    22
    23
    24artifact_prefix='./'; Test unit written to ./crash-da39a3ee5e6b4b0d3255bfef95601890afd80709
    25Base64: 
    
  92. in src/tinyformat.h:1055 in 49f73071a8
    1048@@ -1048,40 +1049,48 @@ TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_MAKEFORMATLIST)
    1049 /// list of format arguments is held in a single function argument.
    1050 inline void vformat(std::ostream& out, const char* fmt, FormatListRef list)
    1051 {
    1052-    detail::formatImpl(out, fmt, list.m_args, list.m_N);
    1053+    // Modified for Bitcoin Core to not throw for formatting errors
    1054+    try {
    1055+        detail::formatImpl(out, fmt, list.m_args, list.m_N);
    1056+    } catch (tinyformat::format_error& fmterr) {
    


    l0rinc commented at 3:49 pm on October 23, 2024:

    nit: we’re already inside namespace tinyformat:

    0    } catch (format_error& fmterr) {
    

    (same for formatImpl)

  93. in src/test/util_string_tests.cpp:91 in 49f73071a8
    84@@ -82,4 +85,17 @@ BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSpec)
    85     FailFmtWithError<1>("%1$", err_term);
    86 }
    87 
    88+BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
    89+{
    90+    // Ensure invalid format strings don't throw at run-time, when not in DEBUG
    91+#ifndef DEBUG
    


    l0rinc commented at 3:55 pm on October 23, 2024:
    might as well put the whole test inside, instead of having an empty passing test when DEBUG
  94. l0rinc commented at 3:59 pm on October 23, 2024: contributor

    I’m not in love with the DEBUG separation, but I’m fine with it if others are.

    ACK 49f73071a8ec4131595090a619d6198a5f8b7c3c

    git range-diff 4f33e9a85c3a8f546deddc2f78cf74bceff30315~4..4f33e9a85c3a8f546deddc2f78cf74bceff30315 aba9ce0eb6e67945f2209a17676f356fe9748 7e6..928538e3782ccbb22f54f4a3b1152b1faba95471

     01:  1f2d0ffd56 < -:  ---------- move-only: ConstevalFormatString tfm::format to tinyformat.h
     12:  c937ebf1ce = 1:  efa6f3f372 tinyformat: remove std::string overload
     23:  be7de807c8 = 2:  9329e334eb tinyformat: force compile-time checks for format string literals
     34:  4f33e9a85c ! 3:  928538e378 tinyformat: don't throw format string errors
     4    @@ Commit message
     5         behaviour, so suppress run-time format errors by returning the error as
     6         a string instead of throwing a tinyformat::format_error.
     7     
     8    +    In DEBUG builds, tinyformat::format_error throwing behaviour is
     9    +    preserved.
    10    +
    11         Note: most format string literals are partially validated at compile-time
    12         through util::ConstevalFormatString. Notably, bilingual_str format strings
    13         do not have this compile-time validation.
    14    @@ src/logging.h: template <typename... Args>
    15     -        try {
    16     -            log_msg = tfm::format(fmt, args...);
    17     -        } catch (tinyformat::format_error& fmterr) {
    18    --            /* Original format string will have newline so don't add one here */
    19     -            log_msg = "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: " + fmt.fmt;
    20     -        }
    21     +        const auto log_msg{tfm::format(fmt, args...)};
    22    @@ src/test/util_string_tests.cpp
    23      #include <util/string.h>
    24      
    25      #include <boost/test/unit_test.hpp>
    26    + #include <test/util/setup_common.h>
    27      
    28     +#include <sstream>
    29     +
    30    @@ src/test/util_string_tests.cpp: BOOST_AUTO_TEST_CASE(ConstevalFormatString_NumSp
    31      
    32     +BOOST_AUTO_TEST_CASE(tinyformat_ConstevalFormatString)
    33     +{
    34    -+    // Ensure invalid format strings don't throw at run-time.
    35    -+    BOOST_CHECK_EQUAL(tfm::format("%*c", "dummy"), R"(Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting log message: %*c)");
    36    -+    BOOST_CHECK_EQUAL(tfm::format("%2$*3$d", "dummy", "value"), R"(Error "tinyformat: Positional argument out of range" while formatting log message: %2$*3$d)");
    37    ++    // Ensure invalid format strings don't throw at run-time, when not in DEBUG
    38    ++#ifndef DEBUG
    39    ++    BOOST_CHECK_EQUAL(tfm::format("%*c", "dummy"), R"(Error "tinyformat: Cannot convert from argument type to integer for use as variable width or precision" while formatting: %*c)");
    40    ++    BOOST_CHECK_EQUAL(tfm::format("%2$*3$d", "dummy", "value"), R"(Error "tinyformat: Positional argument out of range" while formatting: %2$*3$d)");
    41     +    std::ostringstream oss;
    42     +    tfm::format(oss, "%.*f", 5);
    43    -+    BOOST_CHECK_EQUAL(oss.str(), R"(Error "tinyformat: Too many conversion specifiers in format string" while formatting log message: %.*f)");
    44    ++    BOOST_CHECK_EQUAL(oss.str(), R"(Error "tinyformat: Too many conversion specifiers in format string" while formatting: %.*f)");
    45    ++#endif
    46     +}
    47     +
    48     +
    49    @@ src/tinyformat.h: TINYFORMAT_FOREACH_ARGNUM(TINYFORMAT_MAKE_MAKEFORMATLIST)
    50     +    try {
    51     +        detail::formatImpl(out, fmt, list.m_args, list.m_N);
    52     +    } catch (tinyformat::format_error& fmterr) {
    53    -+        out << "Error \"" + std::string{fmterr.what()} + "\" while formatting log message: " + fmt;
    54    ++        out << "Error \"" + std::string{fmterr.what()} + "\" while formatting: " + fmt;
    55    ++#ifdef DEBUG
    56    ++        throw;
    57    ++#endif
    58     +    }
    59      }
    

    Worth bringing back the try/catch in src/test/fuzz/strprintf.cpp for Debug builds?

    We should definitely be able to see easily what the invalid values were (and add unit tests for those cases)

  95. stickies-v marked this as a draft on Oct 24, 2024
  96. stickies-v commented at 2:40 pm on October 24, 2024: contributor

    I’ve carved out the first 3 commits into #31149, since other PRs (e.g. #31061 and #31072) somewhat rely on this, and everyone seems in agreement that adding compile-time checks is good. Existing ACKs should be trivially transferable.

    Converting this PR to draft until #31149 is merged to keep the controversial error suppression discussion until later.

  97. hodlinator commented at 8:16 pm on October 24, 2024: contributor

    Worth bringing back the try/catch in src/test/fuzz/strprintf.cpp for Debug builds?

    We should definitely be able to see easily what the invalid values were (and add unit tests for those cases)

    (Not sure that would add much value, we know the fuzzing will produce mostly garbage format strings that will trigger errors).

  98. DrahtBot added the label Needs rebase on Oct 25, 2024
  99. DrahtBot commented at 9:11 am on October 25, 2024: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 09:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me