Use C++ streams instead of C-style printf functions #3271

pull brandondahler wants to merge 1 commits into bitcoin:master from brandondahler:printf-to-iostream changing 64 files +2408 −1731
  1. brandondahler commented at 4:34 AM on November 17, 2013: contributor

    Convert aoti[64] to fromstr, itostr[64] to tostr. Add CLogStream (logstream.h), wraps ostringstream but preserved type after insertions. Use fstream where possible (when no fsync is required). Change error function to have void return type. Fix functions relying on the false return from error function. Use boost::format for localized string token replacement.

    Follow up for #3236, #3237, and #3244.

    This should remove C99 requirements. Replaces *printf functions with C++ equivalents.

  2. jgarzik commented at 4:36 AM on November 17, 2013: contributor

    While the motivation is appreciated... meh, a bit disinclined to stir every printf line, and break tons of other patches, for this.

  3. in src/noui.cpp:None in fb684fc3c5 outdated
      42 | @@ -41,7 +43,7 @@ static bool noui_ThreadSafeAskFee(int64_t /*nFeeRequired*/)
      43 |  
      44 |  static void noui_InitMessage(const std::string &message)
      45 |  {
      46 | -    LogPrintf("init message: %s\n", message.c_str());
      47 | +    LogPrint(CLogStream() << "init message: " << message << "\n");
    


    laanwj commented at 7:40 AM on November 17, 2013:

    Why is the outer LogPrint() needed? It looks like a combination of old and new style Why not just

    CLogStream([optional category]) << "init message: " << message << "\n";
    

    (similar to std::cout/cin)


    brandondahler commented at 5:56 AM on November 18, 2013:

    This has been rectified with the new Log class.


    laanwj commented at 8:16 AM on November 18, 2013:

    @brandondahler thanks; much better this way

  4. laanwj commented at 7:41 AM on November 17, 2013: member

    @jgarzik That's also why I have discouraged doing this.

    On the other hand, it appears pretty trivial to rebase patches to this and it mostly (only) affects debug printing.

  5. in src/main.cpp:None in fb684fc3c5 outdated
    2092 | @@ -1974,7 +2093,7 @@ bool FindBlockPos(CValidationState &state, CDiskBlockPos &pos, unsigned int nAdd
    2093 |      }
    2094 |  
    2095 |      if (!pblocktree->WriteBlockFileInfo(nLastBlockFile, infoLastBlockFile))
    2096 | -        return state.Abort(_("Failed to write file info"));
    2097 | +        return state.Abort(_("Failed to write file info").str());
    


    laanwj commented at 7:48 AM on November 17, 2013:

    Why add .str() here? There should be no reason to thouch these lines as they don't use *printf.


    brandondahler commented at 5:51 AM on November 18, 2013:

    Had to convert translation (aka _("...") function) to something that still provides printf funcationality, without having printf avaliable. _("...") now has a return type of boost::format, which allows passing of parameters into the format via the % operator. The down side to this is that the boost::format class then has to be converted to a std::string, meaning calling .str() on the boost::format, or using the str(...) function.


    laanwj commented at 2:58 PM on November 19, 2013:

    Agreed, makes perfect sense, though it's a little bit ugly, as _() is used for non-format messages in many places. Not much to do about it I guess, no way to overload functions on return type.

  6. laanwj commented at 9:02 AM on November 17, 2013: member

    Apart from the above nits I think this is overall a good idea (printf is ugly and un-typesafe after all, so getting rid of it in favor of a safer method reduces risks. Also depending on C99 as well as C++ was a bit weird), but we may want to merge other pull requests first to prevent conflicts.

  7. sipa commented at 2:27 PM on November 17, 2013: member

    I'd prefer a CLogStream("class") << ..., with "class" optional as well.

    Since this is something that is often written during debugging, maybe just call it Log()?

  8. sipa commented at 2:29 PM on November 17, 2013: member

    Why remove error's return type?

  9. brandondahler commented at 2:54 PM on November 17, 2013: contributor

    Because error always returned false, which made the fact that it returns anything confusing (until you looked up the definition). Error was a subroutine acting like a function.

    On Nov 17, 2013, at 8:29 AM, Pieter Wuille notifications@github.com wrote:

    Why remove error's return value?

    — Reply to this email directly or view it on GitHub.

  10. brandondahler commented at 6:07 AM on November 18, 2013: contributor

    @sipa and @laanwj: following your suggestions, I have created the Log class (log.h and log.cpp). At its core it works by multiplexing an input to multiple outputs.

    Since generally flags won't change after instantiation of the class (except for changes from other threads), the constructor of the instantiation reads the different flags and decides what streams to send the data to.

    For all types except strings and const char_s, the data is just passed along to the streams below (manipulators work too). For strings and const char_s, we do a little processing based on what stream it is, specifically seeing if/when we are writing new lines.

    For the error function, I removed it all-together and replaced them with log calls ("ERROR: " prepended, "\n" appened). If we want to make it so the "ERROR: " part is configurable, I would recommend a public static const char* on Log instead of making a wrapping function.

  11. in src/bitcoinrpc.cpp:None in 1624b8d086 outdated
     437 | @@ -431,7 +438,7 @@ bool ReadHTTPRequestLine(std::basic_istream<char>& stream, int &proto,
     438 |      proto = 0;
     439 |      const char *ver = strstr(strProto.c_str(), "HTTP/1.");
     440 |      if (ver != NULL)
     441 | -        proto = atoi(ver+7);
     442 | +        fromstr(ver+7, &proto);
    


    laanwj commented at 8:15 AM on November 18, 2013:

    When a function has only one output using a return value is clearer than output parameter syntax. Any specific reason to switch to &pointer here? (maybe parse error reporting?)


    brandondahler commented at 1:29 AM on November 19, 2013:

    The reason for the switch is because fromstr is now a templated function. In order to have it detect what type the function should return, the function needs an input of that type.

    In some other places, I've used the alternate usage of this function:

    proto = fromstr(ver+7, (int64_t*) NULL);
    

    If desired, we can easily add back int64fromstr like so:

    inline int64_t int64fromstr(std::string s)
    {
        return fromstr(s, (int64_t*) NULL);
    }
    

    HapeMask commented at 5:57 AM on November 19, 2013:

    Was bored and reading PRs when I saw this. There is no need to take a pointer argument solely for type deduction. You can be explicit when you call the function:

    proto = fromstr<int64_t>(ver+7);
    

    At least in the (type*)NULL case, this is cleaner. It would require that fromstr default the "pn" parameter to NULL.


    laanwj commented at 7:15 AM on November 19, 2013:

    I was about to suggest what @HapeMask says


    brandondahler commented at 1:19 AM on November 20, 2013:

    That is something that I never have run into in C++, thanks for syntax lesson! Fixing to take template argument this way instead.

  12. brandondahler commented at 1:32 AM on November 19, 2013: contributor

    Fixed problems hopefully, rebased.

  13. in src/db.cpp:None in fd8e818529 outdated
     516 | @@ -503,7 +517,9 @@ bool CAddrDB::Write(const CAddrMan& addr)
     517 |      // Generate random temporary filename
     518 |      unsigned short randv = 0;
     519 |      RAND_bytes((unsigned char *)&randv, sizeof(randv));
     520 | -    std::string tmpfn = strprintf("peers.dat.%04x", randv);
     521 | +
     522 | +    std::ostringstream tmpfn("peers.dat.");
     523 | +    tmpfn << std::hex << std::setfill('0') << std::setw(4) << randv << std::dec << std::setfill(' ');
    


    laanwj commented at 2:06 PM on November 19, 2013:

    I'm still not really convinced that this is better than "%04x", it is much much more verbose. Wouldn't it be possible to keep mostly the same short syntax, but gain typesafeness, using boost::format? (I mean specifically in these cases where the output will end up in a string anyway, not the debug logging, of course you need to use std::stream syntax there)


    Diapolo commented at 7:41 AM on November 22, 2013:

    This is really too verbose, so that I don't think it's a benefit to merge this (yet)...

  14. gavinandresen commented at 1:43 AM on November 20, 2013: contributor

    I prefer printf-style syntax over << " " << etc, so I don't like this pull. It is 700 extra lines of code for approximately zero practical benefit that I can see.

  15. brandondahler commented at 4:10 AM on November 20, 2013: contributor

    @gavinandresen: I understand feeling that there is a lot of one-time risk involved in moving from printf to stream style conversion; however, I completely disagree that there is no practical benefit.

    1. This makes maintenance easier

    The code is much less complex overall. This removes some ugly compatibility #defines, attribute markings, and even hacks.

    1. Platform independence

    We are no longer relying on the platform to have a correct/compliant printf function. Instead we are relying on the C++ standard library implementation which we already rely on since we use C++ in the first place. This is specifically a problem for int64_t since Windows doesn't use the same syntax as Linux unless you use some compatibility defines.

    1. The way it reads

    Streams read fluently instead of requiring the user to place the parameters in the correct place. Consider:

    return strprintf(
            "HTTP/1.1 %d %s\r\n"
            "Date: %s\r\n"
            "Connection: %s\r\n"
            "Content-Length: %"PRIszu"\r\n"
            "Content-Type: application/json\r\n"
            "Server: bitcoin-json-rpc/%s\r\n"
            "\r\n"
            "%s",
        nStatus,
        cStatus,
        rfc1123Time().c_str(),
        keepalive ? "keep-alive" : "close",
        strMsg.size(),
        FormatFullVersion().c_str(),
        strMsg.c_str());
    

    Becomes

    ossReply << "HTTP/1.1 " << nStatus << " " << cStatus << "\r\n"
             << "Date: " << rfc1123Time() << "\r\n"
             << "Connection: " << (keepalive ? "keep-alive" : "close") << "\r\n"
             << "Content-Length: " << strMsg.size() << "\r\n"
             << "Content-Type: application/json\r\n"
             << "Server: bitcoin-json-rpc/" << FormatFullVersion() << "\r\n"
             << "\r\n"
             << strMsg;
    return ossReply.str();
    

    Likewise logging lines stand out just like normal cout lines (given the new Log class).


    All of this being said, I will agree that the shorter changes, such as:

    std::string tmpfn = strprintf("peers.dat.%04x", randv);
    

    Becoming

    std::ostringstream tmpfn("peers.dat.");
    tmpfn << std::hex << std::setfill('0') << std::setw(4) << randv << std::dec << std::setfill(' ');
    

    Would probably work better off as a boost::format. The reason I started out with not using boost::format more is because I personally prefer to keep dependence on boost to a minimum. If y'all disagree I do think it could make things less cluttered when making simple, smalls strings.

    Likewise creating formatting functions for specific types of data would allow us to change the output formatting of the data in one place instead of across a bunch of files, while fitting in with the stream style. Using the above function again, having a static call makes things cleaner all-together:

    ...
    std::string tmpfn = CAddrDB::PeersFilePath(randv);
    ...
    
    std::string CAddrDB::PeersFilePath(unsigned short randv)
    {
        std::ostringstream tmpfn();
        tmpfn << "peers.dat." << std::hex << std::setfill('0') << std::setw(4) << randv;
        return tmpfn.str();
    }
    

    Further we can make custom manipulators to reduce duplication of code:

    ios_base& hex04(ios_base& ib)
    {
        ib << std::hex << std::setfill('0') << std::setw(4);
        return ib;
    }
    
    std::string CAddrDB::PeersFilePath(unsigned short randv)
    {
        std::ostringstream tmpfn();
        tmpfn << "peers.dat." << hex04 << randv;
        return tmpfn.str();
    }
    
  16. brandondahler commented at 4:18 AM on November 20, 2013: contributor

    On another note, does anyone have any idea why pull tester is failing? The only lines that look bad are:

    =============
    1 test passed
    =============
    make[4]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[3]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[2]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[1]: Leaving directory `/mnt/bitcoin/linux-build/src'
    make  check-local
    make[2]: Entering directory `/mnt/bitcoin/linux-build'
    /bin/mkdir -p qa/tmp
    make[2]: *** [check-local] Killed
    make[2]: Leaving directory `/mnt/bitcoin/linux-build'
    make[1]: *** [check-am] Error 2
    make[1]: Leaving directory `/mnt/bitcoin/linux-build'
    make: *** [check-recursive] Error 1
    tail: `/mnt/bitcoin/linux-build/.bitcoin/regtest/debug.log' has become inaccessible: No such file or directory
    

    From http://jenkins.bluematt.me/pull-tester/fd77e459e07f3684c39b7861e758028ace756456/test.log . Whereas a successful one shows the following at the same spot:

    =============
    1 test passed
    =============
    make[4]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[3]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[2]: Leaving directory `/mnt/bitcoin/linux-build/src/test'
    make[1]: Leaving directory `/mnt/bitcoin/linux-build/src'
    make[1]: Entering directory `/mnt/bitcoin/linux-build'
    make  check-local
    make[2]: Entering directory `/mnt/bitcoin/linux-build'
    /bin/mkdir -p qa/tmp
    tail: write error: Broken pipe
    tail: write error
    
  17. laanwj commented at 7:07 AM on November 20, 2013: member

    @gavinandresen Agreed, the problem isn't with printf-style syntax, it's with the printf function - which (in our usage) isn't part of the C++ standard, so we had a dependency on C99-like formats that broke down on Windows without special mingw defines. Also it's not type safe, easy to mess up the stack with some wrong % char, and uses a variable-number-of-parameters hack which breaks down for std::string so a dummy '0' argument is inserted with a macro (the real_strprintf etc...).

    Many reasons to get rid of the sprintf hacks. But boost::format would be better in that regard as it mostly keeps printf syntax but is typesafe and has none of those mentioned drawbacks.

    I don't think the actual risk is that large, at most some debug messages will be formatted differently.

    It could map pretty much one-on-one. Could we do this without adding (so many) lines?

  18. laanwj commented at 7:09 AM on November 20, 2013: member

    Also -- do we really need our own class for handling time? (bitcointime.cpp/h) How is this related to the subject of this pull? Reducing depenency on boost is not a goal! If boost offers some functionality, please use that instead of rolling our own.

    This changes way too many things, please keep it focused.

  19. in src/log.h:None in fd77e459e0 outdated
      73 | +
      74 | +        if (*it == &fileout)
      75 | +        {
      76 | +            // Debug print useful for profiling
      77 | +            if (fLogTimestamps && fStartedNewLine)
      78 | +                **it << BitcoinTime::DateTimeStrFormat("%Y-%m-%d %H:%M:%S", BitcoinTime::GetTime()) << " ";
    


    brandondahler commented at 12:47 PM on November 20, 2013:

    @laanwj bitcointime.h/cpp had to be added to remove the circular dependency that was caused by adding log.h into its own file, since util.h uses Log in various places.

    Moving it into its own files also allows us to namespace the functions, making it clear where they are coming from when seeing them dispersed across several files.

  20. brandondahler commented at 1:04 PM on November 20, 2013: contributor

    @laanwj I would not consider the use of ostringstream as "rolling my own", namely because that is the only way provided by the C++ standard library to convert random types to strings and vice versa.

    If you are talking about the Log class, the first commit started off without it, but it was added because it simplifies the actual logging process (the business logic that goes in to putting type T data into X, Y, and Z streams) and it makes reading the actual print lines prettier:

    Log() << something << " something else"; 
    

    vs

    LogPrint(str(boost::format("%d something else") % something)); ).
    
  21. gavinandresen commented at 8:59 PM on November 29, 2013: contributor

    Closing; no consensus this is a good idea. Maybe bring back as a boost::format version that doesn't break so many other pull requests.

  22. gavinandresen closed this on Nov 29, 2013

  23. brandondahler commented at 2:46 AM on November 30, 2013: contributor

    @gavinandresen Can I request a temporary re-opening--I made changes but did not request comments on them (was busy and didn't write them up).

    Updated summary of changes that are in this pull request:

    1. Replaced LogPrintf with new Log class that is functionally the same, but instead works by multiplexing log data into a set of streams, determined at the time of the construction of the class. This causes >95% of all line changes here.
    2. Removed strprintf and all hacks associated. Replaced usage with boost::format.
    3. Used ostringstreams in place of large string concatenations (many examples in src/rpcnet.cpp).
    4. Update translation function std::string _(const char* psz) to boost::format _(const char* psz). Add ability to get direct string instead of boost::format by specifying _std::string(const char* psz).
    5. Replaced ato* and *tostr functions with templated functions tostr and fromstr.
    6. Moved time functions to bitcointime.{h,cpp} from util.{h,cpp}. Fixes circular dependency and allows namespacing of functions to provide context clarity.
    7. Removed error function completely, always returned false before, logging input(s) with strings preappened and appended.

    Edit: Also rebased.

  24. sipa commented at 5:18 PM on December 1, 2013: member

    I'm personally fine with the general direction this is going in. In particular:

    • Splitting off log and time functionality to their own source files is good - util is a mess.
    • No problem with using a more ideomatic C++ way for formatting output - especially if it provides better type-safety.
    • The Log(class) << some << messages looks ok to me.

    I'm less sure about things like _<std::string>(...) (quite verbose - if we really need a separate method, maybe __ or something?).

    Also, error() most certainly was a hack, but I'm in the middle about getting rid of it, if it requires adding that many extra lines to deal with it. @gavinandresen Given that it's now (in some places) using boost::format, care to reopen?

  25. in src/bitcointime.cpp:None in 8750e9b84f outdated
      19 | +
      20 | +using namespace std;
      21 | +
      22 | +namespace BitcoinTime
      23 | +{
      24 | +    int64_t GetPerformanceCounter()
    


    sipa commented at 5:20 PM on December 1, 2013:

    Though time-related, this is really just for initialized the PRNG. I think it belongs more in util.


    brandondahler commented at 4:27 AM on December 5, 2013:

    Moving back.

  26. in src/bitcointime.h:None in 8750e9b84f outdated
       6 | +#define BITCOIN_BITCOINTIME_H
       7 | +
       8 | +#include <stdint.h>
       9 | +#include <string>
      10 | +
      11 | +class CNetAddr;
    


    sipa commented at 5:21 PM on December 1, 2013:

    It's really quite ugly that this depends on netbase data structures. I don't have a better proposal for now, though.


    brandondahler commented at 4:27 AM on December 5, 2013:

    It is just a forward declare here, are you referring to the #include in the cpp?


    sipa commented at 11:35 AM on December 5, 2013:

    By 'depends' I don't mean a code dependency. Just the fact that this module is not usable without the other.

  27. in src/core.cpp:None in 8750e9b84f outdated
      12 | +#include <sstream>
      13 | +#include <stdint.h>
      14 | +
      15 | +#include <boost/format.hpp>
      16 | +
      17 | +/** Fees smaller than this (in satoshi) are considered zero fee (for transaction creation) */
    


    sipa commented at 5:23 PM on December 1, 2013:

    I don't think these belong in core at all (or should be part of CTransaction). They are validation/relaying related constants that belong in main. Given that the mempool code refactorings that are going on probably rely on it, we can keep it like this for now, but I'd prefer to not have these in core.


    laanwj commented at 6:27 AM on December 2, 2013:

    Agreed @sipa, it's not part of the block verification/consensus so should likely be somewhere else. (then again, this pull is not the place to fix that)


    brandondahler commented at 3:19 AM on December 5, 2013:

    Not sure what flawed logic I was following -- reverted.

  28. in src/log.h:None in 8750e9b84f outdated
      19 | +// Log() has been broken a couple of times now
      20 | +// by well-meaning people adding mutexes in the most straightforward way.
      21 | +// It breaks because it may be called by global destructors during shutdown.
      22 | +// Since the order of destruction of static/global objects is undefined,
      23 | +// defining a mutex as a global object doesn't work (the mutex gets
      24 | +// destroyed, and then some later destructor calls OutputDebugStringF,
    


    sipa commented at 5:26 PM on December 1, 2013:

    Is this comment up to date?


    brandondahler commented at 4:25 AM on December 5, 2013:

    Wasn't following the advice either, after reviewing it.

    Solved problem in more complex, but more understandable way, including new comments that explains problem and reason why the solution works.

  29. brandondahler commented at 6:24 AM on December 5, 2013: contributor

    Updated to address items above and rebased, they will not propagate to this pull request unless the request is re-opened.

  30. laanwj reopened this on Dec 5, 2013

  31. laanwj commented at 10:31 AM on December 5, 2013: member

    Reopening this -- I don't expect that it gets merged for 0.9, as we don't want to break all other pulls, and it needs a lot of work, but overall it's going the right way IMO (boost::format instead of ugly over-verbose stdio stream syntax).

  32. Use C++ streams instead of C-style printf functions
    Convert aoti[64] to fromstr, itostr[64] to tostr.
    Add Log class (log.{cpp,h}), replaces LogPrintf function.
    Remove error(...) function.
    Fix functions relying on the false return from error function.
    Use fstream where possible (when no fsync is required).
    Use boost::format for localized string token replacement.
    Add ability to request the translation as a string.
    Move time functions from util.{h,cpp} to bitcointime.{h,cpp} (can not use time.h).
    Use boost::format instead of ostringstream when possible.
    e39b26c0e8
  33. BitcoinPullTester commented at 3:43 AM on December 6, 2013: none

    Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/e39b26c0e839dc5b2e8436da2b25816ccb1938d9 for binaries and test log. This test script verifies pulls every time they are updated. It, however, dies sometimes and fails to test properly. If you are waiting on a test, please check timestamps to verify that the test.log is moving at http://jenkins.bluematt.me/pull-tester/current/ Contact BlueMatt on freenode if something looks broken.

  34. p12tic commented at 11:05 PM on December 28, 2013: none

    https://github.com/c42f/tinyformat might be a good replacement. As opposed to boost::format, the format is exactly the same as the one that printf uses.

  35. laanwj commented at 9:46 AM on December 29, 2013: member

    @p12tic that actually looks very good, it is a type-safe strprintf() implementation in one header file and is not dependent on the platform's printf functions. According to the description it supports C99 format characters, which is a requirement. Could be a drop-in replacement.

  36. laanwj commented at 4:00 PM on January 16, 2014: member

    See #3549

  37. laanwj closed this on Jan 16, 2014

  38. brandondahler deleted the branch on Feb 27, 2017
  39. Bushstar referenced this in commit 9c9cac6d67 on Apr 8, 2020
  40. DrahtBot locked this on Sep 8, 2021

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 18:16 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me