util: ParseByteUnits - Parse a string with suffix unit #23249

pull dougEfresh wants to merge 1 commits into bitcoin:master from dougEfresh:util-parsebyteunit changing 6 files +138 −4
  1. dougEfresh commented at 10:25 AM on October 11, 2021: contributor

    A convenience utility for parsing human readable strings sizes e.g. 500G is 500 * 1 << 30

    The argument/setting maxuploadtarget now accept human readable byte units [k|K|m|M|g|G||t|T] This change backward compatible, defaults to M if no unit specified.

  2. DrahtBot added the label Utils/log/libs on Oct 11, 2021
  3. in src/init.cpp:1753 in 3f900a6946 outdated
    1749 | @@ -1749,7 +1750,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1750 |      connOptions.nReceiveFloodSize = 1000 * args.GetIntArg("-maxreceivebuffer", DEFAULT_MAXRECEIVEBUFFER);
    1751 |      connOptions.m_added_nodes = args.GetArgs("-addnode");
    1752 |  
    1753 | -    connOptions.nMaxOutboundLimit = 1024 * 1024 * args.GetIntArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET);
    1754 | +    connOptions.nMaxOutboundLimit = ParseByteUnits(args.GetArg("-maxuploadtarget", "")).value_or(DEFAULT_MAX_UPLOAD_TARGET);
    


    laanwj commented at 2:10 PM on October 11, 2021:

    When we change this, it would be nice to do actual error handling instead of silently using the default if the value is unparseable.


    dougEfresh commented at 4:26 PM on October 11, 2021:

    The current/master branch implementation is silent, I agree we should do some type of error handling, should we:

    1. Return InitError()
    2. Log a warning

    If we throw an InitError, some people won't notice it, as this part of the code comes after level DB initialize. The error is shown like ~45-60 seconds after start up. (I know I didn't notice it myself)

    ryanofsky has other PR(s) that seems to address this by refactoring ArgsManager #22766 / #16545 . Perhaps wait for that.


    ryanofsky commented at 11:46 PM on October 11, 2021:

    ryanofsky has other PR(s) that seems to address this by refactoring ArgsManager #22766 / #16545 . Perhaps wait for that.

    I'd definitely suggest fatal, explicit InitError for any bad or ambiguous values provided. ArgsManager class and PRs are a mess, I'd stay far away and keep this separate!

    EDIT: More concretely, I'd suggest changing DEFAULT_MAX_UPLOAD_TARGET to a string and writing something like:

    const std::string DEFAULT_MAX_UPLOAD_TARGET = "0M";
    ...
    auto max_upload_str = args.GetArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET);
    auto max_upload = ParseByteUnits(max_upload_str);
    if (!max_upload) {
        return InitError(strprintf("Bad -maxuploadtarget %s", max_upload_str));
    }
    connOptions.nMaxOutboundLimit = *max_upload;
    
  4. in src/util/strencodings.cpp:542 in 3f900a6946 outdated
     537 | +    size_t nBytes{0};
     538 | +
     539 | +    v.pop_back();
     540 | +    switch (ToLower(unit)) {
     541 | +        case 'k':
     542 | +            return ParseUInt64(v, &nBytes) ? std::optional<size_t>{nBytes << 10} : std::nullopt;
    


    laanwj commented at 2:11 PM on October 11, 2021:

    Could factor out the ParseUInt64(v, &nBytes) part here, it's the same for all cases.

  5. in src/util/strencodings.h:315 in 3f900a6946 outdated
     310 | + *
     311 | + * Examples: 4m,6M,9g,78T
     312 | + *
     313 | + * @param[in] str                  the string to convert into bytes
     314 | + * @param[in] default_multiplier   if no unit is found in str, use this multiplier. default is MiB
     315 | + * @returns                        optional size_t bytes from str
    


    laanwj commented at 2:12 PM on October 11, 2021:

    Please document under which conditions it returns a value or nullopt (e.g. parse errors).

  6. in src/util/strencodings.cpp:551 in 3f900a6946 outdated
     546 | +            break;
     547 | +        case 'g':
     548 | +            return ParseUInt64(v, &nBytes) ? std::optional<size_t>{nBytes << 30} : std::nullopt;
     549 | +            break;
     550 | +        case 't':
     551 | +            return ParseUInt64(v, &nBytes) ? std::optional<size_t>{nBytes << 40} : std::nullopt;
    


    laanwj commented at 4:06 PM on October 11, 2021:

    Are people going to fight over factor 1000 versus 1024 or has this been settled?


    sipa commented at 4:14 PM on October 11, 2021:

    I'd support both, lowercase for 1000-based, uppercase for 1024-based? That hopefully avoids most controversy.

    Also, this looks like it could easily overflow.

  7. dougEfresh force-pushed on Oct 13, 2021
  8. dougEfresh commented at 2:58 PM on October 13, 2021: contributor

    Updated with

    • Replacing size_t in favor of uint64_t (windows/mac didn't play nice with size_t)
    • Adding InitError on parsing failure
    • Simplified case statement in ParseByteUnit
    • 1000 for lowercase 1024 base for UPPERCASE

    What I need help with is overflows. I use __builtin_mul_overflow but there must be a better/easier solution.

  9. dougEfresh renamed this:
    util: ParseByteUnits - Parse a string with a suffix unit
    util: ParseByteUnits - Parse a string with suffix unit
    on Oct 13, 2021
  10. DrahtBot commented at 8:37 PM on October 14, 2021: contributor

    <!--e57a25ab6845829454e8d69fc972939a-->

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    <!--174a7506f384e20aa4161008e828411d-->

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #22834 (net: respect -onlynet= when making outbound connections by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  11. in src/util/strencodings.cpp:578 in 7d96a50d56 outdated
     573 | +            // use default_multipiler
     574 | +            break;
     575 | +    }
     576 | +
     577 | +    int64_t total{0};
     578 | +#if defined(HAVE_BUILTIN_MUL_OVERFLOW)
    


    practicalswift commented at 8:13 AM on October 15, 2021:

    Could use MultiplicationOverflow instead if it is moved out from src/test/fuzz/util.h and proper unit tests are added fot it?


    dougEfresh commented at 4:32 PM on October 15, 2021:

    Since I'm checking an unsigned int, I used

    nBytes > std::numeric_limits<uint64_t>::max() / multiplier
    

    If there is interest, I can create another PR with MultiplicationOverflow out of src/test/fuzz/util.h

  12. dougEfresh force-pushed on Oct 15, 2021
  13. dougEfresh force-pushed on Oct 15, 2021
  14. in src/util/strencodings.cpp:542 in 0d786a1094 outdated
     537 | +    auto unit = v.back();
     538 | +    uint64_t nBytes{0};
     539 | +    uint64_t multiplier{default_multiplier};
     540 | +
     541 | +    v.pop_back();
     542 | +    if (!(ParseUInt64(v, &nBytes) || ParseUInt64(str, &nBytes))) {
    


    promag commented at 9:00 PM on October 15, 2021:

    I think parsing fails for number > 9 without unit?


    dougEfresh commented at 11:04 AM on October 16, 2021:

    Added this test case:

    BOOST_CHECK_EQUAL(ParseByteUnits("10").value_or(0), 10ULL << 20UL);
    
  15. in src/util/strencodings.cpp:573 in 0d786a1094 outdated
     568 | +            break;
     569 | +        case 'T':
     570 | +            multiplier = 1ULL << 40;
     571 | +            break;
     572 | +        default:
     573 | +            // use default_multipiler
    


    promag commented at 9:00 PM on October 15, 2021:

    Should be digit?


    dougEfresh commented at 11:04 AM on October 16, 2021:

    Updated to parse the original str in the default case

  16. promag commented at 9:03 PM on October 15, 2021: member

    Concept ACK. However, I think the implementation can be improved. Why not check the last char first? If it's not a digit then extract it and validate to get the multiplier. Only then parse the number.

  17. dougEfresh force-pushed on Oct 16, 2021
  18. dougEfresh force-pushed on Nov 1, 2021
  19. in src/init.cpp:1754 in c5d4a59cf9 outdated
    1749 | @@ -1749,7 +1750,11 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1750 |      connOptions.nReceiveFloodSize = 1000 * args.GetIntArg("-maxreceivebuffer", DEFAULT_MAXRECEIVEBUFFER);
    1751 |      connOptions.m_added_nodes = args.GetArgs("-addnode");
    1752 |  
    1753 | -    connOptions.nMaxOutboundLimit = 1024 * 1024 * args.GetIntArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET);
    1754 | +    auto opt_max_upload = ParseByteUnits(args.GetArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET));
    1755 | +    if (!opt_max_upload) {
    


    dougEfresh commented at 8:13 AM on November 1, 2021:

    I want to point out to reviewers that this changes start up behavior. Previously, -maxuploadtarget would silently fail and use DEFAULT_MAX_UPLOAD_TARGET. This change will throw an error upon parse failure, but only after 30-50 seconds after start-up


    ryanofsky commented at 4:12 PM on November 2, 2021:

    re: #23249 (review)

    This change will throw an error upon parse failure, but only after 30-50 seconds after start-up

    I'm not seeing this take 30-50 seconds, just a second or two. Still it would be a good to provide faster checking, and just moving the parsing call up a little higher in the AppInitMain function seems to give good results. I tried:

    diff --git a/src/init.cpp b/src/init.cpp
    index 517ad2342ff..513cc24faeb 100644
    --- a/src/init.cpp
    +++ b/src/init.cpp
    @@ -1114,6 +1114,12 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
     {
         const ArgsManager& args = *Assert(node.args);
         const CChainParams& chainparams = Params();
    +
    +    auto opt_max_upload = ParseByteUnits(args.GetArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET));
    +    if (!opt_max_upload) {
    +        return InitError(strprintf(_("Unable to parse -maxuploadtarget: '%s' (possible integer overflow?)"), args.GetArg("-maxuploadtarget","")));
    +    }
    +
         // ********************************************************* Step 4a: application initialization
         if (!CreatePidFile(args)) {
             // Detailed error printed inside CreatePidFile().
    @@ -1749,11 +1755,6 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
         connOptions.nSendBufferMaxSize = 1000 * args.GetIntArg("-maxsendbuffer", DEFAULT_MAXSENDBUFFER);
         connOptions.nReceiveFloodSize = 1000 * args.GetIntArg("-maxreceivebuffer", DEFAULT_MAXRECEIVEBUFFER);
         connOptions.m_added_nodes = args.GetArgs("-addnode");
    -
    -    auto opt_max_upload = ParseByteUnits(args.GetArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET));
    -    if (!opt_max_upload) {
    -        return InitError(strprintf(_("Unable to parse -maxuploadtarget: '%s' (possible integer overflow?)"), args.GetArg("-maxuploadtarget","")));
    -    }
         connOptions.nMaxOutboundLimit = *opt_max_upload;;
         connOptions.m_peer_connect_timeout = peer_connect_timeout;
     
    
    

    dougEfresh commented at 10:13 PM on November 2, 2021:

    Moved

  20. in src/util/strencodings.cpp:588 in c5d4a59cf9 outdated
     583 | +
     584 | +    if (multiplier == 0) {
     585 | +        return opt_uint64{nBytes};
     586 | +    }
     587 | +    // check overflow
     588 | +    return nBytes > std::numeric_limits<uint64_t>::max() / multiplier ? std::nullopt : opt_uint64{nBytes * multiplier};
    


    ryanofsky commented at 6:19 PM on November 2, 2021:

    In commit "util: ParseByteUnits - Parse a string with suffix unit [k|K] [m|M] [g|G] [t|T]" (c5d4a59cf91f54a2027851b32cf83cfcb04cd210)

    There's repetition and unneeded variables in this function. Also nBytes holds the number before multiplication instead of the actual number of bytes, and the name doesn't follow current coding convention. Would suggest an overall cleanup here like:

    diff --git a/src/util/strencodings.cpp b/src/util/strencodings.cpp
    index c490cc7f3e1..0075e69c5e7 100644
    --- a/src/util/strencodings.cpp
    +++ b/src/util/strencodings.cpp
    @@ -530,19 +530,8 @@ std::string HexStr(const Span<const uint8_t> s)
     
     std::optional<uint64_t> ParseByteUnits(const std::string& str, const uint64_t default_multiplier)
     {
    -    if (str.empty()) {
    -        return std::nullopt;
    -    }
    -
    -    using opt_uint64 = std::optional<uint64_t>;
    -
    -    uint64_t nBytes{0};
    -    uint64_t multiplier{default_multiplier};
    -    auto unit = str.back();
    -    auto v = str;
    -
    -    v.pop_back();
    -
    +    auto multiplier = default_multiplier ? default_multiplier : 1;
    +    char unit = str.empty() ? 0 : str.back();
         switch (unit) {
             case 'k':
                 multiplier = 1000;
    @@ -569,21 +558,13 @@ std::optional<uint64_t> ParseByteUnits(const std::string& str, const uint64_t de
                 multiplier = 1ULL << 40;
                 break;
             default:
    -            // unit should be digit, parse original str
    -            if (!ParseUInt64(str, &nBytes)) {
    -                return std::nullopt;
    -            }
    -            //check overflow and use default_multiplier
    -            return default_multiplier == 0 ? opt_uint64{nBytes} : nBytes > std::numeric_limits<uint64_t>::max() / default_multiplier ? std::nullopt : opt_uint64{nBytes * default_multiplier};
    +            unit = 0;
    +            break;
         }
    -
    -    if (!ParseUInt64(v, &nBytes)) {
    +    uint64_t num;
    +    if (!ParseUInt64(unit ? str.substr(0, str.size()-1) : str, &num) ||
    +        num > std::numeric_limits<uint64_t>::max() / multiplier) {
             return std::nullopt;
         }
    -
    -    if (multiplier == 0) {
    -        return opt_uint64{nBytes};
    -    }
    -    // check overflow
    -    return nBytes > std::numeric_limits<uint64_t>::max() / multiplier ? std::nullopt : opt_uint64{nBytes * multiplier};
    +    return num * multiplier;
     }
    
    

    dougEfresh commented at 10:13 PM on November 2, 2021:

    Great suggestion, code now looks cleaner.

  21. in src/test/util_tests.cpp:2478 in c5d4a59cf9 outdated
    2473 | +
    2474 | +    BOOST_CHECK_EQUAL(ParseByteUnits("4t").value_or(0), 4ULL * 1000 * 1000 * 1000 * 1000);
    2475 | +    BOOST_CHECK_EQUAL(ParseByteUnits("4T").value_or(0), 4ULL << 40);
    2476 | +
    2477 | +    // override default multiplier
    2478 | +    BOOST_CHECK_EQUAL(ParseByteUnits("5", 1ULL << 10).value_or(0), 5ULL << 10);
    


    ryanofsky commented at 6:27 PM on November 2, 2021:

    In commit "util: ParseByteUnits - Parse a string with suffix unit [k|K] [m|M] [g|G] [t|T]" (c5d4a59cf91f54a2027851b32cf83cfcb04cd210)

    Would be good to check behavior with 0 multiplier to make sure it doesn't accidentally change in the future.


    dougEfresh commented at 10:14 PM on November 2, 2021:

    Added 0 multiplier test

  22. in src/test/util_tests.cpp:2491 in c5d4a59cf9 outdated
    2486 | +
    2487 | +    // be positive
    2488 | +    BOOST_CHECK(!ParseByteUnits("-123m").has_value());
    2489 | +
    2490 | +    // zero padding
    2491 | +    BOOST_CHECK_EQUAL(ParseByteUnits("020M").value_or(1), 20ULL << 20);
    


    ryanofsky commented at 6:31 PM on November 2, 2021:

    In commit "util: ParseByteUnits - Parse a string with suffix unit [k|K] [m|M] [g|G] [t|T]" (c5d4a59cf91f54a2027851b32cf83cfcb04cd210)

    Minor inconsistency here where this case is using .value_or(1) and cases above are using .value_or(0). I think it would best if all of these cases were consistent and better if they were using .value() instead of .value_or()


    dougEfresh commented at 10:14 PM on November 2, 2021:

    Agreed

  23. ryanofsky approved
  24. ryanofsky commented at 6:47 PM on November 2, 2021: contributor

    Code review and lightly tested ACK c5d4a59cf91f54a2027851b32cf83cfcb04cd210. I left some suggestions below, but none are important so feel free to ignore them.

    It might be good to add a release note in the doc/ directory mentioning the new argument verification, since in theory it could make a configuration that used to work previously now cause an error on startup.

    It would also be good to mention new behavior in the PR description, since current description is a little ambiguous about whether this is only adding a utility function or changing behavior.

  25. dougEfresh force-pushed on Nov 2, 2021
  26. dougEfresh commented at 10:15 PM on November 2, 2021: contributor

    It might be good to add a release note in the doc/ directory mentioning the new argument verification, since in theory it could make a configuration that used to work previously now cause an error on startup.

    Added to release docs

    It would also be good to mention new behavior in the PR description, since current description is a little ambiguous about whether this is only adding a utility function or changing behavior.

    Updated PR description.

    Thanks for feedback @ryanofsky

  27. dougEfresh requested review from laanwj on Nov 2, 2021
  28. ryanofsky approved
  29. ryanofsky commented at 5:31 PM on November 8, 2021: contributor

    Code review ACK f1f8a772b0daf8cc0c40f0fc54e82618ad242ddd. Just suggested changes since last review (thanks!)

  30. in src/init.cpp:1768 in f1f8a772b0 outdated
    1754 | @@ -1748,8 +1755,7 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1755 |      connOptions.nSendBufferMaxSize = 1000 * args.GetIntArg("-maxsendbuffer", DEFAULT_MAXSENDBUFFER);
    1756 |      connOptions.nReceiveFloodSize = 1000 * args.GetIntArg("-maxreceivebuffer", DEFAULT_MAXRECEIVEBUFFER);
    1757 |      connOptions.m_added_nodes = args.GetArgs("-addnode");
    1758 | -
    1759 | -    connOptions.nMaxOutboundLimit = 1024 * 1024 * args.GetIntArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET);
    1760 | +    connOptions.nMaxOutboundLimit = *opt_max_upload;;
    


    vasild commented at 9:05 AM on November 9, 2021:

    nit: double ;; at the end


    dougEfresh commented at 12:19 PM on November 9, 2021:

    removed

  31. in src/test/util_tests.cpp:2459 in f1f8a772b0 outdated
    2455 | @@ -2456,4 +2456,49 @@ BOOST_AUTO_TEST_CASE(remove_prefix)
    2456 |      BOOST_CHECK_EQUAL(RemovePrefix("", ""), "");
    2457 |  }
    2458 |  
    2459 | +BOOST_AUTO_TEST_CASE(util_ParseByteUnits)
    


    vasild commented at 9:17 AM on November 9, 2021:

    Maybe add a test for unknown suffix, e.g. 123X.


    dougEfresh commented at 12:17 PM on November 9, 2021:

    Added invalid unit test

  32. in src/util/strencodings.cpp:566 in f1f8a772b0 outdated
     561 | +            unit = 0;
     562 | +            break;
     563 | +    }
     564 | +
     565 | +    uint64_t num;
     566 | +    if (!ParseUInt64(unit ? str.substr(0, str.size()-1) : str, &num) ||
    


    vasild commented at 9:19 AM on November 9, 2021:

    nit: missing spaces around -.


    dougEfresh commented at 12:20 PM on November 9, 2021:

    When I added the space, it didn't look aesthetically pleasing to me, normally I do add a space, but this if statement looks better without I don't mind adding a space if you like or if someone else asks.


    vasild commented at 1:51 PM on November 9, 2021:

    The thing is that lots of people have lots of different preferences. "Aesthetically pleasant" for somebody may not be for another. The project uses clang-format to avoid endless low-value discussions about styling.


    dougEfresh commented at 4:48 PM on November 9, 2021:

    @vasild I changed const uint64_t to uint64_t

  33. in src/init.cpp:440 in f1f8a772b0 outdated
     435 | @@ -435,7 +436,7 @@ void SetupServerArgs(ArgsManager& argsman)
     436 |      argsman.AddArg("-maxreceivebuffer=<n>", strprintf("Maximum per-connection receive buffer, <n>*1000 bytes (default: %u)", DEFAULT_MAXRECEIVEBUFFER), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
     437 |      argsman.AddArg("-maxsendbuffer=<n>", strprintf("Maximum per-connection send buffer, <n>*1000 bytes (default: %u)", DEFAULT_MAXSENDBUFFER), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
     438 |      argsman.AddArg("-maxtimeadjustment", strprintf("Maximum allowed median peer time offset adjustment. Local perspective of time may be influenced by peers forward or backward by this amount. (default: %u seconds)", DEFAULT_MAX_TIME_ADJUSTMENT), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
     439 | -    argsman.AddArg("-maxuploadtarget=<n>", strprintf("Tries to keep outbound traffic under the given target (in MiB per 24h). Limit does not apply to peers with 'download' permission. 0 = no limit (default: %d)", DEFAULT_MAX_UPLOAD_TARGET), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
     440 | +    argsman.AddArg("-maxuploadtarget=<n>", strprintf("Tries to keep outbound traffic under the given target (in MiB, GiB or TiB per 24h). Limit does not apply to peers with 'download' permission or blocks created within past week. 0 = no limit (default: %s). Optional suffix units [m|M] MiB, [g|G] GiB, [t|T] TiB. Lowercase is 1000 base while uppercase is 1024 base (default).", DEFAULT_MAX_UPLOAD_TARGET), ArgsManager::ALLOW_ANY, OptionsCategory::CONNECTION);
    


    vasild commented at 9:38 AM on November 9, 2021:

    I find this text confusing:

    Optional suffix units [m|M] MiB, [g|G] GiB, [t|T] TiB. Lowercase is 1000 base while uppercase is 1024 base

    Maybe:

    Optional suffix units [m|M|g|G|t|T]. Lowercase is 1000 base while uppercase is 1024 base


    dougEfresh commented at 12:18 PM on November 9, 2021:

    @vasild Good suggestion, I've changed the doc to:

    Tries to keep outbound traffic under the given target per 24h. Limit does not apply to peers with 'download' permission or blocks created within past week. 0 = no limit (default: %s). Optional suffix units [k|K|m|M|g|G|t|T] (default: M). Lowercase is 1000 base while uppercase is 1024 base
    
  34. dougEfresh force-pushed on Nov 9, 2021
  35. dougEfresh force-pushed on Nov 9, 2021
  36. in src/util/strencodings.h:319 in f3f88062f4 outdated
     314 | + * @param[in] str                  the string to convert into bytes
     315 | + * @param[in] default_multiplier   if no unit is found in str, use this multiplier. default is MiB
     316 | + * @returns                        optional uint64_t bytes from str or nullopt
     317 | + *                                 if ParseUInt64 is false, str is empty, trailing whitespace or overflow
     318 | + */
     319 | +std::optional<uint64_t> ParseByteUnits(const std::string& str, const uint64_t default_multiplier = 1024UL*1024UL);
    


    vasild commented at 1:37 PM on November 9, 2021:

    nit: drop unnecessary const from the integer argument style: missing spaces around *:

    std::optional<uint64_t> ParseByteUnits(const std::string& str, uint64_t default_multiplier = 1024UL * 1024UL);
    

    dougEfresh commented at 2:34 PM on November 9, 2021:

    I ran clang-format-diff.py


    vasild commented at 2:48 PM on November 9, 2021:

    The const is still there. I think there was discussion somewhere about it, but I can't find it now...

  37. in src/util/strencodings.cpp:568 in f3f88062f4 outdated
     563 | +    }
     564 | +
     565 | +    uint64_t num;
     566 | +    if (!ParseUInt64(unit ? str.substr(0, str.size()-1) : str, &num) ||
     567 | +        num > std::numeric_limits<uint64_t>::max() / multiplier) { // check overflow
     568 | +         return std::nullopt;
    


    vasild commented at 1:41 PM on November 9, 2021:

    style: this has 5 spaces for indentation, should be 4.

  38. in src/util/strencodings.cpp:536 in f3f88062f4 outdated
     531 | +std::optional<uint64_t> ParseByteUnits(const std::string& str, const uint64_t default_multiplier)
     532 | +{
     533 | +    auto multiplier = default_multiplier ? default_multiplier : 1;
     534 | +    char unit = str.empty() ? 0 : str.back();
     535 | +    switch (unit) {
     536 | +        case 'k':
    


    vasild commented at 1:42 PM on November 9, 2021:

    style: case should be aligned below switch:

        switch (unit) {
        case 'k':
            multiplier = 1000;
    
  39. in src/init.cpp:1121 in f3f88062f4 outdated
    1114 | @@ -1114,6 +1115,12 @@ bool AppInitMain(NodeContext& node, interfaces::BlockAndHeaderTipInfo* tip_info)
    1115 |  {
    1116 |      const ArgsManager& args = *Assert(node.args);
    1117 |      const CChainParams& chainparams = Params();
    1118 | +
    1119 | +    auto opt_max_upload = ParseByteUnits(args.GetArg("-maxuploadtarget", DEFAULT_MAX_UPLOAD_TARGET));
    1120 | +    if (!opt_max_upload) {
    1121 | +        return InitError(strprintf(_("Unable to parse -maxuploadtarget: '%s' (possible integer overflow?)"), args.GetArg("-maxuploadtarget","")));
    


    vasild commented at 1:44 PM on November 9, 2021:

    style: missing space

            return InitError(strprintf(_("Unable to parse -maxuploadtarget: '%s' (possible integer overflow?)"), args.GetArg("-maxuploadtarget", "")));
    
  40. vasild approved
  41. vasild commented at 1:47 PM on November 9, 2021: contributor

    ACK f3f88062f4cdb0cc2d1af221a5a9da9ee2ea0e46

    Some styling nits below, which would be good to address before merge.

    The style rules are defined in src/.clang-format. You can use contrib/devtools/clang-format-diff.py to properly format the patch.

  42. dougEfresh force-pushed on Nov 9, 2021
  43. vasild approved
  44. vasild commented at 2:44 PM on November 9, 2021: contributor

    ACK 6ebb47626ee30576548e8b3c01ad961761e28562

  45. dougEfresh force-pushed on Nov 9, 2021
  46. dougEfresh commented at 4:49 PM on November 9, 2021: contributor

    @MarcoFalke would you mind reviewing? Thanks

  47. vasild approved
  48. vasild commented at 8:21 AM on November 10, 2021: contributor

    ACK 4451a0000dda5daba4b1624dce1c95f1914e7251

  49. in src/util/strencodings.cpp:533 in 4451a0000d outdated
     526 | @@ -526,3 +527,45 @@ std::string HexStr(const Span<const uint8_t> s)
     527 |      assert(it == rv.end());
     528 |      return rv;
     529 |  }
     530 | +
     531 | +std::optional<uint64_t> ParseByteUnits(const std::string& str, uint64_t default_multiplier)
     532 | +{
     533 | +    auto multiplier = default_multiplier ? default_multiplier : 1;
    


    promag commented at 9:15 AM on November 10, 2021:

    nit, why not assert(default_multiplier > 0)?


    dougEfresh commented at 11:38 AM on November 10, 2021:

    Depends on how strict or how we want to use this function. For me, a 0 default_multiplier means "don not apply a multiplier" if no unit found, parse str as is. This may not be intuitive. I'm fine with restricting multiplier > 0. @MarcoFalke what's your opinion ? either way, I'll update the docs in strencoding.h to explaining what a default_multiplier of zero means.


    MarcoFalke commented at 11:50 AM on November 10, 2021:

    Wouldn't "don not apply a multiplier" mean 1?


    dougEfresh commented at 11:57 AM on November 10, 2021:

    ok, i will add an assert

  50. in src/util/strencodings.cpp:534 in 4451a0000d outdated
     526 | @@ -526,3 +527,45 @@ std::string HexStr(const Span<const uint8_t> s)
     527 |      assert(it == rv.end());
     528 |      return rv;
     529 |  }
     530 | +
     531 | +std::optional<uint64_t> ParseByteUnits(const std::string& str, uint64_t default_multiplier)
     532 | +{
     533 | +    auto multiplier = default_multiplier ? default_multiplier : 1;
     534 | +    char unit = str.empty() ? 0 : str.back();
    


    promag commented at 9:15 AM on November 10, 2021:

    nit,

    if (str.empty()) return std::nullopt;
    char unit = str.back();
    
  51. promag commented at 9:20 AM on November 10, 2021: member

    Code review ACK 4451a0000dda5daba4b1624dce1c95f1914e7251.

  52. in src/util/strencodings.cpp:566 in 4451a0000d outdated
     561 | +        unit = 0;
     562 | +        break;
     563 | +    }
     564 | +
     565 | +    uint64_t num;
     566 | +    if (!ParseUInt64(unit ? str.substr(0, str.size() - 1) : str, &num) ||
    


    MarcoFalke commented at 9:24 AM on November 10, 2021:

    With this breaking change (no longer ignoring whitespace), which I think is fine, it might be best to also reject sign characters (+-) and use ToIntegral<uint64_t>.


    dougEfresh commented at 4:13 PM on November 10, 2021:

    I have tests for +-, whitespace, invalid unit...etc I'm now using ToIntegral. I've added an assert(default_multiplier > 0) but I don't know how to test this. Any ideas examples?


    vasild commented at 11:31 AM on November 11, 2021:

    You cannot have the test trigger an assert() and continue successfully. Instead, you can throw std::out_of_range from the function and try/catch it from the test (BOOST_CHECK_THROW()). But in this case maybe it is best/simpler to just remove the parameter. Then there is no need to test edge cases for it :)

  53. in src/util/strencodings.h:319 in 4451a0000d outdated
     314 | + * @param[in] str                  the string to convert into bytes
     315 | + * @param[in] default_multiplier   if no unit is found in str, use this multiplier. default is MiB
     316 | + * @returns                        optional uint64_t bytes from str or nullopt
     317 | + *                                 if ParseUInt64 is false, str is empty, trailing whitespace or overflow
     318 | + */
     319 | +std::optional<uint64_t> ParseByteUnits(const std::string& str, uint64_t default_multiplier = 1024UL * 1024UL);
    


    MarcoFalke commented at 9:26 AM on November 10, 2021:

    Should the default multiplier have a default? It seems like the default is something that is per-arg and not project-wide.


    dougEfresh commented at 11:21 AM on November 10, 2021:

    Agreed. I will remove the default.


    dougEfresh commented at 3:01 PM on November 10, 2021:

    How about 1 as a default multiplier ?


    promag commented at 3:19 PM on November 10, 2021:

    How about removing the argument? It's not needed. Either way please update doc comment above.


    dougEfresh commented at 4:14 PM on November 10, 2021:

    default of 1 seems natural, so my latest change defaults to 1


    vasild commented at 11:26 AM on November 11, 2021:

    How about removing the argument? It's not needed.

    Indeed! I would remove that unused code. No parameter, no problem :) (what should be its default value? should it have default value? how to test 0 if we have assert()?)


    dougEfresh commented at 2:45 PM on November 11, 2021:

    i do need this multiplier argument. the intent of this function is to be used with maxupload, dbcache...etc. they inheritly have a default multiplier (mostly MiB). the main purpose is maintain backwards capability, e. g. maxupload=300 becomes 300m. As I write this, i think I will change the argument to a an enum with the valid units [k|K|m|M...] with no default value. This makes it clear and easy to verify/test.


    dougEfresh commented at 8:12 AM on November 12, 2021:

    @vasild @MarcoFalke @promag I added an enum ByteUnitMultiplier to src/util/strencodings.src/util/strencodings.h Let me know what you think.

  54. dougEfresh force-pushed on Nov 10, 2021
  55. in src/util/strencodings.cpp:553 in e4e2441eba outdated
     548 | +        break;
     549 | +    case 'M':
     550 | +        multiplier = 1ULL << 20;
     551 | +        break;
     552 | +    case 'g':
     553 | +        multiplier = 1000ULL * 1000 * 1000;
    


    MarcoFalke commented at 4:20 PM on November 10, 2021:
            multiplier = 1000'000'000ULL;
    

    Haven't tried this, but can the compiler parse this?


    dougEfresh commented at 8:07 AM on November 12, 2021:

    it does!

  56. in src/util/strencodings.cpp:571 in e4e2441eba outdated
     566 | +        break;
     567 | +    }
     568 | +
     569 | +    auto parsed_num = ToIntegral<uint64_t>(unit ? str.substr(0, str.size() - 1) : str);
     570 | +    // check overflow
     571 | +    return !parsed_num || parsed_num > std::numeric_limits<uint64_t>::max() / multiplier ? std::nullopt : std::optional{*parsed_num * multiplier};
    


    MarcoFalke commented at 4:22 PM on November 10, 2021:
        if( !parsed_num || parsed_num > std::numeric_limits<uint64_t>::max() / multiplier){
        return std::nullopt;
        }
        return *parsed_num * multiplier;
    

    Maybe split up into two lines to avoid the std::optional{} constructor for the value?


    dougEfresh commented at 8:07 AM on November 12, 2021:

    I agree, it looks better split

  57. MarcoFalke commented at 4:22 PM on November 10, 2021: member

    feel free to ignore the style nits

  58. in src/util/strencodings.h:47 in 97c0c4477d outdated
      41 | +    M_IEC = 1ULL << 20,
      42 | +    G_SI = 1'000'000'000ULL,
      43 | +    G_IEC = 1ULL << 30,
      44 | +    T_SI = 1'000'000'000'000ULL,
      45 | +    T_IEC = 1ULL << 40,
      46 | +};
    


    dougEfresh commented at 8:06 AM on November 12, 2021:

    I use the "technical" terms but I don't think they are intuitive, open to suggestions.

  59. in src/util/strencodings.h:36 in 97c0c4477d outdated
      28 | @@ -29,6 +29,22 @@ enum SafeChars
      29 |      SAFE_CHARS_URI, //!< Chars allowed in URIs (RFC 3986)
      30 |  };
      31 |  
      32 | +/**
      33 | + * Used by ParseByteUnits()
      34 | + * SI are 1000 base while IEC is 1024 base
      35 | + */
      36 | +enum ByteUnitMultiplier {
    


    promag commented at 10:07 AM on November 12, 2021:

    Use enum class, which can be used like ByteUnit::g or ByteUnit::K.


    dougEfresh commented at 5:27 PM on November 12, 2021:

    changed. more intuitive, clear.

  60. dougEfresh force-pushed on Nov 12, 2021
  61. dougEfresh force-pushed on Nov 13, 2021
  62. ryanofsky approved
  63. ryanofsky commented at 5:12 PM on November 15, 2021: contributor

    Code review ACK 1f51a25aa5037210ba39e14f048b3c9ec1d342ca. Main changes since last review are removing default multiplier, asserting multiplier is positive, adding enum

  64. in src/test/util_tests.cpp:2501 in 1f51a25aa5 outdated
    2496 | +
    2497 | +    // fractions not allowed
    2498 | +    BOOST_CHECK(!ParseByteUnits("0.5T", noop));
    2499 | +
    2500 | +    // overflow
    2501 | +    BOOST_CHECK(!ParseByteUnits("18446744073709551615g", noop).has_value());
    


    vasild commented at 10:02 AM on November 17, 2021:

    nit: remove .has_value() for consistency with other tests:

        BOOST_CHECK(!ParseByteUnits("18446744073709551615g", noop));
    

    dougEfresh commented at 10:49 AM on November 17, 2021:

    fixed in 21b58f430fa05fdb7c5db79b545302417a5dbceb

  65. in src/util/strencodings.h:326 in 1f51a25aa5 outdated
     321 | @@ -305,4 +322,17 @@ std::string ToUpper(const std::string& str);
     322 |   */
     323 |  std::string Capitalize(std::string str);
     324 |  
     325 | +/**
     326 | + * Parse a string with suffix unit [k|K|m|M|g|G|t|T] .
    


    vasild commented at 10:20 AM on November 17, 2021:

    nit: extra white space

     * Parse a string with suffix unit [k|K|m|M|g|G|t|T].
    

    dougEfresh commented at 10:49 AM on November 17, 2021:

    fixed

  66. vasild commented at 10:22 AM on November 17, 2021: contributor

    The code looks ok (1f51a25aa5037210ba39e14f048b3c9ec1d342ca). Some nits below, feel free to ignore.

    I think the 3 commits should be squashed into one - subsequent ones just change stuff added by the first commit.

  67. util: ParseByteUnits - Parse a string with suffix unit [k|K|m|M|g|G|t|T]
    A convenience utility for human readable arguments/config e.g. -maxuploadtarget=500g
    21b58f430f
  68. dougEfresh force-pushed on Nov 17, 2021
  69. dougEfresh commented at 10:53 AM on November 17, 2021: contributor

    @MarcoFalke hopefully, last review 21b58f43 Main changes since last review are removing assigning a default multiplier, adding enum for ByteUnit(s)

  70. vasild approved
  71. vasild commented at 11:39 AM on November 17, 2021: contributor

    ACK 21b58f430fa05fdb7c5db79b545302417a5dbceb

    Just squash and 2 minor nits since 1f51a25aa5, verified with

    diff -u <(git diff 1f51a25aa5~3..1f51a25aa5) <(git show 21b58f430f) |vim -
    

    Thanks!

  72. ryanofsky approved
  73. ryanofsky commented at 5:31 AM on November 24, 2021: contributor

    Code review ACK 21b58f430fa05fdb7c5db79b545302417a5dbceb. Only changes since last review are dropping optional has_value call, fixing comment punctuation, squashing commits.

  74. in src/test/util_tests.cpp:2467 in 21b58f430f
    2462 | +
    2463 | +    // no multiplier
    2464 | +    BOOST_CHECK_EQUAL(ParseByteUnits("1", noop).value(), 1);
    2465 | +    BOOST_CHECK_EQUAL(ParseByteUnits("0", noop).value(), 0);
    2466 | +
    2467 | +    BOOST_CHECK_EQUAL(ParseByteUnits("1k", noop).value(), 1000ULL);
    


    MarcoFalke commented at 9:47 AM on November 24, 2021:

    nit: Could also add a test that default_multiplier is ignored completely if passed in via the string? Maybe also add a fuzz test?

  75. MarcoFalke approved
  76. MarcoFalke merged this on Nov 24, 2021
  77. MarcoFalke closed this on Nov 24, 2021

  78. sidhujag referenced this in commit 9a3c515945 on Nov 24, 2021
  79. sidhujag referenced this in commit 9ac3ebeb11 on Nov 24, 2021
  80. rebroad commented at 8:38 PM on December 24, 2021: contributor

    This fails to compile on some (unsupported, e.g. cygwin) systems:

      CXX      bitcoind-bitcoind.o
    In file included from ./netaddress.h:19,
                     from ./chainparams.h:11,
                     from bitcoind.cpp:10:
    ./util/strencodings.h:41:10: warning: multi-character character constant [-Wmultichar]
       41 |     m = 1'000'000ULL,
          |          ^~~~~
    
    
  81. sipa commented at 8:53 PM on December 24, 2021: member

    @rebroad You need a C++17 compiler.

  82. delta1 referenced this in commit 3f0bf16b6a on May 30, 2023
  83. bitcoin locked this on Sep 21, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-13 15:14 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me