util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) #20452

pull practicalswift wants to merge 1 commits into bitcoin:master from practicalswift:remove-atoi changing 15 files +145 −52
  1. practicalswift commented at 12:05 pm on November 22, 2020: contributor
    Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17).
  2. laanwj commented at 12:15 pm on November 22, 2020: member

    Note that we have a function ParseInt32 (as well as 64 and UInt variants) for this and I once tried to use it in more places, see #17385.

    However these didn’t turn out to be actually locale-independent. Maybe this can replace them eventually.

  3. in src/util/strencodings.h:95 in be0efcd213 outdated
    85+        }
    86+        s = s.substr(1);
    87+    }
    88+    auto [_, error_condition] = std::from_chars(s.data(), s.data() + s.size(), result);
    89+    if (error_condition != std::errc{}) {
    90+        return 0;
    


    laanwj commented at 12:16 pm on November 22, 2020:
    Please, do not just return 0 in case of a parse error (same comment above!) but use explicit error feedback. The current formulation encourages ignoring errors. Most C++17ish would be to return an Option. (then handle it at the call site, if it wants to squash any errors to 0, fine)

    laanwj commented at 1:06 pm on November 22, 2020:
    I think you need to check the returned pointer against last as well, otherwise it ignores any remainder.

    practicalswift commented at 2:12 pm on November 22, 2020:
    I agree on both for a sane safe interface, but since this function (LocaleIndependentAtoi) is meant as a quirk-by-quirk compatible to the not-entirely-sane atoi(…) I’m afraid we need to return a T (instead of std::optional<T>) and ignore any remainder, no? :)
  4. laanwj commented at 12:23 pm on November 22, 2020: member

    locale-independent std::from_chars(…) (C++17).

    This is good to hear at least. From https://en.cppreference.com/w/cpp/utility/from_chars#Notes :

    Unlike other parsing functions in C++ and C libraries, std::from_chars is locale-independent, non-allocating, and non-throwing.

    Concept ACK.

  5. in src/util/strencodings.h:86 in be0efcd213 outdated
    76+template <typename T>
    77+T ToIntegral(const std::string& str)
    78+{
    79+    T result;
    80+    // Emulate atoi(...) handling of white space and leading +/-.
    81+    std::string s = TrimString(str);
    


    laanwj commented at 12:32 pm on November 22, 2020:

    Making a copy of the string here, as well as below for the +/- check, is kind of efficient. Not sure it matters here, it’s not like this function is used in any performance critical places at the moment.

    But as the function takes a range, we could do better.


    practicalswift commented at 4:15 pm on December 1, 2020:

    As a general rule I try to stay away from pointer arithmetic (rationale here): that’s the reason for the technically extraneous copy which I think is insignificant in this context :)

    Personally I find code that avoids pointer arithmetic much easier to reason about from a safety perspective.

    That said: I’ll let other chime in :)

  6. DrahtBot commented at 1:18 pm on November 22, 2020: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #15936 (Unify bitcoin-qt and bitcoind persistent settings by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  7. laanwj commented at 1:29 pm on November 22, 2020: member

    I think we need to make a choice here (either one or both):

    • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it’s only meant for preserving atoi behavior for backward compatibility reasons.
    • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I’m fine with the name then.
  8. DrahtBot added the label GUI on Nov 22, 2020
  9. DrahtBot added the label P2P on Nov 22, 2020
  10. DrahtBot added the label Utils/log/libs on Nov 22, 2020
  11. practicalswift force-pushed on Nov 22, 2020
  12. practicalswift commented at 2:07 pm on November 22, 2020: contributor

    @laanwj

    I think we need to make a choice here (either one or both):

    • Make an emulation of all broken and surprising atoi behavior, but call it something else than ToIntegral, something like LocaleIndependentAtoi, make it clear it should not be used in new code, it’s only meant for preserving atoi behavior for backward compatibility reasons.
    • Make a sane integer parsing function that can eventually replace Parse[U]IntXX or their contents in a locale-independent way. I’m fine with the name then.

    Good point. I thought about those options as well. I think we should do both. This PR is meant as a pure refactoring: it is not meant to change any behaviour that is defined by atoi.

    I’ve now made it more clear that this PR is meant as a quirk-by-quirk compatible replacement for atoi by calling the function LocaleIndependentAtoi :)

    Good point about naming: we should keep the name ToIntegral for a future sane version that doesn’t emulate atoi or any other legacy function.

  13. practicalswift force-pushed on Nov 22, 2020
  14. practicalswift force-pushed on Nov 22, 2020
  15. laanwj commented at 5:48 pm on November 22, 2020: member
    Okay, agree. I guess it could replace atoi as well as atoi64 in that case (as it’s parametrized on type)?
  16. practicalswift force-pushed on Nov 22, 2020
  17. practicalswift commented at 7:48 pm on November 22, 2020: contributor
    @laanwj Done! I’ve now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).
  18. in src/util/strencodings.h:80 in 6a6a71e098 outdated
    69@@ -68,8 +70,25 @@ std::string EncodeBase32(Span<const unsigned char> input, bool pad = true);
    70 std::string EncodeBase32(const std::string& str, bool pad = true);
    71 
    72 void SplitHostPort(std::string in, int& portOut, std::string& hostOut);
    73-int64_t atoi64(const std::string& str);
    74-int atoi(const std::string& str);
    75+
    76+template <typename T>
    


    laanwj commented at 9:54 am on November 23, 2020:
    Please add a comment what this function does (e.g.: replicate the exact behavior of atoi for backwards compatibility) and that new code should use the Parse[U]IntXX functions with parse error feedback.

    practicalswift commented at 3:50 pm on November 23, 2020:
    Done! :)
  19. laanwj commented at 9:55 am on November 23, 2020: member

    @laanwj Done! I’ve now replaced all instances of atoi64(s) with LocaleIndependentAtoi<int64_t>(s).

    Thanks, looks good to me now except the documentation nit.

  20. practicalswift force-pushed on Nov 23, 2020
  21. practicalswift force-pushed on Nov 23, 2020
  22. practicalswift renamed this:
    Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    on Nov 24, 2020
  23. MarcoFalke referenced this in commit 9402159b9b on Nov 24, 2020
  24. practicalswift force-pushed on Nov 24, 2020
  25. sidhujag referenced this in commit 3a1441004a on Nov 24, 2020
  26. laanwj commented at 2:28 pm on November 25, 2020: member

    I don’t get the CI error. Is charconv non-standard in some way?

    0In file included from primitives/transaction.cpp:10:0:
    1./util/strencodings.h:16:10: fatal error: charconv: No such file or directory
    2 #include <charconv>
    3          ^~~~~~~~~~
    4compilation terminated.
    
  27. practicalswift commented at 3:09 pm on December 1, 2020: contributor

    @laanwj I think the CI failures are from relatively older distros, and that C++17 library support lagged behind C++17 compiler support for a while in C++17 infancy.

    According to cppreference the elementary string conversions (<charconv>) were added in GCC libstdc++ 8, Clang libc++ 7 and MSVC Standard Library 19.14.

  28. MarcoFalke removed the label GUI on Dec 1, 2020
  29. MarcoFalke removed the label P2P on Dec 1, 2020
  30. MarcoFalke added the label Waiting for author on Dec 1, 2020
  31. MarcoFalke commented at 3:51 pm on December 1, 2020: member
    ok, this is blocked on #20460
  32. practicalswift commented at 12:57 pm on December 15, 2020: contributor
    Could remove “Waiting for author”? I don’t know if we have any “Blocked on another PR/issue” (or “Waiting for toolchain upgrade”) tag, but this PR is blocked on #20460 :)
  33. MarcoFalke removed the label Waiting for author on Dec 15, 2020
  34. MarcoFalke added the label Waiting for other on Dec 15, 2020
  35. DrahtBot added the label Needs rebase on Dec 15, 2020
  36. practicalswift force-pushed on Dec 16, 2020
  37. DrahtBot removed the label Needs rebase on Dec 16, 2020
  38. practicalswift force-pushed on Mar 2, 2021
  39. DrahtBot added the label Needs rebase on Mar 19, 2021
  40. practicalswift force-pushed on Mar 19, 2021
  41. DrahtBot removed the label Needs rebase on Mar 19, 2021
  42. practicalswift force-pushed on Mar 21, 2021
  43. practicalswift force-pushed on Apr 24, 2021
  44. DrahtBot added the label Needs rebase on May 5, 2021
  45. practicalswift force-pushed on May 12, 2021
  46. practicalswift force-pushed on May 12, 2021
  47. DrahtBot removed the label Needs rebase on May 12, 2021
  48. DrahtBot added the label Needs rebase on May 30, 2021
  49. practicalswift force-pushed on Jun 13, 2021
  50. practicalswift commented at 8:12 am on June 13, 2021: contributor

    Rebased!

    Could this PR get a 23.0 milestone like the related PR #20457 (“util: Make Parse{Int,UInt}{32,64} use locale independent std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}”)? :)

  51. MarcoFalke added this to the milestone 23.0 on Jun 13, 2021
  52. DrahtBot removed the label Needs rebase on Jun 13, 2021
  53. practicalswift force-pushed on Jun 13, 2021
  54. practicalswift force-pushed on Jun 22, 2021
  55. practicalswift force-pushed on Jun 22, 2021
  56. practicalswift force-pushed on Jun 22, 2021
  57. PastaPastaPasta referenced this in commit 46a05ad537 on Jun 27, 2021
  58. PastaPastaPasta referenced this in commit f059da98e6 on Jun 28, 2021
  59. PastaPastaPasta referenced this in commit 840d3332e4 on Jun 29, 2021
  60. practicalswift force-pushed on Jun 30, 2021
  61. PastaPastaPasta referenced this in commit 1c80f26441 on Jul 1, 2021
  62. PastaPastaPasta referenced this in commit dea0bfecd1 on Jul 1, 2021
  63. PastaPastaPasta referenced this in commit 1621639e85 on Jul 15, 2021
  64. practicalswift force-pushed on Aug 15, 2021
  65. DrahtBot added the label Needs rebase on Aug 24, 2021
  66. practicalswift force-pushed on Sep 5, 2021
  67. DrahtBot removed the label Needs rebase on Sep 5, 2021
  68. DrahtBot added the label Needs rebase on Sep 17, 2021
  69. PastaPastaPasta referenced this in commit 0bb4bb8db0 on Sep 17, 2021
  70. practicalswift force-pushed on Sep 18, 2021
  71. DrahtBot removed the label Needs rebase on Sep 18, 2021
  72. PastaPastaPasta referenced this in commit 4485b3ea20 on Sep 19, 2021
  73. PastaPastaPasta referenced this in commit 40f2d3e692 on Sep 21, 2021
  74. PastaPastaPasta referenced this in commit 2328ead832 on Sep 24, 2021
  75. fanquake referenced this in commit 67eae69f3f on Sep 28, 2021
  76. MarcoFalke removed the label Waiting for other on Sep 28, 2021
  77. practicalswift commented at 2:19 pm on September 29, 2021: contributor

    Now that #20460 has been merged I think this PR should be ready for final review :)

    See also related PR #20457.

  78. practicalswift force-pushed on Sep 30, 2021
  79. Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
    test: Add test cases for LocaleIndependentAtoi
    
    fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s)
    
    fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
    4343f114cc
  80. practicalswift force-pushed on Sep 30, 2021
  81. jonatack commented at 3:33 pm on September 30, 2021: member
    Concept ACK
  82. laanwj commented at 4:57 pm on September 30, 2021: member
    Code review ACK 4343f114cc661cf031ec915538c11b9b030e2e15
  83. in src/rpc/mining.cpp:705 in 4343f114cc
    701@@ -702,7 +702,7 @@ static RPCHelpMan getblocktemplate()
    702             std::string lpstr = lpval.get_str();
    703 
    704             hashWatchedChain = ParseHashV(lpstr.substr(0, 64), "longpollid");
    705-            nTransactionsUpdatedLastLP = atoi64(lpstr.substr(64));
    706+            nTransactionsUpdatedLastLP = LocaleIndependentAtoi<int64_t>(lpstr.substr(64));
    


    jonatack commented at 8:20 pm on September 30, 2021:
    nTransactionsUpdatedLastLP is unsigned int; should this be defined or cast to that?

    laanwj commented at 7:07 am on October 1, 2021:
    Yes it probably needs the same kind of ugly cast as nTimeSmart in the wallet code. In the longer run the field should imo be converted to an explicitly sized type, and parsed as that, but keeping the behavior exactly the same makes sense here to keep it uncontroversial.

    practicalswift commented at 2:19 pm on October 1, 2021:
    @jonatack Good point! I suggest we tackle that in a follow-up PR to keep this PR pure refactoring only :)
  84. jonatack commented at 8:25 pm on September 30, 2021: member
    Code review ACK 4343f114cc661cf031ec915538c11b9b030e2e15
  85. in src/util/strencodings.h:76 in 4343f114cc
    73-int atoi(const std::string& str);
    74+
    75+// LocaleIndependentAtoi is provided for backwards compatibility reasons.
    76+//
    77+// New code should use the ParseInt64/ParseUInt64/ParseInt32/ParseUInt32 functions
    78+// which provide parse error feedback.
    


    laanwj commented at 7:01 am on October 1, 2021:
    This is a really important comment. I hope we can get rid of these functions entirely at some point. The implicit error handling behavior of atoi and atoi64 is pretty much always undesirable. (so I would normally comment “these function names are long and clunky, please shorten them” but it’s good in this case, we want using them to be ugly :smile: )

    MarcoFalke commented at 10:17 am on October 5, 2021:
    The linter says to use ToIntegral, this contradicts by saying to use ParseInt*??

    practicalswift commented at 11:55 am on October 5, 2021:
    ToIntegral sounds like a good suggestion now that it is part of master :)
  86. laanwj merged this on Oct 4, 2021
  87. laanwj closed this on Oct 4, 2021

  88. sidhujag referenced this in commit 95062277f7 on Oct 4, 2021
  89. in src/core_read.cpp:72 in 4343f114cc
    68@@ -69,7 +69,7 @@ CScript ParseScript(const std::string& s)
    69             (w->front() == '-' && w->size() > 1 && std::all_of(w->begin()+1, w->end(), ::IsDigit)))
    70         {
    71             // Number
    72-            int64_t n = atoi64(*w);
    73+            int64_t n = LocaleIndependentAtoi<int64_t>(*w);
    


    MarcoFalke commented at 10:03 am on October 5, 2021:
    The condition literally checks for the required format for ParseIntegral<>, so why not use that?

    practicalswift commented at 11:54 am on October 5, 2021:
    Sounds like a good idea now that ParseIntegral is part of master :)

    MarcoFalke commented at 12:03 pm on October 5, 2021:
    Sorry, I was wrong. This is not a refactor.
  90. in src/qt/rpcconsole.cpp:253 in 4343f114cc
    249@@ -250,7 +250,7 @@ bool RPCConsole::RPCParseCommandLine(interfaces::Node* node, std::string &strRes
    250                                     for(char argch: curarg)
    251                                         if (!IsDigit(argch))
    252                                             throw std::runtime_error("Invalid result query");
    253-                                    subelement = lastResult[atoi(curarg.c_str())];
    254+                                    subelement = lastResult[LocaleIndependentAtoi<int>(curarg)];
    


    MarcoFalke commented at 10:04 am on October 5, 2021:
    Same

    MarcoFalke commented at 12:03 pm on October 5, 2021:
    Sorry, I was wrong. This is not a refactor.
  91. in src/util/moneystr.cpp:80 in 4343f114cc
    76@@ -77,8 +77,7 @@ std::optional<CAmount> ParseMoney(const std::string& money_string)
    77         return std::nullopt;
    78     if (nUnits < 0 || nUnits > COIN)
    79         return std::nullopt;
    80-    int64_t nWhole = atoi64(strWhole);
    81-
    82+    int64_t nWhole = LocaleIndependentAtoi<int64_t>(strWhole);
    


    MarcoFalke commented at 10:36 am on October 5, 2021:
    same

    MarcoFalke commented at 12:03 pm on October 5, 2021:
    Sorry, I was wrong. This is not a refactor.
  92. MarcoFalke commented at 10:37 am on October 5, 2021: member
    concept ack
  93. MarcoFalke commented at 11:11 am on October 5, 2021: member
    Follow up in #23184
  94. MarcoFalke commented at 12:04 pm on October 5, 2021: member
    withdrawn my comments
  95. kittywhiskers referenced this in commit 91f5c0d329 on Oct 12, 2021
  96. jamesob referenced this in commit 5b78055d59 on Dec 22, 2021
  97. jamesob referenced this in commit fc11f8882c on Dec 22, 2021
  98. jamesob referenced this in commit 6d2a15adff on Dec 22, 2021
  99. jamesob referenced this in commit 1d229c783c on Jan 3, 2022
  100. jamesob referenced this in commit ac1a5b113c on Jan 4, 2022
  101. ryanofsky referenced this in commit 4a15ddc19c on Jan 11, 2022
  102. ryanofsky referenced this in commit 6184764899 on Jan 11, 2022
  103. ryanofsky referenced this in commit d5c0e2ddd8 on Jan 11, 2022
  104. ryanofsky referenced this in commit 389c47a78e on Jan 11, 2022
  105. ryanofsky referenced this in commit 1b0b1ddc42 on Jan 11, 2022
  106. ryanofsky referenced this in commit b5c9bb5cb9 on Jan 12, 2022
  107. MarcoFalke referenced this in commit 290ff5ef6d on Jan 12, 2022
  108. rebroad referenced this in commit 4c76d815b2 on Feb 3, 2022
  109. MarcoFalke locked this on Aug 15, 2022

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-12-23 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me