Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17).
util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17) #20452
pull practicalswift wants to merge 1 commits into bitcoin:master from practicalswift:remove-atoi changing 15 files +145 −52-
practicalswift commented at 12:05 PM on November 22, 2020: contributor
-
laanwj commented at 12:15 PM on November 22, 2020: member
Note that we have a function
ParseInt32(as well as 64 and UInt variants) for this and I once tried to use it in more places, see #17385.However these didn't turn out to be actually locale-independent. Maybe this can replace them eventually.
-
in src/util/strencodings.h:95 in be0efcd213 outdated
85 | + } 86 | + s = s.substr(1); 87 | + } 88 | + auto [_, error_condition] = std::from_chars(s.data(), s.data() + s.size(), result); 89 | + if (error_condition != std::errc{}) { 90 | + return 0;
laanwj commented at 12:16 PM on November 22, 2020:Please, do not just return 0 in case of a parse error (same comment above!) but use explicit error feedback. The current formulation encourages ignoring errors. Most C++17ish would be to return an
Option. (then handle it at the call site, if it wants to squash any errors to 0, fine)
laanwj commented at 1:06 PM on November 22, 2020:I think you need to check the returned pointer against
lastas well, otherwise it ignores any remainder.
practicalswift commented at 2:12 PM on November 22, 2020:I agree on both for a sane safe interface, but since this function (
LocaleIndependentAtoi) is meant as a quirk-by-quirk compatible to the not-entirely-saneatoi(…)I'm afraid we need to return aT(instead ofstd::optional<T>) and ignore any remainder, no? :)laanwj commented at 12:23 PM on November 22, 2020: memberlocale-independent std::from_chars(…) (C++17).
This is good to hear at least. From https://en.cppreference.com/w/cpp/utility/from_chars#Notes :
Unlike other parsing functions in C++ and C libraries, std::from_chars is locale-independent, non-allocating, and non-throwing.
Concept ACK.
in src/util/strencodings.h:86 in be0efcd213 outdated
76 | +template <typename T> 77 | +T ToIntegral(const std::string& str) 78 | +{ 79 | + T result; 80 | + // Emulate atoi(...) handling of white space and leading +/-. 81 | + std::string s = TrimString(str);
laanwj commented at 12:32 PM on November 22, 2020:Making a copy of the string here, as well as below for the +/- check, is kind of efficient. Not sure it matters here, it's not like this function is used in any performance critical places at the moment.
But as the function takes a range, we could do better.
practicalswift commented at 4:15 PM on December 1, 2020:As a general rule I try to stay away from pointer arithmetic (rationale here): that's the reason for the technically extraneous copy which I think is insignificant in this context :)
Personally I find code that avoids pointer arithmetic much easier to reason about from a safety perspective.
That said: I'll let other chime in :)
DrahtBot commented at 1:18 PM on November 22, 2020: member<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--174a7506f384e20aa4161008e828411d-->
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #15936 (Unify bitcoin-qt and bitcoind persistent settings by ryanofsky)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
laanwj commented at 1:29 PM on November 22, 2020: memberI think we need to make a choice here (either one or both):
- Make an emulation of all broken and surprising
atoibehavior, but call it something else thanToIntegral, something likeLocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preservingatoibehavior for backward compatibility reasons. - Make a sane integer parsing function that can eventually replace
Parse[U]IntXXor their contents in a locale-independent way. I'm fine with the name then.
DrahtBot added the label GUI on Nov 22, 2020DrahtBot added the label P2P on Nov 22, 2020DrahtBot added the label Utils/log/libs on Nov 22, 2020practicalswift force-pushed on Nov 22, 2020practicalswift commented at 2:07 PM on November 22, 2020: contributorI think we need to make a choice here (either one or both):
- Make an emulation of all broken and surprising
atoibehavior, but call it something else thanToIntegral, something likeLocaleIndependentAtoi, make it clear it should not be used in new code, it's only meant for preservingatoibehavior for backward compatibility reasons. - Make a sane integer parsing function that can eventually replace
Parse[U]IntXXor their contents in a locale-independent way. I'm fine with the name then.
Good point. I thought about those options as well. I think we should do both. This PR is meant as a pure refactoring: it is not meant to change any behaviour that is defined by
atoi.I've now made it more clear that this PR is meant as a quirk-by-quirk compatible replacement for
atoiby calling the functionLocaleIndependentAtoi:)Good point about naming: we should keep the name
ToIntegralfor a future sane version that doesn't emulateatoior any other legacy function.practicalswift force-pushed on Nov 22, 2020practicalswift force-pushed on Nov 22, 2020laanwj commented at 5:48 PM on November 22, 2020: memberOkay, agree. I guess it could replace
atoias well asatoi64in that case (as it's parametrized on type)?practicalswift force-pushed on Nov 22, 2020practicalswift commented at 7:48 PM on November 22, 2020: contributor@laanwj Done! I've now replaced all instances of
atoi64(s)withLocaleIndependentAtoi<int64_t>(s).in src/util/strencodings.h:80 in 6a6a71e098 outdated
69 | @@ -68,8 +70,25 @@ std::string EncodeBase32(Span<const unsigned char> input, bool pad = true); 70 | std::string EncodeBase32(const std::string& str, bool pad = true); 71 | 72 | void SplitHostPort(std::string in, int& portOut, std::string& hostOut); 73 | -int64_t atoi64(const std::string& str); 74 | -int atoi(const std::string& str); 75 | + 76 | +template <typename T>
laanwj commented at 9:54 AM on November 23, 2020:Please add a comment what this function does (e.g.: replicate the exact behavior of
atoifor backwards compatibility) and that new code should use theParse[U]IntXXfunctions with parse error feedback.
practicalswift commented at 3:50 PM on November 23, 2020:Done! :)
practicalswift force-pushed on Nov 23, 2020practicalswift force-pushed on Nov 23, 2020practicalswift renamed this:Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
util: Replace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
on Nov 24, 2020MarcoFalke referenced this in commit 9402159b9b on Nov 24, 2020practicalswift force-pushed on Nov 24, 2020sidhujag referenced this in commit 3a1441004a on Nov 24, 2020laanwj commented at 2:28 PM on November 25, 2020: memberI don't get the CI error. Is
charconvnon-standard in some way?In file included from primitives/transaction.cpp:10:0: ./util/strencodings.h:16:10: fatal error: charconv: No such file or directory #include <charconv> ^~~~~~~~~~ compilation terminated.practicalswift commented at 3:09 PM on December 1, 2020: contributor@laanwj I think the CI failures are from relatively older distros, and that C++17 library support lagged behind C++17 compiler support for a while in C++17 infancy.
According to cppreference the elementary string conversions (
<charconv>) were added in GCC libstdc++ 8, Clang libc++ 7 and MSVC Standard Library 19.14.MarcoFalke removed the label GUI on Dec 1, 2020MarcoFalke removed the label P2P on Dec 1, 2020MarcoFalke added the label Waiting for author on Dec 1, 2020MarcoFalke commented at 3:51 PM on December 1, 2020: memberok, this is blocked on #20460
practicalswift commented at 12:57 PM on December 15, 2020: contributorCould remove "Waiting for author"? I don't know if we have any "Blocked on another PR/issue" (or "Waiting for toolchain upgrade") tag, but this PR is blocked on #20460 :)
MarcoFalke removed the label Waiting for author on Dec 15, 2020MarcoFalke added the label Waiting for other on Dec 15, 2020DrahtBot added the label Needs rebase on Dec 15, 2020practicalswift force-pushed on Dec 16, 2020DrahtBot removed the label Needs rebase on Dec 16, 2020practicalswift force-pushed on Mar 2, 2021DrahtBot added the label Needs rebase on Mar 19, 2021practicalswift force-pushed on Mar 19, 2021DrahtBot removed the label Needs rebase on Mar 19, 2021practicalswift force-pushed on Mar 21, 2021practicalswift force-pushed on Apr 24, 2021DrahtBot added the label Needs rebase on May 5, 2021practicalswift force-pushed on May 12, 2021practicalswift force-pushed on May 12, 2021DrahtBot removed the label Needs rebase on May 12, 2021DrahtBot added the label Needs rebase on May 30, 2021practicalswift force-pushed on Jun 13, 2021practicalswift commented at 8:12 AM on June 13, 2021: contributorRebased!
Could this PR get a 23.0 milestone like the related PR #20457 ("util: Make Parse{Int,UInt}{32,64} use locale independent std::from_chars(…) (C++17) instead of locale dependent strto{l,ll,ul,ull}")? :)
MarcoFalke added this to the milestone 23.0 on Jun 13, 2021DrahtBot removed the label Needs rebase on Jun 13, 2021practicalswift force-pushed on Jun 13, 2021practicalswift force-pushed on Jun 22, 2021practicalswift force-pushed on Jun 22, 2021practicalswift force-pushed on Jun 22, 2021PastaPastaPasta referenced this in commit 46a05ad537 on Jun 27, 2021PastaPastaPasta referenced this in commit f059da98e6 on Jun 28, 2021PastaPastaPasta referenced this in commit 840d3332e4 on Jun 29, 2021practicalswift force-pushed on Jun 30, 2021PastaPastaPasta referenced this in commit 1c80f26441 on Jul 1, 2021PastaPastaPasta referenced this in commit dea0bfecd1 on Jul 1, 2021PastaPastaPasta referenced this in commit 1621639e85 on Jul 15, 2021practicalswift force-pushed on Aug 15, 2021DrahtBot added the label Needs rebase on Aug 24, 2021practicalswift force-pushed on Sep 5, 2021DrahtBot removed the label Needs rebase on Sep 5, 2021DrahtBot added the label Needs rebase on Sep 17, 2021PastaPastaPasta referenced this in commit 0bb4bb8db0 on Sep 17, 2021practicalswift force-pushed on Sep 18, 2021DrahtBot removed the label Needs rebase on Sep 18, 2021PastaPastaPasta referenced this in commit 4485b3ea20 on Sep 19, 2021PastaPastaPasta referenced this in commit 40f2d3e692 on Sep 21, 2021PastaPastaPasta referenced this in commit 2328ead832 on Sep 24, 2021fanquake referenced this in commit 67eae69f3f on Sep 28, 2021MarcoFalke removed the label Waiting for other on Sep 28, 2021practicalswift commented at 2:19 PM on September 29, 2021: contributorpracticalswift force-pushed on Sep 30, 20214343f114ccReplace use of locale dependent atoi(…) with locale-independent std::from_chars(…) (C++17)
test: Add test cases for LocaleIndependentAtoi fuzz: Assert legacy atoi(s) == LocaleIndependentAtoi<int>(s) fuzz: Assert legacy atoi64(s) == LocaleIndependentAtoi<int64_t>(s)
practicalswift force-pushed on Sep 30, 2021jonatack commented at 3:33 PM on September 30, 2021: memberConcept ACK
laanwj commented at 4:57 PM on September 30, 2021: memberCode review ACK 4343f114cc661cf031ec915538c11b9b030e2e15
in src/rpc/mining.cpp:705 in 4343f114cc
701 | @@ -702,7 +702,7 @@ static RPCHelpMan getblocktemplate() 702 | std::string lpstr = lpval.get_str(); 703 | 704 | hashWatchedChain = ParseHashV(lpstr.substr(0, 64), "longpollid"); 705 | - nTransactionsUpdatedLastLP = atoi64(lpstr.substr(64)); 706 | + nTransactionsUpdatedLastLP = LocaleIndependentAtoi<int64_t>(lpstr.substr(64));
jonatack commented at 8:20 PM on September 30, 2021:nTransactionsUpdatedLastLPisunsigned int; should this be defined or cast to that?
laanwj commented at 7:07 AM on October 1, 2021:Yes it probably needs the same kind of ugly cast as
nTimeSmartin the wallet code. In the longer run the field should imo be converted to an explicitly sized type, and parsed as that, but keeping the behavior exactly the same makes sense here to keep it uncontroversial.
practicalswift commented at 2:19 PM on October 1, 2021:@jonatack Good point! I suggest we tackle that in a follow-up PR to keep this PR pure refactoring only :)
jonatack commented at 8:25 PM on September 30, 2021: memberCode review ACK 4343f114cc661cf031ec915538c11b9b030e2e15
in src/util/strencodings.h:76 in 4343f114cc
73 | -int atoi(const std::string& str); 74 | + 75 | +// LocaleIndependentAtoi is provided for backwards compatibility reasons. 76 | +// 77 | +// New code should use the ParseInt64/ParseUInt64/ParseInt32/ParseUInt32 functions 78 | +// which provide parse error feedback.
laanwj commented at 7:01 AM on October 1, 2021:This is a really important comment. I hope we can get rid of these functions entirely at some point. The implicit error handling behavior of
atoiandatoi64is pretty much always undesirable. (so I would normally comment "these function names are long and clunky, please shorten them" but it's good in this case, we want using them to be ugly :smile: )
MarcoFalke commented at 10:17 AM on October 5, 2021:The linter says to use
ToIntegral, this contradicts by saying to useParseInt*??
practicalswift commented at 11:55 AM on October 5, 2021:ToIntegralsounds like a good suggestion now that it is part ofmaster:)laanwj merged this on Oct 4, 2021laanwj closed this on Oct 4, 2021sidhujag referenced this in commit 95062277f7 on Oct 4, 2021in src/core_read.cpp:72 in 4343f114cc
68 | @@ -69,7 +69,7 @@ CScript ParseScript(const std::string& s) 69 | (w->front() == '-' && w->size() > 1 && std::all_of(w->begin()+1, w->end(), ::IsDigit))) 70 | { 71 | // Number 72 | - int64_t n = atoi64(*w); 73 | + int64_t n = LocaleIndependentAtoi<int64_t>(*w);
MarcoFalke commented at 10:03 AM on October 5, 2021:The condition literally checks for the required format for ParseIntegral<>, so why not use that?
practicalswift commented at 11:54 AM on October 5, 2021:Sounds like a good idea now that
ParseIntegralis part ofmaster:)
MarcoFalke commented at 12:03 PM on October 5, 2021:Sorry, I was wrong. This is not a refactor.
in src/qt/rpcconsole.cpp:253 in 4343f114cc
249 | @@ -250,7 +250,7 @@ bool RPCConsole::RPCParseCommandLine(interfaces::Node* node, std::string &strRes 250 | for(char argch: curarg) 251 | if (!IsDigit(argch)) 252 | throw std::runtime_error("Invalid result query"); 253 | - subelement = lastResult[atoi(curarg.c_str())]; 254 | + subelement = lastResult[LocaleIndependentAtoi<int>(curarg)];
MarcoFalke commented at 10:04 AM on October 5, 2021:Same
MarcoFalke commented at 12:03 PM on October 5, 2021:Sorry, I was wrong. This is not a refactor.
in src/util/moneystr.cpp:80 in 4343f114cc
76 | @@ -77,8 +77,7 @@ std::optional<CAmount> ParseMoney(const std::string& money_string) 77 | return std::nullopt; 78 | if (nUnits < 0 || nUnits > COIN) 79 | return std::nullopt; 80 | - int64_t nWhole = atoi64(strWhole); 81 | - 82 | + int64_t nWhole = LocaleIndependentAtoi<int64_t>(strWhole);
MarcoFalke commented at 10:36 AM on October 5, 2021:same
MarcoFalke commented at 12:03 PM on October 5, 2021:Sorry, I was wrong. This is not a refactor.
MarcoFalke commented at 10:37 AM on October 5, 2021: memberconcept ack
MarcoFalke commented at 11:11 AM on October 5, 2021: memberFollow up in #23184
MarcoFalke commented at 12:04 PM on October 5, 2021: memberwithdrawn my comments
kittywhiskers referenced this in commit 91f5c0d329 on Oct 12, 2021jamesob referenced this in commit 5b78055d59 on Dec 22, 2021jamesob referenced this in commit fc11f8882c on Dec 22, 2021jamesob referenced this in commit 6d2a15adff on Dec 22, 2021jamesob referenced this in commit 1d229c783c on Jan 3, 2022jamesob referenced this in commit ac1a5b113c on Jan 4, 2022ryanofsky referenced this in commit 4a15ddc19c on Jan 11, 2022ryanofsky referenced this in commit 6184764899 on Jan 11, 2022ryanofsky referenced this in commit d5c0e2ddd8 on Jan 11, 2022ryanofsky referenced this in commit 389c47a78e on Jan 11, 2022ryanofsky referenced this in commit 1b0b1ddc42 on Jan 11, 2022ryanofsky referenced this in commit b5c9bb5cb9 on Jan 12, 2022MarcoFalke referenced this in commit 290ff5ef6d on Jan 12, 2022rebroad referenced this in commit 4c76d815b2 on Feb 3, 2022MarcoFalke locked this on Aug 15, 2022ContributorsLabelsMilestone
23.0
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-05-02 18:14 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me