Modernize util/strencodings and util/string: string_view and optional #24764

pull sipa wants to merge 10 commits into bitcoin:master from sipa:202204_stringview changing 18 files +196 −240
  1. sipa commented at 7:14 pm on April 4, 2022: member

    Make use of std::string_view and std::optional in the util/{strencodings, string} files.

    This avoids many temporary string/vector objects being created, while making the interface easier to read. Changes include:

    • Make all input arguments in functions in util/strencodings and util/string take std::string_view instead of std::string.
    • Add RemovePrefixView and TrimStringView which also return std::string_view objects (the corresponding RemovePrefix and TrimString keep returning an std::string, as that’s needed in many call sites still).
    • Stop returning std::string from DecodeBase32 and DecodeBase64, but return vectors. Base32/64 are fundamentally algorithms for encoding bytes as strings; returning std::string from those (especially doing it conditionally based on the input arguments/types) is just bizarre.
    • Stop taking a bool* pf_invalid output argument pointer in DecodeBase32 and DecodeBase64; return an std::optional instead.
    • Make DecodeBase32 and DecodeBase64 more efficient by doing the conversion from characters to integer symbols on-the-fly rather than through a temporary vector.
  2. sipa force-pushed on Apr 4, 2022
  3. DrahtBot added the label GUI on Apr 4, 2022
  4. DrahtBot added the label P2P on Apr 4, 2022
  5. DrahtBot added the label RPC/REST/ZMQ on Apr 4, 2022
  6. DrahtBot added the label Utils/log/libs on Apr 4, 2022
  7. DrahtBot commented at 11:40 pm on April 4, 2022: member

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #24742 ([POC] build: prune Boost headers in depends by fanquake)
    • #24185 (refactor: only use explicit reinterpret/const casts, not implicit by PastaPastaPasta)
    • #24058 (BIP-322 basic support by kallewoof)
    • #23595 (util: Add ParseHexstd::byte() helper by MarcoFalke)
    • #22953 (refactor: introduce single-separator split helper (boost::split replacement) by theStack)
    • #22087 (Validate port-options by amadeuszpawlik)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  8. MarcoFalke removed the label GUI on Apr 5, 2022
  9. MarcoFalke removed the label RPC/REST/ZMQ on Apr 5, 2022
  10. MarcoFalke removed the label P2P on Apr 5, 2022
  11. MarcoFalke removed the label Utils/log/libs on Apr 5, 2022
  12. MarcoFalke added the label Refactoring on Apr 5, 2022
  13. in src/util/strencodings.cpp:96 in d59c1f99bd outdated
    105+            ++it;
    106+            continue;
    107+        }
    108+        auto c1 = HexDigit(*(it++));
    109+        auto c2 = HexDigit(*(it++));
    110+        if (c1 == (signed char)-1 || c2 == (signed char)-1) break;
    


    MarcoFalke commented at 7:52 am on April 5, 2022:

    nit in the first commit:

    Is there any reason for the cast? My understanding is that compilers will treat this as int either way. So I think this can be clearer written as c1==-1||c2==-1. Alternatively, c1<0||c2<0, similar to the other places where HexDigit is used.


    sipa commented at 12:09 pm on April 5, 2022:
    I left the casts there because the original calls had them too. I believe they are indeed redundant here, I’ll remove them.

    sipa commented at 1:45 pm on April 5, 2022:
    Replaced it with < 0 checks.
  14. laanwj commented at 8:03 am on April 5, 2022: member

    Concept ACK

    Add RemovePrefixView and TrimStringView which also return std::string_view objects (the corresponding RemovePrefix and TrimString keep returning an std::string, as that’s needed in many call sites still).

    Returning a string_view can be somewhat dangerous (in combination with auto? also when the original string goes out of scope, especially if it’s a temporary), so giving these functions new names is a good idea.

  15. in src/util/strencodings.cpp:79 in 6e1f147804 outdated
    87-    return (str.size() > starting_location);
    88+    return str.size() > 0;
    89 }
    90 
    91-std::vector<unsigned char> ParseHex(const char* psz)
    92+std::vector<unsigned char> ParseHex(std::string_view str)
    


    laanwj commented at 8:09 am on April 5, 2022:
    ParseHex itself could use error feedback using an optional at some point too. But this will need revisiting all call sites, too much for this PR.

    sipa commented at 1:45 pm on April 5, 2022:
    Right, good for a follow-up.
  16. in src/httprpc.cpp:136 in e2fcab1457 outdated
    131@@ -132,7 +132,8 @@ static bool RPCAuthorized(const std::string& strAuth, std::string& strAuthUserna
    132     if (strAuth.substr(0, 6) != "Basic ")
    133         return false;
    134     std::string strUserPass64 = TrimString(strAuth.substr(6));
    135-    std::string strUserPass = DecodeBase64(strUserPass64);
    136+    std::vector<unsigned char> userpass_data = DecodeBase64(strUserPass64);
    137+    std::string strUserPass(userpass_data.begin(), userpass_data.end());
    


    MarcoFalke commented at 8:19 am on April 5, 2022:

    nit in e2fcab145708e43daed290e1b951ff5e43d8890b:

    I prefer {} over (). (feel free to ignore)


    MarcoFalke commented at 8:44 am on April 5, 2022:
    Edit: Removed in a later commit, pls ignore.
  17. in src/test/base32_tests.cpp:26 in e2fcab1457 outdated
    21@@ -22,8 +22,8 @@ BOOST_AUTO_TEST_CASE(base32_testvectors)
    22         BOOST_CHECK_EQUAL(strEnc, vstrOut[i]);
    23         strEnc = EncodeBase32(vstrIn[i], false);
    24         BOOST_CHECK_EQUAL(strEnc, vstrOutNoPadding[i]);
    25-        std::string strDec = DecodeBase32(vstrOut[i]);
    26-        BOOST_CHECK_EQUAL(strDec, vstrIn[i]);
    27+        auto dec = DecodeBase32(vstrOut[i]);
    28+        BOOST_CHECK_MESSAGE(MakeByteSpan(dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    


    MarcoFalke commented at 8:37 am on April 5, 2022:

    nit in https://github.com/bitcoin/bitcoin/commit/e2fcab145708e43daed290e1b951ff5e43d8890b:

    Isn’t it problematic to print binary data as is to the terminal? vstrIn[i] (the error message) is binary data?

    Wouldn’t it be better to call BOOST_CHECK_EQUAL(std::string{dec.begin(), dec.end()}, vstrIn[i]);, if it compiles?


    MarcoFalke commented at 10:11 am on April 5, 2022:

    edit: Looks like the binary data in this test are printable-ascii-only strings, so feel free to ignore. Also it looks like boost doesn’t attempt to sanitize the binary data anyway.

    Though, I still slightly prefer to print both the decoded string and the input string on failure, as opposed to just the input string.


    sipa commented at 1:46 pm on April 5, 2022:
    I’ve changed them to just reporting the encoded string in case of failure; that should still be sufficient to notice the offending test case, and keeps working if we’d add cases that have non-printable input. Ok?
  18. in src/test/base64_tests.cpp:23 in e2fcab1457 outdated
    18@@ -19,8 +19,8 @@ BOOST_AUTO_TEST_CASE(base64_testvectors)
    19     {
    20         std::string strEnc = EncodeBase64(vstrIn[i]);
    21         BOOST_CHECK_EQUAL(strEnc, vstrOut[i]);
    22-        std::string strDec = DecodeBase64(strEnc);
    23-        BOOST_CHECK_EQUAL(strDec, vstrIn[i]);
    24+        auto dec = DecodeBase64(strEnc);
    25+        BOOST_CHECK_MESSAGE(MakeByteSpan(dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    


    MarcoFalke commented at 8:37 am on April 5, 2022:
    Same

    sipa commented at 1:47 pm on April 5, 2022:
  19. in src/i2p.cpp:73 in 576799874b outdated
    74-    if (invalid) {
    75-        throw std::runtime_error(strprintf("Cannot decode Base64: \"%s\"", i2p_b64));
    76-    }
    77-    return decoded;
    78+    auto decoded = DecodeBase64(std_b64);
    79+    if (!decoded) throw std::runtime_error(strprintf("Cannot decode Base64: \"%s\"", i2p_b64));
    


    MarcoFalke commented at 8:49 am on April 5, 2022:

    nit in 576799874b7e4d026ab888dc2bef84a5c02adfa1:

    I think it is fine if you prefer single-line if over multi-line ones for your own code. Though, I don’t think we should convert unrelated lines to that style. This makes the diff larger than needed and I think single-line ifs shouldn’t be used for anything non-trivial (continue;, break;, return;). For anything larger they promote large lines and make it hard to see coverage information as well as attaching a debugger to the right statements. Also they increase the diffs for future patches that need to add lines to the if-body.


    sipa commented at 1:48 pm on April 5, 2022:
    I prefer code with less clutter lines in general, but you make a good point. Reverted.
  20. in src/test/base32_tests.cpp:26 in 576799874b outdated
    22@@ -23,19 +23,14 @@ BOOST_AUTO_TEST_CASE(base32_testvectors)
    23         strEnc = EncodeBase32(vstrIn[i], false);
    24         BOOST_CHECK_EQUAL(strEnc, vstrOutNoPadding[i]);
    25         auto dec = DecodeBase32(vstrOut[i]);
    26-        BOOST_CHECK_MESSAGE(MakeByteSpan(dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    27+        BOOST_CHECK_MESSAGE(dec && MakeByteSpan(*dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    


    MarcoFalke commented at 9:00 am on April 5, 2022:

    nit in https://github.com/bitcoin/bitcoin/commit/576799874b7e4d026ab888dc2bef84a5c02adfa1:

    An alternative would be to use dec.value(), which gives a better indication of where the error happened.

    Edit: Or maybe even a BOOST_REQUIRE(dec); in a separate line?


    sipa commented at 1:49 pm on April 5, 2022:
    Used BOOST_REQUIRE(dec);.
  21. in src/test/base64_tests.cpp:23 in 576799874b outdated
    19@@ -20,7 +20,7 @@ BOOST_AUTO_TEST_CASE(base64_testvectors)
    20         std::string strEnc = EncodeBase64(vstrIn[i]);
    21         BOOST_CHECK_EQUAL(strEnc, vstrOut[i]);
    22         auto dec = DecodeBase64(strEnc);
    23-        BOOST_CHECK_MESSAGE(MakeByteSpan(dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    24+        BOOST_CHECK_MESSAGE(dec && MakeByteSpan(*dec) == MakeByteSpan(vstrIn[i]), vstrIn[i]);
    


    MarcoFalke commented at 9:00 am on April 5, 2022:
    Same

    sipa commented at 1:49 pm on April 5, 2022:
    Used BOOST_REQUIRE(dec);.
  22. in src/util/strencodings.cpp:249 in 576799874b outdated
    253-            *pf_invalid = true;
    254-        }
    255-        return {};
    256-    }
    257-    return DecodeBase32(str.c_str(), pf_invalid);
    258+    if (!ValidAsCString(str)) return {};
    


    MarcoFalke commented at 9:06 am on April 5, 2022:

    nit in https://github.com/bitcoin/bitcoin/commit/576799874b7e4d026ab888dc2bef84a5c02adfa1:

    Diff could be smaller if the multi-line if is kept.


    sipa commented at 1:49 pm on April 5, 2022:
    Done.
  23. in src/util/strencodings.cpp:150 in 2e2645a9b1 outdated
    154-        ++p;
    155-    }
    156+    if (str.size() % 4 != 0) return {};
    157+    /* One or two = characters at the end are permitted. */
    158+    if (str.size() >= 1 && str.back() == '=') str.remove_suffix(1);
    159+    if (str.size() >= 1 && str.back() == '=') str.remove_suffix(1);
    


    MarcoFalke commented at 9:47 am on April 5, 2022:

    nit in 2e2645a9b169824f064f1d69eb80d1a1945c68fb: I tagged this pull with “refactoring”, but I think the validation is now minimally stricter?

    • DecodeBase64 fails if the size isn’t %4. Previously it may have succeed. For example if a trailing padding char was removed.

    It may be possible to check this with the following test on current master:

     0diff --git a/src/test/base64_tests.cpp b/src/test/base64_tests.cpp
     1index 6ee1b83691..e06bd3a542 100644
     2--- a/src/test/base64_tests.cpp
     3+++ b/src/test/base64_tests.cpp
     4@@ -14,12 +14,11 @@ BOOST_AUTO_TEST_SUITE(base64_tests)
     5 BOOST_AUTO_TEST_CASE(base64_testvectors)
     6 {
     7     static const std::string vstrIn[]  = {"","f","fo","foo","foob","fooba","foobar"};
     8-    static const std::string vstrOut[] = {"","Zg==","Zm8=","Zm9v","Zm9vYg==","Zm9vYmE=","Zm9vYmFy"};
     9+    static const std::string vstrOut[] = {"","Zg=","Zm8","Zm9v","Zm9vYg","Zm9vYmE=","Zm9vYmFy"};
    10     for (unsigned int i=0; i<std::size(vstrIn); i++)
    11     {
    12         std::string strEnc = EncodeBase64(vstrIn[i]);
    13-        BOOST_CHECK_EQUAL(strEnc, vstrOut[i]);
    14-        std::string strDec = DecodeBase64(strEnc);
    15+        std::string strDec = DecodeBase64(vstrOut[i]);
    16         BOOST_CHECK_EQUAL(strDec, vstrIn[i]);
    17     }
    18 
    

    MarcoFalke commented at 10:18 am on April 5, 2022:
    If true, it could make sense to mention this behaviour change in the commit description (or pull request description).

    sipa commented at 1:57 pm on April 5, 2022:

    Interesting! The answer is that it’s complicated.

    So, the test code you suggest here actually passes, but only because it isn’t checking the invalid output flag. Current master, I believe, decodes input which isn’t a multiple of 4 characters correctly (as if it had padding) but does set the *pf_invalid flag.

    So this is indeed a behavior change, but only for call sites which don’t check the invalid flag. I think there is only one such site: RPCAuthorized() in httprpc.cpp, and I think that’s a bug (the HTTP spec RFC 7617 for Basic authentication specifies the use of padded Base64 on very casual reading).

    If so, the commit that switches to std::optional-wrapped results actually fixes this.

    Thoughts?


    MarcoFalke commented at 3:19 pm on April 5, 2022:

    Oh right. The commit that changes behaviour is likely the std::optional one (889a5f107c5187c9232f340adfaedfe1b90361d5), not the string_view one I mentioned earlier.

    I guess there is nothing needed to be done here. I just wanted to mention the behaviour change here, so that we aren’t surprised when someone creates a bug report “My broken javascript rpc library doesn’t work”.


    sipa commented at 3:35 am on April 9, 2022:
    Moved the behavior-changing aspect to a separate commit.
  24. MarcoFalke approved
  25. MarcoFalke commented at 10:04 am on April 5, 2022: member

    Returning a string_view can be somewhat dangerous (in combination with auto? also when the original string goes out of scope, especially if it’s a temporary), so giving these functions new names is a good idea.

    A new name is good, but doesn’t solve the problem. I think this can be fixed with an attribute: #20493

    I think this can be merged, but I left a few nits and questions.

    review ACK 6e1f14780452d956503f1dbc3c91b17b44287f77 💢

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3review ACK 6e1f14780452d956503f1dbc3c91b17b44287f77 💢
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUghZAv+JPOBXiEZ9L1jCF9HPLDO1vGLEuvq6hw6EkaMSKd0Zn3UaiYvP98jPBMZ
     8HzTidM3yPWwDfLw3QVTlPCdt0J8wBAwOg5ZgjGRZ7fu1hn0SZ93q960giU68DbeJ
     9ZZCF4fAYPOkTqf1a06DnV7anglhEIG/JXMg+XESTvUGmtdKnP48sdwek4NZOw1h7
    10PcA+Ck38qlb68W38HQMwsQYr/aO2Ftc2Utw2GMv3Kkr7nD6kIwtuH9WOTWc8Kw2H
    11vr5NeRfQS1smm9rP+VrixKmCaC1h37r6NMuXDqh/MRNj77ONwmLDLWES/4bP9ioc
    12w814fQled+vIue+Mv5yBUNeGZmlOryQsC7wJwNkiRNKxG5H+bSn2KUKEtZAGjHpv
    13rfNI9X9nbDCvDQdHemrYCessHaCfGFCMeUDC4PWDZ0uNaZSC6RrU0y8pESeMOzxe
    14BJJxSpq0sHCOAVt8LKGyumBlma/k2+n9d6UNo3++K/am9Uxei4kFgLJ0QebpXKqm
    15NqMH+EA3
    16=/7qh
    17-----END PGP SIGNATURE-----
    
  26. laanwj commented at 11:04 am on April 5, 2022: member

    A new name is good, but doesn’t solve the problem. I think this can be fixed with an attribute: #20493

    Agree, I just meant that putting a string_view-returning function as a drop-in replacement would have been a very bad idea, a new function with a new name means that people using it can reason about the risks anew. Sure, if an attribute can help avoid such problems more widely that’s of course even better.

  27. sipa force-pushed on Apr 5, 2022
  28. MarcoFalke commented at 1:57 pm on April 5, 2022: member

    review ACK e4092c5230bdd1f6439c70fa22b68381e5cc6899 🐤

    Only changes:

    • Remove confusing (signed char) cast
    • Use shorter str.substr(0, 2) == “0x” over manual matching/implementation
    • A minor change in a unit test
    • Whitespace changes

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3review ACK e4092c5230bdd1f6439c70fa22b68381e5cc6899 🐤
     4
     5Only changes:
     6* Remove confusing (signed char) cast
     7* Use shorter str.substr(0, 2) == "0x" over manual matching/implementation
     8* A minor change in a unit test
     9* Whitespace changes
    10-----BEGIN PGP SIGNATURE-----
    11
    12iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
    13pUjhQQwAsEPZBkw/LyKgYc/xITU1DL5Oausc0jYNBH7WrBEh055OyfUZ5NgxM87K
    14bcJTNQMCmLVfWnOcg0FUELaXCGbqua8qPgiKwQT5nxlHN0DXpakSSWrkGh2HKIGe
    15zwPKwNrMEH7TrCs3Tkn3ZTkiHI3gl+AFTL5zdw1b/sMwPapeZLDqoAQZXTqj+LRB
    160d44jHX6oO5faHRI6p+fErVtg40Q+3s49CkYXbTl/u1HEWaKcWBxk0wZ9JHUUWuY
    17Z2K1OWDWH/KuR/gYlUUuVfJk5jgG6NqNgzErLUjCtYjQ6LGpI90e7ud/ahv2Xndb
    18vRzi/r/XAQrzlYLKM4ZuoeiTTSKxLJHD6353KVgPVKETioM6o1eyn3qWpg5JwnT4
    19VglcsozZzoVbSdghoQ397NhFjI+Lfs+d60EcUB8lzPgzP00gAo2smlXXM0QMBRV/
    20XOweIVcmYulLYK0LnBXbQy4GSqyE06JsIKpugwFJEcKb7mkdJSdJ8Xg4xxwwYyjR
    21CdV7knXn
    22=yyjx
    23-----END PGP SIGNATURE-----
    
  29. Empact commented at 6:17 pm on April 6, 2022: member

    Concept ACK - the ConvertBits functionality is unused in this PR - maybe it could be saved for a time when it is needed? (misread)

    nits:

    • Mild preference for more strict separation between refactoring and behavior modification - would make this easier to ACK, at least in part
    • There is a check in IsHex (% 2 == 0) that could be applied to IsHexNumber
    • Could inline or return early in some cases where you’re converting to return optional (previous implementation was constrained by the need to set the val) 889a5f107c5187c9232f340adfaedfe1b90361d5
  30. sipa commented at 6:21 pm on April 6, 2022: member

    Mild preference for more strict separation between refactoring and behavior modification - would make this easier to ACK, at least in part

    Yeah, I’ll separate them (now that we figured out the issue with ignoring the pfInvalid flag in the http auth logic).

    There is a check in IsHex (% 2 == 0) that could be applied to IsHexNumber

    That would be incorrect. Hex numbers don’t need to be a multiple of 2 characters (0xF is a valid hex number, e.g.).

  31. Make ParseHex use string_view f80dedcc61
  32. Make IsHex use string_view 6208dc43f9
  33. Make IsHexNumber use string_view d44fc2cf74
  34. Make SanitizeString use string_view 2f152771c1
  35. Reject incorrect base64 in HTTP auth
    In addition, to make sure that no call site ignores the invalid
    decoding status, make the pf_invalid argument mandatory.
    196c5ac178
  36. Make DecodeBase{32,64} always return vector, not string
    Base32/base64 are mechanisms for encoding binary data. That they'd
    decode to a string is just bizarre. The fact that they'd do that
    based on the type of input arguments even more so.
    e779a46b22
  37. Make DecodeBase{32,64} return optional instead of taking bool* 5a103b7664
  38. Generalize ConvertBits to permit transforming the input 1d980ffdd8
  39. Make DecodeBase{32,64} take string_view arguments e19891c653
  40. Use std::string_view throughout util strencodings/string 5ea072ed08
  41. sipa force-pushed on Apr 9, 2022
  42. MarcoFalke approved
  43. MarcoFalke commented at 9:55 am on April 11, 2022: member

    re-ACK 5ea072ed08b43cdfdf8554597dcf8c1a51ef4dae only effective change is adding a “return false” 🌷

    Signature:

     0-----BEGIN PGP SIGNED MESSAGE-----
     1Hash: SHA512
     2
     3re-ACK 5ea072ed08b43cdfdf8554597dcf8c1a51ef4dae only effective change is adding a "return false" 🌷
     4-----BEGIN PGP SIGNATURE-----
     5
     6iQGzBAEBCgAdFiEE+rVPoUahrI9sLGYTzit1aX5ppUgFAlwqrYAACgkQzit1aX5p
     7pUgUyQwAhIAVoAgFscYNynT/fORwnmC2JJczY8Zne+7IKHyQadrQGB3uGARyJceI
     8WDRjN7aLHHJwFmFZ/Lz4HGa5sZYwAUC/Vz9y2eX0nu2PInfZgdtNbx8JT0cx5wkX
     903LWXiek/qiy6BeEctpE1nP37iSAqB4E18yEhCMqdLe+m7KCq/rTZN2XdF/R3DHK
    10j9mqyyy7M5dYQUE2dViBdqTydSrhGIhw2HqHXytP8/urfFB04wLXeJHNDZ6lRfZ1
    11zq28KVL2FOuFXdYTIAtdeCAFAXFgwiZ2lYik1+aftCBBO9BdBnmKqRD0/ERAC2u8
    12TfVPFMQECHbWxcxGwZGBjhNhcXGI2fVUF2g4SqJVssqPnG53GcSRvN6H0tcKDAH7
    13g1x6fqDibKBIUXcrTp8l5tlZYknN2wU0wq9M7U6uJIt5gGWOTlpYpbQsMIlzg6Ls
    14omRKFXjTXHvFmO8eilHMaMVk6qY/12MjoTyHZ1JwN7TWHRxH6Q9XNTFeq1VJE/qd
    15uqaqLwJe
    16=bejp
    17-----END PGP SIGNATURE-----
    
  44. in src/util/string.h:88 in 5ea072ed08
    84@@ -75,9 +85,12 @@ inline std::string MakeUnorderedList(const std::vector<std::string>& items)
    85 /**
    86  * Check if a string does not contain any embedded NUL (\0) characters
    87  */
    88-[[nodiscard]] inline bool ValidAsCString(const std::string& str) noexcept
    89+[[nodiscard]] inline bool ValidAsCString(std::string_view str) noexcept
    


    martinus commented at 6:22 am on April 19, 2022:

    I think the name ValidAsCString is now a bit dangerous. I’d assume that it returns true when the argument is 0 terminated at the end (which was the case for the const std::string& str argument), but with a std::string_view argument this might not be the case.

    So now even when ValidAsCString(str) returns true, it is not allowed to use e.g. strlen(str.data()).

    I would rename the method to something like ContainsNoNUL to make it more clear what’s checked, or keep the const std::string& argument for now


    laanwj commented at 5:24 pm on April 21, 2022:

    According to the documentation, it only checks if a string has embedded \0 characters. That std::string is zero-terminated internally is an implementation detail, the caller is still responsible for passing c_str() (and never .data()) when passing it to a function that takes a C string.

    it is not allowed to use e.g. strlen(str.data()).

    I’d really frown on code like that in any case.

    Edit: The only time it’d be remotely acceptable to pass .data() as a C string would be if there is an embedded \0 in the array, which is exactly what this function avoids :smile:


    martinus commented at 9:26 am on April 22, 2022:
    well, maybe it’s just me, but from the interface ValidAsCString(std::string_view str) I’d guess that it returns true when it actually finds a \0 somewhere, but it’s exactly the opposite :slightly_smiling_face:

    MarcoFalke commented at 11:58 am on April 22, 2022:
    I have no objection to adding a scripted-diff to rename ValidAsCString to ContainsNoNUL. Either in this pull or as a follow-up.

    MarcoFalke commented at 8:47 am on April 25, 2022:

    MarcoFalke commented at 12:20 pm on April 27, 2022:
  45. DrahtBot commented at 10:06 am on April 26, 2022: member

    🐙 This pull request conflicts with the target branch and needs rebase.

    Want to unsubscribe from rebase notifications on this pull request? Just convert this pull request to a “draft”.

  46. DrahtBot added the label Needs rebase on Apr 26, 2022
  47. MarcoFalke commented at 12:07 pm on April 27, 2022: member
    I am happy to pick this up if you are no longer working on it
  48. sipa commented at 12:08 pm on April 27, 2022: member
    @MarcoFalke Go for it.
  49. sipa closed this on Apr 27, 2022

  50. DrahtBot locked this on Apr 27, 2023

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-01-22 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me