- Add defensive bounds check for
ruleparameter inSanitizeStringto prevent out-of-range access toSAFE_CHARSarray - Pre-reserve result capacity and cache reference to allowed charset to avoid repeated lookups
- No behavior change for valid inputs; improves robustness and minor performance
util/strencodings: guard SAFE_CHARS index and pre-reserve result #33396
pull forkfury wants to merge 1 commits into bitcoin:master from forkfury:ff changing 1 files +9 −1-
forkfury commented at 9:00 PM on September 15, 2025: none
-
Update strencodings.cpp abed6fb21e
-
in src/util/strencodings.cpp:30 in abed6fb21e
29 | @@ -30,8 +30,16 @@ static const std::string SAFE_CHARS[] = 30 | std::string SanitizeString(std::string_view str, int rule)
l0rinc commented at 12:51 AM on September 16, 2025:the
ruleis an enum in reality, what's the reason for this addition?
furszy commented at 12:57 AM on September 16, 2025:what about changing the
SafeCharsenum to be an enum class instead? (enums auto-increment their value so you would just need to cast it to int here)DrahtBot commented at 12:58 AM on September 16, 2025: contributor<!--e57a25ab6845829454e8d69fc972939a-->
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
<!--006a51241073e994b41acfe9ec718e94-->
Code Coverage & Benchmarks
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33396.
<!--021abf342d371248e50ceaed478a90ca-->
Reviews
See the guideline for information on the review process. A summary of reviews will appear here.
in src/util/strencodings.cpp:40 in abed6fb21e
35 | + if (rule < 0 || static_cast<size_t>(rule) >= safe_chars_count) { 36 | + return result; 37 | + } 38 | + 39 | + result.reserve(str.size()); 40 | + const std::string& allowed = SAFE_CHARS[rule];
l0rinc commented at 1:09 AM on September 16, 2025:There's no point in extracting this, the compilers are already smart enough to optimize this, see: https://godbolt.org/z/9rej918W7 (compare the
SanitizeString_DirectandSanitizeString_Referenceassemblies, they're the same).But the string reserve that still make sense and we can change the parameter to the enum instead, e.g.
diff --git a/src/util/strencodings.cpp b/src/util/strencodings.cpp index 289f49d46a..d1b99be943 100644 --- a/src/util/strencodings.cpp +++ b/src/util/strencodings.cpp @@ -27,9 +27,10 @@ static const std::string SAFE_CHARS[] = CHARS_ALPHA_NUM + "!*'();:@&=+$,/?#[]-_.~%", // SAFE_CHARS_URI }; -std::string SanitizeString(std::string_view str, int rule) +std::string SanitizeString(std::string_view str, SafeChars rule) { std::string result; + result.reserve(str.size()); for (char c : str) { if (SAFE_CHARS[rule].find(c) != std::string::npos) { result.push_back(c); diff --git a/src/util/strencodings.h b/src/util/strencodings.h index 0106385804..b13ff22713 100644 --- a/src/util/strencodings.h +++ b/src/util/strencodings.h @@ -27,12 +27,12 @@ #include <vector> /** Used by SanitizeString() */ -enum SafeChars +enum SafeChars: uint8_t { - SAFE_CHARS_DEFAULT, //!< The full set of allowed chars - SAFE_CHARS_UA_COMMENT, //!< BIP-0014 subset - SAFE_CHARS_FILENAME, //!< Chars allowed in filenames - SAFE_CHARS_URI, //!< Chars allowed in URIs (RFC 3986) + SAFE_CHARS_DEFAULT = 0, //!< The full set of allowed chars + SAFE_CHARS_UA_COMMENT = 1, //!< BIP-0014 subset + SAFE_CHARS_FILENAME = 2, //!< Chars allowed in filenames + SAFE_CHARS_URI = 3, //!< Chars allowed in URIs (RFC 3986) }; /** @@ -59,7 +59,7 @@ enum class ByteUnit : uint64_t { * [@param](/bitcoin-bitcoin/contributor/param/)[in] rule The set of safe chars to choose (default: least restrictive) * [@return](/bitcoin-bitcoin/contributor/return/) A new string without unsafe chars */ -std::string SanitizeString(std::string_view str, int rule = SAFE_CHARS_DEFAULT); +std::string SanitizeString(std::string_view str, SafeChars rule = SAFE_CHARS_DEFAULT); /** Parse the hex string into bytes (uint8_t or std::byte). Ignores whitespace. Returns nullopt on invalid input. */ template <typename Byte = std::byte> std::optional<std::vector<Byte>> TryParseHex(std::string_view str);l0rinc changes_requestedl0rinc commented at 1:10 AM on September 16, 2025: contributorNot sure if this is an LLM generated patch or not, it's clear from the usages that the
int ruledoesn't need bounds check - but the reserve does make sense and we can swap the int param to the enum which could make sense.fanquake commented at 11:02 AM on September 16, 2025: memberThanks, however we can leave this code as-is.
fanquake closed this on Sep 16, 2025
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-04-24 15:12 UTC
More mirrored repositories can be found on mirror.b10c.me