- Add defensive bounds check for
rule
parameter inSanitizeString
to prevent out-of-range access toSAFE_CHARS
array - Pre-reserve result capacity and cache reference to allowed charset to avoid repeated lookups
- No behavior change for valid inputs; improves robustness and minor performance
util/strencodings: guard SAFE_CHARS index and pre-reserve result #33396
pull forkfury wants to merge 1 commits into bitcoin:master from forkfury:ff changing 1 files +9 −1-
forkfury commented at 9:00 pm on September 15, 2025: none
-
Update strencodings.cpp abed6fb21e
-
in src/util/strencodings.cpp:30 in abed6fb21e
29@@ -30,8 +30,16 @@ static const std::string SAFE_CHARS[] = 30 std::string SanitizeString(std::string_view str, int rule)
l0rinc commented at 0:51 am on September 16, 2025:therule
is an enum in reality, what’s the reason for this addition?
furszy commented at 0:57 am on September 16, 2025:what about changing theSafeChars
enum to be an enum class instead? (enums auto-increment their value so you would just need to cast it to int here)DrahtBot commented at 0:58 am on September 16, 2025: contributorThe following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Code Coverage & Benchmarks
For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/33396.
Reviews
See the guideline for information on the review process. A summary of reviews will appear here.
in src/util/strencodings.cpp:40 in abed6fb21e
35+ if (rule < 0 || static_cast<size_t>(rule) >= safe_chars_count) { 36+ return result; 37+ } 38+ 39+ result.reserve(str.size()); 40+ const std::string& allowed = SAFE_CHARS[rule];
l0rinc commented at 1:09 am on September 16, 2025:There’s no point in extracting this, the compilers are already smart enough to optimize this, see: https://godbolt.org/z/9rej918W7 (compare the
SanitizeString_Direct
andSanitizeString_Reference
assemblies, they’re the same).But the string reserve that still make sense and we can change the parameter to the enum instead, e.g.
0diff --git a/src/util/strencodings.cpp b/src/util/strencodings.cpp 1index 289f49d46a..d1b99be943 100644 2--- a/src/util/strencodings.cpp 3+++ b/src/util/strencodings.cpp 4@@ -27,9 +27,10 @@ static const std::string SAFE_CHARS[] = 5 CHARS_ALPHA_NUM + "!*'();:@&=+$,/?#[]-_.~%", // SAFE_CHARS_URI 6 }; 7 8-std::string SanitizeString(std::string_view str, int rule) 9+std::string SanitizeString(std::string_view str, SafeChars rule) 10 { 11 std::string result; 12+ result.reserve(str.size()); 13 for (char c : str) { 14 if (SAFE_CHARS[rule].find(c) != std::string::npos) { 15 result.push_back(c); 16diff --git a/src/util/strencodings.h b/src/util/strencodings.h 17index 0106385804..b13ff22713 100644 18--- a/src/util/strencodings.h 19+++ b/src/util/strencodings.h 20@@ -27,12 +27,12 @@ 21 #include <vector> 22 23 /** Used by SanitizeString() */ 24-enum SafeChars 25+enum SafeChars: uint8_t 26 { 27- SAFE_CHARS_DEFAULT, //!< The full set of allowed chars 28- SAFE_CHARS_UA_COMMENT, //!< BIP-0014 subset 29- SAFE_CHARS_FILENAME, //!< Chars allowed in filenames 30- SAFE_CHARS_URI, //!< Chars allowed in URIs (RFC 3986) 31+ SAFE_CHARS_DEFAULT = 0, //!< The full set of allowed chars 32+ SAFE_CHARS_UA_COMMENT = 1, //!< BIP-0014 subset 33+ SAFE_CHARS_FILENAME = 2, //!< Chars allowed in filenames 34+ SAFE_CHARS_URI = 3, //!< Chars allowed in URIs (RFC 3986) 35 }; 36 37 /** 38@@ -59,7 +59,7 @@ enum class ByteUnit : uint64_t { 39 * [@param](/bitcoin-bitcoin/contributor/param/)[in] rule The set of safe chars to choose (default: least restrictive) 40 * [@return](/bitcoin-bitcoin/contributor/return/) A new string without unsafe chars 41 */ 42-std::string SanitizeString(std::string_view str, int rule = SAFE_CHARS_DEFAULT); 43+std::string SanitizeString(std::string_view str, SafeChars rule = SAFE_CHARS_DEFAULT); 44 /** Parse the hex string into bytes (uint8_t or std::byte). Ignores whitespace. Returns nullopt on invalid input. */ 45 template <typename Byte = std::byte> 46 std::optional<std::vector<Byte>> TryParseHex(std::string_view str);
l0rinc changes_requestedl0rinc commented at 1:10 am on September 16, 2025: contributorNot sure if this is an LLM generated patch or not, it’s clear from the usages that theint rule
doesn’t need bounds check - but the reserve does make sense and we can swap the int param to the enum which could make sense.fanquake commented at 11:02 am on September 16, 2025: memberThanks, however we can leave this code as-is.fanquake closed this on Sep 16, 2025
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-26 15:13 UTC
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-09-26 15:13 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me
More mirrored repositories can be found on mirror.b10c.me