Prepare string and net utils for future HTTP operations #34242

pull pinheadmz wants to merge 6 commits into bitcoin:master from pinheadmz:http-utils changing 10 files +355 −17
  1. pinheadmz commented at 6:11 pm on January 9, 2026: member

    This is a component of removing libevent as a dependency of the project. It is the first six commits of #32061 and provides a string-parsing utility (LineReader) that is also consumed by #34158.

    These are the functions that are added / updated for HTTP and Torcontrol:

    • GetBindAddress(): Given a socket, provides the bound address as a CService. Currently used by p2p but moved from net to netbase so other modules can call it.
    • ToIntegral(): Already used to parse numbers from strings, added new argument base = 10 so it can also be used to parse hexadecimal integers. HTTP chunked transfer-encoding uses hex-encoded integers to specify payload size: https://datatracker.ietf.org/doc/html/rfc7230.html#section-4.1
    • AsciiCaseInsensitive comparators: Needed to store HTTP headers in an unordered_map. Headers are key-value pairs that are parsed with case-insensitive keys: https://httpwg.org/specs/rfc9110.html#rfc.section.5.1
    • FormatRFC1123DateTime(): The required datetime format for HTTP headers (e.g. Fri, 31 May 2024 19:18:04 GMT)
    • LineReader: Fields in HTTP requests are newline-terminated. This struct is given an input buffer and provides methods to read lines as strings.
  2. DrahtBot commented at 6:11 pm on January 9, 2026: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34242.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK maflcko, furszy, sedited
    Stale ACK fjahr, vasild

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34342 (cli: Replace libevent usage with simple http client by fjahr)
    • #34158 (torcontrol: Remove libevent usage by fjahr)
    • #32061 (Replace libevent with our own HTTP and socket-handling implementation by pinheadmz)
    • #31672 (rpc: add cpu_load to getpeerinfo by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. in src/netbase.cpp:956 in 67ba7c9302 outdated
    951+CService GetBindAddress(const Sock& sock)
    952+{
    953+    CService addr_bind;
    954+    struct sockaddr_storage sockaddr_bind;
    955+    socklen_t sockaddr_bind_len = sizeof(sockaddr_bind);
    956+    if (!sock.GetSockName((struct sockaddr*)&sockaddr_bind, &sockaddr_bind_len)) {
    


    furszy commented at 6:22 pm on January 9, 2026:

    GetSockName() currently forwards the raw getsockname() return value (0 on success), which forces callers to use an inverted check (!GetSockName(...)).

    Can we normalize it so it returns true on success?

    0bool Sock::GetSockName(sockaddr* name, socklen_t* name_len) const
    1{
    2    return getsockname(m_socket, name, name_len) == 0;
    3}
    

    It would make the code slightly more readable and maintainable.


    pinheadmz commented at 6:47 pm on January 9, 2026:
    I agree with you, but this might be better addressed in a separate PR. The entire Sock class wraps POSIX socket management syscalls that return 0 on success (connect, bind, listen, close, etc…)

    vasild commented at 12:43 pm on January 12, 2026:

    The idea behind the Sock methods is that they are mockable versions of the OS functions. Zero learning curve for newcomers (assuming familiarity with the OS interface). Sock::Foo(x, y, z) is equivalent to calling the OS’s foo(x, y, z).

    inverted check

    the function returns an integer error code, not a boolean. A check like if (foo() == 0) looks better (less “inverted”).


    furszy commented at 4:33 pm on January 12, 2026:
    I see. Thanks @vasild. It would be nice to have more documentation on the return codes then. And +1 on the == 0 suggestion.

    pinheadmz commented at 7:59 pm on January 12, 2026:
    This line gets “modernized” in the next commit, including a comment and == 0, so I think it’s cool for now.

    vasild commented at 5:26 pm on January 13, 2026:

    “returns an integer error code” – oh, I meant this:

    https://www.man7.org/linux/man-pages/man2/getsockname.2.html#RETURN_VALUE

    RETURN VALUE On success, zero is returned. On error, -1 is returned, and errno is set to indicate the error.

    or the check could be if (foo() != -1).

  4. fjahr commented at 2:32 pm on January 10, 2026: contributor

    ACK ea993ff2af1faf863c106f855d3722b29a1f4483

    I reviewed the same 6 commits as part of #32061 and all my feedback from there has been addressed.

  5. in src/netbase.cpp:957 in ea993ff2af
    952+{
    953+    CService addr_bind;
    954+    sockaddr_storage storage;
    955+    socklen_t len = sizeof(storage);
    956+
    957+    auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage));
    


    vasild commented at 12:48 pm on January 12, 2026:

    The inner cast is unnecessary?

    0    auto sa = static_cast<sockaddr*>(&storage);
    

    pinheadmz commented at 8:33 pm on January 12, 2026:
    This works with a reinterpret_cast because sockaddr_storage and sockaddr are technically unrelated types even though they overlap… I’ll clean it up, thanks. I think I had static_cast there before because of hints from Sonarcloud on corecheck.dev.
  6. in src/util/string.cpp:42 in ea993ff2af
    37+        // If the character we just consumed gives us a line length greater
    38+        // than max_read, and we are not at the end of the line (or buffer) yet,
    39+        // that means the line we are currently reading is too long, and we throw.
    40+        if (count > max_read) throw std::runtime_error("max_read exceeded by LineReader");
    41+    }
    42+    const std::string_view untrimmed_line(reinterpret_cast<const char *>(std::to_address(line_start)), count);
    


    vasild commented at 1:03 pm on January 12, 2026:
    0    const std::string_view untrimmed_line(reinterpret_cast<const char*>(std::to_address(line_start)), count);
    

    pinheadmz commented at 3:34 pm on January 13, 2026:
    taken, thanks
  7. vasild approved
  8. vasild commented at 1:05 pm on January 12, 2026: contributor
    ACK ea993ff2af1faf863c106f855d3722b29a1f4483
  9. in src/util/time.cpp:140 in 9423d8ae48 outdated
    135+    std::string_view month{months.at(unsigned{ymd.month()} - 1)};
    136+    const std::chrono::hh_mm_ss hms{secs - days};
    137+    // examples: Mon, 27 Jul 2009 12:28:53 GMT
    138+    //           Fri, 31 May 2024 19:18:04 GMT
    139+    return strprintf("%03s, %02u %03s %04i %02i:%02i:%02i GMT", weekday, unsigned{ymd.day()}, month, signed{ymd.year()}, hms.hours().count(), hms.minutes().count(), hms.seconds().count());
    140+}
    


    furszy commented at 6:32 pm on January 12, 2026:

    nit: I think C++20 has a short version of this (no extra arrays needed):

    0std::string FormatRFC1123DateTime(int64_t time)
    1{
    2    if (time < -62167219200 || 253402300799 < time) {
    3        // RFC7231 mandates 4-digit year, so only support years 0 to 9999
    4        return "";
    5    }
    6    const std::chrono::sys_seconds secs{std::chrono::seconds{time}};
    7    // Example: "Mon, 27 Jul 2009 12:28:53 GMT"
    8    return std::format("{:%a, %d %b %Y %H:%M:%S} GMT", secs);
    9}
    

    maybe you already tried it and it doesn’t work on all our supported platforms?


    vasild commented at 5:38 pm on January 13, 2026:

    This should be locale-independent. I think it is:

    https://en.cppreference.com/w/cpp/chrono/system_clock/formatter.html#Format_specification

    the default “C” locale if L is not present in the format specification


    vasild commented at 6:04 pm on January 13, 2026:

    https://github.com/bitcoin/bitcoin/actions/runs/20963499370/job/60247105398?pr=34242#step:11:1480

    0/home/admin/actions-runner/_work/_temp/src/util/time.cpp:129:17: error: no member named 'format' in namespace 'std'
    1  129 |     return std::format("{:%a, %d %b %Y %H:%M:%S} GMT", secs);
    2      |            ~~~~~^
    

    :cry:


    pinheadmz commented at 6:14 pm on January 13, 2026:
    Indeed, will revert on next push.

    furszy commented at 8:21 pm on January 13, 2026:
    too good to be true
  10. in src/util/string.cpp:36 in ea993ff2af
    31+        ++it;
    32+        ++count;
    33+        // If the character we just consumed was \n, the line is terminated
    34+        if (c == '\n') break;
    35+        // If we are at the end of the incoming buffer, the line is terminated
    36+        if (it == end) break;
    


    furszy commented at 7:06 pm on January 12, 2026:

    In ea993ff2af1faf863c106f855d3722b29a1f4483:

    Seems you can remove this last line. You already do a while (it != end)


    pinheadmz commented at 3:38 pm on January 13, 2026:
    Youre right thanks, its gone.

    vasild commented at 6:02 pm on January 13, 2026:

    Note that it is incremented inside the body of the loop, after the check in while (it != end). So after the increment it could be equal to end.

    0    const std::vector<std::byte> input{StringToBuffer("ab")};
    1    LineReader reader(input, /*max_read=*/1);
    2    std::optional<std::string> line{reader.ReadLine()};
    

    Before this change the above snippet would have set line to ab and after the change it throws std::runtime_error: max_read exceeded by LineReader. I guess the new behavior is the correct? Maybe worth adding such a test.


    pinheadmz commented at 6:41 pm on January 13, 2026:
    Thanks @vasild I added this as a test case, and am scratching my head how it slipped through. Removing the line is not only cleaner code, it is a bug fix. LineReader should not return a line longer than max_read
  11. in src/util/string.cpp:34 in ea993ff2af
    29+        // Read a character from the incoming buffer and increment the iterator
    30+        auto c = static_cast<char>(*it);
    31+        ++it;
    32+        ++count;
    33+        // If the character we just consumed was \n, the line is terminated
    34+        if (c == '\n') break;
    


    furszy commented at 7:09 pm on January 12, 2026:
    tiny nit: could make the termination character customizable.

    pinheadmz commented at 3:34 pm on January 13, 2026:
    Going to leave this for now, next dev who needs a custom EOL character can add it in the future ;-)
  12. furszy commented at 7:10 pm on January 12, 2026: member
    utACK ea993ff2af1faf863c106f855d3722b29a1f4483
  13. Make GetBindAddress() callable from outside net.cpp
    The function logic is moved-only from net.cpp to netbase.cpp
    and redeclared (as non-static) in netbase.h
    a0ca851d26
  14. pinheadmz force-pushed on Jan 13, 2026
  15. DrahtBot added the label CI failed on Jan 13, 2026
  16. DrahtBot commented at 6:22 pm on January 13, 2026: contributor

    🚧 At least one of the CI tasks failed. Task macOS-cross to arm64: https://github.com/bitcoin/bitcoin/actions/runs/20963499370/job/60247105377 LLM reason (✨ experimental): Compilation failed: std::format is unavailable in the used standard library (missing C++20 std::format support).

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  17. pinheadmz force-pushed on Jan 13, 2026
  18. in src/test/util_string_tests.cpp:201 in 1d248166ef outdated
    196+        BOOST_CHECK(line2);
    197+        BOOST_CHECK(line3);
    198+        BOOST_CHECK(!line4);
    199+        BOOST_CHECK_EQUAL(line1.value(), "once upon a time");
    200+        BOOST_CHECK_EQUAL(line2.value(), "there was a dog");
    201+        BOOST_CHECK_EQUAL(line3.value(), "who liked food");
    


    furszy commented at 8:56 pm on January 13, 2026:

    In 1d248166efe6cba330cfb14e652289f68c894652:

    Tiny, non-blocking, idea to see if it makes sense to simplify the LineReader class. (you can reinterpret std::byte to char in order to make it work with std::istringstream)

     0// Check three lines terminated by \n, \r\n, and end of buffer, trimming whitespace
     1const std::string input = "once upon a time\n there was a dog \r\nwho liked food";
     2std::istringstream stream(input);
     3
     4std::string line1, line2, line3, line4;
     5std::getline(stream, line1);
     6std::getline(stream, line2);
     7std::getline(stream, line3);
     8BOOST_CHECK(!std::getline(stream, line4));
     9
    10BOOST_CHECK_EQUAL(line1, "once upon a time");
    11BOOST_CHECK_EQUAL(line2, "there was a dog \r"); // TODO: trim whitespaces at the end.
    12BOOST_CHECK_EQUAL(line3, "who liked food");
    
  19. DrahtBot removed the label CI failed on Jan 13, 2026
  20. vasild approved
  21. vasild commented at 7:55 am on January 14, 2026: contributor
    ACK 1d248166efe6cba330cfb14e652289f68c894652
  22. DrahtBot requested review from fjahr on Jan 14, 2026
  23. DrahtBot requested review from furszy on Jan 14, 2026
  24. pinheadmz commented at 2:50 pm on January 14, 2026: member
    switching to draft for today: Im going to see if we can just use std::istringstream and if so, remove the LineReader commit
  25. pinheadmz marked this as a draft on Jan 14, 2026
  26. pinheadmz commented at 3:38 pm on January 14, 2026: member

    Ok did some learning about std::istringstream and I think the current implementation is better, and not reinventing a wheel from the standard library. Will convert the PR to ready for review.

    Here’s my reasoning (up for debate, @furszy ):

    Raw data is copied as std::byte from the network socket into the receive buffer. LineReader is my boundary between the raw data and text-based HTTP stuff, it operates on a span of that data and returns valid, allocated strings. It only allocates if the line is <= max_read, and only once, after reading from the span.

    std::istringstream can’t operate on std::byte (only on strings) so that would either require one big copy in the beginning, or moving the “boundary” lower where we could read as char from the socket. Even if I made that change, std::istringstream::getline() appends one character at a time as it reads the input, expanding the output string. Only when getline() is done would it be possible to check the line against max_read.

  27. pinheadmz marked this as ready for review on Jan 14, 2026
  28. fjahr commented at 2:03 pm on January 15, 2026: contributor
    re-ACK 1d248166efe6cba330cfb14e652289f68c894652
  29. in src/util/string.h:291 in 1d248166ef outdated
    283+     * and advances iterator. May exceed max_read but will not read past end of buffer.
    284+     * @param[in]   len     The number of bytes to read from the buffer
    285+     * @returns a string of the expected length.
    286+     * @throws a std::runtime_error if there is not enough data in the buffer.
    287+     */
    288+    std::string ReadLength(size_t len);
    


    furszy commented at 2:59 pm on January 15, 2026:
    tiny nit: Would just call this Read(size_t len).

    pinheadmz commented at 6:39 pm on January 15, 2026:
    I’m gonna push back on this, I think in a class called “Line Reader” it’s important to distinguish explicitly between reading a line, and reading a number of bytes. A method called Read() seems ambiguous to me.
  30. in src/util/string.h:293 in 1d248166ef
    288+    std::string ReadLength(size_t len);
    289+
    290+    /**
    291+     * Returns remaining size of bytes in buffer
    292+     */
    293+    size_t Left() const;
    


    furszy commented at 3:00 pm on January 15, 2026:
    nano nit: Maybe Left() is not the best term. Could use Remaining()?. But at this point it doesn’t matter much.

    pinheadmz commented at 6:41 pm on January 15, 2026:
    Ok sure, taking this. 🙏
  31. in src/util/string.cpp:54 in 1d248166ef
    39+        if (count > max_read) throw std::runtime_error("max_read exceeded by LineReader");
    40+    }
    41+    const std::string_view untrimmed_line(reinterpret_cast<const char*>(std::to_address(line_start)), count);
    42+    const std::string_view line = TrimStringView(untrimmed_line); // delete trailing \r and/or \n
    43+    return std::string(line);
    44+}
    


    furszy commented at 3:21 pm on January 15, 2026:

    Since we advance the class member iterator it while scanning the span, throwing a max_read exception drops the bytes already read. We should advance it only at the very end to avoid loosing them (a test for this case would be good).

    Also, small extra nit: what if there only char is a \n? The returned string will be empty?

    As a separate topic, could use std::memchr to find the first instance of \n which would avoid the loop (haven’t fully tried it but seems to be an option as well - dropping it in case you want to investigate it, all good if not).


    pinheadmz commented at 6:37 pm on January 15, 2026:

    Great catch thanks. I now reset the iterator just before throwing. Tested by catching the “exceeded” error then using a more granular operation (ReadLength) to get the remainder of bytes from the buffer. The test fails on the original code because ReadLength returned bytes form whatever point it threw the error instead of from the last point of success.

    Also added test cases for what makes the most sense to me when using the tool:

    • If the buffer is only "\n" then you get one empty "" line, nullopt after that
    • If the buffer is completely empty "" you get nullopt (not an error)

    re: memchr, I played with that a bit, then I also tried out std::find and felt like neither approach produced clean code without also dropping a little efficiency. So I still think the long implementation here is best for our precise use-case.

  32. DrahtBot requested review from furszy on Jan 15, 2026
  33. pinheadmz force-pushed on Jan 15, 2026
  34. pinheadmz commented at 7:12 pm on January 15, 2026: member

    push to 9b3612beaf76189aa50ab66b334a3b258a563f95

    Address feedback from @furszy about LineReader, fix an edge case after an error is thrown and add more tests.

  35. in src/util/string.h:271 in 9b3612beaf outdated
    266+    const std::span<const std::byte>::iterator start;
    267+    const std::span<const std::byte>::iterator end;
    268+    const size_t max_read;
    269+    std::span<const std::byte>::iterator it;
    270+
    271+    explicit LineReader(std::span<const std::byte> buffer, size_t max_read);
    


    furszy commented at 7:57 pm on January 15, 2026:
    nit: As we can read pass max_read limit if max_read + 1 is a jump line, I would rename the arg and member to max_output_length.
  36. furszy commented at 7:58 pm on January 15, 2026: member
    utACK 9b3612beaf76189aa50ab66b334a3b258a563f95
  37. DrahtBot requested review from vasild on Jan 15, 2026
  38. fjahr commented at 9:41 pm on January 15, 2026: contributor
    re-ACK 9b3612beaf76189aa50ab66b334a3b258a563f95
  39. in src/util/strencodings.h:330 in 9b3612beaf outdated
    325@@ -324,6 +326,15 @@ std::string Capitalize(std::string str);
    326  */
    327 std::optional<uint64_t> ParseByteUnits(std::string_view str, ByteUnit default_multiplier);
    328 
    329+/**
    330+ * Returns a byte vector filled with data from a string. Used to test string-
    


    fjahr commented at 4:05 pm on January 19, 2026:

    super-nit complaint by drahtbot on my PR where I cherry-picked this:

    0string- encoded -> string-encoded [The word is split across the line with a stray space, making "string- encoded" incorrect; use "string-encoded" (or "string encoded") for clarity.]
    

    pinheadmz commented at 6:07 pm on January 20, 2026:
    fixed, thanks
  40. pinheadmz force-pushed on Jan 20, 2026
  41. pinheadmz commented at 6:05 pm on January 20, 2026: member

    push to 1a057b3f6a8c3b4f787f99c2a478c90addc6063f

    Remove the functional equivalence of end of buffer and \n, addressing the discussion thread in #34158 (review). I rebased both the torcontrol PR and the HTTP server PR on this new updated LineReader and confirmed its the correct behavior.

  42. pinheadmz force-pushed on Jan 20, 2026
  43. pinheadmz commented at 6:08 pm on January 20, 2026: member

    push to a7bde7143ea7d943ce4aa0df8ef7979673b1aa10

    Fix DrahtBot lint warning: #34242 (review)

  44. DrahtBot added the label CI failed on Jan 20, 2026
  45. DrahtBot removed the label CI failed on Jan 20, 2026
  46. in src/util/strencodings.h:336 in a7bde7143e
    331+ * data from a socket like HTTP headers.
    332+ *
    333+ * @param[in] str                  the string to convert into bytes
    334+ * @returns                        byte vector
    335+ */
    336+std::vector<std::byte> StringToBuffer(const std::string& str);
    


    furszy commented at 9:47 pm on January 20, 2026:
    Based on the function’s description, this is a test-only function. So you could move it to util_string_tests.cpp?

    pinheadmz commented at 8:52 pm on January 21, 2026:
    Yes 💪
  47. in src/util/string.cpp:54 in a7bde7143e outdated
    49+    // End of buffer reached without finding a \n or exceeding max_read.
    50+    // Reset the iterator so the rest of the buffer can be read granularly
    51+    // with ReadLength() and return null to indicate a line was not found.
    52+    it = line_start;
    53+    return std::nullopt;
    54+}
    


    furszy commented at 10:05 pm on January 20, 2026:

    Maybe could:

     0std::optional<std::string> LineReader::ReadLine()
     1{
     2    if (it == end) return std::nullopt;
     3
     4    // Limit search to max_read bytes
     5    const auto limit = std::distance(it, end) > static_cast<long>(max_read) ? std::next(it, max_read) : end;
     6    const auto line = std::find(it, limit, std::byte{'\n'});
     7
     8    if (line == limit) {
     9        if (limit != end) // scanned max_read bytes without line
    10            throw std::runtime_error("max_read exceeded by LineReader");
    11        return std::nullopt; // reached end of buffer without line
    12    }
    13
    14    const std::string_view untrimmed{
    15        reinterpret_cast<const char*>(std::to_address(it)),
    16        static_cast<size_t>(std::distance(it, line) + 1)
    17    };
    18
    19    it = std::next(line); // advance iterator
    20    return std::string(TrimStringView(untrimmed));
    21}
    

    The only change from your code is that this version has a hard limit on max_read. Which in my view is much simpler to reason about. The max_read + 1 retrieval when the “past the limit char” is \n seems confusing to me. In networking code, the usual convention is to stop at a fixed size, without caring about what comes next.


    fjahr commented at 1:30 pm on January 21, 2026:
    Maybe I introduced some of this confusion and also the EOF changes by reviewing this function in isolation and forgetting about the bigger picture. Sorry about that. I am not sure if I find this code easier on first glance but I want to say that it seems like the max_read is really only used as DoS protection measure and not for a protocol limit. Maybe we can just use the behavior that makes the code the easiest to read and not have to be super strict about all edge cases where it’s unclear what the “right” behavior could be. It would be good to document that this is currently only used as a DoS protection thought so if the LineReader class is used in a different context the new user pays attention.

    furszy commented at 2:52 pm on January 21, 2026:
    All good @fjahr, I’ve been thinking about simplifying this code for a while; the last push just gave me another reason to actually do it. Overall, I don’t see why we should read the next character after max_read. If we want to go beyond that value, we should instead increase max_read by 1.

    fjahr commented at 4:08 pm on January 21, 2026:

    I don’t see why we should read the next character after max_read

    In my earlier review, I interpreted max_read as “I want to get a line back that has this length and is not longer” and then the question is if you count \n into that or not because we don’t return the \n with the rest of the line. It seemed unituitive to me that you would give max_read=10 and then the maximum length you would get back is actually 9 not 10. Returning 9 would ask the caller to understand more of the internals which seemed wrong to me. But of course the point of my previous comment was, it probably doesn’t matter because this is only used for DoS and we might not be able to agree on the “right” behavior because there is actually no more concrete use-case of max_read that we could tie it to and without that both ways may be equally “right”.

    I would support a rewrite of the function body (I suggested some reordering here too) or keeping the current structure, whichever way @pinheadmz prefers. But I think we don’t need to continue to split hairs over max_read edge cases because it may just ultimately bike shedding. That’s something that wasn’t clear to me initially.

    Maybe we can agree on just trying to follow the libevent behavior if we have nothing else to go on. Over there evbuffer_readln uses evbuffer_search_eol to find the EOL char and then returns everything that came before it. And then what is returned is checked against > max_headers_size, so from my understanding the behavior better matches what we have currently, e.g. the EOL character are all free and don’t count towards the max value.

    But I am just giving another misdirection ;) As I said, I am no longer so passionate about this and will go with whatever @pinheadmz decides to do now ;)

    If we want to go beyond that value, we should instead increase max_read by 1.

    Not sure if you mean the function should do that internally or the caller should do that. Internally seems fine if that makes the code simpler. On the caller side this doesn’t sound so appealing, but I already said it doesn’t really matter :)

    EDIT: Whatever @pinheadmz decides it will be easier to read than the libevent code :D


    pinheadmz commented at 8:47 pm on January 21, 2026:

    I think the logic is dialed in now, but referring back to #34242 (review) it might make more sense if we called the parameter max_line_length since we will typically actually read + 1 more byte than that.

    As far as DoS, I don’t think its super crucial because:

    • It’s HTTP interface, it shouldn’t be exposed to untrusted clients anyway
    • LineReader is handed a buffer with a fixed size to begin with
  48. pinheadmz force-pushed on Jan 21, 2026
  49. pinheadmz commented at 9:01 pm on January 21, 2026: member

    push to 7462df6bcb3974db47059cd2a7c03412709ada67

    • Rename max_read to max_line_length
    • Move StringToBuffer() into the unit test file, the only place it is used.
  50. pinheadmz force-pushed on Jan 21, 2026
  51. pinheadmz commented at 9:42 pm on January 21, 2026: member

    Push to ea454c08ca4b5736b4b82dddd7560b550dabf26b

    • Tidy up 🤦
  52. DrahtBot added the label CI failed on Jan 21, 2026
  53. DrahtBot commented at 9:44 pm on January 21, 2026: contributor

    🚧 At least one of the CI tasks failed. Task tidy: https://github.com/bitcoin/bitcoin/actions/runs/21225545585/job/61072081681 LLM reason (✨ experimental): Clang-tidy failures: argument-comment mismatch for LineReader max_line_length in test/util_string_tests.cpp, causing CI to fail.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  54. fjahr commented at 10:23 pm on January 21, 2026: contributor

    ACK ea454c08ca4b5736b4b82dddd7560b550dabf26b

    Renaming to max_line_length clarifies quite a bit, thanks!

  55. DrahtBot requested review from furszy on Jan 21, 2026
  56. DrahtBot removed the label CI failed on Jan 22, 2026
  57. vasild approved
  58. vasild commented at 10:37 am on January 22, 2026: contributor
    ACK ea454c08ca4b5736b4b82dddd7560b550dabf26b
  59. in src/netbase.cpp:959 in e93778dd82
    958+    sockaddr_storage storage;
    959+    socklen_t len = sizeof(storage);
    960+
    961+    auto sa = reinterpret_cast<sockaddr*>(&storage);
    962+
    963+    // POSIX convention: 0 is success
    


    maflcko commented at 11:02 am on January 22, 2026:
    nit in e93778dd82795060288ebda3897c8695386101e8: I think this comment can be removed. It should be self-explanatory via the _ == 0 below. There are other places that also call GetSockName without explaining the return type and just comparing it to zero.

    pinheadmz commented at 3:35 pm on January 22, 2026:
    OK, sure.
  60. in src/util/strencodings.h:172 in 0717e882b1
    168@@ -169,17 +169,18 @@ constexpr inline bool IsSpace(char c) noexcept {
    169 /**
    170  * Convert string to integral type T. Leading whitespace, a leading +, or any
    171  * trailing character fail the parsing. The required format expressed as regex
    172- * is `-?[0-9]+`. The minus sign is only permitted for signed integer types.
    173+ * is `-?[0-9]+` by default (or e.g. `-?[0-9a-fA-F]+` if base = 16).
    


    maflcko commented at 11:06 am on January 22, 2026:
    nit in 0717e882b1244d8cd71e27805f7e2f91ea8b2b8b: I think the e.g. is wrong here? I think it can be dropped. or should already be correct and sufficient.

    maflcko commented at 11:12 am on January 22, 2026:
    Also, there could be a unit test to check that negative hex works? Maybe BOOST_CHECK_EQUAL(*ToIntegral<int64_t>("-1", 16), -1);

    pinheadmz commented at 3:46 pm on January 22, 2026:
    Thanks, removed “e.g.” and added the test - it passes!

    maflcko commented at 7:23 pm on January 22, 2026:

    Thanks, removed “e.g.” and added the test - it passes!

    heh, that’s funny. I would have assumed that n is u64, and the result is i64, so the isan should fail CI here. But both std::optional and BOOST_CHECK_EQUAL silence the warnings. The following would fail the integer sanitizer:

     0diff --git a/src/test/util_tests.cpp b/src/test/util_tests.cpp
     1index eb6057a6ff..b8cf961f7d 100644
     2--- a/src/test/util_tests.cpp
     3+++ b/src/test/util_tests.cpp
     4@@ -873,7 +873,7 @@ BOOST_AUTO_TEST_CASE(test_ToIntegralHex)
     5     n = ToIntegral<uint64_t>("FfFfFfFfFfFfFfFf", 16);
     6     BOOST_CHECK_EQUAL(*n, 0xFfFfFfFfFfFfFfFfULL);
     7     n = ToIntegral<int64_t>("-1", 16);
     8-    BOOST_CHECK_EQUAL(*n, -1);
     9+    BOOST_CHECK(*n== -1);
    10     // Invalid values
    11     BOOST_CHECK(!ToIntegral<uint64_t>("", 16));
    12     BOOST_CHECK(!ToIntegral<uint64_t>("-1", 16));
    

    If you want you can push my initial suggestion, or leave this as-is, because apparently the stdlib and boost like it as-is.

  61. in src/util/strencodings.h:370 in bbb9d95d88
    365+    size_t operator()(const std::string_view& s) const {
    366+        std::string lowered;
    367+        lowered.resize(s.size());
    368+        std::ranges::transform(s, lowered.begin(),
    369+            [](char c) {
    370+                // Avoid implicit-integer-sign-change UB by only
    


    maflcko commented at 12:07 pm on January 22, 2026:

    I don’t understand this. sign-change is not UB. Also, i fail to see how it could happen with ToLower? Also, no need to create two copies, when one is enough. The following diff passes isan for me:

     0diff --git a/src/test/util_string_tests.cpp b/src/test/util_string_tests.cpp
     1index 82ddd209c2..4a49e5d090 100644
     2--- a/src/test/util_string_tests.cpp
     3+++ b/src/test/util_string_tests.cpp
     4@@ -178,6 +178,7 @@ BOOST_AUTO_TEST_CASE(ascii_case_insensitive_hash_test)
     5     BOOST_CHECK_NE(hsh("AA"), hsh("A"));
     6     BOOST_CHECK_EQUAL(hsh("A"), hsh("a"));
     7     BOOST_CHECK_EQUAL(hsh("Ab"), hsh("aB"));
     8+    BOOST_CHECK_EQUAL(hsh("A\xfe"), hsh("a\xfe"));
     9 }
    10 
    11 BOOST_AUTO_TEST_SUITE_END()
    12diff --git a/src/util/strencodings.h b/src/util/strencodings.h
    13index b005628f24..3b49eafd6e 100644
    14--- a/src/util/strencodings.h
    15+++ b/src/util/strencodings.h
    16@@ -355,25 +355,16 @@ struct Hex {
    17 } // namespace detail
    18 
    19 struct AsciiCaseInsensitiveKeyEqual {
    20-    bool operator()(const std::string& s1, const std::string& s2) const
    21+    bool operator()(std::string_view s1, std::string_view s2) const
    22     {
    23         return ToLower(s1) == ToLower(s2);
    24     }
    25 };
    26 
    27 struct AsciiCaseInsensitiveHash {
    28-    size_t operator()(const std::string_view& s) const {
    29-        std::string lowered;
    30-        lowered.resize(s.size());
    31-        std::ranges::transform(s, lowered.begin(),
    32-            [](char c) {
    33-                // Avoid implicit-integer-sign-change UB by only
    34-                // processing ASCII.
    35-                auto uc = static_cast<unsigned char>(c);
    36-                if (uc >= 128) return c;
    37-                return ToLower(c);
    38-            });
    39-        return std::hash<std::string>{}(lowered);
    40+    size_t operator()(std::string_view s) const
    41+    {
    42+        return std::hash<std::string>{}(ToLower(s));
    43     }
    44 };
    45 
    
    0$ UBSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:report_error_type=1" ./bld-cmake/bin/test_bitcoin -t util_string_tests
    1Running 3 test cases...
    2
    3*** No errors detected
    

    maflcko commented at 12:17 pm on January 22, 2026:
    (nit in bbb9d95d88b19d3971c555da5de4dbaffbbe11aa)

    pinheadmz commented at 3:54 pm on January 22, 2026:
    Great catch and thanks for testing, I’ll take the whole diff. I think this was leftover from a few different iterations. I was using a std::map at first and this function was a comparator before it was a hash (now http headers are in an unordered_map). I also had started off with standard library tolower() before switching to our locale-independent version. I think I got the UB warning from the fuzzer on ci, but it was too long ago to find the log. Anyway, this is much cleaner.

    maflcko commented at 7:23 pm on January 22, 2026:
    You may also fully drop the whole commit and just use ToLower when adding the key to the map. But I guess this is a behavior change vs libevent and the header fields may read a bit less nice, when debugging.
  62. in src/test/util_string_tests.cpp:219 in ea454c08ca
    214+
    215+        LineReader reader1(input, /*max_line_length=*/22);
    216+        // First line is exactly the length of max_line_length
    217+        BOOST_CHECK_EQUAL(reader1.ReadLine(), "once upon a time there");
    218+        // Second line is +1 character too long
    219+        BOOST_CHECK_THROW(reader1.ReadLine(), std::runtime_error);
    


    maflcko commented at 12:36 pm on January 22, 2026:
    nit in ea454c08ca4b5736b4b82dddd7560b550dabf26b: I think you can use BOOST_CHECK_EXCEPTION(code that throws, exception type, HasReason("foo")); here?

    maflcko commented at 1:32 pm on January 22, 2026:

    Pushed https://github.com/maflcko/DrahtBot/commit/01638c720f4941e84603b7d70e92be59f12be593 to drahtbot, so the llm should say something like:

    • [src/test/util_string_tests.cpp] BOOST_CHECK_THROW(reader1.ReadLine(), std::runtime_error); -> Replace with BOOST_CHECK_EXCEPTION(reader1.ReadLine(), std::runtime_error, HasReason{“max_line_length exceeded by LineReader”})
    • [src/test/util_string_tests.cpp] BOOST_CHECK_THROW(reader.ReadLine(), std::runtime_error); -> Replace with BOOST_CHECK_EXCEPTION(reader.ReadLine(), std::runtime_error, HasReason{“max_line_length exceeded by LineReader”})
    • [src/test/util_string_tests.cpp] BOOST_CHECK_THROW(reader.ReadLine(), std::runtime_error); // “baboon” is too long -> Replace with BOOST_CHECK_EXCEPTION(reader.ReadLine(), std::runtime_error, HasReason{“max_line_length exceeded by LineReader”})
    • [src/test/util_string_tests.cpp] BOOST_CHECK_THROW(reader.ReadLine(), std::runtime_error); // “on” is too long -> Replace with BOOST_CHECK_EXCEPTION(reader.ReadLine(), std::runtime_error, HasReason{“max_line_length exceeded by LineReader”})
    • [src/test/util_string_tests.cpp] BOOST_CHECK_THROW(reader.ReadLength(128), std::runtime_error); -> Replace with BOOST_CHECK_EXCEPTION(reader.ReadLength(128), std::runtime_error, HasReason{“Not enough data in buffer”})

    pinheadmz commented at 4:53 pm on January 22, 2026:
    Ahh this is great thanks. I couldn’t figure out how to match error messages when I first wrote these tests and forgot to go back and fix it.
  63. in src/util/string.cpp:37 in ea454c08ca
    32+        ++count;
    33+        // If the character we just consumed was \n, the line is terminated.
    34+        // The \n itself does not count against max_line_length.
    35+        if (c == '\n') {
    36+            const std::string_view untrimmed_line(reinterpret_cast<const char*>(std::to_address(line_start)), count);
    37+            const std::string_view line = TrimStringView(untrimmed_line); // delete trailing \r and/or \n
    


    maflcko commented at 12:44 pm on January 22, 2026:
    nit in ea454c0: This will also trim leading \r, like \rfoo\r\n. Not sure if the comment/test should be updated.

    pinheadmz commented at 5:10 pm on January 22, 2026:
    I’ll expand the comment

    maflcko commented at 7:27 pm on January 22, 2026:

    The test, if you want to push one:

     0diff --git a/src/test/util_string_tests.cpp b/src/test/util_string_tests.cpp
     1index 9f7513fbb3..d2e0b3d31c 100644
     2--- a/src/test/util_string_tests.cpp
     3+++ b/src/test/util_string_tests.cpp
     4@@ -240,6 +240,15 @@ BOOST_AUTO_TEST_CASE(line_reader_test)
     5         LineReader reader(input, /*max_line_length=*/1024);
     6         BOOST_CHECK(!reader.ReadLine());
     7     }
     8+    {
     9+        // Whitespace counts against the max length
    10+        const std::vector<std::byte> input{StringToBuffer(" \n  \n")};
    11+        LineReader reader(input, /*max_line_length=*/1);
    12+        BOOST_CHECK_EQUAL(reader.ReadLine(), "");
    13+        // Second line has two spaces
    14+        BOOST_CHECK_EXCEPTION(reader.ReadLine(), std::runtime_error, HasReason{"max_line_length exceeded by LineReader"});
    15+        BOOST_CHECK_EQUAL(reader.Remaining(), 3);
    16+    }
    17     {
    18         // Even one character is too long, if it's not \n
    19         const std::vector<std::byte> input{StringToBuffer("ab\n")};
    
  64. maflcko approved
  65. maflcko commented at 12:52 pm on January 22, 2026: member

    Left some nits/questions, which may be addressed, but are not blockers.

    review ACK ea454c08ca4b5736b4b82dddd7560b550dabf26b 🍃

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK ea454c08ca4b5736b4b82dddd7560b550dabf26b 🍃
    38MESW5/YR+Eg4kVH3u/jSlXOmtoeyzvsZFeXc8AUu8W0WxfizC5JHH1Uo29pcCseIllWU3YD5uEfC8FUUs28AA==
    
  66. Modernize GetBindAddress()
    Replace the C-style casting with C++ reinterpret_cast
    0b0d9125c1
  67. string: add `base` argument for ToIntegral to operate on hexadecimal 4e300df712
  68. string: add AsciiCaseInsensitive{KeyEqual, Hash} for unordered map
    https://httpwg.org/specs/rfc9110.html#rfc.section.5.1
    Field names in HTTP headers are case-insensitive. These
    structs will be used in the headers map to search by key.
    
    In libevent field names are also converted to lowercase for comparison:
      evhttp_find_header()
      evutil_ascii_strcasecmp()
      EVUTIL_TOLOWER_()
    eea38787b9
  69. time: implement and test RFC1123 timestamp string
    HTTP 1.1 responses require a timestamp header with a format
    specified (currently) by:
    https://datatracker.ietf.org/doc/html/rfc9110#section-5.6.7
    
    This specific format is defined in RFC1123:
    https://www.rfc-editor.org/rfc/rfc1123#page-55
    
    The libevent implementation can be referenced in evutil_time.c
    evutil_date_rfc1123()
    ee62405cce
  70. string: add LineReader
    This is a helper struct to parse HTTP messages from data in buffers
    from sockets. HTTP messages begin with headers which are
    CRLF-terminated lines (\n or \r\n) followed by an arbitrary amount of
    body data. Whitespace is trimmed from the field lines but not the body.
    
    https://httpwg.org/specs/rfc9110.html#rfc.section.5
    1911db8c6d
  71. pinheadmz force-pushed on Jan 22, 2026
  72. pinheadmz commented at 5:51 pm on January 22, 2026: member

    push to 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9

    • Address nits from @maflcko review
    • Assert error messages in unit test
    • Simplify AsciiCaseInsensitiveHash
  73. maflcko commented at 7:28 pm on January 22, 2026: member

    review ACK 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9 👲

    Signature:

    0untrusted comment: signature from minisign secret key on empty file; verify via: minisign -Vm "${path_to_any_empty_file}" -P RWTRmVTMeKV5noAMqVlsMugDDCyyTSbA3Re5AkUrhvLVln0tSaFWglOw -x "${path_to_this_whole_four_line_signature_blob}"
    1RUTRmVTMeKV5npGrKx1nqXCw5zeVHdtdYURB/KlyA/LMFgpNCs+SkW9a8N95d+U4AP1RJMi+krxU1A3Yux4bpwZNLvVBKy0wLgM=
    2trusted comment: review ACK 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9 👲
    39zWo7ZoxhGTWyRAMC5nyKycjSjnJQuZgbAURC4gLzY/GeA/sJF45ZMc16S2v17rRuRwogEefZlQ3syulcLdJCA==
    
  74. DrahtBot requested review from vasild on Jan 22, 2026
  75. DrahtBot requested review from fjahr on Jan 22, 2026
  76. furszy commented at 8:48 pm on January 22, 2026: member
    utACK 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9
  77. in src/test/util_tests.cpp:391 in 1911db8c6d
    384@@ -385,6 +385,21 @@ BOOST_AUTO_TEST_CASE(util_FormatISO8601Date)
    385     BOOST_CHECK_EQUAL(FormatISO8601Date(1317425777), "2011-09-30");
    386 }
    387 
    388+
    389+BOOST_AUTO_TEST_CASE(util_FormatRFC1123DateTime)
    390+{
    391+    BOOST_CHECK_EQUAL(FormatRFC1123DateTime(std::numeric_limits<int64_t>::max()), "");
    


    sedited commented at 10:48 am on January 23, 2026:
    Nit: Could add a min test case too for symmetry.
  78. sedited approved
  79. sedited commented at 12:09 pm on January 23, 2026: contributor
    ACK 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9
  80. sedited merged this on Jan 23, 2026
  81. sedited closed this on Jan 23, 2026

  82. fjahr commented at 12:26 pm on January 23, 2026: contributor

    re-ACK 1911db8c6dc6b32c8971b14b2b271ec39d9f3ab9

    Reviewed the addressed changes and they all look ok to me.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-02-17 15:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me