Prepare string and net utils for future HTTP operations #34242

pull pinheadmz wants to merge 6 commits into bitcoin:master from pinheadmz:http-utils changing 11 files +329 −17
  1. pinheadmz commented at 6:11 pm on January 9, 2026: member

    This is a component of removing libevent as a dependency of the project. It is the first six commits of #32061 and provides a string-parsing utility (LineReader) that is also consumed by #34158.

    These are the functions that are added / updated for HTTP and Torcontrol:

    • GetBindAddress(): Given a socket, provides the bound address as a CService. Currently used by p2p but moved from net to netbase so other modules can call it.
    • ToIntegral(): Already used to parse numbers from strings, added new argument base = 10 so it can also be used to parse hexadecimal integers. HTTP chunked transfer-encoding uses hex-encoded integers to specify payload size: https://datatracker.ietf.org/doc/html/rfc7230.html#section-4.1
    • AsciiCaseInsensitive comparators: Needed to store HTTP headers in an unordered_map. Headers are key-value pairs that are parsed with case-insensitive keys: https://httpwg.org/specs/rfc9110.html#rfc.section.5.1
    • FormatRFC1123DateTime(): The required datetime format for HTTP headers (e.g. Fri, 31 May 2024 19:18:04 GMT)
    • LineReader: Fields in HTTP requests are newline-terminated. This struct is given an input buffer and provides methods to read lines as strings.
  2. DrahtBot commented at 6:11 pm on January 9, 2026: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/34242.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Stale ACK fjahr, vasild, furszy

    If your review is incorrectly listed, please copy-paste <!–meta-tag:bot-skip–> into the comment that the bot should ignore.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #34158 (torcontrol: Remove libevent usage by fjahr)
    • #31672 (rpc: add cpu_load to getpeerinfo by vasild)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • string- encoded -> string-encoded [the hyphen is split by a space, making the compound adjective unclear; use “string-encoded” or “string encoded”]

    2026-01-13

  3. in src/netbase.cpp:956 in 67ba7c9302 outdated
    951+CService GetBindAddress(const Sock& sock)
    952+{
    953+    CService addr_bind;
    954+    struct sockaddr_storage sockaddr_bind;
    955+    socklen_t sockaddr_bind_len = sizeof(sockaddr_bind);
    956+    if (!sock.GetSockName((struct sockaddr*)&sockaddr_bind, &sockaddr_bind_len)) {
    


    furszy commented at 6:22 pm on January 9, 2026:

    GetSockName() currently forwards the raw getsockname() return value (0 on success), which forces callers to use an inverted check (!GetSockName(...)).

    Can we normalize it so it returns true on success?

    0bool Sock::GetSockName(sockaddr* name, socklen_t* name_len) const
    1{
    2    return getsockname(m_socket, name, name_len) == 0;
    3}
    

    It would make the code slightly more readable and maintainable.


    pinheadmz commented at 6:47 pm on January 9, 2026:
    I agree with you, but this might be better addressed in a separate PR. The entire Sock class wraps POSIX socket management syscalls that return 0 on success (connect, bind, listen, close, etc…)

    vasild commented at 12:43 pm on January 12, 2026:

    The idea behind the Sock methods is that they are mockable versions of the OS functions. Zero learning curve for newcomers (assuming familiarity with the OS interface). Sock::Foo(x, y, z) is equivalent to calling the OS’s foo(x, y, z).

    inverted check

    the function returns an integer error code, not a boolean. A check like if (foo() == 0) looks better (less “inverted”).


    furszy commented at 4:33 pm on January 12, 2026:
    I see. Thanks @vasild. It would be nice to have more documentation on the return codes then. And +1 on the == 0 suggestion.

    pinheadmz commented at 7:59 pm on January 12, 2026:
    This line gets “modernized” in the next commit, including a comment and == 0, so I think it’s cool for now.

    vasild commented at 5:26 pm on January 13, 2026:

    “returns an integer error code” – oh, I meant this:

    https://www.man7.org/linux/man-pages/man2/getsockname.2.html#RETURN_VALUE

    RETURN VALUE On success, zero is returned. On error, -1 is returned, and errno is set to indicate the error.

    or the check could be if (foo() != -1).

  4. fjahr commented at 2:32 pm on January 10, 2026: contributor

    ACK ea993ff2af1faf863c106f855d3722b29a1f4483

    I reviewed the same 6 commits as part of #32061 and all my feedback from there has been addressed.

  5. in src/netbase.cpp:957 in ea993ff2af
    952+{
    953+    CService addr_bind;
    954+    sockaddr_storage storage;
    955+    socklen_t len = sizeof(storage);
    956+
    957+    auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage));
    


    vasild commented at 12:48 pm on January 12, 2026:

    The inner cast is unnecessary?

    0    auto sa = static_cast<sockaddr*>(&storage);
    

    pinheadmz commented at 8:33 pm on January 12, 2026:
    This works with a reinterpret_cast because sockaddr_storage and sockaddr are technically unrelated types even though they overlap… I’ll clean it up, thanks. I think I had static_cast there before because of hints from Sonarcloud on corecheck.dev.
  6. in src/util/string.cpp:42 in ea993ff2af
    37+        // If the character we just consumed gives us a line length greater
    38+        // than max_read, and we are not at the end of the line (or buffer) yet,
    39+        // that means the line we are currently reading is too long, and we throw.
    40+        if (count > max_read) throw std::runtime_error("max_read exceeded by LineReader");
    41+    }
    42+    const std::string_view untrimmed_line(reinterpret_cast<const char *>(std::to_address(line_start)), count);
    


    vasild commented at 1:03 pm on January 12, 2026:
    0    const std::string_view untrimmed_line(reinterpret_cast<const char*>(std::to_address(line_start)), count);
    

    pinheadmz commented at 3:34 pm on January 13, 2026:
    taken, thanks
  7. vasild approved
  8. vasild commented at 1:05 pm on January 12, 2026: contributor
    ACK ea993ff2af1faf863c106f855d3722b29a1f4483
  9. in src/util/time.cpp:140 in 9423d8ae48 outdated
    135+    std::string_view month{months.at(unsigned{ymd.month()} - 1)};
    136+    const std::chrono::hh_mm_ss hms{secs - days};
    137+    // examples: Mon, 27 Jul 2009 12:28:53 GMT
    138+    //           Fri, 31 May 2024 19:18:04 GMT
    139+    return strprintf("%03s, %02u %03s %04i %02i:%02i:%02i GMT", weekday, unsigned{ymd.day()}, month, signed{ymd.year()}, hms.hours().count(), hms.minutes().count(), hms.seconds().count());
    140+}
    


    furszy commented at 6:32 pm on January 12, 2026:

    nit: I think C++20 has a short version of this (no extra arrays needed):

    0std::string FormatRFC1123DateTime(int64_t time)
    1{
    2    if (time < -62167219200 || 253402300799 < time) {
    3        // RFC7231 mandates 4-digit year, so only support years 0 to 9999
    4        return "";
    5    }
    6    const std::chrono::sys_seconds secs{std::chrono::seconds{time}};
    7    // Example: "Mon, 27 Jul 2009 12:28:53 GMT"
    8    return std::format("{:%a, %d %b %Y %H:%M:%S} GMT", secs);
    9}
    

    maybe you already tried it and it doesn’t work on all our supported platforms?


    vasild commented at 5:38 pm on January 13, 2026:

    This should be locale-independent. I think it is:

    https://en.cppreference.com/w/cpp/chrono/system_clock/formatter.html#Format_specification

    the default “C” locale if L is not present in the format specification


    vasild commented at 6:04 pm on January 13, 2026:

    https://github.com/bitcoin/bitcoin/actions/runs/20963499370/job/60247105398?pr=34242#step:11:1480

    0/home/admin/actions-runner/_work/_temp/src/util/time.cpp:129:17: error: no member named 'format' in namespace 'std'
    1  129 |     return std::format("{:%a, %d %b %Y %H:%M:%S} GMT", secs);
    2      |            ~~~~~^
    

    :cry:


    pinheadmz commented at 6:14 pm on January 13, 2026:
    Indeed, will revert on next push.

    furszy commented at 8:21 pm on January 13, 2026:
    too good to be true
  10. in src/util/string.cpp:36 in ea993ff2af
    31+        ++it;
    32+        ++count;
    33+        // If the character we just consumed was \n, the line is terminated
    34+        if (c == '\n') break;
    35+        // If we are at the end of the incoming buffer, the line is terminated
    36+        if (it == end) break;
    


    furszy commented at 7:06 pm on January 12, 2026:

    In ea993ff2af1faf863c106f855d3722b29a1f4483:

    Seems you can remove this last line. You already do a while (it != end)


    pinheadmz commented at 3:38 pm on January 13, 2026:
    Youre right thanks, its gone.

    vasild commented at 6:02 pm on January 13, 2026:

    Note that it is incremented inside the body of the loop, after the check in while (it != end). So after the increment it could be equal to end.

    0    const std::vector<std::byte> input{StringToBuffer("ab")};
    1    LineReader reader(input, /*max_read=*/1);
    2    std::optional<std::string> line{reader.ReadLine()};
    

    Before this change the above snippet would have set line to ab and after the change it throws std::runtime_error: max_read exceeded by LineReader. I guess the new behavior is the correct? Maybe worth adding such a test.


    pinheadmz commented at 6:41 pm on January 13, 2026:
    Thanks @vasild I added this as a test case, and am scratching my head how it slipped through. Removing the line is not only cleaner code, it is a bug fix. LineReader should not return a line longer than max_read
  11. in src/util/string.cpp:34 in ea993ff2af
    29+        // Read a character from the incoming buffer and increment the iterator
    30+        auto c = static_cast<char>(*it);
    31+        ++it;
    32+        ++count;
    33+        // If the character we just consumed was \n, the line is terminated
    34+        if (c == '\n') break;
    


    furszy commented at 7:09 pm on January 12, 2026:
    tiny nit: could make the termination character customizable.

    pinheadmz commented at 3:34 pm on January 13, 2026:
    Going to leave this for now, next dev who needs a custom EOL character can add it in the future ;-)
  12. furszy commented at 7:10 pm on January 12, 2026: member
    utACK ea993ff2af1faf863c106f855d3722b29a1f4483
  13. Make GetBindAddress() callable from outside net.cpp
    The function logic is moved-only from net.cpp to netbase.cpp
    and redeclared (as non-static) in netbase.h
    a0ca851d26
  14. Modernize GetBindAddress()
    Replace the C-style casting with C++ reinterpret_cast
    e93778dd82
  15. string: add `base` argument for ToIntegral to operate on hexadecimal 0717e882b1
  16. string: add AsciiCaseInsensitive{KeyEqual, Hash} for unordered map
    https://httpwg.org/specs/rfc9110.html#rfc.section.5.1
    Field names in HTTP headers are case-insensitive. These
    structs will be used in the headers map to search by key.
    
    In libevent field names are also converted to lowercase for comparison:
      evhttp_find_header()
      evutil_ascii_strcasecmp()
      EVUTIL_TOLOWER_()
    bbb9d95d88
  17. pinheadmz force-pushed on Jan 13, 2026
  18. time: implement and test RFC1123 timestamp string
    HTTP 1.1 responses require a timestamp header with a format
    specified (currently) by:
    https://datatracker.ietf.org/doc/html/rfc9110#section-5.6.7
    
    This specific format is defined in RFC1123:
    https://www.rfc-editor.org/rfc/rfc1123#page-55
    
    The libevent implementation can be referenced in evutil_time.c
    evutil_date_rfc1123()
    36449a1720
  19. DrahtBot added the label CI failed on Jan 13, 2026
  20. DrahtBot commented at 6:22 pm on January 13, 2026: contributor

    🚧 At least one of the CI tasks failed. Task macOS-cross to arm64: https://github.com/bitcoin/bitcoin/actions/runs/20963499370/job/60247105377 LLM reason (✨ experimental): Compilation failed: std::format is unavailable in the used standard library (missing C++20 std::format support).

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  21. string: add LineReader
    This is a helper struct to parse HTTP messages from data in buffers
    from sockets. HTTP messages begin with headers which are
    CRLF-terminated lines (\n or \r\n) followed by an arbitrary amount of
    body data. Whitespace is trimmed from the field lines but not the body.
    
    https://httpwg.org/specs/rfc9110.html#rfc.section.5
    1d248166ef
  22. pinheadmz force-pushed on Jan 13, 2026
  23. in src/test/util_string_tests.cpp:201 in 1d248166ef
    196+        BOOST_CHECK(line2);
    197+        BOOST_CHECK(line3);
    198+        BOOST_CHECK(!line4);
    199+        BOOST_CHECK_EQUAL(line1.value(), "once upon a time");
    200+        BOOST_CHECK_EQUAL(line2.value(), "there was a dog");
    201+        BOOST_CHECK_EQUAL(line3.value(), "who liked food");
    


    furszy commented at 8:56 pm on January 13, 2026:

    In 1d248166efe6cba330cfb14e652289f68c894652:

    Tiny, non-blocking, idea to see if it makes sense to simplify the LineReader class. (you can reinterpret std::byte to char in order to make it work with std::istringstream)

     0// Check three lines terminated by \n, \r\n, and end of buffer, trimming whitespace
     1const std::string input = "once upon a time\n there was a dog \r\nwho liked food";
     2std::istringstream stream(input);
     3
     4std::string line1, line2, line3, line4;
     5std::getline(stream, line1);
     6std::getline(stream, line2);
     7std::getline(stream, line3);
     8BOOST_CHECK(!std::getline(stream, line4));
     9
    10BOOST_CHECK_EQUAL(line1, "once upon a time");
    11BOOST_CHECK_EQUAL(line2, "there was a dog \r"); // TODO: trim whitespaces at the end.
    12BOOST_CHECK_EQUAL(line3, "who liked food");
    
  24. DrahtBot removed the label CI failed on Jan 13, 2026

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2026-01-14 06:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me