Replace libevent with our own HTTP and socket-handling implementation #32061

pull pinheadmz wants to merge 34 commits into bitcoin:master from pinheadmz:http-rewrite-13march2025 changing 26 files +2542 −607
  1. pinheadmz commented at 7:32 pm on March 13, 2025: member

    This is a major component of removing libevent as a dependency of the project

    It is based on #30988 but only in the sense that it copies the Sockman class introduced in that PR. The p2p / Connman refactor isn’t needed for HTTP and therefore this branch can be reviewed and merged independently of the p2p changes.

    Commit strategy:

    • Import sockman.{h,cpp} from #30988
    • Assert current behavior of HTTP with additional functional tests, including copying from libevent’s tests
    • Implement a few helper functions for strings, timestamps, etc needed by HTTP protocol
    • Isolate the existing libevent-based HTTP server in a namespace http_libevent
    • Implement HTTP in a new namespace http_bitcoin (the namespace manages duplicate HTTPRequest definitions, etc)
    • Switch bitcoind from the libevent server to the new server
    • Clean up (delete http_libevent)

    I am currently seeing about a 10% speed up in the functional tests on my own arm/macos machine.

    Integration testing:

    I am testing the new HTTP server by forking projects that integrate with bitcoin via HTTP and running their integration tests with bitcoind built from this branch (on Github actions). I will continue adding integrations over time, and re-running these CI tests as this branch gets rebased:

  2. DrahtBot commented at 7:32 pm on March 13, 2025: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Code Coverage & Benchmarks

    For details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32061.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK laanwj, fjahr, w0xlt
    Approach ACK vasild
    Stale ACK romanz

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #32796 (rpc: use CScheduler for HTTPRPCTimer by pinheadmz)
    • #32747 (Introduce SockMan (“lite”): low-level socket handling for HTTP by pinheadmz)
    • #32522 (util: C++20 ToIntegral() Improvement by w0xlt)
    • #32297 (bitcoin-cli: Add -ipcconnect option by ryanofsky)
    • #31929 (http: Make server shutdown more robust by hodlinator)
    • #31672 (rpc: add cpu_load to getpeerinfo by vasild)
    • #30988 (Split CConnman by vasild)
    • #29641 (scripted-diff: Use LogInfo over LogPrintf [WIP, NOMERGE, DRAFT] by maflcko)
    • #27731 (Prevent file descriptor exhaustion from too many RPC calls by fjahr)
    • #26812 (test: add end-to-end tests for CConnman and PeerManager by vasild)
    • #26022 (Add util::ResultPtr class by ryanofsky)
    • #25665 (refactor: Add util::Result failure values, multiple error and warning messages by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

    LLM Linter (✨ experimental)

    Possible typos and grammar issues:

    • “an unique id” -> “a unique id” [‘unique’ begins with a consonant sound, so use “a” instead of “an.”]

    drahtbot_id_4_m

  3. pinheadmz marked this as a draft on Mar 13, 2025
  4. DrahtBot added the label CI failed on Mar 13, 2025
  5. DrahtBot commented at 8:29 pm on March 13, 2025: contributor

    🚧 At least one of the CI tasks failed. Debug: https://github.com/bitcoin/bitcoin/runs/38735177073

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  6. laanwj added the label RPC/REST/ZMQ on Mar 14, 2025
  7. laanwj commented at 12:52 pm on March 14, 2025: member
    Concept ACK, nice work
  8. vasild commented at 1:57 pm on March 14, 2025: contributor
    Concept ACK
  9. fjahr commented at 8:29 pm on March 14, 2025: contributor

    Concept ACK

    My understanding from the low-level networking discussion at CoreDev was that this wouldn’t build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks!

  10. pinheadmz commented at 0:03 am on March 15, 2025: member

    My understanding from the low-level networking discussion at CoreDev was that this wouldn’t build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks!

    Sure, by coredev I had already written most of this implementation (based on sockman) but the performance was bad, and that was part of the motivation behind the deep-dive talk. However, by the end of the week I had reviewed that code in person with smart attendees and not only improved the performance of my code but started to improve performance vs master branch as well! Those updates came in the days just after the deep-dive discussion.

    SOME kind of sockman is needed to replace libevent. The one @vasild wrote does actually seem to work well for this purpose as well as for p2p, and it would be “nice” to only have to maintain one I/O loop structure in bitcoind. @theuni is investigating how a sockman for http could be optimized if it had no other purpose, and I think that is the kind of feedback that will help us decide which path to take.

  11. pinheadmz force-pushed on Mar 18, 2025
  12. pinheadmz force-pushed on Mar 18, 2025
  13. vasild commented at 12:22 pm on March 19, 2025: contributor

    SOME kind of sockman is needed to replace libevent … it would be “nice” to only have to maintain one I/O loop structure in bitcoind.

    :100:

  14. pinheadmz force-pushed on Mar 19, 2025
  15. DrahtBot added the label Needs rebase on Mar 20, 2025
  16. pinheadmz force-pushed on Mar 20, 2025
  17. pinheadmz force-pushed on Mar 20, 2025
  18. DrahtBot removed the label Needs rebase on Mar 20, 2025
  19. pinheadmz force-pushed on Mar 21, 2025
  20. DrahtBot added the label Needs rebase on Mar 24, 2025
  21. pinheadmz force-pushed on Mar 29, 2025
  22. pinheadmz force-pushed on Mar 29, 2025
  23. DrahtBot removed the label CI failed on Mar 30, 2025
  24. DrahtBot removed the label Needs rebase on Mar 30, 2025
  25. pinheadmz force-pushed on Mar 31, 2025
  26. pinheadmz commented at 10:18 am on March 31, 2025: member

    I rebased this branch on a single squashed commit from #30988 essentially just cherry-picking sockman.{h,cpp} by @vasild and leaving out the p2p refactor. This will make rebase maintenance on master a lot easier by reducing conflicting scope, and hopefully also makes review easier. It also means to some extent this PR can be merged independently of #30988, and also gives @theuni some room to rewrite a specific HTTP sockman if a more efficient purpose-focused module can be written. (Will update PR description in a moment)

    I’ve finally gotten all CI to pass so I’m going to mark this PR as ready for review as I move on to integration testing with all the bitcoin client libraries I can find!

  27. pinheadmz marked this as ready for review on Mar 31, 2025
  28. pinheadmz renamed this:
    [draft] Replace libevent with our own HTTP and socket-handling implementation
    Replace libevent with our own HTTP and socket-handling implementation
    on Mar 31, 2025
  29. pinheadmz force-pushed on Apr 3, 2025
  30. pinheadmz commented at 7:14 pm on April 3, 2025: member
    rebase to a225633e34 includes a new test for “pipelining” HTTP requests (thanks @theuni for pointing out this oversight) and also adds a queue of requests to each HTTPClient to ensure requests are processed in series, in the order they were received.
  31. pinheadmz force-pushed on Apr 3, 2025
  32. DrahtBot added the label CI failed on Apr 3, 2025
  33. DrahtBot removed the label CI failed on Apr 4, 2025
  34. w0xlt commented at 7:06 pm on April 17, 2025: contributor
    Concept ACK.
  35. in src/common/sockman.cpp:396 in cfe5eba446 outdated
    391+        LogPrintf("connection from %s dropped: non-selectable socket\n", them.ToStringAddrPort());
    392+        return;
    393+    }
    394+
    395+    // According to the internet TCP_NODELAY is not carried into accepted sockets
    396+    // on all platforms.  Set it again here just to be sure.
    


    romanz commented at 5:44 am on April 21, 2025:
    nit: this comment seems to be incorrect, since TCP_NODELAY is only set once by SockMan.

    pinheadmz commented at 1:53 pm on April 28, 2025:

    This comment may need to be applied upstream in #30988

    FWIW, the Sockman backend might not be what we end up going with, as @theuni and I are considering an alternative.

  36. in src/util/time.cpp:127 in aeb8352a9d outdated
    118@@ -116,6 +119,20 @@ std::optional<int64_t> ParseISO8601DateTime(std::string_view str)
    119     return int64_t{TicksSinceEpoch<std::chrono::seconds>(tp)};
    120 }
    121 
    122+std::string FormatRFC7231DateTime(int64_t nTime)
    123+{
    124+    const std::chrono::sys_seconds secs{std::chrono::seconds{nTime}};
    125+    const auto days{std::chrono::floor<std::chrono::days>(secs)};
    126+    // 1970-01-01 was a Thursday
    127+    std::string weekday{weekdays[(days.time_since_epoch().count() + 4) % 7]};
    


    romanz commented at 5:49 am on April 21, 2025:
    nit: weekday and month can be a string_view.

    pinheadmz commented at 5:38 pm on April 28, 2025:
    done
  37. in src/util/strencodings.cpp:503 in a1e151c774 outdated
    496@@ -497,3 +497,10 @@ std::optional<uint64_t> ParseByteUnits(std::string_view str, ByteUnit default_mu
    497     }
    498     return *parsed_num * unit_amount;
    499 }
    500+
    501+std::vector<std::byte> StringToBuffer(const std::string& str)
    502+{
    503+    return std::vector<std::byte>(
    


    romanz commented at 6:03 am on April 21, 2025:

    nit: maybe using std::span will be simpler? For example:

    0    auto span = std::as_bytes(std::span(str));
    1    return {span.begin(), span.end()};
    

    pinheadmz commented at 3:38 pm on April 28, 2025:
    Great thanks, done.
  38. in src/util/string.cpp:30 in a1e151c774 outdated
    25+
    26+    auto line_start = it;
    27+    std::string line{};
    28+    while (it != end) {
    29+        char c = static_cast<char>(*it);
    30+        line += c;
    


    romanz commented at 6:33 am on April 21, 2025:

    Maybe it would be better (performance-wise) to create the string after the loop is over?

     0std::optional<std::string> LineReader::ReadLine()
     1{
     2    if (it == end) {
     3        return std::nullopt;
     4    }
     5
     6    auto line_start = it;
     7    size_t count = 0;
     8    while (it != end) {
     9        char c = static_cast<char>(*it);
    10        ++it;
    11        ++count;
    12        if (c == '\n') break;
    13        if (count >= max_read) throw std::runtime_error("max_read exceeded by LineReader");
    14    }
    15    const std::byte *data = &*line_start;
    16    std::string line{reinterpret_cast<const char *>(data), count};
    17    line = TrimString(line); // delete trailing \r and/or \n
    18    return line;
    19}
    

    pinheadmz commented at 3:43 pm on April 28, 2025:
    awesome, done
  39. in src/util/string.cpp:36 in a1e151c774 outdated
    31+        ++it;
    32+        if (c == '\n') break;
    33+        if ((size_t)std::distance(line_start, it) >= max_read) throw std::runtime_error("max_read exceeded by LineReader");
    34+    }
    35+
    36+    line = TrimString(line); // delete trailing \r and/or \n
    


    romanz commented at 6:35 am on April 21, 2025:
    Can we return a string_view to the internal buffer? (preventing allocation & copy)

    pinheadmz commented at 5:31 pm on April 28, 2025:
    by internal buffer I assume you mean m_recv_buffer which im trying to copy from into the HTTPRequest object, then clear the buffer. I left another response about maybe changing the type of that buffer to reduce unnecessary copies, but am going to leave it alone for this current rebase. What I did here though was swap out TrimString() with TrimStringView() so at least that’s one less.
  40. in src/util/string.cpp:45 in a1e151c774 outdated
    40+// Ignores max_read but won't overflow
    41+std::string LineReader::ReadLength(size_t len)
    42+{
    43+    if (len == 0) return "";
    44+    if (Left() < len) throw std::runtime_error("Not enough data in buffer");
    45+    std::string out(reinterpret_cast<const char*>(&(*it)), len);
    


    romanz commented at 6:37 am on April 21, 2025:
    Can we return a string_view to the internal buffer? (preventing allocation & copy)

    pinheadmz commented at 5:36 pm on April 28, 2025:
    I think I addressed this already by taking your suggestion here https://github.com/bitcoin/bitcoin/pull/32061/files#r2052024440

    romanz commented at 10:15 am on May 3, 2025:

    7d30118401 seems to still return a std::string:

    0$ git grep ReadLength
    1src/util/string.cpp:  std::string LineReader::ReadLength(size_t len)
    2src/util/string.h:    std::string ReadLength(size_t len);
    

    pinheadmz commented at 1:43 pm on May 9, 2025:
    Oh I see I misunderstood. For now I want LineReader to return new string objects because it reads directly from the HTTPClient’s m_recv_buffer which is then erased. I know we can prevent that copy by being more clever with m_recv_buffer and I still plan on exploring that, maybe as a follow up? It’s nice and simple now and there is a clean separation between the socket layer receiving data and the HTTPRequest

    romanz commented at 1:46 pm on May 9, 2025:
    Sounds good, thanks for the context :)
  41. in src/httpserver.cpp:787 in 70d003ca10 outdated
    780@@ -781,3 +781,69 @@ void UnregisterHTTPHandler(const std::string &prefix, bool exactMatch)
    781         pathHandlers.erase(i);
    782     }
    783 }
    784+
    785+
    786+namespace http_bitcoin {
    787+std::optional<std::string> HTTPHeaders::Find(const std::string key) const
    


    romanz commented at 6:48 am on April 21, 2025:
    nit: consider using string_view instead of string here and below.

    pinheadmz commented at 7:42 pm on April 28, 2025:
    done.

    romanz commented at 10:23 am on May 3, 2025:
    nit: key & value can also be a string_view (here and below)

    pinheadmz commented at 1:48 pm on May 9, 2025:
    Cool will address this nit on next rebase thanks!

    vasild commented at 5:25 pm on May 14, 2025:

    Bikeshed, feel free to ignore.

    It is redundant to have “http” in both the namespace and in the class name: http_bitcoin::HTTPHeaders::Find(). Also no need to suffix anything with _bitcoin - if we would do that, then we would have to have e.g. namespace common_bitcoin, namespace i2p_bitcoin, namespace node_bitcoin etc. So http_bitcoin::HTTPHeaders::Find() could be shortened to http::Headers::Find().


    pinheadmz commented at 6:09 pm on May 14, 2025:
    Happy to change the names, I just needed to separate the old libevent server from the new one in the PR in the series of commits where each namespace has classes like HTTPRequest
  42. in src/httpserver.cpp:840 in 70d003ca10 outdated
    835+}
    836+
    837+std::string HTTPHeaders::Stringify() const
    838+{
    839+    std::string out;
    840+    for (auto it = m_map.begin(); it != m_map.end(); ++it) {
    


    romanz commented at 4:11 pm on April 21, 2025:

    nit: maybe the following would be simpler?

    0    for (const auto& [k, v] : m_map) {
    1        out += k + ": " + v + "\r\n";
    2    }
    

    pinheadmz commented at 7:41 pm on April 28, 2025:
    thanks I’ll take it!
  43. in src/httpserver.cpp:648 in 482382bd14 outdated
    915+    if (!req->LoadBody(reader)) return false;
    916+
    917+    // Remove the bytes read out of the buffer.
    918+    // If one of the above calls throws an error, the caller must
    919+    // catch it and disconnect the client.
    920+    m_recv_buffer.erase(
    


    romanz commented at 8:49 am on April 27, 2025:
    nit: maybe it’s better to return a std::span from m_recv_buffer to avoid a copy here.

    pinheadmz commented at 3:21 pm on April 28, 2025:
    we’d still want to erase the data after fulfilling the request though (if I’m understanding your comment correctly). Would it make more sense if m_recv_buffer was a std::deque or something where we can truncate the structure without copying/moving/shifting left?

    romanz commented at 5:53 am on April 30, 2025:

    Would it make more sense if m_recv_buffer was a std::deque or something where we can truncate the structure without copying/moving/shifting left?

    Sounds good :)

  44. in src/httpserver.cpp:536 in 73c3c2e3d3 outdated
    940@@ -940,6 +941,92 @@ bool HTTPRequest::LoadBody(LineReader& reader)
    941     }
    942 }
    943 
    944+void HTTPRequest::WriteReply(HTTPStatusCode status, std::span<const std::byte> reply_body)
    


    romanz commented at 9:02 am on April 27, 2025:
    Would it be possible to allow writing a reply composed of several span<byte>? It should allow sending while serializing a response object.

    pinheadmz commented at 2:33 pm on April 28, 2025:
    Could this be done in a followup? I actually just checked #30321 to see why we didn’t do this back then – I remember when that was authored and merged but I didn’t realize you were the author!

    romanz commented at 5:54 am on April 30, 2025:
    Sure, thanks :)
  45. in src/httpserver.cpp:610 in 73c3c2e3d3 outdated
    1009+        m_client->m_send_buffer.insert(m_client->m_send_buffer.end(), headers_bytes.begin(), headers_bytes.end());
    1010+
    1011+        // We've been using std::span up until now but it is finally time to copy
    1012+        // data. The original data will go out of scope when WriteReply() returns.
    1013+        // This is analogous to the memcpy() in libevent's evbuffer_add()
    1014+        m_client->m_send_buffer.insert(m_client->m_send_buffer.end(), reply_body.begin(), reply_body.end());
    


    romanz commented at 9:04 am on April 27, 2025:
    Maybe use a list of buffers (instead a single FIFO buffer)?

    pinheadmz commented at 3:06 pm on April 28, 2025:
    I don’t understand – would that be an optimization? It’s not really a FIFO, since we insert at the end of the buffer here but send data from the begin() of the buffer in SendBytesFromBuffer()

    romanz commented at 5:57 am on April 30, 2025:

    I suggest considering a similar approach to libevent’s evbuffers:

    evbuffers are represented using a linked list of memory chunks, with pointers to the first and last chunk in the chain.

    https://libevent.org/doc/buffer_8h.html


    romanz commented at 6:45 am on May 1, 2025:
    Thinking about it, doing the copy here is definitely fine for now (we can optimize it in a following PR).
  46. in src/httpserver.h:275 in 73c3c2e3d3 outdated
    270+    // Response headers may be set in advance before response body is known
    271+    HTTPHeaders m_response_headers;
    272+    void WriteReply(HTTPStatusCode status, std::span<const std::byte> reply_body = {});
    273+    void WriteReply(HTTPStatusCode status, const char* reply_body) {
    274+        auto reply_body_view = std::string_view(reply_body);
    275+        std::span<const std::byte> byte_span(reinterpret_cast<const std::byte*>(reply_body_view.data()), reply_body_view.size());
    


    romanz commented at 9:16 am on April 27, 2025:

    nit: can be simplified using

    0    void WriteReply(HTTPStatusCode status, std::string_view reply_body_view)
    1    {
    2        WriteReply(status, std::as_bytes(std::span{reply_body_view}));
    3    }
    

    pinheadmz commented at 3:36 pm on April 28, 2025:
    awesome thanks
  47. in src/httpserver.h:137 in 6a6285d268 outdated
    170+    {
    171+        auto reply_body_view = std::string_view(reply_body);
    172+        std::span<const std::byte> byte_span(reinterpret_cast<const std::byte*>(reply_body_view.data()), reply_body_view.size());
    173+        WriteReply(status, byte_span);
    174+    }
    175+    void WriteReply(HTTPStatusCode status, const std::string& reply_body)
    


    romanz commented at 9:20 am on April 27, 2025:
    Can be removed if we’ll add WriteReply(HTTPStatusCode status, std::string_view reply_body_view) (see above).

    pinheadmz commented at 3:36 pm on April 28, 2025:
    👍
  48. in src/httpserver.cpp:246 in 90761d5026 outdated
    242@@ -243,18 +243,18 @@ static bool InitHTTPAllowList()
    243 }
    244 
    245 /** HTTP request method as string - use for logging only */
    246-std::string RequestMethodString(HTTPRequest::RequestMethod m)
    247+std::string RequestMethodString(HTTPRequestMethod m)
    


    romanz commented at 9:43 am on April 27, 2025:

    nit:

    0std::string_view RequestMethodString(HTTPRequestMethod m)
    

    pinheadmz commented at 1:59 pm on April 28, 2025:
    done
  49. in src/httpserver.cpp:980 in b828fa1e29 outdated
    975+    if (end == std::string::npos) {
    976+        end = decoded_uri.length();
    977+    }
    978+    const std::string query{decoded_uri.substr(start + 1, end - start - 1)};
    979+    // find requested parameter in query
    980+    const std::vector<std::string> params{SplitString(query, "&")};
    


    romanz commented at 9:44 am on April 27, 2025:
    nit: wouldn’t it copy the substrings?

    pinheadmz commented at 2:12 pm on April 28, 2025:
    You’re right thanks, I haven’t been dilligent about using string_view when copying isn’t needed. Fixed that here and also in HTTPRequest::LoadControlData()
  50. in src/httpserver.cpp:522 in b828fa1e29 outdated
    989+        }
    990+    }
    991+    return std::nullopt;
    992+}
    993+
    994+std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const
    


    romanz commented at 9:45 am on April 27, 2025:

    nit:

    0std::pair<bool, std::string_view> HTTPRequest::GetHeader(const std::string& hdr) const
    

    pinheadmz commented at 2:28 pm on April 28, 2025:
    thanks, for this fix I added a new commit instead of a fixup. The reason is because in the “story” of the commits, GetHeader() is already defined and used by the legacy HTTP server returning a string, and the goal at this point is to match the exiting API. But we can optimize it after the big “switch HTTP servers” commit.
  51. romanz commented at 1:00 pm on April 27, 2025: contributor
    LGTM, many thanks! Added some comments/suggestions above :)
  52. pinheadmz force-pushed on Apr 29, 2025
  53. pinheadmz commented at 1:23 pm on April 29, 2025: member
    Thanks so much for the great review @romanz I took most of your suggestions, there’s still an open question about m_recv_buffer but I really appreciate all your attention to string copying!
  54. romanz commented at 7:54 am on May 3, 2025: contributor
  55. DrahtBot requested review from laanwj on May 3, 2025
  56. DrahtBot requested review from fjahr on May 3, 2025
  57. DrahtBot requested review from vasild on May 3, 2025
  58. in test/functional/interface_http.py:132 in 7d30118401 outdated
    127+        req2 += body2
    128+        # Get the underlying socket from HTTP connection so we can send something unusual
    129+        conn = http.client.HTTPConnection(urlNode2.hostname, urlNode2.port)
    130+        conn.connect()
    131+        sock = conn.sock
    132+        sock.settimeout(1)
    


    vasild commented at 7:41 am on May 15, 2025:
    1 second timeout to send or receive seems more than enough for local testing on a dev machine. However, CI virtual machines sometimes are surprisingly slow. To avoid unnecessary test failures maybe it would be better to have this be 5 or 10 seconds for the sendall() calls and then set to 1 for the recv() call which we expect to timeout.

    pinheadmz commented at 2:27 pm on May 18, 2025:
    I should have mentioned the initial test commits have been split off into #32408 and the test as written here failed CI, so has been modified with rpcservertimeout=2 and then expects a timeout between 1 and 4 seconds

    pinheadmz commented at 7:04 pm on June 17, 2025:
    Addressed in #32408
  59. in test/functional/interface_http.py:140 in 7d30118401 outdated
    135+        sock.sendall(req2.encode("utf-8"))
    136+        try:
    137+            # The server should not respond to the fast, second request
    138+            # until the (very) slow first request has been handled:
    139+            res = sock.recv(1024)
    140+            assert not res
    


    vasild commented at 7:53 am on May 15, 2025:

    Shouldn’t this be assert False? Here the expectation is that the recv() will throw an exception due to timeout.

    https://docs.python.org/3/library/socket.html#socket.socket.recv

    A returned empty bytes object indicates that the client has disconnected.

    An “empty bytes object” will not trigger the assert assert not res but if that happens (= disconnect) then the test should fail.

    suggestion:

    0-            assert not res
    1+            assert False
    

    pinheadmz commented at 7:04 pm on June 17, 2025:
    Addressed in #32408
  60. in test/functional/interface_http.py:217 in 7d30118401 outdated
    212+        try:
    213+            conn.request('GET', '/')
    214+            conn.getresponse()
    215+        #       macos/linux           windows
    216+        except (ConnectionResetError, ConnectionAbortedError):
    217+            pass
    


    vasild commented at 10:21 am on May 15, 2025:
    This will also pass if no exception is thrown. Either add assert False after line 214 or have a boolean variable to false before the try and set it to true inside except and assert that it is true afterwards.

    pinheadmz commented at 7:04 pm on June 17, 2025:
    Addressed in #32408
  61. in test/functional/interface_http.py:169 in 7d30118401 outdated
    164+            b'0A' * 1000000,
    165+            b'0B' * 1000000,
    166+            b'0C' * 1000000,
    167+            b'0D' * 1000000,
    168+            b'"]}'
    169+        ]
    


    vasild commented at 11:02 am on May 15, 2025:
    Is the intention here to send 8MB of data?

    pinheadmz commented at 7:04 pm on June 17, 2025:
    Addressed in #32408
  62. in test/functional/interface_http.py:179 in 7d30118401 outdated
    174+            url='/',
    175+            body=iter(body_chunked),
    176+            headers=headers_chunked,
    177+            encode_chunked=True)
    178+        out1 = conn.getresponse().read()
    179+        assert out1 == b'{"result":"high-hash","error":null}\n'
    


    vasild commented at 11:05 am on May 15, 2025:

    Here and elsewhere in the added tests, assert_equal() produces a better error message:

    assert (value of out1 is not printed):

    0    assert out1 == b'{"result":"high-hash","error":null}\n'
    1           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    2AssertionError
    

    vs

    assert_equal():

    0AssertionError: not(b'{"result":null,"error":{"code":-32700,"message":"Parse error"},"id":null}\n' == b'{"result":"high-hash","error":null}\n')
    

    pinheadmz commented at 7:04 pm on June 17, 2025:
    Addressed in #32408
  63. in src/test/util_tests.cpp:881 in 7d30118401 outdated
    1068+    BOOST_CHECK(!ParseUInt64Hex("", &n));
    1069+    BOOST_CHECK(!ParseUInt64Hex("-1", &n));
    1070+    BOOST_CHECK(!ParseUInt64Hex("10 00", &n));
    1071+    BOOST_CHECK(!ParseUInt64Hex("1 ", &n));
    1072+    BOOST_CHECK(!ParseUInt64Hex("0xAB", &n));
    1073+}
    


    vasild commented at 12:41 pm on May 15, 2025:

    Add some more test cases and avoid BOOST_CHECK(A && B) because when that fails it is unclear which one of A or B was false. Prefer BOOST_CHECK(A); BOOST_CHECK(B);. Also, because B is n == 123, do BOOST_CHECK_EQUAL(n, 123):

     0BOOST_AUTO_TEST_CASE(test_ParseUInt64Hex)
     1{
     2    uint64_t n;
     3    // Valid values
     4    BOOST_CHECK(ParseUInt64Hex("1234", nullptr));
     5    BOOST_CHECK(ParseUInt64Hex("1234", &n));
     6    BOOST_CHECK_EQUAL(n, 0x1234);
     7    BOOST_CHECK(ParseUInt64Hex("a", &n));
     8    BOOST_CHECK_EQUAL(n, 0xA);
     9    BOOST_CHECK(ParseUInt64Hex("0000000a", &n));
    10    BOOST_CHECK_EQUAL(n, 0xA);
    11    BOOST_CHECK(ParseUInt64Hex("100", &n));
    12    BOOST_CHECK_EQUAL(n, 0x100);
    13    BOOST_CHECK(ParseUInt64Hex("DEADbeef", &n));
    14    BOOST_CHECK_EQUAL(n, 0xDEADbeef);
    15    BOOST_CHECK(ParseUInt64Hex("FfFfFfFf", &n));
    16    BOOST_CHECK_EQUAL(n, 0xFfFfFfFf);
    17    BOOST_CHECK(ParseUInt64Hex("123456789", &n));                                      
    18    BOOST_CHECK_EQUAL(n, 0x123456789ULL);
    19    BOOST_CHECK(ParseUInt64Hex("0", &n));
    20    BOOST_CHECK_EQUAL(n, 0);
    21    BOOST_CHECK(ParseUInt64Hex("FfFfFfFfFfFfFfFf", &n));
    22    BOOST_CHECK_EQUAL(n, 0xFfFfFfFfFfFfFfFfULL);
    23    // Invalid values
    24    BOOST_CHECK(!ParseUInt64Hex("", &n));
    25    BOOST_CHECK(!ParseUInt64Hex("-1", &n));
    26    BOOST_CHECK(!ParseUInt64Hex("10 00", &n));
    27    BOOST_CHECK(!ParseUInt64Hex("1 ", &n));
    28    BOOST_CHECK(!ParseUInt64Hex("0xAB", &n));
    29    BOOST_CHECK(!ParseUInt64Hex("FfFfFfFfFfFfFfFf0", &n));
    30}
    

    pinheadmz commented at 7:09 pm on June 17, 2025:
    taken, thanks
  64. in src/util/strencodings.cpp:270 in 7d30118401 outdated
    265+    }
    266+    if (out != nullptr) {
    267+        *out = total;
    268+    }
    269+    return true;
    270+}
    


    vasild commented at 12:46 pm on May 15, 2025:

    This would not recognize numbers larger than FFFFFFFF (4 bytes, 8 chars). That should be if (str.size() > 16) return false; instead of 8.

    But better not roll our own and use std::from_chars() like the others:

     0diff --git i/src/util/strencodings.cpp w/src/util/strencodings.cpp
     1index d923bdd121..05a5dced62 100644
     2--- i/src/util/strencodings.cpp
     3+++ w/src/util/strencodings.cpp
     4@@ -199,21 +199,21 @@ std::optional<std::vector<unsigned char>> DecodeBase32(std::string_view str)
     5 
     6     return ret;
     7 }
     8 
     9 namespace {
    10 template <typename T>
    11-bool ParseIntegral(std::string_view str, T* out)
    12+bool ParseIntegral(std::string_view str, T* out, size_t base = 10)
    13 {
    14     static_assert(std::is_integral_v<T>);
    15     // Replicate the exact behavior of strtol/strtoll/strtoul/strtoull when
    16     // handling leading +/- for backwards compatibility.
    17     if (str.length() >= 2 && str[0] == '+' && str[1] == '-') {
    18         return false;
    19     }
    20-    const std::optional<T> opt_int = ToIntegral<T>((!str.empty() && str[0] == '+') ? str.substr(1) : str);
    21+    const std::optional<T> opt_int = ToIntegral<T>((!str.empty() && str[0] == '+') ? str.substr(1) : str, base);
    22     if (!opt_int) {
    23         return false;
    24     }
    25     if (out != nullptr) {
    26         *out = *opt_int;
    27     }
    28@@ -250,26 +250,13 @@ bool ParseUInt64(std::string_view str, uint64_t* out)
    29 {
    30     return ParseIntegral<uint64_t>(str, out);
    31 }
    32 
    33 bool ParseUInt64Hex(std::string_view str, uint64_t* out)
    34 {
    35-    if (str.size() > 8) return false;
    36-    if (str.size() < 1) return false;
    37-    uint64_t total{0};
    38-    auto it = str.begin();
    39-    while (it != str.end()) {
    40-        auto v = HexDigit(*(it++));
    41-        if (v < 0) return false;
    42-        total <<= 4;
    43-        total |= v;
    44-    }
    45-    if (out != nullptr) {
    46-        *out = total;
    47-    }
    48-    return true;
    49+    return ParseIntegral<uint64_t>(str, out, 16);
    50 }
    51 
    52 std::string FormatParagraph(std::string_view in, size_t width, size_t indent)
    53 {
    54     assert(width >= indent);
    55     std::stringstream out;
    56diff --git i/src/util/strencodings.h w/src/util/strencodings.h
    57index b95b21dafd..d83fa3c841 100644
    58--- i/src/util/strencodings.h
    59+++ w/src/util/strencodings.h
    60@@ -173,17 +173,17 @@ constexpr inline bool IsSpace(char c) noexcept {
    61  * is `-?[0-9]+`. The minus sign is only permitted for signed integer types.
    62  *
    63  * [@returns](/bitcoin-bitcoin/contributor/returns/) std::nullopt if the entire string could not be parsed, or if the
    64  *   parsed value is not in the range representable by the type T.
    65  */
    66 template <typename T>
    67-std::optional<T> ToIntegral(std::string_view str)
    68+std::optional<T> ToIntegral(std::string_view str, size_t base = 10)
    69 {
    70     static_assert(std::is_integral_v<T>);
    71     T result;
    72-    const auto [first_nonmatching, error_condition] = std::from_chars(str.data(), str.data() + str.size(), result);
    73+    const auto [first_nonmatching, error_condition] = std::from_chars(str.data(), str.data() + str.size(), result, base);
    74     if (first_nonmatching != str.data() + str.size() || error_condition != std::errc{}) {
    75         return std::nullopt;
    76     }
    77     return result;
    78 }
    79 
    

    pinheadmz commented at 7:39 pm on June 17, 2025:
    Due to #32520 I had to change this entire commit anyway. Took your suggestion about adding base argument here and just dropped the wrapper function. Will call ToIntegral( ... 16) directly when reading chunk size.
  65. vasild commented at 1:57 pm on May 15, 2025: contributor
    Posting review midway. Reviewed up to and including 144a777f86 string: add CaseInsensitiveComparator. I like that the PR starts with some new tests to enforce correct behavior.
  66. DrahtBot requested review from vasild on May 15, 2025
  67. in src/util/time.cpp:127 in 7d30118401 outdated
    118@@ -116,6 +119,20 @@ std::optional<int64_t> ParseISO8601DateTime(std::string_view str)
    119     return int64_t{TicksSinceEpoch<std::chrono::seconds>(tp)};
    120 }
    121 
    122+std::string FormatRFC7231DateTime(int64_t nTime)
    123+{
    124+    const std::chrono::sys_seconds secs{std::chrono::seconds{nTime}};
    125+    const auto days{std::chrono::floor<std::chrono::days>(secs)};
    126+    // 1970-01-01 was a Thursday
    127+    std::string_view weekday{weekdays[(days.time_since_epoch().count() + 4) % 7]};
    


    vasild commented at 12:13 pm on May 16, 2025:

    Passing -1717429609 would make this code try to access weekdays[-1] :fire:

    For modern style, better use std::array instead of C arrays and, more importantly, use the array’s at() method which has boundary checks, just in case. Here is a change that adds some more tests and fixes the out-of-bounds access:

     0diff --git i/src/test/util_tests.cpp w/src/test/util_tests.cpp
     1index 387493152d..ebb40dd713 100644
     2--- i/src/test/util_tests.cpp
     3+++ w/src/test/util_tests.cpp
     4@@ -384,15 +384,23 @@ BOOST_AUTO_TEST_CASE(util_FormatISO8601Date)
     5     BOOST_CHECK_EQUAL(FormatISO8601Date(0), "1970-01-01");
     6     BOOST_CHECK_EQUAL(FormatISO8601Date(1317425777), "2011-09-30");
     7 }
     8 
     9 BOOST_AUTO_TEST_CASE(util_FormatRFC7231DateTime)
    10 {
    11+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(std::numeric_limits<int64_t>::max()), "");
    12+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(253402300800), "");
    13+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(253402300799), "Fri, 31 Dec 9999 23:59:59 GMT");
    14     BOOST_CHECK_EQUAL(FormatRFC7231DateTime(253402214400), "Fri, 31 Dec 9999 00:00:00 GMT");
    15     BOOST_CHECK_EQUAL(FormatRFC7231DateTime(1717429609), "Mon, 03 Jun 2024 15:46:49 GMT");
    16     BOOST_CHECK_EQUAL(FormatRFC7231DateTime(0), "Thu, 01 Jan 1970 00:00:00 GMT");
    17+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(-1), "Wed, 31 Dec 1969 23:59:59 GMT");
    18+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(-1717429609), "Sat, 31 Jul 1915 08:13:11 GMT");
    19+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(-62167219200), "Sat, 01 Jan 0000 00:00:00 GMT");
    20+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(-62167219201), "");
    21+    BOOST_CHECK_EQUAL(FormatRFC7231DateTime(std::numeric_limits<int64_t>::min()), "");
    22 }
    23 
    24 BOOST_AUTO_TEST_CASE(util_FormatMoney)
    25 {
    26     BOOST_CHECK_EQUAL(FormatMoney(0), "0.00");
    27     BOOST_CHECK_EQUAL(FormatMoney((COIN/10000)*123456789), "12345.6789");
    28diff --git i/src/util/time.cpp w/src/util/time.cpp
    29index 9b9167d19a..c0f375956a 100644
    30--- i/src/util/time.cpp
    31+++ w/src/util/time.cpp
    32@@ -7,21 +7,22 @@
    33 
    34 #include <compat/compat.h>
    35 #include <tinyformat.h>
    36 #include <util/check.h>
    37 #include <util/strencodings.h>
    38 
    39+#include <array>
    40 #include <atomic>
    41 #include <chrono>
    42 #include <optional>
    43 #include <string>
    44 #include <string_view>
    45 #include <thread>
    46 
    47-static const std::string weekdays[7] = {"Sun", "Mon", "Tue", "Wed", "Thu", "Fri", "Sat"};
    48-static const std::string months[12] = {"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"};
    49+static constexpr std::array<std::string_view, 7> weekdays{"Thu", "Fri", "Sat", "Sun", "Mon", "Tue", "Wed"}; // 1970-01-01 was a Thursday.
    50+static constexpr std::array<std::string_view, 12> months{"Jan", "Feb", "Mar", "Apr", "May", "Jun", "Jul", "Aug", "Sep", "Oct", "Nov", "Dec"};
    51 
    52 void UninterruptibleSleep(const std::chrono::microseconds& n) { std::this_thread::sleep_for(n); }
    53 
    54 static std::atomic<std::chrono::seconds> g_mock_time{}; //!< For testing
    55 std::atomic<bool> g_used_system_time{false};
    56 static std::atomic<std::chrono::milliseconds> g_mock_steady_time{}; //!< For testing
    57@@ -116,20 +117,24 @@ std::optional<int64_t> ParseISO8601DateTime(std::string_view str)
    58     }
    59     const auto time{std::chrono::hours{*hour} + std::chrono::minutes{*min} + std::chrono::seconds{*sec}};
    60     const auto tp{std::chrono::sys_days{ymd} + time};
    61     return int64_t{TicksSinceEpoch<std::chrono::seconds>(tp)};
    62 }
    63 
    64-std::string FormatRFC7231DateTime(int64_t nTime)
    65+std::string FormatRFC7231DateTime(int64_t time)
    66 {
    67-    const std::chrono::sys_seconds secs{std::chrono::seconds{nTime}};
    68+    if (time < -62167219200 || 253402300799 < time) {
    69+        // RFC7231 mandates 4-digit year, so only support years 0 to 9999
    70+        return "";
    71+    }
    72+    const std::chrono::sys_seconds secs{std::chrono::seconds{time}};
    73     const auto days{std::chrono::floor<std::chrono::days>(secs)};
    74-    // 1970-01-01 was a Thursday
    75-    std::string_view weekday{weekdays[(days.time_since_epoch().count() + 4) % 7]};
    76+    const auto w{days.time_since_epoch().count() % 7}; // will be in the range [-6, 6]
    77+    std::string_view weekday{weekdays.at(w >= 0 ? w : w + 7)};
    78     const std::chrono::year_month_day ymd{days};
    79-    std::string_view month{months[unsigned{ymd.month()} - 1]};
    80+    std::string_view month{months.at(unsigned{ymd.month()} - 1)};
    81     const std::chrono::hh_mm_ss hms{secs - days};
    82     // examples: Mon, 27 Jul 2009 12:28:53 GMT
    83     //           Fri, 31 May 2024 19:18:04 GMT
    84     return strprintf("%03s, %02u %03s %04i %02i:%02i:%02i GMT", weekday, unsigned{ymd.day()}, month, signed{ymd.year()}, hms.hours().count(), hms.minutes().count(), hms.seconds().count());
    85 }
    86 
    

    pinheadmz commented at 7:56 pm on June 17, 2025:
    ah really really great suggestion thanks, taken
  68. in src/httpserver.cpp:314 in 7d30118401 outdated
    327 
    328-bool InitHTTPServer(const util::SignalInterrupt& interrupt)
    329+namespace http_bitcoin {
    330+using util::Split;
    331+
    332+std::optional<std::string_view> HTTPHeaders::Find(const std::string key) const
    


    vasild commented at 1:40 pm on May 16, 2025:

    e95c6f5b6511ae35141b1e440e1f22e1004d3de6

    Copying strings is expensive:

    0std::optional<std::string_view> HTTPHeaders::Find(const std::string& key) const
    

    pinheadmz commented at 4:12 pm on June 19, 2025:
    👍
  69. in src/httpserver.cpp:321 in 7d30118401 outdated
    340 
    341-    // Redirect libevent's logging to our own log
    342-    event_set_log_callback(&libevent_log_cb);
    343-    // Update libevent's log handling.
    344-    UpdateHTTPServerLogging(LogInstance().WillLogCategory(BCLog::LIBEVENT));
    345+void HTTPHeaders::Write(const std::string key, const std::string value)
    


    vasild commented at 1:41 pm on May 16, 2025:
    0void HTTPHeaders::Write(const std::string& key, const std::string& value)
    

    pinheadmz commented at 4:29 pm on June 19, 2025:
    :+1:
  70. in src/httpserver.cpp:332 in 7d30118401 outdated
    356-#ifdef WIN32
    357-    evthread_use_windows_threads();
    358-#else
    359-    evthread_use_pthreads();
    360-#endif
    361+void HTTPHeaders::Remove(const std::string key)
    


    vasild commented at 1:43 pm on May 16, 2025:
    0void HTTPHeaders::Remove(const std::string& key)
    

    pinheadmz commented at 4:30 pm on June 19, 2025:
    :+1:
  71. in src/httpserver.h:78 in 7d30118401 outdated
     96+{
     97+public:
     98+    std::optional<std::string_view> Find(const std::string key) const;
     99+    void Write(const std::string key, const std::string value);
    100+    void Remove(const std::string key);
    101+    bool Read(util::LineReader& reader);
    


    vasild commented at 2:38 pm on May 16, 2025:
    Would be good to document when Read() throws, returns true and false.

    pinheadmz commented at 4:36 pm on June 19, 2025:
    sure will add
  72. vasild commented at 2:44 pm on May 16, 2025: contributor
    Posting review midway. Reviewed up to and including e95c6f5b6511ae35141b1e440e1f22e1004d3de6 http: Implement HTTPHeaders class
  73. DrahtBot requested review from vasild on May 16, 2025
  74. DrahtBot added the label Needs rebase on May 20, 2025
  75. in src/rpc/protocol.h:25 in 7d30118401 outdated
    19@@ -20,6 +20,20 @@ enum HTTPStatusCode
    20     HTTP_SERVICE_UNAVAILABLE   = 503,
    21 };
    22 
    23+// Copied from libevent http.c success_phrases[] and client_error_phrases[]
    24+// TODO: Should HTTPStatusCode and HTTPReason be moved since they are not RPC protocols?
    25+const std::map<HTTPStatusCode, std::string> HTTPReason{
    26+    {HTTP_OK, "OK"},
    


    vasild commented at 11:24 am on May 22, 2025:

    TODO: Should HTTPStatusCode and HTTPReason be moved since they are not RPC protocols?

    I think no because this file already contains generic HTTP constants, e.g. HTTP_SERVICE_UNAVAILABLE = 503 above.


    pinheadmz commented at 2:28 pm on June 18, 2025:
    I’m asking if both should be moved, including the generic HTTP constants. They are also used in rest.cpp which has to #include <rpc/protocol.h> even though REST is not RPC protocol. I’ll remove the comment for now.
  76. in src/httpserver.h:96 in 7d30118401 outdated
    129-private:
    130-    struct evhttp_request* req;
    131-    const util::SignalInterrupt& m_interrupt;
    132-    bool replySent;
    133+public:
    134+    std::string m_method;
    


    vasild commented at 1:20 pm on May 22, 2025:
    Maybe enum is better for this?

    pinheadmz commented at 2:38 pm on June 18, 2025:
    yeah good call, I will add that in to “define HTTP request methods at module level outside of class” (currently 7a1359064e326f2b28a1c57aaa94fdbe17f9dee2) which is a few commits ahead
  77. in src/test/httpserver_tests.cpp:300 in 7d30118401 outdated
    301+        HTTPRequest req;
    302+        std::vector<std::byte> buffer{StringToBuffer(bad_content_length)};
    303+        LineReader reader(buffer, MAX_HEADERS_SIZE);
    304+        BOOST_CHECK(req.LoadControlData(reader));
    305+        BOOST_CHECK(req.LoadHeaders(reader));
    306+        BOOST_CHECK_THROW(req.LoadBody(reader), std::runtime_error);
    


    vasild commented at 1:45 pm on May 22, 2025:

    It is fine as it is. I am just wondering if it is not more appropriate for LoadHeaders() to throw instead of succeeding because the header Content-Length: eleven is already not according to the spec: https://httpwg.org/specs/rfc9110.html#field.content-length:

    Content-Length = 1*DIGIT


    pinheadmz commented at 2:45 pm on June 18, 2025:
    Hm yeah I see your point, we don’t validate individual headers in LoadHeaders(). I’ll leave as is for now, for what it’s worth, libevent doesn’t appear to check this until evhttp_get_body_length() as well…
  78. in src/httpserver.cpp:462 in 7d30118401 outdated
    512+        if (!ParseUInt64(content_length_value.value(), &content_length)) throw std::runtime_error("Cannot parse Content-Length value");
    513+
    514+        // Not enough data in buffer for expected body
    515+        if (reader.Left() < content_length) return false;
    516+
    517+        m_body = reader.ReadLength(content_length);
    


    vasild commented at 2:13 pm on May 22, 2025:

    If the body comes in pieces, then this parsing will be repeated for every piece and the return on line 460 will be executed a few times until enough data is received. This is suboptimal - to parse the same thing multiple times. I guess it is fine because we are not building a top performance HTTP server.

    If this becomes an issue at some point, then it would be better to remember that it has been parsed until e.g. byte 456 and only continue parsing from there as new data is received over the socket. I guess this will complicate the current implementation significantly.


    pinheadmz commented at 6:40 pm on June 19, 2025:

    Yeah its true, a client could send a request with a huge list of HTTP headers and an incomplete body and we would parse the headers over and over again from m_recv_buffer until the body was finally completed, when we finally pinch off a HTTPRequest and erase the data from the buffer.

    There certainly is an optimization there, keeping an in-progress request attached to HTTPClient with a “where we left off” pointer. It sounds like you would be ok with seeing that in a follow-up PR but let me know if its blocking for you here.

  79. in src/test/httpserver_tests.cpp:350 in 7d30118401 outdated
    351+        accepted_sockets->Push(std::move(connected_socket));
    352+
    353+        // Prepare a request handler that just stores received requests so we can examine them
    354+        // Mutex is required to prevent a race between this test's main thread and the Sockman I/O loop.
    355+        Mutex requests_mutex;
    356+        std::deque<std::unique_ptr<HTTPRequest>> requests;
    


    vasild commented at 2:50 pm on May 22, 2025:
    0        Mutex requests_mutex;
    1        std::deque<std::unique_ptr<HTTPRequest>> requests GUARDED_BY(requests_mutex);
    

    pinheadmz commented at 2:48 pm on June 18, 2025:

    Compiler didn’t like this actually:

    warning: ‘guarded_by’ attribute only applies to non-static data members and global variables [-Wignored-attributes]

  80. in src/test/httpserver_tests.cpp:354 in 7d30118401 outdated
    352+
    353+        // Prepare a request handler that just stores received requests so we can examine them
    354+        // Mutex is required to prevent a race between this test's main thread and the Sockman I/O loop.
    355+        Mutex requests_mutex;
    356+        std::deque<std::unique_ptr<HTTPRequest>> requests;
    357+        auto StoreRequest = [&](std::unique_ptr<HTTPRequest> req) {
    


    vasild commented at 2:54 pm on May 22, 2025:
    Passing unique_ptr by value looks a bit strange since it is not supposed to be copied. I guess this works when the caller std::moves the object, but isn’t it more clear to use std::unique_ptr<HTTPRequest>&& req?

    pinheadmz commented at 3:03 pm on June 18, 2025:
    Agreed this better, thanks.
  81. vasild commented at 2:57 pm on May 22, 2025: contributor
    Posting review midway. Reviewed up to and including 980a9cd3d38abd0e0da85363056a2be6098d3919 http: read requests from connected clients.
  82. DrahtBot requested review from vasild on May 22, 2025
  83. in src/httpserver.cpp:433 in 7d30118401 outdated
    468+bool HTTPRequest::LoadBody(LineReader& reader)
    469+{
    470+    // https://httpwg.org/specs/rfc9112.html#message.body
    471+
    472+    auto transfer_encoding_header = m_headers.Find("Transfer-Encoding");
    473+    if (transfer_encoding_header && ToLower(transfer_encoding_header.value()) == "chunked") {
    


    vasild commented at 10:18 am on June 5, 2025:
    Would be good to have tests to exercise the chunked transfer encoding from d7778f426c http: support "chunked" Transfer-Encoding.

    pinheadmz commented at 3:22 pm on June 18, 2025:
    That was merged in #32408 unless you mean a unit test?
  84. in src/httpserver.cpp:716 in 7d30118401 outdated
    829+    if (CloseConnection(client->m_node_id)) {
    830+        LogDebug(BCLog::HTTP, "Disconnected HTTP client %s (id=%d)\n", client->m_origin, client->m_node_id);
    831+    } else {
    832+        LogDebug(BCLog::HTTP, "Failed to disconnect non-existent HTTP client %s (id=%d)\n", client->m_origin, client->m_node_id);
    833+    }
    834 }
    


    vasild commented at 11:16 am on June 5, 2025:

    The CloseConnectionInternal() method can and should take just a reference, no need for shared pointer:

    0- void HTTPServer::CloseConnectionInternal(std::shared_ptr<HTTPClient>& client)
    1+ void HTTPServer::CloseConnectionInternal(const HTTPClient& client)
    

    pinheadmz commented at 4:58 pm on June 19, 2025:
    yes thanks
  85. in src/httpserver.h:189 in 7d30118401 outdated
    222+    std::atomic_bool m_send_ready{false};
    223+
    224+    // Set to true when we receive request data and set to false once m_send_buffer is cleared.
    225+    // Checked during DisconnectClients(). All of these operations take place in the Sockman I/O loop,
    226+    // however it may get set my a worker thread during an "optimistic send".
    227+    std::atomic_bool m_prevent_disconnect{false};
    


    vasild commented at 11:35 am on June 5, 2025:

    Would be good to note that this overrides m_disconnect:

    0// If set, then the client will not be disconnected even if `m_disconnect` is true.
    

    pinheadmz commented at 5:00 pm on June 19, 2025:
    :+1:
  86. in src/httpserver.h:182 in 7d30118401 outdated
    224+    // Set to true when we receive request data and set to false once m_send_buffer is cleared.
    225+    // Checked during DisconnectClients(). All of these operations take place in the Sockman I/O loop,
    226+    // however it may get set my a worker thread during an "optimistic send".
    227+    std::atomic_bool m_prevent_disconnect{false};
    228+
    229+    // Client request to keep connection open after all requests have been responded to.
    


    vasild commented at 11:36 am on June 5, 2025:

    nit:

    0    // Client requested to keep the connection open after all requests have been responded to.
    

    pinheadmz commented at 3:24 pm on June 18, 2025:
    👍
  87. in src/httpserver.cpp:904 in 7d30118401 outdated
    1134-        struct evkeyvalq params_q;
    1135-        evhttp_parse_query_str(query, &params_q);
    1136+bool InitHTTPServer(const util::SignalInterrupt& interrupt)
    1137+{
    1138+    if (!InitHTTPAllowList())
    1139+        return false;
    


    vasild commented at 12:45 pm on June 5, 2025:

    nit:

    0    if (!InitHTTPAllowList()) {
    1        return false;
    2    }
    

    pinheadmz commented at 4:43 pm on June 18, 2025:
    👍
  88. in src/httpserver.cpp:909 in 7d30118401 outdated
    1148+    g_http_server->m_rpcservertimeout = std::chrono::seconds(gArgs.GetIntArg("-rpcservertimeout", DEFAULT_HTTP_SERVER_TIMEOUT));
    1149+
    1150+    // Bind HTTP server to specified addresses
    1151+    std::vector<std::pair<std::string, uint16_t>> endpoints{GetBindAddresses()};
    1152+    bool bind_success{false};
    1153+    for (std::vector<std::pair<std::string, uint16_t> >::iterator i = endpoints.begin(); i != endpoints.end(); ++i) {
    


    vasild commented at 12:48 pm on June 5, 2025:
    0    for (std::vector<std::pair<std::string, uint16_t>>::iterator i = endpoints.begin(); i != endpoints.end(); ++i) {
    

    pinheadmz commented at 4:43 pm on June 18, 2025:
    👍
  89. in src/httpserver.cpp:914 in 7d30118401 outdated
    1153+    for (std::vector<std::pair<std::string, uint16_t> >::iterator i = endpoints.begin(); i != endpoints.end(); ++i) {
    1154+        LogPrintf("Binding RPC on address %s port %i\n", i->first, i->second);
    1155+        const std::optional<CService> addr{Lookup(i->first, i->second, false)};
    1156+        if (addr) {
    1157+            if (addr->IsBindAny()) {
    1158+                LogPrintf("WARNING: the RPC server is not safe to expose to untrusted networks such as the public internet\n");
    


    vasild commented at 12:49 pm on June 5, 2025:

    LogPrintf() is using the Info severity.

    0                LogWarning("The RPC server is not safe to expose to untrusted networks such as the public internet\n");
    

    pinheadmz commented at 4:44 pm on June 18, 2025:
    good catch thanks
  90. in src/httpserver.cpp:984 in 7d30118401 outdated
    1241+    if (g_http_server) {
    1242+        // Disconnect clients as their remaining responses are flushed
    1243+        g_http_server->m_disconnect_all_clients = true;
    1244+        // Wait for all disconnections
    1245+        while (!g_http_server->m_no_clients) {
    1246+            std::this_thread::sleep_for(std::chrono::milliseconds{50});
    


    vasild commented at 1:07 pm on June 5, 2025:

    nit:

    0            std::this_thread::sleep_for(50ms);
    

    pinheadmz commented at 4:49 pm on June 18, 2025:
    :+1:
  91. in src/httpserver.cpp:995 in 7d30118401 outdated
    1246+            std::this_thread::sleep_for(std::chrono::milliseconds{50});
    1247+        }
    1248+        // Break sockman I/O loop: stop accepting connections, sending and receiving data
    1249+        g_http_server->interruptNet();
    1250+        // Wait for sockman threads to exit
    1251+        g_http_server->JoinSocketsThreads();
    


    vasild commented at 1:09 pm on June 5, 2025:
    Could a new client sneak in after the loop while (!g_http_server->m_no_clients) exits? Shouldn’t that loop be after interruptNet() so that new clients cannot be accepted?

    pinheadmz commented at 5:00 pm on June 18, 2025:

    Hm that’s a good catch. I need the loop running to disconnect clients though – although I might be able to call HTTPServer::DisconnectClients() one extra time after interruptNet() to clean up edge cases like that.

    Otherwise I’m not sure if it’s even really a problem. We reject all requests in InterruptHTTPServer() which is called first, so the only risk on the server side would be if a lingering connection means JoinSocketsThreads() blocks forever.

  92. vasild commented at 2:02 pm on June 5, 2025: contributor
    Posting review midway. Reviewed up to and including b90f808e30 http: disconnect after idle timeout (-rpcservertimeout)
  93. DrahtBot requested review from vasild on Jun 5, 2025
  94. in src/httprpc.cpp:47 in 7d30118401 outdated
    48     {
    49-        struct timeval tv;
    50-        tv.tv_sec = millis/1000;
    51-        tv.tv_usec = (millis%1000)*1000;
    52-        ev.trigger(&tv);
    53+        context->scheduler->scheduleFromNow(func, std::chrono::milliseconds(millis));
    


    vasild commented at 4:31 am on June 6, 2025:

    Might use std::chrono::milliseconds for the type of the argument instead of int64_t:

    0    HTTPRPCTimer(NodeContext* context, std::function<void()>& func, std::chrono::milliseconds after)
    1    {
    2        context->scheduler->scheduleFromNow(func, after);
    

    (in commit 2950597694 use CScheduler for HTTPRPCTimer)


    pinheadmz commented at 5:07 pm on June 18, 2025:
    Can I though? Parent class RPCTimerInterface uses int64_t millis and so does our sibling class QtRPCTimerBase because of its QTimer.
  95. in src/httprpc.cpp:66 in 7d30118401 outdated
    59@@ -62,10 +60,10 @@ class HTTPRPCTimerInterface : public RPCTimerInterface
    60     }
    61     RPCTimerBase* NewTimer(std::function<void()>& func, int64_t millis) override
    62     {
    63-        return new HTTPRPCTimer(base, func, millis);
    64+        return new HTTPRPCTimer(m_context, func, millis);
    65     }
    66 private:
    67-    struct event_base* base;
    68+    NodeContext* m_context;
    


    vasild commented at 4:35 am on June 6, 2025:
    Because the context will always be set, ie will never be nullptr, then it is better to use a reference here instead of a pointer: NodeContext& m_context;

    pinheadmz commented at 5:16 pm on June 18, 2025:

    This makes sense to me but the problem is we are only ever given a pointer to a NodeContext (line 718):

    https://github.com/bitcoin/bitcoin/blob/5e6dbfd14ea9eace1c7e5ee76b140be46a0ec855/src/init.cpp#L710-L723

    It was explicitly changed from a reference to a pointer in #21574

    I tried passing node here and then jsut extracting the &address for HTTPReq_JSONRPC() but that resulted in JSONRPCException: Node context not found (-32603).

    Going to put this daemon back in the bottle for now, but can re-examine in a future rebase or follow-up

  96. in src/httprpc.cpp:133 in 7d30118401 outdated
    129@@ -132,7 +130,7 @@ static bool multiUserAuthorized(std::string strUserPass)
    130     return false;
    131 }
    132 
    133-static bool RPCAuthorized(const std::string& strAuth, std::string& strAuthUsernameOut)
    134+static bool RPCAuthorized(const std::string_view& strAuth, std::string& strAuthUsernameOut)
    


    vasild commented at 4:48 am on June 6, 2025:

    Can simplify the code a few lines below:

    0- std::string_view strUserPass64 = TrimStringView(std::string_view{strAuth}.substr(6));
    1+ std::string_view strUserPass64 = TrimStringView(strAuth.substr(6));
    

    (in commit 54a36093f7 http: using string_view to avoid unnecessary copy in Headers)


    pinheadmz commented at 4:12 pm on June 19, 2025:
    nice catch thanks
  97. in src/httpserver.h:74 in 7d30118401 outdated
    100+    void Remove(const std::string key);
    101+    bool Read(util::LineReader& reader);
    102+    std::string Stringify() const;
    103+
    104+private:
    105+    std::map<std::string, std::string, util::CaseInsensitiveComparator> m_map;
    


    vasild commented at 7:02 am on June 6, 2025:

    std::map would keep the entries sorted by the key in alphabetical order and lookup time is O(log(number of elements in the map)) whereas std::unordered_map does not maintain an order and has a lookup time of O(1). We do not need any order here, right? This probably is irrelevant since we are only going to put a bunch of entries in this map, but anyway, if there is no reason to use std::map I guess we should default to std::unordered_map.

    (from commit e95c6f5b65 http: Implement HTTPHeaders class)


    pinheadmz commented at 5:16 pm on June 19, 2025:
    This is great feedback thanks. I made this change by first editing the "string: add CaseInsensitiveComparator" commit which is now "string: add CaseInsensitive{KeyEqual, Hash} for unordered map" and implements the two operators needed for the unordered map.
  98. in src/httpserver.cpp:318 in 7d30118401 outdated
    333 {
    334-    if (!InitHTTPAllowList())
    335-        return false;
    336+    const auto it = m_map.find(key);
    337+    if (it == m_map.end()) return std::nullopt;
    338+    return std::string_view(it->second);
    


    vasild commented at 7:10 am on June 6, 2025:

    This returns a reference (or a “pointer”) to the outside world (from the point of view of the HTTPHeaders class) of an element from the private m_map member. That map is not immutable. What happens if the strings it contains get moved around if the map is internally resized when adding new entries? Also there is the HTTPHeaders::Remove() method which deletes entries from the map, rendering any returned pointers by Find() as dangling.

    54a36093f7 http: using string_view to avoid unnecessary copy in Headers – if there is no any measurable performance gain from this commit then it would be safer to drop it. If there is a measurable performance gain then hmm…


    pinheadmz commented at 6:30 pm on June 19, 2025:
    Ah that is another great catch thank you. Glad I wrote that as an extra commit! I’ll pop it off, it was added in response to this review comment: #32061 (review)
  99. vasild commented at 7:17 am on June 6, 2025: contributor
    Approach ACK 7d301184016a3f59c2e363dff631263cdbe21da0
  100. achow101 referenced this in commit d978a43d05 on Jun 9, 2025
  101. SockMan: introduce class and implement binding to listening socket
    Introduce a new low-level socket managing class `SockMan`.
    Unit-test it with a new class `SocketTestingSetup` which mocks
    `CreateSock()` and will enable mock client I/O in future commits.
    
    `SockMan` and `SocketTestingSetup` are designed to be generic and
    reusbale for higher-level network protocol implementation and testing.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    9299d5dbc5
  102. style: modernize the style of SockMan::BindListenPort()
    It was copied verbatim from `CConnman::BindListenPort()` in the previous
    commit. Modernize its variables and style and log the error messages
    from the caller. Also categorize the informative messages to the "net"
    category because they are quite specific to the networking layer.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    de633698a1
  103. SockMan: implement and test AcceptConnection()
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    f6af0c8cd1
  104. style: modernize the style of SockMan::AcceptConnection() cca484c7ce
  105. SockMan: generate sequential Ids for each newly accepted connection
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    1deaa087cd
  106. SockMan: start an I/O loop in a new thread and accept connections
    Socket handling methods are copied from CConnMan:
    
    `CConnman::GenerateWaitSockets()` goes to
    `SockMan::GenerateWaitSockets()`.
    
    `CConnman::ThreadSocketHandler()` and
    `CConnman::SocketHandler()` are combined into
    `SockMan::ThreadSocketHandler()`.
    
    `CConnman::SocketHandlerListening()` goes to
    `SockMan::SocketHandlerListening()`.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    2e6cdb4804
  107. SockMan: handle connected sockets: read data from socket
    `CConnman::SocketHandlerConnected()` copied to
    `SockMan::SocketHandlerConnected()`.
    
    Testing this requires adding a new feature to the SocketTestingSetup,
    inserting a "request" payload into the mock client that conencts
    to us.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    674a5ff8f1
  108. pinheadmz force-pushed on Jun 21, 2025
  109. DrahtBot removed the label Needs rebase on Jun 21, 2025
  110. DrahtBot added the label CI failed on Jun 21, 2025
  111. DrahtBot commented at 2:50 am on June 21, 2025: contributor

    🚧 At least one of the CI tasks failed. Task fuzzer,address,undefined,integer, no depends: https://github.com/bitcoin/bitcoin/runs/44516172178 LLM reason (✨ experimental): The CI failure is caused by a runtime error due to an implicit integer sign change detected by the sanitizer during fuzzing.

    Try to run the tests locally, according to the documentation. However, a CI failure may still happen due to a number of reasons, for example:

    • Possibly due to a silent merge conflict (the changes in this pull request being incompatible with the current code in the target branch). If so, make sure to rebase on the latest commit of the target branch.

    • A sanitizer issue, which can only be found by compiling with the sanitizer and running the affected test.

    • An intermittent issue.

    Leave a comment here, if you need help tracking down a confusing failure.

  112. pinheadmz force-pushed on Jun 21, 2025
  113. SockMan: handle connected sockets: write data to socket
    Sockets-touching bits from `CConnman::SocketSendData()` copied to
    `SockMan::SendBytes()`.
    
    Testing this requires adding a new feature to the SocketTestingSetup,
    returning the DynSock I/O pipes from the mock socket so the recevied
    data can be checked.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    7a9fc9bd4a
  114. SockMan: dispatch cyclical events from I/O loop
    Copy from some parts of `CConnman::SocketHandlerConnected()` and
    `CConnman::ThreadSocketHandler()` to:
    `EventIOLoopCompletedForOne(id)` and `EventIOLoopCompletedForAll()`.
    
    Co-authored-by: Vasil Dimov <vd@FreeBSD.org>
    130708803d
  115. string: add `base` argument for ToIntegral to operate on hexadecimal adbc7a20e9
  116. string: add CaseInsensitive{KeyEqual, Hash} for unordered map
    https://httpwg.org/specs/rfc9110.html#rfc.section.5.1
    Field names in HTTP headers are case-insensitive. These
    structs will be used in the headers map to search by key.
    In libevent these are compared in lowercase:
      evhttp_find_header()
      evutil_ascii_strcasecmp()
      EVUTIL_TOLOWER_()
    cca0673af9
  117. time: implement and test RFC7231 timestamp string
    HTTP 1.1 responses require a timestamp header with a
    specific format, specified in:
    https://www.rfc-editor.org/rfc/rfc7231#section-7.1.1.1
    383df3aaf6
  118. string: add LineReader
    This is a helper struct to parse HTTP messages from data in buffers
    from sockets. HTTP messages begin with headers which are
    CRLF-terminated lines (\n or \r\n) followed by an arbitrary amount of
    body data. Whitespace is trimmed from the field lines but not the body.
    
    https://httpwg.org/specs/rfc9110.html#rfc.section.5
    a281625717
  119. http: enclose libevent-dependent code in a namespace
    This commit is a no-op to isolate HTTP methods and objects that
    depend on libevent. Following commits will add replacement objects
    and methods in a new namespace for testing and review before
    switching over the server.
    1e7f98f553
  120. http: Implement HTTPHeaders class
    see:
    https://www.rfc-editor.org/rfc/rfc2616#section-4.2
    https://www.rfc-editor.org/rfc/rfc7231#section-5
    https://www.rfc-editor.org/rfc/rfc7231#section-7
    https://httpwg.org/specs/rfc9111.html#header.field.definitions
    a6664c97e7
  121. http: Implement HTTPResponse class
    HTTP Response message:
    https://datatracker.ietf.org/doc/html/rfc1945#section-6
    
    Status line (first line of response):
    https://datatracker.ietf.org/doc/html/rfc1945#section-6.1
    
    Status code definitions:
    https://datatracker.ietf.org/doc/html/rfc1945#section-9
    cad86314bb
  122. http: Implement HTTPRequest class
    HTTP Request message:
    https://datatracker.ietf.org/doc/html/rfc1945#section-5
    
    Request Line aka Control Line aka first line:
    https://datatracker.ietf.org/doc/html/rfc1945#section-5.1
    
    See message_read_status() in libevent http.c for how
    `MORE_DATA_EXPECTED` is handled there
    81ba60c27a
  123. http: Begin implementation of HTTPClient and HTTPServer 06cacfafbe
  124. http: read requests from connected clients f2e0501837
  125. http: support "chunked" Transfer-Encoding 197a80a494
  126. http: compose and send replies to connected clients eaf0e39503
  127. http: disconnect clients badf3c38b3
  128. Allow http workers to send data optimistically as an optimization a93664b0a0
  129. http: use a queue to pipeline requests from each connected client
    See https://www.rfc-editor.org/rfc/rfc7230#section-6.3.2
    
    > A server MAY process a sequence of pipelined requests in
      parallel if they all have safe methods (Section 4.2.1 of [RFC7231]),
      but it MUST send the corresponding responses in the same order that
      the requests were received.
    
    We choose NOT to process requests in parallel. They are executed in
    the order recevied as well as responded to in the order received.
    This prevents race conditions where old state may get sent in response
    to requests that are very quick to process but were requested later on
    in the queue.
    e0190da19f
  130. define HTTP request methods at module level outside of class
    This is a refactor to prepare for matching the API of HTTPRequest
    definitions in both namespaces http_bitcoin and http_libevent. In
    particular, to provide a consistent return type for GetRequestMethod()
    in both classes.
    dac14d51f1
  131. Add helper methods to HTTPRequest to match original API
    These methods are called by http_request_cb() and are present in the
    original http_libevent::HTTPRequest.
    1d3eca0218
  132. refactor: split http_request_cb into libevent callback and dispatch
    The original function is passed to libevent as a callback when HTTP
    requests are received and processed. It wrapped the libevent request
    object in a http_libevent::HTTPRequest and then handed that off to
    bitcoin for basic checks and finally dispatch to worker threads.
    
    In this commit we split the function after the
    http_libevent::HTTPRequest is created, and pass that object to a new
    function that maintains the logic of checking and dispatching.
    
    This will be the merge point for http_libevent and http_bitcoin,
    where HTTPRequest objects from either namespace have the same
    downstream lifecycle.
    3183038711
  133. refactor: split HTTPBindAddresses into config parse and libevent setup
    The original function was already naturally split into two chunks:
    First, we parse and validate the users' RPC configuration for IPs and
    ports. Next we bind libevent's http server to the appropriate
    endpoints.
    
    This commit splits these chunks into two separate functions, leaving
    the argument parsing in the common space of the module and moving the
    libevent-specific binding into the http_libevent namespace.
    
    A future commit will implement http_bitcoin::HTTPBindAddresses to
    bind the validate list of endpoints by the new HTTP server.
    0e551ce797
  134. http: implement new server control methods to match legacy API b07848850d
  135. http: disconnect after idle timeout (-rpcservertimeout) cf894ea810
  136. pinheadmz force-pushed on Jun 22, 2025
  137. DrahtBot removed the label CI failed on Jun 22, 2025
  138. use CScheduler for HTTPRPCTimer
    This removes the dependency on libevent for scheduled events,
    like re-locking a wallet some time after decryption.
    6d3d593161
  139. http: switch servers from libevent to bitcoin 8f502246c6
  140. fuzz: switch http_libevent::HTTPRequest to http_bitcoin::HTTPRequest 26bd32ce20
  141. httpserver: delete libevent! e531a7cd2c
  142. pinheadmz force-pushed on Jun 22, 2025
  143. pinheadmz commented at 11:20 pm on June 22, 2025: member

    Rebased to address comments by @vasild.

    Also switched from #30988 (“Split CConnman”) to #32747 (“SockMan lite”) for the back end.

    That also included a rebase on master where #32408 was merged, cherry picking some commits from this PR.


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2025-06-27 12:13 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me