http: bugfix: allow server shutdown in case of remote client disconnection #28551

pull stickies-v wants to merge 3 commits into bitcoin:master from stickies-v:2023-09/http-use-conn-counter changing 1 files +66 −17
  1. stickies-v commented at 4:15 pm on September 29, 2023: contributor

    #26742 significantly increased the http server shutdown speed, but also introduced a bug (#27722 - see #27722 (comment) for steps to reproduce on master) that causes http server shutdown to halt in case of a remote client disconnection. This happens because evhttp_request_set_on_complete_cb is never called and thus the request is never removed from g_requests.

    This PR fixes that bug, and improves robustness of the code by encapsulating the request tracking logic. Earlier approaches (#27909, #27245, #19434) attempted to resolve this but imo are fundamentally unsafe because of differences in lifetime between an evhttp_request and evhttp_connection.

    We don’t need to keep track of open requests or connections, we just need to ensure that there are no active requests on server shutdown. Because a connection can have multiple requests, and a request can be completed in various ways (the request actually being handled, or the client performing a remote disconnect), keeping a counter per connection seems like the approach with the least overhead to me.

    Fixes #27722

  2. DrahtBot commented at 4:16 pm on September 29, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    ACK vasild, theStack
    Concept ACK josibake, erikarvstedt
    Approach ACK willcl-ark

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28051 (Get rid of shutdown.cpp/shutdown.h, use SignalInterrupt directly by ryanofsky)
    • #27909 (http: add evhttp_connection_set_closecb to avoid g_requests hang by Crypt-iQ)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. DrahtBot added the label RPC/REST/ZMQ on Sep 29, 2023
  4. fanquake added this to the milestone 26.0 on Sep 29, 2023
  5. stickies-v force-pushed on Sep 29, 2023
  6. DrahtBot added the label CI failed on Sep 29, 2023
  7. stickies-v force-pushed on Sep 29, 2023
  8. DrahtBot removed the label CI failed on Sep 29, 2023
  9. stickies-v commented at 6:19 pm on September 29, 2023: contributor
    cc @vasild - to follow up on our conversation last week
  10. theStack commented at 12:23 pm on October 1, 2023: contributor

    Concept ACK

    Haven’t looked at the code changes yet, but can confirm that with this branch the problem as described in #27722 (comment) can’t be reproduced anymore on my machine.

  11. fanquake requested review from willcl-ark on Oct 2, 2023
  12. fanquake commented at 9:36 am on October 2, 2023: member
  13. fanquake added the label Needs backport (25.x) on Oct 2, 2023
  14. in src/httpserver.cpp:161 in e3cd46c66b outdated
    156+ *
    157+ */
    158+class HTTPRequestTracker
    159+{
    160+private:
    161+    mutable GlobalMutex m_mutex;
    


    vasild commented at 12:39 pm on October 2, 2023:

    My understanding is that GlobalMutex should only be used for mutexes that are defined as global variables.

    0    mutable Mutex m_mutex;
    

    stickies-v commented at 10:32 am on October 3, 2023:
    Ah right, didn’t think about that. Thanks, fixed.
  15. in src/httpserver.cpp:170 in e3cd46c66b outdated
    168+        auto it{m_tracker.find(Assert(conn))};
    169+        if (it != m_tracker.end()) {
    170+            m_tracker.erase(it);
    171+            if (m_tracker.empty()) m_cv.notify_all();
    172+        }
    173+    }
    


    vasild commented at 12:48 pm on October 2, 2023:

    If nullptr is not allowed, then might as well pass const reference. Also, can erase it right away, without lookup first.

    0    void RemoveConnectionInternal(const evhttp_connection& conn) EXCLUSIVE_LOCKS_REQUIRED(m_mutex)
    1    { 
    2        m_tracker.erase(&conn);
    3        if (m_tracker.empty()) { 
    4            m_cv.notify_all();
    5        } 
    6    } 
    

    stickies-v commented at 10:35 am on October 3, 2023:
    Outdated since incorporating this suggestion to use an iterator.
  16. in src/httpserver.cpp:181 in e3cd46c66b outdated
    181+        if (it == m_tracker.end()) {
    182+            // new connection, initialize counter to 1
    183+            m_tracker[conn] = 1;
    184+        } else {
    185+            ++(it->second);
    186+        }
    


    vasild commented at 12:51 pm on October 2, 2023:

    If an entry is not in the map, then operator[] would insert it with a default value (0 in this case). So the following is equivalent but shorter:

    0        ++m_tracker[conn];
    

    stickies-v commented at 10:47 am on October 3, 2023:
    Beautiful, wasn’t aware of that operator[] behaviour. Thanks, adopted (putting it inside WITH_LOCK)
  17. in src/httpserver.cpp:189 in e3cd46c66b outdated
    189+    void RemoveRequest(evhttp_request* req)
    190+    {
    191+        const evhttp_connection* conn{Assert(evhttp_request_get_connection(Assert(req)))};
    192+        LOCK(m_mutex);
    193+        auto it{m_tracker.find(conn)};
    194+        if(it != m_tracker.end() && it->second > 0) {
    


    vasild commented at 12:52 pm on October 2, 2023:

    whitespace:

    0        if (it != m_tracker.end() && it->second > 0) {
    

    stickies-v commented at 10:33 am on October 3, 2023:
    Thanks, fixed (+ in line below)
  18. josibake commented at 4:59 pm on October 2, 2023: member
    Concept ACK
  19. erikarvstedt commented at 6:06 pm on October 2, 2023: contributor

    Concept ACK

    The nix-bitcoin test suite is now succeeding with this PR.

    It seems there’s no simpler strategy for implementing this:

    • libevent doesn’t allow attaching user data to connections
    • @vasild’s suggestion that avoids adding a connection tracking map doesn’t work for multiple requests with a single connection (because only the last registered callback gets called).

    Here’s a small fixup that avoids a double map lookup in RemoveRequest.

  20. stickies-v force-pushed on Oct 2, 2023
  21. stickies-v commented at 8:32 pm on October 2, 2023: contributor

    Here’s a small fixup that avoids a double map lookup in RemoveRequest.

    Excellent suggestion, thank you. Force pushed to incorporate this - otherwise no changes.

  22. erikarvstedt commented at 9:47 pm on October 2, 2023: contributor
  23. in src/httpserver.cpp:531 in 2a0f5bba5c outdated
    526@@ -535,7 +527,7 @@ void StopHTTPServer()
    527     boundSockets.clear();
    528     {
    529         if (g_requests.CountActiveConnections() != 0) {
    530-            LogPrint(BCLog::HTTP, "Waiting for %d requests to stop HTTP server\n", g_requests.CountActiveRequests());
    531+            LogPrint(BCLog::HTTP, "Waiting for %d connections to stop HTTP server\n", g_requests.CountActiveConnections());
    


    theStack commented at 10:03 pm on October 2, 2023:

    to avoid grabbing the mutex twice in a row in the request tracker, could call .CountActiveConnections only once:

    0        if (auto connections = g_requests.CountActiveConnections(); connections != 0) {
    1            LogPrint(BCLog::HTTP, "Waiting for %d connections to stop HTTP server\n", connections);
    2        }
    

    vasild commented at 8:14 am on October 3, 2023:
    Calling CountActiveConnections() just once will also avoid a weird message like “Waiting for 0 connections to stop HTTP server” which can happen if the function is called two times and the first call returns >0 and the second call returns 0 (the connection was closed in the meantime, after the first and before the second call).

    stickies-v commented at 10:37 am on October 3, 2023:
    Makes sense - considered it but didn’t think about CountActiveConnections() acquiring a lock. Definitely an improvement, so incorporated this suggestion, thanks.
  24. stickies-v force-pushed on Oct 2, 2023
  25. stickies-v commented at 10:13 pm on October 2, 2023: contributor

    fixup! AddRequest: simplify, avoid double map lookup

    Makes sense, thank you. Force pushed to incorporate this - otherwise no changes.

  26. theStack commented at 10:34 pm on October 2, 2023: contributor

    Light ACK a86f3f46b786d1dd60ff55b987df784107008d11

    Code looks good and the bugfix approach seems solid (I’m not too familiar with libevent internals though).

    Is there a simple way to trigger the “multiple-requests-per-connection” case (maybe with our functional test framework’s AuthServiceProxy)? I’ve only tested the typical “one request per connection” scenario via bitcoin-cli so far.

  27. vasild commented at 4:43 am on October 3, 2023: contributor
    Approach ACK e3cd46c66b2d6ca866583d5ff406c62533e74d99
  28. willcl-ark commented at 10:23 am on October 3, 2023: member

    Approach ACK

    I started with a light code review but decided to make my own minimal diff (without the connection tracking) and ended up reimplementing #27909. I’ve now actually read all the context, including why the approach from 27909 is less desirable, and agree with the approach here.

    Incidentally, I still cannot reproduce the original issue on my system with libevent version 2.1.12-stable-8ubuntu3. I tried with a remote bitcoin-cli to try and rule out some local networking-fu, but master and 25.0 still seem happy to shutdown for me no matter if there are remote-closed connections outstanding… FWIW This does mean that, without a test included, I can’t actually “test” that this fix does something new.

    However I can confirm that the changes here also work for me:

     02023-10-03T09:35:35Z [http] Received a POST request for / from 127.0.0.1:51856
     12023-10-03T09:35:38Z [http] Received a POST request for / from 127.0.0.1:60486
     22023-10-03T09:35:38Z [http] Interrupting HTTP server
     32023-10-03T09:35:38Z [http] Unregistering HTTP handler for / (exactmatch 1)
     42023-10-03T09:35:38Z [http] Unregistering HTTP handler for /wallet/ (exactmatch 0)
     52023-10-03T09:35:38Z [http] Stopping HTTP server
     62023-10-03T09:35:38Z [http] Waiting for HTTP worker threads to exit
     72023-10-03T09:35:38Z [http] Waiting for 3 connections to stop HTTP server
     82023-10-03T09:35:38Z [http] Waiting for HTTP event thread to exit
     92023-10-03T09:35:38Z [http] Exited http event loop
    102023-10-03T09:35:38Z [http] Stopped HTTP server
    

    I’d be happy to ACK this if the suggestion from @theStack was addressed.

  29. stickies-v force-pushed on Oct 3, 2023
  30. stickies-v commented at 11:21 am on October 3, 2023: contributor

    Force pushed to incorporate all outstanding review comments, thank you @theStack @vasild @willcl-ark for your review.

    Is there a simple way to trigger the “multiple-requests-per-connection” case (maybe with our functional test framework’s AuthServiceProxy)? I’ve only tested the typical “one request per connection” scenario via bitcoin-cli so far.

    I’ll work on that, don’t think it can be done with bitcoin-cli. I also would like to add some unit testing on HTTPRequestTracker, but from earlier work I’ve found that mocking evhttp_request objects proved to be a huge pain to get the memory sanitizers to pass, so I’m not sure I’ll be able to get that working for this PR either (welcoming contributions there).

  31. in src/httpserver.cpp:173 in aab70f4485 outdated
    168+        m_tracker.erase(it);
    169+        if (m_tracker.empty()) m_cv.notify_all();
    170+    }
    171+public:
    172+    //! Increase request counter for the associated connection by 1
    173+    void AddRequest(evhttp_request* req)
    


    vasild commented at 11:51 am on October 3, 2023:

    You need this:

     0diff --git i/src/httpserver.cpp w/src/httpserver.cpp
     1index d9203f8715..069511563c 100644
     2--- i/src/httpserver.cpp
     3+++ w/src/httpserver.cpp
     4@@ -167,37 +167,40 @@ private:
     5     {
     6         m_tracker.erase(it);
     7         if (m_tracker.empty()) m_cv.notify_all();
     8     }
     9 public:
    10     //! Increase request counter for the associated connection by 1
    11-    void AddRequest(evhttp_request* req)
    12+    void AddRequest(evhttp_request* req) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    13     {
    14         const evhttp_connection* conn{Assert(evhttp_request_get_connection(Assert(req)))};
    15         WITH_LOCK(m_mutex, ++m_tracker[conn]);
    16     }
    17     //! Decrease request counter for the associated connection by 1, remove connection if counter is 0
    18-    void RemoveRequest(evhttp_request* req)
    19+    void RemoveRequest(evhttp_request* req) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    20     {
    21         const evhttp_connection* conn{Assert(evhttp_request_get_connection(Assert(req)))};
    22         LOCK(m_mutex);
    23         auto it{m_tracker.find(conn)};
    24         if (it != m_tracker.end() && it->second > 0) {
    25             if (--(it->second) == 0) RemoveConnectionInternal(it);
    26         }
    27     }
    28     //! Remove a connection entirely
    29-    void RemoveConnection(const evhttp_connection* conn)
    30+    void RemoveConnection(const evhttp_connection* conn) EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    31     {
    32         LOCK(m_mutex);
    33         auto it{m_tracker.find(Assert(conn))};
    34         if (it != m_tracker.end()) RemoveConnectionInternal(it);
    35     }
    36-    size_t CountActiveConnections() const { return WITH_LOCK(m_mutex, return m_tracker.size()); }
    37+    size_t CountActiveConnections() const EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    38+    {
    39+        return WITH_LOCK(m_mutex, return m_tracker.size());
    40+    }
    41     //! Wait until there are no more connections with active requests in the tracker
    42-    void WaitUntilEmpty() const
    43+    void WaitUntilEmpty() const EXCLUSIVE_LOCKS_REQUIRED(!m_mutex)
    44     {
    45         WAIT_LOCK(m_mutex, lock);
    46         m_cv.wait(lock, [this]() EXCLUSIVE_LOCKS_REQUIRED(m_mutex) { return m_tracker.empty(); });
    47     }
    48 };
    49 //! Track active requests
    

    Mutex has stronger requirements/guarantees. Or, in other words, GlobalMutex is a Mutex with silenced warnings because of whatever reasons.


    stickies-v commented at 12:38 pm on October 3, 2023:
    Thank you, fixed!
  32. vasild commented at 11:53 am on October 3, 2023: contributor
    aab70f4485dab0653f3df940c29e97f8c88b0bd6 LGTM, just missing thread safety annotations:
  33. vasild commented at 11:59 am on October 3, 2023: contributor

    Is there a simple way to trigger the “multiple-requests-per-connection” case

    I did this:

    1. Prepare /tmp/reqests:
     0POST / HTTP/1.1
     1Host: 127.0.0.1
     2Content-Type: application/json
     3Authorization: Basic X91hj28...
     4Content-Length: 37
     5
     6{"method":"help","params":[],"id":1}
     7POST / HTTP/1.1
     8Host: 127.0.0.1
     9Content-Type: application/json
    10Authorization: Basic X91hj28...
    11Content-Length: 58
    12
    13{"method":"waitforblockheight","params":[1000000],"id":1}
    

    where X91hj28... is stolen beforehand using tcpdump from a normal bitcoin-cli help session.

    1. Then run nc 127.0.0.1 8332 < /tmp/requests.

    This is doing two requests one after another and the second one hangs. I don’t think it is possible to do two concurrent requests in one connection (at least not in HTTP 1.x).

  34. http: refactor: use encapsulated HTTPRequestTracker
    Introduces and uses a HTTPRequestTracker class to keep track of
    how many HTTP requests are currently active, so we don't stop the
    server before they're all handled.
    
    This has two purposes:
    1. In a next commit, allows us to untrack all requests associated
    with a connection without running into lifetime issues of the
    connection living longer than the request
    (see https://github.com/bitcoin/bitcoin/pull/27909#discussion_r1265614783)
    
    2. Improve encapsulation by making the mutex and cv internal members,
    and exposing just the WaitUntilEmpty() method that can be safely
    used.
    41f9027813
  35. http: log connection instead of request count
    There is no significant benefit in logging the request count instead
    of the connection count. Reduces amount of code and computational
    complexity.
    084d037231
  36. http: bugfix: track closed connection
    It is possible that the client disconnects before the request is
    handled. In those cases, evhttp_request_set_on_complete_cb is never
    called, which means that on shutdown the server we'll keep waiting
    endlessly.
    
    By adding evhttp_connection_set_closecb, libevent automatically
    cleans up those dead connections at latest when we shutdown, and
    depending on the libevent version already at the moment of remote
    client disconnect. In both cases, the bug is fixed.
    68f23f57d7
  37. stickies-v force-pushed on Oct 3, 2023
  38. stickies-v commented at 12:38 pm on October 3, 2023: contributor
    Force pushed to address thread safety annotations comment, thank you for the quick re-review!
  39. vasild approved
  40. vasild commented at 12:39 pm on October 3, 2023: contributor
    ACK 68f23f57d77bc172ed39ecdd4d2d5cd5e13cf483
  41. DrahtBot requested review from theStack on Oct 3, 2023
  42. in src/httpserver.cpp:156 in 41f9027813 outdated
    150@@ -149,10 +151,68 @@ static GlobalMutex g_httppathhandlers_mutex;
    151 static std::vector<HTTPPathHandler> pathHandlers GUARDED_BY(g_httppathhandlers_mutex);
    152 //! Bound listening sockets
    153 static std::vector<evhttp_bound_socket *> boundSockets;
    154+
    155+/**
    156+ * @brief Helps keep track of open `evhttp_connection`s with active `evhttp_requests`
    


    theStack commented at 12:52 pm on October 3, 2023:

    tiny nit (if you have to retouch):

    0 * [@brief](/bitcoin-bitcoin/contributor/brief/) Helps keep track of open `evhttp_connection`s with active `evhttp_request`s
    
  43. theStack approved
  44. theStack commented at 12:53 pm on October 3, 2023: contributor
    ACK 68f23f57d77bc172ed39ecdd4d2d5cd5e13cf483
  45. stickies-v commented at 3:28 pm on October 3, 2023: contributor

    I don’t think it is possible to do two concurrent requests in one connection (at least not in HTTP 1.x).

    HTTP/1.1 supports HTTP pipelining, but libevent doesn’t seem to support that. Pipelining is superseded by multiplexing in HTTP/2.0, which the libevent http server hasn’t implemented (but seems possible to add on our side, not that I think we should at this point).

    So, in practice it seems like our http server will currently never have > 1 active requests per connection, but I think we still need to keep the current accounting system (i.e. a counter per connection) because:

    • I find it difficult to prove that > 1 request per connection can never happen
    • it makes the existing code robust to changes in future libevent versions that do allow e.g. pipelining or multiplexing
  46. fanquake merged this on Oct 4, 2023
  47. fanquake closed this on Oct 4, 2023

  48. fanquake removed the label Needs backport (25.x) on Oct 4, 2023
  49. fanquake commented at 9:14 am on October 4, 2023: member
    Added to #28487 for backporting to 25.x.
  50. fanquake referenced this in commit ae86adabe4 on Oct 4, 2023
  51. fanquake referenced this in commit 752a456fa8 on Oct 4, 2023
  52. fanquake referenced this in commit 45a5fcb165 on Oct 4, 2023
  53. willcl-ark commented at 9:27 am on October 4, 2023: member
    post-merge ACK based on git range-diff a86f3f46b786d1dd60ff55b987df784107008d11...68f23f57d77bc172ed39ecdd4d2d5cd5e13cf483
  54. stickies-v deleted the branch on Oct 4, 2023
  55. fanquake referenced this in commit 9f8d501cb8 on Oct 4, 2023
  56. Frank-GER referenced this in commit b0fea09705 on Oct 13, 2023
  57. bitcoin locked this on Oct 3, 2024

github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-11-21 18:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me