httpserver, rest: improving URI validation #27253

pull pablomartin4btc wants to merge 2 commits into bitcoin:master from pablomartin4btc:http-rest-fix-segfault changing 5 files +122 −99
  1. pablomartin4btc commented at 4:22 am on March 14, 2023: member
    This PR contains an isolated enhancement of the segfault bugfix #27468 (already merged), improving the way we handle the URI validation, which will be performed only once instead of on each query parameter of a rest service endpoint.
  2. DrahtBot commented at 4:22 am on March 14, 2023: contributor

    The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

    Reviews

    See the guideline for information on the review process.

    Type Reviewers
    Concept ACK brunoerg
    Stale ACK stickies-v, vasild

    If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

    Conflicts

    Reviewers, this pull request conflicts with the following ones:

    • #28051 (Get rid of shutdown.cpp/shutdown.h, use SignalInterrupt directly by ryanofsky)

    If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

  3. pablomartin4btc force-pushed on Mar 14, 2023
  4. pablomartin4btc force-pushed on Mar 14, 2023
  5. pablomartin4btc force-pushed on Mar 14, 2023
  6. pablomartin4btc force-pushed on Mar 14, 2023
  7. pablomartin4btc force-pushed on Mar 14, 2023
  8. pablomartin4btc force-pushed on Mar 14, 2023
  9. fanquake added the label RPC/REST/ZMQ on Mar 14, 2023
  10. fanquake requested review from stickies-v on Mar 14, 2023
  11. stickies-v commented at 11:19 am on March 14, 2023: contributor

    Concept ACK, nice catch.

    As for the approach, I think there is a simpler one that requires no overhead for every endpoint using GetQueryParameter(). We could move uri_parsed to a private HTTPRequest member instead of a local GetQueryParameterFromUri variable, and have http_request_cb set it (for example after checking unknown HTTP methods) - or return an early 400 if null. Subsequently, we can assume that uri_parsed is valid and non-null, simplifying the rest code (and avoiding multiple evhttp_uri_parse calls when accessing multiple query params). What do you think?

  12. brunoerg commented at 12:42 pm on March 14, 2023: contributor
    Concept ACK
  13. pablomartin4btc commented at 9:14 pm on March 14, 2023: member

    What do you think?

    As @stickies-v explained to me outside this PR: “… by moving the uri parsing to http_request_cb we’re failing super early, and even when the query parameter would never be called by the endpoint… “, this makes more sense.

    I’m working on this approach which I think it’s better as well.

  14. pablomartin4btc force-pushed on Mar 16, 2023
  15. in src/httpserver.cpp:679 in 02cbb58327 outdated
    684-
    685-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key)
    686-{
    687-    evhttp_uri* uri_parsed{evhttp_uri_parse(uri)};
    688-    const char* query{evhttp_uri_get_query(uri_parsed)};
    689+    if (!m_uri_parsed) { return std::nullopt; }
    


    stickies-v commented at 5:46 pm on March 16, 2023:

    nit

    0    if (!m_uri_parsed) return std::nullopt;
    

    pablomartin4btc commented at 8:14 pm on March 16, 2023:
    thanks, corrected other similar 2.
  16. pablomartin4btc force-pushed on Mar 16, 2023
  17. pablomartin4btc commented at 5:59 pm on March 16, 2023: member

    Updated changes:

    • move the validation/parsing logic up in the stack as soon as the request is initiated even before the work item thread is created
    • updated unit and functional tests
  18. in src/test/httpserver_tests.cpp:18 in 213554cb97 outdated
    14@@ -13,26 +15,34 @@ BOOST_AUTO_TEST_CASE(test_query_parameters)
    15 {
    16     std::string uri {};
    17 
    18+    evhttp_request* evreq = evhttp_request_new(nullptr, nullptr);
    


    stickies-v commented at 6:10 pm on March 16, 2023:
    See #24012 - I couldn’t get mocking an evhttp_request* past the ASan test, and it looks like they’re failing again here too. That’s why in #24098 I introduced the GetQueryParameterFromUri() helper function: it could easily be unit tested, and then HTTPRequest::GetQueryParameter() was just a simple wrapper left untested.

    pablomartin4btc commented at 8:18 pm on March 16, 2023:
    If I bring it back I won’t be able to access to m_uri_parsed from the helper function, unless there’s another alternative, I’ll have to roll-back to the original approach of the PR.

    stickies-v commented at 1:56 pm on March 17, 2023:

    What do you think about this approach? I think it’s quite minimal:

      0diff --git a/src/httpserver.cpp b/src/httpserver.cpp
      1index 942caa042..6a6176a78 100644
      2--- a/src/httpserver.cpp
      3+++ b/src/httpserver.cpp
      4@@ -15,6 +15,7 @@
      5 #include <rpc/protocol.h> // For HTTP status codes
      6 #include <shutdown.h>
      7 #include <sync.h>
      8+#include <util/check.h>
      9 #include <util/strencodings.h>
     10 #include <util/syscall_sandbox.h>
     11 #include <util/system.h>
     12@@ -253,6 +254,14 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
     13         return;
     14     }
     15 
     16+    // Early reject invalid URI
     17+    if (!hreq->GetURIParsed()) {
     18+        LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Invalid URI\n",
     19+                 hreq->GetPeer().ToStringAddrPort());
     20+        hreq->WriteReply(HTTP_BAD_REQUEST, "Invalid URI");
     21+        return;
     22+    }
     23+
     24     LogPrint(BCLog::HTTP, "Received a %s request for %s from %s\n",
     25              RequestMethodString(hreq->GetRequestMethod()), SanitizeString(hreq->GetURI(), SAFE_CHARS_URI).substr(0, 100), hreq->GetPeer().ToStringAddrPort());
     26 
     27@@ -535,18 +544,22 @@ void HTTPEvent::trigger(struct timeval* tv)
     28     else
     29         evtimer_add(ev, tv); // trigger after timeval passed
     30 }
     31-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : req(_req), replySent(_replySent)
     32+HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) :
     33+        req{_req},
     34+        m_uri_parsed{evhttp_uri_parse(evhttp_request_get_uri(req))},
     35+        replySent{_replySent}
     36 {
     37 }
     38 
     39 HTTPRequest::~HTTPRequest()
     40 {
     41+    // evhttpd cleans up the request, as long as a reply was sent.
     42     if (!replySent) {
     43         // Keep track of whether reply was sent to avoid request leaks
     44         LogPrintf("%s: Unhandled request\n", __func__);
     45         WriteReply(HTTP_INTERNAL_SERVER_ERROR, "Unhandled request");
     46     }
     47-    // evhttpd cleans up the request, as long as a reply was sent.
     48+    if(m_uri_parsed) evhttp_uri_free(m_uri_parsed);
     49 }
     50 
     51 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const
     52@@ -665,14 +678,13 @@ HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const
     53 
     54 std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key) const
     55 {
     56-    const char* uri{evhttp_request_get_uri(req)};
     57-
     58-    return GetQueryParameterFromUri(uri, key);
     59+    if (!Assume(m_uri_parsed)) return std::nullopt;  // http_request_cb should have checked that m_uri_parsed is not nullptr
     60+    return GetQueryParameterFromUri(m_uri_parsed, key);
     61 }
     62 
     63-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key)
     64+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key)
     65 {
     66-    evhttp_uri* uri_parsed{evhttp_uri_parse(uri)};
     67+    assert(uri_parsed);
     68     const char* query{evhttp_uri_get_query(uri_parsed)};
     69     std::optional<std::string> result;
     70 
     71@@ -689,7 +701,6 @@ std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::
     72         }
     73         evhttp_clear_headers(&params_q);
     74     }
     75-    evhttp_uri_free(uri_parsed);
     76 
     77     return result;
     78 }
     79diff --git a/src/httpserver.h b/src/httpserver.h
     80index 036a39a02..84906e121 100644
     81--- a/src/httpserver.h
     82+++ b/src/httpserver.h
     83@@ -14,6 +14,7 @@ static const int DEFAULT_HTTP_WORKQUEUE=16;
     84 static const int DEFAULT_HTTP_SERVER_TIMEOUT=30;
     85 
     86 struct evhttp_request;
     87+struct evhttp_uri;
     88 struct event_base;
     89 class CService;
     90 class HTTPRequest;
     91@@ -57,6 +58,7 @@ class HTTPRequest
     92 {
     93 private:
     94     struct evhttp_request* req;
     95+    evhttp_uri* m_uri_parsed;
     96     bool replySent;
     97 
     98 public:
     99@@ -75,6 +77,9 @@ public:
    100      */
    101     std::string GetURI() const;
    102 
    103+    /** Get libevent parsed URI, which is nullptr if uri parsing failed */
    104+    const evhttp_uri* GetURIParsed() const { return m_uri_parsed; }
    105+
    106     /** Get CService (address:ip) for the origin of the http request.
    107      */
    108     CService GetPeer() const;
    109@@ -135,10 +140,10 @@ public:
    110  *
    111  * Helper function for HTTPRequest::GetQueryParameter.
    112  *
    113- * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri is the entire request uri
    114+ * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the request uri after parsing it with evhttp_uri_parse
    115  * [@param](/bitcoin-bitcoin/contributor/param/)[in] key represents the query parameter of which the value is returned
    116  */
    117-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key);
    118+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key);
    119 
    120 /** Event handler closure.
    121  */
    

    pablomartin4btc commented at 4:46 pm on March 17, 2023:
    I see, ok, it makes sense, let me make some changes.

    pablomartin4btc commented at 4:14 pm on March 20, 2023:
    After some tweaks around the unit tests (can’t test an assertion failure onBOOST and the a solution usingBOOST_REQUIRE_THROW(function_w_failing_assert(), boost::execution_exception);works only on windows), this is done now.
  19. in test/functional/interface_rest.py:51 in 213554cb97 outdated
    43@@ -44,6 +44,11 @@ class RetType(Enum):
    44     BYTES = 2
    45     JSON = 3
    46 
    47+class HTTPStatusCode(Enum):
    48+    HTTP_OK = 200
    49+    HTTP_BAD_REQUEST = 400
    50+    HTTP_NOT_FOUND = 404
    51+
    


    stickies-v commented at 6:11 pm on March 16, 2023:
    Seems like unrelated/unnecessary churn imo, and redefining HTTP status codes in a single functional test doesn’t seem like a productive approach. Would leave this out?

    pablomartin4btc commented at 8:16 pm on March 16, 2023:
    I’m now reusing the HTTPStatus enum from the python http module in python; there were like 20 places where these codes were introduced as int. I think it makes it clearer but pls let me know if that’s not the case.

    stickies-v commented at 8:53 pm on March 16, 2023:

    Every line of code that you change needs review. If those changed lines of code are unrelated to what you’re trying to achieve, you risk unnecessarily slowing doing the PR or even getting NACKs for reasons unrelated to what you’re really trying to do. I’d strongly suggest keeping the scope as small as possible.

    I’m not saying it’s necessarily a bad idea to use enumerated HTTP status codes, but it would be better off in a separate PR/issue (and have you checked if it’s been suggested before?)


    pablomartin4btc commented at 4:46 pm on March 17, 2023:
    Ok, got it now and I agree. Thanks.
  20. pablomartin4btc force-pushed on Mar 16, 2023
  21. fanquake commented at 9:01 pm on March 16, 2023: member

    https://cirrus-ci.com/task/4643666984173568:

    0 test  2023-03-16T20:06:03.567000Z TestFramework (ERROR): Unexpected exception caught during testing 
    1                                   Traceback (most recent call last):
    2                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/test_framework/test_framework.py", line 132, in main
    3                                       self.run_test()
    4                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 372, in run_test
    5                                       resp = self.test_rest_request("/mempool/contents", ret_type=RetType.OBJ, status=400, query_params={"verbose": "true", "mempool_sequence": "true"})
    6                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 86, in test_rest_request
    7                                       assert_equal(resp.status, status.value)
    8                                   AttributeError: 'int' object has no attribute 'value'
    
  22. pablomartin4btc commented at 9:13 pm on March 16, 2023: member

    @fanquake that’s something perhaps I’ve introduced with my last push, I’m on it, thanks. Weird, it’s not failing locally for me:

     0$ ./test/functional/interface_rest.py 
     12023-03-16T21:14:11.025000Z TestFramework (INFO): PRNG seed is: 326626678636704986
     22023-03-16T21:14:11.026000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_vwqauy2g
     32023-03-16T21:14:11.812000Z TestFramework (INFO): Broadcast test transaction and sync nodes
     42023-03-16T21:14:12.865000Z TestFramework (INFO): Test the /tx URI
     52023-03-16T21:14:12.876000Z TestFramework (INFO): Query an unspent TXO using the /getutxos URI
     62023-03-16T21:14:12.934000Z TestFramework (INFO): Query a spent TXO using the /getutxos URI
     72023-03-16T21:14:12.935000Z TestFramework (INFO): Query two TXOs using the /getutxos URI
     82023-03-16T21:14:12.938000Z TestFramework (INFO): Query the TXOs using the /getutxos URI with a binary response
     92023-03-16T21:14:12.939000Z TestFramework (INFO): Test the /getutxos URI with and without /checkmempool
    102023-03-16T21:14:14.065000Z TestFramework (INFO): Test the /block, /blockhashbyheight, /headers, and /blockfilterheaders URIs
    112023-03-16T21:14:15.221000Z TestFramework (INFO): Test tx inclusion in the /mempool and /block URIs
    122023-03-16T21:14:16.326000Z TestFramework (INFO): Test the /chaininfo URI
    132023-03-16T21:14:16.334000Z TestFramework (INFO): Test compatibility of deprecated and newer endpoints
    142023-03-16T21:14:16.341000Z TestFramework (INFO): Test the /deploymentinfo URI
    152023-03-16T21:14:16.404000Z TestFramework (INFO): Stopping nodes
    162023-03-16T21:14:16.611000Z TestFramework (INFO): Cleaning up /tmp/bitcoin_func_test_vwqauy2g on exit
    172023-03-16T21:14:16.612000Z TestFramework (INFO): Tests successful
    

    I think this was the issue: Python Enums were introduced in version 3.4, and the value attribute was added in version 3.6. I’m fixing it.

  23. pablomartin4btc force-pushed on Mar 16, 2023
  24. pablomartin4btc force-pushed on Mar 20, 2023
  25. in src/httpserver.cpp:710 in 0115dccef6 outdated
    702@@ -689,11 +703,12 @@ std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::
    703         }
    704         evhttp_clear_headers(&params_q);
    705     }
    706-    evhttp_uri_free(uri_parsed);
    707 
    708     return result;
    709 }
    710 
    711+evhttp_uri* HTTPRequest::GetURIParsed() { return m_uri_parsed; }
    


    stickies-v commented at 4:11 pm on March 20, 2023:
    A simple getter is best implemented in the header

    pablomartin4btc commented at 4:50 pm on March 20, 2023:
    ok
  26. in src/httpserver.h:131 in 0115dccef6 outdated
    140- * @param[in] uri is the entire request uri
    141- * @param[in] key represents the query parameter of which the value is returned
    142- */
    143-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key);
    144+    /** Get libevent parsed URI, which is nullptr if uri parsing failed */
    145+    evhttp_uri* GetURIParsed();
    


    stickies-v commented at 4:11 pm on March 20, 2023:
    0    const evhttp_uri* GetURIParsed() const;
    

    pablomartin4btc commented at 5:26 pm on March 20, 2023:
    Yeah, putting it back, before the compiler was complaining as in the previous approach we were modifying m_uri_parsed in this function.
  27. in src/httpserver.h:180 in 0115dccef6 outdated
    175+ *
    176+ * @param[in] uri_parsed is the evhttp_uri object (obtained from evhttp_uri_parse - libevent function)
    177+ * @param[in] key represents the query parameter of which the value is returned
    178+ */
    179+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key);
    180+
    


    stickies-v commented at 4:15 pm on March 20, 2023:
    What’s the rationale behind moving all this code? I don’t think it’s necessary?

    pablomartin4btc commented at 4:49 pm on March 20, 2023:
    Was unintentional, going to place it on its original location.
  28. in src/httpserver.h:60 in 0115dccef6 outdated
    56@@ -56,7 +57,8 @@ struct event_base* EventBase();
    57 class HTTPRequest
    58 {
    59 private:
    60-    struct evhttp_request* req;
    61+    struct evhttp_request* req{nullptr};
    


    stickies-v commented at 4:17 pm on March 20, 2023:
    seems unrelated to this PR, and no need to touch this otherwise?

    pablomartin4btc commented at 5:10 pm on March 20, 2023:
    I wanted to make sure then to use the free up function for the request in the HTTPRequest destructor: evhttp_request_free. This was something I was doing while investigating the previous failure of the ASan tests and possible memory leaks. I’ll remove it if you think unnecessary,

    pablomartin4btc commented at 5:30 pm on March 20, 2023:
    I’ll take it back.
  29. in src/httpserver.h:97 in 0115dccef6 outdated
    93@@ -92,7 +94,7 @@ class HTTPRequest
    94      *
    95      * @param[in] key represents the query parameter of which the value is returned
    96      */
    97-    std::optional<std::string> GetQueryParameter(const std::string& key) const;
    98+    std::optional<std::string> GetQueryParameter(const std::string& key);
    


    stickies-v commented at 4:17 pm on March 20, 2023:
    Why remove const?

    pablomartin4btc commented at 5:27 pm on March 20, 2023:
    Similar reason to the above comment onGetURIParsed() function, back again.
  30. pablomartin4btc force-pushed on Mar 20, 2023
  31. in src/test/httpserver_tests.cpp:1 in 9b0a0dc923


    stickies-v commented at 4:19 pm on March 20, 2023:
    I think this would benefit from helper function(s) to avoid duplicating all the parsing/memclearing logic and make the tests easier to read.

    pablomartin4btc commented at 5:11 pm on March 20, 2023:
    ok, let me add some helper(s) there.

    pablomartin4btc commented at 6:01 pm on March 20, 2023:
    done.

    stickies-v commented at 12:04 pm on March 21, 2023:

    I think the tests are still more complex than necessary. How about this:

     0diff --git a/src/test/httpserver_tests.cpp b/src/test/httpserver_tests.cpp
     1index 0462d3ae3..6ae57bdf0 100644
     2--- a/src/test/httpserver_tests.cpp
     3+++ b/src/test/httpserver_tests.cpp
     4@@ -11,42 +11,46 @@
     5 
     6 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup)
     7 
     8-static evhttp_uri* GetURIParsed(const std::string uri){
     9-    return evhttp_uri_parse(uri.c_str());
    10+/** Test if query parameter exists in the provided uri. */
    11+bool QueryParameterExists(std::string_view uri, const std::string& key)
    12+{
    13+    auto uri_parsed{evhttp_uri_parse(uri.data())};
    14+    const bool exists{GetQueryParameterFromUri(uri_parsed, key).has_value()};
    15+    if(uri_parsed) evhttp_uri_free(uri_parsed);
    16+    return exists;
    17+}
    18+
    19+/** Test if query parameter exists in the provided uri and if its value matches the expected_value. */
    20+bool QueryParameterEquals(std::string_view uri, const std::string& key, std::string_view expected_value)
    21+{
    22+    auto uri_parsed{evhttp_uri_parse(uri.data())};
    23+    const auto param_value{GetQueryParameterFromUri(uri_parsed, key)};
    24+    const bool is_equal{param_value.has_value() && *param_value == expected_value};
    25+    if(uri_parsed) evhttp_uri_free(uri_parsed);
    26+    return is_equal;
    27 }
    28 
    29 BOOST_AUTO_TEST_CASE(test_query_parameters)
    30 {
    31     // No parameters
    32-    evhttp_uri* uri_parsed{GetURIParsed("localhost:8080/rest/headers/someresource.json")};
    33-    BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value());
    34-    evhttp_uri_free(uri_parsed);
    35+    BOOST_CHECK(!QueryParameterExists("localhost:8080/rest/headers/someresource.json", "p1"));
    36 
    37     // Single parameter
    38-    uri_parsed = GetURIParsed("localhost:8080/rest/endpoint/someresource.json?p1=v1");
    39-    BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1");
    40-    BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p2").has_value());
    41-    evhttp_uri_free(uri_parsed);
    42+    BOOST_CHECK(QueryParameterEquals("localhost:8080/rest/endpoint/someresource.json?p1=v1", "p1", "v1"));
    43+    BOOST_CHECK(!QueryParameterExists("localhost:8080/rest/endpoint/someresource.json?p1=v1", "p2"));
    44 
    45     // Multiple parameters
    46-    uri_parsed = GetURIParsed("/rest/endpoint/someresource.json?p1=v1&p2=v2");
    47-    BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1");
    48-    BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p2").value(), "v2");
    49-    evhttp_uri_free(uri_parsed);
    50+    BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p2=v2", "p1", "v1"));
    51+    BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p2=v2", "p2", "v2"));
    52 
    53     // If the query string contains duplicate keys, the first value is returned
    54-    uri_parsed = GetURIParsed("/rest/endpoint/someresource.json?p1=v1&p1=v2");
    55-    BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1");
    56-    evhttp_uri_free(uri_parsed);
    57+    BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p1=v2", "p1", "v1"));
    58 
    59     // Invalid query string syntax is the same as not having parameters
    60-    uri_parsed = GetURIParsed("/rest/endpoint/someresource.json&p1=v1&p2=v2");
    61-    BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value());
    62-    evhttp_uri_free(uri_parsed);
    63+    BOOST_CHECK(!QueryParameterExists("/rest/endpoint/someresource.json&p1=v1&p2=v2", "p1"));
    64 
    65     // Invalid URI with % symbol won't return any values as not having parameters at all
    66-    uri_parsed = GetURIParsed("/rest/endpoint/someresource.json&p1=v1&p2=v2%");
    67-    BOOST_CHECK_EQUAL(uri_parsed, nullptr);
    68-    BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value());
    69+    BOOST_CHECK(!QueryParameterExists("/rest/endpoint/someresource.json&p1=v1&p2=v2%", "p1"));
    70+
    71 }
    72 BOOST_AUTO_TEST_SUITE_END()
    

    Can probably still be deduplicated a bit more using an RAII wrapper for uri_parsed, if you like.


    pablomartin4btc commented at 8:32 pm on March 21, 2023:
    Done with one more tweak. I’ll use an RAII wrapper if I have to touch the code again, and if that’s the case, could you pls point me to an example already here in the bitcoin core code if already exists? Thanks.

    stickies-v commented at 10:42 am on May 18, 2023:
    The two req->GetQueryParameter calls in rest_mempool need to be updated as well.

    pablomartin4btc commented at 2:26 pm on May 27, 2023:
    Yeah missed them, done now, thanks!
  32. in test/functional/interface_rest.py:218 in 9b0a0dc923 outdated
    218-        self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=400, ret_type=RetType.OBJ)
    219+        self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=HTTPStatus.BAD_REQUEST, ret_type=RetType.OBJ)
    220 
    221         long_uri = '/'.join([f'{txid}-{n_}' for n_ in range(15)])
    222-        self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=200)
    223+        self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=HTTPStatus.OK)
    


    stickies-v commented at 4:22 pm on March 20, 2023:
    These changes are unrelated to the segmentation fault and at the very least need to be in a separate commit

    pablomartin4btc commented at 4:48 pm on March 20, 2023:
    Sorry, forgot to remove all that.
  33. pablomartin4btc commented at 4:24 pm on March 20, 2023: member

    Updated changes:

    • Updated the code with a new workaround to avoid failures on unit tests (httpserver_tests).

    Thanks @stickies-v for your prompt response and your support on quick workarounds on issues raised.

  34. stickies-v commented at 4:26 pm on March 20, 2023: contributor
    nit: the commit message could benefit from additional information on the nature of the bug and how it’s fixed
  35. in src/test/httpserver_tests.cpp:17 in 9b0a0dc923 outdated
    16-    std::string uri {};
    17-
    18     // No parameters
    19-    uri = "localhost:8080/rest/headers/someresource.json";
    20-    BOOST_CHECK(!GetQueryParameterFromUri(uri.c_str(), "p1").has_value());
    21+    std::string uri = "localhost:8080/rest/headers/someresource.json";
    


    brunoerg commented at 5:01 pm on March 20, 2023:

    nit:

    0    const std::string uri{"localhost:8080/rest/headers/someresource.json"};
    

    pablomartin4btc commented at 5:02 pm on March 20, 2023:
    right
  36. pablomartin4btc force-pushed on Mar 20, 2023
  37. pablomartin4btc force-pushed on Mar 20, 2023
  38. pablomartin4btc force-pushed on Mar 20, 2023
  39. pablomartin4btc force-pushed on Mar 20, 2023
  40. pablomartin4btc force-pushed on Mar 20, 2023
  41. in src/httpserver.h:135 in 36be160a9b outdated
    132 };
    133 
    134-/** Get the query parameter value from request uri for a specified key, or std::nullopt if the key
    135- * is not found.
    136+/** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt
    137+ * if the key is not found or uri_parsed parameter is nullptr (most probably from a failed parsing).
    


    stickies-v commented at 11:26 am on March 21, 2023:

    nit: it’s actually an argument, but I would just leave it out

    0 * if the key is not found or if uri_parsed is nullptr (most probably from a failed parsing).
    

    pablomartin4btc commented at 7:04 pm on March 21, 2023:
    done.
  42. in src/httpserver.h:143 in 36be160a9b outdated
    141  * currently not needed in any of the endpoints.
    142  *
    143  * Helper function for HTTPRequest::GetQueryParameter.
    144  *
    145- * @param[in] uri is the entire request uri
    146+ * @param[in] uri_parsed is the evhttp_uri object (obtained from evhttp_uri_parse - libevent function)
    


    stickies-v commented at 11:27 am on March 21, 2023:

    nit

    0 * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the uri parsed with libevent's evhttp_uri_parse
    

    pablomartin4btc commented at 7:05 pm on March 21, 2023:
    done.
  43. stickies-v commented at 12:12 pm on March 21, 2023: contributor
    Approach ACK 36be160a9
  44. pablomartin4btc force-pushed on Mar 21, 2023
  45. pablomartin4btc force-pushed on Mar 21, 2023
  46. pablomartin4btc force-pushed on Mar 21, 2023
  47. pablomartin4btc force-pushed on Mar 21, 2023
  48. in src/test/httpserver_tests.cpp:17 in 729287530e outdated
    10 #include <boost/test/unit_test.hpp>
    11 
    12 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup)
    13 
    14-BOOST_AUTO_TEST_CASE(test_query_parameters)
    15+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
    


    stickies-v commented at 2:15 pm on March 22, 2023:
    Missing #include <string> and #include <string_view>

    pablomartin4btc commented at 3:14 pm on March 22, 2023:
    Ok, thanks.
  49. stickies-v approved
  50. stickies-v commented at 2:18 pm on March 22, 2023: contributor

    ACK 729287530e458febf32e01e542a36e7be4a74fbf

    Also checked that the new functional test fails without the patch and succeeds with the patch.

    but I think the commit message should be updated to better describe the bug, the fix, and the behaviour change introduced

  51. in src/test/httpserver_tests.cpp:22 in 729287530e outdated
    14-BOOST_AUTO_TEST_CASE(test_query_parameters)
    15+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
    16+    auto uri_parsed{evhttp_uri_parse(uri.data())};
    17+    const auto param_value{GetQueryParameterFromUri(uri_parsed, key)};
    18+    if(uri_parsed) evhttp_uri_free(uri_parsed);
    19+    return param_value;
    


    maflcko commented at 2:44 pm on March 22, 2023:
    0test/httpserver_tests.cpp:18:12: error: constness of 'param_value' prevents automatic move [performance-no-automatic-move,-warnings-as-errors]
    1    return param_value;
    2           ^
    3clang-tidy-14 --us
    

    pablomartin4btc commented at 3:12 pm on March 22, 2023:
    I’ll fix this.
  52. pablomartin4btc commented at 3:15 pm on March 22, 2023: member

    but I think the commit message should be updated to better describe the bug, the fix, and the behaviour change introduced

    Yes, I have that pending to do asap.

  53. pablomartin4btc force-pushed on Mar 23, 2023
  54. vasild commented at 10:11 am on March 23, 2023: contributor

    Concept ACK

    The code as of 8bdea6316c looks ok, but it seems to be mixing the fix with an optimization to only parse once (instead of every time a parameter is retrieved). It would be better to have two separate commits - one with the non-functional optimization and one with the fix.

    Idea: I think it would be better to disallow the creation of HTTPRequest if the request is invalid/unparsable. This would avoid having a nullptr member m_uri_parsed and having to handle the nullness. Would it work to check the sanity in the HTTPRequest constructor and if a problem is detected, then throw some object that contains an error code and a string (to convey e.g. HTTP_BAD_REQUEST, "Invalid URI")? Then catch this in http_request_cb() and return early.

    There are other similar bugs in this code which exist in master and remain unfixed by this PR:

    • evhttp_request_get_uri() may return nullptr which we pass unconditionally to evhttp_uri_parse() which will crash on nullptr.
    • In HTTPRequest::GetURI() we implicitly create std::string from the return value of evhttp_request_get_uri(). Creating a string from nullptr is undefined behavior.

    I am not sure under what circumstances would evhttp_request_get_uri() return nullptr, but it is not documented that it will never return nullptr and also its source code implies that it can, so better safe than sorry. I don’t think fixing those is a requirement for this PR. If they are not, then I will follow up with a subsequent PR.

  55. stickies-v commented at 1:06 pm on March 23, 2023: contributor

    It would be better to have two separate commits - one with the non-functional optimization and one with the fix.

    Agreed, that could probably be improved.

    Idea: I think it would be better to disallow the creation of HTTPRequest if the request is invalid/unparsable.

    This was the first approach I considered, but I think it’s better not to. A request with an invalid URI is still a request, and thus it should be representable. Moving the URI validation to the constructor feels like a shortcut and coupling we don’t need to take?

    evhttp_request_get_uri() may return nullptr

    Like you, I don’t see how that can happen in practice but can never hurt to make more robust. I think it’s orthogonal to this PR but happy to review your PR - feel free to ping me?

  56. vasild commented at 1:48 pm on March 23, 2023: contributor

    Moving the URI validation to the constructor feels like a shortcut and coupling we don’t need to take?

    I do not have a strong opinion. This PR moved the parsing to the constructor, but left the checking of the parsing result for later. Now the code looks like (simplified):

    0// constructs the object, does the parsing, leaves null m_uri_parsed if it fails
    1HTTPRequest hreq(...);
    2
    3if (hreq.m_uri_parsed == nullptr) {
    4    handle error and dispose hreq
    5}
    6
    7// elsewhere a soft assert is needed to ensure hreq.m_uri_parsed != nullptr
    

    I find the following more robust and compact:

    0std::unique_ptr<HTTPRequest> hreq;
    1try {
    2    hreq = std::make_unique<HTTPRequest>(...);
    3} catch (e) {
    4    handle error e.code + e.msg
    5}
    6
    7// no need for further error handling or checking here or elsewhere
    

    happy to review your PR - feel free to ping me

    Sure, lets first see what happens with this PR. Maybe @pablomartin4btc would decide to check the return value of evhttp_request_get_uri() in this PR.

  57. pablomartin4btc commented at 3:21 pm on March 23, 2023: member

    It would be better to have two separate commits - one with the non-functional optimization and one with the fix.

    Agreed, that could probably be improved.

    I can do this.

    Sure, lets first see what happens with this PR. Maybe @pablomartin4btc would decide to check the return value of evhttp_request_get_uri() in this PR.

    It was under my radar and I thought to do it in a follow up, but I could do it here if it’s more practical (need to update title and description of this PR).

    I find the following more robust and compact:

    0std::unique_ptr<HTTPRequest> hreq;
    1try {
    2    hreq = std::make_unique<HTTPRequest>(...);
    3} catch (e) {
    4    handle error e.code + e.msg
    5}
    6
    7// no need for further error handling or checking here or elsewhere
    

    I like this approach, let me check and play a bit with the code as if I need to raise an exception or can return the HTTP_BAD_REQUEST from there, and how the consumer (rest obj) would behave/ interact with it.

  58. achow101 added this to the milestone 25.0 on Mar 23, 2023
  59. vasild commented at 9:01 am on March 24, 2023: contributor

    If you decide to go this way, something like this should work:

     0// in HTTPRequest::HTTPRequest():
     1if (cannot parse uri) {
     2    const int code{HTTP_BAD_METHOD};
     3    const char* what{"Invalid URI"};
     4    WriteReply(code, what);
     5    throw std::system_error(code, std::generic_category(), what);
     6}
     7
     8// in http_request_cb():
     9std::unique_ptr<HTTPRequest> hreq;
    10try {
    11    hreq = std::make_unique<HTTPRequest>(...);
    12} catch (const std::system_error& e) {
    13    LogPrint(BCLog::HTTP, "%d %s\n", e.code().value(), e.what());
    14    return;
    15}
    
  60. pablomartin4btc force-pushed on Mar 25, 2023
  61. pablomartin4btc commented at 4:12 am on March 25, 2023: member

    Updated changes:

    • Moving the uri parsing validation from the http_request_cb to the HTTPRequest constructor, while doing this realised that it’s better to validate the result of evhttp_request_get_uri() also at that time. @vasild, @stickies-v: please let me know what you think and I’ll update title & description of the PR and will separate the code in 2 commits (proper fix & non-functional optimisation).
  62. pablomartin4btc commented at 5:52 pm on March 25, 2023: member

    Regarding CI failures:

    • ./test/functional/wallet_create_tx.py passes locally, actioned manually re-run and it passes too.
    • checking the fuzzer one cirrus-ci-build/ci/scratch/qa-assets/fuzz_seed_corpus/http_request" failed with exit code 77 - fuzz: httpserver.cpp:618: void HTTPRequest::WriteReply(int, const std::string &): Assertion '!replySent && req' failed. (crash-75d231d5e022cc38377bbbf4549058d37b7c59a3)
    0FUZZ=process_message ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3
    1process_message: succeeded against 1 files in 0s.
    
  63. maflcko commented at 7:51 am on March 27, 2023: member

    FUZZ=process_message ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3

    You’ll need to set the correct fuzz target:

    FUZZ=http_request ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3

  64. in src/httpserver.cpp:265 in 0aa0626c50 outdated
    261@@ -253,11 +262,11 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
    262         return;
    263     }
    264 
    265+    std::string strURI{hreq->GetURI()};
    


    vasild commented at 9:18 am on March 27, 2023:

    nit:

    0    const std::string strURI{hreq->GetURI()};
    

    vasild commented at 10:27 am on March 27, 2023:

    GetURI() (which calls evhttp_request_get_uri()) could still return null. It seems too indirect to that we rely that we have called evhttp_request_get_uri() before on the same input and back then it returned non-null (from the constructor) so it must also return non-null now.

    I think it would be better to save the result in a variable and document that it will never be null. This will also avoid calling evhttp_request_get_uri() two times. Like this:

      0diff --git i/src/httpserver.h w/src/httpserver.h
      1index 5b21a678c5..d7cd81be7e 100644
      2--- i/src/httpserver.h
      3+++ w/src/httpserver.h
      4@@ -55,13 +55,16 @@ struct event_base* EventBase();
      5  * Thin C++ wrapper around evhttp_request.
      6  */
      7 class HTTPRequest
      8 {
      9 private:
     10     struct evhttp_request* req;
     11-    evhttp_uri* m_uri_parsed{nullptr};
     12+    /// The URI, as returned by `evhttp_request_get_uri()`. Will never be `nullptr`.
     13+    const char* const m_uri;
     14+    /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`.
     15+    evhttp_uri* const m_uri_parsed;
     16     bool replySent;
     17 
     18 public:
     19     explicit HTTPRequest(struct evhttp_request* req, bool replySent = false);
     20     ~HTTPRequest();
     21 
     22diff --git i/src/httpserver.cpp w/src/httpserver.cpp
     23index e9b68a36c1..45d25bc32f 100644
     24--- i/src/httpserver.cpp
     25+++ w/src/httpserver.cpp
     26@@ -238,14 +238,14 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
     27     }
     28 
     29     std::unique_ptr<HTTPRequest> hreq;
     30 
     31     try {
     32         hreq = std::make_unique<HTTPRequest>(req);
     33-    } catch (const std::system_error& e) {
     34-        LogPrint(BCLog::HTTP, "%d %s\n", e.code().value(), e.what());
     35+    } catch (const std::runtime_error& e) {
     36+        LogPrint(BCLog::HTTP, "%s\n", e.what());
     37         return;
     38     }
     39 
     40     // Early address-based allow check
     41     if (!ClientAllowed(hreq->GetPeer())) {
     42         LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Client network is not allowed RPC access\n",
     43@@ -543,34 +543,33 @@ void HTTPEvent::trigger(struct timeval* tv)
     44         event_active(ev, 0, 0); // immediately trigger event in main thread
     45     else
     46         evtimer_add(ev, tv); // trigger after timeval passed
     47 }
     48 HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) :
     49         req{_req},
     50+        m_uri{evhttp_request_get_uri(req)},
     51+        m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr},
     52         replySent{_replySent}
     53 {
     54-    const char* uri_to_parse{GetURI()};
     55-    if (uri_to_parse != nullptr && uri_to_parse[0] != '\0') m_uri_parsed = evhttp_uri_parse(uri_to_parse);
     56-
     57-    if (!m_uri_parsed) {
     58-        const int code{HTTP_BAD_REQUEST};
     59-        const char* what{"Invalid URI"};
     60-        WriteReply(code, what);
     61-        throw std::system_error(code, std::generic_category(), what);
     62+    if (m_uri_parsed == nullptr) {
     63+        const char* err{"Invalid URI"};
     64+        WriteReply(HTTP_BAD_REQUEST, err);
     65+        throw std::runtime_error(
     66+            strprintf("HTTP request from %s rejected: %s", GetPeer().ToStringAddr(), err));
     67     }
     68 }
     69 
     70 HTTPRequest::~HTTPRequest()
     71 {
     72     // evhttpd cleans up the request, as long as a reply was sent.
     73     if (!replySent) {
     74         // Keep track of whether reply was sent to avoid request leaks
     75         LogPrintf("%s: Unhandled request\n", __func__);
     76         WriteReply(HTTP_INTERNAL_SERVER_ERROR, "Unhandled request");
     77     }
     78-    if (m_uri_parsed) evhttp_uri_free(m_uri_parsed);
     79+    evhttp_uri_free(m_uri_parsed);
     80 }
     81 
     82 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const
     83 {
     84     const struct evkeyvalq* headers = evhttp_request_get_input_headers(req);
     85     assert(headers);
     86@@ -662,13 +661,13 @@ CService HTTPRequest::GetPeer() const
     87     }
     88     return peer;
     89 }
     90 
     91 const char* HTTPRequest::GetURI() const
     92 {
     93-    return evhttp_request_get_uri(req);
     94+    return m_uri;
     95 }
     96 
     97 HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const
     98 {
     99     switch (evhttp_request_get_command(req)) {
    100     case EVHTTP_REQ_GET:
    

    pablomartin4btc commented at 3:06 am on March 28, 2023:
    I agree with this suggestion, first the manipulation of null values, mainly its return, and also removing the extra call to evhttp_request_get_uri().
  65. in src/httpserver.h:81 in 0aa0626c50 outdated
    74@@ -73,7 +75,7 @@ class HTTPRequest
    75 
    76     /** Get requested URI.
    


    vasild commented at 9:23 am on March 27, 2023:
    0    /** Get requested URI. Will always return non-nullptr.
    
  66. in src/httpserver.h:132 in 0aa0626c50 outdated
    127@@ -126,19 +128,19 @@ class HTTPRequest
    128     void WriteReply(int nStatus, const std::string& strReply = "");
    129 };
    130 
    131-/** Get the query parameter value from request uri for a specified key, or std::nullopt if the key
    132- * is not found.
    133+/** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt
    134+ * if the key is not found or if uri_parsed argument is nullptr (most probably from a failed parsing).
    


    vasild commented at 10:25 am on March 27, 2023:

    It is possible to enforce that this argument will never be null - if changed to reference. Simplifies the code a bit:

     0diff --git i/src/httpserver.cpp w/src/httpserver.cpp
     1index e9b68a36c1..1dccb196be 100644
     2--- i/src/httpserver.cpp
     3+++ w/src/httpserver.cpp
     4@@ -683,20 +683,18 @@ HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const
     5         return UNKNOWN;
     6     }
     7 }
     8 
     9 std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key) const
    10 {
    11-    if (!Assume(m_uri_parsed)) return std::nullopt;  // HTTPRequest constructor should have checked that m_uri_parsed is not nullptr
    12-    return GetQueryParameterFromUri(m_uri_parsed, key);
    13+    return GetQueryParameterFromUri(*m_uri_parsed, key);
    14 }
    15 
    16-std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key)
    17+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key)
    18 {
    19-    if (!uri_parsed) return std::nullopt;
    20-    const char* query{evhttp_uri_get_query(uri_parsed)};
    21+    const char* query{evhttp_uri_get_query(&uri_parsed)};
    22     std::optional<std::string> result;
    23 
    24     if (query) {
    25         // Parse the query string into a key-value queue and iterate over it
    26         struct evkeyvalq params_q;
    27         evhttp_parse_query_str(query, &params_q);
    28diff --git i/src/httpserver.h w/src/httpserver.h
    29index 5b21a678c5..3f2ef06c11 100644
    30--- i/src/httpserver.h
    31+++ w/src/httpserver.h
    32@@ -126,24 +126,24 @@ public:
    33      * main thread, do not call any other HTTPRequest methods after calling this.
    34      */
    35     void WriteReply(int nStatus, const std::string& strReply = "");
    36 };
    37 
    38 /** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt
    39- * if the key is not found or if uri_parsed argument is nullptr (most probably from a failed parsing).
    40+ * if the key is not found.
    41  *
    42  * If the query string contains duplicate keys, the first value is returned. Many web frameworks
    43  * would instead parse this as an array of values, but this is not (yet) implemented as it is
    44  * currently not needed in any of the endpoints.
    45  *
    46  * Helper function for HTTPRequest::GetQueryParameter.
    47  *
    48  * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the uri parsed with libevent's evhttp_uri_parse
    49  * [@param](/bitcoin-bitcoin/contributor/param/)[in] key represents the query parameter of which the value is returned
    50  */
    51-std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key);
    52+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key);
    53 
    54 /** Event handler closure.
    55  */
    56 class HTTPClosure
    57 {
    58 public:
    59diff --git i/src/test/httpserver_tests.cpp w/src/test/httpserver_tests.cpp
    60index 120c6d102f..6e2128863d 100644
    61--- i/src/test/httpserver_tests.cpp
    62+++ w/src/test/httpserver_tests.cpp
    63@@ -13,13 +13,13 @@
    64 #include <boost/test/unit_test.hpp>
    65 
    66 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup)
    67 
    68 std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
    69     auto uri_parsed{evhttp_uri_parse(uri.data())};
    70-    auto param_value{GetQueryParameterFromUri(uri_parsed, key)};
    71+    auto param_value{GetQueryParameterFromUri(*uri_parsed, key)};
    72     if(uri_parsed) evhttp_uri_free(uri_parsed);
    73     return param_value;
    74 }
    75 
    76 /** Test if query parameter exists in the provided uri. */
    77 bool QueryParameterExists(const std::string_view uri, const std::string& key)
    

    pablomartin4btc commented at 3:03 am on March 28, 2023:
    I see, changing a pointer to a reference does not inherently prevent null pointer dereferences at runtime. However, by changing a pointer to a reference and documenting that the reference is expected to be non-null, we are effectively enforcing a contract that the caller of the function must provide a valid (non-null) reference. Suggestion taken, thanks!

    pablomartin4btc commented at 4:51 am on March 28, 2023:
    The problem removing if (!uri_parsed) return std::nullopt; line from GetQueryParameterFromUri() is that passing nullptr as the uri_parsed (from an invalid uri), as in the last test case in httpserver_test.cpp will produce the segfault, so I could opt that in the test but I dont think it’s the real intention of testing GetQueryParameterFromUri().

    vasild commented at 8:18 am on March 30, 2023:

    It is not possible to pass nullptr as an argument to GetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key). This is the point. Inside the function there is no need to worry about null pointer.

    The crash occurs inside the test:

    0    auto uri_parsed{evhttp_uri_parse(uri.data())}; // ends up nullptr
    1    auto param_value{GetQueryParameterFromUri(*uri_parsed, key)}; // crash here, before calling GetQueryParameterFromUri().
    

    The function is safe, the caller has to be adjusted.


    vasild commented at 8:36 am on March 30, 2023:
    I see you have adjusted the test in the latest push, I think that is the right approach. Thanks!
  67. pablomartin4btc force-pushed on Mar 28, 2023
  68. pablomartin4btc force-pushed on Mar 28, 2023
  69. pablomartin4btc force-pushed on Mar 28, 2023
  70. pablomartin4btc commented at 5:24 am on March 28, 2023: member

    Working on functional failureinterface_reston the invalid uri case, something didn’t like from latest changes.

    I think I’ve found the issue!

    Both, .py and fuzz were getting the same failure Assertion '!replySent && req' failed.

    The problem for the functional was thatGeetPeer()was called while throwing the runtime exception, so moved it before.

    On the fuzz ./src/test/fuzz/test http_request.cpp, the issue is that the test itself is passing replySent as true, so the WriteReply() call on the httpserver constructor when the uri_parse is invalid. I’m investigating the rest of the test if we need to change anything there. Also, not sure, in which cases we need to create an instance of the HTTPRequest obj passing replySent as true, I didn’t find a case in the entire code, unless as in the current fuzz we are talking about.

  71. pablomartin4btc force-pushed on Mar 28, 2023
  72. in src/test/httpserver_tests.cpp:21 in 1c278adc1a outdated
    17-BOOST_AUTO_TEST_CASE(test_query_parameters)
    18+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
    19+    auto uri_parsed{evhttp_uri_parse(uri.data())};
    20+    if (uri_parsed == nullptr) return std::nullopt;
    21+    auto param_value{GetQueryParameterFromUri(*uri_parsed, key)};
    22+    if(uri_parsed) evhttp_uri_free(uri_parsed);
    


    brunoerg commented at 8:58 pm on March 28, 2023:

    nit

    0    if (uri_parsed) evhttp_uri_free(uri_parsed);
    
  73. brunoerg commented at 9:00 pm on March 28, 2023: contributor

    On the fuzz ./src/test/fuzz/test http_request.cpp, the issue is that the test itself is passing replySent as true, so the WriteReply() call on the httpserver constructor when the uri_parse is invalid. I’m investigating the rest of the test if we need to change anything there. Also, not sure, in which cases we need to create an instance of the HTTPRequest obj passing replySent as true, I didn’t find a case in the entire code, unless as in the current fuzz we are talking about.

    Try to set it to false since we are calling ReadBody() to read and process the result.

  74. pablomartin4btc commented at 10:22 pm on March 28, 2023: member

    Try to set it to false since we are calling ReadBody() to read and process the result.

    I think we need to re-think this fuzz test (http_request.cpp), as due to the new approach we have these issues with it:

    • replySent is set to true from test and it will fail in WriteReply() as assert(!replySent && req);.
    • even we set the above to false, later during GetPeer() as the connection is null (we can see in the fuzz test checking later assert(service.ToStringAddrPort() == "[::]:0");) the assignment of GetPeer().ToStringAddr(); will result into <incomplete type> and it will error when throwing the runtime exception.
    • but even we solve the above issues going around the code, now as in the HTTPRequest constructor we are raising the runtime exception, as there’s no call into http_request_cb (because there’s no InitHTTPServer) nothing is catching the error and the test will fail unless we catch the error but then we won’t be able to perform all the other checks the fuzz test does.
  75. vasild commented at 9:50 am on March 30, 2023: contributor

    Since the replySent parameter of the HTTPRequest constructor is used only from the tests, I would suggest to remove it and adjust the test as needed. I believe the tests should be testing the same API that is used by the real code, otherwise they may end up testing scenarios that never happen in real code and missing real code scenarios.

    Furthermore that replySent = true from the test is a hack to workaround another problem. It is actually a lie - the reply has not been sent. The problem being workedaround is that the WriteReply() will be called from the HTTPRequest destructor from the test function, after ev*free() functions have freed the buffers and it will crash due to use-after-free. A convenient way to call the destructor before ev*free() is to put it in a narrower scope:

    0{
    1    HTTPRequest http_request;
    2    ...
    3}
    4evbuffer_free(evbuf);
    5evhttp_request_free(evreq);
    

    That’s cool because we need to try/catch nevertheless. Here is it:

     0diff --git i/src/httpserver.cpp w/src/httpserver.cpp
     1index 99cf147866..3e97d621ce 100644
     2--- i/src/httpserver.cpp
     3+++ w/src/httpserver.cpp
     4@@ -541,17 +541,17 @@ void HTTPEvent::trigger(struct timeval* tv)
     5 {
     6     if (tv == nullptr)
     7         event_active(ev, 0, 0); // immediately trigger event in main thread
     8     else
     9         evtimer_add(ev, tv); // trigger after timeval passed
    10 }
    11-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) :
    12+HTTPRequest::HTTPRequest(struct evhttp_request* _req) :
    13         req{_req},
    14         m_uri{evhttp_request_get_uri(req)},
    15         m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr},
    16-        replySent{_replySent}
    17+        replySent{false}
    18 {
    19     if (m_uri_parsed == nullptr) {
    20         const char* err{"Invalid URI"};
    21         const std::string peerAddress{GetPeer().ToStringAddr()};
    22         WriteReply(HTTP_BAD_REQUEST, err);
    23         throw std::runtime_error(
    24diff --git i/src/httpserver.h w/src/httpserver.h
    25index 00b22bdadc..56a4d89d22 100644
    26--- i/src/httpserver.h
    27+++ w/src/httpserver.h
    28@@ -62,13 +62,13 @@ private:
    29     const char* const m_uri;
    30     /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`.
    31     evhttp_uri* const m_uri_parsed;
    32     bool replySent;
    33 
    34 public:
    35-    explicit HTTPRequest(struct evhttp_request* req, bool replySent = false);
    36+    explicit HTTPRequest(struct evhttp_request* req);
    37     ~HTTPRequest();
    38 
    39     enum RequestMethod {
    40         UNKNOWN,
    41         GET,
    42         POST,
    43diff --git i/src/test/fuzz/http_request.cpp w/src/test/fuzz/http_request.cpp
    44index 9928c4a1ab..258de7c4cd 100644
    45--- i/src/test/fuzz/http_request.cpp
    46+++ w/src/test/fuzz/http_request.cpp
    47@@ -44,23 +44,26 @@ FUZZ_TARGET(http_request)
    48         evhttp_parse_firstline_(evreq, evbuf) != 1 || evhttp_parse_headers_(evreq, evbuf) != 1) {
    49         evbuffer_free(evbuf);
    50         evhttp_request_free(evreq);
    51         return;
    52     }
    53 
    54-    HTTPRequest http_request{evreq, true};
    55-    const HTTPRequest::RequestMethod request_method = http_request.GetRequestMethod();
    56-    (void)RequestMethodString(request_method);
    57-    (void)http_request.GetURI();
    58-    (void)http_request.GetHeader("Host");
    59-    const std::string header = fuzzed_data_provider.ConsumeRandomLengthString(16);
    60-    (void)http_request.GetHeader(header);
    61-    (void)http_request.WriteHeader(header, fuzzed_data_provider.ConsumeRandomLengthString(16));
    62-    (void)http_request.GetHeader(header);
    63-    const std::string body = http_request.ReadBody();
    64-    assert(body.empty());
    65-    const CService service = http_request.GetPeer();
    66-    assert(service.ToStringAddrPort() == "[::]:0");
    67+    try {
    68+        HTTPRequest http_request{evreq};
    69+        const HTTPRequest::RequestMethod request_method = http_request.GetRequestMethod();
    70+        (void)RequestMethodString(request_method);
    71+        (void)http_request.GetURI();
    72+        (void)http_request.GetHeader("Host");
    73+        const std::string header = fuzzed_data_provider.ConsumeRandomLengthString(16);
    74+        (void)http_request.GetHeader(header);
    75+        (void)http_request.WriteHeader(header, fuzzed_data_provider.ConsumeRandomLengthString(16));
    76+        (void)http_request.GetHeader(header);
    77+        const std::string body = http_request.ReadBody();
    78+        assert(body.empty());
    79+        const CService service = http_request.GetPeer();
    80+        assert(service.ToStringAddrPort() == "[::]:0");
    81+    } catch (const std::runtime_error&) {
    82+    }
    83 
    84     evbuffer_free(evbuf);
    85     evhttp_request_free(evreq);
    86 }
    

    … but then we won’t be able to perform all the other checks the fuzz test does.

    I think that is fine because if the input is such that HTTPRequest constructor throws, then we need not to test those because they will never happen in the real code either.

  76. pablomartin4btc commented at 4:55 pm on March 30, 2023: member

    I think that is fine because if the input is such that HTTPRequest constructor throws, then we need not to test those because they will never happen in the real code either.

    Thanks @vasild, it makes sense what you are saying, I’m still playing with some options and will get back to you soon.

     0$ FUZZ=http_request ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3
     1INFO: Seed: 2298450543
     2INFO: Loaded 1 modules   (250057 inline 8-bit counters): 250057 [0x565010b53868, 0x565010b90931), 
     3INFO: Loaded 1 PC tables (250057 PCs): 250057 [0x565010b90938,0x565010f615c8), 
     4./src/test/fuzz/fuzz: Running 1 inputs 1 time(s) each.
     5Running: ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3
     6Executed ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3 in 2 ms
     7***
     8*** NOTE: fuzzing was not performed, you have only
     9***       executed the target code on a fixed set of inputs.
    10***
    
  77. pablomartin4btc force-pushed on Mar 31, 2023
  78. pablomartin4btc commented at 5:39 am on March 31, 2023: member

    Updated changes:

    • Fixed fuzz test failing.

    Since the replySent parameter of the HTTPRequest constructor is used only from the tests, I would suggest to remove it and adjust the test as needed. @vasild I left open the possibility of passing the replySent arg as true, this way we can “build” a “safer” “invalid” request as @stickies-v mentioned above, so we can test the almost entire protocol of HTTPRequest if that was originally the intention of this fuzz. Then, as part of the above, if replySent is passed as true, the constructor won’t call the libevent functions in there but as m_uri_parsed is null (in a real use case we won’t see this as the exception will be thrown), we need to guard the free of m_uri_parsed at the httpserver destructor otherwise produces a segfault.

    I’ve checked another option that would be to create a mock HTTPRequest but I had to create another constructor to avoid calling the default and I preferred not to touch the HTTPRequest class. I think leaving the replySent arg to be optionally passed on the constructor and the fuzz test behaviour as the original commit would be ok for now unless there are other objections.

  79. pablomartin4btc force-pushed on Apr 4, 2023
  80. pablomartin4btc force-pushed on Apr 4, 2023
  81. in src/httpserver.cpp:591 in 4f68c30b2d outdated
    587@@ -563,8 +588,7 @@ std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) cons
    588 std::string HTTPRequest::ReadBody()
    589 {
    590     struct evbuffer* buf = evhttp_request_get_input_buffer(req);
    591-    if (!buf)
    592-        return "";
    593+    if (!buf) return "";
    


    vasild commented at 5:16 pm on April 7, 2023:
    Remove this unrelated whitespace change.

    pablomartin4btc commented at 1:36 pm on April 10, 2023:
    yes, that was something I touched while I was verifying the req class member.
  82. in src/test/httpserver_tests.cpp:21 in 4f68c30b2d outdated
    17-BOOST_AUTO_TEST_CASE(test_query_parameters)
    18+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
    19+    auto uri_parsed{evhttp_uri_parse(uri.data())};
    20+    if (uri_parsed == nullptr) return std::nullopt;
    21+    auto param_value{GetQueryParameterFromUri(*uri_parsed, key)};
    22+    if (uri_parsed) evhttp_uri_free(uri_parsed);
    


    vasild commented at 5:19 pm on April 7, 2023:
    This will always be true because if uri_parsed is nullptr we would have returned 2 lines above.

    pablomartin4btc commented at 1:38 pm on April 10, 2023:
    I’ll change it. Thanks.
  83. vasild approved
  84. vasild commented at 5:24 pm on April 7, 2023: contributor

    ACK 4f68c30b2d52f39c7c7465d49ba41c2e70f486ba

    Would be nice to split the actual fix from the other changes. I guess it would be easier for other reviewers too.

    I would have liked to remove the replySent argument, but that’s secondary to this PR which fixes a serious bug. Let’s get the bug fixed and if I feel strong enough about it, then maybe I will open a followup to remove that parameter.

  85. DrahtBot requested review from stickies-v on Apr 7, 2023
  86. pablomartin4btc commented at 10:29 am on April 11, 2023: member

    Would be nice to split the actual fix from the other changes. I guess it would be easier for other reviewers too.

    I’m working on it.

    I would have liked to remove the replySent argument, but that’s secondary to this PR which fixes a serious bug. Let’s get the bug fixed and if I feel strong enough about it, then maybe I will open a followup to remove that parameter.

    I agree, later on a follow-up, I’d try to investigate if evhttp_request* req can be mocked somehow so the fuzz can go thru all the buffering behaviour and other stuff once the replySent argument is removed.

  87. vasild commented at 12:42 pm on April 11, 2023: contributor

    I’d try to investigate if evhttp_request* req can be mocked somehow so the fuzz can go thru all the buffering behaviour and other stuff once the replySent argument is removed.

    The fuzz will go through this:

    https://github.com/bitcoin/bitcoin/blob/53eb4b7a212db7563291509ff4e53327440a9df2/src/test/fuzz/http_request.cpp#L51-L62

    if the HTTPRequest object is created successfully (once replySent is removed and the test does try/catch). If the input is such that HTTPRequest:HTTPRequest() throws, then it does not have to test the above because it will never be executed in the real code.

  88. pablomartin4btc commented at 1:07 pm on April 12, 2023: member

    if the HTTPRequest object is created successfully (once replySent is removed and the test does try/catch). If the input is such that HTTPRequest:HTTPRequest() throws, then it does not have to test the above because it will never be executed in the real code.

    I’m sorry, I said nonsense, ofc that code won’t be executed because the uri_parsed will be null and so on, my concern is that we’ll lose the test coverage by the current fuzz, even it’s not a real case scenario, the test is performing ’exhaustive’ calls to specific protocol (not sure how ’exhaustive’ because, on the other hand, the internal buffer in the HTTPRequest instance is empty while testing).

  89. pablomartin4btc force-pushed on Apr 14, 2023
  90. pablomartin4btc force-pushed on Apr 14, 2023
  91. pablomartin4btc force-pushed on Apr 14, 2023
  92. pablomartin4btc force-pushed on Apr 14, 2023
  93. pablomartin4btc commented at 1:30 am on April 14, 2023: member

    Updated changes:

    • Split commit in 2 (as suggested by @stickies-v and @vasild): one for the fix and one for the logic improvement.
    • Added some comments to each commit (as suggested by @stickies-v).
  94. in src/httpserver.cpp:683 in 0f9cba3f07 outdated
    679 
    680 std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key)
    681 {
    682     evhttp_uri* uri_parsed{evhttp_uri_parse(uri)};
    683+    if (!uri_parsed) {
    684+        throw std::runtime_error(strprintf("Invalid uri detected while parsing \"%s\"", uri));
    


    stickies-v commented at 9:59 am on April 14, 2023:

    uri is unhandled user input so including that in the error seems dangerous to me, would just change it to

    0        throw std::runtime_error("URI parsing failed, it likely contained RFC 3986 invalid characters);
    

    vasild commented at 4:54 pm on April 14, 2023:
    Or use SanitizeString(uri).

    stickies-v commented at 4:58 pm on April 14, 2023:
    That wouldn’t prevent someone from passing really long URIs, causing e.g. bloated logs. We’re already (debug) logging the full URI in http_request_cb. Adding it here introduces risks and little value imo.

    pablomartin4btc commented at 5:33 pm on April 14, 2023:
    Yeah, it’s a clearer message. Thanks.
  95. in src/httpserver.cpp:676 in 0f9cba3f07 outdated
    672+        return GetQueryParameterFromUri(uri, key);
    673+    } catch (const std::runtime_error& e) {
    674+        LogPrint(BCLog::HTTP, "%s\n", e.what());
    675+        // add additional information and rethrow the exception as we can't send a reply from the worker
    676+        throw std::runtime_error(strprintf("Error occurred trying to get query param key \"%s\": ", key) + e.what());
    677+    }
    


    stickies-v commented at 4:19 pm on April 14, 2023:

    What’s the benefit of re-handling this error? The error message from GetQueryParameterFromUri is enough (or could be updated there to reference query param fetching, but it doesn’t really have anything to do with the URI being invalid, so I’d prefer leaving it out).

     0diff --git a/src/httpserver.cpp b/src/httpserver.cpp
     1index eb1f0bd93..3f0d82497 100644
     2--- a/src/httpserver.cpp
     3+++ b/src/httpserver.cpp
     4@@ -667,13 +667,7 @@ std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key
     5 {
     6     const char* uri{evhttp_request_get_uri(req)};
     7 
     8-    try {
     9-        return GetQueryParameterFromUri(uri, key);
    10-    } catch (const std::runtime_error& e) {
    11-        LogPrint(BCLog::HTTP, "%s\n", e.what());
    12-        // add additional information and rethrow the exception as we can't send a reply from the worker
    13-        throw std::runtime_error(strprintf("Error occurred trying to get query param key \"%s\": ", key) + e.what());
    14-    }
    15+    return GetQueryParameterFromUri(uri, key);
    16 }
    17 
    18 std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key)
    19diff --git a/test/functional/interface_rest.py b/test/functional/interface_rest.py
    20index fa6193039..fb647a62f 100755
    21--- a/test/functional/interface_rest.py
    22+++ b/test/functional/interface_rest.py
    23@@ -282,7 +282,7 @@ class RESTTest (BitcoinTestFramework):
    24 
    25         # Check invalid uri (% symbol at the end of the request)
    26         resp = self.test_rest_request(f"/headers/{bb_hash}%", ret_type=RetType.OBJ, status=400)
    27-        assert_equal(resp.read().decode('utf-8').rstrip(), f"Error occurred trying to get query param key \"count\": Invalid uri detected while parsing \"/rest/headers/{bb_hash}%.json\"")
    28+        assert_equal(resp.read().decode('utf-8').rstrip(), f"Invalid uri detected while parsing \"/rest/headers/{bb_hash}%.json\"")
    29 
    30         # Compare with normal RPC block response
    31         rpc_block_json = self.nodes[0].getblock(bb_hash)
    

    pablomartin4btc commented at 5:42 pm on April 14, 2023:
    Yeah, the intention was to add the param key on top of the previous error, adding more details at this level but that info is also available from the caller of GetQueryParameterFromUri. Ok, I’ll leave it out.
  96. in src/test/httpserver_tests.cpp:38 in 0f9cba3f07 outdated
    33@@ -34,5 +34,9 @@ BOOST_AUTO_TEST_CASE(test_query_parameters)
    34     // Invalid query string syntax is the same as not having parameters
    35     uri = "/rest/endpoint/someresource.json&p1=v1&p2=v2";
    36     BOOST_CHECK(!GetQueryParameterFromUri(uri.c_str(), "p1").has_value());
    37+
    38+    // Invalid query string with invalid will raise a runtime error no matter how many parameters passed or their syntax
    


    stickies-v commented at 4:23 pm on April 14, 2023:

    I’d rephrase to the below - any invalid character in the URI will trigger this, not just in the query string.

    0    // URI with invalid characters (%) raises a runtime error regardless of which query parameter is queried
    
  97. stickies-v commented at 4:41 pm on April 14, 2023: contributor

    Left some comments on the bugfix commit. I’d carve that out in a separate PR to get it merged for v25.

    Also, suggested rephrasing for the commit message:

     0bugfix: rest: avoid segfault for invalid URI
     1
     2`evhttp_uri_parse` can return a nullptr, for example when the URI 
     3contains invalid characters (e.g. "%"). 
     4`GetQueryParameterFromUri` passes the output of `evhttp_uri_parse`
     5straight into `evhttp_uri_get_query`, which means that anyone calling
     6a REST endpoint in which query parameters are used (e.g. `rest_headers`)
     7can cause a segfault.
     8
     9This bugfix is designed to be minimal and without additional behaviour change.
    10Follow-up work should be done to resolve this in a more general and robust way,
    11so not every endpoint has to handle it individually.
    
  98. vasild approved
  99. vasild commented at 4:59 pm on April 14, 2023: contributor

    ACK 9b9c48cf81549c7eacfbfb78534b7eaba46ecb34 modulo #27253 (review)

    I agree with suggestions from @stickies-v, would be happy to re-review.

  100. DrahtBot requested review from stickies-v on Apr 14, 2023
  101. pablomartin4btc commented at 5:34 pm on April 14, 2023: member

    Left some comments on the bugfix commit. I’d carve that out in a separate PR to get it merged for v25.

    I’ll create a new PR linked to this.

    Also, suggested rephrasing for the commit message:

    Sure, I’ll take your suggestion, thanks.

  102. pablomartin4btc commented at 5:35 pm on April 14, 2023: member

    I agree with suggestions from @stickies-v, would be happy to re-review.

    I’ll let you both know ones the new PR is ready, very soon. Thanks.

  103. pablomartin4btc commented at 11:52 pm on April 14, 2023: member

    Updated changes:

    • Created new PR #27468 with a minimal bugfix and single commit in order to merge it faster for 25.0 release, as suggested above by @stickies-v and supported by @vasild.
    • The new PR contains also the suggested changes from @stickies-v.
    • I’ll create another PR for the 2nd commit as a follow-up.
  104. fanquake removed this from the milestone 25.0 on Apr 15, 2023
  105. fanquake referenced this in commit e054b7390c on Apr 17, 2023
  106. sidhujag referenced this in commit 200330a67d on Apr 17, 2023
  107. DrahtBot added the label Needs rebase on Apr 17, 2023
  108. pablomartin4btc force-pushed on Apr 23, 2023
  109. pablomartin4btc commented at 1:59 pm on April 23, 2023: member

    Updated changes:

    • First commit of this PR was already merged into 25.0 release branch (#27468).
    • Rebased & incorporated some nits & suggestions from latest reviews including above PR.
    • Updated original description of this PR.
  110. DrahtBot added the label CI failed on Apr 23, 2023
  111. DrahtBot removed the label Needs rebase on Apr 23, 2023
  112. maflcko commented at 8:42 am on April 24, 2023: member
    Fuzzing segfaults, see CI
  113. pablomartin4btc commented at 10:30 am on April 24, 2023: member

    Fuzzing segfaults, see CI

    Saw it yesterday, working on it, thanks.

  114. pablomartin4btc force-pushed on Apr 29, 2023
  115. pablomartin4btc force-pushed on Apr 29, 2023
  116. DrahtBot removed the label CI failed on Apr 29, 2023
  117. pablomartin4btc commented at 7:33 am on April 29, 2023: member
    • Fixedhttp_requestfuzz failure and added some relevant comments in the code.
  118. pablomartin4btc force-pushed on Apr 29, 2023
  119. pablomartin4btc commented at 11:08 am on April 29, 2023: member
    • Updated 2nd commit message.
  120. stickies-v commented at 11:32 am on May 11, 2023: contributor
    With #27468 merged, I don’t really see the need for 8ef940d39bfa7bc4ba867b70495b68c507297694 anymore? Also the PR title and description need updating, this is not a bugfix PR anymore.
  121. pablomartin4btc force-pushed on May 11, 2023
  122. pablomartin4btc force-pushed on May 11, 2023
  123. pablomartin4btc renamed this:
    httpserver, rest: fix segmentation fault on evhttp_uri_get_query
    httpserver, rest: segfault bugfix (#27468) logic improvement
    on May 11, 2023
  124. pablomartin4btc renamed this:
    httpserver, rest: segfault bugfix (#27468) logic improvement
    httpserver, rest: improving URI validation
    on May 11, 2023
  125. pablomartin4btc force-pushed on May 11, 2023
  126. pablomartin4btc commented at 12:47 pm on May 11, 2023: member

    With #27468 merged, I don’t really see the need for 8ef940d anymore? Also the PR title and description need updating, this is not a bugfix PR anymore.

    Updated them, thanks!

  127. DrahtBot added the label CI failed on May 11, 2023
  128. pablomartin4btc force-pushed on May 14, 2023
  129. DrahtBot removed the label CI failed on May 14, 2023
  130. in src/httpserver.cpp:674 in c72967f48a outdated
    668@@ -643,9 +669,9 @@ CService HTTPRequest::GetPeer() const
    669     return peer;
    670 }
    671 
    672-std::string HTTPRequest::GetURI() const
    673+const char* HTTPRequest::GetURI() const
    674 {
    675-    return evhttp_request_get_uri(req);
    676+    return m_uri;
    


    stickies-v commented at 2:44 pm on May 17, 2023:

    Since docs guarantee that GetURI() returns non-nullptr, would add an Assert here

    0    return Assert(m_uri);
    

    pablomartin4btc commented at 2:26 pm on May 27, 2023:
    Ok, done.
  131. in src/httpserver.h:80 in c72967f48a outdated
    76@@ -71,9 +77,9 @@ class HTTPRequest
    77         PUT
    78     };
    79 
    80-    /** Get requested URI.
    81+    /** Get requested URI. Will always return non-nullptr.
    


    stickies-v commented at 10:39 am on May 18, 2023:

    nit: clearer language

    0    /** Get requested URI. Will never return nullptr.
    

    pablomartin4btc commented at 2:26 pm on May 27, 2023:
    Yeah and also it’ll be more consistent with the comments above in the private declaration section. https://github.com/bitcoin/bitcoin/blob/c72967f48ab9eb02b8dc9231fdcc6ee85d66b7b9/src/httpserver.h#L60-L63
  132. in src/httpserver.cpp:550 in c72967f48a outdated
    543@@ -536,18 +544,36 @@ void HTTPEvent::trigger(struct timeval* tv)
    544     else
    545         evtimer_add(ev, tv); // trigger after timeval passed
    546 }
    547-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : req(_req), replySent(_replySent)
    548+HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) :
    549+        req{_req},
    550+        m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr},
    551+        m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr},
    


    stickies-v commented at 11:05 am on May 18, 2023:
    Is there a reason we need to check for m_uri not being empty (&& m_uri[0] != '\0')? Doesn’t seem to be a documented requirement for evhttp_uri_parse.

    pablomartin4btc commented at 2:25 pm on May 27, 2023:
    Well, I think that an empty string wouldn’t be a valid URI, at least the scheme component (the first part of it which is followed by a “:”, which is again followed by the hier-part component (which may be empty) must be present according to the RFC. I thought it was perhaps validated inside the evhttp_uri_parse but I couldn’t find it there. I’m happy to remove it if you think it’s not worth it.

    stickies-v commented at 12:03 pm on May 30, 2023:

    We don’t need m_uri to be a valid URI, that’s what evhttp_uri_parse is checking, we just need it to not behave unexpectedly (eg. segfault) when passed to evhttp_uri_parse. “a” also would not be a valid URI, but we’re not checking that here either, that would just be duplicating what evhttp_uri_parse is already doing.

    So unless there’s another reason to not pass an empty char array to evhttp_uri_parse, I’d leave it out yeah.


    pablomartin4btc commented at 3:59 pm on May 30, 2023:
    ok

    pablomartin4btc commented at 5:16 pm on June 17, 2023:
    done.
  133. in src/httpserver.h:82 in c72967f48a outdated
    76@@ -71,9 +77,9 @@ class HTTPRequest
    77         PUT
    78     };
    79 
    80-    /** Get requested URI.
    81+    /** Get requested URI. Will always return non-nullptr.
    82      */
    83-    std::string GetURI() const;
    84+    const char* GetURI() const;
    


    stickies-v commented at 11:12 am on May 18, 2023:
    What’s the rationale for this change? No callsite seems to be preferring a C-style string, and returning a std::string would also make the non-nullptr promise explicit.

    pablomartin4btc commented at 2:25 pm on May 27, 2023:
    I can’t remember tbh, try to find it in the history with no success, not sure if I did it due to matching the consumers(?) or @vasild suggested it at some point…
  134. stickies-v commented at 5:43 pm on May 18, 2023: contributor

    I would have liked to remove the replySent argument, but that’s secondary to this PR which fixes a serious bug.

    Me too, especially since this is now no longer a bugfix PR. I think the workarounds that need to introduced in this PR make things unnecessarily messy, and updating the fuzzer as e.g. per your suggestion here look like a reasonable approach for this PR to me.

  135. pablomartin4btc force-pushed on May 27, 2023
  136. pablomartin4btc commented at 2:51 pm on May 27, 2023: member

    Updates:

    • Incorporated most of @stickies-v’s suggestions.
    • I’ll remove the replySent argument from the httpserver constructor (as @vasild suggested and @stickies-v agrees with), and update the fuzz test accordingly. I can squash it later into 1 if you like.
  137. DrahtBot added the label CI failed on May 27, 2023
  138. pablomartin4btc force-pushed on May 28, 2023
  139. DrahtBot removed the label CI failed on May 28, 2023
  140. DrahtBot added the label CI failed on May 29, 2023
  141. pablomartin4btc force-pushed on May 29, 2023
  142. pablomartin4btc commented at 5:44 pm on May 29, 2023: member

    Updates:

    • Working on fuzzer failure.
  143. in src/test/fuzz/http_request.cpp:66 in ca8941100d outdated
    76+        assert(service.ToStringAddrPort() == "[::]:0");
    77 
    78-    evbuffer_free(evbuf);
    79-    evhttp_request_free(evreq);
    80+        evbuffer_free(evbuf);
    81+        evhttp_request_free(evreq);
    


    stickies-v commented at 12:15 pm on May 30, 2023:
    Don’t we need to clean this up regardless of whether an exception was thrown?

    pablomartin4btc commented at 12:44 pm on May 30, 2023:
    Let me check again when I do revert the order of the commits, when I tested before those were needed as the fuzz was complaining about memory leaks.

    pablomartin4btc commented at 8:59 am on June 5, 2023:
    Those are needed, I’ve checked the test code line by line. I’ll do another check while I fix the current issue/ failure. Thanks!
  144. stickies-v commented at 12:15 pm on May 30, 2023: contributor

    I’m in favour of doing the replySent refactoring in a separate commit, but it makes more sense to have that be the first commit imo, to avoid adding logic and then removing it in the very next commit.

    Edit: and please also include the rationale for removing replySent from the constructor in the commit message?

  145. pablomartin4btc commented at 12:41 pm on May 30, 2023: member

    I’m in favour of doing the replySent refactoring in a separate commit, but it makes more sense to have that be the first commit imo, to avoid adding logic and then removing it in the very next commit.

    Yeah, true, I agree… let me revert the order of the commits.

  146. pablomartin4btc commented at 8:57 am on June 5, 2023: member

    Updates:

    • I’ve found the issue with thefuzztest I’ll push a fix with the commits in reverse order.
  147. pablomartin4btc force-pushed on Jun 5, 2023
  148. pablomartin4btc force-pushed on Jun 5, 2023
  149. pablomartin4btc force-pushed on Jun 5, 2023
  150. pablomartin4btc commented at 11:32 am on June 5, 2023: member

    Updates:

    • It seems I broke some functional tests with the WriteReply function “fix”, ha, working on it.
  151. pablomartin4btc force-pushed on Jun 5, 2023
  152. pablomartin4btc force-pushed on Jun 6, 2023
  153. pablomartin4btc force-pushed on Jun 6, 2023
  154. pablomartin4btc force-pushed on Jun 6, 2023
  155. pablomartin4btc force-pushed on Jun 9, 2023
  156. pablomartin4btc force-pushed on Jun 9, 2023
  157. pablomartin4btc force-pushed on Jun 9, 2023
  158. pablomartin4btc force-pushed on Jun 9, 2023
  159. pablomartin4btc force-pushed on Jun 13, 2023
  160. DrahtBot removed the label CI failed on Jun 13, 2023
  161. pablomartin4btc force-pushed on Jun 13, 2023
  162. pablomartin4btc force-pushed on Jun 13, 2023
  163. DrahtBot added the label CI failed on Jun 13, 2023
  164. pablomartin4btc force-pushed on Jun 13, 2023
  165. pablomartin4btc force-pushed on Jun 14, 2023
  166. DrahtBot removed the label CI failed on Jun 14, 2023
  167. DrahtBot added the label CI failed on Jun 16, 2023
  168. DrahtBot removed the label CI failed on Jun 17, 2023
  169. pablomartin4btc commented at 5:37 pm on June 17, 2023: member

    Updates:

    • Reverted order of the commits (now the very first is the replySentarg removal then secondly the URI validation enhancement approach) as suggested by @stickies-v.
    • replySentarg removed fromhttprequestobj constructor in first commit (rationale of the removal added into commits message). In order to do the removal, first I had to fix the underlying issue onWriteReplywhich was corrected also in the first commit (I didn’t want to split it in a different commit since it was going to be like an isolated change and in order to test that I would have to test thelibeventand I’m not too sure if what we want to do here regardin the scope of this PR and so on, happy to extend it and follow up in a different PR).
    • Went thru previous comments, reviews and suggestions to see if I’ve missed any nit or anything since this PR changed quite a bit, found this: simple getter best to be implemented in the header by @stickies-v, which I’ve incorporated in the 2nd, commit.
  170. in src/httpserver.cpp:549 in 68bc89bf9d outdated
    545+        req{_req},
    546+        m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr},
    547+        m_uri_parsed{m_uri != nullptr ? evhttp_uri_parse(m_uri) : nullptr}
    548 {
    549+    if (m_uri_parsed == nullptr) {
    550+        const char* err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
    


    stickies-v commented at 1:54 pm on June 26, 2023:

    nit: no need to use a C-style string here I think

    0        const std::string err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
    
  171. in src/httpserver.cpp:20 in 68bc89bf9d outdated
    16@@ -17,6 +17,7 @@
    17 #include <rpc/protocol.h> // For HTTP status codes
    18 #include <shutdown.h>
    19 #include <sync.h>
    20+#include <util/check.h>
    


    stickies-v commented at 1:48 pm on June 27, 2023:
    Do we need this?

    pablomartin4btc commented at 3:53 pm on June 30, 2023:
    Not anymore, there was an Assume() somewhere in HTTPRequest::GetQueryParameter checking for m_uri_parsed at some point. Thanks for checking, I’ll remove it.
  172. in src/httpserver.cpp:243 in 68bc89bf9d outdated
    239+    try {
    240+        hreq = std::make_unique<HTTPRequest>(req);
    241+    } catch (const std::runtime_error& e) {
    242+        LogPrint(BCLog::HTTP, "%s\n", e.what());
    243+        return;
    244+    }
    


    stickies-v commented at 10:50 am on June 28, 2023:

    I don’t think it’s great that we’re not doing the address-based checks before anything else. Since we’re now adding validation into the constructor, how about we do it consistently and do it for all validation, so we can also control what we validate first? I’m still feeling a bit uneasy about coupling the construction of HTTPRequest with validation, but if we’re doing it we might as well do it more consistently? Curious to hear @vasild’s thoughts on this too.

     0diff --git a/src/httpserver.cpp b/src/httpserver.cpp
     1index f7dc344f4..10f8ec24d 100644
     2--- a/src/httpserver.cpp
     3+++ b/src/httpserver.cpp
     4@@ -235,6 +235,7 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
     5     }
     6     std::unique_ptr<HTTPRequest> hreq;
     7 
     8+    // Initializing the HTTPRequest also performs basic validation such as address, method and URI checks
     9     try {
    10         hreq = std::make_unique<HTTPRequest>(req);
    11     } catch (const std::runtime_error& e) {
    12@@ -242,22 +243,6 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
    13         return;
    14     }
    15 
    16-    // Early address-based allow check
    17-    if (!ClientAllowed(hreq->GetPeer())) {
    18-        LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Client network is not allowed RPC access\n",
    19-                 hreq->GetPeer().ToStringAddrPort());
    20-        hreq->WriteReply(HTTP_FORBIDDEN);
    21-        return;
    22-    }
    23-
    24-    // Early reject unknown HTTP methods
    25-    if (hreq->GetRequestMethod() == HTTPRequest::UNKNOWN) {
    26-        LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Unknown HTTP request method\n",
    27-                 hreq->GetPeer().ToStringAddrPort());
    28-        hreq->WriteReply(HTTP_BAD_METHOD);
    29-        return;
    30-    }
    31-
    32     const std::string strURI{hreq->GetURI()};
    33     LogPrint(BCLog::HTTP, "Received a %s request for %s from %s\n",
    34              RequestMethodString(hreq->GetRequestMethod()), SanitizeString(strURI, SAFE_CHARS_URI).substr(0, 100), hreq->GetPeer().ToStringAddrPort());
    35@@ -545,13 +530,7 @@ HTTPRequest::HTTPRequest(struct evhttp_request* _req) :
    36         m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr},
    37         m_uri_parsed{m_uri != nullptr ? evhttp_uri_parse(m_uri) : nullptr}
    38 {
    39-    if (m_uri_parsed == nullptr) {
    40-        const char* err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
    41-        const std::string peerAddress{GetPeer().ToStringAddr()};
    42-        WriteReply(HTTP_BAD_REQUEST, err);
    43-        throw std::runtime_error(
    44-            strprintf("HTTP request from %s rejected: %s", peerAddress, err));
    45-    }
    46+    Validate();
    47 }
    48 
    49 HTTPRequest::~HTTPRequest()
    50@@ -565,6 +544,29 @@ HTTPRequest::~HTTPRequest()
    51     evhttp_uri_free(m_uri_parsed);
    52 }
    53 
    54+void HTTPRequest::Validate()
    55+{
    56+    const std::string rejected{strprintf("HTTP request from %s rejected: ", GetPeer().ToStringAddrPort())};
    57+    // Early address-based allow check
    58+    if (!ClientAllowed(GetPeer())) {
    59+        WriteReply(HTTP_FORBIDDEN);
    60+        throw std::runtime_error(rejected + "Client network is not allowed RPC access");
    61+    }
    62+
    63+    // Early reject unknown HTTP methods
    64+    if (GetRequestMethod() == HTTPRequest::UNKNOWN) {
    65+        WriteReply(HTTP_BAD_METHOD);
    66+        throw std::runtime_error(rejected + "Unknown HTTP request method");
    67+    }
    68+
    69+    // Ensure valid URI
    70+    if (m_uri_parsed == nullptr) {
    71+        const std::string err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
    72+        WriteReply(HTTP_BAD_REQUEST, err);
    73+        throw std::runtime_error(rejected + err);
    74+    }
    75+}
    76+
    77 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const
    78 {
    79     const struct evkeyvalq* headers = evhttp_request_get_input_headers(req);
    80diff --git a/src/httpserver.h b/src/httpserver.h
    81index 77a249a6a..3c5af5b0b 100644
    82--- a/src/httpserver.h
    83+++ b/src/httpserver.h
    84@@ -64,6 +64,9 @@ private:
    85     /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`.
    86     evhttp_uri* const m_uri_parsed;
    87 
    88+    // Perform early validity checks. Send a response and throw std::runtime_error if invalid.
    89+    void Validate();
    90+
    91 public:
    92     explicit HTTPRequest(struct evhttp_request* req);
    93     ~HTTPRequest();
    

    pablomartin4btc commented at 6:02 pm on June 30, 2023:
    I understand your worries on coupling the validation within the constructor (separation of/ different concerns, error handling, flexibility, etc) but since those validations/ “early checks” also would impact into the state of the object itself it would make sense to me to put them together, consistently and cleaner. Happy to make the change if @vasild also agree with it.
  173. DrahtBot added the label Needs rebase on Jun 28, 2023
  174. pablomartin4btc force-pushed on Jul 3, 2023
  175. pablomartin4btc force-pushed on Jul 3, 2023
  176. pablomartin4btc force-pushed on Jul 3, 2023
  177. httpserver: removing replySent arg from constructor
    Removing what it was originally a hack/ workaround only for
    fuzz test 'http_request', no other client/ consumer was using it,
    to deal with an underlying buffering issue with WriteReply being
    called from the destructor, which was also fixed here not allowing
    buffering when conn is not valid.
    870b2c9ae5
  178. http, rest: URI validation logic improvement
    Non-functional optimisation: moving the logic up
    in the stack, specifically into the constructor
    of the HTTPRequest object, this way we catch the invalid uri earlier,
    removing the exception handling from the rest enpoints as we catch
    the exception raised from the constructor within the request
    call back function where we return an http bad request to consumers.
    31dd51c605
  179. pablomartin4btc force-pushed on Jul 3, 2023
  180. pablomartin4btc commented at 4:32 pm on July 3, 2023: member

    Updates:

    • Rebased
    • Implemented refactoring suggestion from @stickies-v, moving other http reject validations into the constructor and regrouping them under a Validate() function to make the entire validation more consistent.
  181. DrahtBot removed the label Needs rebase on Jul 3, 2023
  182. DrahtBot added the label CI failed on Jul 3, 2023
  183. pablomartin4btc commented at 5:48 pm on July 3, 2023: member

    Updates:

    • Checking fuzz test failure
  184. pablomartin4btc marked this as a draft on Jul 4, 2023
  185. pablomartin4btc commented at 4:53 pm on July 4, 2023: member
    After a chat with @stickies-v where we were discussing different approaches on this enhancement and other details regarding fuzz testing, libevent functionality and each commit intention, I’ve decided to put this onto draft. Firstly we would need to define the purpose of the httprequest object, do we want/ need the httprequest obj to exist even with an invalid request?, as @stickies-v raised his concerns before, it seems coupling the validation within the constructor perhaps is not the best approach after all. I’ll check the different alternatives of moving the validation out from the constructor and perhaps throwing the error later in the callback, also maybe I’ll split the commits in different PRs once I revisit the fuzz test and logic perform some libevent tests.
  186. DrahtBot commented at 9:22 am on October 5, 2023: contributor
    Are you still working on this?
  187. pablomartin4btc commented at 11:44 am on October 5, 2023: member

    Are you still working on this?

    I do

  188. DrahtBot added the label Needs rebase on Dec 14, 2023
  189. DrahtBot commented at 9:53 pm on December 14, 2023: contributor

    🐙 This pull request conflicts with the target branch and needs rebase.

  190. DrahtBot commented at 10:42 am on January 5, 2024: contributor

    There hasn’t been much activity lately and the patch still needs rebase. What is the status here?

    • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
    • Is it no longer relevant? ➡️ Please close.
    • Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
  191. fanquake commented at 2:23 pm on March 6, 2024: member
    Closing for now. @pablomartin4btc feel free to ping for a re-open when you are actively working on this again.
  192. fanquake closed this on Mar 6, 2024


github-metadata-mirror

This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 03:12 UTC

This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me