httpserver, rest: improving URI validation #27253
pull pablomartin4btc wants to merge 2 commits into bitcoin:master from pablomartin4btc:http-rest-fix-segfault changing 5 files +122 −99-
pablomartin4btc commented at 4:22 am on March 14, 2023: memberThis PR contains an isolated enhancement of the segfault bugfix #27468 (already merged), improving the way we handle the URI validation, which will be performed only once instead of on each query parameter of a rest service endpoint.
-
DrahtBot commented at 4:22 am on March 14, 2023: contributor
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.
Reviews
See the guideline for information on the review process.
Type Reviewers Concept ACK brunoerg Stale ACK stickies-v, vasild If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.
Conflicts
Reviewers, this pull request conflicts with the following ones:
- #28051 (Get rid of shutdown.cpp/shutdown.h, use SignalInterrupt directly by ryanofsky)
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.
-
pablomartin4btc force-pushed on Mar 14, 2023
-
pablomartin4btc force-pushed on Mar 14, 2023
-
pablomartin4btc force-pushed on Mar 14, 2023
-
pablomartin4btc force-pushed on Mar 14, 2023
-
pablomartin4btc force-pushed on Mar 14, 2023
-
pablomartin4btc force-pushed on Mar 14, 2023
-
fanquake added the label RPC/REST/ZMQ on Mar 14, 2023
-
fanquake requested review from stickies-v on Mar 14, 2023
-
stickies-v commented at 11:19 am on March 14, 2023: contributor
Concept ACK, nice catch.
As for the approach, I think there is a simpler one that requires no overhead for every endpoint using
GetQueryParameter()
. We could moveuri_parsed
to a privateHTTPRequest
member instead of a localGetQueryParameterFromUri
variable, and havehttp_request_cb
set it (for example after checking unknown HTTP methods) - or return an early 400 if null. Subsequently, we can assume thaturi_parsed
is valid and non-null, simplifying the rest code (and avoiding multipleevhttp_uri_parse
calls when accessing multiple query params). What do you think? -
brunoerg commented at 12:42 pm on March 14, 2023: contributorConcept ACK
-
pablomartin4btc commented at 9:14 pm on March 14, 2023: member
What do you think?
As @stickies-v explained to me outside this PR: “… by moving the uri parsing to http_request_cb we’re failing super early, and even when the query parameter would never be called by the endpoint… “, this makes more sense.
I’m working on this approach which I think it’s better as well.
-
pablomartin4btc force-pushed on Mar 16, 2023
-
in src/httpserver.cpp:679 in 02cbb58327 outdated
684- 685-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key) 686-{ 687- evhttp_uri* uri_parsed{evhttp_uri_parse(uri)}; 688- const char* query{evhttp_uri_get_query(uri_parsed)}; 689+ if (!m_uri_parsed) { return std::nullopt; }
stickies-v commented at 5:46 pm on March 16, 2023:nit
0 if (!m_uri_parsed) return std::nullopt;
pablomartin4btc commented at 8:14 pm on March 16, 2023:thanks, corrected other similar 2.pablomartin4btc force-pushed on Mar 16, 2023pablomartin4btc commented at 5:59 pm on March 16, 2023: memberUpdated changes:
- move the validation/parsing logic up in the stack as soon as the request is initiated even before the work item thread is created
- updated unit and functional tests
in src/test/httpserver_tests.cpp:18 in 213554cb97 outdated
14@@ -13,26 +15,34 @@ BOOST_AUTO_TEST_CASE(test_query_parameters) 15 { 16 std::string uri {}; 17 18+ evhttp_request* evreq = evhttp_request_new(nullptr, nullptr);
stickies-v commented at 6:10 pm on March 16, 2023:See #24012 - I couldn’t get mocking anevhttp_request*
past the ASan test, and it looks like they’re failing again here too. That’s why in #24098 I introduced theGetQueryParameterFromUri()
helper function: it could easily be unit tested, and thenHTTPRequest::GetQueryParameter()
was just a simple wrapper left untested.
pablomartin4btc commented at 8:18 pm on March 16, 2023:If I bring it back I won’t be able to access to m_uri_parsed from the helper function, unless there’s another alternative, I’ll have to roll-back to the original approach of the PR.
stickies-v commented at 1:56 pm on March 17, 2023:What do you think about this approach? I think it’s quite minimal:
0diff --git a/src/httpserver.cpp b/src/httpserver.cpp 1index 942caa042..6a6176a78 100644 2--- a/src/httpserver.cpp 3+++ b/src/httpserver.cpp 4@@ -15,6 +15,7 @@ 5 #include <rpc/protocol.h> // For HTTP status codes 6 #include <shutdown.h> 7 #include <sync.h> 8+#include <util/check.h> 9 #include <util/strencodings.h> 10 #include <util/syscall_sandbox.h> 11 #include <util/system.h> 12@@ -253,6 +254,14 @@ static void http_request_cb(struct evhttp_request* req, void* arg) 13 return; 14 } 15 16+ // Early reject invalid URI 17+ if (!hreq->GetURIParsed()) { 18+ LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Invalid URI\n", 19+ hreq->GetPeer().ToStringAddrPort()); 20+ hreq->WriteReply(HTTP_BAD_REQUEST, "Invalid URI"); 21+ return; 22+ } 23+ 24 LogPrint(BCLog::HTTP, "Received a %s request for %s from %s\n", 25 RequestMethodString(hreq->GetRequestMethod()), SanitizeString(hreq->GetURI(), SAFE_CHARS_URI).substr(0, 100), hreq->GetPeer().ToStringAddrPort()); 26 27@@ -535,18 +544,22 @@ void HTTPEvent::trigger(struct timeval* tv) 28 else 29 evtimer_add(ev, tv); // trigger after timeval passed 30 } 31-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : req(_req), replySent(_replySent) 32+HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : 33+ req{_req}, 34+ m_uri_parsed{evhttp_uri_parse(evhttp_request_get_uri(req))}, 35+ replySent{_replySent} 36 { 37 } 38 39 HTTPRequest::~HTTPRequest() 40 { 41+ // evhttpd cleans up the request, as long as a reply was sent. 42 if (!replySent) { 43 // Keep track of whether reply was sent to avoid request leaks 44 LogPrintf("%s: Unhandled request\n", __func__); 45 WriteReply(HTTP_INTERNAL_SERVER_ERROR, "Unhandled request"); 46 } 47- // evhttpd cleans up the request, as long as a reply was sent. 48+ if(m_uri_parsed) evhttp_uri_free(m_uri_parsed); 49 } 50 51 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const 52@@ -665,14 +678,13 @@ HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const 53 54 std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key) const 55 { 56- const char* uri{evhttp_request_get_uri(req)}; 57- 58- return GetQueryParameterFromUri(uri, key); 59+ if (!Assume(m_uri_parsed)) return std::nullopt; // http_request_cb should have checked that m_uri_parsed is not nullptr 60+ return GetQueryParameterFromUri(m_uri_parsed, key); 61 } 62 63-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key) 64+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key) 65 { 66- evhttp_uri* uri_parsed{evhttp_uri_parse(uri)}; 67+ assert(uri_parsed); 68 const char* query{evhttp_uri_get_query(uri_parsed)}; 69 std::optional<std::string> result; 70 71@@ -689,7 +701,6 @@ std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std:: 72 } 73 evhttp_clear_headers(¶ms_q); 74 } 75- evhttp_uri_free(uri_parsed); 76 77 return result; 78 } 79diff --git a/src/httpserver.h b/src/httpserver.h 80index 036a39a02..84906e121 100644 81--- a/src/httpserver.h 82+++ b/src/httpserver.h 83@@ -14,6 +14,7 @@ static const int DEFAULT_HTTP_WORKQUEUE=16; 84 static const int DEFAULT_HTTP_SERVER_TIMEOUT=30; 85 86 struct evhttp_request; 87+struct evhttp_uri; 88 struct event_base; 89 class CService; 90 class HTTPRequest; 91@@ -57,6 +58,7 @@ class HTTPRequest 92 { 93 private: 94 struct evhttp_request* req; 95+ evhttp_uri* m_uri_parsed; 96 bool replySent; 97 98 public: 99@@ -75,6 +77,9 @@ public: 100 */ 101 std::string GetURI() const; 102 103+ /** Get libevent parsed URI, which is nullptr if uri parsing failed */ 104+ const evhttp_uri* GetURIParsed() const { return m_uri_parsed; } 105+ 106 /** Get CService (address:ip) for the origin of the http request. 107 */ 108 CService GetPeer() const; 109@@ -135,10 +140,10 @@ public: 110 * 111 * Helper function for HTTPRequest::GetQueryParameter. 112 * 113- * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri is the entire request uri 114+ * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the request uri after parsing it with evhttp_uri_parse 115 * [@param](/bitcoin-bitcoin/contributor/param/)[in] key represents the query parameter of which the value is returned 116 */ 117-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key); 118+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key); 119 120 /** Event handler closure. 121 */
pablomartin4btc commented at 4:46 pm on March 17, 2023:I see, ok, it makes sense, let me make some changes.
pablomartin4btc commented at 4:14 pm on March 20, 2023:After some tweaks around the unit tests (can’t test an assertion failure onBOOST
and the a solution usingBOOST_REQUIRE_THROW(function_w_failing_assert(), boost::execution_exception);
works only on windows), this is done now.in test/functional/interface_rest.py:51 in 213554cb97 outdated
43@@ -44,6 +44,11 @@ class RetType(Enum): 44 BYTES = 2 45 JSON = 3 46 47+class HTTPStatusCode(Enum): 48+ HTTP_OK = 200 49+ HTTP_BAD_REQUEST = 400 50+ HTTP_NOT_FOUND = 404 51+
stickies-v commented at 6:11 pm on March 16, 2023:Seems like unrelated/unnecessary churn imo, and redefining HTTP status codes in a single functional test doesn’t seem like a productive approach. Would leave this out?
pablomartin4btc commented at 8:16 pm on March 16, 2023:I’m now reusing the HTTPStatus enum from the python http module in python; there were like 20 places where these codes were introduced as int. I think it makes it clearer but pls let me know if that’s not the case.
stickies-v commented at 8:53 pm on March 16, 2023:Every line of code that you change needs review. If those changed lines of code are unrelated to what you’re trying to achieve, you risk unnecessarily slowing doing the PR or even getting NACKs for reasons unrelated to what you’re really trying to do. I’d strongly suggest keeping the scope as small as possible.
I’m not saying it’s necessarily a bad idea to use enumerated HTTP status codes, but it would be better off in a separate PR/issue (and have you checked if it’s been suggested before?)
pablomartin4btc commented at 4:46 pm on March 17, 2023:Ok, got it now and I agree. Thanks.pablomartin4btc force-pushed on Mar 16, 2023fanquake commented at 9:01 pm on March 16, 2023: memberhttps://cirrus-ci.com/task/4643666984173568:
0 test 2023-03-16T20:06:03.567000Z TestFramework (ERROR): Unexpected exception caught during testing 1 Traceback (most recent call last): 2 File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/test_framework/test_framework.py", line 132, in main 3 self.run_test() 4 File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 372, in run_test 5 resp = self.test_rest_request("/mempool/contents", ret_type=RetType.OBJ, status=400, query_params={"verbose": "true", "mempool_sequence": "true"}) 6 File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 86, in test_rest_request 7 assert_equal(resp.status, status.value) 8 AttributeError: 'int' object has no attribute 'value'
pablomartin4btc commented at 9:13 pm on March 16, 2023: member@fanquake that’s something perhaps I’ve introduced with my last push, I’m on it, thanks. Weird, it’s not failing locally for me:
0$ ./test/functional/interface_rest.py 12023-03-16T21:14:11.025000Z TestFramework (INFO): PRNG seed is: 326626678636704986 22023-03-16T21:14:11.026000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_vwqauy2g 32023-03-16T21:14:11.812000Z TestFramework (INFO): Broadcast test transaction and sync nodes 42023-03-16T21:14:12.865000Z TestFramework (INFO): Test the /tx URI 52023-03-16T21:14:12.876000Z TestFramework (INFO): Query an unspent TXO using the /getutxos URI 62023-03-16T21:14:12.934000Z TestFramework (INFO): Query a spent TXO using the /getutxos URI 72023-03-16T21:14:12.935000Z TestFramework (INFO): Query two TXOs using the /getutxos URI 82023-03-16T21:14:12.938000Z TestFramework (INFO): Query the TXOs using the /getutxos URI with a binary response 92023-03-16T21:14:12.939000Z TestFramework (INFO): Test the /getutxos URI with and without /checkmempool 102023-03-16T21:14:14.065000Z TestFramework (INFO): Test the /block, /blockhashbyheight, /headers, and /blockfilterheaders URIs 112023-03-16T21:14:15.221000Z TestFramework (INFO): Test tx inclusion in the /mempool and /block URIs 122023-03-16T21:14:16.326000Z TestFramework (INFO): Test the /chaininfo URI 132023-03-16T21:14:16.334000Z TestFramework (INFO): Test compatibility of deprecated and newer endpoints 142023-03-16T21:14:16.341000Z TestFramework (INFO): Test the /deploymentinfo URI 152023-03-16T21:14:16.404000Z TestFramework (INFO): Stopping nodes 162023-03-16T21:14:16.611000Z TestFramework (INFO): Cleaning up /tmp/bitcoin_func_test_vwqauy2g on exit 172023-03-16T21:14:16.612000Z TestFramework (INFO): Tests successful
I think this was the issue: Python Enums were introduced in version 3.4, and the value attribute was added in version 3.6. I’m fixing it.
pablomartin4btc force-pushed on Mar 16, 2023pablomartin4btc force-pushed on Mar 20, 2023in src/httpserver.cpp:710 in 0115dccef6 outdated
702@@ -689,11 +703,12 @@ std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std:: 703 } 704 evhttp_clear_headers(¶ms_q); 705 } 706- evhttp_uri_free(uri_parsed); 707 708 return result; 709 } 710 711+evhttp_uri* HTTPRequest::GetURIParsed() { return m_uri_parsed; }
stickies-v commented at 4:11 pm on March 20, 2023:A simple getter is best implemented in the header
pablomartin4btc commented at 4:50 pm on March 20, 2023:okin src/httpserver.h:131 in 0115dccef6 outdated
140- * @param[in] uri is the entire request uri 141- * @param[in] key represents the query parameter of which the value is returned 142- */ 143-std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key); 144+ /** Get libevent parsed URI, which is nullptr if uri parsing failed */ 145+ evhttp_uri* GetURIParsed();
stickies-v commented at 4:11 pm on March 20, 2023:0 const evhttp_uri* GetURIParsed() const;
pablomartin4btc commented at 5:26 pm on March 20, 2023:Yeah, putting it back, before the compiler was complaining as in the previous approach we were modifying m_uri_parsed in this function.in src/httpserver.h:180 in 0115dccef6 outdated
175+ * 176+ * @param[in] uri_parsed is the evhttp_uri object (obtained from evhttp_uri_parse - libevent function) 177+ * @param[in] key represents the query parameter of which the value is returned 178+ */ 179+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key); 180+
stickies-v commented at 4:15 pm on March 20, 2023:What’s the rationale behind moving all this code? I don’t think it’s necessary?
pablomartin4btc commented at 4:49 pm on March 20, 2023:Was unintentional, going to place it on its original location.in src/httpserver.h:60 in 0115dccef6 outdated
56@@ -56,7 +57,8 @@ struct event_base* EventBase(); 57 class HTTPRequest 58 { 59 private: 60- struct evhttp_request* req; 61+ struct evhttp_request* req{nullptr};
stickies-v commented at 4:17 pm on March 20, 2023:seems unrelated to this PR, and no need to touch this otherwise?
pablomartin4btc commented at 5:10 pm on March 20, 2023:I wanted to make sure then to use the free up function for the request in theHTTPRequest
destructor:evhttp_request_free
. This was something I was doing while investigating the previous failure of the ASan tests and possible memory leaks. I’ll remove it if you think unnecessary,
pablomartin4btc commented at 5:30 pm on March 20, 2023:I’ll take it back.in src/httpserver.h:97 in 0115dccef6 outdated
93@@ -92,7 +94,7 @@ class HTTPRequest 94 * 95 * @param[in] key represents the query parameter of which the value is returned 96 */ 97- std::optional<std::string> GetQueryParameter(const std::string& key) const; 98+ std::optional<std::string> GetQueryParameter(const std::string& key);
stickies-v commented at 4:17 pm on March 20, 2023:Why removeconst
?
pablomartin4btc commented at 5:27 pm on March 20, 2023:Similar reason to the above comment onGetURIParsed()
function, back again.pablomartin4btc force-pushed on Mar 20, 2023in src/test/httpserver_tests.cpp:1 in 9b0a0dc923
stickies-v commented at 4:19 pm on March 20, 2023:I think this would benefit from helper function(s) to avoid duplicating all the parsing/memclearing logic and make the tests easier to read.
pablomartin4btc commented at 5:11 pm on March 20, 2023:ok, let me add some helper(s) there.
pablomartin4btc commented at 6:01 pm on March 20, 2023:done.
stickies-v commented at 12:04 pm on March 21, 2023:I think the tests are still more complex than necessary. How about this:
0diff --git a/src/test/httpserver_tests.cpp b/src/test/httpserver_tests.cpp 1index 0462d3ae3..6ae57bdf0 100644 2--- a/src/test/httpserver_tests.cpp 3+++ b/src/test/httpserver_tests.cpp 4@@ -11,42 +11,46 @@ 5 6 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup) 7 8-static evhttp_uri* GetURIParsed(const std::string uri){ 9- return evhttp_uri_parse(uri.c_str()); 10+/** Test if query parameter exists in the provided uri. */ 11+bool QueryParameterExists(std::string_view uri, const std::string& key) 12+{ 13+ auto uri_parsed{evhttp_uri_parse(uri.data())}; 14+ const bool exists{GetQueryParameterFromUri(uri_parsed, key).has_value()}; 15+ if(uri_parsed) evhttp_uri_free(uri_parsed); 16+ return exists; 17+} 18+ 19+/** Test if query parameter exists in the provided uri and if its value matches the expected_value. */ 20+bool QueryParameterEquals(std::string_view uri, const std::string& key, std::string_view expected_value) 21+{ 22+ auto uri_parsed{evhttp_uri_parse(uri.data())}; 23+ const auto param_value{GetQueryParameterFromUri(uri_parsed, key)}; 24+ const bool is_equal{param_value.has_value() && *param_value == expected_value}; 25+ if(uri_parsed) evhttp_uri_free(uri_parsed); 26+ return is_equal; 27 } 28 29 BOOST_AUTO_TEST_CASE(test_query_parameters) 30 { 31 // No parameters 32- evhttp_uri* uri_parsed{GetURIParsed("localhost:8080/rest/headers/someresource.json")}; 33- BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value()); 34- evhttp_uri_free(uri_parsed); 35+ BOOST_CHECK(!QueryParameterExists("localhost:8080/rest/headers/someresource.json", "p1")); 36 37 // Single parameter 38- uri_parsed = GetURIParsed("localhost:8080/rest/endpoint/someresource.json?p1=v1"); 39- BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1"); 40- BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p2").has_value()); 41- evhttp_uri_free(uri_parsed); 42+ BOOST_CHECK(QueryParameterEquals("localhost:8080/rest/endpoint/someresource.json?p1=v1", "p1", "v1")); 43+ BOOST_CHECK(!QueryParameterExists("localhost:8080/rest/endpoint/someresource.json?p1=v1", "p2")); 44 45 // Multiple parameters 46- uri_parsed = GetURIParsed("/rest/endpoint/someresource.json?p1=v1&p2=v2"); 47- BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1"); 48- BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p2").value(), "v2"); 49- evhttp_uri_free(uri_parsed); 50+ BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p2=v2", "p1", "v1")); 51+ BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p2=v2", "p2", "v2")); 52 53 // If the query string contains duplicate keys, the first value is returned 54- uri_parsed = GetURIParsed("/rest/endpoint/someresource.json?p1=v1&p1=v2"); 55- BOOST_CHECK_EQUAL(GetQueryParameterFromUri(uri_parsed, "p1").value(), "v1"); 56- evhttp_uri_free(uri_parsed); 57+ BOOST_CHECK(QueryParameterEquals("/rest/endpoint/someresource.json?p1=v1&p1=v2", "p1", "v1")); 58 59 // Invalid query string syntax is the same as not having parameters 60- uri_parsed = GetURIParsed("/rest/endpoint/someresource.json&p1=v1&p2=v2"); 61- BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value()); 62- evhttp_uri_free(uri_parsed); 63+ BOOST_CHECK(!QueryParameterExists("/rest/endpoint/someresource.json&p1=v1&p2=v2", "p1")); 64 65 // Invalid URI with % symbol won't return any values as not having parameters at all 66- uri_parsed = GetURIParsed("/rest/endpoint/someresource.json&p1=v1&p2=v2%"); 67- BOOST_CHECK_EQUAL(uri_parsed, nullptr); 68- BOOST_CHECK(!GetQueryParameterFromUri(uri_parsed, "p1").has_value()); 69+ BOOST_CHECK(!QueryParameterExists("/rest/endpoint/someresource.json&p1=v1&p2=v2%", "p1")); 70+ 71 } 72 BOOST_AUTO_TEST_SUITE_END()
Can probably still be deduplicated a bit more using an RAII wrapper for uri_parsed, if you like.
pablomartin4btc commented at 8:32 pm on March 21, 2023:Done with one more tweak. I’ll use an RAII wrapper if I have to touch the code again, and if that’s the case, could you pls point me to an example already here in the bitcoin core code if already exists? Thanks.
stickies-v commented at 10:42 am on May 18, 2023:The tworeq->GetQueryParameter
calls inrest_mempool
need to be updated as well.
pablomartin4btc commented at 2:26 pm on May 27, 2023:Yeah missed them, done now, thanks!in test/functional/interface_rest.py:218 in 9b0a0dc923 outdated
218- self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=400, ret_type=RetType.OBJ) 219+ self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=HTTPStatus.BAD_REQUEST, ret_type=RetType.OBJ) 220 221 long_uri = '/'.join([f'{txid}-{n_}' for n_ in range(15)]) 222- self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=200) 223+ self.test_rest_request(f"/getutxos/checkmempool/{long_uri}", http_method='POST', status=HTTPStatus.OK)
stickies-v commented at 4:22 pm on March 20, 2023:These changes are unrelated to the segmentation fault and at the very least need to be in a separate commit
pablomartin4btc commented at 4:48 pm on March 20, 2023:Sorry, forgot to remove all that.pablomartin4btc commented at 4:24 pm on March 20, 2023: memberUpdated changes:
- Updated the code with a new workaround to avoid failures on unit tests (
httpserver_tests
).
Thanks @stickies-v for your prompt response and your support on quick workarounds on issues raised.
stickies-v commented at 4:26 pm on March 20, 2023: contributornit: the commit message could benefit from additional information on the nature of the bug and how it’s fixedin src/test/httpserver_tests.cpp:17 in 9b0a0dc923 outdated
16- std::string uri {}; 17- 18 // No parameters 19- uri = "localhost:8080/rest/headers/someresource.json"; 20- BOOST_CHECK(!GetQueryParameterFromUri(uri.c_str(), "p1").has_value()); 21+ std::string uri = "localhost:8080/rest/headers/someresource.json";
brunoerg commented at 5:01 pm on March 20, 2023:nit:
0 const std::string uri{"localhost:8080/rest/headers/someresource.json"};
pablomartin4btc commented at 5:02 pm on March 20, 2023:rightpablomartin4btc force-pushed on Mar 20, 2023pablomartin4btc force-pushed on Mar 20, 2023pablomartin4btc force-pushed on Mar 20, 2023pablomartin4btc force-pushed on Mar 20, 2023pablomartin4btc force-pushed on Mar 20, 2023in src/httpserver.h:135 in 36be160a9b outdated
132 }; 133 134-/** Get the query parameter value from request uri for a specified key, or std::nullopt if the key 135- * is not found. 136+/** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt 137+ * if the key is not found or uri_parsed parameter is nullptr (most probably from a failed parsing).
stickies-v commented at 11:26 am on March 21, 2023:nit: it’s actually an argument, but I would just leave it out
0 * if the key is not found or if uri_parsed is nullptr (most probably from a failed parsing).
pablomartin4btc commented at 7:04 pm on March 21, 2023:done.in src/httpserver.h:143 in 36be160a9b outdated
141 * currently not needed in any of the endpoints. 142 * 143 * Helper function for HTTPRequest::GetQueryParameter. 144 * 145- * @param[in] uri is the entire request uri 146+ * @param[in] uri_parsed is the evhttp_uri object (obtained from evhttp_uri_parse - libevent function)
stickies-v commented at 11:27 am on March 21, 2023:nit
0 * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the uri parsed with libevent's evhttp_uri_parse
pablomartin4btc commented at 7:05 pm on March 21, 2023:done.stickies-v commented at 12:12 pm on March 21, 2023: contributorApproach ACK 36be160a9pablomartin4btc force-pushed on Mar 21, 2023pablomartin4btc force-pushed on Mar 21, 2023pablomartin4btc force-pushed on Mar 21, 2023pablomartin4btc force-pushed on Mar 21, 2023in src/test/httpserver_tests.cpp:17 in 729287530e outdated
10 #include <boost/test/unit_test.hpp> 11 12 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup) 13 14-BOOST_AUTO_TEST_CASE(test_query_parameters) 15+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){
stickies-v commented at 2:15 pm on March 22, 2023:Missing#include <string>
and#include <string_view>
pablomartin4btc commented at 3:14 pm on March 22, 2023:Ok, thanks.stickies-v approvedstickies-v commented at 2:18 pm on March 22, 2023: contributorACK 729287530e458febf32e01e542a36e7be4a74fbf
Also checked that the new functional test fails without the patch and succeeds with the patch.
but I think the commit message should be updated to better describe the bug, the fix, and the behaviour change introduced
in src/test/httpserver_tests.cpp:22 in 729287530e outdated
14-BOOST_AUTO_TEST_CASE(test_query_parameters) 15+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){ 16+ auto uri_parsed{evhttp_uri_parse(uri.data())}; 17+ const auto param_value{GetQueryParameterFromUri(uri_parsed, key)}; 18+ if(uri_parsed) evhttp_uri_free(uri_parsed); 19+ return param_value;
maflcko commented at 2:44 pm on March 22, 2023:0test/httpserver_tests.cpp:18:12: error: constness of 'param_value' prevents automatic move [performance-no-automatic-move,-warnings-as-errors] 1 return param_value; 2 ^ 3clang-tidy-14 --us
pablomartin4btc commented at 3:12 pm on March 22, 2023:I’ll fix this.pablomartin4btc commented at 3:15 pm on March 22, 2023: memberbut I think the commit message should be updated to better describe the bug, the fix, and the behaviour change introduced
Yes, I have that pending to do asap.
pablomartin4btc force-pushed on Mar 23, 2023vasild commented at 10:11 am on March 23, 2023: contributorConcept ACK
The code as of 8bdea6316c looks ok, but it seems to be mixing the fix with an optimization to only parse once (instead of every time a parameter is retrieved). It would be better to have two separate commits - one with the non-functional optimization and one with the fix.
Idea: I think it would be better to disallow the creation of
HTTPRequest
if the request is invalid/unparsable. This would avoid having anullptr
memberm_uri_parsed
and having to handle the nullness. Would it work to check the sanity in theHTTPRequest
constructor and if a problem is detected, thenthrow
some object that contains an error code and a string (to convey e.g.HTTP_BAD_REQUEST, "Invalid URI"
)? Then catch this inhttp_request_cb()
and return early.There are other similar bugs in this code which exist in
master
and remain unfixed by this PR:evhttp_request_get_uri()
may returnnullptr
which we pass unconditionally toevhttp_uri_parse()
which will crash onnullptr
.- In
HTTPRequest::GetURI()
we implicitly createstd::string
from the return value ofevhttp_request_get_uri()
. Creating a string fromnullptr
is undefined behavior.
I am not sure under what circumstances would
evhttp_request_get_uri()
returnnullptr
, but it is not documented that it will never returnnullptr
and also its source code implies that it can, so better safe than sorry. I don’t think fixing those is a requirement for this PR. If they are not, then I will follow up with a subsequent PR.stickies-v commented at 1:06 pm on March 23, 2023: contributorIt would be better to have two separate commits - one with the non-functional optimization and one with the fix.
Agreed, that could probably be improved.
Idea: I think it would be better to disallow the creation of HTTPRequest if the request is invalid/unparsable.
This was the first approach I considered, but I think it’s better not to. A request with an invalid URI is still a request, and thus it should be representable. Moving the URI validation to the constructor feels like a shortcut and coupling we don’t need to take?
evhttp_request_get_uri()
may returnnullptr
Like you, I don’t see how that can happen in practice but can never hurt to make more robust. I think it’s orthogonal to this PR but happy to review your PR - feel free to ping me?
vasild commented at 1:48 pm on March 23, 2023: contributorMoving the URI validation to the constructor feels like a shortcut and coupling we don’t need to take?
I do not have a strong opinion. This PR moved the parsing to the constructor, but left the checking of the parsing result for later. Now the code looks like (simplified):
0// constructs the object, does the parsing, leaves null m_uri_parsed if it fails 1HTTPRequest hreq(...); 2 3if (hreq.m_uri_parsed == nullptr) { 4 handle error and dispose hreq 5} 6 7// elsewhere a soft assert is needed to ensure hreq.m_uri_parsed != nullptr
I find the following more robust and compact:
0std::unique_ptr<HTTPRequest> hreq; 1try { 2 hreq = std::make_unique<HTTPRequest>(...); 3} catch (e) { 4 handle error e.code + e.msg 5} 6 7// no need for further error handling or checking here or elsewhere
happy to review your PR - feel free to ping me
Sure, lets first see what happens with this PR. Maybe @pablomartin4btc would decide to check the return value of
evhttp_request_get_uri()
in this PR.pablomartin4btc commented at 3:21 pm on March 23, 2023: memberIt would be better to have two separate commits - one with the non-functional optimization and one with the fix.
Agreed, that could probably be improved.
I can do this.
Sure, lets first see what happens with this PR. Maybe @pablomartin4btc would decide to check the return value of evhttp_request_get_uri() in this PR.
It was under my radar and I thought to do it in a follow up, but I could do it here if it’s more practical (need to update title and description of this PR).
I find the following more robust and compact:
0std::unique_ptr<HTTPRequest> hreq; 1try { 2 hreq = std::make_unique<HTTPRequest>(...); 3} catch (e) { 4 handle error e.code + e.msg 5} 6 7// no need for further error handling or checking here or elsewhere
I like this approach, let me check and play a bit with the code as if I need to raise an exception or can return the
HTTP_BAD_REQUEST
from there, and how the consumer (rest obj) would behave/ interact with it.achow101 added this to the milestone 25.0 on Mar 23, 2023vasild commented at 9:01 am on March 24, 2023: contributorIf you decide to go this way, something like this should work:
0// in HTTPRequest::HTTPRequest(): 1if (cannot parse uri) { 2 const int code{HTTP_BAD_METHOD}; 3 const char* what{"Invalid URI"}; 4 WriteReply(code, what); 5 throw std::system_error(code, std::generic_category(), what); 6} 7 8// in http_request_cb(): 9std::unique_ptr<HTTPRequest> hreq; 10try { 11 hreq = std::make_unique<HTTPRequest>(...); 12} catch (const std::system_error& e) { 13 LogPrint(BCLog::HTTP, "%d %s\n", e.code().value(), e.what()); 14 return; 15}
pablomartin4btc force-pushed on Mar 25, 2023pablomartin4btc commented at 4:12 am on March 25, 2023: memberUpdated changes:
- Moving the uri parsing validation from the http_request_cb to the HTTPRequest constructor, while doing this realised that it’s better to validate the result of
evhttp_request_get_uri()
also at that time. @vasild, @stickies-v: please let me know what you think and I’ll update title & description of the PR and will separate the code in 2 commits (proper fix & non-functional optimisation).
pablomartin4btc commented at 5:52 pm on March 25, 2023: memberRegarding CI failures:
./test/functional/wallet_create_tx.py
passes locally, actioned manually re-run and it passes too.- checking the fuzzer one
cirrus-ci-build/ci/scratch/qa-assets/fuzz_seed_corpus/http_request" failed with exit code 77
-fuzz: httpserver.cpp:618: void HTTPRequest::WriteReply(int, const std::string &): Assertion '!replySent && req' failed
. (crash-75d231d5e022cc38377bbbf4549058d37b7c59a3)- it passes locally when I perform manually (following fuzzing instructions):
0FUZZ=process_message ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3 1process_message: succeeded against 1 files in 0s.
maflcko commented at 7:51 am on March 27, 2023: memberFUZZ=process_message ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3
You’ll need to set the correct fuzz target:
FUZZ=http_request ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3
in src/httpserver.cpp:265 in 0aa0626c50 outdated
261@@ -253,11 +262,11 @@ static void http_request_cb(struct evhttp_request* req, void* arg) 262 return; 263 } 264 265+ std::string strURI{hreq->GetURI()};
vasild commented at 9:18 am on March 27, 2023:nit:
0 const std::string strURI{hreq->GetURI()};
vasild commented at 10:27 am on March 27, 2023:GetURI()
(which callsevhttp_request_get_uri()
) could still return null. It seems too indirect to that we rely that we have calledevhttp_request_get_uri()
before on the same input and back then it returned non-null (from the constructor) so it must also return non-null now.I think it would be better to save the result in a variable and document that it will never be null. This will also avoid calling
evhttp_request_get_uri()
two times. Like this:0diff --git i/src/httpserver.h w/src/httpserver.h 1index 5b21a678c5..d7cd81be7e 100644 2--- i/src/httpserver.h 3+++ w/src/httpserver.h 4@@ -55,13 +55,16 @@ struct event_base* EventBase(); 5 * Thin C++ wrapper around evhttp_request. 6 */ 7 class HTTPRequest 8 { 9 private: 10 struct evhttp_request* req; 11- evhttp_uri* m_uri_parsed{nullptr}; 12+ /// The URI, as returned by `evhttp_request_get_uri()`. Will never be `nullptr`. 13+ const char* const m_uri; 14+ /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`. 15+ evhttp_uri* const m_uri_parsed; 16 bool replySent; 17 18 public: 19 explicit HTTPRequest(struct evhttp_request* req, bool replySent = false); 20 ~HTTPRequest(); 21 22diff --git i/src/httpserver.cpp w/src/httpserver.cpp 23index e9b68a36c1..45d25bc32f 100644 24--- i/src/httpserver.cpp 25+++ w/src/httpserver.cpp 26@@ -238,14 +238,14 @@ static void http_request_cb(struct evhttp_request* req, void* arg) 27 } 28 29 std::unique_ptr<HTTPRequest> hreq; 30 31 try { 32 hreq = std::make_unique<HTTPRequest>(req); 33- } catch (const std::system_error& e) { 34- LogPrint(BCLog::HTTP, "%d %s\n", e.code().value(), e.what()); 35+ } catch (const std::runtime_error& e) { 36+ LogPrint(BCLog::HTTP, "%s\n", e.what()); 37 return; 38 } 39 40 // Early address-based allow check 41 if (!ClientAllowed(hreq->GetPeer())) { 42 LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Client network is not allowed RPC access\n", 43@@ -543,34 +543,33 @@ void HTTPEvent::trigger(struct timeval* tv) 44 event_active(ev, 0, 0); // immediately trigger event in main thread 45 else 46 evtimer_add(ev, tv); // trigger after timeval passed 47 } 48 HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : 49 req{_req}, 50+ m_uri{evhttp_request_get_uri(req)}, 51+ m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr}, 52 replySent{_replySent} 53 { 54- const char* uri_to_parse{GetURI()}; 55- if (uri_to_parse != nullptr && uri_to_parse[0] != '\0') m_uri_parsed = evhttp_uri_parse(uri_to_parse); 56- 57- if (!m_uri_parsed) { 58- const int code{HTTP_BAD_REQUEST}; 59- const char* what{"Invalid URI"}; 60- WriteReply(code, what); 61- throw std::system_error(code, std::generic_category(), what); 62+ if (m_uri_parsed == nullptr) { 63+ const char* err{"Invalid URI"}; 64+ WriteReply(HTTP_BAD_REQUEST, err); 65+ throw std::runtime_error( 66+ strprintf("HTTP request from %s rejected: %s", GetPeer().ToStringAddr(), err)); 67 } 68 } 69 70 HTTPRequest::~HTTPRequest() 71 { 72 // evhttpd cleans up the request, as long as a reply was sent. 73 if (!replySent) { 74 // Keep track of whether reply was sent to avoid request leaks 75 LogPrintf("%s: Unhandled request\n", __func__); 76 WriteReply(HTTP_INTERNAL_SERVER_ERROR, "Unhandled request"); 77 } 78- if (m_uri_parsed) evhttp_uri_free(m_uri_parsed); 79+ evhttp_uri_free(m_uri_parsed); 80 } 81 82 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const 83 { 84 const struct evkeyvalq* headers = evhttp_request_get_input_headers(req); 85 assert(headers); 86@@ -662,13 +661,13 @@ CService HTTPRequest::GetPeer() const 87 } 88 return peer; 89 } 90 91 const char* HTTPRequest::GetURI() const 92 { 93- return evhttp_request_get_uri(req); 94+ return m_uri; 95 } 96 97 HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const 98 { 99 switch (evhttp_request_get_command(req)) { 100 case EVHTTP_REQ_GET:
pablomartin4btc commented at 3:06 am on March 28, 2023:I agree with this suggestion, first the manipulation of null values, mainly its return, and also removing the extra call toevhttp_request_get_uri()
.in src/httpserver.h:81 in 0aa0626c50 outdated
74@@ -73,7 +75,7 @@ class HTTPRequest 75 76 /** Get requested URI.
vasild commented at 9:23 am on March 27, 2023:0 /** Get requested URI. Will always return non-nullptr.
in src/httpserver.h:132 in 0aa0626c50 outdated
127@@ -126,19 +128,19 @@ class HTTPRequest 128 void WriteReply(int nStatus, const std::string& strReply = ""); 129 }; 130 131-/** Get the query parameter value from request uri for a specified key, or std::nullopt if the key 132- * is not found. 133+/** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt 134+ * if the key is not found or if uri_parsed argument is nullptr (most probably from a failed parsing).
vasild commented at 10:25 am on March 27, 2023:It is possible to enforce that this argument will never be null - if changed to reference. Simplifies the code a bit:
0diff --git i/src/httpserver.cpp w/src/httpserver.cpp 1index e9b68a36c1..1dccb196be 100644 2--- i/src/httpserver.cpp 3+++ w/src/httpserver.cpp 4@@ -683,20 +683,18 @@ HTTPRequest::RequestMethod HTTPRequest::GetRequestMethod() const 5 return UNKNOWN; 6 } 7 } 8 9 std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key) const 10 { 11- if (!Assume(m_uri_parsed)) return std::nullopt; // HTTPRequest constructor should have checked that m_uri_parsed is not nullptr 12- return GetQueryParameterFromUri(m_uri_parsed, key); 13+ return GetQueryParameterFromUri(*m_uri_parsed, key); 14 } 15 16-std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key) 17+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key) 18 { 19- if (!uri_parsed) return std::nullopt; 20- const char* query{evhttp_uri_get_query(uri_parsed)}; 21+ const char* query{evhttp_uri_get_query(&uri_parsed)}; 22 std::optional<std::string> result; 23 24 if (query) { 25 // Parse the query string into a key-value queue and iterate over it 26 struct evkeyvalq params_q; 27 evhttp_parse_query_str(query, ¶ms_q); 28diff --git i/src/httpserver.h w/src/httpserver.h 29index 5b21a678c5..3f2ef06c11 100644 30--- i/src/httpserver.h 31+++ w/src/httpserver.h 32@@ -126,24 +126,24 @@ public: 33 * main thread, do not call any other HTTPRequest methods after calling this. 34 */ 35 void WriteReply(int nStatus, const std::string& strReply = ""); 36 }; 37 38 /** Get the query parameter value from a evhttp_uri (parsed) object for a specified key, or std::nullopt 39- * if the key is not found or if uri_parsed argument is nullptr (most probably from a failed parsing). 40+ * if the key is not found. 41 * 42 * If the query string contains duplicate keys, the first value is returned. Many web frameworks 43 * would instead parse this as an array of values, but this is not (yet) implemented as it is 44 * currently not needed in any of the endpoints. 45 * 46 * Helper function for HTTPRequest::GetQueryParameter. 47 * 48 * [@param](/bitcoin-bitcoin/contributor/param/)[in] uri_parsed is the uri parsed with libevent's evhttp_uri_parse 49 * [@param](/bitcoin-bitcoin/contributor/param/)[in] key represents the query parameter of which the value is returned 50 */ 51-std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri* uri_parsed, const std::string& key); 52+std::optional<std::string> GetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key); 53 54 /** Event handler closure. 55 */ 56 class HTTPClosure 57 { 58 public: 59diff --git i/src/test/httpserver_tests.cpp w/src/test/httpserver_tests.cpp 60index 120c6d102f..6e2128863d 100644 61--- i/src/test/httpserver_tests.cpp 62+++ w/src/test/httpserver_tests.cpp 63@@ -13,13 +13,13 @@ 64 #include <boost/test/unit_test.hpp> 65 66 BOOST_FIXTURE_TEST_SUITE(httpserver_tests, BasicTestingSetup) 67 68 std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){ 69 auto uri_parsed{evhttp_uri_parse(uri.data())}; 70- auto param_value{GetQueryParameterFromUri(uri_parsed, key)}; 71+ auto param_value{GetQueryParameterFromUri(*uri_parsed, key)}; 72 if(uri_parsed) evhttp_uri_free(uri_parsed); 73 return param_value; 74 } 75 76 /** Test if query parameter exists in the provided uri. */ 77 bool QueryParameterExists(const std::string_view uri, const std::string& key)
pablomartin4btc commented at 3:03 am on March 28, 2023:I see, changing a pointer to a reference does not inherently prevent null pointer dereferences at runtime. However, by changing a pointer to a reference and documenting that the reference is expected to be non-null, we are effectively enforcing a contract that the caller of the function must provide a valid (non-null) reference. Suggestion taken, thanks!
pablomartin4btc commented at 4:51 am on March 28, 2023:The problem removingif (!uri_parsed) return std::nullopt;
line fromGetQueryParameterFromUri()
is that passing nullptr as the uri_parsed (from an invalid uri), as in the last test case inhttpserver_test.cpp
will produce thesegfault
, so I could opt that in the test but I dont think it’s the real intention of testingGetQueryParameterFromUri()
.
vasild commented at 8:18 am on March 30, 2023:It is not possible to pass
nullptr
as an argument toGetQueryParameterFromUri(const evhttp_uri& uri_parsed, const std::string& key)
. This is the point. Inside the function there is no need to worry about null pointer.The crash occurs inside the test:
0 auto uri_parsed{evhttp_uri_parse(uri.data())}; // ends up nullptr 1 auto param_value{GetQueryParameterFromUri(*uri_parsed, key)}; // crash here, before calling GetQueryParameterFromUri().
The function is safe, the caller has to be adjusted.
vasild commented at 8:36 am on March 30, 2023:I see you have adjusted the test in the latest push, I think that is the right approach. Thanks!pablomartin4btc force-pushed on Mar 28, 2023pablomartin4btc force-pushed on Mar 28, 2023pablomartin4btc force-pushed on Mar 28, 2023pablomartin4btc commented at 5:24 am on March 28, 2023: memberWorking on functional failure
interface_rest
on the invalid uri case, something didn’t like from latest changes.I think I’ve found the issue!
Both, .py and fuzz were getting the same failure
Assertion '!replySent && req' failed
.The problem for the functional was that
GeetPeer()
was called while throwing the runtime exception, so moved it before.On the fuzz
./src/test/fuzz/test http_request.cpp
, the issue is that the test itself is passingreplySent
astrue
, so theWriteReply()
call on the httpserver constructor when the uri_parse is invalid. I’m investigating the rest of the test if we need to change anything there. Also, not sure, in which cases we need to create an instance of theHTTPRequest
obj passingreplySent
astrue
, I didn’t find a case in the entire code, unless as in the current fuzz we are talking about.pablomartin4btc force-pushed on Mar 28, 2023in src/test/httpserver_tests.cpp:21 in 1c278adc1a outdated
17-BOOST_AUTO_TEST_CASE(test_query_parameters) 18+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){ 19+ auto uri_parsed{evhttp_uri_parse(uri.data())}; 20+ if (uri_parsed == nullptr) return std::nullopt; 21+ auto param_value{GetQueryParameterFromUri(*uri_parsed, key)}; 22+ if(uri_parsed) evhttp_uri_free(uri_parsed);
brunoerg commented at 8:58 pm on March 28, 2023:nit
0 if (uri_parsed) evhttp_uri_free(uri_parsed);
brunoerg commented at 9:00 pm on March 28, 2023: contributorOn the fuzz ./src/test/fuzz/test http_request.cpp, the issue is that the test itself is passing replySent as true, so the WriteReply() call on the httpserver constructor when the uri_parse is invalid. I’m investigating the rest of the test if we need to change anything there. Also, not sure, in which cases we need to create an instance of the HTTPRequest obj passing replySent as true, I didn’t find a case in the entire code, unless as in the current fuzz we are talking about.
Try to set it to
false
since we are callingReadBody()
to read and process the result.pablomartin4btc commented at 10:22 pm on March 28, 2023: memberTry to set it to
false
since we are callingReadBody()
to read and process the result.I think we need to re-think this fuzz test (
http_request.cpp
), as due to the new approach we have these issues with it:replySent
is set totrue
from test and it will fail in WriteReply() as assert(!replySent && req);.- even we set the above to
false
, later duringGetPeer()
as the connection is null (we can see in the fuzz test checking laterassert(service.ToStringAddrPort() == "[::]:0");
) the assignment of GetPeer().ToStringAddr(); will result into<incomplete type>
and it will error when throwing the runtime exception. - but even we solve the above issues going around the code, now as in the HTTPRequest constructor we are raising the runtime exception, as there’s no call into
http_request_cb
(because there’s no InitHTTPServer) nothing is catching the error and the test will fail unless we catch the error but then we won’t be able to perform all the other checks the fuzz test does.
vasild commented at 9:50 am on March 30, 2023: contributorSince the
replySent
parameter of theHTTPRequest
constructor is used only from the tests, I would suggest to remove it and adjust the test as needed. I believe the tests should be testing the same API that is used by the real code, otherwise they may end up testing scenarios that never happen in real code and missing real code scenarios.Furthermore that
replySent = true
from the test is a hack to workaround another problem. It is actually a lie - the reply has not been sent. The problem being workedaround is that theWriteReply()
will be called from theHTTPRequest
destructor from the test function, afterev*free()
functions have freed the buffers and it will crash due to use-after-free. A convenient way to call the destructor beforeev*free()
is to put it in a narrower scope:0{ 1 HTTPRequest http_request; 2 ... 3} 4evbuffer_free(evbuf); 5evhttp_request_free(evreq);
That’s cool because we need to
try/catch
nevertheless. Here is it:0diff --git i/src/httpserver.cpp w/src/httpserver.cpp 1index 99cf147866..3e97d621ce 100644 2--- i/src/httpserver.cpp 3+++ w/src/httpserver.cpp 4@@ -541,17 +541,17 @@ void HTTPEvent::trigger(struct timeval* tv) 5 { 6 if (tv == nullptr) 7 event_active(ev, 0, 0); // immediately trigger event in main thread 8 else 9 evtimer_add(ev, tv); // trigger after timeval passed 10 } 11-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : 12+HTTPRequest::HTTPRequest(struct evhttp_request* _req) : 13 req{_req}, 14 m_uri{evhttp_request_get_uri(req)}, 15 m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr}, 16- replySent{_replySent} 17+ replySent{false} 18 { 19 if (m_uri_parsed == nullptr) { 20 const char* err{"Invalid URI"}; 21 const std::string peerAddress{GetPeer().ToStringAddr()}; 22 WriteReply(HTTP_BAD_REQUEST, err); 23 throw std::runtime_error( 24diff --git i/src/httpserver.h w/src/httpserver.h 25index 00b22bdadc..56a4d89d22 100644 26--- i/src/httpserver.h 27+++ w/src/httpserver.h 28@@ -62,13 +62,13 @@ private: 29 const char* const m_uri; 30 /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`. 31 evhttp_uri* const m_uri_parsed; 32 bool replySent; 33 34 public: 35- explicit HTTPRequest(struct evhttp_request* req, bool replySent = false); 36+ explicit HTTPRequest(struct evhttp_request* req); 37 ~HTTPRequest(); 38 39 enum RequestMethod { 40 UNKNOWN, 41 GET, 42 POST, 43diff --git i/src/test/fuzz/http_request.cpp w/src/test/fuzz/http_request.cpp 44index 9928c4a1ab..258de7c4cd 100644 45--- i/src/test/fuzz/http_request.cpp 46+++ w/src/test/fuzz/http_request.cpp 47@@ -44,23 +44,26 @@ FUZZ_TARGET(http_request) 48 evhttp_parse_firstline_(evreq, evbuf) != 1 || evhttp_parse_headers_(evreq, evbuf) != 1) { 49 evbuffer_free(evbuf); 50 evhttp_request_free(evreq); 51 return; 52 } 53 54- HTTPRequest http_request{evreq, true}; 55- const HTTPRequest::RequestMethod request_method = http_request.GetRequestMethod(); 56- (void)RequestMethodString(request_method); 57- (void)http_request.GetURI(); 58- (void)http_request.GetHeader("Host"); 59- const std::string header = fuzzed_data_provider.ConsumeRandomLengthString(16); 60- (void)http_request.GetHeader(header); 61- (void)http_request.WriteHeader(header, fuzzed_data_provider.ConsumeRandomLengthString(16)); 62- (void)http_request.GetHeader(header); 63- const std::string body = http_request.ReadBody(); 64- assert(body.empty()); 65- const CService service = http_request.GetPeer(); 66- assert(service.ToStringAddrPort() == "[::]:0"); 67+ try { 68+ HTTPRequest http_request{evreq}; 69+ const HTTPRequest::RequestMethod request_method = http_request.GetRequestMethod(); 70+ (void)RequestMethodString(request_method); 71+ (void)http_request.GetURI(); 72+ (void)http_request.GetHeader("Host"); 73+ const std::string header = fuzzed_data_provider.ConsumeRandomLengthString(16); 74+ (void)http_request.GetHeader(header); 75+ (void)http_request.WriteHeader(header, fuzzed_data_provider.ConsumeRandomLengthString(16)); 76+ (void)http_request.GetHeader(header); 77+ const std::string body = http_request.ReadBody(); 78+ assert(body.empty()); 79+ const CService service = http_request.GetPeer(); 80+ assert(service.ToStringAddrPort() == "[::]:0"); 81+ } catch (const std::runtime_error&) { 82+ } 83 84 evbuffer_free(evbuf); 85 evhttp_request_free(evreq); 86 }
… but then we won’t be able to perform all the other checks the fuzz test does.
I think that is fine because if the input is such that
HTTPRequest
constructor throws, then we need not to test those because they will never happen in the real code either.pablomartin4btc commented at 4:55 pm on March 30, 2023: memberI think that is fine because if the input is such that
HTTPRequest
constructor throws, then we need not to test those because they will never happen in the real code either.Thanks @vasild, it makes sense what you are saying, I’m still playing with some options and will get back to you soon.
0$ FUZZ=http_request ./src/test/fuzz/fuzz ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3 1INFO: Seed: 2298450543 2INFO: Loaded 1 modules (250057 inline 8-bit counters): 250057 [0x565010b53868, 0x565010b90931), 3INFO: Loaded 1 PC tables (250057 PCs): 250057 [0x565010b90938,0x565010f615c8), 4./src/test/fuzz/fuzz: Running 1 inputs 1 time(s) each. 5Running: ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3 6Executed ../qa-assets/fuzz_seed_corpus/http_request/75d231d5e022cc38377bbbf4549058d37b7c59a3 in 2 ms 7*** 8*** NOTE: fuzzing was not performed, you have only 9*** executed the target code on a fixed set of inputs. 10***
pablomartin4btc force-pushed on Mar 31, 2023pablomartin4btc commented at 5:39 am on March 31, 2023: memberUpdated changes:
- Fixed fuzz test failing.
Since the replySent parameter of the HTTPRequest constructor is used only from the tests, I would suggest to remove it and adjust the test as needed. @vasild I left open the possibility of passing the
replySent
arg astrue
, this way we can “build” a “safer” “invalid” request as @stickies-v mentioned above, so we can test the almost entire protocol ofHTTPRequest
if that was originally the intention of this fuzz. Then, as part of the above, ifreplySent
is passed astrue
, the constructor won’t call thelibevent
functions in there but asm_uri_parsed
is null (in a real use case we won’t see this as the exception will be thrown), we need to guard the free ofm_uri_parsed
at thehttpserver
destructor otherwise produces a segfault.I’ve checked another option that would be to create a mock
HTTPRequest
but I had to create another constructor to avoid calling the default and I preferred not to touch theHTTPRequest
class. I think leaving thereplySent
arg to be optionally passed on the constructor and the fuzz test behaviour as the original commit would be ok for now unless there are other objections.pablomartin4btc force-pushed on Apr 4, 2023pablomartin4btc force-pushed on Apr 4, 2023in src/httpserver.cpp:591 in 4f68c30b2d outdated
587@@ -563,8 +588,7 @@ std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) cons 588 std::string HTTPRequest::ReadBody() 589 { 590 struct evbuffer* buf = evhttp_request_get_input_buffer(req); 591- if (!buf) 592- return ""; 593+ if (!buf) return "";
vasild commented at 5:16 pm on April 7, 2023:Remove this unrelated whitespace change.
pablomartin4btc commented at 1:36 pm on April 10, 2023:yes, that was something I touched while I was verifying thereq
class member.in src/test/httpserver_tests.cpp:21 in 4f68c30b2d outdated
17-BOOST_AUTO_TEST_CASE(test_query_parameters) 18+std::optional<std::string> GetParameterFromURI(const std::string_view uri, const std::string& key){ 19+ auto uri_parsed{evhttp_uri_parse(uri.data())}; 20+ if (uri_parsed == nullptr) return std::nullopt; 21+ auto param_value{GetQueryParameterFromUri(*uri_parsed, key)}; 22+ if (uri_parsed) evhttp_uri_free(uri_parsed);
vasild commented at 5:19 pm on April 7, 2023:This will always betrue
because ifuri_parsed
isnullptr
we would have returned 2 lines above.
pablomartin4btc commented at 1:38 pm on April 10, 2023:I’ll change it. Thanks.vasild approvedvasild commented at 5:24 pm on April 7, 2023: contributorACK 4f68c30b2d52f39c7c7465d49ba41c2e70f486ba
Would be nice to split the actual fix from the other changes. I guess it would be easier for other reviewers too.
I would have liked to remove the
replySent
argument, but that’s secondary to this PR which fixes a serious bug. Let’s get the bug fixed and if I feel strong enough about it, then maybe I will open a followup to remove that parameter.DrahtBot requested review from stickies-v on Apr 7, 2023pablomartin4btc commented at 10:29 am on April 11, 2023: memberWould be nice to split the actual fix from the other changes. I guess it would be easier for other reviewers too.
I’m working on it.
I would have liked to remove the
replySent
argument, but that’s secondary to this PR which fixes a serious bug. Let’s get the bug fixed and if I feel strong enough about it, then maybe I will open a followup to remove that parameter.I agree, later on a follow-up, I’d try to investigate if
evhttp_request* req
can be mocked somehow so the fuzz can go thru all the buffering behaviour and other stuff once thereplySent
argument is removed.vasild commented at 12:42 pm on April 11, 2023: contributorI’d try to investigate if
evhttp_request* req
can be mocked somehow so the fuzz can go thru all the buffering behaviour and other stuff once the replySent argument is removed.The fuzz will go through this:
if the
HTTPRequest
object is created successfully (oncereplySent
is removed and the test doestry/catch
). If the input is such thatHTTPRequest:HTTPRequest()
throws, then it does not have to test the above because it will never be executed in the real code.pablomartin4btc commented at 1:07 pm on April 12, 2023: memberif the
HTTPRequest
object is created successfully (oncereplySent
is removed and the test doestry/catch
). If the input is such thatHTTPRequest:HTTPRequest()
throws, then it does not have to test the above because it will never be executed in the real code.I’m sorry, I said nonsense, ofc that code won’t be executed because the
uri_parsed
will be null and so on, my concern is that we’ll lose the test coverage by the currentfuzz
, even it’s not a real case scenario, the test is performing ’exhaustive’ calls to specific protocol (not sure how ’exhaustive’ because, on the other hand, the internal buffer in theHTTPRequest
instance is empty while testing).pablomartin4btc force-pushed on Apr 14, 2023pablomartin4btc force-pushed on Apr 14, 2023pablomartin4btc force-pushed on Apr 14, 2023pablomartin4btc force-pushed on Apr 14, 2023pablomartin4btc commented at 1:30 am on April 14, 2023: memberUpdated changes:
- Split commit in 2 (as suggested by @stickies-v and @vasild): one for the fix and one for the logic improvement.
- Added some comments to each commit (as suggested by @stickies-v).
in src/httpserver.cpp:683 in 0f9cba3f07 outdated
679 680 std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key) 681 { 682 evhttp_uri* uri_parsed{evhttp_uri_parse(uri)}; 683+ if (!uri_parsed) { 684+ throw std::runtime_error(strprintf("Invalid uri detected while parsing \"%s\"", uri));
stickies-v commented at 9:59 am on April 14, 2023:uri
is unhandled user input so including that in the error seems dangerous to me, would just change it to0 throw std::runtime_error("URI parsing failed, it likely contained RFC 3986 invalid characters);
vasild commented at 4:54 pm on April 14, 2023:Or useSanitizeString(uri)
.
stickies-v commented at 4:58 pm on April 14, 2023:That wouldn’t prevent someone from passing really long URIs, causing e.g. bloated logs. We’re already (debug) logging the full URI inhttp_request_cb
. Adding it here introduces risks and little value imo.
pablomartin4btc commented at 5:33 pm on April 14, 2023:Yeah, it’s a clearer message. Thanks.in src/httpserver.cpp:676 in 0f9cba3f07 outdated
672+ return GetQueryParameterFromUri(uri, key); 673+ } catch (const std::runtime_error& e) { 674+ LogPrint(BCLog::HTTP, "%s\n", e.what()); 675+ // add additional information and rethrow the exception as we can't send a reply from the worker 676+ throw std::runtime_error(strprintf("Error occurred trying to get query param key \"%s\": ", key) + e.what()); 677+ }
stickies-v commented at 4:19 pm on April 14, 2023:What’s the benefit of re-handling this error? The error message from
GetQueryParameterFromUri
is enough (or could be updated there to reference query param fetching, but it doesn’t really have anything to do with the URI being invalid, so I’d prefer leaving it out).0diff --git a/src/httpserver.cpp b/src/httpserver.cpp 1index eb1f0bd93..3f0d82497 100644 2--- a/src/httpserver.cpp 3+++ b/src/httpserver.cpp 4@@ -667,13 +667,7 @@ std::optional<std::string> HTTPRequest::GetQueryParameter(const std::string& key 5 { 6 const char* uri{evhttp_request_get_uri(req)}; 7 8- try { 9- return GetQueryParameterFromUri(uri, key); 10- } catch (const std::runtime_error& e) { 11- LogPrint(BCLog::HTTP, "%s\n", e.what()); 12- // add additional information and rethrow the exception as we can't send a reply from the worker 13- throw std::runtime_error(strprintf("Error occurred trying to get query param key \"%s\": ", key) + e.what()); 14- } 15+ return GetQueryParameterFromUri(uri, key); 16 } 17 18 std::optional<std::string> GetQueryParameterFromUri(const char* uri, const std::string& key) 19diff --git a/test/functional/interface_rest.py b/test/functional/interface_rest.py 20index fa6193039..fb647a62f 100755 21--- a/test/functional/interface_rest.py 22+++ b/test/functional/interface_rest.py 23@@ -282,7 +282,7 @@ class RESTTest (BitcoinTestFramework): 24 25 # Check invalid uri (% symbol at the end of the request) 26 resp = self.test_rest_request(f"/headers/{bb_hash}%", ret_type=RetType.OBJ, status=400) 27- assert_equal(resp.read().decode('utf-8').rstrip(), f"Error occurred trying to get query param key \"count\": Invalid uri detected while parsing \"/rest/headers/{bb_hash}%.json\"") 28+ assert_equal(resp.read().decode('utf-8').rstrip(), f"Invalid uri detected while parsing \"/rest/headers/{bb_hash}%.json\"") 29 30 # Compare with normal RPC block response 31 rpc_block_json = self.nodes[0].getblock(bb_hash)
pablomartin4btc commented at 5:42 pm on April 14, 2023:Yeah, the intention was to add the param key on top of the previous error, adding more details at this level but that info is also available from the caller ofGetQueryParameterFromUri
. Ok, I’ll leave it out.in src/test/httpserver_tests.cpp:38 in 0f9cba3f07 outdated
33@@ -34,5 +34,9 @@ BOOST_AUTO_TEST_CASE(test_query_parameters) 34 // Invalid query string syntax is the same as not having parameters 35 uri = "/rest/endpoint/someresource.json&p1=v1&p2=v2"; 36 BOOST_CHECK(!GetQueryParameterFromUri(uri.c_str(), "p1").has_value()); 37+ 38+ // Invalid query string with invalid will raise a runtime error no matter how many parameters passed or their syntax
stickies-v commented at 4:23 pm on April 14, 2023:I’d rephrase to the below - any invalid character in the URI will trigger this, not just in the query string.
0 // URI with invalid characters (%) raises a runtime error regardless of which query parameter is queried
stickies-v commented at 4:41 pm on April 14, 2023: contributorLeft some comments on the bugfix commit. I’d carve that out in a separate PR to get it merged for v25.
Also, suggested rephrasing for the commit message:
0bugfix: rest: avoid segfault for invalid URI 1 2`evhttp_uri_parse` can return a nullptr, for example when the URI 3contains invalid characters (e.g. "%"). 4`GetQueryParameterFromUri` passes the output of `evhttp_uri_parse` 5straight into `evhttp_uri_get_query`, which means that anyone calling 6a REST endpoint in which query parameters are used (e.g. `rest_headers`) 7can cause a segfault. 8 9This bugfix is designed to be minimal and without additional behaviour change. 10Follow-up work should be done to resolve this in a more general and robust way, 11so not every endpoint has to handle it individually.
vasild approvedvasild commented at 4:59 pm on April 14, 2023: contributorACK 9b9c48cf81549c7eacfbfb78534b7eaba46ecb34 modulo #27253 (review)
I agree with suggestions from @stickies-v, would be happy to re-review.
DrahtBot requested review from stickies-v on Apr 14, 2023pablomartin4btc commented at 5:34 pm on April 14, 2023: memberLeft some comments on the bugfix commit. I’d carve that out in a separate PR to get it merged for v25.
I’ll create a new PR linked to this.
Also, suggested rephrasing for the commit message:
Sure, I’ll take your suggestion, thanks.
pablomartin4btc commented at 5:35 pm on April 14, 2023: memberI agree with suggestions from @stickies-v, would be happy to re-review.
I’ll let you both know ones the new PR is ready, very soon. Thanks.
pablomartin4btc commented at 11:52 pm on April 14, 2023: memberUpdated changes:
- Created new PR #27468 with a minimal bugfix and single commit in order to merge it faster for 25.0 release, as suggested above by @stickies-v and supported by @vasild.
- The new PR contains also the suggested changes from @stickies-v.
- I’ll create another PR for the 2nd commit as a follow-up.
fanquake removed this from the milestone 25.0 on Apr 15, 2023fanquake referenced this in commit e054b7390c on Apr 17, 2023sidhujag referenced this in commit 200330a67d on Apr 17, 2023DrahtBot added the label Needs rebase on Apr 17, 2023pablomartin4btc force-pushed on Apr 23, 2023pablomartin4btc commented at 1:59 pm on April 23, 2023: memberUpdated changes:
- First commit of this PR was already merged into 25.0 release branch (#27468).
- Rebased & incorporated some nits & suggestions from latest reviews including above PR.
- Updated original description of this PR.
DrahtBot added the label CI failed on Apr 23, 2023DrahtBot removed the label Needs rebase on Apr 23, 2023maflcko commented at 8:42 am on April 24, 2023: memberFuzzing segfaults, see CIpablomartin4btc commented at 10:30 am on April 24, 2023: memberFuzzing segfaults, see CI
Saw it yesterday, working on it, thanks.
pablomartin4btc force-pushed on Apr 29, 2023pablomartin4btc force-pushed on Apr 29, 2023DrahtBot removed the label CI failed on Apr 29, 2023pablomartin4btc commented at 7:33 am on April 29, 2023: member- Fixed
http_request
fuzz failure and added some relevant comments in the code.
pablomartin4btc force-pushed on Apr 29, 2023pablomartin4btc commented at 11:08 am on April 29, 2023: member- Updated 2nd commit message.
stickies-v commented at 11:32 am on May 11, 2023: contributorWith #27468 merged, I don’t really see the need for 8ef940d39bfa7bc4ba867b70495b68c507297694 anymore? Also the PR title and description need updating, this is not a bugfix PR anymore.pablomartin4btc force-pushed on May 11, 2023pablomartin4btc force-pushed on May 11, 2023pablomartin4btc renamed this:
httpserver, rest: fix segmentation fault on evhttp_uri_get_query
httpserver, rest: segfault bugfix (#27468) logic improvement
on May 11, 2023pablomartin4btc renamed this:
httpserver, rest: segfault bugfix (#27468) logic improvement
httpserver, rest: improving URI validation
on May 11, 2023pablomartin4btc force-pushed on May 11, 2023pablomartin4btc commented at 12:47 pm on May 11, 2023: memberDrahtBot added the label CI failed on May 11, 2023pablomartin4btc force-pushed on May 14, 2023DrahtBot removed the label CI failed on May 14, 2023in src/httpserver.cpp:674 in c72967f48a outdated
668@@ -643,9 +669,9 @@ CService HTTPRequest::GetPeer() const 669 return peer; 670 } 671 672-std::string HTTPRequest::GetURI() const 673+const char* HTTPRequest::GetURI() const 674 { 675- return evhttp_request_get_uri(req); 676+ return m_uri;
stickies-v commented at 2:44 pm on May 17, 2023:Since docs guarantee that GetURI() returns non-nullptr, would add an Assert here
0 return Assert(m_uri);
pablomartin4btc commented at 2:26 pm on May 27, 2023:Ok, done.in src/httpserver.h:80 in c72967f48a outdated
76@@ -71,9 +77,9 @@ class HTTPRequest 77 PUT 78 }; 79 80- /** Get requested URI. 81+ /** Get requested URI. Will always return non-nullptr.
stickies-v commented at 10:39 am on May 18, 2023:nit: clearer language
0 /** Get requested URI. Will never return nullptr.
pablomartin4btc commented at 2:26 pm on May 27, 2023:Yeah and also it’ll be more consistent with the comments above in the private declaration section. https://github.com/bitcoin/bitcoin/blob/c72967f48ab9eb02b8dc9231fdcc6ee85d66b7b9/src/httpserver.h#L60-L63in src/httpserver.cpp:550 in c72967f48a outdated
543@@ -536,18 +544,36 @@ void HTTPEvent::trigger(struct timeval* tv) 544 else 545 evtimer_add(ev, tv); // trigger after timeval passed 546 } 547-HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : req(_req), replySent(_replySent) 548+HTTPRequest::HTTPRequest(struct evhttp_request* _req, bool _replySent) : 549+ req{_req}, 550+ m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr}, 551+ m_uri_parsed{m_uri != nullptr && m_uri[0] != '\0' ? evhttp_uri_parse(m_uri) : nullptr},
stickies-v commented at 11:05 am on May 18, 2023:Is there a reason we need to check form_uri
not being empty (&& m_uri[0] != '\0'
)? Doesn’t seem to be a documented requirement forevhttp_uri_parse
.
pablomartin4btc commented at 2:25 pm on May 27, 2023:Well, I think that an empty string wouldn’t be a valid URI, at least thescheme component
(the first part of it which is followed by a “:”, which is again followed by thehier-part
component (which may be empty) must be present according to the RFC. I thought it was perhaps validated inside theevhttp_uri_parse
but I couldn’t find it there. I’m happy to remove it if you think it’s not worth it.
stickies-v commented at 12:03 pm on May 30, 2023:We don’t need
m_uri
to be a valid URI, that’s whatevhttp_uri_parse
is checking, we just need it to not behave unexpectedly (eg. segfault) when passed toevhttp_uri_parse
. “a” also would not be a valid URI, but we’re not checking that here either, that would just be duplicating whatevhttp_uri_parse
is already doing.So unless there’s another reason to not pass an empty char array to
evhttp_uri_parse
, I’d leave it out yeah.
pablomartin4btc commented at 3:59 pm on May 30, 2023:ok
pablomartin4btc commented at 5:16 pm on June 17, 2023:done.in src/httpserver.h:82 in c72967f48a outdated
76@@ -71,9 +77,9 @@ class HTTPRequest 77 PUT 78 }; 79 80- /** Get requested URI. 81+ /** Get requested URI. Will always return non-nullptr. 82 */ 83- std::string GetURI() const; 84+ const char* GetURI() const;
stickies-v commented at 11:12 am on May 18, 2023:What’s the rationale for this change? No callsite seems to be preferring a C-style string, and returning astd::string
would also make the non-nullptr promise explicit.
pablomartin4btc commented at 2:25 pm on May 27, 2023:I can’t remember tbh, try to find it in the history with no success, not sure if I did it due to matching the consumers(?) or @vasild suggested it at some point…stickies-v commented at 5:43 pm on May 18, 2023: contributorI would have liked to remove the
replySent
argument, but that’s secondary to this PR which fixes a serious bug.Me too, especially since this is now no longer a bugfix PR. I think the workarounds that need to introduced in this PR make things unnecessarily messy, and updating the fuzzer as e.g. per your suggestion here look like a reasonable approach for this PR to me.
pablomartin4btc force-pushed on May 27, 2023pablomartin4btc commented at 2:51 pm on May 27, 2023: memberUpdates:
- Incorporated most of @stickies-v’s suggestions.
- I’ll remove the
replySent
argument from thehttpserver
constructor (as @vasild suggested and @stickies-v agrees with), and update thefuzz
test accordingly. I can squash it later into 1 if you like.
DrahtBot added the label CI failed on May 27, 2023pablomartin4btc force-pushed on May 28, 2023DrahtBot removed the label CI failed on May 28, 2023DrahtBot added the label CI failed on May 29, 2023pablomartin4btc force-pushed on May 29, 2023pablomartin4btc commented at 5:44 pm on May 29, 2023: memberUpdates:
- Working on
fuzzer
failure.
in src/test/fuzz/http_request.cpp:66 in ca8941100d outdated
76+ assert(service.ToStringAddrPort() == "[::]:0"); 77 78- evbuffer_free(evbuf); 79- evhttp_request_free(evreq); 80+ evbuffer_free(evbuf); 81+ evhttp_request_free(evreq);
stickies-v commented at 12:15 pm on May 30, 2023:Don’t we need to clean this up regardless of whether an exception was thrown?
pablomartin4btc commented at 12:44 pm on May 30, 2023:Let me check again when I do revert the order of the commits, when I tested before those were needed as the fuzz was complaining about memory leaks.
pablomartin4btc commented at 8:59 am on June 5, 2023:Those are needed, I’ve checked the test code line by line. I’ll do another check while I fix the current issue/ failure. Thanks!stickies-v commented at 12:15 pm on May 30, 2023: contributorI’m in favour of doing the
replySent
refactoring in a separate commit, but it makes more sense to have that be the first commit imo, to avoid adding logic and then removing it in the very next commit.Edit: and please also include the rationale for removing
replySent
from the constructor in the commit message?pablomartin4btc commented at 12:41 pm on May 30, 2023: memberI’m in favour of doing the
replySent
refactoring in a separate commit, but it makes more sense to have that be the first commit imo, to avoid adding logic and then removing it in the very next commit.Yeah, true, I agree… let me revert the order of the commits.
pablomartin4btc commented at 8:57 am on June 5, 2023: memberUpdates:
- I’ve found the issue with the
fuzz
test I’ll push a fix with the commits in reverse order.
pablomartin4btc force-pushed on Jun 5, 2023pablomartin4btc force-pushed on Jun 5, 2023pablomartin4btc force-pushed on Jun 5, 2023pablomartin4btc commented at 11:32 am on June 5, 2023: memberUpdates:
- It seems I broke some functional tests with the
WriteReply
function “fix”, ha, working on it.
pablomartin4btc force-pushed on Jun 5, 2023pablomartin4btc force-pushed on Jun 6, 2023pablomartin4btc force-pushed on Jun 6, 2023pablomartin4btc force-pushed on Jun 6, 2023pablomartin4btc force-pushed on Jun 9, 2023pablomartin4btc force-pushed on Jun 9, 2023pablomartin4btc force-pushed on Jun 9, 2023pablomartin4btc force-pushed on Jun 9, 2023pablomartin4btc force-pushed on Jun 13, 2023DrahtBot removed the label CI failed on Jun 13, 2023pablomartin4btc force-pushed on Jun 13, 2023pablomartin4btc force-pushed on Jun 13, 2023DrahtBot added the label CI failed on Jun 13, 2023pablomartin4btc force-pushed on Jun 13, 2023pablomartin4btc force-pushed on Jun 14, 2023DrahtBot removed the label CI failed on Jun 14, 2023DrahtBot added the label CI failed on Jun 16, 2023DrahtBot removed the label CI failed on Jun 17, 2023pablomartin4btc commented at 5:37 pm on June 17, 2023: memberUpdates:
- Reverted order of the commits (now the very first is the
replySent
arg removal then secondly the URI validation enhancement approach) as suggested by @stickies-v. replySent
arg removed fromhttprequest
obj constructor in first commit (rationale of the removal added into commits message). In order to do the removal, first I had to fix the underlying issue onWriteReply
which was corrected also in the first commit (I didn’t want to split it in a different commit since it was going to be like an isolated change and in order to test that I would have to test thelibevent
and I’m not too sure if what we want to do here regardin the scope of this PR and so on, happy to extend it and follow up in a different PR).- Went thru previous comments, reviews and suggestions to see if I’ve missed any nit or anything since this PR changed quite a bit, found this: simple getter best to be implemented in the header by @stickies-v, which I’ve incorporated in the 2nd, commit.
in src/httpserver.cpp:549 in 68bc89bf9d outdated
545+ req{_req}, 546+ m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr}, 547+ m_uri_parsed{m_uri != nullptr ? evhttp_uri_parse(m_uri) : nullptr} 548 { 549+ if (m_uri_parsed == nullptr) { 550+ const char* err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
stickies-v commented at 1:54 pm on June 26, 2023:nit: no need to use a C-style string here I think
0 const std::string err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
in src/httpserver.cpp:20 in 68bc89bf9d outdated
16@@ -17,6 +17,7 @@ 17 #include <rpc/protocol.h> // For HTTP status codes 18 #include <shutdown.h> 19 #include <sync.h> 20+#include <util/check.h>
stickies-v commented at 1:48 pm on June 27, 2023:Do we need this?
pablomartin4btc commented at 3:53 pm on June 30, 2023:Not anymore, there was anAssume()
somewhere inHTTPRequest::GetQueryParameter
checking form_uri_parsed
at some point. Thanks for checking, I’ll remove it.in src/httpserver.cpp:243 in 68bc89bf9d outdated
239+ try { 240+ hreq = std::make_unique<HTTPRequest>(req); 241+ } catch (const std::runtime_error& e) { 242+ LogPrint(BCLog::HTTP, "%s\n", e.what()); 243+ return; 244+ }
stickies-v commented at 10:50 am on June 28, 2023:I don’t think it’s great that we’re not doing the address-based checks before anything else. Since we’re now adding validation into the constructor, how about we do it consistently and do it for all validation, so we can also control what we validate first? I’m still feeling a bit uneasy about coupling the construction of
HTTPRequest
with validation, but if we’re doing it we might as well do it more consistently? Curious to hear @vasild’s thoughts on this too.0diff --git a/src/httpserver.cpp b/src/httpserver.cpp 1index f7dc344f4..10f8ec24d 100644 2--- a/src/httpserver.cpp 3+++ b/src/httpserver.cpp 4@@ -235,6 +235,7 @@ static void http_request_cb(struct evhttp_request* req, void* arg) 5 } 6 std::unique_ptr<HTTPRequest> hreq; 7 8+ // Initializing the HTTPRequest also performs basic validation such as address, method and URI checks 9 try { 10 hreq = std::make_unique<HTTPRequest>(req); 11 } catch (const std::runtime_error& e) { 12@@ -242,22 +243,6 @@ static void http_request_cb(struct evhttp_request* req, void* arg) 13 return; 14 } 15 16- // Early address-based allow check 17- if (!ClientAllowed(hreq->GetPeer())) { 18- LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Client network is not allowed RPC access\n", 19- hreq->GetPeer().ToStringAddrPort()); 20- hreq->WriteReply(HTTP_FORBIDDEN); 21- return; 22- } 23- 24- // Early reject unknown HTTP methods 25- if (hreq->GetRequestMethod() == HTTPRequest::UNKNOWN) { 26- LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Unknown HTTP request method\n", 27- hreq->GetPeer().ToStringAddrPort()); 28- hreq->WriteReply(HTTP_BAD_METHOD); 29- return; 30- } 31- 32 const std::string strURI{hreq->GetURI()}; 33 LogPrint(BCLog::HTTP, "Received a %s request for %s from %s\n", 34 RequestMethodString(hreq->GetRequestMethod()), SanitizeString(strURI, SAFE_CHARS_URI).substr(0, 100), hreq->GetPeer().ToStringAddrPort()); 35@@ -545,13 +530,7 @@ HTTPRequest::HTTPRequest(struct evhttp_request* _req) : 36 m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr}, 37 m_uri_parsed{m_uri != nullptr ? evhttp_uri_parse(m_uri) : nullptr} 38 { 39- if (m_uri_parsed == nullptr) { 40- const char* err{"URI parsing failed, it likely contained RFC 3986 invalid characters"}; 41- const std::string peerAddress{GetPeer().ToStringAddr()}; 42- WriteReply(HTTP_BAD_REQUEST, err); 43- throw std::runtime_error( 44- strprintf("HTTP request from %s rejected: %s", peerAddress, err)); 45- } 46+ Validate(); 47 } 48 49 HTTPRequest::~HTTPRequest() 50@@ -565,6 +544,29 @@ HTTPRequest::~HTTPRequest() 51 evhttp_uri_free(m_uri_parsed); 52 } 53 54+void HTTPRequest::Validate() 55+{ 56+ const std::string rejected{strprintf("HTTP request from %s rejected: ", GetPeer().ToStringAddrPort())}; 57+ // Early address-based allow check 58+ if (!ClientAllowed(GetPeer())) { 59+ WriteReply(HTTP_FORBIDDEN); 60+ throw std::runtime_error(rejected + "Client network is not allowed RPC access"); 61+ } 62+ 63+ // Early reject unknown HTTP methods 64+ if (GetRequestMethod() == HTTPRequest::UNKNOWN) { 65+ WriteReply(HTTP_BAD_METHOD); 66+ throw std::runtime_error(rejected + "Unknown HTTP request method"); 67+ } 68+ 69+ // Ensure valid URI 70+ if (m_uri_parsed == nullptr) { 71+ const std::string err{"URI parsing failed, it likely contained RFC 3986 invalid characters"}; 72+ WriteReply(HTTP_BAD_REQUEST, err); 73+ throw std::runtime_error(rejected + err); 74+ } 75+} 76+ 77 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const 78 { 79 const struct evkeyvalq* headers = evhttp_request_get_input_headers(req); 80diff --git a/src/httpserver.h b/src/httpserver.h 81index 77a249a6a..3c5af5b0b 100644 82--- a/src/httpserver.h 83+++ b/src/httpserver.h 84@@ -64,6 +64,9 @@ private: 85 /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`. 86 evhttp_uri* const m_uri_parsed; 87 88+ // Perform early validity checks. Send a response and throw std::runtime_error if invalid. 89+ void Validate(); 90+ 91 public: 92 explicit HTTPRequest(struct evhttp_request* req); 93 ~HTTPRequest();
pablomartin4btc commented at 6:02 pm on June 30, 2023:I understand your worries on coupling the validation within the constructor (separation of/ different concerns, error handling, flexibility, etc) but since those validations/ “early checks” also would impact into the state of the object itself it would make sense to me to put them together, consistently and cleaner. Happy to make the change if @vasild also agree with it.DrahtBot added the label Needs rebase on Jun 28, 2023pablomartin4btc force-pushed on Jul 3, 2023pablomartin4btc force-pushed on Jul 3, 2023pablomartin4btc force-pushed on Jul 3, 2023httpserver: removing replySent arg from constructor
Removing what it was originally a hack/ workaround only for fuzz test 'http_request', no other client/ consumer was using it, to deal with an underlying buffering issue with WriteReply being called from the destructor, which was also fixed here not allowing buffering when conn is not valid.
http, rest: URI validation logic improvement
Non-functional optimisation: moving the logic up in the stack, specifically into the constructor of the HTTPRequest object, this way we catch the invalid uri earlier, removing the exception handling from the rest enpoints as we catch the exception raised from the constructor within the request call back function where we return an http bad request to consumers.
pablomartin4btc force-pushed on Jul 3, 2023pablomartin4btc commented at 4:32 pm on July 3, 2023: memberUpdates:
- Rebased
- Implemented refactoring suggestion from @stickies-v, moving other http reject validations into the constructor and regrouping them under a
Validate()
function to make the entire validation more consistent.
DrahtBot removed the label Needs rebase on Jul 3, 2023DrahtBot added the label CI failed on Jul 3, 2023pablomartin4btc commented at 5:48 pm on July 3, 2023: memberUpdates:
- Checking fuzz test failure
pablomartin4btc marked this as a draft on Jul 4, 2023pablomartin4btc commented at 4:53 pm on July 4, 2023: memberAfter a chat with @stickies-v where we were discussing different approaches on this enhancement and other details regarding fuzz testing, libevent functionality and each commit intention, I’ve decided to put this onto draft. Firstly we would need to define the purpose of thehttprequest
object, do we want/ need thehttprequest
obj to exist even with an invalid request?, as @stickies-v raised his concerns before, it seems coupling the validation within the constructor perhaps is not the best approach after all. I’ll check the different alternatives of moving the validation out from the constructor and perhaps throwing the error later in the callback, also maybe I’ll split the commits in different PRs once I revisit the fuzz test and logic perform some libevent tests.DrahtBot commented at 9:22 am on October 5, 2023: contributorAre you still working on this?pablomartin4btc commented at 11:44 am on October 5, 2023: memberAre you still working on this?
I do
DrahtBot added the label Needs rebase on Dec 14, 2023DrahtBot commented at 9:53 pm on December 14, 2023: contributor🐙 This pull request conflicts with the target branch and needs rebase.
DrahtBot commented at 10:42 am on January 5, 2024: contributorThere hasn’t been much activity lately and the patch still needs rebase. What is the status here?
- Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
- Is it no longer relevant? ➡️ Please close.
- Did the author lose interest or time to work on this? ➡️ Please close it and mark it ‘Up for grabs’ with the label, so that it can be picked up in the future.
fanquake commented at 2:23 pm on March 6, 2024: memberClosing for now. @pablomartin4btc feel free to ping for a re-open when you are actively working on this again.fanquake closed this on Mar 6, 2024
github-metadata-mirror
This is a metadata mirror of the GitHub repository bitcoin/bitcoin. This site is not affiliated with GitHub. Content is generated from a GitHub metadata backup.
generated: 2024-10-30 03:12 UTC
This site is hosted by @0xB10C
More mirrored repositories can be found on mirror.b10c.me