Skip to content

Conversation

@pablomartin4btc
Copy link
Member

@pablomartin4btc pablomartin4btc commented Mar 14, 2023

This PR contains an isolated enhancement of the segfault bugfix #27468 (already merged), improving the way we handle the URI validation, which will be performed only once instead of on each query parameter of a rest service endpoint.

@DrahtBot
Copy link
Contributor

DrahtBot commented Mar 14, 2023

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Reviews

See the guideline for information on the review process.

Type Reviewers
Concept ACK brunoerg
Stale ACK stickies-v, vasild

If your review is incorrectly listed, please react with 👎 to this comment and the bot will ignore it on the next update.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #28051 (Get rid of shutdown.cpp/shutdown.h, use SignalInterrupt directly by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch 6 times, most recently from 50e18c0 to ec14b3c Compare March 14, 2023 07:00
@fanquake fanquake requested a review from stickies-v March 14, 2023 07:28
Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK, nice catch.

As for the approach, I think there is a simpler one that requires no overhead for every endpoint using GetQueryParameter(). We could move uri_parsed to a private HTTPRequest member instead of a local GetQueryParameterFromUri variable, and have http_request_cb set it (for example after checking unknown HTTP methods) - or return an early 400 if null. Subsequently, we can assume that uri_parsed is valid and non-null, simplifying the rest code (and avoiding multiple evhttp_uri_parse calls when accessing multiple query params). What do you think?

Copy link
Contributor

@brunoerg brunoerg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@pablomartin4btc
Copy link
Member Author

What do you think?

As @stickies-v explained to me outside this PR: "... by moving the uri parsing to http_request_cb we're failing super early, and even when the query parameter would never be called by the endpoint... ", this makes more sense.

I'm working on this approach which I think it's better as well.

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch 2 times, most recently from 02cbb58 to 213554c Compare March 16, 2023 17:51
@pablomartin4btc
Copy link
Member Author

Updated changes:

  • move the validation/parsing logic up in the stack as soon as the request is initiated even before the work item thread is created
  • updated unit and functional tests

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch from 213554c to 7318e23 Compare March 16, 2023 19:56
@fanquake
Copy link
Member

fanquake commented Mar 16, 2023

https://cirrus-ci.com/task/4643666984173568:

 test  2023-03-16T20:06:03.567000Z TestFramework (ERROR): Unexpected exception caught during testing 
                                   Traceback (most recent call last):
                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/test_framework/test_framework.py", line 132, in main
                                       self.run_test()
                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 372, in run_test
                                       resp = self.test_rest_request("/mempool/contents", ret_type=RetType.OBJ, status=400, query_params={"verbose": "true", "mempool_sequence": "true"})
                                     File "/private/var/folders/v7/fs2b0v3s0lz1n57gj9y4xb5m0000gn/T/cirrus-ci-build/ci/scratch/build/bitcoin-arm64-apple-darwin/test/functional/interface_rest.py", line 86, in test_rest_request
                                       assert_equal(resp.status, status.value)
                                   AttributeError: 'int' object has no attribute 'value'

@pablomartin4btc
Copy link
Member Author

pablomartin4btc commented Mar 16, 2023

@fanquake that's something perhaps I've introduced with my last push, I'm on it, thanks. Weird, it's not failing locally for me:

$ ./test/functional/interface_rest.py 
2023-03-16T21:14:11.025000Z TestFramework (INFO): PRNG seed is: 326626678636704986
2023-03-16T21:14:11.026000Z TestFramework (INFO): Initializing test directory /tmp/bitcoin_func_test_vwqauy2g
2023-03-16T21:14:11.812000Z TestFramework (INFO): Broadcast test transaction and sync nodes
2023-03-16T21:14:12.865000Z TestFramework (INFO): Test the /tx URI
2023-03-16T21:14:12.876000Z TestFramework (INFO): Query an unspent TXO using the /getutxos URI
2023-03-16T21:14:12.934000Z TestFramework (INFO): Query a spent TXO using the /getutxos URI
2023-03-16T21:14:12.935000Z TestFramework (INFO): Query two TXOs using the /getutxos URI
2023-03-16T21:14:12.938000Z TestFramework (INFO): Query the TXOs using the /getutxos URI with a binary response
2023-03-16T21:14:12.939000Z TestFramework (INFO): Test the /getutxos URI with and without /checkmempool
2023-03-16T21:14:14.065000Z TestFramework (INFO): Test the /block, /blockhashbyheight, /headers, and /blockfilterheaders URIs
2023-03-16T21:14:15.221000Z TestFramework (INFO): Test tx inclusion in the /mempool and /block URIs
2023-03-16T21:14:16.326000Z TestFramework (INFO): Test the /chaininfo URI
2023-03-16T21:14:16.334000Z TestFramework (INFO): Test compatibility of deprecated and newer endpoints
2023-03-16T21:14:16.341000Z TestFramework (INFO): Test the /deploymentinfo URI
2023-03-16T21:14:16.404000Z TestFramework (INFO): Stopping nodes
2023-03-16T21:14:16.611000Z TestFramework (INFO): Cleaning up /tmp/bitcoin_func_test_vwqauy2g on exit
2023-03-16T21:14:16.612000Z TestFramework (INFO): Tests successful

I think this was the issue: Python Enums were introduced in version 3.4, and the value attribute was added in version 3.6. I'm fixing it.

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch 3 times, most recently from 0115dcc to 9b0a0dc Compare March 20, 2023 16:18
@pablomartin4btc
Copy link
Member Author

Updated changes:

  • Updated the code with a new workaround to avoid failures on unit tests (httpserver_tests).

Thanks @stickies-v for your prompt response and your support on quick workarounds on issues raised.

Copy link
Contributor

@stickies-v stickies-v left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the commit message could benefit from additional information on the nature of the bug and how it's fixed

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch 5 times, most recently from 2bf34b7 to 36be160 Compare March 20, 2023 21:28
@pablomartin4btc
Copy link
Member Author

Updates:

  • Reverted order of the commits (now the very first is the replySentarg removal then secondly the URI validation enhancement approach) as suggested by @stickies-v.
  • replySentarg removed fromhttprequestobj constructor in first commit (rationale of the removal added into commits message).
    In order to do the removal, first I had to fix the underlying issue onWriteReplywhich was corrected also in the first commit (I didn't want to split it in a different commit since it was going to be like an isolated change and in order to test that I would have to test thelibeventand I'm not too sure if what we want to do here regardin the scope of this PR and so on, happy to extend it and follow up in a different PR).
  • Went thru previous comments, reviews and suggestions to see if I've missed any nit or anything since this PR changed quite a bit, found this: simple getter best to be implemented in the header by @stickies-v, which I've incorporated in the 2nd, commit.

Comment on lines 238 to 243
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's great that we're not doing the address-based checks before anything else. Since we're now adding validation into the constructor, how about we do it consistently and do it for all validation, so we can also control what we validate first? I'm still feeling a bit uneasy about coupling the construction of HTTPRequest with validation, but if we're doing it we might as well do it more consistently? Curious to hear @vasild's thoughts on this too.

git diff
diff --git a/src/httpserver.cpp b/src/httpserver.cpp
index f7dc344f4..10f8ec24d 100644
--- a/src/httpserver.cpp
+++ b/src/httpserver.cpp
@@ -235,6 +235,7 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
     }
     std::unique_ptr<HTTPRequest> hreq;
 
+    // Initializing the HTTPRequest also performs basic validation such as address, method and URI checks
     try {
         hreq = std::make_unique<HTTPRequest>(req);
     } catch (const std::runtime_error& e) {
@@ -242,22 +243,6 @@ static void http_request_cb(struct evhttp_request* req, void* arg)
         return;
     }
 
-    // Early address-based allow check
-    if (!ClientAllowed(hreq->GetPeer())) {
-        LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Client network is not allowed RPC access\n",
-                 hreq->GetPeer().ToStringAddrPort());
-        hreq->WriteReply(HTTP_FORBIDDEN);
-        return;
-    }
-
-    // Early reject unknown HTTP methods
-    if (hreq->GetRequestMethod() == HTTPRequest::UNKNOWN) {
-        LogPrint(BCLog::HTTP, "HTTP request from %s rejected: Unknown HTTP request method\n",
-                 hreq->GetPeer().ToStringAddrPort());
-        hreq->WriteReply(HTTP_BAD_METHOD);
-        return;
-    }
-
     const std::string strURI{hreq->GetURI()};
     LogPrint(BCLog::HTTP, "Received a %s request for %s from %s\n",
              RequestMethodString(hreq->GetRequestMethod()), SanitizeString(strURI, SAFE_CHARS_URI).substr(0, 100), hreq->GetPeer().ToStringAddrPort());
@@ -545,13 +530,7 @@ HTTPRequest::HTTPRequest(struct evhttp_request* _req) :
         m_uri{req != nullptr ? evhttp_request_get_uri(req) : nullptr},
         m_uri_parsed{m_uri != nullptr ? evhttp_uri_parse(m_uri) : nullptr}
 {
-    if (m_uri_parsed == nullptr) {
-        const char* err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
-        const std::string peerAddress{GetPeer().ToStringAddr()};
-        WriteReply(HTTP_BAD_REQUEST, err);
-        throw std::runtime_error(
-            strprintf("HTTP request from %s rejected: %s", peerAddress, err));
-    }
+    Validate();
 }
 
 HTTPRequest::~HTTPRequest()
@@ -565,6 +544,29 @@ HTTPRequest::~HTTPRequest()
     evhttp_uri_free(m_uri_parsed);
 }
 
+void HTTPRequest::Validate()
+{
+    const std::string rejected{strprintf("HTTP request from %s rejected: ", GetPeer().ToStringAddrPort())};
+    // Early address-based allow check
+    if (!ClientAllowed(GetPeer())) {
+        WriteReply(HTTP_FORBIDDEN);
+        throw std::runtime_error(rejected + "Client network is not allowed RPC access");
+    }
+
+    // Early reject unknown HTTP methods
+    if (GetRequestMethod() == HTTPRequest::UNKNOWN) {
+        WriteReply(HTTP_BAD_METHOD);
+        throw std::runtime_error(rejected + "Unknown HTTP request method");
+    }
+
+    // Ensure valid URI
+    if (m_uri_parsed == nullptr) {
+        const std::string err{"URI parsing failed, it likely contained RFC 3986 invalid characters"};
+        WriteReply(HTTP_BAD_REQUEST, err);
+        throw std::runtime_error(rejected + err);
+    }
+}
+
 std::pair<bool, std::string> HTTPRequest::GetHeader(const std::string& hdr) const
 {
     const struct evkeyvalq* headers = evhttp_request_get_input_headers(req);
diff --git a/src/httpserver.h b/src/httpserver.h
index 77a249a6a..3c5af5b0b 100644
--- a/src/httpserver.h
+++ b/src/httpserver.h
@@ -64,6 +64,9 @@ private:
     /// The parsed URI, as returned by `evhttp_uri_parse()`. Will never be `nullptr`.
     evhttp_uri* const m_uri_parsed;
 
+    // Perform early validity checks. Send a response and throw std::runtime_error if invalid.
+    void Validate();
+
 public:
     explicit HTTPRequest(struct evhttp_request* req);
     ~HTTPRequest();

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your worries on coupling the validation within the constructor (separation of/ different concerns, error handling, flexibility, etc) but since those validations/ "early checks" also would impact into the state of the object itself it would make sense to me to put them together, consistently and cleaner. Happy to make the change if @vasild also agree with it.

@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch 3 times, most recently from f70a6b2 to c9c19c5 Compare July 3, 2023 16:07
Removing what it was originally a hack/ workaround only for
fuzz test 'http_request', no other client/ consumer was using it,
to deal with an underlying buffering issue with WriteReply being
called from the destructor, which was also fixed here not allowing
buffering when conn is not valid.
Non-functional optimisation: moving the logic up
in the stack, specifically into the constructor
of the HTTPRequest object, this way we catch the invalid uri earlier,
removing the exception handling from the rest enpoints as we catch
the exception raised from the constructor within the request
call back function where we return an http bad request to consumers.
@pablomartin4btc pablomartin4btc force-pushed the http-rest-fix-segfault branch from c9c19c5 to 31dd51c Compare July 3, 2023 16:19
@pablomartin4btc
Copy link
Member Author

Updates:

  • Rebased
  • Implemented refactoring suggestion from @stickies-v, moving other http reject validations into the constructor and regrouping them under a Validate() function to make the entire validation more consistent.

@pablomartin4btc
Copy link
Member Author

Updates:

  • Checking fuzz test failure

@pablomartin4btc pablomartin4btc marked this pull request as draft July 4, 2023 16:53
@pablomartin4btc
Copy link
Member Author

After a chat with @stickies-v where we were discussing different approaches on this enhancement and other details regarding fuzz testing, libevent functionality and each commit intention, I've decided to put this onto draft. Firstly we would need to define the purpose of the httprequest object, do we want/ need the httprequest obj to exist even with an invalid request?, as @stickies-v raised his concerns before, it seems coupling the validation within the constructor perhaps is not the best approach after all. I'll check the different alternatives of moving the validation out from the constructor and perhaps throwing the error later in the callback, also maybe I'll split the commits in different PRs once I revisit the fuzz test and logic perform some libevent tests.

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 5, 2023

Are you still working on this?

@pablomartin4btc
Copy link
Member Author

Are you still working on this?

I do

@DrahtBot
Copy link
Contributor

🐙 This pull request conflicts with the target branch and needs rebase.

@DrahtBot
Copy link
Contributor

DrahtBot commented Jan 5, 2024

There hasn't been much activity lately and the patch still needs rebase. What is the status here?

  • Is it still relevant? ➡️ Please solve the conflicts to make it ready for review and to ensure the CI passes.
  • Is it no longer relevant? ➡️ Please close.
  • Did the author lose interest or time to work on this? ➡️ Please close it and mark it 'Up for grabs' with the label, so that it can be picked up in the future.

@fanquake
Copy link
Member

fanquake commented Mar 6, 2024

Closing for now. @pablomartin4btc feel free to ping for a re-open when you are actively working on this again.

@fanquake fanquake closed this Mar 6, 2024
BOOST_CHECK(!QueryParameterExists("/rest/endpoint/someresource.json&p1=v1&p2=v2", "p1"));

// URI with invalid characters (%) raises a runtime error regardless of which query parameter is queried
uri = "/rest/endpoint/someresource.json&p1=v1&p2=v2%";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this while reviewing #27468 (comments locked on that PR) for my own work on HTTPRequest -- this test vector doesn't have a ? so even if the URI didn't contain invalid characters it would still return null for the query parameter request. It doesn't really matter for the invalid-URI-throws-error test but it should matter for this follow up, part of which I'm trying to integrate.

So, my question is - should the test vector always have been:

uri = "/rest/endpoint/someresource.json?p1=v1&p2=v2%";

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything in your observation looks correct to me, yeah. It's fine in master but not in this PR.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!
I must have copied that URI (without the ?) from the previous check (// Invalid query string syntax is the same as not having parameters).
But for this test in this PR it doesn't really matter I think (we can have even 2, one with ? and one without it - or just clarify it in the comment to avoid confusion) as the URI would be invalid anyways (it includes invalid char %).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for this test in this PR it doesn't really matter I think

I think it does, previously we had three states (value, no value, exception) which kinda accidentally still made the test work, but in this PR there are only two (value, no value). So this test does not actually test what it claims it does, because if QueryParameterExists were to be changed to allow invalid characters, this test would still pass because there aren't any query parameters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

previously we had three states (value, no value, exception) which kinda accidentally still made the test work

I think the URI validation was performed before checking the params, so the exception was raised and caught by the test?

// URI with invalid characters (%) raises a runtime error regardless of which query parameter is queried
uri = "/rest/endpoint/someresource.json&p1=v1&p2=v2%";
BOOST_CHECK_EXCEPTION(GetQueryParameterFromUri(uri.c_str(), "p1"), std::runtime_error, HasReason("URI parsing failed, it likely contained RFC 3986 invalid characters"));

So this test does not actually test what it claims it does,

// URI with invalid characters (%) won't return any values regardless of which query parameter is queried

Perhaps it should be clarified also for this incorrect query syntax case, but the entire PR should be revisited anyways.

if QueryParameterExists were to be changed to allow invalid characters, this test would still pass because there aren't any query parameters.

It's not the current state of this closed/ inactive PR but it could be so I agree with your statement in that sense (and with your states analysis as well) and that the PR should be updated to avoid confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the URI validation was performed before checking the params, so the exception was raised and caught by the test?

It seems that the libevent parser doesn't differentiate if the URI is invalid whether the query syntax is wrong (missing the '?') or there's an invalid char within the URI (% not followed by 2 hexa chars - RFC 3986 and libevent only interprets it if the encoded chars are withing the variable's value, not in the path).

DashCoreAutoGuix pushed a commit to DashCoreAutoGuix/dash that referenced this pull request Sep 30, 2025
11422cc bugfix: rest: avoid segfault for invalid URI (pablomartin4btc)

Pull request description:

  Minimal fix to get it promptly into 25.0 release (suggested by  [stickies-v](bitcoin#27253 (review)) and supported by [vasild](bitcoin#27253 (review))  )

  Please check bitcoin#27253 for reviewers comments and acks regarding this PR and read the commit comment message body for more details about the fix.

ACKs for top commit:
  achow101:
    ACK 11422cc
  stickies-v:
    re-ACK 11422cc

Tree-SHA512: 5af6b53fb266a12b463f960910556d5e97bc88b3c2a4f437ffa343886b38749e1eb058cf7bc64d62e82e1acf6232a186bddacd8f3b4500c87bf9e550a0153386
knst pushed a commit to knst/dash that referenced this pull request Oct 22, 2025
11422cc bugfix: rest: avoid segfault for invalid URI (pablomartin4btc)

Pull request description:

  Minimal fix to get it promptly into 25.0 release (suggested by  [stickies-v](bitcoin#27253 (review)) and supported by [vasild](bitcoin#27253 (review))  )

  Please check bitcoin#27253 for reviewers comments and acks regarding this PR and read the commit comment message body for more details about the fix.

ACKs for top commit:
  achow101:
    ACK 11422cc
  stickies-v:
    re-ACK 11422cc

Tree-SHA512: 5af6b53fb266a12b463f960910556d5e97bc88b3c2a4f437ffa343886b38749e1eb058cf7bc64d62e82e1acf6232a186bddacd8f3b4500c87bf9e550a0153386
@bitcoin bitcoin locked and limited conversation to collaborators Jan 12, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants