-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Replace libevent with our own HTTP and socket-handling implementation #32061
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. Code Coverage & BenchmarksFor details see: https://corecheck.dev/bitcoin/bitcoin/pulls/32061. ReviewsSee the guideline for information on the review process.
If your review is incorrectly listed, please copy-paste ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. LLM Linter (✨ experimental)Possible typos and grammar issues:
2026-01-09 |
|
🚧 At least one of the CI tasks failed. HintsTry to run the tests locally, according to the documentation. However, a CI failure may still
Leave a comment here, if you need help tracking down a confusing failure. |
|
Concept ACK, nice work |
|
Concept ACK |
|
Concept ACK My understanding from the low-level networking discussion at CoreDev was that this wouldn't build on top of sockman. I guess the devil is in the details but can you address that in what sense the current approach follows what was discussed there? Thanks! |
Sure, by coredev I had already written most of this implementation (based on sockman) but the performance was bad, and that was part of the motivation behind the deep-dive talk. However, by the end of the week I had reviewed that code in person with smart attendees and not only improved the performance of my code but started to improve performance vs master branch as well! Those updates came in the days just after the deep-dive discussion. SOME kind of sockman is needed to replace libevent. The one @vasild wrote does actually seem to work well for this purpose as well as for p2p, and it would be "nice" to only have to maintain one I/O loop structure in bitcoind. @theuni is investigating how a sockman for http could be optimized if it had no other purpose, and I think that is the kind of feedback that will help us decide which path to take. |
640f38f to
49784b3
Compare
💯 |
pinheadmz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| CService MaybeFlipIPv6toCJDNS(const CService& service); | ||
|
|
||
| /** Get the bind address for a socket as CService. */ | ||
| CService GetBindAddress(const Sock& sock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 re-worded the commit message
| sockaddr_storage storage; | ||
| socklen_t len = sizeof(storage); | ||
|
|
||
| auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah thanks, fixed this commit message
src/netbase.cpp
Outdated
|
|
||
| auto sa = static_cast<sockaddr*>(static_cast<void*>(&storage)); | ||
|
|
||
| if (!sock.GetSockName(sa, &len)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure, taken. every little bit of legibility helps ;-)
| */ | ||
| template <typename T> | ||
| std::optional<T> ToIntegral(std::string_view str) | ||
| std::optional<T> ToIntegral(std::string_view str, size_t base = 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I will leave as-is, a bool would require explanation and this way we have explicit argument name and default value.
| lowered.resize(s.size()); | ||
| std::ranges::transform(s, lowered.begin(), | ||
| [](char c) { | ||
| // Avoid implicit-integer-sign-change UB by only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok taken. I was doubtful at first but learned about "case folding" in unicode so, the qualification is justified.
src/util/strencodings.h
Outdated
| }; | ||
| } // namespace detail | ||
|
|
||
| struct CaseInsensitiveKeyEqual { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The important bit is the case-insensitivity of the HTTP protocol. I reference libevent functions in comments and commit messages throughout the PR because I referenced them when writing and also for reviewers to compare. I'll reword this commit message to better explain.
src/util/time.h
Outdated
| std::optional<int64_t> ParseISO8601DateTime(std::string_view str); | ||
|
|
||
| /** | ||
| * RFC7231 formatting https://www.rfc-editor.org/rfc/rfc7231#section-7.1.1.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replaced the reference to HTTP spec with reference to the actual date/time spec implemented in the function.
src/util/time.h
Outdated
| * RFC7231 formatting https://www.rfc-editor.org/rfc/rfc7231#section-7.1.1.1 | ||
| * Used in HTTP/1.1 responses | ||
| */ | ||
| std::string FormatRFC7231DateTime(int64_t nTime); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great catch, I had to dig through the forest of RFCs again, and rewrote the commit message and function name. I'm going with RFC1123 because that is the RFC referenced by libevent. I also wanted to keep the pattern of naming similar to neighbor functions like FormatISO8601DateTime()
src/util/string.cpp
Outdated
| ++it; | ||
| ++count; | ||
| if (c == '\n') break; | ||
| if (count >= max_read) throw std::runtime_error("max_read exceeded by LineReader"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think everything is correct as-is, but I refactored the tests in order to make sure I hit the edge correctly. I still could be missing an off-by-one though, so I appreciate your patience and re-review.
If max_read is 8, I want to read no more than 8 characters looking for \n. So 123567\n is fine, meaning that indeed we read up to AND including max_read.
However 12345678\n would throw an error because we would consume max_read characters and not find an EOL. In this case the EOL is "past max_read".
fjahr
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit-picking a bit more on LineReader edge cases :) Looking forward to the separate PR!
src/test/util_string_tests.cpp
Outdated
| // but then throws because there is no EOL within max_read | ||
| LineReader reader2(input, /*max_read=*/23); | ||
| BOOST_CHECK_EQUAL(reader2.ReadLine(), "once upon a time there"); | ||
| BOOST_CHECK_THROW(reader2.ReadLine(), std::runtime_error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
EOL is at position equal to max_read
This is a 23-char line so for me EOL is at max_read+1 again, same to what is said above in the 22-char line case:
EOL is +1 past max_read
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, another edge case is a string without any \n or a last line without it, e.g. instead of EOL there is EOF. If I understand correctly, because of how the loop is structured, a string without \n and length 10 will throw the error if max_read=10 but a string with length 9 without a \n will not throw at all and just return the string. I would expect only a line of 11+ without \n to throw.
Maybe this ordering of the lines in the loop would do a bit better for this edge case? I didn't test it yet so it's possible there are other issues with it, feel free to ignore if that's the case.
while (it != end) {
if (count >= max_read) throw std::runtime_error("max_read exceeded by LineReader");
++count;
auto c = static_cast<char>(*it);
if (c == '\n') break;
++it;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right and I think your original instinct on this was right as well. I modified the logic of the ReadLine() loop, added more comments, and covered the end-of-buffer case with another unit test. So it should be more clear: max_read is the length of the longest line we will return, and the line can be terminated either by \n (not counted) or by the end of the buffer.
max_read = 8
"12345678" OK
"12345678\n" OK
"123456789" bad
"123456789\n" bad
src/util/string.cpp
Outdated
| ++it; | ||
| ++count; | ||
| if (c == '\n') break; | ||
| if (count >= max_read) throw std::runtime_error("max_read exceeded by LineReader"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, yeah, I think the behavior is fine after looking at this again. I guess the first thing that made this a bit hard to parse is the ordering of the lines but I can see now that this is probably the most efficient way to do it. It may just not be the most readable and I needed some extra time to stare at it :)
I will add some nits at the appropriate lines.
src/util/string.h
Outdated
| /** | ||
| * Returns a string from current iterator position up to next \n | ||
| * and advances iterator, does not return trailing \n or \r. | ||
| * Will not search for \n past max_read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It sounds a bit like this might just return: "Will throw if a line length exceeds max_read."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated
This is a helper struct to parse HTTP messages from data in buffers from sockets. HTTP messages begin with headers which are CRLF-terminated lines (\n or \r\n) followed by an arbitrary amount of body data. Whitespace is trimmed from the field lines but not the body. https://httpwg.org/specs/rfc9110.html#rfc.section.5
This commit is a no-op to isolate HTTP methods and objects that depend on libevent. Following commits will add replacement objects and methods in a new namespace for testing and review before switching over the server.
see: https://www.rfc-editor.org/rfc/rfc2616#section-4.2 https://www.rfc-editor.org/rfc/rfc7231#section-5 https://www.rfc-editor.org/rfc/rfc7231#section-7 https://httpwg.org/specs/rfc9111.html#header.field.definitions
HTTP Response message: https://datatracker.ietf.org/doc/html/rfc1945#section-6 Status line (first line of response): https://datatracker.ietf.org/doc/html/rfc1945#section-6.1 Status code definitions: https://datatracker.ietf.org/doc/html/rfc1945#section-9
HTTP Request message: https://datatracker.ietf.org/doc/html/rfc1945#section-5 Request Line aka Control Line aka first line: https://datatracker.ietf.org/doc/html/rfc1945#section-5.1 See message_read_status() in libevent http.c for how `MORE_DATA_EXPECTED` is handled there
…ocket Introduce a new low-level socket managing class `HTTPServer`. BindAndStartListening() was copied from CConnMan's BindListenPort() in net.cpp and modernized. Unit-test it with a new class `SocketTestingSetup` which mocks `CreateSock()` and will enable mock client I/O in future commits. Co-authored-by: Vasil Dimov <[email protected]>
AcceptConnection() is mostly copied from CConmann in net.cpp and then modernized. Co-authored-by: Vasil Dimov <[email protected]>
Co-authored-by: Vasil Dimov <[email protected]>
Socket handling methods are copied from CConnMan: `CConnman::GenerateWaitSockets()` `CConnman::SocketHandlerListening()` `CConnman::ThreadSocketHandler()` and `CConnman::SocketHandler()` are combined into ThreadSocketHandler()`. Co-authored-by: Vasil Dimov <[email protected]>
`SocketHandlerConnected()` adapted from CConnman Testing this requires adding a new feature to the SocketTestingSetup, inserting a "request" payload into the mock client that connects to us. Co-authored-by: Vasil Dimov <[email protected]>
Sockets-touching bits copied and adapted from `CConnman::SocketSendData()` Testing this requires adding a new feature to the SocketTestingSetup, returning the DynSock I/O pipes from the mock socket so the received data can be checked. Co-authored-by: Vasil Dimov <[email protected]>
See https://www.rfc-editor.org/rfc/rfc7230#section-6.3.2 > A server MAY process a sequence of pipelined requests in parallel if they all have safe methods (Section 4.2.1 of [RFC7231]), but it MUST send the corresponding responses in the same order that the requests were received. We choose NOT to process requests in parallel. They are executed in the order recevied as well as responded to in the order received. This prevents race conditions where old state may get sent in response to requests that are very quick to process but were requested later on in the queue.
This is a refactor to prepare for matching the API of HTTPRequest definitions in both namespaces http_bitcoin and http_libevent. In particular, to provide a consistent return type for GetRequestMethod() in both classes.
These methods are called by http_request_cb() and are present in the original http_libevent::HTTPRequest.
The original function is passed to libevent as a callback when HTTP requests are received and processed. It wrapped the libevent request object in a http_libevent::HTTPRequest and then handed that off to bitcoin for basic checks and finally dispatch to worker threads. In this commit we split the function after the http_libevent::HTTPRequest is created, and pass that object to a new function that maintains the logic of checking and dispatching. This will be the merge point for http_libevent and http_bitcoin, where HTTPRequest objects from either namespace have the same downstream lifecycle.
The original function was already naturally split into two chunks: First, we parse and validate the users' RPC configuration for IPs and ports. Next we bind libevent's http server to the appropriate endpoints. This commit splits these chunks into two separate functions, leaving the argument parsing in the common space of the module and moving the libevent-specific binding into the http_libevent namespace. A future commit will implement http_bitcoin::HTTPBindAddresses to bind the validate list of endpoints by the new HTTP server.
63a9487 to
b6afd81
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/util/string.h
Outdated
| /** | ||
| * Returns a string from current iterator position up to next \n | ||
| * and advances iterator, does not return trailing \n or \r. | ||
| * Will not search for \n past max_read. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, updated
src/test/util_string_tests.cpp
Outdated
| // but then throws because there is no EOL within max_read | ||
| LineReader reader2(input, /*max_read=*/23); | ||
| BOOST_CHECK_EQUAL(reader2.ReadLine(), "once upon a time there"); | ||
| BOOST_CHECK_THROW(reader2.ReadLine(), std::runtime_error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right and I think your original instinct on this was right as well. I modified the logic of the ReadLine() loop, added more comments, and covered the end-of-buffer case with another unit test. So it should be more clear: max_read is the length of the longest line we will return, and the line can be terminated either by \n (not counted) or by the end of the buffer.
max_read = 8
"12345678" OK
"12345678\n" OK
"123456789" bad
"123456789\n" bad
This is a major component of removing libevent as a dependency of the project, by replacing the HTTP server used for RPC and REST with one implemented entirely within the Bitcoin Core codebase. The new
HTTPServerclass runs its own I/O thread, handling socket connections with code based on #30988, but tailored specifically for HTTP.Some functional tests were added in #32408 to cover libevent behaviors that remain consistent with this branch.
Commit strategy:
http_libeventhttp_bitcoin(classes likeHTTPRequest,HTTPClient, etc...)http_libevent)I am currently seeing about a 10% speed up in the functional tests on my own arm/macos machine.
Integration testing:
I am testing the new HTTP server by forking projects that integrate with bitcoin via HTTP and running their integration tests with bitcoind built from this branch (on Github actions). I will continue adding integrations over time, and re-running these CI tests as this branch gets rebased:
rpc-bitcoin