-
Notifications
You must be signed in to change notification settings - Fork 201
Peek sockaddr for unix domain sockets according to non-null-terminating semantics #305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ng semantics On certain linux distros, the returned sockaddr for a unix domain socket isn't guaranteed to be null terminated. This implements the recommended portable method to handle potentially non-null-terminated strings in `sun_path`. See http://man7.org/linux/man-pages/man7/unix.7.html The previous behavior would potentially read uninitialized memory and attempt to locale decode it Test Plan: New unit test for unix domain sockets, which previously failed on centos
Mixed up types under w32
|
IMHO, this is a lot of complex gymnastics at the Haskell layer for a problem that should likely be more easily and cleanly solved in the underlying C API. The underlying C-functions called by Haskell's Network library that fill-in sockaddr data would need to arrange to handle unterminated sun_path on systems that lack a sun_len field in sockaddr_un, by arranging to ensure any missing null-termination. I would strongly also recommend to always zero the buffer (in C) before calling system calls that fill-in a user-provided sockaddr buffer. Whether on Darwin or not. If the output socket addresses used for AF_UNIX are always zero-filled before use, and are one-byte longer than the maximum size used by the platform, then null-termination is always assured. The C-wrapper can also use the This is not only simpler but uniformly safer. I'd also like too see a lot less Finally, I'm puzzled by the logic below. This type of thing deserves comments explaining what's going on. Why does an empty path get that particular length? Is it the right length on platforms that also have Why not just |
|
If you have preferences about how this should be implemented, please let me know what you want to see where. My goal in this request is to fix a bug upstream that has halted work on a much larger project (I have already patched locally to unblock my team's progress). I am not interested in broader refactors on a repo that is already undergoing a lot of change.
|
|
Indeed you're not obligated to take on substantial refactoring of the code. And my opinion is not dispositive, I am not a maintainer of the project, just an observer with some experience. So I'll leave to you and Kazu et. al. to figure how to move forward. My comments are intended to aid of the long-term quality of the code base. If you have cycles to refactor code that is starting to show wear and tear, super! If not, perhaps someone else can. The risk comes in when code is repeatedly patched in ways that minimize change, but increase long-term costs by adding complexity. When it is time to stop and clean up is a matter of judgement. As to the specifics:
Indeed, and so one might need to add some such functions that present a saner, more uniform API to the upper layers of the system. Heimdal (where I am a maintainer) has "libroken" which deals with platform-specific quirks, so that the rest of the software sees a more uniform platform, and needs a lot fewer
Perhaps so, but if
This cuts to the core of the question of when it is time to stop patching and do some refactoring. I'll leave that call to Kazu et. al. |
author: JonCoens in haskell#305.
|
I created https://github.com/kazu-yamamoto/network/tree/unix-domain by copying your test case. For the first commit, the test case fails on Linux. This is because you check a socket address returned from So, I removed the |
|
In some cases, |
This is due to the original code. I agree that refactoring is necessary. |
For abstract unix domain sockets on Linux, this does not work. I don't understand why yet. |
|
I misunderstood. This is not related to abstract unix domain sockets. Simply |
Unfortunately they do. From the unix man page (http://man7.org/linux/man-pages/man7/unix.7.html) under Bugs: I was only able to consistently repro this on my centos setup. With the Note how the returned SockAddr was I'll attempt to be more concrete on exactly what I've been seeing. An application I'm working on would intermittently get decoding exceptions coming out of a call to
This would then attempt to locale-decode (utf8 in our case) uninitialized memory. If we happened to get something null terminated early enough it would work fine. Other times we'd get invalid byte sequences. It would be great if all linux distros filled |
This is why the sun_path field should be zero-filled by the caller, before calling accept(), or any other socket function that fills-in a caller-provider sockaddr! If the syscall leaves sun_path unmodified, then it will be an empty address! Presumably "untouched" really means 0-length, and the OS writes 0 bytes (and does not write the terminating NUL). On platforms were sockets have a sun_len field. And of course the OS also returns a My view is all the details of this should be handled in a C-wrapper around the various syscalls, so that the Haskell sees uniform behaviour across all platforms. By the time the sockaddr is returned to Haskell it should have a sensible length, and the sun_path should already be NUL terminated.
Also sun_path is surely a ByteString, not a utf8 string. It should not be subject to locale-specific decoding.
Sample C-code and output to demonstrate the behaviour you're seeing would really be helpful. Does zeroing the sockaddr before asking the OS to return the address solve the problem?
I think that the Haskell code that allocates buffers for sockaddr results should allocate one more byte than |
|
I'd like to keep this pull request focused on fixing the bug at hand and not devolve into broader project maintenance ideas. The ideas are good, but I'd prefer they be discussed in separate Issues or subsequent pull requests that implement them, which I would defer to kazu to drive appropriately. There are plenty of more defensive ways to implement things at different layers, but they are out-of-scope of what I'm trying to do here. Are there any concrete requests on how this patch should be implemented that differ from the current implementation? |
| when (rc /= 0) $ | ||
| throwSocketErrorCode "Network.Socket.accept" (fromIntegral rc) | ||
| return new_fd' | ||
| addr' <- peekSocketAddress sa Nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the non-threaded else case, why are we able to get away with not specifying the sockaddr size? Is this just a branch that is not reached by the code that needs this PR, or is the issue somehow not applicable here? Is c_acceptNewSock doing something differently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One was never retrieved in the first place, so my edit was to persist the existing functionality cleanly. I don't know what the intended semantics are for this branch, but any change to its behavior should have a test environment to ensure the "correct" thing is happening.
|
|
||
| serverSetup = do | ||
| sock <- socket AF_UNIX Stream defaultProtocol | ||
| bind sock (SockAddrUnix unixAddr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, to make the test repeatable, before bind we need to unlink the socket path, else the bind will fail.
|
On a purely "functional" level, the patch looks OK to me, other than the question about the "non-threaded" variant on Windows, and the missing |
While modern Linux kernels do generally NUL-terminate So we can avoid running into the issue in every case by providing a buffer that is 109 bytes long and is zeroed, but this should only needed for FWIW, I just tested with a client that does This is a apparently a long-standing |
|
Looks like #306 will be the blessed fix |
|
Superceded by #306 |
On certain linux distros, the returned sockaddr for a unix domain socket isn't guaranteed to be null terminated. This implements the recommended portable method to handle potentially non-null-terminated strings in
sun_path.See http://man7.org/linux/man-pages/man7/unix.7.html
The previous behavior would potentially read uninitialized memory and attempt to locale decode it
Test Plan: New unit test for unix domain sockets, which previously failed on centos
I would like to fix this for 2.6.3, but none of my dev environments can build the 2.6 branch.