Skip to content

Use MSG_EOR to mark fragment boundaries#444

Open
nortti0 wants to merge 1 commit intoservo:mainfrom
nortti0:seqpacket-eor
Open

Use MSG_EOR to mark fragment boundaries#444
nortti0 wants to merge 1 commit intoservo:mainfrom
nortti0:seqpacket-eor

Conversation

@nortti0
Copy link
Copy Markdown
Contributor

@nortti0 nortti0 commented Mar 8, 2026

ipc-channel relies on the fragments being read as atomic chunks, which is not guaranteed to be true for plain recvmsg(2) on SOCK_SEQPACKET cross-platform. Namely, on FreeBSD it's possible for this to return both more and less than a single fragment.

SOCK_SEQPACKET can be thought of as a variant of SOCK_STREAM with the ability to set "end of record" markers using MSG_EOR. A single recvmsg(2) will not read past such a marker. While the fact that the marker has been reached is supposed to be communicated back to the caller through msg_flags, as of now Linux does not properly implement this on Unix-domain sockets.

To handle cases where the recvmsg(2) returns less than a single fragment, include the size of the fragment in the header and continue reads until the entire fragment has been read. Since SOCK_SEQPACKET maintains ordering, as long as sending the fragments is atomic, this will not will not run the danger of interleaving of different messages.

Empirically, sendmsg(2) as it is used by ipc-channel is atomic at least on FreeBSD. To make any possible resultant bugs easier to find, add asserts for that.

@nortti0 nortti0 force-pushed the seqpacket-eor branch 2 times, most recently from 1dc26d0 to 2f49fb2 Compare March 9, 2026 21:38
ipc-channel relies on the fragments being read as atomic chunks, which
is not guaranteed to be true for plain recvmsg(2) on SOCK_SEQPACKET
cross-platform. Namely, on FreeBSD it's possible for this to return both
more and less than a single fragment.

SOCK_SEQPACKET can be thought of as a variant of SOCK_STREAM with the
ability to set "end of record" markers using MSG_EOR. A single
recvmsg(2) will not read past such a marker. While the fact that the
marker has been reached is supposed to be communicated back to the
caller through msg_flags, as of now Linux does not properly implement
this on Unix-domain sockets.

To handle cases where the recvmsg(2) returns less than a single
fragment, include the size of the fragment in the header and continue
reads until the entire fragment has been read. Since SOCK_SEQPACKET
maintains ordering, as long as sending the fragments is atomic, this
will not will not run the danger of interleaving of different messages.

Empirically, sendmsg(2) as it is used by ipc-channel is atomic at least
on FreeBSD. To make any possible resultant bugs easier to find, add
asserts for that.

Signed-off-by: Juhani Krekelä <juhani@krekelä.fi>
@nortti0
Copy link
Copy Markdown
Contributor Author

nortti0 commented Mar 11, 2026

Investigating the failures here has lead me to discover that MSG_EOR is not properly implemented on Linux for Unix-domain sockets. Specifically, unix_seqpacket_sendmsg() delegates to unix_dgram_sendmsg() which doesn't do anything with the flag.

Linux using the dgram logic does mean that it mostly behaves as-if MSG_EOR had been passed, except that it will never signal its presence in msg_flags returned by recvmsg(2). To make it possible to detect that the first fragment has been fully read on both Linux and on systems that properly implement the MSG_EOR semantics on Unix-domain sockets, I extended the header of the fragment to include its own size.

I have also edited the initial pull request message to reflect the current state of the code, as afaiu it will become the final commit message.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant