|
|
Log in / Subscribe / Register

IPC medley: message-queue peeking, io_uring, and bus1

By Jonathan Corbet
April 2, 2026
The kernel provides a number of ways for processes to communicate with each other, but they never quite seem to fit the bill for many users. There are currently a few proposals for interprocess communication (IPC) enhancements circulating on the mailing lists. The most straightforward one adds a new system call for POSIX message queues that enables the addition of new features. For those wanting an entirely new way to do interprocess communication, there is a proposal to add a new subsystem for that purpose to io_uring. Finally, the bus1 proposal has made a return after ten years.

Peeking at message queues

The POSIX message-queue API is not heavily used, but there are users out there who care about how well it works. Message queues are named objects that, by default, all share a global namespace, though IPC namespaces can be used to separate them. There is a whole set of system calls for the creation, configuration, use, and destruction of message queues; see the mq_overview man page for an introduction to this subsystem.

Of interest here is mq_timedreceive(), which can be used to receive messages from a message queue:

    ssize_t mq_timedreceive(size_t msg_len;
                            mqd_t mqdes, char  msg_ptr[msg_len],
                            size_t msg_len, unsigned int *msg_prio,
                            const struct timespec  abs_timeout);

This call will receive the highest-priority message pending in the queue described by mqdes (which is a file descriptor on Linux systems) into the buffer pointed to by msg_ptr, which must be at least msg_len bytes in length. If abs_timeout is not null, it specifies how long the call should block before returning a timeout error. On successful receipt of a message, the location pointed to by msg_prio (if non-null) will be set to the priority of the received message.

That system call has a fair number of parameters, but Mathura Kumar would like to add some more. Since mq_timedreceive() was not designed for extensibility, that means adding a new system call. Thus, Kumar's patch set adding mq_timedreceive2(). But there is an additional constraint here: there are architecture-imposed limits on the number of arguments that can be passed to system calls, and Kumar's plans would exceed those limits. As a result, the new system call is defined as:

    struct mq_timedreceive2_args {
        size_t         msg_len;
        unsigned int  *msg_prio;
        char          *msg_ptr;
    };

    ssize_t mq_timedreceive2(mqd_t mqdes,
                             struct mq_timedreceive2_args *uargs,
                             unsigned int flags,
                             unsigned long index,
                             const struct timespec *abs_timeout);

The msg_len, msg_prio, and msg_ptr arguments have been moved into the new mq_timedreceive2_args structure, freeing up two slots for new parameters to the system call. That structure is passed by pointer, without using the common pattern of passing its length, which would make future additions easier; that may change if this patch series moves forward.

The new arguments are flags and index. In this series, only one flag (MQ_PEEK) is defined; if it is present, the message will be returned as usual, but without removing it from the queue, meaning that it will still be there the next time a receive operation is performed. The index argument indicates which message is of interest; a value of zero will return the highest-priority message, and higher values will return messages further back in the queue.

There are a few use cases for these features described in the patch cover letter. One would be monitoring tools, which may want to look at the message traffic without interfering with it. Another one is Checkpoint/Restore in Userspace, which can read a series of messages out of a queue, then restore them with the rest of the process at a future time.

The series as a whole has not received much attention so far, which is perhaps unsurprising given that few developers have much interest in POSIX message queues. If this work is to proceed, it will need to attract some reviews, and probably go through some more rounds to address the problems that are found.

IPC in io_uring

Since its inception, the io_uring subsystem has steadily gained functionality. After having started as the asynchronous I/O mechanism that Linux has long lacked, it has evolved into a separate system-call interface providing access to increasing amounts of kernel functionality. While io_uring can be used for interprocess communication (by way of Unix-domain sockets, for example), it has not yet acquired its own IPC scheme. This patch series from Daniel Hodges seeks to change that situation, but it probably needs a fair amount of work to get there.

Hodges's goal is to provide a high-bandwidth IPC mechanism, similar to D-Bus, that will perform well on large systems. By using shared ring buffers, processes should be able to communicate with minimal copying of data. It is worth noting that other developers have attempted to solve this problem over the years, generally without success; see, for example, the sad story of kdbus. Hope springs eternal, though, and perhaps io_uring is the platform upon which a successful solution can be built.

There are facilities for direct and broadcast messages. Communication is done through "channels"; it all starts when one process issues at least one IORING_REGISTER_IPC_CHANNEL_CREATE operation to establish an open channel. Other processes can attach to existing channels if the permissions allow. Two basic operations, IORING_OP_IPC_SEND and IORING_OP_IPC_RECV, are used to send and receive messages, respectively. There is no documentation, naturally, but interested readers can look at this patch containing a set of self-tests that exercise the new features.

The io_uring maintainer, Jens Axboe, quickly noticed that the patch showed signs of LLM-assisted creation, something that Hodges owned up to. He also noted that the series falls short of being a complete D-Bus replacement, lacking features like credential management. Still Axboe agreed that an IPC feature for io_uring "makes sense to do" and seemed happy with the overall design of the code. Some questions he asked though, went unanswered. For this work to proceed, Hodges will need to return and do the hard work to bring a proof-of-concept patch up to the level needed for integration into a core subsystem like io_uring.

Bus1 returns

Back in 2016, David Herrmann Rheinsberg proposed a new kernel subsystem called "bus1", which would provide kernel-mediated interprocess communication along the lines of D-Bus. It allowed the passing of messages, but also of capabilities, represented by bus1 handles and open file descriptors. The proposal attracted some attention, and brought some interesting ideas (see the above-linked article for details), but stalled fairly quickly and was never seriously considered for merging into the mainline kernel.

Ten years later, bus1 is back, posted this time by David Rheinsberg. The code has seen a few changes in the intervening decade:

The biggest change is that we stripped everything down to the basics and reimplemented the module in Rust. It is a delight not having to worry about refcount ownership and object lifetimes, but at the cost of a C<->Rust bridge that brings some challenges.

The core features of bus1 remain similar to what was proposed in 2016. For the time being, Rheinsberg is focusing on the Rust aspects of the work and requesting help from the Rust for Linux community to get that integration into better shape.

At some future time, presumably, the new bus1 implementation will be more widely exposed within the kernel community, at which point we will see if there is an appetite for this kind of in-kernel IPC mechanism or not. For those who would like an early look, this patch contains documentation on how the bus1 API will work, though with a number of details left unspecified.

[Editor's note: we originally missed that David had changed his name. Apologies for the error.]


Index entries for this article
Kernelbus1
Kernelio_uring
KernelMessage passing


to post comments

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 2, 2026 20:48 UTC (Thu) by swilmet (subscriber, #98424) [Link] (5 responses)

There are already lots of choices for filesystems and schedulers (there are maybe other examples). Maybe IPC mechanisms could be treated the same, so the kernel could include more ways of doing IPC over time.

(I'm not a kernel developer, it's a real question).

Maybe it's because IPC has a more direct impact on the kernel's external surface - the API provided to userspace. For each IPC system, a different API needs to be devised, while filesystems and schedulers have more common ground.

Also, I wonder if with too many ways of doing IPC, it could create fragmentation among userspace programs. D-Bus is well-established, there is now Varlink (for semi-standard ways of well-integrating services together). The Linux kernel has an influence on the Linux userspace, so the kernel perhaps needs to move (more) wisely here than in other areas.

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 3, 2026 16:03 UTC (Fri) by richard_weinberger (subscriber, #38938) [Link] (2 responses)

As soon as major Linux userspace such as systemd settles on one IPC, there is no choice anymore.

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 3, 2026 17:12 UTC (Fri) by intelfx (subscriber, #130118) [Link] (1 responses)

> As soon as major Linux userspace such as systemd settles on one IPC, there is no choice anymore.

I really wish they'd settle on a reasonable bus than this opaque non-discoverable varlink thing...

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 5, 2026 15:57 UTC (Sun) by bitreader (subscriber, #182986) [Link]

Out of interest, could you elaborate?

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 3, 2026 18:51 UTC (Fri) by alison (subscriber, #63752) [Link] (1 responses)

>Also, I wonder if with too many ways of doing IPC, it could create fragmentation among userspace programs. D-Bus is well-established, there is now Varlink (for semi-standard ways of well-integrating services together).

Perhaps desktop applications prefer D-Bus, but for system programming and embedded applications, passing protobuf or ZeroMQ via unix domain or IP sockets is more common.

David Rheinsberg gave an excellent presentation about IPC at the All Systems Go Conference in 2025:

https://cfp.all-systems-go.io/all-systems-go-2025/speaker...

Here is the video:

https://media.ccc.de/v/all-systems-go-2025-347-linux-ipc-...

Varieties of filesystems and schedulers, so why not for IPC mechanisms too?

Posted Apr 3, 2026 18:53 UTC (Fri) by alison (subscriber, #63752) [Link]

Ah, the kernel also has binder, but I've never heard of anyone using it outside of Android.

-- Alison

real value in soft RT

Posted Apr 4, 2026 22:01 UTC (Sat) by dexjensen (guest, #182120) [Link]

POSIX mqueues may be niche in general-purpose software, but they still have real value in soft real-time embedded Linux.

What happened to kdbus?

Posted Apr 5, 2026 6:51 UTC (Sun) by thomas.poulsen (subscriber, #22480) [Link] (4 responses)

Why was kdbus never merged? The linked article is relatively optimistic, but now 12 years later it seems abandoned. What happened?

What happened to kdbus?

Posted Apr 5, 2026 9:54 UTC (Sun) by bluca (subscriber, #118303) [Link] (2 responses)

Because something along the lines of "it's completely wrong for IPC systems to be implemented in the kernel, they belong in userspace, things like these will never be merged" or so.

Some time later Google showed up with Binder and that was merged because, er, reasons.

What happened to kdbus?

Posted Apr 6, 2026 2:40 UTC (Mon) by quotemstr (subscriber, #45331) [Link] (1 responses)

How would you implement Binder in userspace without a broker? Keep in mind that Binder does just the one memcpy to a kernel-managed memory-mapped area in the recipient (no page-level fragmentation for mprotect!), that senders and recipients don't trust each other, and that priority inheritance likewise works across these untrusted process edges. It also properly tracks A->B->A->B->A callback causality and ensures that call-response chains are thread-local and don't migrate to other threads (causing all sorts of havoc) as the calls bounce back and forth.

Binder really is a nice system, and it's one that solves a multitude of problems that the dbus crowd doesn't even realize it has. It's a shame people think of it as a "Google thing" despite its general utility.

What happened to kdbus?

Posted Apr 9, 2026 8:43 UTC (Thu) by dvdhrm (subscriber, #85474) [Link]

> Binder really is a nice system, and it's one that solves a multitude of problems that the dbus crowd doesn't even realize it has. It's a shame people think of it as a "Google thing" despite its general utility.

The "dbus crowd" works on "bus1", which is highly inspired by Binder, to solve longstanding issues in D-Bus. I wholeheartedly disagree with the sentiment of your comment.

What happened to kdbus?

Posted Apr 5, 2026 13:21 UTC (Sun) by corbet (editor, #1) [Link]

This article describes what happened when the attempt was made to merge kdbus during the 4.1 cycle. For those wanting more information about what happened both before and after, see the kdbus entry in the LWN kernel index.

Binder

Posted Apr 5, 2026 17:50 UTC (Sun) by quotemstr (subscriber, #45331) [Link] (7 responses)

Binder is already in the tree, already had a Rust implementation, and has been proven on billions of devices. Why are we pretending it doesn't exist and making new IPC subsystems from scratch? We don't need bus1. We don't need an io_uring-based IPC mechanism. (At the most, we should just expose Binder's existing bulk-operation bytecodes through an io_uring gateway.)

Binder already copies minimally, propagates security descriptors, sends and receives file descriptions, does priority inheritance, and integrates with LSMs. IPC is a solved problem. Let's focus on putting what we have in production instead of further fragmenting the infrastructure.

Binder

Posted Apr 5, 2026 20:46 UTC (Sun) by lobachevsky (subscriber, #121871) [Link] (1 responses)

But it is a Google thing and nobody outside of Google wants to bet on the continued existence or direction of a Google thing. I find this quite understandable. Sure, the kernel has a don't break userspace policy, but if Google wanted to use binder in a way incompatible with other users' wishes, they would. If they wanted to use a different kernel, they could (they tried with Fuchsia) and almost all the billions of devices using binder on Linux would be gone and so would be the former maintainers.

Binder

Posted Apr 6, 2026 0:46 UTC (Mon) by quotemstr (subscriber, #45331) [Link]

If others used it, it would cease to be a "Google thing". It predates Android. It's originally from Be and Palm. It also does distributed object lifetime tracking in a way you seldom see done right elsewhere, and I think people proposing new IPC subsystems need to at *least* sit down and actually understand Binder, then Chesterton's fence style, explain why it won't work.

Binder

Posted Apr 5, 2026 21:11 UTC (Sun) by kkdwvd (subscriber, #179603) [Link] (4 responses)

I think one of the main reasons D-Bus crew has been proposing bus1 (and kdbus before this) is the supposed need for total ordering guarantees. Unlike kdbus, bus1 (the last time I looked at the code) used Lamport clocks, and broke ties in a way that was consistent across all participants such that a total ordering could be established. User space D-Bus daemons already provide global ordering, since everything bounces through the daemon, so layering them on top of bus1 will be easier, I guess. Even if not layering conventional D-Bus on top, semantically, operating D-Bus APIs on top of bus1 will probably be less surprising.

Apart from that, I do agree that there doesn't seem to be any strong motivation to choose one over the other, except non-technical reasons. I also feel the importance of total ordering is just a relic from the D-Bus days, and you don't actually need it for most cases. systemd has been adopting Varlink which is just JSON marshalled over UDS, where no such guarantee is provided. What I gather to be the real motivation, is that it's probably a PITA to bootstrap an IPC daemon PID 1 itself depends on, creating all kinds of circular dependencies combined with lazy activation, such that having the message bus in the kernel would be more convenient.

Binder

Posted Apr 9, 2026 9:15 UTC (Thu) by dvdhrm (subscriber, #85474) [Link] (3 responses)

You are correct about the ordering. I have to admit, the new RFC replaced the total order with a partial order, where only causal relationships are upheld. This improves performance, because multicasts can be transmitted in a single pass, without any risk of priority-inversion or waiting for locks.

D-Bus provides a total order, but we will try to get away with just causal ordering. Effectively, it means independent broadcasts might be received by different users in a different order (we usually refer to this as breaking "multicast stability"). The upside is that bus1 extends causal ordering to side-channels, which D-Bus cannot provide. And we believe causal ordering is the much more interesting property to provide than multicast stability.

But if we look at the bigger picture, bus1 is highly inspired by Binder. It has a similar node/ref model, it has lifetime notifications, and used to have a similar memory-pool implementation (which is where `memfd_create(2)` originated). The reason Binder is not considered an option is:

1) Binder backs Android, and it is completely unclear what we should do when upstream submissions are rejected by Google. #ifdef everything? Let Google use downstream patches? Let Google decide what gets merged? We created memfds as replacement for ashmem, and I think we can do something similar to Binder.
2) Binder uses a UAPI that differs from classic Linux UAPIs a lot. Metadata is transmitted _inline_ in the payload, requiring payload patching to transmit FDs or refs. File-descriptors are injected into remote processes, instead of passed along integrated into the unix-gc. File resources a pinned to a task and break when moving file descriptors around.
3) Binder does not account resource usage. bus1 puts a lot of effort into a quota system that prevents resource exhausting for cross-domain communication.
4) Binder has no ordered multicasts (doesn't even have multicasts at all). This cannot be used as transport for D-Bus.

Yes, technically Binder can be adapted and modified to do what bus1 does. It has great features that we would love to see adapted elsewhere (e.g., scheduler integration of back-and-forth communication). And its limitations aren't structural, even though it breaks some of the assumptions of classic Linux user-space (e.g., FD injection breaks the assumption that the FD namespace is purely local). But no distribution ships Binder, the politics behind it are not appealing, and there seems to be little downside in having both separate.

Binder

Posted Apr 9, 2026 10:30 UTC (Thu) by quotemstr (subscriber, #45331) [Link] (2 responses)

Memfd got adopted after becoming mostly superior to ashmem --- although if its authors had more insight into real use cases, they'd have had F_SEAL_FUTURE_WRITE, and if they'd had a bit more taste, they'd have implemented more of it in system calls instead of yet more fcntl multiplexing.

Is bus1 superior to binder? It has no mappable memory pool, so out of the gate it involves at least double the copies of Binder, making the Rust version of bus1 somehow a regression from even the low level of C bus1, which, for its flaws, had a kernel managed memory poo. Bus1 has no thread scheduling guarantees. No one-way/two-way message tracking. No callee ID. No refcount. No weak references.

Why does it need to exist? Because we're all supposed not to like the company currently funding most Binder development? Because we hate performance and security?

I see no evidence of understanding of why the way they are in Binder, not even a nod to Chesterton's fence, but merely an assertion that Binder is unsuitable --- so unsuitable, in fact, that it's the most successful IPC capability system in the history of planet Earth.

Yeah, okay: the Binder people, *they're* the ones doing it wrong. We should instead listen to the people who couldn't get bus1 into the kernel last time.

Binder

Posted Apr 9, 2026 11:31 UTC (Thu) by bluca (subscriber, #118303) [Link]

> Why does it need to exist? Because we're all supposed not to like the company currently funding most Binder development? Because we hate performance and security?

Because the company that controls it is very well-known for not playing well with others, arbitrarily rejecting (or merging and then later silently reverting) changes that do not benefit directly their own products, and for outright killing things left and right with no warnings https://killedbygoogle.com/

The technical merits or demerits don't matter one bit, it's simply not a stable political foundation to build upon, it would be like building a house on quicksands.

Binder

Posted Apr 9, 2026 11:56 UTC (Thu) by dvdhrm (subscriber, #85474) [Link]

> Memfd got adopted after becoming mostly superior to ashmem --- although if its authors had more insight into real use cases, they'd have had F_SEAL_FUTURE_WRITE, and if they'd had a bit more taste, they'd have implemented more of it in system calls instead of yet more fcntl multiplexing.

I went through a lengthy review process on LKML with many kernel developers, and memfd_create(2) is the result. It is a shame it is not to your satisfaction.

> Is bus1 superior to binder?

No. It is different.

> It has no mappable memory pool, so out of the gate it involves at least double the copies of Binder, making the Rust version of bus1 somehow a regression from even the low level of C bus1, which, for its flaws, had a kernel managed memory pool.

In the previous submission, Andy Lutomirski asked for bus1 to be submitted without memory pool. It makes review easier, reduces the amount of new features, and can be easily added later on. The memory pool API significantly differs from the classic iovec-copy API, and thus might introduce unexpected accounting difficulties with kmemcg. No specific issues were raised or found, but I can relate to the concern.

Not sure why I wouldn't comply with Andy's wish. Seems reasonable. Sorry to hear that you disagree.

> Bus1 has no thread scheduling guarantees.

Sure it has. Bus1 promotes one peer per thread, rather than sharing a peer. This is required if you need causal ordering, otherwise dequeuing would lose all ordering information.

The cover-letter mentions "anycast nodes" as a future improvement, but we did not want it in the RFC.

> No one-way/two-way message tracking.

Not sure what exactly this refers to.

> No callee ID.

No idea what a "callee ID" is meant to refer to. Bus1 obviously tells you the callee node ID, otherwise you wouldn't know where a message was sent to.

> No refcount.

What? The entire point of handles is to represent a reference count.

> No weak references.

Yes! Binder never made use of this in the upper layers (did this change?), so we decided to strip it from the initial submission. If Android did not see an immediate use, we feel like adding it as an extension later on is preferable.

> Why does it need to exist? Because we're all supposed not to like the company currently funding most Binder development? Because we hate performance and security?
>
> I see no evidence of understanding of why the way they are in Binder, not even a nod to Chesterton's fence, but merely an assertion that Binder is unsuitable --- so unsuitable, in fact, that it's the most successful IPC capability system in the history of planet Earth.
>
> Yeah, okay: the Binder people, *they're* the ones doing it wrong. We should instead listen to the people who couldn't get bus1 into the kernel last time.

"bus1 is highly inspired by Binder"
"[Binder] has great features that we would love to see adapted elsewhere"

Hey I am merely sharing my solution for a problem I face. You seem to be quite agitated by that. I am really not sure why.


Copyright © 2026, Eklektix, Inc.
This article may be redistributed under the terms of the Creative Commons CC BY-SA 4.0 license
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds