So You Want to Rust the Linux Kernel?
There has been much discussion of using the Rust language in the Linux kernel (for example, here, here, and here), at the Kangrejos Rust for Linux Workshop (here, here, and here) and 2021 Linux Plumbers Conference had a number of sessions on this topic, as did Maintainers Summit. At least two of these sessions mentioned the question of how Rust is to handle the Linux-kernel memory model (LKMM), and I volunteered to write this blog series on this topic.
This series focuses mostly on use cases and opportunities, rather than on any non-trivial solutions. Please note that I am not in any way attempting to dictate or limit Rust's level of ambition. I am instead noting the memory-model consequences of a few potential levels of ambition, ranging from "portions of a few drivers", "a few drivers", "some core code" and up to and including "the entire kernel". Greater levels of ambition will require greater willingness to accommodate a wider variety of LKMM requirements.
One could instead argue that portions or even all of the Linux kernel should instead be hammered into the Rust ownership model. On the other hand, might the rumored sudden merge of the ksmdb driver (https://lwn.net/Articles/871098/) have been due to the implicit threat of its being rewritten in Rust? [1] Nevertheless, in cases where Rust is shown to offer particularly desirable advantages, it is quite possible that Rust and some parts of the Linux kernel might meet somewhere in the middle.
These blog posts will therefore present approaches ranging upwards from trivial workarounds. But be warned that some of the high-quality approaches require profound reworking of compiler backends that have thus far failed to spark joy in the hearts of compiler writers. In addition, Rust enjoys considerable use outside of the Linux kernel, for but one example that I have personally observed, as something into which to rewrite inefficient Python scripts. (A megawatt here, a megawatt there, and pretty soon you are talking about real power consumption!) Therefore, there might well be sharp limits beyond which the core Rust developers are unwilling to go.
The remaining posts in this series (along with their modification dates) are as follows:
Please note that this blog series is not a Rust tutorial. Those wanting to learn how to actually program in Rust might start here, here, here, or of course here.
October 12, 2021: Self-review, including making it clear that Rust might have use cases other than rewriting inefficient scripts.
October 13, 2021: Add a link to the recommendations post.
October 22, 2021: This blog series is not a Rust tutorial.
November 3, 2021: Add post on memory model for Rust in general.
September 19, 2022: Add post on Kangrejos 2022.
January 25, 2023: Add post on classifying RCU implementations.
The October 12 update affected the whole series, for example, removing the "under construction" markings. Summary of significant updates to other posts:
This series focuses mostly on use cases and opportunities, rather than on any non-trivial solutions. Please note that I am not in any way attempting to dictate or limit Rust's level of ambition. I am instead noting the memory-model consequences of a few potential levels of ambition, ranging from "portions of a few drivers", "a few drivers", "some core code" and up to and including "the entire kernel". Greater levels of ambition will require greater willingness to accommodate a wider variety of LKMM requirements.
One could instead argue that portions or even all of the Linux kernel should instead be hammered into the Rust ownership model. On the other hand, might the rumored sudden merge of the ksmdb driver (https://lwn.net/Articles/871098/) have been due to the implicit threat of its being rewritten in Rust? [1] Nevertheless, in cases where Rust is shown to offer particularly desirable advantages, it is quite possible that Rust and some parts of the Linux kernel might meet somewhere in the middle.
These blog posts will therefore present approaches ranging upwards from trivial workarounds. But be warned that some of the high-quality approaches require profound reworking of compiler backends that have thus far failed to spark joy in the hearts of compiler writers. In addition, Rust enjoys considerable use outside of the Linux kernel, for but one example that I have personally observed, as something into which to rewrite inefficient Python scripts. (A megawatt here, a megawatt there, and pretty soon you are talking about real power consumption!) Therefore, there might well be sharp limits beyond which the core Rust developers are unwilling to go.
The remaining posts in this series (along with their modification dates) are as follows:
- Rust Concurrency Philosophy: A Historical Perspective (October 13, 2021)
- Atomics and Barriers and Locks, Oh My! (October 13, 2021)
- Compiler Writers Hate Dependencies (Control) (October 12, 2021)
- Compiler Writers Hate Dependencies (Address/Data) (October 12, 2021)
- Compiler Writers Hate Dependencies (OOTA) (November 12, 2021)
- Can Rust Code Own Sequence Locks? (October 13, 2021)
- Can Rust Code Own RCU? (October 18, 2021)
- How Much of the Kernel Can Rust Own? (October 12, 2021)
- Will Your Rust Code Survive the Attack of the Zombie Pointers? (October 12, 2021)
- Can the Kernel Concurrency Sanitizer Own Rust Code? (October 28, 2021)
- Summary and Conclusions (October 13, 2021)
- TL;DR: Memory-Model Recommendations for Rusting the Linux Kernel (October 21, 2021)
- Bonus Post: What Memory Model Should the Rust Language Use? (November 4, 2021)
- Bonus Post: Kangrejos 2022: The Rust for Linux Workshop (September 19, 2022)
- Bonus Post: Kangrejos 2022: The Rust for Linux Workshop (January 26, 2023)
Please note that this blog series is not a Rust tutorial. Those wanting to learn how to actually program in Rust might start here, here, here, or of course here.
Endnotes
| [1] | Some in the Linux-kernel community might be happy with either outcome: (1) The threat of conversion to Rust caused people to push more code into mainline and (2) Out-of-tree code was converted to Rust by Rust advocates and then pushed into mainline. The latter case might need special care for longer-term maintenance of the resulting Rust code, but perhaps the original authors might be persuaded to declare victory, learn Rust, and maintain the code. Who knows? ;-) |
History
October 8, 2021: Fix s/LInux/Linux/ typo noted by Miguel OjedaOctober 12, 2021: Self-review, including making it clear that Rust might have use cases other than rewriting inefficient scripts.
October 13, 2021: Add a link to the recommendations post.
October 22, 2021: This blog series is not a Rust tutorial.
November 3, 2021: Add post on memory model for Rust in general.
September 19, 2022: Add post on Kangrejos 2022.
January 25, 2023: Add post on classifying RCU implementations.
The October 12 update affected the whole series, for example, removing the "under construction" markings. Summary of significant updates to other posts:
- The historical post grew a bit based on feedback elsewhere.
- The sequence-locking post gained a list of Linux-kernel use cases and much else besides. Sequence locking seems to cause about as much trouble for Rust ownership as it does for the C/C++ memory model. I added my view of the properties of a best-case Rust implementation.
- The RCU post saw a lot of change. It gained a list of Linux-kernel use cases and some additional explanations of RCU. Verbiage was added explaining the need to interface to the existing C-language Linux-kernel RCU implementation as opposed to inventing a Rust-only RCU implementation. I added my views on a best-case Rust implementations, including RCU-usage bugs that such an implementation might be able to locate that are currently difficult to find. Finally, I added a list of papers presenting RCU semantic (with varying degrees of formality) as well as papers describing mechanical proofs of correctness for significant portions of Linux-kernel RCU.
- The zombie-pointer post gained a much more detailed description of how zombie pointers can rise from the dead.
- The KCSAN post grew more-detailed descriptions of KCSAN integration and use. Apparently Rust is much farther down the KCSAN road than I would have expected!
- The summary and conclusions gained more details on Linux-kernel undefined-behavior avoidance and on memory models.