Memory unsafety at libc/kernel boundary via argv overread in unix impl of `std::process::Command`

There is a soundness and memory safety issue in the Unix implementation of `std::process::Command` caused by panic-unsafety in `CStringArray`'s `argv` maintenance.

This finding was discovered by OpenAI Codex, validated by @lopopolo. We have been scanning the Rust toolchain using RUSTSEC-2026-0078 as a seed for partially committed state corruption after unwind.

The issue requires a caught unwind during `Command::arg` at an allocation edge, followed by continued use of the recovered `Command`. On the validated toolchain, the recovered object survives with a malformed unterminated `argv`, and later safe process creation fails with `os error 14` / `EFAULT`.

Root cause is panic-unsafety in the Unix `CStringArray::push` helper. It overwrites the trailing null sentinel first and only afterwards appends a replacement null pointer. If allocation failure reaches `handle_alloc_error` in that second step and the caller catches the unwind, the recovered `Command` keeps an unterminated `argv`. Later `Command::output()` / `spawn()` passes that malformed pointer array into `posix_spawn` / `execvp`, causing the libc/kernel process-spawn path to walk past the end of the allocation looking for a null sentinel.

Below is a proof that allocates a two-page mapping, places an unterminated one-pointer `argv` array at the end of the first page, marks the second page `PROT_NONE`, and calls `posix_spawn`. That test confirms the boundary behavior by checking that `posix_spawn` returns `EFAULT` when it reads past the end of the pointer array.

```rust
#![cfg_attr(all(test, unix), feature(alloc_error_hook))]
#![cfg_attr(not(test), allow(dead_code, unused_imports))]

//! Repro for a Unix `std::process::Command` unwind-safety bug.
//!
//! The vulnerable shape is a two-step mutation of the internal `argv` buffer:
//! `CStringArray::push` overwrites the trailing null sentinel first and only
//! afterwards appends a replacement null pointer. If allocation failure reaches
//! `handle_alloc_error` in that second step and the caller catches the unwind,
//! the recovered `Command` survives with a malformed argument vector. Later
//! safe process creation then hands that unterminated `argv` array to libc,
//! which can read past the end of the allocation looking for a null sentinel
//! and fail with `EFAULT`.

#[cfg(all(test, unix))]
use std::alloc::{set_alloc_error_hook, take_alloc_error_hook};
#[cfg(unix)]
use std::alloc::{GlobalAlloc, Layout, System};
#[cfg(all(test, unix))]
use std::mem::{align_of, size_of};
#[cfg(unix)]
use std::panic::{self, AssertUnwindSafe};
#[cfg(unix)]
use std::process::{Command, Stdio};
#[cfg(all(test, unix))]
use std::ptr;
#[cfg(unix)]
use std::sync::atomic::{AtomicBool, AtomicUsize, Ordering};

#[cfg(unix)]
const VALIDATED_TARGET: usize = 2;
#[cfg(unix)]
const CHILD_ENV: &str = "RUST_STD_COMMAND_UNWIND_REPRO_CHILD";

#[cfg(unix)]
struct FailingAlloc;

#[cfg(unix)]
static ALLOCATOR_ARMED: AtomicBool = AtomicBool::new(false);
#[cfg(unix)]
static ALLOCATION_COUNT: AtomicUsize = AtomicUsize::new(0);
#[cfg(unix)]
static FAIL_ON_ALLOCATION: AtomicUsize = AtomicUsize::new(usize::MAX);

#[cfg(unix)]
#[global_allocator]
static GLOBAL_ALLOCATOR: FailingAlloc = FailingAlloc;

#[cfg(unix)]
unsafe impl GlobalAlloc for FailingAlloc {
    unsafe fn alloc(&self, layout: Layout) -> *mut u8 {
        if should_fail_allocation() {
            return std::ptr::null_mut();
        }
        unsafe { System.alloc(layout) }
    }

    unsafe fn alloc_zeroed(&self, layout: Layout) -> *mut u8 {
        if should_fail_allocation() {
            return std::ptr::null_mut();
        }
        unsafe { System.alloc_zeroed(layout) }
    }

    unsafe fn realloc(&self, ptr: *mut u8, layout: Layout, new_size: usize) -> *mut u8 {
        let _ = new_size.max(layout.size());
        if should_fail_allocation() {
            return std::ptr::null_mut();
        }
        unsafe { System.realloc(ptr, layout, new_size) }
    }

    unsafe fn dealloc(&self, ptr: *mut u8, layout: Layout) {
        unsafe { System.dealloc(ptr, layout) }
    }
}

#[cfg(unix)]
#[inline(never)]
fn should_fail_allocation() -> bool {
    if !ALLOCATOR_ARMED.load(Ordering::Relaxed) {
        return false;
    }

    let allocation = ALLOCATION_COUNT.fetch_add(1, Ordering::Relaxed) + 1;
    if allocation == FAIL_ON_ALLOCATION.load(Ordering::Relaxed) {
        ALLOCATOR_ARMED.store(false, Ordering::Relaxed);
        return true;
    }
    false
}

#[cfg(all(test, unix))]
#[derive(Debug)]
struct ProbeReport {
    target: usize,
    arg_panic_caught: bool,
    baseline_args: Vec<String>,
    args_visible_after_catch: Vec<String>,
    spawn_succeeded: bool,
    raw_os_error: Option<i32>,
}

#[cfg(all(test, unix))]
fn probe(target: usize) -> ProbeReport {
    let current_exe = std::env::current_exe().expect("current test executable path");
    let mut cmd = Command::new(current_exe);
    cmd.env(CHILD_ENV, "1");
    cmd.stdout(Stdio::piped());
    cmd.stderr(Stdio::inherit());
    let baseline_args = cmd
        .get_args()
        .map(|arg| arg.to_string_lossy().into_owned())
        .collect();

    let panic_hook_guard = PanicHookGuard::install();
    let alloc_error_hook_guard = AllocErrorHookGuard::install();

    ALLOCATION_COUNT.store(0, Ordering::Relaxed);
    FAIL_ON_ALLOCATION.store(target, Ordering::Relaxed);
    ALLOCATOR_ARMED.store(true, Ordering::Relaxed);
    let arg_panic_caught = panic::catch_unwind(AssertUnwindSafe(|| {
        cmd.arg("A");
    }))
    .is_err();
    ALLOCATOR_ARMED.store(false, Ordering::Relaxed);

    drop(alloc_error_hook_guard);
    drop(panic_hook_guard);

    let args_visible_after_catch = cmd
        .get_args()
        .map(|arg| arg.to_string_lossy().into_owned())
        .collect();
    let (spawn_succeeded, raw_os_error) = match cmd.output() {
        Ok(_) => (true, None),
        Err(err) => (false, err.raw_os_error()),
    };

    ProbeReport {
        target,
        arg_panic_caught,
        baseline_args,
        args_visible_after_catch,
        spawn_succeeded,
        raw_os_error,
    }
}

#[cfg(all(test, unix))]
struct PanicHookGuard(Option<Box<dyn Fn(&std::panic::PanicHookInfo<'_>) + Sync + Send + 'static>>);

#[cfg(all(test, unix))]
impl PanicHookGuard {
    fn install() -> Self {
        let previous_hook = std::panic::take_hook();
        std::panic::set_hook(Box::new(|_| {}));
        Self(Some(previous_hook))
    }
}

#[cfg(all(test, unix))]
impl Drop for PanicHookGuard {
    fn drop(&mut self) {
        if let Some(previous_hook) = self.0.take() {
            std::panic::set_hook(previous_hook);
        }
    }
}

#[cfg(all(test, unix))]
struct AllocErrorHookGuard(fn(Layout));

#[cfg(all(test, unix))]
impl AllocErrorHookGuard {
    fn install() -> Self {
        let previous_hook = take_alloc_error_hook();
        set_alloc_error_hook(|_| panic!("reproducer alloc error"));
        Self(previous_hook)
    }
}

#[cfg(all(test, unix))]
impl Drop for AllocErrorHookGuard {
    fn drop(&mut self) {
        set_alloc_error_hook(self.0);
    }
}

#[cfg(all(test, unix))]
struct MappingGuard {
    base: *mut libc::c_void,
    len: usize,
}

#[cfg(all(test, unix))]
impl Drop for MappingGuard {
    fn drop(&mut self) {
        unsafe {
            assert_eq!(libc::munmap(self.base, self.len), 0);
        }
    }
}

#[cfg(all(test, unix))]
fn run_child() {}

#[cfg(all(test, unix))]
#[test]
fn command_arg_keeps_argv_well_formed_after_alloc_error_unwind() {
    if std::env::var_os(CHILD_ENV).is_some() {
        run_child();
        return;
    }

    // Phase 1: hit the validated alloc-error unwind site inside `Command::arg`.
    let report = probe(VALIDATED_TARGET);

    // Phase 2: a correct implementation should catch the panic and roll the
    // failed argument append back out of the recovered `Command`.
    assert!(
        report.arg_panic_caught,
        "Command::arg should unwind through handle_alloc_error on the reproducing target"
    );
    assert!(
        report.args_visible_after_catch == report.baseline_args,
        "the recovered Command should keep its original argument list after the panic"
    );

    // Phase 3: a correct `Command` should still spawn successfully after the
    // caught panic because its `argv` terminator was restored.
    assert!(
        report.spawn_succeeded,
        "later safe process creation should still succeed after the caught panic; raw_os_error={:?}",
        report.raw_os_error
    );
    assert_eq!(
        report.target, VALIDATED_TARGET,
        "the regression should keep probing the validated allocator target"
    );
}

#[cfg(all(test, unix))]
#[test]
fn posix_spawn_faults_when_unterminated_argv_crosses_guard_page() {
    use std::ffi::CString;

    let program = CString::new("/usr/bin/true").expect("static executable path");
    let mut envp = [ptr::null_mut::<libc::c_char>()];

    let mut ok_argv = [program.as_ptr().cast_mut(), ptr::null_mut()];
    let mut pid = 0;
    let rc = unsafe {
        libc::posix_spawn(
            &mut pid,
            program.as_ptr(),
            ptr::null(),
            ptr::null(),
            ok_argv.as_mut_ptr(),
            envp.as_mut_ptr(),
        )
    };
    assert_eq!(rc, 0, "control spawn with a terminated argv should succeed");
    let mut status = 0;
    let waited = unsafe { libc::waitpid(pid, &mut status, 0) };
    assert_eq!(waited, pid, "control child should be waitable");

    let page_size = unsafe { libc::sysconf(libc::_SC_PAGESIZE) as usize };
    assert!(page_size >= size_of::<*mut libc::c_char>());

    let mapping_len = page_size * 2;
    let base = unsafe {
        libc::mmap(
            ptr::null_mut(),
            mapping_len,
            libc::PROT_READ | libc::PROT_WRITE,
            libc::MAP_PRIVATE | libc::MAP_ANON,
            -1,
            0,
        )
    };
    assert_ne!(base, libc::MAP_FAILED, "guard-page mapping should succeed");
    let mapping = MappingGuard { base, len: mapping_len };

    let second_page = unsafe { (mapping.base as *mut u8).add(page_size) } as *mut libc::c_void;
    let rc = unsafe { libc::mprotect(second_page, page_size, libc::PROT_NONE) };
    assert_eq!(rc, 0, "guard page should become inaccessible");

    let argv = unsafe {
        (mapping.base as *mut u8).add(page_size - size_of::<*mut libc::c_char>())
            as *mut *mut libc::c_char
    };
    assert_eq!((argv as usize) % align_of::<*mut libc::c_char>(), 0);

    unsafe {
        argv.write(program.as_ptr().cast_mut());
    }

    let mut pid = 0;
    let rc = unsafe {
        libc::posix_spawn(
            &mut pid,
            program.as_ptr(),
            ptr::null(),
            ptr::null(),
            argv,
            envp.as_mut_ptr(),
        )
    };
    assert_eq!(
        rc,
        libc::EFAULT,
        "unterminated argv at a guard page boundary should fault when libc/kernel reads past the array"
    );
}
```

The deterministic witness in the reproducer is nightly-only because it uses `alloc_error_hook` to turn `handle_alloc_error` into a catchable unwind after a custom global allocator temporarily returns null. It does not rely on unwinding from `GlobalAlloc`.

The straightforward mitigation appears to be reserving capacity fallibly before overwriting the trailing null sentinel, for example via `Vec::try_reserve(1)`, so allocation failure preserves the old `argv` contents and its null terminator.

Below is the failing nightly repro showing the recovered `Command` later hitting `os error 14`:

```console
$ cargo +nightly test -p rust_std_command_unwind_repro command_arg_keeps_argv_well_formed_after_alloc_error_unwind -- --exact --nocapture
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.04s
     Running unittests src/lib.rs (target/debug/deps/rust_std_command_unwind_repro-93af15781c654a54)

running 1 test

thread 'command_arg_keeps_argv_well_formed_after_alloc_error_unwind' (1770157) panicked at rust-std-command-unwind-repro/src/lib.rs:232:5:
later safe process creation should still succeed after the caught panic; raw_os_error=Some(14)
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
test command_arg_keeps_argv_well_formed_after_alloc_error_unwind ... FAILED

failures:

failures:
    command_arg_keeps_argv_well_formed_after_alloc_error_unwind

test result: FAILED. 0 passed; 1 failed; 0 ignored; 0 measured; 1 filtered out; finished in 0.00s
```

Below is the companion guard-page proof command. The test passes because it asserts that `posix_spawn` returns `EFAULT` when given an unterminated `argv` whose next pointer slot lies in an inaccessible page:

```console
$ cargo +nightly test -p rust_std_command_unwind_repro posix_spawn_faults_when_unterminated_argv_crosses_guard_page -- --exact --nocapture
    Finished `test` profile [unoptimized + debuginfo] target(s) in 0.04s
     Running unittests src/lib.rs (target/debug/deps/rust_std_command_unwind_repro-93af15781c654a54)

running 1 test
test posix_spawn_faults_when_unterminated_argv_crosses_guard_page ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 1 filtered out; finished in 0.00s
```

I previously discussed this with @cuviper and @emilyalbini on the Security WG where we agree that this is a soundness issue, but because it requires nightly APIs, we can triage and fix in public.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Memory unsafety at libc/kernel boundary via argv overread in unix impl of `std::process::Command` #155748

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

Memory unsafety at libc/kernel boundary via argv overread in unix impl of std::process::Command #155748

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Memory unsafety at libc/kernel boundary via argv overread in unix impl of `std::process::Command` #155748