Skip to content

Fix slow WASI stdin reads by passing size hint to worker thread#13256

Merged
alexcrichton merged 2 commits into
bytecodealliance:mainfrom
hiddenbit:pass-stdin-size-hint
May 4, 2026
Merged

Fix slow WASI stdin reads by passing size hint to worker thread#13256
alexcrichton merged 2 commits into
bytecodealliance:mainfrom
hiddenbit:pass-stdin-size-hint

Conversation

@hiddenbit

@hiddenbit hiddenbit commented May 2, 2026

Copy link
Copy Markdown
Contributor

While working on a program that reads a large amount of data through stdin, I was surprised by slow stdin throughput under wasmtime. Piping 1 GiB into a simple WASI program that reads in 64 KiB chunks took about 38 seconds (~28 MiB/s). I expected something in the gigabytes-per-second order of magnitude.

To demonstrate the issue, below is a minimal WASI program that reads stdin in configurable chunks.

Piping 1 GiB of zeroes and attempting to read in 65536 bytes chunks:

# Before the changes in this PR: takes ~38s -> ~28 MiB/s
time dd if=/dev/zero bs=65536 count=16384 2>/dev/null | /path/to/wasmtime stdio_bench.wasm 65536

# With the changes in this PR: takes ~1.2s -> ~900 MiB/s
time dd if=/dev/zero bs=65536 count=16384 2>/dev/null | /path/to/wasmtime stdio_bench.wasm 65536

Benchmark application code:

// build with: cargo build --release --target wasm32-wasip1
use std::env;
use std::io::{self, Read};
use std::process;

fn main() {
    let chunk_size: usize = env::args()
        .nth(1)
        .and_then(|s| s.parse().ok())
        .unwrap_or_else(|| {
            eprintln!("Usage: stdin_bench <chunk-size-bytes>");
            process::exit(1);
        });

    let stdin = io::stdin();
    let mut handle = stdin.lock();
    let mut buf = vec![0u8; chunk_size];
    let mut total: u64 = 0;

    loop {
        match handle.read(&mut buf) {
            Ok(0) => break,
            Ok(n) => total += n as u64,
            Err(e) => {
                eprintln!("read error: {e}");
                process::exit(1);
            }
        }
    }

    eprintln!("{total} bytes read");
}

Root cause

Regardless of how many bytes were requested (e.g. 65536) the worker thread would read at most 1024 bytes (hard-coded number) per round-trip, which results in slow performance.
See worker_thread_stdin.rs:108.

Fix

StdinState::ReadRequested now has a usize size hint from the caller. The worker thread uses this hint to size its read buffer, clamped to [1, MAX_READ_SIZE_ALLOC] to avoid guest-controlled unbounded allocation while still enabling efficient bulk reads.

Callers now store a size hint when transitioning to ReadRequested; Pollable::ready uses MAX_READ_SIZE_ALLOC because it does not know the size of the following read.

When reading 1 GiB in 64 KiB chunks, I now get:

Before After Speedup
~28 MiB/s ~900 MiB/s 32x

@hiddenbit hiddenbit requested a review from a team as a code owner May 2, 2026 15:08
@github-actions github-actions Bot added the wasi Issues pertaining to WASI label May 2, 2026

@alexcrichton alexcrichton left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One small comment but otherwise looks good to me 👍

// Extract the size hint from the request and cap it to `MAX_READ_SIZE_ALLOC`
// to avoid guest-controlled unbounded allocation.
let size_hint = match *lock {
StdinState::ReadRequested(size) => size.min(MAX_READ_SIZE_ALLOC).max(1024),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to remove this max(1024) call which I think is just a holdover from before? It seems reasonable to me that the size is clamped but otherwise un-tampered with.

// Extract the size hint from the request and cap it to `MAX_READ_SIZE_ALLOC`
// to avoid guest-controlled unbounded allocation.
let size_hint = match *lock {
StdinState::ReadRequested(size) => size.min(MAX_READ_SIZE_ALLOC).max(1024),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to remove this max(1024) call which I think is just a holdover from before? It seems reasonable to me that the size is clamped but otherwise un-tampered with.

@alexcrichton alexcrichton left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One small comment but otherwise looks good to me 👍

@alexcrichton alexcrichton left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! One small comment but otherwise looks good to me 👍

@hiddenbit

Copy link
Copy Markdown
Contributor Author

Thanks for the suggestion! You're right that .max(1024) is a holdover from the old hardcoded buffer size and doesn't belong here conceptually.

However, removing max(...) entirely introduces a subtle issue: A WASI guest can call read(0). If that propagates to the worker thread, it allocates BytesMut::zeroed(0) and calls stdin().read(&mut []), which returns Ok(0). The worker then interprets Ok(0) as EOF and transitions to StdinState::Closed, which permanently closes stdin. A subsequent read(42) would fail with StreamError::Closed. (If I didn't miss anything 😅)

I've addressed it like this:

  1. Short-circuit at the call site: InputStream::read now returns Ok(Bytes::new()) immediately when size == 0, avoiding an unnecessary worker wake-up entirely.
  2. Safety floor in the worker: Replaced .max(1024) with .max(1) so that even if a zero somehow reaches the worker (e.g. from a future code path), it can never be misinterpreted as EOF.

This removes the arbitrary 1024 floor while guarding against the edge case.

@alexcrichton alexcrichton added this pull request to the merge queue May 4, 2026
@alexcrichton

Copy link
Copy Markdown
Member

Thanks!

Merged via the queue into bytecodealliance:main with commit 85ae784 May 4, 2026
48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasi Issues pertaining to WASI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants