Feature Request: Improve SDK ergonomics for Rust consumers and AI agent use cases

## Background

I'm building [desktop-assistant](https://github.com/anthropics/desktop-assistant) — a Tauri 2 (Rust) desktop app with an AI agent that uses BoxLite as its sandbox environment. After several months of integration work, I've collected feedback on pain points and suggestions for improving the BoxLite SDK.

The current integration path is: **Tauri (Rust) → spawn Python daemon → Python SDK → BoxLite Rust core**. This works but introduces significant complexity. Most of the workaround code we've written (~600 lines) exists because the SDK doesn't expose capabilities that BoxLite already has internally.

---

## 1. Publish a Rust crate as a first-class SDK

**Pain point:** BoxLite's core is written in Rust, but Rust consumers must go through a Python/Node FFI bridge. In our case, this means:

- A JSON-over-stdin/stdout IPC protocol with a Python daemon subprocess
- Complex daemon script discovery logic (searching CWD, exe dir, ancestor directories)
- Manual daemon crash recovery (detecting closed channels, clearing session state)
- 120s timeout handling for daemon hangs
- Serialization overhead on every call

**Suggestion:** Publish a `boxlite` Rust crate so Rust consumers can use the library directly:

```toml
[dependencies]
boxlite = "0.6"
```

```rust
let runtime = BoxliteRuntime::new(Options::default()).await?;
let litebox = runtime.create(BoxOptions { image: "ubuntu:latest".into(), .. }).await?;
litebox.start().await?;
let result = litebox.exec("bash", &["-c", "echo hello"]).await?.wait().await?;
```

This would eliminate the entire daemon layer for Rust consumers.

---

## 2. Add `exec_shell()` convenience method

**Pain point:** In AI agent scenarios, 99% of commands are `bash -c "some shell command"`. Having to split program and args every time is tedious:

```python
# Current
execution = await box.exec("/bin/bash", args=["-c", "cd /workspace && grep -rn 'TODO' ."])

# Desired
execution = await box.exec_shell("cd /workspace && grep -rn 'TODO' .")
```

**Suggestion:** Add an `exec_shell(command: &str)` convenience method that wraps `exec("bash", &["-c", command])`.

---

## 3. Support in-memory file transfer via `copy_in` / `copy_out`

**Pain point:** Writing files into the VM currently requires a base64-over-bash hack:

```rust
let encoded = base64::encode(content);
sandbox.execute(&["bash", "-c", &format!("printf '%s' '{}' | base64 -d > '{}'", encoded, path)])
```

This is fragile (shell escaping, size limits, special characters) and doesn't leverage BoxLite's existing gRPC `Files.Upload` / `Files.Download` RPCs.

**Suggestion:** Ensure the SDK exposes file transfer that supports:
- Writing from an in-memory buffer (not just host file → container file)
- Streaming for large files
- Clear error types (permission denied vs path not found vs disk full)

```python
# From memory buffer
await box.write_file("/workspace/script.py", content=b"print('hello')")

# From host file (existing)
await box.copy_in("/host/path", "/container/path")
```

---

## 4. Expose `ResizeTty` in the SDK

**Pain point:** The Python SDK doesn't expose the gRPC `Execution.ResizeTty` RPC, so we have to send `stty cols X rows Y\n` via stdin as a workaround. This breaks when a program (vim, top, etc.) is running because the command gets sent as program input.

**Suggestion:** Expose the existing `ResizeTty` capability:

```python
# Current workaround (breaks during program execution)
stdin.send_input(f"stty cols {cols} rows {rows}\n".encode())

# Desired
await execution.resize_tty(cols=120, rows=40)
```

---

## 5. Better health checking and state recovery

**Pain points:**
- VM hangs are only detected after a 120s timeout
- Runtime lock conflicts (`"Another BoxliteRuntime is already using directory"`) require manually killing stale processes
- Consumers must implement their own crash detection and recovery logic

**Suggestions:**
- `box.health_check()` — lightweight check returning VM state (not just "alive")
- `runtime.cleanup_stale()` — auto-clean stale locks and PID files
- Auto-reconnect at the SDK level when the gRPC channel drops, instead of surfacing the error
- Box state change events (callback / event stream) so consumers don't have to poll
- `runtime.get_or_create(name, options)` — idempotent method for reconnecting after app restart

---

## 6. Documentation for AI agent integration

BoxLite is a great fit for AI agent sandboxing, but there's no guide for this use case. Suggested topics:

- **Recommended config:** CPU/memory/image for typical agent workloads
- **Concurrency model:** Is one VM with multiple concurrent `exec()` safe? Or one VM per agent?
- **Timeout handling:** How to ensure processes are killed after exec timeout (we've seen zombie processes)
- **Security boundaries:** Proper readonly volume configuration (we're using `chmod -R a-w` as a workaround)
- **File transfer patterns:** Best approach for many small file writes (exec+base64 vs copy_in vs volume mount)

---

## 7. Minor API improvements

| Current | Suggestion |
|---------|-----------|
| `BoxOptions.volumes` uses tuple `(host, container, mode)` | Use a struct `VolumeMount { host, container, readonly }` for clarity |
| Error messages sometimes lack context (e.g., "Portal error") | Include which box and what operation failed |
| Box name must be provided at create time | Support auto-generated meaningful names or `box.rename()` |
| Only `list_info()` to find existing boxes | Add `runtime.get_or_create(name, options)` for idempotent reconnection |

---

## Summary

BoxLite's core architecture (Rust + libkrun + gRPC) is solid. The main improvement area is **reducing integration friction** — especially providing a direct Rust crate, and optimizing the API surface for the AI agent sandboxing use case. Most of the workaround code we maintain (daemon management, shell escaping, base64 file transfer, resize hacks) exists because the SDK layer doesn't expose capabilities that BoxLite already has internally.

Happy to discuss any of these in more detail or contribute PRs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Improve SDK ergonomics for Rust consumers and AI agent use cases #218

Background

1. Publish a Rust crate as a first-class SDK

2. Add `exec_shell()` convenience method

3. Support in-memory file transfer via `copy_in` / `copy_out`

4. Expose `ResizeTty` in the SDK

5. Better health checking and state recovery

6. Documentation for AI agent integration

7. Minor API improvements

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Current	Suggestion
`BoxOptions.volumes` uses tuple `(host, container, mode)`	Use a struct `VolumeMount { host, container, readonly }` for clarity
Error messages sometimes lack context (e.g., "Portal error")	Include which box and what operation failed
Box name must be provided at create time	Support auto-generated meaningful names or `box.rename()`
Only `list_info()` to find existing boxes	Add `runtime.get_or_create(name, options)` for idempotent reconnection

Feature Request: Improve SDK ergonomics for Rust consumers and AI agent use cases #218

Description

Background

1. Publish a Rust crate as a first-class SDK

2. Add exec_shell() convenience method

3. Support in-memory file transfer via copy_in / copy_out

4. Expose ResizeTty in the SDK

5. Better health checking and state recovery

6. Documentation for AI agent integration

7. Minor API improvements

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

2. Add `exec_shell()` convenience method

3. Support in-memory file transfer via `copy_in` / `copy_out`

4. Expose `ResizeTty` in the SDK