Investigate alternatives to fixed 5ms pacing in the macOS multiline PTY workaround

## Context

This is a follow-up issue for the workaround discussion raised in @jarred-sumner's post:

- https://x.com/jarredsumner/status/2031650127421808789

Related VS Code work:

- Original PR: https://github.com/microsoft/vscode/pull/298993
- Existing bug: https://github.com/microsoft/vscode/issues/296955
- Current PR: https://github.com/microsoft/vscode/pull/300740

The original bug is real: on macOS, multiline PTY input can get corrupted or stuck when the shell is fed a large multiline payload through VS Code's terminal pipeline. The current PR fixes one concrete VS Code-side bug by making the macOS multiline chunking gate UTF-8 byte-aware instead of relying on JS string length.

This issue is about the remaining design question: whether the current fixed `5 ms` pacing heuristic is acceptable, and what viable alternatives exist.

## What We Investigated

We looked at two separate concerns raised in the discussion:

1. Whether the current workaround's `5 ms` sleep between chunks is too heuristic, non-deterministic, or slow.
2. Whether short writes or retries in the lower-level write path might actually be the main issue.

### 1. UTF-8 predicate mismatch in VS Code

We confirmed a real VS Code-side bug in the current macOS multiline gate:

- The old gate used JS string length.
- The PTY limit that matters here is UTF-8 bytes.
- Multibyte payloads can stay under the UTF-16 gate while exceeding the relevant UTF-8 threshold.

That is now covered by a regression test in the PR and fixed locally by switching the gate and chunk splitting to UTF-8 byte semantics.

### 2. Lower-level partial write and retry handling

We re-read the `node-pty` Unix write path and did not find evidence that this specific repro is simply explained by missing short-write handling in VS Code.

At a high level, upstream `node-pty` already appears to:

- queue writes,
- handle partial writes with offsets,
- retry on `EAGAIN`.

So this still looks like a separate issue from the VS Code-side UTF-16-vs-UTF-8 predicate bug and the canonical-mode or line-editor backpressure behavior seen on macOS.

That said, this area could still use a sharper audit from people who know the PTY stack better.

## What We Tried

### Baseline fixed-delay workaround

Existing local findings from the macOS multiline workaround investigation:

- `timeout(0)` failed the medium and large multiline tests.
- `timeout(1)` passed the medium case but still failed larger cases.
- `timeout(5)` passed the existing 500-line local probe.

### Event-driven pacing spike

We spiked a replacement for the fixed `5 ms` delay in `TerminalProcess.input()`:

- chunk multiline writes,
- wait for PTY output activity after each chunk,
- then wait for a short quiet period before sending the next chunk.

We also tried a few variants:

- wait for the first PTY activity event only,
- use a `1 ms` quiet window after activity,
- use `0 ms`,
- use a next-turn yield (`setTimeout0`) instead of `1 ms`.

### Results

#### Small regression coverage

The event-driven spike with a `1 ms` quiet window looked promising against the current small regression coverage:

- medium ASCII multiline case passed repeatedly,
- multibyte UTF-8 regression case passed repeatedly.

#### Larger payload stress

The same spike did not hold up under larger standalone ASCII payload probes:

- 100 lines: `0/3` passes,
- 500 lines: `0/3` passes,
- 1000 lines: `0/3` passes,
- 1500 lines: `0/3` passes,
- 1660 lines: `0/3` passes.

Failure modes included:

- no output file produced at all,
- corrupted tail content rather than a clean stall.

`0 ms` and `setTimeout0` were both too aggressive and regressed even the medium ASCII case.

### Secondary issue surfaced by the spike

The event-driven spike also exposed an implementation problem in standalone Node-based probing:

- canceling the internal `timeout(...).cancel()` path generated repeated unhandled `Canceled` promise rejections in plain Node 23 runs,
- the existing Mocha tests did not surface this, but the standalone harness did.

So even setting aside the larger-payload regressions, that spike implementation would need cleanup before it could be considered further.

## Current Conclusion

What seems justified right now:

- Keep the UTF-8 byte-aware fix. That addresses a real bug.
- Do not assume the event-driven pacing spike is a drop-in replacement for the current fixed delay.
- Do not assume `0 ms`, `yield`, or `setTimeout0` is sufficient. We tried those and they regressed.

At the moment, the fixed-delay workaround still seems to be the only local approach that survives the larger payload probes we ran.

## Open Questions

1. Is there a better PTY-level signal than a fixed sleep for knowing when canonical-mode input is safe to continue writing on macOS?
2. Is the remaining problem fundamentally shell line-editor backpressure, or is there still a lower-level write-path detail we are missing?
3. Is there a better mitigation than a flat per-chunk delay, for example adaptive pacing based on observed output or buffer state?
4. How much of the observed failure boundary is true PTY behavior versus scheduling noise on macOS?
5. How should we think about the tradeoff between a narrow macOS-only workaround and the user-visible latency concerns Jarred raised?

If people familiar with macOS PTYs, shells in canonical mode, or `node-pty` internals want to weigh in, that would be useful.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Investigate alternatives to fixed 5ms pacing in the macOS multiline PTY workaround #300762

Context

What We Investigated

1. UTF-8 predicate mismatch in VS Code

2. Lower-level partial write and retry handling

What We Tried

Baseline fixed-delay workaround

Event-driven pacing spike

Results

Small regression coverage

Larger payload stress

Secondary issue surfaced by the spike

Current Conclusion

Open Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Investigate alternatives to fixed 5ms pacing in the macOS multiline PTY workaround #300762

Description

Context

What We Investigated

1. UTF-8 predicate mismatch in VS Code

2. Lower-level partial write and retry handling

What We Tried

Baseline fixed-delay workaround

Event-driven pacing spike

Results

Small regression coverage

Larger payload stress

Secondary issue surfaced by the spike

Current Conclusion

Open Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions