Skip to content

Dev-server tests: replace time-based synchronization with event-driven approach #8556

@Boshen

Description

@Boshen

Previous findings

  • Dec 19, 2025 (chore(test-dev-server): add retry mechanism to hmr-full-bundle-mode tests #7588): Browser HMR tests were flaky because files modified during tests remained in modified state on retry, causing find-and-replace to fail. Fixed by adding resetTestFiles() in beforeEach hook.
  • Dec 18, 2025: Dev-server tests separated into their own CI workflow due to flakiness affecting other test suites.
  • Watch config uses poll-based watching (usePolling: true, 50ms interval) + debouncing (310ms duration, 300ms tick rate) specifically because native FS events are unreliable in CI.
  • sensibleTimeoutInMs multiplies all timeouts by 3× in CI environments.
  • Vitest configs set retry: 3 in CI for both fixture and browser test suites.
  • Test timeout increased from 40s to 90s for Windows compatibility.

Why this matters

The hardcoded sleeps and retries make CI slow. A single fixture test step waits 2130ms (710ms × 3 CI multiplier) between HMR steps, plus 6000ms for module registration, plus 6000ms for reload waits. Multiply by retry count of 3 and multiple fixtures, and the dev-server test suite burns minutes on sleeping alone. Fixing the synchronization model removes both the flakiness and the wasted CI time.

Problem

The dev-server tests (packages/test-dev-server/) are flaky because they rely on hardcoded sleeps and polling to synchronize with the dev engine pipeline:

File write → FS poll (50ms) → Debounce (310ms+300ms) → Coordinator → Build → HMR compute → JS callback → WebSocket → Client → Observable result

Each step has non-deterministic latency. Tests use fixed timeouts to "wait long enough", which is inherently fragile.

Hardcoded timing hacks

File Sleep Purpose
fixtures.test.ts:102 710ms (2130ms CI) Debounce gap between HMR steps
fixtures.test.ts:130 2000ms (6000ms CI) Wait before sending reload message
fixtures.test.ts:184 2000ms (6000ms CI) Wait for module registration
test-utils.ts:25 1000ms After every file edit for FS events
vitest-setup-playwright.ts:146 3000ms File reset on retry
vitest.config.*.mts retry: 3 in CI Masks all the above
src/utils.ts:20 All timeouts ×3 in CI Brute-force margin

Root causes

  1. FS watcher poll+debounce timing is non-deterministic: A file write takes 360ms-660ms+ to be detected, depending on where in the poll/tick cycle it lands.

  2. No back-channel from dev engine to tests: Tests have zero visibility into dev engine state. They write a file and blindly hope the entire pipeline completes within a hardcoded timeout.

  3. Module registration race (fixtures.test.ts:184): Test sleeps 2s after starting artifact process, hoping WebSocket module registration completes. If the Node process or WebSocket is slow, HMR computation finds zero clients.

  4. Rust-side FIXME (bundle_coordinator.rs:84): State set to Idle before initial build starts — acknowledged as wrong in a FIXME comment.

Proposed solution

Replace time-based waits with event-driven synchronization:

  1. Have the dev server expose a "pipeline idle" signal (REST endpoint like GET /_dev/status returning { idle: true, buildCount: N }, or a signal file) that tests can await after each file edit.

  2. Replace sleep() calls in fixture tests with: write file → await idle signal → assert result.

  3. For module registration: emit a "client registered" event instead of sleeping 2s.

  4. Fix the coordinator FIXME at bundle_coordinator.rs:84.

This would make tests deterministic regardless of system speed and allow removing the retry count and timeout multipliers.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

Priority

None yet

Effort

None yet

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions