Skip to content

Improve CI timings #8553

@Boshen

Description

@Boshen

Goal

Reduce required CI wall-clock time for faster PR merges, while preserving current required coverage in the first pass.

Current Required Merge Flow

flowchart TD
  A[Detect Changes] --> B[Rust Validation]
  A --> C[Repo Validation]
  A --> D[Spell Check]
  A --> E[cargo-test matrix: ubuntu/macos/windows]
  A --> F[build-rolldown-native matrix: ubuntu/macos/windows]
  A --> I[Node Validation]

  F --> J[node-test: windows/macos/ubuntu]
  F --> K[node-dev-server-test: windows/macos/ubuntu]
  F --> L[Type Check]
Loading

Workflow Files (Hotspots)

  • .github/workflows/ci.yml
  • .github/workflows/reusable-node-test.yml
  • .github/workflows/reusable-node-dev-server-test.yml
  • .github/workflows/reusable-cargo-test.yml
  • .github/workflows/reusable-native-build.yml

Timing Baseline (Pre-optimization)

Measured from runs:

  • 22699519602
  • 22678505235

Total required CI wall-clock:

  • ~11.23 min
  • ~11.38 min

Job Family Averages (Pre-optimization)

Job family Avg duration Max duration
node-dev-server-test 5.69m 7.32m
cargo-test 4.98m 6.23m
node-test 4.38m 4.75m
Node Validation 4.35m 4.43m
Build @rolldown/browser 4.23m 4.60m
build-rolldown-wasi 4.01m 4.22m
build-rolldown-native 2.83m 3.88m

Progress

PR Title Status Slow steps addressed Estimated savings
#8554 ci: dedupe type-check from dev server workflow ✅ Merged node-dev-server-test (avg 5.69m, worst 7.32m) — removed redundant pnpm type-check step Reduces the slowest CI lane; type-check already covered in node-test-* jobs
#8573 ci: skip redundant native binding build for browser and remove standalone job ✅ Merged build-browserBuild @rolldown/browser (avg 184.5s) and Node ValidationBuild @rolldown/browser (avg 114.0s) Removes standalone build-browser job entirely (~4.23m avg); skips ~60-90s native binding build in browser package scripts
#8570 ci: parallelize Node tests on ubuntu, single Node 24 on macOS/windows ✅ Merged node-dev-server-test and node-test — sequential Node 20/22/24 runs parallelized via matrix; reduced matrix on macOS/windows to Node 24 only ~4min wall-clock savings; 46% reduction in billable minutes (79 vs 146 for node jobs)
#8574 ci: use Dev Drive for Windows CI jobs ✅ Merged cargo-test (windows) (avg 4.98m, worst 6.23m) and build-rolldown-native (windows) (avg 2.83m, worst 3.88m) — ReFS Dev Drive bypasses Windows Defender minifilter for CARGO_HOME/RUSTUP_HOME ~33% faster cargo-test-windows, ~20% faster build-rolldown-windows
#8577 ci: remove unnecessary submodule checkouts ✅ Merged Checkout step in reusable-native-build, reusable-node-dev-server-test, ci.yml (type-check), reusable-wasi-build, reusable-wasi-test, vite-tests — removed submodules: true from workflows that don't use rollup/ or test262/ Saves ~10-20s per job by skipping submodule clone
#8578 ci: optimize cache keys to fix race conditions and reduce usage ✅ Merged Cache contention across all debug jobs sharing debug-build key — assign unique keys per job type (lint, cargo-test, native-build), remove cache-warmup.yml, disable release build caching Estimated cache usage ~6.4 GB (down from ~9.9 GB); eliminates cache race conditions
#8580 ci: remove WASI build & test pipeline ✅ Merged build-rolldown-wasi (avg 4.01m) and wasi-test matrix (3 OS) — removed entire WASI CI pipeline Removes ~4min from critical path; saves 4 jobs worth of billable minutes
#8586 ci: move Windows cargo target dir to Dev Drive ✅ Merged cargo-test (windows) Build step (252s cold) and build-rolldown-native (windows) — moves target/ to Dev Drive via CARGO_TARGET_DIR Cold build: 252s → ~150-170s; compilation I/O on fast ReFS instead of slow C: drive
#8600 ci: skip Windows CI jobs on PRs 🔄 Open cargo-test (windows), build-rolldown-windows, node-test-windows, node-dev-server-test-windows — all skipped on PRs, still run on main Removes Windows from PR critical path (~9.3m cold / ~6m warm); saves ~4 jobs of billable minutes per PR
#8610 perf: merge 4 integration test binaries into 1 🔄 Open cargo-test Build step — merges 4 integration test binaries (~297MB) into 1 (~82MB), eliminating 3 redundant link steps Local: -48% CPU, -17% wall; CI: -3s (modest, cached); -215MB disk per build

Latest timing (run 22813402815 on main, 2026-03-08)

Attempt 1 (cold cache): Total wall-clock 9m16s
Attempt 2 (warm cache on Windows rerun): cargo-test (windows) 358s

cargo-test step breakdown

OS Build (compile) Run Test Total Notes
Windows (cold cache) 252s 131s 556s First run, cache miss
Windows (warm cache) 96s 135s 358s Rerun with cache hit — 36% faster
Ubuntu 144s 91s 294s
macOS 151s 115s 376s

All jobs (attempt 1)

Job Duration
cargo-test (windows) 556s (9.3m) — critical path (cold cache)
cargo-test (macos) 376s (6.3m)
cargo-test (ubuntu) 294s (4.9m)
build-rolldown-windows 266s
Node Validation 252s
build-rolldown-macos 225s
node-dev-server-test-windows 209s
Rust Validation 171s
build-rolldown-ubuntu 149s
node-test-windows 140s
node-dev-server-test-ubuntu (×3) ~117s
node-dev-server-test-macos 115s
node-test-ubuntu (×3) ~97s
node-test-macos 96s
Type Check 45s

Critical path (warm cache): cargo-test (windows) at 358s. The Run Test step (135s) is now the dominant cost on Windows with warm cache.

Critical path (cold cache): cargo-test (windows) at 556s. The Build step (252s) dominates.

Next bottleneck: cargo-test parallelization with cargo-nextest

cargo test runs ~2000 tests but parallelizes poorly (sequential binary execution). cargo-nextest runs each test as a separate process with full cross-binary parallelism.

Expected Run Test savings:

  • Windows: 135s → ~45-65s (~70-90s saved)
  • macOS: 115s → ~40-55s
  • Ubuntu: 91s → ~30-45s

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Priority

None yet

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions