Skip to content

test(e2e): run tests in parallel#9563

Merged
jdx merged 3 commits intomainfrom
codex/parallel-e2e-tests
May 3, 2026
Merged

test(e2e): run tests in parallel#9563
jdx merged 3 commits intomainfrom
codex/parallel-e2e-tests

Conversation

@jdx
Copy link
Copy Markdown
Owner

@jdx jdx commented May 3, 2026

Summary

  • Run e2e tests with bounded parallelism via E2E_JOBS, capped at 4 jobs by default.
  • Capture each test's combined output to a per-test log and print logs in test-list order, so terminal and GitHub step-summary output stay serial.
  • Keep empty-tranche/empty-PIDS cleanup compatible with bash 3.2 under set -u, and avoid marking a child failed if its status file appears during the liveness check.
  • Treat empty status reads as failures explicitly instead of relying on bash string comparison behavior.
  • Keep step-summary rows in original filtered test order, including skipped slow tests.
  • Move shared e2e server port/ready files and debug logs into each test's isolated temp directory, and replace fixed local HTTP ports with OS-assigned ports where practical.
  • Let wait_for_file fail fast when a watched server PID exits before writing its port file, with cleanup traps registered before fallible waits.
  • Pass HTTP server port-file env vars explicitly in task/netrc tests instead of relying on run_test path coupling.
  • Add the missing test:e2e task entry to call e2e/run_all_tests after building.

Validation

  • mise run build
  • bash -n on changed shell scripts
  • PYTHONDONTWRITEBYTECODE=1 python3 -m py_compile on changed helper scripts
  • shellcheck -x e2e/run_all_tests e2e/assert.sh e2e/tasks/test_task_remote_http e2e/tasks/test_task_standalone e2e/generate/test_generate_tool_stub e2e/backend/test_http_compressed_binaries e2e/backend/test_bun_cross_device_install e2e/config/test_netrc
  • shfmt -d --apply-ignore e2e/run_all_tests e2e/assert.sh e2e/tasks/test_task_remote_http e2e/tasks/test_task_standalone e2e/generate/test_generate_tool_stub e2e/backend/test_http_compressed_binaries e2e/backend/test_bun_cross_device_install e2e/config/test_netrc
  • Concurrent focused runs of remote git task tests
  • Concurrent focused runs of HTTP/tool-stub tests
  • e2e/run_test tasks/test_task_remote_http
  • e2e/run_test tasks/test_task_standalone
  • e2e/run_test generate/test_generate_tool_stub
  • e2e/run_test config/test_netrc
  • TEST_TRANCHE_COUNT=100000 TEST_TRANCHE=99999 e2e/run_all_tests
  • GITHUB_STEP_SUMMARY=<tmp> E2E_JOBS=2 TEST_TRANCHE_COUNT=100 TEST_TRANCHE=0 e2e/run_all_tests
  • E2E_JOBS=2 TEST_TRANCHE_COUNT=100 TEST_TRANCHE=0 e2e/run_all_tests
  • git diff --check
  • Commit hook: hk lint stack, including shellcheck, shfmt, taplo, cargo fmt, cargo check, prettier, markdownlint, schema checks

This PR was generated by an AI coding assistant.


Note

Medium Risk
Changes the e2e test harness to run multiple tests concurrently and rewires how tests coordinate ephemeral ports/log files; failures could be flaky or harder to diagnose if any isolation assumptions are wrong.

Overview
E2E tests now run with bounded parallelism. e2e/run_all_tests queues eligible tests and executes them concurrently (configurable via E2E_JOBS, capped by default), capturing each test’s combined stdout/stderr to per-test log files while printing results in the original order.

Test fixtures were updated to be parallel-safe. A new wait_for_file helper replaces fixed sleeps, HTTP/git helper servers now bind to OS-assigned ports and write the chosen port/ready markers to per-test temp locations via env vars, and multiple tests stop writing debug/output artifacts to shared /tmp paths.

Adds a missing test:e2e task in tasks.toml to run e2e/run_all_tests (after build).

Reviewed by Cursor Bugbot for commit 84b6f9d. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enables parallel execution of end-to-end tests by refactoring the test runner and individual test scripts. Key changes include implementing a process pool in e2e/run_all_tests, replacing hardcoded ports with dynamic allocation (binding to port 0), and migrating from global /tmp paths to test-specific temporary directories to prevent collisions. Review feedback highlights the need for portable mktemp usage for macOS compatibility, more robust cleanup of background processes on script interruption, and the prevention of race conditions when writing status and port files.

Comment thread e2e/run_all_tests Outdated
exit 1
fi

LOG_DIR="$(mktemp --tmpdir --directory "mise-e2e-logs.XXXXXX")"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The use of GNU-specific flags --tmpdir and --directory for mktemp is not portable to BSD-based systems like macOS. Since this repository supports macOS, consider using the more portable mktemp -d instead.

LOG_DIR="$(mktemp -d)"

Comment thread e2e/run_all_tests Outdated
fi

LOG_DIR="$(mktemp --tmpdir --directory "mise-e2e-logs.XXXXXX")"
trap 'rm -rf "$LOG_DIR"' EXIT
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The current trap only cleans up the log directory. When running tests in parallel, it's important to ensure that all background processes are terminated if the main script is interrupted (e.g., via Ctrl+C) to avoid orphaned processes and resource leaks.

trap 'kill "${PIDS[@]}" 2>/dev/null || true; rm -rf "$LOG_DIR"' EXIT INT TERM

Comment thread e2e/run_all_tests Outdated
if [[ $test_status == 0 ]]; then
"$ROOT/e2e/run_test" "$test_name" || test_status=$?
fi
echo "$test_status" >"$status_file"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Writing the status file directly can lead to a race condition where the main loop detects the file's existence (via [[ -f $status_file ]] on line 154) before the content has been fully written or flushed. Using an atomic move ensures the main loop only sees the file once it is complete and stable.

    echo "$test_status" >"$status_file.tmp" && mv "$status_file.tmp" "$status_file"

Comment thread e2e/backend/test_s3_minio_slow Outdated
Comment on lines +14 to +15
MINIO_PORT=$(find_available_port)
MINIO_CONSOLE_PORT=$(find_available_port)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Calling find_available_port twice in quick succession is prone to race conditions, especially in a parallel execution environment. Since the function closes the socket before returning the port, another process (or the second call to find_available_port) could claim the same port before MinIO starts. A more robust approach would be to let the server bind to port 0 and report its assigned port, similar to the improvements made in other test scripts in this PR.

Comment on lines +56 to +58
port_file = os.environ.get("MISE_TOOL_STUB_TEST_PORT_FILE")
if port_file:
Path(port_file).write_text(str(actual_port))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For consistency with other helper scripts and to ensure robustness when the provided path's parent directory doesn't exist, consider creating the parent directory before writing the port file.

        port_file = os.environ.get("MISE_TOOL_STUB_TEST_PORT_FILE")
        if port_file:
            port_file_path = Path(port_file)
            port_file_path.parent.mkdir(parents=True, exist_ok=True)
            port_file_path.write_text(str(actual_port))

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 3, 2026

Greptile Summary

This PR rewires the e2e test harness to run tests with bounded parallelism (E2E_JOBS, capped at 4 by default). Each test gets an isolated subshell and per-test TMPDIR, server processes bind to OS-assigned ports and write them to per-test files, a new wait_for_file helper replaces fixed sleep calls, and the scheduler prints output in original test order. The approach is sound and the isolation model is consistent across all 20 changed files.

  • The cleanup trap kills all accumulated PIDS including already-exited ones; if any PID is recycled by the OS before cleanup fires, an unrelated process could be killed (P2, suggestion provided).
  • The printed drain loop runs inside the outer while but there is no explicit final-drain pass after the loop exits; a last-cycle timing edge could silently drop output for the very last tests (P2, suggestion provided).

Confidence Score: 4/5

Safe to merge; the parallelism model is well-structured with two minor edge-case issues in the scheduler worth addressing before widespread CI use.

Only P2 findings — a PID-recycling risk in cleanup and a theoretical last-cycle output-drain gap. Both are unlikely to trigger in practice but worth hardening. No P0/P1 issues found.

e2e/run_all_tests — the new parallel scheduler, specifically the cleanup trap and the printed drain loop

Important Files Changed

Filename Overview
e2e/run_all_tests Core scheduler rewrite: bounded parallel execution, per-test log/status/summary files, ordered printing, dead-process recovery — logic appears correct with only minor stale-PID cleanup risk
e2e/run_test Injects MISE_HTTP_TEST_PORT_FILE into each test's isolated env so the HTTP test server port file lands in the per-test TMPDIR
e2e/assert.sh Adds wait_for_file helper that polls for a file with an optional dead-PID fast-fail; replaces the previous fixed sleep calls across tests
e2e/helpers/scripts/git_http_backend_server.py Replaces hardcoded /tmp paths with env-var-configurable paths for parallel-safe isolation
e2e/helpers/scripts/http_test_server.py Removes unsafe fixed-range port scanner, switches to OS-assigned port (port=0), writes port file to env-var-configurable path
e2e/backend/test_s3_minio_slow Adds retry loop for MinIO startup with dynamic port selection; TOCTOU race between port probe and MinIO bind noted in previous threads
tasks.toml Adds missing test:e2e task that depends on build and runs e2e/run_all_tests

Fix All in Claude Code

Reviews (6): Last reviewed commit: "test(e2e): address parallel runner feedb..." | Re-trigger Greptile

Comment thread e2e/run_all_tests
Comment thread e2e/backend/test_s3_minio_slow Outdated
Comment thread e2e/run_all_tests Outdated
Comment thread e2e/run_all_tests Outdated
Comment thread e2e/run_all_tests
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 3, 2026

Hyperfine Performance

mise x -- echo

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.28 x -- echo 16.9 ± 0.4 15.9 19.4 1.00
mise x -- echo 17.3 ± 0.4 16.2 19.3 1.02 ± 0.03

mise env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.28 env 16.5 ± 0.4 15.7 19.8 1.00
mise env 17.2 ± 0.6 16.2 20.9 1.04 ± 0.04

mise hook-env

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.28 hook-env 17.1 ± 0.4 15.9 22.6 1.00
mise hook-env 17.2 ± 0.4 16.3 18.7 1.01 ± 0.03

mise ls

Command Mean [ms] Min [ms] Max [ms] Relative
mise-2026.4.28 ls 15.5 ± 0.4 14.6 16.7 1.00
mise ls 16.8 ± 0.6 15.1 18.4 1.08 ± 0.05

xtasks/test/perf

Command mise-2026.4.28 mise Variance
install (cached) 114ms 119ms -4%
ls (cached) 65ms 66ms -1%
bin-paths (cached) 68ms 68ms +0%
task-ls (cached) 706ms 703ms +0%

@jdx jdx force-pushed the codex/parallel-e2e-tests branch from 84b6f9d to b4442db Compare May 3, 2026 14:02
Comment thread e2e/run_all_tests
@jdx jdx force-pushed the codex/parallel-e2e-tests branch from b4442db to 94ae0d9 Compare May 3, 2026 14:21
Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 94ae0d9. Configure here.

Comment thread e2e/config/test_netrc
@jdx jdx force-pushed the codex/parallel-e2e-tests branch from 94ae0d9 to 02a5ff8 Compare May 3, 2026 14:48
@jdx jdx merged commit d2e25fe into main May 3, 2026
38 checks passed
@jdx jdx deleted the codex/parallel-e2e-tests branch May 3, 2026 15:03
jdx added a commit that referenced this pull request May 3, 2026
…mary (#9570)

## Summary
- **Fixes Swift e2e ENOSPC failure.** `e2e/run_test`'s Docker invocation
mounted the container's `/tmp` and `/root` via `--tmpfs`, which is
RAM-backed and capped at ~half of RAM. Multi-GB tool extractions (e.g.
swift) blew past that limit (`No space left on device (os error 28)`).
Replaced with disk-backed bind mounts under `RUNNER_TEMP`, mirroring the
pattern already used in `.github/workflows/registry.yml`.
- **Surfaces failed e2e tests in the GitHub Actions UI.** With parallel
e2e ([#9563](#9563)), per-test
`::group::` blocks are still replayed in order but the runner emitted no
aggregate failure list, making it tedious to find which test failed in
the GitHub Actions log. `e2e/run_all_tests` now tracks failed tests
during the in-order replay and emits:
  - a stderr block at the bottom listing every failed test
- an `::error title=E2E failures (N)::test1,test2,...` annotation that
surfaces in the workflow's annotations panel
  - a `### :x: Failed e2e tests (N)` section in the GHA step summary

## Test plan
- [x] `bash -n` and `shellcheck -x` clean on `e2e/run_test` and
`e2e/run_all_tests`
- [x] `mise run lint-fix` clean
- [x] Smoke test of the failure-summary tail (verified stderr,
`::error::` line, and step-summary content shape)
- [ ] Confirm the e2e tranche that previously OOMed on swift now passes
- [ ] Confirm a deliberately-failing test surfaces in the annotations
panel and step summary

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it changes how e2e tests run in Docker (/tmp and
/root mounts) and alters CI reporting; failures could show up as new CI
flakiness or permission/cleanup issues rather than product behavior.
> 
> **Overview**
> Improves e2e CI ergonomics and reliability.
> 
> `e2e/run_test` replaces Docker `--tmpfs` usage for `/tmp` and `/root`
with disk-backed bind mounts (created under
`${MISE_E2E_HOST_TMP:-${RUNNER_TEMP:-${TMPDIR:-/tmp}}}`) to avoid ENOSPC
on large tool installs, and ensures those temp dirs are cleaned up while
preserving the container exit code.
> 
> `e2e/run_all_tests` now tracks which parallel tests failed and, at the
end of the run, emits an aggregated failure list to stderr and (on
GitHub Actions) a `::error` annotation plus a “Failed e2e tests” section
appended to the step summary.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
c5ce6ec. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>
mise-en-dev added a commit that referenced this pull request May 3, 2026
### 🚀 Features

- **(conda)** graduate conda backend out of experimental by @jdx in
[#9544](#9544)
- **(deps)** Add dart and flutter providers by @tjarvstrand in
[#9505](#9505)
- **(registry)** add neo4j by @mnm364 in
[#9525](#9525)
- **(registry)** add rustfs by @mnm364 in
[#9530](#9530)
- **(task)** support exclusion patterns in task sources by
@jlarmstrongiv in [#9496](#9496)
- **(vfox)** add stat function to lua file module by @esteve in
[#9497](#9497)

### 🐛 Bug Fixes

- **(backend)** flag regex prerelease versions by @jdx in
[#9500](#9500)
- **(backend)** mark -nightly/-canary/-experimental as prereleases by
@jdx in [#9523](#9523)
- **(backend)** suppress no-versions warning for unresolved-latest
backends by @jdx in [#9548](#9548)
- **(backend)** include dotnet prereleases from package flags by @jdx in
[#9551](#9551)
- **(backend)** scope PEP 440 prerelease detection to Python backends by
@jdx in [#9558](#9558)
- **(cargo)** Apply install_env during cargo install by @c22 in
[#9502](#9502)
- **(copr)** drop epel-9 chroots since rust >= 1.91 is unavailable by
@jdx in [#9484](#9484)
- **(github)** skip attestations on non-default api_url by @jdx in
[#9486](#9486)
- **(github)** retry ip allow list errors without auth by @risu729 in
[#9506](#9506)
- **(http)** update versions host tracking endpoint by @jdx in
[#9527](#9527)
- **(install)** don't warn for configured tools when version is passed
via CLI by @jdx in [#9522](#9522)
- **(install)** refresh latest before installing missing tools by @jdx
in [#9545](#9545)
- **(install)** don't cache nonexistent install paths by @jdx in
[#9553](#9553)
- **(lockfile)** don't propagate ad-hoc CLI overrides into the project
lockfile by @jdx in [#9562](#9562)
- **(plugin)** detect plugin types after cloning by @risu729 in
[#9540](#9540)
- **(release)** pass --no-git-checks to aube publish by @jdx in
[#9483](#9483)
- **(task)** convert PATH to MSYS Unix form when spawning POSIX shells
on Windows by @JamBalaya56562 in
[#9547](#9547)

### 📚 Documentation

- **(contributing)** require popularity check for registry PRs by @jdx
in
[7bbeebe](7bbeebe)
- **(watch)** update pitchfork domain to en.dev by @risu729 in
[#9536](#9536)
- document ghtkn GitHub token setup by @jdx in
[#9546](#9546)
- clarify registry backend acceptance policy by @jdx in
[#9543](#9543)
- Change exec command to use bash for variable echo by @kuboon in
[#9567](#9567)

### 🧪 Testing

- **(e2e)** run test-tool targets in parallel by @jdx in
[#9564](#9564)
- **(e2e)** run tests in parallel by @jdx in
[#9563](#9563)
- **(e2e)** bind-mount /tmp on disk and surface failed tests in CI
summary by @jdx in [#9570](#9570)
- **(tasks)** migrate test_task_help atask to usage field by @jdx in
[#9549](#9549)

### 📦️ Dependency Updates

- update fedora:45 docker digest to 8b838b3 by @renovate[bot] in
[#9507](#9507)
- update ghcr.io/jdx/mise:deb docker digest to f02194c by @renovate[bot]
in [#9509](#9509)
- update taiki-e/install-action digest to 7769b73 by @renovate[bot] in
[#9512](#9512)
- update ghcr.io/jdx/mise:alpine docker digest to 581f8a8 by
@renovate[bot] in [#9508](#9508)
- update rust crate ctor to v0.10.1 by @renovate[bot] in
[#9515](#9515)
- update ghcr.io/jdx/mise:rpm docker digest to a5c9655 by @renovate[bot]
in [#9510](#9510)
- update rust docker digest to a9cfb75 by @renovate[bot] in
[#9511](#9511)
- update rust crate age to v0.11.3 by @renovate[bot] in
[#9514](#9514)
- update rust crate jiff to v0.2.24 by @renovate[bot] in
[#9516](#9516)
- update dependency vitepress-plugin-tabs to ^0.9.0 by @renovate[bot] in
[#9518](#9518)
- update autofix-ci/action action to v1.3.4 by @renovate[bot] in
[#9513](#9513)
- update rust crate usage-lib to v3.2.1 by @renovate[bot] in
[#9517](#9517)
- update apple-actions/import-codesign-certs action to v7 by
@renovate[bot] in [#9519](#9519)
- update taiki-e/install-action digest to 51cd0b8 by @renovate[bot] in
[#9531](#9531)
- exclude taiki-e/install-action from renovate by @jdx in
[#9532](#9532)
- update rust crate blake3 to v1.8.5 by @renovate[bot] in
[#9533](#9533)

### 📦 Registry

- enable shellcheck on windows by @zeitlinger in
[#9487](#9487)
- add google-java-format by @zeitlinger in
[#9488](#9488)
- add expert
([aqua:expert-lsp/expert](https://github.com/expert-lsp/expert)) by
@AlternateRT in [#9498](#9498)
- update entry for checkmake by @eread in
[#9504](#9504)
- add systemctl-tui
([aqua:rgwood/systemctl-tui](https://github.com/rgwood/systemctl-tui))
by @2xdevv in [#9521](#9521)
- add codon by @3w36zj6 in
[#9538](#9538)
- add tool yr (backend:github:VirusTotal/yara-x) by @adam-moss in
[#9542](#9542)
- add tool betterleaks (backend:aqua/betterleaks/betterleaks) by
@adam-moss in [#9541](#9541)
- add `git-filter-repo` by @garysassano in
[#9550](#9550)
- add umoci
([aqua:opencontainers/umoci](https://github.com/opencontainers/umoci))
by @2xdevv in [#9555](#9555)
- add aqua backend for elixir-ls by @AlternateRT in
[#9557](#9557)
- deny inline backend options by @risu729 in
[#9565](#9565)

### Chore

- **(ci)** fail registry tests without summary by @jdx in
[#9559](#9559)
- **(ci)** use !cancelled() instead of always() for test-ci aggregator
by @jdx in [#9569](#9569)
- **(ci)** use namespace runners for ci jobs by @jdx in
[#9561](#9561)
- **(config)** deprecate shorthands_file setting by @risu729 in
[#9534](#9534)
- **(docs)** remove shrill.en.dev analytics script by @jdx in
[#9539](#9539)
- **(release)** replace bc with awk in release-plz star formatting by
@jdx in
[d7f177f](d7f177f)
- bump hk to 1.44.3 by @jdx in
[#9493](#9493)
- invert CLAUDE.md/AGENTS.md so AGENTS.md is canonical by @jdx in
[#9560](#9560)
- set dev profile debug to 1 by @jdx in
[#9572](#9572)

### New Contributors

- @kuboon made their first contribution in
[#9567](#9567)
- @AlternateRT made their first contribution in
[#9557](#9557)
- @2xdevv made their first contribution in
[#9555](#9555)
- @adam-moss made their first contribution in
[#9541](#9541)
- @jlarmstrongiv made their first contribution in
[#9496](#9496)
- @tjarvstrand made their first contribution in
[#9505](#9505)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant