fix(tests): stabilize Windows CI cases by Takhoffman · Pull Request #48544 · openclaw/openclaw

Takhoffman · 2026-03-16T22:59:51Z

Summary

Problem: Windows CI was failing in isolated-agent tests and in plugin tests with Windows-specific temp-path formatting.
Why it matters: main stayed red on the Windows shards, masking real regressions and slowing follow-up work.
What changed: the cron subagent follow-up helper now resolves fast-test timing from the current env at call time, and the affected tests now compare canonicalized paths / Windows home env more robustly.
What did NOT change (scope boundary): no production runtime behavior, provider auth behavior, or plugin loading behavior changed outside test handling.

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

Related Plugins: remove public extension-api surface #48462

User-visible / Behavior Changes

None.

Security Impact (required)

New permissions/capabilities? No
Secrets/tokens handling changed? No
New/changed network calls? No
Command/tool execution surface changed? No
Data access scope changed? No
If any Yes, explain risk + mitigation:

Repro + Verification

Environment

OS: macOS local verification; failures originally observed on GitHub Actions Windows runners
Runtime/container: Node 24 / Vitest
Model/provider: N/A
Integration/channel (if any): N/A
Relevant config (redacted): temp test homes only

Steps

Run the targeted isolated-agent tests.
Run the targeted plugin/provider auth tests.
Confirm the updated assertions and timing behavior pass.

Expected

Targeted Windows CI tests complete without timeout and without path-format assertion failures.

Actual

Targeted tests pass locally after the fix.

Evidence

Failing test/log before + passing after
Trace/log snippets
Screenshot/recording
Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

Verified scenarios:
- pnpm vitest run src/cron/isolated-agent.auth-profile-propagation.test.ts src/cron/isolated-agent.subagent-model.test.ts
- pnpm vitest run src/plugins/bundle-mcp.test.ts src/plugins/marketplace.test.ts src/infra/provider-usage.auth.normalizes-keys.test.ts
Edge cases checked:
- Windows short-path (RUNNER~1) vs canonical path handling in assertions
- fast-test timing read after env setup instead of module import time
What you did not verify:
- full Windows shard run locally

Review Conversations

I replied to or resolved every bot review conversation I addressed in this PR.
I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

Backward compatible? Yes
Config/env changes? No
Migration needed? No
If yes, exact upgrade steps:

Failure Recovery (if this breaks)

How to disable/revert this change quickly: revert commit ce67a582b5
Files/config to restore: the four touched test/helper files only
Known bad symptoms reviewers should watch for: isolated-agent Windows tests timing out again or plugin test assertions depending on raw Windows temp-path spelling

Risks and Mitigations

Risk: another unrelated Windows flake still exists outside these targeted cases.
- Mitigation: scope is narrow and directly tied to the failing tests pulled from the last main Windows CI run.

greptile-apps · 2026-03-16T23:04:34Z

Greptile Summary

This PR stabilises Windows CI by making three targeted categories of fixes: (1) moving OPENCLAW_TEST_FAST env reads from module-load time to call time in subagent-followup.ts so tests that set the flag after module import see the right timings; (2) normalising Windows path separators and HOMEDRIVE/HOMEPATH env vars in four test files that were asserting on raw path strings; and (3) converting installGaxiosFetchCompat from synchronous to async with a dynamic-import loader so gaxios can be resolved without an eager static import that complicates test isolation.

subagent-followup.ts – lazy resolveCronSubagentTimings() call at the top of waitForDescendantSubagentSummary replaces module-level constants.
gaxios-fetch-compat.ts – async dynamic gaxios loader with a __OPENCLAW_TEST_GAXIOS_CONSTRUCTOR__ override hook and a "shimmed" fallback state when gaxios is absent. Concurrent callers during the "installing" phase return early (resolved) rather than coalescing on the in-flight Promise — currently safe given sequential startup callers, but worth noting.
Test path normalisation – fs.realpath comparisons in bundle-mcp.test.ts, replaceAll("\\","/") in marketplace.test.ts, and HOMEDRIVE/HOMEPATH population in provider-usage.auth handle Windows short-path spellings.
wizard.contract.test.ts – vi.hoisted() ensures OPENCLAW_TEST_FAST=1 is set before module-level code runs.
New openclaw-launcher.e2e.test.ts – covers transitive-import-error surfacing vs. the friendly "missing dist" message.
ci.yml – Python heredoc replaces the inline if/else for base-SHA resolution (pre-existing indentation note already filed in a prior thread).

Confidence Score: 4/5

Safe to merge — all changes are confined to test helpers and CI infra with no production runtime behaviour changes.
The scope is tightly bounded to test stabilisation. The gaxios-fetch-compat.ts async refactor is the most complex change; it is logically correct and all existing callers properly await it. The only non-critical concern is that concurrent callers arriving while installState === "installing" resolve immediately rather than waiting for the install to finish, but current usage is purely sequential so this doesn't affect real-world behaviour.
src/infra/gaxios-fetch-compat.ts — review the concurrent-caller short-circuit if the function is ever called from parallel async paths in the future.

Prompt To Fix All With AI

This is a comment left during a code review.
Path: src/infra/gaxios-fetch-compat.ts
Line: 268-271

Comment:
**Concurrent callers during `"installing"` resolve immediately**

When `installState === "installing"`, any concurrent `await installGaxiosFetchCompat()` call short-circuits and returns a resolved `Promise<void>` before installation actually finishes. Code that follows the `await` in those concurrent callers may run against an unpatched gaxios prototype.

This is benign for the current sequential startup callers (`entry.ts`, `index.ts`), but the guard silently drops the wait contract for any future caller that does:

```ts
await installGaxiosFetchCompat();
// assumes gaxios is patched here – not guaranteed if a concurrent install is in flight
const result = await new Gaxios().request(...);
```

A simple future-proof alternative is to cache the in-flight Promise so all concurrent callers coalesce onto it:

```ts
let installPromise: Promise<void> | null = null;

export function installGaxiosFetchCompat(): Promise<void> {
  if (installState === "installed" || installState === "shimmed") return Promise.resolve();
  if (typeof globalThis.fetch !== "function") return Promise.resolve();
  installPromise ??= (async () => {
    installState = "installing";
    try {
      // ... existing body
    } catch (err) {
      installState = "not-installed";
      installPromise = null;
      throw err;
    }
  })();
  return installPromise;
}
```

Not blocking the PR given current callers are sequential, but worth addressing before this module is used in any concurrent context.

How can I resolve this? If you propose a fix, please make it concise.

_{Last reviewed commit: 81ef12c}

greptile-apps · 2026-03-16T23:04:38Z

.github/workflows/ci.yml

+            import json
+            import os
+
+            with open(os.environ["GITHUB_EVENT_PATH"], "r", encoding="utf-8") as fh:
+                event = json.load(fh)
+
+            if os.environ["GITHUB_EVENT_NAME"] == "push":
+                print(event["before"])
+            else:
+                print(event["pull_request"]["base"]["sha"])
+            PY
+          )"



Heredoc terminator will never be matched

The Python code and its heredoc terminator (PY) sit at 12 spaces of YAML indentation, while the surrounding shell body (e.g. set -euo pipefail, BASE="$(") sits at 10 spaces. Because run: | is a YAML literal block scalar that strips those 10 spaces of common indentation, the shell actually receives the terminator as PY (2 leading spaces), not PY at column 0.

For a <<'PY' heredoc (no - modifier), bash only recognises a terminator that is exactly at column 0 with no surrounding whitespace. PY is never matched, so bash reads through the rest of the script looking for the terminator and eventually hits EOF, producing an "unexpected end of file" error that breaks the entire secrets job.

There are two problems that need to be fixed:

The terminator indentation – the PY line (and the whole Python body) must be placed at the same indentation level as the surrounding bash code (10 spaces in the raw YAML) so that after YAML stripping they appear at column 0.

Python top-level indentation – after YAML stripping the Python lines all start with 2 spaces, which Python 3 rejects with IndentationError: unexpected indent on the very first import statement.

The existing iOS job (lines ~725-798 in the same file) uses the correct pattern: the Python body and its PY terminator are at the same YAML indentation as the surrounding shell code, not at a deeper level.

A corrected version of this block:

BASE="$( python - <<'PY' import json import os with open(os.environ["GITHUB_EVENT_PATH"], "r", encoding="utf-8") as fh: event = json.load(fh) if os.environ["GITHUB_EVENT_NAME"] == "push": print(event["before"]) else: print(event["pull_request"]["base"]["sha"]) PY )"

(The Python source lines and the PY terminator must be flush against the left margin of the YAML run: block — i.e., at the same raw indentation as set -euo pipefail.)

Prompt To Fix With AI

This is a comment left during a code review. Path: .github/workflows/ci.yml Line: 463-475 Comment: **Heredoc terminator will never be matched** The Python code and its heredoc terminator (`PY`) sit at 12 spaces of YAML indentation, while the surrounding shell body (e.g. `set -euo pipefail`, `BASE="$("`) sits at 10 spaces. Because `run: |` is a YAML literal block scalar that strips those 10 spaces of common indentation, the shell actually receives the terminator as ` PY` (2 leading spaces), not `PY` at column 0. For a `<<'PY'` heredoc (no `-` modifier), bash only recognises a terminator that is **exactly** at column 0 with no surrounding whitespace. ` PY` is never matched, so bash reads through the rest of the script looking for the terminator and eventually hits EOF, producing an "unexpected end of file" error that breaks the entire `secrets` job. There are two problems that need to be fixed: 1. **The terminator indentation** – the `PY` line (and the whole Python body) must be placed at the same indentation level as the surrounding bash code (10 spaces in the raw YAML) so that after YAML stripping they appear at column 0. 2. **Python top-level indentation** – after YAML stripping the Python lines all start with 2 spaces, which Python 3 rejects with `IndentationError: unexpected indent` on the very first `import` statement. The existing iOS job (lines ~725-798 in the same file) uses the correct pattern: the Python body and its `PY` terminator are at the **same** YAML indentation as the surrounding shell code, not at a deeper level. A corrected version of this block: ``` BASE="$( python - <<'PY' import json import os with open(os.environ["GITHUB_EVENT_PATH"], "r", encoding="utf-8") as fh: event = json.load(fh) if os.environ["GITHUB_EVENT_NAME"] == "push": print(event["before"]) else: print(event["pull_request"]["base"]["sha"]) PY )" ``` (The Python source lines and the `PY` terminator must be flush against the left margin of the YAML `run:` block — i.e., at the same raw indentation as `set -euo pipefail`.) How can I resolve this? If you propose a fix, please make it concise.

greptile-apps · 2026-03-17T01:32:52Z

src/infra/gaxios-fetch-compat.ts

+export async function installGaxiosFetchCompat(): Promise<void> {
+  if (installState !== "not-installed" || typeof globalThis.fetch !== "function") {
    return;
  }


Concurrent callers during "installing" resolve immediately

When installState === "installing", any concurrent await installGaxiosFetchCompat() call short-circuits and returns a resolved Promise<void> before installation actually finishes. Code that follows the await in those concurrent callers may run against an unpatched gaxios prototype.

This is benign for the current sequential startup callers (entry.ts, index.ts), but the guard silently drops the wait contract for any future caller that does:

await installGaxiosFetchCompat(); // assumes gaxios is patched here – not guaranteed if a concurrent install is in flight const result = await new Gaxios().request(...);

A simple future-proof alternative is to cache the in-flight Promise so all concurrent callers coalesce onto it:

let installPromise: Promise<void> | null = null; export function installGaxiosFetchCompat(): Promise<void> { if (installState === "installed" || installState === "shimmed") return Promise.resolve(); if (typeof globalThis.fetch !== "function") return Promise.resolve(); installPromise ??= (async () => { installState = "installing"; try { // ... existing body } catch (err) { installState = "not-installed"; installPromise = null; throw err; } })(); return installPromise; }

Not blocking the PR given current callers are sequential, but worth addressing before this module is used in any concurrent context.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/infra/gaxios-fetch-compat.ts Line: 268-271 Comment: **Concurrent callers during `"installing"` resolve immediately** When `installState === "installing"`, any concurrent `await installGaxiosFetchCompat()` call short-circuits and returns a resolved `Promise<void>` before installation actually finishes. Code that follows the `await` in those concurrent callers may run against an unpatched gaxios prototype. This is benign for the current sequential startup callers (`entry.ts`, `index.ts`), but the guard silently drops the wait contract for any future caller that does: ```ts await installGaxiosFetchCompat(); // assumes gaxios is patched here – not guaranteed if a concurrent install is in flight const result = await new Gaxios().request(...); ``` A simple future-proof alternative is to cache the in-flight Promise so all concurrent callers coalesce onto it: ```ts let installPromise: Promise<void> | null = null; export function installGaxiosFetchCompat(): Promise<void> { if (installState === "installed" || installState === "shimmed") return Promise.resolve(); if (typeof globalThis.fetch !== "function") return Promise.resolve(); installPromise ??= (async () => { installState = "installing"; try { // ... existing body } catch (err) { installState = "not-installed"; installPromise = null; throw err; } })(); return installPromise; } ``` Not blocking the PR given current callers are sequential, but worth addressing before this module is used in any concurrent context. How can I resolve this? If you propose a fix, please make it concise.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 81ef12c609

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-17T01:34:54Z

src/infra/gaxios-fetch-compat.ts

+  if (installState !== "not-installed" || typeof globalThis.fetch !== "function") {
    return;


Await in-progress gaxios compat install

installGaxiosFetchCompat now returns immediately for any state other than not-installed, including installing. Because the function sets installState = "installing" before awaiting loadGaxiosConstructor(), a second concurrent caller can await installGaxiosFetchCompat() and continue before the adapter patch is applied, so its first gaxios calls may run with unpatched behavior (no compat fetch/proxy dispatcher translation). Keep an in-flight install promise and await it when state is installing instead of returning early.

Useful? React with 👍 / 👎.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 25a1f49c9c

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

chatgpt-codex-connector · 2026-03-17T01:55:32Z

src/plugins/contracts/wizard.contract.test.ts

+    providers = resolvePluginProviders({
+      config,
+      env: process.env,
+    });


Keep contract expectations independent from plugin resolver

This test now builds both the expected contract set and the wizard/model-picker outputs from resolvePluginProviders, so a regression in provider resolution (for example, a bundled provider being dropped by loader/config changes) would make both sides shrink together and still pass. In this file the stated goal is to assert coverage of every registered provider contract, so deriving expectations from the same resolver path removes that protection and can let real wizard coverage regressions ship undetected.

Useful? React with 👍 / 👎.

openclaw-barnacle · 2026-03-20T19:29:25Z

Please don't make PRs for test failures on main.

The team is aware of those and will handle them directly on the codebase, not only fixing the tests but also investigating what the root cause is. Having to sift through test-fix-PRs (including some that have been out of date for weeks...) on top of that doesn't help. There are already way too many PRs for humans to manage; please don't make the flood worse.

Thank you.

openclaw-barnacle bot added scripts Repository scripts size: M maintainer Maintainer-authored PR labels Mar 16, 2026

Takhoffman marked this pull request as draft March 16, 2026 23:02

greptile-apps bot reviewed Mar 16, 2026

View reviewed changes

Takhoffman changed the title ~~Codex/windows ci fix~~ fix(tests): stabilize Windows CI cases Mar 17, 2026

Takhoffman marked this pull request as ready for review March 17, 2026 01:28

greptile-apps bot reviewed Mar 17, 2026

View reviewed changes

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

fix(tests): tighten windows ci follow-ups

25a1f49

Takhoffman force-pushed the codex/windows-ci-fix branch from 81ef12c to 25a1f49 Compare March 17, 2026 01:52

openclaw-barnacle bot added size: S and removed scripts Repository scripts size: M labels Mar 17, 2026

chatgpt-codex-connector bot reviewed Mar 17, 2026

View reviewed changes

fix(tests): unblock current typecheck drift

4f9ab1b

openclaw-barnacle bot added channel: imessage Channel integration: imessage channel: telegram Channel integration: telegram channel: whatsapp-web Channel integration: whatsapp-web labels Mar 17, 2026

chore(ci): retrigger windows matrix

0f1953c

BradGroux mentioned this pull request Mar 17, 2026

WORKING: All Microsoft Issues and PRs #49126

Draft

Takhoffman marked this pull request as draft March 18, 2026 00:07

kiranvk-2011 mentioned this pull request Mar 18, 2026

bug: Multiple CI jobs fail on main branch — contracts, channels, extensions, extension-fast, install-smoke, Windows shards #49848

Closed

vincentkoc added the r: no-ci-pr Auto-response for CI/test-failure PRs label Mar 20, 2026

openclaw-barnacle bot closed this Mar 20, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(tests): stabilize Windows CI cases#48544

fix(tests): stabilize Windows CI cases#48544
Takhoffman wants to merge 3 commits intomainfrom
codex/windows-ci-fix

Takhoffman commented Mar 16, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Mar 16, 2026 •

edited

Loading

Uh oh!

greptile-apps bot Mar 16, 2026

Uh oh!

greptile-apps bot Mar 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Uh oh!

openclaw-barnacle bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		if (installState !== "not-installed" \|\| typeof globalThis.fetch !== "function") {
		return;

Uh oh!

Conversation

Takhoffman commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Change Type (select all)

Scope (select all touched areas)

Linked Issue/PR

User-visible / Behavior Changes

Security Impact (required)

Repro + Verification

Environment

Steps

Expected

Actual

Evidence

Human Verification (required)

Review Conversations

Compatibility / Migration

Failure Recovery (if this breaks)

Risks and Mitigations

Uh oh!

greptile-apps bot commented Mar 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 4/5

Uh oh!

greptile-apps bot Mar 16, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

openclaw-barnacle bot commented Mar 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Takhoffman commented Mar 16, 2026 •

edited

Loading

greptile-apps bot commented Mar 16, 2026 •

edited

Loading