Skip to content

CI: guard gateway watch against duplicate runtime regressions#49048

Merged
huntharo merged 4 commits intoopenclaw:mainfrom
huntharo:codex/ci-gateway-regression-checks
Mar 17, 2026
Merged

CI: guard gateway watch against duplicate runtime regressions#49048
huntharo merged 4 commits intoopenclaw:mainfrom
huntharo:codex/ci-gateway-regression-checks

Conversation

@huntharo
Copy link
Copy Markdown
Member

Summary

Describe the problem and fix in 2–5 bullets:

  • Problem: after pnpm build, a later pnpm gateway:watch regression can repopulate dist-runtime/ with a second runtime graph, which is the same class of failure that previously gave OpenClaw split runtime personalities between core and plugins.
  • Why it matters: when that happens, plugins can observe different global state than core, which can surface as Telegram missing plugin-registered slash commands like /voice, /phone, and /pair, and it also damages local developer startup time.
  • What changed: add a dedicated scripts/check-gateway-watch-regression.mjs harness plus a CI job that runs pnpm build, snapshots dist/ and dist-runtime/, runs a bounded pnpm gateway:watch, and fails if dist-runtime/ grows unexpectedly or if gateway startup takes too long after build.
  • What did NOT change (scope boundary): this does not change plugin loading, bundling behavior, or runtime staging itself; it only adds regression detection and CI coverage around the existing fix.

Change Type (select all)

  • Bug fix
  • Feature
  • Refactor
  • Docs
  • Security hardening
  • Chore/infra

Scope (select all touched areas)

  • Gateway / orchestration
  • Skills / tool execution
  • Auth / tokens
  • Memory / storage
  • Integrations
  • API / contracts
  • UI / DX
  • CI/CD / infra

Linked Issue/PR

User-visible / Behavior Changes

  • pnpm test:gateway:watch-regression is available locally and in CI.
  • PRs and main now fail if pnpm gateway:watch after pnpm build starts recreating a large dist-runtime/ tree or regresses badly on startup readiness.
  • The guard explicitly calls out the split-runtime failure mode that can lead to Telegram missing plugin slash commands such as /voice, /phone, and /pair.

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation:

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 24 / pnpm
  • Model/provider: N/A
  • Integration/channel (if any): gateway startup with bundled + configured local channels
  • Relevant config (redacted): existing local gateway config with Telegram enabled

Steps

  1. Run pnpm build.
  2. Run pnpm test:gateway:watch-regression.
  3. Inspect .local/gateway-watch-regression/summary.json and the saved snapshots/diffs.

Expected

  • dist-runtime/ should not explode into a mirrored second JS graph after pnpm gateway:watch.
  • Gateway should reach the listening state quickly after pnpm build.

Actual

  • Cold run after pnpm build: startupReadyMs=4075, distRuntimeFileGrowth=0, distRuntimeByteGrowth=0, distRuntimeAddedPaths=0.
  • Follow-up run with --skip-build: startupReadyMs=2636, distRuntimeFileGrowth=0, distRuntimeByteGrowth=0, addedPaths=0.

Evidence

Attach at least one:

  • Failing test/log before + passing after
  • Trace/log snippets
  • Screenshot/recording
  • Perf numbers (if relevant)

Human Verification (required)

What you personally verified (not just CI), and how:

  • Verified scenarios: ran pnpm check; ran pnpm test:gateway:watch-regression; ran node scripts/check-gateway-watch-regression.mjs --skip-build.
  • Edge cases checked: validated both a cold run after pnpm build and a follow-up run without rebuild; confirmed dist-runtime growth stayed at zero in both cases.
  • What you did not verify: Linux CI timing variance; a synthetic reintroduction of the old duplicate-runtime bug to watch the check fail in CI.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

If a bot review conversation is addressed by this PR, resolve that conversation yourself. Do not leave bot review conversation cleanup for maintainers.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps:

Failure Recovery (if this breaks)

  • How to disable/revert this change quickly: revert this PR or skip the gateway-watch-regression CI job while investigating the guard itself.
  • Files/config to restore: .github/workflows/ci.yml, package.json, scripts/check-gateway-watch-regression.mjs
  • Known bad symptoms reviewers should watch for: false positives from the readiness threshold, or unexpected dist-runtime growth in the saved artifact snapshots.

Risks and Mitigations

List only real risks for this PR. Add/remove entries as needed. If none, write None.

  • Risk: startup-readiness timing could be noisy on slower CI hosts.
    • Mitigation: use a generous 5000ms warning threshold and 8000ms loud-alarm threshold, and save artifacts for inspection on every run.
  • Risk: the structural guard could miss a new duplication shape that keeps file counts flat but still changes runtime identity.
    • Mitigation: the check stores path snapshots, byte counts, and startup logs so regressions can be expanded if a new pattern appears.

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 17, 2026

🔒 Aisle Security Analysis

We found 3 potential security issue(s) in this PR:

# Severity Title
1 🟠 High CI harness writes artifacts to attacker-controlled symlinked outputDir (arbitrary file write outside workspace)
2 🔵 Low Unpinned GitHub Actions versions in new CI job (supply-chain risk)
3 🔵 Low Arbitrary process signaling via stale/tampered PID file in gateway watch regression harness

1. 🟠 CI harness writes artifacts to attacker-controlled symlinked outputDir (arbitrary file write outside workspace)

Property Value
Severity High
CWE CWE-22
Location scripts/check-gateway-watch-regression.mjs:9-67

Description

The new scripts/check-gateway-watch-regression.mjs writes multiple artifact files under options.outputDir without protecting against symlink traversal.

This is dangerous in CI because the default output directory is inside the repository checkout (${cwd}/.local/gateway-watch-regression), and .local/ is only gitignored (not blocked from being committed). A malicious PR can commit a .local path element (or a deeper element) as a symlink pointing outside the workspace. Node's mkdirSync(..., {recursive:true}) and writeFileSync() will follow symlinks in path components, causing the script to create directories and overwrite files outside the repository.

Additionally, the script accepts --output-dir and resolves it to an absolute path (path.resolve(...)), enabling writes anywhere on disk if that flag is ever passed by CI (directly or indirectly).

Vulnerable behaviors:

  • ensureDir(options.outputDir) creates directories without checking for symlinks.
  • Many subsequent writeFileSync(path.join(outputDir, ...)) calls will follow symlinks and overwrite targets.

Vulnerable code:

const DEFAULTS = {
  outputDir: path.join(process.cwd(), ".local", "gateway-watch-regression"),
};

function ensureDir(dirPath) {
  fs.mkdirSync(dirPath, { recursive: true });
}

async function main() {
  const options = parseArgs(process.argv.slice(2));
  ensureDir(options.outputDir);// ... multiple writeFileSync() into options.outputDir
}

Impact in CI:

  • A PR can introduce a symlinked .local/ (or nested path) so the job writes outside $GITHUB_WORKSPACE.
  • This can overwrite files on the runner (within permissions), potentially affecting subsequent steps or leaking/planting data in other locations.

Note: .gitignore does not prevent committing ignored paths; git add -f .local can add a symlink to the repo.

Recommendation

Harden the output directory handling so writes cannot escape the workspace or traverse symlinks.

Recommended defense-in-depth (pick one approach, or combine):

  1. Force output under a safe base directory and verify realpath containment
const workspace = fs.realpathSync(process.cwd());
const requested = path.resolve(options.outputDir);
const realOutParent = fs.realpathSync(path.dirname(requested));
const realOut = path.join(realOutParent, path.basename(requested));

if (!realOut.startsWith(workspace + path.sep)) {
  throw new Error(`Refusing to write outside workspace: ${realOut}`);
}
  1. Reject symlinks in any path segment (walk from workspace to outputDir and lstatSync each segment; if isSymbolicLink(), abort).

  2. Prefer a temp directory controlled by the runner (avoids repository-controlled path elements):

const outputDir = fs.mkdtempSync(path.join(os.tmpdir(), "gateway-watch-regression-"));
  1. If --output-dir must exist, require it to be relative (no absolute paths) and still apply the realpath/symlink checks.

Also consider hardening CI artifact upload paths to avoid following repository-controlled symlinks when collecting artifacts.


2. 🔵 Unpinned GitHub Actions versions in new CI job (supply-chain risk)

Property Value
Severity Low
CWE CWE-494
Location .github/workflows/ci.yml:342-362

Description

The newly added gateway-watch-regression job references third-party GitHub Actions by a floating tag (@​v6, @​v7) instead of pinning to an immutable commit SHA.

This creates a supply-chain risk because:

  • GitHub Actions are fetched and executed from the referenced repository at runtime.
  • Tags/major versions can be moved or a maintainer account/release process could be compromised.
  • This workflow runs on push to main (not just PRs), where GITHUB_TOKEN and any configured secrets/permissions for the workflow would be available to the runner.

Vulnerable code (new job):

- name: Checkout
  uses: actions/checkout@​v6
...
- name: Upload gateway watch regression artifacts
  uses: actions/upload-artifact@​v7

Recommendation

Pin GitHub Actions to a full commit SHA (and optionally keep the major tag as a comment for readability).

Example:

- name: Checkout
  uses: actions/checkout@<FULL_40_CHAR_COMMIT_SHA> # v4.x.x

- name: Upload gateway watch regression artifacts
  uses: actions/upload-artifact@<FULL_40_CHAR_COMMIT_SHA> # v4.x.x

Additionally, consider setting explicit least-privilege workflow/job permissions, e.g.:

permissions:
  contents: read

(and add only the scopes needed for release jobs).


3. 🔵 Arbitrary process signaling via stale/tampered PID file in gateway watch regression harness

Property Value
Severity Low
CWE CWE-367
Location scripts/check-gateway-watch-regression.mjs:296-327

Description

The regression harness determines which process to terminate by reading a PID from a predictable file path (watch.pid) and then sending signals to that PID.

Because the PID file is not created atomically, not removed before use, and not validated to be freshly written by the spawned watch process, a pre-existing or tampered watch.pid can cause the harness to send SIGTERM/SIGKILL to an unintended process.

Impact:

  • If options.outputDir is reused (e.g., local runs, or a CI workspace/cache that preserves .local/), an old watch.pid can be read before the new watch process overwrites it.
  • PID reuse on the OS can cause the old PID value to refer to a completely different process.
  • In the worst case this can terminate unrelated processes owned by the same user, causing CI/job disruption or local DoS.

Vulnerable code:

let watchPid = null;
for (let attempt = 0; attempt < 50; attempt += 1) {
  if (fs.existsSync(pidFilePath)) {
    watchPid = Number(fs.readFileSync(pidFilePath, "utf8").trim());
    break;
  }
  await sleep(100);
}
...
process.kill(watchPid, "SIGTERM");
...
process.kill(watchPid, "SIGKILL");

Recommendation

Ensure the PID file cannot be stale/tampered and that the PID corresponds to the process started by this run.

Minimum hardening (recommended):

  • Delete any pre-existing PID file before spawning the watch.
  • Wait for the PID file to be recreated, and validate contents are a positive integer.
  • Optionally, write and verify a nonce (or check file mtime is after spawn time) to ensure freshness.

Example fix (delete + mtime freshness check):

const startTime = Date.now();
try { fs.unlinkSync(pidFilePath); } catch {}// spawn child...

for (let attempt = 0; attempt < 50; attempt += 1) {
  try {
    const st = fs.statSync(pidFilePath);
    if (st.mtimeMs >= startTime) {
      const pid = Number(fs.readFileSync(pidFilePath, "utf8").trim());
      if (Number.isInteger(pid) && pid > 0) {
        watchPid = pid;
        break;
      }
    }
  } catch {}
  await sleep(100);
}

Stronger alternative:

  • Avoid PID files entirely and instead terminate the spawned process tree using process groups (e.g., spawn with { detached: true } and process.kill(-child.pid, signal) on POSIX), or use a library that can kill a process tree safely.

Analyzed PR: #49048 at commit e57f1b3

Last updated on: 2026-03-17T15:04:27Z

@openclaw-barnacle openclaw-barnacle bot added scripts Repository scripts size: M maintainer Maintainer-authored PR labels Mar 17, 2026
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 17, 2026

Greptile Summary

This PR adds a CI regression harness (scripts/check-gateway-watch-regression.mjs) that guards against pnpm gateway:watch silently re-populating dist-runtime/ with a duplicate runtime graph after a clean pnpm build. The check snapshots dist/ and dist-runtime/ before and after a bounded watch run, verifies that file and byte counts do not exceed configurable thresholds, and confirms the gateway reaches its listening state within the measurement window. A new gateway-watch-regression CI job wires this harness into the existing pipeline with proper scoping guards, artifact uploads, and if: always() so diagnostic data is preserved on failure.

Issues found:

  • keepLogs is declared in DEFAULTS but is never read or applied anywhere in the script — the option has no effect and should either be implemented (conditional log cleanup) or removed to avoid misleading future readers.
  • The startup-readiness detection (chunkText.includes("[gateway] listening on ws://")) checks only the bytes in the current data event rather than the accumulated stdout buffer; a message fragmented across two consecutive chunks would cause a false "never reached listening state" failure. Checking stdout.includes(...) after accumulation would be more robust.
  • startupReadyWarnMs (5 000 ms) pushes to the same failures array and calls process.exit(1) like the hard-fail threshold, making the "warn" label misleading. On slower CI runners this could trigger unexpected failures.

Confidence Score: 4/5

  • Safe to merge — adds CI coverage only, no production code changes; minor script issues are non-blocking.
  • The change is purely additive CI/infra scaffolding with no impact on production code paths. The three issues found are all style-level: dead keepLogs option, chunk-only startup detection (theoretical flakiness risk), and a misleadingly-named warn threshold that actually causes hard failures. None of these block the core purpose of the harness, and the PR evidence shows it passing cleanly on the author's machine. The slight risk of timing-induced CI flakiness on slower runners (5 000 ms soft-fail threshold) is the only realistic concern before merge.
  • scripts/check-gateway-watch-regression.mjs — three minor issues: dead keepLogs field, chunk-only startup detection, and misleading warn-vs-fail threshold naming.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 16

Comment:
**`keepLogs` option is dead code**

`keepLogs` is declared in `DEFAULTS` and spread into `options` by `parseArgs`, but it is never read anywhere in the rest of the script — there is no conditional that checks `options.keepLogs` to decide whether to delete or retain the log files. Because the default is `true` and no cleanup path exists, logs are always kept regardless of what value this field holds. Either wire up the option (e.g., conditionally delete the watch log files after a successful run) or remove `keepLogs` from `DEFAULTS` to avoid misleading future maintainers.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 231-237

Comment:
**Startup detection checks the current chunk, not the accumulated buffer**

`chunkText.includes("[gateway] listening on ws://")` only inspects the bytes that arrived in the current `data` event. If the log line happens to be fragmented across two consecutive chunks (possible, though unlikely for a single `console.log` call over a pipe), `startupReadyMs` would remain `null` and the harness would fail with "never reached the gateway listening state". Checking the accumulated `stdout` string after appending is trivially more robust:

```suggestion
  child.stdout?.on("data", (chunk) => {
    const chunkText = String(chunk);
    stdout += chunkText;
    if (startupReadyMs === null && stdout.includes("[gateway] listening on ws://")) {
      startupReadyMs = Date.now() - startedAt;
    }
  });
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 379-383

Comment:
**"Warn" threshold causes a hard CI failure**

`startupReadyWarnMs` (5 000 ms) pushes into the same `failures` array as `startupReadyFailMs` (8 000 ms), so exceeding the "warn" threshold also calls `process.exit(1)` and fails the CI job. The name `warn` implies a softer outcome, but there is no way for the check to produce a non-zero-exit warning today. This is unlikely to be a problem on the current hardware, but could cause unexpected CI failures on slower runners in the future. Consider either renaming the field (e.g., `startupReadySoftFailMs`) to make the behavior explicit, or actually downgrading this branch to a non-fatal advisory (logged but not added to `failures`).

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: f7ba371

startupReadyFailMs: 8_000,
distRuntimeFileGrowthMax: 200,
distRuntimeByteGrowthMax: 2 * 1024 * 1024,
keepLogs: true,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 keepLogs option is dead code

keepLogs is declared in DEFAULTS and spread into options by parseArgs, but it is never read anywhere in the rest of the script — there is no conditional that checks options.keepLogs to decide whether to delete or retain the log files. Because the default is true and no cleanup path exists, logs are always kept regardless of what value this field holds. Either wire up the option (e.g., conditionally delete the watch log files after a successful run) or remove keepLogs from DEFAULTS to avoid misleading future maintainers.

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 16

Comment:
**`keepLogs` option is dead code**

`keepLogs` is declared in `DEFAULTS` and spread into `options` by `parseArgs`, but it is never read anywhere in the rest of the script — there is no conditional that checks `options.keepLogs` to decide whether to delete or retain the log files. Because the default is `true` and no cleanup path exists, logs are always kept regardless of what value this field holds. Either wire up the option (e.g., conditionally delete the watch log files after a successful run) or remove `keepLogs` from `DEFAULTS` to avoid misleading future maintainers.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +231 to +237
child.stdout?.on("data", (chunk) => {
const chunkText = String(chunk);
stdout += chunkText;
if (startupReadyMs === null && chunkText.includes("[gateway] listening on ws://")) {
startupReadyMs = Date.now() - startedAt;
}
});
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Startup detection checks the current chunk, not the accumulated buffer

chunkText.includes("[gateway] listening on ws://") only inspects the bytes that arrived in the current data event. If the log line happens to be fragmented across two consecutive chunks (possible, though unlikely for a single console.log call over a pipe), startupReadyMs would remain null and the harness would fail with "never reached the gateway listening state". Checking the accumulated stdout string after appending is trivially more robust:

Suggested change
child.stdout?.on("data", (chunk) => {
const chunkText = String(chunk);
stdout += chunkText;
if (startupReadyMs === null && chunkText.includes("[gateway] listening on ws://")) {
startupReadyMs = Date.now() - startedAt;
}
});
child.stdout?.on("data", (chunk) => {
const chunkText = String(chunk);
stdout += chunkText;
if (startupReadyMs === null && stdout.includes("[gateway] listening on ws://")) {
startupReadyMs = Date.now() - startedAt;
}
});
Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 231-237

Comment:
**Startup detection checks the current chunk, not the accumulated buffer**

`chunkText.includes("[gateway] listening on ws://")` only inspects the bytes that arrived in the current `data` event. If the log line happens to be fragmented across two consecutive chunks (possible, though unlikely for a single `console.log` call over a pipe), `startupReadyMs` would remain `null` and the harness would fail with "never reached the gateway listening state". Checking the accumulated `stdout` string after appending is trivially more robust:

```suggestion
  child.stdout?.on("data", (chunk) => {
    const chunkText = String(chunk);
    stdout += chunkText;
    if (startupReadyMs === null && stdout.includes("[gateway] listening on ws://")) {
      startupReadyMs = Date.now() - startedAt;
    }
  });
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +379 to +383
} else if (startupReadyMs > options.startupReadyWarnMs) {
failures.push(
`gateway:watch took ${startupReadyMs}ms to reach the listening state after pnpm build, above target ${options.startupReadyWarnMs}ms`,
);
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 "Warn" threshold causes a hard CI failure

startupReadyWarnMs (5 000 ms) pushes into the same failures array as startupReadyFailMs (8 000 ms), so exceeding the "warn" threshold also calls process.exit(1) and fails the CI job. The name warn implies a softer outcome, but there is no way for the check to produce a non-zero-exit warning today. This is unlikely to be a problem on the current hardware, but could cause unexpected CI failures on slower runners in the future. Consider either renaming the field (e.g., startupReadySoftFailMs) to make the behavior explicit, or actually downgrading this branch to a non-fatal advisory (logged but not added to failures).

Prompt To Fix With AI
This is a comment left during a code review.
Path: scripts/check-gateway-watch-regression.mjs
Line: 379-383

Comment:
**"Warn" threshold causes a hard CI failure**

`startupReadyWarnMs` (5 000 ms) pushes into the same `failures` array as `startupReadyFailMs` (8 000 ms), so exceeding the "warn" threshold also calls `process.exit(1)` and fails the CI job. The name `warn` implies a softer outcome, but there is no way for the check to produce a non-zero-exit warning today. This is unlikely to be a problem on the current hardware, but could cause unexpected CI failures on slower runners in the future. Consider either renaming the field (e.g., `startupReadySoftFailMs`) to make the behavior explicit, or actually downgrading this branch to a non-fatal advisory (logged but not added to `failures`).

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: f7ba371a29

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +234 to +235
if (startupReadyMs === null && chunkText.includes("[gateway] listening on ws://")) {
startupReadyMs = Date.now() - startedAt;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Detect gateway readiness across chunked stdout

Readiness detection is keyed on chunkText.includes("[gateway] listening on ws://"), but stream data chunks are arbitrary boundaries; if that marker is split across two chunks, startupReadyMs remains null and the harness reports a false failure (never reached the gateway listening state) even when the gateway did become ready. This can make the new CI guard flaky under normal buffering differences, so the match should be done against accumulated output (or a rolling buffer) rather than a single chunk.

Useful? React with 👍 / 👎.

await sleep(options.windowMs);

try {
process.kill(-watchPid, "SIGTERM");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Add Windows-safe termination for gateway watch process

The harness spawns pnpm gateway:watch with detached: true and then only tries process.kill(-watchPid, ...); on Windows, negative PID kills are unsupported, so both kill attempts are swallowed and the later await exitPromise can block indefinitely while the watch process keeps running. Because package.json exposes this check for local use (pnpm test:gateway:watch-regression), Windows contributors will hit hangs unless this path falls back to Windows-compatible tree termination.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 05d1e5e696

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

distRuntimeAddedPaths,
addedPaths: diff.added.length,
removedPaths: diff.removed.length,
watchExit: watchResult.exit,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Fail when gateway watch exits before window completes

The harness records watchExit but never validates it, so a gateway:watch process that starts, emits the listening line, then crashes before windowMs still produces a passing check as long as dist-runtime growth stays under thresholds. In that scenario CI reports this regression guard as green even though the watched gateway is unstable, which defeats the purpose of this health gate for post-build watch behavior.

Useful? React with 👍 / 👎.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: e57f1b3eed

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment on lines +233 to +235
return {
command: "/usr/bin/time",
args: [
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Resolve timing command without hardcoded /usr/bin path

The harness hardcodes command: "/usr/bin/time" (and later runs /bin/sh), so pnpm test:gateway:watch-regression fails immediately on environments where that absolute path is missing (for example this container, where time is only a shell keyword, and Windows hosts). Since this script is wired into package.json for local use, the check becomes non-runnable outside a narrow runner image and can block contributors from reproducing CI behavior.

Useful? React with 👍 / 👎.

Comment on lines +292 to +294
const exitPromise = new Promise((resolve) => {
child.on("exit", (code, signal) => resolve({ code, signal }));
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Handle spawn errors for the timed watch subprocess

runTimedWatch only listens for exit, so if spawn(...) fails (for example ENOENT when the timing binary/shell is unavailable) Node emits an unhandled error event and the script crashes before writing summary.json or useful artifacts. Converting spawn failures into a controlled failure path would keep this CI guard debuggable instead of terminating with an uncaught exception.

Useful? React with 👍 / 👎.

@huntharo huntharo merged commit f036ed2 into openclaw:main Mar 17, 2026
32 of 38 checks passed
@huntharo huntharo deleted the codex/ci-gateway-regression-checks branch March 17, 2026 14:55
nikolaisid pushed a commit to nikolaisid/openclaw that referenced this pull request Mar 18, 2026
alexey-pelykh pushed a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
alexey-pelykh added a commit to remoteclaw/remoteclaw that referenced this pull request Mar 25, 2026
* fix(ci): stop serializing push workflow runs

(cherry picked from commit 0a20c5c)

* test: harden path resolution test helpers

(cherry picked from commit 1ad47b8)

* Fix launcher startup regressions (openclaw#48501)

* Fix launcher startup regressions

* Fix CI follow-up regressions

* Fix review follow-ups

* Fix workflow audit shell inputs

* Handle require resolve gaxios misses

(cherry picked from commit 313e5bb)

* refactor(scripts): move container setup entrypoints

(cherry picked from commit 46ccbac)

* perf(ci): gate install smoke on changed-smoke (openclaw#52458)

(cherry picked from commit 4bd90f2)

* Docs: prototype generated plugin SDK reference (openclaw#51877)

* Chore: unblock synced main checks

* Docs: add plugin SDK docs implementation plan

* Docs: scaffold plugin SDK reference phase 1

* Docs: mark plugin SDK reference surfaces unstable

* Docs: prototype generated plugin SDK reference

* docs(plugin-sdk): replace generated reference with api baseline

* docs(plugin-sdk): drop generated reference plan

* docs(plugin-sdk): align api baseline flow with config docs

---------

Co-authored-by: Onur <[email protected]>
Co-authored-by: Vincent Koc <[email protected]>
(cherry picked from commit 4f1e12a)

* fix(ci): harden docker builds and unblock config docs

(cherry picked from commit 9f08af1)

* Docs: add config drift baseline statefile (openclaw#45891)

* Docs: add config drift statefile generator

* Docs: generate config drift baseline

* CI: move config docs drift runner into workflow sanity

* Docs: emit config drift baseline json

* Docs: commit config drift baseline json

* Docs: wire config baseline into release checks

* Config: fix baseline drift walker coverage

* Docs: regenerate config drift baselines

(cherry picked from commit cbec476)

* Release: add plugin npm publish workflow (openclaw#47678)

* Release: add plugin npm publish workflow

* Release: make plugin publish scope explicit

(cherry picked from commit d41c9ad)

* build: default to Node 24 and keep Node 22 compat

(cherry picked from commit deada7e)

* ci(android): use explicit flavor debug tasks

(cherry picked from commit 0c2e6fe)

* ci: harden pnpm sticky cache on PRs

(cherry picked from commit 29b36f8)

* CI: add built plugin singleton smoke (openclaw#48710)

(cherry picked from commit 5a2a4ab)

* chore: add code owners for npm release paths

(cherry picked from commit 5c9fae5)

* test add extension plugin sdk boundary guards

(cherry picked from commit 77fb258)

* ci: tighten cache docs and node22 gate

(cherry picked from commit 797b6fe)

* ci: add npm release workflow and CalVer checks (openclaw#42414) (thanks @onutc)

(cherry picked from commit 8ba1b6e)

* CI: add CLI startup memory regression check

(cherry picked from commit c0e0115)

* Add bad-barnacle label to prevent barnacle closures. (openclaw#51945)

(cherry picked from commit c449a0a)

* ci: speed up scoped workflow lanes

(cherry picked from commit d17490f)

* ci: restore PR pnpm cache fallback

(cherry picked from commit e1d0545)

* CI: guard gateway watch against duplicate runtime regressions (openclaw#49048)

(cherry picked from commit f036ed2)

* fix: correct domain reference in docker setup script

* fix: adapt cherry-picks for fork TS strictness

* fix: adapt cherry-picked tests for fork structure

- Dockerfile test: OPENCLAW_ → REMOTECLAW_ ARG names
- ci-changed-scope test: add missing runChangedSmoke field
- doc-baseline test: rename to e2e (needs dist/ build artifacts)
- extension boundary test: update baselines and expectations for fork

* fix: adjust ci-changed-scope test for fork's narrower skills regex

---------

Co-authored-by: Vincent Koc <[email protected]>
Co-authored-by: Peter Steinberger <[email protected]>
Co-authored-by: Tak Hoffman <[email protected]>
Co-authored-by: Bob <[email protected]>
Co-authored-by: Onur <[email protected]>
Co-authored-by: Altay <[email protected]>
Co-authored-by: Ayaan Zaidi <[email protected]>
Co-authored-by: Onur Solmaz <[email protected]>
Co-authored-by: Harold Hunt <[email protected]>
ralyodio pushed a commit to ralyodio/openclaw that referenced this pull request Apr 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

maintainer Maintainer-authored PR scripts Repository scripts size: M

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant