Skip to content

feat(tasks): harden maintenance repair paths#57439

Merged
vincentkoc merged 3 commits intomainfrom
feat/task-health-status-maintenance
Mar 30, 2026
Merged

feat(tasks): harden maintenance repair paths#57439
vincentkoc merged 3 commits intomainfrom
feat/task-health-status-maintenance

Conversation

@vincentkoc
Copy link
Copy Markdown
Member

Summary

  • Problem: task maintenance still left one repair gap: tasks marked lost did not get cleanupAfter until a later pass, and status surfaced audit errors without telling operators the repair command.
  • Why it matters: repair drift keeps old broken rows around longer than intended and makes task health actionable only if the operator already knows the maintenance CLI.
  • What changed: maintenance now stamps cleanupAfter in the same pass that marks a task lost, inspection-time lost projections mirror that retention, and openclaw status prints a direct maintenance hint when task audit errors are present.
  • What did NOT change (scope boundary): no executor changes, no task schema expansion, no Redis/Temporal work, no producer-specific lifecycle refactors.

Change Type (select all)

  • Feature
  • Refactor required for the fix

Scope (select all touched areas)

  • Gateway / orchestration
  • Memory / storage
  • UI / DX

Linked Issue/PR

Root Cause / Regression History (if applicable)

  • Root cause: task maintenance reconciled lost state and retention in two separate passes.
  • Missing detection / guardrail: there was no test asserting that a newly-lost task gets cleanupAfter immediately.
  • Prior context (git blame, prior PR, issue, or refactor if known): follow-up to the task maintenance/status work in feat(tasks): add status health and maintenance command #57423.
  • Why this regressed now: the initial maintenance pass optimized for state reconciliation first and cleanup stamping second.
  • If unknown, what was ruled out: N/A

Regression Test Plan (if applicable)

  • Coverage level that should have caught this:
    • Unit test
    • Seam / integration test
  • Target test or file: src/tasks/task-registry.test.ts, src/commands/status.test.ts
  • Scenario the test should lock in: a task reconciled to lost is immediately cleanup-eligible, and status shows the maintenance command when task audit errors exist.
  • Why this is the smallest reliable guardrail: both behaviors sit at the task-registry and status-command seams.
  • Existing test that already covers this (if any): task maintenance tests already covered prune and missing-cleanup stamping.
  • If no new test is added, why not: N/A

User-visible / Behavior Changes

  • openclaw status now prints a direct openclaw tasks maintenance --apply hint when task audit errors are present.
  • newly-lost tasks now get cleanupAfter immediately in the same maintenance pass.

Diagram (if applicable)

Before:
[maintenance pass] -> [mark lost] -> [later pass] -> [stamp cleanupAfter]
[operator] -> [status shows audit errors] -> [must remember repair command]

After:
[maintenance pass] -> [mark lost + stamp cleanupAfter]
[operator] -> [status shows audit errors + repair command]

Security Impact (required)

  • New permissions/capabilities? (Yes/No) No
  • Secrets/tokens handling changed? (Yes/No) No
  • New/changed network calls? (Yes/No) No
  • Command/tool execution surface changed? (Yes/No) No
  • Data access scope changed? (Yes/No) No
  • If any Yes, explain risk + mitigation: N/A

Repro + Verification

Environment

  • OS: macOS
  • Runtime/container: Node 22 / pnpm workspace
  • Model/provider: N/A
  • Integration/channel (if any): N/A
  • Relevant config (redacted): default local state dir

Steps

  1. Create a task whose backing session is gone and whose last event is past the lost grace window.
  2. Run maintenance.
  3. Inspect the task row and run formatted status with task audit errors present.

Expected

  • maintenance marks the task lost and stamps cleanupAfter immediately.
  • status prints the repair command when task audit errors exist.

Actual

  • Matches expected in scoped tests.

Evidence

  • Failing test/log before + passing after

Human Verification (required)

  • Verified scenarios: lost-task maintenance repair, formatted status maintenance hint.
  • Edge cases checked: inspection-only lost projection still leaves the registry untouched; maintenance still reports cleanupStamped: 0 for the lost reconciliation path because retention is applied as part of the reconciliation itself.
  • What you did not verify: full pnpm test suite; pnpm check on current main.

Review Conversations

  • I replied to or resolved every bot review conversation I addressed in this PR.
  • I left unresolved only the conversations that still need reviewer or maintainer judgment.

Compatibility / Migration

  • Backward compatible? (Yes/No) Yes
  • Config/env changes? (Yes/No) No
  • Migration needed? (Yes/No) No
  • If yes, exact upgrade steps: N/A

Risks and Mitigations

  • Risk: task maintenance semantics could drift from task audit semantics again.
    • Mitigation: the new test asserts single-pass lost reconciliation with immediate cleanup retention.
  • Risk: pnpm check is still red on latest main due to the unrelated src/agents/skills.test-helpers.ts:34 error.
    • Mitigation: scoped task/status tests are green for this change.

AI Assistance

  • AI-assisted: yes
  • Testing: pnpm test -- src/tasks/task-registry.test.ts src/commands/status.test.ts
  • codex review --base origin/main: not re-run in this tiny follow-up because the local transport issue to 127.0.0.1 remained unresolved in the earlier PR cycle.

@vincentkoc vincentkoc self-assigned this Mar 30, 2026
@openclaw-barnacle openclaw-barnacle bot added cli CLI command changes commands Command implementations size: L maintainer Maintainer-authored PR labels Mar 30, 2026
@vincentkoc vincentkoc force-pushed the feat/task-health-status-maintenance branch from 4cf55c9 to 646590f Compare March 30, 2026 03:22
@openclaw-barnacle openclaw-barnacle bot added size: S and removed cli CLI command changes size: L labels Mar 30, 2026
@vincentkoc vincentkoc marked this pull request as ready for review March 30, 2026 03:23
@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Mar 30, 2026

Greptile Summary

This PR closes the two-pass gap in task maintenance: markTaskLost now stamps cleanupAfter in the same reconciliation pass that marks a task lost, and projectTaskLost mirrors this behaviour so inspection-time projections also carry the retention field. Additionally, openclaw status now prints a direct openclaw tasks maintenance --apply hint when taskAudit.errors > 0, making repair actionable without prior CLI knowledge. Both changes are backed by targeted seam tests.

  • Correctness – single-pass lost reconciliation is sound; projectTaskLost guard (typeof cleanupAfter === "number") correctly handles all existing states.
  • previewTaskRegistryMaintenance intentionally keeps cleanupStamped: 0 for the lost path (cleanup is folded into reconciliation), matching the test expectation.
  • P2 style – the maintenance hint in status.command.ts is missing the blank-line separator (runtime.log("")) that every other conditional hint block in the function uses, which will cause it to appear visually attached to the overview table.
  • P2 stylemarkTaskLost calls resolveCleanupAfter(projectTaskLost(task, now)) but projectTaskLost already computes and stores the same value; using projectTaskLost(task, now).cleanupAfter directly would be simpler.

Confidence Score: 5/5

Safe to merge — all remaining findings are P2 style suggestions that do not affect correctness or runtime behaviour.

The core logic change (single-pass cleanupAfter stamping for lost tasks) is correct and well-tested. The status hint is guarded by the right condition. Only two P2 style issues remain: a missing blank-line separator in the status output and a minor redundancy in the cleanupAfter derivation. Neither affects functionality.

src/commands/status.command.ts (missing blank line before maintenance hint)

Important Files Changed

Filename Overview
src/tasks/task-registry.maintenance.ts markTaskLost now stamps cleanupAfter in the same reconciliation pass; projectTaskLost mirrors this for inspection paths. Logic is correct; minor redundancy in cleanupAfter derivation.
src/commands/status.command.ts Adds maintenance hint after the overview table when taskAudit.errors > 0. Missing blank-line separator before the hint (P2 style).
src/tasks/task-registry.test.ts New test asserts single-pass lost reconciliation with immediate cleanupAfter stamping and cleanupStamped: 0. Coverage is correct and targeted.
src/commands/status.test.ts Adds mocks for getInspectableTaskRegistrySummary and getInspectableTaskAuditSummary plus a test confirming the maintenance hint appears when audit errors > 0.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/commands/status.command.ts
Line: 546-550

Comment:
**Missing blank-line separator before maintenance hint**

Every other conditional hint block in this function (`pairingRecovery`, `pluginCompatibility`) starts with `runtime.log("")` to visually separate it from the preceding table. The new maintenance hint is emitted immediately after the trimmed table output with no blank line, so it will appear "glued" to the table rows on terminals.

```suggestion
  if (summary.taskAudit.errors > 0) {
    runtime.log("");
    runtime.log(
      theme.muted(`Task maintenance: ${formatCliCommand("openclaw tasks maintenance --apply")}`),
    );
  }
```

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/tasks/task-registry.maintenance.ts
Line: 111

Comment:
**Redundant double-computation of `cleanupAfter`**

`projectTaskLost(task, now)` already computes and stores `cleanupAfter` on the returned record (via its own `resolveCleanupAfter` call). Wrapping its result in another `resolveCleanupAfter(...)` call re-derives the same value from `endedAt`/`lastEventAt` — the two computations are always equal, but it's unnecessarily indirect. Using the projected record's field directly is clearer and avoids repeating the derivation logic:

```suggestion
  const cleanupAfter = task.cleanupAfter ?? projectTaskLost(task, now).cleanupAfter;
```

How can I resolve this? If you propose a fix, please make it concise.

Reviews (1): Last reviewed commit: "Merge branch 'main' into feat/task-healt..." | Re-trigger Greptile

@vincentkoc vincentkoc merged commit 8c163c1 into main Mar 30, 2026
9 checks passed
@vincentkoc vincentkoc deleted the feat/task-health-status-maintenance branch March 30, 2026 03:37
@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 30, 2026

🤖 We're reviewing this PR with Aisle

We're running a security check on the changes in this PR now. This usually takes a few minutes. ⌛
We'll post the results here as soon as they're ready.

Progress:

  • Analysis
  • Finalization

Latest run failed. Keeping previous successful results. Trace ID: 019d3cc4927324e6efe411c000d6600a.

Last updated on: 2026-03-30T09:12:07Z

@aisle-research-bot
Copy link
Copy Markdown

aisle-research-bot bot commented Mar 30, 2026

🤖 We're reviewing this PR with Aisle

We're running a security check on the changes in this PR now. This usually takes a few minutes. ⌛
We'll post the results here as soon as they're ready.

Progress:

  • Analysis
  • Finalization

Latest run failed. Keeping previous successful results. Trace ID: 019d3cd0a68268bb05f21edd45727783.

Last updated on: 2026-03-30T09:40:20Z

alexjiang1 pushed a commit to alexjiang1/openclaw that referenced this pull request Mar 31, 2026
* feat(tasks): harden maintenance repair paths

* fix(tasks): address follow-up maintenance review comments
pgondhi987 pushed a commit to pgondhi987/openclaw that referenced this pull request Mar 31, 2026
* feat(tasks): harden maintenance repair paths

* fix(tasks): address follow-up maintenance review comments
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

commands Command implementations maintainer Maintainer-authored PR size: S

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant