feat(tasks): move task ledger to sqlite and add audit CLI#57361
feat(tasks): move task ledger to sqlite and add audit CLI#57361vincentkoc merged 7 commits intomainfrom
Conversation
Greptile SummaryThis PR migrates the shared task ledger from full-snapshot JSON persistence to incremental SQLite storage, and adds a new Two design points are worth a look before merging:
Confidence Score: 5/5Safe to merge; all remaining findings are P2 style/UX concerns that do not affect data integrity or correctness. No P0 or P1 bugs found. The SQLite store is correctly implemented with WAL mode, proper index coverage, autocommit for individual row ops, and a transactional full-snapshot fallback. The two P2 items (redundant missing_cleanup for lost tasks, summary counted on filtered set) are operator UX concerns that do not break any functional path. src/tasks/task-registry.audit.ts (missing_cleanup double-flag for lost tasks) and src/commands/tasks.ts (audit summary scoped to filtered set).
|
| Filename | Overview |
|---|---|
| src/tasks/task-registry.store.sqlite.ts | New SQLite store using node:sqlite with WAL mode, individual row upsert/delete helpers, a full-snapshot fallback wrapped in BEGIN IMMEDIATE, and useful indices. Singleton pattern handled correctly with lazy re-open after close. |
| src/tasks/task-registry.audit.ts | New read-only audit module. The missing_cleanup check fires for lost tasks as well, creating a redundant second finding. Sorting and summarization logic are otherwise sound. |
| src/commands/tasks.ts | Adds tasksAuditCommand with severity/code/limit filtering. Summary counts are computed from the post-filter set rather than the full audit universe, which can be misleading when filters are active. |
| src/tasks/task-registry.store.ts | Store interface extended with optional upsertTask/deleteTask/close hooks. Default store swapped to SQLite. resetTaskRegistryRuntimeForTests now correctly closes the SQLite handle before resetting. |
| src/tasks/task-registry.ts | persistTaskRegistry() calls replaced with granular persistTaskUpsert/persistTaskDelete helpers using incremental SQLite ops when available, falling back to full snapshot for custom stores. Logic is correct. |
| src/tasks/task-registry.paths.ts | Path helpers extracted from the deleted JSON store. Test-env isolation via VITEST/NODE_ENV check is preserved correctly. |
| src/infra/node-sqlite.ts | Thin wrapper that dynamically requires node:sqlite and surfaces a clear actionable error if the builtin module is missing. |
| src/tasks/task-registry.store.json.ts | Deleted as part of the JSON-to-SQLite migration. Expected removal. |
Comments Outside Diff (1)
-
src/tasks/task-registry.audit.ts, line 643-681 (link)losttasks generate redundantmissing_cleanupfindingA task with
status === "lost"will always produce two separate findings in the same audit pass: onelostfinding (error) and onemissing_cleanupfinding (warn), because"lost"is neither"queued"nor"running"and a lost task is unlikely to havecleanupAfterset.The
missing_cleanupwarning adds noise without actionable signal in this case — the operator should address thelostroot cause first. Consider excludinglost(and similarlyabandoned/expired) statuses from themissing_cleanupcheck, or at least documenting the intentional double-flagging.Prompt To Fix With AI
This is a comment left during a code review. Path: src/tasks/task-registry.audit.ts Line: 643-681 Comment: **`lost` tasks generate redundant `missing_cleanup` finding** A task with `status === "lost"` will always produce two separate findings in the same audit pass: one `lost` finding (error) and one `missing_cleanup` finding (warn), because `"lost"` is neither `"queued"` nor `"running"` and a lost task is unlikely to have `cleanupAfter` set. The `missing_cleanup` warning adds noise without actionable signal in this case — the operator should address the `lost` root cause first. Consider excluding `lost` (and similarly `abandoned` / `expired`) statuses from the `missing_cleanup` check, or at least documenting the intentional double-flagging. How can I resolve this? If you propose a fix, please make it concise.
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/tasks/task-registry.audit.ts
Line: 643-681
Comment:
**`lost` tasks generate redundant `missing_cleanup` finding**
A task with `status === "lost"` will always produce two separate findings in the same audit pass: one `lost` finding (error) and one `missing_cleanup` finding (warn), because `"lost"` is neither `"queued"` nor `"running"` and a lost task is unlikely to have `cleanupAfter` set.
The `missing_cleanup` warning adds noise without actionable signal in this case — the operator should address the `lost` root cause first. Consider excluding `lost` (and similarly `abandoned` / `expired`) statuses from the `missing_cleanup` check, or at least documenting the intentional double-flagging.
```suggestion
if (
task.status !== "queued" &&
task.status !== "running" &&
task.status !== "lost" &&
typeof task.cleanupAfter !== "number"
) {
findings.push(
createFinding({
severity: "warn",
code: "missing_cleanup",
task,
ageMs,
detail: "terminal task is missing cleanupAfter",
}),
);
}
```
How can I resolve this? If you propose a fix, please make it concise.
---
This is a comment left during a code review.
Path: src/commands/tasks.ts
Line: 329
Comment:
**Audit summary reflects filtered set, not the total universe**
`summarizeTaskAuditFindings` is called on the already-filtered `findings` list. This means the printed header (`Task audit: N findings · M errors · P warnings`) counts only what matched the `--severity`/`--code` filters, not the full audit. An operator using `--severity error` to triage will see "2 findings · 2 errors · 0 warnings" even if there are 10 warnings in the registry, which can give a misleading picture of overall health.
Consider computing the summary from the full unfiltered findings:
```suggestion
const allFindings = listTaskAuditFindings();
const findings = allFindings.filter((finding) => {
if (severityFilter && finding.severity !== severityFilter) {
return false;
}
if (codeFilter && finding.code !== codeFilter) {
return false;
}
return true;
});
const limit = typeof opts.limit === "number" && opts.limit > 0 ? opts.limit : undefined;
const displayed = limit ? findings.slice(0, limit) : findings;
const summary = summarizeTaskAuditFindings(allFindings);
```
How can I resolve this? If you propose a fix, please make it concise.Reviews (1): Last reviewed commit: "Merge branch 'main' into feat/task-sqlit..." | Re-trigger Greptile
| const limit = typeof opts.limit === "number" && opts.limit > 0 ? opts.limit : undefined; | ||
| const displayed = limit ? findings.slice(0, limit) : findings; | ||
| const summary = summarizeTaskAuditFindings(findings); | ||
|
|
There was a problem hiding this comment.
Audit summary reflects filtered set, not the total universe
summarizeTaskAuditFindings is called on the already-filtered findings list. This means the printed header (Task audit: N findings · M errors · P warnings) counts only what matched the --severity/--code filters, not the full audit. An operator using --severity error to triage will see "2 findings · 2 errors · 0 warnings" even if there are 10 warnings in the registry, which can give a misleading picture of overall health.
Consider computing the summary from the full unfiltered findings:
| const allFindings = listTaskAuditFindings(); | |
| const findings = allFindings.filter((finding) => { | |
| if (severityFilter && finding.severity !== severityFilter) { | |
| return false; | |
| } | |
| if (codeFilter && finding.code !== codeFilter) { | |
| return false; | |
| } | |
| return true; | |
| }); | |
| const limit = typeof opts.limit === "number" && opts.limit > 0 ? opts.limit : undefined; | |
| const displayed = limit ? findings.slice(0, limit) : findings; | |
| const summary = summarizeTaskAuditFindings(allFindings); |
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/commands/tasks.ts
Line: 329
Comment:
**Audit summary reflects filtered set, not the total universe**
`summarizeTaskAuditFindings` is called on the already-filtered `findings` list. This means the printed header (`Task audit: N findings · M errors · P warnings`) counts only what matched the `--severity`/`--code` filters, not the full audit. An operator using `--severity error` to triage will see "2 findings · 2 errors · 0 warnings" even if there are 10 warnings in the registry, which can give a misleading picture of overall health.
Consider computing the summary from the full unfiltered findings:
```suggestion
const allFindings = listTaskAuditFindings();
const findings = allFindings.filter((finding) => {
if (severityFilter && finding.severity !== severityFilter) {
return false;
}
if (codeFilter && finding.code !== codeFilter) {
return false;
}
return true;
});
const limit = typeof opts.limit === "number" && opts.limit > 0 ? opts.limit : undefined;
const displayed = limit ? findings.slice(0, limit) : findings;
const summary = summarizeTaskAuditFindings(allFindings);
```
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2f55dfc35
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| mkdirSync(resolveTaskRegistryDir(process.env), { recursive: true }); | ||
| const { DatabaseSync } = requireNodeSqlite(); | ||
| const db = new DatabaseSync(pathname); |
There was a problem hiding this comment.
Harden SQLite task store file and directory modes
The SQLite task store now creates tasks/ with mkdirSync(..., { recursive: true }) and opens runs.sqlite without enforcing permissions, so access falls back to the process umask (often world-readable defaults like 0755/0644). The previous JSON persistence path explicitly wrote 0700 directories and 0600 files, so this is a regression: on shared machines or permissive OPENCLAW_STATE_DIR paths, other local users can read task metadata (including session identifiers and task details). Please set explicit restrictive modes for both the directory and database file after creation.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 0fb3862520
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if ( | ||
| task.status !== "queued" && | ||
| task.status !== "running" && | ||
| typeof task.cleanupAfter !== "number" |
There was a problem hiding this comment.
Exclude synthetic lost tasks from missing_cleanup checks
listTaskAuditFindings() defaults to reconcileInspectableTasks(), which can return projected lost tasks before maintenance has persisted cleanupAfter (see reconcileTaskRecordForOperatorInspection in src/tasks/task-registry.maintenance.ts). This condition treats every non-queued/non-running task without cleanupAfter as missing_cleanup, so projected lost tasks are double-reported as both lost and missing_cleanup, creating noisy false positives in tasks audit output.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ca6d601fe3
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| loadSnapshot: loadTaskRegistrySnapshotFromSqlite, | ||
| saveSnapshot: saveTaskRegistrySnapshotToSqlite, |
There was a problem hiding this comment.
Keep a fallback when selecting the default task store
Making the default registry backend SQLite-only here means every task-registry code path now hard-depends on node:sqlite; in Bun-based runs this module is not available (No such built-in module: node:sqlite), so task-related commands and background-task paths fail instead of degrading gracefully as they did with JSON persistence. This is a functional regression for supported Bun workflows, so the default store selection should include a runtime-capability fallback.
Useful? React with 👍 / 👎.
|
🤖 We're reviewing this PR with Aisle We're running a security check on the changes in this PR now. This usually takes a few minutes. ⌛ Progress:
Latest run failed. Keeping previous successful results. Trace ID: Last updated on: 2026-03-30T09:47:27Z |
…7361) * feat(tasks): move task ledger to sqlite * feat(tasks): add task run audit command * style(tasks): normalize audit command formatting * fix(tasks): address audit summary and sqlite perms * fix(tasks): avoid duplicate lost audit findings
…7361) * feat(tasks): move task ledger to sqlite * feat(tasks): add task run audit command * style(tasks): normalize audit command formatting * fix(tasks): address audit summary and sqlite perms * fix(tasks): avoid duplicate lost audit findings
…7361) * feat(tasks): move task ledger to sqlite * feat(tasks): add task run audit command * style(tasks): normalize audit command formatting * fix(tasks): address audit summary and sqlite perms * fix(tasks): avoid duplicate lost audit findings
Summary
openclaw tasks auditfor stale, lost, delivery-failed, and inconsistent task runs.AI-assisted: yes
Testing: fully tested on the touched surface
Change Type (select all)
Scope (select all touched areas)
Linked Issue/PR
Root Cause / Regression History (if applicable)
git blame, prior PR, issue, or refactor if known): this follows the unreleased shared task-run registry work and the status-pressure follow-up merged in feat(status): surface task run pressure #57350.Regression Test Plan (if applicable)
src/tasks/task-registry.store.test.tssrc/tasks/task-registry.audit.test.tssrc/commands/tasks.test.tssrc/cli/program/register.status-health-sessions.test.tssrc/tasks/task-registry.test.tssrc/tasks/task-registry.test.tscovers the shared lifecycle semantics.User-visible / Behavior Changes
openclaw tasksnow reads from a SQLite-backed shared task ledger.openclaw tasks auditreports stale, lost, delivery-failed, missing-cleanup, and timestamp-inconsistent task runs.Diagram (if applicable)
Security Impact (required)
Yes, explain risk + mitigation: task persistence now uses a local SQLite file instead of a JSON snapshot. Scope remains local-only under the state dir, with the same task data as before and no new remote access.Repro + Verification
Environment
Steps
openclaw tasks.openclaw tasks auditagainst stale or broken task fixtures.Expected
Actual
Evidence
Human Verification (required)
pnpm test -- src/tasks/task-registry.store.test.ts src/tasks/task-registry.audit.test.ts src/commands/tasks.test.ts src/cli/program/register.status-health-sessions.test.tspnpm test -- src/tasks/task-registry.test.tspnpm buildpnpm checkReview Conversations
Compatibility / Migration
Risks and Mitigations
node:sqlitein the runtime.