Expose assembly provenance/cache policy and treat openai-codex cache windows as mutation-sensitive

## Summary

`lossless-claw`/LCM appears to assemble prompt-ready compact context correctly, but the current context-engine result surface is too coarse for the host to reason safely about assembly provenance, fallback mode, raw-history debt, and prompt-cache sensitivity. In the observed OpenClaw integration, this contributed to ambiguity around whether the assembled context should be prompt-authoritative and whether Codex prompt-cache windows should delay prompt-mutating deferred compaction.

This is a companion hardening issue to the OpenClaw host-side bug filed as openclaw/openclaw#74229. The primary live reset loop appears OpenClaw-side, but LCM can make the contract harder to misuse.

## Environment observed

- Installed plugin: `@martian-engineering/lossless-claw@0.9.2`
- Active OpenClaw model: `openai-codex/gpt-5.5`
- Active context window: `258000`
- LCM config highlights:
  - `freshTailCount=64`
  - `freshTailMaxTokens=16000`
  - `proactiveThresholdCompactionMode=deferred`
  - `cacheAwareCompaction.enabled=true`
  - `cacheAwareCompaction.cacheTTLSeconds=300`
  - `promptAwareEviction=false`

## Confidence matrix

- **0.88 - issue:** LCM assembly is intended to produce prompt-ready compact context, but the returned SDK surface does not expose enough metadata for hosts to distinguish authoritative assembly from fallback live context or emergency recovery.
- **0.84 - issue:** prompt-cache mutation sensitivity is still Anthropic/Claude-shaped in the current source lineage, while live telemetry shows `openai-codex/gpt-5.5` has active hot cache signals.
- **0.84 - live evidence:** DB telemetry shows active `openai-codex/gpt-5.5`, `cache_state=hot`, `retention=long`, `last_observed_cache_read=59904`, and prompt token telemetry.
- **0.75 - contributing cause:** session reset/rotation in the host makes exact at-failure LCM attribution harder because the DB tracks the active `sessionKey` through reset rather than preserving a clean per-session assembly snapshot.
- **0.70 - possible cause:** the installed plugin is bundled/minified and does not preserve source-level provenance, making live debugging harder when multiple local PR worktrees exist.
- **0.65 - possible cause:** after a host reset, continuity depends on LCM summaries and fresh tail; if exact artifacts are not preserved in summaries or recall misses, the user experiences partial forgetfulness even though raw data exists in the DB.

## Source evidence

LCM assembly is intended to build model-ready context:

- `src/assembler.ts` defines assembled `messages` as ordered messages ready for the model.
- The assembler builds context under `tokenBudget`, protects fresh tail, drops older items to fit, and returns chronological messages.
- Summaries are converted into synthetic user messages with structured XML wrappers before being returned.
- OpenClaw's SDK contract also states `ContextEngine.assemble()` returns ordered messages ready for the model.

The current result surface is coarse:

- LCM returns only `{ messages, estimatedTokens }` through the context-engine interface.
- Richer internal information such as summary/raw counts, selection mode, fallback reason, cache-aware state, fresh-tail size, and raw-history debt is logged/debug-only rather than returned to the host.
- `ContextEngineInfo` only exposes id/name/version plus broad capabilities like `ownsCompaction` and `turnMaintenanceMode`; there is no capability indicating whether assembly is prompt-authoritative, whether the result is a fallback, or what host precheck behavior is safe.

Prompt-cache sensitivity appears too provider-specific:

- Current source lineage has generic cache telemetry and hot-cache handling, but prompt-mutating deferred compaction delay is Anthropic/Claude-family-specific.
- Live telemetry shows `openai-codex/gpt-5.5` with active hot cache and `retention=long`, so Codex should likely be treated as mutation-sensitive when runtime telemetry says a cache window is active.
- Cache-write-only usage should probably count as an active cache touch/window, not become cold by default, when `lastCacheTouchAt`/retention are available.

## Live evidence

Read-only DB inspection showed:

```text
conversations: 755
messages: 275264
summaries: 3558
context_items: 35596
conversation_compaction_telemetry: 201
```

Active main telemetry included:

```text
provider=openai-codex
model=gpt-5.5
cache_state=hot
retention=long
last_observed_cache_read=59904
last_observed_cache_write=0
last_observed_prompt_token_count=60817
last_cache_touch_at=2026-04-29T07:48:04.137Z
```

Active main context after recovery was compact relative to raw history:

- raw stored history: tens of thousands of messages and about 20M stored message tokens for the active main conversation
- current context items: compact summary + recent message set
- maintenance row: `reason=cold-cache-catchup`, `token_budget=258000`, `current_token_count=87767`, completed successfully

OpenClaw logs in the same incident window show repeated `openai-codex/gpt-5.5` overflow prechecks and eventual session reset, so the LCM side needs enough metadata to help the host avoid misclassifying raw-history debt as prompt-admission failure.

## Possible causes to investigate

1. LCM does not expose result-level provenance: `assembled`, `fallback-live`, `fallback-empty`, `fallback-no-user-turn`, `emergency`, etc.
2. LCM does not expose whether the assembled result should be treated as prompt-authoritative by host prechecks.
3. LCM does not expose raw-history debt separately from prompt-ready assembled size.
4. LCM does not expose enough cache-policy metadata for hosts to avoid cache-breaking prompt mutations.
5. Prompt-cache mutation delay is Anthropic/Claude-family-specific even though Codex cache telemetry is observed.
6. Cache-write-only or cache-touch-only telemetry may be classified too cold when it should represent an active mutation-sensitive cache window.
7. Session-key continuity through host resets makes it hard to reconstruct exact per-session assembly state after incidents.
8. Summary/fresh-tail continuity can be partial after host reset if exact artifacts are not in the assembled summaries or if recall tools miss.

## Suggested fixes

### Assembly result metadata

Extend the context-engine result or add an optional LCM-specific metadata field with values like:

```ts
type LcmAssemblyMetadata = {
  source: "assembled" | "fallback-live" | "fallback-empty" | "fallback-no-user-turn" | "emergency";
  promptAuthoritative: boolean;
  estimatedTokens: number;
  rawHistoryEstimatedTokens?: number;
  rawHistoryDebtTokens?: number;
  contextItemCount?: number;
  summaryCount?: number;
  rawMessageCount?: number;
  freshTailCount?: number;
  freshTailTokens?: number;
  cacheState?: "hot" | "cold" | "unknown";
  cacheMutationSensitive?: boolean;
  fallbackReason?: string;
};
```

The exact shape can differ, but the host needs to know whether it should treat LCM assembly as the prompt-admission authority.

### Cache policy

Replace the Anthropic-only prompt-mutation sensitivity check with a provider/cache-policy helper that includes `openai-codex` when telemetry shows an active prompt-cache window.

Suggested test cases:

- `openai-codex` + cache read + retention long => mutation-sensitive hot window
- `openai-codex` + cache write / last cache touch + retention long => mutation-sensitive hot window
- explicit cache break observation => cold / no delay
- retention none => no delay
- Anthropic/Claude behavior remains covered

### Observability

Persist enough assembly snapshots or telemetry to reconstruct incident-time state after a host session reset:

- last assembly source/provenance
- last assembled token estimate
- last raw-history estimate/debt
- last fallback reason
- last selected context item counts

## Impact

Without richer metadata, hosts can accidentally treat LCM's raw transcript debt as prompt-admission failure, retry compaction unnecessarily, break prompt-cache locality, or reset the session even when LCM has already assembled a compact prompt-ready context. The user-visible result is context blowup, cache disruption, and partial forgetfulness after reset.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose assembly provenance/cache policy and treat openai-codex cache windows as mutation-sensitive #534

Summary

Environment observed

Confidence matrix

Source evidence

Live evidence

Possible causes to investigate

Suggested fixes

Assembly result metadata

Cache policy

Observability

Impact

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Expose assembly provenance/cache policy and treat openai-codex cache windows as mutation-sensitive #534

Description

Summary

Environment observed

Confidence matrix

Source evidence

Live evidence

Possible causes to investigate

Suggested fixes

Assembly result metadata

Cache policy

Observability

Impact

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions