You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat: expose prompt-cache runtime context to context engines (#62179)
* Context engine: plumb prompt cache runtime context
Add a typed prompt-cache payload to the context-engine runtime context and populate it from the embedded runner's resolved retention, last-call usage, cache-break observation, and cache-touch metadata. Also pass the same payload through the retry compaction runtime context when a run attempt already has it.
Regeneration-Prompt: |
Expose OpenClaw prompt-cache telemetry to context engines in a narrow,
additive way without changing compaction policy. Keep the public change on
the OpenClaw side only: add a typed promptCache payload to the context-engine
runtime context, thread it into afterTurn, and also into compact where the
existing run loop already has the data cheaply available.
Use OpenClaw's resolved cache retention, not raw config. Use last-call usage
for the new payload, not accumulated retry or tool-loop totals. Reuse the
existing prompt-cache observability result and tracked change causes instead
of inventing a new heuristic. If cache-touch metadata is already available
from the cache-TTL bookkeeping, include it; do not invent expiry timestamps
for providers where OpenClaw cannot know them confidently.
Keep the interface backward-compatible for engines that ignore the new field.
Add focused tests around the existing attempt/context-engine helpers and the
compaction runtime-context propagation path rather than broad new integration
coverage.
* Agents: fix prompt-cache afterTurn usage
Regeneration-Prompt: |
Fix PR #62179 so context-engine prompt-cache metadata uses only the current attempt's usage. The review comment pointed out that early exits could reuse a prior turn's assistant usage when no new assistant message was produced. Restrict the prompt-cache lastCallUsage lookup to assistant messages added after prePromptMessageCount, and fall back to current-attempt usage totals instead of stale snapshot history. Also repair the PR's new context-engine test typings and add a regression test for the stale prior-turn case. Two import-only fixes in doctor-state-integrity and config/talk were already broken on origin/main, but they blocked build/check and the gateway-watch regression harness, so include the minimum unblocking imports as well.
* Agents: document prompt-cache context
* Agents: address prompt-cache review feedback
* Doctor: drop unused isRecord import
- Memory/wiki: add an opt-in `context.includeCompiledDigestPrompt` flag so memory prompt supplements can append a compact compiled wiki snapshot for legacy prompt assembly and context engines that explicitly consume memory prompt sections. Thanks @vincentkoc.
28
28
- Plugin SDK/context engines: pass `availableTools` and `citationsMode` into `assemble()`, and expose `buildMemorySystemPromptAddition(...)` so non-legacy context engines can adopt the active memory prompt path without reimplementing it. Thanks @vincentkoc.
29
29
- Providers/inferrs: add string-content compatibility for stricter OpenAI-compatible chat backends, document `inferrs` setup with a full config example, and add troubleshooting guidance for local backends that pass direct probes but fail on full agent-runtime prompts.
30
+
- Agents/context engine: expose prompt-cache runtime context to context engines and keep current-turn prompt-cache usage aligned with the active attempt instead of stale prior-turn assistant state. (#62179) Thanks @jalehman.
0 commit comments