feat(openclaw): skills-based memory architecture with batched extraction#4624
Merged
whysosaket merged 22 commits intomainfrom Apr 1, 2026
Merged
feat(openclaw): skills-based memory architecture with batched extraction#4624whysosaket merged 22 commits intomainfrom
whysosaket merged 22 commits intomainfrom
Conversation
The previous exclude rule for credentials was too generic and the extraction LLM failed to recognize credentials embedded in config blocks, setup logs, and tool output (29% of test memories contained leaked credentials). Give the LLM concrete patterns to watch for (sk-, m0-, ak_, ghp_, bot tokens, bearer tokens, webhook URLs, pairing codes) and WRONG/RIGHT examples showing what to store instead of the raw secret. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ia infer=false Replace passive auto-capture with skills-based memory extraction. The agent evaluates each conversation turn and stores facts directly to mem0 with infer=false, bypassing the extraction LLM entirely. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
…ries array memory_store now accepts a `facts` array parameter. Multiple facts from one conversation turn are sent as a single provider.add() call with deduced_memories=[...all facts...], reducing N API calls to 1. Removed maxFactsPerTurn config — the right abstraction is one call per turn with all facts batched, not a rate limiter on call count. Updated inline protocol to instruct the agent to always use the facts array and call memory_store exactly once per turn. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
- Switch from before_agent_start to before_prompt_build for context injection. Static memory protocol goes in prependSystemContext (provider-cacheable), dynamic recalled memories in prependContext. - Use message_received hook to capture clean user message content without OpenClaw sender metadata prefix. Eliminates the regex sanitizer — clean queries at the source, not at search time. - Remove maxFactsPerTurn from types and plugin schema. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
make install — checks deps, installs OpenClaw if missing, builds plugin, removes existing version, links local build, patches openclaw.json with all required settings (tools.profile, session-memory, skills config), backs up original config, and restarts gateway. Also: make status, make logs, make restart, make uninstall, make clean. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Configure must run AFTER plugin link (link overwrites openclaw.json). Moved Python config logic to scripts/configure.py to avoid Make shell escaping issues with inline Python. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The agent now rewrites search queries for retrieval instead of passing raw user messages. The skill protocol teaches the agent to extract key concepts, remove conversational framing, and match the language of stored memories. Plugin-side sanitization (metadata stripping) remains for the auto-recall path, but all semantic query rewriting is the agent's responsibility. Updated: skill-loader.ts (inline protocol), recall-protocol.md (examples), recall.ts (comments clarifying responsibility boundary). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Rewrote recall-protocol.md with structured 4-step rewriting process, 7 worked examples with explicit reasoning chains, anti-pattern catalog, and query length guidance. No ambiguity, no room for improvisation. Inline protocol (prependSystemContext) updated with WRONG/RIGHT pairs and compressed 4-step process for per-turn injection. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
ef9878e to
9e90078
Compare
Restructured the recall protocol with rigorous query construction process. Generalized all examples across domains (health, business, identity, operations, product, personal, technical) to prevent model anchoring on any single use case. 7 worked examples each show full 4-step reasoning chain. Failure pattern table with 7 anti-patterns and fixes. Action-oriented section headers for better model compliance. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
9e90078 to
7d80bf3
Compare
…categories The agent can now pass structured filters to memory_search alongside the rewritten query. Filters narrow by time (created_at gte/lte), category, or metadata fields. Query handles semantic relevance, filters handle structural constraints. Changes: - types.ts: added filters to SearchOptions - providers.ts: merge agent filters with base user_id/run_id filters using AND (agent filters extend, never override user scoping) - index.ts: added filters param to memory_search tool with operator docs - skill-loader.ts: inline protocol teaches filter construction with examples - recall-protocol.md: full filter section with 6 worked examples, when to add vs when to omit, syntax reference Operators: eq, ne, gt, gte, lt, lte, in, contains, icontains Logical: AND, OR, NOT Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
lastCleanUserMessage was process-global mutable state written by message_received and consumed by before_prompt_build. Under overlapping inbound turns, one session's clean message could shape another session's recall query, injecting wrong memories even with correct user_id scoping. Fix: removed message_received hook and the shared variable entirely. Query sanitization now runs inline within before_prompt_build where ctx.sessionKey is available and execution is session-scoped. Uses sanitizeQuery() (exported from recall.ts) to strip OpenClaw metadata. The inherited currentSessionId shared state (line 95, pre-existing in main branch v0.4.0) remains as a documented known limitation. That requires OpenClaw framework changes (session context in tool handlers) to fix properly. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…olicy The facts array applied one category, importance, expiration_date, and immutable flag to every entry in deduced_memories. Mixed-category batches produced wrong retention policy: identity facts could inherit project TTLs, or project memories could become permanently immutable. Fix: batch by policy, not by turn. All facts in one memory_store call must share the same category. If a turn produces facts in different categories, the agent makes one call per category (e.g., identity facts in one call, decision facts in another). Two categories = two API calls. This is correct and expected. Changes: - Tool schema: facts[] description states same-category requirement - Tool schema: category description explains it determines retention policy - Tool schema: importance marked as override (omit to use category default) - Inline protocol: teaches "batch by category" with mixed-category example showing two separate calls Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
loadTriagePrompt() was a hardcoded inline protocol that ignored domain
overlays, category overrides, custom rules, and triage knobs from the
config surface. The plugin advertised a configurable skill system via
skills.domain, customRules, and categories, but the live
before_prompt_build path ran a separate prompt plane.
Fix: loadTriagePrompt() now calls loadSkill("memory-triage") as the
primary path, which merges all config-driven overlays (domain, categories,
custom rules, triage knobs). The inline protocol is now a minimal
fallback for when the SKILL.md file cannot be loaded.
The full SKILL.md content (with all overlays merged) goes into
prependSystemContext, which is provider-cacheable. Tool usage format,
batching rules, and search protocol are appended as operational sections.
One prompt plane. Config surface = what the model sees.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ructions
Three fixes:
1. SKILL.md taught old memory_store(text, {metadata}) contract while
runtime expects facts[] array batched by category. Updated all 12
examples to use facts: [...], category: "..." format. Added Example 11
showing mixed-category turn with separate calls.
2. recall.enabled=false disabled the search path in index.ts but the
system prompt still included search/filter instructions. Now gated:
search instructions only injected when recall.enabled !== false.
3. Removed "MAXIMUM 3 operations per turn" from SKILL.md (replaced
with "BATCH BY CATEGORY" which is the actual contract).
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ime guard
1. Replaced every remaining memory_store("text") example in SKILL.md
entity-grouping section (lines 113-136) and consolidation example
(line 332) with facts: [...], category: "..." format. Zero legacy
examples remain (verified: grep returns empty).
2. Added runtime warning when multi-fact batch arrives without a
category. The prompt teaches batch-by-category but the runtime now
logs when the model disobeys, making violations observable without
silently applying wrong retention policy.
Note on memory_search references in SKILL.md: these appear in
update/consolidation examples where the agent needs to find an existing
memory before replacing it. This is correct even when recall.enabled=false
because that flag disables pre-turn auto-search, not the memory_search
tool itself. The agent can always search explicitly for update operations.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Subagents were told not to store memories via system prompt, but
memory_store and memory_forget tool handlers did not enforce it.
A disobedient subagent tool call could write to a transient namespace
that is never read again, or delete memories from the parent namespace.
Fix: both memory_store and memory_forget now check
isSubagentSession(currentSessionId) and return an error response
before any provider call. The block is logged for observability.
This closes the gap between prompt-level instruction ("do not store")
and runtime enforcement. memory_search, memory_get, and memory_list
remain available to subagents (read-only access is safe and useful
for subagent context).
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…tools Three new tools to close the gap with mem0 MCP: - memory_update(memoryId, text): Atomic in-place update that preserves history. Preferred over delete-then-store for corrections. - memory_delete_all(confirm, userId): Bulk deletion with safety gate (confirm: true required). For user-requested memory resets. - memory_history(memoryId): View edit history of a memory showing all changes over time with old/new values and timestamps. All three tools enforce subagent write blocks. Both Platform and OSS providers implement the new methods. OSS history() degrades gracefully if the backend does not support it. Plugin now exposes 8 tools (was 5): memory_search, memory_store, memory_get, memory_list, memory_forget, memory_update, memory_delete_all, memory_history Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The /new and /reset commands produce system prompts ("A new session was
started via /new or /reset. Run your Session Startup sequence...") that
were being sent to mem0 search as the recall query. This wasted API
calls and returned noise.
Fix: detect system/bootstrap prompts by content pattern and skip the
recall search. The memory protocol is still injected (prependSystemContext)
so the agent has the skill instructions, but no search fires for system
commands.
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
skills.recall.strategy now drives real behavior: - "smart" (default): long-term search only, no session search. 1 plugin search per turn instead of 2. Agent handles additional search via memory_search tool. - "manual": zero plugin searches. Agent controls all search. Skill protocol instructs the agent to search proactively at session start and when context is needed. - "always": long-term + session search every turn (old behavior). 2 plugin searches per turn. Also: skip recall entirely for system/bootstrap prompts (/new, /reset). This addresses the feedback that the system was more search-heavy than the first-principles ideal of agent-controlled search. The default "smart" mode halves the always-on search cost while keeping baseline recall. "manual" mode gives full agent control. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…precedence 1. Skill protocol now teaches explicit scope selection: - "long-term" for user context (default, most common) - "session" for current conversation only - "all" only when both scopes truly needed Avoids unnecessary backend fan-out from scope="all". 2. Documented enabled/strategy precedence in types.ts: enabled=false overrides strategy. Strategy only consulted when enabled is not false. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The skill protocol now explains user_id, agent_id, and run_id scoping: - Default: plugin handles scoping automatically, agent does not need to pass userId or agentId in most cases - agentId: only for cross-agent queries (accessing a different agent's memory namespace) - userId: only when explicitly operating on a different user - run_id: managed by the plugin via the scope parameter, not passed directly - Multi-agent isolation: each agent has separate memory, main agent memories are separate from subagent memories Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Automatic memory consolidation that triggers based on activity: - Three gates: time (24h), sessions (5), memory count (20) - File-based lock prevents concurrent consolidation (1h stale) - Piggybacks onto next user turn when gates pass - 4-phase SKILL.md: Orient, Gather, Consolidate, Report - Uses memory_update (atomic), memory_history, memory_search with filters - State persists in plugin stateDir across gateway restarts New: dream-gate.ts (gate logic, lock, state persistence) Rewrite: skills/memory-dream/SKILL.md (4-phase structured prompt) Modified: index.ts (wire gates into agent_end + before_prompt_build) Modified: types.ts (dream.auto, minHours, minSessions, minMemories) Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
kartik-mem0
approved these changes
Apr 1, 2026
whysosaket
approved these changes
Apr 1, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Replaces the plugin's passive auto-capture pipeline with an agent-controlled memory system. The agent evaluates each conversation turn, decides what is worth persisting, and stores all facts in a single API call via
infer=falsewithdeduced_memoriesarray.Why
The current auto-capture sends every exchange to mem0's extraction LLM. Three problems:
The skills-based approach moves the memory decision to the agent, which already has full conversation context.
Architecture
Backwards compatible. Without
skillsconfig, the plugin uses existing auto-recall + auto-capture.Tools (8)
memory_searchmemory_storememory_getmemory_listmemory_forgetmemory_updatememory_delete_allmemory_historyAll write/delete tools enforce subagent blocks at runtime.
Skills
memory-triagememory-dream/memory-dream)Domain overlay:
companion.mdfor personal AI companion use cases.Fixes from Team Review
created_atdate ranges, category filters, metadata constraints alongside rewritten queries.loadTriagePrompt()now loads SKILL.md vialoadSkill()with all config overlays merged. Config surface = what the model sees.facts: [...]format. Zero legacymemory_store("text")patterns remain.memory_store,memory_forget,memory_update,memory_delete_allall block subagent sessions at runtime.Local Testing
Quick install
git clone -b feat/agentic-memory-skills https://github.com/mem0ai/mem0.git cd mem0/openclaw make install MEM0_API_KEY=m0-your-key make statusConfiguration
make installpatches~/.openclaw/openclaw.jsonautomatically. Three required settings:tools.profile: "full"(exposes plugin tools to the model)session-memory: false(disables built-in workspace file memory)skillsconfig block on the plugin entryTest scenarios
memory_storefires withfactsarray. Dashboard: ADD withinfer: false.agent_end (no auto-capture)injecting N memories. Agent answers from recall.memory_updateor search-forget-store.Check logs
Known Limitations
currentSessionIdshared mutable state: inherited from main branch v0.4.0. Requires OpenClaw framework changes (session context in tool handlers). Documented, not fixable plugin-side.event.prompt(sanitized), not an agent-rewritten query. Agent query rewriting applies only to explicitmemory_searchcalls.