Skip to content

feat(openclaw): skills-based memory architecture with batched extraction#4624

Merged
whysosaket merged 22 commits intomainfrom
feat/agentic-memory-skills
Apr 1, 2026
Merged

feat(openclaw): skills-based memory architecture with batched extraction#4624
whysosaket merged 22 commits intomainfrom
feat/agentic-memory-skills

Conversation

@chaithanyak42
Copy link
Copy Markdown
Contributor

@chaithanyak42 chaithanyak42 commented Mar 30, 2026

What

Replaces the plugin's passive auto-capture pipeline with an agent-controlled memory system. The agent evaluates each conversation turn, decides what is worth persisting, and stores all facts in a single API call via infer=false with deduced_memories array.

Why

The current auto-capture sends every exchange to mem0's extraction LLM. Three problems:

  1. The extraction LLM lacks context. It re-interprets flat text without knowing what the agent was doing, what was tool output, or what is already stored.
  2. No selectivity. Every turn triggers extraction regardless of information density.
  3. No structured metadata. Extracted memories have no category, importance, or expiration.

The skills-based approach moves the memory decision to the agent, which already has full conversation context.

Architecture

before_prompt_build
  |- prependSystemContext: full SKILL.md with domain overlays, config injection (cached)
  |- prependContext: token-budgeted recalled memories (per-turn)

Agent turn
  |- Agent responds to user
  |- Evaluates turn against triage protocol (4-gate filter)
  |- Calls memory_store(facts: [...], category) batched by category
  |    |- Plugin: provider.add(infer=false, deduced_memories=[...facts...])

agent_end
  |- Log only. No auto-capture.

Backwards compatible. Without skills config, the plugin uses existing auto-recall + auto-capture.

Tools (8)

Tool Purpose
memory_search Search with query rewriting, category filters, date range filters
memory_store Store facts batched by category via infer=false
memory_get Retrieve a specific memory by ID
memory_list List all memories with scope filtering
memory_forget Delete by ID or search query
memory_update Atomic in-place update preserving history
memory_delete_all Bulk deletion with safety gate
memory_history View edit history of a memory

All write/delete tools enforce subagent blocks at runtime.

Skills

Skill Type Purpose
memory-triage Always active Per-turn extraction protocol with 4-gate filter, 8 categories, entity grouping, credential detection
memory-dream User-invocable (/memory-dream) Periodic consolidation: merge, prune, rewrite, expire

Domain overlay: companion.md for personal AI companion use cases.

Fixes from Team Review

  • Agent-controlled query rewriting: 4-step deterministic process with 7 worked examples across diverse domains. No raw user messages sent to search API.
  • Search filters: Agent can pass created_at date ranges, category filters, metadata constraints alongside rewritten queries.
  • Batch by category: Facts batched per category (not per turn) so retention policy (TTL, immutability) is correct by construction. Runtime warning when model disobeys.
  • Unified prompt plane: loadTriagePrompt() now loads SKILL.md via loadSkill() with all config overlays merged. Config surface = what the model sees.
  • recall.enabled gate: Search instructions only injected when recall is enabled.
  • SKILL.md contract alignment: All examples updated to facts: [...] format. Zero legacy memory_store("text") patterns remain.
  • lastCleanUserMessage race condition: Removed shared mutable state. Query sanitization runs inline within session-scoped hook.
  • Subagent write enforcement: memory_store, memory_forget, memory_update, memory_delete_all all block subagent sessions at runtime.
  • Generalized examples: All skill examples span 7+ domains to prevent model anchoring on any single use case.

Local Testing

Quick install

git clone -b feat/agentic-memory-skills https://github.com/mem0ai/mem0.git
cd mem0/openclaw
make install MEM0_API_KEY=m0-your-key
make status

Configuration

make install patches ~/.openclaw/openclaw.json automatically. Three required settings:

  1. tools.profile: "full" (exposes plugin tools to the model)
  2. session-memory: false (disables built-in workspace file memory)
  3. skills config block on the plugin entry

Test scenarios

# Message Verify
1 "Hi, I'm Alex. Backend engineer at Stripe, SF, PST. I prefer concise responses." memory_store fires with facts array. Dashboard: ADD with infer: false.
2 "What's 2 + 2?" No memory_store. Log: agent_end (no auto-capture)
3 "What timezone am I in?" Log: injecting N memories. Agent answers from recall.
4 "Switching from Postgres to CockroachDB for multi-region writes." Stored as decision category.
5 "My database password is supersecret123." NOT stored.
6 "Actually, going with TiDB instead of CockroachDB." Agent uses memory_update or search-forget-store.

Check logs

make logs

Known Limitations

  1. currentSessionId shared mutable state: inherited from main branch v0.4.0. Requires OpenClaw framework changes (session context in tool handlers). Documented, not fixable plugin-side.
  2. Pre-turn auto-recall uses event.prompt (sanitized), not an agent-rewritten query. Agent query rewriting applies only to explicit memory_search calls.
  3. Category metadata compliance varies by model. Plugin defaults gracefully when omitted.
  4. Dream execution via CLI generates the prompt. Requires piping to agent session for execution.

chaithanyak42 and others added 2 commits March 26, 2026 15:41
The previous exclude rule for credentials was too generic and the
extraction LLM failed to recognize credentials embedded in config
blocks, setup logs, and tool output (29% of test memories contained
leaked credentials).

Give the LLM concrete patterns to watch for (sk-, m0-, ak_, ghp_,
bot tokens, bearer tokens, webhook URLs, pairing codes) and
WRONG/RIGHT examples showing what to store instead of the raw secret.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ia infer=false

Replace passive auto-capture with skills-based memory extraction.
The agent evaluates each conversation turn and stores facts directly
to mem0 with infer=false, bypassing the extraction LLM entirely.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 30, 2026

@chaithanyak42 chaithanyak42 changed the title feat(openclaw): agentic memory skills — agent-controlled extraction via infer=false feat(openclaw): skills-based memory architecture Mar 30, 2026
…ries array

memory_store now accepts a `facts` array parameter. Multiple facts from
one conversation turn are sent as a single provider.add() call with
deduced_memories=[...all facts...], reducing N API calls to 1.

Removed maxFactsPerTurn config — the right abstraction is one call per
turn with all facts batched, not a rate limiter on call count.

Updated inline protocol to instruct the agent to always use the facts
array and call memory_store exactly once per turn.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@chaithanyak42 chaithanyak42 added the enhancement New feature or request label Mar 30, 2026
@chaithanyak42 chaithanyak42 changed the title feat(openclaw): skills-based memory architecture feat(openclaw): skills-based memory architecture with batched extraction Mar 30, 2026
chaithanyak42 and others added 5 commits March 31, 2026 17:46
- Switch from before_agent_start to before_prompt_build for context
  injection. Static memory protocol goes in prependSystemContext
  (provider-cacheable), dynamic recalled memories in prependContext.

- Use message_received hook to capture clean user message content
  without OpenClaw sender metadata prefix. Eliminates the regex
  sanitizer — clean queries at the source, not at search time.

- Remove maxFactsPerTurn from types and plugin schema.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
make install — checks deps, installs OpenClaw if missing, builds plugin,
removes existing version, links local build, patches openclaw.json with
all required settings (tools.profile, session-memory, skills config),
backs up original config, and restarts gateway.

Also: make status, make logs, make restart, make uninstall, make clean.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Configure must run AFTER plugin link (link overwrites openclaw.json).
Moved Python config logic to scripts/configure.py to avoid Make shell
escaping issues with inline Python.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The agent now rewrites search queries for retrieval instead of passing
raw user messages. The skill protocol teaches the agent to extract key
concepts, remove conversational framing, and match the language of
stored memories.

Plugin-side sanitization (metadata stripping) remains for the auto-recall
path, but all semantic query rewriting is the agent's responsibility.

Updated: skill-loader.ts (inline protocol), recall-protocol.md (examples),
recall.ts (comments clarifying responsibility boundary).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Rewrote recall-protocol.md with structured 4-step rewriting process,
7 worked examples with explicit reasoning chains, anti-pattern catalog,
and query length guidance. No ambiguity, no room for improvisation.

Inline protocol (prependSystemContext) updated with WRONG/RIGHT pairs
and compressed 4-step process for per-turn injection.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@chaithanyak42 chaithanyak42 force-pushed the feat/agentic-memory-skills branch from ef9878e to 9e90078 Compare April 1, 2026 08:43
Restructured the recall protocol with rigorous query construction process.
Generalized all examples across domains (health, business, identity,
operations, product, personal, technical) to prevent model anchoring
on any single use case.

7 worked examples each show full 4-step reasoning chain.
Failure pattern table with 7 anti-patterns and fixes.
Action-oriented section headers for better model compliance.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@chaithanyak42 chaithanyak42 force-pushed the feat/agentic-memory-skills branch from 9e90078 to 7d80bf3 Compare April 1, 2026 08:45
chaithanyak42 and others added 13 commits April 1, 2026 14:22
…categories

The agent can now pass structured filters to memory_search alongside
the rewritten query. Filters narrow by time (created_at gte/lte),
category, or metadata fields. Query handles semantic relevance,
filters handle structural constraints.

Changes:
- types.ts: added filters to SearchOptions
- providers.ts: merge agent filters with base user_id/run_id filters
  using AND (agent filters extend, never override user scoping)
- index.ts: added filters param to memory_search tool with operator docs
- skill-loader.ts: inline protocol teaches filter construction with examples
- recall-protocol.md: full filter section with 6 worked examples,
  when to add vs when to omit, syntax reference

Operators: eq, ne, gt, gte, lt, lte, in, contains, icontains
Logical: AND, OR, NOT

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
lastCleanUserMessage was process-global mutable state written by
message_received and consumed by before_prompt_build. Under overlapping
inbound turns, one session's clean message could shape another session's
recall query, injecting wrong memories even with correct user_id scoping.

Fix: removed message_received hook and the shared variable entirely.
Query sanitization now runs inline within before_prompt_build where
ctx.sessionKey is available and execution is session-scoped. Uses
sanitizeQuery() (exported from recall.ts) to strip OpenClaw metadata.

The inherited currentSessionId shared state (line 95, pre-existing in
main branch v0.4.0) remains as a documented known limitation. That
requires OpenClaw framework changes (session context in tool handlers)
to fix properly.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…olicy

The facts array applied one category, importance, expiration_date, and
immutable flag to every entry in deduced_memories. Mixed-category batches
produced wrong retention policy: identity facts could inherit project TTLs,
or project memories could become permanently immutable.

Fix: batch by policy, not by turn. All facts in one memory_store call
must share the same category. If a turn produces facts in different
categories, the agent makes one call per category (e.g., identity facts
in one call, decision facts in another). Two categories = two API calls.
This is correct and expected.

Changes:
- Tool schema: facts[] description states same-category requirement
- Tool schema: category description explains it determines retention policy
- Tool schema: importance marked as override (omit to use category default)
- Inline protocol: teaches "batch by category" with mixed-category example
  showing two separate calls

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
loadTriagePrompt() was a hardcoded inline protocol that ignored domain
overlays, category overrides, custom rules, and triage knobs from the
config surface. The plugin advertised a configurable skill system via
skills.domain, customRules, and categories, but the live
before_prompt_build path ran a separate prompt plane.

Fix: loadTriagePrompt() now calls loadSkill("memory-triage") as the
primary path, which merges all config-driven overlays (domain, categories,
custom rules, triage knobs). The inline protocol is now a minimal
fallback for when the SKILL.md file cannot be loaded.

The full SKILL.md content (with all overlays merged) goes into
prependSystemContext, which is provider-cacheable. Tool usage format,
batching rules, and search protocol are appended as operational sections.

One prompt plane. Config surface = what the model sees.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ructions

Three fixes:

1. SKILL.md taught old memory_store(text, {metadata}) contract while
   runtime expects facts[] array batched by category. Updated all 12
   examples to use facts: [...], category: "..." format. Added Example 11
   showing mixed-category turn with separate calls.

2. recall.enabled=false disabled the search path in index.ts but the
   system prompt still included search/filter instructions. Now gated:
   search instructions only injected when recall.enabled !== false.

3. Removed "MAXIMUM 3 operations per turn" from SKILL.md (replaced
   with "BATCH BY CATEGORY" which is the actual contract).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…ime guard

1. Replaced every remaining memory_store("text") example in SKILL.md
   entity-grouping section (lines 113-136) and consolidation example
   (line 332) with facts: [...], category: "..." format. Zero legacy
   examples remain (verified: grep returns empty).

2. Added runtime warning when multi-fact batch arrives without a
   category. The prompt teaches batch-by-category but the runtime now
   logs when the model disobeys, making violations observable without
   silently applying wrong retention policy.

Note on memory_search references in SKILL.md: these appear in
update/consolidation examples where the agent needs to find an existing
memory before replacing it. This is correct even when recall.enabled=false
because that flag disables pre-turn auto-search, not the memory_search
tool itself. The agent can always search explicitly for update operations.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Subagents were told not to store memories via system prompt, but
memory_store and memory_forget tool handlers did not enforce it.
A disobedient subagent tool call could write to a transient namespace
that is never read again, or delete memories from the parent namespace.

Fix: both memory_store and memory_forget now check
isSubagentSession(currentSessionId) and return an error response
before any provider call. The block is logged for observability.

This closes the gap between prompt-level instruction ("do not store")
and runtime enforcement. memory_search, memory_get, and memory_list
remain available to subagents (read-only access is safe and useful
for subagent context).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…tools

Three new tools to close the gap with mem0 MCP:

- memory_update(memoryId, text): Atomic in-place update that preserves
  history. Preferred over delete-then-store for corrections.

- memory_delete_all(confirm, userId): Bulk deletion with safety gate
  (confirm: true required). For user-requested memory resets.

- memory_history(memoryId): View edit history of a memory showing all
  changes over time with old/new values and timestamps.

All three tools enforce subagent write blocks. Both Platform and OSS
providers implement the new methods. OSS history() degrades gracefully
if the backend does not support it.

Plugin now exposes 8 tools (was 5):
  memory_search, memory_store, memory_get, memory_list,
  memory_forget, memory_update, memory_delete_all, memory_history

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The /new and /reset commands produce system prompts ("A new session was
started via /new or /reset. Run your Session Startup sequence...") that
were being sent to mem0 search as the recall query. This wasted API
calls and returned noise.

Fix: detect system/bootstrap prompts by content pattern and skip the
recall search. The memory protocol is still injected (prependSystemContext)
so the agent has the skill instructions, but no search fires for system
commands.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
skills.recall.strategy now drives real behavior:

- "smart" (default): long-term search only, no session search.
  1 plugin search per turn instead of 2. Agent handles additional
  search via memory_search tool.

- "manual": zero plugin searches. Agent controls all search.
  Skill protocol instructs the agent to search proactively at
  session start and when context is needed.

- "always": long-term + session search every turn (old behavior).
  2 plugin searches per turn.

Also: skip recall entirely for system/bootstrap prompts (/new, /reset).

This addresses the feedback that the system was more search-heavy than
the first-principles ideal of agent-controlled search. The default
"smart" mode halves the always-on search cost while keeping baseline
recall. "manual" mode gives full agent control.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
…precedence

1. Skill protocol now teaches explicit scope selection:
   - "long-term" for user context (default, most common)
   - "session" for current conversation only
   - "all" only when both scopes truly needed
   Avoids unnecessary backend fan-out from scope="all".

2. Documented enabled/strategy precedence in types.ts:
   enabled=false overrides strategy. Strategy only consulted
   when enabled is not false.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
The skill protocol now explains user_id, agent_id, and run_id scoping:
- Default: plugin handles scoping automatically, agent does not need
  to pass userId or agentId in most cases
- agentId: only for cross-agent queries (accessing a different agent's
  memory namespace)
- userId: only when explicitly operating on a different user
- run_id: managed by the plugin via the scope parameter, not passed
  directly
- Multi-agent isolation: each agent has separate memory, main agent
  memories are separate from subagent memories

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Automatic memory consolidation that triggers based on activity:
- Three gates: time (24h), sessions (5), memory count (20)
- File-based lock prevents concurrent consolidation (1h stale)
- Piggybacks onto next user turn when gates pass
- 4-phase SKILL.md: Orient, Gather, Consolidate, Report
- Uses memory_update (atomic), memory_history, memory_search with filters
- State persists in plugin stateDir across gateway restarts

New: dream-gate.ts (gate logic, lock, state persistence)
Rewrite: skills/memory-dream/SKILL.md (4-phase structured prompt)
Modified: index.ts (wire gates into agent_end + before_prompt_build)
Modified: types.ts (dream.auto, minHours, minSessions, minMemories)

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@whysosaket whysosaket merged commit c250ccf into main Apr 1, 2026
7 checks passed
@whysosaket whysosaket deleted the feat/agentic-memory-skills branch April 1, 2026 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants