Skip to content

research(memory): RL-based admission control — MemAgent learned write-gate replaces static threshold (arXiv:2603.04549) #2416

@bug-ops

Description

@bug-ops

Finding

MemAgent: Memory-Augmented Agent via Online Reinforcement Learning (ICLR 2026 oral, arXiv:2603.04549)

Trains a write-gate policy (RL, online) to decide which information to store vs. discard. Outperforms static threshold + retrieval-only agents on long-horizon tasks. The gate learns from agent performance feedback, not from content heuristics alone.

Applicability to Zeph

Zeph's A-MAC admission control ([memory.admission] threshold = 0.30) uses a static cosine similarity + importance score heuristic. MemAgent's approach replaces or augments this with a learned policy.

Proposed design:

  1. Short term: expose admission_model = "heuristic" | "rl" config field
  2. In heuristic mode (current): existing A-MAC logic unchanged
  3. In rl mode: train a lightweight binary classifier on (query, content, session_outcome) triples collected from prior sessions. Positive signal: content that was recalled and led to better agent responses. Negative signal: content saved but never recalled.

Data collection can be derived from existing memory audit trail + SYNAPSE activation logs (recalled content = positive, never-activated = negative candidate).

Priority

P2 — A-MAC is already implemented; RL training data is available from existing logs. Medium implementation effort.

Source

  • arXiv:2603.04549 — MemAgent: Reshaping Long-Context LLM with Multi-turn Memory (ICLR 2026 oral)

Metadata

Metadata

Assignees

Labels

P2High value, medium complexitymemoryzeph-memory crate (SQLite)researchResearch-driven improvement

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions