-
Notifications
You must be signed in to change notification settings - Fork 2
research(memory): RL-based admission control — MemAgent learned write-gate replaces static threshold (arXiv:2603.04549) #2416
Description
Finding
MemAgent: Memory-Augmented Agent via Online Reinforcement Learning (ICLR 2026 oral, arXiv:2603.04549)
Trains a write-gate policy (RL, online) to decide which information to store vs. discard. Outperforms static threshold + retrieval-only agents on long-horizon tasks. The gate learns from agent performance feedback, not from content heuristics alone.
Applicability to Zeph
Zeph's A-MAC admission control ([memory.admission] threshold = 0.30) uses a static cosine similarity + importance score heuristic. MemAgent's approach replaces or augments this with a learned policy.
Proposed design:
- Short term: expose
admission_model = "heuristic" | "rl"config field - In
heuristicmode (current): existing A-MAC logic unchanged - In
rlmode: train a lightweight binary classifier on (query,content,session_outcome) triples collected from prior sessions. Positive signal: content that was recalled and led to better agent responses. Negative signal: content saved but never recalled.
Data collection can be derived from existing memory audit trail + SYNAPSE activation logs (recalled content = positive, never-activated = negative candidate).
Priority
P2 — A-MAC is already implemented; RL training data is available from existing logs. Medium implementation effort.
Source
- arXiv:2603.04549 — MemAgent: Reshaping Long-Context LLM with Multi-turn Memory (ICLR 2026 oral)