research(memory): agentic memory benchmarking harness — hit rate, recall latency, compression ratio metrics (arXiv:2602.19320)

## Finding

**Anatomy of Agentic Memory** (arXiv:2602.19320)

Comprehensive taxonomy + evaluation framework for agent memory systems. Defines standardized metrics: recall hit rate, latency per recall, compression ratio, interference rate (new memories degrading old recalls), and context utilization efficiency.

## Applicability to Zeph

Zeph has no systematic memory benchmarking. The journal tracks qualitative results (\"cross-session recall works\") but no quantitative metrics. A benchmarking harness would enable data-driven tuning of thresholds (`cross_session_score_threshold`, `compaction_threshold`, `admission.threshold`, etc.).

**Proposed design:**

```bash
# .local/testing/bench-memory.py
# 1. Seed N facts with known content
# 2. Run M recall queries at different time delays
# 3. Report: hit_rate, avg_recall_latency_ms, compression_ratio, interference_rate
python3 .local/testing/bench-memory.py --facts 50 --queries 100 --sessions 5
```

**Metrics to track:**
- `hit_rate` — fraction of seeded facts recalled correctly
- `recall_latency_p50/p99` — milliseconds per `memory_search` call
- `compression_ratio` — tokens before/after compaction per session
- `interference_rate` — fraction of recalled facts that are contaminated by newer, unrelated memories

## Priority

P3 — tooling improvement; enables data-driven tuning of existing thresholds.

## Source

- arXiv:2602.19320 — Anatomy of Agentic Memory: A Principled Survey and Evaluation Framework

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

research(memory): agentic memory benchmarking harness — hit rate, recall latency, compression ratio metrics (arXiv:2602.19320) #2419

Finding

Applicability to Zeph

Priority

Source

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

research(memory): agentic memory benchmarking harness — hit rate, recall latency, compression ratio metrics (arXiv:2602.19320) #2419

Description

Finding

Applicability to Zeph

Priority

Source

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions