⚠️ Experimental research project. Karta is a novel, actively-evolving approach to AI agent memory. APIs, data formats, and benchmark numbers will change. Not production-ready. We're building this in the open because we think agent memory is one of the most important unsolved problems in AI — and our goal is to build the best AI memory system.
An agentic memory system that thinks, not just stores.
Karta is a novel approach that combines structured knowledge graphs with active background reasoning to build memory that organizes itself at write time, links associatively, evolves retroactively, and periodically "dreams" to surface deductions, patterns, gaps, and contradictions that no retrieval system could find because they were never explicitly stored.
- Self-organizing note graph — notes are enriched with LLM-generated context, semantically linked, and retroactively evolved when new information arrives
- Dream engine — 5 types of background inference: deduction, induction, abduction, consolidation, contradiction detection
- Structured output with reasoning — forces chain-of-thought before answers, enabling reliable abstention and contradiction flagging
- Cross-encoder reranking — Jina AI reranker for precise relevance scoring and intelligent abstention
- Temporal awareness — exponential decay scoring, foresight signals with validity windows
- Provenance tracking — every note tagged as FACT or INFERRED with confidence scores
- Forgetting — note lifecycle (Active → Deprecated → Superseded → Archived) with access-based decay
- Embedded by default — LanceDB + SQLite, zero infrastructure.
cargo add kartaand go.
use karta_core::{Karta, config::KartaConfig};
#[tokio::main]
async fn main() {
let config = KartaConfig::default();
let karta = Karta::with_defaults(config).await.unwrap();
// Add memories
karta.add_note("Sarah prefers Slack notifications over email").await.unwrap();
karta.add_note("Brightline requires all integrations on the approved vendor list").await.unwrap();
// Query with synthesis
let result = karta.ask("What should I know about Sarah's notification preferences?", 5).await.unwrap();
println!("{}", result.answer);
// Run background reasoning
let dream_run = karta.run_dreaming("workspace", "default").await.unwrap();
println!("Dreams: {} attempted, {} written", dream_run.dreams_attempted, dream_run.dreams_written);
}Write Path: content → LLM attributes → embed → ANN search → link → evolve → store
Read Path: query → embed → ANN → rerank → multi-hop traverse → synthesize
Dream Path: cluster notes → deduction/induction/abduction/consolidation/contradiction → persist
| Layer | Default | Production Options |
|---|---|---|
| Vector + metadata | LanceDB (embedded) | pgvector, Qdrant |
| Graph + state | SQLite (WAL mode) | Postgres, Dolt |
| Provider | Status |
|---|---|
| OpenAI / Azure OpenAI | Built |
| Any OpenAI-compatible (Ollama, vLLM, Groq, Together) | Built |
| Anthropic | Planned |
| Provider | Status |
|---|---|
| Jina AI (cross-encoder) | Built |
| LLM-based (fallback) | Built |
| Noop (disabled) | Built |
cp .env.example .env
# Fill in your LLM credentials# Per-operation model flexibility
[llm.default]
provider = "openai"
model = "gpt-4o-mini"
[llm.dream.abduction]
model = "claude-sonnet-4-6"
# Reranker
[reranker]
enabled = true
abstention_threshold = 0.1BEAM 100K — 20 conversations, 400 questions, single run, arithmetic mean of per-question rubric scores with the BEAM nugget LLM judge:
| Ability | P1 (2026-04-14) |
|---|---|
| Preference Following | 92% |
| Contradiction Resolution | 74% |
| Temporal Reasoning | 71% |
| Instruction Following | 69% |
| Summarization | 66% |
| Multi-session Reasoning | 65% |
| Abstention | 62% |
| Information Extraction | 59% |
| Knowledge Update | 43% |
| Event Ordering | 35% |
| Overall | 61.6% |
Up from a 53.0% P0 baseline (+8.6pp). P1 fixes: expand-then-rerank, killing the synthesis-level abstention gate, and recency scoring against source note timestamps instead of dream write time.
Active development targeting 90%+ via atomic fact decomposition,
episode-aware retrieval, and improved write-time organization. See
benchmarks/beam-100k.md for the full
experiment log, failure catalogue, and per-ability history.
karta/
├── crates/
│ ├── karta-core/ # Core engine (Rust)
│ │ ├── src/
│ │ │ ├── note.rs # MemoryNote, Provenance, NoteStatus
│ │ │ ├── write.rs # Write path (index, link, evolve)
│ │ │ ├── read.rs # Search, multi-hop traversal, synthesis
│ │ │ ├── rerank.rs # Cross-encoder reranking + abstention
│ │ │ ├── dream/ # Dream engine (5 inference types)
│ │ │ ├── store/ # LanceDB + SQLite implementations
│ │ │ └── llm/ # Provider trait + OpenAI + structured output
│ │ └── tests/ # Eval suites + BEAM/LOCOMO/LongMem harnesses
│ └── karta-cli/ # CLI (planned)
├── data/ # Benchmark preprocessing scripts
├── Cargo.toml
└── README.md
# Run tests (mock LLM, no API keys needed)
cargo test
# Run real eval (requires .env credentials)
cargo test --test real_eval -- --ignored --nocapture
# Run BEAM benchmark
BEAM_DATASET_PATH=data/beam-100k.json cargo test --test beam_100k beam_100k_single -- --ignored --nocapturebenchmarks/— benchmark results, reproduction commands, experiment logsdocs/landscape.md— AI memory systems landscape researchdocs/retrieval-plan.md— open retrieval experiment backlogCONTRIBUTING.md— how to contributeCHANGELOG.md— release notes
MIT