-
Notifications
You must be signed in to change notification settings - Fork 2
feat(memory): community detection with label propagation #1228
Description
Summary
Phase 5 of graph memory (#1222): automatic community detection and summary generation using label propagation.
Depends on: #1227 (background extraction)
Tasks
1. petgraph Dependency
Add petgraph = { workspace = true, optional = true } to crates/zeph-memory/Cargo.toml. Update feature: graph-memory = ["dep:petgraph"]. Add petgraph = "0.7" to workspace dependencies.
2. Community Detection (graph/community.rs)
detect_communities(store: &GraphStore) -> Result<Vec<DetectedCommunity>>:
- Stream all entities via
all_entities_stream()— build petgraph incrementally - Load all active edges, add to petgraph
- Run label propagation (iterative: each node adopts plurality label of neighbors, until convergence or max iterations)
- Group entities by community label
- Return clusters with entity IDs
3. Community Summarization
For each detected community:
- Collect entity names + summaries + key edges
- Call LLM: "Summarize this cluster of related entities and their relationships in 1-2 sentences"
- Upsert community in SQLite
- Embed community name+summary → store in
zeph_communitiesQdrant collection
4. Incremental Update
When new entity is added (during extraction):
- Assign to community of plurality neighbors (if neighbors exist)
- If no neighbors yet, leave unassigned
Full community refresh triggered when extraction_count % community_refresh_interval == 0.
5. Graph Eviction
Run during community refresh:
- Delete expired edges older than 90 days
- Delete orphan entities (no active edges, last_seen > 30 days)
- Entity count cap (configurable, default unlimited for MVP)
- Cleanup stale Qdrant entity embeddings for deleted entities
6. Community Search
search_communities(query: &str, qdrant: &EmbeddingStore, limit: usize) — embed query, search zeph_communities collection, return community summaries.
Architecture Reference
See .local/plan/graph-memory-architecture.md Section 7 for label propagation details, eviction policy, and Qdrant cleanup.
Acceptance Criteria
- Label propagation produces meaningful clusters on test graph
- Community summaries generated and stored
- Incremental assignment works for new entities
- Full refresh triggered by extraction counter
- Expired edges and orphan entities cleaned up
- Stale Qdrant embeddings removed
- ~10 tests (8 unit + 2 integration)