Skip to content

feat(memory): community detection with label propagation #1228

@bug-ops

Description

@bug-ops

Summary

Phase 5 of graph memory (#1222): automatic community detection and summary generation using label propagation.

Depends on: #1227 (background extraction)

Tasks

1. petgraph Dependency

Add petgraph = { workspace = true, optional = true } to crates/zeph-memory/Cargo.toml. Update feature: graph-memory = ["dep:petgraph"]. Add petgraph = "0.7" to workspace dependencies.

2. Community Detection (graph/community.rs)

detect_communities(store: &GraphStore) -> Result<Vec<DetectedCommunity>>:

  1. Stream all entities via all_entities_stream() — build petgraph incrementally
  2. Load all active edges, add to petgraph
  3. Run label propagation (iterative: each node adopts plurality label of neighbors, until convergence or max iterations)
  4. Group entities by community label
  5. Return clusters with entity IDs

3. Community Summarization

For each detected community:

  1. Collect entity names + summaries + key edges
  2. Call LLM: "Summarize this cluster of related entities and their relationships in 1-2 sentences"
  3. Upsert community in SQLite
  4. Embed community name+summary → store in zeph_communities Qdrant collection

4. Incremental Update

When new entity is added (during extraction):

  • Assign to community of plurality neighbors (if neighbors exist)
  • If no neighbors yet, leave unassigned

Full community refresh triggered when extraction_count % community_refresh_interval == 0.

5. Graph Eviction

Run during community refresh:

  • Delete expired edges older than 90 days
  • Delete orphan entities (no active edges, last_seen > 30 days)
  • Entity count cap (configurable, default unlimited for MVP)
  • Cleanup stale Qdrant entity embeddings for deleted entities

6. Community Search

search_communities(query: &str, qdrant: &EmbeddingStore, limit: usize) — embed query, search zeph_communities collection, return community summaries.

Architecture Reference

See .local/plan/graph-memory-architecture.md Section 7 for label propagation details, eviction policy, and Qdrant cleanup.

Acceptance Criteria

  • Label propagation produces meaningful clusters on test graph
  • Community summaries generated and stored
  • Incremental assignment works for new entities
  • Full refresh triggered by extraction counter
  • Expired edges and orphan entities cleaned up
  • Stale Qdrant embeddings removed
  • ~10 tests (8 unit + 2 integration)

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestgraph-memoryKnowledge graph memory featurememoryzeph-memory crate (SQLite)

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions