Skip to content

perf(graph): guard community detection against large entity counts #1265

@bug-ops

Description

@bug-ops

Context

Phase 5 Community Detection (epic #1222, issue #1228) loads all entities and edges into memory before building a petgraph graph for label propagation.

Problem

detect_communities calls store.all_entities() and all_active_edges_stream().try_collect(), fully materializing both tables into Vecs. At >5K entities, memory usage grows unbounded. Peak heap includes: entities Vec + edges Vec + edge_facts_map (cloned facts) + petgraph adjacency structures.

Suggested Fix

Add a hard cap check before loading: if entity_count() > COMMUNITY_DETECTION_MAX (e.g., 10K), skip community detection and log a warning. Alternatively, build the petgraph incrementally from the stream without collecting.

Source

PERF-01 from Phase 5 performance analysis.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requestperformancePerformance improvements

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions