Skip to content

[FEATURE] Performance optimization for deep dependency trees #171

@sergio-sisternes-epam

Description

@sergio-sisternes-epam

Is your feature request related to a problem? Please describe.

As APM adoption grows and packages develop deeper transitive dependency trees, several algorithmic bottlenecks in the install and compile paths become significant:

Bottleneck Location Current Complexity Impact
Primitive conflict detection primitives/models.py _add_with_conflict_detection() O(m²) — linear list scan per add Scales quadratically with primitive count
Cycle detection path tracking deps/apm_resolver.py detect_circular_dependencies() O(V×D)in operator on List[str] Degrades with tree depth
get_nodes_at_depth() during flatten deps/dependency_graph.py O(V × max_depth) — full node scan per depth level Repeated scans across 50 depth levels
from_apm_yml() YAML parsing models/apm_package.py 25 call sites, 0 caching — same files parsed 20-50× per operation Unnecessary disk I/O and CPU
get_config() reads config.py 5-10 disk reads per operation, no caching Redundant file I/O
read_constitution() compilation/constitution.py Read 3× per compile from claude_formatter.py and injector.py Redundant file I/O
Inheritance chain computation compilation/context_optimizer.py _get_inheritance_chain() Recomputed every call, no caching Redundant O(D) walks per instruction
Package downloads cli.py install loop Fully sequential — O(N×T) wall-clock 10 pkgs × 5s = 50s vs ~10s parallel
GitHub API resilience deps/github_downloader.py No 429/rate-limit handling Fails immediately under throttling

What works well today: BFS tree building uses deque + Set for O(1) dedup (queued_keys), glob pattern results are cached in ContextOptimizer, flattening uses Dict-based FlatDependencyMap with O(1) lookups, and output assembly uses list.append() + str.join() (not string concatenation). These should be preserved.

Describe the solution you'd like

A phased performance improvement plan with no breaking changes. All optimizations are internal — same inputs produce identical outputs. Lock file format v1 preserved.

Phase 1 — Performance Baseline (Benchmarks & Fixtures)

Entry gate — must complete before optimization work begins.

  • Add pytest-benchmark test suite in tests/benchmarks/ covering: dependency resolution (50+ packages, 5 depth levels), compilation (200+ primitives, 100+ directories), primitive conflict detection (500+ primitives), from_apm_yml() parse throughput, inheritance chain computation
  • Add deep dependency tree synthetic fixtures in tests/fixtures/: 100 packages, 10 depth levels, diamond dependencies, large primitive counts
  • Record baseline measurements in tests/benchmarks/BASELINE.md

Phase 2 — Data Structure Fixes (High Impact, Low Risk)

Can proceed in parallel with Phase 3.

  • Conflict detection O(m²) → O(m): Add Dict[str, int] name→index alongside each primitive list in PrimitiveCollection. Check dict (O(1)) instead of scanning for i, existing in enumerate(collection). File: src/apm_cli/primitives/models.py
  • Cycle detection O(V×D) → O(V+E): Add companion current_path_set: Set[str] in detect_circular_dependencies(). Use set for O(1) in checks, keep list only for cycle-path reporting. File: src/apm_cli/deps/apm_resolver.py
  • Depth-index for flatten: Pre-compute Dict[int, List[DependencyNode]] in DependencyTree on node insertion. get_nodes_at_depth() becomes O(1). File: src/apm_cli/deps/dependency_graph.py

Phase 3 — Caching Layer (High Impact, Medium Risk)

Can proceed in parallel with Phase 2.

  • from_apm_yml() parse cache: Module-level Dict[Path, APMPackage] keyed by resolved absolute path. No invalidation needed within a CLI invocation. Add clear_apm_yml_cache() for test isolation. File: src/apm_cli/models/apm_package.py
  • get_config() cache: Read-once-per-process module-level cache. File: src/apm_cli/config.py
  • read_constitution() cache: Cache after first read. File: src/apm_cli/compilation/constitution.py
  • Inheritance chain cache: Dict[Path, List[Path]] in ContextOptimizer, populated on first call per directory. File: src/apm_cli/compilation/context_optimizer.py

Phase 4 — Network Parallelization (Highest Wall-Clock Impact, Higher Risk)

Depends on Phases 2-3 stability.

  • Parallel package downloads: concurrent.futures.ThreadPoolExecutor (default 4 workers) for the flat install list after dependency resolution. Thread-safe Rich progress bars, collect-all-errors pattern, cap at 4-6 concurrent clones. Files: src/apm_cli/cli.py, src/apm_cli/deps/github_downloader.py
  • Git sparse-checkout for subdirectory packages: Use git sparse-checkout (git 2.25+) with full-clone fallback for virtual subdirectory packages. File: src/apm_cli/deps/github_downloader.py
  • (Stretch) Pre-fetch next BFS depth level in parallel during transitive resolution

Phase 5 — Resilience & Robustness (Medium Impact, Low Risk)

Independent of other phases.

  • Rate limit handling: Inspect X-RateLimit-Remaining and Retry-After headers. Exponential backoff with max 3 retries for 429/503. File: src/apm_cli/deps/github_downloader.py
  • Skip-if-exists optimization: When lockfile has a commit SHA and local clone HEAD matches, skip re-download. Files: src/apm_cli/deps/github_downloader.py, src/apm_cli/cli.py

Describe alternatives you've considered

  • Async/await rewrite: Full migration to asyncio for network I/O. Rejected as too invasive for the current stage — ThreadPoolExecutor achieves most of the parallel download benefit with far less refactoring.
  • Database-backed caching (SQLite): Persistent cross-invocation cache for parsed manifests. Rejected as over-engineering — CLI invocations are short-lived, and simple in-memory caches per process are sufficient.
  • CDN-based package hosting: Pre-built package tarballs on a CDN instead of git clones. Deferred — requires infrastructure outside APM's scope and git-based resolution works well for the current package scale.
  • npm-style global cache directory: A ~/.apm/cache/ shared across all projects. Considered for Phase 5 skip-if-exists but deferred — lockfile-based local dedup handles the common case.

Additional context

Scope boundaries:

  • All changes are internal — no CLI flag changes (except potentially --parallel-downloads N in Phase 4), no API changes, no lock file format changes.
  • Backward compatibility is a hard requirement: identical inputs must produce byte-identical outputs.
  • Phase 1 benchmarks serve as the regression gate for all subsequent phases.

Dependency graph for phases:

Phase 1 (baseline) ──┬──> Phase 2 (data structures) ──┬──> Phase 4 (parallelization)
                      └──> Phase 3 (caching)     ──────┘
                                                        Phase 5 (resilience) [independent]

Key files involved:

File Concern
src/apm_cli/primitives/models.py O(m²) conflict detection
src/apm_cli/deps/apm_resolver.py Cycle detection, BFS builder, flatten
src/apm_cli/deps/dependency_graph.py get_nodes_at_depth(), tree structures
src/apm_cli/models/apm_package.py Uncached from_apm_yml() (25 call sites)
src/apm_cli/config.py Uncached get_config()
src/apm_cli/compilation/constitution.py Uncached read_constitution()
src/apm_cli/compilation/context_optimizer.py Uncached inheritance chains
src/apm_cli/cli.py Sequential install loop
src/apm_cli/deps/github_downloader.py No parallelism, no rate limits

Design constraint — cache invalidation: The "no invalidation within a process" strategy is intentional. APM is a short-lived CLI tool; files don't change mid-operation. If APM ever becomes a long-running service, this must be revisited.

Thread safety note for Phase 4: Rich progress bars and GitPython may need thread-safe wrappers. A spike/prototype is recommended before committing to ThreadPoolExecutor. Alternative approach: asyncio + subprocess for git operations.

Metadata

Metadata

Assignees

No one assigned

    Labels

    acceptedDeprecated: use status/accepted. Kept for issue history; will be removed in milestone 0.10.0.enhancementDeprecated: use type/feature. Kept for issue history; will be removed in milestone 0.10.0.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions