Skip to content

Refactor: separate package-format detection from classification and normalization strategy #782

@danielmeppiel

Description

@danielmeppiel

Follow-up from #781 (which fixed the immediate ordering bug behind #780).

Problem

detect_package_type() in src/apm_cli/models/validation.py is a positional-priority if/elif cascade that conflates three concerns:

  1. Evidence gathering -- did we see apm.yml, SKILL.md, hooks/*.json, plugin.json, agents/skills/commands/?
  2. Classification -- which PackageType enum value to assign.
  3. Normalization-strategy selection -- which downstream synthesizer to run (e.g. normalize_plugin_directory for MARKETPLACE_PLUGIN).

The cascade's first-match-wins semantics means every new bundled-format introduces a potential ordering bug. #780 was the latest one; the next one is just a matter of time.

Proposed shape

Move to a composition model:

  • FormatDetector interface, one impl per format: ApmYmlDetector, SkillMdDetector, HookJsonDetector, ClaudePluginDetector. Each returns Optional[FormatEvidence] independent of the others.
  • PackageFormatRegistry runs all detectors and produces a DetectionReport (set of evidences -- not a single PackageType).
  • NormalizationPlanner reads the report and decides which normalizers to run. Mixed packages become first-class (e.g. a Claude plugin that legitimately ships hooks runs both the plugin synthesizer AND the hooks pass-through, instead of being shoehorned into one classification).

The DetectionEvidence dataclass + gather_detection_evidence() helper landed in #781 are intentionally a step in this direction -- they decouple the evidence scan from classification so observability code can already use the richer payload.

Benefits

  • New formats are additive -- no ordering bugs possible.
  • Mixed packages are first-class (today's implicit "MARKETPLACE_PLUGIN happens to also handle hooks" becomes explicit).
  • The DetectionReport doubles as the observability payload (verbose detection traces, deploy-summary labels, near-miss warnings).
  • The classification logic stops being a positional puzzle.

Cost

Medium -- touches models/validation.py, install/sources.py, plus all consumers of PackageType (lockfile recording, install summary, drift detection). Should land behind a feature flag for one release if downstream test churn is high.

Out of scope for this issue

  • Tightening _has_hook_json to only count files matching the actual Claude hooks schema (today: any *.json in hooks/).
  • Introducing an explicit MIXED PackageType -- the DetectionReport model makes the enum redundant for mixed cases.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    area/cliCLI command surface, flags, help text (cross-cutting).area/testingTest infrastructure, fixtures, e2e harness, coverage.priority/lowAccepted but not time-sensitivestatus/acceptedDirection approved, safe to start work.status/triagedInitial agentic triage complete; pending maintainer ratification (silence = approval).type/refactorInternal restructure, no behavior change.

    Type

    No type

    Projects

    Status

    Todo

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions