Skip to content

Integrate copilot runtime#2

Merged
danielmeppiel merged 9 commits intomainfrom
integrate-copilot-runtime
Sep 25, 2025
Merged

Integrate copilot runtime#2
danielmeppiel merged 9 commits intomainfrom
integrate-copilot-runtime

Conversation

@danielmeppiel
Copy link
Copy Markdown
Collaborator

Documentation and User Guidance Updates:

  • Updated all setup instructions and examples in README.md, docs/getting-started.md, docs/cli-reference.md, docs/runtime-integration.md, and related files to recommend and default to apm runtime setup copilot instead of Codex CLI. This includes new explanations, usage examples, and troubleshooting steps for Copilot CLI. [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14]

Copilot CLI Runtime Integration:

  • Added a new script scripts/runtime/setup-copilot.sh to automate installation, configuration, and environment setup for the GitHub Copilot CLI, including token detection, MCP config directory creation, and prerequisite checks for Node.js and npm versions.

Token Management and CLI Logic:

  • Updated src/apm_cli/core/token_manager.py to clarify token precedence for Copilot CLI, remove unused npm token handling, and document the new recommended token environment variables. [1] [2] [3]
  • Adjusted the CLI (src/apm_cli/cli.py) to prioritize Copilot CLI when installing MCP dependencies, ensuring Copilot is checked and suggested before Codex or VSCode. [1] [2]

These changes make Copilot CLI the default and best-supported AI runtime for APM, streamline the onboarding experience, and ensure all documentation and automation scripts are consistent and up to date.

- Add src/apm_cli/adapters/client/copilot.py: MCP client adapter for Copilot CLI
- Add src/apm_cli/runtime/copilot_runtime.py: Runtime adapter for Copilot CLI execution
- Add scripts/runtime/setup-copilot.sh: Copilot CLI installation script

These are zero-risk additions as they are new files that don't modify existing code.
Ready for Phase 2: Runtime infrastructure integration.
- Update runtime factory to register CopilotRuntime as first priority
- Add copilot to supported runtimes in runtime manager
- Update runtime preference order: copilot → codex → llm
- Add npm-based removal logic for copilot runtime
- Export CopilotRuntime in __init__.py

Low risk changes that integrate core Copilot files into runtime system.
- Verified 'apm install --runtime' option includes copilot first
- Confirmed 'apm runtime setup copilot' command works
- Verified runtime status shows copilot as highest priority
- Runtime detection logic already prioritizes copilot correctly
- Error messages already mention copilot CLI installation

No additional changes needed - CLI integration already complete from clean-main branch.
- Verified existing tests already support copilot runtime
- Added comprehensive test_copilot_runtime.py with 12 test cases
- Tests cover runtime detection, initialization, execution, error handling
- All existing runtime factory and detection tests pass with copilot
- Integration tests already handle copilot in multi-runtime scenarios

Low risk additions that provide comprehensive test coverage for Copilot runtime.
- Replace references from Codex to GitHub Copilot in README, CLI reference, and getting started guides.
- Modify setup scripts to install GitHub Copilot CLI with MCP configuration.
- Update token management to reflect the removal of GITHUB_NPM_PAT.
- Adjust integration tests to verify Copilot setup.
- Enhance example scripts in apm.yml for Copilot usage.
…e instantiation and enhance runtime info retrieval with mocked subprocess output.
@danielmeppiel danielmeppiel merged commit 55839cb into main Sep 25, 2025
15 checks passed
@danielmeppiel danielmeppiel deleted the integrate-copilot-runtime branch February 27, 2026 09:42
sergio-sisternes-epam referenced this pull request in sergio-sisternes-epam/apm Mar 2, 2026
- Use LockFile.read() instead of raw yaml.safe_load() in _collect_transitive_mcp_deps (#1)
- Guard against mcp:null in get_mcp_dependencies() (#2)
- Remove inline MCP installation pipeline, defer to follow-up PR (#3/microsoft#7)
- Remove redundant import builtins in _deduplicate_mcp_deps (microsoft#10)
- Add tests for mcp:null, mcp:[], root-over-transitive dedup order (microsoft#9)
- Remove tests for deleted inline pipeline functions
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
…covery + governance (#503)

* Initial plan

* Initial plan for marketplace integration

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* feat: marketplace integration core implementation

- Add marketplace/ package: models, errors, registry, client, resolver
- Add marketplace CLI commands: add, list, browse, update, remove, search
- Add lockfile provenance fields: discovered_via, marketplace_plugin_name
- Add install hook for NAME@MARKETPLACE syntax pre-parse intercept
- Wire marketplace commands in cli.py

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* docs: add marketplace integration guide and CLI reference

- Create guides/marketplaces.md covering marketplace concepts,
  registration, browsing, search, install syntax, provenance tracking,
  and cache behavior
- Add apm marketplace and apm search command sections to cli-commands.md
- Update apm install arguments to include NAME@MARKETPLACE syntax
- Update plugins.md Finding Plugins section with marketplace cross-refs

Co-authored-by: danielmeppiel <[email protected]>

* docs: fix marketplace.json format and lockfile field names to match implementation

- Use array-based plugins format matching models.py parser expectations
- Use discovered_via and marketplace_plugin_name matching lockfile.py fields
- Document both Copilot CLI (repository/ref) and Claude Code (source) formats

Co-authored-by: danielmeppiel <[email protected]>

* docs: fix git-subdir and relative source descriptions to match resolver

- git-subdir uses separate repo and subdir fields
- Relative string sources resolve to marketplace repo subdirectory

Co-authored-by: danielmeppiel <[email protected]>

* feat: add marketplace unit tests and docs

- 114 unit tests across 8 test files covering all marketplace modules
- New marketplace guide at docs/src/content/docs/guides/marketplaces.md
- Updated CLI reference with marketplace and search commands
- Updated plugins guide with marketplace integration section
- CHANGELOG entry for marketplace feature

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* refactor: address code review feedback

- Use List[MarketplacePlugin] from typing instead of lowercase generic
- Eliminate duplicated condition in install.py marketplace intercept
- Restructure control flow for clarity

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* fix: address all 12 PR review comments on marketplace integration

- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <[email protected]>

* fix: Copilot CLI format compatibility and marketplace provenance bugs

Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <[email protected]>

* feat: scope marketplace search to QUERY@MARKETPLACE format

Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills)
to eliminate name collisions across marketplaces. Added search_marketplace()
client function for single-marketplace search.

- Rejects bare queries without @ — clear error with usage example
- Validates marketplace exists before searching
- Updated docs/guides/marketplaces.md with new syntax
- 7 test cases: format validation, unknown marketplace, results, no results

Co-authored-by: Copilot <[email protected]>

* docs: update CLI reference and plugins guide for scoped search syntax

Align all documentation with QUERY@MARKETPLACE search format.

Co-authored-by: Copilot <[email protected]>

* refactor: use centralized path_security for marketplace traversal checks

Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py
with validate_path_segments() from utils/path_security.py. Add
defense-in-depth validate_path_segments() call to _sanitize_cache_name()
in client.py.

This ensures marketplace code uses the same cross-platform path safety
utilities (backslash normalization, single-dot rejection) as the rest
of APM.

Co-authored-by: Copilot <[email protected]>

* docs: add path safety rule to copilot-instructions.md

Directs contributors to use validate_path_segments() and
ensure_path_within() from utils/path_security.py instead of
ad-hoc traversal checks.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: danielmeppiel <[email protected]>
Co-authored-by: danielmeppiel <[email protected]>
Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Apr 22, 2026
* feat(policy): W1 foundations for install-time policy enforcement (#827)

Wave 1 of issue #827 implementation. Lays the foundations the install
pipeline gate (W2) will plug into. No behaviour change yet — install
still does NOT enforce policy until W2 wires the gate phase.

What's in:
- policy_checks: new public seam run_dependency_policy_checks(deps,
  lockfile=, policy=, mcp_deps=, effective_target=) accepting a
  resolved dep set; old run_policy_checks(project_root, policy) is now
  a thin wrapper. Honours require_resolution: project-wins for
  version-pin mismatches only. Latent isinstance(allow, list) bug
  fixed for schema's Tuple[str, ...].
- policy/discovery: cache stores merged effective policy with chain
  metadata + fingerprint. Atomic writes via temp + os.replace, with
  pid+thread_id suffix to prevent concurrent-writer collision.
  MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded
  to express 9 outcomes (found, absent, cached_stale,
  cache_miss_fetch_fail, malformed, disabled, garbage_response,
  no_git_remote, empty).
- diagnostics: CATEGORY_POLICY constant + per-category renderer wired
  into render_summary().
- command_logger: InstallLogger.policy_resolved/violation/disabled
  with per-class actionable error wording (auth/unreachable/malformed/
  blocked).
- tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures
  (denied-direct, denied-transitive, required-missing,
  required-version-mismatch, mcp-denied, target-mismatch,
  unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and
  rubber-duck findings I5/I6/I7/N14/C2.
- docs: 12-section Install-time enforcement guide skeleton in both
  enterprise/policy-reference.md and packages/apm-guide skill mirror.
  10 sections filled; sections 7 (snippets) and 10 (error table)
  stubbed for W3-docs-final once W2 lands and W4 captures live output.

Tests:
- tests/unit: 4878 passed (1 pre-existing unrelated MCP failure
  deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21
  fixture-load new tests.

Refs: #827
Co-authored-by: Copilot <[email protected]>

* feat(install): W2A policy enforcement at install time (#827)

Wave 2A wires the three install-time enforcement sites planned for #827:

1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py):
   New phase running between resolve and targets. Discovers org policy,
   resolves the inheritance chain via resolve_policy_chain, persists the
   merged effective policy + chain refs to cache (chain_refs threading
   per C1 amendment), then calls run_dependency_policy_checks against
   the resolved deps. Routes 9 discovery outcomes (found, absent,
   cached_stale, cache_miss_fetch_fail, malformed, disabled,
   garbage_response, no_git_remote, empty). Block-mode violations raise
   PolicyViolationError to halt the pipeline cleanly.

2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py
   + commands/install.py:1091-1125):
   apm install --mcp does NOT enter the install pipeline. New shared
   helper run_policy_preflight() runs discovery + dep checks for any
   non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install
   so denied servers never reach the integrator. Also exports
   PolicyBlockError for callers.

3. **install <pkg> snapshot+rollback** (commands/install.py):
   apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now
   snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip
   drift on whitespace / key-order / comments), and on ANY pipeline
   failure (policy block, download error, etc.) restore byte-for-byte
   via tempfile + os.replace atomic write. Logs '[i] apm.yml restored
   to its previous state.' and exits non-zero.

InstallContext gains policy_fetch, policy_enforcement_active, no_policy.

Tests: +68 new tests, 4946 unit tests pass total.
- test_policy_gate_phase.py: 27 (covers all 9 outcomes)
- test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport,
  self-defined, trust_transitive, discovery outcomes, return shape)
- test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments
  preserved, --no-policy bypass, download error rollback, snapshot
  unit tests)

W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review
follow.

Refs: #827
Co-authored-by: Copilot <[email protected]>

* feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827)

W2B completes the enforcement surface:

* policy_target_check.py - new pipeline phase after targets that re-runs
  target/compilation checks with the resolved effective_target. Filters
  to TARGET_CHECK_IDS only to avoid double-emitting dep violations from
  the gate phase. Honors CLI --target override (I6 fix scenario).

* --no-policy escape hatch on apm install / install <pkg> / install --mcp
  / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through
  ctx.no_policy and emit always-visible warnings via
  InstallLogger.policy_disabled() noting that apm audit --ci still fails.

* --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg.
  Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or
  '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install
  table. Never raises, never mutates. Direct manifest deps only (resolver
  doesn't run in dry-run; documented limitation).

InstallRequest, InstallService, InstallContext threaded with no_policy.
LOC budget on install.py raised 1625 -> 1650 with documented rationale.

Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag +
16 dry_run_policy). Full suite green vs main baseline.

Co-authored-by: Copilot <[email protected]>

* fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827)

C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1
DevX). All landed; full suite 5032 pass.

S1 (Supply Chain BLOCKER) - transitive MCP enforcement:
  Transitive MCP servers from APM packages were bypassing install-time policy.
  The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers
  are merged later via MCPIntegrator.collect_transitive() and written to
  runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check.
  This defeated #827 on the most security-critical dep category.
  Fix: second run_policy_preflight() call in commands/install.py after the
  transitive merge, before MCPIntegrator.install(). On block: abort MCP config
  writes, exit non-zero. APM packages remain installed (gate phase approved
  them). 15 new unit tests in test_transitive_mcp_policy.py.

B1 (Architect, partial) - shared chain-aware discovery:
  Extract discover_policy_with_chain() into policy/discovery.py so both
  policy_gate.py and install_preflight.py walk the same inheritance chain.
  Closes the gap where --mcp / --dry-run paths could resolve a different
  effective policy than the pipeline path. Gate-phase keeps its 9-outcome
  routing; only the discovery seam moved. 10 new tests in
  test_chain_discovery_shared.py.

D2 (DevX UX) - dry-run noise cap:
  install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show
  5 lines per severity bucket + tail '[!] ... and N more would be blocked
  by policy. Run apm audit for full report.' 4 new tests.

D1 (DevX UX) - drop apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dependency refresh. The flag was accepted but unused. Removed the option
  and flipped the test to assert the flag is now rejected.

LOC budget on install.py raised 1650 -> 1675 with documented justification.

Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared
+ 4 dry_run_noise_cap). 1 pre-existing MCP test deselected.

Co-authored-by: Copilot <[email protected]>

* docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827)

W3 phase complete. All 5 parallel workstreams landed.

Tests:
  tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17
  Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via
  CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain
  at both seams (policy_gate + install_preflight). Uses _build_policy()
  helper for frozen-dataclass safe construction.

Docs:
  docs/src/content/docs/enterprise/policy-reference.md
    sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy,
    APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg>
    rollback, transitive MCP block)
    sec 10: outcome table (9 fetch outcomes) + violation table (6 classes)
    Added explicit JSON/SARIF non-goal callout (C1 amendment).
  packages/apm-guide/.apm/skills/apm-usage/governance.md
    Same content, leaner skill version, links back to docs for full text.

CHANGELOG.md:
  Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview,
    install <pkg> rollback
  Changed: pipeline gains policy_gate + policy_target_check phases, shared
    chain discovery + atomic cache + MAX_STALE_TTL
  Security (headline): apm install enforces apm-policy.yml; transitive MCP
    checked before runtime config write

Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob.

Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test
deselected.

PR body drafted at session-state/files/pr-body-827.md. Growth strategy
entry + asciinema script staged in WIP (gitignored).

Co-authored-by: Copilot <[email protected]>

* fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827)

C3 final panel + rubber-duck found 5 issues. All fixed.

#1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement:
  ctx.direct_mcp_deps now populated in pipeline.py from
  apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate
  reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them
  to run_dependency_policy_checks. install.py:1496 second preflight guard
  drops 'and transitive_mcp' so direct-only MCP installs are also caught.

#2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback:
  policy_gate.py replaced sys.exit(1) on malformed with fail-open warn
  (matches install_preflight + cache_miss_fetch_fail/garbage_response
  posture). sys.exit was bypassing the rollback handler in install.py for
  apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is
  follow-up #829.

#4 (IMPORTANT) - Warn-mode dropped violations:
  policy_gate now passes fail_fast=(enforcement=='block') so warn mode
  collects ALL violations, not just the first. Also emits warnings for
  passed=True checks with non-empty details (project-wins version-pin
  mismatches were silently dropped).

#3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level:
  discover_policy_with_chain only walks one parent. Toned down docs in
  policy-reference.md and governance.md with explicit caution callout.
  Filed follow-up #831 for proper recursive walk + cycle detection.

#5 (BLOCKER per panel) - Doc drift on apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dep refresh. Removed all mentions from both docs. apm deps update is
  the dep-refresh surface (runs install pipeline, gate applies); --no-policy
  is NOT exposed there today.

Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration
I18/I19/I20). New integration tests cover real direct-MCP block, real
malformed fail-open, warn-mode multi-violation. I16 class renamed to
TestI16GarbageResponsePolicy to fix mislabeling.

Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain).

Co-authored-by: Copilot <[email protected]>

* fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827)

Originally filed as follow-ups during C3, moved in-PR per reviewer
request so #832 ships a complete enforcement story.

#834 - Warn-mode policy violations did not render in the install
summary. Root cause: pipeline created a fresh DiagnosticCollector for
install_result.diagnostics while InstallLogger.policy_violation()
pushed warnings into logger.diagnostics. Two collectors, one rendered.
Fix: when a logger is present, reuse logger.diagnostics so policy
records flow through render_summary() (block mode unaffected - it
aborts inline before summary).

#831 - extends: chain only supported one level (parent). Inheritance
machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was
already N-deep capable; discovery never wired it. Fix: rewrite
_resolve_and_persist_chain as iterative depth-first walk, leaf-first;
cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5
with explicit pre-append check; partial-chain warning when a mid-chain
ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using
<N> of <M> policies'); single cache write at leaf with full chain
fingerprint.

Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth
limit, partial chain, single-level regression), +1 integration
(TestI21ThreeLevelExtendsChain). 5044 unit pass.

Docs: enterprise/policy-reference.md and apm-usage/governance.md
chain-depth callouts updated.

Co-authored-by: Copilot <[email protected]>

* docs(changelog): record in-PR resolution of #834 and #831 under #827

Co-authored-by: Copilot <[email protected]>

* fix(policy): address review-panel pre-merge findings (#827)

- Security F1 (HIGH): pin extends: chain to leaf policy host; disable
  HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes
  cross-host credential leak vector via git credential fill fallback
  and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent
  .com is treated as distinct from github.com (strict pin).
- Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_
  discovery_miss() canonical helper covering all 7 discovery outcomes;
  route both policy_gate and install_preflight through it. absent now
  verbose-only; no_git_remote downgraded to [i]; garbage_response gets
  distinct wording (no VPN/firewall noise); cached_stale and cache_
  miss_fetch_fail messages now state enforcement posture explicitly;
  violation messages dedupe dep_ref prefix; wire _policy_reason_blocked
  into block-severity policy_violation as dim secondary line.
- Docs: remove [Planned] banner from policy-reference; update
  enforcement tables (policy-reference + governance skill) to reflect
  install-time blocking; document --no-policy / APM_POLICY_DISABLE in
  cli-commands.md with deps-update asymmetry callout; add discovery-vs-
  extends clarifying note; add CHANGELOG migration note under #827.

Tests: 5053 -> 5068 (+15 logging, +9 security host-pin).

Co-authored-by: Copilot <[email protected]>

* feat(policy): ship enterprise hardening pack on top of #827

Four enterprise hardening items shipped in-PR per CISO-arbitrated panel
verdict + CTO threat-model deep dive (PR #832 comments 4294087760 +
4294115069). Closes #829.

1. policy.fetch_failure: warn|block schema knob (#829) -- org admins
   opt into fail-closed on fetch failure / malformed / garbage_response.
   Default 'warn' preserves backwards compat.
2. apm.yml policy.fetch_failure_default: warn|block -- project-side
   complement so a project can lock down behavior even when no policy
   is reachable to read the org-side knob from.
3. apm policy status diagnostic command -- show discovery outcome,
   source, enforcement, cache age, extends chain, effective rule
   counts, and hash-pin state. --json for SIEM ingestion. Trust-but-
   verify tool that makes fail-open acceptable.
4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the
   garbage_response compromised-intermediary vector by verifying raw
   policy bytes against a project-pinned digest. Equivalent of pip
   --require-hashes for the policy itself. ALWAYS fail-closed on
   mismatch, regardless of fetch_failure setting (a hash mismatch is
   an explicit pin violation, not a fetch failure). sha384/sha512
   accepted; md5/sha1 rejected (collision-resistant only).
5. apm audit --ci auto-discovers org policy when --policy-source is
   not provided; --no-policy flag added to skip. Closes the
   audit/install asymmetry that left CI blind to sideloaded primitives.

Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit
auto-discovery, policy status command, plus updates to existing
discovery tests for the new expected_hash kwarg threading).

Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin),
§9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md
skill mirrors all of the above; cli-commands.md gets policy status +
audit --no-policy. CHANGELOG entries under [Unreleased] Added /
Added (Security).

Co-authored-by: Copilot <[email protected]>

* docs(policy): address doc-writer review BLOCKERs (#827)

- policy-reference.md: remove stale 'planned fetch_failure knob' paragraph
  that contradicted the §9.5 entry shipped in the same PR; add Linux
  hash-compute one-liner alongside the macOS shasum example.
- cli-commands.md: add 'apm policy status' command section under a new
  'apm policy' family (synopsis, --policy-source/--no-cache/--json,
  exit-code note, examples). Add --no-policy flag to 'apm audit' options
  list. Reword --policy SOURCE description to reflect that --ci now
  auto-discovers when --policy is omitted. Update audit examples to
  match (drop the now-redundant '--policy org' from auto-discovery
  example, add explicit --no-policy variant).

Co-authored-by: Copilot <[email protected]>

* docs(policy): address doc-writer HIGH+LOW findings (#827)

- manifest-schema.md: add policy: block to schema diagram + new
  section 3.9 documenting fetch_failure_default, hash, hash_algorithm
- policy-reference.md: add fetch_failure: warn to canonical schema
  YAML and a fetch_failure entry under Top-level fields; lift apm
  policy status and apm audit --ci auto-discovery into proper
  numbered subsections (9.7 / 9.8) so anchors match the skill mirror
- governance.md: surface install-time enforcement with link to
  policy-reference#install-time-enforcement
- ci-policy-setup.md: annotate Step 3 noting apm audit --ci
  auto-discovers and --policy org is now an explicit override
- security.md: add Compromised policy intermediary row to attack
  surface comparison, linked to policy.hash consumer-side pin
- cli-commands.md: split --no-policy into 2-line nested bullet
  separating behaviour from env-var equivalence
- apm-guide skill mirror: add fetch_failure: warn to schema overview
  to keep skill aligned with policy-reference

Co-authored-by: Copilot <[email protected]>

* fix(policy): address PR review panel logging+arch findings (#827)

BLOCKING:
- command_logger.policy_discovery_miss: gate no_git_remote info
  message on verbose mode; previously emitted on every install in a
  non-git directory

Architecture:
- New install/errors.py with canonical PolicyViolationError;
  PolicyBlockError kept as re-exported alias to preserve test patches
- New policy/outcome_routing.py::route_discovery_outcome
  consolidating the 9-outcome routing table; policy_gate.py and
  install_preflight.py now delegate instead of duplicating
- pipeline.py: catch PolicyViolationError before bare Exception so
  policy block messages are not double-nested in RuntimeError
- commands/install.py: isinstance(PolicyViolationError) branch in
  the legacy handler for the same reason

Logging UX:
- install_preflight: empty check.details now falls back to
  [check.name] so the block message is never blank
- _extract_dep_ref helper replaces detail.split(":")[0] with
  defensive parsing that falls back to check.name

Security:
- discovery._get_cache_dir asserts containment vs project_root
  (resolves symlinks) instead of an unguarded join
- Removed dead no_policy= kwarg from discover_policy_with_chain;
  env-var defence-in-depth retained on the call site

Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8
  findings; install_logger split into silent/verbose cases. 5176
  unit tests pass, 0 regressions.

Co-authored-by: Copilot <[email protected]>

* test(policy): use urllib.parse for host assertions to silence CodeQL (#827)

CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times
on test_extends_host_pin.py because bare 'host' in msg substring
checks could in theory match a host appearing at an arbitrary URL
position (path, query, userinfo). The assertions are correct in
practice -- they assert on production error messages of known
format -- but the pattern is not safe in general.

Replace each substring check with a precise extractor:

- _assert_extends_host_in_message / _assert_leaf_host_in_message:
  regex-anchor on the production 'extends host: <h>' / 'leaf host:
  <h>' tokens, then exact-compare the captured group.
- _assert_redirect_target_host: regex-extract the redirect target
  URL after 'to ', then urllib.parse.urlparse(...).hostname compare.

No production-code changes; all 9 host-pin tests still pass.

Co-authored-by: Copilot <[email protected]>

* fix(policy,audit): address PR #832 DevX UX blockers

- audit --no-policy help text rewritten to describe positive
  behaviour first ("Skip org policy discovery and enforcement"
  instead of the negative "Skip auto-discovery ... in --ci mode"),
  so apm audit --help no longer hides the primary effect behind a
  caveat. Aligns the code with the docs.

- apm policy status --check flag added: exits 1 when outcome is
  not 'found' (i.e. policy unresolvable / absent / disabled /
  fetch-failed), 0 otherwise. Default behaviour unchanged (always
  exit 0) so the diagnostic remains safe for human and SIEM use,
  while CI authors get the npm audit / pip check style contract
  via a single flag.

Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to
document the new flag and exit-code table. Adds TestStatusCheckFlag
covering the found / unresolvable / discovery-exception / json
combinations.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
danielmeppiel pushed a commit that referenced this pull request May 4, 2026
CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request May 5, 2026
* feat(drift): Phase A infra - guards + diagnostic category

- Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots
  stat of protected paths, raises ProtectedPathMutationError on any
  mutation. Defense-in-depth above the scratch-root remap.
- Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector.
- Add drift_count property and _render_drift_group renderer that groups
  by kind (modified/unintegrated/orphaned) with stable section header
  for machine consumers.
- Tests: 7 unit tests covering happy path, mutation, creation, deletion,
  missing-tolerated, exception-not-masked, single-file protected path.

Refs #1071. Phase A of WIP/drift/06-final-plan.md.

Co-authored-by: Copilot <[email protected]>

* feat(drift): Phase B+C - replay engine + audit CLI wiring

Implements the drift detection feature per WIP/drift/06-final-plan.md
(closes #1071 scope alignment with #898).

Engine (Phase B):
- src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger,
  CacheMissError, normalization helpers (build-id strip, line endings,
  BOM), run_replay() (cache-only), diff_scratch_against_project(),
  text/json/sarif renderers, atexit scratch cleanup.
- src/apm_cli/install/services.py: scratch_root kwarg with
  ensure_path_within defense-in-depth guard for replay isolation.
- src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning
  (CheckResult, list[DriftFinding]); graceful CacheMissError handling.

CLI surface (Phase C):
- src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex
  against --strip/--file via UsageError. Drift wired into both
  _audit_ci_gate (--ci) and _audit_content_scan (bare project audit)
  paths, default-on per ADR-02. JSON/SARIF/text renderers integrated;
  --no-drift warning gated to text mode (stdout cleanliness).

Tests:
- tests/unit/install/test_drift.py: 13 unit tests (normalization,
  diff cases, renderers).
- Legacy --ci tests opt out of drift via batch --no-drift injection
  (fixture parity, not a behavior change).

7597 unit tests pass; lint clean.

Co-authored-by: Copilot <[email protected]>

* test(drift): Phase D - integration + e2e + perf coverage (43 tests)

Implements the locked test matrix for issue #1071 drift
detection. Floor of 43 tests across three new files closes the
'ULTRA HARDENING OF HELL' coverage requirement.

New files:
- tests/integration/test_drift_check.py (32 tests):
  * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/
    BOM/Build-ID false-positive guards)
  * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted)
  * Section C: 7 edges (no/corrupt lockfile, untracked governed,
    no-write contract, idempotency)
  * Section D: 3 multi-target (copilot/claude/cursor)
  * Section E: 9 default-on / --no-drift opt-out (mutex, stderr
    routing, JSON suppression)
- tests/integration/test_drift_check_e2e.py (10 tests):
  full install->mutate->audit loop with mix_stderr=False, air-gap
  proof, JSON/SARIF stability, 30s smoke
- tests/unit/install/test_drift_perf.py (1 test):
  100 primitives replay+diff under 5s

Engine fix surfaced by tests:
- src/apm_cli/install/drift.py: run_replay now reads apm.yml's target
  field via parse_target_field and passes it to resolve_targets.
  Without this, multi-target projects (copilot+claude+cursor) replayed
  only the auto-detected primary target, falsely reporting secondary
  target deployments as orphaned. Helper _read_apm_yml_target() added.

CI wiring:
- scripts/test-integration.sh: two new blocks in run_e2e_tests()
  invoking the integration + e2e suites before the final success log.
  Both safe to run without GITHUB_APM_PAT (cache-only, mocked network).

Verification: 56 drift-domain tests pass; full repo lint clean.

Co-authored-by: Copilot <[email protected]>

* docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note

- CHANGELOG.md: Added [Unreleased] entry under Added describing the
  default-on drift detection in apm audit, the three failure modes it
  catches, false-positive guards, --no-drift opt-out + mutex semantics,
  and the JSON/SARIF integration shape. Closes #1071, supersedes #898.
- docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7):
  Full user-facing guide -- what drift means, how the cache-only replay
  works (with mermaid diagram), exit-code matrix, when to use --no-drift,
  output formats, and the CI single-line gate that replaces the legacy
  git status --porcelain script.
- packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the
  audit row with --no-drift flag and added a paragraph documenting the
  drift-by-default behavior, three failure modes, false-positive
  normalization, and JSON/SARIF integration. Aligns the skill that
  ships in apm-guide with the new CLI surface (per
  apm-keep-docs-up-to-date.instructions.md rule 4).
- .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check)
  with a comment marking it redundant once apm-action ships a CLI with
  default-on drift detection (this PR's release). Kept as
  defense-in-depth fallback until then.

Co-authored-by: Copilot <[email protected]>

* fix(drift): address panel feedback - recovery hint + doc-sync

CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <[email protected]>

* fix(drift): address Copilot review - exit-code contract + types + diagnostics

Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is
the gate (exit 1). Closes the regression introduced when content-scan
escalation accidentally also escalated drift findings.

Also addresses inline review:
- A2: vacuous ASCII-encoding assertion now scopes per-line
- A4: tuple[float, int] -> tuple[int, int] in guards.py
- A5: type-annotated _check_drift signature
- A6: clarified DRIFT_ORPHANED comment
- A7: CHANGELOG references PR + closes
- A3: CacheMiss message now drift-specific (no --no-cache confusion)

Co-authored-by: Copilot <[email protected]>

* docs(drift): link drift detection guide from README security section

Per oss-growth: surfaces drift detection alongside content security
and lockfile integrity in the conversion-critical Production-grade
section, so a reader scanning for 'why APM' sees the supply-chain
story end-to-end.

Co-authored-by: Copilot <[email protected]>

* feat(drift): cache pin marker for stale-cache detection

apm install drops a .apm-pin JSON marker into each cached package
root recording the resolved_commit; apm audit verifies it before
running drift replay. Catches the 'teammate bumped lockfile, did
not reinstall' + 'shared CI runner reused stale apm_modules'
scenarios that would otherwise silently produce misleading drift
output.

LockfileBuilder syncs markers UNCONDITIONALLY (even when the
lockfile YAML is unchanged and even when no install happens), so
existing users self-heal on their next 'apm install'.

This is stale-cache detection, NOT cryptographic integrity --
defending against active cache tampering requires content-addressed
hashes, which is deferred.

Schema (v1): {schema_version: 1, resolved_commit: <sha>}
Marker file: <install_path>/.apm-pin

Coverage:
- 14 unit tests in test_cache_pin.py (positive + every error path
  + skip rules + idempotent re-run + self-heal regression)
- 1 integration test in test_drift_check_e2e.py exercising the
  full install -> mark -> verify flow against a synthetic cache

Co-authored-by: Copilot <[email protected]>

* Address panel follow-ups C1-C5 on PR #1137

C1 (supply-chain): Fail closed on unpinned remote deps
- cache_pin.find_unpinned_remote_deps() helper + stderr warning in
  sync_markers_for_lockfile
- drift._materialize_install_path raises CacheMissError for remote
  deps with resolved_commit=None (was silent fail-open)
- Replaced silent-skip test with warning assertion + new helper test

C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay
- run_replay() now wraps the deps loop with _ReadOnlyProjectGuard
  on governed root dirs + apm.lock.yaml + AGENTS.md
- Regression test: monkeypatched leaky integrator triggers
  ProtectedPathMutationError

C3 (cli-logging-ux): Stderr message on swallowed CacheMissError
- audit._audit_content_scan emits '[!] drift check could not run:
  <msg>' to stderr when drift_failed and no findings (covers cache
  miss, missing lockfile, cache-pin error)
- Integration test e10 asserts stderr message in bare-audit path

C4 (docs): Baseline-check phrasing + CHANGELOG link
- governance-guide, ci-cd, cli-commands now read '7 baseline checks
  plus integration drift detection'
- CHANGELOG drift-detection link points to docs site URL

C5 (oss-growth): User-promise framing
- CHANGELOG drift entry leads with the user promise (forgotten
  installs + hand-edits) before mechanism
- drift-detection.md gains a 'Try it now' block at the top
- Before/after CI comparison promoted to its own subsection with
  explicit framing of what the bash workaround missed

Verification: ruff check + format silent; 7621 unit tests + 27 drift
integration tests green.

Co-authored-by: Copilot <[email protected]>

* docs(changelog): trim drift entry to single 'so what?' line

Collapse the two added entries (drift + cache-pin markers) into one
short line that answers the developer 'so what?' and points to the
Drift Detection guide for the full mechanism + opt-out + cache-pin
details. Per maintainer feedback: the previous entries were too long
for a CHANGELOG.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Daniel Meppiel <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant