Skip to content

Add ARM64 Linux support to CI/CD pipeline#4

Merged
danielmeppiel merged 15 commits intomicrosoft:mainfrom
pofallon:feature/linux-arm64-release
Oct 28, 2025
Merged

Add ARM64 Linux support to CI/CD pipeline#4
danielmeppiel merged 15 commits intomicrosoft:mainfrom
pofallon:feature/linux-arm64-release

Conversation

@pofallon
Copy link
Copy Markdown
Contributor

🚀 New Feature

Description

  • Add ubuntu-24.04-arm runner for native ARM64 builds
  • Update build, integration-tests, and release-validation jobs
  • Add apm-linux-arm64 binary to release artifacts
  • Remove cross-compilation complexity in favor of native builds

I want to use apm in a linux devcontainer running on an M-series Mac host.

Changes Made

  • Feature implementation
  • Tests added
  • Documentation updated

Testing

  • Manual testing completed
  • All existing tests pass
  • New tests added and passing

Checklist

  • LABEL: Apply enhancement or feature label to this PR
  • Code follows project style guidelines
  • Documentation updated (if needed)
  • CHANGELOG.md updated (for significant features)

Fixes # (issue)

Copy link
Copy Markdown
Collaborator

@danielmeppiel danielmeppiel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I am just getting the CI failing now on the smoke tests, specifically with codex on darwin, which is potentially unrelated. Let's address the 2 comments I made

runs-on: ${{ matrix.os }}
strategy:
fail-fast: false
matrix:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test phase will need to run also on the new ubuntu arm platform

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I added a commit to fix this.

"darwin_x86_64": "${{ steps.checksums.outputs.darwin-x86_64-sha }}",
"linux_x86_64": "${{ steps.checksums.outputs.linux-x86_64-sha }}"
"linux_x86_64": "${{ steps.checksums.outputs.linux-x86_64-sha }}",
"linux_arm64": "${{ steps.checksums.outputs.linux-arm64-sha }}"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to update the homebrew formula at danielmeppiel/homebrew-apm#5 so that it accepts the new binary and processes installation correctly

Comment thread .github/workflows/build-release.yml
pofallon and others added 15 commits October 28, 2025 15:47
- Add ubuntu-24.04-arm runner for native ARM64 builds
- Update build, integration-tests, and release-validation jobs
- Add apm-linux-arm64 binary to release artifacts
- Remove cross-compilation complexity in favor of native builds
- Add aarch64|arm64 architecture detection in setup-common.sh
- Add linux-aarch64 -> aarch64-unknown-linux-gnu mapping in setup-codex.sh
- Fixes smoke test failures on ARM64 Linux runners
- Enables Codex runtime installation on ARM64 Linux systems

Fixes smoke test error: 'Unsupported Linux architecture: aarch64'
- Add aarch64|arm64 -> apm-linux-aarch64 mapping in test-integration.sh
- Fixes integration test failures on ARM64 Linux runners
- Complements runtime setup ARM64 support

This was the missing piece causing integration tests to fail with:
'Unsupported Linux architecture: aarch64'
- Runtime list command was failing with KeyError: 'cross'
- cli.py line 2040 references STATUS_SYMBOLS['cross'] but it wasn't defined
- Added 'cross': '❌' to STATUS_SYMBOLS in console.py

Fixes integration test failures:
- test_runtime_list_command
- test_dual_runtime_installation

Error was: 'Error listing runtimes: 'cross''
- Add GITHUB_APM_PAT as fallback for models token purpose
- Ensure both GITHUB_TOKEN and GITHUB_APM_PAT are available to Codex script
- Fixes 'Using unauthenticated GitHub API request' in CI smoke tests

The smoke test environment only provides GITHUB_APM_PAT but not GITHUB_TOKEN.
The token manager now properly falls back to GITHUB_APM_PAT for GitHub Models
API access when GITHUB_TOKEN is not available, and ensures the Codex setup
script has access to both tokens as expected.
- Change apm-linux-aarch64 to apm-linux-arm64 in test-integration.sh
- Matches build script normalization: aarch64 -> arm64
- Fixes 'Binary not found: ./dist/apm-linux-aarch64/apm' error

The build script normalizes aarch64 architecture to arm64, creating
apm-linux-arm64 binary, but integration test was looking for
apm-linux-aarch64. Now both use consistent arm64 naming.
- Add debug prints to _setup_codex_tokens to see what tokens are available
- This will help diagnose why Codex setup is still using unauthenticated requests
- Temporary debugging commit to understand token flow in CI

Will remove debug prints once issue is identified and fixed.
- Add env=os.environ.copy() to subprocess.run() in run_command()
- Ensures GITHUB_APM_PAT and other environment variables are properly
  passed to the shell scripts in smoke tests
- Fixes authentication issue where runtime setup scripts couldn't
  access GitHub tokens set in CI workflow

Root cause: subprocess.run() with shell=True may not always inherit
all environment variables properly, especially in test environments.
- Revert debug logging added for troubleshooting
- Keep only the core fix for subprocess environment passing
- Ensures tokens are available for GitHub API requests to fetch latest release
- Fixes 'Using unauthenticated GitHub API request' in CI environments
- API call at line ~97 needs tokens set up early, not at line 171
- Addresses timing issue where tokens were configured after API usage
- Smoke tests now have both GITHUB_TOKEN and GITHUB_APM_PAT like other test jobs
- Maps GH_MODELS_PAT → GITHUB_TOKEN for GitHub API authentication
- Maps GH_CLI_PAT → GITHUB_APM_PAT for APM module access
- Fixes 'Using unauthenticated GitHub API request' in smoke test step
- Makes smoke test environment consistent with integration-tests and release-validation
- Add detailed environment variable debugging to github-token-helper.sh
- Show initial and final token state with character counts
- Add debug logging to setup-codx.sh before GitHub API calls
- Add environment debugging to test_runtime_smoke.py subprocess calls
- Debug output will show exactly which tokens are available and being used
- Helps diagnose why CI shows 'unauthenticated' despite token setup success

This will reveal the exact root cause of authentication failure in CI.
- Change from pull_request to pull_request_target
- Enables secrets access for fork PRs while maintaining security
- Revert to proper secrets (GH_MODELS_PAT, GH_CLI_PAT, GH_PKG_PAT)
- Add security documentation for pull_request_target usage

Fixes GitHub Actions fork PR limitation where custom secrets
are not available, while preserving full functionality for
regular PRs and main branch builds.
- Revert from pull_request_target to pull_request for proper security
- This prevents automatic secrets exposure to untrusted fork code
- Fork PRs will now require manual approval workflow as intended
- Maintains GitHub's security model for open source projects

This addresses the security concern where pull_request_target would
automatically grant secrets access without approval, bypassing
GitHub's built-in fork protection mechanisms.
@danielmeppiel danielmeppiel merged commit b4f122d into microsoft:main Oct 28, 2025
@danielmeppiel
Copy link
Copy Markdown
Collaborator

danielmeppiel commented Oct 28, 2025

@pofallon thank you, first binary available here https://github.com/danielmeppiel/apm/actions/runs/18882334077/artifacts/4396277101

Will ship on 4.3 this week

Related finding on fork PR testing #12

sergio-sisternes-epam added a commit to sergio-sisternes-epam/apm that referenced this pull request Mar 2, 2026
…microsoft#5/microsoft#6)

- Add _validate_inline_url() with https/http scheme allowlist
- Add _install_inline_mcp_deps() delegating to ClientFactory adapters
- VSCode: read-merge-write via adapter (full-overwrite API)
- Copilot/Codex: pass merge dict via adapter update_config()
- 15 new tests covering adapter delegation, URL validation, error cases
- All 866 tests pass
danielmeppiel added a commit that referenced this pull request Mar 13, 2026
- Add explicit 'Emojis are banned' rule to console helper section
- Clarify STATUS_SYMBOLS renders ASCII text symbols, not emojis
- Add anti-pattern #4 for emoji ban enforcement

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Mar 31, 2026
…covery + governance (#503)

* Initial plan

* Initial plan for marketplace integration

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* feat: marketplace integration core implementation

- Add marketplace/ package: models, errors, registry, client, resolver
- Add marketplace CLI commands: add, list, browse, update, remove, search
- Add lockfile provenance fields: discovered_via, marketplace_plugin_name
- Add install hook for NAME@MARKETPLACE syntax pre-parse intercept
- Wire marketplace commands in cli.py

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* docs: add marketplace integration guide and CLI reference

- Create guides/marketplaces.md covering marketplace concepts,
  registration, browsing, search, install syntax, provenance tracking,
  and cache behavior
- Add apm marketplace and apm search command sections to cli-commands.md
- Update apm install arguments to include NAME@MARKETPLACE syntax
- Update plugins.md Finding Plugins section with marketplace cross-refs

Co-authored-by: danielmeppiel <[email protected]>

* docs: fix marketplace.json format and lockfile field names to match implementation

- Use array-based plugins format matching models.py parser expectations
- Use discovered_via and marketplace_plugin_name matching lockfile.py fields
- Document both Copilot CLI (repository/ref) and Claude Code (source) formats

Co-authored-by: danielmeppiel <[email protected]>

* docs: fix git-subdir and relative source descriptions to match resolver

- git-subdir uses separate repo and subdir fields
- Relative string sources resolve to marketplace repo subdirectory

Co-authored-by: danielmeppiel <[email protected]>

* feat: add marketplace unit tests and docs

- 114 unit tests across 8 test files covering all marketplace modules
- New marketplace guide at docs/src/content/docs/guides/marketplaces.md
- Updated CLI reference with marketplace and search commands
- Updated plugins guide with marketplace integration section
- CHANGELOG entry for marketplace feature

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* refactor: address code review feedback

- Use List[MarketplacePlugin] from typing instead of lowercase generic
- Eliminate duplicated condition in install.py marketplace intercept
- Restructure control flow for clarity

Agent-Logs-Url: https://github.com/microsoft/apm/sessions/12a9b016-7930-41b8-a340-c64f11486b71

Co-authored-by: danielmeppiel <[email protected]>

* fix: address all 12 PR review comments on marketplace integration

- Narrow except Exception to except ImportError for lazy marketplace import (comment #1)
- Fix provenance key mismatch: use dep identity instead of canonical for lockfile lookup (comment #2)
- Include subdir in git-subdir source resolution with path traversal validation (comment #3)
- Include relative path in relative source resolution with traversal validation (comment #4)
- Sanitize marketplace name in cache file paths to prevent path traversal (comment #5)
- Fix docs: stale-if-error, not stale-while-revalidate (comment #6)
- Consolidate CHANGELOG entries into single line with (#503) (comment #7)
- Remove unused _SUPPORTED_SOURCE_TYPES set (comment #8)
- Let auth errors propagate in _auto_detect_path instead of swallowing (comment #9)
- Validate marketplace --name against [a-zA-Z0-9._-]+ charset (comment #10)
- Fix doc examples to use identifier-compatible names (comments #11, #12)
- Update tests to match corrected resolver behavior, add traversal tests

Co-authored-by: Copilot <[email protected]>

* fix: Copilot CLI format compatibility and marketplace provenance bugs

Bug #1 - Format incompatibility with awesome-copilot marketplace:
  - Parser now accepts 'source' key (Copilot CLI) as type discriminator
    fallback when 'type' key is absent, normalizing to 'type' for resolvers
  - GitHub source resolver now accepts 'path' field (Copilot CLI) as
    virtual subdirectory, same as 'subdir' in git-subdir sources
  - Path traversal validation applied to 'path' field
  - Fixes: 8 of 62 plugins in awesome-copilot that use github source
    objects with 'source'+'path' keys instead of 'type'+'subdir'

Bug #2 - Lockfile provenance never written:
  - Root cause: install passed raw marketplace refs (NAME@MARKETPLACE)
    as only_packages, but DependencyReference.parse() can't parse those,
    so identity filtering removed all deps -> 'already installed'
  - Fix: use validated_packages (canonical owner/repo strings) instead
    of raw click argument for only_pkgs

Both bugs verified fixed via E2E tests against real marketplaces:
  - github/awesome-copilot (62 plugins)
  - anthropics/skills (3 plugins)
  - microsoft/azure-skills (1 plugin)

Co-authored-by: Copilot <[email protected]>

* feat: scope marketplace search to QUERY@MARKETPLACE format

Search now requires QUERY@MARKETPLACE (e.g. apm search security@skills)
to eliminate name collisions across marketplaces. Added search_marketplace()
client function for single-marketplace search.

- Rejects bare queries without @ — clear error with usage example
- Validates marketplace exists before searching
- Updated docs/guides/marketplaces.md with new syntax
- 7 test cases: format validation, unknown marketplace, results, no results

Co-authored-by: Copilot <[email protected]>

* docs: update CLI reference and plugins guide for scoped search syntax

Align all documentation with QUERY@MARKETPLACE search format.

Co-authored-by: Copilot <[email protected]>

* refactor: use centralized path_security for marketplace traversal checks

Replace 3 ad-hoc '..' in x.split('/') checks in marketplace/resolver.py
with validate_path_segments() from utils/path_security.py. Add
defense-in-depth validate_path_segments() call to _sanitize_cache_name()
in client.py.

This ensures marketplace code uses the same cross-platform path safety
utilities (backslash normalization, single-dot rejection) as the rest
of APM.

Co-authored-by: Copilot <[email protected]>

* docs: add path safety rule to copilot-instructions.md

Directs contributors to use validate_path_segments() and
ensure_path_within() from utils/path_security.py instead of
ad-hoc traversal checks.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: copilot-swe-agent[bot] <[email protected]>
Co-authored-by: danielmeppiel <[email protected]>
Co-authored-by: danielmeppiel <[email protected]>
Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Apr 22, 2026
* feat(policy): W1 foundations for install-time policy enforcement (#827)

Wave 1 of issue #827 implementation. Lays the foundations the install
pipeline gate (W2) will plug into. No behaviour change yet — install
still does NOT enforce policy until W2 wires the gate phase.

What's in:
- policy_checks: new public seam run_dependency_policy_checks(deps,
  lockfile=, policy=, mcp_deps=, effective_target=) accepting a
  resolved dep set; old run_policy_checks(project_root, policy) is now
  a thin wrapper. Honours require_resolution: project-wins for
  version-pin mismatches only. Latent isinstance(allow, list) bug
  fixed for schema's Tuple[str, ...].
- policy/discovery: cache stores merged effective policy with chain
  metadata + fingerprint. Atomic writes via temp + os.replace, with
  pid+thread_id suffix to prevent concurrent-writer collision.
  MAX_STALE_TTL=7d ceiling on cache reuse. PolicyFetchResult expanded
  to express 9 outcomes (found, absent, cached_stale,
  cache_miss_fetch_fail, malformed, disabled, garbage_response,
  no_git_remote, empty).
- diagnostics: CATEGORY_POLICY constant + per-category renderer wired
  into render_summary().
- command_logger: InstallLogger.policy_resolved/violation/disabled
  with per-class actionable error wording (auth/unreachable/malformed/
  blocked).
- tests/fixtures/policy/: 14 policy fixtures + 7 project fixtures
  (denied-direct, denied-transitive, required-missing,
  required-version-mismatch, mcp-denied, target-mismatch,
  unpacked-bundle) covering W4 live matrix scenarios L2/L4/L13 and
  rubber-duck findings I5/I6/I7/N14/C2.
- docs: 12-section Install-time enforcement guide skeleton in both
  enterprise/policy-reference.md and packages/apm-guide skill mirror.
  10 sections filled; sections 7 (snippets) and 10 (error table)
  stubbed for W3-docs-final once W2 lands and W4 captures live output.

Tests:
- tests/unit: 4878 passed (1 pre-existing unrelated MCP failure
  deselected). Includes 41 logger + 29 policy-seam + 38 cache + 21
  fixture-load new tests.

Refs: #827
Co-authored-by: Copilot <[email protected]>

* feat(install): W2A policy enforcement at install time (#827)

Wave 2A wires the three install-time enforcement sites planned for #827:

1. **Pipeline gate phase** (src/apm_cli/install/phases/policy_gate.py):
   New phase running between resolve and targets. Discovers org policy,
   resolves the inheritance chain via resolve_policy_chain, persists the
   merged effective policy + chain refs to cache (chain_refs threading
   per C1 amendment), then calls run_dependency_policy_checks against
   the resolved deps. Routes 9 discovery outcomes (found, absent,
   cached_stale, cache_miss_fetch_fail, malformed, disabled,
   garbage_response, no_git_remote, empty). Block-mode violations raise
   PolicyViolationError to halt the pipeline cleanly.

2. **--mcp branch preflight** (src/apm_cli/policy/install_preflight.py
   + commands/install.py:1091-1125):
   apm install --mcp does NOT enter the install pipeline. New shared
   helper run_policy_preflight() runs discovery + dep checks for any
   non-pipeline command site. Wired into --mcp BEFORE _run_mcp_install
   so denied servers never reach the integrator. Also exports
   PolicyBlockError for callers.

3. **install <pkg> snapshot+rollback** (commands/install.py):
   apm install <pkg> mutates apm.yml BEFORE the pipeline runs. We now
   snapshot apm.yml as raw bytes (not parsed YAML, to avoid round-trip
   drift on whitespace / key-order / comments), and on ANY pipeline
   failure (policy block, download error, etc.) restore byte-for-byte
   via tempfile + os.replace atomic write. Logs '[i] apm.yml restored
   to its previous state.' and exits non-zero.

InstallContext gains policy_fetch, policy_enforcement_active, no_policy.

Tests: +68 new tests, 4946 unit tests pass total.
- test_policy_gate_phase.py: 27 (covers all 9 outcomes)
- test_mcp_preflight_policy.py: 22 (escape hatches, allow/deny, transport,
  self-defined, trust_transitive, discovery outcomes, return shape)
- test_install_pkg_policy_rollback.py: 19 (byte-equal restore, comments
  preserved, --no-policy bypass, download error rollback, snapshot
  unit tests)

W2B (dry-run, target-aware, escape-hatch CLI flag) and C2 panel review
follow.

Refs: #827
Co-authored-by: Copilot <[email protected]>

* feat(policy): W2B install enforcement - escape hatch, dry-run preview, target-aware check (#827)

W2B completes the enforcement surface:

* policy_target_check.py - new pipeline phase after targets that re-runs
  target/compilation checks with the resolved effective_target. Filters
  to TARGET_CHECK_IDS only to avoid double-emitting dep violations from
  the gate phase. Honors CLI --target override (I6 fix scenario).

* --no-policy escape hatch on apm install / install <pkg> / install --mcp
  / update. APM_POLICY_DISABLE=1 env var equivalent. Both route through
  ctx.no_policy and emit always-visible warnings via
  InstallLogger.policy_disabled() noting that apm audit --ci still fails.

* --dry-run policy preview. run_policy_preflight gains dry_run=True kwarg.
  Emits '[!] Would be blocked by policy: <dep> -- <reason>' (block) or
  '[!] Policy warning: <dep> -- <reason>' (warn) before the would-install
  table. Never raises, never mutates. Direct manifest deps only (resolver
  doesn't run in dry-run; documented limitation).

InstallRequest, InstallService, InstallContext threaded with no_policy.
LOC budget on install.py raised 1625 -> 1650 with documented rationale.

Tests: 5003 unit pass (+57 W2B: 17 target_check + 24 no_policy_flag +
16 dry_run_policy). Full suite green vs main baseline.

Co-authored-by: Copilot <[email protected]>

* fix(policy): C2 panel fixes - transitive MCP enforcement, shared chain discovery, dry-run cap, drop apm update --no-policy (#827)

C2 panel checkpoint surfaced 4 fixes (S1+B1+D2 BLOCKER/PASS-WITH-CONCERN, D1
DevX). All landed; full suite 5032 pass.

S1 (Supply Chain BLOCKER) - transitive MCP enforcement:
  Transitive MCP servers from APM packages were bypassing install-time policy.
  The pipeline gate phase only sees direct apm.yml deps; transitive MCP servers
  are merged later via MCPIntegrator.collect_transitive() and written to
  runtime configs (.copilot/mcp.json, .cursor/mcp.json) with no policy check.
  This defeated #827 on the most security-critical dep category.
  Fix: second run_policy_preflight() call in commands/install.py after the
  transitive merge, before MCPIntegrator.install(). On block: abort MCP config
  writes, exit non-zero. APM packages remain installed (gate phase approved
  them). 15 new unit tests in test_transitive_mcp_policy.py.

B1 (Architect, partial) - shared chain-aware discovery:
  Extract discover_policy_with_chain() into policy/discovery.py so both
  policy_gate.py and install_preflight.py walk the same inheritance chain.
  Closes the gap where --mcp / --dry-run paths could resolve a different
  effective policy than the pipeline path. Gate-phase keeps its 9-outcome
  routing; only the discovery seam moved. 10 new tests in
  test_chain_discovery_shared.py.

D2 (DevX UX) - dry-run noise cap:
  install_preflight._DRY_RUN_PREVIEW_LIMIT = 5. Long deny lists now show
  5 lines per severity bucket + tail '[!] ... and N more would be blocked
  by policy. Run apm audit for full report.' 4 new tests.

D1 (DevX UX) - drop apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dependency refresh. The flag was accepted but unused. Removed the option
  and flipped the test to assert the flag is now rejected.

LOC budget on install.py raised 1650 -> 1675 with documented justification.

Tests: 5032 unit pass (+29 new: 15 transitive_mcp + 10 chain_discovery_shared
+ 4 dry_run_noise_cap). 1 pre-existing MCP test deselected.

Co-authored-by: Copilot <[email protected]>

* docs+test(policy): W3 - integration matrix, docs final fill, CHANGELOG, growth (#827)

W3 phase complete. All 5 parallel workstreams landed.

Tests:
  tests/integration/test_policy_install_e2e.py - 17 e2e scenarios I1..I17
  Covers all 9 PolicyFetchResult outcomes + all 6 violation classes via
  CliRunner-driven full-pipeline flows. Mocks discover_policy_with_chain
  at both seams (policy_gate + install_preflight). Uses _build_policy()
  helper for frozen-dataclass safe construction.

Docs:
  docs/src/content/docs/enterprise/policy-reference.md
    sec 7: 8 verbatim CLI snippets (success, block, warn, --no-policy,
    APM_POLICY_DISABLE, --dry-run with overflow tail, install <pkg>
    rollback, transitive MCP block)
    sec 10: outcome table (9 fetch outcomes) + violation table (6 classes)
    Added explicit JSON/SARIF non-goal callout (C1 amendment).
  packages/apm-guide/.apm/skills/apm-usage/governance.md
    Same content, leaner skill version, links back to docs for full text.

CHANGELOG.md:
  Added: --no-policy / APM_POLICY_DISABLE escape hatch, --dry-run preview,
    install <pkg> rollback
  Changed: pipeline gains policy_gate + policy_target_check phases, shared
    chain discovery + atomic cache + MAX_STALE_TTL
  Security (headline): apm install enforces apm-policy.yml; transitive MCP
    checked before runtime config write

Follow-up issue #829 filed: policy.fetch_failure: warn|block schema knob.

Tests: 5049 pass (5032 unit + 17 integration). 1 pre-existing MCP test
deselected.

PR body drafted at session-state/files/pr-body-827.md. Growth strategy
entry + asciinema script staged in WIP (gitignored).

Co-authored-by: Copilot <[email protected]>

* fix(policy): C3 fixes - direct MCP enforcement, malformed posture, warn-mode coverage, doc drift (#827)

C3 final panel + rubber-duck found 5 issues. All fixed.

#1 (CRITICAL) - Direct MCP deps in apm.yml bypassed enforcement:
  ctx.direct_mcp_deps now populated in pipeline.py from
  apm_package.get_mcp_dependencies() before policy_gate runs. policy_gate
  reads direct_mcp_deps (not the dead mcp_deps_to_install) and passes them
  to run_dependency_policy_checks. install.py:1496 second preflight guard
  drops 'and transitive_mcp' so direct-only MCP installs are also caught.

#2 (CRITICAL) - Malformed policy handling inconsistent + broke rollback:
  policy_gate.py replaced sys.exit(1) on malformed with fail-open warn
  (matches install_preflight + cache_miss_fetch_fail/garbage_response
  posture). sys.exit was bypassing the rollback handler in install.py for
  apm install <pkg>. CEO mandate: malformed = warn, fail-closed knob is
  follow-up #829.

#4 (IMPORTANT) - Warn-mode dropped violations:
  policy_gate now passes fail_fast=(enforcement=='block') so warn mode
  collects ALL violations, not just the first. Also emits warnings for
  passed=True checks with non-empty details (project-wins version-pin
  mismatches were silently dropped).

#3 (IMPORTANT) - Chain inheritance is 1-level, not multi-level:
  discover_policy_with_chain only walks one parent. Toned down docs in
  policy-reference.md and governance.md with explicit caution callout.
  Filed follow-up #831 for proper recursive walk + cycle detection.

#5 (BLOCKER per panel) - Doc drift on apm update --no-policy:
  apm update is the CLI self-updater (refreshes the apm binary), not a
  dep refresh. Removed all mentions from both docs. apm deps update is
  the dep-refresh surface (runs install pipeline, gate applies); --no-policy
  is NOT exposed there today.

Tests: 5059 pass (5049 baseline + 10 new: 6 unit gate + 4 integration
I18/I19/I20). New integration tests cover real direct-MCP block, real
malformed fail-open, warn-mode multi-violation. I16 class renamed to
TestI16GarbageResponsePolicy to fix mislabeling.

Follow-ups: #829 (fetch_failure schema knob), #831 (multi-level chain).

Co-authored-by: Copilot <[email protected]>

* fix(policy): in-PR resolution of #834 (warn-mode rendering) and #831 (recursive extends chain) (#827)

Originally filed as follow-ups during C3, moved in-PR per reviewer
request so #832 ships a complete enforcement story.

#834 - Warn-mode policy violations did not render in the install
summary. Root cause: pipeline created a fresh DiagnosticCollector for
install_result.diagnostics while InstallLogger.policy_violation()
pushed warnings into logger.diagnostics. Two collectors, one rendered.
Fix: when a logger is present, reuse logger.diagnostics so policy
records flow through render_summary() (block mode unaffected - it
aborts inline before summary).

#831 - extends: chain only supported one level (parent). Inheritance
machinery (resolve_policy_chain, detect_cycle, MAX_CHAIN_DEPTH=5) was
already N-deep capable; discovery never wired it. Fix: rewrite
_resolve_and_persist_chain as iterative depth-first walk, leaf-first;
cycle detection via inheritance.detect_cycle; honor MAX_CHAIN_DEPTH=5
with explicit pre-append check; partial-chain warning when a mid-chain
ref fails to fetch ('Policy chain incomplete: <ref> unreachable, using
<N> of <M> policies'); single cache write at leaf with full chain
fingerprint.

Tests: +1 unit (warn-render), +5 unit (3-level full, cycle, depth
limit, partial chain, single-level regression), +1 integration
(TestI21ThreeLevelExtendsChain). 5044 unit pass.

Docs: enterprise/policy-reference.md and apm-usage/governance.md
chain-depth callouts updated.

Co-authored-by: Copilot <[email protected]>

* docs(changelog): record in-PR resolution of #834 and #831 under #827

Co-authored-by: Copilot <[email protected]>

* fix(policy): address review-panel pre-merge findings (#827)

- Security F1 (HIGH): pin extends: chain to leaf policy host; disable
  HTTP redirects in _fetch_from_url and _fetch_github_contents. Closes
  cross-host credential leak vector via git credential fill fallback
  and SSRF/Referer-leak vector via 30x redirects. raw.githubusercontent
  .com is treated as distinct from github.com (strict pin).
- Logging C1+C2 + UX F1/F2/F4/F5/F9: extract InstallLogger.policy_
  discovery_miss() canonical helper covering all 7 discovery outcomes;
  route both policy_gate and install_preflight through it. absent now
  verbose-only; no_git_remote downgraded to [i]; garbage_response gets
  distinct wording (no VPN/firewall noise); cached_stale and cache_
  miss_fetch_fail messages now state enforcement posture explicitly;
  violation messages dedupe dep_ref prefix; wire _policy_reason_blocked
  into block-severity policy_violation as dim secondary line.
- Docs: remove [Planned] banner from policy-reference; update
  enforcement tables (policy-reference + governance skill) to reflect
  install-time blocking; document --no-policy / APM_POLICY_DISABLE in
  cli-commands.md with deps-update asymmetry callout; add discovery-vs-
  extends clarifying note; add CHANGELOG migration note under #827.

Tests: 5053 -> 5068 (+15 logging, +9 security host-pin).

Co-authored-by: Copilot <[email protected]>

* feat(policy): ship enterprise hardening pack on top of #827

Four enterprise hardening items shipped in-PR per CISO-arbitrated panel
verdict + CTO threat-model deep dive (PR #832 comments 4294087760 +
4294115069). Closes #829.

1. policy.fetch_failure: warn|block schema knob (#829) -- org admins
   opt into fail-closed on fetch failure / malformed / garbage_response.
   Default 'warn' preserves backwards compat.
2. apm.yml policy.fetch_failure_default: warn|block -- project-side
   complement so a project can lock down behavior even when no policy
   is reachable to read the org-side knob from.
3. apm policy status diagnostic command -- show discovery outcome,
   source, enforcement, cache age, extends chain, effective rule
   counts, and hash-pin state. --json for SIEM ingestion. Trust-but-
   verify tool that makes fail-open acceptable.
4. apm.yml policy.hash: 'sha256:...' consumer-side pin -- closes the
   garbage_response compromised-intermediary vector by verifying raw
   policy bytes against a project-pinned digest. Equivalent of pip
   --require-hashes for the policy itself. ALWAYS fail-closed on
   mismatch, regardless of fetch_failure setting (a hash mismatch is
   an explicit pin violation, not a fetch failure). sha384/sha512
   accepted; md5/sha1 rejected (collision-resistant only).
5. apm audit --ci auto-discovers org policy when --policy-source is
   not provided; --no-policy flag added to skip. Closes the
   audit/install asymmetry that left CI blind to sideloaded primitives.

Tests: 5068 -> 5157 (+89: hash pin 31, fetch_failure knob, audit
auto-discovery, policy status command, plus updates to existing
discovery tests for the new expected_hash kwarg threading).

Docs: policy-reference §9.5 (fetch_failure), §9.6 (hash pin),
§9.7 (apm policy status), §9.8 (audit auto-discovery); governance.md
skill mirrors all of the above; cli-commands.md gets policy status +
audit --no-policy. CHANGELOG entries under [Unreleased] Added /
Added (Security).

Co-authored-by: Copilot <[email protected]>

* docs(policy): address doc-writer review BLOCKERs (#827)

- policy-reference.md: remove stale 'planned fetch_failure knob' paragraph
  that contradicted the §9.5 entry shipped in the same PR; add Linux
  hash-compute one-liner alongside the macOS shasum example.
- cli-commands.md: add 'apm policy status' command section under a new
  'apm policy' family (synopsis, --policy-source/--no-cache/--json,
  exit-code note, examples). Add --no-policy flag to 'apm audit' options
  list. Reword --policy SOURCE description to reflect that --ci now
  auto-discovers when --policy is omitted. Update audit examples to
  match (drop the now-redundant '--policy org' from auto-discovery
  example, add explicit --no-policy variant).

Co-authored-by: Copilot <[email protected]>

* docs(policy): address doc-writer HIGH+LOW findings (#827)

- manifest-schema.md: add policy: block to schema diagram + new
  section 3.9 documenting fetch_failure_default, hash, hash_algorithm
- policy-reference.md: add fetch_failure: warn to canonical schema
  YAML and a fetch_failure entry under Top-level fields; lift apm
  policy status and apm audit --ci auto-discovery into proper
  numbered subsections (9.7 / 9.8) so anchors match the skill mirror
- governance.md: surface install-time enforcement with link to
  policy-reference#install-time-enforcement
- ci-policy-setup.md: annotate Step 3 noting apm audit --ci
  auto-discovers and --policy org is now an explicit override
- security.md: add Compromised policy intermediary row to attack
  surface comparison, linked to policy.hash consumer-side pin
- cli-commands.md: split --no-policy into 2-line nested bullet
  separating behaviour from env-var equivalence
- apm-guide skill mirror: add fetch_failure: warn to schema overview
  to keep skill aligned with policy-reference

Co-authored-by: Copilot <[email protected]>

* fix(policy): address PR review panel logging+arch findings (#827)

BLOCKING:
- command_logger.policy_discovery_miss: gate no_git_remote info
  message on verbose mode; previously emitted on every install in a
  non-git directory

Architecture:
- New install/errors.py with canonical PolicyViolationError;
  PolicyBlockError kept as re-exported alias to preserve test patches
- New policy/outcome_routing.py::route_discovery_outcome
  consolidating the 9-outcome routing table; policy_gate.py and
  install_preflight.py now delegate instead of duplicating
- pipeline.py: catch PolicyViolationError before bare Exception so
  policy block messages are not double-nested in RuntimeError
- commands/install.py: isinstance(PolicyViolationError) branch in
  the legacy handler for the same reason

Logging UX:
- install_preflight: empty check.details now falls back to
  [check.name] so the block message is never blank
- _extract_dep_ref helper replaces detail.split(":")[0] with
  defensive parsing that falls back to check.name

Security:
- discovery._get_cache_dir asserts containment vs project_root
  (resolves symlinks) instead of an unguarded join
- Removed dead no_policy= kwarg from discover_policy_with_chain;
  env-var defence-in-depth retained on the call site

Tests: +tests/unit/policy/test_pr_832_findings.py covering all 8
  findings; install_logger split into silent/verbose cases. 5176
  unit tests pass, 0 regressions.

Co-authored-by: Copilot <[email protected]>

* test(policy): use urllib.parse for host assertions to silence CodeQL (#827)

CodeQL's py/incomplete-url-substring-sanitization rule fired 6 times
on test_extends_host_pin.py because bare 'host' in msg substring
checks could in theory match a host appearing at an arbitrary URL
position (path, query, userinfo). The assertions are correct in
practice -- they assert on production error messages of known
format -- but the pattern is not safe in general.

Replace each substring check with a precise extractor:

- _assert_extends_host_in_message / _assert_leaf_host_in_message:
  regex-anchor on the production 'extends host: <h>' / 'leaf host:
  <h>' tokens, then exact-compare the captured group.
- _assert_redirect_target_host: regex-extract the redirect target
  URL after 'to ', then urllib.parse.urlparse(...).hostname compare.

No production-code changes; all 9 host-pin tests still pass.

Co-authored-by: Copilot <[email protected]>

* fix(policy,audit): address PR #832 DevX UX blockers

- audit --no-policy help text rewritten to describe positive
  behaviour first ("Skip org policy discovery and enforcement"
  instead of the negative "Skip auto-discovery ... in --ci mode"),
  so apm audit --help no longer hides the primary effect behind a
  caveat. Aligns the code with the docs.

- apm policy status --check flag added: exits 1 when outcome is
  not 'found' (i.e. policy unresolvable / absent / disabled /
  fetch-failed), 0 otherwise. Default behaviour unchanged (always
  exit 0) so the diagnostic remains safe for human and SIEM use,
  while CI authors get the npm audit / pip check style contract
  via a single flag.

Updates cli-commands.md, policy-reference.md, and CHANGELOG.md to
document the new flag and exit-code table. Adds TestStatusCheckFlag
covering the found / unresolvable / discovery-exception / json
combinations.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request Apr 30, 2026
…#1073)

* docs(notice): rename NOTICE.md -> NOTICE; add CLA third-party section

Two changes, one file rename:

1. Rename NOTICE.md -> NOTICE, matching the Apache / CNCF convention used
   by upstream third-party-attribution files (kubernetes-sigs/kro,
   kubernetes-sigs/headlamp, etc.). The .md extension was non-idiomatic
   for a generated legal artifact -- NOTICE files are read by tooling
   (license scanners, SBOM generators) that match on the bare filename.
   Generator (scripts/generate-notice.py), Makefile target, and the
   NOTICE Drift Check workflow are all updated to operate on the
   extension-less path.

2. Add a 'Submitted on behalf of a third-party' section to NOTICE,
   crediting five contributors whose pull requests landed before the
   microsoft-github-policy-service CLA bot recorded a signature on
   file. The repo transferred from danielmeppiel/awd-cli to the
   microsoft org; some early PRs predate CLA enforcement, and we
   could not retroactively reach all contributors. Mirrors section 7
   of common CLA texts (the wording adopted by CNCF NOTICE files).

   Driven by a new _third_party_submissions block in
   scripts/notice-metadata.yaml -- legally-significant wording stays
   alongside the per-component data, not buried in code.

   Contributors named (verified via Check Runs API against the
   microsoft-github-policy-service app, license/cla check on every
   merged PR by each suspected author):
     - @pofallon  (PR #4)
     - @richgo    (PRs #8, #25, #26, #33, #34)
     - @ryanfk    (PR #92 -- bot ran with conclusion=null,
                  output: 'Contributor License Agreement is not agreed yet.')
     - @foutoucour (PR #108)
     - @Jah-yee   (PR #184)

   Listed contributors who later sign the CLA (or who were signed
   under a different GitHub account at the time) can request removal
   via issue.

Co-authored-by: Copilot <[email protected]>

* docs(notice): trim third-party section preamble

Strip the historical/CNCF-citation paragraph and the verbatim CLA-section-7
quote. Keep only the active sentence (what the listing means + how to
request removal).

Co-authored-by: Copilot <[email protected]>

* docs(notice): address PR #1073 review

Three fixes from copilot-pull-request-reviewer:

1. Drop spurious leading '---' separator in the third-party-submissions
   renderer. render_component already ends each component with '---\n\n',
   so prepending another '---' produced two consecutive separators in
   NOTICE. Verified: separator count dropped from 17 to 16.

2. Sweep stale 'NOTICE.md' references in scripts/generate-notice.py
   (top-level docstring, Modes section, ComponentMeta and DepSpec field
   docstrings). The constant was renamed; the docs lagged.

3. Append (#1073) PR refs to both CHANGELOG entries; ASCII-correct the
   arrow ('->' instead of '->').

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
danielmeppiel pushed a commit that referenced this pull request May 4, 2026
CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <[email protected]>
danielmeppiel added a commit that referenced this pull request May 5, 2026
* feat(drift): Phase A infra - guards + diagnostic category

- Add _ReadOnlyProjectGuard context manager (utils/guards.py): snapshots
  stat of protected paths, raises ProtectedPathMutationError on any
  mutation. Defense-in-depth above the scratch-root remap.
- Add CATEGORY_DRIFT + drift() recording method to DiagnosticCollector.
- Add drift_count property and _render_drift_group renderer that groups
  by kind (modified/unintegrated/orphaned) with stable section header
  for machine consumers.
- Tests: 7 unit tests covering happy path, mutation, creation, deletion,
  missing-tolerated, exception-not-masked, single-file protected path.

Refs #1071. Phase A of WIP/drift/06-final-plan.md.

Co-authored-by: Copilot <[email protected]>

* feat(drift): Phase B+C - replay engine + audit CLI wiring

Implements the drift detection feature per WIP/drift/06-final-plan.md
(closes #1071 scope alignment with #898).

Engine (Phase B):
- src/apm_cli/install/drift.py: ReplayConfig, DriftFinding, CheckLogger,
  CacheMissError, normalization helpers (build-id strip, line endings,
  BOM), run_replay() (cache-only), diff_scratch_against_project(),
  text/json/sarif renderers, atexit scratch cleanup.
- src/apm_cli/install/services.py: scratch_root kwarg with
  ensure_path_within defense-in-depth guard for replay isolation.
- src/apm_cli/policy/ci_checks.py: _check_drift() wrapper returning
  (CheckResult, list[DriftFinding]); graceful CacheMissError handling.

CLI surface (Phase C):
- src/apm_cli/commands/audit.py: --no-drift opt-out flag with mutex
  against --strip/--file via UsageError. Drift wired into both
  _audit_ci_gate (--ci) and _audit_content_scan (bare project audit)
  paths, default-on per ADR-02. JSON/SARIF/text renderers integrated;
  --no-drift warning gated to text mode (stdout cleanliness).

Tests:
- tests/unit/install/test_drift.py: 13 unit tests (normalization,
  diff cases, renderers).
- Legacy --ci tests opt out of drift via batch --no-drift injection
  (fixture parity, not a behavior change).

7597 unit tests pass; lint clean.

Co-authored-by: Copilot <[email protected]>

* test(drift): Phase D - integration + e2e + perf coverage (43 tests)

Implements the locked test matrix for issue #1071 drift
detection. Floor of 43 tests across three new files closes the
'ULTRA HARDENING OF HELL' coverage requirement.

New files:
- tests/integration/test_drift_check.py (32 tests):
  * Section A: 9 drift cases (modified/unintegrated/orphaned + CRLF/
    BOM/Build-ID false-positive guards)
  * Section B: 4 past-PR regressions (#1067, #882, #889, source-deleted)
  * Section C: 7 edges (no/corrupt lockfile, untracked governed,
    no-write contract, idempotency)
  * Section D: 3 multi-target (copilot/claude/cursor)
  * Section E: 9 default-on / --no-drift opt-out (mutex, stderr
    routing, JSON suppression)
- tests/integration/test_drift_check_e2e.py (10 tests):
  full install->mutate->audit loop with mix_stderr=False, air-gap
  proof, JSON/SARIF stability, 30s smoke
- tests/unit/install/test_drift_perf.py (1 test):
  100 primitives replay+diff under 5s

Engine fix surfaced by tests:
- src/apm_cli/install/drift.py: run_replay now reads apm.yml's target
  field via parse_target_field and passes it to resolve_targets.
  Without this, multi-target projects (copilot+claude+cursor) replayed
  only the auto-detected primary target, falsely reporting secondary
  target deployments as orphaned. Helper _read_apm_yml_target() added.

CI wiring:
- scripts/test-integration.sh: two new blocks in run_e2e_tests()
  invoking the integration + e2e suites before the final success log.
  Both safe to run without GITHUB_APM_PAT (cache-only, mocked network).

Verification: 56 drift-domain tests pass; full repo lint clean.

Co-authored-by: Copilot <[email protected]>

* docs(drift): CHANGELOG + Starlight guide + apm-usage skill + ci.yml note

- CHANGELOG.md: Added [Unreleased] entry under Added describing the
  default-on drift detection in apm audit, the three failure modes it
  catches, false-positive guards, --no-drift opt-out + mutex semantics,
  and the JSON/SARIF integration shape. Closes #1071, supersedes #898.
- docs/src/content/docs/guides/drift-detection.md (NEW, sidebar order 7):
  Full user-facing guide -- what drift means, how the cache-only replay
  works (with mermaid diagram), exit-code matrix, when to use --no-drift,
  output formats, and the CI single-line gate that replaces the legacy
  git status --porcelain script.
- packages/apm-guide/.apm/skills/apm-usage/commands.md: Extended the
  audit row with --no-drift flag and added a paragraph documenting the
  drift-by-default behavior, three failure modes, false-positive
  normalization, and JSON/SARIF integration. Aligns the skill that
  ships in apm-guide with the new CLI surface (per
  apm-keep-docs-up-to-date.instructions.md rule 4).
- .github/workflows/ci.yml: Annotated Gate B (legacy bash drift check)
  with a comment marking it redundant once apm-action ships a CLI with
  default-on drift detection (this PR's release). Kept as
  defense-in-depth fallback until then.

Co-authored-by: Copilot <[email protected]>

* fix(drift): address panel feedback - recovery hint + doc-sync

CEO panel recommended landing two in-PR follow-ups before merge:

1. Recovery hint in drift output (cli-logging + devx-ux convergence):
   render_drift_text now appends '[i] Run apm install to re-sync
   deployed files with the lockfile.' so users see WHAT and HOW in one
   message. Honors Message Writing Rule #4 'Include the fix'.

2. Doc-sync (doc-writer + devx-ux convergence):
   - reference/cli-commands.md: add --no-drift to audit options table;
     amend --ci description to mention drift contribution.
   - integrations/ci-cd.md: replace bash 'git status --porcelain'
     workaround under 'Verify Deployed Primitives' with 'apm audit --ci'
     one-liner; update 'We dogfood this' callout text.
   - getting-started/quick-start.md: retarget stale cross-ref from the
     now-superseded ci-cd anchor to the new drift-detection guide.
   - guides/drift-detection.md: drop the self-contradictory case #2 in
     'When to use --no-drift' (strip-mode is auto-skipped, not opt-out).
   - CHANGELOG.md: compress verbose entry to one Keep-a-Changelog line
     pointing readers to the guide for detail.

Tracked as follow-up issues (CEO call):
- supply-chain: verify cache content matches lockfile resolved_commit
  before drift replay trusts it (commit-SHA pinning bypass on shared
  CI caches).
- test-coverage: inverse-normalization unit test asserting BOM/CRLF/
  Build-ID guards do NOT mask real content drift (safety invariant).

Lint clean. 45 drift tests pass.

Co-authored-by: Copilot <[email protected]>

* fix(drift): address Copilot review - exit-code contract + types + diagnostics

Bare 'apm audit' is advisory (exit 0 on drift); 'apm audit --ci' is
the gate (exit 1). Closes the regression introduced when content-scan
escalation accidentally also escalated drift findings.

Also addresses inline review:
- A2: vacuous ASCII-encoding assertion now scopes per-line
- A4: tuple[float, int] -> tuple[int, int] in guards.py
- A5: type-annotated _check_drift signature
- A6: clarified DRIFT_ORPHANED comment
- A7: CHANGELOG references PR + closes
- A3: CacheMiss message now drift-specific (no --no-cache confusion)

Co-authored-by: Copilot <[email protected]>

* docs(drift): link drift detection guide from README security section

Per oss-growth: surfaces drift detection alongside content security
and lockfile integrity in the conversion-critical Production-grade
section, so a reader scanning for 'why APM' sees the supply-chain
story end-to-end.

Co-authored-by: Copilot <[email protected]>

* feat(drift): cache pin marker for stale-cache detection

apm install drops a .apm-pin JSON marker into each cached package
root recording the resolved_commit; apm audit verifies it before
running drift replay. Catches the 'teammate bumped lockfile, did
not reinstall' + 'shared CI runner reused stale apm_modules'
scenarios that would otherwise silently produce misleading drift
output.

LockfileBuilder syncs markers UNCONDITIONALLY (even when the
lockfile YAML is unchanged and even when no install happens), so
existing users self-heal on their next 'apm install'.

This is stale-cache detection, NOT cryptographic integrity --
defending against active cache tampering requires content-addressed
hashes, which is deferred.

Schema (v1): {schema_version: 1, resolved_commit: <sha>}
Marker file: <install_path>/.apm-pin

Coverage:
- 14 unit tests in test_cache_pin.py (positive + every error path
  + skip rules + idempotent re-run + self-heal regression)
- 1 integration test in test_drift_check_e2e.py exercising the
  full install -> mark -> verify flow against a synthetic cache

Co-authored-by: Copilot <[email protected]>

* Address panel follow-ups C1-C5 on PR #1137

C1 (supply-chain): Fail closed on unpinned remote deps
- cache_pin.find_unpinned_remote_deps() helper + stderr warning in
  sync_markers_for_lockfile
- drift._materialize_install_path raises CacheMissError for remote
  deps with resolved_commit=None (was silent fail-open)
- Replaced silent-skip test with warning assertion + new helper test

C2 (architecture): Wire _ReadOnlyProjectGuard into run_replay
- run_replay() now wraps the deps loop with _ReadOnlyProjectGuard
  on governed root dirs + apm.lock.yaml + AGENTS.md
- Regression test: monkeypatched leaky integrator triggers
  ProtectedPathMutationError

C3 (cli-logging-ux): Stderr message on swallowed CacheMissError
- audit._audit_content_scan emits '[!] drift check could not run:
  <msg>' to stderr when drift_failed and no findings (covers cache
  miss, missing lockfile, cache-pin error)
- Integration test e10 asserts stderr message in bare-audit path

C4 (docs): Baseline-check phrasing + CHANGELOG link
- governance-guide, ci-cd, cli-commands now read '7 baseline checks
  plus integration drift detection'
- CHANGELOG drift-detection link points to docs site URL

C5 (oss-growth): User-promise framing
- CHANGELOG drift entry leads with the user promise (forgotten
  installs + hand-edits) before mechanism
- drift-detection.md gains a 'Try it now' block at the top
- Before/after CI comparison promoted to its own subsection with
  explicit framing of what the bash workaround missed

Verification: ruff check + format silent; 7621 unit tests + 27 drift
integration tests green.

Co-authored-by: Copilot <[email protected]>

* docs(changelog): trim drift entry to single 'so what?' line

Collapse the two added entries (drift + cache-pin markers) into one
short line that answers the developer 'so what?' and points to the
Drift Detection guide for the full mechanism + opt-out + cache-pin
details. Per maintainer feedback: the previous entries were too long
for a CHANGELOG.

Co-authored-by: Copilot <[email protected]>

---------

Co-authored-by: Daniel Meppiel <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants