Skip to content

feat(health): chart_pinned signal + declared_coverage descriptor#1293

Merged
mchmarny merged 1 commit into
mainfrom
feat/health-chart-pinned-coverage
Jun 10, 2026
Merged

feat(health): chart_pinned signal + declared_coverage descriptor#1293
mchmarny merged 1 commit into
mainfrom
feat/health-chart-pinned-coverage

Conversation

@mchmarny

Copy link
Copy Markdown
Member

Summary

Layers the two pure-read structural signals onto the pkg/health core (ADR-009 V1), both computed only when a leaf resolves cleanly: chart_pinned (graded) and declared_coverage (descriptor).

Motivation / Context

Next step of the Recipe health tracking epic (ADR-009 V1), building directly on the merged core (#1225). These are the two signals that are pure reads of the resolved RecipeResult — no external integration, no Helm render.

Fixes: #1226
Related: #1224 (epic), #1225 (core, merged), #1227, #1228, #1229

Type of Change

  • New feature (non-breaking change that adds functionality)

Component(s) Affected

  • Recipe engine / data (pkg/recipe) — pkg/health only (reads RecipeResult)

Implementation Notes

  • chart_pinned (graded): iterate RecipeResult.ComponentRefs; every Type == ComponentTypeHelm must have a non-empty Version. Any unpinned Helm component → fail (offending names sorted into Detail), dragging the rolled-up status to fail. ADR-006 layer 1 only — the chart-version pin, not image digests; pure field check, no render. A recipe with no Helm components (pure-Kustomize/empty) scores a vacuous pass with an explanatory Detail, since Kustomize defaultTag pinning is out of scope and the column must not be misread as "supply-chain pinned". Feeds the existing generic rollup unchanged.
  • declared_coverage (descriptor, never graded): from RecipeResult.Validation.{Readiness,Deployment,Performance,Conformance}, record per phase Declared bool, sorted Checks []string, and len(Constraints). Lives in StructureHealth.Coverage (a separate field, not in the Dimensions map), so it can never move the status — a minimal recipe that drops phases is not penalized.
  • Both signals are computed only on a clean resolve (no RecipeResult to read otherwise); a failed/held combo carries neither.
  • Determinism preserved: unpinned names and Checks are sorted; pass-path Detail is a static string only in the vacuous-Kustomize case.

Testing

golangci-lint run -c .golangci.yaml ./pkg/health/...   # 0 issues
make test                                               # pass, total 76.6%
  • Unit: classifyChartPinned (all-pinned pass; unpinned fail with sorted names; pure-Kustomize vacuous pass; mixed); computeCoverage/phaseCoverage (nil → zero, sorted checks, undeclared phases not penalized).
  • Integration through the real builder: a leaf whose registry Helm component declares no default version resolves cleanly (resolves=pass) but chart_pinned=fail → status fail with the component named in Detail.
  • pkg/health coverage: 97.6% (was 95.8%; new helpers fully covered).

Risk Assessment

Rollout notes: N/A

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed (N/A — no user-facing surface yet)
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

@mchmarny mchmarny requested a review from a team as a code owner June 10, 2026 17:57
@mchmarny mchmarny added the theme/recipes Recipe expansion, overlays, mixins, and component registry label Jun 10, 2026
@mchmarny mchmarny self-assigned this Jun 10, 2026
@coderabbitai

coderabbitai Bot commented Jun 10, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 1b27f196-8344-434e-8871-ed68aad70c71

📥 Commits

Reviewing files that changed from the base of the PR and between 0dc108b and 5fb5b02.

📒 Files selected for processing (3)
  • pkg/health/doc.go
  • pkg/health/health.go
  • pkg/health/health_test.go

📝 Walkthrough

Walkthrough

This PR implements two new health signals for recipe validation: chart_pinned, a graded dimension that enforces explicit Helm chart version pins, and declared_coverage, a non-graded descriptor tracking validation phase coverage. The implementation adds a DimChartPinned constant, updates computeCombo to evaluate both signals when resolution succeeds, introduces classifyChartPinned to inspect Helm components and computeCoverage to extract validation configuration details, and expands the test suite with unit tests for individual functions and an end-to-end builder test validating integration.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

  • #1224: Implements the chart_pinned and declared_coverage signals from the ADR-009 V1 recipe health tracking epic and aligns the code with the parent epic's scope.
  • #1225: Builds on the core compute/rollup logic by adding DimChartPinned and making StructureHealth.Coverage optional and computed from resolved validation config.
  • #1228: Affects pkg/health outputs consumed by the recipe-list feature; changes to coverage shape and chart_pinned grading may be relevant to that issue's consumers.

Possibly related PRs

  • NVIDIA/aicr#1291: Related prior PR that extended pkg/health compute and rollup logic; this PR continues that work by adding chart_pinned grading and coverage computation.

Suggested labels

size/XL

Suggested reviewers

  • njhensley
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Linked Issues check ⚠️ Warning Implementation addresses #1226 requirements: chart_pinned graded dimension checks Helm ComponentRefs for non-empty Version; declared_coverage descriptor records per-phase validation data without grading; both gated on clean resolve. However, reviewer feedback identifies a blocking issue with disabled-component handling. Add missing guard: classifyChartPinned must skip components where !ref.IsEnabled() to prevent false failures on disabled unpinned Helm refs, and add corresponding unit test coverage as recommended in PR review.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed Title concisely captures the two primary additions: chart_pinned graded signal and declared_coverage descriptor, matching the main changes across pkg/health files.
Description check ✅ Passed Description comprehensively explains both signals' purposes, implementation scope, testing approach, and risk assessment, directly relating to the changeset.
Out of Scope Changes check ✅ Passed Changes are tightly scoped to pkg/health core (doc.go, health.go, health_test.go) implementing two pure-read signals on resolved RecipeResult. No unrelated modifications detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/health-chart-pinned-coverage

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/health/health.go`:
- Around line 118-121: The struct field Coverage is defined as a non-pointer
without omitempty so unresolved recipes emit a zero-valued coverage object;
change the field to a pointer (e.g., Coverage *DeclaredCoverage
`json:"coverage,omitempty" yaml:"coverage,omitempty"`) and update any code that
populates it (the resolver that currently computes DeclaredCoverage at
Resolve/RecipeResult creation around the logic referenced at lines 223-224) to
allocate and assign a pointer when the coverage is available (e.g., set Coverage
= &computedCoverage) and leave it nil on failure/hold so the JSON/YAML omits the
key as intended.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: ea9f9acb-d3ca-4dc7-9d56-f7c613a09ef4

📥 Commits

Reviewing files that changed from the base of the PR and between 10b08f1 and 0dc108b.

📒 Files selected for processing (3)
  • pkg/health/doc.go
  • pkg/health/health.go
  • pkg/health/health_test.go

Comment thread pkg/health/health.go Outdated
@github-actions

github-actions Bot commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Coverage Report ✅

Metric Value
Coverage 76.5%
Threshold 75%
Status Pass
Coverage Badge
![Coverage](https://img.shields.io/badge/coverage-76.5%25-green)

Merging this branch will increase overall coverage

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/aicr/pkg/health 97.59% (+1.76%) 👍

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/aicr/pkg/health/doc.go 0.00% (ø) 0 0 0
github.com/NVIDIA/aicr/pkg/health/health.go 97.59% (+1.76%) 83 (+35) 81 (+35) 2 👍

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

@njhensley

Copy link
Copy Markdown
Member

Code Review

Scope: pkg/health only · chart_pinned graded dimension + declared_coverage descriptor, both pure reads of a cleanly-resolved RecipeResult. Verified against the real recipe types (ComponentRef, RecipeResult, ValidationConfig/ValidationPhase) and the rollup. Clean, well-scoped, well-tested — one should-fix below.

🔴 Blocker

None.

🟠 Major — should fix before merge

1. classifyChartPinned does not skip disabled components — inconsistent with every other consumer, can emit a wrong verdict.
classifyChartPinned iterates all Helm refs. But RecipeResult.ComponentRefs retains components disabled via overrides.enabled: false (the builder does not filter them), and ComponentRef.IsEnabled() exists precisely to gate on this. Five other consumers all skip disabled refs before acting: pkg/bundler/bundler.go, pkg/bundler/deployer/flux/flux.go, pkg/mirror/discover.go, pkg/evidence/attestation/bom.go, validators/deployment/expected_resources.go.

A disabled Helm component left unpinned would score chart_pinned=fail and flip the whole structure rollup to fail — for a component that will never be bundled or deployed. One-line fix:

for _, ref := range result.ComponentRefs {
    if ref.Type != recipe.ComponentTypeHelm || !ref.IsEnabled() {
        continue
    }
    ...
}

Impact today is latent — no current recipe disables a component — so not a hard blocker. But health exists to catch exactly this class of mistake, and emitting a false fail is worse than silence. Add a test (disabled-unpinned Helm → not counted) alongside the fix.

🟡 Medium / Questions

2. chart_pinned checks non-empty, not exact-pin. A floating version (">= 1.0", "*") scores pass, yet a range is not "pinned" in supply-chain terms. In practice the registry uses only exact pins (v26.3.2, 0.18.3), so this is currently theoretical — but the dimension name implies a stronger guarantee than the check delivers. Confirm ADR-006 layer-1 intent is genuinely "non-empty," and add a one-line comment noting it is a presence check, not exact-pin validation.

3. Is vacuous pass the right state for "not applicable"? A pure-Kustomize leaf is arguably not applicable to chart pinning, not passing; the validator already has a skipped precedent. Overloading pass with an explanatory detail keeps the generic rollup unchanged (the stated goal) — a defensible tradeoff, but worth flagging so the choice is deliberate.

🔵 Low / Nits

4. Detail doc comment is now inaccurate. It says "Absent for clean dimensions," but the vacuous-Kustomize branch attaches a detail to a pass dimension. Update the comment.

5. Three-way distinction downstream. For chart_pinned, a matrix renderer (#1228/#1229) will see: pass with detail (vacuous/Kustomize), pass without detail (genuinely pinned), and missing key (failed/held resolve — dimension not added when err != nil). Worth a forward note so the cases render distinctly and a vacuous pass isn't read as "supply-chain pinned."

✅ Strengths (verified)

  • Both signals correctly gated behind err == nil; failed/held combos carry neither.
  • phaseCoverage copies Checks before sort.Strings rather than sorting the underlying RecipeResult slice in place — matches the repo's slice-aliasing guidance.
  • Coverage lives in its own field, never in Dimensions, so it provably cannot move the status. Determinism preserved (sorted names/checks).
  • Declared has no omitempty (verified) — "phase absent" vs "phase present but empty" stays distinguishable in output.
  • Map serialization is handled by contract: Compute's godoc mandates serializer.MarshalYAMLDeterministic; no caller added here.
  • Strong tests: table-driven units + a real-builder integration test driving resolves=pass / chart_pinned=fail / status=fail, plus an embedded-catalog assertion.

Verdict

Approve pending the disabled-component fix (#1). Everything else is a question or nit. Recommend: add the !ref.IsEnabled() guard + test, confirm ADR-006 exact-pin intent (#2), and fix the stale Detail doc comment (#4) in the same push.

Layer the two pure-read structural signals onto the pkg/health core
(ADR-009 V1), both computed only when a leaf resolves cleanly:

- chart_pinned (graded): every enabled Helm component must reference an
  explicit chart version (ADR-006 layer 1 — chart-version presence check,
  not image digests; no Helm render). An unpinned enabled Helm component
  fails and drags the rolled-up status to fail. Disabled components
  (overrides.enabled: false) are skipped — they are never bundled or
  deployed, matching every other ComponentRefs consumer. A recipe with no
  enabled Helm components scores a vacuous pass with an explanatory Detail,
  since Kustomize tag pinning is out of scope and the column must not be
  misread as supply-chain pinned.

- declared_coverage (descriptor, never graded): per validation phase,
  record whether the phase is declared, its sorted named checks, and its
  phase-level constraint count. A minimal recipe that drops phases is not
  penalized. The descriptor is a pointer, omitted on a failed/held resolve
  rather than emitting a misleading all-zero block.

chart_pinned feeds the existing generic rollup unchanged. Adds unit tests
for both signals (including disabled-component skip) plus end-to-end tests
driving a real chart_pinned fail and asserting the coverage omit/populate
contract through the builder.

Closes #1226
@mchmarny mchmarny force-pushed the feat/health-chart-pinned-coverage branch from 0cd52c7 to 5fb5b02 Compare June 10, 2026 19:01
@mchmarny

Copy link
Copy Markdown
Member Author

Thanks for the thorough review — addressed in 5fb5b024.

#1 (Major — disabled components): fixed. classifyChartPinned now skips \!ref.IsEnabled(), matching the bundler / deployers / mirror / BOM consumers, so a disabled unpinned Helm component no longer flips the rollup to fail. The vacuous-pass branch keys off the enabled Helm count too (a recipe whose only Helm component is disabled → vacuous pass). Added two unit cases: disabled unpinned helm is skipped (→ pass) and only disabled unpinned helm is vacuous pass.

#2 (Medium — non-empty vs exact-pin): confirmed + documented. ADR-006 layer 1 is "pin chart versions for every Helm component" and issue #1226 specifies the check as non-empty Version; the registry uses exact pins (v26.3.2, 0.18.3). I added a comment noting this is a presence check, not range/exact-pin validation, so the stronger guarantee the name implies isn't assumed. Tightening to reject floating ranges can follow if ADR-006 ever formalizes that.

#3 (Question — vacuous pass vs skipped): deliberate, per spec. Issue #1226 explicitly says a pure-Kustomize recipe "scores chart_pinned as a vacuous pass", and keeping it in {pass,warn,fail,unknown} avoids adding a state the generic rollup would have to special-case. Flagged the three downstream states in the function doc so a renderer can distinguish vacuous-pass from genuinely-pinned (see #5).

#4 (Nit — stale Detail doc): fixed. Updated the Detail field comment — it now notes the vacuous chart_pinned pass attaches a detail, and only a genuinely clean dimension has no entry.

#5 (Nit — three-way distinction for the renderer): noted in code. Added a paragraph to classifyChartPinned's doc spelling out the three states (pass+detail = vacuous, pass+no-detail = pinned, absent = resolve failed/held) so #1228/#1229 render them distinctly and a vacuous pass isn't read as "supply-chain pinned".

Also folded in CodeRabbit's contract fix: Coverage is now a pointer omitted on unresolved combos. make test green, pkg/health coverage 97.6%, lint clean. Rebased onto latest main (single signed commit).

@njhensley njhensley left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-reviewed at 5fb5b024 — all prior findings resolved. ✅

  • #1 (Major, disabled components): fixed — classifyChartPinned skips !ref.IsEnabled(), helmCount counts only enabled, vacuous-pass keys off enabled Helm count; two new unit cases added. Matches the bundler/deployer/mirror/BOM convention.
  • #2 / #3 / #4 / #5: confirmed + documented (presence-check comment, vacuous-pass per #1226, corrected Detail doc, three-state renderer note).
  • CodeRabbit Coverage pointer fix reviewed independently and sound: *DeclaredCoverage + omitempty, assigned only on clean resolve, distinguishing "resolved, declares nothing" from "unknown"; both ends tested.

CI fully green (Test, Lint, E2E, CLI E2E, Security Scan, CodeRabbit, Analyze); pkg/health 97.6%, project 76.5% (> 75% gate). Tightly scoped, correct, consistent with the codebase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/recipes size/L theme/recipes Recipe expansion, overlays, mixins, and component registry

Projects

None yet

Development

Successfully merging this pull request may close these issues.

pkg/health: chart_pinned + declared_coverage signals

2 participants