Skip to content

fix(bundler): disable kataSandboxDevicePlugin in gpu-operator values#1343

Merged
mchmarny merged 2 commits into
mainfrom
fix/1340-disable-kata-sandbox-device-plugin
Jun 12, 2026
Merged

fix(bundler): disable kataSandboxDevicePlugin in gpu-operator values#1343
mchmarny merged 2 commits into
mainfrom
fix/1340-disable-kata-sandbox-device-plugin

Conversation

@atif1996

Copy link
Copy Markdown
Contributor

Summary

Disable kataSandboxDevicePlugin in the base gpu-operator Helm values so the field is never rendered into ClusterPolicy.

Motivation / Context

The gpu-operator chart defaults kataSandboxDevicePlugin.enabled: true, which renders .spec.kataSandboxDevicePlugin into the ClusterPolicy CR. The v26.3.1 CRD schema does not declare that field, causing ArgoCD's structured-merge diff to fail permanently (ComparisonError / Unknown health) on clusters using that chart version. No AICR recipe uses kata containers, so the field is unused.

Fixes: #1340
Related: N/A

Type of Change

  • Bug fix (non-breaking change that fixes an issue)

Component(s) Affected

  • Recipe engine / data (pkg/recipe)

Implementation Notes

Single-line disable in recipes/components/gpu-operator/values.yaml, mirroring the existing ccManager.enabled: false precedent directly above it. Because recipes/overlays/base.yaml pins every recipe's gpu-operator to this values file as its base, the fix covers all services/accelerators/intents without per-overlay duplication.

BOM regeneration (make bom-docs) confirmed no image change — disabling this feature does not remove a container image from the Helm chart's rendered output.

Testing

go test -race ./recipes/... ./pkg/recipe/...
yamllint recipes/components/gpu-operator/values.yaml

All relevant tests pass.

Risk Assessment

  • Low — Isolated change, well-tested, easy to revert

Rollout notes: N/A — disabling an unused feature flag. Existing deployed ClusterPolicies are unaffected; only newly generated bundles will omit the field.

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

Chart default (enabled) renders .spec.kataSandboxDevicePlugin into
ClusterPolicy. The v26.3.1 CRD schema does not declare that field, so
ArgoCD's structured-merge diff fails permanently (ComparisonError /
Unknown health). No AICR recipe uses kata containers.

Mirrors the ccManager.enabled: false precedent in the same file.

Fixes #1340
@github-actions

Copy link
Copy Markdown
Contributor

Recipe evidence check

No leaf overlays affected by this PR.

This gate is warning-only and never blocks merge.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: e5a55af9-5cdb-4df9-9717-20700715a056

📥 Commits

Reviewing files that changed from the base of the PR and between 1864f05 and e65ff8c.

📒 Files selected for processing (1)
  • recipes/components/gpu-operator/values.yaml

📝 Walkthrough

Walkthrough

This PR adds a Helm values override to the gpu-operator component configuration, setting kataSandboxDevicePlugin.enabled: false. The change includes a comment documenting that the chart's default enabled state renders a CRD field that is not declared in the ClusterPolicy schema. The modification prevents the kata sandbox device plugin field from being included in the manifest, which resolves an ArgoCD ComparisonError when deploying on EKS clusters using gpu-operator v26.3.1.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title directly and accurately describes the main change: disabling kataSandboxDevicePlugin in the gpu-operator Helm values file.
Description check ✅ Passed The description provides comprehensive context about the bug fix, root cause (ArgoCD ComparisonError due to missing CRD field), solution, testing, and risk assessment.
Linked Issues check ✅ Passed The PR fully addresses the objectives from issue #1340 by disabling kataSandboxDevicePlugin in the gpu-operator values file to prevent ArgoCD ComparisonError failures caused by the missing CRD field.
Out of Scope Changes check ✅ Passed All changes are within scope: a single-line addition to disable kataSandboxDevicePlugin in gpu-operator values.yaml, directly addressing issue #1340 without unrelated modifications.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/1340-disable-kata-sandbox-device-plugin

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions

github-actions Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Coverage Report ✅

Metric Value
Coverage 77.1%
Threshold 75%
Status Pass
Coverage Badge
![Coverage](https://img.shields.io/badge/coverage-77.1%25-green)

No Go source files changed in this PR.

@mchmarny mchmarny enabled auto-merge (squash) June 12, 2026 22:25
@mchmarny mchmarny merged commit e3aa6b4 into main Jun 12, 2026
117 checks passed
@mchmarny mchmarny deleted the fix/1340-disable-kata-sandbox-device-plugin branch June 12, 2026 22:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

gpu-operator ComparisonError: kataSandboxDevicePlugin field not declared in CRD schema (EKS)

2 participants