feat(bundler): add helmfile deployer (#632)#899
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds a new helmfile deployer end-to-end: new DeployerHelmfile config and tests; Helmfile data model and release-building helpers; exported Generator that writes per-component chart folders, helmfile.yaml, README, optional provenance/checksums, and data files; bundler wiring and CLI/OpenAPI updates; comprehensive unit/golden tests and a Chainsaw e2e test; embedded README template and multiple test fixtures; CI/action inputs and setup steps to install/verify a pinned helmfile binary; .settings.yaml pin and checksum tooling plus a script and Renovate rule to refresh checksums. Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/contributor/api-server.md`:
- Line 26: The docs line describing value override support for the /v1/bundle
endpoint currently lists deployers as "helm, argocd-helm, flux, and helmfile"
but omits "argocd"; update the sentence referencing `/v1/bundle` and the query
params `?set=bundler:path=value` and `?dynamic=component:path` to include
"argocd" in the supported deployer list so it reads e.g. "helm, argocd,
argocd-helm, flux, and helmfile".
In `@docs/README.md`:
- Around line 86-87: Update the "Deployer Options" and "How it helps" text to
distinguish outputs per deployer: keep the list under "Deployer Options"
(`helm`, `argocd`, `argocd-helm`, `flux`, `helmfile`) but change the "How it
helps" paragraph so it states which deployers produce ready-to-run scripts or
manifests (e.g., `helm` generates a custom install.sh that pre-validates the
environment), while others (e.g., `argocd`, `flux`, `argocd-helm`) produce
Application/HelmRelease manifests and `helmfile` produces a declarative
helmfile-driven release graph — do not imply `helmfile` is script-driven and
avoid promising future features.
In `@docs/user/cli-reference.md`:
- Around line 1458-1463: The release graph example in helmfile.yaml uses the
wrong path pattern for per-release value files; update the example entries that
show values: [./NNN/values.yaml, ./NNN/cluster-values.yaml] so they match the
generated layout (use the NNN-<component> directory naming), e.g. reference
values: [./NNN-<component>/values.yaml, ./NNN-<component>/cluster-values.yaml]
so users edit the correct cluster-values.yaml per component before running
helmfile apply; adjust the example tree and any accompanying text to reflect the
NNN-<component>/ directory naming consistently.
In `@pkg/bundler/deployer/helmfile/helmfile_test.go`:
- Around line 161-163: Tests are currently matching error text substrings
instead of using errors.Is with structured error codes; update the production
errors so tests can use errors.Is: in the Generate() error paths in helmfile.go
(where the code comments reference wrapping lines 114 and 117) return pkg/errors
structured errors using errors.New or errors.Wrap that include StructuredError
with Code set to errors.ErrCodeInvalidRequest and errors.ErrCodeTimeout
respectively, and in releases.go replace fmt.Errorf calls (around the
buildHelmfile() error paths) with pkg/errors errors.Wrap or errors.New so they
produce structured errors; after that, update the tests to assert with
errors.Is(err, &errors.StructuredError{Code: errors.ErrCodeInvalidRequest}) or
errors.ErrCodeTimeout as appropriate.
In `@pkg/bundler/deployer/helmfile/releases.go`:
- Around line 79-86: The componentOverrides map duplicates special-case logic
from the helm deployer template
(pkg/bundler/deployer/helm/templates/deploy.sh.tmpl) and risks drift; replace
the hardcoded map with a single source of truth or add an automated consistency
check: either move the overrides into a shared constant/registry consumed by
both the Go code (componentOverrides / type overrides) and the template, or add
a unit/integration test that loads deploy.sh.tmpl and verifies ASYNC_COMPONENTS
/ COMPONENT_HELM_TIMEOUT entries match the componentOverrides map (e.g.,
validate "kai-scheduler" wait/timeout), ensuring changes to one place fail CI
until synchronized.
- Around line 106-165: Replace the two fmt.Errorf usages inside buildHelmfile
with the project's pkg/errors structured errors: when f.Upstream is nil
(currently "folder %s is KindUpstreamHelm but Upstream is nil") and when
encountering an unsupported folder kind (currently "unsupported folder kind %v
for %s"), construct and return a pkg/errors error value using the appropriate
error code/constructor from pkg/errors (e.g., errors.E with a suitable sentinel
like errors.Invalid or the project's defined code) and include the folder
details in the message; update the imports to use pkg/errors and remove or keep
fmt only if still needed elsewhere.
In `@tests/chainsaw/cli/bundle-helmfile/chainsaw-test.yaml`:
- Around line 129-132: The loop currently skips validation for the first release
(when idx == 1) which lets a regression add a needs: to release[0]; change the
branch so that when idx == 1 you explicitly assert the first release has no
needs entry (using the same variables used in the loop such as idx, prev, and
name) and fail the script if a needs field is present instead of just
continuing; keep the existing behavior of setting prev="$name" afterwards if the
assertion passes.
In `@tools/setup-tools`:
- Around line 410-416: Add SHA256 verification for downloaded helmfile
artifacts: extend .settings.yaml with a testing_tools.helmfile_checksums section
mirroring chainsaw_checksums, add a new helper script
tools/update-helmfile-checksums that fetches the GitHub release checksums
(modeled on tools/update-chainsaw-checksums), and register that script in
Renovate postUpgradeTasks so checksum entries auto-update on version bumps. In
the installer flow (function/section using verify_download_url and variables
HELMFILE_URL, HELMFILE_TMP, HELMFILE_TAR, HELMFILE_OS, HELMFILE_ARCH) read
HELMFILE_SHA256 from .settings.yaml
(testing_tools.helmfile_checksums.${HELMFILE_OS}_${HELMFILE_ARCH}) and call
verify_sha256 on the downloaded "${HELMFILE_TMP}/${HELMFILE_TAR}" before
extracting; keep existing cleanup and move/chmod steps unchanged.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 1fc0b716-bab1-4c80-a629-ba346b680278
📒 Files selected for processing (25)
.settings.yamlapi/aicr/v1/server.yamldocs/README.mddocs/contributor/api-server.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/helmfile/doc.gopkg/bundler/deployer/helmfile/helmfile.gopkg/bundler/deployer/helmfile/helmfile_test.gopkg/bundler/deployer/helmfile/releases.gopkg/bundler/deployer/helmfile/templates/README.md.tmplpkg/bundler/deployer/helmfile/testdata/kai_scheduler_async/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/manifest_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/mixed_gpu_operator/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/README.mdpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/with_dynamic_values/helmfile.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-helmfile/chainsaw-test.yamltools/check-toolstools/setup-tools
Coverage Report ✅
Coverage BadgeMerging this branch changes the coverage (1 decrease, 2 increase)
Coverage by fileChanged files (no unit tests)
Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code. |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/bundler/deployer/helmfile/helmfile.go`:
- Around line 159-161: The current error handling after buildHelmfile uses
errors.Wrap with a fixed ErrCodeInternal which can overwrite an existing
structured code returned by buildHelmfile; replace that call so it propagates an
existing structured error code or wraps uncoded errors by using
errors.PropagateOrWrap(err, errors.ErrCodeInternal, "failed to build helmfile
document") (target the error handling immediately after the buildHelmfile call
where err is checked) so existing codes like ErrCodeInvalidRequest are preserved
while uncoded errors get wrapped with ErrCodeInternal and the message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: e154aa83-1c65-45d0-ab58-b40eda47a4bf
📒 Files selected for processing (30)
.github/actions/cli-e2e/action.yml.github/actions/load-versions/action.yml.github/actions/setup-build-tools/action.yml.github/workflows/qualification.yaml.settings.yamlapi/aicr/v1/server.yamldocs/README.mddocs/contributor/api-server.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/helmfile/doc.gopkg/bundler/deployer/helmfile/helmfile.gopkg/bundler/deployer/helmfile/helmfile_test.gopkg/bundler/deployer/helmfile/releases.gopkg/bundler/deployer/helmfile/templates/README.md.tmplpkg/bundler/deployer/helmfile/testdata/kai_scheduler_async/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/manifest_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/mixed_gpu_operator/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/README.mdpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/with_dynamic_values/helmfile.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-helmfile/chainsaw-test.yamltools/check-toolstools/setup-tools
d6e1089 to
b7b7d85
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (2)
tools/setup-tools (1)
410-414:⚠️ Potential issue | 🟠 Major | ⚡ Quick winVerify Helmfile artifact checksum before extraction.
helmfileis downloaded and untarred without averify_sha256check, so this path is weaker than the neighboring Chainsaw installer and allows unnoticed artifact tampering.Suggested fix
if verify_download_url "$HELMFILE_URL" "helmfile ${HELMFILE_VERSION} for ${HELMFILE_OS} ${HELMFILE_ARCH}"; then HELMFILE_TMP=$(mktemp -d) curl -fsSL -o "${HELMFILE_TMP}/${HELMFILE_TAR}" "${HELMFILE_URL}" + HELMFILE_SHA256=$(yq ".testing_tools.helmfile_checksums.${HELMFILE_OS}_${HELMFILE_ARCH}" .settings.yaml) + verify_sha256 "${HELMFILE_TMP}/${HELMFILE_TAR}" "${HELMFILE_SHA256}" tar -xzf "${HELMFILE_TMP}/${HELMFILE_TAR}" -C "${HELMFILE_TMP}" sudo mv "${HELMFILE_TMP}/helmfile" /usr/local/bin/helmfile sudo chmod +x /usr/local/bin/helmfile🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tools/setup-tools` around lines 410 - 414, The Helmfile download block currently extracts the tarball without checksum verification; add a verify_sha256 step using the expected checksum variable (e.g., HELMFILE_SHA256) after downloading to HELMFILE_TMP and before tar -xzf and sudo mv. Update the block around verify_download_url / HELMFILE_TMP / HELMFILE_TAR to call verify_sha256 "${HELMFILE_TMP}/${HELMFILE_TAR}" "${HELMFILE_SHA256}" (or compute and compare sha256sum) and fail/exit if verification fails so only verified artifacts are extracted and installed.pkg/bundler/deployer/helmfile/helmfile.go (1)
159-161:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve structured error codes from
buildHelmfilefailures.Line 160 currently wraps all
buildHelmfileerrors asErrCodeInternal, which can mask existing structured codes (e.g.,ErrCodeInvalidRequest). Propagate coded errors and wrap only uncoded ones.🔧 Proposed fix
- if err != nil { - return nil, errors.Wrap(errors.ErrCodeInternal, "failed to build helmfile document", err) - } + if err != nil { + return nil, errors.PropagateOrWrap(err, errors.ErrCodeInternal, "failed to build helmfile document") + }Based on learnings: use
errors.PropagateOrWrap(err, errCode, message)so structured error codes remain intact.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/bundler/deployer/helmfile/helmfile.go` around lines 159 - 161, The current error handling after calling buildHelmfile wraps every error as errors.ErrCodeInternal, which hides any structured codes from buildHelmfile; update the return path in the function that calls buildHelmfile to use errors.PropagateOrWrap(err, errors.ErrCodeInternal, "failed to build helmfile document") so that if err already carries a code it is preserved and only uncoded errors get wrapped with ErrCodeInternal; locate the check that reads "if err != nil { return nil, errors.Wrap(...)" and replace that wrap call with errors.PropagateOrWrap as described.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@pkg/bundler/deployer/helmfile/helmfile.go`:
- Around line 159-161: The current error handling after calling buildHelmfile
wraps every error as errors.ErrCodeInternal, which hides any structured codes
from buildHelmfile; update the return path in the function that calls
buildHelmfile to use errors.PropagateOrWrap(err, errors.ErrCodeInternal, "failed
to build helmfile document") so that if err already carries a code it is
preserved and only uncoded errors get wrapped with ErrCodeInternal; locate the
check that reads "if err != nil { return nil, errors.Wrap(...)" and replace that
wrap call with errors.PropagateOrWrap as described.
In `@tools/setup-tools`:
- Around line 410-414: The Helmfile download block currently extracts the
tarball without checksum verification; add a verify_sha256 step using the
expected checksum variable (e.g., HELMFILE_SHA256) after downloading to
HELMFILE_TMP and before tar -xzf and sudo mv. Update the block around
verify_download_url / HELMFILE_TMP / HELMFILE_TAR to call verify_sha256
"${HELMFILE_TMP}/${HELMFILE_TAR}" "${HELMFILE_SHA256}" (or compute and compare
sha256sum) and fail/exit if verification fails so only verified artifacts are
extracted and installed.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 4eb198d4-0004-4811-a2a1-5aa548996b3c
📒 Files selected for processing (31)
.github/actions/cli-e2e/action.yml.github/actions/load-versions/action.yml.github/actions/setup-build-tools/action.yml.github/workflows/qualification.yaml.settings.yamlapi/aicr/v1/server.yamldocs/README.mddocs/contributor/api-server.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/helmfile/doc.gopkg/bundler/deployer/helmfile/helmfile.gopkg/bundler/deployer/helmfile/helmfile_test.gopkg/bundler/deployer/helmfile/releases.gopkg/bundler/deployer/helmfile/templates/README.md.tmplpkg/bundler/deployer/helmfile/testdata/kai_scheduler_async/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/manifest_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/mixed_gpu_operator/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/README.mdpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/with_dynamic_values/helmfile.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-helmfile/chainsaw-test.yamltools/check-toolstools/setup-toolstools/update-helmfile-checksums
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@tools/setup-tools`:
- Around line 395-427: The script logs failures but continues when helmfile
cannot be installed; modify the branches that handle missing checksum
(HELMFILE_SHA256 null check) and the verify_download_url false branch so the
script exits non-zero after logging (e.g., call exit 1) to fail-fast; ensure the
changes are applied around the HELMFILE_SHA256 null/error handling and the else
branch after verify_download_url so that a missing checksum or failed
verification/installation does not allow the setup to continue.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: ac182feb-6a09-4605-87a0-3d609bee5b46
📒 Files selected for processing (32)
.github/actions/cli-e2e/action.yml.github/actions/load-versions/action.yml.github/actions/setup-build-tools/action.yml.github/renovate.json5.github/workflows/qualification.yaml.settings.yamlapi/aicr/v1/server.yamldocs/README.mddocs/contributor/api-server.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/helmfile/doc.gopkg/bundler/deployer/helmfile/helmfile.gopkg/bundler/deployer/helmfile/helmfile_test.gopkg/bundler/deployer/helmfile/releases.gopkg/bundler/deployer/helmfile/templates/README.md.tmplpkg/bundler/deployer/helmfile/testdata/kai_scheduler_async/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/manifest_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/mixed_gpu_operator/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/README.mdpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/with_dynamic_values/helmfile.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-helmfile/chainsaw-test.yamltools/check-toolstools/setup-toolstools/update-helmfile-checksums
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/user/cli-reference.md (1)
1148-1155:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUpdate the later deployer enumerations to include
helmfile.This table adds
helmfile, but the Attestation section later in the same page still says attestation works with onlyhelm,argocd,argocd-helm, andflux. That leaves the CLI reference internally contradictory about current deployer support.As per coding guidelines, user documentation should document CLI commands and user-facing features accurately.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/user/cli-reference.md` around lines 1148 - 1155, The Attestation section is out of sync with the deployer table — add "helmfile" to the list of deployers that support attestation and describe its support (e.g., that attestation works with helmfile deploys similarly to other GitOps/Helm-based deployers, and any caveats such as using per-release cluster-values.yaml or requiring the helmfile binary). Update the Attestation text to explicitly mention "helmfile" and mirror the same support statement pattern used for "helm", "argocd", "argocd-helm", and "flux" so the CLI reference is consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/actions/load-versions/action.yml:
- Line 193: Update the echo that prints the helmfile output so it includes its
checksum like the architecture checksums: change the single echo of
steps.versions.outputs.helmfile to print both the version and the corresponding
checksum (e.g., use steps.versions.outputs.helmfile and
steps.versions.outputs.helmfile_checksum) so the action logs helmfile version
plus its checksum for consistency and easier troubleshooting.
In `@pkg/bundler/bundler_test.go`:
- Around line 829-832: The test currently only asserts helmfile.yaml exists; add
negative assertions to ensure deploy.sh and undeploy.sh are NOT generated by
checking os.Stat(filepath.Join(tmpDir, "deploy.sh")) and
os.Stat(filepath.Join(tmpDir, "undeploy.sh")) and failing the test with t.Errorf
if those files exist (i.e., if stat returns nil) so the non-goals are locked in;
reuse the same pattern as the existing helmfile check but invert the condition
(t.Errorf when err == nil) and include clear messages identifying deploy.sh and
undeploy.sh.
In `@pkg/bundler/deployer/helmfile/helmfile_test.go`:
- Around line 40-270: Consolidate the five nearly-identical tests
TestGenerate_UpstreamHelmOnly, TestGenerate_DynamicValues,
TestGenerate_KaiSchedulerAsync, TestGenerate_MixedGPUOperator, and
TestGenerate_ManifestOnly into a single table-driven test that iterates
scenarios and calls runScenario(t, g, goldenName, expectedFiles) for each entry;
create a slice of structs (name, gen *Generator, golden string, expected
[]string) and a t.Run subtest per entry, reusing the existing Generator
initializations from each test body and referencing runScenario and the
Generator fields (RecipeResult, ComponentValues, DynamicValues,
ComponentPostManifests, Version) to populate each case; leave all other tests
(TestGenerate_EmptyRecipe, TestGenerate_NilRecipeResult,
TestGenerate_WithChecksums, TestGenerate_WithDataFiles,
TestGenerate_ContextCanceled, TestGenerate_MissingDataFile) unchanged.
- Around line 253-270: Update TestGenerate_MissingDataFile to assert the
structured error code instead of only checking non-nil: after calling
g.Generate(...) keep the existing nil check but then use stderrors.Is(err,
errors.New(errors.ErrCodeInternal, "")) (consistent with
TestGenerate_ContextCanceled) to verify the error returned by Generator.Generate
indicates an internal missing-data-file failure; reference the test name
TestGenerate_MissingDataFile and the Generator.Generate call when making the
assertion.
---
Outside diff comments:
In `@docs/user/cli-reference.md`:
- Around line 1148-1155: The Attestation section is out of sync with the
deployer table — add "helmfile" to the list of deployers that support
attestation and describe its support (e.g., that attestation works with helmfile
deploys similarly to other GitOps/Helm-based deployers, and any caveats such as
using per-release cluster-values.yaml or requiring the helmfile binary). Update
the Attestation text to explicitly mention "helmfile" and mirror the same
support statement pattern used for "helm", "argocd", "argocd-helm", and "flux"
so the CLI reference is consistent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 01d4c143-f578-4906-8e30-35ad27a4b514
📒 Files selected for processing (32)
.github/actions/cli-e2e/action.yml.github/actions/load-versions/action.yml.github/actions/setup-build-tools/action.yml.github/renovate.json5.github/workflows/qualification.yaml.settings.yamlapi/aicr/v1/server.yamldocs/README.mddocs/contributor/api-server.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/helmfile/doc.gopkg/bundler/deployer/helmfile/helmfile.gopkg/bundler/deployer/helmfile/helmfile_test.gopkg/bundler/deployer/helmfile/releases.gopkg/bundler/deployer/helmfile/templates/README.md.tmplpkg/bundler/deployer/helmfile/testdata/kai_scheduler_async/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/manifest_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/mixed_gpu_operator/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/README.mdpkg/bundler/deployer/helmfile/testdata/upstream_helm_only/helmfile.yamlpkg/bundler/deployer/helmfile/testdata/with_dynamic_values/helmfile.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-helmfile/chainsaw-test.yamltools/check-toolstools/setup-toolstools/update-helmfile-checksums
mchmarny
left a comment
There was a problem hiding this comment.
Solid additive change — the helmfile deployer slots into the existing localformat.Write() infrastructure cleanly, reuses SortComponentRefsByDeploymentOrder, and the golden-file scenarios cover the four shapes that actually matter (upstream-only, dynamic-values, kai-scheduler async, mixed + manifest-only). The TestComponentOverrides_ParityWithHelmDeployScript regression guard is the right call — it locks the helmfile and helm deployers together until the special cases move into the recipe schema.
CI is green where it has reported (Lint, Test, E2E, CLI E2E, GPU smoke, GPU inference, analyze, malware/grype/ClamAV); the GPU Training Test was still in progress at review time. CodeQL is neutral — confirm that's the expected outcome and not a regressed config.
Feedback is all low/nit-grade — nothing blocks merge:
- The PR description's "parallel installs where
needs:allows" claim doesn't match the implementation (every release chains to its predecessor, no fan-out). valuesFilesForFolderreferencescluster-values.yamlfor injected-pre/-postfolders too — works, but worth a code comment.- Chainsaw cleanup is attached only to the last step.
- One brittle
count >= 2assertion and a couple of0755vs0o755style nits.
Nice work overall.
Summary
Adds
--deployer helmfilealongsidehelm/argocd/argocd-helm/flux. The bundler emits a declarativehelmfile.yamlrelease graph + per-component chart dirs (via the sharedlocalformatwriter); operators drive rollouts with the upstreamhelmfileCLI (apply/diff/destroy). No bash wrapper, no pre-flight, no scripted post-install — AICR's charter is to generate validated configuration, not orchestrate deployments.Motivation / Context
deploy.sh.tmpl(~450 lines) andundeploy.sh.tmpl(~700 lines) have accreted finalizer-handling, stale-hook cleanup, webhook-orphan detection, and per-operator post-install logic. Seven PRs onundeploy.sh.tmplalone have chased finalizer edge cases (#253, #282, #364, #416, #477, #561, #602). That's orchestration logic, and it belongs in a tool purpose-built or it — not in a bash template AICR maintains.This change is additive: the
helmdeployer stays in place; the two coexist indefinitely. If the helmfile path is well received it opens a future conversation about simplifying the bash deployer, but that's a separate decision for a later issue.Beyond scope alignment, users pick up
helmfile diff(no AICR deployer offers preview today), idempotent apply/destroy, a reviewable release graph (needs:in YAML instead of bash iteration order).closes #632
Type of Change
Component(s) Affected
cmd/aicr,pkg/cli)cmd/aicrd,pkg/api,pkg/server)pkg/recipe)pkg/bundler,pkg/component/*)pkg/collector,pkg/snapshotter)pkg/validator)pkg/errors,pkg/k8s)docs/,examples/)Implementation Notes
Mapping from AICR concepts to helmfile primitives:
DeploymentOrderrelease.needs(immediate predecessor)repositories[], deduped by URLDynamicValues[name]cluster-values.yamlappended to releasevalues:HELM_TIMEOUT="10m"helmDefaults.timeout: 600ASYNC_COMPONENTS(kai-scheduler)wait: falsekai-scheduler → 20mtimeout: 1200chart: ./NNN-name(localformat wraps asKindLocalHelm)VendorCharts./NNN-name/charts/; release stillchart: ./NNN-nameReuses existing infrastructure:
localformat.Write()emits the uniformNNN-<name>/per-component layout. The helmfile deployer doesn't re-classify components; it walks the returnedFolderlist and emits one release per folder.deployer.SortComponentRefsByDeploymentOrderfor ordering.|
ASYNC_COMPONENTS(kai-scheduler) | per-releasewait: false||
kai-scheduler → 20m| per-releasetimeout: 1200|Non-goals (per #632):
deploy.sh/undeploy.shemitted alongsidehelmfile.yaml.prepare:/presync:/postsync:/postuninstall:hooks emitted.undeploy.shCRD/finalizer/webhook cleanup —helmfile destroystays the simple graceful-uninstall primitive.Testing
# Commands run (prefer `make qualify` for non-trivial changes) make qualifyRisk Assessment
Rollout notes:
Checklist
make testwith-race)make lint)git commit -S) — GPG signing info