feat(flux): add bundle flux option#817
Conversation
|
will test with a few more receipt overlays and check if the approach for inline helm-charts (e.g. nodewright-customizations, gpu-operator-post) is working as implemented and if we need to make sure that one of the examples: https://github.com/haarchri/aicr-flux-bundle/blob/main/nodewright-customizations/templates/tuning.yaml |
This comment was marked as resolved.
This comment was marked as resolved.
|
This will close a significant gap in AICR, thanks @haarchri |
|
@mchmarny any upfront review possible ? makes it easier to finish the PR after more testing around |
68504d7 to
57ace34
Compare
There was a problem hiding this comment.
Actionable comments posted: 6
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/user/cli-reference.md`:
- Line 1372: Change the ordered-list prefix on the line starting with "2.
**Accelerated Selector Warning**:" to use "1." to comply with the configured
ordered-list style (1/1/1); update the list item that mentions
`--accelerated-node-selector` so its prefix is "1." instead of "2." ensuring
MD029 passes.
In `@pkg/bundler/deployer/flux/doc.go`:
- Line 16: Update the package comment in package flux to use the AICR
terminology instead of “Cloud Native Stack recipes”; modify the doc string at
the top of pkg/bundler/deployer/flux/doc.go so it reads something like “Package
flux provides Flux manifest generation for AICR (Application-Integration
Configuration Reconciler) behavior” (or similar concise phrasing referencing
AICR) so the package doc aligns with AICR terminology and avoids contributor
confusion.
In `@pkg/bundler/deployer/flux/flux_test.go`:
- Around line 101-105: In the loop that checks expectedFiles (using
filepath.Join to build fullPath and calling os.Stat), change the error handling
so any non-nil statErr causes the test to fail: if os.Stat returns an error,
call t.Fatalf or t.Errorf including the statErr in the message rather than only
checking os.IsNotExist; preserve the existing message for the not-exist case but
include the error details for other filesystem errors so the test cannot
silently pass on permission or I/O errors.
- Around line 35-37: Replace unbounded contexts in the TestGenerate_* tests by
creating a timeout-bounded context (e.g., ctx, cancel :=
context.WithTimeout(context.Background(), 10*time.Second)) and defer cancel()
before calling Generate; update TestGenerate_Success and all other
TestGenerate_* cases to pass this ctx into the Generate call so file I/O ops
cannot hang indefinitely.
- Around line 307-310: The test TestGenerate_ContextCancellation currently only
checks that Generate(ctx, t.TempDir()) returns some error; change the assertion
to verify it's a context cancellation by using errors.Is(err, context.Canceled).
Specifically, after calling g.Generate(ctx, t.TempDir()) update the check from
"if err == nil { t.Fatal(...)" to first assert err is non-nil then use if
!errors.Is(err, context.Canceled) { t.Fatalf("expected context.Canceled, got
%v", err) } so the test fails for unrelated errors.
In `@pkg/bundler/deployer/flux/flux.go`:
- Around line 204-211: The loop over sortedRefs that calls
generateComponentResources should check the context for cancellation to avoid
continuing heavy I/O after ctx is done; before each iteration (inside the for i,
ref := range sortedRefs loop) add a non-blocking check for ctx.Done() and return
early (e.g., return ctx.Err()) if canceled, ensuring generateComponentResources
(and any subsequent append to resources) isn't executed when the parent context
is canceled; reference the loop variables sortedRefs,
generateComponentResources, outputDir, helmSources, gitSources, and output when
locating the insertion point.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 7663d94b-b826-4125-aa89-ccae7c83e770
📒 Files selected for processing (25)
api/aicr/v1/server.yamldocs/README.mddocs/contributor/component.mddocs/user/api-reference.mddocs/user/cli-reference.mdpkg/bundler/bundler.gopkg/bundler/config/config.gopkg/bundler/config/config_test.gopkg/bundler/deployer/deployer.gopkg/bundler/deployer/flux/doc.gopkg/bundler/deployer/flux/flux.gopkg/bundler/deployer/flux/flux_test.gopkg/bundler/deployer/flux/helm.gopkg/bundler/deployer/flux/sources.gopkg/bundler/deployer/flux/templates/README.md.tmplpkg/bundler/deployer/flux/templates/chart.yaml.tmplpkg/bundler/deployer/flux/templates/gitrepo-source.yaml.tmplpkg/bundler/deployer/flux/templates/helmrelease.yaml.tmplpkg/bundler/deployer/flux/templates/helmrepo-source.yaml.tmplpkg/bundler/deployer/flux/templates/kustomization.yaml.tmplpkg/bundler/deployer/flux/testdata/helm_components/cert-manager/helmrelease.yamlpkg/bundler/deployer/flux/testdata/helm_components/gpu-operator/helmrelease.yamlpkg/bundler/deployer/flux/testdata/helm_components/kustomization.yamlpkg/cli/bundle.gotests/chainsaw/cli/bundle-flux/chainsaw-test.yaml
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/bundler/deployer/flux/helm.go (1)
125-133:⚠️ Potential issue | 🟠 Major | ⚡ Quick winPreserve relative manifest paths to prevent silent template overwrites.
Line 125 uses
filepath.Base(name), soa/deploy.yamlandb/deploy.yamlboth becometemplates/deploy.yaml; the later write overwrites the former and drops resources.Suggested fix
- safeName := filepath.Base(name) - filePath, joinErr := deployer.SafeJoin(templatesDir, safeName) + cleanName := filepath.Clean(name) + filePath, joinErr := deployer.SafeJoin(templatesDir, cleanName) if joinErr != nil { return joinErr } + if err := os.MkdirAll(filepath.Dir(filePath), 0750); err != nil { + return errors.Wrap(errors.ErrCodeInternal, + fmt.Sprintf("failed to create template subdirectory for %s", compName), err) + } if err := os.WriteFile(filePath, content, 0600); err != nil { return errors.Wrap(errors.ErrCodeInternal, - fmt.Sprintf("failed to write template %s for %s", safeName, compName), err) + fmt.Sprintf("failed to write template %s for %s", cleanName, compName), err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/bundler/deployer/flux/helm.go` around lines 125 - 133, The code currently uses filepath.Base(name) to derive safeName which collapses relative paths (e.g., a/deploy.yaml and b/deploy.yaml) and causes silent overwrites when writing via deployer.SafeJoin(templatesDir, safeName); instead preserve the relative path components (sanitized) when constructing the destination path so each manifest remains unique: replace the filepath.Base(name) usage that sets safeName with logic that cleans and joins the original relative path (while preventing path traversal) before calling deployer.SafeJoin, then continue to use os.WriteFile(filePath, content, 0600) and include compName in the error wrap as before.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@pkg/bundler/deployer/flux/helm.go`:
- Around line 125-133: The code currently uses filepath.Base(name) to derive
safeName which collapses relative paths (e.g., a/deploy.yaml and b/deploy.yaml)
and causes silent overwrites when writing via deployer.SafeJoin(templatesDir,
safeName); instead preserve the relative path components (sanitized) when
constructing the destination path so each manifest remains unique: replace the
filepath.Base(name) usage that sets safeName with logic that cleans and joins
the original relative path (while preventing path traversal) before calling
deployer.SafeJoin, then continue to use os.WriteFile(filePath, content, 0600)
and include compName in the error wrap as before.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 5acce34c-90eb-412d-8a88-98bf43ac4b3d
📒 Files selected for processing (1)
pkg/bundler/deployer/flux/helm.go
|
have the 2 special cases working - compName did the trick that the inline helm charts are working: its available here: https://github.com/haarchri/aicr-flux-bundle |
|
validated also flux deployer with Generated a full flux bundle: haarchri/aicr-flux-bundle#1 |
051dc74 to
00f129e
Compare
There was a problem hiding this comment.
Solid first contribution — clean implementation, follows existing deployer patterns (compile-time deployer.Deployer check, pkg/errors codes, SafeJoin everywhere, table-driven tests), and the chainsaw coverage is thorough.
Pragmatic merge view — CodeRabbit has 10 open inline findings. Some of them are actually good ;)
My read of the must-fix-before-merge subset:
- Branding sweep — "Generated by Cloud Native Stack" / "Cloud Native Stack recipes" still in 8 files (
doc.go+ 6 templates + 3 testdata). One commit, global replace. flux.go:213ctx check in loop — project rule, one line.pkg/bundler/deployer/flux/helm.go:192basename flatten — not a bug with today's data but a real foot-gun; ~5 lines.deployer.go:23typo —argocdhelm→argocd-helm.
Nice-to-have, not blocking: test ergonomics (timeout-bounded ctx, errors.Is(err, context.Canceled), os.Stat error checks), writeTemplate using PropagateOrWrap, stronger dependsOn chainsaw assertion. Address as a follow-up if you want this in fast.
Disagreeing with CR on the ComponentTypeKustomize "major" — registry has zero Kustomize components today, so the default-reject is fine. Document the limitation in doc.go and move on.
CI: tests / CLI E2E is failing on argocd/014-prometheus-adapter/values.yaml yamllint indentation — that's in argocd output, not new flux code. Looks unrelated/possibly pre-existing flake (last on-push run on main was green though). Please rerun once you push the cleanup commit; if it persists on the next push without prometheus-adapter changes, file a separate issue and we shouldn't block this PR.
LGTM after the four must-fix items above.
0ad80a4 to
a84beec
Compare
|
wonder what i need todo because of #846 is in ... |
|
reagarding #846 have a PR for it: haarchri/aicr-flux-bundle#4 and will push the code soon |
|
@mchmarny would love to finish this PR - before i need to add more and more on top |
Just saw your latest commits. I think we are there, assuming the tests pass. |
|
@haarchri PR is ready to merge, you will need to amend your commits:
|
Signed-off-by: Christopher Haar <[email protected]>
…hat variables cons. is working Signed-off-by: Christopher Haar <[email protected]>
… flags Signed-off-by: Christopher Haar <[email protected]>
Signed-off-by: Christopher Haar <[email protected]>
Signed-off-by: Christopher Haar <[email protected]>
Head branch was pushed to by a user without write access
5d25748 to
9a88027
Compare
|
@mchmarny rebased and set correct signing |
Summary
Add a
fluxdeployer type to the bundler, generating Flux CD HelmRelease CRs, source CRs, and a root kustomization.yaml for GitOps deployment.Motivation / Context
Organizations using Flux as their GitOps controller had no native bundle output, they had to manually translate Helm bundles into Flux CRDs.
This adds first-class Flux support alongside the existing helm, argocd, and argocd-helm deployers.
Fixes: #617
Type of Change
Component(s) Affected
cmd/aicr,pkg/cli)cmd/aicrd,pkg/api,pkg/server)pkg/recipe)pkg/bundler,pkg/component/*)pkg/collector,pkg/snapshotter)pkg/validator)pkg/errors,pkg/k8s)docs/,examples/)Implementation Notes
First-time contributor, guidance on whether the direction is correct would be appreciated. Happy to adjust if something is missing or needs a different approach.
All components produce HelmRelease CRs. Components with Helm charts reference HelmRepository sources; manifest-only and mixed components are packaged as local Helm charts (Chart.yaml + templates/) referencing a GitRepository source.
This avoids the problem of raw Helm template syntax
({{ .Values }}, {{ .Release }})in plain YAML, Flux's Helm controller renders the templates natively.Testing
Live test bundle: https://github.com/haarchri/aicr-flux-bundle — generated from h100-eks-ubuntu-training overlay,
Risk Assessment
Rollout notes: New deployer type only activated when --deployer flux is explicitly specified. No impact on existing helm, argocd, or argocd-helm deployers. No migration needed.
Checklist
make testwith-race)make lint)git commit -S) — GPG signing info