Skip to content

feat(bundler): drop generated undeploy.sh; delegate to helm uninstall#1095

Merged
mchmarny merged 5 commits into
mainfrom
feat/671-remove-undeploy-script
May 29, 2026
Merged

feat(bundler): drop generated undeploy.sh; delegate to helm uninstall#1095
mchmarny merged 5 commits into
mainfrom
feat/671-remove-undeploy-script

Conversation

@mchmarny

Copy link
Copy Markdown
Member

Summary

Removes the generated undeploy.sh from helm bundles and delegates teardown to the deployer-native uninstall path (helm uninstall, ArgoCD app deletion). Rewrites docs/user/cli-reference.md Undeploy Script Behavior as a Bundle Uninstall section with per-deployer walkthroughs.

Motivation / Context

Per the issue, undeploy.sh accumulated complexity across #253, #282, #364, #416, #477, #561, #602, #661 chasing CRD lifecycle, finalizer resolution, and multi-tenancy safety — problems Helm itself deliberately avoids and that purpose-built deployers handle more safely. AICR's role is to generate validated design-time bundles, not to be the deployer.

The implementation matches the gating decisions made during planning:

  • Immediate removal (no one-release deprecation window)
  • No tools/ utility — verified that nothing in CI / Makefile / chainsaw / KWOK / tools/e2e actually invokes undeploy.sh, so the "minimal scope" choice maps to zero migration target
  • Per-deployer walkthroughs in the docs (helm / argocd / argocd-helm)

Fixes: #671
Related: #610, #602, #661, #669

Type of Change

  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update

Component(s) Affected

  • Bundlers (pkg/bundler, pkg/component/*)
  • Docs/examples (docs/, examples/)

Implementation Notes

Deviation from issue AC #3 (tools/ utility): The issue's acceptance criteria say "tools/ provides equivalent teardown for developer workflows," but a repo-wide sweep found that nothing in CI (.github/workflows/), Makefile, tools/ scripts, chainsaw tests, KWOK workflows, or developer docs (DEVELOPMENT.md, CONTRIBUTING.md) actually invokes undeploy.sh. The only existing chainsaw reference is a negative assertion that helmfile bundles don't leak undeploy.sh. With the "Only what CI/KWOK/Kind exercise" scope choice, the resulting tool would be empty. Skipping the tools/ utility entirely — surfacing this discrepancy here.

Chainsaw assertion: Dropped the undeploy.sh leaked into helmfile bundle check from tests/chainsaw/cli/bundle-helmfile/chainsaw-test.yaml. After this change undeploy.sh is no longer generated by any deployer, so the negative assertion is universally true and adds noise.

Golden testdata regenerated to reflect the new bundle shape — no more undeploy.sh golden files, updated README.md Uninstall section, updated deploy.sh pre-flight error hint.

Helper function cleanup: uniqueNamespaces (only used by undeploy generation) removed. reverseComponents retained — still used by README.md's per-component manual-uninstall block.

Testing

make qualify
  • go test -race ./pkg/bundler/... — all packages pass
  • golangci-lint run -c .golangci.yaml ./... — 0 issues
  • yamllint — clean on the modified chainsaw test
  • Bundle golden tests regenerated and reverified

Coverage delta (pkg/bundler/deployer/helm): 84.2% → 83.7% (-0.5%, at the threshold). The drop comes from removing the 14 TestUndeployScript_* shell-behavior tests that exercised the deleted template. Absolute level still well above the 70% project floor.

Pre-existing pkg/trust failures in the local run are network-related (sandbox can't reach tuf-repo-cdn.sigstore.dev) and reproduce on origin/main; unrelated to this PR.

Risk Assessment

  • High — Breaking change, affects critical paths, or complex rollout

Rollout notes: This is a breaking change for any user who scripted against ./undeploy.sh in a generated helm bundle. Migration:

Before After (helm)
./undeploy.sh helm uninstall <release> -n <namespace>
./undeploy.sh --delete-pvcs kubectl -n <ns> delete pvc --all
./undeploy.sh --keep-namespaces (default; Helm doesn't delete namespaces)
Before After (argocd / argocd-helm)
./undeploy.sh kubectl -n argocd delete application <parent>

Full per-deployer walkthrough (CRD handling, PVC handling, stuck-release recovery) is in docs/user/cli-reference.md#bundle-uninstall. The bundle's README.md now embeds the rendered helm uninstall commands per component and links to the doc section.

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality (N/A — net removal; golden tests verify the absence)
  • I updated docs if user-facing behavior changed
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

… uninstall

The helm deployer no longer emits undeploy.sh. Bundle teardown delegates to
the deployer-native uninstall path (helm uninstall, ArgoCD app deletion).

The script had accumulated complexity across multiple hardening passes
chasing CRD lifecycle, finalizer resolution, and multi-tenancy safety —
problems Helm itself deliberately avoids and that purpose-built deployers
handle more safely. AICR's role is to generate validated design-time
bundles, not to be the deployer.

Migration:

| Before                          | After (helm)                                  |
| ------------------------------- | --------------------------------------------- |
| ./undeploy.sh                   | helm uninstall <release> -n <namespace>       |
| ./undeploy.sh --delete-pvcs     | kubectl -n <ns> delete pvc --all              |
| ./undeploy.sh --keep-namespaces | (default; Helm doesn't delete namespaces)     |

| Before        | After (argocd / argocd-helm)                  |
| ------------- | --------------------------------------------- |
| ./undeploy.sh | kubectl -n argocd delete application <parent> |

See the new "Bundle Uninstall" section in docs/user/cli-reference.md for
full per-deployer walkthroughs, CRD/PVC handling, and stuck-release recovery.

Closes #671
@github-actions

Copy link
Copy Markdown
Contributor

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4dabfce8-fd2f-4d41-acdb-2db309a5c241

📥 Commits

Reviewing files that changed from the base of the PR and between 125f49b and 665658c.

📒 Files selected for processing (8)
  • docs/user/cli-reference.md
  • pkg/bundler/deployer/helm/templates/README.md.tmpl
  • pkg/bundler/deployer/helm/testdata/kai_scheduler_present/README.md
  • pkg/bundler/deployer/helm/testdata/manifest_only/README.md
  • pkg/bundler/deployer/helm/testdata/mixed_gpu_operator/README.md
  • pkg/bundler/deployer/helm/testdata/mixed_with_pre/README.md
  • pkg/bundler/deployer/helm/testdata/nodewright_present/README.md
  • pkg/bundler/deployer/helm/testdata/upstream_helm_only/README.md
💤 Files with no reviewable changes (4)
  • pkg/bundler/deployer/helm/testdata/manifest_only/README.md
  • pkg/bundler/deployer/helm/testdata/upstream_helm_only/README.md
  • pkg/bundler/deployer/helm/testdata/nodewright_present/README.md
  • pkg/bundler/deployer/helm/testdata/kai_scheduler_present/README.md

📝 Walkthrough

Walkthrough

This PR removes generation of top-level undeploy.sh from Helm bundles, deletes the undeploy template and related undeploy tests/helpers, adds Generator.Generate logic to remove stale top-level undeploy.sh on regeneration, changes README generation to derive reverse uninstall order from emitted localformat folders, updates templates and generated testdata to instruct deployer-native uninstall flows, and adjusts cross-package docs and tests accordingly.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Suggested labels

area/cli

Suggested reviewers

  • lockwobr
  • yuanchen8911
🚥 Pre-merge checks | ✅ 3 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Linked Issues check ❓ Inconclusive The PR implements most issue #671 objectives: removes undeploy.sh from bundles, updates docs with per-deployer walkthroughs, and updates CI. However, the tools/ utility was intentionally skipped per stated deviation (no CI targets found), and the PR description notes this discrepancy. The PR deviates from issue AC #3 (tools/ utility) by omitting it after finding no CI/KWOK/chainsaw invocations of undeploy.sh. This deviation is documented but represents an incomplete implementation of the original issue scope.
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: removal of generated undeploy.sh and delegation to helm uninstall.
Description check ✅ Passed The description clearly relates to the changeset, explaining the removal of undeploy.sh, delegation to deployer-native uninstall, and documentation updates.
Out of Scope Changes check ✅ Passed Changes are well-scoped to removing undeploy.sh, updating documentation, and regenerating golden testdata. All modifications directly support the PR's stated objectives.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/671-remove-undeploy-script

Comment @coderabbitai help to get the list of available commands and usage tips.

coderabbitai[bot]

This comment was marked as resolved.

@github-actions

github-actions Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Coverage Report ✅

Metric Value
Coverage 77.0%
Threshold 75%
Status Pass
Coverage Badge
![Coverage](https://img.shields.io/badge/coverage-77.0%25-green)

Merging this branch will decrease overall coverage

Impacted Packages Coverage Δ 🤖
github.com/NVIDIA/aicr/pkg/bundler 72.99% (ø)
github.com/NVIDIA/aicr/pkg/bundler/deployer/helm 84.04% (-0.17%) 👎
github.com/NVIDIA/aicr/pkg/bundler/deployer/helmfile 86.88% (ø)
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat 73.48% (ø)

Coverage by file

Changed files (no unit tests)

Changed File Coverage Δ Total Covered Missed 🤖
github.com/NVIDIA/aicr/pkg/bundler/deployer/helm/doc.go 0.00% (ø) 0 0 0
github.com/NVIDIA/aicr/pkg/bundler/deployer/helm/helm.go 84.04% (-0.17%) 94 (-20) 79 (-17) 15 (-3) 👎
github.com/NVIDIA/aicr/pkg/bundler/deployer/helmfile/doc.go 0.00% (ø) 0 0 0
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat/folder.go 75.00% (ø) 4 3 1
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat/local_helm.go 75.00% (ø) 60 45 15
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat/upstream_helm.go 63.64% (ø) 22 14 8
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat/vendor_folder.go 70.53% (ø) 95 67 28
github.com/NVIDIA/aicr/pkg/bundler/deployer/localformat/writer.go 73.98% (ø) 196 145 51
github.com/NVIDIA/aicr/pkg/bundler/doc.go 0.00% (ø) 0 0 0
github.com/NVIDIA/aicr/pkg/bundler/handler.go 76.92% (ø) 143 110 33

Please note that the "Total", "Covered", and "Missed" counts above refer to code statements instead of lines of code. The value in brackets refers to the test coverage of that file in the old version of the code.

Comment thread pkg/bundler/deployer/helm/templates/README.md.tmpl Outdated
Comment thread pkg/bundler/deployer/helm/helm.go
Comment thread docs/user/cli-reference.md Outdated
Comment thread docs/user/cli-reference.md Outdated
…dback

Addresses three valid review concerns from @yuanchen8911 and matching
inline feedback from coderabbit:

1. README uninstall list missed auxiliary releases. The Uninstall block in
   the generated README.md ranged over recipe components, but deploy.sh
   installs every NNN-* folder including injected *-pre / *-post
   auxiliaries. Threaded localformat.Write's folder list through to the
   README template via a new releaseRef type; the template now enumerates
   every emitted release in reverse-install order. Verified
   testdata/mixed_with_pre and testdata/mixed_gpu_operator now include the
   *-pre / *-post releases. Folder gains a Namespace field so the caller
   doesn't have to re-derive it.

2. Stale undeploy.sh survived bundle regeneration. localformat.Write only
   prunes NNN-* folders; regenerating over a pre-PR bundle left an
   executable, unchecksummed undeploy.sh contradicting the new README.
   Generate() now explicitly removes any top-level undeploy.sh up front,
   ignoring ENOENT but propagating other errors. Covered by happy-path
   and error-path tests.

3. ArgoCD uninstall doc claimed AICR sets resources-finalizer on every
   Application — false; the argocd templates set no finalizer. Rewrote
   the argocd walkthrough to recommend `argocd app delete --cascade` or a
   manual `kubectl patch` to add the finalizer before delete. Also
   corrected the argocd-helm wording: ArgoCD with a Helm source uses
   helm template + native pruning, not `helm uninstall`. Added flux
   (HelmRelease delete) and helmfile (`helmfile destroy`) subsections so
   the section covers every deployer AICR generates.

Tests:
- New TestReverseReleases replaces TestReverseComponents (dead helper
  removed alongside its consumer).
- New TestGenerate_RemovesStaleUndeployScript +
  TestGenerate_StaleUndeployRemovalErrorPropagates lock in the
  stale-removal behavior on both paths.
- Regenerated 6 golden testdata bundles for the new README shape.

Coverage on pkg/bundler/deployer/helm: 84.2% (main) -> 84.0% (-0.2%).
coderabbitai[bot]

This comment was marked as resolved.

@mchmarny mchmarny changed the title feat(bundler)\!: drop generated undeploy.sh; delegate to helm uninstall feat(bundler): drop generated undeploy.sh; delegate to helm uninstall May 29, 2026
@mchmarny mchmarny requested a review from yuanchen8911 May 29, 2026 10:48
@mchmarny mchmarny assigned mchmarny and unassigned yuanchen8911 May 29, 2026
mchmarny and others added 3 commits May 29, 2026 03:53
…nd 2

Two follow-up nits from coderabbit on commit 125f49b:

1. `argocd app delete --cascade=foreground` is not valid CLI syntax;
   `--cascade` is a boolean and foreground propagation is set via
   `--propagation-policy`. Fixed both occurrences in
   docs/user/cli-reference.md (argocd + argocd-helm walkthroughs) to
   use `--cascade --propagation-policy foreground`.

2. README.md.tmpl emitted adjacent fenced code blocks without a blank
   line between them when the bundle has multiple releases (e.g.
   `mixed_with_pre`, `mixed_gpu_operator`), violating markdownlint
   MD031. Tightened the range body's whitespace control so each
   `helm uninstall` block is separated by exactly one blank line, with
   exactly one blank line between the final block and the next
   paragraph. Regenerated all 6 golden bundles.
@mchmarny mchmarny enabled auto-merge (squash) May 29, 2026 11:16
@mchmarny mchmarny disabled auto-merge May 29, 2026 11:34
@mchmarny mchmarny merged commit 9a31cc0 into main May 29, 2026
33 checks passed
@mchmarny mchmarny deleted the feat/671-remove-undeploy-script branch May 29, 2026 11:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Move undeploy.sh to tools/ for dev use; direct users to deployer-native uninstall

3 participants