Skip to content

fix(recipes): disable Dynamo ssh-keygen on Kind#670

Merged
mchmarny merged 1 commit into
NVIDIA:mainfrom
yuanchen8911:fix/kind-dynamo-disable-ssh-keygen-v2
Apr 24, 2026
Merged

fix(recipes): disable Dynamo ssh-keygen on Kind#670
mchmarny merged 1 commit into
NVIDIA:mainfrom
yuanchen8911:fix/kind-dynamo-disable-ssh-keygen-v2

Conversation

@yuanchen8911

Copy link
Copy Markdown
Contributor

Summary

Disable the Dynamo MPI ssh-keygen Helm hook only for the h100-kind-inference-dynamo recipe.

Motivation / Context

The Kind GPU inference CI workload does not exercise Dynamo's MPI launch path, but the upstream chart still runs a pre-install/pre-upgrade ssh-keygen hook that pulls bitnamisecure/git:latest. In run 24889208608, that hook repeatedly timed out and consumed most of the GPU job budget before the snapshot step was canceled.

Fixes: N/A
Related: https://github.com/NVIDIA/aicr/actions/runs/24889208608/job/72876842190?pr=666

Type of Change

  • Bug fix (non-breaking change that fixes an issue)
  • New feature (non-breaking change that adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation update
  • Refactoring (no functional changes)
  • Build/CI/tooling

Component(s) Affected

  • CLI (cmd/aicr, pkg/cli)
  • API server (cmd/aicrd, pkg/api, pkg/server)
  • Recipe engine / data (pkg/recipe)
  • Bundlers (pkg/bundler, pkg/component/*)
  • Collectors / snapshotter (pkg/collector, pkg/snapshotter)
  • Validator (pkg/validator)
  • Core libraries (pkg/errors, pkg/k8s)
  • Docs/examples (docs/, examples/)
  • Other: ____________

Implementation Notes

Adds an inline override under the Kind Dynamo leaf recipe:

dynamo-operator:
  dynamo:
    mpiRun:
      sshKeygen:
        enabled: false

This is intentionally scoped to h100-kind-inference-dynamo; cloud Dynamo overlays continue to use the chart default.

Testing

go test ./pkg/recipe/...
make lint-yaml
aicr query --service kind --accelerator h100 --intent inference --platform dynamo --data recipes --selector components.dynamo-platform.values.dynamo-operator.dynamo.mpiRun.sshKeygen.enabled
unset GITLAB_TOKEN && make qualify

Results:

  • go test ./pkg/recipe/... passed
  • make lint-yaml passed
  • Query returned false
  • make qualify passed, including race tests, coverage, lint, e2e, scan, and license checks

Risk Assessment

  • Low — Isolated change, well-tested, easy to revert
  • Medium — Touches multiple components or has broader impact
  • High — Breaking change, affects critical paths, or complex rollout

Rollout notes: N/A

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S) — GPG signing info

@coderabbitai

coderabbitai Bot commented Apr 24, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 4b5a8db2-dafb-44c7-a577-1ec0b08945d9

📥 Commits

Reviewing files that changed from the base of the PR and between b177ded and fef6600.

📒 Files selected for processing (1)
  • recipes/overlays/h100-kind-inference-dynamo.yaml

📝 Walkthrough

Walkthrough

The overlay recipes/overlays/h100-kind-inference-dynamo.yaml was updated to add a Helm override for dynamo-platform that disables the Dynamo operator’s MPI SSH key generation pre-install hook by setting dynamo-operator.dynamo.mpiRun.sshKeygen.enabled: false. This change prevents the chart from creating an MPI SSH secret during Kind inference CI runs.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: disabling Dynamo ssh-keygen on Kind infrastructure.
Description check ✅ Passed The description is well-structured and directly related to the changeset, providing context, motivation, implementation details, testing results, and risk assessment.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

mchmarny
mchmarny previously approved these changes Apr 24, 2026
@yuanchen8911 yuanchen8911 force-pushed the fix/kind-dynamo-disable-ssh-keygen-v2 branch from d467146 to fef6600 Compare April 24, 2026 15:04
@mchmarny mchmarny enabled auto-merge (squash) April 24, 2026 15:21
@mchmarny mchmarny merged commit 0b98847 into NVIDIA:main Apr 24, 2026
60 of 61 checks passed
@yuanchen8911 yuanchen8911 added the run-gpu-tests Trigger GPU CI tests on PR label Apr 24, 2026
@yuanchen8911 yuanchen8911 mentioned this pull request Apr 24, 2026
25 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants