feat(bundler): warn on open agentgateway inference-gateway exposure#1163
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds guardrails for agentgateway's inference-gateway exposure: the bundler emits a non-blocking warning during aicr bundle when agentgateway.allowedSourceRanges is empty or missing, and the conformance validator records LoadBalancer exposure and either warns (default) or fails the check when AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=true. Tests and documentation updates accompany the behavior. Estimated code review effort🎯 4 (Complex) | ⏱️ ~35 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/user/cli-reference.md`:
- Line 1201: The documentation incorrectly suggests using the `--set-json` flag
with `aicr bundle` to set `agentgateway:allowedSourceRanges`; remove the
`--set-json` remediation path from the sentence that follows the `agentgateway`
warning and instead only instruct users to scope `allowedSourceRanges` via a
recipe `componentRef` override (keep the reference to `aicr bundle`,
`agentgateway`, and `allowedSourceRanges` so readers know the target and where
to apply the override).
In `@pkg/bundler/bundler.go`:
- Around line 1066-1072: Update the warning message constructed in
warnAgentgatewayOpenExposure to remove the unsupported "--set-json" instruction
and instead direct users only to scope the inference-gateway via a recipe
componentRef override; modify the fmt.Sprintf that uses
agentgatewayComponentName and agentgatewaySourceRangesPath so it no longer
suggests "--set-json %s:%s='[\"<cidr>\"]'" and instead references scoping to
trusted networks via a recipe componentRef override (keeping
agentgatewayComponentName and agentgatewaySourceRangesPath in the message for
context).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 55c69022-c026-4ddf-8393-a9293a69ae65
📒 Files selected for processing (8)
docs/contributor/validator.mddocs/user/cli-reference.mddocs/user/component-catalog.mddocs/user/validation.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.govalidators/conformance/inference_gateway_check.govalidators/conformance/inference_gateway_exposure_test.go
b7d26ec to
644c7ac
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@validators/conformance/inference_gateway_check.go`:
- Around line 171-200: assessGatewayExposure currently calls
ctx.Clientset.CoreV1().Services(...).List(ctx.Ctx, ...) with no timeout and
scans svcs.Items without checking ctx.Ctx.Done(); wrap the List call in a
context.WithTimeout using the package timeout (e.g.,
defaults.DefaultRequestTimeout), defer cancel(), and pass that timeout context
to Services(...).List; after obtaining svcs, ensure the scan loop over
svcs.Items selects on ctx.Ctx.Done() (or timeoutCtx.Done()) and returns early
with ctx.Ctx.Err() (wrapped as appropriate) if cancelled, so the operation is
bounded and respects cancellation.
In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 127-133: The empty-env subtest is not hermetic because envTruthy
reads os.Getenv and the code only calls t.Setenv for non-empty tt.val; always
set the env var for every test case (including the empty string) so each subtest
controls AICR_TEST_ENV_TRUTHY deterministically. Locate the table-driven loop
and the subtest body that currently does `if tt.val != "" {
t.Setenv("AICR_TEST_ENV_TRUTHY", tt.val) }` and replace it by unconditionally
calling `t.Setenv("AICR_TEST_ENV_TRUTHY", tt.val)` before invoking envTruthy to
ensure isolation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 255ba13a-46b8-4bb9-8601-67a012d854f2
📒 Files selected for processing (8)
docs/contributor/validator.mddocs/user/cli-reference.mddocs/user/component-catalog.mddocs/user/validation.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.govalidators/conformance/inference_gateway_check.govalidators/conformance/inference_gateway_exposure_test.go
Always t.Setenv AICR_TEST_ENV_TRUTHY (including the empty-string case) so each subtest controls the var and can't read an ambient value. Addresses CodeRabbit review on NVIDIA#1163.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
validators/conformance/inference_gateway_exposure_test.go (1)
92-94:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winMake the test hermetic by always setting the environment variable.
When
requireScopedis false, the test does not callt.Setenv, so it may read any ambientAICR_REQUIRE_SCOPED_INFERENCE_GATEWAYset in the parent process. This is the same issue fixed inTestEnvTruthy(lines 129–131). SinceenvTruthytreats empty string as false (equivalent to unset), set the variable to""for the non-enforcing cases to ensure each subtest fully controls its environment.🔒 Proposed fix to ensure hermetic test execution
t.Run(tt.name, func(t *testing.T) { objs := make([]runtime.Object, 0, len(tt.objects)) for _, o := range tt.objects { objs = append(objs, o) } vctx := &validators.Context{ Ctx: context.Background(), Clientset: fake.NewClientset(objs...), } - if tt.requireScoped { - t.Setenv(requireScopedGatewayEnv, "true") + if tt.requireScoped { + t.Setenv(requireScopedGatewayEnv, "true") + } else { + t.Setenv(requireScopedGatewayEnv, "") } err := assessGatewayExposure(vctx)🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@validators/conformance/inference_gateway_exposure_test.go` around lines 92 - 94, The test is non-hermetic because it only sets requireScopedGatewayEnv when tt.requireScoped is true; always set the env var so the subtest can't inherit parent state: inside the subtest block that currently checks tt.requireScoped, replace the conditional with a t.Setenv(requireScopedGatewayEnv, "true") when tt.requireScoped is true and t.Setenv(requireScopedGatewayEnv, "") when false (same pattern used in TestEnvTruthy), ensuring the env var is explicitly set for both branches so envTruthy sees a controlled value.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 92-94: The test is non-hermetic because it only sets
requireScopedGatewayEnv when tt.requireScoped is true; always set the env var so
the subtest can't inherit parent state: inside the subtest block that currently
checks tt.requireScoped, replace the conditional with a
t.Setenv(requireScopedGatewayEnv, "true") when tt.requireScoped is true and
t.Setenv(requireScopedGatewayEnv, "") when false (same pattern used in
TestEnvTruthy), ensuring the env var is explicitly set for both branches so
envTruthy sees a controlled value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: bde8e7cb-798a-4e6f-a067-f70a9b10b9cf
📒 Files selected for processing (1)
validators/conformance/inference_gateway_exposure_test.go
Always t.Setenv AICR_TEST_ENV_TRUTHY (including the empty-string case) so each subtest controls the var and can't read an ambient value. Addresses CodeRabbit review on NVIDIA#1163.
8772e22 to
16ba61d
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@validators/conformance/inference_gateway_check.go`:
- Around line 193-225: The code only treats an empty
svc.Spec.LoadBalancerSourceRanges as "open" but must also treat an explicit
"0.0.0.0/0" entry as open; update the logic where openSvcs is populated (the
block referencing svc.Spec.LoadBalancerSourceRanges and openSvcs) to append
svc.Name when either the slice is empty OR any element equals "0.0.0.0/0" (and
optionally "::/0" if you want IPv6 coverage), keep the summary output for
sourceRanges as-is (showing "<empty>" for empty and joined values otherwise),
and ensure the existing failure path that constructs msg and checks
envTruthy(requireScopedGatewayEnv) still triggers for these newly-detected open
services so the guardrail isn't bypassed.
In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 92-94: Tests conditionally set the environment variable using
t.Setenv only when tt.requireScoped is true, which leaves ambient
AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY values affecting the false cases; make the
setup hermetic by calling t.Setenv(requireScopedGatewayEnv,
strconv.FormatBool(tt.requireScoped)) (or equivalent) for every subtest so each
case explicitly sets the env var based on tt.requireScoped; update the subtest
setup where t.Setenv and tt.requireScoped are used to always set
requireScopedGatewayEnv rather than only on true.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Enterprise
Run ID: 7f60cb5e-b7e6-4c6e-8589-bd0b8a045a77
📒 Files selected for processing (10)
docs/contributor/validator.mddocs/user/cli-reference.mddocs/user/component-catalog.mddocs/user/validation.mdpkg/bundler/bundler.gopkg/bundler/bundler_test.gopkg/validator/v1/job_plan_internal.gopkg/validator/v1/job_plan_test.govalidators/conformance/inference_gateway_check.govalidators/conformance/inference_gateway_exposure_test.go
6384207 to
a5687e5
Compare
|
Stacked on #1162 — this PR sits on top of The open-exposure remediation points users at Because this is a cross-fork PR, the base can't be retargeted to the fork branch |
55eaed1 to
25e4bfc
Compare
25e4bfc to
2f72cd6
Compare
f104f1e to
a9749fa
Compare
mchmarny
left a comment
There was a problem hiding this comment.
Solid PR. The two-layer design — bundle-time warning + conformance finding with opt-in fail-closed escalation — is the right shape for a "open by default but make sure operators see it" guardrail, and the env-var forwarding pattern correctly mirrors the existing HF_TOKEN trust boundary (with a test asserting catalog entries can't override the forwarded value, which is genuinely the right paranoia).
One medium concern: isAnySourceCIDR is duplicated byte-for-byte (including the multi-line "Limitation: range-union math" comment) between pkg/bundler/bundler.go:1201-1208 and validators/conformance/inference_gateway_check.go:279-286. Per AGENTS.md anti-pattern, this should live in a shared package. Not blocking merge — the bytes are identical today — but worth lifting out before someone tightens the /0 policy in one place and forgets the other.
A few low / nit items inline: registry-name fragility on the bundler side (validator side has TestEmbeddedCatalog_InferenceGatewayEntryExists; bundler doesn't have an analog), substring-match Service filter (consistent with existing pattern), and one asymmetry vs. validateGatewayDataPlane on how a List failure degrades — fail-closed is defensible there, just worth a comment.
All 31 CI checks green including GPU training/inference/smoke. Coverage 76.4% above floor. Docs + catalog + tests all updated together. Nothing here blocks merge.
e3f231c to
5bf8448
Compare
|
@mchmarny good review — thank you! Addressed all five threads in 5bf8448:
Rebased onto latest |
…exposure The agentgateway inference-gateway is provisioned as a public LoadBalancer open to 0.0.0.0/0 by default (NVIDIA#1138). The default is defensible, but the exposure was silent — nothing in bundle output or validation flagged that the inference endpoint is internet-facing. Add two guardrails that keep open-by-default while removing the silence. 1. Bundle-time warning (pkg/bundler): when a bundle includes agentgateway with an unscoped allowedSourceRanges, emit a non-blocking warning that the inference-gateway will be open to 0.0.0.0/0, with remediation. Mirrors the existing storageClassName PVC warning and inspects the merged component values, so it catches the open state regardless of override mechanism. 2. Conformance finding (validators/conformance): extend the existing inference-gateway check (already wired into every inference overlay) to assess the gateway's LoadBalancer Service and record its exposure as evidence — scoped source ranges, or an explicit open-to-0.0.0.0/0 finding. Open-by-default stays a non-fatal warning; AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=true escalates it to a check failure (fail-closed policy). An "unscoped" source-range list is empty OR contains an any-source CIDR (0.0.0.0/0 or ::/0) — a length-only check would let an explicit ["0.0.0.0/0"] pass enforcement. The exposure assessment filters to the inference-gateway Service by name (mirroring the EndpointSlice readiness filter) so a co-located LoadBalancer is neither mislabeled nor able to fail enforce mode. The InferenceGatewayCheckName constant is exported and locked by TestEmbeddedCatalog_InferenceGatewayEntryExists so a catalog rename can't silently no-op enforcement forwarding. Docs: component-catalog gains an "Exposure guardrails" subsection; cli-reference documents the bundle warning; validation.md and the validator env table cover the conformance finding and the enforcement env var. Closes NVIDIA#1160
5bf8448 to
05443e0
Compare
Summary
Surfaces the agentgateway
inference-gateway's open-by-default network exposure with two guardrails — a bundle-time warning and a conformance finding — without changing the intentional open-by-default behavior.Motivation / Context
The agentgateway
inference-gatewayis provisioned as a publicLoadBalancer(HTTP, unauthenticated) whoseallowedSourceRangesdefaults to empty, leaving it open to0.0.0.0/0(#1138). The default is defensible — a baked-in CIDR would firewall every downstream operator to NVIDIA's network — but the exposure was silent: nothing inaicr bundleoutput or validation flagged that the inference endpoint is internet-facing. An operator only found it by inspecting the live Service. This keeps open-by-default while removing the silence.Fixes: #1160
Related: #1161 / #1162 — now merged (
--set-json/--set-fileare the clean CLI mechanism for scopingallowedSourceRangesafter these guardrails fire). This PR is rebased on top of #1162.Type of Change
Component(s) Affected
pkg/bundler,pkg/component/*)pkg/validator,validators/conformance)docs/,examples/)Implementation Notes
1. Bundle-time warning (
pkg/bundler). When a bundle includesagentgatewaywith an emptyallowedSourceRanges,aicr bundleemits a non-blocking warning that the inference-gateway will be open to0.0.0.0/0, with remediation. It inspects the merged component values (mirroring the existingstorageClassNamePVC warning), so it catches the empty state regardless of how the value was (or wasn't) overridden. A non-list value (e.g. a bare string from a mistaken scalar--set) also triggers the warning, since that itself renders an invalid Service.2. Conformance finding (
validators/conformance). Rather than a new check requiring edits to 13 inference overlays + the catalog, this extends the existinginference-gatewayconformance check — already wired into every inference overlay. It lists theLoadBalancerService(s) inagentgateway-systemand records their exposure as evidence: the source ranges if scoped, or an explicit "open to0.0.0.0/0" finding if not. Open-by-default stays a non-fatal warning (preserving #1138); settingAICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=trueon the validator environment escalates an open gateway to a check failure (fail-closed policy). This matches the issue's "warn/fail per policy."Non-goals (per the issue): not baking a CIDR into the shared overlay (would lock external operators out), and not flipping the default to internal/ClusterIP (a separate maintainer decision).
Testing
New tests:
pkg/bundler(warnAgentgatewayOpenExposure: empty / empty-list / bare-string / scoped / absent),validators/conformance(assessGatewayExposure: open→warn, open+policy→fail, scoped→pass, non-LB ignored, nil clientset;envTruthy).Verified end-to-end with a locally built binary against a generated EKS/H100/dynamo inference recipe:
make scan(grype) was not run: no new dependencies were introduced, so the vulnerability surface is unchanged.Risk Assessment
Rollout notes: Backwards compatible.
AICR_REQUIRE_SCOPED_INFERENCE_GATEWAYis opt-in and defaults to off.Checklist
make testwith-race)make lint)git commit -S)