Skip to content

feat(bundler): warn on open agentgateway inference-gateway exposure#1163

Merged
yuanchen8911 merged 3 commits into
NVIDIA:mainfrom
yuanchen8911:feat/agentgateway-guardrails
Jun 4, 2026
Merged

feat(bundler): warn on open agentgateway inference-gateway exposure#1163
yuanchen8911 merged 3 commits into
NVIDIA:mainfrom
yuanchen8911:feat/agentgateway-guardrails

Conversation

@yuanchen8911

@yuanchen8911 yuanchen8911 commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Surfaces the agentgateway inference-gateway's open-by-default network exposure with two guardrails — a bundle-time warning and a conformance finding — without changing the intentional open-by-default behavior.

Motivation / Context

The agentgateway inference-gateway is provisioned as a public LoadBalancer (HTTP, unauthenticated) whose allowedSourceRanges defaults to empty, leaving it open to 0.0.0.0/0 (#1138). The default is defensible — a baked-in CIDR would firewall every downstream operator to NVIDIA's network — but the exposure was silent: nothing in aicr bundle output or validation flagged that the inference endpoint is internet-facing. An operator only found it by inspecting the live Service. This keeps open-by-default while removing the silence.

Fixes: #1160
Related: #1161 / #1162 — now merged (--set-json / --set-file are the clean CLI mechanism for scoping allowedSourceRanges after these guardrails fire). This PR is rebased on top of #1162.

Type of Change

  • New feature (non-breaking change that adds functionality)
  • Documentation update

Component(s) Affected

  • Bundlers (pkg/bundler, pkg/component/*)
  • Validator (pkg/validator, validators/conformance)
  • Docs/examples (docs/, examples/)

Implementation Notes

1. Bundle-time warning (pkg/bundler). When a bundle includes agentgateway with an empty allowedSourceRanges, aicr bundle emits a non-blocking warning that the inference-gateway will be open to 0.0.0.0/0, with remediation. It inspects the merged component values (mirroring the existing storageClassName PVC warning), so it catches the empty state regardless of how the value was (or wasn't) overridden. A non-list value (e.g. a bare string from a mistaken scalar --set) also triggers the warning, since that itself renders an invalid Service.

2. Conformance finding (validators/conformance). Rather than a new check requiring edits to 13 inference overlays + the catalog, this extends the existing inference-gateway conformance check — already wired into every inference overlay. It lists the LoadBalancer Service(s) in agentgateway-system and records their exposure as evidence: the source ranges if scoped, or an explicit "open to 0.0.0.0/0" finding if not. Open-by-default stays a non-fatal warning (preserving #1138); setting AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=true on the validator environment escalates an open gateway to a check failure (fail-closed policy). This matches the issue's "warn/fail per policy."

Non-goals (per the issue): not baking a CIDR into the shared overlay (would lock external operators out), and not flipping the default to internal/ClusterIP (a separate maintainer decision).

Testing

make test-coverage   # 76.4% total (threshold 75%)
make lint            # golangci-lint 0 issues, license headers, MDX-safe docs, agents-sync
make e2e             # 22/22 chainsaw tests pass (incl. cli-bundle-agentgateway-templates)

New tests: pkg/bundler (warnAgentgatewayOpenExposure: empty / empty-list / bare-string / scoped / absent), validators/conformance (assessGatewayExposure: open→warn, open+policy→fail, scoped→pass, non-LB ignored, nil clientset; envTruthy).

Verified end-to-end with a locally built binary against a generated EKS/H100/dynamo inference recipe:

$ aicr bundle -r recipe.yaml -o ./bundle
  ⚠ Warning: agentgateway: inference-gateway will be provisioned as an internet-facing
    LoadBalancer open to 0.0.0.0/0 (allowedSourceRanges is empty). Scope it to trusted
    networks via a recipe componentRef override or --set-json agentgateway:allowedSourceRanges='["<cidr>"]'.

make scan (grype) was not run: no new dependencies were introduced, so the vulnerability surface is unchanged.

Risk Assessment

  • Low — Additive. The bundle warning is non-blocking; the conformance finding defaults to non-fatal (no change to existing pass/fail behavior unless the new opt-in env var is set). Open-by-default is preserved.

Rollout notes: Backwards compatible. AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY is opt-in and defaults to off.

Checklist

  • Tests pass locally (make test with -race)
  • Linter passes (make lint)
  • I did not skip/disable tests to make CI green
  • I added/updated tests for new functionality
  • I updated docs if user-facing behavior changed
  • Changes follow existing patterns in the codebase
  • Commits are cryptographically signed (git commit -S)

@github-actions

github-actions Bot commented Jun 2, 2026

Copy link
Copy Markdown
Contributor

@yuanchen8911 yuanchen8911 marked this pull request as draft June 3, 2026 00:00
@yuanchen8911 yuanchen8911 changed the title feat(bundler,validator): surface agentgateway open inference-gateway exposure WIP: feat(bundler,validator): surface agentgateway open inference-gateway exposure Jun 3, 2026
@coderabbitai

coderabbitai Bot commented Jun 3, 2026

Copy link
Copy Markdown

Review Change Stack

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds guardrails for agentgateway's inference-gateway exposure: the bundler emits a non-blocking warning during aicr bundle when agentgateway.allowedSourceRanges is empty or missing, and the conformance validator records LoadBalancer exposure and either warns (default) or fails the check when AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=true. Tests and documentation updates accompany the behavior.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

  • NVIDIA/aicr#1138: Introduces agentgateway.allowedSourceRanges parameter that maps to spec.loadBalancerSourceRanges, which this PR monitors and warns/fails on when empty.

Suggested labels

area/tests, documentation

Suggested reviewers

  • njhensley
  • mchmarny
🚥 Pre-merge checks | ✅ 4
✅ Passed checks (4 passed)
Check name Status Explanation
Linked Issues check ✅ Passed The PR fully addresses both objectives from #1160: implements bundle-time warning for empty allowedSourceRanges and conformance finding with policy-driven escalation via AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY.
Out of Scope Changes check ✅ Passed All changes are directly scoped to implementing the two guardrails (bundle warning and conformance check) and updating related documentation; no out-of-scope modifications detected.
Description check ✅ Passed The pull request description is directly related to the changeset, clearly describing the two guardrails (bundle-time warning and conformance finding) for surfacing agentgateway inference-gateway network exposure.
Title check ✅ Passed The title accurately describes the main change: adding a warning when agentgateway's inference-gateway exposure is open/unscoped during bundle creation.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/user/cli-reference.md`:
- Line 1201: The documentation incorrectly suggests using the `--set-json` flag
with `aicr bundle` to set `agentgateway:allowedSourceRanges`; remove the
`--set-json` remediation path from the sentence that follows the `agentgateway`
warning and instead only instruct users to scope `allowedSourceRanges` via a
recipe `componentRef` override (keep the reference to `aicr bundle`,
`agentgateway`, and `allowedSourceRanges` so readers know the target and where
to apply the override).

In `@pkg/bundler/bundler.go`:
- Around line 1066-1072: Update the warning message constructed in
warnAgentgatewayOpenExposure to remove the unsupported "--set-json" instruction
and instead direct users only to scope the inference-gateway via a recipe
componentRef override; modify the fmt.Sprintf that uses
agentgatewayComponentName and agentgatewaySourceRangesPath so it no longer
suggests "--set-json %s:%s='[\"<cidr>\"]'" and instead references scoping to
trusted networks via a recipe componentRef override (keeping
agentgatewayComponentName and agentgatewaySourceRangesPath in the message for
context).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 55c69022-c026-4ddf-8393-a9293a69ae65

📥 Commits

Reviewing files that changed from the base of the PR and between 4d5ac90 and b7d26ec.

📒 Files selected for processing (8)
  • docs/contributor/validator.md
  • docs/user/cli-reference.md
  • docs/user/component-catalog.md
  • docs/user/validation.md
  • pkg/bundler/bundler.go
  • pkg/bundler/bundler_test.go
  • validators/conformance/inference_gateway_check.go
  • validators/conformance/inference_gateway_exposure_test.go

Comment thread docs/user/cli-reference.md Outdated
Comment thread pkg/bundler/bundler.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@validators/conformance/inference_gateway_check.go`:
- Around line 171-200: assessGatewayExposure currently calls
ctx.Clientset.CoreV1().Services(...).List(ctx.Ctx, ...) with no timeout and
scans svcs.Items without checking ctx.Ctx.Done(); wrap the List call in a
context.WithTimeout using the package timeout (e.g.,
defaults.DefaultRequestTimeout), defer cancel(), and pass that timeout context
to Services(...).List; after obtaining svcs, ensure the scan loop over
svcs.Items selects on ctx.Ctx.Done() (or timeoutCtx.Done()) and returns early
with ctx.Ctx.Err() (wrapped as appropriate) if cancelled, so the operation is
bounded and respects cancellation.

In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 127-133: The empty-env subtest is not hermetic because envTruthy
reads os.Getenv and the code only calls t.Setenv for non-empty tt.val; always
set the env var for every test case (including the empty string) so each subtest
controls AICR_TEST_ENV_TRUTHY deterministically. Locate the table-driven loop
and the subtest body that currently does `if tt.val != "" {
t.Setenv("AICR_TEST_ENV_TRUTHY", tt.val) }` and replace it by unconditionally
calling `t.Setenv("AICR_TEST_ENV_TRUTHY", tt.val)` before invoking envTruthy to
ensure isolation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 255ba13a-46b8-4bb9-8601-67a012d854f2

📥 Commits

Reviewing files that changed from the base of the PR and between b7d26ec and 644c7ac.

📒 Files selected for processing (8)
  • docs/contributor/validator.md
  • docs/user/cli-reference.md
  • docs/user/component-catalog.md
  • docs/user/validation.md
  • pkg/bundler/bundler.go
  • pkg/bundler/bundler_test.go
  • validators/conformance/inference_gateway_check.go
  • validators/conformance/inference_gateway_exposure_test.go

Comment thread validators/conformance/inference_gateway_check.go
Comment thread validators/conformance/inference_gateway_exposure_test.go
yuanchen8911 added a commit to yuanchen8911/aicr that referenced this pull request Jun 3, 2026
Always t.Setenv AICR_TEST_ENV_TRUTHY (including the empty-string case) so each
subtest controls the var and can't read an ambient value. Addresses CodeRabbit
review on NVIDIA#1163.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
validators/conformance/inference_gateway_exposure_test.go (1)

92-94: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Make the test hermetic by always setting the environment variable.

When requireScoped is false, the test does not call t.Setenv, so it may read any ambient AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY set in the parent process. This is the same issue fixed in TestEnvTruthy (lines 129–131). Since envTruthy treats empty string as false (equivalent to unset), set the variable to "" for the non-enforcing cases to ensure each subtest fully controls its environment.

🔒 Proposed fix to ensure hermetic test execution
 		t.Run(tt.name, func(t *testing.T) {
 			objs := make([]runtime.Object, 0, len(tt.objects))
 			for _, o := range tt.objects {
 				objs = append(objs, o)
 			}
 			vctx := &validators.Context{
 				Ctx:       context.Background(),
 				Clientset: fake.NewClientset(objs...),
 			}
-			if tt.requireScoped {
-				t.Setenv(requireScopedGatewayEnv, "true")
+			if tt.requireScoped {
+				t.Setenv(requireScopedGatewayEnv, "true")
+			} else {
+				t.Setenv(requireScopedGatewayEnv, "")
 			}

 			err := assessGatewayExposure(vctx)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@validators/conformance/inference_gateway_exposure_test.go` around lines 92 -
94, The test is non-hermetic because it only sets requireScopedGatewayEnv when
tt.requireScoped is true; always set the env var so the subtest can't inherit
parent state: inside the subtest block that currently checks tt.requireScoped,
replace the conditional with a t.Setenv(requireScopedGatewayEnv, "true") when
tt.requireScoped is true and t.Setenv(requireScopedGatewayEnv, "") when false
(same pattern used in TestEnvTruthy), ensuring the env var is explicitly set for
both branches so envTruthy sees a controlled value.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 92-94: The test is non-hermetic because it only sets
requireScopedGatewayEnv when tt.requireScoped is true; always set the env var so
the subtest can't inherit parent state: inside the subtest block that currently
checks tt.requireScoped, replace the conditional with a
t.Setenv(requireScopedGatewayEnv, "true") when tt.requireScoped is true and
t.Setenv(requireScopedGatewayEnv, "") when false (same pattern used in
TestEnvTruthy), ensuring the env var is explicitly set for both branches so
envTruthy sees a controlled value.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: bde8e7cb-798a-4e6f-a067-f70a9b10b9cf

📥 Commits

Reviewing files that changed from the base of the PR and between 644c7ac and 8772e22.

📒 Files selected for processing (1)
  • validators/conformance/inference_gateway_exposure_test.go

yuanchen8911 added a commit to yuanchen8911/aicr that referenced this pull request Jun 3, 2026
Always t.Setenv AICR_TEST_ENV_TRUTHY (including the empty-string case) so each
subtest controls the var and can't read an ambient value. Addresses CodeRabbit
review on NVIDIA#1163.
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch from 8772e22 to 16ba61d Compare June 3, 2026 03:28

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@validators/conformance/inference_gateway_check.go`:
- Around line 193-225: The code only treats an empty
svc.Spec.LoadBalancerSourceRanges as "open" but must also treat an explicit
"0.0.0.0/0" entry as open; update the logic where openSvcs is populated (the
block referencing svc.Spec.LoadBalancerSourceRanges and openSvcs) to append
svc.Name when either the slice is empty OR any element equals "0.0.0.0/0" (and
optionally "::/0" if you want IPv6 coverage), keep the summary output for
sourceRanges as-is (showing "<empty>" for empty and joined values otherwise),
and ensure the existing failure path that constructs msg and checks
envTruthy(requireScopedGatewayEnv) still triggers for these newly-detected open
services so the guardrail isn't bypassed.

In `@validators/conformance/inference_gateway_exposure_test.go`:
- Around line 92-94: Tests conditionally set the environment variable using
t.Setenv only when tt.requireScoped is true, which leaves ambient
AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY values affecting the false cases; make the
setup hermetic by calling t.Setenv(requireScopedGatewayEnv,
strconv.FormatBool(tt.requireScoped)) (or equivalent) for every subtest so each
case explicitly sets the env var based on tt.requireScoped; update the subtest
setup where t.Setenv and tt.requireScoped are used to always set
requireScopedGatewayEnv rather than only on true.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Enterprise

Run ID: 7f60cb5e-b7e6-4c6e-8589-bd0b8a045a77

📥 Commits

Reviewing files that changed from the base of the PR and between 8772e22 and 16ba61d.

📒 Files selected for processing (10)
  • docs/contributor/validator.md
  • docs/user/cli-reference.md
  • docs/user/component-catalog.md
  • docs/user/validation.md
  • pkg/bundler/bundler.go
  • pkg/bundler/bundler_test.go
  • pkg/validator/v1/job_plan_internal.go
  • pkg/validator/v1/job_plan_test.go
  • validators/conformance/inference_gateway_check.go
  • validators/conformance/inference_gateway_exposure_test.go

Comment thread validators/conformance/inference_gateway_check.go Outdated
Comment thread validators/conformance/inference_gateway_exposure_test.go
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch from 6384207 to a5687e5 Compare June 3, 2026 15:27
@yuanchen8911

Copy link
Copy Markdown
Contributor Author

Stacked on #1162 — this PR sits on top of feat/bundle-set-json (#1162) and should merge after it.

The open-exposure remediation points users at --set-json agentgateway:allowedSourceRanges='["<cidr>"]', which is the flag added by #1162. The branch is rebased onto #1162's head, so that commit currently shows in this PR's diff.

Because this is a cross-fork PR, the base can't be retargeted to the fork branch feat/bundle-set-json (GitHub requires the base to be a branch on NVIDIA/aicr). So the base stays main. Once #1162 merges, this PR's diff collapses to just its own commit automatically — no rebase needed.

@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch 3 times, most recently from 55eaed1 to 25e4bfc Compare June 3, 2026 16:53
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch from 25e4bfc to 2f72cd6 Compare June 3, 2026 16:55
@yuanchen8911 yuanchen8911 changed the title WIP: feat(bundler,validator): surface agentgateway open inference-gateway exposure feat(bundler,validator): surface agentgateway open inference-gateway exposure Jun 3, 2026
@yuanchen8911 yuanchen8911 marked this pull request as ready for review June 3, 2026 16:57
@yuanchen8911 yuanchen8911 requested a review from mchmarny June 3, 2026 16:57
@yuanchen8911 yuanchen8911 changed the title feat(bundler,validator): surface agentgateway open inference-gateway exposure feat(bundler): warn on open agentgateway inference-gateway exposure Jun 3, 2026
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch 2 times, most recently from f104f1e to a9749fa Compare June 3, 2026 20:25

@mchmarny mchmarny left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid PR. The two-layer design — bundle-time warning + conformance finding with opt-in fail-closed escalation — is the right shape for a "open by default but make sure operators see it" guardrail, and the env-var forwarding pattern correctly mirrors the existing HF_TOKEN trust boundary (with a test asserting catalog entries can't override the forwarded value, which is genuinely the right paranoia).

One medium concern: isAnySourceCIDR is duplicated byte-for-byte (including the multi-line "Limitation: range-union math" comment) between pkg/bundler/bundler.go:1201-1208 and validators/conformance/inference_gateway_check.go:279-286. Per AGENTS.md anti-pattern, this should live in a shared package. Not blocking merge — the bytes are identical today — but worth lifting out before someone tightens the /0 policy in one place and forgets the other.

A few low / nit items inline: registry-name fragility on the bundler side (validator side has TestEmbeddedCatalog_InferenceGatewayEntryExists; bundler doesn't have an analog), substring-match Service filter (consistent with existing pattern), and one asymmetry vs. validateGatewayDataPlane on how a List failure degrades — fail-closed is defensible there, just worth a comment.

All 31 CI checks green including GPU training/inference/smoke. Coverage 76.4% above floor. Docs + catalog + tests all updated together. Nothing here blocks merge.

Comment thread pkg/bundler/bundler.go Outdated
Comment thread pkg/bundler/bundler.go
Comment thread validators/conformance/inference_gateway_check.go
Comment thread validators/conformance/inference_gateway_check.go Outdated
Comment thread pkg/validator/v1/job_plan_test.go
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch 3 times, most recently from e3f231c to 5bf8448 Compare June 4, 2026 01:22
@yuanchen8911

Copy link
Copy Markdown
Contributor Author

@mchmarny good review — thank you! Addressed all five threads in 5bf8448:

  1. Extracted the duplicated isAnySourceCIDR into a shared pkg/netutil.IsAnySourceCIDR (with a unit test); bundler and validator both call it now.
  2. Added TestAgentgatewayComponentExistsInRegistry so renaming the agentgateway registry entry fails a test instead of silently no-opping the warning.
  3. assessGatewayExposure no longer fails the whole check on a transient apiserver List error in advisory mode — it records best-effort and passes, and fails closed only under enforce mode. Added a reactor-based test for both.
  4. Replaced the strings.Contains filter with isInferenceGatewayProxyName (excludes controller-manager/webhook/metrics), applied to both the exposure Service filter and the EndpointSlice readiness filter so they stay in lockstep. Added predicate + table-case tests.
  5. Left the forwarded-value test as-is per your note.

Rebased onto latest main, make test / lint green on the affected packages. Please take another look.

@yuanchen8911 yuanchen8911 requested a review from mchmarny June 4, 2026 01:26
…exposure

The agentgateway inference-gateway is provisioned as a public LoadBalancer
open to 0.0.0.0/0 by default (NVIDIA#1138). The default is defensible, but the
exposure was silent — nothing in bundle output or validation flagged that the
inference endpoint is internet-facing. Add two guardrails that keep
open-by-default while removing the silence.

1. Bundle-time warning (pkg/bundler): when a bundle includes agentgateway with
   an unscoped allowedSourceRanges, emit a non-blocking warning that the
   inference-gateway will be open to 0.0.0.0/0, with remediation. Mirrors the
   existing storageClassName PVC warning and inspects the merged component
   values, so it catches the open state regardless of override mechanism.

2. Conformance finding (validators/conformance): extend the existing
   inference-gateway check (already wired into every inference overlay) to
   assess the gateway's LoadBalancer Service and record its exposure as
   evidence — scoped source ranges, or an explicit open-to-0.0.0.0/0 finding.
   Open-by-default stays a non-fatal warning; AICR_REQUIRE_SCOPED_INFERENCE_GATEWAY=true
   escalates it to a check failure (fail-closed policy).

An "unscoped" source-range list is empty OR contains an any-source CIDR
(0.0.0.0/0 or ::/0) — a length-only check would let an explicit ["0.0.0.0/0"]
pass enforcement. The exposure assessment filters to the inference-gateway
Service by name (mirroring the EndpointSlice readiness filter) so a co-located
LoadBalancer is neither mislabeled nor able to fail enforce mode. The
InferenceGatewayCheckName constant is exported and locked by
TestEmbeddedCatalog_InferenceGatewayEntryExists so a catalog rename can't
silently no-op enforcement forwarding.

Docs: component-catalog gains an "Exposure guardrails" subsection; cli-reference
documents the bundle warning; validation.md and the validator env table cover
the conformance finding and the enforcement env var.

Closes NVIDIA#1160
@yuanchen8911 yuanchen8911 force-pushed the feat/agentgateway-guardrails branch from 5bf8448 to 05443e0 Compare June 4, 2026 14:20
@mchmarny mchmarny enabled auto-merge (squash) June 4, 2026 16:51
@yuanchen8911 yuanchen8911 disabled auto-merge June 4, 2026 16:59
@yuanchen8911 yuanchen8911 enabled auto-merge (squash) June 4, 2026 16:59
@yuanchen8911 yuanchen8911 merged commit 3c1f525 into NVIDIA:main Jun 4, 2026
32 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

agentgateway: surface open inference-gateway exposure (bundle warning + conformance check)

2 participants