Skip to content

agentgateway: surface open inference-gateway exposure (bundle warning + conformance check) #1160

Description

@yuanchen8911

Summary

The agentgateway inference-gateway is provisioned as a public LoadBalancer (port 80, HTTP, unauthenticated). Its allowedSourceRanges defaults to [], which — by deliberate design (#1138) — leaves the Service open to 0.0.0.0/0. The default itself is defensible (a baked-in CIDR would firewall every downstream operator to NVIDIA's network), but the exposure is silent: nothing in aicr bundle output or validation surfaces that the inference gateway is internet-facing. An operator only discovers it by manually inspecting the live Service.

Discovered during GKE H100 inference e2e validation: the deployed inference-gateway had a public IP with loadBalancerSourceRanges empty (0.0.0.0/0), found only by hand.

Proposal (two guardrails — keep open-by-default, remove the silence)

  1. Bundle-time warning — when the agentgateway component is included with an empty allowedSourceRanges, emit a warning in aicr bundle output, e.g.:

    ⚠ inference-gateway will be provisioned as an internet-facing LoadBalancer open to 0.0.0.0/0. Scope it to trusted networks via agentgateway.allowedSourceRanges (recipe componentRef override). See docs/user/component-catalog.md.

    This mirrors the existing kube-prometheus-stack PVC storageClassName warning already printed by the bundler.

  2. Conformance/security check — a validator check that, on a live cluster, flags an inference-gateway (or any agentgateway-managed) LoadBalancer Service whose spec.loadBalancerSourceRanges is empty while type=LoadBalancer. Reports the open exposure as a finding (warn/fail per policy).

Rationale / scope

  • Preserves the intentional open-by-default UX from feat(agentgateway): scope inference-gateway LB to allowed source ranges #1138 (zero-config external reachability) while making the exposure a conscious decision rather than a silent default.
  • Non-goals: (a) baking a CIDR into the shared overlay/default — would lock external operators out of their own gateway; (b) flipping the default to internal/ClusterIP — a stronger posture worth considering, but a separate maintainer decision with broad blast radius.

Pointers

  • Default + rationale: recipes/components/agentgateway/values.yaml (allowedSourceRanges: [])
  • Render path: recipes/components/agentgateway/manifests/inference-gateway.yaml (loadBalancerSourceRanges under AgentgatewayParameters.spec.service.spec)
  • Bundle warning precedent: kube-prometheus-stack PVC warning in pkg/bundler output

Addressed by: #1163 (bundle-time warning + conformance finding). Related: #1161 / #1162 (--set-json CLI flag for scoping allowedSourceRanges).

Metadata

Metadata

Assignees

Fields

No fields configured for Enhancement.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions