Conversation
**Commit Message** This removes the inference extension related code from the controller to reduce the size of the refactoring PR #629. We need to do the complete redo on inference extension after #629, so this doesn't mean that we drop the support for it. --------- Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
ok all good except that i need to backfill the unit tests on the new gateway reconciler and gateway pod mutators. other than that, ready |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
| // Note: when multiple AIGatewayRoute resources are attached to the same Gateway, and each | ||
| // AIGatewayRoute has a different resource configuration, the ai-gateway will pick one of them | ||
| // to configure the resource requirements of the external processor container. | ||
| // |
There was a problem hiding this comment.
i want some ideas on this - maybe we might want a "gateway-level" CRD to define things like this
There was a problem hiding this comment.
I think we can add the resource requirement to the ai gateway controller flags.
manifests/charts/ai-gateway-helm/templates/admission_webhook.yaml
Outdated
Show resolved
Hide resolved
|
i think this lacks lots of code comments compared to the usual changes i did ... |
|
i think i need to add one additional end to end test where we apply multiple AIGatewayRoutes |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
aabchoo
left a comment
There was a problem hiding this comment.
minor comments still reviewing
Signed-off-by: Takeshi Yoneda <[email protected]>
|
i will refactor around how certs are created etc tomorrow, following a nice advice by @arkodg |
|
|
||
| var slogLevel slog.Level | ||
| if err = slogLevel.UnmarshalText([]byte(*extProcLogLevelPtr)); err != nil { | ||
| if err := slogLevel.UnmarshalText([]byte(*extProcLogLevelPtr)); err != nil { |
There was a problem hiding this comment.
curious: why use := when err is defined a few lines above?
| // PostVirtualHostModify allows an extension to modify the virtual hosts in the xDS config. | ||
| // | ||
| // Currently, this replaces the route that has "x-ai-eg-selected-route" pointing to "original_destination_cluster" to route to the original destination cluster. | ||
| func (s *Server) PostVirtualHostModify(_ context.Context, req *egextension.PostVirtualHostModifyRequest) (*egextension.PostVirtualHostModifyResponse, error) { |
There was a problem hiding this comment.
i should've left note: this code here was not used at all after InfExt controller code removal #632, and I didn't want to fix the test for this unused code hence I removed
| // httpHeaderName: x-ai-eg-original-dst | ||
| // useHttpHeader: true | ||
| // type: ORIGINAL_DST | ||
| if !extProcUDSExist { |
There was a problem hiding this comment.
note: this uds cluster creation was necessary to attach the UDS cluster to the upstream filter below since now the extension policy was created per-gateway, not per-route, so we needed to create a unique name UDS cluster that can be attached below
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
ok it seems e2e is flaky... i need to fix |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
hmm still flaky... |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
@yuzisun @wengyao04 PTAL |
Signed-off-by: Takeshi Yoneda <[email protected]>
|
man flake |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
# Conflicts: # cmd/aigw/run.go
# Conflicts: # internal/controller/ai_gateway_route.go # internal/controller/ai_gateway_route_test.go
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Commit Message
This commit refactors the internal on how the ext proc is deployed. Specifically, this switches to insert the ext proc container as a sidecar container of Envoy pods created by Envoy Gateway. This is another large refactoring that turned out necessary for #599. This utilizes the mutating webhook to insert the extproc container Envoy pods.
Making the extproc as as sidecar means that we now have a one-to-one mapping between Gateway and the extproc hence this naturally resolves the previously known limitation #509 and now users can attach multiple AIGatewayRoute(s) to one Gateway.
Implementation note: since the volume mounts only work in the namespace-scoped way, use-created secrets (like API Keys) cannot be mounted by the extproc as it runs in "envoy-gateway-system" namespace. To resolve this, now the controller reads the secret and embed the read credentials into the "extproc secret" (which is previously known as "extproc configmap") together with routing, matching and backend information. That secret is written in the "envoy-gateway-system" namespace hence it can be mounted by the extproc container.
Related Issues/PRs (if applicable)
Resolves #509
Resolves #621