Adds initial controller implementation by mathetake · Pull Request #71 · envoyproxy/ai-gateway

mathetake · 2025-01-08T03:00:56Z

This commit adds the initial implementation of the resource
management controllers. As per the official recommendation of
controller-runtime, the controller interface is implemented per
CRD. Therefore, this adds two typical "controller" one for LLMRoute
and another for LLMBackend.

In addition, this implements "configuration sink" which is the sync
mechanism across multiple resource events via Go channel. It has
the global view over all AI Gateway resources and hence have the
full context required to create a final extproc configuration as well
as HTTPRoute. The following is the (very) simple diagram for the
relationship among k8s events, controllers and configSink.

k8s events --async--> {llmRouteCtrl, llmBackendCtrl, ...} --sinkEvent (sync)--> configSink

The followup PR will add the end to end tests.

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake · 2025-01-08T08:16:12Z

the implementation is mechanical so nothing special here. I hope someone approves by the time when I wake up tomorrow then I can unblock the implementation part of #43 as well as the rate limit stuff

Signed-off-by: Takeshi Yoneda <[email protected]>

yuzisun · 2025-01-08T11:34:34Z

extprocconfig/extprocconfig.go

 //	  schema: OpenAI
-//	backendRoutingHeaderKey: x-backend-name
-//	modelNameHeaderKey: x-model-name
+//	selectedBackendHeaderKey: x-envoy-ai-gateway-selected-backend


not sure if we want to call it selected-backend, the backend may not be necessarily selected by gateway. User can still specify the backend in the request

well, the backend will be "calculated" and "selected" following the matching rule of LLMRoute, so the word "selected" makes sense regardless of who actually "select".

this is the internal use only, and not user facing, so it should be fine

This is not true, you might not be able to select the backend just based on the model.. For example there is case where the same anthropic models are supported by both google and aws. In this case user needs to set the backend header to determine where to route to.

see https://aws.amazon.com/bedrock/claude/ and https://cloud.google.com/solutions/anthropic?hl=en.

wait a second, this is nothing to do with model...

Say you havve

apiVersion: aigateway.envoyproxy.io/v1alpha1 kind: LLMRoute metadata: name: some-route namespace: default spec: inputSchema: schema: OpenAI rules: - matches: - headers: - type: Exact name: some-random-header-that-can-be-sent-directly-by-clients value: foo backendRefs: - name: some-backend-name ......

and clients send curl -H "some-random-header-that-can-be-sent-directly-by-clients: foo" then internally extproc sets the header $selectedBackendHeaderKey: some-backend-name. That's what this does and this is the completely implementation detail. This package is not CRD but a configuration of extproc itself.

So you are saying we are not providing the standard ai gateway backend routing header to user, they MUST define the matching rules on LLMRoute for each backend user creates. Is that correct ?

yes exactly currently, though we can provide some "canonical-backend-choosing-header-key" defined in the api/v1alpha package and prioritize the header value when present, which effectively ignores LLMRoute.Rules if the header exists. This is another topic we should discuss in another issue/pr if you want to support that. This configuration key is complete implementation detail regardless of whether or not we do that

I will raise an issue tomorrow!

That is what exactly I was thinking to create those rules for user, if this routing header exists then we ignore the rules on llm route. I think this can provide better user experience.

yuzisun · 2025-01-08T11:45:01Z

extprocconfig/extprocconfig.go

-// where the input of the external processor is in the OpenAI schema, the model name is populated in the header x-model-name,
-// The model name header `x-model-name` is used in the header matching to make the routing decision. **After** the routing decision is made,
-// the selected backend name is populated in the header `x-backend-name`. For example, when the model name is `llama3.3333`,
+// where the input of the external processor is in the OpenAI schema, the model name is populated in the header x-envoy-ai-gateway-llm-model,


Since user can configure the model routing header name, we can say configured modelNameHeaderKey

well, no, users do not choose the model routing header name at this point.

ai-gateway/api/v1alpha1/api.go

Line 206 in cb6b2e0

LLMModelHeaderKey = "x-envoy-ai-gateway-llm-model"

It should be good to make this part of the LLMRoute resource, but the comment here matches what it is now

extprocconfig/extprocconfig.go

Signed-off-by: Takeshi Yoneda <[email protected]>

internal/controller/sink.go

Signed-off-by: Takeshi Yoneda <[email protected]>

internal/controller/llmroute.go

aabchoo · 2025-01-08T22:46:04Z

internal/controller/llmroute.go

+		return fmt.Errorf("failed to get extension policy: %w", err)
+	}
+	pm := egv1a1.BufferedExtProcBodyProcessingMode
+	port := gwapiv1.PortNumber(1063)


Why specifically 1063? should this port be configurable?

yeah we can make this configurable later

internal/controller/sink.go

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake · 2025-01-09T00:27:27Z

@yuzisun @aabchoo thanks for the review!

@yuzisun

This comes up in a thread in #71 with @yuzisun, and we might want to remove the LLM prefix. The rationale is that the current functionality is nothing to do with "LLM" but more about the general routing and authn/z with "AI providers" where the input is OpenAI format. On the other hand, there "will be" the LLM specific CRD such as the configurations for prompt guard, semantics caching etc. --------- Signed-off-by: Takeshi Yoneda <[email protected]>

Adds initial controller implementation

86ab274

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake force-pushed the controllerscaffolds2 branch from 60b9003 to 86ab274 Compare January 8, 2025 08:08

mathetake marked this pull request as ready for review January 8, 2025 08:14

mathetake requested review from aabchoo, missBerg and wengyao04 as code owners January 8, 2025 08:14

mathetake requested a review from a team January 8, 2025 08:14

more

9728720

Signed-off-by: Takeshi Yoneda <[email protected]>

yuzisun reviewed Jan 8, 2025

View reviewed changes

extprocconfig/extprocconfig.go Outdated Show resolved Hide resolved

nuke llm-

7d7bd0d

Signed-off-by: Takeshi Yoneda <[email protected]>

yuzisun reviewed Jan 8, 2025

View reviewed changes

internal/controller/sink.go Outdated Show resolved Hide resolved

removes the manual cache

35df174

Signed-off-by: Takeshi Yoneda <[email protected]>

yuzisun approved these changes Jan 8, 2025

View reviewed changes

aabchoo approved these changes Jan 8, 2025

View reviewed changes

apply review comments

31303db

Signed-off-by: Takeshi Yoneda <[email protected]>

mathetake merged commit 8aa1b4b into main Jan 9, 2025

mathetake deleted the controllerscaffolds2 branch January 9, 2025 00:27

This was referenced Jan 9, 2025

discussion: add "standard routing header" that gets prioritized #73

Closed

api: changes LLM prefix -> AI* #76

Merged

Conversation

mathetake commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mathetake commented Jan 8, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mathetake Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mathetake Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mathetake commented Jan 9, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mathetake commented Jan 8, 2025 •

edited

Loading

mathetake Jan 8, 2025 •

edited

Loading

mathetake Jan 8, 2025 •

edited

Loading