Conversation
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
the left is routing code, but almost there! |
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
|
nothing special going on as I described in the PR description, so I need a rubber stamping i guess 😉 |
|
FYI @nacx |
There was a problem hiding this comment.
prefer to move this out of internal.
| BackendRoutingHeaderKey string `yaml:"backendRoutingHeaderKey"` | ||
| // Rules is the routing rules to be used by the external processor to make the routing decision. | ||
| // Inside the routing rules, the header ModelNameHeaderKey may be used to make the routing decision. | ||
| Rules []RouteRule `yaml:"rules"` |
There was a problem hiding this comment.
Why the rules are configured in the extproc configmap ?
There was a problem hiding this comment.
these rules are expect to be updated quite frequently as users add or remove models endpoints, or we are translating the LLMroute to the config for extproc ?
There was a problem hiding this comment.
or we are translating the LLMroute to the config for extproc ?
yes, this is because of the requirement about not restarting the extproc on the LLMRoute/etc updates
| Backends []Backend `yaml:"backends"` | ||
| } | ||
|
|
||
| // Backend corresponds to LLMRouteRuleBackendRef in api/v1alpha1/api.go |
There was a problem hiding this comment.
If I understand correctly the goal here for the config is not intended to be tied with AI concepts.
There was a problem hiding this comment.
well, I think not exactly - I assume this will have a semantics-caching, prompt guard etc happening in extproc
yuzisun
left a comment
There was a problem hiding this comment.
left some comments but good to merge to iterate based on this initial implementation
Follow up on #50 (comment) Signed-off-by: Takeshi Yoneda <[email protected]>
Signed-off-by: Takeshi Yoneda <[email protected]>
This scaffolds and implements the initial logic based on the
current API. Notably, I intentionally organized the extproc code
in a way that it is decoupled from EG concepts as well as the
AI Gateway API level constructs. The rationale is that it not only
allows us to test the extproc without relying on not only EG/AIG
and even without kubernetes at all, but also some users are asking
for the use of outside the ai-gateway projects.
As for the implementation, there's nothing special going on, and
it is simply mostly migrated from the MVP with the decoupling in mind.
I will follow up with the end-to-end test in subsequent PRs.