Our current Envoy integration relies on EnvoyExtensionPolicy and EnvoyPatchPolicy this is very manual, and not sustainable.
(See: #18)
We're trying to settle on a single implementation that this project will work on to extend to support LLMServerPool as a Gateway API backend. This will enable us to run e2e tests against these concepts and iterate more quickly. That implementation should be:
- An existing conformant implementation of Gateway API
- Part of CNCF
- Envoy-based for simplicity of extension mechanisms
- Open to contributions from us to support this new type of backend
We propose extending existing this gateway implementation to act as the controller for the LLMServerPool object. (See: https://github.com/kubernetes-sigs/llm-instance-gateway/blob/main/docs/proposals/002-api-proposal/proposal.md#llmserverpool). As well as updating HTTPRoute to support a LLMServerPool as a backendRef.
At a high level we expect this to look like:
- Upon creation of an LLMServerPool the controller creates: An ext-proc deployment/service. An original_dst cluster.
- Upon creation of an HTTPRoute with an LLMServerPool as a backendRef: A Listener that routes requests to the appropriate original_dst cluster (there may be multiple LLMServerPools), and configure ext_proc to operate on requests sent to this cluster
Our current Envoy integration relies on
EnvoyExtensionPolicyandEnvoyPatchPolicythis is very manual, and not sustainable.(See: #18)
We're trying to settle on a single implementation that this project will work on to extend to support LLMServerPool as a Gateway API backend. This will enable us to run e2e tests against these concepts and iterate more quickly. That implementation should be:
We propose extending existing this gateway implementation to act as the controller for the
LLMServerPoolobject. (See: https://github.com/kubernetes-sigs/llm-instance-gateway/blob/main/docs/proposals/002-api-proposal/proposal.md#llmserverpool). As well as updatingHTTPRouteto support aLLMServerPoolas a backendRef.At a high level we expect this to look like: