Description:
Right now, it appears that you need to set a custom header in order to route requests to a single backend (e.g. ollama).
e.g. curl ... -H 'aigw-backend-selector: ollama' will pass to ollama like this.
- matches:
- headers:
- type: Exact
name: aigw-backend-selector
value: ollama
backendRefs:
- name: ollama
namespace: default
Here's the full configuration I would like to simplify
Details
apiVersion: gateway.networking.k8s.io/v1
kind: GatewayClass
metadata:
name: aigw-run
spec:
controllerName: gateway.envoyproxy.io/gatewayclass-controller
---
apiVersion: gateway.networking.k8s.io/v1
kind: Gateway
metadata:
name: aigw-run
spec:
gatewayClassName: aigw-run
listeners:
- name: http
protocol: HTTP
port: 1975
infrastructure:
parametersRef:
group: gateway.envoyproxy.io
kind: EnvoyProxy
name: envoy-ai-gateway
---
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: aigw-run
spec:
schema:
name: OpenAI
targetRefs:
- name: aigw-run
kind: Gateway
group: gateway.networking.k8s.io
rules:
- matches:
- headers:
- type: Exact
name: aigw-backend-selector
value: ollama
backendRefs:
- name: ollama
namespace: default
apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIServiceBackend
metadata:
name: ollama
spec:
timeouts:
request: 3m
schema:
name: OpenAI
backendRef:
name: ollama
kind: Backend
group: gateway.envoyproxy.io
---
apiVersion: gateway.envoyproxy.io/v1alpha1
kind: Backend
metadata:
name: ollama
spec:
endpoints:
- ip:
address: 0.0.0.0
port: 11434
However, as far as I know, you can't specify a default backend. This makes programmatic access not portable.
For example, here's how to add the headers with openai
client = openai.Client(default_headers={"aigw-backend-selector": "ollama"})
In summary, having a default route (backend, not just model) is important. Particularly, single backend configurations will find having to configure a header unnecessary toil. Even where possible, it adds work that isn't required in alternative platforms.
Prior art
Here are similar configs in other products that accomplish this. Having something that can do the same in envoy ai-gateway seems a great way to allow folks to transition.
archgw:
llm_providers:
- name: qwen3:0.6b
provider_interface: openai
# This configuration is converted to Envoy and run inside Docker.
endpoint: host.docker.internal:11434
model: qwen3:0.6b
default: true
litellm:
model_list:
- model_name: "*"
litellm_params:
model: openai/*
api_base: os.environ/OPENAI_BASE_URL
api_key: os.environ/OPENAI_API_KEY
also, a couple proxies have even simpler config
open-responses: specifies a target OPENAI_BASE_URL which is the default egress
llama-stack: has a distribution that bakes in ollama egress: llamastack/distribution-ollama
Description:
Right now, it appears that you need to set a custom header in order to route requests to a single backend (e.g. ollama).
e.g.
curl ... -H 'aigw-backend-selector: ollama'will pass to ollama like this.Here's the full configuration I would like to simplify
Details
However, as far as I know, you can't specify a default backend. This makes programmatic access not portable.
For example, here's how to add the headers with openai
In summary, having a default route (backend, not just model) is important. Particularly, single backend configurations will find having to configure a header unnecessary toil. Even where possible, it adds work that isn't required in alternative platforms.
Prior art
Here are similar configs in other products that accomplish this. Having something that can do the same in envoy ai-gateway seems a great way to allow folks to transition.
archgw:
litellm:
also, a couple proxies have even simpler config
open-responses: specifies a target
OPENAI_BASE_URLwhich is the default egressllama-stack: has a distribution that bakes in ollama egress: llamastack/distribution-ollama