-
-
Notifications
You must be signed in to change notification settings - Fork 69.4k
[Feature]: Add and configure vLLM and Ollama as first-class model providers #2838
Description
Clawdbot currently supports model providers, but configuring local inference engines such as vLLM and Ollama is not straightforward or fully documented. Users running local LLMs (GPU / on-prem / WSL) face friction when attempting to integrate these providers reliably.
Adding official support for vLLM and Ollama as first-class providers would significantly improve local deployment, performance, and developer experience.
Summary
Clawdbot currently supports model providers, but configuring local inference engines such as vLLM and Ollama is not straightforward or fully documented. Users running local LLMs (GPU / on-prem / WSL) face friction when attempting to integrate these providers reliably.
Adding official support for vLLM and Ollama as first-class providers would significantly improve local deployment, performance, and developer experience.
Proposed solution
Add built-in support and configuration helpers for:
Ollama provider
Define baseUrl, models, contextWindow, and maxTokens
Support OpenAI-compatible /v1/chat/completions
Auto-detect local models (e.g. qwen3:8b)
Clear validation errors and examples
vLLM provider
Support OpenAI-compatible vLLM endpoints
Allow configuration of:
served-model-name
contextWindow
maxTokens
GPU / batching options
Enable seamless switching between cloud and local models
CLI / config improvements:
clawdbot models add --provider ollama|vllm
Better schema validation errors
Example configs in docs
Alternatives considered
Manually configuring models via generic OpenAI-compatible providers
→ Works partially, but leads to unclear errors, missing validation, and poor UX.
External proxy layers
→ Adds unnecessary complexity for users already running vLLM or Ollama locally.