feat(provider): add mlx provider with local context-window detection by LeeCheneler · Pull Request #275 · LeeCheneler/tomo

LeeCheneler · 2026-04-10T19:08:19Z

Summary

Adds MLX (mlx_lm.server) as a first-class provider type so users can point
tomo at a local MLX server without masquerading it as Ollama. Also teaches
the provider to read the context window from the HuggingFace cache on disk,
since mlx_lm doesn't expose that info over HTTP.

GitHub Issue

N/A

What Changed

New mlx provider type. Adds "mlx" to providerTypeSchema, wires up a
default URL (http://127.0.0.1:8080) and MLX_API_KEY env var in
provider/client.ts, and exposes it in the settings dropdown. Because MLX
is OpenAI-compatible, model listing flows through the existing /v1/models
path — no new client implementation required. Previously users who selected
"ollama" and pointed the URL at MLX hit Ollama's native /api/show and
/api/tags endpoints and got confusing results.

Context window from HuggingFace config.json. mlx_lm.server's /v1/models
returns only id/object/created — no context length — and there's no
other endpoint that exposes it. Since mlx_lm downloads HF models to the
local cache before serving, we read config.json directly from
~/.cache/huggingface/hub/models--<org>--<repo>/snapshots/*/. Field lookup
walks max_position_embeddings → max_sequence_length → seq_length →
n_positions → sliding_window at the top level first, then falls back to
text_config for multimodal models (e.g. gemma-3). HF_HUB_CACHE and
HF_HOME env vars are honoured, and absolute local model paths are
supported. Any failure falls through to the existing 8192 default.

Test-stability fixups. Spinner assertions in chat/model-selector/providers
tests hardcoded frame 0 (⠋), which raced against the 80ms spinner tick —
relaxed to match any frame in the cycle. Also bumped flushInkFrames from
25ms to 50ms to give ink input parsing more headroom on slower hosts.

Notes for Reviewers

fetchMlxContextWindow is synchronous (fs reads are cheap) but called
from the async fetchContextWindow wrapper — the outer try/catch still
catches any unexpected throw.
The HF cache layout uses models--<org>--<repo>/snapshots/<hash>/ — we
take the first snapshot directory found. Multiple snapshots from stale
revisions shouldn't matter in practice since max_position_embeddings
is architectural, not per-revision.
Coverage is at 100% across the new code and the rest of the repo.

MLX server (mlx_lm.server) exposes an OpenAI-compatible API on http://127.0.0.1:8080 by default. Treating it as its own provider type ensures model listing and context-window detection go through the /v1/models path rather than Ollama's native /api/show endpoint.

mlx_lm.server's /v1/models doesn't expose any context size, so look up max_position_embeddings (falling back through max_sequence_length, seq_length, n_positions, sliding_window) directly from the HuggingFace cache on disk. Checks top-level first, then text_config for multimodal models like gemma-3. Honours HF_HUB_CACHE and HF_HOME, and supports absolute local model paths.

Spinner ticks on an 80ms interval, so hardcoding frame 0 (⠋) races with the flush delay. Match any frame in the cycle instead. Also bump flushInkFrames from 25ms to 50ms for more timing headroom on slower hosts.

LeeCheneler added 3 commits April 10, 2026 10:45

test: stabilise spinner assertions and bump ink flush timeout

80fb388

Spinner ticks on an 80ms interval, so hardcoding frame 0 (⠋) races with the flush delay. Match any frame in the cycle instead. Also bump flushInkFrames from 25ms to 50ms for more timing headroom on slower hosts.

github-actions Bot added the feature label Apr 10, 2026

LeeCheneler merged commit 2fe1ef0 into main Apr 10, 2026
4 checks passed

LeeCheneler deleted the feat/add-mlx-provider branch April 10, 2026 19:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(provider): add mlx provider with local context-window detection#275

feat(provider): add mlx provider with local context-window detection#275
LeeCheneler merged 3 commits intomainfrom
feat/add-mlx-provider

LeeCheneler commented Apr 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

LeeCheneler commented Apr 10, 2026

Summary

GitHub Issue

What Changed

Notes for Reviewers

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant