Releases: ollama/ollama
Releases · ollama/ollama
v0.21.0
Hermes Agent
ollama launch hermes
Hermes learns with you, automatically creating skills to better serve your workflows. Great for research and engineering tasks.
What's Changed
- Gemma 4 on MLX. Added support for running Gemma 4 via MLX on Apple Silicon, including a text-only MLX runtime for the model. The MLX backend also picked up mixed-precision quantization, better capability detection, and a batch of new op wrappers (Conv2d, Pad, activations, trig, masked SDPA, and RoPE-with-freqs).
- Hermes and GitHub Copilot CLI in
ollama launch. Added both integrations, which can now be configured in one command alongside the rest of the supported coding agents. - OpenCode moved to inline config.
ollama launch opencodenow writes its config inline rather than to a separate file, matching how other integrations are handled. ollama launchno longer rewrites config when nothing changed. Pressing → on a configured multi-model integration, or passing--modelwith the current primary, used to trigger a confirmation prompt and rewrite both the editor's config file andconfig.json. Now it's a no-op when the resolved model list matches what's already saved.- Fixed
ollama launch openclaw --yesso it correctly skips the channels configuration step, so non-interactive setups complete cleanly. - Restored the Gemma 4 nothink renderer with the e2b-style prompt.
- Fixed the Gemma 4 compiler error that was breaking Metal builds.
- Fixed macOS cross-compiles so they no longer trigger
generate, which was breaking cmake builds on some Xcode versions. - Quieted cgo builds by suppressing deprecated warnings during
go build.
Full Changelog: v0.20.7...v0.21.0
v0.20.8
What's Changed
- ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483
- gemma4: fix nothink case renderer by @drifkin in #15553
- gemma4: fix compiler error on metal by @dhiltgen in #15550
- gemma4: add nothink renderer tests by @drifkin in #15554
- mlx: mixed-precision quant and capability detection improvements by @dhiltgen in #15409
- mlx: add op wrappers for Conv2d, Pad, activations, trig, and masked SDPA by @dhiltgen in #14913
- Revert "gemma4: add nothink renderer tests" by @drifkin in #15555
- cgo: suppress deprecated warning to quiet down go build by @dhiltgen in #15438
- mac: prevent generate on cross-compiles by @dhiltgen in #15120
- Revert "gemma4: fix nothink case renderer" by @drifkin in #15556
- launch/opencode: use inline config by @hoyyeva in #15462
- gemma4: restore e2b-style nothink prompt by @drifkin in #15560
- Gemma4 on MLX by @dhiltgen in #15244
Full Changelog: v0.20.6...v0.20.8-rc0
v0.20.7
What's Changed
- Fix quality of gemma:e2b and gemma:e4b when thinking is disabled
- ROCm: Update to ROCm 7.2.1 on Linux by @saman-amd in #15483
Full Changelog: v0.20.6...v0.20.7
v0.20.6
What's Changed
- Gemma 4 tool calling ability is improved and updated to use Google's latest post-launch fixes
- Parallel tool calling improved for streaming responses
- Hermes agent Ollama integration guide is now available
- Ollama app is updated to fix image attachment errors
New Contributors
@matteocelani made their first contribution in #15272
Full Changelog: v0.20.5...v0.20.6
v0.20.5
OpenClaw channel setup with ollama launch
What's Changed
- OpenClaw channel setup: connect WhatsApp, Telegram, Discord, and other messaging channels through
ollama launch openclaw - Enable flash attention for Gemma 4 on compatible GPUs
ollama launch opencodenow detects curl-based OpenCode installs at~/.opencode/bin- Fix
/savecommand for models imported from safetensors
New Contributors
Full Changelog: v0.20.4...v0.20.5
v0.20.4
What's Changed
- mlx: Improve M5 performance with NAX
- gemma4: enable flash attention
Full Changelog: v0.20.3...v0.20.4
v0.20.3
What's Changed
- Gemma 4 Tool Calling improvements
- Added latest models to Ollama App
- OpenClaw fixes for launching TUI
Full Changelog: v0.20.2...v0.20.3
v0.20.2
What's Changed
- app: default app home view to new chat instead of launch by @jmorganca in #15312
Full Changelog: v0.20.1...v0.20.2
v0.20.1
What's Changed
- bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
- model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
- ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
- gemma4: enable flash attention by @dhiltgen in #15296
- ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
- model/parsers: rework gemma4 tool call handling by @drifkin in #15306
Full Changelog: v0.20.0...v0.20.1
v0.20.0
Gemma 4
Effective 2B (E2B)
ollama run gemma4:e2b
Effective 4B (E4B)
ollama run gemma4:e4b
26B (Mixture of Experts model with 4B active parameters)
ollama run gemma4:26b
31B (Dense)
ollama run gemma4:31b
What's Changed
- docs: update pi docs by @ParthSareen in #15152
- mlx: respect tokenizer add_bos_token setting in pipeline by @dhiltgen in #15185
- tokenizer: add SentencePiece-style BPE support by @dhiltgen in #15162
Full Changelog: v0.19.0...v0.20.0-rc0