Skip to content

Releases: ollama/ollama

v0.15.0

21 Jan 22:40
c01608b

Choose a tag to compare

v0.15.0 Pre-release
Pre-release

What's Changed

  • Clean up the manifest and modelpath by @pdevine in #13807
  • x/imagegen: remove qwen_image and qwen_image_edit models by @jmorganca in #13827
  • cmd: handle Enter key pressed during model loading, render multiline better by @ParthSareen in #13839
  • x/imagegen: replace memory estimation with actual weight size by @jmorganca in #13848
  • cmd: ollama config command to help configure integrations to use Ollama by @ParthSareen in #13712
  • x/imagegen: add image edit capabilities by @jmorganca in #13846

Full Changelog: v0.14.3...v0.15.0-rc1

v0.14.3

16 Jan 19:53
d6dd430

Choose a tag to compare

Ollama screenshot 2026-01-20 at 23 41 54@2x
  • Z-Image Turbo: 6 billion parameter text-to-image model from Alibaba’s Tongyi Lab. It generates high-quality photorealistic images.
  • Flux.2 Klein: Black Forest Labs’ fastest image-generation models to date.

New models

  • GLM-4.7-Flash: As the strongest model in the 30B class, GLM-4.7-Flash offers a new option for lightweight deployment that balances performance and efficiency.
  • LFM2.5-1.2B-Thinking: LFM2.5 is a new family of hybrid models designed for on-device deployment.

What's Changed

  • Fixed issue where Ollama's macOS app would interrupt system shutdown
  • Fixed ollama create and ollama show commands for experimental models
  • The /api/generate API can now be used for image generation
  • Fixed minor issues in Nemotron-3-Nano tool parsing
  • Fixed issue where removing an image generation model would cause it to first load
  • Fixed issue where ollama rm would only stop the first model in the list if it were running

Full Changelog: v0.14.2...v0.14.3

v0.14.2

16 Jan 00:50
55d0b6e

Choose a tag to compare

New models

  • TranslateGemma: A new collection of open translation models built on Gemma 3, helping people communicate across 55 languages.

What's Changed

  • Shift + Enter (or Ctrl + j) will now enter a newline in Ollama's CLI
  • Improve /v1/responses API to better confirm to OpenResponses specification

New Contributors

Full Changelog: v0.14.1...v0.14.2

v0.14.1

14 Jan 19:02
4adb9cf

Choose a tag to compare

Image generation models (experimental)

Experimental image generation models are available for macOS and Linux (CUDA) in Ollama:

Available models

ollama run x/z-image-turbo

Note: x is a username on ollama.com where experimental models are uploaded

More models coming soon:

  1. Qwen-Image-2512
  2. Qwen-Image-Edit-2511
  3. GLM-Image

What's Changed

  • fix macOS auto-update signature verification failure

New Contributors

Full Changelog: v0.14.0...v0.14.1

v0.14.0

10 Jan 08:33
02a2401

Choose a tag to compare

What's Changed

  • ollama run --experimental CLI will now open a new Ollama CLI that includes an agent loop and the bash tool
  • Anthropic API compatibility: support for the /v1/messages API
  • A new REQUIRES command for the Modelfile allows declaring which version of Ollama is required for the model
  • For older models, Ollama will avoid an integer underflow on low VRAM systems during memory estimation
  • More accurate VRAM measurements for AMD iGPUs
  • Ollama's app will now highlight swift source code
  • An error will now return when embeddings return NaN or -Inf
  • Ollama's Linux install bundles files now use zst compression
  • New experimental support for image generation models, powered by MLX

New Contributors

Full Changelog: v0.13.5...v0.14.0-rc2

v0.13.5

18 Dec 16:39
7325791

Choose a tag to compare

New Models

  • Google's FunctionGemma a specialized version of Google's Gemma 3 270M model fine-tuned explicitly for function calling.

What's Changed

  • bert architecture models now run on Ollama's engine
  • Added built-in renderer & tool parsing capabilities for DeepSeek-V3.1
  • Fixed issue where nested properties in tools may not have been rendered properly

New Contributors

Full Changelog: v0.13.4...v0.13.5

v0.13.4

13 Dec 09:24
89eb795

Choose a tag to compare

New Models

  • Nemotron 3 Nano: A new Standard for Efficient, Open, and Intelligent Agentic Models
  • Olmo 3 and Olmo 3.1: A series of Open language models designed to enable the science of language models. These models are pre-trained on the Dolma 3 dataset and post-trained on the Dolci datasets.

What's Changed

  • Enable Flash Attention automatically for models by default
  • Fixed handling of long contexts with Gemma 3 models
  • Fixed issue that would occur with Gemma 3 QAT models or other models imported with the Gemma 3 architecture

New Contributors

Full Changelog: v0.13.3...v0.13.4-rc0

v0.13.3

09 Dec 02:14
709f842

Choose a tag to compare

New models

  • Devstral-Small-2: 24B model that excels at using tools to explore codebases, editing multiple files and power software engineering agents.
  • rnj-1: Rnj-1 is a family of 8B parameter open-weight, dense models trained from scratch by Essential AI, optimized for code and STEM with capabilities on par with SOTA open-weight models.
  • nomic-embed-text-v2: nomic-embed-text-v2-moe is a multilingual MoE text embedding model that excels at multilingual retrieval.

What's Changed

  • Improved truncation logic when using /api/embed and /v1/embeddings
  • Extend Gemma 3 architecture to support rnj-1 model
  • Fix error that would occur when running qwen2.5vl with image input

Full Changelog: v0.13.2...v0.13.3

v0.13.2

04 Dec 04:39
0c78723

Choose a tag to compare

New models

  • Qwen3-Next: The first installment in the Qwen3-Next series with strong performance in terms of both parameter efficiency and inference speed.

What's Changed

  • Flash attention is now enabled by default for vision models such as mistral-3, gemma3, qwen3-vl and more. This improves memory utilization and performance when providing images as input.
  • Fixed GPU detection on multi-GPU CUDA machines
  • Fixed issue where deepseek-v3.1 would always think even with thinking is disabled in Ollama's app

New Contributors

Full Changelog: v0.13.1...v0.13.2

v0.13.1

27 Nov 02:48

Choose a tag to compare

New models

  • Ministral-3: The Ministral 3 family is designed for edge deployment, capable of running on a wide range of hardware.
  • Mistral-Large-3: A general-purpose multimodal mixture-of-experts model for production-grade tasks and enterprise workloads.

What's Changed

  • nomic-embed-text will now use Ollama's engine by default
  • Tool calling support for cogito-v2.1
  • Fixed issues with CUDA VRAM discovery
  • Fixed link to docs in Ollama's app
  • Fixed issue where models would be evicted on CPU-only systems
  • Ollama will now better render errors instead of showing Unmarshal: errors
  • Fixed issue where CUDA GPUs would fail to be detected with older GPUs
  • Added thinking and tool parsing for cogito-v2.1

New Contributors

Full Changelog: v0.13.0...v0.13.1