Update VLM model to Qwen3-VL-4B-Instruct-GGUF by kovtcharov · Pull Request #226 · amd/gaia

kovtcharov · 2026-01-22T18:00:58Z

Summary

This PR updates the Vision Language Model (VLM) from Qwen2.5-VL-7B to Qwen3-VL-4B across the entire codebase.

Changes

VLM Model Updates

LemonadeClient: Update VLM model definition from qwen2.5-vl-7b to qwen3-vl-4b
- Model ID: Qwen3-VL-4B-Instruct-GGUF (was Qwen2.5-VL-7B-Instruct-GGUF)
- Update agent profiles: chat, rag, vlm, and mcp agents now use Qwen3-VL-4B
Documentation: Update VLM references in CLAUDE.md, README.md, EMR agent README
Tests: Update all VLM model references in integration tests
CI/CD: Update RAG workflow to use Qwen3-VL-4B
Scripts: Update PowerShell start-lemonade script examples
Claude Agents: Update RAG specialist agent example

Additional Improvements

Lint Tool: Add pre-check to verify GAIA package is installed before running import validation

Files Changed (11 total)

.claude/agents/rag-specialist.md - Update RAG VLM example
.github/workflows/test_rag.yml - Update CI VLM model
CLAUDE.md - Update default VLM model reference
README.md - Update vision models feature description
scripts/start-lemonade.ps1 - Update example command
src/gaia/agents/emr/README.md - Update VLM extraction reference
src/gaia/llm/lemonade_client.py - Update VLM model definition and agent profiles
tests/test_lemonade_client.py - Update test assertions
tests/test_rag_integration.py - Update required models list
tests/test_vlm_integration.py - Update VLM initialization
util/lint.py - Add GAIA installation pre-check

Testing

Verified VLM model references are consistent across codebase
Updated all agent profiles (chat, rag, vlm, mcp)
Updated test assertions to match new model
Updated CI workflows to use Qwen3-VL-4B

Impact

Users will need to download the new Qwen3-VL-4B-Instruct-GGUF model (~3.3GB)
The smaller 4B model provides faster inference while maintaining good quality
All vision-enabled agents (chat, rag, emr, mcp) will use the updated model

- Add `gaia install --lemonade` to install Lemonade Server - Add `gaia uninstall --lemonade` to uninstall (downloads matching MSI) - Add `gaia kill --lemonade` to kill Lemonade server on port 8000 - Add minimal installer support for --profile minimal - Update CLI reference docs with new commands - Update quickstart/setup docs with gaia init step - Add unit tests for minimal installer URL patterns

- Increase download timeout to 2 hours for large models - Show filename, speed, and detailed progress info - Add specific error handling for timeout/connection errors - List models that need downloading before starting

- Restore Qwen3-0.6B-GGUF model in test_gaia_cli_linux.yml workflow - Restore lemonade-server-dev commands in Linux CI - Update default VLM model to Qwen3-VL-4B-Instruct-GGUF in documentation - Update vision model references in README and EMR agent docs Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>

kovtcharov added 7 commits January 21, 2026 17:39

gaia init functionality

53d5472

update installer plan

723e9c6

Improve model download progress display and error handling

c3ed0f4

- Increase download timeout to 2 hours for large models - Show filename, speed, and detailed progress info - Add specific error handling for timeout/connection errors - List models that need downloading before starting

rich console, force model download

9394736

rich console for model load

3816920

update to qwen3-vl

1871f07

kovtcharov self-assigned this Jan 22, 2026

kovtcharov added the ready_for_ci Run CI workflows on draft PR without requesting review label Jan 22, 2026

github-actions bot added documentation Documentation changes dependencies Dependency updates devops DevOps/infrastructure changes agents Agent system changes rag RAG system changes llm LLM backend changes cli CLI changes tests Test changes performance Performance-critical changes labels Jan 22, 2026

kovtcharov and others added 3 commits January 23, 2026 01:28

Merge remote-tracking branch 'origin/main' into kalin/vl

5f50fef

Merge remote-tracking branch 'origin/main' into kalin/vl

7a09e87

Merge branch 'main' into kalin/vl

324bc74

kovtcharov added this to the v0.15.4 milestone Jan 28, 2026

kovtcharov modified the milestones: v0.15.4, v0.16.0 Feb 5, 2026

Claude Code and others added 2 commits February 9, 2026 16:14

Merge remote-tracking branch 'origin/main' into kalin/vl

76d88a8

kovtcharov-amd approved these changes Feb 10, 2026

View reviewed changes

kovtcharov-amd marked this pull request as ready for review February 10, 2026 03:20

Merge branch 'main' into kalin/vl

4e3c937

kovtcharov-amd enabled auto-merge February 10, 2026 03:20

kovtcharov changed the title ~~Kalin/vl~~ Add gaia init command and update VLM to Qwen3-VL-4B Feb 10, 2026

kovtcharov changed the title ~~Add gaia init command and update VLM to Qwen3-VL-4B~~ Update VLM model to Qwen3-VL-4B-Instruct-GGUF Feb 10, 2026

kovtcharov-amd added this pull request to the merge queue Feb 10, 2026

Merged via the queue into main with commit 3295114 Feb 10, 2026
51 checks passed

kovtcharov-amd deleted the kalin/vl branch February 10, 2026 03:44

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update VLM model to Qwen3-VL-4B-Instruct-GGUF#226

Update VLM model to Qwen3-VL-4B-Instruct-GGUF#226
kovtcharov-amd merged 13 commits intomainfrom
kalin/vl

kovtcharov commented Jan 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

kovtcharov commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

VLM Model Updates

Additional Improvements

Files Changed (11 total)

Testing

Impact

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

kovtcharov commented Jan 22, 2026 •

edited

Loading