Skip to content

Update VLM model to Qwen3-VL-4B-Instruct-GGUF#226

Merged
kovtcharov-amd merged 13 commits intomainfrom
kalin/vl
Feb 10, 2026
Merged

Update VLM model to Qwen3-VL-4B-Instruct-GGUF#226
kovtcharov-amd merged 13 commits intomainfrom
kalin/vl

Conversation

@kovtcharov
Copy link
Collaborator

@kovtcharov kovtcharov commented Jan 22, 2026

Summary

This PR updates the Vision Language Model (VLM) from Qwen2.5-VL-7B to Qwen3-VL-4B across the entire codebase.

Changes

VLM Model Updates

  • LemonadeClient: Update VLM model definition from qwen2.5-vl-7b to qwen3-vl-4b
    • Model ID: Qwen3-VL-4B-Instruct-GGUF (was Qwen2.5-VL-7B-Instruct-GGUF)
    • Update agent profiles: chat, rag, vlm, and mcp agents now use Qwen3-VL-4B
  • Documentation: Update VLM references in CLAUDE.md, README.md, EMR agent README
  • Tests: Update all VLM model references in integration tests
  • CI/CD: Update RAG workflow to use Qwen3-VL-4B
  • Scripts: Update PowerShell start-lemonade script examples
  • Claude Agents: Update RAG specialist agent example

Additional Improvements

  • Lint Tool: Add pre-check to verify GAIA package is installed before running import validation

Files Changed (11 total)

  • .claude/agents/rag-specialist.md - Update RAG VLM example
  • .github/workflows/test_rag.yml - Update CI VLM model
  • CLAUDE.md - Update default VLM model reference
  • README.md - Update vision models feature description
  • scripts/start-lemonade.ps1 - Update example command
  • src/gaia/agents/emr/README.md - Update VLM extraction reference
  • src/gaia/llm/lemonade_client.py - Update VLM model definition and agent profiles
  • tests/test_lemonade_client.py - Update test assertions
  • tests/test_rag_integration.py - Update required models list
  • tests/test_vlm_integration.py - Update VLM initialization
  • util/lint.py - Add GAIA installation pre-check

Testing

  • Verified VLM model references are consistent across codebase
  • Updated all agent profiles (chat, rag, vlm, mcp)
  • Updated test assertions to match new model
  • Updated CI workflows to use Qwen3-VL-4B

Impact

  • Users will need to download the new Qwen3-VL-4B-Instruct-GGUF model (~3.3GB)
  • The smaller 4B model provides faster inference while maintaining good quality
  • All vision-enabled agents (chat, rag, emr, mcp) will use the updated model

- Add `gaia install --lemonade` to install Lemonade Server
- Add `gaia uninstall --lemonade` to uninstall (downloads matching MSI)
- Add `gaia kill --lemonade` to kill Lemonade server on port 8000
- Add minimal installer support for --profile minimal
- Update CLI reference docs with new commands
- Update quickstart/setup docs with gaia init step
- Add unit tests for minimal installer URL patterns
- Increase download timeout to 2 hours for large models
- Show filename, speed, and detailed progress info
- Add specific error handling for timeout/connection errors
- List models that need downloading before starting
@kovtcharov kovtcharov self-assigned this Jan 22, 2026
@kovtcharov kovtcharov added the ready_for_ci Run CI workflows on draft PR without requesting review label Jan 22, 2026
@github-actions github-actions bot added documentation Documentation changes dependencies Dependency updates devops DevOps/infrastructure changes agents Agent system changes rag RAG system changes llm LLM backend changes cli CLI changes tests Test changes performance Performance-critical changes labels Jan 22, 2026
@kovtcharov kovtcharov added this to the v0.15.4 milestone Jan 28, 2026
@kovtcharov kovtcharov modified the milestones: v0.15.4, v0.16.0 Feb 5, 2026
Claude Code and others added 2 commits February 9, 2026 16:14
- Restore Qwen3-0.6B-GGUF model in test_gaia_cli_linux.yml workflow
- Restore lemonade-server-dev commands in Linux CI
- Update default VLM model to Qwen3-VL-4B-Instruct-GGUF in documentation
- Update vision model references in README and EMR agent docs

Co-Authored-By: Claude Sonnet 4.5 (1M context) <[email protected]>
@kovtcharov-amd kovtcharov-amd marked this pull request as ready for review February 10, 2026 03:20
@kovtcharov kovtcharov changed the title Kalin/vl Add gaia init command and update VLM to Qwen3-VL-4B Feb 10, 2026
@kovtcharov kovtcharov changed the title Add gaia init command and update VLM to Qwen3-VL-4B Update VLM model to Qwen3-VL-4B-Instruct-GGUF Feb 10, 2026
@kovtcharov-amd kovtcharov-amd added this pull request to the merge queue Feb 10, 2026
Merged via the queue into main with commit 3295114 Feb 10, 2026
51 checks passed
@kovtcharov-amd kovtcharov-amd deleted the kalin/vl branch February 10, 2026 03:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

agents Agent system changes cli CLI changes dependencies Dependency updates devops DevOps/infrastructure changes documentation Documentation changes llm LLM backend changes performance Performance-critical changes rag RAG system changes ready_for_ci Run CI workflows on draft PR without requesting review tests Test changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants