Skip to content

[telegram] Images not being passed to local Ollama vision model (qwen2.5vl) #7564

@PavDev3

Description

@PavDev3

Description

Telegram images are received by the gateway but not passed to the configured imageModel when using a local Ollama vision model.

Environment

  • Clawdbot version: 2026.1.24-3
  • OS: Windows 11
  • Ollama version: (run ollama --version)
  • Node version: 22.16.0

Configuration

{
  "agents": {
    "defaults": {
      "imageModel": {
        "primary": "ollama/qwen2.5vl:7b"
      },
      "model": {
        "primary": "ollama/qwen3-coder:latest"
      }
    }
  },
  "models": {
    "providers": {
      "ollama": {
        "api": "openai-completions",
        "baseUrl": "http://127.0.0.1:11434/v1",
        "models": [
          {
            "id": "qwen2.5vl:7b",
            "input": ["text", "image"]
          }
        ]
      }
    }
  }
}

Steps to Reproduce

  1. Configure Ollama with qwen2.5vl:7b vision model
  2. Set imageModel.primary to ollama/qwen2.5vl:7b
  3. Send an image via Telegram DM to the bot
  4. Bot responds with text model (qwen3-coder) saying it cannot see images

Expected Behavior

When an image is sent via Telegram, the gateway should:

  1. Detect the image attachment
  2. Switch to the configured imageModel (qwen2.5vl:7b)
  3. Pass the image to the vision model for analysis

Actual Behavior

  • The text model (qwen3-coder) responds instead of the vision model
  • Logs show no mention of media, photo, image, or imageModel
  • The bot says "I'm receiving images but cannot see them"

Verification

The Ollama /v1/chat/completions endpoint does support images. This works correctly:

$imageBase64 = [Convert]::ToBase64String([System.IO.File]::ReadAllBytes("image.png"))
$body = @{
    model = "qwen2.5vl:7b"
    messages = @(@{
        role = "user"
        content = @(
            @{ type = "image_url"; image_url = @{ url = "data:image/png;base64,$imageBase64" } },
            @{ type = "text"; text = "What do you see?" }
        )
    })
} | ConvertTo-Json -Depth 10

Invoke-RestMethod -Uri "http://127.0.0.1:11434/v1/chat/completions" -Method Post -ContentType "application/json" -Body $body
# Returns 450+ prompt tokens, confirming image was processed

Logs

embedded run start: runId=... provider=ollama model=qwen3-coder:latest thinking=off messageChannel=telegram

No logs showing image detection or imageModel switching.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions