Skip to content

[Bug]: onboard-custom hardcodes input: ["text"] for non-Azure custom providers, silently disabling image/vision support #51869

@Antsoldier1974

Description

@Antsoldier1974

Description

When configuring a custom provider (non-Azure) via openclaw onboard --custom, the generated model config in openclaw.json always sets input: ["text"], with no option to include "image". This silently disables vision/image support for all models under that provider, even when the underlying models (Claude, GPT, Gemini, Qwen, etc.) fully support image input.

Root Cause

File: src/commands/onboard-custom.ts, line ~666:

// Non-Azure path (all custom providers)
{
    id: modelId,
    name: `${modelId} (Custom Provider)`,
    contextWindow: DEFAULT_CONTEXT_WINDOW,
    maxTokens: DEFAULT_MAX_TOKENS,
    input: ["text"] as ["text"],  // ← hardcoded, no image option
    ...
}

Compare with the Azure path, which at least detects reasoning models and adds "image":

// Azure path
input: isLikelyReasoningModel
    ? (["text", "image"] as Array<"text" | "image">)
    : (["text"] as ["text"]),

Impact

  • Users who set up custom providers (OpenAI-compatible endpoints like apiyi, one-api, etc.) with vision-capable models (Claude, GPT-4o, Gemini, Qwen-VL) cannot send images to agents
  • The gateway receives image attachments but silently drops them because the model config declares input: ["text"] only
  • No error or warning is shown — the agent just never receives the image
  • Users have no way to fix this through the onboard wizard; manual JSON editing is required

Steps to Reproduce

  1. Run openclaw onboard --custom with any non-Azure provider
  2. Add a vision-capable model (e.g., claude-sonnet-4-6)
  3. Check ~/.openclaw/openclaw.json — the model will have input: ["text"]
  4. Try sending an image via any channel — agent receives only text

Expected Behavior

  • The onboard wizard should detect common vision-capable models by name pattern (e.g., claude, gpt-4o, gpt-5, gemini, qwen-vl) and default to ["text", "image"]
  • Or: prompt the user to confirm whether the model supports image input
  • Or: default all models to ["text", "image"] since the gateway already validates attachments

Workaround

Manually edit ~/.openclaw/openclaw.json and change input: ["text"] to input: ["text", "image"] for each affected model, then restart the gateway.

Environment

  • OpenClaw version: 2026.3.13
  • OS: Linux
  • Provider: apiyi (OpenAI-compatible)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions