Skip to content

Conversation

@daniel-lxs
Copy link
Member

@daniel-lxs daniel-lxs commented Aug 28, 2025

Summary

Adds experimental image generation feature using OpenRouter API.

Changes

  • New generate_image tool for AI-driven image creation
  • Settings UI for OpenRouter API key and model selection
  • Image viewer with zoom, copy, and save functionality
  • Integration with approval system and file permissions
  • i18n support with translation strings

Implementation

  • Tool definition: src/core/prompts/tools/generate-image.ts
  • Tool handler: src/core/tools/generateImageTool.ts
  • UI components: ImageViewer, ImageBlock, ImageGenerationSettings
  • Auto-approval when write permissions granted
  • Collapsible approval dialog with file path and prompt

Testing

  • TypeScript compilation passes
  • Manual testing completed

Models

  • Gemini 2.5 Flash Image Preview (initial support)
  • Additional models can be added as available

Important

Adds an AI-driven image generation tool using OpenRouter API with UI components, i18n support, and integration into the existing system.

  • Behavior:
    • Adds generate_image tool for AI-driven image creation using OpenRouter API.
    • Integrates with approval system and file permissions.
    • Supports auto-approval when write permissions are granted.
    • Includes i18n support with translation strings in multiple languages.
  • Implementation:
    • Tool definition in generate-image.ts and handler in generateImageTool.ts.
    • UI components: ImageViewer, ImageBlock, ImageGenerationSettings.
    • Adds generateImage() method in openrouter.ts for API interaction.
  • Testing:
    • TypeScript compilation passes.
    • Manual testing completed.
  • Models:
    • Initial support for Gemini 2.5 Flash Image Preview.
    • Additional models can be added as available.

This description was created by Ellipsis for 7a5db06. You can customize this summary. It will automatically update as commits are pushed.

- Add experimental image generation feature using OpenRouter API
- Implement generate_image tool for AI-driven image creation
- Add ImageViewer component with zoom, copy, and save functionality
- Create settings UI for API key configuration and model selection
- Integrate with approval system and auto-approval for write permissions
- Add collapsible approval dialog matching existing tool patterns
- Support for multiple image generation models (starting with Gemini 2.5 Flash)
- Add i18n support with proper translation strings
- Respect file protection and workspace boundaries
- Display generated images inline in chat with rich controls
@daniel-lxs daniel-lxs requested review from cte, jr and mrubens as code owners August 28, 2025 02:39
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. Enhancement New feature or request labels Aug 28, 2025
@daniel-lxs daniel-lxs moved this from Triage to PR [Needs Review] in Roo Code Roadmap Aug 28, 2025
Copy link
Contributor

@roomote roomote bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution! I've reviewed the image generation feature and found several issues that need attention before merging.

handleError: HandleError,
pushToolResult: PushToolResult,
removeClosingTag: RemoveClosingTag,
) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for this critical new tool. Could we add comprehensive tests to ensure the image generation functionality works correctly and handles edge cases properly?

const isWriteProtected = cline.rooProtectedController?.isWriteProtected(relPath) || false

// Get OpenRouter API key from settings or profile
const imageGenerationSettings = (state as any)?.imageGenerationSettings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Security concern: Using bypasses TypeScript's type safety. Could we properly type the imageGenerationSettings in the provider settings interface instead?

headers: {
Authorization: `Bearer ${openRouterApiKey}`,
"Content-Type": "application/json",
"HTTP-Referer": "https://github.com/RooVetGit/Roo-Code",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the correct GitHub URL? It appears to reference 'RooVetGit' instead of 'RooCodeInc':

import { getReadablePath } from "../../utils/path"
import { isPathOutsideWorkspace } from "../../utils/pathUtils"
import { EXPERIMENT_IDS, experiments } from "../../shared/experiments"
import { safeWriteJson } from "../../utils/safeWriteJson"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The utility is imported but never used. Since this tool writes binary image data (not JSON), should we remove this unused import?

}),
})

if (!response.ok) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add retry logic here for transient network failures? OpenRouter API might have temporary issues that could be resolved with a simple retry mechanism.


// Ensure the file has the correct extension
let finalPath = relPath
if (!finalPath.match(/\.(png|jpg|jpeg)$/i)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension validation only checks for png/jpg/jpeg. Should we also support other common formats like webp or gif that the model might generate?

<div className="ml-2 space-y-3">
{/* API Key Configuration */}
<div>
<label className="block font-medium mb-1">OpenRouter API Key</label>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These UI strings are not internationalized. Could we add i18n support for consistency with the rest of the application?

  • "OpenRouter API Key"
  • "Use API key from current profile"
  • "Enter your OpenRouter API key"
  • "Image Generation Model"


return (
<>
<div
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing ARIA labels for accessibility. Could we add proper ARIA attributes for screen reader support, especially for the zoom controls and modal interactions?

className?: string
}

export function ImageViewer({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for this new UI component. Could we add tests to ensure the image viewer functionality works correctly?


export function getGenerateImageDescription(args: ToolArgs): string {
return `## generate_image
Description: Request to generate an image using AI models through OpenRouter API. This tool creates images from text prompts and saves them to the specified path. Requires OpenRouter API key to be configured in experimental settings.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

- Add translations for image.tabs.view in all common.json files
- Add translations for experimental.IMAGE_GENERATION settings in all locales
- Support for 17 languages: ca, de, es, fr, hi, id, it, ja, ko, nl, pl, pt-BR, ru, tr, vi, zh-CN, zh-TW
- All translations verified complete
- Access imageGenerationSettings from apiConfiguration instead of state directly
- Add missing recordToolUsage call for successful image generation
- Ensure API key is exclusively from experimental settings with no profile fallback
if (errorJson.error?.message) {
errorMessage = `Failed to generate image: ${errorJson.error.message}`
}
} catch {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the JSON.parse try/catch block (lines 312-319), consider logging the caught error (e.g. using console.error) to aid debugging of parsing failures.

Suggested change
} catch {
} catch (err) { console.error(err)

This comment was generated because it violated a code review rule: irule_PTI8rjtnhwrWq6jS.

daniel-lxs and others added 3 commits August 27, 2025 22:31
- Remove redundant text from success messages
- Display only the file path in both chat and tool result
- Maintain consistency with other file creation tools
Comment on lines 115 to 122

// Image generation settings (experimental)
imageGenerationSettings: z
.object({
openRouterApiKey: z.string().optional(),
selectedModel: z.string().optional(),
})
.optional(),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can remove this from the base provider schema

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Aug 28, 2025
@mrubens mrubens merged commit 2092fb1 into main Aug 28, 2025
10 checks passed
@mrubens mrubens deleted the feat/image-generation-tool branch August 28, 2025 05:57
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Aug 28, 2025
@github-project-automation github-project-automation bot moved this from PR [Needs Review] to Done in Roo Code Roadmap Aug 28, 2025
mini2s added a commit to zgsm-ai/costrict that referenced this pull request Aug 31, 2025
* Follow symlinks in rooignore checks (RooCodeInc#7405)

* Sonic -> Grok Code Fast (RooCodeInc#7426)

* chore: add changeset for v3.26.0 (RooCodeInc#7428)

* Changeset version bump (RooCodeInc#7429)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Matt Rubens <[email protected]>

* feat: Add Vercel AI Gateway provider integration (RooCodeInc#7396)

Co-authored-by: daniel-lxs <[email protected]>
Co-authored-by: cte <[email protected]>

* feat: Enable on-disk storage for Qdrant vectors and HNSW index (RooCodeInc#7182)

* fix: use anthropic protocol for token counting when using anthropic models via Vercel AI Gateway (RooCodeInc#7433)

- Added condition in getApiProtocol to return 'anthropic' for vercel-ai-gateway when modelId starts with 'anthropic/'
- Added tests for Vercel AI Gateway provider protocol detection

This ensures proper token counting for Anthropic models accessed through Vercel AI Gateway, as Anthropic and OpenAI count tokens differently (Anthropic excludes cache tokens from input count, OpenAI includes them).

* fix: remove duplicate cache display in task header (RooCodeInc#7443)

* Random chat text area cleanup (RooCodeInc#7436)

* Update @roo-code/cloud to enable roomote control for cloud agents (RooCodeInc#7446)

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>

* Always set remoteControlEnabled to true for cloud agents (RooCodeInc#7448)

* chore: add changeset for v3.26.1 (RooCodeInc#7459)

* feat: show model ID in API configuration dropdown (RooCodeInc#7423)

* feat: update tooltip component to match native VSCode tooltip shadow styling (RooCodeInc#7457)

Co-authored-by: Roo Code <[email protected]>
Co-authored-by: cte <[email protected]>

* Add support for Vercel embeddings (RooCodeInc#7445)

Co-authored-by: daniel-lxs <[email protected]>

* Remove dot before model display (RooCodeInc#7461)

* Update contributors list (RooCodeInc#7109)

Co-authored-by: mrubens <[email protected]>

* Update 3.26.1 changeset (RooCodeInc#7463)

* Changeset version bump (RooCodeInc#7460)

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Matt Rubens <[email protected]>

* Add type for RooCodeEventName.TaskSpawned (RooCodeInc#7465)

* fix: hide .rooignore'd files from environment details by default (RooCodeInc#7369)

* fix: change default showRooIgnoredFiles to false to hide ignored files

- Changed default value from true to false across all files
- Updated tests to reflect the new default behavior
- This prevents ignored files from appearing in environment details

Fixes RooCodeInc#7368

* fix: update tests to match new showRooIgnoredFiles default

* fix: update test expectation to match new showRooIgnoredFiles default value

The PR changed the default value of showRooIgnoredFiles from true to false,
so the test needs to expect false instead of true when calling formatFilesList.

---------

Co-authored-by: Roo Code <[email protected]>
Co-authored-by: daniel-lxs <[email protected]>

* fix: exclude browser scroll actions from repetition detection (RooCodeInc#7471)

- Modified ToolRepetitionDetector to skip repetition detection for browser_action scroll_down and scroll_up actions
- Added isBrowserScrollAction() helper method to identify scroll actions
- Added comprehensive tests for the new behavior
- Fixes issue where multiple scroll actions were incorrectly flagged as being stuck in a loop

Resolves: RooCodeInc#7470

Co-authored-by: Roo Code <[email protected]>

* Fix GPT-5 Responses API issues with condensing and image support (RooCodeInc#7067)

Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com>
Co-authored-by: Roo Code <[email protected]>
Co-authored-by: Hannes Rudolph <[email protected]>

* Bump cloud to 0.25.0 (RooCodeInc#7475)

* feat: add image generation tool with OpenRouter integration (RooCodeInc#7474)

Co-authored-by: Matt Rubens <[email protected]>
Co-authored-by: cte <[email protected]>

* Make the default image filename more generic (RooCodeInc#7479)

* Release v3.26.2 (RooCodeInc#7490)

* Support free imagegen (RooCodeInc#7493)

* feat: update OpenRouter API to support input/output modalities and filter image generation models (RooCodeInc#7492)

* Add padding to image model picker (RooCodeInc#7494)

* fix: prevent dirty state on initial mount in ImageGenerationSettings (RooCodeInc#7495)

* Changeset version bump (