-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat: add image generation tool with OpenRouter integration #7474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Add experimental image generation feature using OpenRouter API - Implement generate_image tool for AI-driven image creation - Add ImageViewer component with zoom, copy, and save functionality - Create settings UI for API key configuration and model selection - Integrate with approval system and auto-approval for write permissions - Add collapsible approval dialog matching existing tool patterns - Support for multiple image generation models (starting with Gemini 2.5 Flash) - Add i18n support with proper translation strings - Respect file protection and workspace boundaries - Display generated images inline in chat with rich controls
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution! I've reviewed the image generation feature and found several issues that need attention before merging.
| handleError: HandleError, | ||
| pushToolResult: PushToolResult, | ||
| removeClosingTag: RemoveClosingTag, | ||
| ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for this critical new tool. Could we add comprehensive tests to ensure the image generation functionality works correctly and handles edge cases properly?
src/core/tools/generateImageTool.ts
Outdated
| const isWriteProtected = cline.rooProtectedController?.isWriteProtected(relPath) || false | ||
|
|
||
| // Get OpenRouter API key from settings or profile | ||
| const imageGenerationSettings = (state as any)?.imageGenerationSettings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Security concern: Using bypasses TypeScript's type safety. Could we properly type the imageGenerationSettings in the provider settings interface instead?
src/core/tools/generateImageTool.ts
Outdated
| headers: { | ||
| Authorization: `Bearer ${openRouterApiKey}`, | ||
| "Content-Type": "application/json", | ||
| "HTTP-Referer": "https://github.com/RooVetGit/Roo-Code", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the correct GitHub URL? It appears to reference 'RooVetGit' instead of 'RooCodeInc':
| import { getReadablePath } from "../../utils/path" | ||
| import { isPathOutsideWorkspace } from "../../utils/pathUtils" | ||
| import { EXPERIMENT_IDS, experiments } from "../../shared/experiments" | ||
| import { safeWriteJson } from "../../utils/safeWriteJson" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The utility is imported but never used. Since this tool writes binary image data (not JSON), should we remove this unused import?
src/core/tools/generateImageTool.ts
Outdated
| }), | ||
| }) | ||
|
|
||
| if (!response.ok) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add retry logic here for transient network failures? OpenRouter API might have temporary issues that could be resolved with a simple retry mechanism.
|
|
||
| // Ensure the file has the correct extension | ||
| let finalPath = relPath | ||
| if (!finalPath.match(/\.(png|jpg|jpeg)$/i)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The extension validation only checks for png/jpg/jpeg. Should we also support other common formats like webp or gif that the model might generate?
| <div className="ml-2 space-y-3"> | ||
| {/* API Key Configuration */} | ||
| <div> | ||
| <label className="block font-medium mb-1">OpenRouter API Key</label> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These UI strings are not internationalized. Could we add i18n support for consistency with the rest of the application?
- "OpenRouter API Key"
- "Use API key from current profile"
- "Enter your OpenRouter API key"
- "Image Generation Model"
|
|
||
| return ( | ||
| <> | ||
| <div |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing ARIA labels for accessibility. Could we add proper ARIA attributes for screen reader support, especially for the zoom controls and modal interactions?
| className?: string | ||
| } | ||
|
|
||
| export function ImageViewer({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for this new UI component. Could we add tests to ensure the image viewer functionality works correctly?
|
|
||
| export function getGenerateImageDescription(args: ToolArgs): string { | ||
| return `## generate_image | ||
| Description: Request to generate an image using AI models through OpenRouter API. This tool creates images from text prompts and saves them to the specified path. Requires OpenRouter API key to be configured in experimental settings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worth including some of the prompting tips from https://ai.google.dev/gemini-api/docs/image-generation#prompt-guide
- Add translations for image.tabs.view in all common.json files - Add translations for experimental.IMAGE_GENERATION settings in all locales - Support for 17 languages: ca, de, es, fr, hi, id, it, ja, ko, nl, pl, pt-BR, ru, tr, vi, zh-CN, zh-TW - All translations verified complete
- Access imageGenerationSettings from apiConfiguration instead of state directly - Add missing recordToolUsage call for successful image generation - Ensure API key is exclusively from experimental settings with no profile fallback
| if (errorJson.error?.message) { | ||
| errorMessage = `Failed to generate image: ${errorJson.error.message}` | ||
| } | ||
| } catch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the JSON.parse try/catch block (lines 312-319), consider logging the caught error (e.g. using console.error) to aid debugging of parsing failures.
| } catch { | |
| } catch (err) { console.error(err) |
This comment was generated because it violated a code review rule: irule_PTI8rjtnhwrWq6jS.
- Remove redundant text from success messages - Display only the file path in both chat and tool result - Maintain consistency with other file creation tools
|
|
||
| // Image generation settings (experimental) | ||
| imageGenerationSettings: z | ||
| .object({ | ||
| openRouterApiKey: z.string().optional(), | ||
| selectedModel: z.string().optional(), | ||
| }) | ||
| .optional(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can remove this from the base provider schema
* Follow symlinks in rooignore checks (RooCodeInc#7405) * Sonic -> Grok Code Fast (RooCodeInc#7426) * chore: add changeset for v3.26.0 (RooCodeInc#7428) * Changeset version bump (RooCodeInc#7429) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Matt Rubens <[email protected]> * feat: Add Vercel AI Gateway provider integration (RooCodeInc#7396) Co-authored-by: daniel-lxs <[email protected]> Co-authored-by: cte <[email protected]> * feat: Enable on-disk storage for Qdrant vectors and HNSW index (RooCodeInc#7182) * fix: use anthropic protocol for token counting when using anthropic models via Vercel AI Gateway (RooCodeInc#7433) - Added condition in getApiProtocol to return 'anthropic' for vercel-ai-gateway when modelId starts with 'anthropic/' - Added tests for Vercel AI Gateway provider protocol detection This ensures proper token counting for Anthropic models accessed through Vercel AI Gateway, as Anthropic and OpenAI count tokens differently (Anthropic excludes cache tokens from input count, OpenAI includes them). * fix: remove duplicate cache display in task header (RooCodeInc#7443) * Random chat text area cleanup (RooCodeInc#7436) * Update @roo-code/cloud to enable roomote control for cloud agents (RooCodeInc#7446) Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> * Always set remoteControlEnabled to true for cloud agents (RooCodeInc#7448) * chore: add changeset for v3.26.1 (RooCodeInc#7459) * feat: show model ID in API configuration dropdown (RooCodeInc#7423) * feat: update tooltip component to match native VSCode tooltip shadow styling (RooCodeInc#7457) Co-authored-by: Roo Code <[email protected]> Co-authored-by: cte <[email protected]> * Add support for Vercel embeddings (RooCodeInc#7445) Co-authored-by: daniel-lxs <[email protected]> * Remove dot before model display (RooCodeInc#7461) * Update contributors list (RooCodeInc#7109) Co-authored-by: mrubens <[email protected]> * Update 3.26.1 changeset (RooCodeInc#7463) * Changeset version bump (RooCodeInc#7460) Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Matt Rubens <[email protected]> * Add type for RooCodeEventName.TaskSpawned (RooCodeInc#7465) * fix: hide .rooignore'd files from environment details by default (RooCodeInc#7369) * fix: change default showRooIgnoredFiles to false to hide ignored files - Changed default value from true to false across all files - Updated tests to reflect the new default behavior - This prevents ignored files from appearing in environment details Fixes RooCodeInc#7368 * fix: update tests to match new showRooIgnoredFiles default * fix: update test expectation to match new showRooIgnoredFiles default value The PR changed the default value of showRooIgnoredFiles from true to false, so the test needs to expect false instead of true when calling formatFilesList. --------- Co-authored-by: Roo Code <[email protected]> Co-authored-by: daniel-lxs <[email protected]> * fix: exclude browser scroll actions from repetition detection (RooCodeInc#7471) - Modified ToolRepetitionDetector to skip repetition detection for browser_action scroll_down and scroll_up actions - Added isBrowserScrollAction() helper method to identify scroll actions - Added comprehensive tests for the new behavior - Fixes issue where multiple scroll actions were incorrectly flagged as being stuck in a loop Resolves: RooCodeInc#7470 Co-authored-by: Roo Code <[email protected]> * Fix GPT-5 Responses API issues with condensing and image support (RooCodeInc#7067) Co-authored-by: ellipsis-dev[bot] <65095814+ellipsis-dev[bot]@users.noreply.github.com> Co-authored-by: Roo Code <[email protected]> Co-authored-by: Hannes Rudolph <[email protected]> * Bump cloud to 0.25.0 (RooCodeInc#7475) * feat: add image generation tool with OpenRouter integration (RooCodeInc#7474) Co-authored-by: Matt Rubens <[email protected]> Co-authored-by: cte <[email protected]> * Make the default image filename more generic (RooCodeInc#7479) * Release v3.26.2 (RooCodeInc#7490) * Support free imagegen (RooCodeInc#7493) * feat: update OpenRouter API to support input/output modalities and filter image generation models (RooCodeInc#7492) * Add padding to image model picker (RooCodeInc#7494) * fix: prevent dirty state on initial mount in ImageGenerationSettings (RooCodeInc#7495) * Changeset version bump (
Summary
Adds experimental image generation feature using OpenRouter API.
Changes
generate_imagetool for AI-driven image creationImplementation
src/core/prompts/tools/generate-image.tssrc/core/tools/generateImageTool.tsTesting
Models
Important
Adds an AI-driven image generation tool using OpenRouter API with UI components, i18n support, and integration into the existing system.
generate_imagetool for AI-driven image creation using OpenRouter API.generate-image.tsand handler ingenerateImageTool.ts.ImageViewer,ImageBlock,ImageGenerationSettings.generateImage()method inopenrouter.tsfor API interaction.This description was created by
for 7a5db06. You can customize this summary. It will automatically update as commits are pushed.