Skip to content

Conversation

@hannesrudolph
Copy link
Collaborator

@hannesrudolph hannesrudolph commented Nov 26, 2025

Summary

This PR enhances the web-evals application with several new features:

Task Log Viewing

  • Added a full-screen dialog to view task logs with syntax highlighting
  • Logs are color-coded for timestamps, log levels (INFO, WARN, ERROR, DEBUG), and task identifiers
  • Copy to clipboard functionality for easy sharing
  • Click on any completed task row to view its log
  • ESC key closes the log dialog

Export Failed Logs

  • Added "Export Failed Logs" option in the run dropdown menu
  • Downloads a zip file containing all failed task logs for a run
  • Only available when the run has at least one failed task

New Run Options

  • Added "Use Multiple Native Tool Calls" checkbox for all providers (Roo, OpenRouter, Other)
  • Added "Reasoning Effort" dropdown for Roo Code Cloud provider (None, Low, Medium, High)
  • Improved job token field with tooltip showing how to generate tokens
  • Added validation requiring job token for Roo Code Cloud provider
  • Added "Iterations per Exercise" slider (1-10) to run each exercise multiple times

Iterations Support

  • New iterations slider in new run form to run each exercise multiple times (1-10)
  • Task table displays iteration number for repeated exercises (e.g., "go/hello-world (Fix vscode compatibility issue #2)")
  • Database migration adds iteration column to tasks table
  • Supports comparing results across multiple runs of the same exercise

Docker Configuration

  • Added log file mount in docker-compose.yml so web container can access task logs
  • Added PRODUCTION_DATABASE_URL environment variable
  • Added docker-compose.override.yml for local development

Dependencies

  • Added archiver package for zip file generation
  • Added @types/archiver for TypeScript support

Important

Enhance web-evals with task log viewing, export failed logs, new run options, and database schema updates for iterations.

  • Task Log Viewing:
    • Added full-screen dialog for task logs with syntax highlighting in run.tsx.
    • Logs color-coded for timestamps, log levels, and task identifiers.
    • Copy to clipboard feature for logs.
  • Export Failed Logs:
    • Added API endpoint in route.ts to export failed task logs as a zip file.
    • UI option in run.tsx to trigger log export.
  • New Run Options:
    • Added "Use Multiple Native Tool Calls" and "Reasoning Effort" options in new-run.tsx.
    • Added "Iterations per Exercise" slider in new-run.tsx.
  • Iterations Support:
    • Updated createRun in runs.ts to handle multiple iterations per exercise.
    • Database migration in 0004_sloppy_black_knight.sql to add iteration column to tasks table.
  • Docker Configuration:
    • Updated docker-compose.yml and docker-compose.override.yml to mount log files for web access.
  • Dependencies:
    • Added archiver and @types/archiver for zip file generation in package.json.

This description was created by Ellipsis for 13016ee. You can customize this summary. It will automatically update as commits are pushed.

@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. Enhancement New feature or request labels Nov 26, 2025
@roomote
Copy link
Contributor

roomote bot commented Nov 26, 2025

Oroocle Clock   Follow along on Roo Cloud

Review updated for the latest commit. One issue in the task log viewer JSX/highlighting remains open.

  • Escape or safely render task log lines instead of using dangerouslySetInnerHTML or unsafe HTML so logs cannot inject HTML or scripts.
Previous reviews

Mention @roomote in a comment to request specific changes to this pull request or fix all unresolved issues.

@hannesrudolph hannesrudolph added the Issue/PR - Triage New issue. Needs quick review to confirm validity and assign labels. label Nov 27, 2025
@dosubot dosubot bot added size:XL This PR changes 500-999 lines, ignoring generated files. and removed size:L This PR changes 100-499 lines, ignoring generated files. labels Nov 27, 2025
…n options

- Add task log viewing dialog with syntax highlighting and copy to clipboard
- Add export failed logs functionality (downloads zip file)
- Add 'Use Multiple Native Tool Calls' option for all providers
- Add reasoning effort dropdown for Roo Code Cloud provider
- Improve job token field with tooltip and validation
- Mount log files in docker-compose for web access
- Add archiver dependency for zip exports
…d logs export

- Add /api/runs/[id]/logs/[taskId] route to retrieve individual task logs
- Add /api/runs/[id]/logs/failed route to export failed task logs as zip
- Add archiver dependency for zip file generation
- Remove redundant ESC key handler (Radix Dialog handles this)
- Fixes missing functionality from original PR
- Fix XSS vulnerability in formatLogContent by escaping HTML before injecting spans
- Use async fs.readFile instead of sync fs.readFileSync to avoid blocking event loop
- Add path sanitization to prevent path traversal attacks in log file APIs
- Add defense-in-depth path validation to ensure resolved paths stay within LOG_BASE_PATH
- Add archiver error handler to properly handle archive generation errors
- Fix event listener ordering: register 'end' handler before calling finalize()
- Add empty zip detection: return 404 error if no log files found on disk
- Fix toolProtocol not being applied for 'other' provider in new-run.tsx
- Add iterations slider (1-10) to new run form
- Add iteration column to tasks table schema
- Add ESC key handler to close task log dialog
- Update run display to show iteration number for repeated tasks
- Add database migration for iteration column
@hannesrudolph hannesrudolph force-pushed the feature/web-evals-updates branch from 70f6c8c to ed51c0e Compare November 28, 2025 00:47
@dosubot dosubot bot added size:XXL This PR changes 1000+ lines, ignoring generated files. and removed size:XL This PR changes 500-999 lines, ignoring generated files. labels Nov 28, 2025
hannesrudolph and others added 2 commits November 27, 2025 20:21
Co-authored-by: roomote[bot] <219738659+roomote[bot]@users.noreply.github.com>
…log highlighting

- Fixed malformed JSX in formatLogContent function (duplicate nested div elements)
- Replaced HTML string injection with proper React elements for XSS-safe syntax highlighting
- Addresses review feedback about dangerouslySetInnerHTML security concern
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 28, 2025
@hannesrudolph hannesrudolph merged commit 3f0a697 into main Nov 28, 2025
13 checks passed
@hannesrudolph hannesrudolph deleted the feature/web-evals-updates branch November 28, 2025 04:15
@github-project-automation github-project-automation bot moved this from Triage to Done in Roo Code Roadmap Nov 28, 2025
@github-project-automation github-project-automation bot moved this from New to Done in Roo Code Roadmap Nov 28, 2025
mini2s added a commit to zgsm-ai/costrict that referenced this pull request Nov 29, 2025