Version 1.2.0 by Yuki-Imajuku · Pull Request #6 · SakanaAI/ALE-Bench

Yuki-Imajuku · 2025-07-17T06:47:38Z

No description provided.

* "Claude PR Assistant workflow" * "Claude Code Review workflow" * update config

This reverts commit 33ec2c2.

* add code_run function * update test code run

* fix MCP server and update scripts * update MCP pyproject.toml and GitHub Actions config * revert dev dependencies and fix MCP server docstring

Yuki-Imajuku · 2025-09-09T15:00:24Z

@codex Review this PR

Copilot

Pull Request Overview

This pull request introduces version 1.2.0 of ALE-Bench, adding a new code_run function, refactoring Docker client management, updating dependencies, and improving various components.

Adds a new code_run function that compiles and executes arbitrary code without judging
Refactors Docker client usage to use a context manager pattern for better resource management
Updates project dependencies and includes several bug fixes and improvements

Reviewed Changes

Copilot reviewed 23 out of 25 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
pyproject.toml	Updates version to 1.2.0 and bumps development dependencies
src/ale_bench/result.py	Adds CodeRunResult model for raw code execution results
src/ale_bench/tool_wrappers/code_runner.py	New module implementing code_run functionality
src/ale_bench/session.py	Adds code_run method to Session class with resource tracking
tests/tool_wrappers/test_code_runner.py	New comprehensive test suite for code_run functionality
tests/judge/test_code_run.py	New integration tests for code_run with Docker
src/ale_bench/utils.py	Adds docker_client context manager for proper resource management
Multiple files	Refactors Docker client usage throughout codebase
src/ale_bench/tool_wrappers/case_runner.py	Simplifies function signatures by replacing separate parameters with prefix strings

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

Copilot · 2025-09-09T15:00:42Z

tests/tool_wrappers/test_case_runner.py

                memory_usage=0,
            ),
-            id="non_aero_exited_broken_wa",
+            id="non_zero_exited_broken_wa",


Fixed typo from 'non_aero_exited_broken_wa' to 'non_zero_exited_broken_wa' which correctly describes a non-zero exit status.

chatgpt-codex-connector · 2025-09-09T15:06:22Z

Codex Review: Didn't find any major issues. Can't wait for the next one!

About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you open a pull request for review, mark a draft as ready, or comment "@codex review". If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex fix this CI failure" or "@codex address that feedback".

Yuki-Imajuku and others added 8 commits July 17, 2025 15:47

update pyproject.toml

fcbdbf7

update uv.lock

ef347b5

update docker handling (#7)

15323a4

Add claude GitHub actions (#8)

33ec2c2

* "Claude PR Assistant workflow" * "Claude Code Review workflow" * update config

Revert "Add claude GitHub actions (#8)" (#9)

f5e82ad

This reverts commit 33ec2c2.

Add code_run method (#10)

82d3225

* add code_run function * update test code run

Minor Changes (#12)

b31b99f

* fix MCP server and update scripts * update MCP pyproject.toml and GitHub Actions config * revert dev dependencies and fix MCP server docstring

update README and add scripts

f4a4619

Yuki-Imajuku requested a review from Copilot September 9, 2025 15:00

Copilot AI reviewed Sep 9, 2025

View reviewed changes

Yuki-Imajuku merged commit 1a05735 into main Sep 9, 2025
5 checks passed

Yuki-Imajuku deleted the versions/v1.2.0 branch September 9, 2025 15:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Version 1.2.0#6

Version 1.2.0#6
Yuki-Imajuku merged 8 commits intomainfrom
versions/v1.2.0

Yuki-Imajuku commented Jul 17, 2025

Uh oh!

Yuki-Imajuku commented Sep 9, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Sep 9, 2025

Uh oh!

chatgpt-codex-connector bot commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Yuki-Imajuku commented Jul 17, 2025

Uh oh!

Yuki-Imajuku commented Sep 9, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot commented Sep 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants