Skip to content

Change coding style#27

Merged
Yuki-Imajuku merged 4 commits intomainfrom
chore/change-coding-style
Feb 20, 2026
Merged

Change coding style#27
Yuki-Imajuku merged 4 commits intomainfrom
chore/change-coding-style

Conversation

@Yuki-Imajuku
Copy link
Copy Markdown
Collaborator

  • Change type checker (mypy -> ty)
  • Add ruff rules
  • Update dependencies

Copilot AI review requested due to automatic review settings February 20, 2026 09:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the project’s Python tooling and style configuration (switching from mypy to ty, expanding Ruff lint rules), while also applying consistent formatting and small refactors across runtime and evaluation code.

Changes:

  • Replace mypy with ty for type checking and update CI/docs accordingly.
  • Expand Ruff lint rule set and apply formatting-driven refactors across the codebase.
  • Improve path handling in tests (move away from hard-coded /tmp) and tighten type/error handling in several modules.

Reviewed changes

Copilot reviewed 47 out of 49 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
tests/tool_wrappers/test_local_visualization.py Reformat pytest.mark.parametrize argument list to tuple style.
tests/tool_wrappers/test_input_generation.py Use tmp_path instead of hard-coded /tmp; adjust volume key assertions; parametrize formatting.
tests/tool_wrappers/test_code_runner.py Reformat pytest.mark.parametrize argument list to tuple style.
tests/tool_wrappers/test_case_runner.py Replace hard-coded /tmp paths with constants-based temp paths; parametrize formatting.
tests/test_utils.py Remove tempfile usage in favor of tmp_path; tighten iteration (zip(..., strict=True)); adjust exception type.
tests/test_session.py Improve fixtures (use tmp_path for tool dir); add coverage for estimate_rank_and_performance; minor parametrize formatting.
tests/test_schemas.py Make serialized datetimes timezone-aware (UTC) and update expected ISO strings; parametrize formatting.
tests/test_result.py Reformat pytest.mark.parametrize argument list to tuple style.
tests/test_data.py Reformat parametrizations; adjust typing for rank/performance tests; make datetimes timezone-aware (UTC).
tests/judge/test_tle.py Reformat pytest.mark.parametrize argument list to tuple style.
tests/judge/test_mle.py Reformat pytest.mark.parametrize argument list to tuple style.
tests/judge/test_ce.py Reformat pytest.mark.parametrize argument list to tuple style.
src/ale_bench_eval/shared_async_loop.py Refactor singleton storage and cleanup; improve logging style and exception messaging.
src/ale_bench_eval/selection.py Add CodeLanguage TypeGuard validation and centralize extraction/validation of selected solution info.
src/ale_bench_eval/scaffolds.py Switch to structured logging; simplify token/cost extraction; improve exception logging.
src/ale_bench_eval/safe_generation.py Add TYPE_CHECKING imports; extract response validation helper; refine error parsing and messages.
src/ale_bench_eval/prompts/builder.py Refactor content merging logic; remove unnecessary branches; tighten type errors.
src/ale_bench_eval/prompts/init.py Add package docstring.
src/ale_bench_eval/logger.py Use pathlib .open(); adjust JSON encode/decode helpers; expand LoggerAdapter passthrough methods.
src/ale_bench_eval/evaluate.py Use Session.estimate_rank_and_performance; improve logging and minor logic simplification.
src/ale_bench_eval/calc_cost.py Accept both genai_prices.Usage and pydantic_ai.usage.RunUsage; normalize for pricing.
src/ale_bench_eval/analyze_results.py Use pathlib .open() consistently; docstring style changes.
src/ale_bench_eval/main.py Switch to structured logging; normalize pathlib usage; improve argument passing/readability.
src/ale_bench_eval/init.py Replace explicit imports with importlib.import_module loop; improve dependency error message.
src/ale_bench/utils.py Add module logger; replace print in some paths; add overloads for parse_statement; tighten errors.
src/ale_bench/tool_wrappers/local_visualization.py Add module/function docstrings; use zip(..., strict=True) for input/output pairing.
src/ale_bench/tool_wrappers/input_generation.py Add module docstring; improve warnings (stacklevel); replace asserts with explicit errors.
src/ale_bench/tool_wrappers/code_runner.py Alias requests connection error; replace asserts with explicit type checks; docstring formatting.
src/ale_bench/tool_wrappers/case_runner.py Update tmp bind paths via constants.TMP_DIR; alias requests connection error; replace asserts with explicit checks.
src/ale_bench/tool_wrappers/init.py Add package docstring.
src/ale_bench/start.py Add module logger; improve warnings (stacklevel); tweak signature typing; minor refactors.
src/ale_bench/session.py Add constants; switch many asserts to explicit exceptions; add estimate_rank_and_performance; improve logging.
src/ale_bench/schemas.py Add module docstring; use stdlib Annotated; simplify serializer usage.
src/ale_bench/result.py Add module docstring; use collections.abc.Sequence; simplify computed fields and loops.
src/ale_bench/error.py Add module docstring; remove redundant pass.
src/ale_bench/data.py Add module docstring; replace asserts with explicit errors; strengthen typing and interpolation logic.
src/ale_bench/constants.py Add module docstring; introduce TMP_DIR and derive temp-file constants from it.
src/ale_bench/code_language.py Add module docstring; simplify conditionals and improve error messages.
src/ale_bench/init.py Remove runtime Python-version guard (now requires-python governs).
pyproject.toml Replace optional dev deps with [dependency-groups]; remove mypy config; expand Ruff rules; add ty config.
mcp/pyproject.toml Bump version; switch to [dependency-groups]; update lint/type tooling deps.
docs/session_object.md Document estimate_rank_and_performance.
docs/mcp_server.md Update uv sync instructions for dev dependency groups.
docs/evaluation.md Update uv sync instructions to --no-dev --extra eval.
README.md Update supported Python range and uv sync commands (--no-dev / --extra eval).
CONTRIBUTING.md Update contributor workflow to uv + ruff + ty; adjust commands.
.github/workflows/check.yml Switch CI to uv sync --dev --extra eval and ty check.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 421 to 424
@@ -413,7 +424,8 @@ def case_gen_eval(
)
Copy link

Copilot AI Feb 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case_gen_eval calls self.case_gen(seed, **gen_kwargs), but case_gen only accepts seed and a single gen_kwargs dict parameter. Expanding gen_kwargs will raise TypeError when any generation options are present (and is also incorrect when empty). Pass the dict as gen_kwargs=gen_kwargs (or as the second positional argument) instead of expanding it.

Copilot uses AI. Check for mistakes.
@Yuki-Imajuku Yuki-Imajuku merged commit e698721 into main Feb 20, 2026
10 checks passed
@Yuki-Imajuku Yuki-Imajuku deleted the chore/change-coding-style branch February 20, 2026 10:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants