Skip to content

Optimize Jinja2 environment caching and ruff batch formatting#2779

Merged
koxudaxi merged 5 commits intomainfrom
perf/rendering-and-formatting-optimizations
Dec 24, 2025
Merged

Optimize Jinja2 environment caching and ruff batch formatting#2779
koxudaxi merged 5 commits intomainfrom
perf/rendering-and-formatting-optimizations

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 24, 2025

Summary by CodeRabbit

  • New Features

    • Added defer_formatting option to delay formatting for outputs without filename extensions and enable deferred batch Ruff formatting.
    • New directory-formatting flow that can run Ruff check+format in a combined pipeline for faster batch formatting.
    • Improved template loading to better support custom and absolute template directories.
  • Tests

    • Added unit and integration tests covering deferred formatting, directory batch formatting, and Ruff check/format workflows.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 24, 2025

📝 Walkthrough

Walkthrough

Adds a defer_formatting flag propagated from public parsers into CodeFormatter, adds directory-level Ruff handling and a combined Ruff check+format pipeline, defers formatting for extension-less outputs, and updates templates/environment helpers and tests to cover the new formatting behaviors.

Changes

Cohort / File(s) Change summary
Public surface / entrypoint
src/datamodel_code_generator/__init__.py
Export and wiring: import CodeFormatter and add logic to compute and pass defer_formatting when generate() is invoked for extension-less outputs; conditionally trigger directory formatting after generation.
Formatting implementation
src/datamodel_code_generator/format.py
Add defer_formatting flag to CodeFormatter; implement format_directory, combined apply_ruff_check_and_format, Ruff path discovery _find_ruff_path, and adjust format_code flow to use combined pipeline when appropriate.
Parser plumbing
src/datamodel_code_generator/parser/base.py, .../parser/graphql.py, .../parser/jsonschema.py, .../parser/openapi.py
Thread new defer_formatting: bool parameter through Parser constructors and pass it into CodeFormatter during setup.
Template environment refactor
src/datamodel_code_generator/model/base.py
Centralize Jinja2 Environment creation with cached factories _get_environment and _get_environment_with_absolute_path; refactor template-loading helpers to use those factories.
Tests
tests/test_format.py, tests/main/test_main_general.py
Add tests for directory formatting, combined Ruff check+format pipeline, and behavior when defer_formatting is true; mock subprocess calls and assert command sequences and call counts.

Sequence Diagram(s)

sequenceDiagram
    autonumber
    actor User
    participant Generate as Generator
    participant Parser
    participant Formatter as CodeFormatter
    participant FS as Filesystem
    participant Ruff as Ruff (subprocess)

    note over Generate,Parser: Generation request (output path may be dir)
    User->>Generate: generate(output=<dir or file>)
    Generate->>Parser: parse(..., defer_formatting=bool)
    Parser->>Formatter: instantiate(defer_formatting=bool)
    Parser->>FS: write generated files
    alt defer_formatting == false
        Formatter->>Ruff: apply_ruff_check_and_format / ruff format / ruff check
        Ruff-->>Formatter: formatted code
        Formatter->>FS: overwrite files with formatted output
    else defer_formatting == true and output is directory
        note right of Formatter: defer formatting during generation\n(Formatter.format_code skips Ruff)
        Generate->>Formatter: after generation, if requested run Formatter.format_directory
        Formatter->>Ruff: run combined ruff check | ruff format across directory
        Ruff-->>Formatter: formatting completed
    end
    Generate-->>User: generation complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

"I hopped through templates, tiny paws on keys,
Deferred the polish to land of breeze,
Ruff and I will tidy the meadow soon,
Carrots for tests, and a crescent moon.
— 🐇"

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the two main optimization changes: Jinja2 environment caching and ruff batch formatting.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch perf/rendering-and-formatting-optimizations

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 24, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.48%. Comparing base (0a41d43) to head (38010cf).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff            @@
##             main    #2779    +/-   ##
========================================
  Coverage   99.48%   99.48%            
========================================
  Files          87       87            
  Lines       12908    13023   +115     
  Branches     1547     1556     +9     
========================================
+ Hits        12841    12956   +115     
  Misses         35       35            
  Partials       32       32            
Flag Coverage Δ
unittests 99.48% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/datamodel_code_generator/format.py (1)

175-187: Deferred Ruff in format_code() and new format_directory() are well‑designed; consider dropping unused noqa

  • defer_formatting is correctly stored and only affects the Ruff steps in format_code(), leaving Black/isort and custom formatters unchanged.
  • format_directory() encapsulates the “batch Ruff over a directory” behavior that generate() now relies on, with commands and cwd consistent with the rest of CodeFormatter.

One minor nit: Ruff reports the # noqa: FBT001, FBT002 on the defer_formatting parameter as an unused noqa (RUF100). If FBT checks aren’t enabled in your config, you can safely remove that noqa to keep the file clean.

Also applies to: 259-260, 294-299, 346-361

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0a41d43 and 025e226.

📒 Files selected for processing (8)
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/format.py
  • src/datamodel_code_generator/model/base.py
  • src/datamodel_code_generator/parser/base.py
  • src/datamodel_code_generator/parser/graphql.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • src/datamodel_code_generator/parser/openapi.py
  • tests/test_format.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/format.py (3)
  • CodeFormatter (172-361)
  • Formatter (160-166)
  • format_directory (346-361)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py

186-186: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: py312-black23 on Ubuntu
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: benchmarks
  • GitHub Check: Analyze (python)
🔇 Additional comments (12)
src/datamodel_code_generator/model/base.py (4)

404-420: Verify the intended caching behavior for directory existence checks.

The custom_dir.exists() check on line 411 executes only when the cache misses. If a custom template directory is created after the environment is cached, it won't be detected or used. This optimization assumes template directories are static at runtime.

This is likely intentional for performance, but confirm that this behavior aligns with expected usage patterns.


423-436: LGTM! Clean separation of environment and template caching.

The refactoring properly delegates environment creation to the cached _get_environment function while maintaining the template-level cache. This two-tier caching strategy is efficient and maintains the original behavior.


439-449: LGTM! Proper caching for absolute path templates.

The function correctly creates an environment with fallback to built-in templates. The absence of existence checks is safe since callers (line 721) only invoke this after verifying the template file exists.


452-462: LGTM! Consistent refactoring pattern.

The refactoring mirrors the approach used in _get_template_with_custom_dir, providing consistent two-tier caching for absolute path templates.

src/datamodel_code_generator/parser/jsonschema.py (1)

624-625: defer_formatting wiring in JsonSchemaParser looks correct

The new defer_formatting argument is added with a safe default and forwarded to Parser.__init__ via keyword, so existing callers remain compatible while enabling deferred Ruff behavior.

Also applies to: 733-734

src/datamodel_code_generator/parser/openapi.py (1)

281-282: OpenAPIParser defer_formatting propagation is consistent and non‑breaking

The new defer_formatting flag is added with a default and cleanly forwarded to the base parser, matching the JsonSchemaParser pattern and preserving existing behavior.

Also applies to: 391-392

src/datamodel_code_generator/parser/graphql.py (1)

198-199: GraphQLParser defer_formatting integration is aligned with other parsers

Adding and forwarding defer_formatting mirrors the JSON/OpenAPI parsers and should make GraphQL formatting behavior consistent without affecting existing callers.

Also applies to: 306-307

src/datamodel_code_generator/__init__.py (1)

33-41: Deferred Ruff + batch directory formatting in generate() is coherent

  • Importing CodeFormatter here and computing defer_formatting = output is not None and not output.suffix correctly distinguishes directory/extension‑less outputs from the typical single‑file models.py case.
  • Passing defer_formatting into the parser ensures Ruff is skipped during per‑file format_code when we intend to run it once over the final directory.
  • The final block that instantiates CodeFormatter and calls format_directory(output) only when defer_formatting is true and Ruff is requested gives the desired “batch Ruff over directory” behavior without changing Black/isort semantics.

Net effect: single‑file flows behave as before, while modular/directory flows now avoid N× Ruff invocations in favor of a single batch run.

Also applies to: 711-712, 811-812, 935-951

tests/test_format.py (1)

216-336: New formatting tests thoroughly cover directory and deferred Ruff behavior

The added tests validate:

  • format_directory() calling the correct Ruff commands (check/format/both) with the expected cwd.
  • defer_formatting=True preventing any Ruff subprocess calls from format_code.
  • generate() invoking batch Ruff exactly twice for directory output with both Ruff formatters.

This gives good confidence in the new formatting pipeline wiring.

src/datamodel_code_generator/parser/base.py (3)

779-779: LGTM! Well-placed parameter addition.

The defer_formatting parameter is properly typed, has a sensible default value for backward compatibility, and is logically placed alongside other formatting-related parameters.


955-955: LGTM! Standard instance variable initialization.

The instance variable is properly initialized and logically grouped with related formatting configuration.


2718-2718: LGTM! Correct parameter propagation.

The defer_formatting flag is properly forwarded to CodeFormatter, completing the integration chain from Parser initialization to the formatting stage.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 24, 2025

CodSpeed Performance Report

Merging #2779 will not alter performance

Comparing perf/rendering-and-formatting-optimizations (38010cf) with main (a0df0b4)

Summary

✅ 73 untouched
⏩ 10 skipped1

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

When both RUFF_CHECK and RUFF_FORMAT formatters are enabled,
use a shell pipe (ruff check --fix - | ruff format -) to run
both processes simultaneously. This reduces process startup
overhead by ~33% compared to running them sequentially.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
@koxudaxi koxudaxi force-pushed the perf/rendering-and-formatting-optimizations branch from c4cfc8d to 8964f69 Compare December 24, 2025 07:09
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 025e226 and 06140a1.

📒 Files selected for processing (2)
  • src/datamodel_code_generator/format.py
  • tests/test_format.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/test_format.py
🧰 Additional context used
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py

10-10: Unused noqa directive (non-enabled: S404)

Remove unused noqa directive

(RUF100)


188-188: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.13 on macOS
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: benchmarks
  • GitHub Check: Analyze (python)
🔇 Additional comments (4)
src/datamodel_code_generator/format.py (4)

9-11: LGTM! Necessary imports for new functionality.

The shutil and sys imports support the new _find_ruff_path method for locating the Ruff executable.


296-304: Nice optimization: combined Ruff pipeline reduces subprocess overhead.

The logic correctly routes to the combined apply_ruff_check_and_format when both Ruff check and format are enabled, avoiding two separate subprocess calls. The deferred formatting path appropriately skips Ruff processing here, assuming format_directory will be called later.


374-389: LGTM! Safe directory-level formatting implementation.

The method correctly uses tuple arguments with subprocess.run, avoiding shell injection risks. The separate subprocess calls for check and format are appropriate for directory operations where the shell pipe optimization would be less beneficial.


10-10: Remove unused noqa directives.

Static analysis reports that the noqa directives on these lines suppress rules that are not enabled in your configuration.

🔎 Proposed fix
-import subprocess  # noqa: S404
+import subprocess
-        defer_formatting: bool = False,  # noqa: FBT001, FBT002
+        defer_formatting: bool = False,

Also applies to: 188-188

⛔ Skipped due to learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2681
File: tests/cli_doc/test_cli_doc_coverage.py:82-82
Timestamp: 2025-12-18T13:43:16.235Z
Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.

Comment thread src/datamodel_code_generator/format.py Outdated
Comment thread src/datamodel_code_generator/format.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
src/datamodel_code_generator/format.py (2)

10-10: Clean up unused noqa directive.

Static analysis indicates the noqa: S404 directive is unused.

🔎 Proposed fix
-import subprocess  # noqa: S404
+import subprocess

188-188: Clean up unused noqa directive.

Static analysis indicates the noqa: FBT001, FBT002 directives are unused (boolean trap rules may not be enabled).

🔎 Proposed fix
-        defer_formatting: bool = False,  # noqa: FBT001, FBT002
+        defer_formatting: bool = False,
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 06140a1 and 38010cf.

📒 Files selected for processing (3)
  • src/datamodel_code_generator/format.py
  • tests/main/test_main_general.py
  • tests/test_format.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/main/test_main_general.py (2)
tests/test_main_kr.py (2)
  • output_file (44-46)
  • output_dir (50-52)
tests/main/conftest.py (3)
  • output_file (98-100)
  • run_main_and_assert (244-408)
  • output_dir (104-106)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py

10-10: Unused noqa directive (non-enabled: S404)

Remove unused noqa directive

(RUF100)


188-188: Unused noqa directive (non-enabled: FBT001, FBT002)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: 3.12 on macOS
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.13 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (13)
tests/test_format.py (5)

216-234: LGTM! Comprehensive test for format_directory with ruff check.

The test correctly validates that format_directory invokes ruff check --fix with the output directory path and appropriate working directory.


237-255: LGTM! Test correctly validates ruff format for directories.

The test properly verifies that format_directory invokes ruff format on the entire directory when RUFF_FORMAT is configured.


258-283: LGTM! Test validates combined Ruff check and format pipeline.

The test correctly verifies that both ruff check --fix and ruff format are invoked sequentially when both formatters are configured for directory formatting.


286-299: LGTM! Test validates deferred formatting skips Ruff subprocess calls.

The test correctly verifies that when defer_formatting=True, Ruff formatters (which use subprocess) are skipped while other formatters like Black continue to run. The assertion on output content confirms the code is returned (with non-Ruff formatters applied).


302-336: LGTM! Integration test validates batch Ruff formatting for directory output.

The test correctly validates that generate() with directory output and Ruff formatters applies batch formatting via two subprocess calls. The use of mock.ANY for cwd is appropriate since the exact working directory is implementation-dependent.

src/datamodel_code_generator/format.py (4)

296-304: LGTM! Defer formatting logic correctly skips Ruff for batch processing.

The implementation correctly:

  • Applies Black and isort unconditionally (they work per-file)
  • Defers Ruff formatters when defer_formatting=True for batch directory processing
  • Uses combined pipeline when both Ruff formatters are present
  • Applies custom formatters unconditionally

The defer logic is sound since Ruff formatting is applied later via format_directory when generating to directories.


341-364: LGTM! Process chaining correctly implements combined Ruff pipeline.

The implementation properly chains ruff check --fix and ruff format using subprocess pipes:

  • Creates both processes with appropriate pipe configuration
  • Closes check_proc.stdout in parent to enable proper SIGPIPE handling
  • Uses communicate() for clean output capture
  • Waits for both processes to complete

The # type: ignore comments are reasonable since mypy doesn't track that stdin is guaranteed to be a pipe.


366-374: LGTM! Windows compatibility correctly implemented.

The method properly handles platform differences:

  • Uses ruff.exe on Windows, ruff elsewhere
  • Checks virtual environment first (respects isolated environments)
  • Falls back to PATH search via shutil.which

This addresses the Windows compatibility concern from previous reviews.


387-402: LGTM! Batch directory formatting implementation is efficient.

The method correctly applies Ruff formatters to entire directories in batch mode, which is more efficient than per-file formatting. The behavior is consistent with existing Ruff methods (apply_ruff_lint, apply_ruff_formatter) in silently ignoring formatter errors via check=False.

tests/main/test_main_general.py (4)

1362-1379: LGTM! Integration test validates combined Ruff pipeline end-to-end.

The test correctly exercises the combined ruff check and ruff format pipeline through the CLI interface and validates the formatted output matches expectations.


1382-1399: LGTM! Test validates ruff check formatter independently.

The test correctly verifies that ruff check --fix alone produces properly formatted output through the CLI.


1402-1419: LGTM! Test validates ruff format formatter independently.

The test correctly verifies that ruff format alone produces properly formatted output through the CLI.


1422-1434: LGTM! Test validates batch Ruff formatting for directory output.

The test correctly verifies that Ruff formatters are applied in batch mode when generating to a directory with multiple files. The assertions confirm that the generated files contain expected content with proper formatting.

@koxudaxi koxudaxi merged commit c93a6d3 into main Dec 24, 2025
37 checks passed
@koxudaxi koxudaxi deleted the perf/rendering-and-formatting-optimizations branch December 24, 2025 08:15
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: PR #2779 introduces performance optimizations without breaking changes: (1) Jinja2 environment caching is an internal optimization that doesn't change template loading behavior, (2) The new defer_formatting parameter has a default value of False preserving existing behavior, (3) The batch ruff formatting for directory outputs produces the same final code output as per-file formatting, (4) No CLI changes were made, (5) No changes to generated code format or template variables, (6) New API parameters (defer_formatting in Parser and CodeFormatter constructors) are backward compatible with defaults that match previous behavior.


This analysis was performed by Claude Code Action

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant