Optimize Jinja2 environment caching and ruff batch formatting#2779
Optimize Jinja2 environment caching and ruff batch formatting#2779
Conversation
📝 WalkthroughWalkthroughAdds a defer_formatting flag propagated from public parsers into CodeFormatter, adds directory-level Ruff handling and a combined Ruff check+format pipeline, defers formatting for extension-less outputs, and updates templates/environment helpers and tests to cover the new formatting behaviors. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
actor User
participant Generate as Generator
participant Parser
participant Formatter as CodeFormatter
participant FS as Filesystem
participant Ruff as Ruff (subprocess)
note over Generate,Parser: Generation request (output path may be dir)
User->>Generate: generate(output=<dir or file>)
Generate->>Parser: parse(..., defer_formatting=bool)
Parser->>Formatter: instantiate(defer_formatting=bool)
Parser->>FS: write generated files
alt defer_formatting == false
Formatter->>Ruff: apply_ruff_check_and_format / ruff format / ruff check
Ruff-->>Formatter: formatted code
Formatter->>FS: overwrite files with formatted output
else defer_formatting == true and output is directory
note right of Formatter: defer formatting during generation\n(Formatter.format_code skips Ruff)
Generate->>Formatter: after generation, if requested run Formatter.format_directory
Formatter->>Ruff: run combined ruff check | ruff format across directory
Ruff-->>Formatter: formatting completed
end
Generate-->>User: generation complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2779 +/- ##
========================================
Coverage 99.48% 99.48%
========================================
Files 87 87
Lines 12908 13023 +115
Branches 1547 1556 +9
========================================
+ Hits 12841 12956 +115
Misses 35 35
Partials 32 32
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/datamodel_code_generator/format.py (1)
175-187: Deferred Ruff informat_code()and newformat_directory()are well‑designed; consider dropping unusednoqa
defer_formattingis correctly stored and only affects the Ruff steps informat_code(), leaving Black/isort and custom formatters unchanged.format_directory()encapsulates the “batch Ruff over a directory” behavior thatgenerate()now relies on, with commands andcwdconsistent with the rest of CodeFormatter.One minor nit: Ruff reports the
# noqa: FBT001, FBT002on thedefer_formattingparameter as an unusednoqa(RUF100). If FBT checks aren’t enabled in your config, you can safely remove thatnoqato keep the file clean.Also applies to: 259-260, 294-299, 346-361
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
src/datamodel_code_generator/__init__.pysrc/datamodel_code_generator/format.pysrc/datamodel_code_generator/model/base.pysrc/datamodel_code_generator/parser/base.pysrc/datamodel_code_generator/parser/graphql.pysrc/datamodel_code_generator/parser/jsonschema.pysrc/datamodel_code_generator/parser/openapi.pytests/test_format.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/__init__.py (1)
src/datamodel_code_generator/format.py (3)
CodeFormatter(172-361)Formatter(160-166)format_directory(346-361)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py
186-186: Unused noqa directive (non-enabled: FBT001, FBT002)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: py312-isort7 on Ubuntu
- GitHub Check: py312-isort5 on Ubuntu
- GitHub Check: py312-black23 on Ubuntu
- GitHub Check: py312-pydantic1 on Ubuntu
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: 3.12 on Ubuntu
- GitHub Check: benchmarks
- GitHub Check: Analyze (python)
🔇 Additional comments (12)
src/datamodel_code_generator/model/base.py (4)
404-420: Verify the intended caching behavior for directory existence checks.The
custom_dir.exists()check on line 411 executes only when the cache misses. If a custom template directory is created after the environment is cached, it won't be detected or used. This optimization assumes template directories are static at runtime.This is likely intentional for performance, but confirm that this behavior aligns with expected usage patterns.
423-436: LGTM! Clean separation of environment and template caching.The refactoring properly delegates environment creation to the cached
_get_environmentfunction while maintaining the template-level cache. This two-tier caching strategy is efficient and maintains the original behavior.
439-449: LGTM! Proper caching for absolute path templates.The function correctly creates an environment with fallback to built-in templates. The absence of existence checks is safe since callers (line 721) only invoke this after verifying the template file exists.
452-462: LGTM! Consistent refactoring pattern.The refactoring mirrors the approach used in
_get_template_with_custom_dir, providing consistent two-tier caching for absolute path templates.src/datamodel_code_generator/parser/jsonschema.py (1)
624-625:defer_formattingwiring in JsonSchemaParser looks correctThe new
defer_formattingargument is added with a safe default and forwarded toParser.__init__via keyword, so existing callers remain compatible while enabling deferred Ruff behavior.Also applies to: 733-734
src/datamodel_code_generator/parser/openapi.py (1)
281-282: OpenAPIParserdefer_formattingpropagation is consistent and non‑breakingThe new
defer_formattingflag is added with a default and cleanly forwarded to the base parser, matching the JsonSchemaParser pattern and preserving existing behavior.Also applies to: 391-392
src/datamodel_code_generator/parser/graphql.py (1)
198-199: GraphQLParserdefer_formattingintegration is aligned with other parsersAdding and forwarding
defer_formattingmirrors the JSON/OpenAPI parsers and should make GraphQL formatting behavior consistent without affecting existing callers.Also applies to: 306-307
src/datamodel_code_generator/__init__.py (1)
33-41: Deferred Ruff + batch directory formatting ingenerate()is coherent
- Importing
CodeFormatterhere and computingdefer_formatting = output is not None and not output.suffixcorrectly distinguishes directory/extension‑less outputs from the typical single‑filemodels.pycase.- Passing
defer_formattinginto the parser ensures Ruff is skipped during per‑fileformat_codewhen we intend to run it once over the final directory.- The final block that instantiates
CodeFormatterand callsformat_directory(output)only whendefer_formattingis true and Ruff is requested gives the desired “batch Ruff over directory” behavior without changing Black/isort semantics.Net effect: single‑file flows behave as before, while modular/directory flows now avoid N× Ruff invocations in favor of a single batch run.
Also applies to: 711-712, 811-812, 935-951
tests/test_format.py (1)
216-336: New formatting tests thoroughly cover directory and deferred Ruff behaviorThe added tests validate:
format_directory()calling the correct Ruff commands (check/format/both) with the expectedcwd.defer_formatting=Truepreventing any Ruff subprocess calls fromformat_code.generate()invoking batch Ruff exactly twice for directory output with both Ruff formatters.This gives good confidence in the new formatting pipeline wiring.
src/datamodel_code_generator/parser/base.py (3)
779-779: LGTM! Well-placed parameter addition.The
defer_formattingparameter is properly typed, has a sensible default value for backward compatibility, and is logically placed alongside other formatting-related parameters.
955-955: LGTM! Standard instance variable initialization.The instance variable is properly initialized and logically grouped with related formatting configuration.
2718-2718: LGTM! Correct parameter propagation.The
defer_formattingflag is properly forwarded toCodeFormatter, completing the integration chain from Parser initialization to the formatting stage.
CodSpeed Performance ReportMerging #2779 will not alter performanceComparing Summary
Footnotes
|
When both RUFF_CHECK and RUFF_FORMAT formatters are enabled, use a shell pipe (ruff check --fix - | ruff format -) to run both processes simultaneously. This reduces process startup overhead by ~33% compared to running them sequentially. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
c4cfc8d to
8964f69
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/datamodel_code_generator/format.pytests/test_format.py
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/test_format.py
🧰 Additional context used
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py
10-10: Unused noqa directive (non-enabled: S404)
Remove unused noqa directive
(RUF100)
188-188: Unused noqa directive (non-enabled: FBT001, FBT002)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.13 on macOS
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.12 on macOS
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: benchmarks
- GitHub Check: Analyze (python)
🔇 Additional comments (4)
src/datamodel_code_generator/format.py (4)
9-11: LGTM! Necessary imports for new functionality.The
shutilandsysimports support the new_find_ruff_pathmethod for locating the Ruff executable.
296-304: Nice optimization: combined Ruff pipeline reduces subprocess overhead.The logic correctly routes to the combined
apply_ruff_check_and_formatwhen both Ruff check and format are enabled, avoiding two separate subprocess calls. The deferred formatting path appropriately skips Ruff processing here, assumingformat_directorywill be called later.
374-389: LGTM! Safe directory-level formatting implementation.The method correctly uses tuple arguments with
subprocess.run, avoiding shell injection risks. The separate subprocess calls for check and format are appropriate for directory operations where the shell pipe optimization would be less beneficial.
10-10: Remove unusednoqadirectives.Static analysis reports that the
noqadirectives on these lines suppress rules that are not enabled in your configuration.🔎 Proposed fix
-import subprocess # noqa: S404 +import subprocess- defer_formatting: bool = False, # noqa: FBT001, FBT002 + defer_formatting: bool = False,Also applies to: 188-188
⛔ Skipped due to learnings
Learnt from: koxudaxi Repo: koxudaxi/datamodel-code-generator PR: 2681 File: tests/cli_doc/test_cli_doc_coverage.py:82-82 Timestamp: 2025-12-18T13:43:16.235Z Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/format.py (2)
10-10: Clean up unused noqa directive.Static analysis indicates the
noqa: S404directive is unused.🔎 Proposed fix
-import subprocess # noqa: S404 +import subprocess
188-188: Clean up unused noqa directive.Static analysis indicates the
noqa: FBT001, FBT002directives are unused (boolean trap rules may not be enabled).🔎 Proposed fix
- defer_formatting: bool = False, # noqa: FBT001, FBT002 + defer_formatting: bool = False,
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
src/datamodel_code_generator/format.pytests/main/test_main_general.pytests/test_format.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/main/test_main_general.py (2)
tests/test_main_kr.py (2)
output_file(44-46)output_dir(50-52)tests/main/conftest.py (3)
output_file(98-100)run_main_and_assert(244-408)output_dir(104-106)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/format.py
10-10: Unused noqa directive (non-enabled: S404)
Remove unused noqa directive
(RUF100)
188-188: Unused noqa directive (non-enabled: FBT001, FBT002)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: 3.12 on macOS
- GitHub Check: py312-isort5 on Ubuntu
- GitHub Check: 3.10 on macOS
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.12 on Ubuntu
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (13)
tests/test_format.py (5)
216-234: LGTM! Comprehensive test for format_directory with ruff check.The test correctly validates that
format_directoryinvokesruff check --fixwith the output directory path and appropriate working directory.
237-255: LGTM! Test correctly validates ruff format for directories.The test properly verifies that
format_directoryinvokesruff formaton the entire directory whenRUFF_FORMATis configured.
258-283: LGTM! Test validates combined Ruff check and format pipeline.The test correctly verifies that both
ruff check --fixandruff formatare invoked sequentially when both formatters are configured for directory formatting.
286-299: LGTM! Test validates deferred formatting skips Ruff subprocess calls.The test correctly verifies that when
defer_formatting=True, Ruff formatters (which use subprocess) are skipped while other formatters like Black continue to run. The assertion on output content confirms the code is returned (with non-Ruff formatters applied).
302-336: LGTM! Integration test validates batch Ruff formatting for directory output.The test correctly validates that
generate()with directory output and Ruff formatters applies batch formatting via two subprocess calls. The use ofmock.ANYforcwdis appropriate since the exact working directory is implementation-dependent.src/datamodel_code_generator/format.py (4)
296-304: LGTM! Defer formatting logic correctly skips Ruff for batch processing.The implementation correctly:
- Applies Black and isort unconditionally (they work per-file)
- Defers Ruff formatters when
defer_formatting=Truefor batch directory processing- Uses combined pipeline when both Ruff formatters are present
- Applies custom formatters unconditionally
The defer logic is sound since Ruff formatting is applied later via
format_directorywhen generating to directories.
341-364: LGTM! Process chaining correctly implements combined Ruff pipeline.The implementation properly chains
ruff check --fixandruff formatusing subprocess pipes:
- Creates both processes with appropriate pipe configuration
- Closes
check_proc.stdoutin parent to enable proper SIGPIPE handling- Uses
communicate()for clean output capture- Waits for both processes to complete
The
# type: ignorecomments are reasonable since mypy doesn't track thatstdinis guaranteed to be a pipe.
366-374: LGTM! Windows compatibility correctly implemented.The method properly handles platform differences:
- Uses
ruff.exeon Windows,ruffelsewhere- Checks virtual environment first (respects isolated environments)
- Falls back to PATH search via
shutil.whichThis addresses the Windows compatibility concern from previous reviews.
387-402: LGTM! Batch directory formatting implementation is efficient.The method correctly applies Ruff formatters to entire directories in batch mode, which is more efficient than per-file formatting. The behavior is consistent with existing Ruff methods (
apply_ruff_lint,apply_ruff_formatter) in silently ignoring formatter errors viacheck=False.tests/main/test_main_general.py (4)
1362-1379: LGTM! Integration test validates combined Ruff pipeline end-to-end.The test correctly exercises the combined
ruff checkandruff formatpipeline through the CLI interface and validates the formatted output matches expectations.
1382-1399: LGTM! Test validates ruff check formatter independently.The test correctly verifies that
ruff check --fixalone produces properly formatted output through the CLI.
1402-1419: LGTM! Test validates ruff format formatter independently.The test correctly verifies that
ruff formatalone produces properly formatted output through the CLI.
1422-1434: LGTM! Test validates batch Ruff formatting for directory output.The test correctly verifies that Ruff formatters are applied in batch mode when generating to a directory with multiple files. The assertions confirm that the generated files contain expected content with proper formatting.
Breaking Change AnalysisResult: No breaking changes detected Reasoning: PR #2779 introduces performance optimizations without breaking changes: (1) Jinja2 environment caching is an internal optimization that doesn't change template loading behavior, (2) The new This analysis was performed by Claude Code Action |
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.