Skip to content

Simplify generate() function signature using Unpack[GenerateConfigDict]#2874

Merged
koxudaxi merged 4 commits intomainfrom
simplify-generate-function-signature
Dec 30, 2025
Merged

Simplify generate() function signature using Unpack[GenerateConfigDict]#2874
koxudaxi merged 4 commits intomainfrom
simplify-generate-function-signature

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Dec 30, 2025

Summary by CodeRabbit

  • New Features

    • Added ability to pass a full configuration object to generate() as an alternative to keyword options
    • Several configuration constants and types are now publicly available
  • Refactor

    • generate() now uses a config-driven interface and simplified parameter handling
    • Aliases support widened to accept either a single string or a list of strings
  • Bug Fixes

    • Now raises an explicit error if both a config object and keyword options are supplied

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 30, 2025

Warning

Rate limit exceeded

@koxudaxi has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 18 minutes and 47 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between 21d6248 and 58d215b.

📒 Files selected for processing (1)
  • tests/main/test_public_api_signature_baseline.py
📝 Walkthrough

Walkthrough

The PR refactors the public generate() API to accept either a GenerateConfig object or keyword options, widens alias types to allow string-or-list values across config and parser types, expands public exports, and updates tests to reflect the new config-driven usage and mutual-exclusion error when both config and kwargs are provided.

Changes

Cohort / File(s) Summary
Public API Migration
src/datamodel_code_generator/__init__.py
Replaced long generate(...) parameter list with `generate(input_, *, config: GenerateConfig
Config & Types
src/datamodel_code_generator/config.py, src/datamodel_code_generator/_types/generate_config_dict.py, src/datamodel_code_generator/_types/parser_config_dict.py
Broadened aliases type from `Mapping[str, str]
Parser constructors & resolver
src/datamodel_code_generator/parser/*.py, src/datamodel_code_generator/reference.py
Widened aliases parameter type in parser constructors and ModelResolver (graphql.py, jsonschema.py, openapi.py, base.py, reference.py) to accept `Mapping[str, str
Tests
tests/main/test_main_general.py, tests/main/test_public_api_signature_baseline.py
Updated tests to pass output via GenerateConfig, added assertion for ValueError when both config and kwargs provided, and adjusted baseline signature checks to compare against GenerateConfigDict annotations (including aliases change).

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant generate
    participant GenerateConfig
    participant Pydantic
    participant ParserFactory
    participant Codegen

    Caller->>generate: call generate(input_, config=cfg) or generate(input_, **options)
    alt config provided
        generate->>GenerateConfig: use provided config
    else kwargs provided
        generate->>GenerateConfig: Construct from options (Unpack[GenerateConfigDict])
    end
    generate->>Pydantic: validate/rebuild config (v2 vs v1 guarded)
    Pydantic-->>generate: resolved config
    generate->>ParserFactory: instantiate parser(s) with config.parsers.* and aliases
    ParserFactory->>Codegen: parse input and produce models
    Codegen-->>generate: generated modules / string / None
    generate-->>Caller: return result
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

  • #2844: Alters GenerateConfig and generate() to a config-driven approach; touches the same config/typedict structures.
  • #2832: Related baseline-signature changes for generate(); overlaps in public API validation tests.
  • #2791: Changes public exports and generate() signature surface; touches related exported constants and strategies.

Suggested labels

breaking-change-analyzed

Suggested reviewers

  • ilovelinux

Poem

🐰 Hopping through configs, parameters unwind,
Aliases gather — strings and lists combined.
Exports extend, the API's anew,
A rabbit applauds the tidy review! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: refactoring the generate() function to use Unpack[GenerateConfigDict] instead of explicit parameters.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Dec 30, 2025

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Dec 30, 2025

CodSpeed Performance Report

Merging #2874 will not alter performance

Comparing simplify-generate-function-signature (58d215b) with main (72f3300)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched
⏩ 98 skipped1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
tests/main/test_public_api_signature_baseline.py (1)

398-415: Config‑dict baseline check is solid; consider also asserting generate’s signature shape

Comparing keyword‑only parameter names from _baseline_generate (minus "config") to GenerateConfigDict.__annotations__.keys() is a good way to enforce parity between the legacy kwargs surface and the new GenerateConfigDict backing type, and the mismatch message is clear.

One thing you lose versus the previous version is any direct assertion about generate itself (e.g. that it still has a config kw‑only parameter plus a **options var‑kwarg using GenerateConfigDict). To keep this test guarding the actual public function as well, you might additionally assert, for example:

  • inspect.signature(generate) has a kw‑only config parameter, and
  • its VAR_KEYWORD parameter is named as expected (e.g. options).

That would preserve the new dict‑based check while still catching future accidental signature changes.

src/datamodel_code_generator/__init__.py (2)

475-493: Consider removing unused noqa directives for local imports.

Static analysis indicates the # noqa: PLC0415 comments on lines 475, 483-484, and 489-490 are unused. These are local imports inside the function, and the directive for local imports (PLC0415) may not be enabled in your linter configuration.

If local imports are intentional (to avoid circular imports or for lazy loading), you can safely remove the noqa comments since the rule isn't enabled.

🔎 Proposed fix to remove unused noqa directives
-    from datamodel_code_generator.config import GenerateConfig  # noqa: PLC0415
+    from datamodel_code_generator.config import GenerateConfig

     if config is None:
         if is_pydantic_v2():
-            from datamodel_code_generator.model.pydantic_v2 import UnionMode  # noqa: PLC0415
-            from datamodel_code_generator.types import StrictTypes  # noqa: PLC0415
+            from datamodel_code_generator.model.pydantic_v2 import UnionMode
+            from datamodel_code_generator.types import StrictTypes

             GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})
             config = GenerateConfig.model_validate(options)
         else:
-            from datamodel_code_generator.enums import UnionMode  # noqa: PLC0415
-            from datamodel_code_generator.types import StrictTypes  # noqa: PLC0415
+            from datamodel_code_generator.enums import UnionMode
+            from datamodel_code_generator.types import StrictTypes

             GenerateConfig.update_forward_refs(StrictTypes=StrictTypes, UnionMode=UnionMode)
             config = GenerateConfig(**options)

495-616: Verbose but necessary for backward compatibility.

The extraction of ~120 config values to local variables is verbose but maintains backward compatibility with the existing function body that references these variables. This avoids a larger refactor of the function internals.

For future consideration: the function body could be refactored to use config.field_name directly, which would eliminate this extraction block. However, that's a larger change that can be deferred.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3e159c and 1457063.

📒 Files selected for processing (5)
  • docs/cli-reference/model-customization.md
  • docs/cli-reference/template-customization.md
  • src/datamodel_code_generator/__init__.py
  • tests/main/test_main_general.py
  • tests/main/test_public_api_signature_baseline.py
🧰 Additional context used
🧬 Code graph analysis (1)
tests/main/test_public_api_signature_baseline.py (1)
src/datamodel_code_generator/_types/generate_config_dict.py (1)
  • GenerateConfigDict (38-158)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py

450-450: Unused noqa directive (non-enabled: PLR0912, PLR0914, PLR0915)

Remove unused noqa directive

(RUF100)


475-475: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


483-483: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


484-484: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


489-489: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


490-490: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (9)
docs/cli-reference/model-customization.md (1)

1371-1379: Literal typing for Api.version example looks consistent and precise

Importing Literal and tightening Api.version to Literal['v1'] alongside Field('v1', const=True, ...) accurately reflects the const constraint and aligns this Pydantic v1 example with the other literal-based examples in the section.

docs/cli-reference/template-customization.md (1)

76-77: Tightened MapState literals and added import are well-aligned with the schema

Adding Literal to the typing imports and switching map_view_mode, map_split_mode, and is_split to Literal[...] types (while keeping the existing Field(..., const=True, alias=...)) matches the JSON Schema const constraints and makes the example’s static types more precise without changing the intended semantics.

Also applies to: 2562-2564, 2574-2578

tests/main/test_public_api_signature_baseline.py (1)

605-612: Passing output via GenerateConfig makes the comparison test more representative

Moving output=output_config into the GenerateConfig(...) call and invoking generate(input_=schema, config=config) ensures this test now exercises the pure “config‑only” path versus the pure “kwargs‑only” path, without mixing shared options across both. That more accurately validates that config‑driven and kwargs‑driven generation remain behaviorally equivalent.

tests/main/test_main_general.py (2)

1972-1995: LGTM!

The test correctly validates that GenerateConfig now accepts an output parameter and uses it properly. The test structure follows existing patterns.


1997-2014: LGTM!

The new test properly validates the mutual exclusivity constraint between config and keyword arguments. The test:

  • Creates a config object without an output path
  • Attempts to pass output as a keyword argument alongside the config
  • Correctly expects a ValueError with the appropriate message

This ensures the API contract is enforced.

src/datamodel_code_generator/__init__.py (4)

477-479: LGTM!

The mutual exclusivity check is correctly implemented. Raising ValueError when both config and **options are provided ensures a clear API contract and prevents ambiguous configurations.


635-640: LGTM!

Using DataclassArguments() to create an empty typed dict is correct. The subsequent conditional logic properly populates frozen and kw_only keys based on the config flags.


1080-1107: LGTM!

The __all__ exports are appropriately expanded to include newly exposed public API elements like DEFAULT_FORMATTERS, DEFAULT_SHARED_MODULE_NAME, AllOfMergeMode, CollapseRootModelsNameStrategy, FieldTypeCollisionStrategy, GraphQLScope, OpenAPIScope, PythonVersionMin, and ReuseScope.


450-454: The noqa directives on line 450 are technically unused and can be safely removed.

The rules PLR0912, PLR0914, and PLR0915 are explicitly listed in lint.ignore in pyproject.toml, so Ruff will not check these violations regardless of the noqa directives. Since the rules are disabled at the configuration level, the directives have no effect. They can be removed without consequence, though keeping them for documentation purposes is also acceptable given the function's 603-line span.

@koxudaxi koxudaxi force-pushed the simplify-generate-function-signature branch from 1457063 to 356d859 Compare December 30, 2025 18:00
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/datamodel_code_generator/__init__.py (1)

450-450: Remove unused noqa directives.

Static analysis reports that several noqa directives are unused:

  • Line 450: PLR0912, PLR0914, PLR0915 (complexity warnings)
  • Lines 475, 483-484, 489-490: PLC0415 (import-outside-top-level)

However, the runtime imports in lines 475, 483-484, 489-490 appear to still be present. Please verify whether these noqa directives are actually needed, and if not, remove them to keep the code clean.

🔎 Proposed fix
-def generate(  # noqa: PLR0912, PLR0914, PLR0915
+def generate(
     input_: Path | str | ParseResult | Mapping[str, Any],
     *,
     config: GenerateConfig | None = None,
     **options: Unpack[GenerateConfigDict],
 ) -> str | GeneratedModules | None:
     """Generate Python data models from schema definitions or structured data.
 
     This is the main entry point for code generation. Supports OpenAPI, JSON Schema,
     GraphQL, and raw data formats (JSON, YAML, Dict, CSV) as input.
 
     Args:
         input_: The input source (file path, string content, URL, or dict).
         config: A GenerateConfig object with all options. Cannot be used together with **options.
         **options: Individual options matching GenerateConfig fields. Cannot be used together with config.
 
     Returns:
         - When output is a Path: None (writes to file system)
         - When output is None and single module: str (generated code)
         - When output is None and multiple modules: GeneratedModules (dict mapping
           module path tuples to generated code strings)
 
     Raises:
         ValueError: If both config and **options are provided.
     """
-    from datamodel_code_generator.config import GenerateConfig  # noqa: PLC0415
+    from datamodel_code_generator.config import GenerateConfig
 
     if config is not None and options:
         msg = "Cannot specify both 'config' and keyword arguments. Use one or the other."
         raise ValueError(msg)
 
     if config is None:
         if is_pydantic_v2():
-            from datamodel_code_generator.model.pydantic_v2 import UnionMode  # noqa: PLC0415
-            from datamodel_code_generator.types import StrictTypes  # noqa: PLC0415
+            from datamodel_code_generator.model.pydantic_v2 import UnionMode
+            from datamodel_code_generator.types import StrictTypes
 
             GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})
             config = GenerateConfig.model_validate(options)
         else:
-            from datamodel_code_generator.enums import UnionMode  # noqa: PLC0415
-            from datamodel_code_generator.types import StrictTypes  # noqa: PLC0415
+            from datamodel_code_generator.enums import UnionMode
+            from datamodel_code_generator.types import StrictTypes
 
             GenerateConfig.update_forward_refs(StrictTypes=StrictTypes, UnionMode=UnionMode)
             config = GenerateConfig(**options)

Note: Only remove the noqa comments if linting confirms they're genuinely unused. The complexity warnings may be valid to suppress.

Also applies to: 475-475, 483-484, 489-490

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1457063 and 356d859.

📒 Files selected for processing (6)
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/_types/generate_config_dict.py
  • src/datamodel_code_generator/_types/parser_config_dict.py
  • src/datamodel_code_generator/config.py
  • tests/main/test_main_general.py
  • tests/main/test_public_api_signature_baseline.py
🧰 Additional context used
🧬 Code graph analysis (3)
tests/main/test_public_api_signature_baseline.py (1)
src/datamodel_code_generator/_types/generate_config_dict.py (1)
  • GenerateConfigDict (38-158)
src/datamodel_code_generator/__init__.py (7)
src/datamodel_code_generator/_types/generate_config_dict.py (1)
  • GenerateConfigDict (38-158)
src/datamodel_code_generator/config.py (1)
  • GenerateConfig (59-191)
src/datamodel_code_generator/util.py (1)
  • is_pydantic_v2 (52-57)
src/datamodel_code_generator/enums.py (3)
  • UnionMode (192-196)
  • StrictTypes (213-220)
  • DataclassArguments (15-27)
src/datamodel_code_generator/types.py (1)
  • model_rebuild (414-421)
src/datamodel_code_generator/parser/jsonschema.py (1)
  • model_rebuild (228-230)
src/datamodel_code_generator/model/base.py (4)
  • model_rebuild (200-207)
  • base_class (837-839)
  • class_name (848-850)
  • class_name (853-857)
tests/main/test_main_general.py (4)
tests/test_main_kr.py (1)
  • output_file (44-46)
tests/main/conftest.py (1)
  • output_file (98-100)
src/datamodel_code_generator/config.py (1)
  • GenerateConfig (59-191)
src/datamodel_code_generator/__init__.py (1)
  • generate (450-1051)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py

450-450: Unused noqa directive (non-enabled: PLR0912, PLR0914, PLR0915)

Remove unused noqa directive

(RUF100)


475-475: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


483-483: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


484-484: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


489-489: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


490-490: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)
  • GitHub Check: py312-isort5 on Ubuntu
  • GitHub Check: py312-pydantic1 on Ubuntu
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.14 on macOS
  • GitHub Check: 3.12 on macOS
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: py312-black24 on Ubuntu
  • GitHub Check: py312-isort7 on Ubuntu
  • GitHub Check: py312-black22 on Ubuntu
  • GitHub Check: py312-isort6 on Ubuntu
  • GitHub Check: 3.12 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: 3.10 on macOS
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.13 on Ubuntu
  • GitHub Check: 3.11 on macOS
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (13)
src/datamodel_code_generator/_types/parser_config_dict.py (1)

49-49: LGTM! Type expansion for aliases field.

The change from Mapping[str, str] to Mapping[str, str | list[str]] allows each alias to map to either a single string or a list of strings, providing greater flexibility. This is consistent with parallel changes in GenerateConfigDict and the config models.

src/datamodel_code_generator/_types/generate_config_dict.py (1)

55-55: LGTM! Consistent type expansion.

The aliases field type expansion matches the change in ParserConfigDict, maintaining consistency across the configuration type definitions.

src/datamodel_code_generator/config.py (2)

88-88: LGTM! Consistent type expansion in GenerateConfig.

The aliases field type matches the corresponding TypedDict definition, maintaining type consistency across the configuration models.


223-223: LGTM! Consistent type expansion in ParserConfig.

The change aligns with the GenerateConfig model and the TypedDict definitions, ensuring uniform alias handling across parsers.

tests/main/test_public_api_signature_baseline.py (2)

452-469: LGTM! Test correctly validates new config-driven signature.

The test now verifies that GenerateConfigDict keys match the baseline function's keyword-only arguments (excluding 'config'), which appropriately validates backward compatibility for the new **options: Unpack[GenerateConfigDict] signature.


659-666: LGTM! Test correctly demonstrates config-driven usage.

The test now passes output via the GenerateConfig object rather than as a keyword argument to generate(), correctly demonstrating the new config-driven API pattern while ensuring functional parity with the kwargs path.

tests/main/test_main_general.py (2)

979-995: LGTM! Test updated for config-driven usage.

The test now correctly passes output through the GenerateConfig object, demonstrating the intended config-driven pattern for the new API.


998-1014: LGTM! Test correctly validates mutual exclusivity.

The test appropriately verifies that passing both a config object and keyword arguments raises a ValueError with the expected message, ensuring proper enforcement of the mutual exclusivity constraint.

src/datamodel_code_generator/__init__.py (5)

12-13: LGTM! Import changes support config-driven API.

The new imports (defaultdict, Mapping, Unpack, GenerateConfigDict) are all necessary for the refactored signature and config handling logic.

Also applies to: 28-28, 63-63


450-494: LGTM! Clean migration to config-driven API.

The signature refactoring successfully:

  • Simplifies the API by accepting either a GenerateConfig object or individual options via **options: Unpack[GenerateConfigDict]
  • Enforces mutual exclusivity between config and **options with a clear error message
  • Properly handles Pydantic v1 vs v2 differences (model_validate vs init, model_rebuild vs update_forward_refs)

The approach maintains backward compatibility through the **options path while enabling the cleaner config object pattern.


495-616: LGTM! Systematic config extraction maintains compatibility.

The extraction of all config values to local variables ensures the rest of the function continues to work unchanged. The special handling for extra_template_data (converting to defaultdict when provided) is appropriate given the function's internal expectations.


636-636: LGTM! Proper initialization of DataclassArguments.

The explicit initialization of DataclassArguments() ensures the variable is properly typed and prevents potential issues when conditionally adding keys.


1080-1111: LGTM! Expanded public exports for convenience.

The additions to __all__ expose commonly used constants and enums (DEFAULT_FORMATTERS, DEFAULT_SHARED_MODULE_NAME, AllOfMergeMode, CollapseRootModelsNameStrategy, FieldTypeCollisionStrategy, GraphQLScope, OpenAPIScope, PythonVersionMin, ReuseScope), improving the public API surface without breaking changes.

@codecov
Copy link
Copy Markdown

codecov Bot commented Dec 30, 2025

Codecov Report

❌ Patch coverage is 98.79518% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 99.39%. Comparing base (72f3300) to head (58d215b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
tests/main/test_public_api_signature_baseline.py 90.00% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2874   +/-   ##
=======================================
  Coverage   99.39%   99.39%           
=======================================
  Files          91       91           
  Lines       15591    15618   +27     
  Branches     1831     1836    +5     
=======================================
+ Hits        15496    15523   +27     
  Misses         50       50           
  Partials       45       45           
Flag Coverage Δ
unittests 99.39% <98.79%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi force-pushed the simplify-generate-function-signature branch from 356d859 to 8cfddbb Compare December 30, 2025 18:15
@koxudaxi koxudaxi force-pushed the simplify-generate-function-signature branch from 8cfddbb to 8134931 Compare December 30, 2025 18:17
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (2)
tests/main/test_public_api_signature_baseline.py (1)

349-408: Union-string normalization helper looks solid; consider guarding for older Python

_normalize_union_str plus its parametrized tests correctly canonicalize unions (including nested generics like Mapping[str, str | list[str]] | None]) so that config vs TypedDict types compare reliably.

One nuance: this relies on ast.unparse, which is only available on Python ≥3.9. The test module already depends on types.UnionType (Python ≥3.10), but if you ever need the test suite to run on <3.9, this will fail at import time.

If cross-version testing matters, a cheap defensive guard would be:

Optional compatibility guard for <3.9
 def _normalize_union_str(type_str: str) -> str:
-    """Normalize a union type string by sorting its components recursively."""
-    try:
-        tree = ast.parse(type_str, mode="eval")
-    except SyntaxError:  # pragma: no cover
-        return type_str
+    """Normalize a union type string by sorting its components recursively."""
+    # ast.unparse is only available in Python 3.9+; on older versions just
+    # return the original string so tests still run.
+    if not hasattr(ast, "unparse"):  # pragma: no cover
+        return type_str
+    try:
+        tree = ast.parse(type_str, mode="eval")
+    except SyntaxError:  # pragma: no cover
+        return type_str

Can you confirm your CI and supported test Python versions so this remains safe?

src/datamodel_code_generator/__init__.py (1)

496-618: Consider refactoring the config extraction block.

This ~120-line block that extracts every field from config into local variables is quite verbose and creates a maintenance burden. Each change to GenerateConfig requires updating this extraction logic.

Since you're using "Chill" review mode, this is an optional suggestion for future refactoring rather than a blocking issue. The current approach works and might be intentional to keep the PR scope limited to signature changes.

💡 Potential refactoring approach

Consider these alternatives for future iterations:

  1. Pass config directly to parser: Update the parser constructor to accept config: GenerateConfig and extract fields internally
  2. Use config attributes directly: Reference config.field_name throughout the function instead of extracting to locals
  3. Dataclass unpacking: Use **config.model_dump() or similar if the parser constructor signature can be updated

These would reduce the code size and make maintenance easier, but would require more extensive refactoring.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 356d859 and 21d6248.

📒 Files selected for processing (10)
  • src/datamodel_code_generator/__init__.py
  • src/datamodel_code_generator/_types/generate_config_dict.py
  • src/datamodel_code_generator/_types/parser_config_dict.py
  • src/datamodel_code_generator/config.py
  • src/datamodel_code_generator/parser/base.py
  • src/datamodel_code_generator/parser/graphql.py
  • src/datamodel_code_generator/parser/jsonschema.py
  • src/datamodel_code_generator/parser/openapi.py
  • src/datamodel_code_generator/reference.py
  • tests/main/test_public_api_signature_baseline.py
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/datamodel_code_generator/config.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/datamodel_code_generator/__init__.py (5)
src/datamodel_code_generator/_types/generate_config_dict.py (1)
  • GenerateConfigDict (39-160)
src/datamodel_code_generator/config.py (1)
  • GenerateConfig (60-193)
src/datamodel_code_generator/util.py (1)
  • is_pydantic_v2 (52-57)
src/datamodel_code_generator/enums.py (3)
  • UnionMode (199-203)
  • StrictTypes (220-227)
  • DataclassArguments (15-27)
src/datamodel_code_generator/types.py (1)
  • model_rebuild (414-421)
tests/main/test_public_api_signature_baseline.py (1)
src/datamodel_code_generator/_types/generate_config_dict.py (1)
  • GenerateConfigDict (39-160)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/__init__.py

451-451: Unused noqa directive (non-enabled: PLR0912, PLR0914, PLR0915)

Remove unused noqa directive

(RUF100)


476-476: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


484-484: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


485-485: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


490-490: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)


491-491: Unused noqa directive (non-enabled: PLC0415)

Remove unused noqa directive

(RUF100)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: 3.10 on Ubuntu
  • GitHub Check: py312-isort6 on Ubuntu
  • GitHub Check: 3.11 on Ubuntu
  • GitHub Check: 3.11 on macOS
  • GitHub Check: py312-black23 on Ubuntu
  • GitHub Check: 3.12 on Windows
  • GitHub Check: 3.11 on Windows
  • GitHub Check: 3.10 on Windows
  • GitHub Check: 3.14 on Windows
  • GitHub Check: 3.13 on Ubuntu
  • GitHub Check: 3.13 on Windows
  • GitHub Check: Analyze (python)
  • GitHub Check: benchmarks
🔇 Additional comments (14)
src/datamodel_code_generator/_types/generate_config_dict.py (1)

56-56: LGTM! Consistent with ParserConfigDict changes.

The aliases type widening matches the corresponding change in ParserConfigDict, maintaining consistency across configuration types. This enables Pydantic v2 AliasChoices support at the generation level.

src/datamodel_code_generator/parser/jsonschema.py (1)

661-661: LGTM! Implementation already supports list aliases.

The parameter type widening aligns with the TypedDict changes and is well-supported by the existing implementation. Multiple locations in the codebase (lines 1227-1233, 2367-2372, 2548-2565) already handle both string and list aliases correctly using isinstance checks:

if isinstance(alias, list):
    validation_aliases = alias
else:
    single_alias = alias

This backward-compatible change enables Pydantic v2 AliasChoices functionality.

src/datamodel_code_generator/_types/parser_config_dict.py (1)

50-50: Type widening for multi-alias support aligns with Pydantic v2 integration.

The change from Mapping[str, str] to Mapping[str, str | list[str]] correctly widens the type to support multiple aliases per field, consistent with Pydantic v2's AliasChoices feature. This is a backward-compatible change since existing Mapping[str, str] values remain valid. The type definition itself is sound and properly reflects the intended schema.

src/datamodel_code_generator/parser/base.py (1)

690-711: Alias type widening in Parser.__init__ is consistent

Changing aliases to Mapping[str, str | list[str]] | None matches ModelResolver / FieldNameResolver expectations and cleanly enables multi-alias (AliasChoices) without affecting existing string-only callers.

src/datamodel_code_generator/reference.py (1)

523-575: ModelResolver aliases type aligned with FieldNameResolver

The aliases: Mapping[str, str | list[str]] | None parameter is correctly propagated into FieldNameResolver, keeping alias semantics consistent while allowing multiple aliases per field.

src/datamodel_code_generator/parser/graphql.py (1)

103-136: GraphQLParser aliases signature matches core parser API

aliases: Mapping[str, str | list[str]] | None is correctly widened and forwarded to Parser.__init__, keeping the GraphQL parser in sync with the new multi-alias support.

src/datamodel_code_generator/parser/openapi.py (1)

185-216: OpenAPIParser aliases parameter updated appropriately

The aliases kwarg type is widened to Mapping[str, str | list[str]] | None and passed through to the base parser, matching the rest of the alias-handling stack.

tests/main/test_public_api_signature_baseline.py (3)

55-75: Baseline aliases parameters updated to multi-alias type

The _baseline_generate and _BaselineParser.__init__ signatures now use aliases: Mapping[str, str | list[str]] | None, matching the real API and GenerateConfigDict / ParserConfigDict types so the baseline tests accurately track the public surface.

Also applies to: 184-205


455-472: Signature baseline now directly validated against GenerateConfigDict

test_generate_signature_matches_baseline comparing baseline kw-only args (excluding config) to GenerateConfigDict.__annotations__ is a nice tightening: it ensures the Unpack-based generate() signature stays aligned with the TypedDict keys and gives a clear diff when something drifts.


634-674: Config vs kwargs output equivalence test is precise and robust

test_generate_with_config_produces_same_result_as_kwargs now writes to two explicit files (output_kwargs.py and output_config.py) and compares their contents, with GenerateConfig(output=output_config, ...) mirroring the kwargs path. This is a good, minimal check that the new config-driven path is behaviorally identical to the kwargs path.

src/datamodel_code_generator/__init__.py (4)

1083-1115: LGTM! Public API expansion is correctly implemented.

The newly exported constants and types are all properly imported from their respective modules and align with the PR objective to expand the public API surface. This makes these types more accessible for users who want to configure generation programmatically.


482-494: Performance: Forward reference resolution executes on every call; remove unused noqa directives.

Two concerns:

  1. Performance issue: Lines 487 and 493 call model_rebuild (Pydantic v2) or update_forward_refs (Pydantic v1) every time generate() is invoked without a config parameter. These operations should be performed once at module initialization time, not on every function call, to avoid unnecessary overhead.

  2. Unused noqa directives: Lines 484, 485, 490, 491 have unused noqa: PLC0415 directives that should be removed per static analysis.

💡 Suggested approach for forward reference resolution

Consider moving the forward reference resolution to module initialization:

# Near the top of the module, after imports
if is_pydantic_v2():
    from datamodel_code_generator.model.pydantic_v2 import UnionMode
    from datamodel_code_generator.types import StrictTypes
    from datamodel_code_generator.config import GenerateConfig
    
    GenerateConfig.model_rebuild(_types_namespace={"StrictTypes": StrictTypes, "UnionMode": UnionMode})
else:
    from datamodel_code_generator.enums import UnionMode
    from datamodel_code_generator.types import StrictTypes
    from datamodel_code_generator.config import GenerateConfig
    
    GenerateConfig.update_forward_refs(StrictTypes=StrictTypes, UnionMode=UnionMode)

Then in the generate() function, simply construct the config without rebuilding:

if config is None:
    if is_pydantic_v2():
        config = GenerateConfig.model_validate(options)
    else:
        config = GenerateConfig(**options)
⛔ Skipped due to learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:22.481Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/util.py:49-66
Timestamp: 2025-12-25T09:23:08.506Z
Learning: In datamodel-code-generator, the is_pydantic_v2() and is_pydantic_v2_11() functions in src/datamodel_code_generator/util.py intentionally use global variable caching (_is_v2, _is_v2_11) on top of lru_cache for performance optimization. This dual-layer caching eliminates function call overhead and cache lookup overhead for frequently-called version checks. The PLW0603 linter warnings should be suppressed with # noqa: PLW0603 as this is a deliberate design choice.
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2681
File: tests/cli_doc/test_cli_doc_coverage.py:82-82
Timestamp: 2025-12-18T13:43:16.235Z
Learning: In datamodel-code-generator project, Ruff preview mode is enabled via `lint.preview = true` in pyproject.toml. This enables preview rules like PLR6301 (no-self-use), so `noqa: PLR6301` directives are necessary and should not be removed even if RUF100 suggests they are unused.

476-481: Remove unused noqa directive, but the mutual exclusion logic is correct.

Line 476 has an unused noqa: PLC0415 directive that should be removed. However, the mutual exclusion check itself is correctly implemented and provides a clear error message.

🔎 Proposed fix
-    from datamodel_code_generator.config import GenerateConfig  # noqa: PLC0415
+    from datamodel_code_generator.config import GenerateConfig
⛔ Skipped due to learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:22.481Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.

451-451: Remove unused noqa directive.

The complexity warnings (PLR0912, PLR0914, PLR0915) are not being triggered, so the noqa directive can be removed.

🔎 Proposed fix
-def generate(  # noqa: PLR0912, PLR0914, PLR0915
+def generate(
     input_: Path | str | ParseResult | Mapping[str, Any],
⛔ Skipped due to learnings
Learnt from: koxudaxi
Repo: koxudaxi/datamodel-code-generator PR: 2799
File: src/datamodel_code_generator/model/pydantic/__init__.py:43-43
Timestamp: 2025-12-25T09:22:22.481Z
Learning: In datamodel-code-generator project, defensive `# noqa: PLC0415` directives should be kept on lazy imports (imports inside functions/methods) even when Ruff reports them as unused via RUF100, to prepare for potential future Ruff configuration changes that might enable the import-outside-top-level rule.

@koxudaxi koxudaxi enabled auto-merge (squash) December 30, 2025 18:43
@koxudaxi koxudaxi merged commit 0610118 into main Dec 30, 2025
37 checks passed
@koxudaxi koxudaxi deleted the simplify-generate-function-signature branch December 30, 2025 18:44
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: Breaking changes detected

Reasoning: The PR removes the ability to pass both a config object and individual keyword arguments to the generate() function. Previously, the function would merge the two, with keyword arguments taking precedence. Now it raises a ValueError if both are provided. This is confirmed by the test change from test_generate_with_config_object_extra_template_data_override to test_generate_with_config_and_kwargs_raises_error. The aliases type expansion from Mapping[str, str] to Mapping[str, str | list[str]] is backward compatible. The signature change to **options: Unpack[GenerateConfigDict] maintains backward compatibility for existing keyword argument usage since tests verify all parameter names, types, and default values match the baseline.

Content for Release Notes

API/CLI Changes

  • Mixing config and keyword arguments now raises ValueError - Previously, generate() allowed passing both a config object and individual keyword arguments, with keyword arguments overriding config values. Now, providing both raises ValueError: "Cannot specify both 'config' and keyword arguments. Use one or the other." Users must choose one approach: either pass a GenerateConfig object or use keyword arguments, but not both. (Simplify generate() function signature using Unpack[GenerateConfigDict] #2874)
    # Before (worked): keyword args overrode config values
    generate(input_=schema, config=config, output=some_path)
    
    # After (raises ValueError): must use one or the other
    # Option 1: Use config only (include output in config)
    config = GenerateConfig(output=some_path, ...)
    generate(input_=schema, config=config)
    
    # Option 2: Use keyword args only
    generate(input_=schema, output=some_path, ...)

This analysis was performed by Claude Code Action

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Jan 1, 2026

🎉 Released in 0.51.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant