Return str or dict when output=None in generate() by koxudaxi · Pull Request #2787 · koxudaxi/datamodel-code-generator

koxudaxi · 2025-12-24T17:43:12Z

Fixes: #423

Summary by CodeRabbit

New Features
- generate() can return generated code as a string (single module) or a mapping of modules to code (multi-module)
- CLI prints generated code to stdout when no output path is specified
- Custom file headers supported
Documentation
- Usage guide updated with "Getting Generated Code as String", multi-module examples, file-writing guidance, and a Return Value Summary table
Behavior Changes
- Supplying a file path for multiple modules now raises an error; use a directory path
Tests
- Expanded tests covering return values, header handling, multi-module outputs, and CLI stdout

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-24T17:43:22Z

📝 Walkthrough

Walkthrough

This PR makes generate() return generated code when output=None: a single module returns a string, multiple modules return a GeneratedModules dict. The CLI now prints such returned content to stdout. Docs and tests are updated to reflect and verify these behaviors.

Changes

Cohort / File(s)	Summary
Core API `src/datamodel_code_generator/__init__.py`	Adds public `GeneratedModules` TypeAlias; extends `generate()` return type to `str \| GeneratedModules \| None`; adds `_build_module_content()` helper to assemble headers/body and relocate `__future__` imports when custom headers exist; adjusts single vs multi-module return behavior.
CLI / Entrypoint `src/datamodel_code_generator/__main__.py`	Captures `generate()` result and prints to stdout when `output=None`, handling string and mapping results (iterates and prints values).
Documentation `docs/using_as_module.md`	Replaces previous single-module guidance with examples for returning code as string, multiple-module output (`GeneratedModules`), writing to files, and a Return Value Summary table; clarifies file vs directory output rules.
Tests — Behavior & Export `tests/main/test_main_general.py`, `tests/test_main_kr.py`	Adds tests verifying return types/values for single and multiple modules, custom_file_header permutations, file-writing behavior, and that `GeneratedModules` is exported; updates `test_main_modular_no_file` to use `capsys` and assert stdout.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~35 minutes

Possibly related PRs

Optimize Jinja2 environment caching and ruff batch formatting #2779 — overlaps changes to generate() and header/module output handling.
Add --use-generic-base-class option for DRY model config #2726 — modifies generate() signature/behavior; related API-level changes.
feat: Add --module-split-mode option to generate one file per model (#1170) #2685 — edits public API and __all__, similar export/type additions.

Suggested labels

breaking-change-analyzed

Poem

🐰 Hoppity hop, code in a string,
No temp files, I dance and sing.
One module, one tidy line,
Many modules, a mapping fine.
Toast to simpler dev-time spring! ✨

Pre-merge checks and finishing touches

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: returning string or dict from generate() when output=None, which is the primary objective of this PR.
Linked Issues check	✅ Passed	The PR fully implements the requirements from issue #423 by enabling generate() to return generated content as a string (or dict for multiple modules) when output=None, eliminating the need for temporary directories.
Out of Scope Changes check	✅ Passed	All changes are directly scoped to implementing the core feature: updating generate() return behavior, adding documentation, updating tests, and handling stdout printing for CLI when no output path is specified.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✨ Finishing touches

📝 Generate docstrings

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch feature/generate-return-string-or-dict

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

src/datamodel_code_generator/__init__.py (1)

965-992: Consider consolidating duplicate future import extraction logic.

The future import extraction logic (lines 965-992) duplicates the logic in _build_module_content (lines 459-485). While the file writing path uses print() statements versus string concatenation, this could potentially be refactored to reduce duplication.

This is a minor observation; the current implementation is correct and functional.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5cceb4f and cdc64fd.

📒 Files selected for processing (5)

docs/using_as_module.md
src/datamodel_code_generator/__init__.py
src/datamodel_code_generator/__main__.py
tests/main/test_main_general.py
tests/test_main_kr.py

🧰 Additional context used

🧬 Code graph analysis (3)

tests/test_main_kr.py (3)

tests/main/openapi/test_main_openapi.py (1)

test_main_modular_no_file (463-470)

tests/main/conftest.py (1)

run_main_with_args (215-241)

src/datamodel_code_generator/__main__.py (1)

Exit (94-100)

src/datamodel_code_generator/__init__.py (1)

src/datamodel_code_generator/format.py (1)

Formatter (162-168)

src/datamodel_code_generator/__main__.py (1)

src/datamodel_code_generator/__init__.py (1)

generate (489-1016)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: 3.10 on macOS
GitHub Check: py312-black24 on Ubuntu
GitHub Check: 3.10 on Windows
GitHub Check: py312-isort5 on Ubuntu
GitHub Check: 3.11 on macOS
GitHub Check: benchmarks
GitHub Check: py312-pydantic1 on Ubuntu
GitHub Check: 3.11 on Ubuntu
GitHub Check: py312-isort6 on Ubuntu
GitHub Check: 3.13 on Windows
GitHub Check: 3.12 on macOS
GitHub Check: py312-isort7 on Ubuntu
GitHub Check: Analyze (python)
GitHub Check: 3.13 on macOS
GitHub Check: 3.13 on Ubuntu
GitHub Check: 3.11 on Windows
GitHub Check: 3.12 on Ubuntu
GitHub Check: 3.14 on Windows
GitHub Check: 3.14 on Ubuntu
GitHub Check: 3.12 on Windows

🔇 Additional comments (13)

tests/test_main_kr.py (1)

104-109: LGTM!

The test correctly validates the new behavior where modular output without a file path now succeeds and prints generated content to stdout. The assertions for "class Chocolate" and "class Source" verify that the expected models are present in the output.

src/datamodel_code_generator/__main__.py (1)

849-855: LGTM!

The stdout output logic correctly handles both single-module (string) and multi-module (mapping) return types from generate(). Adding a trailing newline ensures proper stdout formatting.

docs/using_as_module.md (2)

15-62: LGTM!

The documentation clearly explains the new return value behavior with practical examples. The type hint str | GeneratedModules and the handling pattern for both cases are well-documented.

190-198: LGTM!

The return value summary table is a helpful addition that clearly documents the behavior matrix for different output parameter scenarios.

src/datamodel_code_generator/__init__.py (4)

85-90: LGTM!

The GeneratedModules type alias is well-documented and provides a clear contract for the multi-module return type.

448-486: LGTM!

The _build_module_content helper correctly handles future import extraction and placement when a custom file header is provided. The logic properly preserves the docstring position while inserting __future__ imports in the correct location.

606-616: LGTM!

The updated return type and docstring accurately describe the three possible return scenarios.

921-935: LGTM!

The new return logic correctly handles both single-module (string) and multi-module (GeneratedModules dict) cases when output is None.

tests/main/test_main_general.py (5)

10-21: LGTM!

The new imports for inline_snapshot and GeneratedModules are appropriate for the added tests.

1442-1464: LGTM!

This test validates the core new functionality: generate() returns a string when output=None for single-file schemas. The snapshot testing approach ensures the output format is verified.

1518-1554: LGTM!

These tests correctly verify that:

generate() returns None when an output path is provided

The file content matches what would be returned with output=None

The .strip() comparison on line 1554 appropriately handles potential trailing whitespace differences.

1557-1642: LGTM!

This test validates the multi-module return behavior using a directory input with cross-file references. The snapshot correctly captures the GeneratedModules dict structure with module path tuples as keys.

1650-1717: LGTM!

These tests comprehensively cover custom file header scenarios:

Basic custom header

Custom header with code after docstring (testing __future__ import placement)

Custom header with disable_future_imports=True

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (2)

src/datamodel_code_generator/__init__.py (2)

448-487: Code duplication between _build_module_content and file-writing logic.

The future imports extraction logic at lines 463-477 is nearly identical to lines 969-985 in the file-writing path. Consider extracting shared logic to reduce duplication, though this is not critical for this PR.

🔎 Potential refactor to reduce duplication

You could extract the future imports handling into a separate helper:

def _extract_future_imports(body: str) -> tuple[str, str]:
    """Extract future imports from body, returning (extracted, body_without_future)."""
    lines = body.split("\n")
    future_indices = [i for i, line in enumerate(lines) if line.strip().startswith("from __future__")]
    if not future_indices:
        return "", body
    extracted = "\n".join(lines[i] for i in future_indices)
    remaining = [line for i, line in enumerate(lines) if i not in future_indices]
    return extracted, "\n".join(remaining).lstrip("\n")

This could be used in both _build_module_content and the file-writing path.

958-1000: Consider using a context manager for file handling.

The file is opened at line 960 but closed manually at line 1000. Using a context manager would be safer against exceptions during write operations.

🔎 Proposed refactor using context manager

     for path, (body, future_imports, filename) in modules.items():
         if not path.parent.exists():
             path.parent.mkdir(parents=True)
-        file = path.open("wt", encoding=encoding)
+        with path.open("wt", encoding=encoding) as file:

-        safe_filename = filename.replace("\n", " ").replace("\r", " ") if filename else ""
-        effective_header = custom_file_header or header.format(safe_filename)
+            safe_filename = filename.replace("\n", " ").replace("\r", " ") if filename else ""
+            effective_header = custom_file_header or header.format(safe_filename)

-        if custom_file_header and body:
-            # ... (indent rest of the block)
+            if custom_file_header and body:
+                # ... (rest of the block)
-        file.close()

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cdc64fd and 7a6c76b.

📒 Files selected for processing (5)

docs/using_as_module.md
src/datamodel_code_generator/__init__.py
src/datamodel_code_generator/__main__.py
tests/main/test_main_general.py
tests/test_main_kr.py

🚧 Files skipped from review as they are similar to previous changes (1)

tests/test_main_kr.py

🧰 Additional context used

🧬 Code graph analysis (2)

tests/main/test_main_general.py (1)

src/datamodel_code_generator/__init__.py (4)

AllExportsScope (259-267)

DataModelType (226-234)

generate (489-1016)

InputFileType (204-214)

src/datamodel_code_generator/__main__.py (1)

src/datamodel_code_generator/__init__.py (1)

generate (489-1016)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (20)

GitHub Check: py312-black23 on Ubuntu
GitHub Check: 3.10 on Ubuntu
GitHub Check: py312-isort5 on Ubuntu
GitHub Check: py312-black22 on Ubuntu
GitHub Check: py312-pydantic1 on Ubuntu
GitHub Check: 3.11 on Windows
GitHub Check: 3.10 on macOS
GitHub Check: py312-isort7 on Ubuntu
GitHub Check: 3.11 on macOS
GitHub Check: py312-black24 on Ubuntu
GitHub Check: 3.11 on Ubuntu
GitHub Check: 3.13 on Windows
GitHub Check: 3.12 on Ubuntu
GitHub Check: 3.14 on Windows
GitHub Check: 3.10 on Windows
GitHub Check: 3.12 on macOS
GitHub Check: 3.13 on Ubuntu
GitHub Check: 3.12 on Windows
GitHub Check: 3.13 on macOS
GitHub Check: benchmarks

🔇 Additional comments (10)

src/datamodel_code_generator/__main__.py (1)

734-855: LGTM! Clean implementation of stdout printing for generated content.

The changes correctly:

Capture the return value from generate()

Print string results directly when output is None

Iterate over GeneratedModules dict values for multi-module output

docs/using_as_module.md (2)

15-62: Documentation looks comprehensive and well-structured.

The examples clearly demonstrate:

Getting generated code as a string (single module)

Handling GeneratedModules dict for multi-module schemas

The isinstance(result, dict) check correctly distinguishes between return types

190-199: Return Value Summary table is accurate and helpful.

The table correctly documents the behavior, including that a file path with multiple modules raises an error.

src/datamodel_code_generator/__init__.py (3)

85-91: Well-documented type alias.

The GeneratedModules TypeAlias with its docstring clearly explains the purpose and structure of the return type for multi-module generation.

921-935: Clean implementation of in-memory return for both single and multi-module cases.

The logic correctly:

Returns a str for single-file output

Returns a GeneratedModules dict with sorted keys for deterministic ordering

Applies headers consistently to all modules

1052-1052: LGTM! GeneratedModules correctly exported.

Adding GeneratedModules to __all__ makes it part of the public API, enabling users to type-hint their code when working with multi-module generation.

tests/main/test_main_general.py (4)

1439-1461: Good test coverage for basic string return.

The test correctly validates:

Return type is str when output=None

Generated content structure with header, imports, and model

1554-1639: Comprehensive test for multi-module generation.

The test properly validates the GeneratedModules return type with tuple keys mapping to generated code strings.

Note: The snapshot shows test_generate_returns_dict_for0 as the filename in __init__.py (lines 1597-1598), which appears truncated. This is likely because the directory name (tmp_path) is used as input_filename fallback, which is expected behavior for directory inputs.

1647-1714: Excellent edge case coverage for custom file headers.

The three tests thoroughly validate:

Basic custom header with future imports placement

Custom header with docstring and code - future imports inserted after docstring

Custom header when disable_future_imports=True - no future import handling needed

1642-1644: Simple but effective export verification.

The test confirms GeneratedModules is importable from the public API. The assertion is minimal but sufficient since the import at line 16 would fail if the export was missing.

codspeed-hq · 2025-12-24T17:52:05Z

CodSpeed Performance Report

Merging #2787 will not alter performance

_{Comparing feature/generate-return-string-or-dict (7a6c76b) with main (5cceb4f)}

Summary

✅ 73 untouched
⏩ 10 skipped¹

10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩

codecov · 2025-12-24T17:52:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.48%. Comparing base (a46ceb8) to head (7a6c76b).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff            @@
##             main    #2787    +/-   ##
========================================
  Coverage   99.47%   99.48%            
========================================
  Files          88       88            
  Lines       13213    13348   +135     
  Branches     1556     1565     +9     
========================================
+ Hits        13144    13279   +135     
  Misses         36       36            
  Partials       33       33

Flag	Coverage Δ
unittests	`99.48% <100.00%> (+<0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-12-24T17:53:46Z

Breaking Change Analysis

Result: Breaking changes detected

Reasoning: This PR contains several breaking changes: 1) The generate() function signature changed from returning None to returning str | GeneratedModules | None - when output=None, it now returns the generated code as a string (single module) or a dictionary (multiple modules). This is a significant API change that could affect type checking and code that relies on the previous behavior. 2) The CLI now prints generated code to stdout when no --output flag is specified, which is a new default behavior that could affect scripts or pipelines expecting silent operation. 3) An error message was slightly changed which could affect error parsing. The removal of the "Modular references require an output directory" error for output=None case is not breaking since it now supports returning the dict instead of erroring.

Content for Release Notes

API/CLI Changes

generate() function return type changed - Previously returned None, now returns str | GeneratedModules | None. When output=None, returns str for single module or GeneratedModules dict for multiple modules. Code that explicitly checks generate() is None or ignores the return value will continue to work, but type checkers may flag this change. (Return str or dict when output=None in generate() #2787)

Default Behavior Changes

CLI now prints to stdout when no output path specified - When running datamodel-codegen without the --output flag, generated code is now printed to stdout instead of silently doing nothing. This enables piping output but may affect scripts that expected no output. (Return str or dict when output=None in generate() #2787)

# Before: No output
datamodel-codegen --input schema.json

# After: Prints generated code to stdout
datamodel-codegen --input schema.json

Error Handling Changes

Error message changed for multi-module output without directory - The error message when attempting multi-module generation with a file path changed from "Modular references require an output directory" to "Modular references require an output directory, not a file". Scripts parsing error messages may need updates. (Return str or dict when output=None in generate() #2787)

This analysis was performed by Claude Code Action

* Add --collapse-root-models-name-strategy option * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions * Add pragma no cover for defensive edge cases * Achieve 100% diff coverage for collapse-root-models-name-strategy * Use cast instead of type ignore comment * Remove line comments from collapse-root-models implementation * Add complex e2e tests for collapse-root-models-name-strategy * Update reference metadata when renaming in parent strategy * Refactor collapse-root-models tests to use parameterization for v1/v2 * Add schema path context to error messages (#2786) * Return str or dict when output=None in generate() (#2787) * Add --http-timeout CLI option (#2788) * Add --http-timeout CLI option for configurable HTTP request timeout * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Pass schema extensions to templates (#2790) * Pass schema extensions to templates * Move model_base import to top of file * Add schema extensions documentation Document how x-* schema extensions are passed to custom templates via the extensions variable, with examples for database model configuration and other use cases. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * Add propertyNames and x-propertyNames support (#2789) * Add propertyNames and x-propertyNames support * Fix Pydantic v1 compatibility for x-propertyNames Use the model_validate utility function from util module instead of calling JsonSchemaObject.model_validate() directly, which only exists in Pydantic v2. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * Add test for x-propertyNames non-dict branch coverage Test that x-propertyNames with non-dict value (e.g., boolean) is correctly ignored, achieving 100% diff coverage. 🤖 Generated with [Claude Code](https://claude.com/claude-code) * Add support for additional_imports in extra-template-data JSON (#2793) * Update zensical to 0.0.15 (#2794) * Add --use-field-description-example option (#2792) * Add --use-field-description-example option * docs: update CLI reference documentation and prompt data 🤖 Generated by GitHub Actions * Add tests for complete branch coverage of docstring property --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Fix formatting in test file --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Return str or dict when output=None in generate()

7a6c76b

koxudaxi force-pushed the feature/generate-return-string-or-dict branch from cdc64fd to 7a6c76b Compare December 24, 2025 17:45

coderabbitai Bot reviewed Dec 24, 2025

View reviewed changes

koxudaxi enabled auto-merge (squash) December 24, 2025 17:50

koxudaxi merged commit f3029e8 into main Dec 24, 2025
35 checks passed

koxudaxi deleted the feature/generate-return-string-or-dict branch December 24, 2025 17:51

github-actions Bot added breaking-change-analyzed breaking-change labels Dec 24, 2025

koxudaxi added a commit that referenced this pull request Dec 25, 2025

Return str or dict when output=None in generate() (#2787)

6fac5b5

coderabbitai Bot mentioned this pull request Dec 25, 2025

Add --collapse-root-models-name-strategy option #2791

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Return str or dict when output=None in generate()#2787

Return str or dict when output=None in generate()#2787
koxudaxi merged 1 commit intomainfrom
feature/generate-return-string-or-dict

koxudaxi commented Dec 24, 2025 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Dec 24, 2025 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

codspeed-hq Bot commented Dec 24, 2025

Uh oh!

codecov Bot commented Dec 24, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Dec 24, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

koxudaxi commented Dec 24, 2025 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Poem

Pre-merge checks and finishing touches

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codspeed-hq Bot commented Dec 24, 2025

CodSpeed Performance Report

Merging #2787 will not alter performance

Summary

Footnotes

Uh oh!

codecov Bot commented Dec 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions Bot commented Dec 24, 2025

Breaking Change Analysis

Content for Release Notes

API/CLI Changes

Default Behavior Changes

Error Handling Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

koxudaxi commented Dec 24, 2025 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Dec 24, 2025 •

edited

Loading

codecov Bot commented Dec 24, 2025 •

edited

Loading