Skip to content

Fix for #3045#3046

Merged
koxudaxi merged 5 commits intokoxudaxi:mainfrom
ashipilov:fix-dict-default
Mar 10, 2026
Merged

Fix for #3045#3046
koxudaxi merged 5 commits intokoxudaxi:mainfrom
ashipilov:fix-dict-default

Conversation

@ashipilov
Copy link
Copy Markdown
Contributor

@ashipilov ashipilov commented Mar 10, 2026

Fixes #3045 by handling dictionaries in the same way as the lists

Summary by CodeRabbit

  • New Features

    • Generated Pydantic v2 models now support default values for dict fields whose values are model instances, correctly handling empty and pre-populated defaults and validating each entry into the appropriate model type.
  • Tests

    • Added parameterized tests covering empty, non-empty and nullable default dictionaries to verify generation and validation behavior.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 10, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: ea1297e0-421e-4162-9d7a-0131b04d1c99

📥 Commits

Reviewing files that changed from the base of the PR and between e66aa2f and 9ef313c.

📒 Files selected for processing (2)
  • src/datamodel_code_generator/parser/base.py
  • tests/main/jsonschema/test_main_jsonschema.py

📝 Walkthrough

Walkthrough

Adds correct handling for dict-typed field defaults whose values are Pydantic BaseModel instances: empty dict defaults now use default_factory=dict (STANDARD_DICT), non-empty dict defaults produce a default_factory lambda that parses/validates each dict value into the BaseModel. Tests and expected outputs for Pydantic v2 were added.

Changes

Cohort / File(s) Summary
Core Logic
src/datamodel_code_generator/model/pydantic_base.py
Import STANDARD_DICT and extend _get_default_as_pydantic_model to handle dict-typed fields with BaseModel values: return STANDARD_DICT for empty dict defaults or a lambda that parses/validates each dict value for non-empty defaults.
Parser Adjustment
src/datamodel_code_generator/parser/base.py
Treat single-field dict data types as using existing validated default model factory path when the contained reference supports validated defaults (aligns dict-wrapped single refs with list/ref cases).
Expected Generated Models (pydantic v2)
tests/data/expected/main/jsonschema/pydantic_v2_model_default_dict_empty.py, tests/data/expected/main/jsonschema/pydantic_v2_model_default_dict_non_empty.py, tests/data/expected/main/jsonschema/pydantic_v2_model_default_nullable_dict_empty.py, tests/data/expected/main/jsonschema/pydantic_v2_model_default_nullable_dict_non_empty.py
New expected output files defining ItemModel and ParentModel variants where dict_with_defaults uses default_factory=dict for empty defaults or a default_factory that builds/validates dict entries into ItemModel for non-empty defaults; includes nullable variants.
Tests
tests/main/jsonschema/test_main_jsonschema.py
Adds parameterized test test_main_generate_pydantic_v2_model_default_dict covering empty/non-empty and nullable dict default scenarios for Pydantic v2 model generation. Note: the diff contains a duplicate insertion of the same test block.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

Suggested labels

breaking-change-analyzed

Poem

🐰 I hopped through code with twitchy nose,

Found stray dict defaults where confusion grows,
Empty nests now use a proper tray,
Non-empty ones validate and stay,
Hooray — tidy models, off I go to doze! 🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The PR title 'Fix for #3045' is vague and generic, referring only to an issue number without describing what the actual fix addresses or which component is affected. Use a more descriptive title like 'Fix default dict handling in Pydantic models' to clearly communicate the change in the PR history.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed The PR successfully addresses issue #3045 by handling dictionary defaults correctly in Pydantic code generation, treating them like list defaults with default_factory=dict instead of incorrect validation.
Out of Scope Changes check ✅ Passed All changes are scoped to fixing dictionary default value handling in Pydantic models; no unrelated modifications were introduced beyond the primary objective.
Docstring Coverage ✅ Passed Docstring coverage is 85.71% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
src/datamodel_code_generator/model/pydantic_base.py (1)

158-162: Consider updating the TODO in the loop to handle nested dict types.

The new code at lines 144-156 handles direct dict types, but the loop still has a TODO for dict handling. For complex union types where the dict is nested within data_types, the loop would skip it without processing.

This creates asymmetry - direct dicts are handled, but dicts within unions may not be. Consider whether this TODO should be addressed for completeness, or document why it's intentionally left as-is.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/datamodel_code_generator/model/pydantic_base.py` around lines 158 - 162,
The loop over self.data_type.data_types currently skips entries where
data_type.is_dict (TODO) causing nested dicts inside unions to be ignored;
update the loop in the method using self.data_type.data_types to handle nested
dicts the same way as the direct dict branch (the logic implemented earlier at
lines ~144-156) by parsing the dict model to compute the default and not simply
continue when data_type.is_dict is true, using the same helper/parse routine
used for direct dict handling so union-member dict types are correctly
processed.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/datamodel_code_generator/model/pydantic_base.py`:
- Around line 158-162: The loop over self.data_type.data_types currently skips
entries where data_type.is_dict (TODO) causing nested dicts inside unions to be
ignored; update the loop in the method using self.data_type.data_types to handle
nested dicts the same way as the direct dict branch (the logic implemented
earlier at lines ~144-156) by parsing the dict model to compute the default and
not simply continue when data_type.is_dict is true, using the same helper/parse
routine used for direct dict handling so union-member dict types are correctly
processed.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 3c7cf1df-abf3-44ec-ad64-6fd7a417104b

📥 Commits

Reviewing files that changed from the base of the PR and between 95daa4e and 7ae8de6.

⛔ Files ignored due to path filters (2)
  • tests/data/jsonschema/pydantic_v2_model_default_dict_empty.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/pydantic_v2_model_default_dict_non_empty.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (4)
  • src/datamodel_code_generator/model/pydantic_base.py
  • tests/data/expected/main/jsonschema/pydantic_v2_model_default_dict_empty.py
  • tests/data/expected/main/jsonschema/pydantic_v2_model_default_dict_non_empty.py
  • tests/main/jsonschema/test_main_jsonschema.py

Comment thread src/datamodel_code_generator/model/pydantic_base.py Dismissed
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Mar 10, 2026

Merging this PR will improve performance by 22.66%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

⚡ 10 improved benchmarks
✅ 1 untouched benchmark
⏩ 98 skipped benchmarks1

Performance Changes

Mode Benchmark BASE HEAD Efficiency
WallTime test_perf_kubernetes_style_pydantic_v2 2.8 s 2.4 s +19.14%
WallTime test_perf_openapi_large 3.4 s 2.8 s +22.66%
WallTime test_perf_aws_style_openapi_pydantic_v2 1.9 s 1.8 s +10.29%
WallTime test_perf_graphql_style_pydantic_v2 823.2 ms 740.7 ms +11.15%
WallTime test_perf_all_options_enabled 6.6 s 5.9 s +12.46%
WallTime test_perf_duplicate_names 1,093 ms 951.3 ms +14.9%
WallTime test_perf_deep_nested 6.2 s 5.5 s +12.77%
WallTime test_perf_large_models_pydantic_v2 3.7 s 3.3 s +12.79%
WallTime test_perf_complex_refs 2.3 s 1.9 s +18.41%
WallTime test_perf_multiple_files_input 3.7 s 3.3 s +11.04%

Comparing ashipilov:fix-dict-default (9ef313c) with main (f3f3912)

Open in CodSpeed

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
src/datamodel_code_generator/model/pydantic_base.py (1)

162-174: Consider extracting shared dict-handling logic.

This block is nearly identical to lines 145-157. While the duplication is consistent with how list handling is structured (direct at 132-144, nested at 176-188), a helper method could reduce this repetition.

Since refactoring would also affect the existing list handling, this can be deferred.

♻️ Optional: Extract helper method
def _get_dict_default_factory(self, data_type_value) -> str | None:
    """Get default factory for dict field with BaseModel values."""
    if (
        data_type_value.reference
        and isinstance(data_type_value.reference.source, BaseModelBase)
        and isinstance(self.default, dict)
    ):
        if not self.default:
            return STANDARD_DICT
        class_name = data_type_value.alias or data_type_value.reference.source.class_name
        return (
            f"lambda :{{k: {class_name}."
            f"{self._PARSE_METHOD}(v) for k, v in {self.default!r}.items()}}"
        )
    return None

Then call this helper in both locations.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/datamodel_code_generator/model/pydantic_base.py` around lines 162 - 174,
Extract the duplicated dict default-factory logic into a helper method (e.g.,
_get_dict_default_factory) that accepts data_type_value and returns either
STANDARD_DICT, the generated lambda string, or None; the helper should check the
same conditions (data_type_value.reference,
isinstance(data_type_value.reference.source, BaseModelBase),
isinstance(self.default, dict)) and use data_type_value.alias or
data_type_value.reference.source.class_name and self._PARSE_METHOD when building
the lambda, then replace the two nearly identical blocks that reference
data_type/data_type_value/self.default with calls to this helper and return its
result when non-None.
tests/main/jsonschema/test_main_jsonschema.py (1)

1938-1950: Add one runtime assertion for the generated defaults.

These tests currently only snapshot the emitted source. Since the regression is about default_factory behavior, I'd also instantiate the generated ParentModel and assert the empty case yields {} and the populated case yields ItemModel values. That makes the regression harder to reintroduce if the expected fixtures are updated alongside the generator.

Example direction
 def test_main_generate_pydantic_v2_model_default_dict_non_empty(tmp_path: Path) -> None:
     ...
     assert_file_content(output_file, "pydantic_v2_model_default_dict_non_empty.py")
+    namespace: dict[str, object] = {}
+    exec(compile(output_file.read_text(encoding="utf-8"), str(output_file), "exec"), namespace)
+    parent = namespace["ParentModel"]()
+    item_model = namespace["ItemModel"]
+    assert parent.dict_with_defaults == {
+        "jedi": item_model(name="Yoda", description="The wise old jedi")
+    }

Also applies to: 1953-1965, 1968-1980, 1983-1995

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/main/jsonschema/test_main_jsonschema.py` around lines 1938 - 1950, Add
runtime assertions to the
test_main_generate_pydantic_v2_model_default_dict_empty test (and the sibling
tests at ranges 1953-1965, 1968-1980, 1983-1995): after generate(...) and
assert_file_content(...), import or load the generated module and instantiate
ParentModel to verify default behavior (e.g., ParentModel() yields items == {}),
and for the populated-case tests instantiate ParentModel with the provided data
and assert the items contain the expected ItemModel instances/values; use the
generated class names ParentModel and ItemModel and keep these assertions
alongside the existing snapshot checks to prevent regressions in default_factory
handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/datamodel_code_generator/model/pydantic_base.py`:
- Around line 162-174: Extract the duplicated dict default-factory logic into a
helper method (e.g., _get_dict_default_factory) that accepts data_type_value and
returns either STANDARD_DICT, the generated lambda string, or None; the helper
should check the same conditions (data_type_value.reference,
isinstance(data_type_value.reference.source, BaseModelBase),
isinstance(self.default, dict)) and use data_type_value.alias or
data_type_value.reference.source.class_name and self._PARSE_METHOD when building
the lambda, then replace the two nearly identical blocks that reference
data_type/data_type_value/self.default with calls to this helper and return its
result when non-None.

In `@tests/main/jsonschema/test_main_jsonschema.py`:
- Around line 1938-1950: Add runtime assertions to the
test_main_generate_pydantic_v2_model_default_dict_empty test (and the sibling
tests at ranges 1953-1965, 1968-1980, 1983-1995): after generate(...) and
assert_file_content(...), import or load the generated module and instantiate
ParentModel to verify default behavior (e.g., ParentModel() yields items == {}),
and for the populated-case tests instantiate ParentModel with the provided data
and assert the items contain the expected ItemModel instances/values; use the
generated class names ParentModel and ItemModel and keep these assertions
alongside the existing snapshot checks to prevent regressions in default_factory
handling.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b3622713-6cd0-446a-aa9c-c524d0af2ece

📥 Commits

Reviewing files that changed from the base of the PR and between 7ae8de6 and b5c2369.

⛔ Files ignored due to path filters (2)
  • tests/data/jsonschema/pydantic_v2_model_default_nullable_dict_empty.json is excluded by !tests/data/**/*.json and included by none
  • tests/data/jsonschema/pydantic_v2_model_default_nullable_dict_non_empty.json is excluded by !tests/data/**/*.json and included by none
📒 Files selected for processing (4)
  • src/datamodel_code_generator/model/pydantic_base.py
  • tests/data/expected/main/jsonschema/pydantic_v2_model_default_nullable_dict_empty.py
  • tests/data/expected/main/jsonschema/pydantic_v2_model_default_nullable_dict_non_empty.py
  • tests/main/jsonschema/test_main_jsonschema.py

@koxudaxi koxudaxi merged commit c48bdb2 into koxudaxi:main Mar 10, 2026
36 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR is a bug fix that adds proper handling for dict fields with model value defaults in Pydantic v2. The previous code had a TODO comment ("TODO: Parse dict model for default") and would skip proper handling, producing incorrect output. The new code produces correct, working code with proper model_validate calls. While generated output will differ for affected schemas, this corrects broken behavior rather than breaking working functionality. The change follows the same pattern already established for list handling.


This analysis was performed by Claude Code Action

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 10, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (f3f3912) to head (9ef313c).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##              main     #3046   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           86        86           
  Lines        18071     18091   +20     
  Branches      2101      2108    +7     
=========================================
+ Hits         18071     18091   +20     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@github-actions
Copy link
Copy Markdown
Contributor

🎉 Released in 0.55.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Default values for python dictionaries are incorrectly rendered

3 participants