Fix allOf with $ref to root model losing constraints#2676
Conversation
WalkthroughAdds root-model-aware allOf handling to the JSON Schema parser: new helpers detect root primitive schemas, merge primitive constraints per AllOfMergeMode, and short-circuit parse_all_of to produce constrained root models when an allOf mixes a $ref with extra constraints. Tests and expected outputs for both merge modes were added. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (5 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CodSpeed Performance ReportMerging #2676 will not alter performanceComparing Summary
Benchmarks breakdown
Footnotes
|
…' into fix/allof-root-model-constraints
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2676 +/- ##
========================================
Coverage 99.51% 99.52%
========================================
Files 79 79
Lines 11001 11162 +161
Branches 1318 1347 +29
========================================
+ Hits 10948 11109 +161
Misses 32 32
Partials 21 21
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…' into fix/allof-root-model-constraints
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (2)
src/datamodel_code_generator/parser/jsonschema.py (2)
1494-1510: Remove unusednoqadirective.The logic correctly identifies root model schemas by excluding arrays, compositions (allOf/oneOf/anyOf), objects with properties, pattern properties, and enums. This aligns with the
parse_raw_objmethod's routing logic.Apply this diff to remove the unused
noqadirective:- def _is_root_model_schema(self, obj: JsonSchemaObject) -> bool: # noqa: PLR6301 + def _is_root_model_schema(self, obj: JsonSchemaObject) -> bool:Based on static analysis,
PLR6301is not enabled in your Ruff configuration.
1512-1573: Remove unusednoqadirective.The method correctly handles the core fix for issue #1901:
- Filters out inline properties (only applies to named schema definitions)
- Identifies allOf with exactly one
$refto a root model- Collects additional constraint items from other allOf members
- Validates type compatibility (allowing integer/number cross-compatibility)
- Merges constraints using
_merge_primitive_schemas_for_allof- Generates a root model with the merged constraints
The logic correctly excludes schemas with
propertiesoritemsfrom constraint items (line 1550), ensuring this handler only applies to primitive types with constraints.Apply this diff to remove the unused
noqadirective:- def _handle_allof_root_model_with_constraints( # noqa: PLR0911, PLR0912 + def _handle_allof_root_model_with_constraints(Based on static analysis,
PLR0911andPLR0912are not enabled in your Ruff configuration.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/datamodel_code_generator/parser/jsonschema.py(4 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints.py(1 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py(1 hunks)tests/data/jsonschema/allof_root_model_constraints.json(1 hunks)tests/main/jsonschema/test_main_jsonschema.py(1 hunks)tests/parser/test_jsonschema.py(3 hunks)
🧰 Additional context used
🧬 Code graph analysis (4)
tests/main/jsonschema/test_main_jsonschema.py (1)
tests/main/conftest.py (2)
output_file(94-96)run_main_and_assert(196-352)
tests/parser/test_jsonschema.py (4)
src/datamodel_code_generator/__init__.py (1)
AllOfMergeMode(273-283)src/datamodel_code_generator/parser/jsonschema.py (6)
JsonSchemaObject(186-450)parse_obj(2955-2996)_is_root_model_schema(1494-1510)_merge_primitive_schemas_for_allof(1165-1194)_handle_allof_root_model_with_constraints(1512-1573)_load_ref_schema_object(1109-1120)src/datamodel_code_generator/model/pydantic/base_model.py (1)
Constraints(35-47)src/datamodel_code_generator/model/pydantic_v2/base_model.py (1)
Constraints(43-61)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/__init__.py (1)
AllOfMergeMode(273-283)
tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (1)
tests/data/expected/main/jsonschema/allof_root_model_constraints.py (3)
StringDatatype(12-15)ConstrainedStringDatatype(18-21)Model(76-87)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py
1494-1494: Unused noqa directive (non-enabled: PLR6301)
Remove unused noqa directive
(RUF100)
1512-1512: Unused noqa directive (non-enabled: PLR0911, PLR0912)
Remove unused noqa directive
(RUF100)
🔇 Additional comments (13)
tests/data/jsonschema/allof_root_model_constraints.json (1)
1-110: Comprehensive test schema for allOf root model constraints.The schema provides excellent coverage of edge cases including:
- Basic string/integer constraints (lines 4-33)
- Format constraints like email (lines 34-40)
- Object inheritance vs root model (lines 41-53)
- Multiple refs, no constraints, incompatible types (lines 54-73)
- Constraints with properties/items that should fall back (lines 74-87)
- Number/integer type compatibility (lines 88-94)
This test data aligns well with the PR objective to fix issue #1901.
tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (2)
18-21: Merged pattern uses lookaheads correctly.The regex
r'(?=^\S(.*\S)?$)(?=^[A-Z].*)'correctly combines the base pattern (^\S(.*\S)?$) and child pattern (^[A-Z].*) using positive lookaheads. This ensures both constraints must be satisfied, which is the semantically correct behavior for allOf merging.
50-67: Empty class bodies for fallback cases are expected.
MultiRefAllOf,NoConstraintAllOf,IncompatibleTypeAllOf, andConstraintWithItemscorrectly fall back to emptypassbodies when the allOf root model handler cannot process them (multiple refs, no constraints, incompatible types, or items constraint). This matches the handler's documented behavior of returningNoneto let existing logic handle these cases.tests/data/expected/main/jsonschema/allof_root_model_constraints.py (2)
18-21: NoMerge mode correctly uses only child pattern.In non-merge mode,
ConstrainedStringDatatypeuses onlyr'^[A-Z].*'(the child constraint) rather than merging with the base pattern. This correctly demonstrates theAllOfMergeMode.NoMergebehavior where constraints overwrite rather than combine.
1-87: Expected output structure is correct.The file correctly represents the expected generated code for the allOf root model constraints test case with
--allof-merge-mode none. All type mappings (EmailStr, conint, constr) and model structures match the schema definitions.tests/parser/test_jsonschema.py (4)
14-14: New imports added for testing allOf root model handling.The imports of
AllOfMergeModeandSPECIAL_PATH_MARKERare necessary for the new test cases validating the allOf merge behavior and special path handling.Also applies to: 26-26
935-954: Comprehensive parametrized test for_is_root_model_schema.The test covers all conditions in the helper method: arrays, allOf/oneOf/anyOf, properties, patternProperties, object type, enum, and valid root model cases (plain string, string with constraints). This provides good coverage of the detection logic.
957-999: Tests for_merge_primitive_schemas_for_allofcover key scenarios.The tests verify:
- Single item passthrough (lines 957-962)
- NoMerge mode constraint handling (lines 965-975)
- Format preservation in both NoMerge and Constraints modes (lines 978-999)
This validates the merge behavior respects
allof_merge_modesettings correctly.
1002-1111: Edge case tests for_handle_allof_root_model_with_constraints.The tests comprehensively cover the early-return conditions:
- SPECIAL_PATH_MARKER in path (lines 1002-1013)
- Multiple refs (lines 1016-1026)
- No refs (lines 1029-1039)
- No constraint items (lines 1042-1052)
- Constraint with properties (lines 1055-1066)
- Constraint with items (lines 1069-1080)
- Incompatible types (lines 1083-1094)
- Ref to non-root model (lines 1097-1111)
Using lambda to mock
_load_ref_schema_objectis appropriate for isolating the unit under test.tests/main/jsonschema/test_main_jsonschema.py (1)
4297-4320: Well-structured integration tests for allOf root model constraints.Both tests follow the established patterns in this file:
@pytest.mark.benchmarkdecorator for performance tracking- Clear docstrings referencing issue #1901
run_main_and_assertwith appropriate parameters- Testing both merge modes (
noneandconstraints) for complete coverageThe tests will validate the end-to-end behavior of the fix for the issue where allOf combining a
$refto a root model with additional constraints would generate an empty class body.src/datamodel_code_generator/parser/jsonschema.py (3)
63-63: LGTM!The import of
SPECIAL_PATH_MARKERis necessary for the new root model constraint handling logic.
1984-1987: LGTM!The integration correctly adds an early return path for root model with constraints. The placement is optimal:
- After attempting to merge allOf objects (line 1973)
- Before the general allOf parsing logic (line 1989+)
This ensures the specific root-model-with-constraints case is handled before falling back to general object inheritance logic.
1165-1194: Implementation correctly respectsallof_merge_modesetting with appropriate test coverage.The method properly handles both merge modes:
- Non-NoMerge modes (lines 1170-1176): Uses intersection logic via
_merge_primitive_schemas, then preserves format from any item that defines it- NoMerge mode (lines 1178-1193): Takes base type and overlays constraint fields without intersection (last value wins), with format preservation
This correctly addresses the allOf use case where constraints should be added/overridden rather than intersected. Comprehensive test cases validate the behavior:
test_merge_primitive_schemas_for_allof_nomerge_mode,test_merge_primitive_schemas_for_allof_nomerge_mode_with_format, andtest_merge_primitive_schemas_for_allof_constraints_mode_with_formatconfirm expected constraint merging and format handling.
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)
1494-1494: Remove unusednoqadirective.The static analysis tool correctly identified that the
noqadirective forPLR6301is unused. Remove it to keep the code clean.Apply this diff:
- def _is_root_model_schema(self, obj: JsonSchemaObject) -> bool: # noqa: PLR6301 + def _is_root_model_schema(self, obj: JsonSchemaObject) -> bool:
1512-1512: Remove unusednoqadirectives.The static analysis tool correctly identified unused
noqadirectives forPLR0911andPLR0912. Remove them to keep the code clean.Apply this diff:
- def _handle_allof_root_model_with_constraints( # noqa: PLR0911, PLR0912 + def _handle_allof_root_model_with_constraints(
1554-1559: Consider expanding type compatibility beyond integer/number.The type compatibility check currently only handles integer/number pairs:
compatible_type_pairs = { ("integer", "number"), ("number", "integer"), }While this is correct for the current use case, consider whether other type compatibility scenarios should be supported (e.g., number subtypes, string format variations). This could be left as a future enhancement if not needed now.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/datamodel_code_generator/parser/jsonschema.py(4 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints.py(1 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py(1 hunks)tests/data/jsonschema/allof_root_model_constraints.json(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/reference.py (2)
reference(77-79)is_url(988-990)src/datamodel_code_generator/__init__.py (1)
AllOfMergeMode(273-283)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py
1494-1494: Unused noqa directive (non-enabled: PLR6301)
Remove unused noqa directive
(RUF100)
1512-1512: Unused noqa directive (non-enabled: PLR0911, PLR0912)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.9 on Windows
- GitHub Check: 3.12 on macOS
- GitHub Check: 3.14 on macOS
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.9 on macOS
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (6)
tests/data/jsonschema/allof_root_model_constraints.json (1)
1-197: LGTM! Comprehensive test schema.The JSON Schema test file provides excellent coverage of allOf scenarios with root models and constraints, including edge cases like incompatible types, empty constraints, nested allOf, and various primitive type combinations.
src/datamodel_code_generator/parser/jsonschema.py (2)
63-63: LGTM! Import addition is necessary.The
SPECIAL_PATH_MARKERimport is correctly added and used in the new_handle_allof_root_model_with_constraintsmethod to filter out inline properties.
1985-1987: LGTM! Integration point is well-placed.The call to
_handle_allof_root_model_with_constraintsis correctly positioned after object merging but before the inheritance-based processing, allowing the new root model path to take precedence when applicable while preserving existing behavior for object inheritance.tests/data/expected/main/jsonschema/allof_root_model_constraints.py (2)
50-67: Expected empty classes for edge cases.The empty classes like
MultiRefAllOf,NoConstraintAllOf, andIncompatibleTypeAllOfare expected outputs for edge cases where:
- Multiple refs (not handled by new code)
- No constraints added beyond the ref
- Incompatible types in allOf
This correctly reflects that these schemas don't benefit from the new root model constraint handling.
134-153: LGTM! Model class correctly aggregates all types.The
Modelclass properly references all the defined constraint types with appropriate optional field declarations.tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (1)
18-21: Merged regex pattern uses positive lookahead for intersection.The generated regex
r'(?=^\S(.*\S)?$)(?=^[A-Z].*)'correctly merges two patterns using positive lookahead assertions to enforce both constraints. This aligns with allOf semantics where both patterns must be satisfied.Note: The generated pattern is more complex than hand-written alternatives (e.g.,
r'^[A-Z]\S*(\S| \S+)*$'), but ensures correctness when merging arbitrary patterns.
…ce allOf processing
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
src/datamodel_code_generator/parser/jsonschema.py (1)
1165-1200: Consider documenting format handling behavior.The format handling is improved from the previous concern (multiple different formats now cause the method to return
Noneat line 1172, preventing conflicts), but the behavior could be more explicit:
- When all items share the same format or only one has a format: applied ✓
- When multiple different formats exist: returns
None(parent falls back to other handling) ✓- The distinction between merge modes (intersection vs last-wins for constraints) is clear
Consider adding a docstring note explaining the format handling strategy, especially the conflict resolution approach. This addresses the previous review feedback.
Based on learnings from past review comments about format handling.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/datamodel_code_generator/parser/jsonschema.py(4 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints.py(1 hunks)tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py(1 hunks)tests/data/jsonschema/allof_root_model_constraints.json(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/data/jsonschema/allof_root_model_constraints.json
🧰 Additional context used
🧬 Code graph analysis (2)
tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (1)
tests/data/expected/main/jsonschema/allof_root_model_constraints.py (30)
StringDatatype(12-15)ConstrainedStringDatatype(18-21)IntegerDatatype(24-25)NonNegativeIntegerDatatype(28-29)BoundedIntegerDatatype(32-35)EmailDatatype(38-39)FormattedStringDatatype(42-43)ObjectBase(46-47)ObjectWithAllOf(50-51)MultiRefAllOf(54-55)NoConstraintAllOf(58-59)IncompatibleTypeAllOf(62-63)ConstraintWithProperties(66-67)ConstraintWithItems(70-71)NumberIntegerCompatible(74-77)RefWithSchemaKeywords(80-83)ArrayDatatype(86-87)RefToArrayAllOf(90-91)ObjectNoPropsDatatype(94-95)RefToObjectNoPropsAllOf(98-99)PatternPropsDatatype(102-103)RefToPatternPropsAllOf(106-107)NestedAllOfDatatype(110-111)RefToNestedAllOfAllOf(114-115)ConstraintsOnlyDatatype(118-119)RefToConstraintsOnlyAllOf(122-123)NoDescriptionAllOf(126-129)EmptyConstraintItemAllOf(132-135)ConflictingFormatAllOf(138-139)Model(142-162)
tests/data/expected/main/jsonschema/allof_root_model_constraints.py (1)
tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (3)
StringDatatype(12-15)ConstrainedStringDatatype(18-21)Model(142-162)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py
1500-1500: Unused noqa directive (non-enabled: PLR6301)
Remove unused noqa directive
(RUF100)
1518-1518: Unused noqa directive (non-enabled: PLR0911, PLR0912)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.10 on macOS
- GitHub Check: 3.12 on macOS
- GitHub Check: 3.9 on macOS
- GitHub Check: 3.10 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.11 on macOS
- GitHub Check: 3.13 on macOS
- GitHub Check: 3.14 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: benchmarks
- GitHub Check: Analyze (python)
🔇 Additional comments (13)
src/datamodel_code_generator/parser/jsonschema.py (4)
63-63: LGTM! Imports support new allOf root model handling.The added imports are necessary for the new functionality that handles allOf with root model constraints.
1500-1516: LGTM! Root model detection logic is correct.The method correctly identifies root primitive schemas by excluding arrays, combined schemas (allOf/oneOf/anyOf), objects with properties/patternProperties, and enums. This aligns with the conditions in
parse_raw_obj.Note: The static analysis warning about the unused
noqadirective at line 1500 is a false positive—the method intentionally remains an instance method for consistency with other parser methods.
1518-1581: Excellent fix for the core issue!This method correctly handles the case where
allOfcombines a$refto a root primitive model with additional constraints. Key strengths:
- Validates single $ref: Lines 1534-1537 ensure exactly one reference
- Type compatibility: Lines 1559-1565 properly allow integer/number coercion
- Constraint collection: Lines 1552-1566 gather additional constraints while filtering out schemas with properties/items
- Proper merging: Lines 1571-1574 delegate to
_merge_primitive_schemas_for_allofrespecting merge mode- Description preservation: Lines 1576-1579 ensure parent description is retained
- Graceful fallback: Multiple
Nonereturns allow otherallOfhandling to take over when this pattern doesn't applyThis directly addresses the issue #1901 where such schemas generated empty class bodies.
Note: The static analysis warning at line 1518 about unused
noqadirectives is a false positive—the method's complexity justifies keeping it readable without splitting.
1992-1995: Clean integration of the fix.The placement after
_merge_all_of_objectand before the general allOf handling is correct. The early return pattern ensures the new handler takes precedence only when applicable, preserving all existing behavior.tests/data/expected/main/jsonschema/allof_root_model_constraints.py (6)
1-40: LGTM! Basic root models are correctly generated.The generated root models properly demonstrate non-merge mode behavior, where allOf constraints are not merged with the referenced base model's constraints. For example,
ConstrainedStringDatatypeshows only the new regex pattern without merging the baseStringDatatypepattern.
46-72: LGTM! Object allOf and edge cases correctly handled.Object-based allOf correctly uses inheritance (
ObjectWithAllOfextendsObjectBase), while empty classes represent legitimate edge cases where allOf merging cannot produce meaningful results (e.g., incompatible types, multiple refs).
80-84: LGTM! Key test case demonstrating the fix for issue #1901.
RefWithSchemaKeywordscorrectly demonstrates the core fix: an allOf combining a $ref to a root model (StringDatatype) with additional schema keywords (minLength,maxLength) now generates a proper root model with merged constraints instead of an empty class body.
110-140: LGTM! Edge cases comprehensively covered.The remaining edge cases properly demonstrate:
- Inheritance for nested allOf structures
Anytype for constraint-only schemas without explicit types- Partial constraint merging where applicable
- Empty classes for irreconcilable conflicts (e.g., format conflicts)
142-162: LGTM! Comprehensive test model aggregator.The
Modelclass effectively aggregates all test datatypes as optional fields, ensuring comprehensive test coverage of the allOf root model constraint scenarios.
86-88: No action needed—ArrayDatatype is correctly generated.ArrayDatatype lacks a
Fieldwrapper not because of inconsistency, but because its JSON schema source has nodescriptionfield. Other root models likeStringDatatypeandIntegerDatatypehave descriptions in the schema and thus receiveField(..., description=...)wrapping in the generated code. The generator correctly reflects the source schema: models with descriptions get Field wrappers; models without descriptions don't. The code is working as intended.tests/data/expected/main/jsonschema/allof_root_model_constraints_merge.py (3)
18-21: LGTM! Merge mode correctly combines regex patterns.
ConstrainedStringDatatypedemonstrates proper constraint merging: the regex patterns from both the baseStringDatatype(^\S(.*\S)?$) and the allOf item (^[A-Z].*) are combined using positive lookahead assertions:(?=^\S(.*\S)?$)(?=^[A-Z].*). This is the correct behavior for merge mode.
1-84: LGTM! Merge and non-merge modes produce identical output where appropriate.Models without multiple regex patterns (e.g.,
StringDatatype,RefWithSchemaKeywords, integer models) correctly produce identical output in both merge and non-merge modes. Only models combining multiple regex patterns show differences, which is the expected behavior.
86-162: LGTM! Remaining models and edge cases correctly identical to non-merge mode.The remaining models, edge cases, and the
Modelaggregator are correctly identical to the non-merge file, as these cases don't involve multiple regex patterns requiring merging.
Summary
allOfcombining a$refto a root model (primitive type with constraints) and additional constraints would generate an empty class body instead of properly merging the constraints.Closes #1901
Summary by CodeRabbit
New Features
New Files
Tests
✏️ Tip: You can customize this high-level summary in your review settings.