Preserve Python types (Set, Mapping, Sequence) in --input-model#2837
Preserve Python types (Set, Mapping, Sequence) in --input-model#2837
Conversation
|
Warning Rate limit exceeded@koxudaxi has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 9 minutes and 24 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (6)
📝 WalkthroughWalkthroughThis change introduces systematic preservation of Python typing semantics (Set, FrozenSet, Mapping, Sequence) across JSON Schema conversion by annotating schemas with x-python-type metadata during model loading, then propagating and applying these flags during code generation to restore accurate type hints in output models. Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant __main__
participant JsonSchema
participant jsonschema parser
participant types
participant Output
User->>__main__: Load model (Pydantic/dataclass)
__main__->>__main__: Generate JSON Schema
__main__->>__main__: _serialize_python_type() for each property
__main__->>JsonSchema: Annotate with x-python-type metadata<br/>(Set→"set", Mapping→"Mapping", etc.)
JsonSchema-->>__main__: Augmented schema with x-python-type
__main__->>jsonschema parser: Parse augmented schema
jsonschema parser->>jsonschema parser: _get_python_type_flags(parent_obj)<br/>Extract x-python-type from extras
jsonschema parser->>jsonschema parser: Apply is_set/is_mapping/is_sequence<br/>to data_type construction
jsonschema parser-->>types: Property data_type<br/>with preserved flags
types->>types: type_hint(): Prefer FrozenSet,<br/>Mapping, Sequence based on flags
types-->>Output: Generate code with<br/>correct Python types
Output-->>User: Model with preserved types<br/>(Set[str], Mapping[str, int], etc.)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
CodSpeed Performance ReportMerging #2837 will not alter performanceComparing
|
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2837 +/- ##
==========================================
- Coverage 99.52% 99.51% -0.01%
==========================================
Files 90 90
Lines 14092 14245 +153
Branches 1674 1697 +23
==========================================
+ Hits 14025 14176 +151
- Misses 36 37 +1
- Partials 31 32 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
bc520e7 to
632e736
Compare
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
src/datamodel_code_generator/types.py (1)
336-347: New container flags are wired correctly, butbase_type_hintignores themThe additions of
is_frozen_set,is_mapping, andis_sequenceare cleanly integrated intoimportsandtype_hint, so emitted hints and imports forFrozenSet,Mapping, andSequenceshould be correct.One behavioral gap:
base_type_hintstill only considersis_list,is_set, andis_dict. For types marked with the new flags,base_type_hintwill effectively return just the inner type (e.g.intinstead ofMapping[str, int]orstrinstead ofSequence[str]). Ifbase_type_hintis ever used downstream for these preserved containers (beyond the current regex/RootModel use cases), this could be surprising.Consider mirroring the
type_hintcontainer wrapping inbase_type_hintforis_frozen_set,is_mapping, andis_sequenceto keep the two accessors consistent, or explicitly documenting thatbase_type_hintmay drop these container wrappers.Also applies to: 489-533, 579-681
src/datamodel_code_generator/parser/jsonschema.py (1)
1226-1234: Clean up unusednoqadirectives flagged by RuffRuff reports unused
noqadirectives here:
- Line 1226:
# noqa: PLR6301on_get_python_type_flags- Line 2458:
# noqa: PLR0912onparse_property_names- Line 2462:
# noqa: FBT001onadditional_propertiesparameterGiven these rules are not enabled in this project’s configuration, the
noqacomments just generate RUF100 noise. Removing them keeps lint output clean without changing behavior.Also applies to: 2458-2464
src/datamodel_code_generator/__main__.py (1)
603-608: Remove unusednoqapragmas on local imports and global mutationRuff’s RUF100 warnings about unused
noqadirectives apply here:
# noqa: PLC0415on the local imports inside_init_preserved_type_origins,_serialize_python_type,_find_models_in_type, and_get_type_hints_safe.# noqa: PLW0603on theglobal _PRESERVED_TYPE_ORIGINSline.Since those rules are not enabled, the
noqaannotations are unnecessary and now themselves cause lint noise. Dropping thesenoqacomments will keep the linter quiet without affecting runtime behavior.Also applies to: 624-624, 635-635, 690-690, 702-702
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
src/datamodel_code_generator/__main__.pysrc/datamodel_code_generator/parser/jsonschema.pysrc/datamodel_code_generator/types.pytests/data/python/input_model/dataclass_models.pytests/data/python/input_model/pydantic_models.pytests/test_input_model.py
🧰 Additional context used
🧬 Code graph analysis (2)
src/datamodel_code_generator/parser/jsonschema.py (1)
src/datamodel_code_generator/types.py (1)
DataType(296-805)
src/datamodel_code_generator/types.py (7)
src/datamodel_code_generator/model/typed_dict.py (1)
imports(164-170)src/datamodel_code_generator/model/pydantic/base_model.py (2)
imports(268-276)field(92-106)src/datamodel_code_generator/model/base.py (3)
imports(324-349)imports(810-815)field(403-405)src/datamodel_code_generator/model/pydantic_v2/types.py (1)
imports(62-67)src/datamodel_code_generator/model/dataclass.py (2)
imports(139-144)field(147-152)src/datamodel_code_generator/model/enum.py (1)
imports(119-121)src/datamodel_code_generator/model/type_alias.py (1)
imports(28-34)
🪛 Ruff (0.14.10)
src/datamodel_code_generator/parser/jsonschema.py
1226-1226: Unused noqa directive (non-enabled: PLR6301)
Remove unused noqa directive
(RUF100)
2458-2458: Unused noqa directive (non-enabled: PLR0912)
Remove unused noqa directive
(RUF100)
2462-2462: Unused noqa directive (non-enabled: FBT001)
Remove unused noqa directive
(RUF100)
src/datamodel_code_generator/__main__.py
603-603: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
604-604: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
605-605: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
606-606: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
607-607: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
608-608: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
624-624: Unused noqa directive (non-enabled: PLW0603)
Remove unused noqa directive
(RUF100)
635-635: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
690-690: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
702-702: Unused noqa directive (non-enabled: PLC0415)
Remove unused noqa directive
(RUF100)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
- GitHub Check: py312-black24 on Ubuntu
- GitHub Check: 3.10 on Windows
- GitHub Check: py312-isort6 on Ubuntu
- GitHub Check: 3.11 on Windows
- GitHub Check: 3.12 on Windows
- GitHub Check: 3.13 on Windows
- GitHub Check: 3.14 on Windows
- GitHub Check: Analyze (python)
- GitHub Check: benchmarks
🔇 Additional comments (5)
tests/data/python/input_model/dataclass_models.py (1)
3-25: Dataclass fixture cleanly exercises the preserved Python container typesThe new
DataclassWithPythonTypes(and associated imports) correctly coversSet,FrozenSet,Mapping, andSequenceusing standard-library types and__future__annotations. This is a good, minimal fixture for the new--input-modelbehavior and aligns with the Pydantic model counterparts.src/datamodel_code_generator/parser/jsonschema.py (1)
1226-1253: x-python-type flag handling is well-scoped and keeps existing behavior as a fallbackThe new
_get_python_type_flagshelper and its use in:
parse_property_names(viaparent_obj),- the
additionalPropertiesobject branch inparse_item, andparse_array_fieldsfor arrayscleanly thread
x-python-typethrough toDataTypeflags (is_set,is_frozen_set,is_mapping,is_sequence) without disturbing the previous defaults:
- When
x-python-typeis absent or not a string, you still fall back to{"is_dict": True}or{"is_list": True}.- For tuples (
is_tuple=True), you intentionally skip container overrides, preserving the existing tuple handling.- Using
parent_objforpropertyNamesis sensible, since the extension describes the container rather than the key-schema itself.This looks consistent with the
DataTypechanges and should be backwards compatible for schemas withoutx-python-type.Also applies to: 2458-2539, 2621-2630, 2705-2715, 2839-2841
src/datamodel_code_generator/__main__.py (1)
597-760: Python-type preservation pipeline for--input-modelis well factoredThe new helpers:
_init_preserved_type_origins/_get_preserved_type_origins_serialize_python_typeand_simple_type_name_collect_nested_models/_find_models_in_type_get_type_hints_safe_add_python_type_to_properties,_add_python_type_info, and_add_python_type_info_genericare cohesive and safely scoped:
- You only touch schemas produced by
BaseModel.model_json_schema()andTypeAdapter(obj).json_schema(), so non---input-modelflows are unaffected.- Container origins are mapped for both builtins (
set,frozenset) and ABCs (Mapping,Sequence,Mutable*variants), which matches the imports used in your test fixtures.- Nested models in
$defsare handled viamodel_fieldsand recursive discovery, ensuring that things likeTag.values: FrozenSet[str]getx-python-typeeven when defined as a referenced definition.- The dataclass/TypedDict path via
_add_python_type_info_genericis a reasonable, simpler approximation that still captures per-field containers viaget_type_hints.Given the parser changes, this should provide the end-to-end preservation you’re testing (Set/FrozenSet/Mapping/Sequence in both top-level and nested Pydantic models, plus dataclasses/TypedDict).
Also applies to: 762-867
tests/data/python/input_model/pydantic_models.py (1)
3-41: Pydantic fixtures comprehensively cover preserved type scenariosThe new
Tag,ModelWithPythonTypes, andRecursiveNodemodels are well chosen:
- They mirror the dataclass fixtures and exercise
Set,FrozenSet,Mapping,Sequence, nestedMapping[str, Set[int]], and recursiveSetusage.from __future__ import annotationsmakes the recursivechildren: Optional[list[RecursiveNode]]safe for Pydantic v2.These look like solid inputs for the new
--input-modeltests.tests/test_input_model.py (1)
464-558: New--input-modeltests accurately validate preserved container typesThe added “x-python-type preservation” tests:
- Exercise Set/FrozenSet/Mapping/Sequence on Pydantic models (
ModelWithPythonTypes) and confirm they survive into generated code.- Verify that preservation holds when targeting
typing.TypedDictanddataclasses.dataclass.- Cover dataclass input (
DataclassWithPythonTypes) and a recursive model (RecursiveNode), which is important for the nested-model collection logic.The use of
SKIP_PYDANTIC_V1and the sharedrun_input_model_and_asserthelper keeps these tests consistent with the rest of the suite. The expectations ("set[str]","frozenset[int]","Mapping[str, int]","Sequence[str]") match the intended output under the new pipeline.
632e736 to
d01a0b1
Compare
Breaking Change AnalysisResult: Breaking changes detected Reasoning: This PR changes the generated code output when using the Content for Release NotesCode Generation Changes
Custom Template Update Required
This analysis was performed by Claude Code Action |
|
🎉 Released in 0.51.0 This PR is now available in the latest release. See the release notes for details. |
Summary by CodeRabbit
New Features
Tests
✏️ Tip: You can customize this high-level summary in your review settings.