Support $recursiveRef/$dynamicRef in JSON Schema and OpenAPI#2982
Support $recursiveRef/$dynamicRef in JSON Schema and OpenAPI#2982
Conversation
📝 WalkthroughWalkthroughThis PR adds JSON Schema recursive ( Changes
Sequence Diagram(s)sequenceDiagram
participant Loader as Schema Loader
participant Parser as JSONSchemaParser
participant IndexBuilder as _build_anchor_indexes
participant Resolver as _resolve_recursive/_dynamic_ref
participant ItemParser as parse_item
Loader->>Parser: load root schema (_parse_file / _parse_raw_obj)
Parser->>IndexBuilder: _build_anchor_indexes(obj, path)
IndexBuilder->>IndexBuilder: traverse definitions and anchors
IndexBuilder-->>Parser: populate _recursive_anchor_index and _dynamic_anchor_index
Parser->>ItemParser: parse_item(item, path)
ItemParser->>Resolver: item.recursiveRef? -> _resolve_recursive_ref(item, path)
Resolver->>Resolver: lookup in _recursive_anchor_index[path] -> resolved $ref?
Resolver-->>ItemParser: return resolved $ref or fallback
ItemParser->>Resolver: item.dynamicRef? -> _resolve_dynamic_ref(item)
Resolver->>Resolver: lookup in _dynamic_anchor_index -> resolved $ref?
Resolver-->>ItemParser: return resolved $ref or fallback
ItemParser->>Parser: continue standard parsing with resolved $ref
sequenceDiagram
participant ParserLogic as base.py logic
participant Field as field.data_type
participant Guard as Any-like guard
participant Discriminator as discriminator applier
ParserLogic->>Guard: __apply_discriminator_type(field)
Guard->>Field: inspect field.data_type.data_types for ANY-like variants
alt Any-like variant present
Guard->>Field: remove discriminator extras and clear discriminator
Guard-->>ParserLogic: skip discriminator application
else No Any-like variants
Guard->>Discriminator: apply standard discriminator mapping
Discriminator-->>ParserLogic: discriminator set on union
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@docs/supported_formats.md`:
- Around line 41-43: The table header row for the new "Recursive/Dynamic
References" section has only one cell but the table requires six columns; update
the header row (the line containing "**Recursive/Dynamic References**") to
include six pipe-separated header cells (e.g., repeat or name the six column
headers or use empty cells) so the header has the same column count as the
subsequent rows and MD056 is resolved; ensure the header separator row uses the
same number of columns as well.
In `@src/datamodel_code_generator/parser/jsonschema.py`:
- Around line 1607-1645: The function _resolve_recursive_ref currently
initializes best to anchors[0], which can return a non-enclosing anchor when no
prefix match exists; change the initialization to set best = "#" (the safe
fallback) and best_len = 0, then keep the existing loop logic that updates best
and best_len only when an anchor is a proper prefix match so the function
returns "#" unless a valid enclosing anchor is found (ensure you still handle
the special case for anchor_ref == "#" as before).
🧹 Nitpick comments (4)
src/datamodel_code_generator/parser/base.py (1)
1983-1995: Guard in__collapse_root_modelscorrectly mirrors the discriminator guard.The logic correctly wraps the discriminator assignment in a
not has_any_variantcheck, preventing invalid discriminators when Any-like variants are present in the collapsed root model's data types.Note: the
has_any_variantdetection expression is identical to the one at lines 1456–1459. If a third call site is added, extract a shared_has_any_variant(data_types)helper.♻️ Optional: Extract shared helper
+ `@staticmethod` + def _has_any_variant(data_types: list[DataType]) -> bool: + """Check if any data type in the list is Any or effectively empty (non-discriminable).""" + return any( + dt.type == ANY or (not dt.reference and not dt.data_types and not dt.literals and not dt.type) + for dt in data_types + )Then replace both inline checks with
self._has_any_variant(...).src/datamodel_code_generator/parser/jsonschema.py (3)
369-372: Unusednoqadirectives on new fields.Ruff reports that the
noqadirectivesN815andUP045on lines 369–372 are non-enabled rules, so the# noqa:comments are unnecessary. Consider removing them to keep the suppressions consistent with the active ruleset, or enable those rules if suppression was intentional.
1587-1605: Redundant guard beforesetdefaulton_dynamic_anchor_index.Line 1603's
if root_key not in self._dynamic_anchor_indexcheck is unnecessary becausesetdefaulton line 1605 already handles the missing-key case. This contrasts with_recursive_anchor_index(line 1601) which usessetdefaultdirectly without a guard.Suggested simplification
if obj.dynamicAnchor: - if root_key not in self._dynamic_anchor_index: # pragma: no cover - self._dynamic_anchor_index[root_key] = {} - self._dynamic_anchor_index[root_key].setdefault(obj.dynamicAnchor, ref_path) + self._dynamic_anchor_index.setdefault(root_key, {}).setdefault(obj.dynamicAnchor, ref_path)
4156-4191: Anchor index pre-population in_parse_fileduplicates logic from_build_anchor_indexes.The manual anchor indexing here (for both root and definitions) repeats the same pattern as
_build_anchor_indexes. I understand the need to pre-populate before parsing begins — but the duplication is fragile. If the indexing logic changes in_build_anchor_indexes, these blocks must be updated in lockstep.Consider extracting the root/definition anchor registration into a shared helper or calling
_build_anchor_indexeswith the appropriate path for each definition during the pre-scan loop (lines 4176–4191) instead of inlining the logic.
Generated by GitHub Actions
CodSpeed Performance ReportMerging this PR will not alter performanceComparing
|
There was a problem hiding this comment.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
docs/supported_formats.md (1)
175-181:⚠️ Potential issue | 🟡 MinorSupported features listed under "Unsupported Features" heading is misleading.
Lines 180-181 now show
$recursiveRef/$recursiveAnchorand$dynamicRef/$dynamicAnchoras "✅ Supported" in a table whose section heading is "JSON Schema - Unsupported Features". This is confusing for readers scanning the limitations section. Consider either removing these rows (since they're already listed as supported in the Feature Compatibility Matrix above) or moving them to a separate "Recently Added Support" subsection.
🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/base.py (1)
1479-1487: Any-like variant guard in discriminator application is sound.This correctly prevents Pydantic v2
PydanticUserErrorwhen a discriminated union includes anAny-like variant (which cannot have a discriminator field). The guard checks both explicitANYtypes and "empty" data types that lack any reference, nested types, literals, or type — all of which would be non-discriminable.One observation: the condition on lines 1480-1483 is duplicated in
__collapse_root_models(lines 2007-2011). Consider extracting a helper to keep this in sync.♻️ Optional: extract shared helper
+ `@staticmethod` + def _has_any_like_variant(data_types: list[DataType]) -> bool: + """Check if any data type in a union is Any-like and cannot participate in a discriminated union.""" + return any( + dt.type == ANY or (not dt.reference and not dt.data_types and not dt.literals and not dt.type) + for dt in data_types + )Then use
self._has_any_like_variant(field.data_type.data_types)in both call sites.src/datamodel_code_generator/parser/jsonschema.py (2)
369-372: Unusednoqadirectives on new field declarations.Ruff reports that the
noqa: N815, UP045directives on lines 369–372 are for non-enabled rules. Remove them to keep lint suppression accurate and avoid masking future issues.- recursiveRef: Optional[str] = Field(default=None, alias="$recursiveRef") # noqa: N815, UP045 - recursiveAnchor: Optional[bool] = Field(default=None, alias="$recursiveAnchor") # noqa: N815, UP045 - dynamicRef: Optional[str] = Field(default=None, alias="$dynamicRef") # noqa: N815, UP045 - dynamicAnchor: Optional[str] = Field(default=None, alias="$dynamicAnchor") # noqa: N815, UP045 + recursiveRef: Optional[str] = Field(default=None, alias="$recursiveRef") + recursiveAnchor: Optional[bool] = Field(default=None, alias="$recursiveAnchor") + dynamicRef: Optional[str] = Field(default=None, alias="$dynamicRef") + dynamicAnchor: Optional[str] = Field(default=None, alias="$dynamicAnchor")
4160-4195: Duplicate anchor registration:_parse_fileinline code duplicates_build_anchor_indexes, and definitions get double-appended.The anchor-building logic here (lines 4160–4195) duplicates what
_build_anchor_indexesalready does. For definitions, this code registers anchors, and thenparse_raw_obj(line 4207) calls_build_anchor_indexesagain, causing duplicate entries in_recursive_anchor_index(since it useslist.append). While not a correctness bug (the prefix-matching algorithm tolerates duplicates), this is a DRY violation that could diverge over time.Consider delegating to
_build_anchor_indexeshere instead of inlining the logic:♻️ Sketch: replace inline anchor building with method calls
self.parse_id(root_obj, path_parts) - # Build $recursiveAnchor index for root object - if root_obj.recursiveAnchor: - root_key = tuple(path_parts) - self._recursive_anchor_index.setdefault(root_key, []).append("#") - # Build $dynamicAnchor index for root object - if root_obj.dynamicAnchor: - root_key = tuple(path_parts) - if root_key not in self._dynamic_anchor_index: - self._dynamic_anchor_index[root_key] = {} - self._dynamic_anchor_index[root_key].setdefault(root_obj.dynamicAnchor, "#") + self._build_anchor_indexes(root_obj, path_parts)Then for definitions, remove the inline registration (lines 4184–4195) since
parse_raw_objat line 4207 already calls_build_anchor_indexes. If you keep both call sites, guard_recursive_anchor_indexagainst duplicates (e.g., check before appending).
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #2982 +/- ##
==========================================
Coverage 100.00% 100.00%
==========================================
Files 94 94
Lines 17813 17913 +100
Branches 2055 2070 +15
==========================================
+ Hits 17813 17913 +100
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Breaking Change AnalysisResult: No breaking changes detected Reasoning: This PR adds new feature support for $recursiveRef/$recursiveAnchor (JSON Schema 2019-09) and $dynamicRef/$dynamicAnchor (JSON Schema 2020-12) which is purely additive - schemas using these features that previously failed or were unsupported now generate valid recursive models. The discriminator change is a bug fix that prevents invalid code generation when Any-like variants exist in unions. No CLI options, API changes, default behavior changes, or template changes are required. Existing schemas without these features continue to work identically. This analysis was performed by Claude Code Action |
|
🎉 Released in 0.54.0 This PR is now available in the latest release. See the release notes for details. |
Fixes: #2972
Summary by CodeRabbit
New Features
Bug Fixes
Documentation
Tests