Skip to content

Support $recursiveRef/$dynamicRef in JSON Schema and OpenAPI#2982

Merged
koxudaxi merged 5 commits intomainfrom
fix/recursive-dynamic-ref-support
Feb 10, 2026
Merged

Support $recursiveRef/$dynamicRef in JSON Schema and OpenAPI#2982
koxudaxi merged 5 commits intomainfrom
fix/recursive-dynamic-ref-support

Conversation

@koxudaxi
Copy link
Copy Markdown
Owner

@koxudaxi koxudaxi commented Feb 7, 2026

Fixes: #2972

Summary by CodeRabbit

  • New Features

    • Support for JSON Schema recursive ($recursiveRef/$recursiveAnchor) and dynamic ($dynamicRef/$dynamicAnchor) references across relevant drafts and OpenAPI paths; feature flags exposed.
  • Bug Fixes

    • Improved discriminator handling to skip invalid discriminators when Any-like/ambiguous variants are present.
  • Documentation

    • Updated supported formats and feature matrix to reflect recursive/dynamic reference support and refreshed status indicators.
  • Tests

    • Added tests and expected outputs covering recursive/dynamic references, Pydantic v2 variants, and discriminator scenarios.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Feb 7, 2026

📝 Walkthrough

Walkthrough

This PR adds JSON Schema recursive ($recursiveRef/$recursiveAnchor) and dynamic ($dynamicRef/$dynamicAnchor) reference handling via anchor indexes and resolution, introduces schema feature flags for recursive/dynamic refs, and adds guards to skip discriminators when Any-like union variants exist. Tests and expected outputs for Pydantic v1/v2 were added.

Changes

Cohort / File(s) Summary
Docs
docs/supported_formats.md
Updated feature matrices/tables to mark support for $recursiveRef/$recursiveAnchor and $dynamicRef/$dynamicAnchor, adjusting status indicators.
JSON Schema Parser
src/datamodel_code_generator/parser/jsonschema.py
Added recursiveRef/recursiveAnchor/dynamicRef/dynamicAnchor fields to JsonSchemaObject; added _dynamic_anchor_index and _recursive_anchor_index; implemented _build_anchor_indexes, _resolve_recursive_ref, _resolve_dynamic_ref; integrated anchor-index building and early resolution into parse flow.
Discriminator Handling
src/datamodel_code_generator/parser/base.py
Imported ANY and added guards in __apply_discriminator_type and __collapse_root_models to detect Any-like union variants and remove/skip discriminators for those cases.
Schema Version Features
src/datamodel_code_generator/parser/schema_version.py
Added recursive_ref boolean feature and promoted dynamic_ref to supported in JsonSchema/OpenAPI feature structs; updated factory methods to propagate these flags for relevant versions.
Generated Test Outputs
tests/data/expected/main/jsonschema/*recursive_ref*.py, tests/data/expected/main/jsonschema/*dynamic_ref*.py, tests/data/expected/main/openapi/recursive_ref_discriminator*.py
Added/generated Pydantic v1/v2 modules demonstrating recursive/dynamic refs and discriminator-based recursive unions with forward-ref handling (update_forward_refs() or model_rebuild()).
Tests — Test Cases
tests/main/jsonschema/test_main_jsonschema.py, tests/main/openapi/test_main_openapi.py, tests/parser/test_schema_version.py, tests/parser/test_graphql.py
Added tests covering recursive/dynamic refs (including in $defs, no-anchor cases, and Pydantic v2 variants) and updated schema feature assertions to include recursive_ref and dynamic_ref.

Sequence Diagram(s)

sequenceDiagram
    participant Loader as Schema Loader
    participant Parser as JSONSchemaParser
    participant IndexBuilder as _build_anchor_indexes
    participant Resolver as _resolve_recursive/_dynamic_ref
    participant ItemParser as parse_item

    Loader->>Parser: load root schema (_parse_file / _parse_raw_obj)
    Parser->>IndexBuilder: _build_anchor_indexes(obj, path)
    IndexBuilder->>IndexBuilder: traverse definitions and anchors
    IndexBuilder-->>Parser: populate _recursive_anchor_index and _dynamic_anchor_index

    Parser->>ItemParser: parse_item(item, path)
    ItemParser->>Resolver: item.recursiveRef? -> _resolve_recursive_ref(item, path)
    Resolver->>Resolver: lookup in _recursive_anchor_index[path] -> resolved $ref?
    Resolver-->>ItemParser: return resolved $ref or fallback
    ItemParser->>Resolver: item.dynamicRef? -> _resolve_dynamic_ref(item)
    Resolver->>Resolver: lookup in _dynamic_anchor_index -> resolved $ref?
    Resolver-->>ItemParser: return resolved $ref or fallback
    ItemParser->>Parser: continue standard parsing with resolved $ref
Loading
sequenceDiagram
    participant ParserLogic as base.py logic
    participant Field as field.data_type
    participant Guard as Any-like guard
    participant Discriminator as discriminator applier

    ParserLogic->>Guard: __apply_discriminator_type(field)
    Guard->>Field: inspect field.data_type.data_types for ANY-like variants
    alt Any-like variant present
        Guard->>Field: remove discriminator extras and clear discriminator
        Guard-->>ParserLogic: skip discriminator application
    else No Any-like variants
        Guard->>Discriminator: apply standard discriminator mapping
        Discriminator-->>ParserLogic: discriminator set on union
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

  • #2934: Related work touching schema_version and feature flags; closely connected to recursive/dynamic feature additions.
  • #2722: Changes to discriminator handling during parsing; related to the Any-variant discriminator guards.
  • #2890: Adjusts parse_item/reference-resolution flow and special-case ref handling; related to anchor/ref resolution changes.

Suggested labels

breaking-change-analyzed

Poem

🐰 I hopped through anchors, traced each looping thread,

I stitched dynamic paths where recursive refs led.
I nudged Any‑like variants so discriminators rest,
Forward refs rebuilt — the models now nest.
A tiny rabbit cheers for schemas passing the test!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Support $recursiveRef/$dynamicRef in JSON Schema and OpenAPI' accurately describes the main change: adding support for these two JSON Schema keywords across both JSON Schema and OpenAPI parsers.
Linked Issues check ✅ Passed The PR successfully implements support for $recursiveRef and $dynamicRef, which resolves the core issue #2972 where OpenAI OpenAPI models failed with discriminator-related Pydantic errors due to missing support for recursive/dynamic references.
Out of Scope Changes check ✅ Passed All changes are directly scoped to supporting $recursiveRef/$dynamicRef: parser enhancements, schema definitions, test coverage, and documentation updates. No unrelated refactoring or feature additions detected.
Docstring Coverage ✅ Passed Docstring coverage is 95.12% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch fix/recursive-dynamic-ref-support

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In `@docs/supported_formats.md`:
- Around line 41-43: The table header row for the new "Recursive/Dynamic
References" section has only one cell but the table requires six columns; update
the header row (the line containing "**Recursive/Dynamic References**") to
include six pipe-separated header cells (e.g., repeat or name the six column
headers or use empty cells) so the header has the same column count as the
subsequent rows and MD056 is resolved; ensure the header separator row uses the
same number of columns as well.

In `@src/datamodel_code_generator/parser/jsonschema.py`:
- Around line 1607-1645: The function _resolve_recursive_ref currently
initializes best to anchors[0], which can return a non-enclosing anchor when no
prefix match exists; change the initialization to set best = "#" (the safe
fallback) and best_len = 0, then keep the existing loop logic that updates best
and best_len only when an anchor is a proper prefix match so the function
returns "#" unless a valid enclosing anchor is found (ensure you still handle
the special case for anchor_ref == "#" as before).
🧹 Nitpick comments (4)
src/datamodel_code_generator/parser/base.py (1)

1983-1995: Guard in __collapse_root_models correctly mirrors the discriminator guard.

The logic correctly wraps the discriminator assignment in a not has_any_variant check, preventing invalid discriminators when Any-like variants are present in the collapsed root model's data types.

Note: the has_any_variant detection expression is identical to the one at lines 1456–1459. If a third call site is added, extract a shared _has_any_variant(data_types) helper.

♻️ Optional: Extract shared helper
+    `@staticmethod`
+    def _has_any_variant(data_types: list[DataType]) -> bool:
+        """Check if any data type in the list is Any or effectively empty (non-discriminable)."""
+        return any(
+            dt.type == ANY or (not dt.reference and not dt.data_types and not dt.literals and not dt.type)
+            for dt in data_types
+        )

Then replace both inline checks with self._has_any_variant(...).

src/datamodel_code_generator/parser/jsonschema.py (3)

369-372: Unused noqa directives on new fields.

Ruff reports that the noqa directives N815 and UP045 on lines 369–372 are non-enabled rules, so the # noqa: comments are unnecessary. Consider removing them to keep the suppressions consistent with the active ruleset, or enable those rules if suppression was intentional.


1587-1605: Redundant guard before setdefault on _dynamic_anchor_index.

Line 1603's if root_key not in self._dynamic_anchor_index check is unnecessary because setdefault on line 1605 already handles the missing-key case. This contrasts with _recursive_anchor_index (line 1601) which uses setdefault directly without a guard.

Suggested simplification
         if obj.dynamicAnchor:
-            if root_key not in self._dynamic_anchor_index:  # pragma: no cover
-                self._dynamic_anchor_index[root_key] = {}
-            self._dynamic_anchor_index[root_key].setdefault(obj.dynamicAnchor, ref_path)
+            self._dynamic_anchor_index.setdefault(root_key, {}).setdefault(obj.dynamicAnchor, ref_path)

4156-4191: Anchor index pre-population in _parse_file duplicates logic from _build_anchor_indexes.

The manual anchor indexing here (for both root and definitions) repeats the same pattern as _build_anchor_indexes. I understand the need to pre-populate before parsing begins — but the duplication is fragile. If the indexing logic changes in _build_anchor_indexes, these blocks must be updated in lockstep.

Consider extracting the root/definition anchor registration into a shared helper or calling _build_anchor_indexes with the appropriate path for each definition during the pre-scan loop (lines 4176–4191) instead of inlining the logic.

Comment thread docs/supported_formats.md Outdated
Comment thread src/datamodel_code_generator/parser/jsonschema.py
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Feb 7, 2026

Comment thread src/datamodel_code_generator/parser/base.py Dismissed
@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Feb 7, 2026

CodSpeed Performance Report

Merging this PR will not alter performance

Comparing fix/recursive-dynamic-ref-support (8d0ab61) with main (9554fb6)

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

Summary

✅ 11 untouched benchmarks
⏩ 98 skipped benchmarks1

Footnotes

  1. 98 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/supported_formats.md (1)

175-181: ⚠️ Potential issue | 🟡 Minor

Supported features listed under "Unsupported Features" heading is misleading.

Lines 180-181 now show $recursiveRef/$recursiveAnchor and $dynamicRef/$dynamicAnchor as "✅ Supported" in a table whose section heading is "JSON Schema - Unsupported Features". This is confusing for readers scanning the limitations section. Consider either removing these rows (since they're already listed as supported in the Feature Compatibility Matrix above) or moving them to a separate "Recently Added Support" subsection.

🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/base.py (1)

1479-1487: Any-like variant guard in discriminator application is sound.

This correctly prevents Pydantic v2 PydanticUserError when a discriminated union includes an Any-like variant (which cannot have a discriminator field). The guard checks both explicit ANY types and "empty" data types that lack any reference, nested types, literals, or type — all of which would be non-discriminable.

One observation: the condition on lines 1480-1483 is duplicated in __collapse_root_models (lines 2007-2011). Consider extracting a helper to keep this in sync.

♻️ Optional: extract shared helper
+    `@staticmethod`
+    def _has_any_like_variant(data_types: list[DataType]) -> bool:
+        """Check if any data type in a union is Any-like and cannot participate in a discriminated union."""
+        return any(
+            dt.type == ANY or (not dt.reference and not dt.data_types and not dt.literals and not dt.type)
+            for dt in data_types
+        )

Then use self._has_any_like_variant(field.data_type.data_types) in both call sites.

src/datamodel_code_generator/parser/jsonschema.py (2)

369-372: Unused noqa directives on new field declarations.

Ruff reports that the noqa: N815, UP045 directives on lines 369–372 are for non-enabled rules. Remove them to keep lint suppression accurate and avoid masking future issues.

-    recursiveRef: Optional[str] = Field(default=None, alias="$recursiveRef")  # noqa: N815, UP045
-    recursiveAnchor: Optional[bool] = Field(default=None, alias="$recursiveAnchor")  # noqa: N815, UP045
-    dynamicRef: Optional[str] = Field(default=None, alias="$dynamicRef")  # noqa: N815, UP045
-    dynamicAnchor: Optional[str] = Field(default=None, alias="$dynamicAnchor")  # noqa: N815, UP045
+    recursiveRef: Optional[str] = Field(default=None, alias="$recursiveRef")
+    recursiveAnchor: Optional[bool] = Field(default=None, alias="$recursiveAnchor")
+    dynamicRef: Optional[str] = Field(default=None, alias="$dynamicRef")
+    dynamicAnchor: Optional[str] = Field(default=None, alias="$dynamicAnchor")

4160-4195: Duplicate anchor registration: _parse_file inline code duplicates _build_anchor_indexes, and definitions get double-appended.

The anchor-building logic here (lines 4160–4195) duplicates what _build_anchor_indexes already does. For definitions, this code registers anchors, and then parse_raw_obj (line 4207) calls _build_anchor_indexes again, causing duplicate entries in _recursive_anchor_index (since it uses list.append). While not a correctness bug (the prefix-matching algorithm tolerates duplicates), this is a DRY violation that could diverge over time.

Consider delegating to _build_anchor_indexes here instead of inlining the logic:

♻️ Sketch: replace inline anchor building with method calls
                 self.parse_id(root_obj, path_parts)
-                # Build $recursiveAnchor index for root object
-                if root_obj.recursiveAnchor:
-                    root_key = tuple(path_parts)
-                    self._recursive_anchor_index.setdefault(root_key, []).append("#")
-                # Build $dynamicAnchor index for root object
-                if root_obj.dynamicAnchor:
-                    root_key = tuple(path_parts)
-                    if root_key not in self._dynamic_anchor_index:
-                        self._dynamic_anchor_index[root_key] = {}
-                    self._dynamic_anchor_index[root_key].setdefault(root_obj.dynamicAnchor, "#")
+                self._build_anchor_indexes(root_obj, path_parts)

Then for definitions, remove the inline registration (lines 4184–4195) since parse_raw_obj at line 4207 already calls _build_anchor_indexes. If you keep both call sites, guard _recursive_anchor_index against duplicates (e.g., check before appending).

@codecov
Copy link
Copy Markdown

codecov Bot commented Feb 7, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (9554fb6) to head (8d0ab61).
⚠️ Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##              main     #2982    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           94        94            
  Lines        17813     17913   +100     
  Branches      2055      2070    +15     
==========================================
+ Hits         17813     17913   +100     
Flag Coverage Δ
unittests 100.00% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@koxudaxi koxudaxi merged commit 4be7699 into main Feb 10, 2026
40 checks passed
@koxudaxi koxudaxi deleted the fix/recursive-dynamic-ref-support branch February 10, 2026 08:42
@github-actions
Copy link
Copy Markdown
Contributor

Breaking Change Analysis

Result: No breaking changes detected

Reasoning: This PR adds new feature support for $recursiveRef/$recursiveAnchor (JSON Schema 2019-09) and $dynamicRef/$dynamicAnchor (JSON Schema 2020-12) which is purely additive - schemas using these features that previously failed or were unsupported now generate valid recursive models. The discriminator change is a bug fix that prevents invalid code generation when Any-like variants exist in unions. No CLI options, API changes, default behavior changes, or template changes are required. Existing schemas without these features continue to work identically.


This analysis was performed by Claude Code Action

@github-actions
Copy link
Copy Markdown
Contributor

🎉 Released in 0.54.0

This PR is now available in the latest release. See the release notes for details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Generate OpenAI OpenAPI for python pydantic2.datamodel

2 participants