Skip to content

Add support for prefixItems to emit tuples#2537

Merged
koxudaxi merged 18 commits intokoxudaxi:mainfrom
saulshanabrook:prefixItems
Dec 20, 2025
Merged

Add support for prefixItems to emit tuples#2537
koxudaxi merged 18 commits intokoxudaxi:mainfrom
saulshanabrook:prefixItems

Conversation

@saulshanabrook
Copy link
Copy Markdown
Contributor

@saulshanabrook saulshanabrook commented Nov 11, 2025

Closes #1546 by adding support for emitting tuples from prefixItems, if the min and max are set to the number of prefix items and no items are specified.

Summary by CodeRabbit

  • New Features

    • Added JSON Schema prefixItems support: generates fixed-length tuple-typed fields when applicable, with automatic fallback to list semantics when lengths vary.
  • Documentation

    • Added a "Tuple validation" section with examples.
    • Updated docs/examples to use PEP 604 union syntax (e.g., X | None) and modern list syntax.
  • Tests

    • Added tests covering prefixItems tuple generation and non-tuple fallback behavior.

✏️ Tip: You can customize this high-level summary in your review settings.

@codecov
Copy link
Copy Markdown

codecov Bot commented Nov 12, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 99.34%. Comparing base (4423a49) to head (dc14dee).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #2537   +/-   ##
=======================================
  Coverage   99.33%   99.34%           
=======================================
  Files          81       81           
  Lines       11480    11535   +55     
  Branches     1367     1387   +20     
=======================================
+ Hits        11404    11459   +55     
  Misses         45       45           
  Partials       31       31           
Flag Coverage Δ
unittests 99.34% <100.00%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@codspeed-hq
Copy link
Copy Markdown

codspeed-hq Bot commented Nov 12, 2025

CodSpeed Performance Report

Merging #2537 will not alter performance

Comparing saulshanabrook:prefixItems (dc14dee) with main (4423a49)

Summary

✅ 52 untouched
⏩ 10 skipped1

Footnotes

  1. 10 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Comment thread src/datamodel_code_generator/types.py Outdated
Comment thread src/datamodel_code_generator/types.py Outdated
Comment thread docs/jsonschema.md Outdated
Comment thread docs/jsonschema.md
Comment thread src/datamodel_code_generator/parser/jsonschema.py
@koxudaxi
Copy link
Copy Markdown
Owner

I have merged main branch into this branch. and I have applid new test assertion helper function to the test.

@saulshanabrook
Copy link
Copy Markdown
Contributor Author

Thank you! Sorry I haven't gotten time to address the feedback yet, I should in a week or so, but feel free to make any changes as well.

Copilot AI review requested due to automatic review settings December 6, 2025 10:21
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request adds support for generating Python tuple type annotations from JSON Schema's prefixItems keyword. When a JSON Schema array has prefixItems defined with matching minItems and maxItems constraints (equal to the number of prefix items) and no additional items specification, the code generator now emits precise Tuple[...] types instead of generic List[...] types.

Key Changes

  • Added prefixItems field to JsonSchemaObject model for parsing JSON Schema 2020-12 tuple validation syntax
  • Implemented tuple detection logic in array field parsing when prefix items count matches min/max items constraints
  • Extended DataType class with is_tuple flag and tuple-specific type hint generation
  • Added IMPORT_TUPLE constant and integrated tuple import handling for non-standard collections

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/datamodel_code_generator/parser/jsonschema.py Added prefixItems field to schema object model; implemented tuple detection logic in parse_array_fields; extended parse_ref and parse_id to handle prefixItems references
src/datamodel_code_generator/types.py Added is_tuple flag to DataType; implemented tuple type hint generation with support for both Tuple and tuple (standard collections); added tuple import handling
src/datamodel_code_generator/imports.py Added IMPORT_TUPLE constant for importing typing.Tuple
tests/main/jsonschema/test_main_jsonschema.py Added parametrized test covering both pydantic and msgspec output models for prefix items
tests/data/jsonschema/prefix_items.json Added test input schema demonstrating tuple validation with prefixItems, minItems, and maxItems
tests/data/expected/main/jsonschema/prefix_items.py Added expected pydantic output with Tuple[Span, str] annotation
tests/data/expected/main/jsonschema/prefix_items_msgspec.py Added expected msgspec output with Tuple[Span, str] annotation
docs/jsonschema.md Added documentation section explaining tuple validation with prefixItems including example schema and generated code

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/main/jsonschema/test_main_jsonschema.py Outdated
Comment thread docs/jsonschema.md Outdated
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Dec 18, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds JSON Schema prefixItems support: parser records prefixItems, detects fixed-length tuple arrays when minItems == maxItems == len(prefixItems) and items is absent/false, marks DataType as tuple, emits Tuple imports/type hints, updates docs, tests, and expected outputs.

Changes

Cohort / File(s) Change Summary
Imports
src/datamodel_code_generator/imports.py
Added IMPORT_TUPLE for typing.Tuple.
Type constants & type system
src/datamodel_code_generator/types.py
Added TUPLE, STANDARD_TUPLE; added is_tuple: bool to Config and DataType; extended type_hint and imports to emit tuple hints and tuple imports; tuple handling short-circuits union generation.
JSON Schema parser
src/datamodel_code_generator/parser/jsonschema.py
Added prefixItems: Optional[list[JsonSchemaObject]]; updated is_array detection, traversal (_traverse_schema_objects), _add_id_callback, parse_array_fields, and get_object_field to handle prefixItems, detect fixed-length tuples (when minItems == maxItems == len(prefixItems) and items absent/false), suppress min/max constraints for tuples, and set is_tuple.
Documentation
docs/jsonschema.md, docs/cli-reference/general-options.md, docs/graphql.md, docs/jsondata.md, docs/openapi.md
Updated examples to use PEP 604 unions and added "Tuple validation" section in docs/jsonschema.md; aligned example imports and type hints across docs.
Schemas / Fixtures
tests/data/jsonschema/prefix_items.json, tests/data/jsonschema/prefix_items_no_tuple.json
Added JSON Schema fixtures: one where minItems == maxItems == len(prefixItems) (tuple case) and one where min/max differ (list fallback).
Expected generated outputs
tests/data/expected/main/jsonschema/prefix_items.py, .../prefix_items_msgspec.py, .../prefix_items_no_tuple.py
Added expected outputs: Tuple-typed models for tuple case (uses Tuple[...] import where appropriate); list-typed fallback when bounds differ.
Tests
tests/main/jsonschema/test_main_jsonschema.py
Added test_main_jsonschema_prefix_items (parameterized for output models) and test_main_jsonschema_prefix_items_no_tuple validating tuple detection and list fallback.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant Parser as JSON Schema Parser
    participant Types as Type System
    participant Generator as Code Generator

    User->>Parser: Provide schema with `prefixItems`, `minItems`, `maxItems`, `items`
    Parser->>Parser: Parse schema, record `prefixItems`, `minItems`, `maxItems`, `items`
    alt fixed-length tuple detected (min==max==len(prefixItems) and items is None/false)
        Parser->>Types: Create DataType (is_tuple = true) with element types from prefixItems
        Parser->>Parser: Suppress minItems/maxItems in field constraints
    else
        Parser->>Types: Create DataType for list/array, retain items and constraints
    end
    Types->>Types: Build type_hint (Tuple[...] or list[...]) and resolve imports (typing.Tuple if needed)
    Types->>Generator: Emit type hint and import statements
    Generator->>User: Output generated model code
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Review focus:
    • src/datamodel_code_generator/parser/jsonschema.py — traversal, _add_id_callback, parse_array_fields, get_object_field, and constraint suppression for tuples.
    • src/datamodel_code_generator/types.py — is_tuple propagation, type_hint precedence, import emission, and handling of empty/unknown inner types.
    • Tests and expected outputs — verify tuple vs list fallback across output model targets and Python versions.

Possibly related PRs

Poem

🐰 I nibbled schemas under moonlit shoots,
Prefix hops arranged in tidy boots,
Tiny tuples snug where lists once sprawled,
Ordered hops now neatly called,
🥕 Precise types — my joyful loot

Pre-merge checks and finishing touches

❌ Failed checks (1 inconclusive)
Check name Status Explanation Resolution
Out of Scope Changes check ❓ Inconclusive Documentation updates to examples (jsonschema.md, cli-reference, graphql.md, jsondata.md, openapi.md) showing PEP 604 union syntax align with modernizing type hints but appear tangential to prefixItems core feature. Clarify whether documentation updates modernizing type hint syntax from Optional/List to PEP 604 unions are intentional scope or should be separated into a distinct PR.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed Title accurately describes the main feature addition: implementing prefixItems support to generate tuple type annotations.
Linked Issues check ✅ Passed PR implements the exact feature requested in #1546: emitting properly typed Tuple annotations from JSON Schema prefixItems with equal minItems/maxItems.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 7028455 and dc14dee.

📒 Files selected for processing (5)
  • docs/cli-reference/general-options.md (1 hunks)
  • docs/graphql.md (2 hunks)
  • docs/jsondata.md (1 hunks)
  • docs/jsonschema.md (1 hunks)
  • docs/openapi.md (1 hunks)
🔇 Additional comments (6)
docs/jsondata.md (1)

50-50: LGTM!

Type annotation updates properly reflect Python 3.10+ union syntax and modern conventions.

Also applies to: 54-54

docs/jsonschema.md (2)

52-65: LGTM!

Person example correctly demonstrates updated type annotations using PEP 604 unions and modern imports.


67-119: LGTM!

Tuple validation section clearly documents the prefixItems feature with accurate conditions and a well-structured example. The class name Defaults correctly matches the schema title, addressing previous feedback. The example demonstrates the heterogeneous tuple annotation tuple[Span, str] properly.

docs/graphql.md (1)

57-57: LGTM!

GraphQL example properly updated with modern type annotations, using PEP 604 unions and lowercase list syntax throughout.

Also applies to: 75-75, 77-78, 82-82, 85-85, 90-91, 96-96, 97-98

docs/cli-reference/general-options.md (1)

1222-1222: LGTM!

The --check example correctly demonstrates updated type annotations with modern imports and PEP 604 unions.

Also applies to: 1228-1233

docs/openapi.md (1)

133-133: LGTM!

OpenAPI example correctly demonstrates RootModel usage for array-type schemas, consistent with Pydantic v2 conventions. Pet type annotation also properly updated.

Also applies to: 139-139, 142-143


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@koxudaxi koxudaxi requested a review from ilovelinux December 18, 2025 16:46
@koxudaxi
Copy link
Copy Markdown
Owner

@ilovelinux Can you review the PR again?

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (4)
src/datamodel_code_generator/parser/jsonschema.py (4)

312-312: Remove unused noqa directive.

Static analysis indicates N815 and UP045 rules are not enabled, making this noqa comment unnecessary.

🔎 Apply this diff:
-    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815, UP045
+    prefixItems: Optional[list[JsonSchemaObject]] = None

2378-2382: Mutating input object has potential side effects.

Setting obj.minItems = obj.maxItems = None directly modifies the JsonSchemaObject instance. While this achieves the goal of excluding these constraints from output (via obj.dict() at line 2415), mutating input parameters can cause unexpected behavior if the same object is referenced elsewhere.

Consider creating a modified constraints dict instead of mutating the source object.

🔎 Alternative approach (defer constraints filtering):

Instead of mutating obj, filter out the constraints when building the field:

 elif obj.prefixItems is not None and obj.minItems == obj.maxItems == len(obj.prefixItems):
-    # Set these to None so that it won't output max item constraints
-    obj.minItems = obj.maxItems = None
     items = obj.prefixItems
     is_tuple = True
+    # Mark for later: exclude minItems/maxItems from constraints
+    exclude_item_constraints = True
+else:
+    exclude_item_constraints = False

Then at line 2415:

constraints = obj.dict()
if exclude_item_constraints:
    constraints.pop('minItems', None)
    constraints.pop('maxItems', None)

2355-2355: Remove unused noqa directive.

The PLR0912 rule is not enabled, making this noqa comment unnecessary.

🔎 Apply this diff:
-    def parse_array_fields(  # noqa: PLR0912
+    def parse_array_fields(

2941-2941: Remove unused noqa directive.

The PLR0912 rule is not enabled, making this noqa comment unnecessary.

🔎 Apply this diff:
-    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:  # noqa: PLR0912
+    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 67ea24b and e8fb818.

📒 Files selected for processing (10)
  • docs/jsonschema.md (1 hunks)
  • src/datamodel_code_generator/imports.py (1 hunks)
  • src/datamodel_code_generator/parser/jsonschema.py (7 hunks)
  • src/datamodel_code_generator/types.py (5 hunks)
  • tests/data/expected/main/jsonschema/prefix_items.py (1 hunks)
  • tests/data/expected/main/jsonschema/prefix_items_msgspec.py (1 hunks)
  • tests/data/expected/main/jsonschema/prefix_items_no_tuple.py (1 hunks)
  • tests/data/jsonschema/prefix_items.json (1 hunks)
  • tests/data/jsonschema/prefix_items_no_tuple.json (1 hunks)
  • tests/main/jsonschema/test_main_jsonschema.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (3)
src/datamodel_code_generator/types.py (4)
src/datamodel_code_generator/model/base.py (3)
  • imports (238-255)
  • imports (583-588)
  • type_hint (217-235)
src/datamodel_code_generator/model/enum.py (1)
  • imports (118-120)
src/datamodel_code_generator/model/typed_dict.py (2)
  • imports (150-155)
  • type_hint (132-137)
src/datamodel_code_generator/model/type_alias.py (1)
  • imports (28-34)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/model/base.py (1)
  • path (664-666)
src/datamodel_code_generator/reference.py (1)
  • add_id (633-635)
tests/data/expected/main/jsonschema/prefix_items.py (1)
tests/data/expected/main/jsonschema/prefix_items_msgspec.py (2)
  • Span (12-13)
  • Defaults (16-17)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py

312-312: Unused noqa directive (non-enabled: N815, UP045)

Remove unused noqa directive

(RUF100)


2355-2355: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)


2941-2941: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (15)
src/datamodel_code_generator/imports.py (1)

164-164: LGTM!

The IMPORT_TUPLE constant follows the established pattern for typing imports and is correctly placed alongside related constants.

tests/data/jsonschema/prefix_items.json (1)

1-23: LGTM!

This test fixture correctly exercises the tuple generation conditions: prefixItems present with two items, minItems and maxItems both equal to 2, and no items specified.

tests/data/jsonschema/prefix_items_no_tuple.json (1)

1-23: LGTM!

This test fixture correctly validates the fallback behavior when minItems (1) and maxItems (3) don't match the prefixItems length (2), ensuring a List type is generated instead of Tuple.

tests/data/expected/main/jsonschema/prefix_items.py (1)

1-17: LGTM!

The expected output correctly demonstrates tuple generation with Tuple[Span, str] matching the prefixItems schema definition.

tests/data/expected/main/jsonschema/prefix_items_no_tuple.py (1)

12-17: LGTM!

The expected output correctly falls back to List[Any] with min/max length constraints when the tuple conditions aren't satisfied. The Span class is still generated from $defs as expected, even though it's not referenced in the fallback type.

tests/data/expected/main/jsonschema/prefix_items_msgspec.py (1)

1-17: LGTM!

The msgspec variant correctly generates Tuple[Span, str] with msgspec.Struct base class, demonstrating that tuple support works across different output model types.

tests/main/jsonschema/test_main_jsonschema.py (2)

2886-2915: LGTM!

The test is well-structured with appropriate parameterization for both pydantic_v2.BaseModel and msgspec.Struct output models. The @freeze_time decorator ensures reproducible timestamps, and the black version skip maintains compatibility.


2918-2932: LGTM!

Good coverage of the fallback behavior when minItems != maxItems, ensuring the generator correctly produces List[Any] instead of a tuple type.

src/datamodel_code_generator/types.py (3)

79-83: LGTM on constant placement.

The constants are correctly organized: TUPLE with the typing module generics (lines 73-79) and STANDARD_TUPLE with the builtin type names (lines 80-83).


464-487: Tuple import handling is correct.

The IMPORT_TUPLE is added only when use_standard_collections=False, which aligns with using typing.Tuple. When use_standard_collections=True, the builtin tuple type is used (no import required). This follows the same pattern as other container types.


533-537: Tuple type hint generation looks good.

The implementation correctly:

  • Selects between tuple and Tuple based on use_standard_collections
  • Builds inner types from data_types for heterogeneous tuple elements
  • Handles the empty tuple case with tuple[()]
src/datamodel_code_generator/parser/jsonschema.py (4)

384-386: LGTM on is_array update.

The inclusion of prefixItems in the array detection is correct per JSON Schema semantics.


2397-2402: LGTM on DataType construction.

The mutual exclusivity of is_tuple and is_list is correctly enforced.


2916-2918: LGTM on prefixItems traversal.

The traversal correctly handles prefixItems for reference resolution, consistent with the existing items handling pattern.


2951-2953: LGTM on prefixItems ID resolution.

The _add_id_callback correctly processes prefixItems for ID registration, matching the pattern used for other schema structures.

Comment thread docs/jsonschema.md
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/datamodel_code_generator/parser/jsonschema.py (1)

2388-2417: Critical: Tuple detection incorrectly handles schemas with items set.

The condition at line 393 determines whether to emit a tuple type, but it doesn't verify that items is absent. According to JSON Schema 2020-12:

  • prefixItems defines schemas for the first N elements
  • items defines the schema for any additional elements beyond prefixItems

If both prefixItems and items are present, the array can have more elements than prefixItems, making it not a fixed-length tuple.

Example that would be incorrectly handled:

{
  "prefixItems": [{"type": "string"}, {"type": "number"}],
  "items": {"type": "boolean"},
  "minItems": 2,
  "maxItems": 3
}

This allows arrays like ["a", 1] or ["a", 1, true], which is not a tuple. The current code would incorrectly emit Tuple[str, float].

🔎 Proposed fix
-        elif obj.prefixItems is not None and obj.minItems == obj.maxItems == len(obj.prefixItems):
+        elif obj.prefixItems is not None and obj.minItems == obj.maxItems == len(obj.prefixItems) and obj.items in (None, False):
             # Set these to None so that it won't output max item constraints
             obj.minItems = obj.maxItems = None
             items = obj.prefixItems
             is_tuple = True

This ensures tuples are only emitted when:

  1. prefixItems is specified
  2. Array length is fixed (minItems == maxItems == len(prefixItems))
  3. No additional items are allowed (items is None or False)
♻️ Duplicate comments (2)
src/datamodel_code_generator/types.py (2)

49-49: LGTM: Tuple constants are correctly positioned.

The TUPLE constant is appropriately placed with other typing module constants (after LIST), and STANDARD_TUPLE is grouped with other standard/builtin type constants. The organization is consistent with the existing pattern.

Also applies to: 77-77, 81-81


461-484: LGTM: Import logic correctly handles all branches.

The tuple import handling is correct across all configuration combinations:

  • When use_standard_collections=True: Uses builtin tuple, no import needed
  • When use_standard_collections=False: Uses Tuple from typing, import added at lines 475 and 483

The implementation properly accounts for both use_generic_container settings.

🧹 Nitpick comments (1)
src/datamodel_code_generator/parser/jsonschema.py (1)

313-313: Optional: Remove unused noqa directives.

Static analysis indicates the noqa directives at these lines suppress rules (N815, PLR0912, UP045) that are not enabled in your configuration. While these directives are defensive and don't cause issues, removing them would clean up the code.

Suggested cleanup

Line 313:

-    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815, UP045
+    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815

Line 2370:

-    def parse_array_fields(  # noqa: PLR0912
+    def parse_array_fields(

Line 2956:

-    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:  # noqa: PLR0912
+    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:

Note: The N815 suppression for prefixItems should be kept since the field name uses camelCase to match the JSON Schema specification.

Also applies to: 2370-2370, 2956-2956

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8fb818 and 75cf4d5.

📒 Files selected for processing (7)
  • docs/jsonschema.md (1 hunks)
  • src/datamodel_code_generator/imports.py (1 hunks)
  • src/datamodel_code_generator/parser/jsonschema.py (7 hunks)
  • src/datamodel_code_generator/types.py (5 hunks)
  • tests/data/expected/main/jsonschema/prefix_items.py (1 hunks)
  • tests/data/expected/main/jsonschema/prefix_items_msgspec.py (1 hunks)
  • tests/data/expected/main/jsonschema/prefix_items_no_tuple.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (5)
  • docs/jsonschema.md
  • src/datamodel_code_generator/imports.py
  • tests/data/expected/main/jsonschema/prefix_items_no_tuple.py
  • tests/data/expected/main/jsonschema/prefix_items.py
  • tests/data/expected/main/jsonschema/prefix_items_msgspec.py
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/model/base.py (1)
  • path (689-691)
src/datamodel_code_generator/reference.py (1)
  • add_id (632-634)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py

313-313: Unused noqa directive (non-enabled: N815, UP045)

Remove unused noqa directive

(RUF100)


2370-2370: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)


2956-2956: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (4)
src/datamodel_code_generator/types.py (1)

530-534: LGTM: Tuple type hint generation is well implemented.

The tuple type hint logic correctly:

  • Selects between tuple (builtin) and Tuple (typing) based on use_standard_collections
  • Extracts inner element types from nested data_types
  • Formats as tuple[type1, type2, ...] for heterogeneous tuples
  • Handles empty tuples with tuple[()]
src/datamodel_code_generator/parser/jsonschema.py (3)

387-387: LGTM: Correctly extends array detection to include prefixItems.

Arrays can be defined solely through prefixItems in JSON Schema 2020-12 without an items field, so this check is necessary and correct.


2931-2933: LGTM: Traversal correctly processes prefixItems.

The recursive traversal of prefixItems matches the existing pattern for items and ensures all nested schema objects are processed by the callback.


2960-2981: LGTM: ID callback correctly handles prefixItems.

The _add_id_callback method now processes prefixItems for ID registration, consistent with how it handles items, additionalProperties, and other schema collections.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)

313-313: Remove unused noqa directive.

The noqa: N815, UP045 directive is flagged as unused by static analysis. These rules are not being violated here, so the directive should be removed.

🔎 Proposed fix
-    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815, UP045
+    prefixItems: Optional[list[JsonSchemaObject]] = None

2370-2370: Remove unused noqa directive.

The noqa: PLR0912 directive is flagged as unused. It should be removed.

🔎 Proposed fix
-    def parse_array_fields(  # noqa: PLR0912
+    def parse_array_fields(

2960-2960: Remove unused noqa directive.

The noqa: PLR0912 directive is flagged as unused. It should be removed.

🔎 Proposed fix
-    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:  # noqa: PLR0912
+    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 75cf4d5 and 29671e0.

📒 Files selected for processing (1)
  • src/datamodel_code_generator/parser/jsonschema.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/model/base.py (1)
  • path (689-691)
src/datamodel_code_generator/reference.py (1)
  • add_id (632-634)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py

313-313: Unused noqa directive (non-enabled: N815, UP045)

Remove unused noqa directive

(RUF100)


2370-2370: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)


2960-2960: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (4)
src/datamodel_code_generator/parser/jsonschema.py (4)

387-387: LGTM! Correct array detection enhancement.

The is_array property now correctly identifies arrays that use prefixItems, which is essential for proper tuple type generation.


2935-2937: LGTM! Correct traversal of prefixItems.

The traversal logic properly iterates through prefixItems and recursively processes each schema object, consistent with how other collection fields are handled.


2970-2972: LGTM! Correct ID propagation for prefixItems.

The ID parsing logic for prefixItems is correctly implemented, following the same pattern as items handling.


388-403: Incorrect line number references in review comment.

The review references lines 388-403 but the actual prefixItems handling is located around line 2393-2403. Additionally, the behavior described is intentional and already tested: when prefixItems conditions don't match a fixed-length tuple (minItems == maxItems == len(prefixItems)), the code correctly generates List[Any] with field constraints. A test case for this scenario already exists in prefix_items_no_tuple.json.

Likely an incorrect or invalid review comment.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
src/datamodel_code_generator/parser/jsonschema.py (1)

2393-2403: Address previous review feedback: Add test for prefixItems with mismatched item counts.

A previous reviewer requested a test case for when prefixItems is set but minItems != maxItems. Currently, when this condition occurs, the code falls through to items = [] (line 2403), which results in Types.any being used (line 2414).

Please verify this is the intended behavior and add a test case to document it.

Based on learnings from previous review by ilovelinux.

🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)

313-313: Remove unused noqa directive.

The static analysis tool correctly identified that the noqa directive for N815 and UP045 is unnecessary since these checks are not enabled in the project configuration.

🔎 Proposed fix
-    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815, UP045
+    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: UP045

2370-2370: Remove unused noqa directive.

The noqa directive for PLR0912 is unnecessary as this check is not enabled.

🔎 Proposed fix
-    def parse_array_fields(  # noqa: PLR0912
+    def parse_array_fields(

2960-2960: Remove unused noqa directive.

The noqa directive for PLR0912 is unnecessary as this check is not enabled.

🔎 Proposed fix
-    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:  # noqa: PLR0912
+    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 29671e0 and 1b4b435.

📒 Files selected for processing (1)
  • src/datamodel_code_generator/parser/jsonschema.py (7 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/datamodel_code_generator/parser/jsonschema.py (2)
src/datamodel_code_generator/model/base.py (1)
  • path (689-691)
src/datamodel_code_generator/reference.py (1)
  • add_id (632-634)
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py

313-313: Unused noqa directive (non-enabled: N815, UP045)

Remove unused noqa directive

(RUF100)


2370-2370: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)


2960-2960: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)

387-387: LGTM! Correctly identifies prefixItems schemas as arrays.

The addition of self.prefixItems is not None to the is_array check properly handles JSON Schema tuple validation where prefixItems defines the array structure without requiring items to be set.


2935-2937: LGTM! Correctly traverses prefixItems.

The prefixItems traversal follows the same pattern as the items list traversal and properly propagates the include_one_of parameter through the recursive calls.


2962-2985: LGTM! Comprehensive ID propagation through all schema constructs.

The expanded _add_id_callback now correctly handles ID propagation through prefixItems, items (both object and list forms), additionalProperties, patternProperties, anyOf, allOf, and properties. The prefixItems handling at lines 2970-2972 correctly mirrors the items list handling pattern.

Comment thread src/datamodel_code_generator/parser/jsonschema.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)

313-313: Remove unused noqa directives.

The noqa directives for N815 and UP045 are flagged as unused by Ruff, indicating these rules aren't enabled in your configuration.

🔎 Suggested cleanup
-    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815, UP045
+    prefixItems: Optional[list[JsonSchemaObject]] = None  # noqa: N815

1018-1027: Consider extracting duplicate tuple detection logic.

The same condition for detecting fixed-length tuples appears in both get_object_field (lines 1020-1025) and parse_array_fields (lines 2404-2408). While both locations need this check, extracting it to a helper method would improve maintainability and ensure consistency.

🔎 Suggested refactor

Add a helper method to the JsonSchemaParser class:

def _is_fixed_length_tuple(self, field: JsonSchemaObject) -> bool:
    """Check if field represents a fixed-length tuple via prefixItems."""
    return (
        field.prefixItems is not None
        and field.minItems == field.maxItems == len(field.prefixItems)
        and field.items in {None, False}
    )

Then use it in both locations:

 def get_object_field(...):
     constraints = field.dict() if self.is_constraints_field(field) else None
-    # Suppress minItems/maxItems for fixed-length tuples
-    if (
-        constraints
-        and field.prefixItems is not None
-        and field.minItems == field.maxItems == len(field.prefixItems)
-        and field.items in {None, False}
-    ):
+    if constraints and self._is_fixed_length_tuple(field):
         constraints.pop("minItems", None)
         constraints.pop("maxItems", None)
 def parse_array_fields(...):
-    elif (
-        obj.prefixItems is not None
-        and obj.minItems == obj.maxItems == len(obj.prefixItems)
-        and obj.items in {None, False}
-    ):
+    elif self._is_fixed_length_tuple(obj):
         suppress_item_constraints = True
         items = obj.prefixItems
         is_tuple = True

Also applies to: 2398-2414


2975-3000: Remove unused noqa directive and consider method complexity.

The expansion of _add_id_callback to handle prefixItems and other properties is correct, but static analysis flags an unused PLR0912 (too-many-branches) directive on line 2975.

🔎 Suggested cleanup
-    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:  # noqa: PLR0912
+    def _add_id_callback(self, obj: JsonSchemaObject, path: list[str]) -> None:
         """Add $id to model resolver."""

If complexity warnings appear after removing the directive, consider extracting property traversal into a separate helper method.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1b4b435 and 7028455.

📒 Files selected for processing (1)
  • src/datamodel_code_generator/parser/jsonschema.py (9 hunks)
🧰 Additional context used
🪛 Ruff (0.14.8)
src/datamodel_code_generator/parser/jsonschema.py

313-313: Unused noqa directive (non-enabled: N815, UP045)

Remove unused noqa directive

(RUF100)


2380-2380: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)


2975-2975: Unused noqa directive (non-enabled: PLR0912)

Remove unused noqa directive

(RUF100)

🔇 Additional comments (3)
src/datamodel_code_generator/parser/jsonschema.py (3)

387-387: LGTM! Correct array type detection.

Including prefixItems is not None in the array check correctly identifies schemas using prefixItems as array types, aligning with JSON Schema specifications.


2950-2952: LGTM! Correct traversal of prefixItems.

The traversal logic correctly handles prefixItems similarly to how items is processed, ensuring all nested schemas are visited by the callback.


2404-2414: This behavior is already tested and documented. The code correctly handles two cases:

  1. Fixed-length tuples (minItems == maxItems == len(prefixItems)): Generates tuple[...] using prefixItems schemas.
  2. Variable-length arrays (minItems != maxItems or != len(prefixItems)): Falls back to list[Any] with min_length/max_length constraints preserved via Field.

The test test_main_jsonschema_prefix_items_no_tuple explicitly validates this fallback behavior with the test input prefix_items_no_tuple.json (minItems: 1, maxItems: 3) and confirms the expected output is list[Any] = Field(..., max_length=3, min_length=1). The design choice to lose prefixItems schema information in variable-length cases is intentional, trading schema specificity for code simplicity outside the scope of fixed-length tuple support.

- Replace Optional[X] with X | None
- Replace List[X] with list[X]
- Replace Tuple[X] with tuple[X]
- Update Pydantic v1 examples to v2 syntax (RootModel instead of __root__)
@saulshanabrook
Copy link
Copy Markdown
Contributor Author

Thanks for fixing this! I ended up just taking some time off last month and am now coming back to things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature request: support JSON Schema's prefixItems with precisely typed tuples instead of imprecise lists

4 participants