Add regex patterns reflecting Decimal Constraints to the Decimal type's string schema by Jack70248 · Pull Request #11420 · pydantic/pydantic

Jack70248 · 2025-02-08T01:55:22Z

Change Summary

This PR builds on and addresses remaining requirements from #11016

Based on the remaining requirements from the previous PR I have:

Updated the patterns and corresponding logic in json_schema.py. These should now handle the following cases for both validation and serialization:
- decimal_places and max_digits set: both decimal_places and integer_places enforced via a repetition quantifier, the total of the maximum of integer_places and decimal_places enforces max_digits
- decimal_places set and max_digits not set: decimal_places enforced via a repetition quantifier, excludes the lookahead and leaves the upper bound of integer_places repetition quantifier blank
- decimal_places not set and max_digits set: max_digits enforced via positive lookahead, integer_places and decimal_places repetition quantifier upper bound left blank
Removed the test that was added in the previous PR, and added a few more parametrize test cases for test_constraints_schema_validation() and test_constraints_schema_serialization() tests - ruff format split these out over quite a few lines due to their length

I've been able to reuse the patterns for all cases by using bracketed repetition quantifiers where we just leave the upper bound empty if it doesn't need to be enforced by max_digits or decimal_places. This does have a downside in that the output of json_schema() for the user will be slightly less succinct, using {0,} or {1,} in some places where * or + feels more natural. What this looks like can be seen in the new test cases, example here

I also have a couple of questions:

~~Do we want to allow for trailing zeroes in validation and/or serialization?~~ Implemented for validation
We currently match against something like 1. as a valid Decimal, but wouldn't match against .1 due to the pattern requiring at minimum one integer place.
- Do we want to match against entries like .1? We can drop the requirement for a minimum of one integer place but then we'd match against . and empty entries - the best way to handle it might be a minimum digit requirement in the positive lookahead, where we don't care if its an integer or decimal, and then always include the lookahead.
- Instead if we shouldn't be matching against entries like 1. this should be an easy fix by requiring a minimum of one digit in the optional decimal place capture group

Related issue number

fix #10867

Checklist

The pull request title is a good summary of the changes - it will be used in the changelog
Unit tests for the changes exist
Tests pass on CI
Documentation reflects the changes where applicable
My PR is ready to review, please add a comment including the phrase "please review" to assign reviewers

Selected Reviewer: @sydney-runkle

….decimal_schema() method

github-actions · 2025-02-08T02:00:53Z

Coverage report

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
pydantic
json_schema.py
Project Total

_{This report was generated by python-coverage-comment-action}

codspeed-hq · 2025-02-08T02:03:06Z

CodSpeed Performance Report

Merging #11420 will not alter performance

_{Comparing Jack70248:decimal_constraint (88da428) with main (10af6a8)}

Summary

✅ 46 untouched benchmarks

This reverts commit 7014353.

…aces are None

Jack70248 · 2025-02-09T02:46:30Z

please review - thanks!

Jack70248 · 2025-02-11T08:58:26Z

Apologies, I've spotted a couple of outstanding issues. I'll pull this PR back to draft until I can push an update:

The lookahead is redundant when both max_digits and decimal_places are set
~~The lookahead isn't currently working with the requirement for arbitrary leading zeroes in validation~~ This was actually fine

I'll also add in support for arbitrary trailing zeroes in validation, on the basis that Pydantic successfully validates 1111.000 with a constraint of 'max_digits':4

…ahead is used

Jack70248 · 2025-02-22T03:21:46Z

This should be ready for review now, thanks @sydney-runkle

Viicos · 2025-03-04T14:03:21Z

+# ##### Regex for Decimal JSON Schema Generation #####
+
+_DECIMAL_JSON_VALIDATION_MAX_DIGIT_LOOKAHEAD_PATTERN = (
+    r'(?=\d{{0,{max_digits}}}0*$'  # Positive lookahead for max_digits, allows trailing zeroes


This will match arbitrary trailing zeros, which is only allowed after the . separator:

ta = TypeAdapter(Annotated[Decimal, Field(max_digits=4)]) ta.validate_python("30000000") # ValidationError, passes regex: ta.json_schema()['anyOf'][1]['pattern'] #> '^-?0*(?=\\d{0,4}0*$|(?=.*\\..*)[\\d\\.]{0,5}0*$)\\d{1,}(?:\\.\\d{0,})?0*$'

But I don't know if this is reasonably easy to support, so perhaps we should use a simpler (but one that accepts invalid inputs in some cases) regex when having only max_digits specified.

Thanks @Viicos. I've realised this issue also occurs when both max_digits and decimal_places is set as well. I've been able to allow trailing zeroes only if a decimal place exists using the (?(id/name)yes-pattern|no-pattern) syntax in the max_digits lookahead and always including the lookahead if max_digits is set.

In looking in to a solution I've also been able to simplify the max_digit lookahead a bit.

Viicos · 2025-03-27T09:27:45Z

Thanks and sorry for the delayed response. We are almost here. One issue I noticed is that during serialization, we don't make use of the normalized version:

import re
from decimal import Decimal
from typing import Annotated

from pydantic import TypeAdapter, Field


ta = TypeAdapter(Annotated[Decimal, Field(max_digits=4)])

reg = ta.json_schema(mode='serialization')['pattern']

val = ta.validate_python("0000001.001000")
print(val)
#> 1.001000
dumped = ta.dump_python(val, mode='json')
print(dumped)
#> 1.001000

re.compile(reg).match(dumped)  # None

Jack70248 · 2025-04-03T06:15:57Z

Thanks and sorry for the delayed response. We are almost here. One issue I noticed is that during serialization, we don't make use of the normalized version:

import re
from decimal import Decimal
from typing import Annotated

from pydantic import TypeAdapter, Field


ta = TypeAdapter(Annotated[Decimal, Field(max_digits=4)])

reg = ta.json_schema(mode='serialization')['pattern']

val = ta.validate_python("0000001.001000")
print(val)
#> 1.001000
dumped = ta.dump_python(val, mode='json')
print(dumped)
#> 1.001000

re.compile(reg).match(dumped)  # None

Thanks @Viicos , and no problem on the delay. I've got a few things on for the next week but will be able to push an update later next week.

Jack70248 · 2025-04-13T01:12:47Z

Hi @Viicos

I might need a hand to wrap my head around the solution to this problem. It sounds like we want the output of the TypeAdaptors dump_python() method (and BaseModel's dump_model() method?) to return the normalised Decimal form: str(Decimal("1.001000").normalize()) -> "1.001", so that it correctly matches against the serialization regex pattern.

It looks like the default serializers for this come from pydantic_core, but there's mechanisms to overwrite these through defining custom serializers such as a PlainSerializer or using something like core_schema.plain_serializer_function_ser_schema(), but it's not clear to me the best way to overwrite the TypeAdaptors/BaseModels serializer with a custom serializer, or if this should instead be changed in pydantic-core?

Let me know if I've misunderstood anything, thanks!

Viicos · 2025-05-08T13:46:42Z

I might need a hand to wrap my head around the solution to this problem. It sounds like we want the output of the TypeAdaptors dump_python() method (and BaseModel's dump_model() method?) to return the normalised Decimal form: str(Decimal("1.001000").normalize()) -> "1.001", so that it correctly matches against the serialization regex pattern.

We could, but this would be a breaking change so it can only be considered for v3.

Viicos · 2025-07-11T11:50:40Z

Handled in #11420.

KCui0327 and others added 4 commits February 8, 2025 13:36

Add decimal precision constraints to JSON schema

1e916d4

Addressed all feedbacks and added more test cases

890850d

Update decimal regex patterns and related logic in GenerateJsonSchema…

f430c1b

….decimal_schema() method

update tests and docs with new decimal constraint patterns

7014353

github-actions Bot added the relnotes-fix Used for bugfixes. label Feb 8, 2025

Jack70248 changed the title ~~Decimal constraint~~ Add regex patterns reflecting user-defined constraints to the Decimal type's string schema Feb 8, 2025

Jack70248 changed the title ~~Add regex patterns reflecting user-defined constraints to the Decimal type's string schema~~ Add regex patterns reflecting Decimal Constraints to the Decimal type's string schema Feb 8, 2025

Jack70248 added 5 commits February 8, 2025 15:53

Revert "update tests and docs with new decimal constraint patterns"

a67d5b9

This reverts commit 7014353.

Update tests, add test cases to test various decimal regex patterns

753b828

Add guard so we don't add a pattern if both max_digits and decimal_pl…

6446895

…aces are None

Tidy up patterns, address coverage missing line

fd366fa

fix bug in max digit lookahead pattern

805c785

Jack70248 marked this pull request as ready for review February 9, 2025 02:46

pydantic-hooky Bot added the ready for review label Feb 9, 2025

pydantic-hooky Bot assigned sydney-runkle Feb 9, 2025

Jack70248 marked this pull request as draft February 11, 2025 08:58

sydney-runkle added relnotes-feature awaiting author revision awaiting changes from the PR author and removed ready for review relnotes-fix Used for bugfixes. labels Feb 12, 2025

sydney-runkle mentioned this pull request Feb 12, 2025

Add decimal precision constraints to JSON schema #11016

Closed

5 tasks

Allow for trailing zeroes in validation, refine when a max digit look…

69a0e37

…ahead is used

Jack70248 marked this pull request as ready for review February 22, 2025 03:21

sydney-runkle assigned Viicos and unassigned sydney-runkle Feb 28, 2025

Viicos requested changes Mar 4, 2025

View reviewed changes

fix trailing zero bug and simplify max_digit lookahead

88da428

Dima-Bulavenko mentioned this pull request Jun 16, 2025

Add regex patterns to JSON schema for Decimal type #11987

Merged

5 tasks

Viicos closed this Jul 11, 2025

Uh oh!

Conversation

Jack70248 commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Summary

Related issue number

Checklist

Uh oh!

github-actions Bot commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report

Uh oh!

codspeed-hq Bot commented Feb 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

CodSpeed Performance Report

Merging #11420 will not alter performance

Summary

Uh oh!

Jack70248 commented Feb 9, 2025

Uh oh!

Jack70248 commented Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Jack70248 commented Feb 22, 2025

Uh oh!

Viicos Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

Viicos Mar 4, 2025

Choose a reason for hiding this comment

Uh oh!

Jack70248 Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

Viicos commented Mar 27, 2025

Uh oh!

Jack70248 commented Apr 3, 2025

Uh oh!

Jack70248 commented Apr 13, 2025

Uh oh!

Viicos commented May 8, 2025

Uh oh!

Viicos commented Jul 11, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Jack70248 commented Feb 8, 2025 •

edited

Loading

github-actions Bot commented Feb 8, 2025 •

edited

Loading

codspeed-hq Bot commented Feb 8, 2025 •

edited

Loading

Jack70248 commented Feb 11, 2025 •

edited

Loading