-
-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More consistent and intuitive alias
behavior for validation and serialization
#8379
Comments
Plus one having this behavior match and an option to pass to |
It would be great if this could be considered for V2 as well, because it looks more like a bug currently, no? The current behavior is not what the original plan for the alias system #5426 (comment) had specified (every other row in that specification table currently misbehaves). Also, the current behavior means that it is effectively not possible in V2 to decouple the Python field name from its serialized name -- which is kind of the primary feature of the alias system. |
Though I agree that the current behavior is not great, changing the API is enough of a breaking change that I don't think it makes sense to do in V2. Thanks for linking that comment - that's a helpful reference to have in the future on this issue.
Could you please say more about this? I don't think I understand the issue you're having. |
I'm often seeing variations of code that contains funny comments like these: class SomeModel(BaseModel):
# Please don't change these field names even though they poorly named!
# They are determined by external API X / third-party JSON schema Y / ...
foo: int
bar: int or even class SomeModel(BaseModel):
# Please don't fix the typo in this field name to avoid breaking backwards compatibility.
field_wit_typo: int This raises the question: Can we not simply decouple the field name visible in the code from the representation in the serialized layer? The alias system seems to be the feature that should provide exactly that. Ideally the code should look like: class SomeModel(BaseModel):
# We are decoupling the field names from external API X for better readability
better_named_foo: int = Field(alias="foo")
better_named_bar: int = Field(alias="bar") However currently in V2 the alias system is a rather dangerous pitfall: It suggests that it addresses this use case but it doesn't! The above code looks right, and a shallow experiment may even wrongly conclude that it does what one expects. But as demonstrated in #8551, none of the 8 possible combinations of Identifying each and every occurrence of Having to use a custom serializer on top doesn't feel right either. Currently the alias system is non-intuitive, does not cover its primary use case, does not follow the docs and the specs in #5426 (comment), so it looks a bit more like a bug, no? |
This is a pretty major pitfall for my current attempt at upgrading a codebase to 2.x. While V2 aliases seemed promising at first, I am still running into a fair amount of frustration with regards to the by_alias=False default and surprising behavior where sometimes the wrong name is being serialized into the json. |
are we just waiting for PR's on this? right? or are we not sold on a particular solution - where are we at with this? diff --git a/pydantic/config.py b/pydantic/config.py
index 7edf7c60..504d627d 100644
--- a/pydantic/config.py
+++ b/pydantic/config.py
@@ -161,6 +161,11 @@ class ConfigDict(TypedDict, total=False):
3. The model is populated by the field name `'name'`.
"""
+ serialize_by_name: bool
+ """
+ counter part to populate_by_name
+ """
+
use_enum_values: bool
"""
Whether to populate models with the `value` property of enums, rather than the raw enum.
diff --git a/pydantic/main.py b/pydantic/main.py
index 525c8f98..fc61d8a7 100644
--- a/pydantic/main.py
+++ b/pydantic/main.py
@@ -363,6 +363,8 @@ class BaseModel(metaclass=_model_construction.ModelMetaclass):
Returns:
A JSON string representation of the model.
"""
+ if self.model_config.get("serialize_by_name", False):
+ by_alias = True
return self.__pydantic_serializer__.to_json(
self,
indent=indent, </details. |
I think we can add an option to config now. We can't change the default behaviour until V3. |
@samuelcolvin I have not made any contributions to |
I started learning rust in order to contribute to pydantic core but so far have not been able to get anything meaningful working. iirc its going to be a change to core. which is to say im not working on it at the moment |
I don't know of anyone working on it at the moment. This will definitely require changes in @fortify-avnenciu or @jammymalina, feel free to take a stab! |
I have different case that needs this feature, decided to share in this thread too: If I have a superclass model which has fields with aliases, I would like to be able to define how to serialize in subclass. For example, I have class which converts date object to month and year as ints, and I have two models which inherits this model: class DateToMonthYearSchema(BaseModel):
model_config = ConfigDict(populate_by_name=True)
issue_date: date = Field(
exclude=True,
alias='start_date',
)
expire_date: Optional[date] = Field(
default=None,
exclude=True,
alias='end_date',
)
@computed_field(alias='start_date_month') # type: ignore[misc]
@property
def issue_date_month(self) -> int:
return self.issue_date.month
@computed_field(alias='start_date_year') # type: ignore[misc]
@property
def issue_date_year(self) -> int:
return self.issue_date.year
class UserWorkExperienceSchema(DateToMonthYearSchema):
id: int
currently_working: bool
class CertificateSchema(DateToMonthYearSchema):
id: int Ideally, I can use `.model_dump(by_alias=True), however, these two models are also part of another model: class UserProfileSchema(BaseModel):
work_experiences: list[UserWorkExperienceSchema] = []
certificates: list[CertificateSchema] = [] So, I get either |
There doesn't seem to be any open PRs for this on pydantic-core, so I am going to look at this.. 🧐 Very keen to see this feature. |
Any progress? i can also take a look at this for v2.10, if desired. |
Hey @sydney-runkle I had only started to look at this when you commented, so tbf feel free to reassign if this is high in list of priorities, otherwise I can probably up a PR up for this within the next 7-10 days. |
Is there a related pydantic core issue for keeping track of failed attempts yet |
Great, ping me when you're ready :). |
It would be great to see some way of doing: class SomeModel(BaseModel):
# We are decoupling the field names from external API X for better readability
better_named_foo: int = Field(alias="foo")
better_named_bar: int = Field(alias="bar") I have recently started using pydantic and have been really impressed, but I am working with a lot of camelCase APIs and am having to replicate this in all of my models, which feels very awkward. I would be happy to put some time into a solution to this if there is agreement on what it should look like - there have been a lot of suggestions in different threads. Maybe if v3 is going to fix the behaviour this could be backported under a different argument to |
It would be great if the Pydantic model could completely separate the Python attribute name for a field from the dictionary key used when serializing and deserializing JSON/YAML/etc. class SomeModel(BaseModel):
some_field: str
model_config = ConfigDict(serialized_key_generator=to_camel, extra="forbid")
SomeModel(some_field="x") # Legal
SomeModel(someField="x") # Illegal: someField not allowed here
SomeModel.model_validate_json('{"someField": "x"}') # Legal
SomeModel.model_validate_json('{"some_field": "x"}') # Illegal: some_field not allowed here
SomeModel(some_field="x").model_dump_json() # {"someField": "x"} The Python attribute name has constraints that are very different from the serialized form. Currently it seems as if the only options are:
I think this is addressing the same problem as #8379 (comment) only without changing how aliases work? Also, I think this could be maybe done in v2 since it's opt-in? |
@mrob95 @alicederyn doesn't |
As far as I can tell, no, it does not. |
Just wants to give a shot at #6762 here. Ideally when we use A new model config might be unavoidable before we can make it the new default behavior. cc @sydney-runkle who seems to be working on it. |
@stevapple how about my suggestion of adding a new metadata type, |
@alicederyn As long as others are comfortable about the migration from With your proposed change it looks like |
I plan on working on this for v2.11 👍 |
We are excited that the pydantic team has planned this for v2.11! I don't know what the API will look like, but I wrote some test code as a small contribution. If I understand correctly, this is the behavior users should expect. import pytest
from pydantic import BaseModel, Field, ValidationError
class SomeModel(BaseModel):
some_field: str = Field(alias="someField") # NOTE: Replace "alias" with the new API
def test_construct() -> None:
# Currently, ValidationError is raised at runtime and the type checker reports an error
m = SomeModel(some_field="x")
assert m.some_field == "x"
assert m.model_dump_json() == '{"someField":"x"}'
def test_construct_illegal() -> None:
with pytest.raises(ValidationError):
SomeModel(someField="x") # Currently, DID NOT RAISE
def test_validate() -> None:
m = SomeModel.model_validate_json('{"someField":"x"}')
assert m.some_field == "x"
# Currently, AssertionError '{"some_field":"x"}'
assert m.model_dump_json() == '{"someField":"x"}'
def test_validate_illegal() -> None: # Currently, OK
with pytest.raises(ValidationError):
SomeModel.model_validate_json('{"some_field":"x"}') |
Yes, we're excited to have this land in v2.11 as well. I'll be starting work on this today :) |
I've merged #11468 which makes some significant strides towards a unified API here :) |
Right now, we have some inconsistent behavior in terms of using aliases in validation and serialization.
By default, if an
alias
orvalidation_alias
is defined on a field, we use the alias for validation. This behavior can be changed by settingpopulate_by_name
toTrue
on themodel_config
.Conversely, if an
alias
orserialization_alias
is defined on a field, that alias not used by default for serialization. We must specifyby_alias=True
in the call tomodel_dump
+ other serialization functions.I propose that in V3:
alias
by default for both validation and serializationConfigDict
to support different behavior than the defaultThis is a breaking change, hence the
V3
label.Requests to change the inconsistent default behavior have been made for a few years, so I'm going to comb through issues and close those so we can centralize discussion here.
The text was updated successfully, but these errors were encountered: