Stop serializing columns as YAML by default#47422
Conversation
2d9da0a to
7deaa5d
Compare
313d6b3 to
768801c
Compare
|
Note to self: Let's ship the refactor without the change of default for now. |
|
I don't think we need to provide a default since we do provide a global configuration, that to me is good enough if people wants a default. |
516df83 to
da3fef6
Compare
YAML is great for configuration and such, but for serializing arbitrary
data it has many problem, both in term of performance and efficiency
but also in term of security.
As such, we shouldn't let it be the default.
The open question is wether we should provide another default, or
just let users chose what they want based on their own tradeoffs.
Many people would probably suggest JSON as the new default, unfortunately
I don't think it's a good fit either because the parsers available in
Ruby have some wonky behaviors:
```ruby
>> ActiveSupport::JSON.decode(ActiveSupport::JSON.encode(Object.new))
=> {}
>> JSON.load(JSON.dump(Object.new))
=> "#<Object:0x000000012b61a068>"
```
If we were to select another default, I beleive it would need several
properties:
- It should only serialized a safe list of primitive types.
- It should explictly raise if attempting to serialize complex types.
da3fef6 to
cee1555
Compare
|
cc @tenderlove @jhawthorn as we discussed this a couple days ago. We mentioned maybe implementing a strict JSON encoder. The performance likely wouldn't be good and it's a big can of worm I'm not sure I want to get into. In the meantime I implemented a In the meantime I think I'm ok with no default. Serializing data inside the database is a decision with big consequences, so I think I like that users now need to make that choice themselves. So I think I'll merge this for now, but if there are strong opinions we can revert or set another default. |
| active_record.raise_on_assign_to_attr_readonly = true | ||
| active_record.belongs_to_required_validates_foreign_key = false | ||
| active_record.before_committed_on_all_records = true | ||
| active_record.default_column_serializer = nil |
There was a problem hiding this comment.
This should be in 7.0, and setting it to YAML. Whether we want to be broken out of the box for new applications is debatable, but we definitely don't want to break for upgrades.
There was a problem hiding this comment.
@matthewd the default is set here:
It's still YAML for all existing applications, we only set it to nil if you have load_default 7.1.
There was a problem hiding this comment.
This causes different behaviour of active_record when used by itself vs being used in a Rails 7.1 application. Would it not be better to have active_record default to nil also so at least the behaviour is consistent and gems which depend on active_record can have parity between their tests and behaviour when used in Rails apps (see rpush/rpush#668 for example)?
There was a problem hiding this comment.
@benlangfeld unfortunately no, because the default must be YAMLColumn if load_defaults is never called.
This is the way we do it for all default changes.
Followup: #47463
YAML is great for configuration and such, but for serializing arbitrary data it has many problem, both in term of performance and efficiency but also in term of security.
As such, we shouldn't let it be the default.
The open question is wether we should provide another default, or just let users chose what they want based on their own tradeoffs.
Many people would probably suggest JSON as the new default, unfortunately I don't think it's a good fit either because the parsers available in Ruby have some wonky behaviors:
If we were to select another default, I beleive it would need several properties:
NB: I have quite a lot of documentation to update, but making sure CI passes first.