Assign auto populated columns on Active Record object creation #48241
Conversation
|
Can you take a look at the test failures when you're back on Thursday? |
79473d8 to
a9f0f3e
Compare
Done. Was a memoization issue so we added a reset of the There are a few |
ec4f723 to
46054c2
Compare
I suspect this might be the best moment to change that: we can vary the return value by whether the new parameter is supplied. That does mean our caller is going to need to know how to figure out which return strategy the adapter supports, but I think that's already true: prioritising the PK is sensible, but in the case of MySQL, it's not true that we can get the new value for one column of our choosing [e.g. if the PK is not auto-populated] -- we will always get the value of the autoincrement column, whatever it is, and otherwise nothing. 99% of the time that's equivalent, but for when it's not, I think we do need to know about the distinction. (That is, by my reading, the current implementation would be wrong on MySQL when faced with a PK that defaults to |
Right, for some reason I thought we can't do it without a deprecation cycle but we can if we only change the signature in case if the new My only concern here that regardless of implementation the multi-return values solution will only work with PostgreSQL unless we want to add a special case for non-postgresql databases which will make an additional query to fetch values for all auto-populated columns. Though it may be too much for a single PR
That's correct. Not an excuse but just a clarification that current implementation on Though I agree that since we are making changes we have a great opportunity to fix it as well. Because it is impossible to have more than one I'll work on extending the return values of insert methods (either through an argument or using a new method) along with a strategy, will post an update when its ready |
a14ee55 to
ecb6bee
Compare
aa162ce to
23ff3fd
Compare
This commit extends Active Record creation logic to allow for a database
auto-populated attributes to be assigned on object creation.
Given a `Post` model represented by the following schema:
```ruby
create_table :posts, id: false do |t|
t.integer :sequential_number, auto_increment: true
t.string :title, primary_key: true
t.string :ruby_on_rails, default: -> { "concat('R', 'o', 'R')" }
end
```
where `title` is being used as a primary key, the table has an
integer `sequential_number` column populated by a sequence and
`ruby_on_rails` column has a default function - creation of
`Post` records should populate the `sequential_number` and
`ruby_on_rails` attributes:
```ruby
new_post = Post.create(title: 'My first post')
new_post.sequential_number # => 1
new_post.ruby_on_rails # => 'RoR'
```
* At this moment MySQL and SQLite adapters are limited to only one
column being populated and the column must be the `auto_increment`
while PostgreSQL adapter supports any number of auto-populated
columns through `RETURNING` statement.
23ff3fd to
c929332
Compare
matthewd
left a comment
There was a problem hiding this comment.
Looks great!
I think it's worth some follow-up to make sure adapters that don't know how to do full-featured RETURNING will instead raise if they're asked to return a column(s) they can't... but no need to squeeze that into this PR.
Thanks, that's being done right now in parallel and a PR will be ready soon! |
Implementing the full `supports_insert_returning?` contract means the SQLite3 adapter supports auto-populated columns (rails#48241) as well as custom primary keys.
Fixes rails#49502. This bug was introduced in rails#48241. Since `IDENTITY` columns in PostgreSQL do not have defaults, they are not auto populated via the changes made in that PR. Thus we need to consider them separately.
See rails/rails#48241 and rails/rails#49692 48241 added support for setting additional db-controlled attributes via a returning clause. As part of these changes, that PR removed the [setting of the in-memory object's id](https://github.com/rails/rails/pull/48241/files#diff-11b42664eb9834972953ecd5725c75fd6020af6d4ee5da37ef6d8e0c146f7773L1242) in Persistance#_create_record. Instead, those changes expect some `Column`s to be true for `auto_incremented_by_db?` so they can be matched with the id value (and possibly other database values) returned by insert and then used to update attributes on the in memory activerecord object in [`_insert_record`](https://github.com/rails/rails/pull/48241/files#diff-11b42664eb9834972953ecd5725c75fd6020af6d4ee5da37ef6d8e0c146f7773R1248). Unless an adapter adds the ability to identify some Column as `auto_incremented_by_db?` the insert happens but the newly created object does not get its id value updated from nil to the new id value. This commit updates `column_defintions` and `Column` to identify the primary key column and make it selectable by `auto_incremented_by_db?` so the corresponding attribute gets updated with the id_value from the `insert`. 49692 makes further changes to support id_value in `insert` but I still need to explicity pass `returning: nil` and wrap the return from `super` in Array to get the id to update in tests. Limitations: This approach will probably need to be reconsidered if the adapter adds support for composite primary keys, but for now this gets `SomeModel.create` to set the id after the record is inserted and fixes many current test failures. PR 48241 also supports updating more via `returning` but this commit does not add that feature. It looks like the seeds of an implementation are there in `DatabaseStatements#sql_for_insert` and `DatabaseStatements.exec_insert` but the goal of this commit is to fix setting id during create. Performance: Using a left join to bring in the primary key since its performance is pretty close to the performance of the original query and better than using a WITH clause: using benchmark-ips: Warming up -------------------------------------- old_column_definitions 1.000 i/100ms with clause column_definitions 1.000 i/100ms left join column_definitions 1.000 i/100ms Calculating ------------------------------------- old_column_definitions 10.093 (± 9.9%) i/s - 51.000 in 5.070277s with clause column_definitions 9.031 (± 0.0%) i/s - 46.000 in 5.106131s left join column_definitions 9.919 (± 0.0%) i/s - 50.000 in 5.049687s Comparison: old_column_definitions: 10.1 i/s left join column_definitions: 9.9 i/s - same-ish: difference falls within error with clause column_definitions: 9.0 i/s - 1.12x slower
rails/rails#48241 Also relates to change to sqlite3 adapter getting support to 'supports_insert_returning?' true
This PR extends Active Record creation logic to allow for a database
auto-populated attributes to be assigned on object creation.
Given a
Postmodel represented by the following schema:where
titleis being used as a primary key, the table has aninteger
sequential_numbercolumn populated by a sequence andruby_on_railscolumn has a default function - creation ofPostrecords should populate thesequential_numberandruby_on_railsattributes:At this moment MySQL and SQLite adapters are limited to only one
column being populated and the column must be the
auto_incrementwhile PostgreSQL adapter supports any number of auto-populated
columns through
RETURNINGstatement.Implementation details
Overall solution is trying to be generic and extensible to allow for additions while still maintains (for MySQL and SQLite) the fact that currently Rails only populates a single column on creation.
exec_insertandinsertpublic methods. We are adding a newreturningkeyword which accepts either a single value or an array of columns to control the return value of the method and define what column values to be returned after row insertion. Currently only supported by Postgresql usingRETURNINGstatement. The behavior of thereturningargument is defined as following:nilis the default value and doesn't change the behavior of the methods - they continue to return a single primary key orLAST_INSERT_ID()value. If an array of column names is passed - an array is returned back. However non-nilvalue for the argument is only supported by postgresql adapter at the moment.ConnectionAdapters::Columngets a new adapter-abstract attribute calledauto_incremented_by_db?which indicates whether a given column is being auto-incremented by the database regardless of the way it's being incremented. For mysql we alias it toauto_incrementand for postgresql we alias it toserial.ConnectionAdapters::Columnalso gets a newauto_populated?method which allows us to identify columns auto-populated by the database regardless of the way how the column is being populated. Currently only checks forauto_incremented_by_dbor a default functionSQLite3::Columnhas a newrowidproperty which is an implicit way to have an integer column being autoincrementedModelSchema._returning_columns_for_inserthas been added to gather a list of columns values for which we want to get back after insertion. Adapter defines the selection rule. postgresql will select any auto-populated column while mysql and sqlite will select onlyauto_incremented_by_db?which by implication limits the selection to a single column.Long term plans
The feature is flexible enough to be extended to support not just postgresql adapter but any other database that is capable of returning column values after creation.
The closest candidates are:
We'll have to teach adapters how to build a
RETURNINGstatement insql_for_inserthttps://github.com/rails/rails/blob/aa162cefe19b81daa5d7f8c2b9ee4fe2564e73fc/activerecord/lib/active_record/connection_adapters/abstract/database_statements.rb#L632-L633
and then just extend the
return_value_after_insert?method to ask for more than just an auto-increment being returned after an insert.https://github.com/rails/rails/blob/aa162cefe19b81daa5d7f8c2b9ee4fe2564e73fc/activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb#L284-L285
Databases that don't support
RETURNINGstatement or an alternative will require an additionalSELECTquery to be performed to fetch values for every auto-populated columns. An additional query may not be desirable for every application so the behavior should be configurable which brings more complexity and one of the reasons why we are not trying to tackle it all at once.