Skip to content

Comments

Add support for generated columns in PostgreSQL#39368

Closed
fanfilmu wants to merge 1 commit intorails:mainfrom
fanfilmu:ar-postgres-virtual-columns
Closed

Add support for generated columns in PostgreSQL#39368
fanfilmu wants to merge 1 commit intorails:mainfrom
fanfilmu:ar-postgres-virtual-columns

Conversation

@fanfilmu
Copy link
Contributor

@fanfilmu fanfilmu commented May 20, 2020

Summary

Related topic: https://discuss.rubyonrails.org/t/activerecord-postgresql-support-for-generated-virtual-columns/75302

This PR adds support for generated (virtual) columns to PostgreSQL adapter.
Those are available in PSQL since version 12: https://www.postgresql.org/docs/12/ddl-generated-columns.html

Other Information

I heavily relied on the MySQL implementation for this case, with small adjustments - for example, generated columns are always stored in PostgreSQL, so there is no stored option.

I was running the tests for PSQL locally (test:postgresql) - they pass against version 11 and 12. I don't think there is a Postgres version specified in CI?

Also, it's my first contribution to Rails, so please let me know if anything is out of order here :)

@fanfilmu fanfilmu force-pushed the ar-postgres-virtual-columns branch 3 times, most recently from 5289960 to 3f7dfc4 Compare May 24, 2020 11:13
@fanfilmu fanfilmu force-pushed the ar-postgres-virtual-columns branch from 3f7dfc4 to 0f9610d Compare June 4, 2020 08:58
@sergiopantoja
Copy link

+1 for generated column support on Postgres. For reference, here's the initial commit for adding generated columns to the MySQL adapter: 65bf1c6

I think we should have the stored: true option. Postgres 12 only supports STORED for now, but this would make the t.virtual signature match the MySQL adapter and make this ready for when Postgres does eventually add the VIRTUAL option.

If stored: true isn't specified and the Postgres adapter is being used, then we can throw an error saying the VIRTUAL option isn't implemented in Postgres yet, so please use stored: true.

@fanfilmu fanfilmu force-pushed the ar-postgres-virtual-columns branch from 0f9610d to 39d2b52 Compare July 2, 2020 05:58
@fanfilmu
Copy link
Contributor Author

fanfilmu commented Jul 2, 2020

Thank you for your comment @sergiopantoja !

I added handling of the stored option, where it raises ArgumentError if it's not passed (or is false) along with as.

I am thinking though if maybe it would be better to simply let PostgreSQL raise error in this case. On one hand, currently users will get nice readable error message, but the adapter code will have to be adjusted if Postgres actually implements non-persistent generated columns.

So the alternative would be:

1. Adapter side - simply add STORED to the SQL if stored: true
2. User side - if option is not supplied, PostgreSQL will complain about lack of STORED (until they implement it)

Let me know what you think!

EDIT: It will require some work on the adapter anyway, due to schema dumping. Currently PostgreSQL marks (stored) generated columns with s (see here), if they introduce not stored ones, it will be something else, but we have no way of knowing.

So I guess raising error is the way to go!

@md5
Copy link
Contributor

md5 commented Jul 14, 2020

Has there been any discussion of including these values in the RETURNING clause currently used to fetch the assigned value for primary keys? Without that functionality, I believe a #reload is needed to see these values for new records.

There is some related discussion in #34237

As pointed out by @matthewd in #34237 (comment), the RETURNING would ideally be there for both INSERT and UPDATE

@fanfilmu
Copy link
Contributor Author

@md5 You are correct, they do not get populated after insert/update, I have noticed it too. I figure it's a bit different feature though, maybe you can try asking around in https://discuss.rubyonrails.org/ ?

@fanfilmu fanfilmu force-pushed the ar-postgres-virtual-columns branch from 39d2b52 to 9ff315c Compare July 15, 2020 06:34
@fanfilmu fanfilmu force-pushed the ar-postgres-virtual-columns branch from 9ff315c to fc71338 Compare July 15, 2020 06:38
@md5
Copy link
Contributor

md5 commented Jul 15, 2020

@fanfilmu yes, I think you are right. The situation will be similar to the current situation with function-valued defaults, which also require a #reload to be reflected in the model when they are not explicitly set.

@mlt
Copy link

mlt commented Aug 7, 2020

I might be wrong but we need to exclude those column while loading fixtures. Currently build_fixture_sql tries to stuff those with DEFAULT. I do not see that addressed in the patch. It looks like d8d6bd5 (database_statements.rb circa line 442) took care of it as a side effect. Although I'd love to see this feature somehow backported to 5.2.x :/

There seem to be a bug in PG as DEFAULT is an allowed keyword for inserts into generated columns, but multiple rows fail to insert https://www.db-fiddle.com/f/2NxbvvNvsFBTzYEif2jTWK/0 . Here is the PG mailing list thread.

Backport for 5.2.x if anybody needs https://github.com/mlt/rails/commits/ar-postgres-virtual-columns-52

@marksiemers
Copy link
Contributor

We recently upgraded from PG10 to 12 and are about to add quite a few generated columns.

Anything I can do here to help get this PR merged (and backported to rails 5.2)?

@simi
Copy link
Contributor

simi commented Sep 28, 2020

@marksiemers AFAIK new features are not backported.

@marksiemers
Copy link
Contributor

@marksiemers AFAIK new features are not backported.

Thanks @simi - we'll plan to upgrade to rails 6 (assuming we can make sure the apartment gem still works).

Anything that can be done to help get this merged and available in Rails 6?

@jeremy jeremy added this to the 6.1.0 milestone Sep 29, 2020
Copy link
Member

@jeremy jeremy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this work @fanfilmu !

raise ArgumentError, <<~MSG
PostgreSQL currently does not support VIRTUAL (not persisted) generated columns.
Specify 'stored: true' option for '#{options[:column].name}'
MSG
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we omit STORED, does it result in an error? That may be preferable, considering future versions may support non-stored virtual columns.

Copy link
Contributor

@marksiemers marksiemers Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As of PG 12 and PG 13, it is required:

GENERATED ALWAYS AS ( generation_expr ) STORED
...
The keyword STORED is required to signify that the column will be computed on write and will be stored on disk.

Source: https://www.postgresql.org/docs/12/sql-createtable.html
Source: https://www.postgresql.org/docs/13/sql-createtable.html

It looks like their plan is to require the keyword STORED and in future versions, require the keyword VIRTUAL

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we implement this with an expectation of compatibility with future PG versions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Predicting the future is always a bit tricky :)

In regard to your original suggestion, the thought is to let the exception happen deeper down the stack (with the pg gem) and let that error be the guide - as opposed to creating this error. Is that right?

If that is correct, I think the trade-off is:

  • As written

    • Con: More brittle in handling postgres updates
    • Pro: Gives the developer a precise fix in the error message (Specify 'stored: true' option for ...)
  • Letting pg gem raise the error

    • Pro: handles future postgresql updates more accurately/flexibly
    • Con: Leaves it up to the developer to google the error to find the fix

@fanfilmu - WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Giving the developer a precise fix in the error message is definitely a nice feature.

```ruby
create_table :users do |t|
t.string :name
t.virtual :name_upcased, type: :string, as: 'upper(name)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will fail without stored: true ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a good idea to add stored: true to this example, even if it wouldn't fail.

pg_get_expr(d.adbin, d.adrelid), a.attnotnull, a.atttypid, a.atttypmod,
c.collname, col_description(a.attrelid, a.attnum) AS comment
c.collname, col_description(a.attrelid, a.attnum) AS comment,
#{supports_virtual_columns? ? 'attgenerated' : quote('')} as attgenerated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PG 12, do non-generated columns return an empty string here (quote('')) or NULL?

Copy link
Contributor

@marksiemers marksiemers Oct 1, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested on our PG12 server, and it looks like an empty string (pgAdmin will show [null] for NULL values):
Screen Shot 2020-10-01 at 3 55 49 PM

From this query:

  SELECT attgenerated
   FROM pg_attribute a
   LEFT JOIN pg_attrdef d ON a.attrelid = d.adrelid AND a.attnum = d.adnum
   LEFT JOIN pg_type t ON a.atttypid = t.oid
   LEFT JOIN pg_collation c ON a.attcollation = c.oid AND a.attcollation <> t.typcollation
    WHERE a.attnum > 0 AND NOT a.attisdropped
  LIMIT 10;

## all documents matching 'cat & dog'
Document.where("textsearchable_index_col @@ to_tsquery(?)", "cat & dog")
```

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice touch sharing this usage.

end

def virtual?
@generated == "s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be any non-nil value? Assuming "s" is specific to stored columns, anticipating that non-stored columns may be supported in PG 13+.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be some time before virtual are supported. PG12 nor 13 support it.

The current devel documentation also states that only stored are supported: https://www.postgresql.org/docs/devel/ddl-generated-columns.html

Would something like this work?

VIRTUAL_COLUMN_KEYS = %w[s].freeze # when PG supports virtual switch to %w[s v].freeze

# ...

def virtual?
  VIRTUAL_COLUMN_KEYS.include?(@generated)
end

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# when PG supports virtual switch to %w[s v].freeze

This makes Rails brittle with respect to specific PostgreSQL versions supported, however.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, so the alternative would be to check for a non-null non-empty string value. The presumption being that will be definitive no matter the value of the string.

Sounds good to me.

@marksiemers
Copy link
Contributor

@fanfilmu - Do you have time to make the requested updates? Would you like someone else to take a shot at it?

@colinux
Copy link

colinux commented Oct 14, 2020

I might be wrong but we need to exclude those column while loading fixtures.
There seem to be a bug in PG as DEFAULT is an allowed keyword for inserts into generated columns, but multiple rows fail to insert https://www.db-fiddle.com/f/2NxbvvNvsFBTzYEif2jTWK/0 . Here is the PG mailing list thread.

Backport for 5.2.x if anybody needs https://github.com/mlt/rails/commits/ar-postgres-virtual-columns-52

Agreed, especially since this bug still exists in PG 13 and is breaking fixtures.

@Intrepidd
Copy link
Contributor

Would love to see this hit master :) Any work I can do ?

@rafaelfranca rafaelfranca removed this from the 6.1.0 milestone Nov 24, 2020
# db/migrate/20131220144913_create_users.rb
create_table :users do |t|
t.string :name
t.virtual :name_upcased, type: :string, as: 'upper(name)'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
t.virtual :name_upcased, type: :string, as: 'upper(name)'
t.virtual :name_upcased, type: :string, as: 'upper(name)', stored: true

as per https://github.com/rails/rails/pull/39368/files#r496404108

Base automatically changed from master to main January 14, 2021 17:01

def new_column_definition(name, type, **options) # :nodoc:
case type
when :virtual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has there been discussion about the history behind the virtual name/keyword here?

The entry in the PostgreSQL v13 documentation is under the "Generated Columns" heading, the SQL command itself contains GENERATED ALWAYS AS, and the opposite of the STORED flag at the end of the statement is presumed to be VIRTUAL. As of January 2021, the STORED option is required.

My gut tells me that generated would be a more appropriate name in all the cases where this PR introduces virtual. What am I missing here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, in this comment: #39368 (comment)

I think we should have the stored: true option. Postgres 12 only supports STORED for now, but this would make the t.virtual signature match the MySQL adapter and make this ready for when Postgres does eventually add the VIRTUAL option.

The idea is that using the name virtual with a stored option should:

  1. Match an existing method for MySQL and make this feature more DB-agnostic
  2. Support a (non-stored) "virtual" column once postgresql supports it

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for clarifying @marksiemers. I was unaware of the MySQL analog.

@Yuji-Kuroko
Copy link

Hello. I need this feature very much!
Is there anything that I can help with?

@dsounded
Copy link
Contributor

dsounded commented May 5, 2021

Currently if we wanna use it

def change
    execute <<-SQL
      ALTER TABLE table_name
      ADD COLUMN month integer GENERATED ALWAYS AS (EXTRACT (MONTH FROM date)) STORED
    SQL
    add_index :table_name, :month
  end

^ production use case.

it works okay, but somehow it generates wrong default option in schema

e.g. , default: -> { "date_part('month'::text, date)" }
Which causes an error during rake db:schema:load for postgres (that you cannot use fields values as a default option part).

Temporary solution is just to remove default from schema and create a custom script to support test db.

That's not that handy, so having this type of functionality in Rails will be really nice.

@ghiculescu
Copy link
Member

@dsounded to make that accurate, you could set config.active_record.schema_format = :sql and switch to using db:structure instead of db:schema. At least until this functionality is natively added.

@dsounded
Copy link
Contributor

dsounded commented May 7, 2021

@ghiculescu Yeah I understand that, but if this is going to be merged soon then I can wait for it, I'd like to avoid sql schema way

@judy
Copy link

judy commented May 11, 2021

Adding my interest for this PR (or the redux by @MSNexploder). Our team is also using schema.rb currently for a large project and switching to structure.sql feels like it would be a big shift for the team. Not sure how to get around the issues mentioned above without it though.

@JuanCrg90
Copy link

Any update about this change? I tried to add a generated column and the fixtures started to fail.

@mlt
Copy link

mlt commented Aug 1, 2021

@JuanCrg90 If you wait long enough, it should work with PG 14+. The bug with DEFAULT in multirow INSERT that causes fixtures to fail was fixed.

@ghiculescu
Copy link
Member

This was implemented in #41856 so I'm going to close this one. Thanks for everyone who helped on it.

@ghiculescu ghiculescu closed this Sep 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.