fix: Dialect requires derived table alias#12994
fix: Dialect requires derived table alias#12994goldmedal merged 6 commits intoapache:mainfrom peasee:feat/dialect-requires-derived-table-alias
Conversation
* fix: Add Dialect option for requiring table aliases * feat: Add CustomDialectBuilder for requires_table_alias * docs: Spelling * refactor: rename requires_derived_table_alias * refactor: rename requires_derived_table_alias
| expected: | ||
| // top projection sort gets derived into a subquery | ||
| // for MySQL, this subquery needs an alias | ||
| "SELECT `j1_min` FROM (SELECT min(`ta`.`j1_id`) AS `j1_min`, min(`ta`.`j1_id`) FROM `j1` AS `ta` ORDER BY min(`ta`.`j1_id`) ASC) AS `derived_sort` LIMIT 10", |
There was a problem hiding this comment.
I found the test only covers the derived_sort case but some cases aren't covered, such as derived_projection, derived_limit,...
Could you add more tests for them?
There was a problem hiding this comment.
I added another test case for a more complex SQL that results in 2 nested derives for derived_sort and derived_distinct in the same query.
I'm not too sure what SQL to use to trigger derived_limit or derived_projection though... any ideas that I could include?
There was a problem hiding this comment.
For derived_projection, I found the SQL can trigger:
select j1_id from (select 1 as j1_id)
-> SELECT `j1_id` FROM (SELECT 1 AS `j1_id`) AS `derived_projection`However, I guess it's not a valid SQL for MySQL, right? I tried it in DataFusion and DuckDB, they accept it but Postgres doesn't allow it. Maybe, it could be a DataFusion dialect to MySQL dialect case.
For derived_limit, a similar case can trigger it:
select * from (select * from j1 limit 10)
-> SELECT * FROM (SELECT * FROM `j1` LIMIT 10) AS `derived_limit`There was a problem hiding this comment.
It is valid for MySQL, I just tested it out in my CLI. I've added both as new cases, thanks!
|
By the way, I tried to pull your branch and run the tests on my laptop but the tests always fail. After merging with the latest main branch, the tests are passed. |
There was a problem hiding this comment.
It's a good improvement for the derived table. 👍 It makes sense to me.
I noticed some databases don't support this kind of unnamed subquery, see #12896 (comment)
This PR would be helpful.
|
Maybe @phillipleblanc and @sgrebnov want to take a look at this PR. |
|
Looks good to me, thanks @goldmedal and @peasee |
|
@goldmedal perhaps you would like to merge this PR as a way to verify your permissions are configured correctly? |
Sure, I plan to merge this PR if no more comments tomorrow. |
Let's hold off on merging this for now. I think I've introduced a regression that I'd like to test more first before merging. |
Okay, looks like my PR didn't introduce the regression but there's a somewhat related bug currently on This gets rewritten in the Which is invalid, because the subquery is un-aliased. I tested it in DuckDB to confirm: Let's merge my PR, and I can raise a new issue for this other bug @goldmedal ? |
It's valid for DataFusion Generally,
How about or If it's the first one, I think it's an issue because it isn't valid. We can file an issue for it. |
My bad, I should've clarified better. It is the first one with |
I see. I think it's an issue but we can fix it in the follow-up PR. Let's merge this PR first. |
|
|
Thanks @peasee and @phillipleblanc @alamb for reviewing. 👍 |
Which issue does this PR close?
Closes #12993.
Rationale for this change
Fixes a bug preventing derived logical plans from running in MySQL due to a requirement that every derived table must have an alias:
What changes are included in this PR?
Includes a dialect update to specify whether derived tables require aliases. For dialects that do, set a static alias. I've simply selected the name of the operation prefixed with
derived_, likederived_sort, etc.Are these changes tested?
Yes
Are there any user-facing changes?
No