Skip to content

Comments

Fix unparser invalid sql for query with order#11527

Merged
alamb merged 5 commits intoapache:mainfrom
spiceai:fix-unparser-invalid-sql-for-query-with-order
Jul 22, 2024
Merged

Fix unparser invalid sql for query with order#11527
alamb merged 5 commits intoapache:mainfrom
spiceai:fix-unparser-invalid-sql-for-query-with-order

Conversation

@y-f-u
Copy link
Contributor

@y-f-u y-f-u commented Jul 18, 2024

Which issue does this PR close?

This PR addresses two issues:

  • missing order/limit/distinct on derived table
    Before the fix, query like select a from (select a, b from table order by b desc) tmp order by a will only keep the order by b desc at a wrong nesting level after unparsing.

  • order by columns not on select
    Before the fix, query like select a from table order by b will be unparsed into select table.a from (select table.a, table.b from table) order by table.b which is invalid as the derived table misses alias.

Rationale for this change

To prevent unparser generating invalid sql for above two common scenarios

What changes are included in this PR?

  • relation derive can happen for projection, limit, order, distinct as they can be top-level logical plan node.
  • rewrite the logicalplan when it has a pattern which parsed from sort columns are not in select. See code comment for more details

Are these changes tested?

Yes

Are there any user-facing changes?

No

y-f-u added 2 commits July 18, 2024 15:50
…th limit/sort/distinct; fix wrong unparsed query for original query with sort column that is not in select
@github-actions github-actions bot added the sql SQL Planner label Jul 18, 2024
Copy link
Contributor

@goldmedal goldmedal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @y-f-u. LGTM. Just one minor comment. I also tried other cases, and they work well.

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @y-f-u and @goldmedal for the review

I think it would be good to implement some of the suggestions in this PR from @goldmedal but I think we could also do it as a follow on PR too.

@y-f-u
Copy link
Contributor Author

y-f-u commented Jul 19, 2024

Thanks @alamb and @goldmedal . I've addressed all the comments, also fixed one missing case that the projected column is an alias. Now all works well :)

@alamb
Copy link
Contributor

alamb commented Jul 19, 2024

https://github.com/apache/datafusion/actions/runs/10001535587/job/27645407724?pr=11527

Doesn't seem related to the changes in this PR. I'll restart the check and if it passes let's file a ticket about the intermittent failure

failures:

---- fuzz_cases::join_fuzz::test_anti_join_1k_filtered stdout ----
thread 'fuzz_cases::join_fuzz::test_anti_join_1k_filtered' panicked at datafusion/core/tests/fuzz_cases/join_fuzz.rs:537:17:
assertion `left == right` failed: HashJoinExec and SortMergeJoinExec produced different row counts, batch_size: 1
  left: 952
 right: 953
stack backtrace:

@alamb
Copy link
Contributor

alamb commented Jul 19, 2024

Filed #11555 for intermittent test failure

@alamb
Copy link
Contributor

alamb commented Jul 19, 2024

Oh no! It has another conflict to resolve.

I can't push commits to the spiceai repo or I would fix it myself.

Screenshot 2024-07-19 at 5 48 42 PM

@y-f-u
Copy link
Contributor Author

y-f-u commented Jul 21, 2024

I will fix the conflict

@alamb alamb merged commit ecf5323 into apache:main Jul 22, 2024
@alamb
Copy link
Contributor

alamb commented Jul 22, 2024

Thanks @y-f-u

@y-f-u y-f-u deleted the fix-unparser-invalid-sql-for-query-with-order branch July 22, 2024 23:17
Lordworms pushed a commit to Lordworms/arrow-datafusion that referenced this pull request Jul 23, 2024
* wip

* fix wrong unparsed query for original query that has derived table with limit/sort/distinct; fix wrong unparsed query for original query with sort column that is not in select

* clippy

* addressed the comments, also fix one issue when selected column is aliased - see test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

sql SQL Planner

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants