Details
-
Bug
-
Status: In Progress
-
Major
-
Resolution: Unresolved
-
3.0.0
-
None
-
None
Description
The analyzer can sometimes hit issues with resolving functions. e.g.
select max(id) from range(10) group by id having count(1) >= 1 order by max(id)
The analyzed plan of this query is:
== Analyzed Logical Plan ==
max(id): bigint
Project [max(id)#91L]
+- Sort [max(id#88L) ASC NULLS FIRST], true
+- Project [max(id)#91L, id#88L]
+- Filter (count(1)#93L >= cast(1 as bigint))
+- Aggregate [id#88L], [max(id#88L) AS max(id)#91L, count(1) AS count(1)#93L, id#88L]
+- Range (0, 10, step=1, splits=None)
Note how an aggregate function is outside of Aggregate operators in the fully analyzed plan:
Sort max(id#88L) ASC NULLS FIRST, true, which makes the plan invalid.
Trying to run this query will lead to weird issues in codegen, but the root cause is in the analyzer:
java.lang.UnsupportedOperationException: Cannot generate code for expression: max(input[1, bigint, false]) at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:291) at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:290) at org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.doGenCode(interfaces.scala:87) at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:138) at scala.Option.getOrElse(Option.scala:138) at org.apache.spark.sql.catalyst.expressions.Expression.genCode(Expression.scala:133) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.$anonfun$createOrderKeys$1(GenerateOrdering.scala:82) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237) at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62) at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49) at scala.collection.TraversableLike.map(TraversableLike.scala:237) at scala.collection.TraversableLike.map$(TraversableLike.scala:230) at scala.collection.AbstractTraversable.map(Traversable.scala:108) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.createOrderKeys(GenerateOrdering.scala:82) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.genComparisons(GenerateOrdering.scala:91) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:152) at org.apache.spark.sql.catalyst.expressions.codegen.GenerateOrdering$.create(GenerateOrdering.scala:44) at org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator.generate(CodeGenerator.scala:1194) at org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.<init>(GenerateOrdering.scala:195) at org.apache.spark.sql.catalyst.expressions.codegen.LazilyGeneratedOrdering.<init>(GenerateOrdering.scala:192) at org.apache.spark.sql.execution.TakeOrderedAndProjectExec.executeCollect(limit.scala:153) at org.apache.spark.sql.Dataset.collectFromPlan(Dataset.scala:3302) at org.apache.spark.sql.Dataset.$anonfun$head$1(Dataset.scala:2470) at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3291) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:87) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:147) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:74) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3287) at org.apache.spark.sql.Dataset.head(Dataset.scala:2470) at org.apache.spark.sql.Dataset.take(Dataset.scala:2684) at org.apache.spark.sql.Dataset.getRows(Dataset.scala:262) at org.apache.spark.sql.Dataset.showString(Dataset.scala:299) at org.apache.spark.sql.Dataset.show(Dataset.scala:753) at org.apache.spark.sql.Dataset.show(Dataset.scala:712) at org.apache.spark.sql.Dataset.show(Dataset.scala:721)
The test case SPARK-23957 Remove redundant sort from subquery plan(scalar subquery) in SubquerySuite has been disabled because of hitting this issue, caught by SPARK-26735. We should re-enable that test once this bug is fixed.
Attachments
Issue Links
- links to