Skip to content

Commit 06c155c

Browse files
committed
[SPARK-20908][SQL] Cache Manager: Hint should be ignored in plan matching
### What changes were proposed in this pull request? In Cache manager, the plan matching should ignore Hint. ```Scala val df1 = spark.range(10).join(broadcast(spark.range(10))) df1.cache() spark.range(10).join(spark.range(10)).explain() ``` The output plan of the above query shows that the second query is not using the cached data of the first query. ``` BroadcastNestedLoopJoin BuildRight, Inner :- *Range (0, 10, step=1, splits=2) +- BroadcastExchange IdentityBroadcastMode +- *Range (0, 10, step=1, splits=2) ``` After the fix, the plan becomes ``` InMemoryTableScan [id#20L, id#23L] +- InMemoryRelation [id#20L, id#23L], true, 10000, StorageLevel(disk, memory, deserialized, 1 replicas) +- BroadcastNestedLoopJoin BuildRight, Inner :- *Range (0, 10, step=1, splits=2) +- BroadcastExchange IdentityBroadcastMode +- *Range (0, 10, step=1, splits=2) ``` ### How was this patch tested? Added a test. Author: Xiao Li <[email protected]> Closes #18131 from gatorsmile/HintCache.
1 parent 3969a80 commit 06c155c

File tree

2 files changed

+9
-1
lines changed

2 files changed

+9
-1
lines changed

sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/hints.scala

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,8 @@ case class ResolvedHint(child: LogicalPlan, hints: HintInfo = HintInfo())
4040

4141
override def output: Seq[Attribute] = child.output
4242

43+
override lazy val canonicalized: LogicalPlan = child.canonicalized
44+
4345
override def computeStats(conf: SQLConf): Statistics = {
4446
val stats = child.stats(conf)
4547
stats.copy(hints = hints)

sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/plans/SameResultSuite.scala

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -20,7 +20,7 @@ package org.apache.spark.sql.catalyst.plans
2020
import org.apache.spark.SparkFunSuite
2121
import org.apache.spark.sql.catalyst.dsl.expressions._
2222
import org.apache.spark.sql.catalyst.dsl.plans._
23-
import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan, Union}
23+
import org.apache.spark.sql.catalyst.plans.logical.{LocalRelation, LogicalPlan, ResolvedHint, Union}
2424
import org.apache.spark.sql.catalyst.util._
2525

2626
/**
@@ -66,4 +66,10 @@ class SameResultSuite extends SparkFunSuite {
6666
assertSameResult(Union(Seq(testRelation, testRelation2)),
6767
Union(Seq(testRelation2, testRelation)))
6868
}
69+
70+
test("hint") {
71+
val df1 = testRelation.join(ResolvedHint(testRelation))
72+
val df2 = testRelation.join(testRelation)
73+
assertSameResult(df1, df2)
74+
}
6975
}

0 commit comments

Comments
 (0)