[SPARK-31220] repartition obeys spark.sql.adaptive.coalescePartitions.initialPartitionNum when spark.sql.adaptive.enabled - ASF Jira

XML

Word

Printable

JSON

Details

Type: Bug
Status: Resolved
Priority: Major
Resolution: Fixed
Affects Version/s: 3.0.0
Fix Version/s: 3.0.1, 3.1.0
Component/s: SQL
Labels:
None

Description

spark.sql("CREATE TABLE spark_31220(id int)")
spark.sql("set spark.sql.adaptive.coalescePartitions.initialPartitionNum=1000")
spark.sql("set spark.sql.adaptive.enabled=true")

scala> spark.sql("SELECT id from spark_31220 GROUP BY id").explain
== Physical Plan ==
AdaptiveSparkPlan(isFinalPlan=false)
+- HashAggregate(keys=[id#5], functions=[])
   +- Exchange hashpartitioning(id#5, 1000), true, [id=#171]
      +- HashAggregate(keys=[id#5], functions=[])
         +- FileScan parquet default.spark_31220[id#5] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/apache-spark/spark-warehouse/spark_31220], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>



scala> spark.sql("SELECT id from spark_31220 DISTRIBUTE BY id").explain
== Physical Plan ==
AdaptiveSparkPlan(isFinalPlan=false)
+- Exchange hashpartitioning(id#5, 200), false, [id=#179]
   +- FileScan parquet default.spark_31220[id#5] Batched: true, DataFilters: [], Format: Parquet, Location: InMemoryFileIndex[file:/root/opensource/apache-spark/spark-warehouse/spark_31220], PartitionFilters: [], PushedFilters: [], ReadSchema: struct<id:int>

Attachments

Issue Links

is duplicated by

SPARK-31841 Dataset.repartition leverage adaptive execution

Resolved

is related to

SPARK-32056 Repartition by key should support partition coalesce for AQE

Resolved

links to

GitHub Pull Request #27986

Activity

People

Assignee:: Yuming Wang

Reporter:: Yuming Wang

Votes:: 1 Vote for this issue

Watchers:: 4 Start watching this issue

Dates

Created:: 23/Mar/20 05:17

Updated:: 22/Jun/20 19:38

Resolved:: 09/Jun/20 16:07