Spark 3.4: Action to compute table stats#11106
Merged
szehon-ho merged 2 commits intoapache:mainfrom Sep 13, 2024
Merged
Conversation
0d30697 to
eee58dd
Compare
singhpk234
approved these changes
Sep 11, 2024
huaxingao
reviewed
Sep 11, 2024
spark/v3.3/build.gradle
Outdated
| implementation project(':iceberg-parquet') | ||
| implementation project(':iceberg-arrow') | ||
| implementation("org.scala-lang.modules:scala-collection-compat_${scalaVersion}:${libs.versions.scala.collection.compat.get()}") | ||
| implementation("org.apache.datasketches:datasketches-java:${libs.versions.datasketches.get()}") |
Contributor
There was a problem hiding this comment.
Do we need to change v3.3 build.gradle?
68afca0 to
230a19b
Compare
dramaticlly
reviewed
Sep 11, 2024
Comment on lines
+76
to
+82
| return spark | ||
| .read() | ||
| .format("iceberg") | ||
| .option(SparkReadOptions.SNAPSHOT_ID, snapshot.snapshotId()) | ||
| .load(table.name()) | ||
| .select(toAggColumns(colNames)) | ||
| .first(); |
Contributor
There was a problem hiding this comment.
do we need backport #10984 for spark 3.4 as well per Anton's comment in https://github.com/apache/iceberg/pull/10288/files#r1726000959? Happy to help
dramaticlly
approved these changes
Sep 11, 2024
Contributor
dramaticlly
left a comment
There was a problem hiding this comment.
LGTM, looks like CI run into some transient test failure
dramaticlly
added a commit
to dramaticlly/iceberg
that referenced
this pull request
Sep 11, 2024
backport of apache#10984, tests can be backport in together with apache#11106
szehon-ho
approved these changes
Sep 11, 2024
Member
|
Merged, thanks @karuppayya and all for additional review. |
tedyu
reviewed
Oct 3, 2024
| .option(SparkReadOptions.SNAPSHOT_ID, snapshot.snapshotId()) | ||
| .load(table.name()) | ||
| .select(toAggColumns(colNames)) | ||
| .first(); |
There was a problem hiding this comment.
should we consider calling .cache() before .first() ?
zachdisc
pushed a commit
to zachdisc/iceberg
that referenced
this pull request
Dec 23, 2024
parthchandra
pushed a commit
to parthchandra/iceberg
that referenced
this pull request
Oct 22, 2025
…n) (apache#1343) * API, Spark 3.5: Action to compute table stats (apache#10288) (cherry picked from commit 2f6e7e6) * Spark 3.4: Action to compute table stats (apache#11106) (cherry picked from commit 5582b0c) * Spark 3.4: Add utility to load table state reliably (apache#11115) (cherry picked from commit d5b21d8) * Cheery-pick data-sketches lib version chnage from apache@cbe391d#diff-697f70cdd88ba88fe77eebda60c7e143f6ad1286bca75017421e93ad84fb87df --------- Co-authored-by: Karuppayya <[email protected]> Co-authored-by: Hongyue/Steve Zhang <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Backport of #10288
cc: @aokolnychyi @szehon-ho