Core: Fix query failure when using projection on top of partitions metadata table by singhpk234 · Pull Request #4720 · apache/iceberg

singhpk234 · 2022-05-07T08:13:19Z

As per my understanding, we should use scan.tableSchema() rather than scan.schema() for transformSpec when planFiles() for partitions metaData table.

Hence we faced :

22/05/06 16:12:25 ERROR SparkSQLDriver: Failed in [select file_count from spark_catalog.monitoring.test.partitions]
java.lang.IllegalArgumentException: Cannot find source column: partition.date

as present in reported ticket

Though in master we will get :

Cannot find source column: 1000
java.lang.NullPointerException: Cannot find source column: 1000

Testing Done :
Added UT fails without the change

cc @szehon-ho @RussellSpitzer

RussellSpitzer · 2022-05-07T12:12:12Z

core/src/main/java/org/apache/iceberg/PartitionsTable.java

    LoadingCache<Integer, ManifestEvaluator> evalCache = Caffeine.newBuilder().build(specId -> {
      PartitionSpec spec = table.specs().get(specId);
-      PartitionSpec transformedSpec = transformSpec(scan.schema(), spec);
+      PartitionSpec transformedSpec = transformSpec(scan.tableSchema(), spec);


So in this case "Schema" is the projected schema which is missing all the partition columns and "tableSchema" has the full partitionSpec. LGTM

RussellSpitzer · 2022-05-07T12:17:21Z

...2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMetadataTables.java

    Assert.assertEquals("Metadata table should return one data file", 1, actualDataFiles.size());
    TestHelpers.assertEqualsSafe(filesTableSchema.asStruct(), expectedDataFiles.get(0), actualDataFiles.get(0));

+    List<Row> actualPartitionsWithProjection =


While I like this test I believe we should also add one to https://github.com/apache/iceberg/blob/4ae2002bd46bf8e1c20db03cffc6319237e4d74a/core/src/test/java/org/apache/iceberg/TestMetadataTableScans.java

To make sure we have one in the core module.

@singhpk234 ^

+1 added a UT for the same, thanks @RussellSpitzer !!!

Thanks! Otherwise we wouldn't have an automated test running on this with "Core" only changes.

abmo-x · 2022-05-07T21:22:45Z

Thanks @singhpk234 for a quick turn around, appreciate it!

RussellSpitzer · 2022-05-09T17:12:32Z

Thanks @singhpk234 for the PR and @abmo-x for review!

szehon-ho · 2022-05-09T17:36:01Z

Arg, sorry about the initial bug, thanks @singhpk234 for the fix and all for review

…ions metadata table (apache#4720)

…tadata table (#4720) (#4890)

…tadata table (apache#4720) (apache#619) Co-authored-by: Prashant Singh <[email protected]>

github-actions bot added core spark labels May 7, 2022

Fix query failure when using projection on top of partitions table

3fd3889

singhpk234 force-pushed the fix/ICEBERG-4718 branch from 6f74a52 to 3fd3889 Compare May 7, 2022 08:24

singhpk234 changed the title ~~[Core] : Fix query failure when using projection on top of partitions table~~ Core: Fix query failure when using projection on top of partitions table May 7, 2022

singhpk234 mentioned this pull request May 7, 2022

Iceberg 0.13 with Spark 3.2 - list partitions query always need partition.date and partition.hour columns in the result #4718

Closed

singhpk234 changed the title ~~Core: Fix query failure when using projection on top of partitions table~~ Core: Fix query failure when using projection on top of partitions metadata table May 7, 2022

RussellSpitzer reviewed May 7, 2022

View reviewed changes

RussellSpitzer approved these changes May 7, 2022

View reviewed changes

RussellSpitzer reviewed May 7, 2022

View reviewed changes

Add ut to check plan files for partition table

42ad42f

abmo-x approved these changes May 9, 2022

View reviewed changes

RussellSpitzer merged commit 0390e17 into apache:master May 9, 2022

szehon-ho added a commit to szehon-ho/iceberg that referenced this pull request May 28, 2022

[0.13] Core: Fix query failure when using projection on top of partit…

ead48e0

…ions metadata table (apache#4720)

szehon-ho mentioned this pull request May 28, 2022

[0.13] Core: Fix query failure when using projection on top of partitions metadata table (#4720) #4890

Merged

rdblue pushed a commit that referenced this pull request May 29, 2022

Core: Fix query failure when using projection on top of partitions me…

487b428

…tadata table (#4720) (#4890)

RussellSpitzer pushed a commit that referenced this pull request Jun 1, 2022

Core: Fix query failure when using projection on top of partitions me…

73f91a3

…tadata table (#4720) (#4890)

sunchao pushed a commit to sunchao/iceberg that referenced this pull request May 9, 2023

Core: Fix query failure when using projection on top of partitions me…

ef99449

…tadata table (apache#4720) (apache#619) Co-authored-by: Prashant Singh <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Core: Fix query failure when using projection on top of partitions metadata table#4720

Core: Fix query failure when using projection on top of partitions metadata table#4720
RussellSpitzer merged 2 commits intoapache:masterfrom
singhpk234:fix/ICEBERG-4718

singhpk234 commented May 7, 2022

Uh oh!

RussellSpitzer May 7, 2022

Uh oh!

RussellSpitzer May 7, 2022

Uh oh!

RussellSpitzer May 7, 2022

Uh oh!

singhpk234 May 7, 2022

Uh oh!

RussellSpitzer May 9, 2022

Uh oh!

abmo-x commented May 7, 2022

Uh oh!

RussellSpitzer commented May 9, 2022

Uh oh!

szehon-ho commented May 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

singhpk234 commented May 7, 2022

Uh oh!

RussellSpitzer May 7, 2022

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer May 7, 2022

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer May 7, 2022

Choose a reason for hiding this comment

Uh oh!

singhpk234 May 7, 2022

Choose a reason for hiding this comment

Uh oh!

RussellSpitzer May 9, 2022

Choose a reason for hiding this comment

Uh oh!

abmo-x commented May 7, 2022

Uh oh!

RussellSpitzer commented May 9, 2022

Uh oh!

szehon-ho commented May 9, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants