Skip to content

Add server and client support for the new generic table baseLocation field#2122

Merged
gh-yzou merged 5 commits intoapache:mainfrom
gh-yzou:yzou-generic-table-location-support
Jul 19, 2025
Merged

Add server and client support for the new generic table baseLocation field#2122
gh-yzou merged 5 commits intoapache:mainfrom
gh-yzou:yzou-generic-table-location-support

Conversation

@gh-yzou
Copy link
Contributor

@gh-yzou gh-yzou commented Jul 16, 2025

We have added the baseLocation in the spec, but haven't provided the server support, this PR adds the server support and also the spark client support for the new location fields

@github-project-automation github-project-automation bot moved this to PRs In Progress in Basic Kanban Board Jul 16, 2025
@gh-yzou gh-yzou changed the title Server side generic table support for location Add server and client support for the new generic table baseLocation field Jul 17, 2025
@gh-yzou gh-yzou marked this pull request as ready for review July 17, 2025 04:44
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had a quick look - no concerns from my side, but I could not review in depth for an approval :)

@gh-yzou gh-yzou force-pushed the yzou-generic-table-location-support branch from 510420f to be11ee0 Compare July 17, 2025 18:57
@gh-yzou
Copy link
Contributor Author

gh-yzou commented Jul 17, 2025

@dimas-b Thanks! Could you help take a look sometime?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If location and path are both set, it might be worth a debug log explaining that location is taking precedence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sg! Added debug log under the case if both are configured

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this assert not work without the if?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it works, I was trying to make it more explicit that it is checking null. I simplified to just use the same assertion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

simplified

eric-maynard
eric-maynard previously approved these changes Jul 18, 2025
@github-project-automation github-project-automation bot moved this from PRs In Progress to Ready to merge in Basic Kanban Board Jul 18, 2025
dimas-b
dimas-b previously approved these changes Jul 18, 2025
Copy link
Contributor

@dimas-b dimas-b left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change looks reasonable to me.

I believe tests can be improved as @eric-maynard commented.

@gh-yzou gh-yzou dismissed stale reviews from dimas-b and eric-maynard via 026cf8b July 19, 2025 00:57
@gh-yzou gh-yzou force-pushed the yzou-generic-table-location-support branch from be11ee0 to 026cf8b Compare July 19, 2025 00:57
Copy link
Contributor

@singhpk234 singhpk234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as well, Thanks @gh-yzou !

@gh-yzou gh-yzou merged commit b48cfb6 into apache:main Jul 19, 2025
12 checks passed
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Jul 19, 2025
snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* chore(deps): update dependency mypy to >=1.17, <=1.17.0 (apache#2114)

* Spark 3.5.6 and Iceberg 1.9.1 (apache#1960)

* Spark 3.5.6 and Iceberg 1.9.1

* Cleanup

* Add `pathStyleAccess` to AwsStorageConfigInfo (apache#2012)

* Add `pathStyleAccess` to AwsStorageConfigInfo

This change allows configuring the "path-style" access
mode in S3 clients (both in Polaris Servers and Iceberg
REST Catalog API clients).

This change is applicable both to AWS storage and to
non-AWS S3-compatible storage (apache#1530).

* Add TestFileIOFactory helper (apache#2105)

* Add FileIOFactory.wrapExisting helper

* fix(deps): update dependency gradle.plugin.org.jetbrains.gradle.plugin.idea-ext:gradle-idea-ext to v1.2 (apache#2125)

* fix(deps): update dependency boto3 to v1.39.7 (apache#2124)

* Abstract polaris-runtime-service tests for all persistence implementations (apache#2106)

The NoSQL persistence implementation has to run the Iceberg table & view catalog plus the Polaris specific tests as well. Reusing existing tests is beneficial to avoid a lot of code duplcation.

This change moves the actual tests to `Abstract*` classes and refactors the existing tests to extend those. The NoSQL persistence work extends the same `Abstract*` classes but runs with different Quarkus test profiles.

* Add IMPLICIT authentication support to the CLI (apache#2121)

PRs apache#1925 and apache#1912 were merged around the same time.  This PR connects the two changes and enables the CLI to accept IMPLICIT authentication type. 

Since Hadoop federated catalogs rely purely on IMPLICIT authentication, the CLI parsing test has been updated to reflect the same.

* feat(helm): Add support for external authentication (apache#2104)

* fix(deps): update dependency org.apache.iceberg:iceberg-bom to v1.9.2 (apache#2126)

* fix(deps): update quarkus platform and group to v3.24.4 (apache#2128)

* fix(deps): update dependency boto3 to v1.39.8 (apache#2129)

* fix(deps): update dependency io.smallrye.config:smallrye-config-core to v3.13.3 (apache#2130)

* Add newIcebergCatalog helper (apache#2134)

creation of `IcebergCatalog` instances was quite redundant as tests
mostly use the same parameters most of the time.

also remove an unused field in 2 other tests.

* Add server and client support for the new generic table `baseLocation` field (apache#2122)

* Use Makefile to simplify setup and commands (apache#2027)

* Use Makefile to simplify setup and commands

* Add targets for minikube state management

* Add podman support and spark plugin build

* Add version target

* Update README.md for Makefile usage and relation to the project

* Fix nit

* Package polaris client as python package (apache#2049)

* Package polaris client as python package

* Package polaris client as python package

* Change owner to spark when copying files from local into Dockerfile

* CI: Address failure from accessing GH API (apache#2132)

CI sometimes fails with this failure:
```
* What went wrong:
Execution failed for task ':generatePomFileForMavenPublication'.
> Unable to process url: https://api.github.com/repos/apache/polaris/contributors?per_page=1000
```

The sometimes failing request fetches the list of contributors to be published in the "root" POM. Unauthorized GH API requests have an hourly(?) limit of 60 requests per source IP. Authorized requests have a much higher rate limit. We do have a GitHub token available in every CI run, which can be used in GH API requests. This change adds the `Authorization` header for the failing GH API request to leverage the higher rate limit and let CI not fail (that often).

* fix(deps): update dependency com.nimbusds:nimbus-jose-jwt to v10.4 (apache#2139)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.0 (apache#2142)

* fix(deps): update dependency software.amazon.awssdk:bom to v2.32.4 (apache#2146)

* fix(deps): update dependency org.xerial.snappy:snappy-java to v1.1.10.8 (apache#2138)

* fix(deps): update dependency org.junit:junit-bom to v5.13.4 (apache#2147)

* fix(deps): update dependency boto3 to v1.39.9 (apache#2137)

* fix(deps): update dependency com.fasterxml.jackson:jackson-bom to v2.19.2 (apache#2136)

* Python client: add support for endpoint, sts-endpoint, path-style-access (apache#2127)

This change adds support for endpoint, sts-endpoint, path-style-access to the Polaris Python client.

Amends apache#1913 and apache#2012

* Remove PolarisEntityManager.getCredentialCache (apache#2133)

`PolarisEntityManager` itself is not using the `StorageCredentialCache` but just hands it out via `getCredentialCache`.
the only caller of `getCredentialCache` is `FileIOUtil.refreshAccessConfig`, which in in turn is only called by `DefaultFileIOFactory` and `IcebergCatalog`.

note that in a follow-up we will likely be able to remove `PolarisEntityManager` usage completely from `IcebergCatalog`.

additional cleanups:
- use `StorageCredentialCache` injection in tests (but we need to invalidate all entries on test start)
- remove unused `UserSecretsManagerFactory` from `PolarisCallContextCatalogFactory`

* chore(deps): update registry.access.redhat.com/ubi9/openjdk-21-runtime docker tag to v1.22-1.1752676419 (apache#2150)

* fix(deps): update dependency com.diffplug.spotless:spotless-plugin-gradle to v7.2.1 (apache#2152)

* fix(deps): update dependency boto3 to v1.39.10 (apache#2151)

* chore: fix class reference in the javadoc of TableLikeEntity (apache#2157)

* fix(deps): update dependency commons-codec:commons-codec to v1.19.0 (apache#2160)

* fix(deps): update dependency boto3 to v1.39.11 (apache#2159)

* Last merged commit 395459f

---------

Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Yong Zheng <[email protected]>
Co-authored-by: Dmitri Bourlatchkov <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Pooja Nilangekar <[email protected]>
Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Yun Zou <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants