Skip to content

Add support for S3 request signing#2280

Closed
adutra wants to merge 4 commits intoapache:mainfrom
adutra:request-signing
Closed

Add support for S3 request signing#2280
adutra wants to merge 4 commits intoapache:mainfrom
adutra:request-signing

Conversation

@adutra
Copy link
Contributor

@adutra adutra commented Aug 6, 2025

@singhpk234
Copy link
Contributor

Thank you so much for this @adutra !

when ready please let us know (would be really helpful if you can share some description to walk us through your thought process), happy to help with reviews :) !

Copy link
Member

@snazy snazy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like good 1st step.

As all signing requests will have to access the backend for every little data file chunk being accessed by clients, this implementation will cause quite a slow experience for users. Until we don't have a faster implementation (signed access-rules), I think we should label this feature as "experimental".

.method(method)
.headers(signingRequest.headers());

// FIXME is this correct?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope.
See ListObjects + ListObjectsV2 REST spec.

Be careful: Not all GET /{bucket/ requests are list-objects requests!
The "v1" list-objects has no required parameter, so you cannot just check for the presence of a req param. It could also be a get-bucket-cors or get-bucket-encryption or ...-policy or ...-website or ......
Nice, no?

Maybe it's okay to assume that every client uses the v2 api, which has a unique, required request param.

@adutra adutra force-pushed the request-signing branch 14 times, most recently from 6bf40a9 to 8d1f354 Compare August 8, 2025 13:08
@adutra adutra marked this pull request as ready for review August 8, 2025 13:09
@adutra adutra force-pushed the request-signing branch 2 times, most recently from df10af7 to 6a26337 Compare August 8, 2025 14:14
@adutra
Copy link
Contributor Author

adutra commented Aug 8, 2025

Ready for review. The test failures are unrelated to this change.

\cc @singhpk234 @snazy @dimas-b @metadaddy

mapOf(
"IcebergErrorResponse" to "org.apache.iceberg.rest.responses.ErrorResponse",
"S3SignRequest" to "org.apache.polaris.service.aws.sign.model.PolarisS3SignRequest",
"SignS3Request200Response" to
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI The name of the logical response type is a bit weird because it is inlined, instead of being a $ref in the OpanAPI spec.

private String currentCatalogName;
private Map<String, String> restCatalogConfig;
private URI externalCatalogBase;
private URI externalCatalogBaseLocation;
Copy link
Contributor Author

@adutra adutra Aug 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI I refactored this class a bit, notably to facilitate subclassing, but there is no functional change in this class.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor request here, but you could pull this and the new IntegerationTest Helpers out into a separate PR and reduce the size of this PR by a bit. I think that would also be a very fast review

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup good idea!

#2384

}

dependencies {
implementation(project(":polaris-core"))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not directly referenced in this module.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another change I would try to pull out to another PR, it's not related to this changeset correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct:

#2385

}
}

private static boolean isStaticFacade(CatalogEntity catalog) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactored to become a method of CatalogEntity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! It definitely helps to keep this sort of refactor out of the larger functional change

* <p>The returned provider is not meant to be vended directly to clients, but rather used with
* STS, unless credential subscoping is disabled.
*/
default AwsCredentialsProvider awsSystemCredentials() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to awsSystemCredentials because these are not "just" credentials for STS, they are server credentials.

prefix,
catalog -> {
S3SignResponse response = catalog.signS3Request(s3SignRequest, tableIdentifier);
return Response.status(Response.Status.OK).entity(response).build();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory the response should include Cache-Control headers. This is left for a later improvement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... but that would require clients to respect those 🤷

if (baseCatalog.tableExists(tableIdentifier)) {

// If the table exists, get allowed locations from the table metadata
Table table = baseCatalog.loadTable(tableIdentifier);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each SIGN request a loadTable call is super expensive, if someone enables this in prod by accident for a table million of files the service might become unresponsive ?
[suggestion / optional to address] wondering for now if we should add a prod readiness check to keep this disabled for prod ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each SIGN request a loadTable call is super expensive

Indeed, that's where M2 will help because it will save us this loadTable call.

I will add a FIXME for now.

[suggestion / optional to address] wondering for now if we should add a prod readiness check to keep this disabled for prod ?

Good idea, will do 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update on this topic: I was able to remove the loadTable call, replaced with a direct check of the PolarisEntity properties. That's going to be faster imho.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we store the 3 properties in the entity properties asking because
i only see the StorageUtil#getLocationsUsedByTable passing TableMetada#properties and also not seeing it actually persisted in StorageUtil, am i missing something ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, which 3 properties? I'm not sure I understand your question 😅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

iceberg properties location, write.data.path, write.metadata.path, which is used for credential vending, I am not sure we persist them Entity, as the function StorageUtil#getLocationsUsedByTable has always worked in TableMetadata and not on Entity properties

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah now I get it! Good question! I think there is a problem indeed.

For the base location, we're good, it is stored in the properties map:

return getPropertiesAsMap().get(PolarisEntityConstants.ENTITY_BASE_LOCATION);

But afaict write.data.path and write.metadata.path are not stored in the entity 😞

I see a few options:

  1. We could store write.data.path and write.metadata.path in the entity, but that will be a problem for catalogs created before the change.
  2. We could go back to baseCatalog.loadTable(tableIdentifier) but that would be very slow.
  3. We could just add a FIXME here + create a GitHub issue and wait for a better solution with M2.

Wdyt?

I would go with 3) since we know that this code path will be improved, but I'm fine with 2) as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would personally prefer 2 over 3 as IMHO 3 is not quite correct ?

one thing i was thinking is every sign request will make a corresponding loadTable to get signer configs, should be backfill with an updateEntity if these props are missing i.e during read, wdyt ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the late reply!

I went back to option 2 : baseCatalog.loadTable(tableIdentifier).

one thing i was thinking is every sign request will make a corresponding loadTable to get signer configs, should be backfill with an updateEntity if these props are missing i.e during read, wdyt ?

That's an interesting idea, however, it would still be a temporary measure until we get faster signings. So all in all, I'm not sure if it's worth investing on this change.

Wdyt?

}

private Set<PolarisStorageActions> getStorageActions(PolarisS3SignRequest s3SignRequest) {
// TODO M2: better handling of DELETE and LIST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[for my understanding] can you please elaborate this case more ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By just looking at the request URI, it's not easy to know if the request is a write or a delete, or if it is a read or a list.

For example: the ListObjects operation is conceptually a LIST operation, and GetObject is conceptually a READ. But both requests use the GET method, so it's not easy to disambiguate.

Another example: the DeleteObject operation uses the DELETE method, but the DeleteObjects operation uses... the POST method 🤷‍♂️

I will add some comments to clarify.

@adutra
Copy link
Contributor Author

adutra commented Nov 19, 2025

Heads up: since this PR creates constant merge conflicts, I squashed the older commits to facilitate conflict resolution.

@binarycat0
Copy link
Contributor

Hello everyone 👋 I would like to propose my help in implementing M2 (from proposed milestones).
As I see M2 can be split into 3 independent features: locations-check, nessir-style-signed-params, support-proxies.
I already have a solution for check-locations and ready to create a follow-up PR after this will be merged.

Could we agree on a timeline to merge?

snazy added a commit to snazy/polaris that referenced this pull request Nov 20, 2025
* Refactor Authenticator and PolarisPrincipal (apache#2307)

The main goal of this change is to facilitate future integration of federated principals:

- `AuthenticatedPolarisPrincipal` becomes an interface `PolarisPrincipal`, as the original class leaks implementation details (references to `PrincipalEntity` and thus to the storage layer). The new interface does not reference the storage layer. This is one step further towards easy pluggability of authentication in Polaris.

- The `Authenticator.authenticate()` method does not return an `Optional` anymore, as this was ambiguous (returning `Optional.empty()` vs throwing `NotAuthorizedException`).

- Also the `Authenticator` interface is not generic anymore. This was an artifact of times when there were two kinds of `Authenticators` in Polaris (one for internal auth, the other for external) and is not necessary anymore.

* Add PolarisDiagnostics field to TransactionalMetaStoreManagerImpl (apache#2361)

the ultimate goal is removing the PolarisCallContext parameter from every
PolarisMetaStoreManager interface method, so we make steps towards reducing
its usage first.

* Support HMS Federation (apache#2355)

Supports federating to HiveCatalog using the Iceberg REST library. 
All hive dependencies are added in an independent module, i.e., `polaris-extensions-federation-hive` and can be removed/converted to a compile time flag if necessary. 
Similar to HadoopCatalog, HMS federation support is currently restricted to `IMPLICIT` auth. The underlying authentication can be any form that Hive supports, however Polaris will not store and manage any of these credentials. Again, similar to HadoopCatalog, this version supports federating to a single Hive instance. 

This PR relies on Polaris discovering the `hive-site.xml` file to get the configuration options from the classpath (including `HADOOP_CONF_DIR`).

The spec change has been discussed in the [dev mailing list](https://lists.apache.org/thread/5qktjv6rzd8pghcl6f4oohko798o2p2g), followed by a discussion in the Polaris community sync on Aug 7, 2025. 

Testing: 
Modified the regression test to locally test that Hive federation works as expected. The next step would be to add a regression test once the change is baked into the Polaris docker image (for CI builds).


This PR primarily builds on apache#1305 and apache#1466. Thank you @dennishuo and @eric-maynard for helping out with this!

* Add PolarisDiagnostics field to TransactionWorkspaceMetaStoreManager (apache#2359)

the ultimate goal is removing the `PolarisCallContext` parameter from every
`PolarisMetaStoreManager` interface method, so we make steps towards
reducing its usage first.

* Rat-ignore user-settings for hugo-run-in-docker (apache#2376)

* Modularize generic table federation (apache#2379)

In apache#2369 Iceberg table federation was refactored around the new `IcebergRESTExternalCatalogFactory` type based on discussion in the community sync. This has unblocked the ability to federate to more non-Iceberg catalogs, such as in apache#2355. This PR refactors generic table federation to go through the same mechanism. After this, we can go through and implement generic table federation for the existing `IcebergRESTExternalCatalogFactory` implementations.

* Update community meeting dates (apache#2382)

* Reduce getRealmConfig calls (apache#2337)

Classes with a `CallContext` field should call `getRealmConfig` once and store it as a field as well.
The idea is that long term we would want to stop relying on the `CallContext` itself but instead inject its individual items. Thus we also add `RealmConfig` to `TestServices`.

* Python client: make S3 role-ARN optional and add missing endpoint-internal property (apache#2339)

* fix(deps): update dependency io.prometheus:prometheus-metrics-exporter-servlet-jakarta to v1.4.1 (apache#2377)

* chore(deps): bump s3mock from 3.11.0 to 4.7.0 (apache#2375)

Updates S3Mock testcontainer dependency from 3.11.0 to 4.7.0 and refactors usage into a centralized wrapper class in runtime/test-common.

Changes

    Upgraded S3Mock testcontainer to 4.7.0
    Created S3Mock wrapper class for consistent configuration
    Consolidated S3 config properties generation
    Updated integration tests to use new wrapper

No functional changes to test behavior.

* Nit: extract getResolvedCatalogEntity method in IcebergCatalogHandler (apache#2387)

* Nit: remove transitive dependencies from runtime/server/build.gradle.kts (apache#2385)

* Nit: add methods isExternal and isStaticFacade to CatalogEntity (apache#2386)

* Minor refactor of integration test classes (apache#2384)

This change promotes `CatalogConfig` and `RestCatalogConfig` to top-level, public annotations and introduces a few "hooks" in `PolarisRestCatalogIntegrationBase` that can be overridden by subclasses.

This change is a preparatory work for apache#2280 (S3 remote signing).

* Remove BaseMetaStoreManager.serializeProperties (apache#2374)

similar to 7af85be we should prefer
the existing helper methods on the entity instead

* fix: minor corrections of documentation (apache#2397)

- fixed dead link to catalog definition in Iceberg docs on Entities page
- removed single quotes from credential parameter in the cmdline example for connecting a local spark-sql: env variables need to be resolved in cmdline, they will not be resolved by spark-sql itself.

* chore(deps): update azure/setup-helm action to v4.3.1 (apache#2402)

* Add 1.0.1 release to the website (apache#2400)

* Add PolarisDiagnostics field to AbstractTransactionalPersistence (apache#2372)

The ultimate goal is removing the `PolarisCallContext` parameter from every `PolarisMetaStoreManager` interface method, so we make steps towards reducing its usage first.

* NoSQL: javadoc nit

* Last merged commit fcd4777

---------

Co-authored-by: Alexandre Dutra <[email protected]>
Co-authored-by: Christopher Lambert <[email protected]>
Co-authored-by: Pooja Nilangekar <[email protected]>
Co-authored-by: Eric Maynard <[email protected]>
Co-authored-by: JB Onofré <[email protected]>
Co-authored-by: Mend Renovate <[email protected]>
Co-authored-by: Artur Rakhmatulin <[email protected]>
Co-authored-by: olsoloviov <[email protected]>
@adutra adutra force-pushed the request-signing branch 2 times, most recently from f3c8c34 to 2af5104 Compare December 12, 2025 17:18
@adutra
Copy link
Contributor Author

adutra commented Feb 2, 2026

A few updates on remote signing:

The community decided to push some changes down to the Iceberg code, which resulted in two PRs:

I have been working actively on a new version that incorporates the above changes, and also provides the full feature, including the "M2" topics:

  • Reverse proxy support
  • Encrypted signing token
  • Full authz checks including table locations

I think it makes sense to close this PR and open a new one later, not only because it is stale, but because the authorization model will change significantly (it will go from a model that checks on every signing request to a model that checks authz only once at load table).

I'm confident we could include the Iceberg changes in the upcoming Iceberg 1.11 release, then have remote signing released (as beta of course) for Polaris 1.5. But this requires upgrading Polaris to Java 17.

\cc @singhpk234

@adutra adutra closed this Feb 2, 2026
@github-project-automation github-project-automation bot moved this from Ready to merge to Done in Basic Kanban Board Feb 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE REQUEST] On-Premise S3 & Remote Signing

7 participants