Skip to content

AWS: S3SignerServlet should strip out more request headers for caching #15417

@steveloughran

Description

@steveloughran

Feature Request / Improvement

Looking at recent AWS SDK traces there are more fields which may come in the request which can be ignored for signing, specifically user-agent, referrer. UA often built from the JVM and libraries; referrer is used by s3a for its audit tracing (mapping requests to operations and principals)

Not signing them allows for cached requests to be reused more, so reducing the number of signings needed

Currently the exclusion list is restricted to range and some sdk internal headers.

"range", "x-amz-date", "amz-sdk-invocation-id", "amz-sdk-retry"

I'm not sure about whether x-amz-content-sha256 can/should be left out of signing; safest to leave it in

"HEAD /hadoop HTTP/1.1[\r][\n]"
2026-02-23 14:32:34,107 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "Host: stevel-london.s3.eu-west-2.amazonaws.com[\r][\n]"
2026-02-23 14:32:34,110 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "amz-sdk-invocation-id: 16d6a005-24fa-25cf-13b0-ab764b68bf08[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "amz-sdk-request: attempt=1; max=3[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "Authorization: AWS4-HMAC-SHA256 Credential=<CUT>/20260223/eu-west-2/s3/aws4_request, SignedHeaders=amz-sdk-invocation-id;amz-sdk-request;host;referer;x-amz-content-sha256;x-amz-date, Signature=<CUT>[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "Referer: https://audit.example.org/hadoop/1/op_get_file_status/916d5993-4525-4cf3-b5b1-679d08f27734-00000005/?op=op_get_file_status&p1=hadoop&pr=stevel&ps=fdefc55c-8504-42ee-bcfc-738033ab93b4&cm=EtagCommand&id=916d5993-4525-4cf3-b5b1-679d08f27734-00000005&t0=1&fs=916d5993-4525-4cf3-b5b1-679d08f27734&t1=1&ts=1771857153489[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "User-Agent: Hadoop 3.4.3 aws-sdk-java/2.35.4 md/io#sync md/http#Apache ua/2.1 api/S3#2.35.x os/Mac_OS_X#26.3 lang/java#17.0.17 md/OpenJDK_64-Bit_Server_VM#17.0.17+10-LTS md/vendor#Amazon.com_Inc. md/en_GB m/F,G hll/cross-region[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "x-amz-content-sha256: UNSIGNED-PAYLOAD[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "X-Amz-Date: 20260223T143233Z[\r][\n]"
2026-02-23 14:32:34,111 [main] DEBUG http.wire (Wire.java:wire(73)) - http-outgoing-0 >> "Connection: Keep-Alive[\r][\n]"

Query engine

None

Willingness to contribute

  • I can contribute this improvement/feature independently
  • I would be willing to contribute this improvement/feature with guidance from the Iceberg community
  • I cannot contribute this improvement/feature at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    improvementPR that improves existing functionality

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions