Skip to content

Automatically detect S3 regions / redirect correctly #402

@alamb

Description

@alamb

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

I would like to be able to retrieve data stored in S3 without having to explicitly specify a region (see apache/datafusion#16306 for the DataFusion usecase)

Here is an example program.

use object_store::aws::AmazonS3Builder;
use object_store::{ObjectStore};
use object_store::path::Path;

#[tokio::main]
async fn main() {
    // Goal is to read a parquet file from S3 without having to specify the region
    // s3://clickhouse-public-datasets/hits_compatible/athena_partitioned/hits_1.parquet
    let path = "hits_compatible/athena_partitioned/hits_1.parquet";

    let store = AmazonS3Builder::new()
        .with_bucket_name("clickhouse-public-datasets")
        .with_skip_signature(true)
        // NOTE the region is not specified. If it is set to `eu-central-1`
        // this example works great, but if it is not set, it fails with:
        // ```
        // Error getting object from S3: Generic S3 error: Error performing  in 120.898541ms -
        // Received redirect without LOCATION, this normally indicates an incorrectly configured region
        // ```
        //.with_region("eu-central-1") 
        .build()
        .expect("Failed to create S3 object store");


    let path = Path::from(path);
    let result = match store.
        get(&path)
        .await {
        Ok(result) => result,
        Err(e) => {
            eprintln!("Error getting object from S3: {e}");
            return;
        }
    };
    println!("Successfully retrieved object from S3");
    println!("Result meta: {:#?} ", result.meta);
}

Describe the solution you'd like
I want the request to succeed without having to explicitly specify the region.

Describe alternatives you've considered

Using curl you can see the response from AWS actually includes the correct region in the x-amz-bucket-region header

< x-amz-bucket-region: eu-central-1

curl -v https://s3.us-east-1.amazonaws.com/clickhouse-public-datasets/hits_compatible/athena_partitioned/hits_1.parquet
...
* using HTTP/1.x
> GET /clickhouse-public-datasets/hits_compatible/athena_partitioned/hits_1.parquet HTTP/1.1
> Host: s3.us-east-1.amazonaws.com
> User-Agent: curl/8.7.1
> Accept: */*
>
* Request completely sent off
< HTTP/1.1 301 Moved Permanently
< x-amz-bucket-region: eu-central-1
< x-amz-request-id: SG3WAD43HD3NADYW
< x-amz-id-2: qKq7zFq+trkB04Ie5CrBIZmYBMS5Lk4gVYH6Pq5NrO1rC+pmTPG8bxelOM3y3vRcweQiS1pBDwU=
< Content-Type: application/xml
< Transfer-Encoding: chunked
< Date: Fri, 06 Jun 2025 14:35:38 GMT
< Server: AmazonS3
<
<?xml version="1.0" encoding="UTF-8"?>
* Connection #0 to host s3.us-east-1.amazonaws.com left intact
<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Endpoint>clickhouse-public-datasets.s3.eu-central-1.amazonaws.com</Endpoint><Bucket>clickhouse-public-datasets</Bucket><RequestId>SG3WAD43HD3NADYW</RequestId><HostId>qKq7zFq+trkB04Ie5CrBIZmYBMS5Lk4gVYH6Pq5NrO1rC+pmTPG8bxelOM3y3vRcweQiS1pBDwU=</HostId></Error>

Additional context

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions