Skip to content

bug: object_store path will be percent-encoded twice #6129

@Colerar

Description

@Colerar

Describe the bug

The object-store-opendal crate breaks the correctness of the object-store Path type. The Path type is percent-encoded. The code following passed the percent-encoded Path, while opendal requires a raw string:

async fn get_opts(
&self,
location: &Path,
options: GetOptions,
) -> object_store::Result<GetResult> {
let meta = {
let mut s = self.inner.stat_with(location.as_ref());
if let Some(version) = &options.version {
s = s.version(version.as_str())
}
if let Some(if_match) = &options.if_match {
s = s.if_match(if_match.as_str());
}
if let Some(if_none_match) = &options.if_none_match {
s = s.if_none_match(if_none_match.as_str());
}
if let Some(if_modified_since) = options.if_modified_since {
s = s.if_modified_since(if_modified_since);
}
if let Some(if_unmodified_since) = options.if_unmodified_since {
s = s.if_unmodified_since(if_unmodified_since);
}
s.into_send()
.await
.map_err(|err| format_object_store_error(err, location.as_ref()))?
};

So opendal will encode the path again:

pub fn http_head_request(&self, path: &str, args: &OpStat) -> Result<Request<Buffer>> {
let p = build_rooted_abs_path(&self.root, path);
let url = format!("{}{}", self.endpoint, percent_encode_path(&p));
let mut req = Request::head(&url);

Steps to Reproduce

#[tokio::test]
async fn test() {
    let url = Url::parse("https://huggingface.co/datasets/Anthropic/persuasion/resolve/refs%2Fconvert%2Fparquet/default/train/0000.parquet").unwrap();
    let endpoint = format!(
        "{}://{}{}",
        url.scheme(),
        url.host_str().unwrap(),
        url.port().map_or("".to_string(), |p| format!(":{p}"))
    );
    let object_store_url = ObjectStoreUrl::parse(&endpoint).unwrap();
    println!("{object_store_url}");
    let path = url.path().to_string();
    let path = Path::parse(path).unwrap();
    println!("{path}");
    let builder = Http::default().endpoint(&endpoint);
    let op = Operator::new(builder).unwrap();
    let op = op.finish();
    let object_store = Arc::new(ObjectStoreCache::new(OpendalStore::new(op)));
    object_store.head(&path).await.unwrap();
}

Expected Behavior

A structure similar to Cow may be needed to distinguish whether the path should be percent-encoded to avoid repeated encoding and decoding.

Additional Context

From XiangpengHao/parquet-viewer#19

Are you willing to submit a PR to fix this bug?

  • Yes, I would like to submit a PR.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions