Skip to content

bug: Performance Issue: opendal slower than object_store for Large File Uploads to S3 #5929

@chitralverma

Description

@chitralverma

Describe the bug

I have been comparing the performance of opendal with object_store when copying a 1GB file from local storage to Amazon S3 with a custom endpoint. I observed that object_store consistently outperforms opendal in terms of speed and consistency of results.

Steps to Reproduce

I created a Criterion.rs benchmark to compare the two crates.

🔹 Environment:

  • Rust version: rustc 1.85.0-nightly (4ba4ac612 2024-12-18)
  • opendal version: 0.52 (latest at the time of writing)
  • object_store version: 0.11.2
  • S3 Bucket Region: eu, germany
  • File size: 1GB

Here's my benchmark code

use anyhow::{Context, Result};
use criterion::{Criterion, criterion_group, criterion_main};
use futures::{StreamExt, TryStreamExt};
use object_store::aws::AmazonS3Builder;
use object_store::local::LocalFileSystem;
use object_store::path::Path;
use object_store::{ObjectStore, WriteMultipart};
use opendal::services::{Fs, S3};
use opendal::{Operator};
use tokio::runtime::Runtime;

const TEST_FILE: &str = "/tmp/test_1gb_file"; // Ensure this file exists

const DEST_PATH_OPENDAL: &str = "test_upload_opendal";
const DEST_PATH_OBJECT_STORE: &str = "test_upload_objectstore";

const DEST_BUCKET_NAME: &str = "bucket_name";
const DEST_ENDPOINT: &str = "endpoint";
const DEST_ACCESS_KEY: &str = "access";
const DEST_SECRET_KEY: &str = "secret";

async fn upload_with_opendal() -> Result<()> {
    let src_operator = Operator::new(Fs::default().root("/"))?.finish();

    let dest_operator = Operator::new(
        S3::default()
            .root("/")
            .bucket(DEST_BUCKET_NAME)
            .access_key_id(DEST_ACCESS_KEY)
            .secret_access_key(DEST_SECRET_KEY)
            .endpoint(DEST_ENDPOINT),
    )?
    .finish();

    let reader = src_operator
        .reader_with(TEST_FILE)
        .concurrent(8)
        .chunk(8 * 1024 * 1024)
        .await?
        .into_bytes_stream(..)
        .await?;

    let writer = dest_operator
        .writer_with(DEST_PATH_OPENDAL)
        .concurrent(8)
        .await?
        .into_bytes_sink();

    reader
        .forward(writer)
        .await
        .context("Failed to upload file with opendal")
}

async fn upload_with_object_store() -> Result<()> {
    let src_store = LocalFileSystem::new_with_prefix("/")?;

    let dest_store = AmazonS3Builder::new()
        .with_access_key_id(DEST_ACCESS_KEY)
        .with_secret_access_key(DEST_SECRET_KEY)
        .with_endpoint(DEST_ENDPOINT)
        .with_bucket_name(DEST_BUCKET_NAME)
        .build()?;

    /////

    // Open a reader stream and writer sink
    let stream = src_store.get(&Path::from(TEST_FILE)).await?.into_stream();

    // Open a writer sink
    let upload = dest_store
        .put_multipart(&Path::from(DEST_PATH_OBJECT_STORE))
        .await?;

    let mut write = WriteMultipart::new(upload);

    stream
        .try_for_each(|chunk| {
            write.put(chunk);
            async move { Ok(()) }
        })
        .await?;

    write
        .finish()
        .await
        .context("Failed to upload file with object_store")?;
    Ok(())
}

fn benchmark_upload(c: &mut Criterion) {
    let mut criterion = Criterion::default()
        .sample_size(10);
    let mut group = criterion.benchmark_group("comparison");

    let rt = Runtime::new().unwrap();

    group.bench_function("upload_with_object_store", |b| {
        b.iter(|| {
            rt.block_on(upload_with_object_store()).unwrap();
        })
    });
    group.bench_function("upload_with_opendal", |b| {
        b.iter(|| {
            rt.block_on(upload_with_opendal()).unwrap();
        })
    });

    group.finish();
}

criterion_group!(benches, benchmark_upload);
criterion_main!(benches);

Expected Behavior

As you can see in the snippet, with opendal I had to manually configure the concurrent and chunk to get the performance close to what object_store crate was giving me out of the box.

Other values of concurrent and chunk on reader and writer were giving a worse performance.

Expected behavior is to have optimal defaults in place which the users can override if required, but the performance should have been equivalent to that of object_store.

Additional Context

Log of cargo bench --bench s3_upload_benchmark

Benchmarking comparison/upload_with_object_store
Benchmarking comparison/upload_with_object_store: Warming up for 3.0000 s

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 200.8s.
Benchmarking comparison/upload_with_object_store: Collecting 10 samples in estimated 200.82 s (10 iterations)
Benchmarking comparison/upload_with_object_store: Analyzing
comparison/upload_with_object_store
                        time:   [17.578 s 19.972 s 22.836 s]
Found 2 outliers among 10 measurements (20.00%)
  2 (20.00%) high mild
Benchmarking comparison/upload_with_opendal
Benchmarking comparison/upload_with_opendal: Warming up for 3.0000 s

Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 285.9s.
Benchmarking comparison/upload_with_opendal: Collecting 10 samples in estimated 285.93 s (10 iterations)
Benchmarking comparison/upload_with_opendal: Analyzing
comparison/upload_with_opendal
                        time:   [27.903 s 32.448 s 38.408 s]
Found 4 outliers among 10 measurements (40.00%)
  2 (20.00%) low mild
  2 (20.00%) high severe

Violin Plot
Image

Attached the reports from criterion as a zip

criterion.zip

Are you willing to submit a PR to fix this bug?

  • Yes, I would like to submit a PR.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions