Skip to content

Comments

WIP: Enhance py build#6622

Closed
chitralverma wants to merge 12 commits intoapache:mainfrom
chitralverma:enhance-py-build
Closed

WIP: Enhance py build#6622
chitralverma wants to merge 12 commits intoapache:mainfrom
chitralverma:enhance-py-build

Conversation

@chitralverma
Copy link
Contributor

@chitralverma chitralverma commented Oct 6, 2025

@Xuanwo
Copy link
Member

Xuanwo commented Oct 13, 2025

This PR is too large for me to read the diff on the web page. Can we split it? For example, Bump pyo3 to 0.26.0 deserves its own PR.

I personally hate Makefile, can we use Justfile instead?

@chitralverma
Copy link
Contributor Author

chitralverma commented Oct 13, 2025

This PR is too large for me to read the diff on the web page. Can we split it? For example, Bump pyo3 to 0.26.0 deserves its own PR.

I personally hate Makefile, can we use Justfile instead?

@Xuanwo pyo3 upgrade was already in in s separate PR #6631
I can replace Makefile, with Justfile

@Xuanwo
Copy link
Member

Xuanwo commented Oct 13, 2025

Thank you! Welcome to pin me while this PR is ready.

@chitralverma
Copy link
Contributor Author

chitralverma commented Oct 14, 2025

Hello team, I'm providing an update on the refactoring effort.

While I've completed about 70% of the work, I've hit a significant roadblock with the async methods in AsyncOperator, AsyncLister, and AsyncFile. The issue stems from a fundamental mismatch between how opendal-python handles asynchronous operations and how pyo3-stub-gen expects them to be structured.

The Core Issue: A Mismatch in Async Handling

  1. pyo3-stub-gen's Expectation: To generate correct async def stubs, the tool requires that the Rust functions themselves are declared as async fn. This typically involves enabling PyO3's experimental-async feature.

  2. OpenDAL's Implementation: We use the recommended pattern from pyo3-async-runtimes, where a synchronous Rust function acts as a "factory" that returns a Python future (by calling future_into_py). This is the correct way to avoid !Send lifetime issues when bridging Rust async code to Python's event loop.

Because our Rust functions are not literally async fn, pyo3-stub-gen generates incorrect synchronous stubs (e.g., def read(...) instead of async def read(...)). This defeats the purpose of type hinting for our entire async API.

Possible Paths Forward

I see two potential solutions to this problem:

1. Overhaul OpenDAL's Async Implementation (Not Recommended)

We could refactor all our async methods to use async fn in Rust, likely by creating a custom wrapper around the Tokio runtime. I've drafted a proof-of-concept for this below.

However, I won't recommend this as it would be a substantial and invasive change to our core async logic, introducing complexity and potential risks, purely for the sake of stubs.

Click to see a proof-of-concept `future_into_py_async` wrapper
/// Spawns a future on the shared Tokio runtime and returns a Python awaitable.
/// Relies on the runtime from `pyo3-async-runtimes`.
pub async fn future_into_py_async<F, T>(fut: F) -> PyResult<T>
where
    F: std::future::Future<Output = PyResult<T>> + Send + 'static,
    T: Send + 'static,
{
    let rt = pyo3_async_runtimes::tokio::get_runtime();
    let handle = rt.handle().clone();

    // Use a channel to bridge the result back from the spawned task
    let (tx, rx) = tokio::sync::oneshot::channel();

    // Spawn the future on the Tokio runtime
    handle.spawn(async move {
        let result = fut.await;
        let _ = tx.send(result);
    });

    // Await the result from the channel
    match rx.await {
        Ok(result) => result,
        Err(e) => Err(crate::errors::Unexpected::new_err(format!(
            "Async task panicked or was dropped: {e}"
        ))),
    }
}

2. Enhance pyo3-stub-gen (Recommended Path) ✅

The more pragmatic and correct solution is to address this limitation in the stub generation tool itself. The tool should provide a way to manually hint that a synchronous Rust function produces an awaitable.

To that end, I have already opened an issue on the pyo3-stub-gen repository requesting this feature:

Side Note: The recent migration from chrono to jiff will require a separate, smaller issue to be filed with pyo3-stub-gen to add support for jiff::Timestamp. I will handle that.

Looking forward to your thoughts on this approach.
cc @Xuanwo @messense @PsiACE

@Xuanwo
Copy link
Member

Xuanwo commented Oct 14, 2025

Thank you @chitralverma for working on this! I didn’t take a close look at the changes to pyo3. What’s the current state of writing async fn directly? I recall their page at https://pyo3.rs/v0.26.0/async-await saying we can just use:

use std::{thread, time::Duration};
use futures::channel::oneshot;
use pyo3::prelude::*;

#[pyfunction]
#[pyo3(signature=(seconds, result=None))]
async fn sleep(seconds: f64, result: Option<Py<PyAny>>) -> Option<Py<PyAny>> {
    let (tx, rx) = oneshot::channel();
    thread::spawn(move || {
        thread::sleep(Duration::from_secs_f64(seconds));
        tx.send(()).unwrap();
    });
    rx.await.unwrap();
    result
}

What are the limitations?

@chitralverma
Copy link
Contributor Author

What are the limitations?

@Xuanwo your comment is in ref to point 1 in my message earlier. the snippet your shared is a simpler version of future_into_py_async that i attached above, please take a look.

the difference between option 1 (self implemented future_into_py_async) and existing approach (pyo3_async_runtimes's future_into_py) is as following:


Short answer

They’re similar in purpose but serve different roles.

Long answer

High-Level Comparison

Aspect future_into_py_async pyo3_async_runtimes::future_into_py
Return type PyResult<T> (awaits Rust side, returns result) PyResult<Bound<PyAny>> (a Python asyncio.Future)
Runs inside Rust Yes — your async method awaits it fully before returning to Python No — it returns control to Python immediately, and Python awaits the future
Python integration Minimal — looks like a normal async fn to PyO3 Deep — integrates with Python's event loop (TaskLocals)
Cancellation needs more work to do this Full support (cancellation, contextvars, error propagation)
Error handling handled JoinError manually Handles panics, cancellation, and wraps into Python exceptions
Use case Async inside PyO3 async functions (you await in Rust) Async from sync PyO3 methods (return Python awaitables to user)

Conceptual Difference: Two Approaches to Async Bridging

There's a fundamental difference in the design and use case between the standard future_into_py and the proposed future_into_py_async helper.

pyo3_async_runtimes::future_into_py

This function acts as a bridge from Rust to Python's event loop.

  • Analogy: "Here, Python, take this promise (asyncio.Future). I'll tell you when the work is done."
  • How it works: It immediately returns a Python asyncio.Future object to the caller. The Python event loop then takes control and awaits this future. Meanwhile, Rust spawns the work in the background.
  • Integration: It provides deep, "heavyweight but correct" integration, handling cancellation, contextvars, and proper error propagation between the two runtimes.
  • Use Case: The canonical way to expose a Rust async operation to Python from a synchronous Rust function.

The proposed future_into_py_async

This function is a self-contained async operation within Rust.

  • Analogy: "Wait here. I'm going to finish this Rust task, and I'll give you the final result when I'm back."
  • How it works: It is an async fn itself that awaits the given Rust future completely within Rust. It only returns the final, concrete PyResult<T> to Python after the Rust-side await is complete.
  • Integration: Minimal. From PyO3's perspective, it's just another async function. It doesn't manage Python futures or interact with the event loop.
  • Use Case: A helper for running a Send-able Rust future inside an existing async PyO3 method.

@chitralverma
Copy link
Contributor Author

chitralverma commented Oct 14, 2025

I think i should split this PR in smaller ones maybe, because if we have to try option 1, that will definitely need its own big PR

@Xuanwo
Copy link
Member

Xuanwo commented Oct 14, 2025

@Xuanwo your comment is in ref to point 1 in my message earlier. the snippet your shared is a simpler version of future_into_py_async that i attached above, please take a look.

Hi, I don't think they are the same thing.

I thought we can do:

use std::{thread, time::Duration};
use futures::channel::oneshot;
use pyo3::prelude::*;

#[pyfunction]
#[pyo3(signature=(seconds, result=None))]
async fn sleep(seconds: f64, result: Option<Py<PyAny>>) -> Option<Py<PyAny>> {
    tokio::sleep(10s).await;
    result
}

Can we?

@chitralverma
Copy link
Contributor Author

chitralverma commented Oct 15, 2025

Can we?

  1. the snippet that you shared does not work. it fails with:
pyo3_runtime.PanicException: there is no reactor running, must be called from the context of a Tokio 1.x runtime
  1. This example from pyo3 docs works but doesn't use tokio. Same example you mentioned in one of the comments.

  2. I managed to merge the two and the following works with tokio. This is because i inject the tokio runtime that was being internally used in the existing implementation

    #[pyo3(signature=(seconds, result=None))]
    async fn sleep(&self, seconds: f64, result: Option<Py<PyAny>>) -> Option<Py<PyAny>> {
        let (tx, rx) = tokio::sync::oneshot::channel();
        let rt = pyo3_async_runtimes::tokio::get_runtime();    // <- tokio runtime injected
        let handle = rt.handle().clone();    // <- injected runtime used to create a cheap handle

        handle.spawn(async move {
            tokio::time::sleep(tokio::time::Duration::from_secs_f64(seconds)).await;
            let _ = tx.send(result);
        });

        rx.await.unwrap()
    }

@Xuanwo
Copy link
Member

Xuanwo commented Oct 15, 2025

Got it, thanks @chitralverma for the explanation. I agree that Jij-Inc/pyo3-stub-gen#326 is the right direction. Maybe we can collaborate with upstream to make it happen.

@chitralverma
Copy link
Contributor Author

raised Jij-Inc/pyo3-stub-gen#329 for jiff compatiiblity

@chitralverma
Copy link
Contributor Author

closed via #6677 #6690 #6703 #6720

@chitralverma chitralverma deleted the enhance-py-build branch December 8, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

new feature: Improve automatic stub generation for python bindings

3 participants