Skip to content

Comments

feat(bindings/python)!: Generate stubs for Operator overloads and Scheme#6729

Merged
Xuanwo merged 16 commits intoapache:mainfrom
chitralverma:gen-services
Oct 27, 2025
Merged

feat(bindings/python)!: Generate stubs for Operator overloads and Scheme#6729
Xuanwo merged 16 commits intoapache:mainfrom
chitralverma:gen-services

Conversation

@chitralverma
Copy link
Contributor

@chitralverma chitralverma commented Oct 22, 2025

Which issue does this PR close?

Closes #6686.

Rationale for this change

This uses an alternate approach to generate the overloads previously in __base.pyi.
Instead of rendering __base.pyi directly, this approach generates a checked rust code instead which then gets submitted to pyo3-stub-gen and the resultant stub services.pyi is safely generated with all other stubs at the same time.

I aim to make this the parent class of AsyncOperator and Operator so that they can get all the overloads.
image

What changes are included in this PR?

  • Redo python generator to fix fix:(bindings/python) changes required in dev/src/generate/python.j2 #6686 along with rest of the stubs
    • Overload __new__ in Operator for each service
    • Overload __new__ in AysncOperator for each service
    • All overloads come with typed configurations and valid docstrings
      • Config descriptions (comments ) required a bit of of preprocessing and formatting, add that.
  • Remove old __base.pyi
  • Introduce services py module containing a Scheme which can also be used in the constructors of operators

Are there any user-facing changes?

No breaking changes included.
Users get better docs and typing.

@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Oct 22, 2025
@dosubot dosubot bot added the releases-note/feat The PR implements a new feature or has a title that begins with "feat" label Oct 22, 2025
A closed file cannot be used for further I/O operations.
"""
def __aenter__(self) -> typing.Self: ...
def __aenter__(self) -> typing_extensions.Self: ...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typing.Self was not available in python 3.10 which opendal supports, so fixed this

}

#[gen_stub(override_return_type(type_repr="typing.Self", imports=("typing")))]
#[gen_stub(override_return_type(type_repr="typing_extensions.Self", imports=("typing_extensions")))]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See.

[group('build')]
stub-gen: setup
@echo "{{ BOLD }}--- Generating Python type stubs ---{{ NORMAL }}"
@cargo run --quiet --manifest-path=../../dev/Cargo.toml -- generate -l python
Copy link
Contributor Author

@chitralverma chitralverma Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overloads and Scheme generates as part of the stub-gen process

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me think that can we move stub_gen to odev instead.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me think that can we move stub_gen to odev instead.

actually these are 2 separate processes that cannot be merged into the binding or in odev from what i understood.

  1. pyo3-stub-gen requires.... well, pyo3 which is available in python bindings. this step is responsible for generating most of the python stubs from rust code.
  2. the stuff in odev uses jinja templating because it has parsing logic of services code used for java as well. if odev had lib target then it could be used as a build dependency but still would be separate from pyo3 stuff.

@chitralverma chitralverma marked this pull request as draft October 22, 2025 19:35
@@ -0,0 +1,155 @@
// Licensed to the Apache Software Foundation (ASF) under one
Copy link
Contributor Author

@chitralverma chitralverma Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this file is temporary. will replace python.j2 with this.
done

Comment on lines +28 to +30
"etcd" | "foundationdb" | "ftp" | "hdfs" | "rocksdb" | "tikv" | "sftp" | "github"
| "cloudflare_kv" | "monoiofs" | "dbfs" | "surrealdb" | "d1" | "opfs" | "compfs"
| "lakefs" | "pcloud" | "vercel_blob" => false,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will re enable some of these in another PR

Comment on lines +93 to +95
pub fn format_text(text: &str, indent: usize, max_line_length: usize) -> String {
// Precompute indentation string
let indent_str = " ".repeat(indent);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

was trying to avoid an ext. crate for this this task.

Comment on lines +225 to +233
ConfigType::Bool => "builtins.bool",
ConfigType::Duration => "typing.Any",
ConfigType::I64
| ConfigType::Usize
| ConfigType::U64
| ConfigType::U32
| ConfigType::U16 => "_int",
ConfigType::Vec => "_strings",
ConfigType::String => "str",
| ConfigType::U16 => "builtins.int",
ConfigType::Vec => "typing.Any",
ConfigType::String => "builtins.str",
Copy link
Contributor Author

@chitralverma chitralverma Oct 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be able to cast automatically

@chitralverma chitralverma changed the title feat(bindings/python)!: Generate new ServiceBase and add Scheme using stub-gen feat(bindings/python)!: Generate stubs for Operator overloads and Scheme Oct 22, 2025
Comment on lines +21 to +41
from opendal._opendal import ( # noqa: F403
capability,
exceptions,
file,
layers,
services,
types,
)
from opendal.operator import AsyncOperator, Operator # pyright:ignore

__version__: builtins.str

__all__ = _opendal.__all__ # noqa: F405 # pyright:ignore
__all__ += ["AsyncOperator", "Operator"] # pyright:ignore
__all__ = [
"capability",
"exceptions",
"file",
"layers",
"services",
"types",
"AsyncOperator",
"Operator",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes things explicit

Comment on lines 69 to 80
{% for srv in srvs %}
submit! {
gen_methods_from_python! {
r#"
import builtins
import typing
import typing_extensions
class Operator:
def __new__(cls,
scheme: typing.Literal["{{ snake_to_kebab_case(srv) }}"],
/,
{%- if srvs[srv].config %}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Operator overloads

Comment on lines 108 to 116
import builtins
import typing
import typing_extensions
class AsyncOperator:
def __new__(cls,
scheme: typing.Literal["{{ snake_to_kebab_case(srv) }}"],
/,
{%- if srvs[srv].config %}
*,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AsyncOperator overloads

Comment on lines 82 to 84
/// The new operator.
#[gen_stub(skip)]
#[new]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

helps IDEs

@chitralverma chitralverma marked this pull request as ready for review October 22, 2025 21:18
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. and removed size:XXL This PR changes 1000+ lines, ignoring generated files. labels Oct 22, 2025
@dosubot dosubot bot removed the size:L This PR changes 100-499 lines, ignoring generated files. label Oct 22, 2025
@dosubot dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Oct 22, 2025
@chitralverma
Copy link
Contributor Author

@Xuanwo this is ready for your review, when ever you're free.
This effectively finishes the refactoring for #6253 and fixes #6686.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on this, mostly LGTM, just some questions.

import typing

@typing.final
class Scheme(enum.Enum):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a goal to finally split OpenDAL into separate crates so users can easily implement and use their own services. On the Python side, I want to dynamically load them. Is there a way to enable this?

This is not a blocker: we can still merge this PR as is, but let’s keep this idea in mind as we move forward.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

separate crates how? by bindings or services ?

On the Python side, I want to dynamically load them. Is there a way to enable this?

in python world, yes via dependency groups, in rust world, yes by feature gating but in rust-python world via pyo3, i've never done this before so not sure. but then i also never did stub-gen before this, so will take a shot at it when that happens.

assuming a service level split:
makes me think that if there was a way to generate language agnostic linkable code for the services, the bindings can actually be generated via some sort of a spec driven development model. Elasticsearch and aws do this but they're server based so its easier for them.

duckdb does this as well (think of opendal services as duckdb extensions - cross platform and language agnostic) but need to think more on this. is there a issue/ discussion/ rfc for this where i can contribute better ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is shown here: #5206

In short, the OpenDAL Rust core will eventually become opendal-service-s3, opendal-service-azblob, and more. I want to know if this change can benefit Python packages. Can we avoid delivering large wheels to users?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

raised astral-sh/uv#16465, which if answered will help.
maybe @messense has some suggestions from maturin side.

[group('build')]
stub-gen: setup
@echo "{{ BOLD }}--- Generating Python type stubs ---{{ NORMAL }}"
@cargo run --quiet --manifest-path=../../dev/Cargo.toml -- generate -l python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me think that can we move stub_gen to odev instead.

paths: [python]
options:
docstring_style: google
docstring_style: numpy
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this change for? You perfer numpy style?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well before all these PRs, the code had a mix of docstring conventions which i wanted to standardize as well so i while doing this (with the help of LLMs), all the generated docstrings turned out to be in numpy convention.

today i checked that mkdocs was not rendering it properly so made this change to get it working properly.
numpy style is slightly verbose but its a very well established standard as well.

@chitralverma chitralverma requested a review from Xuanwo October 23, 2025 18:33
@chitralverma
Copy link
Contributor Author

postgresql test is stuck.

@chitralverma
Copy link
Contributor Author

@Xuanwo any other comments from your side before this can be merged?

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#6729 (comment) may need a follow-up, but other changes LGTM, let's move!

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Oct 27, 2025
@Xuanwo Xuanwo merged commit 9d8ccf0 into apache:main Oct 27, 2025
63 checks passed
@chitralverma chitralverma deleted the gen-services branch December 8, 2025 12:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

lgtm This PR has been approved by a maintainer releases-note/feat The PR implements a new feature or has a title that begins with "feat" size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix:(bindings/python) changes required in dev/src/generate/python.j2

2 participants