Skip to content

Comments

feat(bindings/cpp): expose reader #3004

Merged
Xuanwo merged 7 commits intomainfrom
cpp-more-operation
Sep 5, 2023
Merged

feat(bindings/cpp): expose reader #3004
Xuanwo merged 7 commits intomainfrom
cpp-more-operation

Conversation

@silver-ymz
Copy link
Member

@silver-ymz silver-ymz commented Sep 3, 2023

Explanation

std::istream is more like BufRead in rust, so I expose struct Reader(BufReader<od::BlockingReader>). Otherwise, we need to impl a BufReader in c++ which will bring more unsafe code and it's more difficult to maintain.

std::streambuf use three following pointers to track buffer state. c++ side needs to update these pointers with rust buffer state. My implementation is to call fill_buf and seek in needed and assign buffer result to the three pointers. And we call consume in the next fill_buf and seek operation to update c++ info for rust side.

image

Some updated explanation is on #3004 (comment)

@github-actions github-actions bot added the releases-note/feat The PR implements a new feature or has a title that begins with "feat" label Sep 3, 2023
Signed-off-by: silver-ymz <[email protected]>
@Xuanwo
Copy link
Member

Xuanwo commented Sep 4, 2023

@silver-ymz
Copy link
Member Author

The reader implementation mimics gcs sdk. It only uses std to provide a reader suitable for std::istream.

If we use boost to provide a more high-level API. The Input-seekable device of iostream library seems better. asio::basic_file seems not suitable to derive and make custom io file.

@Xuanwo
Copy link
Member

Xuanwo commented Sep 4, 2023

If we use boost to provide a more high-level API.

Yes, I prefer to have a more high-level API. OpenDAL's Reader is not a underlying file object and can't be operated as a file.


What makes me feel the most uneasy is the direction of our cpp binding. Let's revisit the VISION for our opendal binding: to provide a native experience and easier-to-maintain bindings based on opendal.

  • Naitve Experience: Makes OpenDAL looks like written in Cpp
  • Easier to maintain: Make maintainers for both rust core and cpp bindings happy

So our bindings is composed by two parts: ffi and cpp.

  • ffi is used to expose rust structs to cpp structs, it's just a bridge.
  • cpp is used to build public API for users.

Take a naming issue as an example:

image

It does ok to expose as just fill_buf, but:

  • Users should not call fill_buf as first, it's not the public API we want to expose. (are you agree with this?)
  • fill_buf is confused with other APIs. I prefer to have something that opendal::ffi::reader_fill_buf, and our cpp code should call opendal::ffi::reader_fill_buf internally. (note, there are two layers here, we should not expose ffi to users directly)

@silver-ymz
Copy link
Member Author

It does ok to expose as just fill_buf, but:

Users should not call fill_buf as first, it's not the public API we want to expose. (are you agree with this?)
fill_buf is confused with other APIs. I prefer to have something that opendal::ffi::reader_fill_buf, and our cpp code should call opendal::ffi::reader_fill_buf internally. (note, there are two layers here, we should not expose ffi to users directly)

We don't expose fill_buf as public API.
our public headers don't inlcude lib.rs.h. So idealy, user won't see or use anything in opendal::ffi namesapce inlucing all functions of Reader. If users try using it according to some magic ways, we don't have any ways to avoid it.
Although we show the definition of Reader in opendal.hpp, our users can't get a Reader by any normal ways. What user will use is ReaderStream, all functions about it are same with std::istream.

@Xuanwo
Copy link
Member

Xuanwo commented Sep 4, 2023

Although we show the definition of Reader in opendal.hpp, our users can't get a Reader by any normal ways. What user will use is ReaderStream, all functions about it are same with std::istream.

Great! Can we organize them in a better way so that we can know the bounary?

@silver-ymz
Copy link
Member Author

silver-ymz commented Sep 4, 2023

I use boost iostreams library to make the API look better.

Update:

  • reader() will return Reader type to provide basic read and seek function. If users don't want iostream to make a buffer read, they can use Reader directly.
  • We use boost iostreams library to deal with all operations about buffer. This can significantly reduce maintain difficulty.
  • rename original Reader to InternalReader and move it into current Reader class definition. Users now can have a clearer idea of what type they need.

@silver-ymz silver-ymz requested a review from Xuanwo September 4, 2023 15:09
@silver-ymz
Copy link
Member Author

@Xuanwo Could you make review again? Previously, I didn't notice boost iostreams library. All low-level details are completed by ourselves which results some APIs aren't easy to use and maintain. With boost iostreams library, I think most problems previously mentioned are solved.

Copy link
Member

@Xuanwo Xuanwo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@Xuanwo Xuanwo merged commit 89627a6 into main Sep 5, 2023
@Xuanwo Xuanwo deleted the cpp-more-operation branch September 5, 2023 09:37
@silver-ymz silver-ymz mentioned this pull request Sep 1, 2023
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

releases-note/feat The PR implements a new feature or has a title that begins with "feat"

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants