Add examples of DataFrame::write* methods without S3 dependency#8606
Add examples of DataFrame::write* methods without S3 dependency#8606metesynnada merged 5 commits intoapache:mainfrom
Conversation
| async fn main() -> Result<(), DataFusionError> { | ||
| let ctx = SessionContext::new(); | ||
| let local = Arc::new(LocalFileSystem::new_with_prefix("./").unwrap()); | ||
| let local_url = Url::parse("file://local").unwrap(); |
There was a problem hiding this comment.
I think we can simplify this by removing the LocalFileSystem register. We can show the default behavior here, and how to extend in S3 example. Let's make the first-time user climb the stairs one by one.
There was a problem hiding this comment.
Thanks, that is definitely simpler.
metesynnada
left a comment
There was a problem hiding this comment.
Thanks for the effort. LGTM.
alamb
left a comment
There was a problem hiding this comment.
Thank you @devinjdangelo -- I think this looks great and thank you for doing it.
Can you you please also add an entry to the readme here: https://github.com/apache/arrow-datafusion/tree/main/datafusion-examples#single-process ?
| .write_table("test", DataFrameWriteOptions::new()) | ||
| .await?; | ||
|
|
||
| df.clone() |
There was a problem hiding this comment.
I wonder if it would be valuable to update one of these options showing DataFrameWriteOptions
something like
// you can use DataFrameWriteOptions to control how the dataframe output is created
// for example:
....
?
There was a problem hiding this comment.
Just pushed up an update with a DataFrameWriteOptions example and added the new example to the README file
| df.clone() | ||
| .write_csv( | ||
| "./datafusion-examples/test_csv/", | ||
| // DataFrameWriteOptions contains options which control how data is written |
Which issue does this PR close?
Closes #8551
Rationale for this change
We currently do not have an example of DataFrame::write_table, nor other DataFrame::write* methods which do not depend on an external S3 bucket.
What changes are included in this PR?
Adds examples of DataFrame::write_table and other write* methods using LocalFileSystem object store.
Are these changes tested?
Via existing tests
Are there any user-facing changes?
No