Add DataFrame reference to the user guide#3067
Conversation
|
@kmitchener WDYT? |
Codecov Report
@@ Coverage Diff @@
## master #3067 +/- ##
==========================================
+ Coverage 85.85% 85.93% +0.08%
==========================================
Files 289 289
Lines 51890 52118 +228
==========================================
+ Hits 44548 44790 +242
+ Misses 7342 7328 -14
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
|
Looks great! |
Co-authored-by: Loïc Sharma <[email protected]>
Co-authored-by: Loïc Sharma <[email protected]>
Yeah, this is nice and was completely missing from the docs before. Re the function listing, should we combine the dataframe functions/conditional expressions, etc with the SQL version? So one definition of the function with maybe 2 examples -- one for SQL, one for Dataframe. Although that consolidation can come later in a follow-on. |
…afusion into user-guide-dataframe
| // TODO(kszucs): this seems buggy, unary_scalar_expr! is used for many | ||
| // varying arity functions |
I'm not sure if we should combine them or not. Combining them is easier for the maintainers for sure, but having separate references for DataFrame and SQL might be easier for users? |
|
Here is the related PR to add SQL function documentation: #3090 |
alamb
left a comment
There was a problem hiding this comment.
❤️
This looks like a really nice improvment @andygrove . Thank you!
| unary_scalar_expr!(Log10, log10); | ||
| unary_scalar_expr!(Ln, ln); | ||
| unary_scalar_expr!(NullIf, nullif); | ||
| unary_scalar_expr!(Sqrt, sqrt, "square root of a number"); |
| under the License. | ||
| --> | ||
|
|
||
| # DataFrame API |
There was a problem hiding this comment.
I wonder if it is important to mention somewhere that the computations are deferred until collect() is called? Maybe that is common across other dataframe implementations and can be assumed.
| col("a").gt(lit(5)).and(col("b").lt(lit(7))) | ||
| ``` | ||
|
|
||
| ## Identifiers |
There was a problem hiding this comment.
Since these are not data frame specific, maybe we should put them into a different guide -- and reference that here
Something like docs/source/user-guide/expressions.md perhaps
This is just a suggestion, I think it would be fine to do in a follow on PR or never
There was a problem hiding this comment.
I will move into a separate page as a follow-on. How would you see users using expressions outside the context of DataFrame usage?
There was a problem hiding this comment.
We use them all the time in IOx when we need to make Exprs -- for example to create LogicalPlans / use the LogicalPlanBuilder. I also think they are useful for writing tests when people are writing extensions for DataFusion
| | case | CASE expression. Example: `case(expr).when(expr, expr).when(expr, expr).otherwise(expr).end()`. | | ||
| | nullif | Returns a null value if value1 equals value2; otherwise it returns value1. This can be used to perform the inverse operation of the `coalesce` expression. | | ||
|
|
||
| ## String Expressions |
There was a problem hiding this comment.
I think it is ok to leave the Notes column blank for now and fill it out going forward
There was a problem hiding this comment.
Or maybe we can link to the content in docs/source/user-guide/sql/scalar_functions.md
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
|
Benchmark runs are scheduled for baseline = 098f0b0 and contender = 1e44417. 1e44417 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Which issue does this PR close?
Closes #3066
Rationale for this change
We need to document what is available otherwise how will users know where to start?
What changes are included in this PR?
New section in user guide (see rendered version)
This is just a starting point and we'll need to continue improving and extending once this is merged.
Are there any user-facing changes?
No