Simplify display format of AggregateFunctionExpr, add Expr::sql_name#15253
Simplify display format of AggregateFunctionExpr, add Expr::sql_name#15253alamb merged 17 commits intoapache:mainfrom
AggregateFunctionExpr, add Expr::sql_name#15253Conversation
| struct SqlDisplay<'a>(&'a Expr); | ||
| impl Display for SqlDisplay<'_> { | ||
| fn fmt(&self, f: &mut Formatter<'_>) -> fmt::Result { | ||
| match self.0 { | ||
| Expr::Column(c) => { | ||
| write!(f, "{}", c.name) | ||
| } | ||
| Expr::Literal(_) => { | ||
| write!(f, "aa") | ||
| } | ||
| Expr::ScalarVariable(..) => { | ||
| write!(f, "bb") | ||
| } | ||
| Expr::OuterReferenceColumn(..) => { | ||
| write!(f, "cc") | ||
| } | ||
| Expr::Placeholder(_) => { | ||
| write!(f, "dd") | ||
| } | ||
| Expr::Wildcard { .. } => { | ||
| write!(f, "ee") | ||
| } | ||
| Expr::AggregateFunction(AggregateFunction { func, params }) => { | ||
| match func.sql_name(params) { | ||
| Ok(name) => { | ||
| write!(f, "{name}") | ||
| } | ||
| Err(e) => { | ||
| write!(f, "got error from schema_name {}", e) | ||
| } | ||
| } | ||
| } | ||
| _ => write!(f, "{}", self.0.schema_name()), | ||
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
Just a demo here.
AggregateFunctionExpr's name is constructed at logical expr phase, we need a new member sql_name for tree explain, and a new constructor SqlDisplay to handle the inner Expr recursively. cc @alamb 👀
| 13)-----------------------------│ AggregateExec │ | ||
| 14)-----------------------------│ -------------------- │ | ||
| 15)-----------------------------│ aggr: count(Int64(1)) │ | ||
| 15)-----------------------------│ aggr: count(aa) │ |
There was a problem hiding this comment.
We can see the final display changed.
30efcd5 to
03e33a4
Compare
AggregateFunctionExprAggregateFunctionExpr, add Expr::sql_name
There was a problem hiding this comment.
Thank you @irenjj -- this looks really nice
I think there are a few ways we should improve the comments, but I'll directly push those to this branch.
Thanks again!
Update; I don't have time right now to make the comment changes. Let me know what you think @irenjj and if you are willing to make them in this PR. If not I'll make a follow on PR tomorrow.
Thanks again ❤️
| } | ||
|
|
||
| struct SqlDisplay<'a>(&'a Expr); | ||
| impl Display for SqlDisplay<'_> { |
There was a problem hiding this comment.
We already have code to convert from Expr to SQL, in https://docs.rs/datafusion/latest/datafusion/sql/unparser/struct.Unparser.html#method.expr_to_sql
However, datafusion-expr doesn't currently depend on the datafusion-sql crate which is probably a good thing
So I think we should add a note in Expr::sql_name that if the user's want syntactically correct SQL to feed to some other system, they should use the Unparser and leave a link to https://docs.rs/datafusion/latest/datafusion/sql/unparser/struct.Unparser.html
datafusion/expr/src/expr.rs
Outdated
| schema_name_from_exprs_inner(exprs, ",") | ||
| } | ||
|
|
||
| /// Get `sql_name` for Vector of expressions. |
There was a problem hiding this comment.
Instead of adding so many new so specific functions, what do you think about adding something like the following. I think the API would be easier to understand and it would also let us save a String copy when formatting as sql
/// Formats a list of `&Expr` with a custom separator
struct ExprListDisplay<'a> {
exprs: &'a [Expr],
sep: &'a str,
}
...
impl <'a> Display for ExprListDisplay<'a> {
...
}| 51)│ files: 1 │ | ||
| 52)│ format: csv │ | ||
| 53)└───────────────────────────┘ | ||
| 07)│ group_by: string_col │ |
| 70)│ format: memory ││ format: memory │ | ||
| 71)│ rows: 1 ││ rows: 1 │ | ||
| 72)└───────────────────────────┘└───────────────────────────┘ | ||
| 10)│ aggr: count(1) │ |
Thanks @alamb for your review, I will revise it later according to your suggestions. |
| } | ||
|
|
||
| struct SqlDisplay<'a>(&'a Expr); | ||
| impl Display for SqlDisplay<'_> { |
There was a problem hiding this comment.
What is the main difference between Display? I found alias is different but others looks similar
There was a problem hiding this comment.
That's true, it's redundant for most cases.
There was a problem hiding this comment.
Maybe we should take nested expr into consideration, like aggr(case aggr() when...)
datafusion/expr/src/expr.rs
Outdated
| } | ||
| } | ||
| } | ||
| _ => write!(f, "{}", self.0.schema_name()), |
There was a problem hiding this comment.
I think by default should be Display, self.0.to_string()
datafusion/expr/src/expr.rs
Outdated
| } | ||
|
|
||
| /// Get `sql_name` for Vector of expressions. | ||
| pub(crate) fn sql_name_from_exprs_comma_separated_without_space( |
There was a problem hiding this comment.
without_space is the old format and ideally we should standardize the format #10364 so better not to introduce another format without good reason
|
Is it possible to modify |
| 04)│ aggr: sum(bigint_col) │ | ||
| 05)│ │ | ||
| 06)│ group_by: │ | ||
| 07)│ string_col@0 as string_col│ |
There was a problem hiding this comment.
I think it'll be helpful to keep the index @0, what do you think?
There was a problem hiding this comment.
It's true that column indices provide precise tracking and tracing capabilities for complex query analysis, but I think tree explain should be as simple as possible🤔, as doc says:
/// TreeRender, displayed in the `tree` explain type.
///
/// This format is inspired by DuckDB's explain plans. The information
/// presented should be "user friendly", and contain only the most relevant
/// information for understanding a plan. It should NOT contain the same level
/// of detail information as the [`Self::Default`] format.
There was a problem hiding this comment.
I think the key difference Display and the now named human_format is that Display has more details -- I tried to add comments that explain this
I haven't tried it, not sure how it will affect other logic. |
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
|
Thanks @alamb, @jayzhan211 and @xudong963 for your review, here are two points that remain unclear:
|
Co-authored-by: Andrew Lamb <[email protected]>
I don't think we should keep the index for the tree explain plan
I think this would be a more disruptive change as it would change the existing |
|
I pushed a commit to add more documentation explaining the different options for printing I also pushed a commit to rename Let me know what you think |
| /// | ||
| /// [`ExprFunctionExt`]: crate::expr_fn::ExprFunctionExt | ||
| /// | ||
| /// # Printing Expressions |
There was a problem hiding this comment.
I added this section and the one below to try and explain the uses of the different explain variants.
|
I'll plan to merge this tomorrow unless others would like time to review |
|
Thank you @irenjj @jayzhan211 and @xudong963 🙏 |
|
@irenjj |
I believe that the Perhaps what we should really focus on is ensuring that the explain output correctly displays the |
Which issue does this PR close?
AggregateFunctionExpr#15252Rationale for this change
What changes are included in this PR?
Are these changes tested?
Are there any user-facing changes?