Supply consistent format output for FileScanConfig params#6202
Supply consistent format output for FileScanConfig params#6202alamb merged 5 commits intoapache:mainfrom
Conversation
| "AggregateExec: mode=Final, gby=[], aggr=[COUNT(2)]", | ||
| "AggregateExec: mode=Partial, gby=[], aggr=[COUNT(1)]", | ||
| "ParquetExec: limit=None, partitions={1 group: [[x]]}, projection=[a, b, c]", | ||
| "ParquetExec: file_groups={1 group: [[x]]}, projection=[a, b, c], limit=None, output_ordering=[]", |
There was a problem hiding this comment.
Would it be possible to keep the existing behavior and only print out limit, projection and output_ordering if they were non empty?
So like in this case it would be
file_groups={1 group: [[x]]}, projection=[a, b, c]",
Because limit is None and output_ordering is empty?
The rationale is to keep the display easier to read by keeping the output concise
There was a problem hiding this comment.
Yup just now I second that a compact output will be better.
Made the change accordingly!
| super::FileGroupsDisplay(&self.base_config.file_groups), | ||
| self.base_config.limit, | ||
| ) | ||
| write!(f, "AvroExec: {}", self.base_config()) |
There was a problem hiding this comment.
❤️ that really looks much nicer
| } | ||
| } | ||
|
|
||
| fn make_output_ordering_string(ordering: &[PhysicalSortExpr]) -> String { |
There was a problem hiding this comment.
I think you could avoid the extra copy / materialization by doing something like
struct DisplayableOrdering( &[PhysicalSortExpr]);
impl Display for DisplayableOrdering {
fn fmt(&self, f: &mut Formatter) -> FmtResult {
// use self.0 here and call write! to f
}
}Alternately you could probably use join here instead well: https://doc.rust-lang.org/std/primitive.slice.html#method.join
https://doc.rust-lang.org/std/primitive.slice.html#method.join
There was a problem hiding this comment.
Thanks I made the change with a wrapper.
However, I kept the imperative style instead of using join as I think it will do the actual copy, though I don't think performance matters that much for such path.
| TableScan: aggregate_test_100_with_order projection=[c1] | ||
| physical_plan | ||
| GlobalLimitExec: skip=0, fetch=10 | ||
| CsvExec: file_groups={1 group: [[WORKSPACE_ROOT/testing/data/csv/aggregate_test_100.csv]]}, projection=[c1], limit=None, output_ordering=[c1@0 ASC NULLS LAST], has_header=true |
| } | ||
| } | ||
|
|
||
| impl Display for FileScanConfig { |
There was a problem hiding this comment.
this looks great -- thank you
|
I took the liberty of merging this branch from main, fixing clippy and updating some remaining sqllogictests to get CI to pass |
|
Thanks again @tz70s |
|
Thanks @alamb for the review & helps! |
Which issue does this PR close?
Closes #6194.
Rationale for this change
Implemented
Displaytrait forFileScanConfigto be consistently applied in all Exec plan block formatting process.Only added those in-used fields.
Highlight:
projectmethod while callingfmt_aswithout introducing cyclic dependency, should be manageable as explain is not critical.file_groups,projection,limit,output_ordering) to make the output deterministic and consistent, e.g. previously we printlimit=Nonerather than omitting it, keep this behavior. The drawback of this approach is each time we add a new field (even not appearing in plan) will make massive change on test output, but even we omit the field many not fully solve the problem for extending fields.What changes are included in this PR?
Implemented
Displaytrait forFileScanConfigto be consistently applied in all Exec plan block formatting process.Are these changes tested?
Yes
Are there any user-facing changes?
Yes (explain output)