[wip] a prototype for window functions#18222
Conversation
| if (auto it = window_descriptions.find(window_description.window_name); | ||
| it != window_descriptions.end()) | ||
| { | ||
| assert(it->second.full_sort_description |
There was a problem hiding this comment.
This check looks a little bit clumsy ...
I would rather group by sort full_sort_description initially.
Also, I suppose it may be reasonable group windows by partition_by, cause we can start common sorting for a window by it, and then add finish sorting by full_sort_description for each window function separately.
There was a problem hiding this comment.
I think I'll try to write the complete window matching (group the windows declared ad hoc with same parameters) + window declaration grammar, to figure this out.
There was a problem hiding this comment.
Some windows are just totally identical, and then to some windows we can apply the optimization you describe (group them so that they have a common sorting prefix). I have both these points here in the "rough edges" list: #18097
| // Populate the reverse map. | ||
| for (const auto & f : window_function_descriptions) | ||
| { | ||
| window_descriptions[f.window_name].window_functions.push_back(f); |
There was a problem hiding this comment.
So, we use window_name as grouping key, so it is equal for each window function from one window. OK.
It is interesting if window_name is used somewhere else. If not, we can remove this field form WindowFunctionDescription
There was a problem hiding this comment.
It turned out to be redundant after we group the functions by windows, yes. But I'm not sure where else to put it... Maybe we can just group right away, and remove the common array of window functions and this separate grouping step.
There was a problem hiding this comment.
I managed to remove it.
There was a problem hiding this comment.
This also addresses a couple of the above comments about maps and duplication of window_name.
| col.column = col.type->createColumn(); | ||
| col.name = f.column_name; | ||
|
|
||
| step.actions()->addInput(col); |
There was a problem hiding this comment.
Hm, it is a little bit strange to add window function name as input.
I think it would be logical to add a new action, after the calculation of window function arguments step, which will have this input.
Also it is not clear why we add window function column name to the list of required columns. I suppose it is possible that window function name is not actually used, and we can skip window function evaluation?
There was a problem hiding this comment.
Also, it probably may be done the same way as with aggregation.
The idea is that we can evaluate the list of columns after aggregation (aggregated_columns), and start building new ExpressionActionsChain starting from this list. Maybe we can have the same fixed list of columns after window function.
I don't say if it is a good approach and will be easier, though.
There was a problem hiding this comment.
Added a comment about the various approaches.
The aggregation one is logically similar to INPUT actions, but with more code (i.e. it also add columns as an input to the action chain, but in a different way). And we'll have to remove it anyway and add a proper WINDOW action. So I think we can keep the INPUT actions for now.
| * NOTE: if `ast` is a SELECT query from a table, the structure of this table should not change during the lifetime of ExpressionAnalyzer. | ||
| */ | ||
| class ExpressionAnalyzer : protected ExpressionAnalyzerData, private boost::noncopyable | ||
| class ExpressionAnalyzer : public ExpressionAnalyzerData, private boost::noncopyable |
There was a problem hiding this comment.
I think it should have been an all-public immutable struct. "Private data + accessors/modifiers" are for things that change over time and have to maintain complex invariants. On the contrary, the ExpressionAnalyzerData is built once at a particular stage of query analysis, and doesn't change. So "all-public immutable" would have been more suitable. But I don't feel like redoing it all now, so for uniformity let's change it back to private add an accessor for window descriptions.
|
@akuzm |
Took it from Postgres. Probably we should convert all our tests to echo queries, just did it the quick way for now. |
|
The build failure after merge is because of infrastructure problem. |
Changelog category (leave one):