Short circuit optimization for functions executed over Nullable arguments#60129
Short circuit optimization for functions executed over Nullable arguments#60129Avogar merged 61 commits intoClickHouse:masterfrom
Conversation
|
This is an automated comment for commit 3386cbb with description of existing statuses. It's updated for the latest CI running ❌ Click here to open a full report in a separate page
Successful checks
|
|
We should add a setting to be able to turn it on/off |
I'm afraid current changes will slow down performance. Let's keep it as draft. |
|
But we still can have it under a setting that will be disabled by default, it will be useful in some cases. For example, afaiu it can solve this problem: #34854 (comment) |
I'll try. |
Done. Hope for you reviews, thanks! |
|
@Avogar I wonder why "ClickHouse special build check" in CI is always pending : https://s3.amazonaws.com/clickhouse-test-reports/60129/7cf67df78d19c7e2722b423ba311872f6628964c/clickhouse_special_build_check/report.html Do you known why ? |
No I don't. Fast test is failed and in this case we don't run any build checks. Let's ask @Felixoid why it shows pending. |
|
Fast tests is a hard check before other tasks are run, like other builds or full stateless tests. |
|
Both reports are pending because of the crashed server, @Algunenano is right. |
Probably the question was why it is pending when we will not run any builds (as fast test fail), pending status may be misleading that it will be actually run |
Yes, this is the most important thing I care about. Where could I find any valuable clues to fix it ? |
7cf67df to
eb20833
Compare
Crash is fixed now. Let wait for ci. |
|
It is ready for review now. cc @Avogar |
|
@Avogar I can't reproduce the performance regression in my machine. Let's trigger the CI again and wait for the result. |
|
@Avogar do you know why there is no performance tests in the newest CI ? |
Reproduces for me: This PR: Most likely the difference is because of executing |
|
The fasttest is broken |
|
Performance tests are ok now. Failied test |
|
@Avogar |
|
It would be nice to have a performance test that verifies the impact of this feature in the real world. Right now all I see is 300 lines of code changed and some dubious performance reports with several degradations. |
|
@Algunenano we will test this feature in apache gluten. cc @baibaichen |
That's not ok. Changes to CH upstream should have tests in CH upstream. |
|
As suspected and shown by the existing perf tests, it's degrading the performance:
As the perf test show no advantage from having this change, I'm reverting this. Please resubmit with proper proof of the impact and without ignoring degradations in the existing benchmarks. |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add 2 new settings
short_circuit_function_evaluation_for_nullsandshort_circuit_function_evaluation_for_nulls_thresholdthat allow to execute functions overNullablecolumns in short-circuit manner when the ratio of NULL values in the block of data exceeds the specified threshold. It means that the function will be executed only on rows with non-null values. It applies only to functions that return NULL value for rows where at least one argument is NULL.It also closes #34854
Documentation entry:
Short circuit optimization for defaultImplementationForNulls. It makes sense only when useDefaultImplementationForNulls() = true. For functions with useDefaultImplementationForNulls() = true, and result_type is Nullable(T):
short_circuit_function_evaluation_for_nullsand the ratio of rows containing nulls to total rows exceedsshort_circuit_function_evaluation_for_nulls_threshold, skip evaluation for those rows containing nulls