Short circuit optimization for functions executed over Nullable arguments by taiyang-li · Pull Request #60129 · ClickHouse/ClickHouse

taiyang-li · 2024-02-19T10:40:19Z

Changelog category (leave one):

Performance Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Add 2 new settings short_circuit_function_evaluation_for_nulls and short_circuit_function_evaluation_for_nulls_threshold that allow to execute functions over Nullable columns in short-circuit manner when the ratio of NULL values in the block of data exceeds the specified threshold. It means that the function will be executed only on rows with non-null values. It applies only to functions that return NULL value for rows where at least one argument is NULL.

It also closes #34854

Documentation entry:

Short circuit optimization for defaultImplementationForNulls. It makes sense only when useDefaultImplementationForNulls() = true. For functions with useDefaultImplementationForNulls() = true, and result_type is Nullable(T):

If each rows contains argument with null value, skip function evaluation and return column with all rows null directly
If short_circuit_function_evaluation_for_nulls and the ratio of rows containing nulls to total rows exceeds short_circuit_function_evaluation_for_nulls_threshold, skip evaluation for those rows containing nulls
Otherwise process rows as it was processed before.

SQL:
with null::Nullable(String) as x, 'hello' as y, ' clickhouse' as z  select concat(materialize(x), materialize(y), materialize(z)) from numbers(10000000) format Null;   

Before: 
0 rows in set. Elapsed: 0.253 sec. Processed 10.00 million rows, 80.00 MB (39.57 million rows/s., 316.54 MB/s.)
Peak memory usage: 4.00 MiB.

After:
0 rows in set. Elapsed: 0.518 sec. Processed 10.00 million rows, 80.00 MB (19.30 million rows/s., 154.40 MB/s.)
Peak memory usage: 11.33 MiB.

robot-clickhouse-ci-2 · 2024-02-19T10:41:22Z

This is an automated comment for commit 3386cbb with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Check name	Description	Status
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
Builds	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker keeper image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docker server image	The check to build and optionally push the mentioned image to docker hub	✅ success
Docs check	Builds and tests the documentation	✅ success
Fast test	Normally this is the first check that is ran for a PR. It builds ClickHouse and runs most of stateless functional tests, omitting some. If it fails, further checks are not started until it is fixed. Look at the report to see which tests fail, then reproduce the failure locally as described here	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integration tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	Runs a set of checks to keep the code style clean. If some of tests failed, see the related log from the report	✅ success
Unit tests	Runs the unit tests for different release types	✅ success
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	✅ success

Avogar · 2024-02-19T12:55:53Z

We should add a setting to be able to turn it on/off

taiyang-li · 2024-02-24T13:10:59Z

We should add a setting to be able to turn it on/off

I'm afraid current changes will slow down performance. Let's keep it as draft.

Avogar · 2024-02-26T12:11:22Z

But we still can have it under a setting that will be disabled by default, it will be useful in some cases. For example, afaiu it can solve this problem: #34854 (comment)

taiyang-li · 2024-02-27T02:11:38Z

But we still can have it under a setting that will be disabled by default, it will be useful in some cases. For example, afaiu it an solve this problem: #34854 (comment)

I'll try.

taiyang-li · 2024-02-28T08:01:44Z

But we still can have it under a setting that will be disabled by default, it will be useful in some cases. For example, afaiu it can solve this problem: #34854 (comment)

Done. Hope for you reviews, thanks!

taiyang-li · 2024-03-06T07:48:44Z

@Avogar I wonder why "ClickHouse special build check" in CI is always pending : https://s3.amazonaws.com/clickhouse-test-reports/60129/7cf67df78d19c7e2722b423ba311872f6628964c/clickhouse_special_build_check/report.html

Do you known why ?

Avogar · 2024-03-06T12:09:16Z

Do you known why ?

No I don't. Fast test is failed and in this case we don't run any build checks. Let's ask @Felixoid why it shows pending.
BTW, server creashes in Fast test and it's related to the changes, please, take a look and fix

Algunenano · 2024-03-06T12:16:23Z

Fast tests is a hard check before other tasks are run, like other builds or full stateless tests.

Felixoid · 2024-03-06T13:22:21Z

Both reports are pending because of the crashed server, @Algunenano is right.

Avogar · 2024-03-06T13:45:57Z

Both reports are pending because of the crashed server, Algunenano is right.

Probably the question was why it is pending when we will not run any builds (as fast test fail), pending status may be misleading that it will be actually run

taiyang-li · 2024-03-18T02:41:43Z

Both reports are pending because of the crashed server, Algunenano is right.

Probably the question was why it is pending when we will not run any builds (as fast test fail), pending status may be misleading that it will be actually run

Yes, this is the most important thing I care about. Where could I find any valuable clues to fix it ?

taiyang-li · 2024-04-02T06:54:09Z

Both reports are pending because of the crashed server, @Algunenano is right.

Crash is fixed now. Let wait for ci.

src/Core/Settings.h

taiyang-li · 2024-04-12T07:23:43Z

It is ready for review now. cc @Avogar

taiyang-li · 2024-11-04T04:08:25Z

@Avogar I can't reproduce the performance regression in my machine. Let's trigger the CI again and wait for the result.

taiyang-li · 2024-11-05T03:37:48Z

@Avogar do you know why there is no performance tests in the newest CI ?

Avogar · 2024-11-13T13:08:49Z

@Avogar I can't reproduce the performance regression in my machine. Let's trigger the CI again and wait for the result.

Reproduces for me:
Master:

:) SELECT count() FROM (SELECT toNullable(materialize(1)) AS x1, toNullable(materialize(1)) AS x2 FROM zeros(100000000)) WHERE NOT ignore(xor(x1,x2))

   ┌───count()─┐
1. │ 100000000 │ -- 100.00 million
   └───────────┘

1 row in set. Elapsed: 0.029 sec.

:) SELECT sumCountIf(key, key != -1) FROM ( SELECT materialize(toNullable(number)) AS key FROM numbers(100000000) ) FORMAT Null

Ok.

0 rows in set. Elapsed: 0.072 sec.

This PR:

:) SELECT count() FROM (SELECT toNullable(materialize(1)) AS x1, toNullable(materialize(1)) AS x2 FROM zeros(100000000)) WHERE NOT ignore(xor(x1,x2))


   ┌───count()─┐
1. │ 100000000 │ -- 100.00 million
   └───────────┘

1 row in set. Elapsed: 0.152 sec.

:) SELECT sumCountIf(key, key != -1) FROM ( SELECT materialize(toNullable(number)) AS key FROM numbers(100000000) ) FORMAT Null

Ok.

0 rows in set. Elapsed: 0.144 sec.

Most likely the difference is because of executing extractInvertedMask on each Nullable argument null-mask. So probably we can't enable new behaviour by default

…ouse into short_circut_func

taiyang-li · 2024-11-14T03:30:01Z

@Avogar I had improved the performance in 932caea. Let's wait for the newest performance report.

Avogar · 2024-11-14T12:14:28Z

The fasttest is broken

Avogar · 2024-11-18T12:04:42Z

Performance tests are ok now. Failied test 02809_storage_set_analysis_bug is related (it's flaky and fails only sometimes, you can run this test with clickhouse-test --test-runs 100 -j 10 to reproduce)

taiyang-li · 2024-11-19T03:39:04Z

@Avogar 02809_storage_set_analysis_bug is failed because for in function, the first argument contains null in each row. The execution is short-circuited, instead of throwing exception, it returns 0.

…ouse into short_circut_func

Algunenano · 2024-11-19T13:58:50Z

It would be nice to have a performance test that verifies the impact of this feature in the real world. Right now all I see is 300 lines of code changed and some dubious performance reports with several degradations.

taiyang-li · 2024-11-20T07:08:07Z

@Algunenano we will test this feature in apache gluten. cc @baibaichen

Algunenano · 2024-11-20T11:27:17Z

@Algunenano we will test this feature in apache gluten. cc @baibaichen

That's not ok. Changes to CH upstream should have tests in CH upstream.

Algunenano · 2024-11-22T11:49:26Z

As suspected and shown by the existing perf tests, it's degrading the performance:

SELECT count() FROM (SELECT toNullable(materialize(1)) AS x1, toNullable(materialize(1)) AS x2 FROM zeros(100000000000)) WHERE NOT ignore(xor(x1,x2)

Before the change: 7.70 GB/s.
After the change: 6.02 GB/s.

As the perf test show no advantage from having this change, I'm reverting this. Please resubmit with proper proof of the impact and without ignoring degradations in the existing benchmarks.

short circuit for defaultImplementationForNulls

2febfb4

robot-clickhouse-ci-2 added the pr-performance Pull request with some performance improvements label Feb 19, 2024

alexey-milovidov added the can be tested Allows running workflows for external contributors label Feb 19, 2024

Avogar self-assigned this Feb 19, 2024

taiyang-li marked this pull request as draft February 24, 2024 13:11

taiyang-li added 2 commits February 27, 2024 20:00

Merge branch 'master' into short_circut_func

9715292

fix wrong uts

2ffbb7c

taiyang-li marked this pull request as ready for review February 28, 2024 07:13

add settings allow_short_circuit_default_implementations_for_null

ef1e64a

taiyang-li added 2 commits April 2, 2024 12:20

Merge remote-tracking branch 'origin/master' into short_circut_func

e498e76

short_circut_func

eb20833

taiyang-li force-pushed the short_circut_func branch from 7cf67df to eb20833 Compare April 2, 2024 06:51

Avogar reviewed Apr 2, 2024

View reviewed changes

src/Core/Settings.h Outdated Show resolved Hide resolved

taiyang-li added 3 commits April 7, 2024 11:03

Merge remote-tracking branch 'origin/master' into short_circut_func

edbfb2d

change as requested

89004bd

fix conflicts

2e4d312

Merge branch 'master' into short_circut_func

755e02f

Merge branch 'ClickHouse:master' into short_circut_func

3d78c49

Merge branch 'ClickHouse:master' into short_circut_func

88cf6ef

taiyang-li added 3 commits November 14, 2024 11:24

improve performance

932caea

Merge branch 'short_circut_func' of https://github.com/bigo-sg/ClickH…

1e1ea07

…ouse into short_circut_func

Merge remote-tracking branch 'origin/master' into short_circut_func

02de47c

taiyang-li added 3 commits November 18, 2024 10:30

fix failed uts

f4e1866

Merge branch 'master' into short_circut_func

b706458

Merge branch 'ClickHouse:master' into short_circut_func

b1e816f

taiyang-li added 2 commits November 19, 2024 11:43

fix failed uts

47944a4

Merge branch 'short_circut_func' of https://github.com/bigo-sg/ClickH…

3386cbb

…ouse into short_circut_func

Avogar approved these changes Nov 19, 2024

View reviewed changes

Avogar changed the title ~~Short circuit optimization for defaultImplementationForNulls~~ Short circuit optimization for functions executed over Nullable arguments Nov 19, 2024

Avogar added this pull request to the merge queue Nov 19, 2024

Merged via the queue into ClickHouse:master with commit e3e4e45 Nov 19, 2024

robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Nov 19, 2024

Algunenano mentioned this pull request Nov 22, 2024

Revert "Short circuit optimization for functions executed over Nullable arguments" #72258

Merged

taiyang-li mentioned this pull request Jan 8, 2025

Revert "Revert "Short circuit optimization for functions executed over Nullable arguments"" #73820

Merged

23 tasks

Conversation

taiyang-li commented Feb 19, 2024 • edited by Avogar Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry:

Uh oh!

robot-clickhouse-ci-2 commented Feb 19, 2024 • edited by robot-clickhouse Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Avogar commented Feb 19, 2024

Uh oh!

taiyang-li commented Feb 24, 2024

Uh oh!

Avogar commented Feb 26, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taiyang-li commented Feb 27, 2024

Uh oh!

taiyang-li commented Feb 28, 2024

Uh oh!

taiyang-li commented Mar 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Avogar commented Mar 6, 2024

Uh oh!

Algunenano commented Mar 6, 2024

Uh oh!

Felixoid commented Mar 6, 2024

Uh oh!

Avogar commented Mar 6, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

taiyang-li commented Mar 18, 2024

Uh oh!

taiyang-li commented Apr 2, 2024

Uh oh!

Uh oh!

taiyang-li commented Apr 12, 2024

Uh oh!

taiyang-li commented Nov 4, 2024

Uh oh!

taiyang-li commented Nov 5, 2024

Uh oh!

Avogar commented Nov 13, 2024

Uh oh!

taiyang-li commented Nov 14, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Avogar commented Nov 14, 2024

Uh oh!

Avogar commented Nov 18, 2024

Uh oh!

taiyang-li commented Nov 19, 2024

Uh oh!

Algunenano commented Nov 19, 2024

Uh oh!

taiyang-li commented Nov 20, 2024

Uh oh!

Algunenano commented Nov 20, 2024

Uh oh!

Algunenano commented Nov 22, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

taiyang-li commented Feb 19, 2024 •

edited by Avogar

Loading

robot-clickhouse-ci-2 commented Feb 19, 2024 •

edited by robot-clickhouse

Loading

Avogar commented Feb 26, 2024 •

edited

Loading

taiyang-li commented Mar 6, 2024 •

edited

Loading

Avogar commented Mar 6, 2024 •

edited

Loading

taiyang-li commented Nov 14, 2024 •

edited

Loading