Rewrite IN (subquery) so that it can be executed as JOIN instead of CreatingSets by davenger · Pull Request #83991 · ClickHouse/ClickHouse

davenger · 2025-07-18T16:02:50Z

The implementation rewrites
x IN subquery
to
EXISTS (SELECT 1 FROM (SELECT * AS _unique_name_ FROM subquery) WHERE x = _unique_name_ LIMIT 1)
and the EXIST expression is rewritten into JOIN by de-correlation logic.

Changelog category (leave one):

Not for changelog (changelog entry is not required)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

...

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

clickhouse-gh · 2025-07-18T16:03:15Z

Workflow [PR], commit [da741fb]

…reatingSets

…nd outer scopes

novikd · 2025-08-20T12:18:16Z

src/Analyzer/Resolve/QueryAnalyzer.cpp

+        {
+            auto & in_second_argument = function_in_arguments_nodes[1];
+
+            if (in_second_argument->as<QueryNode>())


You can't check if it's a query until you resolve this node. Example:

WITH t as (select number from numbers(10) SELECT * FROM numbers(20) WHERE number in t

The second argument here will be IdentifierNode.

added resolving the node here

novikd · 2025-08-20T12:20:49Z

src/Analyzer/Resolve/QueryAnalyzer.cpp

+                internal_exists_subquery->getProjection().getNodes().push_back(std::make_shared<IdentifierNode>(Identifier{unique_column_name}));
+                internal_exists_subquery->getJoinTree() = std::move(subquery_node);
+
+                /// SELECT 1 FROM (SELECT * AS _unique_name_ FROM subquery) WHERE a = _unique_name_ LIMIT 1


Does it work with tuples?

added support for subqueries returning mutiple columns

novikd · 2025-08-20T12:21:39Z

src/Analyzer/Resolve/QueryAnalyzer.cpp

+        (function_name == "in" || function_name == "notIn") &&
+        scope.context->getSettingsRef()[Setting::rewrite_in_to_join])
+    {
+        if (!scope.context->getSettingsRef()[Setting::allow_experimental_correlated_subqueries])


Maybe fallback to CreatingSets instead of exception? Both options are okay for me

I think it's better not to silently fallback here because it might hide errors that way

EmeraldShift · 2025-08-28T19:27:33Z

Does this transformation work when the key x is in the primary key, or has special index analysis, like _part_offset + _part_starting_offset? For my use case, I have a query like this:

WITH offsets AS
    (
        SELECT _part_starting_offset + _parent_part_offset AS offset
        FROM mergeTreeProjection('otel', 'spans', 'by_span')
        WHERE trace_id = unhex({trace:String})
    )
SELECT *
FROM otel.spans
WHERE (_part_offset + _part_starting_offset) IN (offsets)

There are ~90 million matching rows in the projection, and CreatingSetsTransformation takes a very long time:

CreatingSetsTransform: Created Set with 89792618 entries from 89792618 rows in 9.025697714 sec.

But the rest of the query is very fast because the part offsets are included in primary index analysis, and can quickly filter parts and granules.
However, If I try to manually rewrite the query as this PR suggests:

WITH offsets AS
    (
        SELECT _part_starting_offset + _parent_part_offset AS offset
        FROM mergeTreeProjection('otel', 'spans', 'by_span')
        WHERE trace_id = unhex({trace:String})
    )
SELECT *
FROM otel.spans
WHERE exists((
    SELECT 1
    FROM
    (
        SELECT offset
        FROM offsets
    )
    WHERE (_part_offset + _part_starting_offset) = offset
    LIMIT 1
))

then the query takes even longer, and it appears to perform a full table scan.

Is there some way to eliminate the lag of CreatingSets, but also utilize the primary key for part offsets? Then, at any size, it would be fast to join against the right side of IN.

davenger · 2025-08-29T10:03:50Z

Is there some way to eliminate the lag of CreatingSets, but also utilize the primary key for part offsets?

I think you can try to play with this sub-query to speed it up. Does it also take 9 sec when run separately?

 SELECT _part_starting_offset + _parent_part_offset AS offset
         FROM mergeTreeProjection('otel', 'spans', 'by_span')
         WHERE trace_id = unhex({trace:String}) FORMAT Null

Then, at any size, it would be fast to join against the right side of IN
You right, just rewriting IN to join will not help, there needs to be an optimization that allows skipping data during reading. Maybe this optimization can help #81526

EmeraldShift · 2025-08-29T11:15:12Z

I think you can try to play with this sub-query to speed it up. Does it also take 9 sec when run separately?

The subquery only takes ~2 seconds to complete. Then, after it's complete, the CreatingSetsTransformation takes an additional ~9 seconds to transform the result into a set for use with IN, and finally the main query runs quickly, due to the primary index analysis on the part offset columns.

Maybe this optimization can help #81526

Does this work for the primary index too? Notably I am not utilizing any skip indexes in the main query, just the special part offset columns.

At any rate, it seems there are two separate issues:

CreatingSets can be very slow for large sets (is this expected? Maybe it can be optimized? Or maybe it's bad and that's why this PR exists?)
This JOIN transformation eliminated the cost of CreatingSets for my query (yay!) but also seems to have eliminated index analysis (sad!) on the intermediate result. I don't know how to recover the original behavior of the main query with this transformation.

…e_in_subquery

novikd

LGTM

novikd · 2025-09-29T15:19:27Z

src/Analyzer/Resolve/resolveFunction.cpp

+                /// SELECT * AS _unique_name_ FROM subquery
+                auto internal_exists_subquery = std::make_shared<QueryNode>(Context::createCopy(scope.context));
+                internal_exists_subquery->setIsSubquery(true);
+                internal_exists_subquery->getProjection().getNodes().push_back(std::make_shared<IdentifierNode>(Identifier{unique_column_name}));


In the future can be replaced with ColumnNode

novikd · 2025-09-29T15:19:37Z

src/Analyzer/Resolve/resolveFunction.cpp

+
+                    auto & copy_of_in_first_parameter = function_in_arguments_nodes[0];
+
+                    auto subquery_projection = std::make_shared<IdentifierNode>(Identifier{unique_column_name});


davenger · 2025-09-30T07:02:08Z

Test failures:
AST fuzzer (amd_tsan) #86957
Integration tests (arm_binary, distributed plan, 1/4) #87787

EmeraldShift · 2025-09-30T12:57:36Z

Just double checking, is this transformation going to affect queries like my earlier comment, which leverage the primary key analysis between running the subquery and the main query? If it will have a negative impact, then is the transformation optional?

davenger · 2025-09-30T13:19:31Z

@EmeraldShift This transformation is optional and is disabled by default, it is controlled by rewrite_in_to_join setting.
Currently the rewritten query will not use indexes for the IN condition that was transformed.

davenger marked this pull request as draft July 18, 2025 16:02

clickhouse-gh bot added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jul 18, 2025

davenger force-pushed the rewrite_in_subquery branch from 3743db6 to 80832c4 Compare July 18, 2025 16:22

novikd self-assigned this Jul 21, 2025

davenger force-pushed the rewrite_in_subquery branch from aa5a628 to 68b1e3c Compare July 24, 2025 15:44

davenger force-pushed the rewrite_in_subquery branch 4 times, most recently from a5d861e to 4ad68dc Compare August 11, 2025 07:49

davenger added 6 commits August 11, 2025 16:22

Rewrite IN (subquery) so that it can be executed as JOIN instead of C…

3140d7f

…reatingSets

Rewrite only if allow_experimental_correlated_subqueries is enabled

580b0a1

typo

17ec500

Tests for various cases of rewriting IN

7f720fa

Change rewriting logic to address column name collisions from inner a…

1acc09b

…nd outer scopes

More test cases

016e4d9

davenger force-pushed the rewrite_in_subquery branch from e148585 to 4be2a7f Compare August 11, 2025 14:22

davenger added 3 commits August 11, 2025 16:28

Rewrite NOT IN to NOT EXISTS

7ac67b3

Tests for NOT IN

54bfb28

Add an experimental setting to enable rewriting IN to JOIN

63c89b5

davenger force-pushed the rewrite_in_subquery branch from 4be2a7f to 63c89b5 Compare August 11, 2025 14:29

davenger marked this pull request as ready for review August 12, 2025 07:11

Merge branch 'master' into rewrite_in_subquery

6a5d63e

novikd reviewed Aug 20, 2025

View reviewed changes

robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label Aug 20, 2025

EmeraldShift mentioned this pull request Aug 28, 2025

Support rewrite to optimize order by limit #82478

Closed

1 task

Merge branch 'master' into rewrite_in_subquery

aaa7cd2

davenger added 10 commits September 4, 2025 15:27

Merge branch 'master' into rewrite_in_subquery

8a966b3

Test cases with CTEs and with tuples

0d348b2

Support subqueries as CTEs and subqueries returning multiple columns

771cc1e

Disable explain descriptions

e033e7b

Merge branch 'master' of github.com:ClickHouse/ClickHouse into rewrit…

a4d930b

…e_in_subquery

Update setting description

b1c9f88

Merge branch 'master' into rewrite_in_subquery

5dbbdb3

Merge branch 'master' into rewrite_in_subquery

f0ad635

Merge branch 'master' into rewrite_in_subquery

1519ca4

Merge branch 'master' into rewrite_in_subquery

f6adec7

Felixoid removed the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 26, 2025

novikd approved these changes Sep 29, 2025

View reviewed changes

Update test reference

da741fb

davenger enabled auto-merge September 30, 2025 07:03

davenger added this pull request to the merge queue Sep 30, 2025

Merged via the queue into master with commit a6df40d Sep 30, 2025
119 of 123 checks passed

davenger deleted the rewrite_in_subquery branch September 30, 2025 07:17

robot-ch-test-poll4 added the pr-synced-to-cloud The PR is synced to the cloud repo label Sep 30, 2025

PedroTadim mentioned this pull request Oct 15, 2025

rewrite_in_to_join SEGV #88569

Closed


		auto & copy_of_in_first_parameter = function_in_arguments_nodes[0];

		auto subquery_projection = std::make_shared<IdentifierNode>(Identifier{unique_column_name});

Conversation

davenger commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

clickhouse-gh bot commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

EmeraldShift commented Aug 28, 2025

Uh oh!

davenger commented Aug 29, 2025

Uh oh!

EmeraldShift commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

novikd left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

davenger commented Sep 30, 2025

Uh oh!

Uh oh!

EmeraldShift commented Sep 30, 2025

Uh oh!

davenger commented Sep 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

davenger commented Jul 18, 2025 •

edited

Loading

clickhouse-gh bot commented Jul 18, 2025 •

edited

Loading

EmeraldShift commented Aug 29, 2025 •

edited

Loading