Skip to content

JOIN condition optimization #26928

@UnamedRus

Description

@UnamedRus

Use case

Some BI tools really like to produce queries which has weird JOIN clauses.
As an example from power bi:

    ) AS ITBL ON (((toFloat64(OTBL.DestAirportID) = toFloat64(ITBL.C1)) AND isNotNull(OTBL.DestAirportID)) AND isNotNull(ITBL.C1)) OR (isNull(OTBL.DestAirportID) AND isNull(ITBL.C1))

It's being done in such strange way to join by Null = Null condition also.
But in clickhouse it's quite common to avoid using Nullable datatypes at all.
So clickhouse could get rig of always true/false clauses and make join condition much simpler.

    ) AS ITBL ON (((toFloat64(OTBL.DestAirportID) = toFloat64(ITBL.C1))

If both DestAirportID and C1 are non nullable datatypes it's possible to drop AND isNotNull(OTBL.DestAirportID)) AND isNotNull(ITBL.C1)) because it's always true. And drop OR (isNull(OTBL.DestAirportID) AND isNull(ITBL.C1)) because it's always false.

Describe the solution you'd like
Clickhouse would simplify and remove all unnecessary conditions from join clause.

Additional context
Related #24420

Metadata

Metadata

Assignees

Labels

comp-joinsJOINs end-to-end (planning hooks + runtime join operators/algorithms). Single bucket to avoid pla...comp-query-optimizerQuery plan optimization: physical plan steps, plan-level rewrites and optimizations (QueryPlan pa...feature

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions