-
Notifications
You must be signed in to change notification settings - Fork 8.3k
JOIN condition optimization #26928
Description
Use case
Some BI tools really like to produce queries which has weird JOIN clauses.
As an example from power bi:
) AS ITBL ON (((toFloat64(OTBL.DestAirportID) = toFloat64(ITBL.C1)) AND isNotNull(OTBL.DestAirportID)) AND isNotNull(ITBL.C1)) OR (isNull(OTBL.DestAirportID) AND isNull(ITBL.C1))
It's being done in such strange way to join by Null = Null condition also.
But in clickhouse it's quite common to avoid using Nullable datatypes at all.
So clickhouse could get rig of always true/false clauses and make join condition much simpler.
) AS ITBL ON (((toFloat64(OTBL.DestAirportID) = toFloat64(ITBL.C1))
If both DestAirportID and C1 are non nullable datatypes it's possible to drop AND isNotNull(OTBL.DestAirportID)) AND isNotNull(ITBL.C1)) because it's always true. And drop OR (isNull(OTBL.DestAirportID) AND isNull(ITBL.C1)) because it's always false.
Describe the solution you'd like
Clickhouse would simplify and remove all unnecessary conditions from join clause.
Additional context
Related #24420