-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
In SQL intersect may be modified with the all modifier. For example:
if table A contains a column a with values [1,1,2,3,4] and B contains a column b with values [1, 1, 2]:
select *
from (select a from A)
intersect (select b from B)
Returns [1, 2] while
select *
from (select a from A)
intersect ALL (select b from B)
returns [1,1,2].
Currently Pinot accepts the ALL modifier and it is shown in the explain plan, but the semantics are always the same. Specifically, Pinot semantics are the ones of intersect without all modifier.
You can verify that by running ColocatedJoinEngineQuickStart and executing:
select userUUID
from (select userUUID from userGroups)
intersect all
(select userUUID from userGroups)
with and without all.
intersect all should return the same number of rows than count() (which is 2494) while intersect should return the same number of rows than count(distinct(userUUID)) (which is 2470).
But the returned number of rows is 2470 with and without all modifier
As a short term solution we can fail if all modifier is supplied, as it would be better than returning incorrect results.