Repro:
SELECT
COUNT(*) -- the issue happens even if you have COUNT(A.playerID)
FROM
baseballStats_OFFLINE AS A
JOIN baseballStats_OFFLINE AS B ON A.playerID = B.playerID
WHERE
A.hits > 10 AND B.hits < 5
The issue happens even if you use a sub-query:
SELECT
COUNT(*)
FROM
(
SELECT
A.playerID
FROM
baseballStats_OFFLINE AS A
JOIN baseballStats_OFFLINE AS B ON A.playerID = B.playerID
WHERE
A.hits > 10
AND B.hits < 5
)
For each of the queries above, they will end up reading all the columns in the table-scan stage. The reason is that there's no projection node created by Calcite.
I am testing out a few approaches for a fix in this PR: #10122
Btw, this is what GPT recommends:
cc: @walterddr
