Skip to content

SQL: NULL values absent from JOIN output #5984

@philrz

Description

@philrz

When Postgres runs this query, it has NULL values for k and l in the bottom output record.

$ super -f parquet -o integers.parquet -c "
values
  {i:1::int32,j:2::int32},
  {i:2::int32,j:3::int32},
  {i:3::int32,j:4::int32}
" &&
super -f parquet -o integers2.parquet -c "
values
  {k:1::int32,l:10::int32},
  {k:2::int32,l:20::int32}
" &&
super -c super -c "SELECT * FROM integers.parquet LEFT OUTER JOIN integers2.parquet ON integers.i=integers2.k ORDER BY i;"

{i:1::int32,j:2::int32,k:1::int32,l:10::int32}
{i:2::int32,j:3::int32,k:2::int32,l:20::int32}
{i:3::int32,j:4::int32}

Details

Repro is with super commit a90585f. This was found via a query from a sqllogictest.

Here's the query running in Postgres with a client setting to make it easier to see NULL values.

$ psql --version
psql (PostgreSQL) 17.6 (Homebrew)

$ psql postgres -P null=NULL -c "
DROP TABLE IF EXISTS integers;
DROP TABLE IF EXISTS integers2;
CREATE TABLE integers(i INTEGER, j INTEGER);
INSERT INTO integers VALUES (1, 2), (2, 3), (3, 4);
CREATE TABLE integers2(k INTEGER, l INTEGER);
INSERT INTO integers2 VALUES (1, 10), (2, 20);
SELECT * FROM integers LEFT OUTER JOIN integers2 ON integers.i=integers2.k ORDER BY i;"

DROP TABLE
DROP TABLE
CREATE TABLE
INSERT 0 3
CREATE TABLE
INSERT 0 2

 i | j |  k   |  l   
---+---+------+------
 1 | 2 |    1 |   10
 2 | 3 |    2 |   20
 3 | 4 | NULL | NULL
(3 rows)

Whereas in SuperDB, the bottom output record lacked values for the fields k and l.

Now, as a user, I know that if I want the NULL values, I can just tack on a piped fuse.

$ super -version
Version: a90585f6d

$ super -c "
SELECT * FROM integers.parquet LEFT OUTER JOIN integers2.parquet ON integers.i=integers2.k ORDER BY i
| fuse"

{i:1::int32,j:2::int32,k:1::int32,l:10::int32}
{i:2::int32,j:3::int32,k:2::int32,l:20::int32}
{i:3::int32,j:4::int32,k:null::int32,l:null::int32}

I'm not sure if we should be doing something similar automatically for a pure SQL query to match the Postgres behavior.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions