-
Notifications
You must be signed in to change notification settings - Fork 8.3k
FULL OUTER JOIN works incorrectly with Cluster functions #89996
Copy link
Copy link
Closed
Labels
Description
Describe the unexpected behaviour
A FULL OUTER JOIN between two identical s3Cluster table functions on the same dataset returns only left-side rows with NULL values on the right side, instead of matching pairs. The same join using plain s3 table functions produces the correct result with matched rows.
Same with icebergS3Cluster.
SELECT *
FROM s3Cluster(replicated_cluster, 'http://minio:9000/warehouse/data2/data/**.parquet', 'admin', 'password') AS t1
FULL OUTER JOIN s3Cluster(replicated_cluster, 'http://minio:9000/warehouse/data2/data/**.parquet', 'admin', 'password') AS t2 ON t1.boolean_col = t2.boolean_col
ORDER BY tuple(*) ASC
Query id: 561496ba-623f-4e34-a5a8-6da1f6e8567a
┌─boolean_col─┬─long_col─┬─t2.boolean_col─┬─t2.long_col─┐
1. │ false │ 2000 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
2. │ true │ 1000 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
└─────────────┴──────────┴────────────────┴─────────────┘
2 rows in set. Elapsed: 0.025 sec.Expected:
SELECT *
FROM s3('http://minio:9000/warehouse/data2/data/**.parquet', 'admin', 'password') AS t1
FULL OUTER JOIN s3('http://minio:9000/warehouse/data2/data/**.parquet', 'admin', 'password') AS t2 ON t1.boolean_col = t2.boolean_col
ORDER BY tuple(*) ASC
Query id: 1da58346-23c0-4bce-aa79-e53094a4b455
┌─boolean_col─┬─long_col─┬─t2.boolean_col─┬─t2.long_col─┐
1. │ false │ 2000 │ false │ 2000 │
2. │ true │ 1000 │ true │ 1000 │
└─────────────┴──────────┴────────────────┴─────────────┘
2 rows in set. Elapsed: 0.015 sec. Additional context
SELECT *
FROM s3Cluster(replicated_cluster, 'http://minio:9000/warehouse/data2/data/**.parquet', 'admin', 'password')
Query id: 5f8c80ba-c897-4772-ae74-8349304d664c
┌─boolean_col─┬─long_col─┐
1. │ true │ 1000 │
2. │ false │ 2000 │
└─────────────┴──────────┘
2 rows in set. Elapsed: 0.019 sec. SELECT *
FROM icebergS3Cluster(replicated_cluster, 'http://minio:9000/warehouse/data2', 'admin', 'password') AS t1
FULL OUTER JOIN icebergS3Cluster(replicated_cluster, 'http://minio:9000/warehouse/data2', 'admin', 'password') AS t2 ON t1.boolean_col = t2.boolean_col
ORDER BY tuple(*) ASC
Query id: 128b9db1-f53e-45c7-a009-75ab47f7266d
┌─boolean_col─┬─long_col─┬─t2.boolean_col─┬─t2.long_col─┐
1. │ false │ 2000 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
2. │ true │ 1000 │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
└─────────────┴──────────┴────────────────┴─────────────┘
2 rows in set. Elapsed: 0.026 sec. Reactions are currently unavailable