Skip to content

Wrong result in case of SubQueryConversion = true #8911

@EPluribusUnum

Description

@EPluribusUnum

Hi *!

Tested with latest 5.0.4 snapshot (1767)

CREATE TABLE PARENT_TABLE (
    ID INTEGER NOT NULL PRIMARY KEY,
    LINK_ID INTEGER
);

CREATE TABLE CHILD_TABLE (
    ID INTEGER NOT NULL PRIMARY KEY,
    PARENT_ID INTEGER,
    AMOUNT DOUBLE PRECISION,
    REF_ID INTEGER
);

INSERT INTO PARENT_TABLE (ID, LINK_ID) VALUES (1, NULL);

INSERT INTO CHILD_TABLE (ID, PARENT_ID, AMOUNT, REF_ID) VALUES (101, 1, 100, NULL);
INSERT INTO CHILD_TABLE (ID, PARENT_ID, AMOUNT, REF_ID) VALUES (102, 1, 100, NULL);
INSERT INTO CHILD_TABLE (ID, PARENT_ID, AMOUNT, REF_ID) VALUES (103, 1, 100, NULL);

-- EXPECTED: 1 row (OK with SubQueryConversion = false)
-- ACTUAL: 3 rows (same row repeated) with SubQueryConversion = true
SELECT
    p.id
FROM parent_table p
WHERE p.id = 1 AND
    EXISTS(
        SELECT 1
        FROM child_table c
        WHERE c.parent_id = p.id AND
            c.amount - COALESCE(
                (SELECT SUM(c2.amount)
                FROM child_table c2
                    JOIN parent_table p2 ON p2.link_id = p.id -- Correlation to outer p
                WHERE c2.ref_id = c.id), 0) > 0);

-- Same query WITHOUT the JOIN works correctly
SELECT
    p.id
FROM parent_table p
WHERE p.id = 1 AND
    EXISTS(
        SELECT 1
        FROM child_table c
        WHERE c.parent_id = p.id AND
            c.amount - COALESCE(
                (SELECT SUM(c2.amount)
                FROM child_table c2
                WHERE c2.ref_id = c.id), 0) > 0);

--Even DISTINCT does NOT help
SELECT DISTINCT p.id
FROM parent_table p
WHERE p.id = 1 AND
      EXISTS(SELECT 1
             FROM child_table c
             WHERE c.parent_id = p.id AND
                   c.amount - COALESCE((SELECT SUM(c2.amount)
                                        FROM child_table c2
                                        JOIN parent_table p2 ON p2.link_id = p.id
                                        WHERE c2.ref_id = c.id), 0) > 0);

Analysis (by AI)

The bug triggers when ALL conditions are met:

  1. Outer query selects from parent table (p)
  2. WHERE clause uses EXISTS with child table (c) correlated to p
  3. Condition inside EXISTS includes a scalar subquery
  4. Scalar subquery JOINs back to parent table using correlation
    to the OUTERMOST query (p2.link_id = p.id)

Result: Row count incorrectly multiplied by child records matching EXISTS

The semantic of the query means the inner JOIN can never match any rows
(link_id is NULL), yet it causes the outer query to repeat rows.

Thank you!

Metadata

Metadata

Assignees

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions