Skip to content

SQL: FULL OUTER JOIN support #6310

@philrz

Description

@philrz

Right now super parses a SQL FULL OUTER JOIN query, but the result doesn't match the result from Postgres with the same data/query.

$ super -f parquet -o za.parquet -c 'values {id:1,name:"foo",src:"za"},{id:3,name:"qux",src:"za"}' &&
  super -f parquet -o zb.parquet -c 'values {id:1,name:"foo",src:"zb"},{id:2,name:"bar",src:"zb"}' &&
  super -c "select * from za.parquet as za
            full outer join zb.parquet as zb
            on za.id=zb.id"

{id:1,name:"foo",src:"zb"}
{id:error("missing"),name:error("missing"),src:error("missing")}

Details

Repro is with super commit bdb38bb. This issue was surfaced by a user in a community Slack thread.

Here's Postgres returning the expected result with the same data/query.

$ psql --version
psql (PostgreSQL) 17.6 (Homebrew)

$ psql postgres -P null='NULL' -c "
DROP TABLE IF EXISTS za;
CREATE TABLE za (id INTEGER, name VARCHAR, src VARCHAR);
INSERT INTO za (id, name, src) VALUES (1, 'foo', 'za');
INSERT INTO za (id, name, src) VALUES (3, 'qux', 'za');
DROP TABLE IF EXISTS zb;
CREATE TABLE zb (id INTEGER, name VARCHAR, src VARCHAR);
INSERT INTO zb (id, name, src) VALUES (1, 'foo', 'zb');
INSERT INTO zb (id, name, src) VALUES (2, 'bar', 'zb');
select * from za
  full outer join zb
  on za.id=zb.id;"

DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1

  id  | name | src  |  id  | name | src  
------+------+------+------+------+------
    1 | foo  | za   |    1 | foo  | zb
    3 | qux  | za   | NULL | NULL | NULL
 NULL | NULL | NULL |    2 | bar  | zb
(3 rows)

I suspect that if we went ahead and tried to implement FULL OUTER JOIN right now the output still would not match that from Postgres because of the effect of other open issues (#5977, #5984) so perhaps in the near term we might just want to reject it. The fact super is only returning two rows makes me wonder if right now it's maybe falling back on running as a LEFT JOIN.

$ super -version
Version: bdb38bbc4

$ super -c "
select * from za.parquet as za
left join zb.parquet as zb
on za.id=zb.id"

{id:1,name:"foo",src:"zb"}
{id:error("missing"),name:error("missing"),src:error("missing")}

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions