-
Notifications
You must be signed in to change notification settings - Fork 71
Open
Labels
Description
Right now super parses a SQL FULL OUTER JOIN query, but the result doesn't match the result from Postgres with the same data/query.
$ super -f parquet -o za.parquet -c 'values {id:1,name:"foo",src:"za"},{id:3,name:"qux",src:"za"}' &&
super -f parquet -o zb.parquet -c 'values {id:1,name:"foo",src:"zb"},{id:2,name:"bar",src:"zb"}' &&
super -c "select * from za.parquet as za
full outer join zb.parquet as zb
on za.id=zb.id"
{id:1,name:"foo",src:"zb"}
{id:error("missing"),name:error("missing"),src:error("missing")}
Details
Repro is with super commit bdb38bb. This issue was surfaced by a user in a community Slack thread.
Here's Postgres returning the expected result with the same data/query.
$ psql --version
psql (PostgreSQL) 17.6 (Homebrew)
$ psql postgres -P null='NULL' -c "
DROP TABLE IF EXISTS za;
CREATE TABLE za (id INTEGER, name VARCHAR, src VARCHAR);
INSERT INTO za (id, name, src) VALUES (1, 'foo', 'za');
INSERT INTO za (id, name, src) VALUES (3, 'qux', 'za');
DROP TABLE IF EXISTS zb;
CREATE TABLE zb (id INTEGER, name VARCHAR, src VARCHAR);
INSERT INTO zb (id, name, src) VALUES (1, 'foo', 'zb');
INSERT INTO zb (id, name, src) VALUES (2, 'bar', 'zb');
select * from za
full outer join zb
on za.id=zb.id;"
DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
DROP TABLE
CREATE TABLE
INSERT 0 1
INSERT 0 1
id | name | src | id | name | src
------+------+------+------+------+------
1 | foo | za | 1 | foo | zb
3 | qux | za | NULL | NULL | NULL
NULL | NULL | NULL | 2 | bar | zb
(3 rows)
I suspect that if we went ahead and tried to implement FULL OUTER JOIN right now the output still would not match that from Postgres because of the effect of other open issues (#5977, #5984) so perhaps in the near term we might just want to reject it. The fact super is only returning two rows makes me wonder if right now it's maybe falling back on running as a LEFT JOIN.
$ super -version
Version: bdb38bbc4
$ super -c "
select * from za.parquet as za
left join zb.parquet as zb
on za.id=zb.id"
{id:1,name:"foo",src:"zb"}
{id:error("missing"),name:error("missing"),src:error("missing")}