为什么下面的查询不会产生相同的结果?
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
ID ID
1 1
2 2
3 3
4 4
with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r on r.id = l.id;
ID ID
1 1
2
3
4JOIN docs说:o1 join o2 using (key_column)等同于o1 join o2 on o2.key_column = o1.key_column
发布于 2020-10-14 01:26:09
我猜这属于非标准用法,所以不要这么做。具体地说:
若要正确使用USING子句,投影列表( SELECT关键字后的列和其他表达式的列表)应为“
*”。
SparkSQL产生与雪花相同的结果,但是产生了我期望的结果,所以...我猜这是不一致的。
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r using(id)").show()
+---+
| id|
+---+
| 1|
| 3|
| 4|
| 2|
+---+
scala> spark.sql("with l as (select col1 id from values(1), (2), (3)) , r as (select col1 id from values(1), (4)) select * from l full outer join r on l.id = r.id").show()
+----+----+
| id| id|
+----+----+
| 1| 1|
| 3|null|
|null| 4|
| 2|null|
+----+----+
psql> with l as (select $1 id from values(1), (2), (3))
, r as (select $1 id from values(1), (4))
select l.*,r.* from l full outer join r using(id);
id id
1 1
2 (null)
3 (null)
(null) 4发布于 2020-10-14 02:49:03
这种行为看起来确实很奇怪。推荐的方法是使用ON,而不是USING。
来自Snowflake社区的讨论:
根据ANSI标准,
FROM t1
FULL OUTER JOIN t2
USING (c)生成以下表达式: coalesce(t1.c,t2.c) as c。因此,标准中实际上没有定义对t1.c和t2.c的后续引用。MySQL、Postgres和Snowflake都支持这些引用,但使用了不同的语义。在Snowflake中,t1.c和t2.c只是c的别名。
https://community.snowflake.com/s/question/0D50Z00008WRZBBSA5/bug-with-join-using-
https://stackoverflow.com/questions/64339904
复制相似问题