我有这个SQL查询,我正在尝试转换它,以便它可以在HiveQL 2.1.1上运行。
SELECT p.id FROM page p, comments c, users u,
WHERE c.commentid= p.id
AND u.id = p.creatorid
AND u.upvotes IN (
SELECT MAX(upvotes)
FROM users u WHERE u.date > p.date
)
AND EXISTS (
SELECT 1 FROM links l WHERE l.relid > p.id
)这在Hive上不起作用,因为它有超过一个SubQuery (不支持)
从SQL到Hive的EXISTS或IN替换如下:
WHERE A.aid IN (SELECT bid FROM B...)可代之以:
A LEFT SEMI JOIN B ON aid=bid但是,我想不出用附加的MAX()函数来实现这一点的方法。
发布于 2020-04-23 17:20:32
使用标准联接语法,而不是用逗号分隔:
SELECT p.id
FROM page p INNER JOIN
comments c
ON c.commentid= p.id INNER JOIN
users u
ON u.id = p.creatorid INNER JOIN
links l
ON l.relid > p.id
WHERE u.upvotes IN (SELECT MAX(upvotes)
FROM users u
WHERE u.date > p.date
);发布于 2020-04-23 18:34:56
我不知道upvotes逻辑应该做什么。links逻辑易于处理。蜂巢可以处理这件事:
SELECT p.id
FROM page p JOIN
comments c
ON c.commentid = p.id JOIN
users u
ON u.id = p.creatorid CROSS JOIN
(SELECT MAX(l.relid) as max_relid
FROM links l
) l
WHERE l.max_relid > p.id AND
u.upvotes IN (SELECT MAX(upvotes)
FROM users u
WHERE u.date > p.date
);https://stackoverflow.com/questions/61393077
复制相似问题