首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >使用子查询和联接时的不同结果:SQL1454。活动用户

使用子查询和联接时的不同结果:SQL1454。活动用户
EN

Stack Overflow用户
提问于 2022-11-29 05:00:37
回答 1查看 27关注 0票数 0
代码语言:javascript
复制
Input: 
Accounts table:
+----+----------+
| id | name     |
+----+----------+
| 1  | Winston  |
| 7  | Jonathan |
+----+----------+
Logins table:
+----+------------+
| id | login_date |
+----+------------+
| 7  | 2020-05-30 |
| 1  | 2020-05-30 |
| 7  | 2020-05-31 |
| 7  | 2020-06-01 |
| 7  | 2020-06-02 |
| 7  | 2020-06-02 |
| 7  | 2020-06-03 |
| 1  | 2020-06-07 |
| 7  | 2020-06-10 |
+----+------------+
Output: 
+----+----------+
| id | name     |
+----+----------+
| 7  | Jonathan |
+----+----------+

说明:用户温斯顿的id =1只在两天内登录2次,所以,温斯顿不是一个活跃的用户。用户Jonathan,id =7,在6天内登录7次,其中5天是连续的,所以Jonathan是一个活动用户。

活动用户是指连续五天或更长时间登录帐户的用户。

编写SQL查询以查找id和活动用户的名称。

返回按id排序的结果表

我的代码在做这个问题时遇到了一些问题。

我的密码:

代码语言:javascript
复制
SELECT DISTINCT a.id, a.name
FROM Accounts a
LEFT JOIN Logins l
ON a.id = l.id
JOIN Logins l1
ON l.id=l1.id AND DATEDIFF(l.login_date, l1.login_date) BETWEEN 1 AND 4
GROUP BY l.login_date
HAVING COUNT(DISTINCT l1.login_date) = 4

输入

代码语言:javascript
复制
{"headers":{"Accounts":["id","name"],"Logins":["id","login_date"]},"rows":{"Accounts":[[182,"Gavriel"],[119,"Naftali"],[31,"Yaakov"],[136,"Menachem"],[142,"Sarah"],[204,"Daniel"],[49,"Ezra"],[27,"David"]],"Logins":[[142,"2020-6-27"],[119,"2020-6-29"],[31,"2020-6-26"],[27,"2020-6-27"],[182,"2020-7-2"],[136,"2020-6-28"],[142,"2020-7-5"],[27,"2020-6-29"],[136,"2020-6-27"],[49,"2020-7-1"],[204,"2020-7-1"],[49,"2020-7-5"],[204,"2020-7-3"],[49,"2020-7-3"],[31,"2020-7-3"],[204,"2020-7-3"],[142,"2020-6-30"],[119,"2020-6-26"],[142,"2020-6-29"],[136,"2020-7-2"],[49,"2020-7-2"],[182,"2020-7-4"],[119,"2020-6-29"],[49,"2020-6-30"],[136,"2020-7-5"],[27,"2020-7-2"],[136,"2020-6-28"],[31,"2020-6-29"],[204,"2020-7-3"],[142,"2020-6-29"],[31,"2020-6-30"],[204,"2020-6-27"],[204,"2020-7-2"],[182,"2020-6-27"],[31,"2020-7-3"],[119,"2020-7-4"],[142,"2020-6-27"],[119,"2020-6-27"],[27,"2020-6-26"],[142,"2020-7-2"],[27,"2020-6-28"],[136,"2020-6-26"],[119,"2020-6-27"],[142,"2020-7-1"],[27,"2020-7-1"],[31,"2020-6-29"],[204,"2020-6-28"],[136,"2020-6-28"],[204,"2020-7-3"],[31,"2020-6-28"],[182,"2020-6-29"],[49,"2020-7-4"],[204,"2020-6-27"],[136,"2020-7-5"],[142,"2020-7-4"],[31,"2020-7-2"],[182,"2020-7-1"],[204,"2020-6-28"],[31,"2020-7-4"],[136,"2020-7-1"],[136,"2020-6-26"],[27,"2020-7-4"],[27,"2020-6-29"],[31,"2020-7-2"]]}}

我的产出:

代码语言:javascript
复制
{"headers": ["id", "name"], "values": [[49, "Ezra"], [136, "Menachem"], [142, "Sarah"], [182, "Gavriel"]]}

预期产出:

代码语言:javascript
复制
{"headers":["id","name"],"values":[[49,"Ezra"]]}

正确答案:

代码语言:javascript
复制
SELECT DISTINCT l1.id,
(SELECT name FROM Accounts WHERE id = l1.id) AS name
FROM Logins l1
JOIN Logins l2 ON l1.id = l2.id AND DATEDIFF(l2.login_date, l1.login_date) BETWEEN 1 AND 4
GROUP BY l1.id, l1.login_date
HAVING COUNT(DISTINCT l2.login_date) = 4

在我的答案中,我加入了Accounts表,而不是使用subquery作为答案。但是,我不明白为什么我的代码会导致如此不同的输出。

EN

回答 1

Stack Overflow用户

发布于 2022-11-29 06:33:58

请尝试以下查询:

代码语言:javascript
复制
with data as (
select 
id,l_date,
--lag(l_date) over(partition by id order by l_date) lag_date,
l_date - lag(l_date) over(partition by id order by l_date),
case when l_date - lag(l_date) over(partition by id order by l_date) > 1 then 1 else 0 end as session_change

from login
),
staging_data as (
select 
id,l_date,sum(session_change)
over(partition by id order by l_date rows between unbounded preceding and current row) as session
from data )
select id
from staging_data
group by id,session having count(*) >=5
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/74609339

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档