Input:
Accounts table:
+----+----------+
| id | name |
+----+----------+
| 1 | Winston |
| 7 | Jonathan |
+----+----------+
Logins table:
+----+------------+
| id | login_date |
+----+------------+
| 7 | 2020-05-30 |
| 1 | 2020-05-30 |
| 7 | 2020-05-31 |
| 7 | 2020-06-01 |
| 7 | 2020-06-02 |
| 7 | 2020-06-02 |
| 7 | 2020-06-03 |
| 1 | 2020-06-07 |
| 7 | 2020-06-10 |
+----+------------+
Output:
+----+----------+
| id | name |
+----+----------+
| 7 | Jonathan |
+----+----------+说明:用户温斯顿的id =1只在两天内登录2次,所以,温斯顿不是一个活跃的用户。用户Jonathan,id =7,在6天内登录7次,其中5天是连续的,所以Jonathan是一个活动用户。
活动用户是指连续五天或更长时间登录帐户的用户。
编写SQL查询以查找id和活动用户的名称。
返回按id排序的结果表
我的代码在做这个问题时遇到了一些问题。
我的密码:
SELECT DISTINCT a.id, a.name
FROM Accounts a
LEFT JOIN Logins l
ON a.id = l.id
JOIN Logins l1
ON l.id=l1.id AND DATEDIFF(l.login_date, l1.login_date) BETWEEN 1 AND 4
GROUP BY l.login_date
HAVING COUNT(DISTINCT l1.login_date) = 4输入
{"headers":{"Accounts":["id","name"],"Logins":["id","login_date"]},"rows":{"Accounts":[[182,"Gavriel"],[119,"Naftali"],[31,"Yaakov"],[136,"Menachem"],[142,"Sarah"],[204,"Daniel"],[49,"Ezra"],[27,"David"]],"Logins":[[142,"2020-6-27"],[119,"2020-6-29"],[31,"2020-6-26"],[27,"2020-6-27"],[182,"2020-7-2"],[136,"2020-6-28"],[142,"2020-7-5"],[27,"2020-6-29"],[136,"2020-6-27"],[49,"2020-7-1"],[204,"2020-7-1"],[49,"2020-7-5"],[204,"2020-7-3"],[49,"2020-7-3"],[31,"2020-7-3"],[204,"2020-7-3"],[142,"2020-6-30"],[119,"2020-6-26"],[142,"2020-6-29"],[136,"2020-7-2"],[49,"2020-7-2"],[182,"2020-7-4"],[119,"2020-6-29"],[49,"2020-6-30"],[136,"2020-7-5"],[27,"2020-7-2"],[136,"2020-6-28"],[31,"2020-6-29"],[204,"2020-7-3"],[142,"2020-6-29"],[31,"2020-6-30"],[204,"2020-6-27"],[204,"2020-7-2"],[182,"2020-6-27"],[31,"2020-7-3"],[119,"2020-7-4"],[142,"2020-6-27"],[119,"2020-6-27"],[27,"2020-6-26"],[142,"2020-7-2"],[27,"2020-6-28"],[136,"2020-6-26"],[119,"2020-6-27"],[142,"2020-7-1"],[27,"2020-7-1"],[31,"2020-6-29"],[204,"2020-6-28"],[136,"2020-6-28"],[204,"2020-7-3"],[31,"2020-6-28"],[182,"2020-6-29"],[49,"2020-7-4"],[204,"2020-6-27"],[136,"2020-7-5"],[142,"2020-7-4"],[31,"2020-7-2"],[182,"2020-7-1"],[204,"2020-6-28"],[31,"2020-7-4"],[136,"2020-7-1"],[136,"2020-6-26"],[27,"2020-7-4"],[27,"2020-6-29"],[31,"2020-7-2"]]}}我的产出:
{"headers": ["id", "name"], "values": [[49, "Ezra"], [136, "Menachem"], [142, "Sarah"], [182, "Gavriel"]]}预期产出:
{"headers":["id","name"],"values":[[49,"Ezra"]]}正确答案:
SELECT DISTINCT l1.id,
(SELECT name FROM Accounts WHERE id = l1.id) AS name
FROM Logins l1
JOIN Logins l2 ON l1.id = l2.id AND DATEDIFF(l2.login_date, l1.login_date) BETWEEN 1 AND 4
GROUP BY l1.id, l1.login_date
HAVING COUNT(DISTINCT l2.login_date) = 4在我的答案中,我加入了Accounts表,而不是使用subquery作为答案。但是,我不明白为什么我的代码会导致如此不同的输出。
发布于 2022-11-29 06:33:58
请尝试以下查询:
with data as (
select
id,l_date,
--lag(l_date) over(partition by id order by l_date) lag_date,
l_date - lag(l_date) over(partition by id order by l_date),
case when l_date - lag(l_date) over(partition by id order by l_date) > 1 then 1 else 0 end as session_change
from login
),
staging_data as (
select
id,l_date,sum(session_change)
over(partition by id order by l_date rows between unbounded preceding and current row) as session
from data )
select id
from staging_data
group by id,session having count(*) >=5https://stackoverflow.com/questions/74609339
复制相似问题