我有一个表,其中包含每个日期的作业ID (varchar)和用户ID。
User Date Job
mid1 2019-10-10 jid1
mid1 2019-10-10 jid2
mid1 2019-10-10 jid3
mid1 2019-10-10 jid4
mid1 2019-10-10 jid5
mid1 2019-10-11 jid3
mid1 2019-10-11 jid5
mid1 2019-10-11 jid6
mid1 2019-10-11 jid7
mid1 2019-10-11 jid8
mid1 2019-10-11 jid9
mid1 2019-10-12 jid3
mid1 2019-10-12 jid9
mid1 2019-10-12 jid10
mid2 2019-10-10 jid100
mid2 2019-10-10 jid101
mid2 2019-10-10 jid102
...现在,我需要一个表,其中包含每个用户数据的时序中新作业(“传入”)和已完成(“传出”)作业的数量。
User Date Jobs Incoming Outgoing
mid1 2019-10-10 5 5 0
mid1 2019-10-11 6 4 3
mid1 2019-10-12 3 1 4
mid2 ...如果它只计算唯一的作业id (有重复的),那就更好了。但除此之外,我可以提前消除它们。
使用Teradata SQL可以做到这一点吗?
发布于 2019-10-17 20:05:47
SELECT
User
,Date
,Count(*) AS Jobs
-- new jobs today
,Sum(firstdate) AS Incoming
-- finished jobs today
,Sum(lastdate)
-- finished jobs the day before
,Lag(Sum(lastdate),1,0) Over (PARTITION BY User ORDER BY Date) AS Outgoing
FROM
(
SELECT
User
,Job
,Date
-- flag indicating job is present on the current day but absent the day before
,CASE WHEN Date = Lag(Date) Over (PARTITION BY User, job ORDER BY Date) + 1 THEN 0 ELSE 1 END AS firstdate
-- flag indicating job is present on the current day but absent the day after
,CASE WHEN Date = Lead(Date) Over (PARTITION BY User, job ORDER BY Date) - 1 THEN 0 ELSE 1 END AS lastdate
FROM your_table
-- to remove duplicate rows add
-- GROUP BY 1,2,3
) AS dt
GROUP BY 1,2
ORDER BY 1,2如果你是Teradata版本不支持滞后/领先(即< 16.10),你必须重写它:
SELECT
User
,Date
,Count(*) AS Jobs
-- new jobs today
,Sum(firstdate) AS Incoming
-- finished jobs today
,Sum(lastdate)
-- finished jobs the day before
,Coalesce(Min(Sum(lastdate)) Over (PARTITION BY User ORDER BY Date ROWS BETWEEN 1 Preceding AND 1 Preceding), 0) AS Outgoing
FROM
(
SELECT
User
,Job
,Date
-- flag indicating job is present on the current day but absent the day before
,CASE WHEN Date = Min(Date) Over (PARTITION BY User, job ORDER BY Date ROWS BETWEEN 1 Preceding AND 1 Preceding) + 1 THEN 0 ELSE 1 END AS firstdate
-- flag indicating job is present on the current day but absent the day after
,CASE WHEN Date = Min(Date) Over (PARTITION BY User, job ORDER BY Date ROWS BETWEEN 1 Following AND 1 Following ) - 1 THEN 0 ELSE 1 END AS lastdate
FROM your_table
-- to remove duplicate rows add
-- GROUP BY 1,2,3
) AS dt
GROUP BY 1,2
ORDER BY 1,2https://stackoverflow.com/questions/58427517
复制相似问题