我有一个表,里面有快递数据。我每天都有送货安排。我想统计一下失败的投递次数。如果连续几天投递失败,则计入1。例如,投递安排在2021年4月的每一天。4月15日交货失败。在那之后,从4月18日到4月20日,交货再次失败。虽然交付失败的天数是4天,但我希望将失败的天数计为2,因为连续失败的天数将被计为1。
DeliveryId DeliveryDate Status
1 2021-04-14 Success
2 2021-04-15 Failure
3 2021-04-16 Success
4 2021-04-17 Success
5 2021-04-18 Failure
6 2021-04-19 Failure
7 2021-04-20 Failure
8 2021-04-21 Success我想要一个新的列,它将显示第一次失败的deliveryId,如下所示。
DeliveryId DeliveryDate Status FailedDeliveryId
1 2021-04-14 Success
2 2021-04-15 Failure 2
3 2021-04-16 Success
4 2021-04-17 Success
5 2021-04-18 Failure 5
6 2021-04-19 Failure 5
7 2021-04-20 Failure 5
8 2021-04-21 Success 我已经尝试了几个选项,但未能达到上述结果。我在查询中使用了LAG函数来查找以前的交付状态。但问题是,如果交付失败超过3或4天,那么我将不得不使用滞后功能,以3或4天后,并检查状态。我想让它成为一个动态查询。下面是我用过的
SELECT *,
CASE WHEN Status='Failure' AND Prev_Status='Success' THEN DeliveryId
WHEN Status='Failure' AND Prev_Status='Failure' THEN Prev_DeliveryId
END AS FailureInstance
FROM (
SELECT *,
LAG(Status,1) OVER(ORDER BY DeliveryDate ASC) Prev_Status,
LAG(DeliveryId,1) OVER(ORDER BY DeliveryDate ASC) Prev_DeliveryId
FROM table1
) A发布于 2021-06-28 17:15:52
…
declare @t table(DeliveryId int, DeliveryDate date, Status varchar(10));
insert into @t(DeliveryId, DeliveryDate, Status)
values
(10, '2021-04-14', 'Success'),
(20, '2021-04-15', 'Failure'),
(30, '2021-04-16', 'Success'),
(40, '2021-04-17', 'Success'),
(70, '2021-04-18', 'Failure'),
(60, '2021-04-19', 'Failure'),
(50, '2021-04-20', 'Failure'),
(80, '2021-04-21', 'Success');
select *,
case when Status='Failure' then min(grpDeliveryId) over(partition by grp) end as FailedDeliveryId,
case when Status='Failure' then datediff(minute, min(DeliveryDate) over(partition by grp), min(grpSuccessDate) over(partition by grp) ) end as MinutesDiffFailSuccess
from
(
select *,
sum(addorcountme) over(order by DeliveryDate) as grp
from
(
select *,
case when Status='Failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else 1 end as addorcountme,
case when Status='Failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else DeliveryId end as grpDeliveryId,
case when Status='Failure' and lead(Status) over(order by DeliveryDate)='Success' then lead(DeliveryDate) over(order by DeliveryDate) end as grpSuccessDate
from @t
) as t
) as g;
/*
select *, case when Status='Failure' then min(DeliveryId) over(partition by grp) end as FailedDeliveryId
from
(
select *,
sum(sumorcountme) over(order by DeliveryDate) as grp
from
(
select *, case when Status='failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else 1 end as sumorcountme
from @t
) as t
) as g; */发布于 2021-06-28 18:44:49
一种方法只使用窗口函数。您可以根据每行的成功次数为每组故障分配一个组。然后,对于每个故障组,只需获取故障的最小id:
select t.*,
(case when status = 'Failure'
then min(case when status = 'Failure' then DeliveryId end) over (partition by grp)
end) as first_failureId
from (select t.*,
sum(case when status = 'Success' then 1 else 0 end) over (order by DeliveryDate) as grp
from t
) t;Here是一个db<>fiddle。
实际上,稍微简单一点的版本以相反的顺序分配组,因此第一条记录是失败的,而不是成功的,因此min()中的case是不必要的:
select t.*,
(case when status = 'Failure'
then min(DeliveryId) over (partition by grp)
end) as first_failureId
from (select t.*,
sum(case when status = 'Success' then 1 else 0 end) over (order by DeliveryDate desc) as grp
from t
) t
order by DeliveryId;嗯。..。..。另一种方法使用lag()来检测状态何时发生变化。然后,仅对更改使用累积最大值:
select t.*,
(case when status = 'Failure'
then max(case when prev_status is null or prev_status <> status then DeliveryId end) over (order by DeliveryDate)
end) as first_failureId
from (select t.*,
lag(status) over (order by DeliveryDate) as prev_status
from t
) t
order by DeliveryId;发布于 2021-06-28 16:32:09
这是一个“差距和岛屿”的问题。我最喜欢的文章是https://blog.jooq.org/2016/04/25/10-sql-tricks-that-you-didnt-think-were-possible/,技巧#4。
解析孤岛的关键是使用key (即date) - row_number,这将使用相同的编号对孤岛进行分组。结果与您的类似,但它可以在不使用lag的情况下处理任何数量的故障。您希望在此处仅使用故障:
select *,DeliveryDate-row_number() over (order by DeliveryId asc) as grp
from table1
where Status='Failure'现在你有了这个,你可以在上面添加dense_rank() (order by grp)来获得失败的次数,也许还可以合并成功的结果:
;with cte as
(
select *,DeliveryDate-row_number() over (order by DeliveryId asc) as grp
from table1
where Status='Failure'
)
select
cte.*,dense_rank() over (order by grp) as FailureNum
from cte
union all
select *,null as grp,null as FailureNum
from table1
where Status='Success'https://stackoverflow.com/questions/68158445
复制相似问题