首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >在TSQL中对连续值计数一次

在TSQL中对连续值计数一次
EN

Stack Overflow用户
提问于 2021-06-28 14:22:30
回答 4查看 107关注 0票数 0

我有一个表,里面有快递数据。我每天都有送货安排。我想统计一下失败的投递次数。如果连续几天投递失败,则计入1。例如,投递安排在2021年4月的每一天。4月15日交货失败。在那之后,从4月18日到4月20日,交货再次失败。虽然交付失败的天数是4天,但我希望将失败的天数计为2,因为连续失败的天数将被计为1。

代码语言:javascript
复制
DeliveryId     DeliveryDate     Status
1              2021-04-14       Success
2              2021-04-15       Failure
3              2021-04-16       Success
4              2021-04-17       Success
5              2021-04-18       Failure
6              2021-04-19       Failure
7              2021-04-20       Failure
8              2021-04-21       Success

我想要一个新的列,它将显示第一次失败的deliveryId,如下所示。

代码语言:javascript
复制
DeliveryId     DeliveryDate     Status     FailedDeliveryId
1              2021-04-14       Success     
2              2021-04-15       Failure     2
3              2021-04-16       Success     
4              2021-04-17       Success     
5              2021-04-18       Failure     5
6              2021-04-19       Failure     5
7              2021-04-20       Failure     5
8              2021-04-21       Success     

我已经尝试了几个选项,但未能达到上述结果。我在查询中使用了LAG函数来查找以前的交付状态。但问题是,如果交付失败超过3或4天,那么我将不得不使用滞后功能,以3或4天后,并检查状态。我想让它成为一个动态查询。下面是我用过的

代码语言:javascript
复制
SELECT *, 
       CASE WHEN Status='Failure' AND Prev_Status='Success' THEN DeliveryId 
            WHEN Status='Failure' AND Prev_Status='Failure' THEN Prev_DeliveryId 
            END AS FailureInstance 
FROM (
         SELECT *, 
                LAG(Status,1) OVER(ORDER BY DeliveryDate ASC) Prev_Status,
                LAG(DeliveryId,1) OVER(ORDER BY DeliveryDate ASC) Prev_DeliveryId
         FROM   table1 
     ) A
EN

回答 4

Stack Overflow用户

回答已采纳

发布于 2021-06-28 17:15:52

代码语言:javascript
复制
declare @t table(DeliveryId int, DeliveryDate date, Status varchar(10));

insert into @t(DeliveryId, DeliveryDate, Status)
values
(10, '2021-04-14', 'Success'),
(20, '2021-04-15', 'Failure'),
(30, '2021-04-16', 'Success'),
(40, '2021-04-17', 'Success'),
(70, '2021-04-18', 'Failure'),
(60, '2021-04-19', 'Failure'),
(50, '2021-04-20', 'Failure'),
(80, '2021-04-21', 'Success');


select *, 
case when Status='Failure' then min(grpDeliveryId) over(partition by grp) end as FailedDeliveryId,
case when Status='Failure' then datediff(minute, min(DeliveryDate) over(partition by grp), min(grpSuccessDate) over(partition by grp) ) end as MinutesDiffFailSuccess
from
(
select *, 
sum(addorcountme) over(order by DeliveryDate) as grp
from
(
select *, 
case when Status='Failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else 1 end as addorcountme,
case when Status='Failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else DeliveryId end as grpDeliveryId,
case when Status='Failure' and lead(Status) over(order by DeliveryDate)='Success' then lead(DeliveryDate) over(order by DeliveryDate) end as grpSuccessDate
from @t
) as t
) as g;



/*
select *, case when Status='Failure' then min(DeliveryId) over(partition by grp) end as FailedDeliveryId
from
(
select *, 
sum(sumorcountme) over(order by DeliveryDate) as grp
from
(
select *, case when Status='failure' and lag(Status) over(order by DeliveryDate)='Failure' then null else 1 end as sumorcountme
from @t
) as t
) as g; */
票数 1
EN

Stack Overflow用户

发布于 2021-06-28 18:44:49

一种方法只使用窗口函数。您可以根据每行的成功次数为每组故障分配一个组。然后,对于每个故障组,只需获取故障的最小id:

代码语言:javascript
复制
select t.*,
       (case when status = 'Failure'
             then min(case when status = 'Failure' then DeliveryId end) over (partition by grp)
        end) as first_failureId
from (select t.*,
             sum(case when status = 'Success' then 1 else 0 end) over (order by DeliveryDate) as grp
      from t
     ) t;

Here是一个db<>fiddle。

实际上,稍微简单一点的版本以相反的顺序分配组,因此第一条记录是失败的,而不是成功的,因此min()中的case是不必要的:

代码语言:javascript
复制
select t.*,
       (case when status = 'Failure'
             then min(DeliveryId) over (partition by grp)
        end) as first_failureId
from (select t.*,
             sum(case when status = 'Success' then 1 else 0 end) over (order by DeliveryDate desc) as grp
      from t
     ) t
order by DeliveryId;

嗯。..。..。另一种方法使用lag()来检测状态何时发生变化。然后,仅对更改使用累积最大值:

代码语言:javascript
复制
select t.*,
       (case when status = 'Failure'
             then max(case when prev_status is null or prev_status <> status then DeliveryId end) over (order by DeliveryDate)
        end) as first_failureId
from (select t.*,
             lag(status) over (order by DeliveryDate) as prev_status
      from t
     ) t
order by DeliveryId;
票数 1
EN

Stack Overflow用户

发布于 2021-06-28 16:32:09

这是一个“差距和岛屿”的问题。我最喜欢的文章是https://blog.jooq.org/2016/04/25/10-sql-tricks-that-you-didnt-think-were-possible/,技巧#4。

解析孤岛的关键是使用key (即date) - row_number,这将使用相同的编号对孤岛进行分组。结果与您的类似,但它可以在不使用lag的情况下处理任何数量的故障。您希望在此处仅使用故障:

代码语言:javascript
复制
select *,DeliveryDate-row_number() over (order by DeliveryId asc) as grp
from table1
where Status='Failure'

现在你有了这个,你可以在上面添加dense_rank() (order by grp)来获得失败的次数,也许还可以合并成功的结果:

代码语言:javascript
复制
;with cte as
(
    select *,DeliveryDate-row_number() over (order by DeliveryId asc) as grp
    from table1
    where Status='Failure'
)
    select 
        cte.*,dense_rank() over (order by grp) as FailureNum
    from cte
union all
    select *,null as grp,null as FailureNum
    from table1
    where Status='Success'
票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68158445

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档