首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何优化这个工作中的TAT查询?

如何优化这个工作中的TAT查询?
EN

Stack Overflow用户
提问于 2021-06-25 17:58:03
回答 1查看 108关注 0票数 0

我的任务是编写一份不包括周末和假日的TAT报告。这篇文章的目的是确定我所做的解决方案是否可以更好地优化,通过改进或使用另一种选择。

当我最初着手这项任务时,我想加入一个data表,但这似乎有问题,因为如果我有这样的数据集:

代码语言:javascript
复制
create table #example (ID varchar(50), startdate datetime2, enddate datetime2)

insert into #example
values
('4CF662A3-57EE-446B-9FC3-3F105219F253',      '2021-05-07 21:45:00.000',              '2021-05-15 08:46:32.000')
,('8B58294D-C20E-4E87-8B9C-186EFB3DCF19',      '2021-04-29 08:52:00.000',              '2021-05-06 20:02:58.000')
,('6177A53C-B285-4087-9F03-BF3324992DEE',      '2021-05-20 10:47:00.000',              '2021-05-27 22:05:30.000')
,('26E3A73E-9D5A-472C-890A-57D51B038C79',      '2021-05-04 20:24:00.000',              '2021-05-12 09:11:47.000')
,('ECC7DB03-A2D9-4D03-AC8D-47B5204C5512',      '2021-05-13 20:24:00.000',              '2021-05-21 10:29:03.000')
,('0A673471-CA22-4AA5-8024-EEF7CEBA8A0D',      '2021-05-10 20:24:00.000',              '2021-05-18 11:03:41.000')
,('27ADB272-A9E0-42ED-AE3C-E42F051C60DF',      '2021-05-03 20:06:00.000',              '2021-05-11 10:59:29.000')
,('5B9D97EC-A994-45E3-BB18-E583C9702389',      '2021-05-11 18:39:00.000',              '2021-05-19 09:56:18.000')
,('DE88A793-4F1C-479D-BB28-4245D9D30606',      '2021-05-10 20:07:00.000',              '2021-05-18 11:48:17.000')
,('42962CBB-79AF-44CB-8EB9-565A75EEA565',      '2021-04-26 21:35:00.000',              '2021-05-04 13:16:25.000')
,('62A980BF-D6B5-47F3-A671-D7F3DE046747',      '2021-05-20 18:08:00.000',              '2021-05-28 10:06:56.000')
,('E1D89979-7D77-4D63-983E-8127A1CB81FA',      '2021-05-18 20:28:00.000',              '2021-05-26 12:27:32.000')
,('D3D923AC-21E0-401E-A9DB-7BE1BE87A6B1',      '2021-05-13 19:56:00.000',              '2021-05-21 11:55:22.000')
,('364F5BF9-55A3-430E-9F41-F3EB41D30633',      '2021-05-13 19:58:00.000',              '2021-05-21 12:09:43.000')

select * from #example drop table #example

然后加入一个日期表,只需告诉我加入的日期是否是工作日、假日等等。

(注:我见过有人试图通过加入日期表来创建一个Business报告,但它们似乎不像预期的那样准确,或者有一些奇怪的问题。由于这些原因,我本人没有深入探讨这个选择。)

接下来,我使用一个循环查看了一个标量UDF。我的灵感主要来自于本文:https://anyonconsulting.com/business_intelligence/how-to-count-the-number-of-business-days-between-two-dates-which-exclude-weekends-and-holidays/。这是我最终开发的查询:

代码语言:javascript
复制
declare @DateFrom as datetime2 = '2021-05-17 19:06:00.000';
declare @DateTo as datetime2 = '2021-05-31 17:59:51.000';

-- Switches things around
if @DateFrom > @DateTo
    begin
        declare
            @T datetime2 = @DateTo
            , @F datetime2 = @DateFrom
        select @DateFrom = @T
            , @DateTo = @F
    end

declare @temptable table (
CalendarDate datetime2 not null primary key
, DayID as (datepart(weekday,[CalendarDate]))
, isHoliday as 
    (case  
        -- New Year's Day, Observed 1st Day of Year
        when month([CalendarDate]) = 1  and day([CalendarDate]) = 1 and datepart(weekday,[CalendarDate]) in (2,3,4,5,6) then 1 --New Year’s Day
        when month([CalendarDate]) = 12  and day([CalendarDate]) = 31 and datepart(weekday,[CalendarDate]) =6 then 1 --New Year’s Day
        when month([CalendarDate]) = 1  and day([CalendarDate]) = 2 and datepart(weekday,[CalendarDate]) =2 then 1 --New Year’s Day

        -- Martin Luther King Jr. Day, Observed 3rd Monday of January
        when month([CalendarDate]) = 1  and day([CalendarDate]) >= 15 and day([CalendarDate]) <= 21 and datepart(weekday,[CalendarDate]) = 2 then 1

        -- Presidents' Day, Officially Washington’s Birthday, Observed 3rd Monday of February
        when month([CalendarDate]) = 2  and day([CalendarDate]) >= 15 and day([CalendarDate]) <= 21 and datepart(weekday,[CalendarDate]) = 2 then 1 

        -- Memorial Day, Observed Last Monday of May
        when month([CalendarDate]) = 5  and day([CalendarDate]) >= 25 and datepart(weekday,[CalendarDate]) = 2 then 1

        -- Independence Day, July 4th
        when month([CalendarDate]) = 7  and day([CalendarDate]) = 4 and datepart(weekday,[CalendarDate]) IN (2,3,4,5,6) then 1
        when month([CalendarDate]) = 7  and day([CalendarDate]) = 3 and datepart(weekday,[CalendarDate]) =6 then 1
        when month([CalendarDate]) = 7  and day([CalendarDate]) = 5 and datepart(weekday,[CalendarDate]) =2 then 1

        -- Labor Day, Observed First Monday of September
        when month([CalendarDate]) = 9  and day([CalendarDate]) <= 7 and datepart(weekday,[CalendarDate]) = 2 then 1

        -- Thanksgiving Day, Observed Fourth Thursday of November
        when month([CalendarDate]) = 11 and day([CalendarDate]) >= 22 and day([CalendarDate]) <= 28 AND datepart(weekday,[CalendarDate]) = 5 then 1

        -- Christmas Day, Observed 25th of December
        when month([CalendarDate]) = 12 and day([CalendarDate]) = 25 and datepart(weekday,[CalendarDate]) IN (2,3,4,5,6) then 1
        when month([CalendarDate]) = 12 and day([CalendarDate]) = 24 and datepart(weekday,[CalendarDate]) =6 then 1
        when month([CalendarDate]) = 12 and day([CalendarDate]) = 26 and datepart(weekday,[CalendarDate]) =2 then 1

        else 0

    end))

-- Fill the Calendar table
declare @D datetime2 = @DateFrom
while @D <= @DateTo
begin
   insert into @temptable (CalendarDate)
      select @D
   set @D = dateadd(hour, 1, @D)
end

declare
    @Count as int = 0
    , @Date as datetime2 = @DateFrom
while 
    @Date <= @DateTo
begin
    if ((datepart(weekday, @Date) in (1, 7))
        or exists
            (select *
            from @temptable
            where CalendarDate = @Date
                and isHoliday = 1
                and (DayID <> 7 or DayID <> 1)))
        begin
            select @Count = @Count + 1
        end

    select @Date = dateadd(hour, 1, @Date)
end

select convert(decimal(38,2) , (datediff(hour, @DateFrom, @DateTo) - (@Count)) / 24.0) as [TAT]

这个脚本,当被放入一个函数中时,是相当准确的,但是性能很慢。在一组43,057行的测试中,它花费了1分19秒(让它执行得很好对我来说很重要,因为查询已经足够长了)。由于性能不佳,我开始考虑其他方法来做到这一点。我研究了一段时间,并对此表示赞赏,其中一个是:https://www.sqlservercentral.com/articles/calculating-work-days。但不幸的是,它对我来说还不够精确,因为我想要小数天的数字(所以我想要4.68,而不是5)。

然后,我开始考虑某种基于集合的函数,灵感来自于这个https://www.sqlservercentral.com/articles/the-joy-of-numbers以及Itzik的based查询,第216-218页关于填充数字表。在进行了一些实验之后,我得到了以下查询:

代码语言:javascript
复制
declare 
    @TotalTimeDiff int
    , @low int
    , @DateFrom datetime2
    , @DateTo datetime2;

set @DateFrom = '2021-05-17 19:06:00.000';
set @DateTo = '2021-05-31 17:59:51.000';

set @low = 1;
set @TotalTimeDiff = (datediff(hour, @DateFrom, @DateTo))

;with 
    L0 as (select C from (values(1),(1)) as D(c)),
    L1 as (select 1 as c from L0 as A cross join L0 as B),
    L2 as (select 1 as c from L1 as A cross join L1 as B),
    L3 as (select 1 as c from L2 as A cross join L2 as B),
    L4 as (select 1 as c from L3 as A cross join L3 as B),
    L5 as (select 1 as c from L4 as A cross join L4 as B), -- Up to 4,294,967,296 rows
    
    Nums as (select ROW_NUMBER() over(order by (select null)) as rownum from L5),

    Times as (select top (@TotalTimeDiff) @low + rownum - 1 as n from Nums)

select
    [TAT] = isnull(((count(BusinessHours.n) - sum(BusinessHours.NonBusinessHours)) / 24.0), datediff(minute, @DateFrom, @DateTo) / 1440.0)
from (
    select tms.n
    , [NonBusinessHours] = 
        case 
            when datename(dw, dateadd(hour, tms.n, @DateFrom)) in ('Sunday', 'Saturday') then 1
            
            -- New Year's Day, Observed 1st Day of Year
            when month(dateadd(hour, tms.n, @DateFrom)) = 1  and day(dateadd(hour, tms.n, @DateFrom)) = 1 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) in (2,3,4,5,6) then 1 --    New Year’s Day
            when month(dateadd(hour, tms.n, @DateFrom)) = 12  and day(dateadd(hour, tms.n, @DateFrom)) = 31 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =6 then 1 --New Year’s    Day
            when month(dateadd(hour, tms.n, @DateFrom)) = 1  and day(dateadd(hour, tms.n, @DateFrom)) = 2 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =2 then 1 --New Year’s  Day

            -- Martin Luther King Jr. Day, Observed 3rd Monday of January
            when month(dateadd(hour, tms.n, @DateFrom)) = 1  and day(dateadd(hour, tms.n, @DateFrom)) >= 15 and day(dateadd(hour, tms.n, @DateFrom)) <= 21 and datepart(weekday,dateadd (hour, tms.n, @DateFrom)) = 2 then 1

            -- Presidents' Day, Officially Washington’s Birthday, Observed 3rd Monday of February
            when month(dateadd(hour, tms.n, @DateFrom)) = 2  and day(dateadd(hour, tms.n, @DateFrom)) >= 15 and day(dateadd(hour, tms.n, @DateFrom)) <= 21 and datepart(weekday,dateadd (hour, tms.n, @DateFrom)) = 2 then 1 

            -- Memorial Day, Observed Last Monday of May
            when month(dateadd(hour, tms.n, @DateFrom)) = 5  and day(dateadd(hour, tms.n, @DateFrom)) >= 25 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) = 2 then 1

            -- Independence Day, July 4th
            when month(dateadd(hour, tms.n, @DateFrom)) = 7  and day(dateadd(hour, tms.n, @DateFrom)) = 4 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) IN (2,3,4,5,6) then 1
            when month(dateadd(hour, tms.n, @DateFrom)) = 7  and day(dateadd(hour, tms.n, @DateFrom)) = 3 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =6 then 1
            when month(dateadd(hour, tms.n, @DateFrom)) = 7  and day(dateadd(hour, tms.n, @DateFrom)) = 5 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =2 then 1

            -- Labor Day, Observed First Monday of September
            when month(dateadd(hour, tms.n, @DateFrom)) = 9  and day(dateadd(hour, tms.n, @DateFrom)) <= 7 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) = 2 then 1 

            -- Thanksgiving Day, Observed Fourth Thursday of November
            when month(dateadd(hour, tms.n, @DateFrom)) = 11 and day(dateadd(hour, tms.n, @DateFrom)) >= 22 and day(dateadd(hour, tms.n, @DateFrom)) <= 28 AND datepart(weekday,dateadd (hour, tms.n, @DateFrom)) = 5 then 1

            -- Christmas Day, Observed 25th of December
            when month(dateadd(hour, tms.n, @DateFrom)) = 12 and day(dateadd(hour, tms.n, @DateFrom)) = 25 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) IN (2,3,4,5,6) then 1
            when month(dateadd(hour, tms.n, @DateFrom)) = 12 and day(dateadd(hour, tms.n, @DateFrom)) = 24 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =6 then 1
            when month(dateadd(hour, tms.n, @DateFrom)) = 12 and day(dateadd(hour, tms.n, @DateFrom)) = 26 and datepart(weekday,dateadd(hour, tms.n, @DateFrom)) =2 then 1

        else 0 end
    from Times as tms) as [BusinessHours]

此查询在转换为标量UDF后,在17秒钟内计算出43,057行的业务TAT,这是对循环查询的明显改进。(顺便说一句,这就是Server 2017。)

首先,我想知道最后一个解决方案是否是人们可以考虑的“基于集”的解决方案。它似乎比明显的过程/迭代循环更基于集合,但它不是根据数据集本身计算TAT,而是从表中逐项输入(@DateFrom和@DateTo)。我认为纯“基于集”的解决方案意味着使用从数据集本身到日期表的复杂连接逻辑。然而,这个函数是基于设置的,也就是说,一旦函数本身运行,它就不会经过一个循环,一个接一个地确定每一个小时,而是给出一个数字列表,在相关的日期/时间集中推入逐项列出的小时,以及每个小时是否是一个工作小时。

第二,有人知道有什么方法可以进一步改进吗?或者是另一种方法,以达到这种精确度(将两个日期之间的每一个小时划分为一个营业小时)?比这更快的东西?

谢谢。如果这是不恰当的发布(在这里看起来确实不那么具体),我会感谢重定向,所以我可以在那里张贴这个。我主要是尝试众包这个功能,看看它能走多远,以及把这个函数放在那里,因为它是有趣的收敛后,看看多个解决方案。

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2021-07-12 17:51:06

多亏@Charlieface,我提出了一个非常精确的解决方案,只使用一个日期表。因为我只在生产数据上做了这个解决方案,所以我不会在这里发布,但这是可能的。话虽如此,我只能在一定程度上改善这种解决办法的问题。引起的问题是:

  • ,如果某件事在星期天完成了怎么办?
  • ,我们怎么知道哪几天是真正的工作?
  • ,他们星期六也工作吗?

这个解决方案确实比我上面的第二个解决方案执行得更快(大约快5-7秒),但是,考虑到一些不准确的地方,我选择了上面的第二个解决方案。希望这篇文章能帮助那些希望根据自己的自定义业务逻辑计算业务TAT的人提供指导。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/68135326

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档