假设有如下记录:
Employee_id, work_start_date, work_end_date
1, 01-jan-2014, 07-jan-2014
1, 03-jan-2014, 12-jan-2014
1, 23-jan-2014, 25-jan-2014
2, 15-jan-2014, 25-jan-2014
2, 07-jan-2014, 15-jan-2014
2, 09-jan-2014, 12-jan-2014所需的是编写一个select状态,该状态将汇总按employee_id分组的工作日,但不包括重叠期间(意思是--只计算它们一次)。
预期的产出将是:
Employee_id, worked_days
1, 13
2, 18日期范围内工作日的计算如下:如果work_start_date =5和work_end_date =9,那么worked_days =4(9-5)。
我可以编写一个pl/sql函数来解决这个问题(手动迭代记录并执行计算),但我确信可以使用SQL来获得更好的性能。
谁能帮我指出正确的方向吗?
谢谢!
发布于 2014-04-01 16:54:38
这是一个来自类似问题的稍微修改的查询:
compute sum of values associated with overlapping date ranges
SELECT "Employee_id",
SUM( "work_end_date" - "work_start_date" )
FROM(
SELECT "Employee_id",
"work_start_date" ,
lead( "work_start_date" )
over (Partition by "Employee_id"
Order by "Employee_id", "work_start_date" )
As "work_end_date"
FROM (
SELECT "Employee_id", "work_start_date"
FROM Table1
UNION
SELECT "Employee_id","work_end_date"
FROM Table1
) x
) x
WHERE EXISTS (
SELECT 1 FROM Table1 t
WHERE t."work_start_date" > x."work_end_date"
AND t."work_end_date" > x."work_start_date"
OR t."work_start_date" = x."work_start_date"
AND t."work_end_date" = x."work_end_date"
)
GROUP BY "Employee_id"
;演示:http://sqlfiddle.com/#!4/4fcce/2
发布于 2014-04-01 16:11:07
这是个棘手的问题。例如,您不能使用lag(),因为重叠期可能不是“前一个”周期。或者不同的时段可以在同一天开始或停止。
我们的想法是重建这些时期。怎么做?找到周期开始的记录-也就是说,没有与任何其他的重叠。然后将其用作标志,并累计计数此标志以计数重叠组。那么,获取工作日只是从那里聚合出来的:
with ps as (
select e.*,
(case when exists (select 1
from emps e2
where e2.employee_id = e.employee_id and
e2.work_start_date <= e.work_start_date and
e2.work_end_date >= e.work_end_date
)
then 0 else 1
) as IsPeriodStart
from emps e
)
select employee_id, sum(work_end_date - work_start_date) as Days_Worked
from (select employee_id, min(work_start_date) as work_start_date,
max(work_end_date) as work_end_date
from (select ps.*,
sum(IsPeriod_Start) over (partition by employee_id
order by work_start_date
) as grp
from ps
) ps
group by employee_id, grp
) ps
group by employee_id;发布于 2014-04-01 16:24:33
date_tbl型
创建或替换包RG_TYPE类型为date_tbl是日期表;
函数(结果为包含两个参数之间的日期的表)
创建或替换函数日期( p_from date,p_to date )返回rg_type.date_tbl流水线的是l_idx date:=p_from;开始循环如果l_idx>nvl(p_to,p_from)然后退出;end if;管道行(L_idx);l_idx:=l_idx+1;end循环;返回;end;
SQL:
选择employee_id,sum(c) from (选择e.employee_id,d.column_value,count(distinct w.employee_id)作为c from (从works中选择distinct employee_id ) e,表(dates(选择min(work_start_date)作为一个from ),(选择max(work_end_date)作为b来自works) )d工作w其中e.employee_id=w.employee_id和d.column_value>=w.work_start_date以及d.column_value
https://stackoverflow.com/questions/22790678
复制相似问题