我试图为每个日期生成value相对于ids的累积平均值,以便只考虑最后一个id。例如,初始表如下所示:
id value y m d
1 1 2020 3 10
2 2 2020 3 10
3 1 2020 3 11
2 4 2020 3 11我想要制作:
date average
2020-3-10 1.5
2020-3-11 2这里,日期2020-3-11的累积平均值计算为(1+4+1)/3 -从前2行和最后2行获取values。
我试着用这种方式解决这个问题:
SELECT date_parse(cast(c.y*10000+c.m*100+c.d as varchar), '%Y%m%d') as date, avg(s.value) as cum_aver FROM
(SELECT *
FROM (
SELECT id, value, date_parse(cast(y*10000+m*100+d as varchar), '%Y%m%d') as date,
ROW_NUMBER () OVER (PARTITION BY id ORDER BY date_parse (cast(y*10000+m*100+d as varchar), '%Y%m%d') DESC, id DESC) rn) from table
WHERE rn = 1) s
join table c ON
s.date <= date_parse (cast(c.y*10000+c.m*100+c.d as varchar), '%Y%m%d')
group by c.y, c.m, c.d但它并没有给我想要的输出。
发布于 2020-03-16 20:49:46
取每个id的最近值的平均值是相当棘手的。其想法是获取每个id的最新值,除以不同id的数量。
这也很棘手。要得到和,一种方法是保留第一个值,然后取连续的差。这些差异的和是任何时间点的和。不同ids的数量-嗯,数一下你看到的第一个。
select y, m, d,
(sum(sum(value - prev_value)) over (order by y, m, d) /
sum(sum(case when seqnum = 1 then 1 else 0 end)) over (order by y, m, d)
) as average
from (select t.*,
row_number() over (partition by id order by y, m, d) as seqnum,
lag(value, 1, 0) over (partition by id order by y, m, d) as prev_value
from t
) t
group by y, m, d;这里是db<>fiddle。
https://stackoverflow.com/questions/60712980
复制相似问题