我有一个包含数周缺失数据的表(如下所示):
week | customer_id | score
-----------|--------------|---------
2019-10-27 | 1 | 3
2019-11-10 | 1 | 4
2019-10-20 | 2 | 5
2019-10-27 | 2 | 8因此,我使用BigQuery的GENERATE_DATE_ARRAY函数来填充每个客户缺失的周(在2019-10-20到2019-11-10之间),这将导致缺失的周的customer_id和记分值为空(如下所示)。
week | customer_id | score
-----------|--------------|---------
2019-10-20 | NULL | NULL
2019-10-27 | 1 | 3
2019-11-03 | NULL | NULL
2019-11-10 | 1 | 4
2019-10-20 | 2 | 5
2019-10-27 | 2 | 8
2019-11-03 | NULL | NULL
2019-11-10 | NULL | NULL我希望使用最后一个非空值来向前填充customer_id并为每个客户评分,这样表就会如下所示:
week | customer_id | score
-----------|--------------|---------
2019-10-20 | NULL | NULL
2019-10-27 | 1 | 3
2019-11-03 | 1 | 3
2019-11-10 | 1 | 4
2019-10-20 | 2 | 5
2019-10-27 | 2 | 8
2019-11-03 | 2 | 8
2019-11-10 | 2 | 8我写了这个查询,但是,由于在某些行中customer_id值为NULL,我无法按此字段进行分区,而是返回NULL值。如果我过滤WHERE customer_id =1并删除PARTITION BY子句,我会得到想要的结果,但我不能让它对多个客户起作用。
WITH weeks AS
(SELECT created_week
FROM UNNEST(GENERATE_DATE_ARRAY('2019-10-20', '2019-11-10', INTERVAL 1 WEEK)) week
),
table AS
(SELECT *, DATE_TRUNC(EXTRACT(DATE FROM created_at), WEEK) AS week,
FROM score
)
SELECT weeks.week,
COALESCE(table.customer_id, LAST_VALUE(table.customer_id IGNORE NULLS) OVER (PARTITION BY table.customer_id ORDER BY weeks.week)) AS customer_id,
COALESCE(table.score, LAST_VALUE(table.score IGNORE NULLS) OVER (PARTITION BY table.customer_id ORDER BY weeks.week)) AS score,
FROM weeks
LEFT JOIN table
ON weeks.week = table.week我想知道如何为每个客户生成这个日期数组,然后以某种方式使用该客户的最后一个customer_id和得分值来向前填充任何缺少的数据。任何帮助都将不胜感激!
发布于 2021-05-28 05:26:09
最有效的方法是在需要的时候生成数据:
select the_week, t.customerid, t.score
from (select DATE_TRUNC(EXTRACT(DATE FROM created_at), WEEK) AS week,
customerid, score,
lead(DATE_TRUNC(EXTRACT(DATE FROM created_at), WEEK)) over (partition by customerid order by created_at) as next_week
from t
) t cross join
unnest(generate_date_array(t.week,
date_add(t.next_week, interval -1 week),
interval 1 week
)) the_week;通过只生成每周所需的日期,您不需要“填写”任何内容。唯一的缺点是你在第一周之前得不到数据。如果你真的想要的话,你可以填写它,但它看起来并不是很有用。
https://stackoverflow.com/questions/67730146
复制相似问题