下面是我正在使用的数据集:
customer_id, event_date, status, credit_limit
1, 2019-1-1, C, 1000
1, 2019-1-5, F, 1000
1, 2019-3-10, [NULL], 1000
1, 2019-3-10, [NULL], 1000
1, 2019-8-27, L, 1000
2, 2019-1-1, L, 2000
2, 2019-1-5, [NULL], 2500
2, 2019-3-10, [NULL], 2500
3, 2019-1-1, S, 5000
3, 2019-1-5, [NULL], 6000
3, 2019-3-10, B, 5000
4, 2019-3-10, B, 10000我试图解决以下几个问题:
For each customer_id, show account status at month end for the year 2019我尝试过使用windows函数last_value(),但它没有给我一个月内的最新日期。以下是我的疑问:
with cte1 as
(select customer_id, status,
event_date,
last_value(date_format(event_date, '%Y-%m-%d')) over ( partition by customer_id, event_date
order by event_date) as l_v
from cust_acct ca
where event_date between "2019-01-01 00:00:00" and "2019-12-31 11:59:59")
select * from cte1它返回:
Customer_id, Status, Event_date, L_v
1, C, 2019-01-01 00:00:00, 2019-01-01
1, F, 2019-01-05 00:00:00, 2019-01-05
1, [NULL], 2019-03-10 00:00:00, 2019-03-10
1, [NULL], 2019-03-10 00:00:00, 2019-03-10
1, L, 2019-08-27 00:00:00, 2019-08-27
2, L, 2019-01-01 00:00:00, 2019-01-01
2, [NULL], 2019-01-05 00:00:00, 2019-01-05
2, [NULL], 2019-03-10 00:00:00, 2019-03-10
3, S, 2019-01-01 00:00:00, 2019-01-01
3, [NULL], 2019-01-05 00:00:00, 2019-01-05
3, B, 2019-03-10 00:00:00, 2019-03-10
4, B, 2019-03-10 00:00:00, 2019-03-10Customer_id 1,对于2019-01个月,在列l_v中应该有一个'2019-01-05‘的last_value。为什么查询在l_v列中显示这两个日期?
发布于 2022-08-04 20:58:55
在这种情况下,LAST_VALUE()不是合适的窗口函数。
只有当您扩展窗口时才能使用它:
WITH cte1 AS (
SELECT customer_id, status, event_date,
LAST_VALUE(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
ORDER BY event_date
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;你应该使用FIRST_VALUE()
WITH cte1 AS (
SELECT customer_id, status, event_date,
FIRST_VALUE(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
ORDER BY event_date DESC
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;或者更好的MAX()
WITH cte1 AS (
SELECT customer_id, status, event_date,
MAX(DATE(event_date)) OVER (
PARTITION BY customer_id, DATE_FORMAT(event_date, '%Y-%m')
) AS l_v
FROM cust_acct ca
WHERE event_date BETWEEN '2019-01-01 00:00:00' AND '2019-12-31 11:59:59'
)
SELECT * FROM cte1;见演示。
https://stackoverflow.com/questions/73241626
复制相似问题