现有订单操作记录表t_order_log,包含用户ID,订单ID,操作状态,操作时间。该表中操作状态包含下单、取消、改单,三种状态,用户取消订单后,则该订单不能再做修改。
有效订单:指未取消的订单。
查询出用户最新的有效订单下单记录
样例数据
+----------+-----------+---------+----------------------+
| user_id | order_id | status | create_time |
+----------+-----------+---------+----------------------+
| U001 | ord001 | 1 | 2025-06-01 12:01:00 |
| U001 | ord001 | 3 | 2025-06-01 12:02:00 |
| U001 | ord001 | 2 | 2025-06-01 12:03:00 |
| U001 | ord002 | 1 | 2025-06-01 12:05:00 |
| U001 | ord003 | 1 | 2025-06-01 12:15:00 |
| U001 | ord003 | 2 | 2025-06-01 12:20:00 |
| U002 | ord004 | 1 | 2025-06-01 15:01:00 |
| U002 | ord004 | 2 | 2025-06-01 15:02:00 |
| U003 | ord005 | 1 | 2025-06-01 20:01:00 |
| U003 | ord006 | 1 | 2025-06-01 20:02:00 |
+----------+-----------+---------+----------------------+数据说明:
期望结果
+----------+-----------+---------+----------------------+
| user_id | order_id | status | create_time |
+----------+-----------+---------+----------------------+
| U001 | ord002 | 1 | 2025-06-01 12:05:00 |
| U003 | ord006 | 1 | 2025-06-01 20:02:00 |
+----------+-----------+---------+----------------------+订单状态通常是对状态进行更新,在数据表中存储日志这种设计确实不合适。虽然设计不合适但是遇到我们得会处理。
处理难度相对简单,理清逻辑即可。
维度 | 评分 |
|---|---|
题目难度 | ⭐️⭐️⭐️ |
题目清晰度 | ⭐️⭐️⭐️⭐️⭐ |
业务常见度 | ⭐️⭐️⭐️⭐️ |
处理逻辑:
先查询所有的下单记录t1,再查询取消订单记录t2,使用orderid将t1 left join t2,剔除有取消的订单后,查询每个用户的最新记录。
查询SQL
select user_id,
order_id,
status,
create_time
from (select t1.user_id,
t1.order_id,
t1.status,
t1.create_time,
row_number() over (partition by t1.user_id order by t1.create_time desc) as rn
from (
--下单记录
select *
from t_order_log
where status = 1) t1
left join
(
--取消单记录
select *
from t_order_log
where status = 2) t2
on t1.order_id = t2.order_id
where t2.order_id is null--限定非取消单
) tt
where tt.rn = 1执行结果
+----------+-----------+---------+----------------------+
| user_id | order_id | status | create_time |
+----------+-----------+---------+----------------------+
| U001 | ord002 | 1 | 2025-06-01 12:05:00 |
| U003 | ord006 | 1 | 2025-06-01 20:02:00 |
+----------+-----------+---------+----------------------+处理逻辑:
如果我们将用户下单和取消的记录取出来,对每个订单进行统计,如果用户存在取消记录,则订单存在2条记录,如果用户不存在取消记录,则订单存在1条记录。 取出仅有一条记录的订单记录,然后根据用户进行分组,取出最新记录。
查询SQL
select user_id,
order_id,
status,
create_time
from (select user_id,
order_id,
status,
create_time,
row_number() over (partition by user_id order by create_time desc) as rn
from (
--订单行数
select user_id,
order_id,
status,
create_time,
count(1) over (partition by order_id) ord_cnt
from t_order_log
where status in (1, 2)) t
where ord_cnt = 1 --仅有下单记录的订单记录
) tt
where tt.rn = 1执行结果
+----------+-----------+---------+----------------------+
| user_id | order_id | status | create_time |
+----------+-----------+---------+----------------------+
| U001 | ord002 | 1 | 2025-06-01 12:05:00 |
| U003 | ord006 | 1 | 2025-06-01 20:02:00 |
+----------+-----------+---------+----------------------+CREATE TABLE IF NOT EXISTS t_order_log (
user_id string, --用户ID
order_id STRING, --订单ID
status bigint, --操作类型1-下单,2-取消,3-改单
create_time string --操作时间
);
insert into t_order_log(user_id, order_id,status,create_time) values
('U001','ord001',1,'2025-06-01 12:01:00'),
('U001','ord001',3,'2025-06-01 12:02:00'),
('U001','ord001',2,'2025-06-01 12:03:00'),
('U001','ord002',1,'2025-06-01 12:05:00'),
('U001','ord003',1,'2025-06-01 12:15:00'),
('U001','ord003',2,'2025-06-01 12:20:00'),
('U002','ord004',1,'2025-06-01 15:01:00'),
('U002','ord004',2,'2025-06-01 15:02:00'),
('U003','ord005',1,'2025-06-01 20:01:00'),
('U003','ord006',1,'2025-06-01 20:02:00');本文同步在微信公众号”数据仓库技术“和个人博客”数据仓库技术“发表。