我想通过以下条件从数据库表中取样:我的示例应该在每天每小时从特定类别列中至少有100个样本。下表如下:
id recieved_time catgory
1 2022-08-01 4:05 blue
1 2022-08-01 5:05 red
1 2022-08-01 7:05 red
...有人能帮我写一个高效的sql查询吗?(顺便说一句,如果这有帮助的话,我正在使用clickhouse )
发布于 2022-10-17 01:56:27
如果您只希望每类随机抽样100个,请尝试不按条件订购的ROW_NUMBER():
SELECT
*
FROM
(
SELECT
*,
ROW_NUMBER() OVER (
PARTITION BY
category,
DATE_TRUNC('hour', received_time)
ORDER BY
(SELECT NULL) -- change if you care
) AS category_rownumber
FROM
TableName
) X
WHERE
category_rownumber <= 100
ORDER BY
category,
received_time DESC
;https://stackoverflow.com/questions/74086031
复制相似问题