我需要使用多个聚合级别在表上运行聚合统计信息。
实现这一目标的一种不优雅的方法是使用相同的字段名来UNION几个SELECT查询,但每个查询的GROUP BY中有一个不同的字段。
例:每个市(=最小的地理单位)、县、民族和世界的人口总数( population场的总和)。
-- aggregate dummy table population_statistics by
-- 1:municipality 2:county 3:nation 4:world
SELECT
"municipality" AS geo_level,
municipality_name AS geo_name,
SUM(population) AS population
FROM population_statistics
GROUP BY municipality_name
UNION
SELECT
"county" AS geo_level,
county_name AS geo_name,
SUM(population) AS population
FROM population_statistics
GROUP BY county_name
UNION
SELECT
"nation" AS geo_level,
country_name AS geo_name,
SUM(population) AS population
FROM population_statistics
GROUP BY country_name
UNION
SELECT
"world" AS geo_level,
"world" AS geo_name,
SUM(population) AS population
FROM population_statistics
;我知道,像R的tidyr和python的pandas库这样的编程语言有更清晰的方法来运行表的多级聚合。,但是否可以使用纯SQL运行多层聚合?可能会使用公共表表达式(,最近添加到MySQL版本8),从而提高聚合的效率?
发布于 2021-07-15 13:47:56
在MySql 8.0+中,您可以创建一个包含您希望聚合的所有级别的CTE,然后应用聚合:
WITH cte AS (
SELECT 1 sort_order, 'municipality' geo_level, municipality_name geo_name, population FROM population_statistics
UNION ALL
SELECT 2, 'county', county_name, population FROM population_statistics
UNION ALL
SELECT 3, 'nation', country_name, population FROM population_statistics
UNION ALL
SELECT 4, 'world', 'world', population FROM population_statistics
)
SELECT geo_level, geo_name, SUM(population) population
FROM cte
GROUP BY sort_order, geo_level, geo_name
ORDER BY sort_order见演示。
https://stackoverflow.com/questions/68381511
复制相似问题