如何获取MySQL中每个标签出现频率最高的类别?理想情况下,我希望模拟一个聚合函数来计算列的mode。
SELECT
t.tag
, s.category
FROM tags t
LEFT JOIN stuff s
USING (id)
ORDER BY tag;
+------------------+----------+
| tag | category |
+------------------+----------+
| automotive | 8 |
| ba | 8 |
| bamboo | 8 |
| bamboo | 8 |
| bamboo | 8 |
| bamboo | 8 |
| bamboo | 8 |
| bamboo | 10 |
| bamboo | 8 |
| bamboo | 9 |
| bamboo | 8 |
| bamboo | 10 |
| bamboo | 8 |
| bamboo | 9 |
| bamboo | 8 |
| banana tree | 8 |
| banana tree | 8 |
| banana tree | 8 |
| banana tree | 8 |
| bath | 9 |
+-----------------------------+发布于 2009-09-10 22:05:52
SELECT t1.*
FROM (SELECT tag, category, COUNT(*) AS count
FROM tags INNER JOIN stuff USING (id)
GROUP BY tag, category) t1
LEFT OUTER JOIN
(SELECT tag, category, COUNT(*) AS count
FROM tags INNER JOIN stuff USING (id)
GROUP BY tag, category) t2
ON (t1.tag = t2.tag AND (t1.count < t2.count
OR t1.count = t2.count AND t1.category < t2.category))
WHERE t2.tag IS NULL
ORDER BY t1.count DESC;我同意这对于单个SQL查询来说有点太多了。在子查询中使用任何GROUP BY都会让我退缩。您可以通过使用视图使其看起来更简单:
CREATE VIEW count_per_category AS
SELECT tag, category, COUNT(*) AS count
FROM tags INNER JOIN stuff USING (id)
GROUP BY tag, category;
SELECT t1.*
FROM count_per_category t1
LEFT OUTER JOIN count_per_category t2
ON (t1.tag = t2.tag AND (t1.count < t2.count
OR t1.count = t2.count AND t1.category < t2.category))
WHERE t2.tag IS NULL
ORDER BY t1.count DESC;但它基本上是在幕后做同样的工作。
您评论说,您可以在应用程序代码中轻松地执行类似的操作。那你为什么不这么做呢?执行更简单的查询以获取每个类别的计数:
SELECT tag, category, COUNT(*) AS count
FROM tags INNER JOIN stuff USING (id)
GROUP BY tag, category;并在应用程序代码中对结果进行排序。
发布于 2009-09-11 15:59:19
SELECT tag, category
FROM (
SELECT @tag <> tag AS _new,
@tag := tag AS tag,
category, COUNT(*) AS cnt
FROM (
SELECT @tag := ''
) vars,
stuff
GROUP BY
tag, category
ORDER BY
tag, cnt DESC
) q
WHERE _new对于您的数据,这将返回以下内容:
'automotive', 8
'ba', 8
'bamboo', 8
'bananatree', 8
'bath', 9下面是测试脚本:
CREATE TABLE stuff (tag VARCHAR(20) NOT NULL, category INT NOT NULL);
INSERT
INTO stuff
VALUES
('automotive',8),
('ba',8),
('bamboo',8),
('bamboo',8),
('bamboo',8),
('bamboo',8),
('bamboo',8),
('bamboo',10),
('bamboo',8),
('bamboo',9),
('bamboo',8),
('bamboo',10),
('bamboo',8),
('bamboo',9),
('bamboo',8),
('bananatree',8),
('bananatree',8),
('bananatree',8),
('bananatree',8),
('bath',9);发布于 2009-09-12 14:30:48
(编辑:在ORDER BYs中忘记DESC )
子查询中的限制很容易做到。MySQL是否仍然有子查询中没有限制的限制?下面的示例使用的是PostgreSQL。
=> select tag, (select category from stuff z where z.tag = s.tag group by tag, category order by count(*) DESC limit 1) AS category, (select count(*) from stuff z where z.tag = s.tag group by tag, category order by count(*) DESC limit 1) AS num_items from stuff s group by tag;
tag | category | num_items
------------+----------+-----------
ba | 8 | 1
automotive | 8 | 1
bananatree | 8 | 4
bath | 9 | 1
bamboo | 8 | 9
(5 rows)只有在需要计数的情况下,才需要第三列。
https://stackoverflow.com/questions/1407723
复制相似问题