我在找副本,这样它们就可以被移除了。
我有一个名为categories的表,它包含uid、qid和value。
uid是该表的唯一ID
qid是一个问题ID
value是那个qid的标记。
因此,每个qid可以有很多行,但是每个qid都应该有唯一的值。
例如:
mysql> SELECT * FROM categories WHERE qid=6869;
+-------+------+-----------+
| uid | qid | value |
+-------+------+-----------+
| 19838 | 6869 | Sport |
| 19839 | 6869 | Football |
| 19840 | 6869 | Sport |
| 19841 | 6869 | Athletics |
+-------+------+-----------+如您所见,它有两个Sport。我们有8000多个qid,每个都有3-8个标签.我真的不想手动检查每一个qid。
因此,至少,我想得到一个qid的清单与这个问题,并充其量,删除所有重复。
我试过的是:
SELECT count(value) AS cnt FROM categories GROUP BY value HAVING cnt>1;这给了我一张有很多数字的表格,但我无法打印出更多的数字,因为我得到了这个错误:
mysql> SELECT *, count(value) AS cnt FROM categories GROUP BY value HAVING cnt>1;
ERROR 1055 (42000): Expression #1 of SELECT list is not in GROUP BY clause and contains nonaggregated column 'quizmastershop.categories.uid' which is not functionally dependent on columns in GROUP BY clause; this is incompatible with sql_mode=only_full_group_by而且,这也不是我所需要的,因为它只给了我每一个值的一个计数.
有什么想法吗?
干杯
编辑:版本数据
mysql> SELECT VERSION();
+-------------------------+
| VERSION() |
+-------------------------+
| 5.7.21-0ubuntu0.16.04.1 |
+-------------------------+编辑2:我从ONLY_FULL_GROUP_BY字符串中删除了sql_mode。在上面产生错误的字符串仍然没有为我提供任何有用的信息。
编辑3:试用电子代码,这正是我所需要的输出:-)
+-------+------+-------------------+
| uid | qid | value |
+-------+------+-------------------+
| 470 | 170 | Children's |
| 472 | 170 | Children's |
| 570 | 204 | Geography |
| 572 | 204 | Geography |
| 575 | 205 | Geography |
| 577 | 205 | Geography |发布于 2018-07-19 17:09:17
试试下面的代码。基本上,内部查询抓取记录有多个条目。外部查询将其连接回类别表以获取uid。
SELECT DISTINCT c.uid, c.qid, c.value
FROM categories c
JOIN (
SELECT qid, value, COUNT(*)
FROM categories
GROUP BY qid, value
HAVING COUNT(*) > 1
) a ON a.qid = c.qid AND a.value = c.value发布于 2018-07-19 16:37:37
Select * from categories where value in (SELECT value FROM categories GROUP BY value HAVING count(value)>1)https://stackoverflow.com/questions/51427690
复制相似问题