blocks|key|3125435|text|这是更直接的方法：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3125436|select+afield1,count(afield1)+from+atable+
group+by+afield1+having+count(afield1)+>+1|code-block|syntax|javascript|3125437|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

This is the more direct way:

<pre><code>select afield1,count(afield1) from atable 
group by afield1 having count(afield1) &gt; 1
</code></pre>

blocks|key|819983|text|您可以尝试：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|819984|select+afield1,+afield2+from+afile+a
where+afield1+in
(+select+afield1
++from+afile
++group+by+afield1
++having+count(*)+>+1
);|code-block|syntax|javascript|819985|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

You could try:

<pre><code>select afield1, afield2 from afile a
where afield1 in
( select afield1
 from afile
 group by afield1
 having count(*) &gt; 1
);
</code></pre>

blocks|key|3125514|text|上周也提出了类似的问题。这里有一些很好的答案。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3125515|SQL+to+find+duplicate+entries+(within+a+group)|offset|length|3125516|在这个问题中，OP对表(文件)中的所有列(字段)感兴趣，但是如果行具有相同的键值(afield1)，则它们属于同一组。|3125517|答案有三种：|3125518|where子句中的子查询，就像这里的一些其他答案一样。|3125519|表和被视为表的组之间的内部连接(我的答案)|3125520|和分析查询(这对我来说是新的东西)。|3125521|entityMap|0|LINK|mutability|MUTABLE|url|https://stackoverflow.com/questions/182544/sql-to-find-duplicate-entries-within-a-group^0|0|0|1A|0|0|0|0|0|0|0^^$0|@$1|2|3|4|5|6|7|X|8|@]|9|@]|A|$]]|$1|B|3|C|5|6|7|Y|8|@]|9|@$D|Z|E|10|1|11]]|A|$]]|$1|F|3|G|5|6|7|12|8|@]|9|@]|A|$]]|$1|H|3|I|5|6|7|13|8|@]|9|@]|A|$]]|$1|J|3|K|5|6|7|14|8|@]|9|@]|A|$]]|$1|L|3|M|5|6|7|15|8|@]|9|@]|A|$]]|$1|N|3|O|5|6|7|16|8|@]|9|@]|A|$]]|$1|P|3|-4|5|6|7|17|8|@]|9|@]|A|$]]]|Q|$R|$5|S|T|U|A|$V|W]]]]

A similar question was asked last week. There are some good answers there.

<a href="https://stackoverflow.com/questions/182544/sql-to-find-duplicate-entries-within-a-group">SQL to find duplicate entries (within a group)</a>

In that question, the OP was interested in all the columns (fields) in the table (file),
but rows belonged in the same group if they had the same key value (afield1).

There are three kinds of answers:

subqueries in the where clause, like some of the other answers in here.

an inner join between the table and the groups viewed as a table (my answer)

and analytic queries (something that's new to me).

blocks|key|3125562|text|顺便说一句，如果有人想要删除重复项，我使用了以下命令：|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3125563|delete+from+MyTable+where+MyTableID+in+(
++select+max(MyTableID)
++from+MyTable
++group+by+Thing1,+Thing2,+Thing3
++having+count(*)+>+1
)|code-block|syntax|javascript|3125564|entityMap^0|0|0^^$0|@$1|2|3|4|5|6|7|I|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|J|8|@]|9|@]|A|$E|F]]|$1|G|3|-4|5|6|7|K|8|@]|9|@]|A|$]]]|H|$]]

By the way, if anyone wants to remove the duplicates, I have used this:

<pre><code>delete from MyTable where MyTableID in (
 select max(MyTableID)
 from MyTable
 group by Thing1, Thing2, Thing3
 having count(*) &gt; 1
)
</code></pre>

blocks|key|3970802|text|这应该是相当快的(如果对dupeFields进行了索引，速度会更快)。|type|unstyled|depth|inlineStyleRanges|entityRanges|data|3970803|SELECT+DISTINCT+a.id,+a.dupeField1,+a.dupeField2
FROM+TableX+a
JOIN+TableX+b
ON+a.dupeField1+=+b.dupeField2
AND+a.dupeField2+=+b.dupeField2
AND+a.id+!=+b.id|code-block|syntax|javascript|3970804|我猜这个查询的唯一缺点是，因为您不是在做COUNT(*)，所以您无法检查它被复制的次数，只能检查它多次出现。|offset|length|style|CODE|3970805|entityMap^0|0|0|K|8|0^^$0|@$1|2|3|4|5|6|7|O|8|@]|9|@]|A|$]]|$1|B|3|C|5|D|7|P|8|@]|9|@]|A|$E|F]]|$1|G|3|H|5|6|7|Q|8|@$I|R|J|S|K|L]]|9|@]|A|$]]|$1|M|3|-4|5|6|7|T|8|@]|9|@]|A|$]]]|N|$]]

This should be reasonably fast (even faster if the dupeFields are indexed).

<pre><code>SELECT DISTINCT a.id, a.dupeField1, a.dupeField2
FROM TableX a
JOIN TableX b
ON a.dupeField1 = b.dupeField2
AND a.dupeField2 = b.dupeField2
AND a.id != b.id
</code></pre>

I guess the only downside to this query is that because you're not doing a <code>COUNT(*)</code> you can't check for the number of times it is duplicated, only that it appears more than once.

What is an example of a fast SQL to get duplicates in datasets with hundreds of thousands of records. I typically use something like:

<pre><code>SELECT afield1, afield2 FROM afile a 
WHERE 1 &lt; (SELECT count(afield1) FROM afile b WHERE a.afield1 = b.afield1);
</code></pre>

But this is quite slow.

Fastest "Get Duplicates" SQL script

翻译质量差，导致语言生硬或混乱。

没有提供实际的解决方法或示例。

解答不清晰，无法理解或解决问题。

页面排版不美观，阅读体验差。

文章

问答

视频

教程

学习中心

腾讯云实验室

直播

竞赛

腾讯云代码分析专区

腾讯iOA零信任安全管理系统专区

腾讯云架构师技术同盟交流圈

腾讯云数据库专区

腾讯云智能顾问专区

腾讯云原生专区

腾讯混元专区

腾讯云TCE专区

腾讯云Lighthouse专区

腾讯云HAI专区

腾讯云Edgeone专区

腾讯云存储专区

腾讯云智能专区

腾讯轻联专区 

腾讯云开发专区

TAPD专区

腾讯轻量云游戏服专区

腾讯云最具价值专家

腾讯云架构师技术同盟

腾讯云创作之星

腾讯云开发者先锋

腾讯云AI代码助手

云原生构建

TAPD 敏捷项目管理

Cloud Studio

SDK中心

API中心

命令行工具

功能1上新10个字符

功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符功能2描述100个字符。

功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符功能2上新100个字符。

功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符功能5描述100个字符

功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符功能5上新100个字符

功能4上新

文章&问答评论现已支持表情

全新交互，全新视觉，新增快捷键、悬浮工具栏、高亮块等功能并同时优化现有功能，全面提升创作效率和体验

社区富文本编辑器全新改版！诚邀体验～ 

精选全网热门MCP server，让你的AI更好用 🚀

💥开发者 MCP广场重磅上线！

涵盖代码开发、场景应用、自动测试全流程，助你从零构建专属AI助手

一站式MCP教程库，解锁AI应用新玩法

聚焦“写作效率、视觉美观与运行性能”三方面进行全面升级，为您提供更高效、稳定的创作环境

社区富文本&Markdown编辑器全新改版上线，欢迎大家体验!

诚挚邀请您参与本次调研，分享您的真实使用感受与建议。您的反馈至关重要，感谢您的支持与参与！

社区新版编辑器体验调研

下面是一个在具有数十万条记录的数据集中获取重复项的快速SQL示例。我通常使用类似这样的东西：SELECT afield1, afield2 FROM afile a WHERE 1 < (SELECT count(afield1) FROM afile b WHERE a.afield1 = b.afield1);但这是相当慢的。

问最快的"Get Duplicates“SQL脚本
EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问最快的"Get Duplicates“SQL脚本EN

回答 5

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问最快的"Get Duplicates“SQL脚本
EN