我有一个SQL Server数据库,里面有相当多的副本。手动删除副本并不有趣,所以我想知道是否有任何类型的sql编程或脚本可以让它自动化。
下面是我的查询,它返回重复项的ID和Code。
select a.ID, a.Code
from Table1 a
inner join (
SELECT Code
FROM Table1 GROUP BY Code HAVING COUNT(Code)>1)
x on x.Code= a.Code我将得到如下的返回结果,例如:
5163 51727
5164 51727
5165 51727
5166 51728
5167 51728
5168 51728这段代码显示了每个ID/Code的三个返回值(即一个主要的“好”记录和两个副本)。然而,情况并不总是如此。最多可以有n个副本,尽管2-3似乎是常态。
我只想以某种方式遍历这个结果集,并删除除一条记录之外的所有记录。要删除的记录是任意的,因为它们中的任何一个都可以“保留”。
发布于 2015-11-21 01:16:00
您可以使用row_number来驱动您的删除。即
CREATE TABLE #table1
(id INT,
code int
);
WITH cte AS
(select a.ID, a.Code, ROW_NUMBER() OVER(PARTITION by COdE ORDER BY ID) AS rn
from #Table1 a
)
DELETE x
FROM #table1 x
JOIN cte ON x.id = cte.id
WHERE cte.rn > 1但是..。如果您要从一个非常大的表中执行大量的删除操作,那么最好将需要的行选择到一个临时表中&然后截断您的表并重新插入您需要的行。防止事务日志被敲打,你的CI被破解,而且应该更快!
发布于 2015-12-09 11:36:08
它实际上非常简单:
DELETE FROM Table1
WHERE ID NOT IN
(SELECT MAX(ID)
FROM Table1
GROUP BY CODE)发布于 2015-11-21 01:46:52
Self join解决方案与cte进行了性能测试。
create table codes(
id int IDENTITY(1,1) NOT NULL,
code int null,
CONSTRAINT [PK_codes_id] PRIMARY KEY CLUSTERED
(
id ASC
))
declare @counter int, @code int
set @counter = 1
set @code = 1
while (@counter <= 1000000)
begin
print ABS(Checksum(NewID()) % 1000)
insert into codes(code) select ABS(Checksum(NewID()) % 1000)
set @counter = @counter + 1
end
GO
set statistics time on;
delete a
from codes a left join(
select MIN(id) as id from codes
group by code) b
on a.id = b.id
where b.id is null
set statistics time off;
--set statistics time on;
-- WITH cte AS
-- (select a.id, a.code, ROW_NUMBER() OVER(PARTITION by code ORDER BY id) AS rn
-- from codes a
-- )
-- delete x
-- FROM codes x
-- JOIN cte ON x.id = cte.id
-- WHERE cte.rn > 1
--set statistics time off;性能测试结果: With Join:
SQL Server Execution Times:
CPU time = 3198 ms, elapsed time = 3200 ms.
(999000 row(s) affected)使用CTE:
SQL Server Execution Times:
CPU time = 4197 ms, elapsed time = 4229 ms.
(999000 row(s) affected)https://stackoverflow.com/questions/33832244
复制相似问题