首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Server中的MERGing重复记录

Server中的MERGing重复记录
EN

Stack Overflow用户
提问于 2017-09-28 13:45:33
回答 2查看 13.7K关注 0票数 5

SQL Fiddle

我有下表

代码语言:javascript
复制
CREATE TABLE __EpiTest
(
    `ActivityRecordID` int, 
    `ActCstID` varchar(6), 
    `ResCstID` varchar(6), 
    `VolAmt` int, 
    `ActCnt` int, 
    `TotOCst` int, 
    `TotCst` int
);

INSERT INTO __EpiTest (`ActivityRecordID`, `ActCstID`, `ResCstID`, `VolAmt`, `ActCnt`, `TotOCst`, `TotCst`)
VALUES (15652, 'DIM008', 'CPF005', 30.455249786377, 1, 0, 0.375024198767061),
       (15652, 'DIM008', 'CSC004', 30.455249786377, 1, 7.62176510799961, 11.932578069479),
       (15652, 'DIM008', 'REC001', 30.455249786377, 1, 0.17902367836393, 0.384881520159455),
       (15652, 'OUT001', 'CPF002', 15, 0, 0, 16.9408193013078),
       (15652, 'OUT001', 'CSC001', 15, 0, 2.36971564207042,  2.36971564207042),
       (15652, 'OUT001', 'CSC004', 15, 0, 12.3230666021278, 12.3760690367354),
       (15652, 'OUT001', 'REC001', 15, 0, 0.377459387378349, 3.0275278374102),
       (15652, 'SUP001', 'CPF002', 1, 1, 0, 0.00108648359810756),
       (15652, 'SUP001', 'CPF011', 1, 1, 0, -1.89799880202357E-14),
       (15652, 'SUP001', 'CPF020', 1, 1, 0, 1.31058251625567E-05),
       (15652, 'SUP001', 'CPF021', 1, 1, 0, 25.0942308512551),
       (15652, 'SUP001', 'CPF021', 1, 1, 0, 25.0942308512551),
       (15652, 'SUP001', 'CSC001', 1, 1, 1.9628769103451, 1.9628769103451),
       (15652, 'SUP001', 'CSC001', 1, 1, 1.9628769103451, 1.9628769103451),
       (15652, 'SUP001', 'CSC002', 1, 1, 0, 10.2266625467779),
       (15652, 'SUP001', 'CSC004', 1, 1, 16.3451721608005, 16.3513319060046),
       (15652, 'SUP001', 'CSC004', 1, 1, 16.3451721608005, 16.3513319060046),
       (15652, 'SUP001', 'REC001', 1, 1, 0.254410386701976, -6.27048795659376),
       (15652, 'SUP001', 'REC001', 1, 1, 0.254410386701976, -6.27048795659376),
       (15652, 'SUP001', 'REC002', 1, 1, 0, 1.10781732547441);

注意,有些行具有[ActivityRecordID][ActCstID][ResCstID]的匹配值。我希望合并这些值,并将[TotOCst][TotCst]中的值相加。要做到这一点,我尝试使用MERGE

代码语言:javascript
复制
MERGE [__EpiTest] AS Tgt 
USING (
    SELECT [ActivityRecordID], 
           [ActCstID], 
           [ResCstID], 
           SUM([TotOCst]) AS TotOCst, 
           SUM([TotCst]) AS TotCst 
    FROM [__EpiTest]  
    GROUP BY [ActivityRecordID], 
             [ActCstID], 
             [ResCstID]) AS Src 
ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND 
     Tgt.[ActCstID] = Src.[ActCstID] AND 
     Tgt.[ResCstID] = Src.[ResCstID])  
WHEN MATCHED THEN 
    UPDATE 
    SET [TotOCst] = Src.[TotOCst], 
        [TotCst] = Src.[TotCst] 
WHEN NOT MATCHED BY SOURCE THEN 
    DELETE;
GO

这将匹配并正确更新每个重复的[TotOCst][TotCst]中的值,但是它会将重复的行保留在表中,而我希望删除除一个之外的所有数据行。我怎样才能做到这一点?

注意,目标表很大,所以我想通过使用上面的MERGE查询的变体或其他选择来尝试使用单个操作来实现这一点。多个查询对我来说太昂贵了.

Illustration

我得到了

代码语言:javascript
复制
...
15652   SUP001  CPF021  1   1   0                   12.5471154256275
15652   SUP001  CPF021  1   1   0                   12.5471154256275
15652   SUP001  CSC001  1   1   0.98143845517255    0.98143845517255
15652   SUP001  CSC001  1   1   0.98143845517255    0.98143845517255
15652   SUP001  CSC002  1   1   0                   10.2266625467779
15652   SUP001  CSC004  1   1   8.17258608040024    8.17566595300228
15652   SUP001  CSC004  1   1   8.17258608040024    8.17566595300228
15652   SUP001  REC001  1   1   0.127205193350988   -3.13524397829688
15652   SUP001  REC001  1   1   0.127205193350988   -3.13524397829688
...

但我想

代码语言:javascript
复制
...
15652   SUP001  CPF021  1   1   0                   12.5471154256275
15652   SUP001  CSC001  1   1   0.98143845517255    0.98143845517255 
15652   SUP001  CSC002  1   1   0                   10.2266625467779
15652   SUP001  CSC004  1   1   8.17258608040024    8.17566595300228
15652   SUP001  REC001  1   1   0.127205193350988   -3.13524397829688
...
EN

回答 2

Stack Overflow用户

回答已采纳

发布于 2017-09-28 14:09:48

为了使它在MERGE语句中工作,您需要对行更加具体一些。我调整了下面的查询:

代码语言:javascript
复制
MERGE [__EpiTest] AS Tgt 
USING (
    SELECT [ActivityRecordID], 
           [ActCstID], 
           [ResCstID], 
           SUM([TotOCst]) AS TotOCst, 
           SUM([TotCst]) AS TotCst ,
           VolAmt,
           ActCnt
    FROM [__EpiTest]  
    GROUP BY [ActivityRecordID], 
             [ActCstID], 
             [ResCstID], 
             VolAmt,
             ActCnt) AS Src 
ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND 
     Tgt.[ActCstID] = Src.[ActCstID] AND 
     Tgt.[ResCstID] = Src.[ResCstID] AND
     Tgt.TotOCst = Src.TotOCst AND
     Tgt.TotCst = Src.TotCst
     ) 
WHEN NOT MATCHED BY TARGET THEN
    INSERT ( ActivityRecordID, ActCstID, ResCstID, TotOCst, TotCst, VolAmt, ActCnt )
    VALUES ( ActivityRecordID, ActCstID, ResCstID, TotOCst, TotCst, VolAmt, ActCnt )
WHEN NOT MATCHED BY SOURCE THEN 
    DELETE;
GO

基本上,我已经对其进行了更改,以便合并的源包含其他看似常量的列,以便以后可以插入这些列。我已经将MATCH条件更改为“该行完全相同,并且将保持不变”,因此删除了WHEN语句。

然后,我添加了一个INSERT语句,以便对于已更改的行(将由非匹配的目标删除它们的行),使用这些值插入一个新的行。

在相同的数据中,这将返回以下结果集:

代码语言:javascript
复制
ActivityRecordID    ActCstID    ResCstID    VolAmt  ActCnt  TotOCst TotCst
15652   DIM008  CPF005  30.455249786377 1   0   0.375024198767061
15652   DIM008  CSC004  30.455249786377 1   7.62176510799961    11.932578069479
15652   DIM008  REC001  30.455249786377 1   0.17902367836393    0.384881520159455
15652   OUT001  CPF002  15  0   0   16.9408193013078
15652   OUT001  CSC001  15  0   2.36971564207042    2.36971564207042
15652   OUT001  CSC004  15  0   12.3230666021278    12.3760690367354
15652   OUT001  REC001  15  0   0.377459387378349   3.0275278374102
15652   SUP001  CPF002  1   1   0   0.00108648359810756
15652   SUP001  CPF011  1   1   0   -1.89799880202357E-14
15652   SUP001  CPF020  1   1   0   1.31058251625567E-05
15652   SUP001  CSC002  1   1   0   10.2266625467779
15652   SUP001  REC002  1   1   0   1.10781732547441
15652   SUP001  CPF021  1   1   0   50.1884617025102
15652   SUP001  CSC001  1   1   3.9257538206902 3.9257538206902
15652   SUP001  CSC004  1   1   32.690344321601 32.7026638120092
15652   SUP001  REC001  1   1   0.508820773403952   -12.5409759131875
票数 5
EN

Stack Overflow用户

发布于 2017-09-28 14:03:20

在__EpiTest中有重复的值

代码语言:javascript
复制
SELECT * FROM @__EpiTest 
where ResCstID = 'REC001' and ActCstID ='SUP001'
order by ActivityRecordID ,  ActCstID , ResCstID;
ActivityRecordID ActCstID ResCstID VolAmt      ActCnt      TotOCst                                 TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566
15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566

要简单地选择单个值

代码语言:javascript
复制
  SELECT DISTINCT * FROM @__EpiTest 
    where ResCstID = 'REC001' and ActCstID ='SUP001'
    order by ActivityRecordID ,  ActCstID , ResCstID;
    ActivityRecordID ActCstID ResCstID VolAmt      ActCnt      TotOCst                                 TotCst
    ---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
    15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566

删除

删除重复值并维护第一个

代码语言:javascript
复制
;WITH cte as (
SELECT Row_number() OVER (PARTITION BY ActivityRecordID ,  ActCstID , ResCstID ORDER BY (SELECT NULL)) Rn, * FROM @__EpiTest 
where ResCstID = 'REC001' and ActCstID ='SUP001'
)
Delete from cte where Rn > 1 

SELECT * FROM @__EpiTest 
where ResCstID = 'REC001' and ActCstID ='SUP001' 
ActivityRecordID ActCstID ResCstID VolAmt      ActCnt      TotOCst                                 TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566

使用合并时要小心,合并状态有严重的性能问题。

代码语言:javascript
复制
  ;WITH myResult as (
    SELECT Row_number() OVER (PARTITION BY ActivityRecordID ,  ActCstID , ResCstID ORDER BY (SELECT NULL)) Rn, * FROM @__EpiTest 
    )
    MERGE myResult AS Tgt 
    USING myResult AS Src 
    ON (Tgt.[ActivityRecordID] = Src.[ActivityRecordID] AND 
         Tgt.[ActCstID] = Src.[ActCstID] AND 
         Tgt.[ResCstID] = Src.[ResCstID] AND 
         Tgt.Rn = Src.Rn AND
         Src.Rn = 1)  
    WHEN MATCHED THEN 
        UPDATE 
        SET [TotOCst] = Src.[TotOCst], 
            [TotCst] = Src.[TotCst] 
    WHEN NOT MATCHED BY SOURCE THEN 
        DELETE;
     SELECT * FROM @__EpiTest 
    where ResCstID = 'REC001' and ActCstID ='SUP001' 
Rn                   ActivityRecordID ActCstID ResCstID VolAmt      ActCnt      TotOCst                                 TotCst
-------------------- ---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
1                    15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566
2                    15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566

(2 row(s) affected)

(20 row(s) affected)

ActivityRecordID ActCstID ResCstID VolAmt      ActCnt      TotOCst                                 TotCst
---------------- -------- -------- ----------- ----------- --------------------------------------- ---------------------------------------
15652            SUP001   REC001   1           1           0.2544103867                            -6.2704879566
票数 2
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/46470773

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档