首页
学习
活动
专区
圈层
工具
发布

sql微调
EN

Stack Overflow用户
提问于 2017-12-07 10:36:12
回答 1查看 93关注 0票数 1

在表中,我需要根据列(在本例中为列1.WORK_ORDER_NUM,2.ESN3.PLANT4..REMD_PART_NUM 5. REMD_PART_SERIAL)将2条记录组合成一条记录。如果这5列相等,那么我必须在其他列中使用聚合函数使其成为一个record.In --在这种情况下,列是LLP_TRACKD_PART_IND、REMD_PART_TSN和REMD_PART_CSN。

这就是我尝试过的:

代码语言:javascript
复制
SELECT decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT)PLANT,
 COUNT(*) RECORD_COUNT,
 COUNT(DISTINCT OFF.REMD_PART_NUM) REMD_PART_NUM_COUNT,
 COUNT(DISTINCT OFF.REMD_PART_SERIAL) REMD_PART_SER_NUM_COUNT,
  COUNT(DECODE(LLP_TRACKD_PART_IND,'LL',LLP_TRACKD_PART_IND,NULL)) LL_COUNT,
 COUNT(DECODE(LLP_TRACKD_PART_IND,'LR',LLP_TRACKD_PART_IND,NULL)) LR_COUNT,
 COUNT(DECODE(LLP_TRACKD_PART_IND,'TR',LLP_TRACKD_PART_IND,NULL)) TR_COUNT,
 SUM(OFF.REMD_PART_QTY) TOTAL_REMD_PART_QTY, 
  SUM(decode(LLP_TRACKD_PART_IND,null,0,
     CASE
       WHEN REGEXP_LIKE(REMD_PART_TSN, '^-?\d+(\.\d+)?$')
            THEN CAST(REMD_PART_TSN AS NUMBER)
        ELSE 0
    END
  )) TOTAL_TSN,
 SUM(decode(LLP_TRACKD_PART_IND,null,0,
    CASE
        WHEN REGEXP_LIKE(REMD_PART_CSN, '^-?\d+(\.\d+)?$')
            THEN CAST(REMD_PART_CSN AS NUMBER)
        ELSE 0
    END
 )) TOTAL_CSN
FROM (with t as ( SELECT distinct PLANT,
WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL,REMD_PART_IIN,
LLP_TRACKD_PART_IND,
REMD_PART_QTY,REMD_PART_TSN,REMD_PART_CSN,REMD_PART_TSO,REMD_PART_CSO
,REMD_PART_TSC,REMD_PART_CSC,REMD_CYCLE_REMAIN          
FROM <TABLE1> 
WHERE 
REMD_PART_NUM is not null 
)
select DISTINCT PLANT,WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL 
,REMD_PART_IIN
 ,(select max(LLP_TRACKD_PART_IND) from t bb  where aa.PLANT=bb.PLANT and 
aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
 and aa.ESN=bb.ESN 
  and aa.REMD_PART_NUM=bb.REMD_PART_NUM
  and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL) LLP_TRACKD_PART_IND
 ,REMD_PART_QTY
 ,(select max(REMD_PART_TSN) from t bb  where aa.PLANT=bb.PLANT and 
aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
 and aa.ESN=bb.ESN 
 and aa.REMD_PART_NUM=bb.REMD_PART_NUM
 and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL) REMD_PART_TSN
 ,
(select max(REMD_PART_CSN) from t bb  where aa.PLANT=bb.PLANT and             
aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
  and aa.ESN=bb.ESN 
  and aa.REMD_PART_NUM=bb.REMD_PART_NUM
  and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL) REMD_PART_CSN
  from t aa) OFF
WHERE 
REMD_PART_NUM is not null
GROUP BY decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT) 

和它大约需要8个小时才能完成。还有其他方法可以更快地完成它吗? 它花了8个小时才完成

EN

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-12-08 10:05:50

从更好的查询格式开始,它将允许您轻松地理解代码并注意重复的模式:

代码语言:javascript
复制
SELECT decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT)PLANT,
    COUNT(*) RECORD_COUNT,
    COUNT(DISTINCT OFF.REMD_PART_NUM) REMD_PART_NUM_COUNT,
    COUNT(DISTINCT OFF.REMD_PART_SERIAL) REMD_PART_SER_NUM_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'LL',LLP_TRACKD_PART_IND,NULL)) LL_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'LR',LLP_TRACKD_PART_IND,NULL)) LR_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'TR',LLP_TRACKD_PART_IND,NULL)) TR_COUNT,
    SUM(OFF.REMD_PART_QTY) TOTAL_REMD_PART_QTY, 
    SUM(decode(LLP_TRACKD_PART_IND,null,0,
            CASE
                WHEN REGEXP_LIKE(REMD_PART_TSN, '^-?\d+(\.\d+)?$')
                THEN CAST(REMD_PART_TSN AS NUMBER)
                ELSE 0
            END
        )) TOTAL_TSN,
    SUM(decode(LLP_TRACKD_PART_IND,null,0,
            CASE
                WHEN REGEXP_LIKE(REMD_PART_CSN, '^-?\d+(\.\d+)?$')
                THEN CAST(REMD_PART_CSN AS NUMBER)
                ELSE 0
            END
        )) TOTAL_CSN
FROM (
    with t as ( 
        SELECT distinct PLANT, WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL,REMD_PART_IIN,
                LLP_TRACKD_PART_IND,
                REMD_PART_QTY,REMD_PART_TSN,REMD_PART_CSN,REMD_PART_TSO,REMD_PART_CSO
                ,REMD_PART_TSC,REMD_PART_CSC,REMD_CYCLE_REMAIN          
        FROM <TABLE1> 
        WHERE REMD_PART_NUM is not null 
    )
    select DISTINCT PLANT,WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL 
            ,REMD_PART_IIN
            ,(  select max(LLP_TRACKD_PART_IND) 
                from t bb  
                where aa.PLANT=bb.PLANT 
                    and aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
                    and aa.ESN=bb.ESN 
                    and aa.REMD_PART_NUM=bb.REMD_PART_NUM
                    and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL
            ) LLP_TRACKD_PART_IND
            ,REMD_PART_QTY
            ,(  select max(REMD_PART_TSN) from t bb  
                where aa.PLANT=bb.PLANT 
                    and aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
                    and aa.ESN=bb.ESN 
                    and aa.REMD_PART_NUM=bb.REMD_PART_NUM
                    and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL
            ) REMD_PART_TSN
            ,
            (   select max(REMD_PART_CSN) from t bb  
                where aa.PLANT=bb.PLANT 
                    and aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
                    and aa.ESN=bb.ESN 
                    and aa.REMD_PART_NUM=bb.REMD_PART_NUM
                    and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL
            ) REMD_PART_CSN
    from t aa
) OFF
WHERE REMD_PART_NUM is not null
GROUP BY decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT)
;

您将看到以下模式被重复了3次(3个几乎相同的子查询):

代码语言:javascript
复制
select max( some_field ) 
from t bb  
where aa.PLANT=bb.PLANT 
  and aa.WORK_ORDER_NUM=bb.WORK_ORDER_NUM 
  and aa.ESN=bb.ESN 
  and aa.REMD_PART_NUM=bb.REMD_PART_NUM
  and aa.REMD_PART_SERIAL=bb.REMD_PART_SERIAL

您还可以很容易地看到,这个查询中有3种非常昂贵的排序操作--在最内部的子查询中有一个不同的操作,在另一个子查询中有一个不同的排序操作,然后在顶级查询中按操作分组(一种不同的操作)。

只查看您的查询,就可以很容易地用这种方式使用分析函数消除一种(不同的)类型:

代码语言:javascript
复制
SELECT * FROM (
    SELECT PLANT,WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL,
           REMD_PART_IIN, REMD_PART_QTY,
           max( LLP_TRACKD_PART_IND ) over 
              (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) 
           as LLP_TRACKD_PART_IND,
           max( REMD_PART_TSN )       over 
              (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) 
           as REMD_PART_TSN,
           max( REMD_PART_CSN )       over 
              (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) 
           as REMD_PART_CSN,
           row_number() over 
              (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL, REMD_PART_IIN, REMD_PART_QTY 
                   order by PLANT) as Rn
    FROM TABLE1
    WHERE REMD_PART_NUM is not null 
)
WHERE rn = 1

因此,最后的查询是:

代码语言:javascript
复制
SELECT decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT)PLANT,
    COUNT(*) RECORD_COUNT,
    COUNT(DISTINCT OFF.REMD_PART_NUM) REMD_PART_NUM_COUNT,
    COUNT(DISTINCT OFF.REMD_PART_SERIAL) REMD_PART_SER_NUM_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'LL',LLP_TRACKD_PART_IND,NULL)) LL_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'LR',LLP_TRACKD_PART_IND,NULL)) LR_COUNT,
    COUNT(DECODE(LLP_TRACKD_PART_IND,'TR',LLP_TRACKD_PART_IND,NULL)) TR_COUNT,
    SUM(OFF.REMD_PART_QTY) TOTAL_REMD_PART_QTY, 
    SUM(decode(LLP_TRACKD_PART_IND,null,0,
            CASE
                WHEN REGEXP_LIKE(REMD_PART_TSN, '^-?\d+(\.\d+)?$')
                THEN CAST(REMD_PART_TSN AS NUMBER)
                ELSE 0
            END
        )) TOTAL_TSN,
    SUM(decode(LLP_TRACKD_PART_IND,null,0,
            CASE
                WHEN REGEXP_LIKE(REMD_PART_CSN, '^-?\d+(\.\d+)?$')
                THEN CAST(REMD_PART_CSN AS NUMBER)
                ELSE 0
            END
        )) TOTAL_CSN
FROM (
    SELECT PLANT,WORK_ORDER_NUM,ESN,REMD_PART_NUM,REMD_PART_SERIAL,REMD_PART_IIN, REMD_PART_QTY,
           max( LLP_TRACKD_PART_IND ) over (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) as LLP_TRACKD_PART_IND,
           max( REMD_PART_TSN )       over (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) as REMD_PART_TSN,
           max( REMD_PART_CSN )       over (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL) as REMD_PART_CSN,
           row_number()               over (partition by PLANT, WORK_ORDER_NUM, ESN, REMD_PART_NUM, REMD_PART_SERIAL, REMD_PART_IIN, REMD_PART_QTY 
                                            order by PLANT) as Rn
    FROM TABLE1
    WHERE REMD_PART_NUM is not null 
)
WHERE rn = 1
GROUP BY decode (PLANT ,'ECL','CELMA','EDS','CELMA',PLANT)

我觉得这个查询可以进一步优化,但它需要深入了解表的结构和对业务需求的了解。

不过,仍有一些微优化的可能。

这种模式:

代码语言:javascript
复制
SUM(decode(LLP_TRACKD_PART_IND,null,0,
        CASE
            WHEN REGEXP_LIKE(REMD_PART_TSN, '^-?\d+(\.\d+)?$')
            THEN CAST(REMD_PART_TSN AS NUMBER)
            ELSE 0
        END
    )) TOTAL_TSN,

可以用这个替换:

代码语言:javascript
复制
coalesce( 
    SUM( 
        CASE
            WHEN REGEXP_LIKE(REMD_PART_TSN, '^-?\d+(\.\d+)?$')
            THEN CAST(REMD_PART_TSN AS NUMBER)
            ELSE 0
        END
    ), 0
)

由于SUM忽略空值,那么检查每个记录的空值都是浪费。对于少量的记录(<100万)来说,这并不重要,但是对于数亿条记录来说,你可以达到刻度的效果--比方说,0.05毫秒用于检查每条记录乘以10.000.000记录可以给500秒。

票数 1
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/47693075

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档