文章/答案/技术大牛

发布

社区首页 >问答首页 >通过识别0& NULL (合并逻辑)的错误使用，确保使用SQL正确的输入数据

问通过识别0& NULL (合并逻辑)的错误使用，确保使用SQL正确的输入数据
EN

Stack Overflow用户

提问于 2017-10-26 12:22:19

回答 1查看 46关注 0票数 0

背景信息：

有一张"ProductCosts“的桌子。第一个示例数据集显示正确插入的数据。数据通过excel输入，并由ETL进程接收。该表显示了不同的成本。成本"4_Cost“是最近给出的，然后是"3_Costs"，等等。

在这种情况下，“3-费用”是最近给定的成本：

Category                Product ISOMonth    1_Costs     2_Costs     3_Costs     4_Costs     
----------------------------------------------------------------------------------------
ProductCategory1        Stuff   2017-10     40,000.00   40,000.00   50,000.00   NULL    
ProductCategory1        Stuff   2017-10     10,000.00   10,000.00   00.00       NULL    
ProductCategory1        Stuff   2017-10     10,000.00   10,000.00   00.00       NULL

您可以看到，在第二行和第三行中，"2_Costs“中的10,000.00被"3_Costs”替换为00.00。为了识别CurrentCosts，应用了以下简单的逻辑(参见合并)：

SELECT Category
    . Product
    . ISOMonth
    . COALESCE([4_Costs].[3_Costs]. [2_Costs]. [1_Costs]) AS CurrentRRCosts 
FROM [ProductCosts]

正确结果：

Category                Product ISOMonth    CurrentCosts
-----------------------------------------------------------
ProductCategory1        Stuff   2017-10     50,000.00
ProductCategory1        Stuff   2017-10     00.00
ProductCategory1        Stuff   2017-10     00.00

最后，对CurrentCost进行了总结，如果Inputdata是正确的，它的50,000.00将非常有效。

错误数据：

Category                Product ISOMonth    1_Costs     2_Costs     3_Costs     4_Costs     CurrentCosts
---------------------------------------------------------------------------------------------------------
ProductCategory1        Stuff   2017-10     40,000.00   40,000.00   50,000.00   NULL        50,000.00
ProductCategory1        Stuff   2017-10     10,000.00   10,000.00   NULL        NULL        10,000.00
ProductCategory1        Stuff   2017-10     10,000.00   10,000.00   NULL        NULL        10,000.00

在这种情况下，用户忘记在"3_Costs“列的第二行和第三行中输入00.00。这将导致CurrentCosts列中的错误结果：

Category                Product ISOMonth    CurrentCosts
--------------------------------------------------------
ProductCategory1        Stuff   2017-10     50,000.00
ProductCategory1        Stuff   2017-10     10,000.00
ProductCategory1        Stuff   2017-10     10,000.00

最后，CurrentCost被总结为它的70,000.00，这是一个错误的结果，仅仅因为用户忘记用一个00.00覆盖prevoius 10,000.00。

断言:如果"3_Costs“这样的列的一个值按类别、产品和月份不为空(这里为50,000.00)，则其他值不应为空。

错误数据示例：查看数据集“错误数据”。如果第一行中的"3_Costs“有50,000.00，则必须在第二行和第三行中有一个值。

返回像"has_incomplete_cost_column“这样的标志的SQL查询是可以的。那我就知道数据不一致了。

决定因素:我必须保持现有的数据模型和概念，因为它已经这样实现了。输入数据由Excel表提供，因此不需要构建捕捉这些错误的用户界面。

sql

sql-server

tsql

回答 1

Stack Overflow用户

回答已采纳

发布于 2017-10-26 12:43:47

如何处理分析和用例或子查询，以获得每个列的总数，然后每次使用相同的列？

演示：

最基本的问题是，需要在列的和上进行合并，而不是单个行；然后只显示行值，而不是和。

With ProductCosts(Category,Product, ISOMonth, [1_Costs], [2_Costs], [3_Costs], [4_Costs]) as (
SELECT 'ProductCategory1',        'Stuff',   '2017-10',     40000.00,   40000.00,   50000.00,   cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1',        'Stuff',   '2017-10',     10000.00,   10000.00,   NULL ,     cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1',        'Stuff',   '2017-10',     10000.00,   10000.00,   NULL,       cast(NULL as numeric(10,2)) UNION ALL
SELECT 'ProductCategory1',        'Stuff',   '2017-10',     NULL,        NULL,        NULL,        cast(NULL as numeric(10,2)))


Select Category, Product, ISOMonth, Case when sum([4_costs]) over (partition by Category, Product, ISOMonth) > 0 then [4_costs]
     when sum([3_Costs]) over (partition by Category, Product, ISOMonth)> 0 then [3_Costs]
     when sum([2_costs]) over (partition by Category, Product, ISOMonth)> 0 then [2_costs]
     when sum([1_Costs]) over (partition by Category, Product, ISOMonth)> 0 then [1_costs]
end as currentprice
from productCosts A

给我们(上或下的方法)

+----+------------------+---------+----------+--------------+
|    |     Category     | Product | ISOMonth | currentprice |
+----+------------------+---------+----------+--------------+
|  1 | ProductCategory1 | Stuff   | 2017-10  | 50000,00     |
|  2 | ProductCategory1 | Stuff   | 2017-10  | NULL         |
|  3 | ProductCategory1 | Stuff   | 2017-10  | NULL         |
|  4 | ProductCategory1 | Stuff   | 2017-10  | NULL         |
+----+------------------+---------+----------+--------------+

很少有人注意到：

我不太喜欢用数字开始列，但那是我的挂断。
通过使用解析，我们识别了第一列，它有一个值，并一直使用它的值。
分析基本上必须对每条记录/行运行--这可能比计算一次或为每列计算一次慢，将结果放入变量并在情况下使用该变量。
我对每个分析的分区不是肯定的，所以您可能需要调整。
这允许丢失的0.00在任何一行(包括第一列)。
怎么说某人没有将0.00放在4_Costs中，那么所有的总数都是基于4_costs中的0.00？你不能控制那种人为错误。但是，如果列的总数为0，则可以检查并报告给用户以“警告”形式检查输入。

替代方法；我不确定分析重复查询或子查询是否会更快地进行w/o测试。我认为子查询(我相信它们只会发生一次)，而解析器必须运行每一行；但是引擎可能知道这一点，并相应地进行了优化。

Select PC.Category, PC.Product, PC.ISOMonth, Case when D.[4_costs] > 0 then PC.[4_costs]
     when C.[3_Costs]> 0 then PC.[3_Costs]
     when B.[2_Costs]> 0 then PC.[2_costs]
     when A.[1_Costs]> 0 then PC.[1_costs]
end as currentprice
from productCosts PC
INNER join (Select sum([4_costs]) [4_costs], Category, product, ISOMonth from ProductCosts GROUP BY  Category, product, ISOMonth ) D
  on D.Category = PC.Category
 and D.Product = PC.Product
 and D.ISOMonth = PC.ISOMonth
INNER join (Select sum([3_costs]) [3_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth) C
  on C.Category = PC.Category
 and C.Product = PC.Product
 and C.ISOMonth = PC.ISOMonth
INNER join (Select sum([2_costs]) [2_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth ) B
  on B.Category = PC.Category
 and B.Product = PC.Product
 and B.ISOMonth = PC.ISOMonth
INNER join (Select sum([1_costs]) [1_costs], Category, product, ISOMonth from ProductCosts Group by Category, product, ISOMonth ) A
  on A.Category = PC.Category
 and A.Product = PC.Product
 and A.ISOMonth = PC.ISOMonth

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/46954226

复制

相似问题

问通过识别0& NULL (合并逻辑)的错误使用，确保使用SQL正确的输入数据
EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问通过识别0& NULL (合并逻辑)的错误使用，确保使用SQL正确的输入数据EN

回答 1

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问通过识别0& NULL (合并逻辑)的错误使用，确保使用SQL正确的输入数据
EN