我们收集客户调查,其中一个问题是“您推荐我们的服务的可能性有多大?”计算净启动得分的问题。(我讨厌这种计算,但我将把讨论推迟到不同的时间。)
我们将调查存储在我们的数据库中,而且,由于问题每年都在变化,我创建了这个观点,以获得计算NPS的关键数据,这是我们还收集到的一个总体满意度指标,以及其他一些一般的报告需求。
CREATE VIEW [dbo].[SatSurvey] AS
SELECT SSV1.ID, SSV1.ClinicID, CC.ShortName, NULL as HashValue, NULL as LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, NULL as StartDateTime, NULL as EndDateTime, Method,
CorporateAffiliation, NULL as DoNotReport
FROM SatSurveyV1 SSV1 inner join ClinicConfig CC on SSV1.ClinicID = cc.ClinicID
UNION ALL
SELECT SSV2.ID, SSV2.ClinicID, CC.ShortName, HashValue, LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, StartDateTime, EndDateTime, Method,
CorporateAffiliation, DoNotReport
FROM SatSurveyV2 SSV2 inner join ClinicConfig CC on SSV2.ClinicID = cc.ClinicID
UNION ALL
SELECT SSV2018.ID, SSV2018.ClinicID, cc.ShortName, HashValue, LoadDate, CollectionDate, RecommendNPS, RecommendNPSReason, OverallExperience, Comments, StartDateTime, EndDateTime, Method,
CorporateAffiliation, DoNotReport
FROM SatSurveyV2018 SSV2018 inner join ClinicConfig CC on SSV2018.ClinicID = CC.ClinicID注意: HashValue、LoadDate和DoNotReport列是在第一次调查更改后添加的,因此它们被选择为最早的数据的NULL。
我通过在代码中选择此视图中的数据来计算NPS分数,但我认为它将更有效(特别是因为我现在需要计算多个不同时期的YTD分数),才能从存储过程返回它。
我编写了这个存储过程来计算NPS,但是似乎有一种更有效的方法来计算它。
CREATE PROCEDURE [dbo].[CalculateNPSOverPeriod] (
@NPS decimal (5,2) output,
@startDate date,
@endDate date,
@clinicName nvarchar(50) = NULL)
AS
BEGIN
SET NOCOUNT ON;
SELECT @NPS = ((SUM(Promoters) - SUM(Detractors)) / (SUM(Responses) * 1.0) * 100.0)
FROM (
SELECT clinicid, YEAR(CollectionDate) Yr, MONTH(CollectionDate) Mo, COUNT(*) Promoters, 0 Neutrals, 0 Detractors, 0 Responses
FROM SatSurvey
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS > 8
AND DoNotReport IS NULL
GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
/*
UNION
SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, COUNT(*) Neutrals, 0 Detractors, 0 Responses
FROM SatSurvey
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS BETWEEN 7 AND 8
AND DoNotReport IS NULL
GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
*/
UNION
SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, 0 Neutrals, COUNT(*) Detractors, 0 Responses
FROM SatSurvey
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS < 7
AND DoNotReport IS NULL
GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
UNION
SELECT clinicid, YEAR(CollectionDate), MONTH(CollectionDate), 0 Promoters, 0 Neutrals, 0 Detractors, COUNT(*) Responses
FROM SatSurvey
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS IS NOT NULL
AND DoNotReport IS NULL
GROUP BY ClinicID, YEAR(CollectionDate), MONTH(CollectionDate)
) p INNER JOIN ClinicConfig CC on p.ClinicID = cc.ClinicID
WHERE cc.GenerateSatSurvey = 1
AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
OPTION (Recompile)
RETURN @NPS
END我知道我能做的第一件事是提取注释的代码(以及其他SELECT语句中的所有D8),因为实际上不需要知道中立响应的数量来计算NPS。我以前在检查其他代码时就使用了它,以确保我得到了正确的结果并进行了故障排除。
UNION处理所有单个子查询更好的方法吗?Promoters、Detractors和Responses,然后在最后进行计算吗?CollectionDate、RecommendNPS、DoNotReport和ClinicID上设置索引,会为我提供最好的服务吗?output参数可能有点不合常规,但是,我选择这样做,因为我有一个标准过程,用于调用存储过程,我决定总是首先列出输出参数(如果有)是最简单的(也许是懒惰的)。这很可能是另一个需要完成的重构,但是这个问题已经超出了范围。发布于 2018-08-21 15:09:37
最后,我在聊天中得到了一些很好的帮助(尽管我似乎无法说服任何人发帖),这就是我想出的:
CREATE PROCEDURE [dbo].[CalculateNPSOverPeriod] (
@NPS decimal (5,2) output,
@startDate date,
@endDate date,
@clinicName nvarchar(50) = NULL)
AS
BEGIN
SET NOCOUNT ON; -- this suppresses the "x rows affected" message which breaks returning the value
DECLARE @Promoters decimal (10,2)
SELECT @Promoters= CAST(COUNT(*) as decimal (10,2))
FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS > 8
AND DoNotReport IS NULL
AND cc.GenerateSatSurvey = 1
AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
OPTION (Recompile)
DECLARE @Detractors decimal (10,2)
SELECT @Detractors = CAST(COUNT(*) as decimal (10,2))
FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS < 7
AND DoNotReport IS NULL
AND cc.GenerateSatSurvey = 1
AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
OPTION (Recompile)
DECLARE @Responses decimal(10,2)
SELECT @Responses = CAST(COUNT(*) as decimal (10,2))
FROM SatSurvey SS INNER JOIN ClinicConfig CC ON SS.ClinicID = CC.ClinicID
WHERE CollectionDate >= @startDate
AND CollectionDate < DATEADD(d, 1, @endDate)
AND RecommendNPS IS NOT NULL
AND DoNotReport IS NULL
AND cc.GenerateSatSurvey = 1
AND (@clinicName IS NULL OR (cc.MidName LIKE @clinicName))
OPTION (Recompile)
DECLARE @100Percent as decimal(5,2) = 100.00
SELECT @NPS = CAST((@Promoters - @Detractors) / NULLIF((@Responses * @100Percent),0) as decimal(5,2))
END通过设置SET STATISTICS IO ON;和SET STATISTICS TIME ON;,我能够确定查询的这个版本比UNION版本快大约3ms (79对82 3ms,IIRC)。
在运行查询时(在实际使用SP之前),包括实际的执行计划表明,我需要对视图的底层表进行一些索引。
在从推荐的索引开始并进行列排序之后,我最终得到了以下结果:
CREATE NONCLUSTERED INDEX [nu_nc_DNR_CD_RNPS_SSV2018] ON [dbo].[table]
(
[ClinicID] ASC,
[CollectionDate] ASC,
[RecommendNPS] ASC,
[DoNotReport] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF,
DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]我从79毫秒降到了49毫秒。
https://codereview.stackexchange.com/questions/201824
复制相似问题