我有一个sql查询,在这里我计算收据。我想数一数:( a)所有收据;( b)有关客户在同一年/月内登记的收据;( c)有关客户曾最少4次光顾该商店的收据。查询按年和月分组。我希望这是可以理解的。
SELECT t.Year, t.Month,
COUNT(DISTINCT fs.Receipt) AS NumberOfGuests, //fs.Receipt is Receipt Number
COUNT(DISTINCT
(CASE
WHEN YEAR(c.RegistrationDate) = t.Year AND MONTH(c.RegistrationDate) = t.Month
THEN fs.Receipt END)) AS NumberOfNewGuests,
COUNT(DISTINCT
(CASE WHEN
(
SELECT COUNT(DISTINCT fs_sub.Receipt)
FROM Dimension_Time AS t_sub
LEFT OUTER JOIN Fact_Sales AS fs_sub ON t_sub.ID = fs_sub.Time AND fs_sub.Store = @storeID
WHERE t_sub.Time = t.Time AND fs_sub.CustomerID = fs.CustomerID
) > 3
THEN fs.Receipt END)) AS NumberOfRegularGuests
FROM Dimension_Time AS t
LEFT OUTER JOIN Fact_Sales AS fs ON t.ID = fs.Time AND fs.Store = @storeID
LEFT OUTER JOIN Dimension_Customer AS c ON fs.Customer = c.ID
WHERE (t.Time >= DATEFROMPARTS(2021 - 1, 5, 1)) AND (t.Time <= EOMONTH(DATEFROMPARTS(2021, 5, 1)))
GROUP BY t.Month, t.Year
ORDER BY t.Year, t.MonthNumberOfGuests和NumberOfNewGuests工作得很好,但是使用NumberOfRegularGuests,我得到了一个错误:
不能对包含聚合或子查询的表达式执行聚合函数。
是否有可能以另一种方式构建查询以避免错误?
发布于 2021-05-12 21:11:57
可以将子查询放入APPLY中。
SELECT t.Year, t.Month,
COUNT(DISTINCT fs.Receipt) AS NumberOfGuests, //fs.Receipt is Receipt Number
COUNT(DISTINCT
(CASE
WHEN YEAR(c.RegistrationDate) = t.Year AND MONTH(c.RegistrationDate) = t.Month
THEN fs.Receipt END)) AS NumberOfNewGuests,
COUNT(DISTINCT
(CASE WHEN t2.countReceipts > 3 THEN fs.Receipt END)
) AS NumberOfRegularGuests
FROM Dimension_Time AS t
LEFT OUTER JOIN Fact_Sales AS fs ON t.ID = fs.Time AND fs.Store = @storeID
LEFT OUTER JOIN Dimension_Customer AS c ON fs.Customer = c.ID
OUTER APPLY (
SELECT COUNT(DISTINCT fs_sub.Receipt) AS countReceipts
FROM Dimension_Time AS t_sub
LEFT OUTER JOIN Fact_Sales AS fs_sub ON t_sub.ID = fs_sub.Time AND fs_sub.Store = @storeID
WHERE t_sub.Time = t.Time AND fs_sub.CustomerID = fs.CustomerID
) t2
WHERE (t.Time >= DATEFROMPARTS(2021 - 1, 5, 1)) AND (t.Time <= EOMONTH(DATEFROMPARTS(2021, 5, 1)))
GROUP BY t.Month, t.Year
ORDER BY t.Year, t.Month;但是这种类型的子查询通常效率很低(尽管这可能取决于索引和基数,即在对日期进行过滤之后实际使用了多少Fact_Sales )。它可以用加窗的聚合来代替。
SELECT t.Year, t.Month,
COUNT(DISTINCT fs.Receipt) AS NumberOfGuests, //fs.Receipt is Receipt Number
COUNT(DISTINCT
(CASE
WHEN YEAR(c.RegistrationDate) = t.Year AND MONTH(c.RegistrationDate) = t.Month
THEN fs.Receipt END)) AS NumberOfNewGuests,
COUNT(DISTINCT
(CASE WHEN fs.countReceipts > 3 THEN fs.Receipt END)
) AS NumberOfRegularGuests
FROM Dimension_Time AS t
LEFT OUTER JOIN (
SELECT *,
COUNT(*) OVER (PARTITION BY fs.CustomerID) AS countReceipts
FROM Fact_Sales
) AS fs ON t.ID = fs.Time AND fs.Store = @storeID
LEFT OUTER JOIN Dimension_Customer AS c ON fs.Customer = c.ID
WHERE (t.Time >= DATEFROMPARTS(2021 - 1, 5, 1)) AND (t.Time <= EOMONTH(DATEFROMPARTS(2021, 5, 1)))
GROUP BY t.Month, t.Year
ORDER BY t.Year, t.Month;不清楚为什么您有COUNT(DISTINCT而不仅仅是COUNT,这通常是一个错误的连接的迹象。
如果确实存在多个具有相同Fact_Sales的Receipt,则需要更改窗口聚合,因为不能使用COUNT(DISTINCT...) OVER...。
SELECT t.Year, t.Month,
COUNT(DISTINCT fs.Receipt) AS NumberOfGuests, //fs.Receipt is Receipt Number
COUNT(DISTINCT
(CASE
WHEN YEAR(c.RegistrationDate) = t.Year AND MONTH(c.RegistrationDate) = t.Month
THEN fs.Receipt END)) AS NumberOfNewGuests,
COUNT(DISTINCT
(CASE WHEN fs.countReceipts > 3 THEN fs.Receipt END)
) AS NumberOfRegularGuests
FROM Dimension_Time AS t
LEFT OUTER JOIN (
SELECT *,
MAX(rn) OVER (PARTITION BY fs.CustomerID) AS countReceipts
FROM (
SELECT *,
DENSE_RANK() OVER (PARTITION BY fs.CustomerID ORDER BY fs.Receipt) AS rn
FROM Fact_Sales
) AS fs
) AS fs ON t.ID = fs.Time AND fs.Store = @storeID
LEFT OUTER JOIN Dimension_Customer AS c ON fs.Customer = c.ID
WHERE (t.Time >= DATEFROMPARTS(2021 - 1, 5, 1)) AND (t.Time <= EOMONTH(DATEFROMPARTS(2021, 5, 1)))
GROUP BY t.Month, t.Year
ORDER BY t.Year, t.Month;https://stackoverflow.com/questions/67508778
复制相似问题