我想在给定的日期范围内统计返回的访客。每一个出现不止一次的ipAddress都应该作为返回的访问者进行计算。这是如何做到的呢?
表格
ipAddress | last_update
416.246.227.151 | 2020-11-11 19:29:45
416.246.227.151 | 2020-11-11 20:29:45
173.252.127.119 | 2020-11-11 21:29:45
816.246.227.151 | 2020-11-13 13:53:16
816.246.227.151 | 2020-11-13 15:53:16
816.246.227.151 | 2020-11-13 19:53:16
373.252.127.119 | 2020-11-13 22:53:16
673.252.127.119 | 2020-11-13 20:53:16查询
SELECT last_update, COUNT(ipAddress) as returningVisitor
FROM geolocation
WHERE last_update BETWEEN '2020-11-01' AND '2020-12-01'
GROUP BY date(last_update)
HAVING (returningVisitor>1) 结果
last_update | returningVisitor
2020-11-11 19:29:45 | 3
2020-11-13 19:53:16 | 5期望结果
last_update | returningVisitor
2020-11-11 19:29:45 | 1
2020-11-13 19:53:16 | 1实际上,这两个日期的结果应该是1 returningVisitor。我试过用COUNT(DISTINCT ipAddress),结果还是不太好。
发布于 2020-11-16 03:56:18
我们可以尝试通过两层聚合来处理这个问题。首先,按日期和IP地址进行聚合,以生成每个日期/IP的访问计数,然后限制仅对每个日期进行重复访问。接下来,仅按日期汇总结果,并从所有IP地址中计算重复访问者的数量。
WITH cte AS (
SELECT DATE(last_update) AS last_update, ipAddress
FROM geolocation
WHERE last_update >= '2020-11-01' AND last_update < '2020-12-01'
GROUP BY DATE(last_update), ipAddress
HAVING COUNT(*) > 1
)
SELECT last_update, COUNT(*) AS returningVisitor
FROM cte
GROUP BY last_update;

https://stackoverflow.com/questions/64852499
复制相似问题