首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >如何计算一年中逐日滑动计数SQL

如何计算一年中逐日滑动计数SQL
EN

Stack Overflow用户
提问于 2020-10-03 18:11:34
回答 1查看 34关注 0票数 0

我必须计算过去两年(2019和2020)逐日滑动的活跃用户总数。我有日期,电子邮件和上次访问列。如果last_visit > current day - 90,则活动用户为活动用户。我遇到的第一个问题是,我不知道如何告诉SQL我今天是什么日子。我尝试使用date列,但它带来了错误,并且行数与之前相同:

代码语言:javascript
复制
 WITH users_list as
  (SELECT SUBSTRING([agent_email], CHARINDEX('@', [agent_email])+1, LEN([agent_email])) AS DOMAIN,
          SUBSTRING(last_visit, 1, +10) as _date,
         VS.agent_email,
         VS.last_visit,
         vs.agent_license_type,
         vs.custom_templic
  FROM test.visitor AS VS
  WHERE VS.id NOT LIKE '%@rule.com'
    AND VS.agent_company NOT LIKE '%Rule%'
    AND last_visit> _date - 90
  GROUP BY VS.agent_email,
           VS.last_visit,
           vs.agent_license_type,
           vs.custom_templic

第二个问题,我不知道如何计算2019年和2020年的活跃用户总数。我尝试使用这个脚本:

代码语言:javascript
复制
SELECT _date, 
   count (agent_email)
   over(order by _date ROWS BETWEEN 1 PRECEDING AND CURRENT ROW) as number_of_active_users
FROM users_list

但它也带来了错误。

EN

回答 1

Stack Overflow用户

发布于 2020-10-03 19:59:59

不幸的是,Redshift不支持日期的range窗口框架规范。

您的示例代码实现了问题描述中没有的逻辑。此答案针对问题描述。

因此,另一种方法是跟踪用户何时在范围内而不在范围内。这从确定每个用户的范围开始,这是一个缺口和孤岛问题:

代码语言:javascript
复制
select test_email, min(last_visit) as active_start,
       max(last_visit) + interval '90 day' as active_end
from (select v.*,
             sum(case when prev_last_visit > last_visit - interval '100 day' then 0 else 1 end) over
                 (partition by test_email order by last_visit) as active_period
      from (select v.*,
                   lag(last_visit) over (partition by test_email order by last_visit) as prev_last_visit
            from test.visitor v
           ) v
     ) v

然后,有了这些信息,我们可以取消旋转,以跟踪某人何时进入活动状态并离开。所以:

代码语言:javascript
复制
with actives as (
      select test_email, min(last_visit) as active_start,
             max(last_visit) + interval '90 day' as active_end
      from (select v.*,
                   sum(case when prev_last_visit > last_visit - interval '100 day' then 0 else 1 end) over
                     (partition by test_email order by last_visit) as active_period
            from (select v.*,
                         lag(last_visit) over (partition by test_email order by last_visit) as prev_last_visit
                  from test.visitor v
                 ) v
           ) v
      )
select dte,
       sum(inc) as change_on_day,
       sum(sum(inc)) over (order by dte rows between unbounded preceding and current row) as actives_on_day
from ((select test_email, active_start as dte, 1 as inc
       from actives
      ) union all
      (select test_email, active_end as dte, -1 as inc
       from actives
      )
     ) a
group by dte;

如果要对日期范围进行筛选,请将附加子查询或CTE与最外层查询中的筛选一起使用。

票数 0
EN
页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/64183067

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档