首页
学习
活动
专区
圈层
工具
发布
社区首页 >问答首页 >Bigquery事件分析在subselect语句中联接

Bigquery事件分析在subselect语句中联接
EN

Stack Overflow用户
提问于 2016-07-12 03:50:08
回答 0查看 206关注 0票数 1

我试图从bigquery返回一个查询结果,该结果返回每个会话发生的事件数。我一直在参考下面这篇文章:

http://developer.streak.com/2013/11/using-google-bigquery-for-event-tracking.html

数据库模式是非常简单的sessionId,eventType,createdAt返回的结果集类似于谷歌分析中的事件工作流。像sessionId,num_event1,num_event2,..。

该方法是根据事件类型和时间戳生成子查询,然后创建连接每个事件子查询的结果的附加子查询。我能够单独执行Step1、step2、step3子查询:

代码语言:javascript
复制
SELECT COUNT(first_event_timestamp) AS number_first_events,
       COUNT(second_event_timestamp) AS number_second_events,
       COUNT(third_event_timestamp) AS number_third_events
FROM

(SELECT eventUid AS eventUid1,
        createdAt AS timestamp1
 FROM [events_table]
 WHERE eventType = 'first-event') step1,

 (SELECT eventUid AS eventUid2,
        createdAt AS timestamp2
  FROM [events_table]
  WHERE eventType = 'second-event') step2,

 (SELECT
    eventUid as sessionId3,
    createdAt as timestamp3         
  FROM
    [events_table]         
  WHERE
    eventType = "third_event") step3

加上steps1_2,steps1_2_3是我碰壁的地方。我得到一个错误,即表中缺少数据集名称。下面是完整的查询:

代码语言:javascript
复制
SELECT COUNT(first_event_timestamp) AS num_first,
       COUNT(second_event_timestamp) AS num_second,
       COUNT(third_event_timestamp) AS num_third
FROM (SELECT
             sessionId
             first_event_timestamp,
             second_event_timestamp,
             third_event_timestamp
      FROM steps1_2_3
      GROUP BY sessionId),

      (SELECT
            sessionId AS sessionId1,
            createdAt AS timestamp1         
         FROM
            [events_table]         
         WHERE
            eventType = "first_event") step1,           (SELECT
            eventUid AS sessionId2,
            createdAt AS timestamp2         
         FROM
            [events_table]         
         WHERE
            eventType = "second_event") step2,       (SELECT
            eventUid AS sessionId3,
            createdAt AS timestamp3         
         FROM
            [events_table]         
         WHERE
            eventType = "third_Event") step3,         (SELECT sessionId1,
                    timestamp1,
                    IF(timestamp1 < timestamp2, timestamp2, NULL) AS timestamp2
             FROM
                  (SELECT sessionId1,
                          timestamp1,
                          timestamp2
                   FROM step1
                   LEFT JOIN step2
                   ON sessionId1 = sessionId2) ) steps1_2,  (SELECT sessionId1 as sessionId,
              timestamp1 as first_event_timestamp,
              timestamp2 as second_event_timestamp,
              IF(timestamp2 < timestamp3, timestamp3, NULL) as  third_event_timestamp
       FROM
            (SELECT sessionId2,
                    timestamp2,
                    timestamp3
             FROM steps1_2
             LEFT JOIN step3
             ON sessionId1 = sessionId3)
             ) steps1_2_3

理想的结果集应如下所示: sessionId num_first_event num_second_event num_third_event S1 1 null S2 2 3 null S3 4 5 6

我的第一个问题是,是否可以连接子查询steps1_2,steps1_2_3?

在bigquery中实现像工作流这样的事件的替代方法,而不是计算时间戳的数量?

非常感谢您提供的任何技巧或建议的文档。另外,感谢您的时间和考虑。

EN

回答

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持
原文链接:

https://stackoverflow.com/questions/38315064

复制
相关文章

相似问题

领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档