计算一个事件 "A" 前后的事件数,直到在大查询中遇到另一个事件 "A"?

Count number of events before and after a event "A" till another event "A" is encountered in Big query?

我有一个 table 包含日期、事件和用户。有一个名为 'A' 的事件。我想知道在 SQL Bigquery 中的事件 'A' 之前和之后特定事件发生了多少次。事件 A 可能会出现多次。但它应该只计算事件,直到它在前后条件下都遇到另一个事件 A。
例如,

 User           Date             Events
    123          2018-02-14            X.Y.A
    123          2018-02-12            X.Y.B
    134          2018-02-10            Y.Z.A
    123          2018-02-11            A
    123          2018-02-01            X.Y.Z
    134          2018-02-05            X.Y.B
    134          2018-02-04            A
    123          2018-02-13            A

输出会是这样的。

User       Event    Before   After
123          A      1        1
123          A      0        1
134          A      0        1

其他条件不变

这个问题是我上一个问题的延伸。

详情见

我必须计算的事件包含一个特定的前缀。意味着我必须检查以( X.Y.then 某些事件名称)开头的事件。因此,X.Y.SomeEvent 是我必须为其设置计数器的事件。有什么建议吗?

这是一个更笼统的问题。使用可以使用与 lag()lead() 相同的想法:

select userid,
       (seqnum - lag(seqnum, 1, 0) over (partition by userid, order by date) - 1) as before,
       (lead(seqnum, 1, cnt) over (partition by user_id order by date) - seqnum - 1) as after
from (select t.*,
             row_number() over (partition by userid order by date) as seqnum,
             count(*) over (partition by userid) as cnt
      from t
      where event like 'X.Y%' or event = 'A'
     ) t
where event = 'A';

以下适用于 BigQuery 标准 SQL

#standardSQL
WITH grps AS (
  SELECT user, dt, event, 
    COUNTIF(event = 'A') OVER(PARTITION BY user ORDER BY dt) grp
  FROM `project.dataset.events`
)
SELECT dt, user, event, before, after 
FROM (
  SELECT dt, user, event, 
    COUNTIF(event LIKE 'X.Y.%') OVER(PARTITION BY user ORDER BY grp RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING ) before,
    COUNTIF(event LIKE 'X.Y.%') OVER(PARTITION BY user ORDER BY grp RANGE BETWEEN CURRENT ROW AND CURRENT ROW) after
  FROM grps
)
WHERE event = 'A'
-- ORDER BY user  

您可以 test/play 使用上面示例中的虚拟数据,如下所示

#standardSQL
WITH `project.dataset.events` AS (
  SELECT 123 user,  '2018-02-14' dt, 'X.Y.A' event UNION ALL
  SELECT 123,       '2018-02-13', 'A'     UNION ALL
  SELECT 123,       '2018-02-12', 'X.Y.B' UNION ALL
  SELECT 123,       '2018-02-11', 'A'     UNION ALL
  SELECT 123,       '2018-02-01', 'X.Y.Z' UNION ALL
  SELECT 134,       '2018-02-10', 'Y.Z.A' UNION ALL
  SELECT 134,       '2018-02-05', 'X.Y.B' UNION ALL
  SELECT 134,       '2018-02-04', 'A'     
), grps AS (
  SELECT user, dt, event, 
    COUNTIF(event = 'A') OVER(PARTITION BY user ORDER BY dt) grp
  FROM `project.dataset.events`
)
SELECT dt, user, event, before, after 
FROM (
  SELECT dt, user, event, 
    COUNTIF(event LIKE 'X.Y.%') OVER(PARTITION BY user ORDER BY grp RANGE BETWEEN 1 PRECEDING AND 1 PRECEDING ) before,
    COUNTIF(event LIKE 'X.Y.%') OVER(PARTITION BY user ORDER BY grp RANGE BETWEEN CURRENT ROW AND CURRENT ROW) after
  FROM grps
)
WHERE event = 'A'
ORDER BY user  

结果为

Row dt          user    event   before  after    
1   2018-02-11  123     A       1       1    
2   2018-02-13  123     A       1       1    
3   2018-02-04  134     A       0       1