计算 BigQuery 中时间戳之间的平均滞后

Calculating the average lag between timestamps in BigQuery

我的数据大致结构如下:

client_id | visit_number | session_start_time | hit_count

我目前正在使用:

SELECT client_id, visit_number, SUM(hit_count) OVER (PARTITION BY client_id ORDER BY visit_number), 
session_start_time - LAG(session_start_time) OVER (PARTITION by client_id ORDER BY visit_number)
FROM session_table

理想情况下,我希望获得客户点击的滚动总和(这似乎工作正常)。连续会话之间的平均增量也很方便。希望我为当前会话计算一个增量的方法是正确的,但我不确定计算平均增量的合理方法。

一个想法是将上面的查询包装到一个 CTE 中,然后在另一个 window 函数中计算平均值,但我相信它可以在一个查询中完成。

如果你想要到本次会话的会话之间的平均时间,那么你可以用当前时间减去第一个时间再除以小于会话数的一个来计算:

SELECT client_id, visit_number,
       SUM(hit_count) OVER (PARTITION BY client_id ORDER BY visit_number), 
       session_start_time - LAG(session_start_time) OVER (PARTITION by client_id ORDER BY visit_number) as delta,
       (session_start_time - 
        MIN(session_start_time) OVER (PARTITION by client_id)
       ) / NULLIF(ROW_NUMBER() OVER (PARTITION BY client_id ORDER BY session_start_time) - 1, 0) as avg_delta       
FROM session_table;