创建分区时在 framing 子句 window 中添加分组

Adding grouping in framing clause window while creating partitions

以 Google (MBL Data) 上托管的数据集为例,这就是我要做的事情 - 获取给定场地的最后 3 周得分 运行。

我的聚合数据集看起来像这样,没有 strikes_3wk 列 -

strikes_3wk 列的逻辑是按 venueName 对聚合数据集进行分区,按 YearWeek 列排序,然后获取最近 3 周的聚合罢工数据。

这是我到目前为止编写的查询。我看到窗口函数是我需要修改逻辑的地方。那么,有没有办法在窗口函数中添加分组?有没有其他方法可以做到这一点?

在图像中我添加了一个新列 'expected',显示了两周的值。

select inr.*
       ,sum(inr.strikes) over (Venue_Week rows between current row and 2 following) as strikes_3wk
from
(
    select seasonType
        ,gameStatus
        ,homeTeamName
        ,awayTeamName
        ,venueName
        ,CAST(
        CONCAT(
            CAST(EXTRACT(YEAR FROM createdAt) as string)
            ,CAST(EXTRACT(WEEK(Monday) FROM createdAt) as string)
            ) as INT64)
            as YearWeek
        ,sum(homeFinalRuns) as homeFinalRuns
        ,sum(strikes) as strikes
    from  `bigquery-public-data.baseball.games_wide`
    where   createdAt is not null
    group by seasonType
        ,gameStatus
        ,homeTeamName
        ,awayTeamName
        ,venueName
        ,YearWeek
)inr
window Venue_Week as (
    partition by inr.venueName
    order by inr.YearWeek desc
)

所以您正在寻找每个场地的罢工,而不管是谁干的,对吗?

可能是这样的:

SELECT INR.*, STATS.strikes_3wk 
FROM `bigquery-public-data.baseball.games_wide` INR
  LEFT JOIN (
    SELECT venueName, SUM(strikes) as strikes_3wk 
    FROM `bigquery-public-data.baseball.games_wide` INR2
    WHERE YearWeek IN (
      SELECT TOP 3 YearWeek 
      FROM `bigquery-public-data.baseball.games_wide` 
      WHERE venueName = INR2.venueName
      ORDER BY YearWeek DESC
    )
    GROUP BY venueName
  ) STATS 
    ON INR.venueName = STATS.venueName