如何编写引用原始 table 的子查询进行聚合?
How to write a subquery which references the original table for an aggregation?
我有两个 tables 跟踪比赛的细节和每支球队在比赛中的表现。架构基本上是这样的:
Game
- id
- date [TIMESTAMPTZ]
- team_a_id
- team_b_id
TeamStats
- game_id
- team_id
- stat_a [INTEGER]
- stat_b [INTEGER]
对于每场比赛,我想总结一下每支球队在之前所有比赛中的表现。所需的输出看起来像这样,其中平均列取自给定 game_id:
之前日期的所有游戏
- game_id
- team_a_avg_stat_a
- team_a_avg_stat_b
- team_b_avg_stat_a
- team_b_avg_stat_b
我原以为我想要一个类似下面的查询,将游戏 table 加入查询给定时间范围内给定团队的平均统计数据:
-- Example for team_a, would repeat with another join for team_b
SELECT g.id, ats.avg_stat_a as team_a_avg_stat_a, ats.avg_stat_b as team_a_avg_stat_b
FROM game g
INNER JOIN LATERAL (
SELECT game_id, AVG(stat_a) AS avg_stat_a, AVG(stat_b) as avg_stat_b
FROM teamstats its
INNER JOIN game ig
ON its.game_id = ig.id
WHERE ig.date < g.date AND its.team_id = g.team_a_id
GROUP BY its.game_id
) ats
ON ats.game_id = g.id;
但是,当我尝试上述查询时,得到的结果为零。我本以为游戏中的每一行都有一个结果 table。
我最初的尝试实际上没有横向连接 - 但当我尝试这样做时,我收到一条错误消息,这让我走上了相关子查询的道路:
/* ERROR: invalid reference to FROM-clause entry for table "g"
LINE 8: WHERE ig.date < g.date AND its.team_id = g.team_a_id
^
HINT: There is an entry for table "g", but it cannot be referenced from this part of the query. */
我错过了什么?
此外,还有一个我一开始忘记提到的限制 - 我希望能够将平均值限制为仅考虑比赛日期的特定时间段(例如 90 天)内的日期。
嗯。 . .我认为 window 函数可以满足您的需求:
select g.*,
avg(ts_a.stat_a) over (partition by ts_a.team_id order by g.date) as avg_a_a,
avg(ts_a.stat_b) over (partition by ts_a.team_id order by g.date) as avg_a_b,
avg(ts_b.stat_a) over (partition by ts_a.team_id order by g.date) as avg_b_a,
avg(ts_b.stat_b) over (partition by ts_a.team_id order by g.date) as avg_b_b
from game g join
teamstats ts_a
on ts_a.game_id = g.id and ts_a.team_id = g.team_a_id join
teamstats ts_b
on ts_b.game_id = g.id and ts_b.team_id = g.team_b_id
我认为您可以使用 window 函数 - 但您需要一个行框以便仅考虑之前的游戏:
select g.id,
avg(ta.stats_a) over(
partition by tsa.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_a_avg_stat_a,
avg(ta.stats_b) over(
partition by tsa.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_a_avg_stat_b,
avg(tb.stats_a) over(
partition by tsb.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_b_avg_stat_a,
avg(ta.stats_b) over(
partition by tsb.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_b_avg_stat_b
from game g
inner join teamstats tsa
on tsa.game_id = g.game_id
and tsa.team_id = g.team_a_id
inner join teamstats tsb
on tsb.game_id = g.game_id
and tsb.team_id = g.team_b_id
我有两个 tables 跟踪比赛的细节和每支球队在比赛中的表现。架构基本上是这样的:
Game
- id
- date [TIMESTAMPTZ]
- team_a_id
- team_b_id
TeamStats
- game_id
- team_id
- stat_a [INTEGER]
- stat_b [INTEGER]
对于每场比赛,我想总结一下每支球队在之前所有比赛中的表现。所需的输出看起来像这样,其中平均列取自给定 game_id:
之前日期的所有游戏- game_id
- team_a_avg_stat_a
- team_a_avg_stat_b
- team_b_avg_stat_a
- team_b_avg_stat_b
我原以为我想要一个类似下面的查询,将游戏 table 加入查询给定时间范围内给定团队的平均统计数据:
-- Example for team_a, would repeat with another join for team_b
SELECT g.id, ats.avg_stat_a as team_a_avg_stat_a, ats.avg_stat_b as team_a_avg_stat_b
FROM game g
INNER JOIN LATERAL (
SELECT game_id, AVG(stat_a) AS avg_stat_a, AVG(stat_b) as avg_stat_b
FROM teamstats its
INNER JOIN game ig
ON its.game_id = ig.id
WHERE ig.date < g.date AND its.team_id = g.team_a_id
GROUP BY its.game_id
) ats
ON ats.game_id = g.id;
但是,当我尝试上述查询时,得到的结果为零。我本以为游戏中的每一行都有一个结果 table。
我最初的尝试实际上没有横向连接 - 但当我尝试这样做时,我收到一条错误消息,这让我走上了相关子查询的道路:
/* ERROR: invalid reference to FROM-clause entry for table "g"
LINE 8: WHERE ig.date < g.date AND its.team_id = g.team_a_id
^
HINT: There is an entry for table "g", but it cannot be referenced from this part of the query. */
我错过了什么?
此外,还有一个我一开始忘记提到的限制 - 我希望能够将平均值限制为仅考虑比赛日期的特定时间段(例如 90 天)内的日期。
嗯。 . .我认为 window 函数可以满足您的需求:
select g.*,
avg(ts_a.stat_a) over (partition by ts_a.team_id order by g.date) as avg_a_a,
avg(ts_a.stat_b) over (partition by ts_a.team_id order by g.date) as avg_a_b,
avg(ts_b.stat_a) over (partition by ts_a.team_id order by g.date) as avg_b_a,
avg(ts_b.stat_b) over (partition by ts_a.team_id order by g.date) as avg_b_b
from game g join
teamstats ts_a
on ts_a.game_id = g.id and ts_a.team_id = g.team_a_id join
teamstats ts_b
on ts_b.game_id = g.id and ts_b.team_id = g.team_b_id
我认为您可以使用 window 函数 - 但您需要一个行框以便仅考虑之前的游戏:
select g.id,
avg(ta.stats_a) over(
partition by tsa.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_a_avg_stat_a,
avg(ta.stats_b) over(
partition by tsa.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_a_avg_stat_b,
avg(tb.stats_a) over(
partition by tsb.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_b_avg_stat_a,
avg(ta.stats_b) over(
partition by tsb.team_id
order by g.date rows between unbounded preceding and 1 preceding
) team_b_avg_stat_b
from game g
inner join teamstats tsa
on tsa.game_id = g.game_id
and tsa.team_id = g.team_a_id
inner join teamstats tsb
on tsb.game_id = g.game_id
and tsb.team_id = g.team_b_id