在 window 函数中计算 运行 总和
Compute running sum in a window function
我在 Redshift(使用 Postgres 8)中遇到这个 运行 总和问题:
select extract(month from registration_time) as month
, extract(week from registration_time)%4+1 as week
, extract(day from registration_time) as day
, count(*) as count_of_users_registered
, sum(count(*)) over (ORDER BY (1,2,3))
from loyalty.v_user
group by 1,2,3
order by 1,2,3
;
我得到的错误是:
ERROR: 42601: Aggregate window functions with an ORDER BY clause require a frame clause
您 可以 运行 一个 window 对同一查询级别上的聚合函数的结果起作用。在这种情况下使用子查询就简单多了:
SELECT *, sum(count_registered_users) OVER (ORDER BY month, week, day) AS running_sum
FROM (
SELECT extract(month FROM registration_time)::int AS month
, extract(week FROM registration_time)::int%4+1 AS week
, extract(day FROM registration_time)::int AS day
, count(*) AS count_registered_users
FROM loyalty.v_user
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) sub;
我还修复了表达式计算的语法 week
。 extract()
returns double precision
,但模运算符 %
不接受 double precision
数字。我把这三个都投到 integer
上。
与 一样,您不能在 window 函数的 ORDER BY
子句中使用位置引用(与查询的 ORDER BY
子句不同)。
但是,您不能在此查询中使用 over (order by registration_time)
,因为您是按 month
、week
、day
分组的。 registration_time
既没有聚合也没有出现在 GROUP BY
子句中。在查询评估的那个阶段,您无法再访问该列。
您可以重复 ORDER BY
子句中前三个 SELECT
项的表达式以使其生效:
SELECT extract(month FROM registration_time)::int AS month
, extract(week FROM registration_time)::int%4+1 AS week
, extract(day FROM registration_time)::int AS day
, count(*) AS count_registered_users
, sum(count(*)) OVER (ORDER BY
extract(month FROM registration_time)::int
, extract(week FROM registration_time)::int%4+1
, extract(day FROM registration_time)::int) AS running_sum
FROM loyalty.v_user
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;
但这似乎很吵。 (不过性能会很好。)
旁白:我确实想知道 week%4+1
背后的目的......整个查询可能更简单。
相关:
- Get the distinct sum of a joined table column
- PostgreSQL: running count of rows for a query 'by minute'
我在 Redshift(使用 Postgres 8)中遇到这个 运行 总和问题:
select extract(month from registration_time) as month
, extract(week from registration_time)%4+1 as week
, extract(day from registration_time) as day
, count(*) as count_of_users_registered
, sum(count(*)) over (ORDER BY (1,2,3))
from loyalty.v_user
group by 1,2,3
order by 1,2,3
;
我得到的错误是:
ERROR: 42601: Aggregate window functions with an ORDER BY clause require a frame clause
您 可以 运行 一个 window 对同一查询级别上的聚合函数的结果起作用。在这种情况下使用子查询就简单多了:
SELECT *, sum(count_registered_users) OVER (ORDER BY month, week, day) AS running_sum
FROM (
SELECT extract(month FROM registration_time)::int AS month
, extract(week FROM registration_time)::int%4+1 AS week
, extract(day FROM registration_time)::int AS day
, count(*) AS count_registered_users
FROM loyalty.v_user
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3
) sub;
我还修复了表达式计算的语法 week
。 extract()
returns double precision
,但模运算符 %
不接受 double precision
数字。我把这三个都投到 integer
上。
与 ORDER BY
子句中使用位置引用(与查询的 ORDER BY
子句不同)。
但是,您不能在此查询中使用 over (order by registration_time)
,因为您是按 month
、week
、day
分组的。 registration_time
既没有聚合也没有出现在 GROUP BY
子句中。在查询评估的那个阶段,您无法再访问该列。
您可以重复 ORDER BY
子句中前三个 SELECT
项的表达式以使其生效:
SELECT extract(month FROM registration_time)::int AS month
, extract(week FROM registration_time)::int%4+1 AS week
, extract(day FROM registration_time)::int AS day
, count(*) AS count_registered_users
, sum(count(*)) OVER (ORDER BY
extract(month FROM registration_time)::int
, extract(week FROM registration_time)::int%4+1
, extract(day FROM registration_time)::int) AS running_sum
FROM loyalty.v_user
GROUP BY 1, 2, 3
ORDER BY 1, 2, 3;
但这似乎很吵。 (不过性能会很好。)
旁白:我确实想知道 week%4+1
背后的目的......整个查询可能更简单。
相关:
- Get the distinct sum of a joined table column
- PostgreSQL: running count of rows for a query 'by minute'