如何按组计算值出现后的行数(postgresql)
How to count rows after the occurence of a value by group (postgresql)
例如,我有以下 table:
Name
Day
Healthy
Jon
1
No
Jon
2
Yes
Jon
3
Yes
Jon
4
Yes
Jon
5
No
Mary
1
Yes
Mary
2
No
Mary
3
Yes
Mary
4
No
Mary
5
Yes
我想添加一个列,计算一个人在第 X 天之后健康的天数:
Name
Day
Healthy
Number of days the person was healthy after day X (incl.)
Jon
1
No
3
Jon
2
Yes
3
Jon
3
Yes
2
Jon
4
Yes
1
Jon
5
No
0
Mary
1
Yes
3
Mary
2
No
2
Mary
3
Yes
2
Mary
4
No
1
Mary
5
Yes
1
是否可以使用某种 window 函数来创建这样的列?非常感谢您的帮助!
有几种方法可以使用 window 函数执行此操作。一种是按天降序排列,使用默认的window。另一种是指定从当前行到分区末尾的window
此示例将布尔值 healthy
转换为 int
,以便可以对它求和。如果您的 table 具有文字 Yes
和 No
字符串,那么您可以使用 sum((healthy = 'yes')::int) over (...)
来实现相同的目的。
select name, day,
sum(healthy::int)
over (partition by name
order by day
rows between current row
and unbounded following) as num_subsequent_health_days
from my_table;
name | day | num_subsequent_health_days
:--- | --: | -------------------------:
Jon | 1 | 3
Jon | 2 | 3
Jon | 3 | 2
Jon | 4 | 1
Jon | 5 | 0
Mary | 1 | 3
Mary | 2 | 2
Mary | 3 | 2
Mary | 4 | 1
Mary | 5 | 1
db<>fiddle here
我假设您的关系具有以下架构:
CREATE TABLE test(name text, day int, healthy boolean);
那么这应该会产生预期的结果:
SELECT name, day, sum(mapped) OVER (PARTITION BY name ORDER BY day DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM (SELECT name, day, CASE WHEN healthy THEN 1 ELSE 0 END AS mapped FROM test) sub ORDER BY name, day;
例如,我有以下 table:
Name | Day | Healthy |
---|---|---|
Jon | 1 | No |
Jon | 2 | Yes |
Jon | 3 | Yes |
Jon | 4 | Yes |
Jon | 5 | No |
Mary | 1 | Yes |
Mary | 2 | No |
Mary | 3 | Yes |
Mary | 4 | No |
Mary | 5 | Yes |
我想添加一个列,计算一个人在第 X 天之后健康的天数:
Name | Day | Healthy | Number of days the person was healthy after day X (incl.) |
---|---|---|---|
Jon | 1 | No | 3 |
Jon | 2 | Yes | 3 |
Jon | 3 | Yes | 2 |
Jon | 4 | Yes | 1 |
Jon | 5 | No | 0 |
Mary | 1 | Yes | 3 |
Mary | 2 | No | 2 |
Mary | 3 | Yes | 2 |
Mary | 4 | No | 1 |
Mary | 5 | Yes | 1 |
是否可以使用某种 window 函数来创建这样的列?非常感谢您的帮助!
有几种方法可以使用 window 函数执行此操作。一种是按天降序排列,使用默认的window。另一种是指定从当前行到分区末尾的window
此示例将布尔值 healthy
转换为 int
,以便可以对它求和。如果您的 table 具有文字 Yes
和 No
字符串,那么您可以使用 sum((healthy = 'yes')::int) over (...)
来实现相同的目的。
select name, day,
sum(healthy::int)
over (partition by name
order by day
rows between current row
and unbounded following) as num_subsequent_health_days
from my_table;
name | day | num_subsequent_health_days
:--- | --: | -------------------------:
Jon | 1 | 3
Jon | 2 | 3
Jon | 3 | 2
Jon | 4 | 1
Jon | 5 | 0
Mary | 1 | 3
Mary | 2 | 2
Mary | 3 | 2
Mary | 4 | 1
Mary | 5 | 1
db<>fiddle here
我假设您的关系具有以下架构:
CREATE TABLE test(name text, day int, healthy boolean);
那么这应该会产生预期的结果:
SELECT name, day, sum(mapped) OVER (PARTITION BY name ORDER BY day DESC RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) FROM (SELECT name, day, CASE WHEN healthy THEN 1 ELSE 0 END AS mapped FROM test) sub ORDER BY name, day;