Window PostgreSQL 中分组依据的函数
Window function within group by in PostgreSQL
每次操作字段更改为值 1 时,我都需要对每个用户进行计数。如果第一个条目为 1,那也算在内。行乱序但应按 action_date.
顺序计数
换句话说,我认为需要做的是:按 user_id 对行进行分组,按时间戳对它们进行排序,然后计算 action=1 和 action != 前一行的频率。
例子
create table t (
user_id int,
action_date timestamp,
action int
);
Insert into t(user_id, action_date, action)
values
(1, '2017-01-01 00:00:00', 1),
(2, '2017-01-01 00:00:00', 0),
(1, '2017-01-03 00:00:00', 1),
(2, '2017-01-03 00:00:00', 0),
(1, '2017-01-02 00:00:00', 1),
(2, '2017-01-02 00:00:00', 1),
(1, '2017-01-04 00:00:00', 1),
(2, '2017-01-04 00:00:00', 1);
结果应该是
user_id | count
---------+-------
1 | 1
2 | 2
在 回答的帮助下,我可以通过这种方式获得单个帐户的结果,
select user_id, count(*)
from (select user_id, action_date,action,lag(action) over(order by action_date) as prev_action
from t where user_id=2
) t
where (action<>prev_action and action=1) or (action=1 and prev_action is null)
group by user_id;
但我一直在尝试将其扩展到所有用户。
将lag()
函数与partition by
结合使用:
select user_id, count(*)
from (select t.*,
lag(action) over (partition by user_id order by action_date) as prev_action
from t
) t
where (action = 1) and (prev_action is distinct from 1)
group by user_id;
每次操作字段更改为值 1 时,我都需要对每个用户进行计数。如果第一个条目为 1,那也算在内。行乱序但应按 action_date.
顺序计数换句话说,我认为需要做的是:按 user_id 对行进行分组,按时间戳对它们进行排序,然后计算 action=1 和 action != 前一行的频率。
例子
create table t (
user_id int,
action_date timestamp,
action int
);
Insert into t(user_id, action_date, action)
values
(1, '2017-01-01 00:00:00', 1),
(2, '2017-01-01 00:00:00', 0),
(1, '2017-01-03 00:00:00', 1),
(2, '2017-01-03 00:00:00', 0),
(1, '2017-01-02 00:00:00', 1),
(2, '2017-01-02 00:00:00', 1),
(1, '2017-01-04 00:00:00', 1),
(2, '2017-01-04 00:00:00', 1);
结果应该是
user_id | count
---------+-------
1 | 1
2 | 2
在
select user_id, count(*)
from (select user_id, action_date,action,lag(action) over(order by action_date) as prev_action
from t where user_id=2
) t
where (action<>prev_action and action=1) or (action=1 and prev_action is null)
group by user_id;
但我一直在尝试将其扩展到所有用户。
将lag()
函数与partition by
结合使用:
select user_id, count(*)
from (select t.*,
lag(action) over (partition by user_id order by action_date) as prev_action
from t
) t
where (action = 1) and (prev_action is distinct from 1)
group by user_id;