在postgres中查找系列的长度
Finding the length of a series in postgres
postgres 的棘手查询。想象一下,我有一组行,其中有一个名为(例如)成功的布尔列。像这样:
id | success
9 | false
8 | false
7 | true
6 | true
5 | true
4 | false
3 | false
2 | true
1 | false
而且我需要计算最新(不)成功系列的长度。例如在这种情况下,“3”表示成功,“2”表示不成功。或者使用 window 函数,然后像这样:
id | success | length
9 | false | 2
8 | false | 2
7 | true | 3
6 | true | 3
5 | true | 3
4 | false | 1
3 | true | 2
2 | true | 2
1 | false | 1
(请注意,我通常只需要最新系列的长度,而不是全部)
到目前为止我找到的最接近的答案是这篇文章:
https://jaxenter.com/10-sql-tricks-that-you-didnt-think-were-possible-125934.html
(参见#5)
但是,postgres 不支持 "IGNORE NULLS" 选项,因此查询无效。没有 "IGNORE NULLS" 它只是 returns 我在长度列中为空。
这是我能得到的最接近的:
WITH
trx1(id, success, rn) AS (
SELECT id, success, row_number() OVER (ORDER BY id desc)
FROM results
),
trx2(id, success, rn, lo, hi) AS (
SELECT trx1.*,
CASE WHEN coalesce(lag(success) OVER (ORDER BY id DESC), FALSE) != success THEN rn END,
CASE WHEN coalesce(lead(success) OVER (ORDER BY id DESC), FALSE) != success THEN rn END
FROM trx1
)
SELECT trx2.*, 1
- last_value (lo) IGNORE nulls OVER (ORDER BY id DESC ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW)
+ first_value(hi) OVER (ORDER BY id DESC ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING)
AS length FROM trx2;
您对这样的查询有什么想法吗?
可以使用window函数row_number()
指定系列:
select max(id) as max_id, success, count(*) as length
from (
select *, row_number() over wa - row_number() over wp as grp
from my_table
window
wp as (partition by success order by id desc),
wa as (order by id desc)
) s
group by success, grp
order by 1 desc
max_id | success | length
--------+---------+--------
9 | f | 2
7 | t | 3
4 | f | 2
2 | t | 1
1 | f | 1
(5 rows)
尽管 Klin 的回答完全正确,但我想 post 我朋友建议的另一个解决方案:
with last_success as (
select max(id) id from my_table where success
)
select count(mt.id) last_fails_count
from my_table mt, last_success lt
where mt.id > lt.id;
--------------------
| last_fails_count |
--------------------
| 2 |
--------------------
如果我只需要获取最后一个失败或成功的系列,它会快两倍。
postgres 的棘手查询。想象一下,我有一组行,其中有一个名为(例如)成功的布尔列。像这样:
id | success 9 | false 8 | false 7 | true 6 | true 5 | true 4 | false 3 | false 2 | true 1 | false
而且我需要计算最新(不)成功系列的长度。例如在这种情况下,“3”表示成功,“2”表示不成功。或者使用 window 函数,然后像这样:
id | success | length 9 | false | 2 8 | false | 2 7 | true | 3 6 | true | 3 5 | true | 3 4 | false | 1 3 | true | 2 2 | true | 2 1 | false | 1
(请注意,我通常只需要最新系列的长度,而不是全部)
到目前为止我找到的最接近的答案是这篇文章: https://jaxenter.com/10-sql-tricks-that-you-didnt-think-were-possible-125934.html (参见#5)
但是,postgres 不支持 "IGNORE NULLS" 选项,因此查询无效。没有 "IGNORE NULLS" 它只是 returns 我在长度列中为空。
这是我能得到的最接近的:
WITH
trx1(id, success, rn) AS (
SELECT id, success, row_number() OVER (ORDER BY id desc)
FROM results
),
trx2(id, success, rn, lo, hi) AS (
SELECT trx1.*,
CASE WHEN coalesce(lag(success) OVER (ORDER BY id DESC), FALSE) != success THEN rn END,
CASE WHEN coalesce(lead(success) OVER (ORDER BY id DESC), FALSE) != success THEN rn END
FROM trx1
)
SELECT trx2.*, 1
- last_value (lo) IGNORE nulls OVER (ORDER BY id DESC ROWS BETWEEN
UNBOUNDED PRECEDING AND CURRENT ROW)
+ first_value(hi) OVER (ORDER BY id DESC ROWS BETWEEN CURRENT ROW
AND UNBOUNDED FOLLOWING)
AS length FROM trx2;
您对这样的查询有什么想法吗?
可以使用window函数row_number()
指定系列:
select max(id) as max_id, success, count(*) as length
from (
select *, row_number() over wa - row_number() over wp as grp
from my_table
window
wp as (partition by success order by id desc),
wa as (order by id desc)
) s
group by success, grp
order by 1 desc
max_id | success | length
--------+---------+--------
9 | f | 2
7 | t | 3
4 | f | 2
2 | t | 1
1 | f | 1
(5 rows)
尽管 Klin 的回答完全正确,但我想 post 我朋友建议的另一个解决方案:
with last_success as (
select max(id) id from my_table where success
)
select count(mt.id) last_fails_count
from my_table mt, last_success lt
where mt.id > lt.id;
--------------------
| last_fails_count |
--------------------
| 2 |
--------------------
如果我只需要获取最后一个失败或成功的系列,它会快两倍。