将列中的所有值设置为等于 Postgres 中 window 上的第一个非空值
Set all values in a column equal to the first non-null value over a window in Postgres
以这个table为例:
+----+------------+------+
| id | date | flag |
+----+------------+------+
| A | 01/01/2020 | 0 |
| A | 01/02/2020 | 0 |
| A | 01/03/2020 | 0 |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | 1 |
| B | 01/01/2020 | 0 |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | 1 |
| B | 01/04/2020 | 1 |
| B | 01/05/2020 | 1 |
+----+------------+------+
有一些标志设置为 0 或 1。我想创建一个名为 day_flagged
的新列,其中将包含标志首次变为 1 的日期。例如,对于 id
A,那就是 01/04/2020
。对于 id
B,那将是 01/02/2020
.
这是我目前拥有的:
SELECT x.id,
x.date,
( CASE
WHEN prev_flag = 0
AND next_flag = 1
AND x.flag = 1 THEN 1
ELSE NULL
END ) AS flagged
FROM (SELECT id,
date,
flag,
Lag(flag)
OVER (
partition BY id
ORDER BY date ASC) AS prev_flag,
Lead(flag)
OVER (
partition BY id
ORDER BY date ASC) AS next_flag
FROM tableA) AS x;
结果是这样的:
+----+------------+---------+
| id | date | flagged |
+----+------------+---------+
| A | 01/01/2020 | null |
| A | 01/02/2020 | null |
| A | 01/03/2020 | null |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | null |
| B | 01/01/2020 | null |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | null |
| B | 01/04/2020 | null |
| B | 01/05/2020 | null |
+----+------------+---------+
我能够确定每个 id
的 flag
的值何时首次从 0 变为 1,并将其存储在 flagged
中。如何获取与 flagged
为 1 的行对应的 date
值,并将该日期作为 day_flagged
?
插入分区的每一行
期望的结果:
+----+------------+------+-------------+
| id | date | flag | day_flagged |
+----+------------+------+-------------+
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |
+----+------------+------+-------------+
DB Fiddle: https://www.db-fiddle.com/f/wJsTnvNkYELHqLjHRx1pie/4
我了解到您想要每个 id
的第一个 1
的日期。
如果是这样,条件 window min()
似乎可以满足您的需要:
select
t.*,
min(date) filter(where flag = 1) over(partition by id) day_flagged
from tableA t
| id | date | flag | day_flagged |
| --- | ---------- | ---- | ----------- |
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |
以这个table为例:
+----+------------+------+
| id | date | flag |
+----+------------+------+
| A | 01/01/2020 | 0 |
| A | 01/02/2020 | 0 |
| A | 01/03/2020 | 0 |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | 1 |
| B | 01/01/2020 | 0 |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | 1 |
| B | 01/04/2020 | 1 |
| B | 01/05/2020 | 1 |
+----+------------+------+
有一些标志设置为 0 或 1。我想创建一个名为 day_flagged
的新列,其中将包含标志首次变为 1 的日期。例如,对于 id
A,那就是 01/04/2020
。对于 id
B,那将是 01/02/2020
.
这是我目前拥有的:
SELECT x.id,
x.date,
( CASE
WHEN prev_flag = 0
AND next_flag = 1
AND x.flag = 1 THEN 1
ELSE NULL
END ) AS flagged
FROM (SELECT id,
date,
flag,
Lag(flag)
OVER (
partition BY id
ORDER BY date ASC) AS prev_flag,
Lead(flag)
OVER (
partition BY id
ORDER BY date ASC) AS next_flag
FROM tableA) AS x;
结果是这样的:
+----+------------+---------+
| id | date | flagged |
+----+------------+---------+
| A | 01/01/2020 | null |
| A | 01/02/2020 | null |
| A | 01/03/2020 | null |
| A | 01/04/2020 | 1 |
| A | 01/05/2020 | null |
| B | 01/01/2020 | null |
| B | 01/02/2020 | 1 |
| B | 01/03/2020 | null |
| B | 01/04/2020 | null |
| B | 01/05/2020 | null |
+----+------------+---------+
我能够确定每个 id
的 flag
的值何时首次从 0 变为 1,并将其存储在 flagged
中。如何获取与 flagged
为 1 的行对应的 date
值,并将该日期作为 day_flagged
?
期望的结果:
+----+------------+------+-------------+
| id | date | flag | day_flagged |
+----+------------+------+-------------+
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |
+----+------------+------+-------------+
DB Fiddle: https://www.db-fiddle.com/f/wJsTnvNkYELHqLjHRx1pie/4
我了解到您想要每个 id
的第一个 1
的日期。
如果是这样,条件 window min()
似乎可以满足您的需要:
select
t.*,
min(date) filter(where flag = 1) over(partition by id) day_flagged
from tableA t
| id | date | flag | day_flagged |
| --- | ---------- | ---- | ----------- |
| A | 01/01/2020 | 0 | 01/04/2020 |
| A | 01/02/2020 | 0 | 01/04/2020 |
| A | 01/03/2020 | 0 | 01/04/2020 |
| A | 01/04/2020 | 1 | 01/04/2020 |
| A | 01/05/2020 | 1 | 01/04/2020 |
| B | 01/01/2020 | 0 | 01/02/2020 |
| B | 01/02/2020 | 1 | 01/02/2020 |
| B | 01/03/2020 | 1 | 01/02/2020 |
| B | 01/04/2020 | 1 | 01/02/2020 |
| B | 01/05/2020 | 1 | 01/02/2020 |