将列中的所有值设置为等于 Postgres 中 window 上的第一个非空值

Set all values in a column equal to the first non-null value over a window in Postgres

以这个table为例:

+----+------------+------+
| id |    date    | flag |
+----+------------+------+
| A  | 01/01/2020 |    0 |
| A  | 01/02/2020 |    0 |
| A  | 01/03/2020 |    0 |
| A  | 01/04/2020 |    1 |
| A  | 01/05/2020 |    1 |
| B  | 01/01/2020 |    0 |
| B  | 01/02/2020 |    1 |
| B  | 01/03/2020 |    1 |
| B  | 01/04/2020 |    1 |
| B  | 01/05/2020 |    1 |
+----+------------+------+

有一些标志设置为 0 或 1。我想创建一个名为 day_flagged 的新列,其中将包含标志首次变为 1 的日期。例如,对于 id A,那就是 01/04/2020。对于 id B,那将是 01/02/2020.

这是我目前拥有的:

SELECT x.id, 
       x.date, 
       ( CASE 
           WHEN prev_flag = 0 
                AND next_flag = 1 
                AND x.flag = 1 THEN 1 
           ELSE NULL 
         END ) AS flagged 
FROM   (SELECT id, 
               date, 
               flag, 
               Lag(flag) 
                 OVER ( 
                   partition BY id 
                   ORDER BY date ASC) AS prev_flag, 
               Lead(flag) 
                 OVER ( 
                   partition BY id 
                   ORDER BY date ASC) AS next_flag 
        FROM   tableA) AS x;

结果是这样的:

+----+------------+---------+
| id |    date    | flagged |
+----+------------+---------+
| A  | 01/01/2020 | null    |
| A  | 01/02/2020 | null    |
| A  | 01/03/2020 | null    |
| A  | 01/04/2020 | 1       |
| A  | 01/05/2020 | null    |
| B  | 01/01/2020 | null    |
| B  | 01/02/2020 | 1       |
| B  | 01/03/2020 | null    |
| B  | 01/04/2020 | null    |
| B  | 01/05/2020 | null    |
+----+------------+---------+

我能够确定每个 idflag 的值何时首次从 0 变为 1,并将其存储在 flagged 中。如何获取与 flagged 为 1 的行对应的 date 值,并将该日期作为 day_flagged?

插入分区的每一行

期望的结果:

+----+------------+------+-------------+
| id |    date    | flag | day_flagged |
+----+------------+------+-------------+
| A  | 01/01/2020 |    0 | 01/04/2020  |
| A  | 01/02/2020 |    0 | 01/04/2020  |
| A  | 01/03/2020 |    0 | 01/04/2020  |
| A  | 01/04/2020 |    1 | 01/04/2020  |
| A  | 01/05/2020 |    1 | 01/04/2020  |
| B  | 01/01/2020 |    0 | 01/02/2020  |
| B  | 01/02/2020 |    1 | 01/02/2020  |
| B  | 01/03/2020 |    1 | 01/02/2020  |
| B  | 01/04/2020 |    1 | 01/02/2020  |
| B  | 01/05/2020 |    1 | 01/02/2020  |
+----+------------+------+-------------+

DB Fiddle: https://www.db-fiddle.com/f/wJsTnvNkYELHqLjHRx1pie/4

我了解到您想要每个 id 的第一个 1 的日期。

如果是这样,条件 window min() 似乎可以满足您的需要:

select
    t.*,
    min(date) filter(where flag = 1) over(partition by id) day_flagged
from tableA t

Demo on DB Fiddle:

| id  | date       | flag | day_flagged |
| --- | ---------- | ---- | ----------- |
| A   | 01/01/2020 | 0    | 01/04/2020  |
| A   | 01/02/2020 | 0    | 01/04/2020  |
| A   | 01/03/2020 | 0    | 01/04/2020  |
| A   | 01/04/2020 | 1    | 01/04/2020  |
| A   | 01/05/2020 | 1    | 01/04/2020  |
| B   | 01/01/2020 | 0    | 01/02/2020  |
| B   | 01/02/2020 | 1    | 01/02/2020  |
| B   | 01/03/2020 | 1    | 01/02/2020  |
| B   | 01/04/2020 | 1    | 01/02/2020  |
| B   | 01/05/2020 | 1    | 01/02/2020  |