SELECT 前一组值基于 Google Big Query 中的条件

SELECT previous group value based on conditions in Google Big Query

目标是为所有适用行填充上一组的值并包括条件。条件在 STATUS 列中指定。需要针对条件 STATUS = A.

调整查询

数据看起来像这样:

DATE        ID   VALUE       GROUP_ID   STATUS
2021-06-01  1   New York       1          A
2021-06-02  1   New York       1          A
2021-06-03  1   New York       1          B
2021-06-04  1   New York       1          A
2021-06-05  1   Boston         2          A
2021-06-06  1   Boston         2          A
2021-06-07  1   San Francisco  3          A
2021-06-08  1   San Francisco  3          A
2021-06-09  1   New York       4          A 

预期结果: 数据如下所示:

DATE        ID   VALUE       GROUP_ID   STATUS  PREVIOUS_VALUE
2021-06-01  1   New York       1          A           NA
2021-06-02  1   New York       1          A           NA
2021-06-03  1   New York       1          B           NA   
2021-06-04  1   New York       1          A           NA
2021-06-05  1   Boston         2          A         New York
2021-06-06  1   Boston         2          A         New York
2021-06-07  1   San Francisco  3          A         Boston
2021-06-08  1   San Francisco  3          A         Boston 
2021-06-09  1   New York       4          A         San Francisco

到目前为止尝试

select *, last_vale(VALUE IGNORE NULLS) OVER (partition by ID, GROUP_ID order by DATE) from ( 
table)A
select *, lag(VALUE) OVER (partition by ID, GROUP_ID order by DATE) from ( 
table)A

我有一个备份计划来创建一个 table,它将根据条件保存唯一值,然后可以 运行 UPDATE 基于最接近的较小 GROUP_ID 的语句但宁愿有一个更可持续的解决方案。

谢谢。

第一次修改:排除标题为 'Boston' 的目标 我尝试使用 case when value not in ('Boston') then ...

DATE        ID   VALUE       GROUP_ID   STATUS  PREVIOUS_VALUE
2021-06-01  1   New York       1          A           NA
2021-06-02  1   New York       1          A           NA
2021-06-03  1   New York       1          B           NA   
2021-06-04  1   New York       1          A           NA
2021-06-05  1   Boston         2          A         New York
2021-06-06  1   Boston         2          A         New York
2021-06-07  1   San Francisco  3          A         San Francisco
2021-06-08  1   San Francisco  3          A         San Francisco
2021-06-09  1   New York       4          A         San Francisco

使用 window 框架规范怎么样?

select t.*,
       max(value) over (order by group_id
                        range between 1 preceding and 1 preceding
                       ) as prev_value
from t;

如果 group_id 是连续的但有间隙,那么您可以使用 dense_rank() 得到一个有效的:

select t.*,
       max(value) over (order by dense_group_id
                        range between 1 preceding and 1 preceding
                       ) as prev_value
from (select t.*,
             dense_rank() over (order by group_id) as dense_group_id
      from t
     ) t