尝试替换雪花上的空值时使用 LEAD 和 LAG 时出现不一致

inconsistencies while using LEAD and LAG when trying to replace null values on snowflake

我是 snowflake 的新手,正在尝试编写一个 SQL 查询,以根据日期为每个 ID 将空值替换为最后记录的 Ip。 Id被认为是降序的,每个Id的日期也是降序的。

这是我的例子table

table姓名:客户

id date ip
113 2022-02-05 11:40:42 Null
113 2021-12-05 11:40:42 Null
113 2021-08-05 11:40:42 Null
113 2021-07-05 11:40:42 Null
113 2022-02-05 11:40:42 83:93:225:63
112 2022-02-05 11:40:42 Null
112 2021-02-05 11:40:42 3:9:225:63
112 2020-02-05 11:40:42 8:9:225:63

我想达到的目标

id date ip
113 2022-02-05 11:40:42 83:93:225:63
113 2021-12-05 11:40:42 83:93:225:63
113 2021-08-05 11:40:42 83:93:225:63
113 2021-07-05 11:40:42 83:93:225:63
113 2022-02-05 11:40:42 83:93:225:63
112 2022-02-05 11:40:42 3:9:225:63
112 2021-02-05 11:40:42 3:9:225:63
112 2020-02-05 11:40:42 8:9:225:63

我不确定你是否想要“下一个”、“最后一个”、“前一个”值,所以这三个都在这里

已编辑为与您要求的排序顺序相同:

SELECT column1 as id
    ,to_date(column2) as date
    ,column3 as ip 
    ,LAST_VALUE(ip) IGNORE NULLS OVER (PARTITION BY id ORDER BY date) last_value
    ,NVL(ip, last_value) AS current_or_last
    ,LEAD(ip) IGNORE NULLS OVER (PARTITION BY id ORDER BY date) as next_value
    ,NVL(ip, next_value) AS current_or_next  
    ,LAG(ip) IGNORE NULLS OVER (PARTITION BY id ORDER BY date) as prior_value
    ,NVL(ip, prior_value) AS current_or_prior
FROM VALUES
    (113,'2022-02-05 11:40:42',null),
    (113,'2021-12-05 11:40:42',null),
    (113,'2021-08-05 11:40:42',null),
    (113,'2021-07-05 11:40:42',null),
    (113,'2022-02-05 11:40:42','83:93:225:63'),
    (112,'2022-02-05 11:40:42',null),
    (112,'2021-02-05 11:40:42','3:9:225:63'),
    (112,'2020-02-05 11:40:42','8:9:225:63')
ORDER BY 1 DESC,2 DESC;

给出:

ID DATE IP LAST_VALUE CURRENT_OR_LAST NEXT_VALUE CURRENT_OR_NEXT PRIOR_VALUE CURRENT_OR_PRIOR
113 2022-02-05 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63
113 2022-02-05 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63
113 2021-12-05 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63
113 2021-08-05 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63
113 2021-07-05 83:93:225:63 83:93:225:63 83:93:225:63 83:93:225:63
112 2022-02-05 3:9:225:63 3:9:225:63 3:9:225:63 3:9:225:63
112 2021-02-05 3:9:225:63 3:9:225:63 3:9:225:63 3:9:225:63 8:9:225:63 3:9:225:63
112 2020-02-05 8:9:225:63 3:9:225:63 8:9:225:63 3:9:225:63 8:9:225:63 8:9:225:63

猜想您希望 current_or_last 像这样组合在一起:

SELECT column1 as id
    ,to_date(column2) as date
    ,NVL(column3, LAST_VALUE(column3) IGNORE NULLS OVER (PARTITION BY id ORDER BY date)) AS ip
FROM VALUES
    (113,'2022-02-05 11:40:42',null),
    (113,'2021-12-05 11:40:42',null),
    (113,'2021-08-05 11:40:42',null),
    (113,'2021-07-05 11:40:42',null),
    (113,'2022-02-05 11:40:42','83:93:225:63'),
    (112,'2022-02-05 11:40:42',null),
    (112,'2021-02-05 11:40:42','3:9:225:63'),
    (112,'2020-02-05 11:40:42','8:9:225:63')
ORDER BY 1 DESC,2 DESC;

给出:

ID DATE IP
113 2022-02-05 83:93:225:63
113 2022-02-05 83:93:225:63
113 2021-12-05 83:93:225:63
113 2021-08-05 83:93:225:63
113 2021-07-05 83:93:225:63
112 2022-02-05 3:9:225:63
112 2021-02-05 3:9:225:63
112 2020-02-05 8:9:225:63