SQL:ANSI 当值第一次从 0 移动到任何正数时,输出带有 ID 的 table
SQL:ANSI Output a table with ID's when values first move from 0 to any positive number
我有一个 table,其中包含 ID、日期和值。我想在值第一次具体从 0 移动到任何正数时检索每个唯一 ID(日期和值)。
ID DATE Value
1 2019-01-31 0
2 2019-02-27 0
3 2019-03-31 0
2 2019-01-31 5
1 2019-02-31 1
3 2019-04-31 5
2 2019-04-30 5
1 2019-05-31 10
3 2020-01-31 0
2 2020-02-28 3
1 2019-06-31 5
3 2020-04-30 5
期望的输出:
ID DATE Value
1 2019-02-31 1
2 2019-02-28 3
3 2019-04-31 5
我正尝试在雪花中完成此操作,不确定是否会产生任何影响。
QUALIFY 和 ROW_NUMBER() 可用于此:
如果您想要第一个 non-zero 值...但您确实要求过...
SELECT *
FROM values
(1, '2019-01-31'::date,0 ),
(2, '2019-02-27'::date,0 ),
(3, '2019-03-31'::date,0 ),
(2, '2019-01-31'::date,5 ),
(1, '2019-02-28'::date,1 ),
(3, '2019-04-30'::date,5 ),
(2, '2019-04-30'::date,5 ),
(1, '2019-05-31'::date,10 ),
(3, '2020-01-31'::date,0 ),
(2, '2020-02-28'::date,3 ),
(1, '2019-06-30'::date,5 ),
(3, '2020-04-30'::date,5 )
t(ID , DATE , Value )
QUALIFY value > 0 AND row_number() over(partition by id, value > 0 order by date ) = 1;
ORDER BY 1,2
需要注意的技巧是您要排除所有不大于 0 的值,并按此划分 row_number。
给出:
ID
DATE
VALUE
1
2019-02-28
1
2
2019-01-31
5
3
2019-04-30
5
取二:
从 0 到 non-zero 的第一次转换:
所以我们只对数据进行排序,所以我们在谈论同样的事情:
ID
DATE
VALUE
wanted
1
2019-01-31
0
1
2019-02-28
1
this
1
2019-05-31
10
1
2019-06-30
5
2
2019-01-31
5
2
2019-02-27
0
2
2019-04-30
5
this
2
2020-02-28
3
3
2019-03-31
0
3
2019-04-30
5
this
3
2020-01-31
0
3
2020-04-30
5
这可以通过两个嵌套的 QUALIFY 来完成:
SELECT * FROM (
SELECT *
FROM values
(1, '2019-01-31'::date,0 ),
(2, '2019-02-27'::date,0 ),
(3, '2019-03-31'::date,0 ),
(2, '2019-01-31'::date,5 ),
(1, '2019-02-28'::date,1 ),
(3, '2019-04-30'::date,5 ),
(2, '2019-04-30'::date,5 ),
(1, '2019-05-31'::date,10 ),
(3, '2020-01-31'::date,0 ),
(2, '2020-02-28'::date,3 ),
(1, '2019-06-30'::date,5 ),
(3, '2020-04-30'::date,5 )
t(ID , DATE , Value )
QUALIFY lag(value)over(partition by id order by date) = 0
)
QUALIFY row_number() over(partition by id order by date ) = 1
ORDER BY 1,2
给出:
ID
DATE
VALUE
1
2019-02-28
1
2
2019-04-30
5
3
2019-04-30
5
ANSI SQL:
如果您需要 ANSI SQL,您应该使用这种形式:
SELECT
b.ID,
b.DATE,
b.Value
FROM (
SELECT
a.ID,
a.DATE,
a.Value,
row_number() over(partition by a.id order by a.date ) as rn
FROM (
SELECT
ID,
DATE,
Value,
lag(value)over(partition by id order by date) as lag_val
FROM table_data
) AS a
WHERE a.lag_val = 0
) AS b
WHERE b.rn = 1
ORDER BY 1,2
我倾向于发现用最小的代码表达所需的输出更清晰,因此它最能表达手头的任务。
我有一个 table,其中包含 ID、日期和值。我想在值第一次具体从 0 移动到任何正数时检索每个唯一 ID(日期和值)。
ID DATE Value
1 2019-01-31 0
2 2019-02-27 0
3 2019-03-31 0
2 2019-01-31 5
1 2019-02-31 1
3 2019-04-31 5
2 2019-04-30 5
1 2019-05-31 10
3 2020-01-31 0
2 2020-02-28 3
1 2019-06-31 5
3 2020-04-30 5
期望的输出:
ID DATE Value
1 2019-02-31 1
2 2019-02-28 3
3 2019-04-31 5
我正尝试在雪花中完成此操作,不确定是否会产生任何影响。
QUALIFY 和 ROW_NUMBER() 可用于此:
如果您想要第一个 non-zero 值...但您确实要求过...
SELECT *
FROM values
(1, '2019-01-31'::date,0 ),
(2, '2019-02-27'::date,0 ),
(3, '2019-03-31'::date,0 ),
(2, '2019-01-31'::date,5 ),
(1, '2019-02-28'::date,1 ),
(3, '2019-04-30'::date,5 ),
(2, '2019-04-30'::date,5 ),
(1, '2019-05-31'::date,10 ),
(3, '2020-01-31'::date,0 ),
(2, '2020-02-28'::date,3 ),
(1, '2019-06-30'::date,5 ),
(3, '2020-04-30'::date,5 )
t(ID , DATE , Value )
QUALIFY value > 0 AND row_number() over(partition by id, value > 0 order by date ) = 1;
ORDER BY 1,2
需要注意的技巧是您要排除所有不大于 0 的值,并按此划分 row_number。
给出:
ID | DATE | VALUE |
---|---|---|
1 | 2019-02-28 | 1 |
2 | 2019-01-31 | 5 |
3 | 2019-04-30 | 5 |
取二:
从 0 到 non-zero 的第一次转换:
所以我们只对数据进行排序,所以我们在谈论同样的事情:
ID | DATE | VALUE | wanted |
---|---|---|---|
1 | 2019-01-31 | 0 | |
1 | 2019-02-28 | 1 | this |
1 | 2019-05-31 | 10 | |
1 | 2019-06-30 | 5 | |
2 | 2019-01-31 | 5 | |
2 | 2019-02-27 | 0 | |
2 | 2019-04-30 | 5 | this |
2 | 2020-02-28 | 3 | |
3 | 2019-03-31 | 0 | |
3 | 2019-04-30 | 5 | this |
3 | 2020-01-31 | 0 | |
3 | 2020-04-30 | 5 |
这可以通过两个嵌套的 QUALIFY 来完成:
SELECT * FROM (
SELECT *
FROM values
(1, '2019-01-31'::date,0 ),
(2, '2019-02-27'::date,0 ),
(3, '2019-03-31'::date,0 ),
(2, '2019-01-31'::date,5 ),
(1, '2019-02-28'::date,1 ),
(3, '2019-04-30'::date,5 ),
(2, '2019-04-30'::date,5 ),
(1, '2019-05-31'::date,10 ),
(3, '2020-01-31'::date,0 ),
(2, '2020-02-28'::date,3 ),
(1, '2019-06-30'::date,5 ),
(3, '2020-04-30'::date,5 )
t(ID , DATE , Value )
QUALIFY lag(value)over(partition by id order by date) = 0
)
QUALIFY row_number() over(partition by id order by date ) = 1
ORDER BY 1,2
给出:
ID | DATE | VALUE |
---|---|---|
1 | 2019-02-28 | 1 |
2 | 2019-04-30 | 5 |
3 | 2019-04-30 | 5 |
ANSI SQL:
如果您需要 ANSI SQL,您应该使用这种形式:
SELECT
b.ID,
b.DATE,
b.Value
FROM (
SELECT
a.ID,
a.DATE,
a.Value,
row_number() over(partition by a.id order by a.date ) as rn
FROM (
SELECT
ID,
DATE,
Value,
lag(value)over(partition by id order by date) as lag_val
FROM table_data
) AS a
WHERE a.lag_val = 0
) AS b
WHERE b.rn = 1
ORDER BY 1,2
我倾向于发现用最小的代码表达所需的输出更清晰,因此它最能表达手头的任务。