如何将可重复分区作为一个又一个分区的新分区?
How to get repeatable partition as new one after another partition?
OVER 子句中的 "Partition by" 将所有值分组为唯一值,就像 "Distinct" 或 "Group by" 一样。
这就是它在我使用 row_number():
的查询中的工作方式
id st t row_number
-------------------
1 1 1 1
1 1 2 2
1 1 3 3
2 1 3 1
1 2 4 1
1 1 10 4
这就是我想要的:
id st t uniq_row_number
------------------
1 1 1 1
1 1 2 2
1 1 3 3
2 1 3 1
1 2 4 1
1 1 10 1
无论之前是否有new string,每次改变partition后都会读入新的partition。
如果分区重复,那么 uniq_row_number 得到 +1。如果新分区带有新字符串:boom,它会得到 uniq_row_number 1.
我的SQL查询:
SELECT id, st, t,
row_number() OVER (PARTITION BY id, st ORDER BY id, st) cat_num,
min(t) over (PARTITION BY id, st) min_t,
max(t) over (PARTITION BY id, st) max_t
FROM tabl ORDER BY t;
SQL 代码在这里:http://sqlfiddle.com/#!18/d4290/2
这称为 "gaps-and-islands" 问题。您需要为每个 "island" 相似值定义一个组。然后你可以使用 row_number()
.
行号的不同是定义岛的便捷方式:
select t.*,
row_number() over (partition by id, seqnum_t - seqnum_it
order by t
) as uniq_row_number
from (select t.*,
row_number() over (order by t) as seqnum_t,
row_number() over (partition by id order by t) as seqnum_it,
from t
) t;
了解其工作原理的最佳方式是查看子查询的结果。您应该能够看到行号的差异如何定义您关心的组。
"Partition by" 将所有值分组为唯一值,就像 "Distinct" 或 "Group by" 一样。
这就是它在我使用 row_number():
的查询中的工作方式 id st t row_number
-------------------
1 1 1 1
1 1 2 2
1 1 3 3
2 1 3 1
1 2 4 1
1 1 10 4
这就是我想要的:
id st t uniq_row_number
------------------
1 1 1 1
1 1 2 2
1 1 3 3
2 1 3 1
1 2 4 1
1 1 10 1
无论之前是否有new string,每次改变partition后都会读入新的partition。 如果分区重复,那么 uniq_row_number 得到 +1。如果新分区带有新字符串:boom,它会得到 uniq_row_number 1.
我的SQL查询:
SELECT id, st, t,
row_number() OVER (PARTITION BY id, st ORDER BY id, st) cat_num,
min(t) over (PARTITION BY id, st) min_t,
max(t) over (PARTITION BY id, st) max_t
FROM tabl ORDER BY t;
SQL 代码在这里:http://sqlfiddle.com/#!18/d4290/2
这称为 "gaps-and-islands" 问题。您需要为每个 "island" 相似值定义一个组。然后你可以使用 row_number()
.
行号的不同是定义岛的便捷方式:
select t.*,
row_number() over (partition by id, seqnum_t - seqnum_it
order by t
) as uniq_row_number
from (select t.*,
row_number() over (order by t) as seqnum_t,
row_number() over (partition by id order by t) as seqnum_it,
from t
) t;
了解其工作原理的最佳方式是查看子查询的结果。您应该能够看到行号的差异如何定义您关心的组。