在 snowflake 中使用 order by timestamp_tz(9) 的排名函数无法正常工作

rank function using order by timestamp_tz(9) in snowflake is not working properly

snowflake rank function order by same ROW_MODIFIED_TMST function is generating unique numbers.

例如:

Table1

Column1  ROW_MODIFIED_TMST                    
A        2022-04-03 17:42:41.009 +0000        
b        2022-04-03 17:42:41.009 +0000        
c        2022-04-03 17:42:41.009 +0000        
d        2022-04-03 17:42:41.009 +0100

select 
rank() over(partition by column1 order by ROW_MODIFIED_TMST desc) from table1

Column1  ROW_MODIFIED_TMST                    RANK
A        2022-04-03 17:42:41.009 +0000        1
b        2022-04-03 17:42:41.009 +0000        2
c        2022-04-03 17:42:41.009 +0000        3
d        2022-04-03 17:42:41.009 +0100        4

Here rank function should be 1,1,1,2 instead of 1,2,3,4

Please suggest

在这个例子中,第 1 列的分区是问题所在。每个值都不同 a,b,c,d。

为避免间隙,应使用 DENSE_RANK 而不是 RANK

代码应该是:

select *, dense_rank() over(order by ROW_MODIFIED_TMST desc) 
from table1

因此,如果我们以示例数据为例,运行 它:

with table1(Column1,ROW_MODIFIED_TMST) as (      
    SELECT * FROM VALUES
    ('A', '2022-04-03 17:42:41.009 +0000'::timestamp_tz),    
    ('b', '2022-04-03 17:42:41.009 +0000'::timestamp_tz),      
    ('c', '2022-04-03 17:42:41.009 +0000'::timestamp_tz),    
    ('d', '2022-04-03 17:42:41.009 +0100'::timestamp_tz)
)
select 
rank() over(partition by column1 order by ROW_MODIFIED_TMST desc) from table1
RANK() OVER(PARTITION BY COLUMN1 ORDER BY ROW_MODIFIED_TMST DESC)
1
1
1
1

它完全符合我的预期,以及 Lukazs 指出的内容。

但是你说:

Here rank function should be 1,1,1,2 instead of 1,2,3,4

但是没有得到1,2,3,4,也不应该得到1,1,1,2,因为四个Column1的值都不一样。

现在,如果您删除这四个不同的 Column1 值的 PARTITION BY

我们可以看到两种类型的 RANK 如何工作,并与 ROW_NUMBER()

进行比较
select 
    rank() over(order by ROW_MODIFIED_TMST desc) as sparse,
    dense_rank() over(order by ROW_MODIFIED_TMST desc) as dense,
    row_number() over(order by ROW_MODIFIED_TMST desc) as rn
from table1

给出:

SPARSE DENSE RN
1 1 1
1 1 2
1 1 3
4 2 4