来自 table 的计数(列)在配置单元中的排名
Rank in hive from count(column) from table
如果我想从 hive 中的 table select count(user_name), country
。我应该使用什么命令来获得最多 user_name 的前 2 个国家/地区的结果?
如何使用排名功能?
id | user_name | country
1 | a | UK
2 | b | US
3 | c | AUS
4 | d | ITA
5 | e | UK
6 | f | US
结果应该是:
rank| num_user_name | country
1 | 2 | US
1 | 2 | UK
2 | 1 | ITA
2 | 1 | AUS
您可以使用 dense_rank
分析函数:
with cte as (
select country,
count(user_name) as num_user_name
from tbl
group by country
), cte2 as (
select dense_rank() over (order by num_user_name desc) as ranked,
num_user_name,
country
from cte
)
select ranked,
num_user_name,
country
from cte2
where ranked <= 2
order by 1
不需要子查询:
select dense_rank() over (order by count(*)) as rank,
country,
count(*) as num_user_name
from t
group by country
order by count(*) desc, country;
如果我想从 hive 中的 table select count(user_name), country
。我应该使用什么命令来获得最多 user_name 的前 2 个国家/地区的结果?
如何使用排名功能?
id | user_name | country
1 | a | UK
2 | b | US
3 | c | AUS
4 | d | ITA
5 | e | UK
6 | f | US
结果应该是:
rank| num_user_name | country
1 | 2 | US
1 | 2 | UK
2 | 1 | ITA
2 | 1 | AUS
您可以使用 dense_rank
分析函数:
with cte as (
select country,
count(user_name) as num_user_name
from tbl
group by country
), cte2 as (
select dense_rank() over (order by num_user_name desc) as ranked,
num_user_name,
country
from cte
)
select ranked,
num_user_name,
country
from cte2
where ranked <= 2
order by 1
不需要子查询:
select dense_rank() over (order by count(*)) as rank,
country,
count(*) as num_user_name
from t
group by country
order by count(*) desc, country;