选择特定列中重复值最多的行
Selecting rows with the most repeated values at specific column
一般问题:我需要 select 一个 table 的值引用另一个 table.
中重复次数最多的值
表具有以下结构:
screenshot
screenshot2
问题是要找到与它相关的运动员结果最多的国家。
首先,INNER JOIN table在结果和国家/地区之间建立关系
SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);
然后,我统计每个国家出现了多少次
SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;
得到这个screenshot3
现在感觉我离解决方案只有一步之遥了))
我猜还有一个 SELECT FROM (SELECT ...) 和 MAX() 是可能的,但我不能把它包起来?
ps:
我是通过像这样将查询加倍来做到这一点的,但我觉得如果有数百万行,它的效率就太低了。
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
)
WHERE highest_participation = (SELECT MAX(highest_participation)
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
))
我也是用view做的
CREATE VIEW temp AS
SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp
FROM(
SELECT country, competition_id FROM result
INNER JOIN sportsman USING(sportsman_id)
)
GROUP BY country;
SELECT country_with_most_participations FROM temp
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);
但不确定这是否是最简单的方法。
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
order by 2 desc
)
where rownum=1
你似乎把这个复杂化了。从现有的 join
查询开始,您可以对结果进行聚合、排序并仅保留最上面的行。
select s.country, count(*) cnt
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by cnt desc
fetch first 1 row with ties
请注意,这允许顶部连接(如果有)。
如果我理解正确的话,您想要根据比赛次数对国家/地区进行排名,并显示排名最高的国家(或多个国家/地区)及其比赛次数。我建议你使用 RANK
进行排名。
select country, competition_count
from
(
select
s.country,
count(*) as competition_count,
rank() over (order by count(*) desc) as rn
from sportsman s
inner join result r using (sportsman_id)
group by s.country
) ranked_by_count
where rn = 1
order by country;
如果结果行的顺序无关紧要,您可以将其缩短为:
select s.country, count(*) as competition_count
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by count(*) desc
fetch first rows with ties;
一般问题:我需要 select 一个 table 的值引用另一个 table.
中重复次数最多的值表具有以下结构: screenshot screenshot2
问题是要找到与它相关的运动员结果最多的国家。
首先,INNER JOIN table在结果和国家/地区之间建立关系
SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id);
然后,我统计每个国家出现了多少次
SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id))
GROUP BY country
;
得到这个screenshot3
现在感觉我离解决方案只有一步之遥了)) 我猜还有一个 SELECT FROM (SELECT ...) 和 MAX() 是可能的,但我不能把它包起来?
ps: 我是通过像这样将查询加倍来做到这一点的,但我觉得如果有数百万行,它的效率就太低了。
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
)
WHERE highest_participation = (SELECT MAX(highest_participation)
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
))
我也是用view做的
CREATE VIEW temp AS
SELECT country as country_with_most_participations, COUNT(country) as country_participate_in_#_comp
FROM(
SELECT country, competition_id FROM result
INNER JOIN sportsman USING(sportsman_id)
)
GROUP BY country;
SELECT country_with_most_participations FROM temp
WHERE country_participate_in_#_comp = (SELECT MAX(country_participate_in_#_comp) FROM temp);
但不确定这是否是最简单的方法。
SELECT country
FROM (SELECT country, COUNT(country) AS highest_participation
FROM (SELECT competition_id, country FROM result
INNER JOIN sportsman USING (sportsman_id)
) GROUP BY country
order by 2 desc
)
where rownum=1
你似乎把这个复杂化了。从现有的 join
查询开始,您可以对结果进行聚合、排序并仅保留最上面的行。
select s.country, count(*) cnt
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by cnt desc
fetch first 1 row with ties
请注意,这允许顶部连接(如果有)。
如果我理解正确的话,您想要根据比赛次数对国家/地区进行排名,并显示排名最高的国家(或多个国家/地区)及其比赛次数。我建议你使用 RANK
进行排名。
select country, competition_count
from
(
select
s.country,
count(*) as competition_count,
rank() over (order by count(*) desc) as rn
from sportsman s
inner join result r using (sportsman_id)
group by s.country
) ranked_by_count
where rn = 1
order by country;
如果结果行的顺序无关紧要,您可以将其缩短为:
select s.country, count(*) as competition_count
from sportsman s
inner join result r using (sportsman_id)
group by s.country
order by count(*) desc
fetch first rows with ties;