按组提取最常见(计数最高)的条目
extract the most common (highest count) entry by group
我有以下 table:
ID height
personA 182
personA 182
personA 182
personA 192
personA 172
personB 175
personB 175
我想提取此人最常出现的身高,因为我怀疑 192 是一个拼写错误。到目前为止,我有:
select ID, height, count(ID,height) as cnt
from tbl
group by ID, height
having max(cnt);
我想要的输出是:
ID height
personA 182
personB 175
您可以使用 window 函数根据用户的身高对用户 ID 进行排名。
WITH cte AS (
SELECT
ID
, height
, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY COUNT(height) DESC) rn
FROM dbo.tbl
GROUP BY
ID,
height)
SELECT
ID,
height
FROM cte WHERE rn = 1
您也可以使用 max() 函数通过 ID 获取最大条目..
select ID, max(height)
from tbl
group by ID
应该可以。
您需要使用google解析函数。分析函数将用您想要的列对您的 table 进行分区。我使用了 row_number() 函数。您还可以使用 rank() 函数。
要了解有关解析函数的更多信息:https://hevodata.com/learn/bigquery-row-number-function/
代码:
Select ID, height
From (SELECT *,
row_number() over(partition by id, height order by height
desc) as row_number
FROM students)
Group By ID
having max(row_number)
您可以简单地使用专为您的用例设计的 mode
。请注意,这不会处理平局
select id, mode(height) as height
from t
group by id;
另一种不使用 analytic functions
的替代方法也能处理平局
with cte as
(select id, height, count(*) as cnt
from t
group by id, height)
select id, height
from cte
where (id, cnt) in (select id, max(cnt)
from cte
group by id)
如果您要使用 Lukasz 的回答中巧妙使用的 qualify
子句来实现上述内容,您可以
select id, height
from t
group by id, height
qualify max( count(*) ) over (partition by id) = count(*)
使用QUALIFY
:
SELECT ID, height
FROM tab
GROUP BY ID, height
QUALIFY RANK() OVER(PARTITION BY ID ORDER BY COUNT(*) DESC) = 1;
RANK
用于处理关系。
我有以下 table:
ID height
personA 182
personA 182
personA 182
personA 192
personA 172
personB 175
personB 175
我想提取此人最常出现的身高,因为我怀疑 192 是一个拼写错误。到目前为止,我有:
select ID, height, count(ID,height) as cnt
from tbl
group by ID, height
having max(cnt);
我想要的输出是:
ID height
personA 182
personB 175
您可以使用 window 函数根据用户的身高对用户 ID 进行排名。
WITH cte AS (
SELECT
ID
, height
, ROW_NUMBER() OVER (PARTITION BY ID ORDER BY COUNT(height) DESC) rn
FROM dbo.tbl
GROUP BY
ID,
height)
SELECT
ID,
height
FROM cte WHERE rn = 1
您也可以使用 max() 函数通过 ID 获取最大条目..
select ID, max(height)
from tbl
group by ID
应该可以。
您需要使用google解析函数。分析函数将用您想要的列对您的 table 进行分区。我使用了 row_number() 函数。您还可以使用 rank() 函数。 要了解有关解析函数的更多信息:https://hevodata.com/learn/bigquery-row-number-function/
代码:
Select ID, height
From (SELECT *,
row_number() over(partition by id, height order by height
desc) as row_number
FROM students)
Group By ID
having max(row_number)
您可以简单地使用专为您的用例设计的 mode
。请注意,这不会处理平局
select id, mode(height) as height
from t
group by id;
另一种不使用 analytic functions
的替代方法也能处理平局
with cte as
(select id, height, count(*) as cnt
from t
group by id, height)
select id, height
from cte
where (id, cnt) in (select id, max(cnt)
from cte
group by id)
如果您要使用 Lukasz 的回答中巧妙使用的 qualify
子句来实现上述内容,您可以
select id, height
from t
group by id, height
qualify max( count(*) ) over (partition by id) = count(*)
使用QUALIFY
:
SELECT ID, height
FROM tab
GROUP BY ID, height
QUALIFY RANK() OVER(PARTITION BY ID ORDER BY COUNT(*) DESC) = 1;
RANK
用于处理关系。