T-SQL查询获取子元素的最大值

Question

考虑这样的 table：

Category	Subcategory	Item
Foo	Apple	i1
Foo	Apple	i2
Foo	Apple	i3
Foo	Pear	i4
Foo	Pear	i5
Bar	Blackberry	i6
Bar	Blueberry	i7
Bar	Blueberry	i8

我想，对于每个 category，得到 item 中计数最高的 subcategory。我不关心物品的身份（甚至它们的数量）。所以，我希望最终的 return 是

Category	Subcategory
Foo	Apple
Bar	Blueberry

我试过了

WITH pool AS (
    SELECT
        category,
        subcategory,
        COUNT(item) AS "itemCount"
    FROM table
    GROUP BY category, subcategory
),
maxItems AS (
    SELECT
        category,
        MAX(subcategory), -- In real life, this is a numeric column
    FROM pool
    GROUP BY category
    HAVING itemCount = MAX(itemCount)
)
-- and final query here filtered on the category -> subcategory mapping from above

但是 HAVING 语句错误

is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.

的课程不在分组依据中。我不想按最大计数分组，我想按它过滤。

我可以让它与 maxItems 中的子查询一起使用，将其更改为

maxItems AS (
    SELECT
        category,
        MAX(subcategory), -- In real life, this is a numeric column
    FROM pool
    JOIN (
        SELECT
            subcategory,
            MAX(itemCount) AS "itemCount"
        FROM pool
        GROUP BY subcategory
    ) AS "maxFilter"
        ON rmCounts.subcategory = maxFilter.subcategory
        AND maxFilter.itemCount = rmCounts.itemCount
    GROUP BY category
)

但我真的觉得如果 HAVING 有效，它会更优雅，更清晰，我不明白为什么它不 .

Answer 1

这是每个子类别中最高的，如果计数相同，returns 两个子类别：

select a.category, a.subcategory, itemcounts.total
from table a
cross apply ( select top 1 b.subcategory, count(b.item) as total
            from table b 
            where b.category = a.category
            group by b.subcategory
            order by count(b.item) desc) itemcounts
group by a.category, a.subcategory, itemcounts.total
having count(a.item) = itemcounts.total

Answer 2

这是一种方法，它也可以处理关系：

select * from (
   select category,Subcategory,rank() over (partition by category order by count(*) desc) rn 
   from tablename
   group by category,Subcategory
)t where rn = 1

db<>fiddle here

Answer 3

您可以使用 FIRST_VALUE() window 函数来完成：

SELECT DISTINCT Category,
       FIRST_VALUE(Subcategory) OVER (PARTITION BY Category ORDER BY COUNT(*) DESC) Subcategory
FROM tablename 
GROUP BY Category, Subcategory;

参见demo。

T-SQL查询获取子元素的最大值

T-SQL query getting the max of a sub element

sql

tsql

sql-server

group-by

window-functions