T-SQL查询获取子元素的最大值
T-SQL query getting the max of a sub element
考虑这样的 table:
Category
Subcategory
Item
Foo
Apple
i1
Foo
Apple
i2
Foo
Apple
i3
Foo
Pear
i4
Foo
Pear
i5
Bar
Blackberry
i6
Bar
Blueberry
i7
Bar
Blueberry
i8
我想,对于每个 category
,得到 item
中计数最高的 subcategory
。我不关心物品的身份(甚至它们的数量)。所以,我希望最终的 return 是
Category
Subcategory
Foo
Apple
Bar
Blueberry
我试过了
WITH pool AS (
SELECT
category,
subcategory,
COUNT(item) AS "itemCount"
FROM table
GROUP BY category, subcategory
),
maxItems AS (
SELECT
category,
MAX(subcategory), -- In real life, this is a numeric column
FROM pool
GROUP BY category
HAVING itemCount = MAX(itemCount)
)
-- and final query here filtered on the category -> subcategory mapping from above
但是 HAVING 语句错误
is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
的课程不在分组依据中。我不想按最大计数分组,我想按它过滤。
我可以让它与 maxItems 中的子查询一起使用,将其更改为
maxItems AS (
SELECT
category,
MAX(subcategory), -- In real life, this is a numeric column
FROM pool
JOIN (
SELECT
subcategory,
MAX(itemCount) AS "itemCount"
FROM pool
GROUP BY subcategory
) AS "maxFilter"
ON rmCounts.subcategory = maxFilter.subcategory
AND maxFilter.itemCount = rmCounts.itemCount
GROUP BY category
)
但我真的觉得如果 HAVING
有效,它会更优雅,更清晰,我不明白为什么它 不 .
这是每个子类别中最高的,如果计数相同,returns 两个子类别:
select a.category, a.subcategory, itemcounts.total
from table a
cross apply ( select top 1 b.subcategory, count(b.item) as total
from table b
where b.category = a.category
group by b.subcategory
order by count(b.item) desc) itemcounts
group by a.category, a.subcategory, itemcounts.total
having count(a.item) = itemcounts.total
这是一种方法,它也可以处理关系:
select * from (
select category,Subcategory,rank() over (partition by category order by count(*) desc) rn
from tablename
group by category,Subcategory
)t where rn = 1
db<>fiddle here
您可以使用 FIRST_VALUE()
window 函数来完成:
SELECT DISTINCT Category,
FIRST_VALUE(Subcategory) OVER (PARTITION BY Category ORDER BY COUNT(*) DESC) Subcategory
FROM tablename
GROUP BY Category, Subcategory;
参见demo。
考虑这样的 table:
Category | Subcategory | Item |
---|---|---|
Foo | Apple | i1 |
Foo | Apple | i2 |
Foo | Apple | i3 |
Foo | Pear | i4 |
Foo | Pear | i5 |
Bar | Blackberry | i6 |
Bar | Blueberry | i7 |
Bar | Blueberry | i8 |
我想,对于每个 category
,得到 item
中计数最高的 subcategory
。我不关心物品的身份(甚至它们的数量)。所以,我希望最终的 return 是
Category | Subcategory |
---|---|
Foo | Apple |
Bar | Blueberry |
我试过了
WITH pool AS (
SELECT
category,
subcategory,
COUNT(item) AS "itemCount"
FROM table
GROUP BY category, subcategory
),
maxItems AS (
SELECT
category,
MAX(subcategory), -- In real life, this is a numeric column
FROM pool
GROUP BY category
HAVING itemCount = MAX(itemCount)
)
-- and final query here filtered on the category -> subcategory mapping from above
但是 HAVING 语句错误
is invalid in the HAVING clause because it is not contained in either an aggregate function or the GROUP BY clause.
的课程不在分组依据中。我不想按最大计数分组,我想按它过滤。
我可以让它与 maxItems 中的子查询一起使用,将其更改为
maxItems AS (
SELECT
category,
MAX(subcategory), -- In real life, this is a numeric column
FROM pool
JOIN (
SELECT
subcategory,
MAX(itemCount) AS "itemCount"
FROM pool
GROUP BY subcategory
) AS "maxFilter"
ON rmCounts.subcategory = maxFilter.subcategory
AND maxFilter.itemCount = rmCounts.itemCount
GROUP BY category
)
但我真的觉得如果 HAVING
有效,它会更优雅,更清晰,我不明白为什么它 不 .
这是每个子类别中最高的,如果计数相同,returns 两个子类别:
select a.category, a.subcategory, itemcounts.total
from table a
cross apply ( select top 1 b.subcategory, count(b.item) as total
from table b
where b.category = a.category
group by b.subcategory
order by count(b.item) desc) itemcounts
group by a.category, a.subcategory, itemcounts.total
having count(a.item) = itemcounts.total
这是一种方法,它也可以处理关系:
select * from (
select category,Subcategory,rank() over (partition by category order by count(*) desc) rn
from tablename
group by category,Subcategory
)t where rn = 1
db<>fiddle here
您可以使用 FIRST_VALUE()
window 函数来完成:
SELECT DISTINCT Category,
FIRST_VALUE(Subcategory) OVER (PARTITION BY Category ORDER BY COUNT(*) DESC) Subcategory
FROM tablename
GROUP BY Category, Subcategory;
参见demo。