限制每组结果
Limit results of each group
我想限制每个组中的记录,这样当我在select语句中将它们聚合成一个JSON对象时,它只需要N个conversations
最高 count
有什么想法吗?
我的查询:
select
dt.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
from (
select a.id, c.started_at_url, count(c.id)
from apps a
left join conversations c on c.app_id = a.id
where started_at_url is not null and c.started_at::date > (current_date - (7 || ' days')::interval)::date
group by a.id, c.started_at_url
order by count desc
) as dt
where dt.id = 'ASnYW1-RgCl0I'
group by dt.id
你的问题类似于groupwise-max问题,有很多解决方法。
过滤row_number
window函数
一个简单的方法是使用 row_number()
window 函数并仅过滤掉结果 < N 的行(以 5 为例):
select
dt.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
from (
select
a.id, c.started_at_url,
count(c.id) as count,
row_number() over(partition by a.id order by count(c.id) desc) as rn
from apps a
left join conversations c on c.app_id = a.id
where started_at_url is not null and c.started_at > (current_date - (7 || ' days')::interval)::date
group by a.id, c.started_at_url
order by count desc
) as dt
where
dt.id = 'ASnYW1-RgCl0I'
and dt.rn <= 5 /* get top 5 only */
group by dt.id
使用横向
另一种选择是使用 LATERAL
和 LIMIT
只返回您感兴趣的结果:
select
a.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
form
apps a, lateral(
select
c.started_at_url,
count(*) as count
from
conversations c
where
c.app_id = a.id /* here is why lateral is necessary */
and c.started_at_url is not null
and c.started_at > (current_date - (7 || ' days')::interval)::date
group by
c.started_at_url
order by
count(*) desc
limit 5 /* get top 5 only */
) as dt
where
a.id = 'ASnYW1-RgCl0I'
group by
a.id
OBS:我没试过,所以可能有错别字。如果您希望进行一些测试,可以提供示例数据集。
OBS 2: 如果您真的在最终查询中按 app_id
进行过滤,那么您甚至不需要那个 GROUP BY
子句。
我想限制每个组中的记录,这样当我在select语句中将它们聚合成一个JSON对象时,它只需要N个conversations
最高 count
有什么想法吗?
我的查询:
select
dt.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
from (
select a.id, c.started_at_url, count(c.id)
from apps a
left join conversations c on c.app_id = a.id
where started_at_url is not null and c.started_at::date > (current_date - (7 || ' days')::interval)::date
group by a.id, c.started_at_url
order by count desc
) as dt
where dt.id = 'ASnYW1-RgCl0I'
group by dt.id
你的问题类似于groupwise-max问题,有很多解决方法。
过滤row_number
window函数
一个简单的方法是使用 row_number()
window 函数并仅过滤掉结果 < N 的行(以 5 为例):
select
dt.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
from (
select
a.id, c.started_at_url,
count(c.id) as count,
row_number() over(partition by a.id order by count(c.id) desc) as rn
from apps a
left join conversations c on c.app_id = a.id
where started_at_url is not null and c.started_at > (current_date - (7 || ' days')::interval)::date
group by a.id, c.started_at_url
order by count desc
) as dt
where
dt.id = 'ASnYW1-RgCl0I'
and dt.rn <= 5 /* get top 5 only */
group by dt.id
使用横向
另一种选择是使用 LATERAL
和 LIMIT
只返回您感兴趣的结果:
select
a.id as app_id,
json_build_object(
'rows', array_agg(
json_build_object(
'url', dt.started_at_url,
'count', dt.count
)
)
) as data
form
apps a, lateral(
select
c.started_at_url,
count(*) as count
from
conversations c
where
c.app_id = a.id /* here is why lateral is necessary */
and c.started_at_url is not null
and c.started_at > (current_date - (7 || ' days')::interval)::date
group by
c.started_at_url
order by
count(*) desc
limit 5 /* get top 5 only */
) as dt
where
a.id = 'ASnYW1-RgCl0I'
group by
a.id
OBS:我没试过,所以可能有错别字。如果您希望进行一些测试,可以提供示例数据集。
OBS 2: 如果您真的在最终查询中按 app_id
进行过滤,那么您甚至不需要那个 GROUP BY
子句。