PostgreSQL 在一次查询中获取事件发生的每日、每周和每月平均值

PostgreSQL getting daily, weekly, and monthly averages of the occurrences of an event in one query

目前我有一个相当大的查询,由

  1. 通过获取按事件名称和日期分组的事件的 count(),将每天、每周、每月的计数汇总到中间 table 中。
  2. 通过 avg() 按事件分组,选择每个中间体 table 的平均计数,对结果进行联合,因为我想为每天、每周、每月,将填充值 0 放入空列中。
  3. 然后我对所有列求和,0 基本上充当空操作,它只为每个事件提供一个值。

虽然查询量很大,但我觉得我在做很多重复性的工作。有什么办法可以更好地执行此查询或使其更小吗?我以前没有真正做过这样的查询,所以我不太确定。

WITH monthly_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('month', created_at)
),
weekly_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('week', created_at)
),
daily_counts as (
  SELECT
    event,
    count(*) as count
  FROM tracking_stuff
  WHERE
    event = 'thing'
    OR event = 'thing2'
    OR event = 'thing3'
  GROUP BY event, date_trunc('day', created_at)
),
query as (
  SELECT
    event,
    0 as daily_avg,
    0 as weekly_avg,
    avg(count) as monthly_avg
  FROM monthly_counts
  GROUP BY event
  UNION
  SELECT
    event,
    0 as daily_avg,
    avg(count) as weekly_avg,
    0 as monthly_avg
  FROM weekly_counts
  GROUP BY event
  UNION
  SELECT
    event,
    avg(count) as daily_avg,
    0 as weekly_avg,
    0 as monthly_avg
  FROM daily_counts
  GROUP BY event
)
SELECT
  event,
  sum(daily_avg) as daily_avg,
  sum(weekly_avg) as weekly_avg,
  sum(monthly_avg) as monthly_avg
FROM query
GROUP BY event;

我会这样写查询:

select event, daily_avg, weekly_avg, monthly_avg
from (
    select event, avg(count) monthly_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('month', created_at)
    ) s
    group by 1
) monthly
join (
    select event, avg(count) weekly_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('week', created_at)
    ) s
    group by 1
) weekly using(event)
join (
    select event, avg(count) daily_avg
    from (
        select event, count(*)
        from tracking_stuff
        where event in ('thing1', 'thing2', 'thing3')
        group by event, date_trunc('day', created_at)
    ) s
    group by 1
) daily using(event)
order by 1;

如果 where 条件消除了很大一部分数据(比如一半以上),使用 cte 可以略微加快查询执行速度:

with the_data as (
    select event, created_at
    from tracking_stuff
    where event in ('thing1', 'thing2', 'thing3')
    )

select event, daily_avg, weekly_avg, monthly_avg
from (
    select event, avg(count) monthly_avg
    from (
        select event, count(*)
        from the_data
        group by event, date_trunc('month', created_at)
    ) s
    group by 1
) monthly
--  etc ... 

出于好奇,我对数据进行了测试:

create table tracking_stuff (event text, created_at timestamp);
insert into tracking_stuff
    select 'thing' || random_int(9), '2016-01-01'::date+ random_int(365)
    from generate_series(1, 1000000);

在每个查询中,我都将 thing 替换为 thing1,因此查询消除了大约 2/3 的行。

10 次测试的平均执行时间:

Original query          1106 ms
My query without cte    1077 ms
My query with cte        902 ms
Clodoaldo's query       5187 ms

在 9.5+ 中使用 grouping sets

The data selected by the FROM and WHERE clauses is grouped separately by each specified grouping set, aggregates computed for each group just as for simple GROUP BY clauses, and then the results returned

select event,
    avg(total) filter (where day is not null) as avg_day,
    avg(total) filter (where week is not null) as avg_week,
    avg(total) filter (where month is not null) as avg_month    
from (
    select
        event,
        date_trunc('day', created_at) as day,
        date_trunc('week', created_at) as week,
        date_trunc('month', created_at) as month,
        count(*) as total
    from tracking_stuff
    where event in ('thing','thing2','thing3')
    group by grouping sets ((event, 2), (event, 3), (event, 4))
) s
group by event

要了解有关 grouping sets 的更多信息,请考虑以下教程:one, two