如何在postgres中对几个月的日期记录进行分组后填补时间差距

How to fill the time gap after grouping date record for months in postgres

我有 table 个记录为 -

date                n_count
2020-02-19 00:00:00  4
2020-07-14 00:00:00  1
2020-07-17 00:00:00  1
2020-07-30 00:00:00  2
2020-08-03 00:00:00  1
2020-08-04 00:00:00  2
2020-08-25 00:00:00  2
2020-09-23 00:00:00  2
2020-09-30 00:00:00  3
2020-10-01 00:00:00  11
2020-10-05 00:00:00  12
2020-10-19 00:00:00  1
2020-10-20 00:00:00  1
2020-10-22 00:00:00  1
2020-11-02 00:00:00  376
2020-11-04 00:00:00  72
2020-11-11 00:00:00  1

我想将所有记录按月分组,以查找有效的月份总数,但缺少月份。如何填补这个空白。

time           month_count
"2020-02-01"    4
"2020-07-01"    4
"2020-08-01"    5
"2020-09-01"    5
"2020-10-01"    26
"2020-11-01"    449

这是我试过的。

SELECT (date_trunc('month', date))::date AS time,
       sum(n_count) as month_count      
FROM table1
group by time
order by time asc

您可以使用 generate_series() 生成 table 中可用的最早和最晚日期之间月份的所有星星,然后将 table 与 left join 一起使用:

select d.dt, coalesce(sum(t.n_count), 0) as month_count      
from (
    select generate_series(date_trunc('month', min(date)), date_trunc('month', max(date)), '1 month') as dt 
    from table1
) as d(dt)
left join table1 t on t.date >= d.dt and t.date < d.dt + interval '1 month'
group by d.dt
order by d.dt

我只是 UNION 一个日期系列,从 MINMAX 日期生成:

demo:db<>fiddle

WITH cte AS (                                      -- 1
    SELECT
        *,
        date_trunc('month', date)::date AS time
    FROM
        t
)
SELECT 
    time,
    SUM(n_count) as month_count                    --3
FROM (
    SELECT
        time,
        n_count
    FROM cte

    UNION

    SELECT                                        -- 2
        generate_series(
            (SELECT MIN(time) FROM cte),
            (SELECT MAX(time) FROM cte),
            interval '1 month'
        )::date,
        0
) s
GROUP BY time
ORDER BY time
  1. 使用CTE只计算一次date_trunc。如果您想在下面的 UNION 中两次调用您的 table,则可以省略
  2. 生成从 MINMAX 日期的每月日期系列,其中包含您的 n_count value = 0。将其添加到 table
  3. 计算一下