如何在postgresql中获取最大并发事件数?

How to get maximum number of concurrent events in postgresql?

我有一个名为 events 的 table,如下所示:

id: int
source_id: int
start_datetime: timestamp
end_datetime: timestamp  

这些事件可能有重叠,我想知道在一段时间内发生的重叠事件的最大数量。例如,在这样的情况下:

id | source_id | start_datetime     | end_datetime
----------------------------------------------------------
1  | 23        | 2017-1-1T10:20:00  | 2017-1-1T10:40:00
1  | 42        | 2017-1-1T10:30:00  | 2017-1-1T10:35:00
1  | 11        | 2017-1-1T10:37:00  | 2017-1-1T10:50:00  

答案是 2,因为最多有 2 个事件在 10:30 重叠,直到 10:35。
我正在使用 Postgres 9.6

我不完全确定应该如何处理 idsource_id 列,但根据您的描述,可能是这样的:

select e1.source_id, 
       count(distinct e2.source_id) as overlap_count, 
       array_agg(e2.source_id) as overlap_events
from events e1
  join events e2 
    on e1.source_id <> e2.source_id
   and (e1.start_datetime, e1.end_datetime) overlaps (e2.start_datetime, e2.end_datetime) 
group by e1.source_id
order by overlap_count desc;

根据您的示例数据,returns 如下:

source_id | overlap_count | overlap_events
----------+---------------+---------------
       23 |             2 | {42,11}       
       11 |             1 | {23}          
       42 |             1 | {23}          

要仅获取最大行,您可以在查询中添加 limit 1

另一个(可能较慢)选项,如果您需要事件中的完整行 table:

select e1.id, e1.source_id, e1.start_datetime, e1.end_datetime, 
       (select count(*)
        from events e2
        where e2.source_id <> e1.source_id
          and (e1.start_datetime, e1.end_datetime) overlaps (e2.start_datetime, e2.end_datetime)
       )  as overlap_count
from events e1
order by overlap_count desc;

另一种选择是使用 range types&& 运算符而不是 overlaps:

select e1.source_id, 
       count(distinct e2.source_id) as overlap_count, 
       array_agg(e2.source_id) as overlap_events
from events e1
  join events e2 on e1.source_id <> e2.source_id
             and tsrange(e1.start_datetime, e1.end_datetime,'[]') && tsrange(e2.start_datetime, e2.end_datetime, '[]') 
group by e1.source_id
order by overlap_count desc;

思路如下:计算开始的次数并减去停止的次数。这给出了每次的净额。剩下的只是聚合:

with e as (
      select start_datetime as dte, 1 as inc
      from events
      union all
      select end_datetime as dte, -1 as inc
      from events
     )
select max(concurrent)
from (select dte, sum(sum(inc)) over (order by dte) as concurrent
      from e
      group by dte
     ) e;

子查询显示每次重叠事件的数量。

您可以获得时间范围为:

select dte, next_dte, concurrent
from (select dte, sum(sum(inc)) over (order by dte) as concurrent,
             lead(dte) over (partition by dte) as next_dte
      from e
      group by dte
     ) e
order by concurrent desc
fetch first 1 row only;