合并日期时间范围 Oracle SQL 或 PL/SQL

Merge Datetime Ranges Oracle SQL or PL/SQL

我一直在努力合并 oracle SQL 或 PL/SQL(数据库标准版 11gR2)中的日期时间范围。

我正在尝试合并日期时间范围,以便以下数据

order_id    start_date_time         end_date_time
3933        04/02/2020 08:00:00     04/02/2020 12:00:00
3933        04/02/2020 13:30:00     04/02/2020 17:00:00
3933        04/02/2020 14:00:00     04/02/2020 19:00:00
3933        05/02/2020 13:40:12     05/02/2020 14:34:48
3933        05/02/2020 14:00:00     05/02/2020 18:55:12
3933        05/02/2020 14:49:48     05/02/2020 15:04:48
3933        06/02/2020 08:00:00     06/02/2020 12:00:00
3933        06/02/2020 13:30:00     06/02/2020 17:00:00
3933        06/02/2020 14:10:12     06/02/2020 18:49:48
3933        07/02/2020 08:00:00     07/02/2020 10:30:00
3933        07/02/2020 08:00:00     07/02/2020 12:00:00
3933        07/02/2020 13:30:00     07/02/2020 17:00:00
11919       14/05/2020 09:00:00     14/05/2020 17:00:00
11919       14/05/2020 09:00:00     14/05/2020 17:00:00
11919       14/05/2020 15:00:00     14/05/2020 16:30:00
11919       15/05/2020 08:40:12     15/05/2020 16:30:00
11919       15/05/2020 09:40:12     15/05/2020 16:30:00
11919       15/05/2020 10:15:00     15/05/2020 12:15:00
11919       15/05/2020 13:19:48     15/05/2020 16:00:00
11919       18/05/2020 08:49:48     18/05/2020 09:45:00
11919       18/05/2020 10:00:00     18/05/2020 17:00:00
11919       18/05/2020 10:00:00     18/05/2020 16:58:12
11919       18/05/2020 15:34:48     18/05/2020 16:10:12
11919       18/05/2020 16:30:00     18/05/2020 16:45:00
...         ...                     ...

会转化为如下结果集

--after merge (this is the result I am seeking)
order_id    start_date_time         end_date_time
3933        04/02/2020 08:00:00     04/02/2020 12:00:00
3933        04/02/2020 13:30:00     04/02/2020 19:00:00
3933        05/02/2020 13:40:12     05/02/2020 18:55:12
3933        06/02/2020 08:00:00     06/02/2020 12:00:00
3933        06/02/2020 13:30:00     06/02/2020 18:49:48
3933        07/02/2020 08:00:00     07/02/2020 12:00:00
3933        07/02/2020 13:30:00     07/02/2020 17:00:00
11919       14/05/2020 09:00:00     14/05/2020 17:00:00
11919       15/05/2020 08:40:12     15/05/2020 16:30:00
11919       18/05/2020 08:49:48     18/05/2020 17:00:00
...         ...                     ...

start_date_time和end_date_time的格式为DAY/MONTH/YEARHH24:MI:SS.

关于如何在 Oracle SQL 或 PL/SQL 中进行合并的任何 suggestion/solution?

我认为这是一个微不足道的问题,但是我还没能在互联网上找到解决方案。

提前致谢。

改编自 ,其中包含对代码的解释。所有改变的是添加 PARTITION BY order_id 来计算每个 order_id 的日期范围,然后添加到 return 范围(而不是根据链接的答案计算总值):

SELECT order_id,
       start_date_time,
       end_date_time
FROM   (
  SELECT order_id,
         LAG( dt ) OVER ( PARTITION BY order_id ORDER BY dt ) AS start_date_time,
         dt AS end_date_time,
         start_end
  FROM   (
    SELECT order_id,
           dt,
           CASE SUM( value ) OVER ( PARTITION BY order_id ORDER BY dt ASC, value DESC, ROWNUM ) * value
             WHEN 1 THEN 'start'
             WHEN 0 THEN 'end'
           END AS start_end
    FROM   table_name
    UNPIVOT ( dt FOR value IN ( start_date_time AS 1, end_date_time AS -1 ) )
  )
  WHERE start_end IS NOT NULL
)
WHERE  start_end = 'end';

从 Oracle 12 开始,您可以使用 MATCH_RECONIZE 进行逐行处理:

SELECT *
FROM   table_name
MATCH_RECOGNIZE(
  PARTITION BY order_id
  ORDER     BY start_date_time
  MEASURES
    FIRST(start_date_time) AS start_date_time,
    MAX(end_date_time)     AS end_date_time
  ONE ROW PER MATCH
  PATTERN (overlapping_rows* last_row)
  DEFINE
    overlapping_rows AS NEXT(start_date_time) <= MAX(end_date_time)
)

其中,对于你的测试数据:

CREATE TABLE table_name (
  order_id NUMBER,
  start_date_time DATE,
  end_date_time DATE
);

INSERT INTO table_name ( order_id, start_date_time, end_date_time )
SELECT 3933, TIMESTAMP '2020-02-04 08:00:00', TIMESTAMP '2020-02-04 12:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-04 13:30:00', TIMESTAMP '2020-02-04 17:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-04 14:00:00', TIMESTAMP '2020-02-04 19:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-05 13:40:12', TIMESTAMP '2020-02-05 14:34:48' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-05 14:00:00', TIMESTAMP '2020-02-05 18:55:12' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-05 14:49:48', TIMESTAMP '2020-02-05 15:04:48' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-06 08:00:00', TIMESTAMP '2020-02-06 12:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-06 13:30:00', TIMESTAMP '2020-02-06 17:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-06 14:10:12', TIMESTAMP '2020-02-06 18:49:48' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-07 08:00:00', TIMESTAMP '2020-02-07 10:30:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-07 08:00:00', TIMESTAMP '2020-02-07 12:00:00' FROM DUAL UNION ALL
SELECT 3933, TIMESTAMP '2020-02-07 13:30:00', TIMESTAMP '2020-02-07 17:00:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-14 09:00:00', TIMESTAMP '2020-05-14 17:00:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-14 09:00:00', TIMESTAMP '2020-05-14 17:00:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-14 15:00:00', TIMESTAMP '2020-05-14 16:30:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-15 08:40:12', TIMESTAMP '2020-05-15 16:30:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-15 09:40:12', TIMESTAMP '2020-05-15 16:30:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-15 10:15:00', TIMESTAMP '2020-05-15 12:15:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-15 13:19:48', TIMESTAMP '2020-05-15 16:00:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-18 08:49:48', TIMESTAMP '2020-05-18 09:45:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-18 10:00:00', TIMESTAMP '2020-05-18 17:00:00' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-18 10:00:00', TIMESTAMP '2020-05-18 16:58:12' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-18 15:34:48', TIMESTAMP '2020-05-18 16:10:12' FROM DUAL UNION ALL
SELECT 11919, TIMESTAMP '2020-05-18 16:30:00', TIMESTAMP '2020-05-18 16:45:00' FROM DUAL;

这两个输出:

ORDER_ID | START_DATE_TIME     | END_DATE_TIME      
-------: | :------------------ | :------------------
    3933 | 2020-02-04 08:00:00 | 2020-02-04 12:00:00
    3933 | 2020-02-04 13:30:00 | 2020-02-04 19:00:00
    3933 | 2020-02-05 13:40:12 | 2020-02-05 18:55:12
    3933 | 2020-02-06 08:00:00 | 2020-02-06 12:00:00
    3933 | 2020-02-06 13:30:00 | 2020-02-06 18:49:48
    3933 | 2020-02-07 08:00:00 | 2020-02-07 12:00:00
    3933 | 2020-02-07 13:30:00 | 2020-02-07 17:00:00
   11919 | 2020-05-14 09:00:00 | 2020-05-14 17:00:00
   11919 | 2020-05-15 08:40:12 | 2020-05-15 16:30:00
   11919 | 2020-05-18 08:49:48 | 2020-05-18 09:45:00
   11919 | 2020-05-18 10:00:00 | 2020-05-18 17:00:00

db<>fiddle here

下面的解决方案使用一种称为 "start of group" 方法的常用方法。

想法是按开始日期(分别为每个 id)对间隔进行排序,并将间隔分配给组,如下所示。对于每个间隔,检查其开始时间是否严格大于所有先前间隔的结束时间的最大值。如果是,则开始一个新组。剩下的很简单 - 只需 select 每个组的 MIN 开始日期和 MAX 结束日期。

这是使用解析函数实现的方法:

with
  has_sog_flags (order_id, start_date_time, end_date_time, flag) as (
    select order_id, start_date_time, end_date_time,
           case when start_date_time > 
                      max(end_date_time) over (partition by order_id
                                    order by start_date_time
                    rows between unbounded preceding and 1 preceding) 
                then 1 end
    from   table_name
  )
, has_groups (order_id, start_date_time, end_date_time, grp) as (
    select order_id, start_date_time, end_date_time,
           sum(flag) over (partition by order_id order by start_date_time)
    from   has_sog_flags
  )
select order_id, min(start_date_time) as start_date_time, 
       max(end_date_time) as end_date_time
from   has_groups
group  by order_id, grp
order  by order_id, start_date_time
;

一个有趣的问题是如何处理开放式区间(例如 null for end_date_time 表示 "open ended into the future"。可以相对轻松地调整查询以涵盖此类扩展问题陈述。