Oracle SQL 组问题

Oracle SQL Group Issue

我正在尝试汇总一名员工 table,其中存在多个记录,而一名员工在一个团队中。我试图分组依据,Min/Max 分区依据和 Lead/Lag 团队名称,但每个结果都以代理人结束,该代理人已从一个团队转移,然后在稍后的日期回到原始团队组即使我按日期排序也出现了一次。

示例数据库:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 02/JAN/19

John Smith    | 123123      | Team A      | Site A   | 02/JAN/19  | 03/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 04/JAN/19

John Smith    | 123123      | Team A      | Site A   | 04/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 05/JAN/19  | 06/JAN/19

当我运行一个示例查询时:

SELECT
Employee Name
,Employee ID
,Team Leader
,Location
,MIN(Start Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, Start Date) AS Starting Date
,MAX(End Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, End Date) AS End Date
FROM TABLE 1

结果如下:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 06/JAN/19

能否帮助实现预期结果:

Employee Name | Employee ID | Team Leader | Location | Start Date | End Date

John Smith    | 123123      | Team A      | Site A   | 01/JAN/19  | 03/JAN/19

John Smith    | 123123      | Team B      | Site A   | 03/JAN/19  | 04/JAN/19

John Smith    | 123123      | Team A      | Site A   | 04/JAN/19  | 05/JAN/19

John Smith    | 123123      | Team B      | Site A   | 05/JAN/19  | 06/JAN/19

这看起来像 gaps-and-islands 的一种形式,其中记录按日期范围链接。

这是一种方法,它使用 left join 找到岛屿的开始位置,然后使用累积和来识别组和聚合:

select employeename, employeeid, teamleader, location,
       min(startdate), max(enddate)
from (select t1.*,
             sum(case when tprev.employeeid is null  -- new group
                      then 1 else 0
                 end) over (partition by employeeid, teamleader, location
                            order by startdate
                           ) as grouping
      from table1 t1 left join
           table1 tprev
           on t1.startdate = tprev.enddate and
              t1.employeeid = tprev.employeeid and
              t1.teamleader = tprev.teamleader and
              t1.location = tprev.location
     ) t
group by employeeid, teamleader, location, grouping
order by employeeid, min(startdate);

这是一种选择:

  • test CTE代表你的数据(稍微简化了一点)
  • 有用 代码从第 8 行开始

SQL> with test (ename, team, start_date, end_date) as
  2    (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
  3     select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
  4     select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
  5     select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
  6     select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
  7    ),
  8  temp as
  9    (select ename, team, start_date, end_date,
 10       row_number() over (order by start_date) rn,
 11       row_number() over (partition by ename, team order by start_date) rna
 12     from test
 13    )
 14  select ename, team, min(start_date) start_date, max(end_date) end_date
 15  from temp
 16  group by ename, team, (rn - rna)
 17  order by 3;

ENAM T START_DATE  END_DATE
---- - ----------- -----------
John A 01/jan/2019 03/jan/2019
John B 03/jan/2019 04/jan/2019
John A 04/jan/2019 05/jan/2019
John B 05/jan/2019 06/jan/2019

SQL>

如果您使用的是 12c 或更高版本,行模式匹配是一个很好的替代解决方案。与 "gaps and islands" 解决方案不同,我也处理重叠问题。 WITH子句中包含测试数据,解决方案随后开始。

with test (ename, team, start_date, end_date) as
 (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
  select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
  select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
  select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
  select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
 )
select * from test
match_recognize(
  partition by ename, team order by start_date
  measures first(start_date) start_date, last(end_date) end_date
  pattern(a b*)
  define b as start_date <= a.end_date
)
order by ename, start_date;

ENAM T START_DATE       END_DATE        
---- - ---------------- ----------------
John A 2019-01-01 00:00 2019-01-03 00:00
John B 2019-01-03 00:00 2019-01-04 00:00
John A 2019-01-04 00:00 2019-01-05 00:00
John B 2019-01-05 00:00 2019-01-06 00:00