Oracle SQL 组问题
Oracle SQL Group Issue
我正在尝试汇总一名员工 table,其中存在多个记录,而一名员工在一个团队中。我试图分组依据,Min/Max 分区依据和 Lead/Lag 团队名称,但每个结果都以代理人结束,该代理人已从一个团队转移,然后在稍后的日期回到原始团队组即使我按日期排序也出现了一次。
示例数据库:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 02/JAN/19
John Smith | 123123 | Team A | Site A | 02/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
当我运行一个示例查询时:
SELECT
Employee Name
,Employee ID
,Team Leader
,Location
,MIN(Start Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, Start Date) AS Starting Date
,MAX(End Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, End Date) AS End Date
FROM TABLE 1
结果如下:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 06/JAN/19
能否帮助实现预期结果:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
这看起来像 gaps-and-islands 的一种形式,其中记录按日期范围链接。
这是一种方法,它使用 left join
找到岛屿的开始位置,然后使用累积和来识别组和聚合:
select employeename, employeeid, teamleader, location,
min(startdate), max(enddate)
from (select t1.*,
sum(case when tprev.employeeid is null -- new group
then 1 else 0
end) over (partition by employeeid, teamleader, location
order by startdate
) as grouping
from table1 t1 left join
table1 tprev
on t1.startdate = tprev.enddate and
t1.employeeid = tprev.employeeid and
t1.teamleader = tprev.teamleader and
t1.location = tprev.location
) t
group by employeeid, teamleader, location, grouping
order by employeeid, min(startdate);
这是一种选择:
test
CTE代表你的数据(稍微简化了一点)
- 有用 代码从第 8 行开始
SQL> with test (ename, team, start_date, end_date) as
2 (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
3 select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
4 select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
5 select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
6 select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
7 ),
8 temp as
9 (select ename, team, start_date, end_date,
10 row_number() over (order by start_date) rn,
11 row_number() over (partition by ename, team order by start_date) rna
12 from test
13 )
14 select ename, team, min(start_date) start_date, max(end_date) end_date
15 from temp
16 group by ename, team, (rn - rna)
17 order by 3;
ENAM T START_DATE END_DATE
---- - ----------- -----------
John A 01/jan/2019 03/jan/2019
John B 03/jan/2019 04/jan/2019
John A 04/jan/2019 05/jan/2019
John B 05/jan/2019 06/jan/2019
SQL>
如果您使用的是 12c 或更高版本,行模式匹配是一个很好的替代解决方案。与 "gaps and islands" 解决方案不同,我也处理重叠问题。 WITH子句中包含测试数据,解决方案随后开始。
with test (ename, team, start_date, end_date) as
(select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
)
select * from test
match_recognize(
partition by ename, team order by start_date
measures first(start_date) start_date, last(end_date) end_date
pattern(a b*)
define b as start_date <= a.end_date
)
order by ename, start_date;
ENAM T START_DATE END_DATE
---- - ---------------- ----------------
John A 2019-01-01 00:00 2019-01-03 00:00
John B 2019-01-03 00:00 2019-01-04 00:00
John A 2019-01-04 00:00 2019-01-05 00:00
John B 2019-01-05 00:00 2019-01-06 00:00
我正在尝试汇总一名员工 table,其中存在多个记录,而一名员工在一个团队中。我试图分组依据,Min/Max 分区依据和 Lead/Lag 团队名称,但每个结果都以代理人结束,该代理人已从一个团队转移,然后在稍后的日期回到原始团队组即使我按日期排序也出现了一次。
示例数据库:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 02/JAN/19
John Smith | 123123 | Team A | Site A | 02/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
当我运行一个示例查询时:
SELECT
Employee Name
,Employee ID
,Team Leader
,Location
,MIN(Start Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, Start Date) AS Starting Date
,MAX(End Date) OVER(PARTITION BY Team Leader ORDER BY Employee ID, End Date) AS End Date
FROM TABLE 1
结果如下:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 06/JAN/19
能否帮助实现预期结果:
Employee Name | Employee ID | Team Leader | Location | Start Date | End Date
John Smith | 123123 | Team A | Site A | 01/JAN/19 | 03/JAN/19
John Smith | 123123 | Team B | Site A | 03/JAN/19 | 04/JAN/19
John Smith | 123123 | Team A | Site A | 04/JAN/19 | 05/JAN/19
John Smith | 123123 | Team B | Site A | 05/JAN/19 | 06/JAN/19
这看起来像 gaps-and-islands 的一种形式,其中记录按日期范围链接。
这是一种方法,它使用 left join
找到岛屿的开始位置,然后使用累积和来识别组和聚合:
select employeename, employeeid, teamleader, location,
min(startdate), max(enddate)
from (select t1.*,
sum(case when tprev.employeeid is null -- new group
then 1 else 0
end) over (partition by employeeid, teamleader, location
order by startdate
) as grouping
from table1 t1 left join
table1 tprev
on t1.startdate = tprev.enddate and
t1.employeeid = tprev.employeeid and
t1.teamleader = tprev.teamleader and
t1.location = tprev.location
) t
group by employeeid, teamleader, location, grouping
order by employeeid, min(startdate);
这是一种选择:
test
CTE代表你的数据(稍微简化了一点)- 有用 代码从第 8 行开始
SQL> with test (ename, team, start_date, end_date) as
2 (select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
3 select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
4 select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
5 select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
6 select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
7 ),
8 temp as
9 (select ename, team, start_date, end_date,
10 row_number() over (order by start_date) rn,
11 row_number() over (partition by ename, team order by start_date) rna
12 from test
13 )
14 select ename, team, min(start_date) start_date, max(end_date) end_date
15 from temp
16 group by ename, team, (rn - rna)
17 order by 3;
ENAM T START_DATE END_DATE
---- - ----------- -----------
John A 01/jan/2019 03/jan/2019
John B 03/jan/2019 04/jan/2019
John A 04/jan/2019 05/jan/2019
John B 05/jan/2019 06/jan/2019
SQL>
如果您使用的是 12c 或更高版本,行模式匹配是一个很好的替代解决方案。与 "gaps and islands" 解决方案不同,我也处理重叠问题。 WITH子句中包含测试数据,解决方案随后开始。
with test (ename, team, start_date, end_date) as
(select 'John', 'A', date '2019-01-01', date '2019-01-02' from dual union all
select 'John', 'A', date '2019-01-02', date '2019-01-03' from dual union all
select 'John', 'B', date '2019-01-03', date '2019-01-04' from dual union all
select 'John', 'A', date '2019-01-04', date '2019-01-05' from dual union all
select 'John', 'B', date '2019-01-05', date '2019-01-06' from dual
)
select * from test
match_recognize(
partition by ename, team order by start_date
measures first(start_date) start_date, last(end_date) end_date
pattern(a b*)
define b as start_date <= a.end_date
)
order by ename, start_date;
ENAM T START_DATE END_DATE
---- - ---------------- ----------------
John A 2019-01-01 00:00 2019-01-03 00:00
John B 2019-01-03 00:00 2019-01-04 00:00
John A 2019-01-04 00:00 2019-01-05 00:00
John B 2019-01-05 00:00 2019-01-06 00:00