Select start/end 每组日期

Select start/end date for each group

我有一个员工工作分配 table,格式如下:

emp_id, dept_id, assignment,  start_dt,    end_dt
1,      10,      project 1,   2001-01-01,  2001-12-31
1,      10,      project 2,   2002-01-01,  2002-12-31
1,      20,      project 3,   2003-01-01,  2003-12-31
1,      20,      project 4,   2004-01-01,  2004-12-31
1,      10,      project 5,   2005-01-01,  2005-12-31

从上面table我需要总结员工部门的历史,即员工在被转移到其他部门之前为特定部门工作的持续时间。

预期输出 结果如下所示:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2002-12-31
1,      20,       2003-01-01,  2004-12-31
1,      10,       2005-01-01,  2005-12-31

我已尝试使用 oracle 分析函数解决上述问题,但无法获得所需的输出

    select distinct emp_id, dept_id, start_dt, end_dt 
    from ( 
       select emp_id, dept_id, 
              min(start_date) 
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as start_dt,
              max(end_date)   
                 over (partition by emp_id, dept_id order by emp_id, dept_id 
                 RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as end_dt
       from employee_job_assignment
    )
    where emp_id = 1;

以上查询结果如下:

emp_id, dept_id,  start_dt,    end_dt
1,      10,       2001-01-01,  2005-12-31
1,      20,       2003-01-01,  2004-12-31

你可以试试下面的-

select emp_id,dept_id,min(start_Date) as start_Date,min(end_date) as end_date
from
(
select *,
row_number() over(order by start_date)-row_number() over(partition by dept_id order by start_date) as grp
from t
)A group by grp, dept_id,emp_id

输出:

emp_id  dept_id start_Date              end_date
 1       10      01/01/2001 00:00:00    31/12/2001 00:00:00
 1       10      01/01/2005 00:00:00    31/12/2005 00:00:00
 1       20      01/01/2003 00:00:00    31/12/2003 00:00:00

解决方案的关键是根据您的逻辑将行分成几组。您可以使用 LAG() 函数来做到这一点。例如:

select
  max(emp_id) as emp_id,
  max(dept_id) as dept_id,
  min(start_dt) as start_dt,
  max(end_dt) as end_dt
from (
  select
    *,
    sum(inc) over(partition by emp_id order by start_dt) as grp
  from (
    select
      *,
      case when lag(dept_id) over(partition by emp_id order by start_dt) 
                <> dept_id then 1 else 0 end as inc
    from employee_job_assignment
  ) x
) y
group by grp
order by grp

这是一个缺口和孤岛问题。但有一个转折。在这种情况下,您可能还想考虑同一部门内的差距。例如:

emp_id, dept_id, assignment,  start_dt,    end_dt
1,      10,      project 1,   2001-01-01,  2001-12-31
1,      10,      project 2,   2003-01-01,  2003-12-31

这应该 return 两行而不是一行。

为此,通过比较上一个结束日期和当前开始日期来确定每个岛屿的开始位置。这定义了分组的开始。剩下的就是聚合:

select emp_id, dept_id, min(start_dt), max(end_dt)
from (select eja.*,
             sum(case when prev_end_dt = start_dt - 1
                      then 0 else 1
                 end) over (partition by emp_id, dept_id) as grouping
      from (select eja.*,
                   lag(end_dt) over (partition by emp_id, dept_id order by start_dt) as prev_end_dt
            from employee_job_assignment eja
           ) eja
     ) eja
group by emp_id, dept_id, grouping;