SQL 按日期分组

SQL Group by dates

我已经阅读了一些关于按顺序分组的主题,这几乎是我所需要的,但是我找不到解决我的问题的方法。

我有一个 table 这样的:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
26                    46          2012-05-21 00:00:00.000 2012-05-31 00:00:00.000 
28                    48          2012-06-01 00:00:00.000 2012-06-01 00:00:00.000 
27                    60          2012-06-02 00:00:00.000 2012-06-02 00:00:00.000 
29                    60          2012-06-07 00:00:00.000 2012-06-08 00:00:00.000 
37                    60          2012-06-08 00:00:00.000 2012-06-10 00:00:00.000 
30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-06-24 00:00:00.000 
32                    65          2012-06-23 00:00:00.000 2012-07-01 00:00:00.000 
33                    66          2012-07-02 00:00:00.000 2012-07-02 00:00:00.000 
34                    66          2012-07-02 00:00:00.000 2012-07-05 00:00:00.000 
36                    66          2012-07-06 00:00:00.000 2012-07-10 00:00:00.000 


Desired output:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
26                    46          2012-05-21 00:00:00.000 2012-05-31 00:00:00.000 
28                    48          2012-06-01 00:00:00.000 2012-06-01 00:00:00.000 
27                    60          2012-06-02 00:00:00.000 2012-06-02 00:00:00.000 
29                    60          2012-06-07 00:00:00.000 2012-06-10 00:00:00.000 
30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-07-01 00:00:00.000 
33                    66          2012-07-02 00:00:00.000 2012-07-05 00:00:00.000 
36                    66          2012-07-06 00:00:00.000 2012-07-10 00:00:00.000 

所以我必须按 JalonID 进行分组,但只有在 DataFinal >= DataStart 时才应进行分组。我想获取每个 JalonID 的时间段,但我只想获取没有暂停时间的时间段。

首页我说清楚了

select MIN(pp.DataStart) as DataStart, MAX(pp.DataFinal) as DataFinal, pp.JalonID FROM #PlanPozitii pp
GROUP BY pp.JalonID 

但是这个查询不满足我按连续时间段分组的条件。

至于澄清。举个例子

30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-06-24 00:00:00.000 
32                    65          2012-06-23 00:00:00.000 2012-07-01 00:00:00.000

2012-06-13 00:00:00.000 < 2012-06-18 00:00:00.000 因此在 PlanificatorPozitieID 3031 之间不会进行分组。 但是 2012-06-24 00:00:00.000 > 2012-06-23 00:00:00.000 所以现在 PlanificatorPozitieID 3132.

之间会有一个组

所以从这 3 行我们将得到两行。

30                    65          2012-06-10 00:00:00.000 2012-06-13 00:00:00.000 
31                    65          2012-06-18 00:00:00.000 2012-07-01 00:00:00.000 





DECLARE @YourTable TABLE(PlanificatorPozitieID INT, JalonID INT,DataStart DATETIME, DataFinal DATETIME)
INSERT INTO @YourTable VALUES
(39,1223,'2015-02-16 00:00:00.000','2015-02-20 00:00:00.000'),
(43,1223,'2015-02-19 00:00:00.000','2015-02-24 00:00:00.000'),
(40,1223,'2015-02-23 00:00:00.000','2015-02-27 00:00:00.000'),
(42,1223,'2015-03-09 00:00:00.000','2015-03-13 00:00:00.000')
;WITH cte AS
(
SELECT  a.PlanificatorPozitieID, 
        a.JalonID, 
        a.DataStart, 
        COALESCE(b.DataFinal,a.datafinal) AS [DataFinal],
        ROW_NUMBER() OVER (PARTITION BY  a.JalonID ORDER BY DATEDIFF(dd,a.datastart, COALESCE(b.DataFinal,a.datafinal))) [rn],
        COUNT(*) OVER (PARTITION BY  a.JalonID) [cnt]
FROM    @YourTable a
        LEFT JOIN  @YourTable b 
        ON  a.JalonID = b.JalonID AND 
            b.DataStart BETWEEN a.DataStart AND a.DataFinal AND  
            a.PlanificatorPozitieID <> b.PlanificatorPozitieID AND 
            DATEDIFF(dd,a.DataStart,a.DataFinal) < DATEDIFF(dd,a.DataStart,b.DataFinal)
)
SELECT * 
FROM   cte 
WHERE  rn= 1 OR rn=cnt

结果:

PlanificatorPozitieID JalonID     DataStart               DataFinal               rn                   cnt
--------------------- ----------- ----------------------- ----------------------- -------------------- -----------
40                    1223        2015-02-23 00:00:00.000 2015-02-27 00:00:00.000 1                    4
43                    1223        2015-02-19 00:00:00.000 2015-02-27 00:00:00.000 4                    4

预期结果:

PlanificatorPozitieID JalonID     DataStart               DataFinal               
--------------------- ----------- ----------------------- ----------------------- 
39                    1223        2015-02-16 00:00:00.000 2015-02-27 00:00:00.000 
42                    1223        2015-03-09 00:00:00.000 2015-03-13 00:00:00.000 

https://msdn.microsoft.com/en-us/library/hh231256.aspx

我一直在寻找 LAG() 函数,它可以访问查询中的前几行。我的想法是计算当前行和上一行之间的差异以获得要分组的列;

with a as (
select 1 as ID, 1 as A, 2 as B union all
select 1 as ID, 2 as A, 3 as B union all
select 1 as ID, 3 as A, 4 as B union all
select 1 as ID, 4 as A, 5 as B union all
select 1 as ID, 6 as A, 7 as B
) 
select ID, A, B, A-LAG(B,1,0) OVER (order by ID) as koe from a where B > A

如果您 运行 该查询,您得到的结果为;

ID          A           B           koe
----------- ----------- ----------- -----------
1           1           2           1
1           2           3           0
1           3           4           0
1           4           5           0
1           6           7           1

假设 A 是 DataStart,B 是 DataFinal,计算出的 koe 是不同的,如您所见,它适用于除第一行以外的所有行...第一行在不存在的行之间产生差异(因此它是 0)。但这是我开始尝试的方向。

我不知道它是否适用于实际数据,因为我没有对其进行严格测试,但这是一个解决方案:

DECLARE @YourTable TABLE(PlanificatorPozitieID INT, JalonID INT,DataStart DATETIME, DataFinal DATETIME)
INSERT INTO @YourTable VALUES
(26,46,'2012-05-21 00:00:00.000','2012-05-31 00:00:00.000'), 
(28,48,'2012-06-01 00:00:00.000','2012-06-01 00:00:00.000'), 
(27,60,'2012-06-02 00:00:00.000','2012-06-02 00:00:00.000'), 
(29,60,'2012-06-07 00:00:00.000','2012-06-08 00:00:00.000'), 
(37,60,'2012-06-08 00:00:00.000','2012-06-10 00:00:00.000'), 
(30,65,'2012-06-10 00:00:00.000','2012-06-13 00:00:00.000'), 
(31,65,'2012-06-18 00:00:00.000','2012-06-24 00:00:00.000'), 
(32,65,'2012-06-23 00:00:00.000','2012-07-01 00:00:00.000'), 
(33,66,'2012-07-02 00:00:00.000','2012-07-02 00:00:00.000'), 
(34,66,'2012-07-02 00:00:00.000','2012-07-05 00:00:00.000'), 
(36,66,'2012-07-06 00:00:00.000','2012-07-10 00:00:00.000')

;WITH cte AS
(
SELECT  a.PlanificatorPozitieID, 
        a.JalonID, 
        a.DataStart, 
        COALESCE(b.DataFinal,a.datafinal) AS [DataFinal],
        ROW_NUMBER() OVER (PARTITION BY  a.JalonID ORDER BY DATEDIFF(dd,a.datastart, COALESCE(b.DataFinal,a.datafinal))) [rn],
        COUNT(*) OVER (PARTITION BY  a.JalonID) [cnt]
FROM    @YourTable a
        LEFT JOIN  @YourTable b 
        ON  a.JalonID = b.JalonID AND 
            b.DataStart BETWEEN a.DataStart AND a.DataFinal AND  
            a.PlanificatorPozitieID <> b.PlanificatorPozitieID AND 
            DATEDIFF(dd,a.DataStart,a.DataFinal) < DATEDIFF(dd,a.DataStart,b.DataFinal)
)
SELECT * 
FROM   cte 
WHERE  rn= 1 OR rn=cnt

我找到了一个解决方案,它不是很有效,因为它使用了 2 个游标。但如果有人需要一个例子,它就可以工作

DROP TABLE #DateTEst
DROP TABLE #pozitii
drop table #PozitiiJaloaneStandard
CREATE TABLE #DateTest (JalonStandardID int,DataStart datetime,DataFinal datetime)


INSERT INTO #DateTest VALUES (1,'2015-05-05','2015-05-08')
INSERT INTO #DateTest VALUES (1,'2015-05-09','2015-05-13')
INSERT INTO #DateTest VALUES (1,'2015-05-12','2015-05-15')
INSERT INTO #DateTest VALUES (1,'2015-05-16','2015-05-18')
INSERT INTO #DateTest VALUES (1,'2015-05-14','2015-05-19')
INSERT INTO #DateTest VALUES (2,'2015-05-05','2015-05-06')
INSERT INTO #DateTest VALUES (2,'2015-05-06','2015-05-07')
INSERT INTO #DateTest VALUES (2,'2015-05-06','2015-05-09')
INSERT INTO #DateTest VALUES (3,'2015-05-05','2015-05-07')
INSERT INTO #DateTest VALUES (3,'2015-05-08','2015-05-10')
INSERT INTO #DateTest VALUES (4,'2015-05-05','2015-05-08')
INSERT INTO #DateTest VALUES (5,'2015-05-07','2015-05-07')
INSERT INTO #DateTest VALUES (5,'2015-05-08','2015-05-08')
INSERT INTO #DateTest VALUES (5,'2015-05-09','2015-05-12')
INSERT INTO #DateTest VALUES (5,'2015-05-11','2015-05-12')
INSERT INTO #DateTest VALUES (6,'2015-05-05','2015-05-20')
INSERT INTO #DateTest VALUES (6,'2015-05-15','2015-05-18')




CREATE TABLE #Pozitii (DataStart datetime, DataFinal datetime)

CREATE TABLE #PozitiiJaloaneStandard ( JalonStandardID int, DataStart datetime, DataFinal datetime)



Declare @JalonStandarID int
DEclare @PlanificatorPozitieID int
Declare @DataStartPozitie datetime
Declare @DataFinalPozitie datetime 


DEclare @DataStartMin datetime
Declare @PozitieMinima int
Declare @DataFinalMin datetime


Declare @DataStartJalonStandard datetime
Declare @DataFinalJalonStandard datetime


Declare Crs_JaloaneStandard Cursor For
Select JalonStandardID
From #DateTest
ORDER BY JalonStandardID
Open Crs_JaloaneStandard
Fetch Next From Crs_JaloaneStandard Into 
@JalonStandarID
While @@Fetch_Status = 0 
    Begin
            INSERT INTO #Pozitii
            SELECT pp.DataStart,pp.DataFinal            
            FROM #DateTest pp
            WHERE pp.JalonStandardID = @JalonStandarID
            GROUP BY  pp.DataStart,pp.DataFinal



            SELECT @DataStartMin = MIN(DataStart) FROM #Pozitii

            SELECT   @DataFinalMin = DataFinal FROM 
             #Pozitii WHERE DataStart = @DataStartMin



            Declare Crs_Pozitii Cursor For
            SELECT 
            p.DataStart,p.DataFinal
            FROM #Pozitii p
            ORDER by p.DataStart ASC
            Open Crs_Pozitii
            Fetch Next From Crs_Pozitii Into
            @DataStartPozitie,@DataFinalPozitie
            while @@FETCH_STATUS = 0 
            begin

                    if (@DataFinalMin > @DataStartPozitie) and (@DataFinalMin <= @DataFinalPozitie )
                    begin 
                        set @DataFinalMin = @DataFinalPozitie
                    end
                    if (@DataFinalMin <= @DataStartPozitie) begin
                         INSERT INTO #PozitiiJaloaneStandard VALUES (@JalonStandarID,@DataStartMin,@DataFinalMin)
                         set @DataFinalMin = @DataFinalPozitie
                         set @DataStartMin = @DataStartPozitie      
                         print @DataStartPozitie
                         print @DataFinalPozitie                    
                    end



            Fetch Next From Crs_Pozitii Into 
            @DataStartPozitie,@DataFinalPozitie
            End 

            INSERT INTO #PozitiiJaloaneStandard VALUES (@JalonStandarID,@DataStartMin,@DataFinalMin)

            DELETE FROM #Pozitii
            Close Crs_Pozitii
            Deallocate Crs_Pozitii

    Fetch Next From Crs_JaloaneStandard Into 
    @JalonStandarID
    End

Close Crs_JaloaneStandard
Deallocate Crs_JaloaneStandard




SELECT * FROM #PozitiiJaloaneStandard
GROUP BY JalonStandardID,DataStart,DataFinal