连续日期跨度 T-Sql

Continuous Date Span T-Sql

我正在尝试合并存在的连续日期跨度

ID_NBR  START_DT    END_DT
22  20120101    20120131
22  20120201    20120731
22  20120801    20121231
22  20130201    20131231
22  20140101    20151231
22  20160101    20160131
22  20160201    20160430
22  20160601    20160630
22  20160701    99991231

并希望结果如下所示:

ID_NBR  START_DT    END_DT
22  20120101    20121231
22  20130201    20160430
22  20160601    99991231

很明显我不想被填满,所以这就是我目前所拥有的,但我真的认为必须有一个更简单的方法

SELECT 
    s1.ID_NBR,
    s1.START_DT, 
    MIN(t1.END_DT) AS END_DT,
    ROW_NUMBER() OVER(ORDER BY s1.START_DT) AS Sequence_ID
FROM MEM s1 
INNER JOIN MEM t1 
ON t1.ID_NBR=s1.ID_NBR
AND s1.START_DT <= t1.END_DT
AND NOT EXISTS (
                SELECT*FROM  MEM t2 
                        WHERE t2.ID_NBR=t1.ID_NBR
                                AND (t1.END_DT+1) >= t2.START_DT 
                                AND t1.END_DT < t2.END_DT
                ) 
WHERE NOT EXISTS(SELECT * FROM MEM s2 
WHERE s2.ID_NBR=s1.ID_NBR
AND s1.START_DT > s2.START_DT AND (s1.START_DT-1) <= s2.END_DT)                 
GROUP BY s1.ID_NBR,s1.START_DT

在 Teradata TD14.10 中,有一种使用 SELECT NORMALIZE 合并重叠周期的简单方法。该实现基于 PERIOD 数据类型,其中包括开始日期,但不包括结束日期。由于您的数据包含结束日期,因此您必须针对计算调整它,并再次将期间拆分为单独的列:

SELECT ID_NBR,
   Begin(pd), -- get the start date
   Last(pd)   -- adjust the end date
FROM
 (
   SELECT NORMALIZE 
      ID_NBR, 
      -- periods are [inclusive..exclusive[ 
      PERIOD(START_DT,CASE WHEN END_DT = DATE '9999-12-31' THEN END_DT ELSE END_DT + 1 END) AS pd
   FROM tab
 ) AS dt

如果您的日期实际上是 Decimal(38,0)(这是完全错误的),您需要先使用

将它们转换为日期
Cast(start_dt - 19000000 AS DATE)