SQL 服务器中的重叠日期时间更正

Overlapping DateTime Correction in SQL Server

我没有写过任何 SQL 的年龄,我正在努力处理数据清理脚本的最后阶段。我现有脚本的示例输出是

MRN ID     ADTM                    SDTM                    WardDays    WardMins
45  45_1   2016-03-24 06:28:00.000 2016-03-24 18:15:00.000 0           707
45  45_2   2016-03-24 11:07:00.000 2016-03-24 18:15:00.000 0           428

MRN ID     ADTM                    SDTM                    TDays   Tminutes
381 381_1  2016-01-30 00:25:00.000 2016-01-31 16:53:00.000 0       1415
381 381_1  2016-01-31 00:00:00.000 2016-01-31 16:53:00.000 0       1013
381 381_2  2016-01-31 11:30:00.000 2016-01-31 16:53:00.000 0       323
381 381_3  2016-01-31 16:53:00.000 2016-02-01 17:50:00.000 0       427
381 381_3  2016-02-01 00:00:00.000 2016-02-01 17:50:00.000 0       1070

问题是同一 [非唯一] [ID] 字段的重叠日期。对于第一种情况,我想要的输出(斜体更正)是:

MRN ID     ADTM                    SDTM                        WardDays    WardMins
45  45_1   2016-03-24 06:28:00.000 _2016-03-24 11:07:00.000_   0           335
45  45_2   2016-03-24 11:07:00.000 2016-03-24 18:15:00.000     0           428

第二组记录:

MRN ID    ADTM                    SDTM                        TDays   Tminutes
381 381_1  2016-01-30 00:25:00.000 _2016-01-31 00:00:00.000_   0       1415
381 381_1  2016-01-31 00:00:00.000 _2016-01-31 11:30:00.000_   0       690
381 381_2  2016-01-31 11:30:00.000 2016-01-31 16:53:00.000     0       323
381 381_3  2016-01-31 16:53:00.000 _2016-02-01 00:00:00.000_   0       427
381 381_3  2016-02-01 00:00:00.000 2016-02-01 17:50:00.000     0       1070

所以你看到我不希望任何两条记录的结束日期时间 [SDTM] 与下一条记录的开始日期时间 [ADTM] 重叠。我看到这是分两个阶段完成的:

  1. 根据上述数据集概述的逻辑更新日期。

  2. 更新每条记录的 TDays 和 TMinutes。

要设置数据集,请使用:

CREATE TABLE T (
    MRN int, ID varchar(5), ADTM varchar(23), SDTM varchar(23), TDays int, TMinutes int);

INSERT INTO T
    (MRN, ID, ADTM, SDTM, TDays, TMinutes)
VALUES
    (45, '45_1', '2016-03-24 06:28:00.000', '2016-03-24 18:15:00.000', 0, 707),
    (45, '45_2', '2016-03-24 11:07:00.000', '2016-03-24 18:15:00.000', 0, 428),
    (381, '381_1', '2016-01-30 00:25:00.000', '2016-01-31 16:53:00.000', 0, 1415),
    (381, '381_1', '2016-01-31 00:00:00.000', '2016-01-31 16:53:00.000', 0, 1013),
    (381, '381_3', '2016-01-31 16:53:00.000', '2016-02-01 17:50:00.000', 0, 427),
    (381, '381_3', '2016-02-01 00:00:00.000', '2016-02-01 17:50:00.000', 0, 1070),
    (381, '381_2', '2016-01-31 11:30:00.000', '2016-01-31 16:53:00.000', 0, 323);

对于第 1 部分。我一直在研究 CTE 查询,但这只是合并重叠的记录。我需要查询前面的记录以检查所需的条件,但我迷路了。

; WITH StartD AS
(
    SELECT ID, ADTM, ROW_NUMBER() 
    OVER(PARTITION BY ID ORDER BY ADTM) AS Rn 
    FROM
        WD AS t
    WHERE 
        NOT EXISTS
        (
            SELECT *
            FROM WD AS p 
            WHERE p.ID = t.ID 
                AND p.ADTM < t.ADTM  
                AND t.ADTM <= DATEADD(day, 1, p.SDTM) 
        )
) , EndD AS
(
    SELECT ID, SDTM, ROW_NUMBER() 
    OVER(PARTITION BY ID ORDER BY SDTM) AS Rn 
    FROM
        WD AS t
    WHERE
        NOT EXISTS
        ( 
            SELECT *
            FROM WD AS p
            WHERE p.ID = t.ID
                AND DATEADD(day, -1, p.ADTM) <= t.SDTM
                AND t.SDTM < p.SDTM
        )
) SELECT s.ID, s.ADTM, e.SDTM
  FROM StartD AS s JOIN EndD AS e
      ON  e.ID = s.ID AND e.Rn = s.Rn;

有人可以给我任何有关如何完成此操作的建议吗?

感谢您的宝贵时间。


这个案例没有得到解决,接受的答案是:

MRN ID     ADTM                    SDTM                    TDays   Tminutes
381 381_1  2016-01-30 00:25:00.000 2016-01-31 00:00:00.000 0       1415
381 381_2  2016-01-31 11:30:00.000 2016-02-01 00:00:00.000 0       323
381 381_3  2016-01-31 16:53:00.000 2016-02-01 00:00:00.000 0       1070

新 table 是:

CREATE TABLE T (
    MRN int, ID varchar(5), ADTM varchar(23), SDTM varchar(23), TDays int, TMinutes int);

INSERT INTO T
    (MRN, ID, ADTM, SDTM, TDays, TMinutes)
VALUES
    (45, '45_1', '2016-03-24 06:28:00.000', '2016-03-24 18:15:00.000', 0, 707),
    (45, '45_2', '2016-03-24 11:07:00.000', '2016-03-24 18:15:00.000', 0, 428),
    (381, '381_1', '2016-01-30 00:25:00.000', '2016-01-31 00:00:00.000', 0, 1415),
    (381, '381_2', '2016-01-31 11:30:00.000', '2016-02-01 00:00:00.000', 0, 323),
    (381, '381_3', '2016-01-31 16:53:00.000', '2016-02-01 00:00:00.000', 0, 427);

这应该能满足您在 sql 2008

中的需求
SELECT  t1.ID,
        t1.ADTM,
        COALESCE(t2.ADTM,t1.SDTM) SDTM,
        DATEDIFF(MINUTE,t1.ADTM,COALESCE(t2.ADTM,t1.SDTM)) Tminutes
FROM    T t1
        OUTER APPLY (SELECT TOP 1
                            *
                     FROM   T t2
                     WHERE  t2.MRN = t1.MRN
                            AND t2.ADTM > t1.ADTM
                            AND t2.ADTM <> t1.SDTM
                     ORDER BY adtm
                    ) t2
ORDER BY t1.ID

这似乎是正确的开始方式:

declare @T table ( MRN int, ID varchar(5), ADTM varchar(23), SDTM varchar(23),
                  TDays int, TMinutes int);

INSERT INTO @T (MRN, ID, ADTM, SDTM, TDays, TMinutes) VALUES
(45, '45_1', '2016-03-24 06:28:00.000', '2016-03-24 18:15:00.000', 0, 707),
(45, '45_2', '2016-03-24 11:07:00.000', '2016-03-24 18:15:00.000', 0, 428),
(381, '381_1', '2016-01-30 00:25:00.000', '2016-01-31 16:53:00.000', 0, 1415),
(381, '381_1', '2016-01-31 00:00:00.000', '2016-01-31 16:53:00.000', 0, 1013),
(381, '381_3', '2016-01-31 16:53:00.000', '2016-02-01 17:50:00.000', 0, 427),
(381, '381_3', '2016-02-01 00:00:00.000', '2016-02-01 17:50:00.000', 0, 1070),
(381, '381_2', '2016-01-31 11:30:00.000', '2016-01-31 16:53:00.000', 0, 323);

;With Ordered as (
    select
        *,
        ROW_NUMBER() OVER (PARTITION BY MRN order by ADTM) as rn
    from
        @T
), Ends as (
    select
        o1.MRN,
        o1.ID,
        o1.ADTM,
        CASE WHEN o2.ADTM < o1.SDTM THEN o2.ADTM ELSE o1.SDTM END as SDTM
    from
        Ordered o1
            left join
        Ordered o2
            on
                o1.MRN = o2.MRN and
                o1.rn=  o2.rn - 1
)
select
    *,
    DATEDIFF(minute,ADTM,SDTM) as TMinutes
from Ends

结果:

MRN         ID    ADTM                    SDTM                    TMinutes
----------- ----- ----------------------- ----------------------- -----------
45          45_1  2016-03-24 06:28:00.000 2016-03-24 11:07:00.000 279
45          45_2  2016-03-24 11:07:00.000 2016-03-24 18:15:00.000 428
381         381_1 2016-01-30 00:25:00.000 2016-01-31 00:00:00.000 1415
381         381_1 2016-01-31 00:00:00.000 2016-01-31 11:30:00.000 690
381         381_2 2016-01-31 11:30:00.000 2016-01-31 16:53:00.000 323
381         381_3 2016-01-31 16:53:00.000 2016-02-01 00:00:00.000 427
381         381_3 2016-02-01 00:00:00.000 2016-02-01 17:50:00.000 1070

除非您的示例数据不完整或者我遗漏了什么,否则我们总是将每一行与之后的下一行匹配(只需按 ADTM 排序),然后取当前的 SDTM 或下一行 ADTM,以先到者为准(通过 CASE)。