SQL 或 SSIS 查找缺失的日期范围?

SQL or SSIS to find missing date ranges?

我有一个协议 table 和一个计划 table。计划是协议的子项,并且具有对它们的 FK 引用。

同一协议的所有计划应填满协议的整个范围。但是,协议中缺少一些计划。

SQL 或 SSIS 中是否有获取 "missing" 计划列表的方法?

协议Table

+--------------+----------------------+--------------------+
| Agreement Id | Agreement Start Date | Agreement End Date |
+--------------+----------------------+--------------------+
|            1 | 1/1/2010             | 12/31/2016         |
+--------------+----------------------+--------------------+

计划Table

+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
|            1 |       1 | 1/1/2010        | 12/31/2010    |
|            1 |       2 | 1/1/2012        | 12/31/2012    |
|            1 |       3 | 1/1/2014        | 12/31/2016    |
+--------------+---------+-----------------+---------------+

想要的计划Table

+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
|            1 |       1 | 1/1/2010        | 12/31/2010    |
|            1 |       4 | 1/1/2011        | 12/31/2011    |
|            1 |       2 | 1/1/2012        | 12/31/2012    |
|            1 |       5 | 1/1/2013        | 12/31/2013    |
|            1 |       3 | 1/1/2014        | 12/31/2016    |
+--------------+---------+-----------------+---------------+

基本上,我想获取协议 1 的缺失计划,即这些行:

+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
|            1 |       4 | 1/1/2011        | 12/31/2011    |
|            1 |       5 | 1/1/2013        | 12/31/2013    |
+--------------+---------+-----------------+---------------+

这是为 MS SQL 服务器编写的,因此如果您正在为 MySQL 编码,您可能需要调整日期函数,但我相信这应该可行。我不认为我在协议开始时涵盖了缺少计划的情况,所以我会考虑一下并尽快添加代码:

SELECT
    DATEADD(DAY, 1, P.end_date) AS start_date,
    COALESCE(DATEADD(DAY, -1, P3.start_date), A.end_date) AS end_date
FROM
    dbo.Agreements A
INNER JOIN dbo.Plans P ON
    P.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P.end_date)
LEFT OUTER JOIN dbo.Plans P3 ON P3.agreement_id = A.agreement_id AND P3.start_date > P.end_date
LEFT OUTER JOIN dbo.Plans P4 ON P4.agreement_id = A.agreement_id AND P4.start_date BETWEEN P.end_date AND P3.start_date AND P4.plan_id <> P3.plan_id
WHERE
    P.end_date <> A.end_date AND
    P2.agreement_id IS NULL AND
    P4.agreement_id IS NULL

此方法还应该捕获缺少的开始和结束计划,但使用窗口函数 ROW_NUMBER 来排列东西。您可以在没有 ROW_NUMBER 的情况下执行此操作,但它要复杂得多。我也不确定 SQL 中是否有更简单的方法来执行此操作,但这是我开始输入时想到的第一件事:

;WITH CTE_MissingEndDates AS
(
    SELECT agreement_id, missing_end_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_end_date) AS row_num
    FROM
    (
        SELECT
            A.agreement_id,
            DATEADD(DAY, -1, P1.start_date) AS missing_end_date
        FROM
            dbo.Agreements A
        INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
        LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.end_date = DATEADD(DAY, -1, P1.start_date)
        WHERE
            P1.start_date > A.start_date
        UNION
        SELECT A2.agreement_id, A2.end_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND end_date = A2.end_date)
    ) SQ
),
CTE_MissingStartDates AS
(
    SELECT agreement_id, missing_start_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_start_date) AS row_num
    FROM
    (
        SELECT
            A.agreement_id,
            DATEADD(DAY, 1, P1.end_date) AS missing_start_date
        FROM
            dbo.Agreements A
        INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
        LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P1.end_date)
        WHERE
            P1.end_date < A.end_date
        UNION
        SELECT A2.agreement_id, A2.start_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND start_date = A2.start_date)
    ) SQ
)
SELECT
    MSD.missing_start_date,
    MED.missing_end_date
FROM
    CTE_MissingStartDates MSD
INNER JOIN CTE_MissingEndDates MED ON
    MED.agreement_id = MSD.agreement_id AND
    MED.row_num = MSD.row_num