SQL 或 SSIS 查找缺失的日期范围?
SQL or SSIS to find missing date ranges?
我有一个协议 table 和一个计划 table。计划是协议的子项,并且具有对它们的 FK 引用。
同一协议的所有计划应填满协议的整个范围。但是,协议中缺少一些计划。
SQL 或 SSIS 中是否有获取 "missing" 计划列表的方法?
协议Table
+--------------+----------------------+--------------------+
| Agreement Id | Agreement Start Date | Agreement End Date |
+--------------+----------------------+--------------------+
| 1 | 1/1/2010 | 12/31/2016 |
+--------------+----------------------+--------------------+
计划Table
+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
| 1 | 1 | 1/1/2010 | 12/31/2010 |
| 1 | 2 | 1/1/2012 | 12/31/2012 |
| 1 | 3 | 1/1/2014 | 12/31/2016 |
+--------------+---------+-----------------+---------------+
想要的计划Table
+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
| 1 | 1 | 1/1/2010 | 12/31/2010 |
| 1 | 4 | 1/1/2011 | 12/31/2011 |
| 1 | 2 | 1/1/2012 | 12/31/2012 |
| 1 | 5 | 1/1/2013 | 12/31/2013 |
| 1 | 3 | 1/1/2014 | 12/31/2016 |
+--------------+---------+-----------------+---------------+
基本上,我想获取协议 1 的缺失计划,即这些行:
+--------------+---------+-----------------+---------------+
| Agreement Id | Plan Id | Plan Start Date | Plan End Date |
+--------------+---------+-----------------+---------------+
| 1 | 4 | 1/1/2011 | 12/31/2011 |
| 1 | 5 | 1/1/2013 | 12/31/2013 |
+--------------+---------+-----------------+---------------+
这是为 MS SQL 服务器编写的,因此如果您正在为 MySQL 编码,您可能需要调整日期函数,但我相信这应该可行。我不认为我在协议开始时涵盖了缺少计划的情况,所以我会考虑一下并尽快添加代码:
SELECT
DATEADD(DAY, 1, P.end_date) AS start_date,
COALESCE(DATEADD(DAY, -1, P3.start_date), A.end_date) AS end_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P ON
P.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P.end_date)
LEFT OUTER JOIN dbo.Plans P3 ON P3.agreement_id = A.agreement_id AND P3.start_date > P.end_date
LEFT OUTER JOIN dbo.Plans P4 ON P4.agreement_id = A.agreement_id AND P4.start_date BETWEEN P.end_date AND P3.start_date AND P4.plan_id <> P3.plan_id
WHERE
P.end_date <> A.end_date AND
P2.agreement_id IS NULL AND
P4.agreement_id IS NULL
此方法还应该捕获缺少的开始和结束计划,但使用窗口函数 ROW_NUMBER 来排列东西。您可以在没有 ROW_NUMBER 的情况下执行此操作,但它要复杂得多。我也不确定 SQL 中是否有更简单的方法来执行此操作,但这是我开始输入时想到的第一件事:
;WITH CTE_MissingEndDates AS
(
SELECT agreement_id, missing_end_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_end_date) AS row_num
FROM
(
SELECT
A.agreement_id,
DATEADD(DAY, -1, P1.start_date) AS missing_end_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.end_date = DATEADD(DAY, -1, P1.start_date)
WHERE
P1.start_date > A.start_date
UNION
SELECT A2.agreement_id, A2.end_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND end_date = A2.end_date)
) SQ
),
CTE_MissingStartDates AS
(
SELECT agreement_id, missing_start_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_start_date) AS row_num
FROM
(
SELECT
A.agreement_id,
DATEADD(DAY, 1, P1.end_date) AS missing_start_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P1.end_date)
WHERE
P1.end_date < A.end_date
UNION
SELECT A2.agreement_id, A2.start_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND start_date = A2.start_date)
) SQ
)
SELECT
MSD.missing_start_date,
MED.missing_end_date
FROM
CTE_MissingStartDates MSD
INNER JOIN CTE_MissingEndDates MED ON
MED.agreement_id = MSD.agreement_id AND
MED.row_num = MSD.row_num
我有一个协议 table 和一个计划 table。计划是协议的子项,并且具有对它们的 FK 引用。
同一协议的所有计划应填满协议的整个范围。但是,协议中缺少一些计划。
SQL 或 SSIS 中是否有获取 "missing" 计划列表的方法?
协议Table
+--------------+----------------------+--------------------+ | Agreement Id | Agreement Start Date | Agreement End Date | +--------------+----------------------+--------------------+ | 1 | 1/1/2010 | 12/31/2016 | +--------------+----------------------+--------------------+
计划Table
+--------------+---------+-----------------+---------------+ | Agreement Id | Plan Id | Plan Start Date | Plan End Date | +--------------+---------+-----------------+---------------+ | 1 | 1 | 1/1/2010 | 12/31/2010 | | 1 | 2 | 1/1/2012 | 12/31/2012 | | 1 | 3 | 1/1/2014 | 12/31/2016 | +--------------+---------+-----------------+---------------+
想要的计划Table
+--------------+---------+-----------------+---------------+ | Agreement Id | Plan Id | Plan Start Date | Plan End Date | +--------------+---------+-----------------+---------------+ | 1 | 1 | 1/1/2010 | 12/31/2010 | | 1 | 4 | 1/1/2011 | 12/31/2011 | | 1 | 2 | 1/1/2012 | 12/31/2012 | | 1 | 5 | 1/1/2013 | 12/31/2013 | | 1 | 3 | 1/1/2014 | 12/31/2016 | +--------------+---------+-----------------+---------------+
基本上,我想获取协议 1 的缺失计划,即这些行:
+--------------+---------+-----------------+---------------+ | Agreement Id | Plan Id | Plan Start Date | Plan End Date | +--------------+---------+-----------------+---------------+ | 1 | 4 | 1/1/2011 | 12/31/2011 | | 1 | 5 | 1/1/2013 | 12/31/2013 | +--------------+---------+-----------------+---------------+
这是为 MS SQL 服务器编写的,因此如果您正在为 MySQL 编码,您可能需要调整日期函数,但我相信这应该可行。我不认为我在协议开始时涵盖了缺少计划的情况,所以我会考虑一下并尽快添加代码:
SELECT
DATEADD(DAY, 1, P.end_date) AS start_date,
COALESCE(DATEADD(DAY, -1, P3.start_date), A.end_date) AS end_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P ON
P.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P.end_date)
LEFT OUTER JOIN dbo.Plans P3 ON P3.agreement_id = A.agreement_id AND P3.start_date > P.end_date
LEFT OUTER JOIN dbo.Plans P4 ON P4.agreement_id = A.agreement_id AND P4.start_date BETWEEN P.end_date AND P3.start_date AND P4.plan_id <> P3.plan_id
WHERE
P.end_date <> A.end_date AND
P2.agreement_id IS NULL AND
P4.agreement_id IS NULL
此方法还应该捕获缺少的开始和结束计划,但使用窗口函数 ROW_NUMBER 来排列东西。您可以在没有 ROW_NUMBER 的情况下执行此操作,但它要复杂得多。我也不确定 SQL 中是否有更简单的方法来执行此操作,但这是我开始输入时想到的第一件事:
;WITH CTE_MissingEndDates AS
(
SELECT agreement_id, missing_end_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_end_date) AS row_num
FROM
(
SELECT
A.agreement_id,
DATEADD(DAY, -1, P1.start_date) AS missing_end_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.end_date = DATEADD(DAY, -1, P1.start_date)
WHERE
P1.start_date > A.start_date
UNION
SELECT A2.agreement_id, A2.end_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND end_date = A2.end_date)
) SQ
),
CTE_MissingStartDates AS
(
SELECT agreement_id, missing_start_date, ROW_NUMBER() OVER (PARTITION BY agreement_id ORDER BY missing_start_date) AS row_num
FROM
(
SELECT
A.agreement_id,
DATEADD(DAY, 1, P1.end_date) AS missing_start_date
FROM
dbo.Agreements A
INNER JOIN dbo.Plans P1 ON P1.agreement_id = A.agreement_id
LEFT OUTER JOIN dbo.Plans P2 ON P2.agreement_id = A.agreement_id AND P2.start_date = DATEADD(DAY, 1, P1.end_date)
WHERE
P1.end_date < A.end_date
UNION
SELECT A2.agreement_id, A2.start_date FROM dbo.Agreements A2 WHERE NOT EXISTS (SELECT * FROM dbo.Plans WHERE agreement_id = A2.agreement_id AND start_date = A2.start_date)
) SQ
)
SELECT
MSD.missing_start_date,
MED.missing_end_date
FROM
CTE_MissingStartDates MSD
INNER JOIN CTE_MissingEndDates MED ON
MED.agreement_id = MSD.agreement_id AND
MED.row_num = MSD.row_num