SQL 多个列上的服务器 PIVOT 正在丢失数据
SQL Server PIVOT on multiple columns is losing data
我在这个问题上花了太多时间,我还没有走到终点。在您 运行 得出结论之前,请通读此内容,即这是 SO 上具有多列的所有其他数据透视表的副本。
我们有属性和单位,其中有一个 table 可以跟踪单位何时发生变化。我们无法更改 table 的结构,因为这是供应商应用程序。
Objective:提取单位代码不可用时的开始和结束日期"model"。
问题:我需要过滤掉中间可用的日期,尽管这似乎每次都省略了一行数据(对于单元 105)。
我试过的:PIVOT, CROSS APPLY
结合 LEAD/LAG
这是 SQLFiddle 的 link:http://sqlfiddle.com/#!6/29592/2/0
问题的其余部分有来自 SQLfiddle 的 tsql,包括我得到的结果。想要的结果在最后。
创建table并插入示例数据
DROP TABLE IF EXISTS testModelUnit;
CREATE TABLE testModelUnit(
propertykey INT NOT NULL
,unitNumber VARCHAR(10) NOT NULL
,rowStartDate DATETIME NOT NULL
,rowEndDate DATETIME NOT NULL
,unavailableCode varchar(10) NULL
,CONSTRAINT pk_testModelUnit PRIMARY KEY (propertykey, unitNumber, rowStartDate )
)
GO
INSERT INTO testModelUnit VALUES
(33,'105', '2010-11-11 00:00:00.000','2016-11-11 00:00:00.000','MODEL')
,(33,'105', '2016-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
,(33,'105', '2016-12-14 07:51:03.307','2017-01-01 00:00:00.000',NULL)
,(33,'105', '2017-01-01 00:00:00.00','2017-03-21 12:21:13.703','MODEL')
,(33,'105', '2017-03-21 12:21:13.703','2017-04-21 12:21:13.703','MODEL')
,(33,'105', '2017-04-21 12:21:13.703','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-21 12:21:23.207','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-19 10:30:09.227','2017-04-21 12:21:23.207','MODEL')
,(33,'2703','2016-12-14 07:51:03.307','2017-04-19 10:29:47.970','MODEL')
,(33,'2703','2011-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
GO
这为您提供了测试它所需的所有数据,因为 105 单元在 2016 年底的短时间内可用。
尝试 1 - 使用 LEAD/LAG
确定日期是否是系列中的第一个日期 - 然后使用多个 PIVOT
语句
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM (
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) SRC
PIVOT (
MAX(rowStartDate)
FOR isfirst in ([firstDate])
) as pivotFirst
PIVOT (
MAX(rowEndDate)
FOR islast in ([lastDate])
) as pivotLast
结果是:
propertykey unitNumber firstDate lastDate
33 105 NULL 9999-12-31 00:00:00.000
33 105 2010-11-11 00:00:00.000 NULL
33 105 2017-01-01 00:00:00.000 NULL
33 2606 NULL 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 NULL
33 2703 NULL 2017-04-19 10:29:47.970
33 2703 2011-11-11 00:00:00.000 NULL
问题是双重的:首先,我在不同的行中有 NULL,其次,我缺少单元 105 的结束日期(通过颠倒两个数据透视语句的顺序,我扭转了问题,我是然后在开始日期丢失)
第二次尝试:像以前一样使用 LAG/LEAD
,但这次使用 CROSS APPLY
将 first/last 值放入一列,然后旋转结果
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM(
SELECT
propertykey
,unitNumber
,ca.col
,ca.value
FROM
(
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) sub
OUTER APPLY (
SELECT ISFIRST, rowStartDate
UNION ALL
SELECT ISLAST, rowEndDate
) CA (col, value)
WHERE col IS NOT NULL
)src
PIVOT
(
max(value)
for col in ([firstDate],[lastDate])
) AS pivoted
结果:
propertykey unitNumber firstDate lastDate
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970
问题:我去掉了 NULL 行,但仍然缺少 105 的一条数据记录
想要的结果:
propertykey unitNumber firstDate lastDate
33 105 2010-11-11 00:00:00.000 2016-12-14 07:51:03.307
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970
您正在寻找如下查询吗?
Select PropertyKey, UnitNumber, Min(RowStartDate) as FirstDate, Max(rowEndDate) as LastDate from (
Select *, Bucket = Row_number() over(partition by propertykey, unitnumber order by rowStartDate) -
Row_number() over(partition by propertykey, unitnumber, unavailablecode order by rowStartDate)
from testModelUnit
) a
Where a.unavailableCode is not null
group by propertykey, unitNumber, Bucket
输出如下:
+-------------+------------+-------------------------+-------------------------+
| PropertyKey | UnitNumber | FirstDate | LastDate |
+-------------+------------+-------------------------+-------------------------+
| 33 | 105 | 2010-11-11 00:00:00.000 | 2016-12-14 07:51:03.307 |
| 33 | 105 | 2017-01-01 00:00:00.000 | 9999-12-31 00:00:00.000 |
| 33 | 2606 | 2017-04-19 10:30:09.227 | 9999-12-31 00:00:00.000 |
| 33 | 2703 | 2011-11-11 00:00:00.000 | 2017-04-19 10:29:47.970 |
+-------------+------------+-------------------------+-------------------------+
我在这个问题上花了太多时间,我还没有走到终点。在您 运行 得出结论之前,请通读此内容,即这是 SO 上具有多列的所有其他数据透视表的副本。
我们有属性和单位,其中有一个 table 可以跟踪单位何时发生变化。我们无法更改 table 的结构,因为这是供应商应用程序。
Objective:提取单位代码不可用时的开始和结束日期"model"。
问题:我需要过滤掉中间可用的日期,尽管这似乎每次都省略了一行数据(对于单元 105)。
我试过的:PIVOT, CROSS APPLY
结合 LEAD/LAG
这是 SQLFiddle 的 link:http://sqlfiddle.com/#!6/29592/2/0
问题的其余部分有来自 SQLfiddle 的 tsql,包括我得到的结果。想要的结果在最后。
创建table并插入示例数据
DROP TABLE IF EXISTS testModelUnit;
CREATE TABLE testModelUnit(
propertykey INT NOT NULL
,unitNumber VARCHAR(10) NOT NULL
,rowStartDate DATETIME NOT NULL
,rowEndDate DATETIME NOT NULL
,unavailableCode varchar(10) NULL
,CONSTRAINT pk_testModelUnit PRIMARY KEY (propertykey, unitNumber, rowStartDate )
)
GO
INSERT INTO testModelUnit VALUES
(33,'105', '2010-11-11 00:00:00.000','2016-11-11 00:00:00.000','MODEL')
,(33,'105', '2016-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
,(33,'105', '2016-12-14 07:51:03.307','2017-01-01 00:00:00.000',NULL)
,(33,'105', '2017-01-01 00:00:00.00','2017-03-21 12:21:13.703','MODEL')
,(33,'105', '2017-03-21 12:21:13.703','2017-04-21 12:21:13.703','MODEL')
,(33,'105', '2017-04-21 12:21:13.703','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-21 12:21:23.207','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-19 10:30:09.227','2017-04-21 12:21:23.207','MODEL')
,(33,'2703','2016-12-14 07:51:03.307','2017-04-19 10:29:47.970','MODEL')
,(33,'2703','2011-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
GO
这为您提供了测试它所需的所有数据,因为 105 单元在 2016 年底的短时间内可用。
尝试 1 - 使用 LEAD/LAG
确定日期是否是系列中的第一个日期 - 然后使用多个 PIVOT
语句
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM (
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) SRC
PIVOT (
MAX(rowStartDate)
FOR isfirst in ([firstDate])
) as pivotFirst
PIVOT (
MAX(rowEndDate)
FOR islast in ([lastDate])
) as pivotLast
结果是:
propertykey unitNumber firstDate lastDate
33 105 NULL 9999-12-31 00:00:00.000
33 105 2010-11-11 00:00:00.000 NULL
33 105 2017-01-01 00:00:00.000 NULL
33 2606 NULL 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 NULL
33 2703 NULL 2017-04-19 10:29:47.970
33 2703 2011-11-11 00:00:00.000 NULL
问题是双重的:首先,我在不同的行中有 NULL,其次,我缺少单元 105 的结束日期(通过颠倒两个数据透视语句的顺序,我扭转了问题,我是然后在开始日期丢失)
第二次尝试:像以前一样使用 LAG/LEAD
,但这次使用 CROSS APPLY
将 first/last 值放入一列,然后旋转结果
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM(
SELECT
propertykey
,unitNumber
,ca.col
,ca.value
FROM
(
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) sub
OUTER APPLY (
SELECT ISFIRST, rowStartDate
UNION ALL
SELECT ISLAST, rowEndDate
) CA (col, value)
WHERE col IS NOT NULL
)src
PIVOT
(
max(value)
for col in ([firstDate],[lastDate])
) AS pivoted
结果:
propertykey unitNumber firstDate lastDate
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970
问题:我去掉了 NULL 行,但仍然缺少 105 的一条数据记录
想要的结果:
propertykey unitNumber firstDate lastDate
33 105 2010-11-11 00:00:00.000 2016-12-14 07:51:03.307
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970
您正在寻找如下查询吗?
Select PropertyKey, UnitNumber, Min(RowStartDate) as FirstDate, Max(rowEndDate) as LastDate from (
Select *, Bucket = Row_number() over(partition by propertykey, unitnumber order by rowStartDate) -
Row_number() over(partition by propertykey, unitnumber, unavailablecode order by rowStartDate)
from testModelUnit
) a
Where a.unavailableCode is not null
group by propertykey, unitNumber, Bucket
输出如下:
+-------------+------------+-------------------------+-------------------------+
| PropertyKey | UnitNumber | FirstDate | LastDate |
+-------------+------------+-------------------------+-------------------------+
| 33 | 105 | 2010-11-11 00:00:00.000 | 2016-12-14 07:51:03.307 |
| 33 | 105 | 2017-01-01 00:00:00.000 | 9999-12-31 00:00:00.000 |
| 33 | 2606 | 2017-04-19 10:30:09.227 | 9999-12-31 00:00:00.000 |
| 33 | 2703 | 2011-11-11 00:00:00.000 | 2017-04-19 10:29:47.970 |
+-------------+------------+-------------------------+-------------------------+