SQL 多个列上的服务器 PIVOT 正在丢失数据

SQL Server PIVOT on multiple columns is losing data

我在这个问题上花了太多时间,我还没有走到终点。在您 运行 得出结论之前,请通读此内容,即这是 SO 上具有多列的所有其他数据透视表的副本。

我们有属性和单位,其中有一个 table 可以跟踪单位何时发生变化。我们无法更改 table 的结构,因为这是供应商应用程序。

Objective:提取单位代码不可用时的开始和结束日期"model"。

问题:我需要过滤掉中间可用的日期,尽管这似乎每次都省略了一行数据(对于单元 105)。

我试过的PIVOT, CROSS APPLY 结合 LEAD/LAG

这是 SQLFiddle 的 link:http://sqlfiddle.com/#!6/29592/2/0

问题的其余部分有来自 SQLfiddle 的 tsql,包括我得到的结果。想要的结果在最后。

创建table并插入示例数据

DROP TABLE IF EXISTS testModelUnit; 
CREATE TABLE testModelUnit(
    propertykey         INT             NOT NULL
    ,unitNumber         VARCHAR(10)     NOT NULL
    ,rowStartDate       DATETIME        NOT NULL
    ,rowEndDate         DATETIME        NOT NULL
    ,unavailableCode    varchar(10)     NULL
    ,CONSTRAINT pk_testModelUnit PRIMARY KEY (propertykey, unitNumber, rowStartDate )
)
GO

INSERT INTO testModelUnit VALUES 

(33,'105',  '2010-11-11 00:00:00.000','2016-11-11 00:00:00.000','MODEL')
,(33,'105', '2016-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
,(33,'105', '2016-12-14 07:51:03.307','2017-01-01 00:00:00.000',NULL)
,(33,'105', '2017-01-01 00:00:00.00','2017-03-21 12:21:13.703','MODEL')
,(33,'105', '2017-03-21 12:21:13.703','2017-04-21 12:21:13.703','MODEL')
,(33,'105', '2017-04-21 12:21:13.703','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-21 12:21:23.207','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-19 10:30:09.227','2017-04-21 12:21:23.207','MODEL')
,(33,'2703','2016-12-14 07:51:03.307','2017-04-19 10:29:47.970','MODEL')
,(33,'2703','2011-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')

GO 

这为您提供了测试它所需的所有数据,因为 105 单元在 2016 年底的短时间内可用。

尝试 1 - 使用 LEAD/LAG 确定日期是否是系列中的第一个日期 - 然后使用多个 PIVOT 语句

SELECT
    propertykey         
    ,unitNumber     
    ,firstDate
    ,lastDate   
FROM (
    SELECT 
        propertykey         
        ,unitNumber         
        ,rowStartDate       
        ,rowEndDate     
        ,CASE 
            WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL            
            ELSE 'firstDate'
        END ISFIRST
        ,CASE 
            WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL           
            ELSE 'lastDate'
        END ISLAST
    FROM testModelUnit
    WHERE UnavailableCode = 'model'
) SRC
PIVOT (
    MAX(rowStartDate)
    FOR isfirst in ([firstDate])
) as pivotFirst
PIVOT (
    MAX(rowEndDate)
    FOR islast in ([lastDate])
) as pivotLast

结果是:

propertykey  unitNumber  firstDate                  lastDate
33           105         NULL                       9999-12-31 00:00:00.000
33           105         2010-11-11 00:00:00.000    NULL
33           105         2017-01-01 00:00:00.000    NULL
33           2606        NULL                       9999-12-31 00:00:00.000
33           2606        2017-04-19 10:30:09.227    NULL
33           2703        NULL                       2017-04-19 10:29:47.970
33           2703        2011-11-11 00:00:00.000    NULL

问题是双重的:首先,我在不同的行中有 NULL,其次,我缺少单元 105 的结束日期(通过颠倒两个数据透视语句的顺序,我扭转了问题,我是然后在开始日期丢失)

第二次尝试:像以前一样使用 LAG/LEAD,但这次使用 CROSS APPLY 将 first/last 值放入一列,然后旋转结果

SELECT 
    propertykey
    ,unitNumber
    ,firstDate
    ,lastDate
FROM(
    SELECT
        propertykey
        ,unitNumber
        ,ca.col
        ,ca.value       
    FROM 
    (
        SELECT 
            propertykey         
            ,unitNumber         
            ,rowStartDate       
            ,rowEndDate     
            ,CASE 
                WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL            
                ELSE 'firstDate'
            END ISFIRST
            ,CASE 
                WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) 
                    AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL           
                ELSE 'lastDate'
            END ISLAST
        FROM testModelUnit
        WHERE UnavailableCode = 'model'
    ) sub
    OUTER APPLY (
        SELECT ISFIRST, rowStartDate
        UNION ALL
        SELECT ISLAST, rowEndDate
    ) CA (col, value)
    WHERE col IS NOT NULL
)src
PIVOT
(
    max(value)
    for col in ([firstDate],[lastDate])
) AS pivoted

结果:

propertykey  unitNumber firstDate                 lastDate
33           105        2017-01-01 00:00:00.000   9999-12-31 00:00:00.000
33           2606       2017-04-19 10:30:09.227   9999-12-31 00:00:00.000
33           2703       2011-11-11 00:00:00.000   2017-04-19 10:29:47.970

问题:我去掉了 NULL 行,但仍然缺少 105 的一条数据记录

想要的结果:

propertykey      unitNumber firstDate                 lastDate
    33           105        2010-11-11 00:00:00.000   2016-12-14 07:51:03.307
    33           105        2017-01-01 00:00:00.000   9999-12-31 00:00:00.000
    33           2606       2017-04-19 10:30:09.227   9999-12-31 00:00:00.000
    33           2703       2011-11-11 00:00:00.000   2017-04-19 10:29:47.970

您正在寻找如下查询吗?

Select PropertyKey, UnitNumber, Min(RowStartDate) as FirstDate, Max(rowEndDate) as LastDate from (
    Select *, Bucket = Row_number() over(partition by propertykey, unitnumber order by rowStartDate) - 
            Row_number() over(partition by propertykey, unitnumber, unavailablecode order by rowStartDate) 
    from testModelUnit
) a
Where a.unavailableCode is not null
group by propertykey, unitNumber, Bucket

输出如下:

+-------------+------------+-------------------------+-------------------------+
| PropertyKey | UnitNumber |        FirstDate        |        LastDate         |
+-------------+------------+-------------------------+-------------------------+
|          33 |        105 | 2010-11-11 00:00:00.000 | 2016-12-14 07:51:03.307 |
|          33 |        105 | 2017-01-01 00:00:00.000 | 9999-12-31 00:00:00.000 |
|          33 |       2606 | 2017-04-19 10:30:09.227 | 9999-12-31 00:00:00.000 |
|          33 |       2703 | 2011-11-11 00:00:00.000 | 2017-04-19 10:29:47.970 |
+-------------+------------+-------------------------+-------------------------+

Demo