Select 带有版本控制的行的最终状态
Select final state of a row with versioning
我有一个table这样的
ID Value1 Value2 value3 Versioning
1 sport tennis 2 1
1 NULL NULL 4 2
1 NULL football NULL 3
1 game NULL NULL 4
这实际上是一个从一个数据库到另一个数据库的自定义复制table。逻辑如下:
您复制的第一行 (versioning=1) 带有它的所有字段。然后,每次对原始 table 进行任何更新时,只会复制更改的值,而不是整个 table。所以在 4 个版本之后,我们最终会像上面那样。我需要做的是创建一个可以读取此 table 和 return 只有一行实际上是最后状态的查询。
使用我们的示例 table 我想要的结果将是
ID Value1 Value2 Value3
1 game football 4
在第一个版本中解释 Value1 int 的结果,我在第二个和第三个版本中有 'sport',我们没有任何更改,在第四个版本中它已更新为 'game'。分别对于其他值,我们有 tennis -> No change -> football -> No change 而对于值 3,我们有 2 -> 4-> No change -> 每个都没有变化 -> 代表一个版本。
这在 SQL 服务器中相当棘手,因为它不支持 window 函数上的 ignore null
s 选项。您可以使用重复的 apply
s,每列一个:
select t.id, t1.value1, t2.value2, t3.value3
from (values (1)) t(id) outer apply
(select top (1) t2.value1
from yourtable t1
where t1.id = t.id and t1.value1 is not null
order by t1.versioning desc
) t1 outer apply
(select top (1) t2.value1
from yourtable t2
where t2.id = t.id and t2.value1 is not null
order by t2.versioning desc
) t2 outer apply
(select top (1) t3.value1
from yourtable t3
where t3.id = t.id and t3.value1 is not null
order by t3.versioning desc
) t3;
With a CTE
that returns the [Versioning]
for the latest non null value of each of the columns [ValueX]
然后加入到 table :
with cte as (
select [ID],
max(case when [Value1] is not null then [Versioning] end) v1,
max(case when [Value2] is not null then [Versioning] end) v2,
max(case when [Value3] is not null then [Versioning] end) v3
from tablename
group by [ID]
)
select c.[ID], t1.[Value1], t2.[Value2], t3.[Value3]
from cte c
inner join tablename t1 on t1.[ID] = c.[ID] and t1.[Versioning] = c.v1
inner join tablename t2 on t2.[ID] = c.[ID] and t2.[Versioning] = c.v2
inner join tablename t3 on t3.[ID] = c.[ID] and t3.[Versioning] = c.v3
参见demo。
结果:
> ID | Value1 | Value2 | Value3
> -: | :----- | :------- | :-----
> 1 | game | football | 4
您可以通过将您的版本控制和您的值合并到一个二进制列中,然后选择最大值来做到这一点。最短形式的查询将是:
SELECT t.ID,
Value1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value1)), 5, 50)),
Value2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value2)), 5, 50)),
Value3 = CONVERT(INT, SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value3)), 5, 50))
FROM YourTable AS t
GROUP BY ID;
为了解释发生了什么,我将只关注值 3,以及缩减示例数据。
该过程的第一步只是将排序列和值列组合成一个二进制值:
SELECT *,
BinaryValue3 = CONVERT(BINARY(2), t.Versioning) + CONVERT(BINARY(2), t.Value3)
FROM (VALUES (1, 2, 1), (1, 4, 2), (1, NULL, 3)) AS t (ID, Value3, Versioning)
给出:
ID Value3 Versioning BinaryValue3
--------------------------------------
1 2 1 0x00010002
1 4 2 0x00020004
1 NULL 3 NULL
然后我们取二进制值的最大值。这取决于两件事:
- 连接 NULL 将产生 NULL,因此非空记录只有一个二进制值
- 由于二进制值将从左到右排序,
MAX
函数将始终选择版本号最高的二进制值
然后一旦我们有了最大二进制值 (0x00020004
),这只是提取右侧并将其转换回原始数据类型的情况。
完整的工作演示
DECLARE @T TABLE
(
ID INT NOT NULL,
Value1 VARCHAR(50),
Value2 VARCHAR(50),
value3 INT,
Versioning INT NOT NULL,
PRIMARY KEY (ID, Versioning)
);
INSERT @T (ID, Value1, Value2, Value3, Versioning)
VALUES
(1, 'sport', 'tennis', 2, 1),
(1, NULL, NULL, 4, 2),
(1, NULL, 'football', NULL, 3),
(1, 'game', NULL, NULL, 4);
SELECT t.ID,
Value1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value1)), 5, 50)),
Value2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value2)), 5, 50)),
Value3 = CONVERT(INT, SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value3)), 5, 50))
FROM @T AS t
GROUP BY ID;
您还可以将此方法与 window 函数一起使用,将最后一个非空值添加到每一行,因此如果您想填充所有空值,您可以使用最后一个非空值:
DECLARE @T TABLE
(
ID INT NOT NULL,
Value1 VARCHAR(50),
Value2 VARCHAR(50),
value3 INT,
Versioning INT NOT NULL,
PRIMARY KEY (ID, Versioning)
);
INSERT @T (ID, Value1, Value2, Value3, Versioning)
VALUES
(1, 'sport', 'tennis', 2, 1),
(1, NULL, NULL, 4, 2),
(1, NULL, 'football', NULL, 3),
(1, 'game', NULL, NULL, 4);
SELECT t.ID,
ActualValue1 = t.Value1,
ActualValue2 = t.Value2,
ActualValue3 = t.Value3,
LastNonNUllValue1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(Value1Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
LastNonNUllValue2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(Value2Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
LastNonNUllValue3 = CONVERT(INT, SUBSTRING(MAX(Value3Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
t.Versioning
FROM @T AS t
CROSS APPLY
( SELECT Value1Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value1),
Value2Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value2),
Value3Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value3)
) AS b
ORDER BY t.Versioning;
给出:
ID ActualValue1 ActualValue2 ActualValue3 LastNonNUllValue1 LastNonNUllValue2 LastNonNUllValue3 Versioning
------------------------------------------------------------------------------------------------------------------------------
1 sport tennis 2 sport tennis 2 1
1 NULL NULL 4 sport tennis 4 2
1 NULL football NULL sport football 4 3
1 game NULL NULL game football 4 4
有关更多阅读,请参阅 Itzik Ben-Gan's The Last non NULL Puzzle
还有一个选择。
这里我们unpivot你的数据,然后pivot
例子
Select *
From (
Select Top 1 with ties
A.ID
,B.*
From YourTable A
Cross Apply ( values ('Value1',Value1)
,('Value2',Value2)
,('Value3',convert(varchar(50),Value3))
) B(Item,Value)
Where Value is not null
Order By row_number() over (partition by id,item order by versioning desc)
) pvt
Pivot (max(value) for item in ([Value1],[Value2],[Value3]) ) pvt
Returns
ID Value1 Value2 Value3
1 game football 4
假设您的 table 是 'tablename' 并且以下代码提供最新值并且可以扩展到任何其他列
select
(SELECT TOP 1 Value1 FROM tablename WHERE Value1 IS NOT NULL ORDER BY Versioning desc) Value1,
(SELECT TOP 1 Value2 FROM tablename WHERE Value2 IS NOT NULL ORDER BY Versioning desc) Value2,
(SELECT TOP 1 Value3 FROM tablename WHERE Value3 IS NOT NULL ORDER BY Versioning desc) Value3
我有一个table这样的
ID Value1 Value2 value3 Versioning
1 sport tennis 2 1
1 NULL NULL 4 2
1 NULL football NULL 3
1 game NULL NULL 4
这实际上是一个从一个数据库到另一个数据库的自定义复制table。逻辑如下: 您复制的第一行 (versioning=1) 带有它的所有字段。然后,每次对原始 table 进行任何更新时,只会复制更改的值,而不是整个 table。所以在 4 个版本之后,我们最终会像上面那样。我需要做的是创建一个可以读取此 table 和 return 只有一行实际上是最后状态的查询。 使用我们的示例 table 我想要的结果将是
ID Value1 Value2 Value3
1 game football 4
在第一个版本中解释 Value1 int 的结果,我在第二个和第三个版本中有 'sport',我们没有任何更改,在第四个版本中它已更新为 'game'。分别对于其他值,我们有 tennis -> No change -> football -> No change 而对于值 3,我们有 2 -> 4-> No change -> 每个都没有变化 -> 代表一个版本。
这在 SQL 服务器中相当棘手,因为它不支持 window 函数上的 ignore null
s 选项。您可以使用重复的 apply
s,每列一个:
select t.id, t1.value1, t2.value2, t3.value3
from (values (1)) t(id) outer apply
(select top (1) t2.value1
from yourtable t1
where t1.id = t.id and t1.value1 is not null
order by t1.versioning desc
) t1 outer apply
(select top (1) t2.value1
from yourtable t2
where t2.id = t.id and t2.value1 is not null
order by t2.versioning desc
) t2 outer apply
(select top (1) t3.value1
from yourtable t3
where t3.id = t.id and t3.value1 is not null
order by t3.versioning desc
) t3;
With a CTE
that returns the [Versioning]
for the latest non null value of each of the columns [ValueX]
然后加入到 table :
with cte as (
select [ID],
max(case when [Value1] is not null then [Versioning] end) v1,
max(case when [Value2] is not null then [Versioning] end) v2,
max(case when [Value3] is not null then [Versioning] end) v3
from tablename
group by [ID]
)
select c.[ID], t1.[Value1], t2.[Value2], t3.[Value3]
from cte c
inner join tablename t1 on t1.[ID] = c.[ID] and t1.[Versioning] = c.v1
inner join tablename t2 on t2.[ID] = c.[ID] and t2.[Versioning] = c.v2
inner join tablename t3 on t3.[ID] = c.[ID] and t3.[Versioning] = c.v3
参见demo。
结果:
> ID | Value1 | Value2 | Value3
> -: | :----- | :------- | :-----
> 1 | game | football | 4
您可以通过将您的版本控制和您的值合并到一个二进制列中,然后选择最大值来做到这一点。最短形式的查询将是:
SELECT t.ID,
Value1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value1)), 5, 50)),
Value2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value2)), 5, 50)),
Value3 = CONVERT(INT, SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value3)), 5, 50))
FROM YourTable AS t
GROUP BY ID;
为了解释发生了什么,我将只关注值 3,以及缩减示例数据。
该过程的第一步只是将排序列和值列组合成一个二进制值:
SELECT *,
BinaryValue3 = CONVERT(BINARY(2), t.Versioning) + CONVERT(BINARY(2), t.Value3)
FROM (VALUES (1, 2, 1), (1, 4, 2), (1, NULL, 3)) AS t (ID, Value3, Versioning)
给出:
ID Value3 Versioning BinaryValue3
--------------------------------------
1 2 1 0x00010002
1 4 2 0x00020004
1 NULL 3 NULL
然后我们取二进制值的最大值。这取决于两件事:
- 连接 NULL 将产生 NULL,因此非空记录只有一个二进制值
- 由于二进制值将从左到右排序,
MAX
函数将始终选择版本号最高的二进制值
然后一旦我们有了最大二进制值 (0x00020004
),这只是提取右侧并将其转换回原始数据类型的情况。
完整的工作演示
DECLARE @T TABLE
(
ID INT NOT NULL,
Value1 VARCHAR(50),
Value2 VARCHAR(50),
value3 INT,
Versioning INT NOT NULL,
PRIMARY KEY (ID, Versioning)
);
INSERT @T (ID, Value1, Value2, Value3, Versioning)
VALUES
(1, 'sport', 'tennis', 2, 1),
(1, NULL, NULL, 4, 2),
(1, NULL, 'football', NULL, 3),
(1, 'game', NULL, NULL, 4);
SELECT t.ID,
Value1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value1)), 5, 50)),
Value2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value2)), 5, 50)),
Value3 = CONVERT(INT, SUBSTRING(MAX(CONVERT(BINARY(4), t.Versioning)
+ CONVERT(VARBINARY(50), t.Value3)), 5, 50))
FROM @T AS t
GROUP BY ID;
您还可以将此方法与 window 函数一起使用,将最后一个非空值添加到每一行,因此如果您想填充所有空值,您可以使用最后一个非空值:
DECLARE @T TABLE
(
ID INT NOT NULL,
Value1 VARCHAR(50),
Value2 VARCHAR(50),
value3 INT,
Versioning INT NOT NULL,
PRIMARY KEY (ID, Versioning)
);
INSERT @T (ID, Value1, Value2, Value3, Versioning)
VALUES
(1, 'sport', 'tennis', 2, 1),
(1, NULL, NULL, 4, 2),
(1, NULL, 'football', NULL, 3),
(1, 'game', NULL, NULL, 4);
SELECT t.ID,
ActualValue1 = t.Value1,
ActualValue2 = t.Value2,
ActualValue3 = t.Value3,
LastNonNUllValue1 = CONVERT(VARCHAR(50), SUBSTRING(MAX(Value1Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
LastNonNUllValue2 = CONVERT(VARCHAR(50), SUBSTRING(MAX(Value2Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
LastNonNUllValue3 = CONVERT(INT, SUBSTRING(MAX(Value3Bin) OVER(PARTITION BY t.ID ORDER BY t.Versioning), 5, 50)),
t.Versioning
FROM @T AS t
CROSS APPLY
( SELECT Value1Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value1),
Value2Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value2),
Value3Bin = CONVERT(BINARY(4), t.Versioning) + CONVERT(VARBINARY(50), t.Value3)
) AS b
ORDER BY t.Versioning;
给出:
ID ActualValue1 ActualValue2 ActualValue3 LastNonNUllValue1 LastNonNUllValue2 LastNonNUllValue3 Versioning
------------------------------------------------------------------------------------------------------------------------------
1 sport tennis 2 sport tennis 2 1
1 NULL NULL 4 sport tennis 4 2
1 NULL football NULL sport football 4 3
1 game NULL NULL game football 4 4
有关更多阅读,请参阅 Itzik Ben-Gan's The Last non NULL Puzzle
还有一个选择。
这里我们unpivot你的数据,然后pivot
例子
Select *
From (
Select Top 1 with ties
A.ID
,B.*
From YourTable A
Cross Apply ( values ('Value1',Value1)
,('Value2',Value2)
,('Value3',convert(varchar(50),Value3))
) B(Item,Value)
Where Value is not null
Order By row_number() over (partition by id,item order by versioning desc)
) pvt
Pivot (max(value) for item in ([Value1],[Value2],[Value3]) ) pvt
Returns
ID Value1 Value2 Value3
1 game football 4
假设您的 table 是 'tablename' 并且以下代码提供最新值并且可以扩展到任何其他列
select
(SELECT TOP 1 Value1 FROM tablename WHERE Value1 IS NOT NULL ORDER BY Versioning desc) Value1,
(SELECT TOP 1 Value2 FROM tablename WHERE Value2 IS NOT NULL ORDER BY Versioning desc) Value2,
(SELECT TOP 1 Value3 FROM tablename WHERE Value3 IS NOT NULL ORDER BY Versioning desc) Value3