获取行的连续差异,包括第一行和最后一行,按一个或多个列分组

Get successive differences of rows, including both the first and last row, grouped by one or more columns

我正在尝试获取 SQL 中数据行的连续差异,包括第一行和最后一行之间的差异以及 0,其中行按多列分组。

我有两个 table 看起来像这样

Date                       Value

+------------+-------+     +------------+-------+------+------+
| Date       | Name  |     | Date       | Value | Name | Type |
+------------+-------+     +------------+-------+------+------+
| 2019-10-10 | A     |     | 2019-10-11 | 10    | A    | X    |
| 2019-10-11 | A     |     | 2019-10-12 | 11    | A    | X    |
| 2019-10-12 | A     |     | 2019-10-14 | 20    | A    | X    |
| 2019-10-13 | A     |     | 2019-10-11 | 10    | A    | Y    |
| 2019-10-14 | A     |     | 2019-10-12 | 22    | A    | Y    |
| 2019-10-15 | A     |     | 2019-10-14 | 30    | A    | Y    |
| 2019-10-10 | B     |     | 2019-10-11 | 10    | B    | X    |
| 2019-10-11 | B     |     | 2019-10-12 | 33    | B    | X    |
| 2019-10-12 | B     |     | 2019-10-14 | 40    | B    | X    |
| 2019-10-13 | B     |     | 2019-10-11 | 10    | B    | Y    |
| 2019-10-14 | B     |     | 2019-10-12 | 44    | B    | Y    |
| 2019-10-15 | B     |     | 2019-10-15 | 50    | B    | Y    |
+------------+-------+     +------------+-------+------+------+

Date table 包含不同名称的日期范围。 Value table 每个名称都有不同类型的值。我想为每个值获取一组连续差异,按 NameType.

分组

我要找的最终结果是

+------------+-------+------+-------+---------------+------------+
| Date       | Name  | Type | Value | PreviousValue | Difference |
+------------+-------+------+-------+---------------+------------+
| 2019-10-11 | A     | X    | 10    | 0             | 10         |
| 2019-10-12 | A     | X    | 11    | 10            | 1          |
| 2019-10-14 | A     | X    | 20    | 11            | 9          |
| 2019-10-15 | A     | X    | 0     | 20            | -20        |
| 2019-10-11 | A     | Y    | 10    | 0             | 10         |
| 2019-10-12 | A     | Y    | 22    | 10            | 12         |
| 2019-10-14 | A     | Y    | 30    | 22            | 8          |
| 2019-10-15 | A     | Y    | 0     | 30            | -30        |
| 2019-10-11 | B     | X    | 10    | 0             | 10         |
| 2019-10-12 | B     | X    | 33    | 10            | 23         |
| 2019-10-14 | B     | X    | 40    | 33            | 7          |
| 2019-10-15 | B     | X    | 0     | 40            | -40        |
| 2019-10-11 | B     | Y    | 10    | 0             | 10         |
| 2019-10-12 | B     | Y    | 44    | 10            | 34         |
| 2019-10-15 | B     | Y    | 50    | 44            | 10         |
+------------+-------+------+-------+---------------+------------+

请注意,BY 组行说明了一个重要点——我们可能有最后一个日期的值,在这种情况下,不需要 "extra"该组的行。

我现在能得到的最接近的是

SELECT
    d.[Date],
    d.[Name],
    v.[Type],
    v.[Value],
    [PreviousValue] = COALESCE(LAG(v.[Value]) OVER (PARTITION BY d.[Name], v.[Type] ORDER BY d.[Date]), 0),
    [Difference] = v.[Value] - COALESCE(LAG(v.[Value]) OVER (PARTITION BY d.[Name], v.[Type] ORDER BY v.[Date]), 0)
FROM
    [Dates] d
LEFT JOIN
    [Values] v
ON
    d.[Date] = v.[Date]
    AND d.[Name] = v.[Name]

但这不会产生最后一行的差异。

只需使用 lag() 和默认值参数:

[PreviousValue] = COALESCE(LAG(v.Value, 1, 0) OVER (PARTITION BY d.[Name], v.[Type] ORDER BY d.[Date]), 0)
[Difference] = v.[Value] - COALESCE(LAG(v.Value, 1, 0) OVER (PARTITION BY d.[Name], v.[Type] ORDER BY v.[Date]), 0)

由于两边都缺少一些数据,你必须以某种方式弥补它。

一个技巧是通过仔细连接来创建此类缺失数据。
下面的示例首先将类型连接到 Dates 数据。这样 FULL JOINValues 数据也可以在类型上完成。

然后在添加足够的 COALESCE 或 ISNULL 之后,计算指标就变得容易了。

CREATE TABLE [Dates](
   [Date] DATE  NOT NULL,
   [Name] VARCHAR(8) NOT NULL,
   PRIMARY KEY ([Date], [Name])
);
INSERT INTO [Dates]
([Date], [Name]) VALUES
  ('2019-10-10','A')
, ('2019-10-11','A')
, ('2019-10-12','A')
, ('2019-10-13','A')
, ('2019-10-14','A')
, ('2019-10-15','A')
, ('2019-10-10','B')
, ('2019-10-11','B')
, ('2019-10-12','B')
, ('2019-10-13','B')
, ('2019-10-15','B')
;
CREATE TABLE [Values](
   [Id] INT IDENTITY(1,1) PRIMARY KEY,
   [Date]  DATE  NOT NULL,
   [Name] VARCHAR(8) NOT NULL,
   [Value] INTEGER  NOT NULL,
   [Type] VARCHAR(8) NOT NULL
);
INSERT INTO [Values]
([Date], [Value], [Name], [Type]) VALUES
  ('2019-10-11', 10, 'A', 'X')
, ('2019-10-12', 11, 'A', 'X')
, ('2019-10-14', 20, 'A', 'X')
, ('2019-10-11', 10, 'A', 'Y')
, ('2019-10-12', 22, 'A', 'Y')
, ('2019-10-14', 30, 'A', 'Y')
, ('2019-10-11', 10, 'B', 'X')
, ('2019-10-12', 33, 'B', 'X')
, ('2019-10-14', 40, 'B', 'X')
, ('2019-10-11', 10, 'B', 'Y')
, ('2019-10-12', 44, 'B', 'Y')
, ('2019-10-15', 50, 'B', 'Y')
;
WITH CTE_DATA AS
(
  SELECT 
    [Name] = COALESCE(d.[Name],v.[Name])
  , [Type] = COALESCE(tp.[Type],v.[Type])
  , [Date] = COALESCE(d.[Date],v.[Date])
  , [Value] = ISNULL(v.[Value], 0)
  FROM [Dates] AS d
  INNER JOIN 
  (
    SELECT [Name], [Type], MAX([Date]) AS [Date]
    FROM [Values]
    GROUP BY [Name], [Type]
  ) AS tp
    ON tp.[Name] = d.[Name]
  FULL JOIN [Values] AS v
    ON v.[Date] = d.[Date]
   AND v.[Name] = d.[Name]
   AND v.[Type] = tp.[Type]
  WHERE v.[Type] IS NOT NULL
     OR d.[Date] > tp.[Date]
)
SELECT 
  [Name], [Type], [Date], [Value]
, [PreviousValue] = ISNULL(LAG([Value]) OVER (PARTITION BY [Name], [Type] ORDER BY [Date]), 0)
, [Difference] = [Value] - ISNULL(LAG([Value]) OVER (PARTITION BY [Name], [Type] ORDER BY [Date]), 0)
FROM CTE_DATA
ORDER BY [Name], [Type], [Date]
Name | Type | Date                | Value | PreviousValue | Difference
:--- | :--- | :------------------ | ----: | ------------: | ---------:
A    | X    | 11/10/2019 00:00:00 |    10 |             0 |         10
A    | X    | 12/10/2019 00:00:00 |    11 |            10 |          1
A    | X    | 14/10/2019 00:00:00 |    20 |            11 |          9
A    | X    | 15/10/2019 00:00:00 |     0 |            20 |        -20
A    | Y    | 11/10/2019 00:00:00 |    10 |             0 |         10
A    | Y    | 12/10/2019 00:00:00 |    22 |            10 |         12
A    | Y    | 14/10/2019 00:00:00 |    30 |            22 |          8
A    | Y    | 15/10/2019 00:00:00 |     0 |            30 |        -30
B    | X    | 11/10/2019 00:00:00 |    10 |             0 |         10
B    | X    | 12/10/2019 00:00:00 |    33 |            10 |         23
B    | X    | 14/10/2019 00:00:00 |    40 |            33 |          7
B    | X    | 15/10/2019 00:00:00 |     0 |            40 |        -40
B    | Y    | 11/10/2019 00:00:00 |    10 |             0 |         10
B    | Y    | 12/10/2019 00:00:00 |    44 |            10 |         34
B    | Y    | 15/10/2019 00:00:00 |    50 |            44 |          6

db<>fiddle here

上测试